WO2019169945A1 - Argonaute蛋白突变体及其用途 - Google Patents

Argonaute蛋白突变体及其用途 Download PDF

Info

Publication number
WO2019169945A1
WO2019169945A1 PCT/CN2019/070253 CN2019070253W WO2019169945A1 WO 2019169945 A1 WO2019169945 A1 WO 2019169945A1 CN 2019070253 W CN2019070253 W CN 2019070253W WO 2019169945 A1 WO2019169945 A1 WO 2019169945A1
Authority
WO
WIPO (PCT)
Prior art keywords
mutant
target dna
sequence
dna
kit
Prior art date
Application number
PCT/CN2019/070253
Other languages
English (en)
French (fr)
Inventor
张建光
毛爱平
Original Assignee
北京贝瑞和康生物技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201811505553.7A external-priority patent/CN110229799B/zh
Application filed by 北京贝瑞和康生物技术有限公司 filed Critical 北京贝瑞和康生物技术有限公司
Priority to EP19763666.5A priority Critical patent/EP3763812A4/en
Priority to US16/978,428 priority patent/US12098367B2/en
Publication of WO2019169945A1 publication Critical patent/WO2019169945A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the present invention relates to a wild type Argonaute protein (Ago)-based mutant lacking DNA cleavage activity but having DNA binding activity, and use based on the protein mutant, particularly in enriching target DNA and constructing a sequencing library .
  • the invention also relates to a kit comprising the protein mutant.
  • Efficient enrichment of target region DNA can effectively reduce sequencing costs and increase sequencing depth.
  • performance of target region enrichment is a major factor in determining sensitivity and specificity1.
  • the mainstream target region enrichment methods mainly include (1) multiplex primer amplification and (2) nucleic acid probe hybridization capture method 2 .
  • sequence differences between target sequences can seriously affect the amplification efficiency, uniformity, and specificity of the target sequence. Therefore, as the target region increases, the design difficulty of multiplex primer amplification increases rapidly, and the efficiency of enrichment usually decreases accordingly.
  • the commonly used multiplex primer amplification method utilizes face-to-face primer design, and the ends of the target fragment to be enriched need to be known sequences, which may be unknown target sequences (such as gene fusion sequences). Enrichment cannot be achieved.
  • primer amplification requires simultaneous targeting of primer pairs at both ends of the template DNA fragment to allow amplification, so for highly fragmented DNA (such as free DNA), primer amplification has very limited utilization of template DNA.
  • a single-stranded nucleic acid probe 80-120 nt
  • a molecular marker such as biotin label
  • nucleic acid probe hybridization-based capture method despite overcoming the multiple primer amplification
  • CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
  • Cas9 protein the Cas protein encoded by the Cas gene can be targeted and targeted under the guidance of a piece of RNA.
  • the dsDNA sequence binds and the sequence is excised.
  • mutation of certain specific functional sites of the wild-type Cas protein can cause its deletion of the cleavage activity of the target DNA, but retains the activity of binding the target DNA according to the sgRNA-guided probe. 6.
  • the Cas9 protein mutant (dCas9) thus obtained is capable of capturing target DNA 7,8 quickly and efficiently.
  • dCas9 the capture of target DNA by dCas9 still has the following disadvantages: (1) The recognition sequence of dCas9 needs to have a protospacer adjacent motif consisting of three bases usually consisting of NGG (N stands for any base) at the 3' end (protospacer adjacent). Motif, PAM).
  • the target DNA that dCas9 can capture is not any sequence 5,6; (2) the length of the guide RNA required for dCas9 is usually close to 100 nucleotides, and the synthesis of such longer RNA sequences is more difficult 5,6;
  • the guide RNA required for dCas9 is time-consuming and complicated to operate by plasmid expression or in vitro transcription, and also causes problems of unstable expression and contamination; and RNA easily forms secondary structure leading to failure;
  • dCas9 There is a severe off-target effect because its specificity for recognition with the target site depends on the pairing of the gRNA with the 10-12 bp base near the PAM, while the remaining recognition of the mismatched target site away from the 8-10 bp base at the PAM Not obvious, this will greatly affect the capture efficiency of dCas9 for target DNA.
  • the present invention provides an isolated Argonaute (Ago) protein mutant which has DNA binding activity but lacks DNA cleavage activity, and thus can be used for easy, efficient, and accurate target DNA enrichment, thereby solving the problem of utilizing existing Techniques (especially nucleic acid probe-based hybrid capture methods and dCas9-based capture methods) enrich the target DNA sequence with limited target DNA range, long time-consuming, complicated operation, inefficient efficiency, and severe off-target problems.
  • Oligonaute (Ago) protein mutant which has DNA binding activity but lacks DNA cleavage activity, and thus can be used for easy, efficient, and accurate target DNA enrichment, thereby solving the problem of utilizing existing Techniques (especially nucleic acid probe-based hybrid capture methods and dCas9-based capture methods) enrich the target DNA sequence with limited target DNA range, long time-consuming, complicated operation, inefficient efficiency, and severe off-target problems.
  • the present invention provides a mutant of an isolated Ago protein having DNA binding activity but lacking DNA cleavage activity.
  • Ago proteins are widely found in eukaryotic and prokaryotic organisms and are proteins that have ribonuclease action under the guidance of RNA or DNA.
  • Eukaryotic Ago proteins are key proteins of the RNA interference (RNAi) machinery, which exert specific shearing functions by binding to 5' phosphorylated long 20-30 bases of small RNA 9 .
  • RNAi RNA interference
  • Eukaryotic Ago proteins are capable of forming an RNA-induced silencing complex (RISC) 9,10 with a series of accessory proteins, which induce post-transcriptional gene silencing by destabilizing mRNA or by translational repression. It plays an important role in various biological activities such as embryonic development, cell differentiation, stem cell maintenance, and transposon silencing.
  • RISC RNA-induced silencing complex
  • RNAi 9 RNA or DNA 9 .
  • prokaryotic Ago proteins can also use small RNA or DNA as a leader sequence to specifically cleave RNA or DNA 9,10 .
  • Ago proteins are all multidomain proteins, including the N-terminal domain, the PAZ domain, the MID domain, and the PIWI domain 9 .
  • the Ago protein of prokaryotes is a bilobal structure in which the MID domain and the PIWI domain form one leaf, while the N-terminal domain and the PAZ domain form another leaf.
  • the PAZ domain binds to the 3' end of the leader sequence
  • the MID domain is used to recognize the 5' end of the leader sequence
  • the PIWI domain can perform RNase H-like endo-nuclear cleavage due to RNaseH-like folding. Enzyme function to cleave target DNA 9 .
  • the catalytic site responsible for RNaseH enzymatic activity includes an aspartate-aspartate-histidine/lysine motif that binds to a divalent metal ion, and is located in the glutamine Acid refers to glutamic acid (E) in the structural subdomain of ".
  • E glutamic acid
  • the term "mutant of Ago protein” or “dAgo” is used interchangeably to refer to an Ago protein obtained by mutation that has DNA binding activity but lacks DNA cleavage activity.
  • the Ago protein is derived from a prokaryote, such as from a bacterium or an archaea.
  • the bacteria include, for example, the genus Marinitoga, Thermotoga, Rhodobacter, and Aquifex.
  • archaea include, for example, Pyrococcus, Methanocaldococus, Thermus, Archaeoglobus.
  • the Ago protein is derived from a prokaryote selected from the group consisting of Pyrococcus furiosus, Thermus thermophilus, Methanocaldococus jannaschii, Marinitoga piezophila, deep sea Thermotoga profunda, Rhodobacter sphaeroides, Aquifex aeolicus, and Archaeoblobus fulgidus.
  • amino acid sequence of the Ago protein is selected from the group consisting of SEQ ID NOs: 1-8.
  • mutation refers to a change in a given amino acid residue in a protein, such as an insertion, deletion or substitution of an amino acid.
  • “Deletion” refers to the absence of one or more amino acids in a protein.
  • “Insert” refers to an increase in one or more amino acids in a protein.
  • Replacement refers to the replacement of one or more amino acids by another amino acid residue in the protein. Mutation methods for proteins are known in the art, for example, the corresponding coding sequences of proteins can be mutated by site-directed mutagenesis.
  • the Ago protein mutant has a mutation in the PIWI domain that results in a loss of DNA cleavage activity.
  • the mutation comprises a mutation at one or more of the following positions:
  • substitution means that the corresponding amino acid is substituted with alanine or glutamic acid.
  • positionally equivalent amino acid refers to an amino acid residue in a sequence corresponding to a given position of a reference sequence when the two sequences are optimally aligned.
  • the reference sequence may be, for example, SEQ ID NO: 1.
  • SEQ ID NO: 2 the positions corresponding to the amino acid residue positions 558, 596, 628 and 745 of SEQ ID NO: 1 are amino acid residues 478, 512, 546 and 660, respectively; and SEQ ID NO: The amino acid residues at positions 628-770 of 1 are equivalent to amino acid residues 546-685.
  • SEQ ID NO: 3 the positions corresponding to the amino acid residue positions 558, 596, 628 and 745 of SEQ ID NO: 1 are amino acid residues 504, 541, 570 and 688, respectively; and SEQ ID NO: The amino acid residues at positions 628-770 of 1 correspond to amino acid residues 570-713.
  • amino acid residues 558, 596, 628 and 745 of SEQ ID NO: 1 are amino acid residues 446, 482, 516 and 624, respectively; and SEQ ID NO: The amino acid residues at positions 628-770 of 1 are equivalent to amino acid residues 516-639.
  • positions corresponding to the amino acid residue positions 558, 596, 628 and 745 of SEQ ID NO: 1 are amino acid residues 439, 475, 509 and 617, respectively; and SEQ ID NO: The amino acid residues at positions 628-770 of 1 are equivalent to amino acid residues 509-637.
  • the position corresponding to the amino acid residue position 628 of SEQ ID NO: 1 is the amino acid residue at position 554, respectively; and the position of amino acid residues 628-770 of SEQ ID NO: 1 is equivalent.
  • amino acid residues at positions 558, 596, 628 and 745 of SEQ ID NO: 1 are corresponding to amino acid residues 502, 464, 571 and 683, respectively; and SEQ ID NO: The amino acid residues at positions 628-770 of 1 are equivalent to amino acid residues 571 to 706.
  • the positions corresponding to amino acid residues 558 and 628 of SEQ ID NO: 1 are amino acid residues 174 and 205, respectively; and amino acids 628-770 of SEQ ID NO: 1.
  • the residue positions are equivalent to amino acid residues 205-427.
  • the Ago protein mutant may also include a mutation in the following domain: an N-terminal domain, a PAZ domain.
  • the mutation of the Ago protein mutant in the N-terminal domain and/or the PAZ domain may be a functionally conserved mutation or a mutation that does not affect Ago protein binding activity.
  • the term "functionally conserved mutation” refers to a mutation that does not alter the overall structure and function of the protein.
  • conservative mutations include mutating a non-polar (hydrophobic) residue such as isoleucine, valine, leucine or methionine to another non-polar residue; one polarity (hydrophilic) The residue is mutated to another polar residue, such as between arginine and lysine, between glutamine and asparagine, between glycine and serine; a basic residue such as lysine The acid, arginine and histidine are mutated to another basic residue; or one acidic residue such as aspartic acid and glutamic acid is mutated to another acidic residue.
  • the Ago protein mutant carries a specific marker, preferably a biotin marker.
  • the invention provides a method of enriching target DNA, comprising the steps of:
  • step (b) further comprises the following steps:
  • the leader sequence is designed for a specific sequence in the target DNA.
  • specific sequence refers to the specificity of the sequence relative to the DNA of interest, such specificity that the leader sequence designed for it can bind to the sequence without binding to other nucleotide sequences.
  • Methods for designing a leader sequence are known to those skilled in the art, for example, after removing a human genome repeat sequence in a target DNA, a specific interval is selected at a fixed interval (eg, every 80 nucleotides), and then bases are selected. The principle of complementary pairing is to design a corresponding guiding sequence.
  • the leader sequence is RNA or DNA. More preferably, the leader sequence is single stranded RNA (ssRNA) or single stranded DNA (ssDNA).
  • ssRNA single stranded RNA
  • ssDNA single stranded DNA
  • the leader sequence comprises a nucleotide modification, such as 5' phosphorylation, 5' hydroxylation.
  • the leader sequence comprises a 5' phosphorylation modification.
  • the leader sequence is 15-25 nucleotides in length, preferably 18-23 nucleotides, and most preferably 21 nucleotides.
  • the length of the leader sequence affects its efficiency in binding to dAgo. In particular, too short a leader sequence will affect the specificity of binding, and too long may result in the formation of an RNA secondary structure (in the case where the leader sequence is RNA) or cause difficulty in synthesis.
  • the leader sequence is substantially complementary to a specific sequence in the target DNA. In certain embodiments, the leader sequence has a mismatch of no more than 2 bases to the target DNA.
  • the dAgo, the leader sequence and the target DNA binding are carried out at a temperature of 85-95 °C.
  • the binding of dAgo to the leader sequence is carried out at a temperature of about 93-95 ° C and the binding to the target DNA is carried out at a temperature of about 85-87 ° C.
  • the dAgo carries a specific marker, including but not limited to: a biotin tag, an S-Tag tag.
  • a specific marker is a biotin marker.
  • the capture medium includes, but is not limited to, magnetic beads, agarose beads (such as Sepharose or Argarose), preferably magnetic beads. Further, the capture medium carries a capture marker capable of binding to a specific tag carried by the dAgo, including but not limited to: a streptavidin marker, an S-Protein marker. Preferably, the capture medium carries a streptavidin marker.
  • the capture medium binds to the specific marker carried by dAgo through the capture label carried thereby, thereby capturing the dAgo-guide sequence-target DNA ternary complex.
  • the method of capture is known in the art, for example, by incubated a biotin-labeled Ago protein with a magnetic particle carrying a streptavidin under appropriate conditions to bind the biotin label to the streptavidin to capture the target DNA. .
  • Those skilled in the art can adjust the specific conditions of capture, such as capture temperature, capture time, etc., depending on the particular experimental needs.
  • a method of isolating a target DNA from a captured dAgo-guide sequence-target DNA ternary complex is also known in the art, such as incubating a magnetic bead that captures a ternary complex under appropriate conditions, The streptavidin is inactivated to release the ternary complex bound thereto, and then the bound protein is removed by proteinase K and the target DNA is then separated from the ternary complex.
  • the present invention provides a method of constructing a sequencing library of a target DNA, comprising the following steps:
  • the present invention also provides a method of constructing a sequencing library of target DNA, comprising the following steps:
  • the enriched target DNA may be present on a capture medium, ie, without the target DNA isolated from the capture medium.
  • the enriched target DNA is a target DNA isolated from a capture medium.
  • the method of the invention may further comprise a pre-amplification step prior to the enrichment step.
  • the sequencing linker is a sequencing linker that matches a sequencing platform.
  • the specific conditions of the ligation reaction such as temperature and reaction time, etc., can be adjusted by a person skilled in the art according to the circumstances.
  • the primers used in the amplification step are universal primers.
  • the term "universal primer” refers to a primer pair that is capable of complementing the sequence at both ends of the sequencing linker and capable of amplifying the correct ligation product.
  • the invention also provides a kit for performing the method according to the invention comprising: dAgo, a leader sequence and a capture medium.
  • the leader sequence is RNA or DNA. More preferably, the leader sequence is single stranded RNA (ssRNA) or single stranded DNA (ssDNA).
  • ssRNA single stranded RNA
  • ssDNA single stranded DNA
  • the leader sequence comprises a nucleotide modification, such as 5' phosphorylation, 5' hydroxylation.
  • the leader sequence comprises a 5' phosphorylation modification.
  • the leader sequence is 15-25 nucleotides in length, preferably 18-23 nucleotides, and most preferably 21 nucleotides.
  • the length of the leader sequence affects its efficiency in binding to dAgo. In particular, too short a leader sequence will affect the specificity of binding, and too long may result in the formation of an RNA secondary structure (in the case where the leader sequence is RNA), or lead to difficulty in synthesis.
  • the leader sequence is substantially complementary to the target DNA. In certain embodiments, the leader sequence has a mismatch of no more than 2 bases to the target DNA.
  • the dAgo carries a specific marker, including but not limited to: a biotin tag, an S-Tag tag.
  • a specific marker is a biotin marker.
  • the capture medium includes, but is not limited to, magnetic beads, agarose beads (such as Sepharose or Argarose), preferably magnetic beads. Further, the capture medium carries a capture marker capable of binding to a specific tag carried by the dAgo, including but not limited to: a streptavidin marker, an S-Protein marker. Preferably, the capture medium carries a streptavidin marker.
  • the methods and kits of the present invention have the following advantages over prior art nucleic acid probe capture methods and dCas9 capture methods:
  • the conventional nucleic acid probe capture method relies on a hybridization reaction and requires a reaction time of up to 4 hours or even overnight.
  • the enrichment method of the present invention requires a relatively short time, generally 30-60 min.
  • the enrichment method of the present invention uses high-temperature washing to increase the specificity and also reduce the number of washings, thereby avoiding the loss of target DNA. Therefore, the binding of the dAgo of the present invention to the leader sequence allows rapid selection and binding of the target DNA, avoiding the problem of long time-consuming and complicated operation caused by directly using the single-stranded nucleic acid probe to hybridize with the target DNA, and also avoids Long-term hybridization introduces the wrong problem in the target DNA, reducing the loss of the target DNA.
  • the guide sequence of the present invention is designed for a specific sequence in a target DNA, and the sequence is short (not more than 25 bases), which is not only easy to synthesize, but also has less sequence requirements for the target DNA, and can have a larger enrichment.
  • the desired target fragment increases the detection efficiency.
  • the method for enriching target DNA according to the present invention is simple in operation, easy to control quality and cost, and can be flexibly adjusted, and is particularly suitable for highly fragmented DNA (for example, cfDNA or severely degraded DNA from FFPE samples). Enrichment.
  • Figure 1 Schematic diagram illustrating the method of enriching target DNA according to the present invention.
  • Figure 2 The amino acid sequence of the Ago protein (PfAgo) of Pyrococcus furiosus SEQ ID NO: 1, in which the PIWI domain (amino acid residues 473-756) is underlined.
  • FIG. 3 Amino acid sequence of Ago protein (TtAgo) of Thermus thermophilus SEQ ID NO: 2, wherein the PIWI domain (amino acid residues 507-671) is underlined.
  • Figure 4 The amino acid sequence of the Ago protein (MjAgo) of M. jannaschii SEQ ID NO: 3, wherein the PIWI domain (amino acid residues 426-699) is underlined.
  • Figure 5 Amino acid sequence of the Ago protein (MpAgo) of Marinitoga piezophila SEQ ID NO: 4, wherein the PIWI domain (amino acid residues 394-634) is underlined.
  • Figure 6 Amino acid sequence of Ago protein (TpAgo) of Thermomyces faecalis SEQ ID NO: 5, wherein the PIWI domain (amino acid residues 431 to 620) is underlined.
  • FIG. 7 Amino acid sequence of Rhodobacter sphaeroides (RsAgo) SEQ ID NO: 6, in which the PIWI domain (amino acid residues 445-757) is underlined.
  • Figure 8 Amino acid sequence of Aa protein (AaAgo) of A. aeruginosa SEQ ID NO: 7, in which the PIWI domain (amino acid residues 419-694) is underlined.
  • Figure 9 Amino acid sequence of Ago protein (AfAgo) of C. angustifolia SEQ ID NO: 8, in which the PIWI domain (amino acid residues 110-406) is underlined.
  • Figure 10 Amino acid sequence alignment of the DEDX catalytic regions of the PIWI domain of hAGO2 (GenBank Gene ID: 27161), TtAgo, MjAgo, PfAgo, MpAgo, TpAgo, AaAgo, AfAgo and RsAgo.
  • the DEDX catalytic regions shown are amino acid residues 553-563/591-600/623-631/740-750 of SEQ ID NO: 1, and 473-483/511 of SEQ ID NO: 2, respectively.
  • Figure 11 Sequencing results of plasmids pPFA-1.1, pPFA-1.2, pPFA-1.3, pPFA-1.4 and pPFA-1.5.
  • Figure 12 Results of mass analysis of target DNA enriched according to the method of Example 2.
  • Figure 13 Representative sequencing results of sequencing libraries prepared according to the methods of Examples 3 and 4.
  • Step 1 Construct an expression vector
  • the biotin receptor sequence was ligated at the N-terminus of the known amino acid sequence of Pyrococcus furios Ago protein (PfAgo) (SEQ ID NO: 1), and codon optimization for E. coli was designed and synthesized accordingly. Nucleotide sequence.
  • the nucleotide sequence, 6x His-Tag, PfAgo-BAS, IRES, BirA (E. coli biotin ligase) were serially cloned in sequence to the pET-28a vector carrying the kanamycin resistance gene to obtain a vector.
  • pPFA-1.0 The nucleotide sequence, 6x His-Tag, PfAgo-BAS, IRES, BirA (E. coli biotin ligase) were serially cloned in sequence to the pET-28a vector carrying the kanamycin resistance gene to obtain a vector.
  • pPFA-1.0 The nucleotide sequence, 6x His-Tag, PfAgo-BA
  • Site-directed mutagenesis of pPFA-1.0 was performed using the Q5 Site-Directed Mutagenesis Kit (NEB, Cat# E0554S) according to the protocol.
  • the DNA obtained after the mutation was transformed into E. coli DH5 ⁇ cells, and cultured overnight at 37 ° C in an LB agarose medium containing kanamycin.
  • Ten colonies of each mutation were selected and cultured in 4 mL of LB liquid medium containing kanamycin for 12-16 hours at 37 ° C, and then 2 mL of the bacterial solution was taken to extract the plasmid using a Plasmid Mini Kit (Qiagen, Cat #27104).
  • the extracted plasmid was amplified using the universal primer (T7 promoter primer 5'-TAATACGACTCACTATAGGG-3' and T7 terminator primer 5'-GCTAGTTATTGCTCAGCGG-3', IDT synthesis) on the plasmid, and then the amplified product was sequenced ( Beijing Ruibo Xingke Biotechnology Co., Ltd.). The sequencing results are shown in Figure 11.
  • the five plasmids confirmed to be mutated in the above step 2 were separately transformed into E. coli BL21 (DE3) cells.
  • the transformed cells were cultured overnight in a LB medium containing 50 ug/mL kanamycin at 37 ° C, and then the fresh LB medium was replaced, and the expansion was continued until the OD 600 reached 0.4-0.8.
  • IPTG was added to a final concentration of 500 uM and incubation was continued at 37 ° C for 3-5 hours.
  • the culture solution was centrifuged at 6,000 g for 15 minutes, and the supernatant was removed. The resulting pellet was resuspended in Cell Lysis I (20 mM Tris pH 8.0, 1 M NaCl, 2 mM MnCl 2 ) and sonicated. The disrupted solution was centrifuged at 20,000 g for 30 minutes at 4 ° C, and then the supernatant was collected. The supernatant was purified on a nickel column at 4 ° C, and then the purified product was desalted and concentrated by a protein ultrafiltration column (Pierce Protein Concentrators PES, 30K MWCO, Thermo Fisher Scientific) according to the protocol, and the concentrated product was expressed. Carrying a biotinylated PfAgo protein mutant. The expressed PfAgo protein mutant was added to an equal volume of glycerol and stored at -20 °C.
  • Example 2 Enrichment of target DNA according to the method of the present invention
  • the target DNA in this example is an exon 18-21 fragment of the EGFR gene derived from free DNA in plasma samples and genomic DNA in leukocytes isolated from normal human peripheral blood, respectively.
  • genomic DNA 200 uL of leukocytes isolated from human peripheral blood were taken, and genomic DNA was extracted using a MagJET Whole Blood gDNA Kit (ThermoFisher, Cat# K2741) according to the kit instructions. Approximately 500 ng (30 uL) of extracted genomic DNA was sonicated (Ultrasonic Crusher Biorupter Pico from Diagenode SA).
  • the gDNA with 5' phosphorylation modification was designed and synthesized based on the EGFR 18, 19, 20, 21 exon sequences, and the sequence is as follows:
  • gDNA name gDNA sequence (5'-3') EGFR_E18_gD1 CTCCCAACCAAGCTCTCTTG (SEQ ID NO: 9) EGFR_E19_gD1 TAGGGACTCTGGATCCCAGA (SEQ ID NO: 10) EGFR_E20_gD2 TGAGGCAGATGCCCAGCAGG (SEQ ID NO: 11) EGFR_E21_gD1 TCTGTGATCTTGACATGCTG (SEQ ID NO: 12)
  • Step 3 gDNA binds to the PfAgo protein mutant to form a binary complex.
  • Reagent name volume Buffer DA1(2x) 10uL PfAgo protein mutant (5uM) 0.5uL gDNA mixture (1uM) 5uL ddH 2 O 4.5uL
  • the above reaction system was incubated at 95 ° C for 10 minutes.
  • Step 4 The binary complex binds to the target DNA to form a ternary complex.
  • Step 5 Capture the ternary complex.
  • Step 6 Isolation of enriched target DNA
  • Buffer DA1 (1x) and 1 uL proteinase K (20 ug/uL) were added to Dynabeads and incubated at 55 ° C for 15 minutes. Then placed on ice, cooled, add 2 volumes of Agencourt Ampure XP magnetic beads (Beckman Coulter, Cat #A63880), incubate for 10 minutes at room temperature, then adsorb the magnetic beads to remove the supernatant, wash twice with 80% alcohol, and finally dissolve In 25 uL Tris solution (20 mM, pH 8.5).
  • the purified DNA was assayed for DNA concentration on a Qubit 3 Fluoromter (ThermoFisher, Cat# Q33216) using Qubit dsDNA HS reagent (ThermoFisher, Cat# Q3323) while DNA purity was detected by capillary electrophoresis (Agilent 2100 Bioanalyzer Instrument, Cat# G2939BA). Representative results are shown in Fig. 12.
  • the enriched target DNA has a length of about 200-1000 bp, a concentration of 61.5 pg/ ⁇ l, a molar concentration of 275.8 pmol/l, and a good quality, which is in accordance with the requirements for preparing a library for sequencing.
  • Example 3 Construction of a sequencing library of target DNA according to the method of the present invention
  • Step 2 Connect the sequencing connector
  • the free DNA was end-blended and A was added using a KAPA Hyper Prep Kit (Kapa Biosystems, Cat# KK8501) according to the protocol protocol, and then ligated to a TruSeq adaptor suitable for the Illumina sequencing platform.
  • Step 3 Pre-amplification of the ligation product
  • the preamplification product was purified using 200 uL of Agencourt Ampure XP magnetic beads (Beckman Coulter, Cat #A63880) according to the manufacturer's instructions.
  • the purified product was dissolved in 30 uL of Buffer DA1 (1x) (15 mM Tris pH 8.0, 0.5 mM MnCl 2 , 250 mM NaCl).
  • Step 4 Enrich the target DNA
  • gDNA name gDNA sequence (5'-3') EGFR_E18_gD1 CTCCCAACCAAGCTCTCTTG (SEQ ID NO: 9) EGFR_E19_gD1 TAGGGACTCTGGATCCCAGA (SEQ ID NO: 10) EGFR_E20_gD2 TGAGGCAGATGCCCAGCAGG (SEQ ID NO: 11) EGFR_E21_gD1 TCTGTGATCTTGACATGCTG (SEQ ID NO: 12)
  • Buffer DA1 (2x) 30 mM Tris pH 8.0, 1.0 mM MnCl 2 , 500 mM NaCl The above reaction system was incubated at 95 ° C for 10 minutes.
  • Step 5 Amplify the enriched target DNA
  • Reagent name volume NEB Ultra II Q5 Mater Mix 2x 25uL P5/P7 universal primer mixture (20uM each) 2.5uL Deionized water 22.5uL
  • Step 6 Purify the amplified target DNA
  • Agencourt Ampure XP magnetic beads (Beckman Coulter, Cat# A63880) was added to the amplification product obtained in the above step 5, incubated at room temperature for 5 minutes, and then washed twice with 200 ⁇ l of 80% ethanol. After drying at room temperature, 30 ⁇ l of Buffer EB was added, and after standing for 5 min, the supernatant was collected. The resulting supernatant is the target DNA sequencing library that has been enriched and purified.
  • Example 4 Construction of a sequencing library of target DNA according to the method of the present invention
  • the enriched target DNA obtained according to step 6 of Example 2 was subjected to end-filling and addition of A using KAPA Hyper Prep Kit (Kapa Biosystems, Cat# KK8501) and according to the instructions of the kit (may also adopt the method of Example 2)
  • the enriched target DNA obtained in step 5 is bound to Dynabeads and then ligated to the TruSeq linker suitable for the Illumina sequencing platform to obtain the ligation product.
  • Reagent name volume NEB Ultra II Q5 Mater Mix 2x 25uL P5/P7 universal primer mixture (20uM each) 2.5uL Deionized water 22.5uL
  • Agencourt Ampure XP magnetic beads (Beckman Coulter, Cat# A63880) was added to the amplification product, incubated at room temperature for 5 minutes, and then washed twice with 200 ⁇ l of 80% ethanol. After drying at room temperature, 30 ⁇ l of Buffer EB was added, and after standing for 5 min, the supernatant was collected. The resulting supernatant is the target DNA sequencing library that has been enriched and purified.
  • the sequencing libraries obtained in Examples 3 and 4 were subjected to the KAPA Library Quantification Kits (KAPA Biosciences, Cat# KK4835) and according to the kit instructions on the StepOne Plus Real-Time PCR System (ThermoFisher, Cat#4376592) real-time PCR machine. Quantitative.
  • the effective concentration of the sequencing library for quantitative detection is not less than 1 nM.
  • the Ago protein mutant of the present invention enriches the target DNA fragment in genomic DNA and free DNA by about 500-fold. Therefore, for genomic DNA and highly fragmented free DNA, the present invention utilizes Ago protein mutants to rapidly and efficiently enrich target DNA, thereby constructing a sequencing library that satisfies sequencing requirements.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Medicinal Chemistry (AREA)
  • Microbiology (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明涉及一种缺失DNA切割活性但具有DNA结合活性的Argonaute蛋白的突变体,其中所述突变体的突变位于PIWI结构域。本发明还涉及基于该蛋白突变体的用途,尤其是在富集目标DNA以及构建测序文库中的用途。因此,本发明还涉及一种富集目标DNA的方法,包括以下步骤:(a)针对目标DNA中的特异性序列设计引导序列;(b)使根据本发明的突变体、引导序列和目标DNA结合,获得突变体-引导序列-目标DNA三元复合物;(c)通过捕获介质捕获突变体-引导序列-目标DNA三元复合物;(d)从捕获的突变体-引导序列-目标DNA三元复合物中分离目标DNA,获得富集的目标DNA。

Description

Argonaute蛋白突变体及其用途 技术领域
本发明涉及一种缺失DNA切割活性但具有DNA结合活性的基于野生型Argonaute蛋白(Ago)的突变体,以及基于该蛋白突变体的用途,尤其是在富集目标DNA以及构建测序文库中的用途。本发明还涉及包含所述蛋白突变体的试剂盒。
背景技术
高效的富集目标区DNA,能够有效的降低测序成本、提高测序深度。对于通常情况下需要高深度测序应用,如体细胞突变检测,目标区富集的性能是决定其灵敏度和特异性的主要因素1。
目前主流的目标区富集方法主要包括(1)多重引物扩增及(2)核酸探针杂交捕获法2。(1)基于多重引物扩增的目标区富集方法,利用数十至数千对引物同时在同一含有扩增酶的反应体系中扩增模板DNA中的目标序列,从而实现目标DNA富集的目的。但是引物之间的相互作用、目标序列之间的序列差异(如GC含量、能够形成二级结构等)会严重影响对目标序列扩增效率、均匀性及特异性。因此,随着目标区的增大,多重引物扩增的设计难度迅速增加,富集的效率通常情况也会相应降低。除此之外,常用的多重引物扩增方法,利用的是面对面的引物设计,待富集的目标片段的两端需为已知序列,这对于末端序列可能未知的目标序列(如基因融合序列)的富集无法实现。再次,引物扩增需要同时靶向模板DNA片段两端的引物对才能实现扩增,因此对于高度片段化的DNA(如游离DNA)而言,引物扩增对模板DNA的利用率非常有限。(2)基于核酸探针杂交捕获法,利用带有分子标记(如生物素标记)的单链核酸探针(80~120nt),在杂交缓冲液中与目标DNA片段在高温条件下经过较长时间的杂交(4~12小时),再通过捕获与DNA杂交的、带有分 子标记的探针来实现目标DNA的富集。整个方法对反应条件和温度的稳定性和持续性有较高要求,流程长、操作复杂。有研究指出,杂交过程中,反应体系会对DNA造成损伤,进而引入突变3。同时,探针序列一般较长,不仅合成困难、成本较高,同时目标序列也需要有相应较长的匹配序列与之配对才能被富集。因此,核酸探针捕获法对于较短的DNA(如游离DNA)的捕获效率往往较差4。
综上,基于多重引物扩增的方法,难以实现对较大范围目标区的有效富集、无法有效富集融合基因DNA;基于核酸探针杂交的捕获方法,尽管克服了多重引物扩增所产生的诸多局限,但操作复杂、时间长、对短片段捕获效率差。
近年来,研究人员发现,相较于核酸探针杂交,某些可编程的DNA结合蛋白能够更快速、更特异地与目标DNA结合。例如,在规律成簇的间隔短回文重复(Clustered Regularly Interspaced Short Palindromic Repeats,CRISPR)-Cas(CRISPR-associated protein)系统中,Cas基因编码的Cas蛋白能够在一段RNA指导下,特异性与靶标dsDNA序列结合,然后将该序列切除。研究人员进一步发现,对野生型Cas蛋白(例如Cas9蛋白)的某些特定功能位点进行突变,能够使其缺失对目标DNA的切割活性,但保留根据sgRNA引导探针结合目标DNA的活性5,6。这样获得的Cas9蛋白突变体(dCas9)能够快速、高效地捕获目标DNA7,8。
然而,利用dCas9捕获目标DNA仍然存在以下缺点:(1)dCas9的识别序列需要在3’末端含有通常由NGG(N表示任意碱基)三个碱基构成的原间隔序列邻近基序(protospacer adjacent motif,PAM)。因此,dCas9能捕获的目标DNA并非为任意序列5,6;(2)dCas9所需的向导RNA长度通常接近100个核苷酸,这种较长的RNA序列合成比较困难5,6;(3)dCas9所需的向导RNA通过质粒表达或体外转录时,耗时较长、操作复杂,同时也带来表达量不稳定以及污染的问题;并且RNA容易形成二级结构导致失效;(4)dCas9存在严重的脱靶效应,因为其与靶标位点识别的特异性依赖于gRNA与靠 近PAM处10-12bp碱基的配对,而其余远离PAM处8-10bp碱基的错配对靶标位点的识别影响不明显,这将极大地影响dCas9对靶标DNA的捕获效率。
因此,需要一种能够克服dCas9的以上缺点,并且能够高效、准确捕获靶标DNA的新方法。
发明内容
本发明提供了一种分离的Argonaute(Ago)蛋白突变体,其具有DNA结合活性但缺失DNA切割活性,因而能够用于进行易于操作的、高效且准确的目标DNA富集,从而解决利用现有技术(尤其是基于核酸探针的杂交捕获法和基于dCas9的捕获法)富集目标DNA序列时,目标DNA范围有限、耗时长、操作复杂、效率差以及脱靶严重的问题。
因此,在第一个方面,本发明提供了一种分离的Ago蛋白的突变体,其具有DNA结合活性但缺失DNA切割活性。
Ago蛋白广泛存在于真核和原核生物中,是在RNA或DNA引导下具有核糖核酸酶作用的蛋白。真核生物的Ago蛋白是RNA干扰(RNAi)机制的关键蛋白,它们通过与5′磷酸化的长为20-30个碱基的小RNA结合来行使特异性的剪切功能 9。真核生物的Ago蛋白能够与一系列辅助蛋白形成RNA诱导的沉默复合体(RNA-induced silencing complex,RISC) 9,10,通过使mRNA不稳定或通过翻译抑制来诱导转录后的基因沉默,从而在各种生物活动例如胚胎发育、细胞分化、干细胞维持、转座子沉默中发挥重要的作用。与真核生物的Ago蛋白不同,原核生物的Ago蛋白通常情况下缺乏与之结合的辅助蛋白来行使RNAi的功能 9。但是,一些原核Ago蛋白也可以利用小RNA或DNA作为引导序列,特异性切割RNA或者DNA 9,10
Ago蛋白都是多结构域蛋白,包括N末端结构域、PAZ结构域、MID结构域和PIWI结构域 9。原核生物的Ago蛋白是二叶状结构,其中MID结构域和PIWI结构域形成一叶,而N末端结构域和PAZ 结构域形成另一叶。通常,PAZ结构域与引导序列的3′末端结合,MID结构域用以识别引导序列的5′末端,而PIWI结构域由于具有RNaseH样的折叠情况,因此可以行使类似于RNase H的核酸内切酶功能,以切割目标DNA 9。在PIWI结构域中,负责RNaseH酶活性的催化位点包括一个与二价金属离子结合的天冬氨酸-天冬氨酸-组氨酸/赖氨酸基序,以及位于称为“谷氨酸指”的结构性亚结构域中的谷氨酸(E)。这四个氨基酸及其临近序列组成DEDX结构区,成为Ago蛋白PIWI结构域的关键特征 9。尽管不同物种中Ago蛋白整体序列差异较大,但是PIWI结构域中DEDX结构区具有较高的保守性(图1) 9,11
如本文所用,术语“Ago蛋白的突变体”或“dAgo”可互换使用,是指通过突变获得的具有DNA结合活性但缺失DNA切割活性的Ago蛋白。在本发明中,Ago蛋白来源于原核生物,例如来源于细菌或古生菌。细菌的实例包括,例如Marinitoga属、栖热孢菌属(Thermotoga)、红杆菌属(Rhodobacter)、产液菌属(Aquifex)。古生菌的实例包括,例如火球菌属(Pyrococcus)、甲烷球菌属(Methanocaldococus)、栖热菌属(Thermus)、古球状菌属(Archaeoglobus)。
在一个具体的实施方案中,Ago蛋白来源于选自以下的原核生物:激烈火球菌(Pyrococcus furiosus)、嗜热栖热菌(Thermus thermophilus)、詹氏甲烷球菌(Methanocaldococus jannaschii)、Marinitoga piezophila、深海栖热孢菌(Thermotoga profunda)、球形红杆菌(Rhodobacter sphaeroides)、风产液菌(Aquifex aeolicus)和闪烁古生球菌(Archaeoblobus fulgidus)。
更优选地,所述Ago蛋白的氨基酸序列选自SEQ ID NO:1-8。
如本文所用,术语“突变”是指蛋白质中给定的氨基酸残基的改变,例如氨基酸的插入、缺失或替换。“缺失”是指蛋白质中一个或多个氨基酸的缺少。“插入”是指蛋白质中一个或多个氨基酸的增加。“替换”是指蛋白质中用另一个氨基酸残基替代一个或多个氨基 酸。蛋白质的突变方法是本领域已知的,例如可以通过定点突变的方法对蛋白质的相应编码序列进行突变。
在一个实施方案中,Ago蛋白突变体在PIWI结构域具有突变,所述突变导致DNA切割活性缺失。优选地,所述突变包括一个或多个以下位置的突变:
-SEQ ID NO:1的第558、596、628和745位氨基酸残基,以及位置与前述相当的氨基酸残基被取代,或
-SEQ ID NO:1的第628-770位氨基酸,以及位置与前述相当的氨基酸残基缺失。优选地,所述取代是指相应的氨基酸被丙氨酸或谷氨酸取代。
如本文所用,术语“位置相当的氨基酸”是指当将两个序列进行最佳比对时,与参考序列的给定位置相对应的序列中的氨基酸残基。本领域技术人员知晓确定一个序列中与参考序列的给定位置相对应的氨基酸位置的方法。在本发明中,参考序列可以是例如SEQ ID NO:1。
在SEQ ID NO:2中,与SEQ ID NO:1的第558、596、628和745位氨基酸残基位置相当的分别是第478、512、546和660位氨基酸残基;与SEQ ID NO:1的第628-770位氨基酸残基位置相当的是第546-685位氨基酸残基。
在SEQ ID NO:3中,与SEQ ID NO:1的第558、596、628和745位氨基酸残基位置相当的分别是第504、541、570和688位氨基酸残基;与SEQ ID NO:1的第628-770位氨基酸残基位置相当的是第570-713位氨基酸残基。
在SEQ ID NO:4中,与SEQ ID NO:1的第558、596、628和745位氨基酸残基位置相当的分别是第446、482、516和624位氨基酸残基;与SEQ ID NO:1的第628-770位氨基酸残基位置相当的是第516-639位氨基酸残基。
在SEQ ID NO:5中,与SEQ ID NO:1的第558、596、628和745位氨基酸残基位置相当的分别是第439、475、509和617位氨基 酸残基;与SEQ ID NO:1的第628-770位氨基酸残基位置相当的是第509-637位氨基酸残基。
在SEQ ID NO:6中,与SEQ ID NO:1的第628位氨基酸残基位置相当的分别是第554位氨基酸残基;与SEQ ID NO:1的第628-770位氨基酸残基位置相当的是第554-777位氨基酸残基。
在SEQ ID NO:7中,与SEQ ID NO:1的第558、596、628和745位氨基酸残基位置相当的分别是第502、464、571和683位氨基酸残基;与SEQ ID NO:1的第628-770位氨基酸残基位置相当的是第571-706位氨基酸残基。
在SEQ ID NO:8中,与SEQ ID NO:1的第558和628位氨基酸残基位置相当的分别是第174和205位氨基酸残基;与SEQ ID NO:1的第628-770位氨基酸残基位置相当的是第205-427位氨基酸残基。
任选地,所述Ago蛋白突变体还可以包括在以下结构域的突变:N端结构域、PAZ结构域。在该实施方案中,Ago蛋白突变体在N端结构域和/或PAZ结构域的突变可以是功能保守性突变,或者是不影响Ago蛋白结合活性的突变。
如本文所用,术语“功能保守性突变”是指不改变蛋白质的总体结构和功能的突变。保守性突变的例子包括将一个非极性(疏水性)残基如异亮氨酸、缬氨酸、亮氨酸或蛋氨酸突变为另一个非极性残基;将一个极性(亲水性)残基突变为另一个极性残基,如精氨酸和赖氨酸之间、谷氨酰胺和天冬酰胺之间、甘氨酸和丝氨酸之间的突变;将一个碱性残基如赖氨酸、精氨酸和组氨酸突变为另一个碱性残基;或者将一个酸性残基如天冬氨酸和谷氨酸突变为另一个酸性残基。
在一个实施方案中,所述Ago蛋白突变体带有特异性标记,优选生物素标记。
在第二个方面,本发明提供了一种富集目标DNA的方法,包括以下步骤:
(a)针对目标DNA中的特异性序列设计引导序列;
(b)使根据本发明的dAgo、引导序列和目标DNA结合,获得dAgo- 引导序列-目标DNA三元复合物;
(c)通过捕获介质捕获dAgo-引导序列-目标DNA三元复合物;
(d)从捕获的dAgo-引导序列-目标DNA三元复合物中分离目标DNA,获得富集的目标DNA。
在一个实施方案中,为了增加dAgo与引导序列结合的特异性和结合效率,可以使dAgo与引导序列先结合,然后再与目标DNA结合。因此,在该实施方案中,上述步骤(b)进一步包括以下步骤:
(b1)使根据本发明的dAgo与引导序列结合,获得dAgo-引导序列二元复合物;
(b2)使dAgo-引导序列二元复合物与目标DNA序列结合,获得dAgo-引导序列-目标DNA三元复合物。
在一个实施方案中,引导序列是针对目标DNA中的特异性序列设计的。如本文所用,术语“特异性序列”是指该序列相对于目标DNA而言具有特异性,这种特异性使得针对其设计的引导序列能够与该序列结合,而不与其他核苷酸序列结合。设计引导序列的方法是本领域技术人员已知的,例如去除目标DNA中的人基因组重复序列之后,间隔固定的间距(如每隔80个核苷酸)选取一段特异性序列,然后按照碱基互补配对的原则设计相应的引导序列。
在一个实施方案中,引导序列是RNA或DNA。更优选地,引导序列是单链RNA(ssRNA)或单链DNA(ssDNA)。
在一个实施方案中,引导序列包含核苷酸修饰,例如5’磷酸化、5’羟基化。优选地,为提高引导序列与dAgo的结合效率,引导序列包含5’磷酸化修饰。
在一个实施方案中,所述引导序列长度为15-25个核苷酸,优选18-23个核苷酸,最优选21个核苷酸。引导序列的长度影响其与dAgo结合的效率。具体而言,引导序列过短会影响结合的特异性,过长则可能导致形成RNA二级结构(在引导序列是RNA的情况下),或导致合成困难。
在一个实施方案中,所述引导序列与目标DNA中的特异性序列 基本上互补。在某些实施方案中,引导序列与目标DNA存在不超过2个碱基的错配。
在一个实施方案中,dAgo、引导序列和目标DNA结合在85-95℃温度下进行。在两步结合的实施方案中,dAgo与引导序列的结合在约93-95℃的温度下进行,与目标DNA的结合在约85-87℃的温度下进行。
在一个实施方案中,所述dAgo携带有特异性标记,包括但不限于:生物素标记、S-Tag标记。优选地,所述特异性标记是生物素标记。
在一个实施方案中,所述捕获介质包括但不限于:磁珠、琼脂糖微珠(如Sepharose或Argarose),优选磁珠。进一步地,所述捕获介质携带有能够与dAgo携带的特异性标记结合的捕获标记,包括但不限于:亲链霉素标记、S-Protein标记。优选地,所述捕获介质携带亲链霉素标记。
在本发明中,捕获介质通过其携带的捕获标记结合dAgo携带的特异性标记,从而捕获dAgo-引导序列-目标DNA三元复合物。捕获的方法是本领域已知的,例如通过将携带生物素标记的Ago蛋白与携带亲链霉素的磁珠在适当条件下孵育以使生物素标记与亲链霉素结合,从而捕获目标DNA。根据具体的实验需要,本领域技术人员可以调整捕获的具体条件,例如捕获温度、捕获时间等。
在一个实施方案中,从捕获的dAgo-引导序列-目标DNA三元复合物分离目标DNA的方法也是本领域已知的,例如将捕获了三元复合物的磁珠于适当条件下孵育,以使亲链霉素失活进而释放与其结合的三元复合物,然后通过蛋白酶K去除结合的蛋白继而从三元复合物中分离目标DNA。
在第三个方面,本发明提供了一种构建目标DNA的测序文库的方法,主要包括以下步骤:
(1)将目标DNA与测序接头连接,获得连接产物;
(2)根据本发明所述的方法从连接产物富集与测序接头连接 的目标DNA,获得富集的目标DNA;
(3)扩增富集的目标DNA,获得测序文库。
在另一个实施方案中,本发明还提供了一种构建目标DNA的测序文库的方法,主要包括以下步骤:
(1)根据本发明所述的方法富集目标DNA;
(2)将富集的目标DNA与测序接头连接,获得连接产物;
(3)扩增连接产物,获得测序文库。
在一个实施方案中,富集的目标DNA可以存在于捕获介质上,即不需要从捕获介质上分离的目标DNA。在另一个实施方案中,富集的目标DNA是从捕获介质上分离的目标DNA。
在一个实施方案中,本发明的方法还可以包括在富集步骤之前的预扩增步骤。
在一个实施方案中,所述测序接头是与测序平台匹配的测序接头。连接反应的具体条件,例如温度和反应时间等,是本领域技术人员根据情况可以通过常规技术进行调整的。
在一个实施方案中,扩增步骤所用的引物是通用引物。如本文所用,术语“通用引物”是指能够与测序接头两端序列互补并能够扩增正确连接产物的引物对。
在第四个方面,本发明还提供用于实施根据本发明的方法的试剂盒,包括:dAgo、引导序列和捕获介质。
在一个实施方案中,引导序列是RNA或DNA。更优选地,引导序列是单链RNA(ssRNA)或单链DNA(ssDNA)。
在一个实施方案中,引导序列包含核苷酸修饰,例如5’磷酸化、5’羟基化。优选地,为提高引导序列与dAgo的结合效率,引导序列包含5’磷酸化修饰。
在一个实施方案中,所述引导序列长度为15-25个核苷酸,优选18-23个核苷酸,最优选21个核苷酸。引导序列的长度影响其与dAgo结合的效率。具体而言,引导序列过短会影响结合的特异性,过长则可能导致形成RNA二级结构(在引导序列是RNA的情况下),或导 致合成困难。
在一个实施方案中,所述引导序列与目标DNA基本上互补。在某些实施方案中,引导序列与目标DNA存在不超过2个碱基的错配。
在一个实施方案中,所述dAgo携带有特异性标记,包括但不限于:生物素标记、S-Tag标记。优选地,所述特异性标记是生物素标记。
在一个实施方案中,所述捕获介质包括但不限于:磁珠、琼脂糖微珠(如Sepharose或Argarose),优选磁珠。进一步地,所述捕获介质携带有能够与dAgo携带的特异性标记结合的捕获标记,包括但不限于:亲链霉素标记、S-Protein标记。优选地,所述捕获介质携带亲链霉素标记。
根据本发明所述的方法和试剂盒,可高效、快速、简便地实现目标DNA的富集,以及基于二代高通量测序平台的测序。特别地,相对于现有技术的核酸探针捕获法和dCas9捕获法,本发明的方法和试剂盒具有以下优势:
(1)传统的核酸探针捕获法依赖于杂交反应,需要长达4小时甚至过夜的反应时间。本发明的富集方法所需时间较短,一般为30-60min。其次,本发明的富集方法采用高温洗涤,增加特异性的同时,也减少清洗次数,避免了目标DNA的损失。因此,本发明的dAgo与引导序列的结合允许对目标DNA的快速选择与结合,避免了直接利用单链核酸探针与目标DNA杂交带来的耗时长、操作复杂的问题,同时也避免了由于长时间杂交在目标DNA中引入错误的问题,减少了目标DNA的损失。
(2)本发明的引导序列针对目标DNA中特异性序列而设计,序列较短(不超过25碱基),不仅容易合成,而且对目标DNA的序列要求少,能够更大限度的富集所需的目的片段,增加检测效率。
(3)总之,根据本发明的富集目标DNA的方法操作简便,容易控制质量和成本,并且可以灵活调整,尤其适用于高度片段化的DNA(例如,cfDNA或严重降解的来自FFPE样本的DNA)的富集。
附图说明
图1:说明根据本发明的富集目标DNA的方法的流程示意图。
图2:激烈火球菌的Ago蛋白(PfAgo)的氨基酸序列SEQ ID NO:1,其中下划线示出了PIWI结构域(第473-756位氨基酸残基)。
图3:嗜热栖热菌的Ago蛋白(TtAgo)的氨基酸序列SEQ ID NO:2,其中下划线示出了PIWI结构域(第507-671位氨基酸残基)。
图4:詹氏甲烷球菌的Ago蛋白(MjAgo)的氨基酸序列SEQ ID NO:3,其中下划线示出了PIWI结构域(第426-699位氨基酸残基)。
图5:Marinitoga piezophila的Ago蛋白(MpAgo)的氨基酸序列SEQ ID NO:4,其中下划线示出了PIWI结构域(第394-634位氨基酸残基)。
图6:深海栖热孢菌的Ago蛋白(TpAgo)的氨基酸序列SEQ ID NO:5,其中下划线示出了PIWI结构域(第431-620位氨基酸残基)。
图7:球形红杆菌的Ago蛋白(RsAgo)的氨基酸序列SEQ ID NO:6,其中下划线示出了PIWI结构域(第445-757位氨基酸残基)。
图8:风产液菌的Ago蛋白(AaAgo)的氨基酸序列SEQ ID NO:7,其中下划线示出了PIWI结构域(第419-694位氨基酸残基)。
图9:闪烁古生球菌的Ago蛋白(AfAgo)的氨基酸序列SEQ ID NO:8,其中下划线示出了PIWI结构域(第110-406位氨基酸残基)。
图10:hAGO2(GenBank Gene ID:27161)、TtAgo、MjAgo、PfAgo、MpAgo、TpAgo、AaAgo、AfAgo及RsAgo的PIWI结构域中的DEDX催化区域的氨基酸序列比对。其中,示出的DEDX催化区域分别是SEQ ID NO:1的第553-563/591-600/623-631/740-750位氨基酸残基、SEQ ID NO:2的第473-483/511-519/541-549/655-665位氨基酸残基、SEQ ID NO:3的第499-509/540-548/565-573/683-693位氨基酸残基、SEQ ID NO:4的第441-451/481-489/511-521/619-629位氨基酸残基、SEQ ID NO:5的第434-444/474-482/504-514/612-622位氨基酸残基、SEQ ID NO:6的第524-534/695-703/549-559/461-471位氨基酸残基、 SEQ ID NO:7的第463-471/497-507/566-576/678-688位氨基酸残基和SEQ ID NO:8的第169-179/136-144/200-210/121-131位氨基酸残基。
图11:质粒pPFA-1.1、pPFA-1.2、pPFA-1.3、pPFA-1.4和pPFA-1.5的测序结果。
图12:根据实施例2的方法富集的目标DNA的质量分析结果。
图13:根据实施例3和4的方法制备的测序文库的代表性测序结果。
具体实施方式
下面将参考附图并结合实施例来详细说明本发明。需要说明的是,本领域的技术人员应该理解本发明的附图及其实施例仅为了例举的目的,并不能对本发明构成任何限制。
实施例1:制备本发明的Ago蛋白突变体
步骤1:构建表达载体
在已知的激烈火球菌Ago蛋白(PfAgo)的氨基酸序列(SEQ ID NO:1)的N末端连接生物素受体序列,并据此设计并合成针对大肠杆菌(E.coli)的密码子优化的核苷酸序列。将所述核苷酸序列、6x His-Tag、PfAgo-BAS、IRES、BirA(大肠杆菌生物素连接酶)按顺序串联克隆至带有卡那霉素抗性基因的pET-28a载体,得到载体pPFA-1.0。
采用Q5 Site-Directed Mutagenesis Kit(NEB,Cat#E0554S),按照说明书操作流程对pPFA-1.0进行定点突变。将突变后获得的DNA转化入E.Coli DH5α细胞中,并于含有卡那霉素的LB琼脂糖培养基中37℃过夜培养。每种突变选取10个菌落于4mL含有卡那霉素的LB液体培养基中37℃震荡培养12-16小时,然后取2mL菌液利用Plasmid Mini Kit(Qiagen,Cat#27104)提取质粒。
步骤2:测序验证
利用质粒上的通用引物(T7启动子引物5’-TAATACGACTCACTATAGGG-3’和T7终止子引物5’-GCTAGTTATTGCTCAGCGG-3’,IDT合成)对提取的质粒进行扩增,然后对扩增产物进行测序(北京睿博兴科生物技术有限公司)。测序结果如图11所示。
将确认包含突变的以下质粒于-20℃长期保存:
-质粒pPFA-1.1,其中第558位氨基酸残基被丙氨酸取代(D558A);
-质粒pPFA-1.2,其中第596位氨基酸残基被丙氨酸取代(E596A);
-质粒pPFA-1.3,其中第628位氨基酸残基被丙氨酸取代(D628A);
-质粒pPFA-1.4,其中第745位氨基酸残基被丙氨酸取代(H745A);和
-质粒pPFA-1.5,其中第628-770位氨基酸残基缺失(Δ628-770)。
步骤3:载体转化及PfAgo蛋白突变体的表达
将上述步骤2中确认突变的5个质粒分别转化至E.coli BL21(DE3)细胞中。将转化细胞在含有50ug/mL卡那霉素的LB培养液中于37℃震荡培养过夜,然后更换新鲜的LB培养液,继续扩大培养直至OD 600达到0.4-0.8。添加IPTG至终浓度500uM,继续于37℃震荡培养3-5小时。
将培养液于6,000g离心15分钟,去除上清液。将所得沉淀重悬于细胞裂解液I(20mM Tris pH 8.0,1M NaCl,2mM MnCl 2),超声破碎。破碎后的溶液于4℃在20,000g下离心30分钟,然后收集上清液。将上清液用镍柱于4℃纯化,然后利用蛋白超滤柱(Pierce Protein Concentrators PES,30K MWCO,ThermoFisher Scientific)按说明书操作 流程对纯化产物进行脱盐及浓缩,浓缩后的产物即为表达的携带有生物素标记的PfAgo蛋白突变体。将表达的PfAgo蛋白突变体加入等体积甘油保存于-20℃。
实施例2:根据本发明的方法富集目标DNA
本实施例中的目标DNA是分别来自血浆样品中的游离DNA和正常人外周血分离的白细胞中的基因组DNA的EGFR基因的外显子18-21片段。
步骤1:提取DNA
对于游离DNA:取4mL人血浆,利用QIAamp Circulating Nucleic Acid Kit(Qiagen,Cat#55114)按照试剂盒说明书提取游离DNA,然后用45uL Elution Buffer洗脱。
对于基因组DNA:取200uL人外周血分离的白细胞,利用MagJET Whole Blood gDNA Kit(ThermoFisher,Cat#K2741),按照试剂盒的说明书提取基因组DNA。将约500ng(30uL)提取的基因组DNA进行超声破碎(超声破碎仪Biorupter Pico,来自Diagenode SA)。
步骤2:设计引导DNA(gDNA)
根据EGFR 18、19、20、21外显子序列设计并合成带有5’磷酸化修饰的gDNA,序列如下:
gDNA名称 gDNA序列(5’-3’)
EGFR_E18_gD1 CTCCCAACCAAGCTCTCTTG(SEQ ID NO:9)
EGFR_E19_gD1 TAGGGACTCTGGATCCCAGA(SEQ ID NO:10)
EGFR_E20_gD2 TGAGGCAGATGCCCAGCAGG(SEQ ID NO:11)
EGFR_E21_gD1 TCTGTGATCTTGACATGCTG(SEQ ID NO:12)
将100uM的上述gDNA分别溶于Buffer EB(20mM Tris pH 8.0)。然后将各gDNA溶液等体积混合,并稀释100倍,获得1uM的gDNA混合液。
步骤3:gDNA与PfAgo蛋白突变体结合,形成二元复合物。
按照下表制备反应体系混合每种PfAgo蛋白突变体(即,D558A、E596A、D628A、H745A和Δ628-770)与gDNA:
试剂名称 体积
Buffer DA1(2x) 10uL
PfAgo蛋白突变体(5uM) 0.5uL
gDNA混合液(1uM) 5uL
ddH 2O 4.5uL
将上述反应体系于95℃孵育10分钟。
步骤4:二元复合物与目标DNA结合,形成三元复合物。
向上述步骤3的反应体系中加入上述步骤1中获得的45uL游离DNA或30uL超声破碎的基因组DNA,混匀后于87℃孵育15分钟,然后置于冰上。
步骤5:捕获三元复合物。
向上述步骤4的反应体系中加入用Buffer DA1(1x)预平衡过的Streptavidin Dynabeads M270(Thermo Fisher,Cat#65305)中,于室温孵育30分钟。然后,于室温用Buffer DA1(1x)洗涤Dynabeads 3次,每次3分钟。此时,Dynabeads上结合有富集的目标DNA。
步骤6:分离富集的目标DNA
向Dynabeads中加入50uL Buffer DA1(1x)及1uL蛋白酶K(20ug/uL),于55℃孵育15分钟。然后置于冰上,冷却后加入2倍体积的Agencourt Ampure XP磁珠(Beckman Coulter,Cat#A63880),室温孵育10分钟,然后吸附磁珠去除上清,用80%酒精清洗两次,最后溶于25uL Tris溶液(20mM,pH 8.5)。
步骤7:富集的目标DNA的质量分析
纯化后的DNA用Qubit dsDNA HS reagent(ThermoFisher,Cat#Q3323)在Qubit 3 Fluoromter(ThermoFisher,Cat#Q33216)上测定DNA浓度,同时通过毛细管电泳检测DNA纯度(Agilent 2100 Bioanalyzer Instrument,Cat#G2939BA)。代表性结果如图12所示,富集的目标DNA长度为约200-1000bp,浓度为61.5pg/μl,摩尔浓度达到275.8 pmol/l,质量较好,符合制备文库进行测序的要求。
实施例3:根据本发明的方法构建目标DNA的测序文库
步骤1:游离DNA提取
取4mL人血浆,利用QIAamp Circulating Nucleic Acid Kit(Qiagen,Cat#55114)并按照试剂盒说明书提取游离DNA,最终的游离DNA用试剂盒提供的45uL Elution Buffer洗脱。
步骤2:连接测序接头
利用KAPA Hyper Prep Kit(Kapa Biosystems,Cat#KK8501)按照说明书流程,将游离DNA进行末端补平及加A,然后与适用于Illumina测序平台的TruSeq接头连接。
步骤3:预扩增连接产物
按照下表制备反应体系:
Figure PCTCN2019070253-appb-000001
在PCR仪上,按如下条件进行预扩增:
Figure PCTCN2019070253-appb-000002
扩增完成后,用200uL Agencourt Ampure XP磁珠(Beckman Coulter,Cat#A63880)依照制造商的说明书纯化预扩增产物。将纯化 产物溶于30uL Buffer DA1(1x)(15mM Tris pH 8.0,0.5mM MnCl 2,250mM NaCl)。
步骤4:富集目标DNA
(1)根据EGFR基因的18、19、20、21外显子序列设计并合成带有5’磷酸化修饰的引导DNA(gDNA),其序列如下:
gDNA名称 gDNA序列(5’-3’)
EGFR_E18_gD1 CTCCCAACCAAGCTCTCTTG(SEQ ID NO:9)
EGFR_E19_gD1 TAGGGACTCTGGATCCCAGA(SEQ ID NO:10)
EGFR_E20_gD2 TGAGGCAGATGCCCAGCAGG(SEQ ID NO:11)
EGFR_E21_gD1 TCTGTGATCTTGACATGCTG(SEQ ID NO:12)
将100uM的上述gDNA分别溶于Buffer EB(20mM Tris pH 8.0)。然后将各gDNA溶液等体积混合,并稀释100倍,获得1uM的gDNA混合液。
(2)按照下表制备反应体系混合PfAgo蛋白突变体(即,D558A、E596A、D628A、H745A和Δ628-770)与gDNA:
试剂名称 体积
Buffer DA1(2x)* 10uL
PfAgo蛋白突变体(5uM) 0.5uL
gDNA混合液(1uM) 5uL
ddH 2O 4.5uL
*Buffer DA1(2x):30mM Tris pH 8.0,1.0mM MnCl 2,500mM NaCl将上述反应体系于95℃孵育10分钟。
(3)向上述反应体系中加入步骤3获得的30uL纯化产物,混匀后于87℃孵育15分钟,然后置于冰上。
(4)向上述反应体系中加入用Buffer DA1(1x)预平衡过的Streptavidin Dynabeads M270(Thermo Fisher,Cat#65305),于室温孵育30分钟。然后,于室温用Buffer DA1(1x)洗涤Dynabeads 3次,每次3分钟。此时,Dynabeads上结合有富 集的目标DNA。
步骤5:扩增富集的目标DNA
向步骤4获得的Dynabeads加入如下试剂:
试剂名称 体积
NEB Ultra II Q5 Mater Mix 2x 25uL
P5/P7通用引物混合物(各20uM) 2.5uL
去离子水 22.5uL
在PCR仪上,按如下条件进行扩增:
Figure PCTCN2019070253-appb-000003
步骤6:纯化扩增的目标DNA
向上述步骤5获得的扩增产物加入等体积的Agencourt Ampure XP磁珠(Beckman Coulter,Cat#A63880),于室温孵育5分钟,然后用200μl的80%乙醇洗涤2次。室温晾干后,加入30μl Buffer EB,静置5min后收集上清液。所得上清液即是已富集并纯化的目标DNA测序文库。
实施例4:根据本发明的方法构建目标DNA的测序文库
利用KAPA Hyper Prep Kit(Kapa Biosystems,Cat#KK8501)并按照试剂盒的说明书,将根据实施例2的步骤6获得的富集的目标DNA进行末端补平及加A(也可以采用实施例2的步骤5获得的与Dynabeads结合的富集的目标DNA),然后与适用于Illumina测序平台的TruSeq接头连接,获得连接产物。
向上述连接产物中加入如下试剂:
试剂名称 体积
NEB Ultra II Q5 Mater Mix 2x 25uL
P5/P7通用引物混合物(各20uM) 2.5uL
去离子水 22.5uL
在PCR仪上,按如下条件进行扩增:
Figure PCTCN2019070253-appb-000004
扩增完成后,向扩增产物加入等体积的Agencourt Ampure XP磁珠(Beckman Coulter,Cat#A63880),于室温孵育5分钟,然后用200μl的80%乙醇洗涤2次。室温晾干后,加入30μl Buffer EB,静置5min后收集上清液。所得上清液即是已富集并纯化的目标DNA测序文库。
实施例5.上机测序
利用KAPA Library Quantification Kits(KAPA Biosciences,Cat#KK4835)并按照试剂盒的说明书,在StepOnePlus Real-Time PCR System(ThermoFisher,Cat#4376592)荧光定量PCR仪上对实施例3和4获得的测序文库进行定量。测序文库定量检测的有效浓度不小于1nM。
根据文库浓度,将适当体积的测序文库在Illumina NextSeq CN500测序仪上进行双端150碱基(150PE)测序。代表性测序结果如图13所示,本发明的Ago蛋白突变体将基因组DNA和游离DNA中的目标DNA片段富集了约500倍。因此,对于基因组DNA和高度片段化的游离DNA,本发明利用Ago蛋白突变体能够快速高效地富集目标DNA,从而构建满足测序要求的测序文库。
需要说明的是,虽然已通过以上实施例阐明了本发明的一些特征,但不能用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。测序文库构建中所涉及的反应试剂、反应条件等等可以根据具体的需要进行相应的调整和改变。因此对于本领域技术人员来说,在不脱离本发明的构思和原则之内,还可做出若干简单替换,这些均应包含在本发明的保护范围之内。
参考文献
1 Garcia-Garcia,G.et al.Assessment of the latest NGS enrichment capture methods in clinical context.Sci Rep 6,20948,doi:10.1038/srep20948(2016).
2 Bodi,K.et al.Comparison of commercially available target enrichment methods for next-generation sequencing.J Biomol Tech 24,73-86,doi:10.7171/jbt.13-2402-002(2013).
3 Newman,A.M.et al.Integrated digital error suppression for improved detection of circulating tumor DNA.Nat Biotechnol,doi:10.1038/nbt.3520(2016).
4 Samorodnitsky,E.et al.Evaluation of Hybridization Capture Versus Amplicon-Based Methods for Whole-Exome Sequencing.Hum Mutat 36,903-914,doi:10.1002/humu.22825(2015).
5 Kuscu,C.,Arslan,S.,Singh,R.,Thorpe,J.&Adli,M.Genome-wide analysis reveals characteristics of off-target sites bound by the Cas9 endonuclease.Nat Biotechnol 32,677-683,doi:10.1038/nbt.2916(2014).
6 Wu,X.et al.Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells.Nat Biotechnol 32,670-676,doi:10.1038/nbt.2889(2014).
7 Liu,X.et al.In Situ Capture of Chromatin Interactions by Biotinylated dCas9.Cell 170,1028-1043el019,doi:10.1016/j.cell.2017.08.003(2017).
8 Fujita,T.,Yuno,M.&Fujii,H.Efficient sequence-specific isolation of DNA fragments and chromatin by in vitro enChIP technology using recombinant CRISPR ribonucleoproteins.Genes Cells 21,370-377,doi:10.1111/gtc.12341(2016).
9 Swarts,D.C.et al.The evolutionary journey of Argonaute proteins.Nat Struct Mol Biol21,743-753,doi:10.1038/nsmb.2879(2014).
10 Song,J.J.,Smith,S.K.,Hannon,G.J.&Joshua-Tor,L.Crystal  structure of Argonaute and its implications for RISC slicer activity.Science 305,1434-1437,doi:10.1126/science.1102514(2004).
11 Swarts,D.C.et al.Argonaute of the archaeon Pyrococcus furiosus is a DNA-guided nuclease that targets cognate DNA.Nucleic Acids Res 43,5120-5129,doi:10.1093/nar/gkv415(2015).
12 Raines,R.T.,McCormick,M.,Van Oosbree,T.R.&Mierendorf,R.C.The S.Tag fusion system for protein purification.Methods Enzymol 326,362-376(2000).

Claims (39)

  1. 一种Argonaute蛋白的突变体,其具有DNA结合活性但缺失DNA切割活性,其中所述突变体的突变位于PIWI结构域。
  2. 权利要求1所述的突变体,其中所述Argonaute蛋白来源于Marinitoga属、栖热孢菌属、火球菌属、甲烷球菌属、红杆菌属、产液菌属、古球状菌属或栖热菌属。
  3. 权利要求1所述的突变体,其中所述Argonaute蛋白来源于激烈火球菌、嗜热栖热菌、詹氏甲烷球菌、Marinitoga piezophila、球形红杆菌、风产液菌、闪烁古生球菌或深海栖热孢菌。
  4. 权利要求1所述的突变体,其中所述Argonaute蛋白的氨基酸序列选自SEQ ID NO:1-8。
  5. 权利要求4所述的突变体,其中所述突变体包括一个或多个选自以下位置的突变:
    -SEQ ID NO:1的第558、596、628和745位氨基酸残基,以及位置与前述相当的氨基酸残基被取代,或
    -SEQ ID NO:1的第628-770位氨基酸,以及位置与前述相当的氨基酸残基缺失。
  6. 权利要求5所述的突变体,其中所述取代是被丙氨酸或谷氨酸取代。
  7. 权利要求1所述的突变体,其中所述突变体进一步包括位于以下结构域的突变:N端结构域、PAZ结构域。
  8. 权利要求1所述的突变体,其中所述突变体带有特异性标记。
  9. 权利要求8所述的突变体,其中所述特异性标记是生物素标记。
  10. 一种富集目标DNA的方法,包括以下步骤:
    (a)针对目标DNA中的特异性序列设计引导序列;
    (b)使根据权利要求1-9任一项所述的突变体、引导序列和目标DNA结合,获得突变体-引导序列-目标DNA三元复合物;
    (c)通过捕获介质捕获突变体-引导序列-目标DNA三元复合物;
    (d)从捕获的突变体-引导序列-目标DNA三元复合物中分离目标DNA,获得富集的目标DNA。
  11. 权利要求10所述的方法,其中所述步骤(b)进一步包括以下步骤:
    (b1)使根据本发明的突变体与引导序列结合,获得突变体-引导序列二元复合物;
    (b2)使dAgo-引导序列二元复合物与目标DNA序列结合,获得突变体-引导序列-目标DNA三元复合物。
  12. 权利要求10所述的方法,其中所述引导序列是RNA或DNA。
  13. 权利要求10所述的方法,其中所述引导序列是单链RNA(ssRNA)或单链DNA(ssDNA)。
  14. 权利要求10所述的方法,其中所述引导序列包含核苷酸修饰。
  15. 权利要求14所述的方法,其中所述修饰是5’磷酸化或5’羟基化。
  16. 权利要求10所述的方法,其中所述引导序列的长度为15-25个核苷酸。
  17. 权利要求10所述的方法,其中所述引导序列与目标DNA中的特异性序列基本上互补。
  18. 权利要求10所述的方法,其中所述突变体携带有特异性标记。
  19. 权利要求18所述的方法,其中所述特异性标记是生物素标记。
  20. 权利要求10所述的方法,其中所述捕获介质是磁珠。
  21. 权利要求10所述的方法,其中所述捕获介质携带有能够与突变体携带的特异性标记结合的捕获标记。
  22. 权利要求21所述的方法,其中所述捕获标记是亲链霉素标记。
  23. 一种构建目标DNA的测序文库的方法,包括以下步骤:
    (1)将目标DNA与测序接头连接,获得连接产物;
    (2)根据权利要求10-22任一项所述的方法从连接产物富集与测序接头连接的目标DNA,获得富集的目标DNA;
    (3)扩增富集的目标DNA,获得测序文库。
  24. 一种构建目标DNA的测序文库的方法,包括以下步骤:
    (1)根据权利要求10-22任一项所述的方法富集目标DNA;
    (2)将富集的目标DNA与测序接头连接,获得连接产物;
    (3)扩增连接产物,获得测序文库。
  25. 权利要求23或24所述的方法,进一步包括在富集步骤之前的预扩增步骤。
  26. 权利要求23或24所述的方法,其中所述测序接头是与测序平台匹配的测序接头。
  27. 权利要求23或24所述的方法,其中所述扩增步骤所用的引物是通用引物。
  28. 一种试剂盒,其包括权利要求1-9任一项所述的突变体。
  29. 权利要求28所述的试剂盒,进一步包括引导序列和捕获介质。
  30. 权利要求29所述的试剂盒,其中所述引导序列是RNA或DNA。
  31. 权利要求29所述的试剂盒,其中所述引导序列是单链RNA(ssRNA)或单链DNA(ssDNA)。
  32. 权利要求29所述的试剂盒,其中所述引导序列包含核苷酸修饰。
  33. 权利要求32所述的试剂盒,其中所述核苷酸修饰是5’磷酸化或5’羟基化。
  34. 权利要求29所述的试剂盒,其中所述引导序列的长度为15-25个核苷酸。
  35. 权利要求28所述的试剂盒,其中所述突变体携带有特异性标记。
  36. 权利要求35所述的试剂盒,其中所述特异性标记是生物素标记。
  37. 权利要求29所述的试剂盒,其中所述捕获介质是磁珠。
  38. 权利要求29所述的试剂盒,其中所述捕获介质携带有能够与突变体携带的特异性标记结合的捕获标记。
  39. 权利要求38所述的试剂盒,其中所述捕获标记是亲链霉素标记。
PCT/CN2019/070253 2018-03-06 2019-01-03 Argonaute蛋白突变体及其用途 WO2019169945A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19763666.5A EP3763812A4 (en) 2018-03-06 2019-01-03 ARGONAUT PROTEIN MUTANT AND CORRESPONDING USE
US16/978,428 US12098367B2 (en) 2018-03-06 2019-01-03 Argonaute protein mutant and use thereof

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201810184689.6 2018-03-06
CN201810184689 2018-03-06
CN201811505553.7A CN110229799B (zh) 2018-03-06 2018-12-10 Argonaute蛋白突变体及其用途
CN201811505553.7 2018-12-10

Publications (1)

Publication Number Publication Date
WO2019169945A1 true WO2019169945A1 (zh) 2019-09-12

Family

ID=67845466

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/070253 WO2019169945A1 (zh) 2018-03-06 2019-01-03 Argonaute蛋白突变体及其用途

Country Status (1)

Country Link
WO (1) WO2019169945A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023169228A1 (zh) * 2022-03-11 2023-09-14 上海交通大学 一种新型嗜热核酸内切酶突变体及其制备方法和应用

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108048532A (zh) * 2018-02-02 2018-05-18 北京大学 基于Argonaute蛋白的荧光原位杂交方法及应用

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108048532A (zh) * 2018-02-02 2018-05-18 北京大学 基于Argonaute蛋白的荧光原位杂交方法及应用

Non-Patent Citations (14)

* Cited by examiner, † Cited by third party
Title
BODI, K. ET AL.: "Comparison of commercially available target enrichment methods for next-generation sequencing", J BIOMOL TECH, vol. 24, 2013, pages 73 - 86
FUJITA, T.YUNO, M.FUJII, H.: "Efficient sequence-specific isolation of DNA fragments and chromatin by in vitro enChIP technology using recombinant CRISPR ribonucleoproteins", GENES CELLS, vol. 21, 2016, pages 370 - 377, XP055621734, DOI: 10.1111/gtc.12341
GARCIA-GARCIA, G ET AL.: "Assessment of the latest NGS enrichment capture methods in clinical context", SCI REP, vol. 6, 2016, pages 20948, XP055623863, DOI: 10.1038/srep20948
KUSCU, C.ARSLAN, S.SINGH, R.THORPE, J.ADLI, M.: "Genome-wide analysis reveals characteristics of off-target sites bound by the Cas9 endonuclease", NAT BIOTECHNOL, vol. 32, 2014, pages 677 - 683, XP055382577, DOI: 10.1038/nbt.2916
LIU, X. ET AL.: "In Situ Capture of Chromatin Interactions by Biotinylated dCas9", CELL, vol. 170, 2017, pages 1028 - 1043
NEWMAN, A. M. ET AL.: "Integrated digital error suppression for improved detection of circulating tumor DNA", NAT BIOTECHNOL, 2016
RAINES, R. T.MCCORMICK, M.VAN OOSBREE, T. R.MIERENDORF, R. C.: "The S.Tag fusion system for protein purification.", METHODS ENZYMOL, vol. 326, 2000, pages 362 - 376, XP009135533
SAMORODNITSKY, E. ET AL.: "Evaluation of Hybridization Capture Versus Amplicon-Based Methods for Whole-Exome Sequencing", HUM MUTAT, vol. 36, 2015, pages 903 - 914
See also references of EP3763812A4 *
SONG, J. J.SMITH, S. K.HANNON, G. J.JOSHUA-TOR, L.: "Crystal structure of Argonaute and its implications for RISC slicer activity", SCIENCE, vol. 305, 2004, pages 1434 - 1437, XP002470480, DOI: 10.1126/science.1102514
SWARTS, D. C. ET AL.: "Argonaute of the archaeon Pyrococcus furiosus is a DNA-guided nuclease that targets cognate DNA", NUCLEIC ACIDS RES, vol. 43, 2015, pages 5120 - 5129, XP055287460, DOI: 10.1093/nar/gkv415
SWARTS, D. C. ET AL.: "The evolutionary journey of Argonaute proteins", NAT STRUCT MOL BIOL, vol. 21, 2014, pages 743 - 753, XP055287457, DOI: 10.1038/nsmb.2879
WANG, Y.: "Nucleation, Propagation and Cleavage of Target RNAs in Ago Silencing Complexes", NATURE, vol. 461, no. 7265, 8 October 2009 (2009-10-08), XP055265388 *
WU, X. ET AL.: "Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells.", NAT BIOTECHNOL, vol. 32, 2014, pages 670 - 676, XP055241568, DOI: 10.1038/nbt.2889

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023169228A1 (zh) * 2022-03-11 2023-09-14 上海交通大学 一种新型嗜热核酸内切酶突变体及其制备方法和应用

Similar Documents

Publication Publication Date Title
CN110229799B (zh) Argonaute蛋白突变体及其用途
AU2021202913B2 (en) CRISPR hybrid DNA/RNA polynucleotides and methods of use
US20240093241A1 (en) Crispr enabled multiplexed genome engineering
JP2021019617A (ja) Cas9−crRNA複合体によるRNA指向性DNA切断
JP2019533996A (ja) S.ピオゲネスcas9変異遺伝子及びこれによってコードされるポリペプチド
JP2022523189A (ja) Lachnospiraceae bacterium ND2006のCAS12A変異型遺伝子およびそれらによってコードされるポリペプチド
KR20010071227A (ko) 무세포 키메라플라스티 및 이형이중나선 돌연변이성벡터를 진핵세포에 이용
CN110699407B (zh) 一种长单链dna的制备方法
US11542530B2 (en) Method for increasing efficiency of homologous recombination-based gene editing in plant
CN110300802A (zh) 用于动物胚胎碱基编辑的组合物和碱基编辑方法
KR20210042130A (ko) Acidaminococcus sp. cpf1의 dna 절단 활성을 향상시키는 신규한 돌연변이
CN115768886A (zh) 一种基因组编辑系统及方法
US20240002834A1 (en) Adenine base editor lacking cytosine editing activity and use thereof
JP2024533038A (ja) カーゴヌクレオチド配列を転位するための系及び方法
Aliu et al. CRISPR RNA‐guided integrase enables high‐efficiency targeted genome engineering in Agrobacterium tumefaciens
WO2019169945A1 (zh) Argonaute蛋白突变体及其用途
WO2021258580A1 (zh) 一种基于CRISPR/Cas12a的体外大片段DNA克隆方法及其应用
CN113227370B (zh) 一种单链dna合成方法
JP2022519308A (ja) その中の核酸分子を差次的にメチル化するように操作されたミニサークル産生細菌
KR102685619B1 (ko) 티민-사이토신 서열 특이적 사이토신 교정 활성이 증진된 아데닌 염기교정 유전자가위 및 이의 용도
WO2017197588A1 (zh) 抗草甘膦基因筛选方法、epsps突变基因和缺陷型菌株及应用
JP2001017172A (ja) 二本鎖dna断片の末端での連結
JP5129498B2 (ja) 核酸クローニング法
US20240018550A1 (en) Adenine base editor having increased thymine-cytosine sequence-specific cytosine editing activity, and use thereof
TW202338094A (zh) 功能性dna卡匣及質體

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19763666

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019763666

Country of ref document: EP

Effective date: 20201006