WO2018040962A1 - 一种建库方法及snp分型方法 - Google Patents

一种建库方法及snp分型方法 Download PDF

Info

Publication number
WO2018040962A1
WO2018040962A1 PCT/CN2017/098214 CN2017098214W WO2018040962A1 WO 2018040962 A1 WO2018040962 A1 WO 2018040962A1 CN 2017098214 W CN2017098214 W CN 2017098214W WO 2018040962 A1 WO2018040962 A1 WO 2018040962A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
linker
site
snp
amplification
Prior art date
Application number
PCT/CN2017/098214
Other languages
English (en)
French (fr)
Inventor
盛司潼
黄思强
卫明
Original Assignee
广州康昕瑞基因健康科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州康昕瑞基因健康科技有限公司 filed Critical 广州康昕瑞基因健康科技有限公司
Publication of WO2018040962A1 publication Critical patent/WO2018040962A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the present invention relates to the field of molecular biology, and more particularly to a method of building a database and a method for typing a SNP.
  • SNP Single Nucleotide Polymorphism
  • the second-generation high-throughput sequencing technology has expanded its application range due to its accurate and sensitive characteristics. It has involved various aspects of life science research and medical research, using second-generation high-throughput sequencing technology to perform SNP sites. The detection is also one of the current research hotspots.
  • the SNP typing method of the second-generation high-throughput sequencing technology when the distance between the SNP site to be tested and the end of the sequencing primer is long, the detection time is greatly prolonged, and is limited by the read length of the sequencing method.
  • the accuracy is greatly reduced; in addition, when multiple SNP sites of multiple susceptibility genes are simultaneously detected, it is usually necessary to design a plurality of different sequencing primers for sequences near different SNP sites to be tested, but different sequencing primers It is easy to cause mutual interference, and it is difficult for the sequencing primer to be accurately anchored at a specific position, thereby increasing the design difficulty of the sequencing primer and possibly reducing the accuracy of the SNP typing detection.
  • the object of the present invention is to provide a database construction method and a SNP classification method, which aims to solve the problem that the SNP classification accuracy in the prior art is affected by the sequencing read length and the detection of multiple SNP sites simultaneously in the same system.
  • the present invention provides a method of building a database, comprising the following steps:
  • A PCR amplification of the sample to be sequenced containing the SNP site to be tested, to obtain an amplification product
  • the ligated product is digested with a type IIS restriction endonuclease to obtain a first nucleic acid fragment containing the SNP site to be tested and the linker one, and the first nucleic acid fragment is digested to form a first end;
  • Linker II is ligated at the first end of the first nucleic acid fragment to obtain a library molecule, which is a double-stranded nucleic acid molecule comprising a sequencing primer binding site.
  • the Type IIS restriction endonuclease recognition sequence is located on at least one amplification primer in the PCR amplification primer set and is introduced into the ligation product by PCR amplification.
  • the Type IIS restriction endonuclease recognition sequence is located on the linker and is introduced to the ligation product by a ligation reaction.
  • one of the primer sets used for the PCR amplification contains a cleavable site or a cleavable sequence on the amplification primer; and the step B includes the following steps:
  • Linker l is ligated to the second end of the amplification product to obtain a ligation product comprising a Type IIS restriction endonuclease recognition sequence and a Type IIS restriction endonuclease cleavage site, wherein the linker is double-stranded
  • the nucleic acid molecule, wherein the distance between the IIS type restriction endonuclease cleavage site and the SNP site to be tested is 0 to 5 bases.
  • the step B1 and the step B2 are carried out simultaneously in the same reaction system.
  • the sequencing primer binding site is located at one end of the linker two that is joined to the first end.
  • the present invention also provides a SNP typing method comprising the step of sequencing a library molecule prepared according to any of the above methods.
  • the method further comprises the step of immobilizing the library molecule on the solid support.
  • a plurality of library molecules are obtained, and then the plurality of library molecules are mixed and then sequenced.
  • the plurality of library molecules each contain a different tag sequence.
  • the distance between the linker 2 on the obtained library molecule and the SNP site to be tested is 0 to 5 bases.
  • Subsequent library molecules containing different SNP sites to be tested can be mixed and then sequenced.
  • the sequencing primers are completely complementary to the sequences on the linker 2 in the library molecule, so multiple SNP sites to be tested are sequenced.
  • the sequencing primers can be the same, which reduces the design difficulty of the sequencing primers, ensures the consistency of the sequencing primer anchoring efficiency of each SNP site, avoids mutual interference caused by different primers in the sequencing process, and improves the accuracy of sequencing.
  • the library molecule can be conveniently purified, and it can be more conveniently used in the sequencing process.
  • the addressing is immobilized on a solid support to facilitate the sequencing step.
  • FIG. 1 is a diagram showing the detection of polyacrylamide gel electrophoresis of a Chinese library molecule according to a second embodiment of the present invention.
  • FIG. 2 is a polyacrylamide gel electrophoresis detection diagram of a Chinese library molecule according to a fifth embodiment of the present invention.
  • the present invention provides a first embodiment, a method of building a database, comprising the following steps:
  • A PCR amplification of the sample to be sequenced containing the SNP site to be tested, to obtain an amplification product
  • the distance between the IIS type restriction endonuclease cleavage site and the SNP site to be tested is 0 to 5 bases;
  • the ligated product is digested with a type IIS restriction endonuclease to obtain a first nucleic acid fragment containing the SNP site to be tested and the linker one, and the first nucleic acid fragment is digested to form a first end;
  • Linker II is ligated at the first end of the first nucleic acid fragment to obtain a library molecule, which is a double-stranded nucleic acid molecule comprising a sequencing primer binding site.
  • the present invention obtains a library molecule comprising a restriction endonuclease recognition sequence of type IIS, wherein the distance between the linker 2 and the SNP site to be tested is 0 to 5 bases.
  • the sequencing primers are completely complementary to the sequences on the linker 2 in the library molecule, so the sequencing primers for sequencing the plurality of SNP sites to be tested may be The same, the design difficulty of the sequencing primer is reduced, the consistency of the sequencing primer anchoring efficiency of each SNP site is ensured, the mutual interference caused by the different primers in the sequencing process is avoided, and the accuracy of the sequencing is improved;
  • the detection can be completed with fewer sequencing steps, which greatly shortens the sequencing time, and the detection of the SNP site to be tested is not limited by the read length of the sequencing instrument, and the accuracy can be improved.
  • the Type IIS restriction endonuclease is used to recognize the Type IIS restriction endonuclease recognition sequence and to cleave at the Type IIS restriction endonuclease cleavage site.
  • the type IIS restriction endonuclease is a restriction endonuclease having a cleavage site outside the recognition sequence, including but not limited to: Acu I, Alw I, Bbs I, BbV I, Bcc I, BceA I, BciV I , BfuA I, Bmr I, Bpm I, BpuE I, Bsa I, BseM II, BseR I, Bsg I, BsmA I, BsmB I, BsmF I, BspCN I, BspM I, BspQ I, BtgZ I, Ear I, Eci I, EcoP15 I, Fau I, Fok I, Hga I, Hph I, HpyA V
  • the sample to be sequenced is a nucleic acid molecule containing a SNP site to be tested, including but not limited to a DNA molecule, a cDNA molecule or an RNA molecule.
  • the PCR amplification described in step A can be either single molecule amplification or non-single molecule amplification.
  • the single molecule amplification is emulsion PCR, bridge PCR or emulsion bridge PCR.
  • the non-single molecule amplification is common PCR amplification, real-time fluorescent quantitative PCR, asymmetric PCR, solid phase PCR, in situ PCR, reverse transcription PCR, nested PCR, degenerate primer PCR, immunoPCR, and reverse PCR or decrement PCR.
  • the IIS type restriction endonuclease recognition sequence is located on at least one amplification primer in the PCR amplification primer set, and is matched by the complementary pairing of the primer and the template strand in the amplification reaction of step A. Introduced to the amplification product; or,
  • the type IIS restriction endonuclease recognition sequence is located on the linker, and the type IIS restriction endonuclease recognition sequence is introduced into the ligation product by ligation of the linker.
  • the linker may be directly attached to the first end of the amplification product; the amplified product may be cleaved to form a second end, and the linker 1 may be attached to the second end.
  • At least one of the amplification primers comprises a cleavable site or a cleavable sequence
  • the step B comprises the following steps:
  • Linker l is ligated to the second end of the amplification product to obtain a ligation product comprising a Type IIS restriction endonuclease recognition sequence and a Type IIS restriction endonuclease cleavage site, wherein the linker is double-stranded
  • the nucleic acid molecule, wherein the distance between the IIS type restriction endonuclease cleavage site and the SNP site to be tested is 0 to 5 bases.
  • the cleavable site is U
  • the cleavage agent is a URSE enzyme
  • the excisable sequence is an RNA sequence
  • the cleavage agent is RNase H
  • the excisable sequence is a restriction endonuclease recognition sequence
  • the cleavage agent is a corresponding restriction endonuclease
  • the reaction of the linker 1 and the linker 2 is carried out under the action of a ligase.
  • the ligase may be selected from the group consisting of E. coli DNA ligase, T4 DNA ligase, and thermostable DNA.
  • the ligase and Tth DNA ligase, preferably T4 DNA ligase, are highly versatile and can be attached to both sticky ends and blunt ends.
  • a step of inactivating the ligase in the reaction system is further included, and the present scheme can avoid re-ligation of the excised product during the enzymatic cleavage of step C.
  • the step B1 and the step B2 may be carried out stepwise or simultaneously in the same reaction system.
  • the step B1 and the step B2 are carried out simultaneously in the same reaction system, compared with the stepwise scheme,
  • the solution simplifies the steps of building the database and improves the efficiency of building the database.
  • the step C and the step D may be carried out stepwise or simultaneously in the same reaction system.
  • the step C and the step D are performed simultaneously in the same reaction system, and the solution simplifies the database construction step and improves the efficiency of the database construction compared with the stepwise implementation.
  • step B1 and step C may be Simultaneously in the same system, step B2 and step D can also be carried out simultaneously in the same system; or, the steps B1, B2, C, and D are simultaneously performed in the same reaction system, and the stepwise scheme is carried out.
  • this solution simplifies the steps of building a database and improves the efficiency of building a database.
  • the linker contains a biotin label on the opposite end of the junction with the first end, and the biotin label can be used to conveniently separate the library molecules after the end of step D.
  • the linker is preliminarily immobilized on a solid phase carrier containing streptavidin or avidin by biotin labeling thereon, and can be immobilized on a solid phase carrier in a subsequent step.
  • the linker 2 is a double-stranded nucleic acid molecule containing a sequencing primer binding site, and the present invention mixes the library molecules containing different SNP sites to be tested by ligating the linker II on the first nucleic acid fragment, and then performing sequencing.
  • the sequencing process a variety of sequencing primers for SNP sites to be tested are unified, which reduces the design difficulty of sequencing primers, ensures the consistency of sequencing primer anchoring efficiency of each SNP site, and avoids different primers in the sequencing process.
  • the mutual interference generated improves the accuracy of sequencing.
  • the sequencing primer binding site is located at one end of the linker two and the second end; compared with the technical solution that the sequencing primer binding site is located at other positions of the linker, the solution shortens the SNP site to be tested.
  • the distance from the sequencing primers reduces the sequencing step, thereby shortening the detection time and improving the accuracy of the detection.
  • the present invention proposes a second embodiment, which constructs a library of MTHFR gene fragments containing the rs1799853 locus using the human whole blood genome as a template.
  • the centrifuge tube was placed in a PCR machine, and the reaction procedure was set: at 94 ° C for 4 minutes; at 94 ° C for 20 seconds, at 49 ° C for 20 seconds, and at 72 ° C for 1 minute for a total of 30 cycles; The reaction was continued for 3 minutes at 72 ° C; after the completion of the PCR reaction, the first round of PCR product was obtained;
  • upstream primer SEQ ID NO: 3 contains U and the Acu I enzyme recognition sequence CTGAAG, the upstream primer 3' end is adjacent to the SNP site to be tested, and the Acu I enzyme cleavage site Located at the SNP site;
  • the Acu I enzyme was used to digest the ligation product in step B, and the reaction system was configured.
  • the ligation product in step B was 27 ⁇ L, the concentration of 2 units/ ⁇ L of Acu I enzyme 1 ⁇ L, and the concentration of 3.2 mM S-adenosine A.
  • Thionine 0.5 ⁇ L, 10 ⁇ Buffer 2.5 ⁇ L; 2 ⁇ L of deionized water; reacted at 37 ° C for 1 hour, and inactivated at 65 ° C for 20 minutes to obtain a first nucleic acid fragment containing the SNP site to be tested and the linker 1.
  • the first nucleic acid fragment was digested. A first end comprising two overhanging bases is formed and the SNP site to be tested is at the end.
  • the centrifuge tube was placed on a magnetic rack, and the supernatant was separated and removed to obtain a first nucleic acid fragment adsorbed on the magnetic beads.
  • 15 ⁇ L of 4 ⁇ BW buffer and 16.5 ⁇ L of water were added, and the mixture was incubated at room temperature for 30 min.
  • the magnetic beads were washed once with 50 ⁇ L of 1 ⁇ NXS (containing 1% Triton), washed once with 50 ⁇ L of 1 ⁇ NXS (containing 0.01% triton), washed once with 50 ⁇ L of 1 ⁇ TE (containing 50 mM KCl solution), and resuspended in 20 ⁇ L of 1 ⁇ cutsmart.
  • the buffer was treated with 2units shrimp alkaline phosphatase and reacted at 37 ° C for 1 h. The reaction was completed and inactivated at 65 ° C for 5 min.
  • step C Add the following components to the reaction system of step C, 1 ⁇ L of the linker at a concentration of 10 pmol/ ⁇ L; 1 ⁇ L of T4 DNA ligase at a concentration of 2 units/ ⁇ L; 7.5 ul ligase buffe (containing 117 mM Tris-HCl, 17.5 mM MgCl) 2 , 35 mM DTT, 5 mM ATP and 23.415% (w/v) PEG 6000); deionized water 0.5 ⁇ L; reaction at room temperature for 1 hour.
  • the linker two comprises two, consisting of SEQ ID NO: 7 and SEQ ID NO: 8, SEQ ID NO: 9 and SEQ ID NO: 8, respectively, and the sequence of SEQ ID NO: 8 is a sequencing primer binding sequence.
  • the centrifuge tube is placed on a magnetic rack, and the supernatant is separated and removed to obtain a library molecule containing a linker, a linker 2, and a rs1799853 site adsorbed on the magnetic beads.
  • the library molecules were verified, and the following reaction system was prepared to amplify the library molecules: 1 ⁇ L of 100-fold diluted magnetic bead suspension; 2 ⁇ long Taq Mix 10 ⁇ L; concentration of 10 ⁇ M upstream primer (SEQ ID NO: 10) 0.4 ⁇ L; 10 ⁇ M downstream primer (SEQ ID NO: 11) 0.4 ⁇ L; 20 ⁇ L of deionized water; mix and centrifuge.
  • Fig. 1 The results of the polyacrylamide gel electrophoresis of the verification product are shown in Fig. 1.
  • 0 is a molecular size marker
  • lane 1 is a verification product.
  • the target product band appears near the 60 bp position, and the theoretical expected library Complete molecular size , indicating that the method of the present invention can realize the construction of a sample to be sequenced.
  • the upstream primer (SEQ ID NO: 3) was designed to take into account both the specific amplification of the SNP site to be tested and the Acu I recognition sequence CTGAAG.
  • the 3' end of the upstream primer is adjacent to the SNP site to be tested, thereby ensuring the efficiency of PCR amplification; since the Acu I cleavage site is located 16 bp after the Acu I recognition sequence, the upstream primer is after the Acu I recognition sequence
  • the sequence length is 15 bp, so that the cleavage site is at the SNP site to be tested, and the second round of PCR product is obtained by the Acu I enzyme to obtain a first nucleic acid fragment containing the first end of two protruding bases.
  • the SNP site to be tested is at the end. Since the SNP site to be tested in this example is located at the end, there are two possibilities for gene mutation, and thus the linker 2 includes two double-stranded nucleic acid molecules, which are respectively linked to the first end of the SNP site containing the above two mutations.
  • the present invention proposes a third embodiment, which constructs a library of MTHFR gene fragments containing the rs1057910 locus using the human whole blood genome as a template.
  • the first round of PCR amplification of the primer set in step A is: upstream primer (SEQ ID NO: 12) and downstream primer (SEQ ID NO: 13); second round of PCR
  • the amplified primer sets were: upstream primer (SEQ ID NO: 14) and downstream primer (SEQ ID NO: 15).
  • the upstream primer (SEQ ID NO: 14) contains the Nb.BbvCI cleavage sequence GCTGAGG
  • the downstream primer (SEQ ID NO: 15) contains the BceA I enzyme recognition sequence ACGGC
  • BceA I cleavage The distance between the locus and the rs1057910 locus to be tested is 4 bp.
  • the above two reaction systems were reacted at 25 ° C for one hour and inactivated at 80 ° C for 20 minutes. After completion of the reaction, the library molecules were hybridized to the flow chamber by biotin labeling at the end of the linker.
  • the library molecules were verified, wherein the validation primer set was: upstream primer (SEQ ID NO: 18), downstream primer (SEQ ID NO: 11). It was verified that the target band appeared near the 95 bp position, which was completely consistent with the theoretical expected library molecular size, indicating that the method of the present invention can realize the construction of the sample to be sequenced.
  • the cleavage-ligation reaction system of the present embodiment includes Nb.BbvCI enzyme, BceA I enzyme, linker 1 and linker 2, and both ends of the amplified product are simultaneously cut in the same system, and the linker 1 and the linker 2 are respectively connected.
  • the reaction realizes multi-enzyme cooperation, simplifies the steps of building the database, and improves the efficiency of building the database.
  • the present invention proposes a fourth embodiment, which constructs a VKORC1 gene fragment library containing the rs9923231 locus using the human whole blood genome as a template.
  • This example differs from the second embodiment in that the primer set for the second round of PCR amplification is SEQ ID NO: 19 and SEQ ID NO: 20, wherein SEQ ID NO: 19 contains U.
  • the linker one consists of SEQ ID NO: 21 and SEQ ID NO: 22, wherein SEQ ID NO: 21 contains the Acu I recognition sequence CTGAAG, through the linker and the amplification product.
  • the ligation reaction introduces the Acu I enzyme recognition sequence into the ligation product, and the 5' end of SEQ ID NO: 20 contains a biotin tag and is pre-immobilized on streptavidin-containing magnetic beads.
  • Linker II in step C is an equal mixture of the following linkers: a linker consisting of SEQ ID NO: 7 and SEQ ID NO: 8, a linker consisting of SEQ ID NO: 9 and SEQ ID NO: 8, SEQ ID NO: 23 and A linker consisting of SEQ ID NO: 8, a linker consisting of SEQ ID NO: 24 and SEQ ID NO: 8.
  • the primer set was verified as an upstream primer (SEQ ID NO: 25) and a downstream primer (SEQ ID NO: 11).
  • the verification product of the library molecule of the present example was subjected to agarose gel electrophoresis, and the target band appeared near the 65 bp position, which completely coincided with the theoretical expected library molecular size, indicating that the method of the present invention can realize the database construction of the sample to be sequenced.
  • an Acu I enzyme recognition sequence is set on the adaptor, and the Acu I enzyme recognition sequence is introduced into the library molecule through the ligation of the linker to the amplification product, thereby reducing the design difficulty of the amplification primer.
  • the gene mutation may be four mutations, and thus the linker 2 includes four double-stranded nucleic acid molecules, which are respectively linked to the first end of the SNP site containing the above four mutations.
  • the fifth embodiment of the present invention proposes to construct five different reaction systems using the human whole blood genome as a template, and respectively construct a CYP2C9 gene fragment containing the rs1799853 locus, a CYP2C9 gene fragment containing the rs1057910 locus, and a VKORC1 containing the rs9923231 locus.
  • the first round of PCR amplification amplification primer sets in each reaction system are: an upstream primer (SEQ ID NO: 1) and a downstream primer (SEQ ID NO: 2), respectively.
  • primer sets for the second round of PCR amplification are: upstream primers (SEQ ID NO: 3) And downstream primers (SEQ ID NO: 4), upstream primers (SEQ ID NO: 32) and downstream primers (SEQ ID NO: 33), upstream primers (SEQ ID NO: 34) and downstream primers (SEQ ID NO: 21) ), upstream primer (SEQ ID NO: 35) and downstream primer (SEQ ID NO:
  • the linker l ligated in each reaction system consists of the following sequences: SEQ ID NO: 5 and SEQ ID NO: 6, SEQ ID NO: 39 and SEQ ID NO: 40, SEQ ID NO: 41 and SEQ ID NO: 42, SEQ ID NO: 43 and SEQ ID NO: 44, SEQ ID NO: 45 and SEQ ID NO: 46, and the primer set of each linker 1 respectively contains a tag sequence: ACTG, TGCA, GTAC, CATG and AGTC; and linker 2 is SEQ ID NO: 7 and SEQ ID NO: 8, respectively.
  • SEQ ID NO: 9 and SEQ ID NO: 8 SEQ ID NO: 24 and SEQ ID NO: 8, SEQ ID NO: 25 and SEQ ID NO: 8, SEQ ID NO: 47 and SEQ ID NO: 8, SEQ ID Mixture of NO:48 and SEQ ID NO:8, wherein the 3' end of the sequence of SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:24, SEQ ID NO:25 is amino-modified to avoid the reaction process Self-ligation occurs; the 5' end of the sequence of SEQ ID NO: 8 is modified with phosphoric acid.
  • Fig. 2 The results of the polyacrylamide gel electrophoresis of the verified product are shown in Fig. 2.
  • 0 is a molecular size marker
  • lanes 1-5 are the verification products of the above five reaction systems, respectively at 63 bp, 74 bp, 121 bp, 127 bp, and 112 bp.
  • the target band appears nearby, which is completely consistent with the theoretical expected library molecular size, indicating that the method of the present invention can realize the construction of the sample to be sequenced.
  • the present invention also proposes a sixth embodiment, a SNP typing method comprising the step of sequencing a library molecule prepared according to the database construction method of any of the above embodiments.
  • the library molecule of the present invention contains the linker II, the sequencing primers and the linker II specifically bind to each other, and the sequencing primers of the plurality of different SNP sites in the same system are unified, thereby avoiding mutual interference between the sequencing primers.
  • the method further comprises the step of immobilizing the library molecule on the solid support.
  • the addressable fixed means that the position information can be fixed. That is, the library molecules immobilized at each specific position on the solid phase carrier can be clearly distinguished from the library molecules immobilized at other positions.
  • the library molecules can be immobilized on the solid support by a direct or indirect means.
  • the present invention provides an embodiment in which a library molecule is hybridized directly to a flow cell through a linker to effect addressable immobilization of the library molecule;
  • a library molecule is immobilized on a microsphere by a linker, and the microsphere is pre-fixed on a solid support to effect addressable immobilization of the library-containing molecule.
  • the present invention proposes an embodiment in which a library molecule is first immobilized on a microsphere, and then the microsphere is immobilized on a solid phase carrier, thereby realizing addressability of the library molecule. fixed.
  • the library is separately constructed according to the different samples to be sequenced, and a plurality of library molecules are obtained, and then the plurality of library molecules are mixed and sequenced in the same system.
  • the plurality of library molecules respectively contain different tag sequences, and the program can distinguish the sequencing results of different library molecules by using the tag sequences on the library molecules.
  • the tag sequence is located on the connector one.
  • the method of sequencing is a second generation high throughput gene sequencing technology, including but not limited to ligation sequencing or synthetic sequencing.
  • the ligation sequencing method is based on the fidelity of a ligase in a ligation reaction between nucleic acid fragments.
  • the nucleic acid fragment to be sequenced is used as a template, the sequencing primer and the oligonucleotide probe (fluorescent labeling at a specific position of the probe) are subjected to a ligation reaction, and the fluoronucleotide is determined by detecting a fluorescent label on the ligation product.
  • the acid probe has information on the sequence corresponding to the specific position of the fluorescent label.
  • connection sequencing methods commonly used in the market, including but not limited to: Pstar connection sequencing method of Shenzhen Huayinkang Gene Technology Co., Ltd., ABI's connection sequencing method, and Complete Genomics' connection sequencing method.
  • the synthetic sequencing method is based on the fidelity of the polymerase in the process of extending the nucleic acid strand, and the nucleic acid fragment to be sequenced is used as a template, and the anchor primer (also referred to as a sequencing primer, which is complementary to the strand of the nucleic acid fragment to be sequenced) is complementary. Binding to the nucleic acid fragment to be sequenced, sequence information of the corresponding position on the nucleic acid fragment to be sequenced is determined by detecting a signal generated during the extension.
  • synthetic sequencing methods commonly available on the market, including but not limited to: Illumina's Solexa synthetic sequencing method, Roche's 454 synthetic sequencing method, Life Technologies' Iontorrent, and Ion Proton synthetic sequencing method.
  • the present invention also provides a seventh embodiment, a method for detecting the MTHFR gene rs1799853 site.
  • the present embodiment further includes the following steps on the basis of the second embodiment:
  • sequencing primer SEQ ID NO: 12
  • sequencing linker immobilized on the sequencing primer binding sequence of the sequencing linker, using the degenerate nine-band XNNNNNNNN with fluorescent group complementary to the detection site
  • the sequencing probe was sequenced and the rs1801133 site of the MTHFR gene was determined to be C.
  • the present invention also provides an eighth embodiment, a method for detecting the MTHFR gene rs1057910 locus.
  • the present embodiment further includes the following steps on the basis of the third embodiment:
  • the present invention also provides a ninth embodiment, which simultaneously detects the rs1799853 site, the rs1057910 site on the CYP2C9 gene fragment, the rs9923231 site on the VKORC1 gene fragment, the rs4244285 site on the CYP2C19 gene fragment, and the rs4986893 site.
  • the fourth embodiment differs in that it further includes the following steps:
  • the library molecules adsorbed on the magnetic beads were mixed, and 20 ⁇ L of a 0.1 M NaOH solution was added thereto to change the template.
  • the single strand was separated, and the supernatant was separated and removed, washed twice with 20 ⁇ L of 1 ⁇ TE (containing triton with a concentration of 0.01%), washed once with 20 ⁇ L of 1 ⁇ TE, and finally resuspended in 10 ⁇ L of 1 ⁇ TE for use as a sequencing template;
  • sequencing was performed by ligation sequencing, and the sequencing primer (SEQ ID NO: 43) was immobilized on the sequencing primer binding site of the sequencing linker.
  • the ligation was sequenced to determine that the SNPs to be tested were C, A, T, G, and G, respectively.
  • sequencing primers for detecting multiple SNP sites in the same system are unified, thereby avoiding mutual interference between different primers.
  • the present invention provides a tenth embodiment, a kit for detecting mutations of a plurality of SNP sites, the kit comprising an amplification primer set and/or a linker; the amplification primer set is used for the plurality of Specifically amplifying at least one SNP site of a SNP site, the linker is a double-stranded nucleic acid molecule for ligation with an amplification product of a sample to be sequenced containing the SNP site to be tested; At least one amplification primer or linker in the primer set contains a type IIS restriction endonuclease recognition sequence such that the distance between the IIS type restriction endonuclease cleavage site and the SNP site is 0 to 5 Bases.
  • the plurality of SNP sites comprises at least one of rs1799853, rs1057910, rs9923231, rs4244285, rs4986893; and the primer set for specifically amplifying the rs1799853 site comprises SEQ ID NO: 3 and SEQ ID NO: 4.
  • a primer set that increases the rs1057910 site includes SEQ ID NO: 12 and SEQ ID NO: 13, SEQ ID NO: 14 and SEQ ID NO: 15, SEQ ID NO: 32 and SEQ ID NO: 33, SEQ ID NO: SEQ ID NO: 15, SEQ ID NO: 12 and SEQ ID NO: 13, SEQ ID NO: 14 and SEQ ID NO: 13, SEQ ID NO: 14 and SEQ ID NO: 33, SEQ ID NO: 32 and SEQ ID NO: 13, at least one of SEQ ID NO: 32 and SEQ ID NO: 15;
  • the primer set for specifically amplifying the rs9923231 site comprises SEQ ID NO: 1 and SEQ ID NO: 2, SEQ ID NO: 19 and at least one of SEQ ID NO: 20, SEQ ID NO: 1 and SEQ ID NO: 20, SEQ ID NO: 19 and SEQ ID NO: 2; said
  • the linker is selected from the group consisting of rs1799853 linker 1 consisting of SEQ ID NO: 5 and SEQ ID NO: 6, consisting of SEQ ID NO: 16 and SEQ ID NO: 6 or by SEQ ID NO: 39 and SEQ ID NO: 40 consisting of rs1057910 linker I, rs9923231 linker consisting of SEQ ID NO: 21 and SEQ ID NO: 22, rs4244285 linker consisting of SEQ ID NO: 43 and SEQ ID NO: 44, by SEQ ID NO :45 and at least one of the rs4986893 linker 1 consisting of SEQ ID NO:46; the rs1799853 linker l is for ligation to a gene fragment comprising the rs1799853 site, the rs1057910 linker is used The rs9923231 linker is ligated to a gene fragment containing the rs99223285 site, and the rs4244285 linker is lig
  • the primer set for specifically amplifying the rs1799853 site comprises two pairs of primer pairs of SEQ ID NO: 3 and SEQ ID NO: 4, SEQ ID NO: 1 and SEQ ID NO: 2, the SEQ ID NO: 1 and SEQ ID NO: 2 are used as external primer pairs for specific amplification, and SEQ ID NO: 3 and SEQ ID NO: 4 are used as internal primer pairs for specific amplification, and the amplification accuracy of the present scheme higher.
  • the kit further comprises rs1799853 linker 2, the rs1799853 linker 2 is a double-stranded nucleic acid molecule consisting of SEQ ID NO: 7 and SEQ ID NO: 8 and SEQ ID NO: 9 and SEQ ID NO: 8
  • rs1799853 linker II is designed to include a double strand that can be ligated to the rs1799853 site with two mutation possibilities, respectively.
  • the primer set for specifically amplifying the rs1057910 site comprises SEQ ID NO: 12 and SEQ ID NO: 13, SEQ ID NO: 14 and SEQ ID NO: 15, SEQ ID NO: 32 and SEQ ID NO: 33 pairs of primer pairs, SEQ ID NO: 12 and SEQ ID NO: 13 used as external primer pairs for specific amplification, SEQ ID NO: 14 and SEQ ID NO: 15 and/or SEQ ID NO :32 and SEQ ID NO:33 were used as internal primer pairs for specific amplification, and the amplification accuracy of this protocol was higher.
  • the kit further comprises a rs1057910 linker 2 consisting of SEQ ID NO: 17 and SEQ ID NO: 11 for ligation with a gene fragment comprising the rs1057910 site.
  • a rs1057910 linker 2 consisting of SEQ ID NO: 17 and SEQ ID NO: 11 for ligation with a gene fragment comprising the rs1057910 site.
  • the primer set for specifically amplifying the rs9923231 site comprises two pairs of primer pairs, SEQ ID NO: 1 and SEQ ID NO: 2, SEQ ID NO: 19 and SEQ ID NO: 20, the SEQ ID NO: 1 and SEQ ID NO: 2 are used as external primer pairs for specific amplification, and SEQ ID NO: 19 and SEQ ID NO: 20 are used as internal primer pairs for specific amplification, and the amplification accuracy of the present scheme higher.
  • the kit further comprises rs9923231 linker 2, the rs9923231 linker 2 is a double-stranded nucleic acid molecule consisting of SEQ ID NO: 21 and SEQ ID NO: 22, and SEQ ID NO: 9 and SEQ ID NO: 8 a double-stranded nucleic acid molecule consisting of a double-stranded nucleic acid molecule consisting of SEQ ID NO: 23 and SEQ ID NO: 8, a mixture of double-stranded nucleic acid molecules consisting of SEQ ID NO: 24 and SEQ ID NO: 8, said rs9923231 Linker 2 was used to ligate to a gene fragment containing the rs9923231 site.
  • rs9923231 linker II is directly linked to the rs9923231 site, the rs9923231 site has four mutation possibilities. Therefore, rs9923231 linker II is designed to include a double strand that can be ligated to the rs9923231 site with four mutation possibilities, respectively. A mixture of nucleic acid molecules.
  • the primer set for specifically amplifying the rs4244285 site comprises two pairs of primer pairs, SEQ ID NO: 28 and SEQ ID NO: 29, SEQ ID NO: 35 and SEQ ID NO: 36, the SEQ ID NO: 28 and SEQ ID NO: 29
  • SEQ ID NO: 35 and SEQ ID NO: 36 were used as the internal primer pair for specific amplification, and the amplification accuracy of this protocol was higher.
  • the kit further comprises rs4244285 linker 2 consisting of SEQ ID NO: 47 and SEQ ID NO: 8 for ligation with a gene fragment comprising the rs4244285 site.
  • the primer set for specifically amplifying the rs4986893 site comprises two pairs of primer pairs, SEQ ID NO: 30 and SEQ ID NO: 31, SEQ ID NO: 37 and SEQ ID NO: 38, the SEQ ID NO: 30 and SEQ ID NO: 31 are used as external primer pairs for specific amplification, and SEQ ID NO: 37 and SEQ ID NO: 38 are used as internal primer pairs for specific amplification, and the amplification accuracy of the present scheme higher.
  • the kit further comprises rs4986893 linker 2 consisting of SEQ ID NO: 48 and SEQ ID NO: 8, and the rs4986893 linker 2 is for ligation with a gene fragment comprising the rs4986893 site.
  • the kit further comprises sequencing primer SEQ ID NO: 12.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biomedical Technology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

提供一种建库方法,包括:PCR扩增待测序样本,得到扩增产物;在扩增产物上连接接头一,得到含有IIS型限制性内切酶识别序列以及IIS型限制性内切酶切割位点的连接产物,接头一为双链核酸分子,IIS型限制性内切酶切割位点与待测SNP位点之间的距离为0至5个碱基;采用IIS型限制性内切酶对连接产物进行酶切,得到含有待测SNP位点和接头一的第一核酸片段,第一核酸片段上经酶切形成第一末端;在第一核酸片段的第一末端处连接接头二,得到文库分子,接头二为含有测序引物结合位点的双链核酸分子。还提供了一种SNP分型方法及检测SNP位点突变的试剂盒。所述方法缩短了位点检测时间,提高了检测准确性,且统一了同一体系中进行多个位点检测的测序引物。

Description

一种建库方法及SNP分型方法 技术领域
本发明涉及分子生物学领域,更具体地说,涉及一种建库方法及SNP分型方法。
背景技术
单核苷酸多态性(Single Nucleotide Polymorphism,SNP)是指基因组上单个核苷酸位置上存在转换、颠换、插入、缺失等变化,其数量很多,多态性丰富。SNP被认为是遗传标志,人体许多表型差异、对药物或疾病的易感性等等都可能与SNP有关,因此SNP的分型对诸多疾病的治疗和用药有着积极的意义。
针对基因检测,二代高通量测序技术因其准确、灵敏的特性,应用范围不断扩大,已涉及生命科学研究以及医学研究的各个不同方面,利用二代高通量测序技术来进行SNP位点的检测也是目前的研究热点之一。但是,基于二代高通量测序技术的SNP分型方法,当待测SNP位点与测序引物末端之间距离较长时,检测时间会大大延长,且受限于测序方法的读长,检测的准确性会大大降低;此外,当同时对多个易感基因的多个SNP位点检测时,通常需要针对不同待测SNP位点附近的序列设计多种不同的测序引物,但不同测序引物之间容易产生相互干扰,测序引物难以准确锚定在特定位置,从而增加了测序引物的设计难度,可能降低SNP分型检测的准确率。
因此,需要一种新的建库方法及SNP分型方法,使得对待测SNP位点检测的准确性不受测序读长的影响;且能够避免在同一体系中同时对多个待测SNP位点进行检测时,不同测序引物之间相互干扰的现象。
发明内容
本发明的目的在于提供一种建库方法及SNP分型方法,旨在解决现有技术中SNP分型准确性受测序读长影响,以及在同一体系中同时对多个SNP位点检测时不同测序引物之间相互干扰的问题。
为了实现发明目的,本发明提供了一种建库方法,包括以下步骤:
A、PCR扩增含待测SNP位点的待测序样本,得到扩增产物;
B、在所述扩增产物上连接接头一,得到含有IIS型限制性内切酶识别序列以及IIS型限制 性内切酶切割位点的连接产物,所述接头一为双链核酸分子,所述IIS型限制性内切酶切割位点与所述待测SNP位点之间的距离为0至5个碱基;
C、采用IIS型限制性内切酶对连接产物进行酶切,得到含有待测SNP位点和接头一的第一核酸片段,且所述第一核酸片段上经酶切形成第一末端;
D、在第一核酸片段在第一末端处连接接头二,得到文库分子,所述接头二为含有测序引物结合位点的双链核酸分子。
优选的,所述IIS型限制性内切酶识别序列位于所述PCR扩增引物组中的至少一种扩增引物上,并通过PCR扩增引入至连接产物上。
优选的,所述IIS型限制性内切酶识别序列位于所述接头一上,并通过连接反应引入至连接产物上。
优选的,所述PCR扩增所用的引物组中,有少一种扩增引物上含有可断裂位点或可切除序列;所述步骤B包括以下步骤:
B1.利用断裂剂切割所述扩增产物,所述断裂剂用于对扩增产物中的可断裂位点或可切除序列进行特异性切割,形成第二末端;
B2.在所述扩增产物的第二末端处连接接头一,得到含有IIS型限制性内切酶识别序列以及IIS型限制性内切酶切割位点的连接产物,所述接头一为双链核酸分子,所述IIS型限制性内切酶切割位点与所述待测SNP位点之间的距离为0至5个碱基。
优选的,所述步骤B1和步骤B2在同一反应体系中同时进行。
优选的,所述测序引物结合位点位于所述接头二与第一末端连接的一端处。
本发明还提供了一种SNP分型方法,包括对按上述任一种建库方法制得的文库分子进行测序的步骤。
优选的,所述方法还包括将文库分子可寻址的固定在固相载体上的步骤。
优选的,当检测的待检测序样本有多个时,根据待测序样本的不同分别进行建库,获得多种文库分子,再将多种文库分子混合后进行测序。
优选的,所述多种文库分子上分别含有不同的标签序列。
本发明的建库方法,使获得的文库分子上的接头二与待测SNP位点之间的距离为0至5个碱基。后续可将含有不同待测SNP位点的文库分子混合,然后进行测序,测序时,测序引物与文库分子中的接头二上的序列完全互补配对,因此对多种待测SNP位点进行测序的测序引物可以是相同的,降低了测序引物的设计难度,保证了各SNP位点的测序引物锚定效率的一致性,避免了测序过程中由于引物不同而产生的相互干扰,提高了测序的准确性;另外,只需进行较少次数的测序步骤即可完成检测,大大缩短了测序时间,且使对待测SNP位点的检 测不受测序仪器读长的限制,也能提高测序的准确性;通过在接头一末端上设置生物素标记,可以方便的对文库分子进行提纯,且在测序过程中可以更方便地将其可寻址的固定在固相载体上,从而有利于测序步骤的进行。
附图说明
图1是本发明第二实施例中文库分子的聚丙烯酰胺凝胶电泳检测图。
图2是本发明第五实施例中文库分子的聚丙烯酰胺凝胶电泳检测图。
具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。
本发明提出第一实施例,一种建库方法,包括以下步骤:
A、PCR扩增含待测SNP位点的待测序样本,得到扩增产物;
B、在所述扩增产物上连接接头一,得到含有IIS型限制性内切酶识别序列以及IIS型限制性内切酶切割位点的连接产物,所述接头一为双链核酸分子,所述IIS型限制性内切酶切割位点与所述待测SNP位点之间的距离为0至5个碱基;
C、采用IIS型限制性内切酶对连接产物进行酶切,得到含有待测SNP位点和接头一的第一核酸片段,且所述第一核酸片段上经酶切形成第一末端;
D、在第一核酸片段在第一末端处连接接头二,得到文库分子,所述接头二为含有测序引物结合位点的双链核酸分子。
本发明获得含有IIS型限制性内切酶识别序列的文库分子,其接头二与待测SNP位点之间的距离为0至5个碱基。后续将含有不同待测SNP位点的文库分子混合并进行测序时,测序引物与文库分子中的接头二上的序列完全互补配对,因此对多种待测SNP位点进行测序的测序引物可以是相同的,降低了测序引物的设计难度,保证了各SNP位点的测序引物锚定效率的一致性,避免了测序过程中由于引物不同而产生的相互干扰,提高了测序的准确性;另外,只需进行较少次数的测序步骤即可完成检测,大大缩短了测序时间,且使对待测SNP位点的检测不受测序仪器读长的限制,也能提高准确性。
所述IIS型限制性内切酶用于识别IIS型限制性内切酶识别序列并在IIS型限制性内切酶切割位点处进行切割。所述IIS型限制性内切酶为切割位点在识别序列之外的限制性内切酶,包括但不限于:Acu Ⅰ、Alw Ⅰ、Bbs Ⅰ、BbV Ⅰ、Bcc Ⅰ、BceA Ⅰ、BciV Ⅰ、BfuA Ⅰ、Bmr Ⅰ、Bpm Ⅰ、BpuE Ⅰ、Bsa Ⅰ、BseM Ⅱ、BseR Ⅰ、Bsg Ⅰ、BsmA Ⅰ、BsmB Ⅰ、BsmF Ⅰ、BspCN Ⅰ、 BspM Ⅰ、BspQ Ⅰ、BtgZ Ⅰ、Ear Ⅰ、Eci Ⅰ、EcoP15 Ⅰ、Fau Ⅰ、Fok Ⅰ、Hga Ⅰ、Hph Ⅰ、HpyA V、Mbo Ⅱ、Mly Ⅰ、Mme Ⅰ、Mnl Ⅰ、NmeAⅢ、Ple Ⅰ、Sap Ⅰ、SfaN Ⅰ和TspDT Ⅰ。
所述待测序样本为含待测SNP位点的核酸分子,包括但不限于DNA分子、cDNA分子或RNA分子。
步骤A中所述的PCR扩增可为单分子扩增,也可为非单分子扩增。
优选的,所述单分子扩增为乳液PCR、桥式PCR或乳液桥式PCR。
优选的,所述非单分子扩增为普通PCR扩增、实时荧光定量PCR、不对称PCR、固相PCR、原位PCR、反转录PCR、巢式PCR、兼并引物PCR、免疫PCR、反向PCR或递减PCR。
优选的,所述IIS型限制性内切酶识别序列位于所述PCR扩增引物组中的至少一种扩增引物上,并通过步骤A的扩增反应中引物与模板链的互补配对将其引入至扩增产物上;或,
所述IIS型限制性内切酶识别序列位于所述接头一上,通过连接接头一将IIS型限制性内切酶识别序列引入连接产物中。
所述接头一可以直接连接在扩增产物的第一末端;也可以对扩增产物进行切割后形成第二末端,再在第二末端上连接接头一。
优选的,所述PCR扩增所用的引物组中,有至少一种扩增引物上含有可断裂位点或可切除序列,所述步骤B包括以下步骤:
B1.利用断裂剂切割所述扩增产物,所述断裂剂用于对扩增产物中的可断裂位点或可切除序列进行特异性切割,形成第二末端;
B2.在所述扩增产物的第二末端处连接接头一,得到含有IIS型限制性内切酶识别序列以及IIS型限制性内切酶切割位点的连接产物,所述接头一为双链核酸分子,所述IIS型限制性内切酶切割位点与所述待测SNP位点之间的距离为0至5个碱基。
进一步的,所述可断裂位点为U,所述断裂剂为URSE酶;
或,所述可切除序列为RNA序列,所述断裂剂为RNase H;
或,所述可切除序列为限制性内切酶识别序列,所述断裂剂为相应的限制性内切酶。
优选的,连接接头一和连接接头二的反应均是在连接酶的作用下进行的,一实施方式中,所述连接酶可以选自E.coli DNA连接酶、T4 DNA连接酶、热稳定DNA连接酶和Tth DNA连接酶,优选为T4 DNA连接酶,其适用性广,且既可连接粘性末端,又能连接平末端。
优选的,步骤C开始之前还包括对反应体系中的连接酶进行灭活的步骤,本方案可以避免在步骤C的酶切过程中发生酶切产物重新连接。
优选的,所述步骤B1和步骤B2可以分步进行,也可以在同一反应体系中同时进行。
优选的,所述步骤B1和步骤B2在同一反应体系中同时进行,与分步进行的方案相比,本 方案简化了建库步骤,提高了建库效率。
所述步骤C和步骤D可以分步进行,也可以在同一反应体系中同时进行。
优选的,所述步骤C和步骤D在同一反应体系中同时进行,与分步进行的方案相比,本方案简化了建库步骤,提高了建库效率。
当所述IIS型限制性内切酶识别序列位于所述PCR扩增引物组中的至少一种扩增引物上,通过扩增反应将其引入至扩增产物上时,步骤B1和步骤C可以在同一体系中同时进行,步骤B2和步骤D也可以在同一体系中同时进行;或,所述步骤B1、B2、步骤C、步骤D均在同一反应体系中同时进行,与分步进行的方案相比,本方案简化了建库步骤,提高了建库效率。
优选的,所述接头一在与第一末端连接端的相对末端上含有生物素标记,利用该生物素标记,可以在步骤D结束后,很方便的将文库分子分离出来。
所述接头一通过其上的生物素标记被预先固定在含链霉亲和素或亲和素的固相载体上,可以在后续步骤中固定在固相载体上。
所述接头二为含有测序引物结合位点的双链核酸分子,本发明通过在第一核酸片段上连接接头二,后续将含有不同待测SNP位点的文库分子混合,然后进行测序。测序过程中,统一了多种待测SNP位点的测序引物,降低了测序引物的设计难度,保证了各SNP位点的测序引物锚定效率的一致性,避免了测序过程中由于引物不同而产生的相互干扰,提高了测序的准确性。
优选的,所述测序引物结合位点位于所述接头二与第二末端连接的一端;与测序引物结合位点位于接头二其他位置的技术方案相比,本方案缩短了了待测SNP位点与测序引物之间的距离,减少了测序步骤,从而缩短了检测时间,提高了检测的准确性。
本发明提出第二实施例,以人类全血基因组为模板,构建含rs1799853位点的MTHFR基因片段文库。
A、配制巢式扩增反应体系,第一轮PCR扩增,在200μL离心管中加入浓度为50ng/μL的人类全血DNA分子1.0μL;2×long Taq Mix(深圳华因康基因科技有限公司生产)10μL;浓度为10μM的上游引物(SEQ ID NO:1)0.4μL;浓度为10μM的下游引物(SEQ ID NO:2)0.4μL;20μL去离子水;混匀并离心。将离心管置于PCR仪中,设置反应程序:94℃条件下持续4分钟;94℃条件下持续20秒,49℃条件下持续20秒,72℃条件下持续1分钟,一共30个循环;72℃条件下持续3分钟;PCR反应完成后,得到第一轮PCR产物;
第二轮PCR扩增,在200μL离心管中加入稀释100倍的第一轮PCR扩增产物1.0μL;2×long Taq Mix 10μL;浓度为10μM的上游引物(SEQ ID NO:3)0.4μL;0.4μL浓度为10μM的下游引物(SEQ ID NO:4);20μL去离子水;混匀并离心。将离心管置于PCR仪中,设置反应程序: 94℃条件下持续4分钟;94℃条件下持续20秒,50℃条件下持续20秒,72℃条件下持续20秒,一共30个循环;72℃条件下持续3分钟;PCR反应完成后,得到第二轮PCR产物;其中,上游引物(SEQ ID NO:3)的上含有U以及Acu Ⅰ酶识别序列CTGAAG,上游引物3’末端与待测SNP位点相邻,Acu Ⅰ酶切割位点位于SNP位点上;
B、配置切割-连接反应体系如下:取第二轮PCR产物20μL,浓度为2units/μL的USER酶2μL,浓度为2units/μL的T4连接酶1μL,T4连接酶缓冲液2.7μL,浓度为10pmol/μL的接头一0.2μL,去离子水1.1μL,在室温下反应半小时,65℃下灭活10分钟,得到连接产物。接头一由SEQ ID NO:5和SEQ ID NO:6组成,其中SEQ ID NO:5的5’端上含有生物素标记,且被预先固定在含链霉亲和素的磁珠上;
C、采用Acu Ⅰ酶对步骤B中的连接产物进行酶切,配置反应体系,步骤B中的连接产物27μL,浓度为2units/μL的Acu Ⅰ酶1μL,浓度为3.2mM的S-腺苷甲硫氨酸0.5μL,10×
Figure PCTCN2017098214-appb-000001
Buffer 2.5μL;去离子水2μL;在37℃条件下反应1小时,65℃下灭活20分钟,得到含有待测SNP位点和接头一的第一核酸片段,第一核酸片段上经酶切形成含有2个突出碱基的第一末端,且待测SNP位点位于末端。将离心管置于磁架上,分离去除上清,得到吸附在磁珠上的第一核酸片段,向0.8μL磁珠中加入15μL 4×BW buffer和16.5μL水,室温回转孵育30min。将磁珠用50μL 1×NXS(含1%Triton)洗一遍,50μL 1×NXS(含0.01%triton)洗一遍,50μL 1×TE(含50mM KCl溶液)洗一遍,重悬于20μL 1×cutsmart buffer中用2units虾碱性磷酸酶对其进行处理,37℃下反应1h。反应完毕置于65℃下灭活5min。
D、向步骤C的反应体系中加入以下组分,浓度为10pmol/μL的接头二1μL;浓度为2units/μL的T4 DNA连接酶1μL;7.5ul ligase buffe(含117mM Tris-HCl,17.5mM MgCl2,35mM DTT,5mM ATP和23.415%(w/v)PEG6000);去离子水0.5μL;室温下反应1小时。其中,接头二包括两种,分别由SEQ ID NO:7和SEQ ID NO:8、SEQ ID NO:9和SEQ ID NO:8组成,SEQ ID NO:8序列为测序引物结合序列。反应完成后,将离心管置于磁架上,分离去除上清,得到吸附在磁珠上的含有接头一、接头二以及rs1799853位点的文库分子。
对文库分子进行验证,配制以下反应体系,扩增文库分子:稀释100倍的磁珠悬液1μL;2×long Taq Mix 10μL;浓度为10μM上游引物(SEQ ID NO:10)0.4μL;浓度为10μM下游引物(SEQ ID NO:11)0.4μL;20μL去离子水;混匀并离心。将离心管置于PCR仪中,设置反应程序:94℃条件下持续2分钟;94℃条件下持续20秒,54℃条件下持续20秒,72℃条件下持续10秒分钟,一共25个循环;72℃条件下持续3分钟;PCR反应完成后,得到验证产物。
验证产物聚丙烯酰胺凝胶电泳结果如图1所示,0为分子大小标记物,泳道1为验证产物,从图中可以看出,验证产物在60bp位置附近出现目标条带,与理论预期文库分子大小完全相 符,说明本发明的方法可以实现对待测序样本的建库。
本实施例中,上游引物(SEQ ID NO:3)的设计既兼顾了对待测SNP位点的特异性扩增,又使其上含有Acu Ⅰ酶识别序列CTGAAG。使上游引物3’末端与待测SNP位点相邻,从而保证了PCR扩增的效率;由于Acu Ⅰ酶切割位点位于Acu Ⅰ酶识别序列后16bp处,上游引物在Acu Ⅰ酶识别序列后的序列长度为15bp,从而使得切割位点在待测SNP位点上,第二轮PCR产物在Acu Ⅰ酶的作用下,得到的第一核酸片段,其含有2个突出碱基的第一末端,且待测SNP位点位于末端。由于本实施例待测SNP位点位于末端,其基因突变有两种可能,因此接头二包括两种双链核酸分子,分别与含上述两种突变可能的SNP位点的第一末端连接。
本发明提出第三实施例,以人类全血基因组为模板,构建含rs1057910位点的MTHFR基因片段文库。
本实施例与第二实施例的区别在于,步骤A中第一轮PCR扩增的引物组为:上游引物(SEQ ID NO:12)和下游引物(SEQ ID NO:13);第二轮PCR扩增的引物组为:上游引物(SEQ ID NO:14)和下游引物(SEQ ID NO:15)。其中上游引物(SEQ ID NO:14)上含有Nb.BbvCI酶切序列GCTGAGG,下游引物(SEQ ID NO:15)上含有BceA Ⅰ酶识别序列ACGGC,得到的第二轮PCR产物中,BceA Ⅰ切割位点与待测rs1057910位点之间的距离为4bp。
配制切割-连接反应体系:第二轮PCR产物20μL,浓度为2units/μL的Nb.BbvCI酶0.5μL,浓度为2units/μL的T4连接酶1μL,10×NEBuffer 3.1 3μL,BceA Ⅰ酶1μL,浓度为10μM的接头一0.2μL,浓度为10μM的接头二1μL,浓度为100mM的ATP 0.3μL,去离子水3μL。其中,接头一由SEQ ID NO:16和SEQ ID NO:6组成;接头二由SEQ ID NO:17和SEQ ID NO:11组成。上述两个反应体系在25℃下反应一小时,80℃条件下灭活20分钟,反应完成后,通过接头一末端的生物素标记,将文库分子杂交至流动小室上。
对文库分子进行验证,其中验证引物组为:上游引物(SEQ ID NO:18),下游引物(SEQ ID NO:11)。验证产物在95bp位置附近出现目标条带,与理论预期文库分子大小完全相符,说明本发明的方法可以实现对待测序样本的建库。
本实施例的切割-连接反应体系中同时包括Nb.BbvCI酶、BceA Ⅰ酶、接头一以及接头二,在同一体系中同时对扩增产物的两端进行切割以及分别连接接头一和接头二的反应,实现了多酶协作,简化了建库步骤,提高了建库效率。
本发明提出第四实施例,以人类全血基因组为模板,构建含rs9923231位点的VKORC1基因片段文库。
本实施例与第二实施例的区别在于,第二轮PCR扩增的引物组为SEQ ID NO:19和SEQ ID NO:20,其中SEQ ID NO:19上含有U。
步骤B中的切割-连接反应体系中,接头一由SEQ ID NO:21和SEQ ID NO:22组成,其中,SEQ ID NO:21上含有Acu Ⅰ酶识别序列CTGAAG,通过接头一与扩增产物的连接反应将Acu Ⅰ酶识别序列引入至连接产物中,SEQ ID NO:20的5’端上含有生物素标记,且被预先固定在含链霉亲和素的磁珠上。
步骤C中的接头二为以下接头的等比例混合物:SEQ ID NO:7和SEQ ID NO:8组成的接头,SEQ ID NO:9和SEQ ID NO:8组成的接头,SEQ ID NO:23和SEQ ID NO:8组成的接头,SEQ ID NO:24和SEQ ID NO:8组成的接头。
对该文库分子进行验证的步骤中,验证引物组为:上游引物(SEQ ID NO:25)和下游引物(SEQ ID NO:11)。本实施例文库分子的验证产物经琼脂糖凝胶电泳,在65bp位置附近出现目标条带,与理论预期文库分子大小完全相符,说明本发明的方法可以实现对待测序样本的建库。
本实施例在接头一上设置Acu Ⅰ酶识别序列,通过接头一与扩增产物的连接将Acu Ⅰ酶识别序列引入至文库分子中,降低了扩增引物的设计难度。由于本实施例待测SNP位点位于末端,其基因突变四种突变可能,因此接头二包括四种双链核酸分子,分别与含上述四种突变可能的SNP位点的第一末端连接。
本发明提出第五实施例,以人类全血基因组为模板,建立五个不同的反应体系,分别构建含rs1799853位点的CYP2C9基因片段,含rs1057910位点的CYP2C9基因片段,含rs9923231位点的VKORC1基因片段,含rs4244285位点的CYP2C19基因片段,含rs4986893位点的CYP2C19基因片段的基因文库。
本实施例与第二实施例的区别在于,各反应体系中第一轮PCR扩增的扩增引物组分别为:上游引物(SEQ ID NO:1)和下游引物(SEQ ID NO:2),上游引物(SEQ ID NO:12)和下游引物(SEQ ID NO:13),上游引物(SEQ ID NO:26)和下游引物(SEQ ID NO:27),上游引物(SEQ ID NO:28)和下游引物(SEQ ID NO:29),上游引物(SEQ ID NO:30)和下游引物(SEQ ID NO:31);第二轮PCR扩增的引物组分别为:上游引物(SEQ ID NO:3)和下游引物(SEQ ID NO:4),上游引物(SEQ ID NO:32)和下游引物(SEQ ID NO:33),上游引物(SEQ ID NO:34)和下游引物(SEQ ID NO:21),上游引物(SEQ ID NO:35)和下游引物(SEQ ID NO:36),上游引物(SEQ ID NO:37)和下游引物(SEQ ID NO:38)。其中SEQ ID NO:3、SEQ ID NO:32、SEQ ID NO:34、SEQ ID NO:35、SEQ ID NO:37上均含有Acu Ⅰ酶识别序列CTGAAG。
各反应体系中连接的接头一分别由以下序列组成:SEQ ID NO:5和SEQ ID NO:6,SEQ ID NO:39和SEQ ID NO:40,SEQ ID NO:41和SEQ ID NO:42,SEQ ID NO:43和SEQ ID NO:44,SEQ  ID NO:45和SEQ ID NO:46,且各接头一的引物组上分别含有标签序列:ACTG、TGCA、GTAC、CATG和AGTC;接头二分别为SEQ ID NO:7和SEQ ID NO:8,SEQ ID NO:9和SEQ ID NO:8,SEQ ID NO:24和SEQ ID NO:8,SEQ ID NO:25和SEQ ID NO:8,SEQ ID NO:47和SEQ ID NO:8、SEQ ID NO:48和SEQ ID NO:8的混合物,其中,SEQ ID NO:7、SEQ ID NO:9、SEQ ID NO:24、SEQ ID NO:25序列的3’端经氨基修饰,避免在反应过程中发生自连;SEQ ID NO:8序列的5’端经磷酸修饰。
验证产物经聚丙烯酰胺凝胶电泳结果如图2所示,0为分子大小标记物,泳道1-5分别为上述五个反应体系的验证产物,分别在63bp、74bp、121bp、127bp、112bp位置附近出现目标条带,与理论预期文库分子大小完全相符,说明本发明的方法可以实现对待测序样本的建库。
本发明还提出了第六实施例,一种SNP分型方法,包括对按上述任一实施例中的建库方法制得的文库分子进行测序的步骤。
本发明的文库分子上由于含有接头二,通过测序引物与接头二的特异性结合,统一了同一体系中多个不同待测SNP位点的测序引物,避免了测序引物之间的相互干扰。
优选的,所述方法还包括将文库分子可寻址的固定在固相载体上的步骤。
所述可寻址的固定,是指能够确定位置信息的固定。即固相载体上每一具体位置上所固定的文库分子与其它位置上所固定的文库分子之间是能够明确区分的。
进一步的,文库分子可通过直接或间接的方式可寻址的固定在固相载体上。
针对通过直接的方式实现含测序接头的文库分子的可寻址固定,本发明提出一实施例:文库分子直接通过接头一杂交至流动小室上,从而实现文库分子的可寻址固定;本发明还提出另一实施例,文库分子通过接头一固定在微球上,微球预先固定在固相载体上,从而实现含文库分子的可寻址固定。
针对通过间接的方式实现文库分子的可寻址固定,本发明提出一实施例:文库分子先固定在微球上,然后再将微球固定在固相载体上,从而实现文库分子的可寻址固定。
当检测的待测序样本有多个时,根据待测序样本的不同分别进行建库,获得多种文库分子,再将多种文库分子混合,在同一体系中进行测序。
优选的,所述多种文库分子分别含有不同的标签序列,本方案通过文库分子上的标签序列,可以区分出不同文库分子的测序结果。
进一步的,所述标签序列位于所述接头一上。
优选的,所述测序的方法为第二代高通量基因测序技术,包括但不限于连接测序法或合成测序法。
所述连接测序法是基于连接酶在核酸片段之间进行连接反应的过程中的保真性来实现 的,以待测序核酸片段为模板,测序引物和寡聚核苷酸探针(该探针的特定位置上带有荧光标记)进行连接反应,通过检测连接产物上的荧光标记从而确定寡核苷酸探针上带有荧光标记的特定位置对应的序列的信息。目前,市场上常见的连接测序法有多种,包括但不限于:深圳华因康基因科技有限公司的Pstar连接测序法、ABI公司的连接测序法、Complete Genomics公司的连接测序法。
所述合成测序法是基于聚合酶在延伸核酸链过程中的保真性来实现的,以待测序核酸片段为模板,锚定引物(又称测序引物,其与待测序核酸片段所在链互补)互补结合至待测序核酸片段上,通过检测在延伸过程中产生的信号来确定待测序核酸片段上相应位置的序列信息。目前,市场上常见的合成测序法有多种,包括但不限于:Illumina公司的Solexa合成测序法、Roche公司的454合成测序法、Life Technologies公司的Iontorrent、Ion Proton合成测序法。
本发明还提出第七实施例,一种对MTHFR基因rs1799853位点检测的方法,本实施例在第二实施例的基础上,还包括以下步骤:
向步骤C中得到的吸附在磁珠上的文库分子中加入浓度为0.1M的NaOH溶液20μL,使模板变性为单链,分离去除上清,用20μL 1×TE(含质量浓度为0.01%的triton)洗涤两遍,20μL1×TE洗涤一遍,最后重悬于10μL 1×TE中用作测序模板;采用深圳华因康基因科技有限公司的高通量基因测序仪Pstar IIA,以连接测序法进行测序,测序引物(SEQ ID NO:12)5’端经磷酸修饰,且固定在测序接头的测序引物结合序列上,采用与检测位点互补的带有荧光基团的简并九聚物XNNNNNNNN作为测序探针,经过一次连接测序,确定MTHFR基因的rs1801133位点为C。
本发明还提出第八实施例,一种对MTHFR基因rs1057910位点检测的方法,本实施例在第三实施例的基础上,还包括以下步骤:
向杂交至流动小室上的文库分子中加入浓度为0.1M的NaOH溶液20μL,使模板变性为单链,分离去除上清,用20μL 1×TE(含质量浓度为0.01%的triton)洗涤两遍,20μL 1×TE洗涤一遍,最后重悬于10μL 1×TE中用作测序模板;采用illumina测序仪,将测序引物(SEQ ID NO:12)固定在测序接头的测序引物结合序列上,经过四次合成测序,确定MTHFR基因的rs1057910位点为A。
本发明还提出第九实施例,同时检测CYP2C9基因片段上的rs1799853位点、rs1057910位点,VKORC1基因片段上的rs9923231位点,CYP2C19基因片段上的rs4244285位点、rs4986893位点,本实施例在第四实施例的区别在于,还包括以下步骤:
将吸附在磁珠上的文库分子混匀,向其中加入浓度为0.1M的NaOH溶液20μL,使模板变 性为单链,分离去除上清,用20μL 1×TE(含质量浓度为0.01%的triton)洗涤两遍,20μL 1×TE洗涤一遍,最后重悬于10μL 1×TE中用作测序模板;采用深圳华因康基因科技有限公司的高通量基因测序仪Pstar IIA,以连接测序法进行测序,将测序引物(SEQ ID NO:43)固定在测序接头的测序引物结合位点上,经过一次连接测序,确定上述待测SNP位点分别为C、A、T、G、G。
本实施例通过在文库分子上连接相同的测序接头,统一了同一体系中多个待测SNP位点检测的测序引物,避免了不同引物之间的相互干扰。
本发明提出第十实施例,一种用于检测多种SNP位点突变的试剂盒,所述试剂盒包括扩增引物组和/或接头一;所述扩增引物组用于对所述多种SNP位点中的至少一个SNP位点进行特异性扩增,所述接头一为双链核酸分子,用于与含待测SNP位点的待测序样本的扩增产物连接;所述扩增引物组中的至少一种扩增引物或接头一上含有IIS型限制性内切酶识别序列,使得产生的IIS型限制性内切酶切割位点与SNP位点之间的距离为0至5个碱基。优选的,所述多种SNP位点包括rs1799853、rs1057910、rs9923231、rs4244285、rs4986893中的至少一个;所述用于特异性扩增rs1799853位点的引物组包括SEQ ID NO:3和SEQ ID NO:4、SEQ ID NO:1和SEQ ID NO:2、SEQ ID NO:1和SEQ ID NO:4、SEQ ID NO:3和SEQ ID NO:2中的至少一对;所述用于特异性扩增rs1057910位点的引物组包括SEQ ID NO:12和SEQ ID NO:13、SEQ ID NO:14和SEQ ID NO:15、SEQ ID NO:32和SEQ ID NO:33、SEQ ID NO:12和SEQ ID NO:15、SEQ ID NO:12和SEQ ID NO:13、SEQ ID NO:14和SEQ ID NO:13、SEQ ID NO:14和SEQ ID NO:33、SEQ ID NO:32和SEQ ID NO:13、SEQ ID NO:32和SEQ ID NO:15中的至少一对;所述用于特异性扩增rs9923231位点的引物组包括SEQ ID NO:1和SEQ ID NO:2、SEQ ID NO:19和SEQ ID NO:20、SEQ ID NO:1和SEQ ID NO:20、SEQ ID NO:19和SEQ ID NO:2中的至少一对;所述用于特异性扩增rs4244285位点的引物组包括SEQ ID NO:28和SEQ ID NO:29、SEQ ID NO:35和SEQ ID NO:36、SEQ ID NO:28和SEQ ID NO:36、SEQ ID NO:35和SEQ ID NO:29中的至少一对;所述用于特异性扩增rs4986893位点的引物组包括SEQ ID NO:30和SEQ ID NO:31、SEQ ID NO:37和SEQ ID NO:38、SEQ ID NO:30和SEQ ID NO:38、SEQ ID NO:37和SEQ ID NO:31中的至少一对。
优选的,所述接头一选自:由SEQ ID NO:5和SEQ ID NO:6组成的rs1799853接头一,由SEQ ID NO:16和SEQ ID NO:6组成或由SEQ ID NO:39和SEQ ID NO:40组成的rs1057910接头一,由SEQ ID NO:21和SEQ ID NO:22组成的rs9923231接头一,由SEQ ID NO:43和SEQ ID NO:44组成的rs4244285接头一,由SEQ ID NO:45和SEQ ID NO:46组成的rs4986893接头一中的至少一种;所述rs1799853接头一用于与含rs1799853位点的基因片段连接,所述rs1057910接头一用 于与含rs1057910位点的基因片段连接,所述rs9923231接头一用于与含rs9923231位点的基因片段连接,所述rs4244285接头一用于与含rs4244285位点的基因片段连接,所述rs4986893接头一用于与含rs4986893位点的基因片段连接。
优选的,所述用于特异性扩增rs1799853位点的引物组包括SEQ ID NO:3和SEQ ID NO:4、SEQ ID NO:1和SEQ ID NO:2这两对引物对,所述SEQ ID NO:1和SEQ ID NO:2用作特异性扩增的外引物对,SEQ ID NO:3和SEQ ID NO:4用作特异性扩增的内引物对,本方案的扩增准确性更高。
优选的,所述试剂盒还包括rs1799853接头二,所述rs1799853接头二为由SEQ ID NO:7和SEQ ID NO:8组成的双链核酸分子和由SEQ ID NO:9和SEQ ID NO:8组成的双链核酸分子的混合物,所述接头二用于与含rs1799853位点的基因片段连接。本方案中,由于rs1799853接头二直接与rs1799853位点连接,rs1799853位点有两种突变可能性,因此,rs1799853接头二设计为包括分别可以与具有两种突变可能性的rs1799853位点连接的双链核酸分子的混合物。
优选的,所述用于特异性扩增rs1057910位点的引物组包括SEQ ID NO:12和SEQ ID NO:13、SEQ ID NO:14和SEQ ID NO:15、SEQ ID NO:32和SEQ ID NO:33这三对引物对,所述SEQ ID NO:12和SEQ ID NO:13用作特异性扩增的外引物对,SEQ ID NO:14和SEQ ID NO:15和/或SEQ ID NO:32和SEQ ID NO:33用作特异性扩增的内引物对,本方案的扩增准确性更高。
优选的,所述试剂盒还包括由SEQ ID NO:17和SEQ ID NO:11组成的rs1057910接头二,所述rs1057910接头二用于与含rs1057910位点的基因片段连接。
优选的,所述用于特异性扩增rs9923231位点的引物组包括SEQ ID NO:1和SEQ ID NO:2、SEQ ID NO:19和SEQ ID NO:20这两对引物对,所述SEQ ID NO:1和SEQ ID NO:2用作特异性扩增的外引物对,SEQ ID NO:19和SEQ ID NO:20用作特异性扩增的内引物对,本方案的扩增准确性更高。
优选的,所述试剂盒还包括rs9923231接头二,所述rs9923231接头二为由SEQ ID NO:21和SEQ ID NO:22组成的双链核酸分子、由SEQ ID NO:9和SEQ ID NO:8组成的双链核酸分子、由SEQ ID NO:23和SEQ ID NO:8组成的双链核酸分子、由SEQ ID NO:24和SEQ ID NO:8组成的双链核酸分子的混合物,所述rs9923231接头二用于与含rs9923231位点的基因片段连接。本方案中,由于rs9923231接头二直接与rs9923231位点连接,rs9923231位点有四种突变可能性,因此,rs9923231接头二设计为包括分别可以与具有四种突变可能性的rs9923231位点连接的双链核酸分子的混合物。
优选的,所述用于特异性扩增rs4244285位点的引物组包括SEQ ID NO:28和SEQ ID NO:29、SEQ ID NO:35和SEQ ID NO:36这两对引物对,所述SEQ ID NO:28和SEQ ID NO:29用 作特异性扩增的外引物对,SEQ ID NO:35和SEQ ID NO:36用作特异性扩增的内引物对,本方案的扩增准确性更高。
优选的,所述试剂盒还包括由SEQ ID NO:47和SEQ ID NO:8组成的rs4244285接头二,所述rs4244285接头二用于与含rs4244285位点的基因片段连接。
优选的,所述用于特异性扩增rs4986893位点的引物组包括SEQ ID NO:30和SEQ ID NO:31、SEQ ID NO:37和SEQ ID NO:38这两对引物对,所述SEQ ID NO:30和SEQ ID NO:31用作特异性扩增的外引物对,SEQ ID NO:37和SEQ ID NO:38用作特异性扩增的内引物对,本方案的扩增准确性更高。
优选的,所述试剂盒还包括由SEQ ID NO:48和SEQ ID NO:8组成的rs4986893接头二,所述rs4986893接头二用于与含rs4986893位点的基因片段连接。
优选的,所述试剂盒还包括测序引物SEQ ID NO:12。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种建库方法,其特征在于,包括以下步骤:
    A、PCR扩增含待测SNP位点的待测序样本,得到扩增产物;
    B、在所述扩增产物上连接接头一,得到含有IIS型限制性内切酶识别序列以及IIS型限制性内切酶切割位点的连接产物,所述接头一为双链核酸分子,所述IIS型限制性内切酶切割位点与所述待测SNP位点之间的距离为0至5个碱基;
    C、采用IIS型限制性内切酶对连接产物进行酶切,得到含有待测SNP位点和接头一的第一核酸片段,且所述第一核酸片段上经酶切形成第一末端;
    D、在第一核酸片段的第一末端处连接接头二,得到文库分子,所述接头二为含有测序引物结合位点的双链核酸分子。
  2. 根据权利要求1所述的建库方法,其特征在于,所述IIS型限制性内切酶识别序列位于所述PCR扩增引物组中的至少一种扩增引物上,并通过PCR扩增引入至连接产物上。
  3. 根据权利要求1所述的建库方法,其特征在于,所述IIS型限制性内切酶识别序列位于所述接头一上,并通过连接反应引入至连接产物上。
  4. 根据权利要求1所述的建库方法,其特征在于,所述PCR扩增所用的引物组中,有至少一种扩增引物上含有可断裂位点或可切除序列,所述步骤B包括以下步骤:
    B1.利用断裂剂切割所述扩增产物,所述断裂剂用于对扩增产物中的可断裂位点或可切除序列进行特异性切割,形成第二末端;
    B2.在所述扩增产物的第二末端处连接接头一,得到含有IIS型限制性内切酶识别序列以及IIS型限制性内切酶切割位点的连接产物,所述接头一为双链核酸分子,所述IIS型限制性内切酶切割位点与所述待测SNP位点之间的距离为0至5个碱基。
  5. 一种SNP分型方法,其特征在于,包括对按权利要求1至6中任一项所述的建库方法制得的文库分子进行测序的步骤。
  6. 根据权利要求5所述的SNP分型方法,其特征在于,所述方法还包括将文库分子可寻址的固定在固相载体上的步骤。
  7. 根据权利要求5所述的SNP分型方法,其特征在于,当检测的待检测序样本有多个时,根据待测序样本的不同分别进行建库,获得多种文库分子,再将多种文库分子混合后进行测序。
  8. 一种用于检测多种SNP位点突变的试剂盒,所述试剂盒包括扩增引物组和/或接头一;所述扩增引物组用于对所述多种SNP位点中的至少一种SNP位点进行特异性扩增,所述接头 一为双链核酸分子,用于与含待测SNP位点的待测序样本的扩增产物连接;其特征在于,所述扩增引物组中的至少一种扩增引物上或接头一上含有IIS型限制性内切酶识别序列,使得产生的IIS型限制性内切酶切割位点与待测SNP位点之间的距离为0至5个碱基。
  9. 根据权利要求8所述的用于检测多种SNP位点突变的试剂盒,其特征在于,所述多种SNP位点包括rs1799853、rs1057910、rs9923231、rs4244285、rs4986893中的至少一个;所述用于特异性扩增rs1799853位点的引物组包括SEQ ID NO:3和SEQ ID NO:4、SEQ ID NO:1和SEQ ID NO:2、SEQ ID NO:1和SEQ ID NO:4、SEQ ID NO:3和SEQ ID NO:2中的至少一对;所述用于特异性扩增rs1057910位点的引物组包括SEQ ID NO:12和SEQ ID NO:13、SEQ ID NO:14和SEQ ID NO:15、SEQ ID NO:32和SEQ ID NO:33、SEQ ID NO:12和SEQ ID NO:15、SEQ ID NO:12和SEQ ID NO:13、SEQ ID NO:14和SEQ ID NO:13、SEQ ID NO:14和SEQ ID NO:33、SEQ ID NO:32和SEQ ID NO:13、SEQ ID NO:32和SEQ ID NO:15中的至少一对;所述用于特异性扩增rs9923231位点的引物组包括SEQ ID NO:1和SEQ ID NO:2、SEQ ID NO:19和SEQ ID NO:20、SEQ ID NO:1和SEQ ID NO:20、SEQ ID NO:19和SEQ ID NO:2中的至少一对;所述用于特异性扩增rs4244285位点的引物组包括SEQ ID NO:28和SEQ ID NO:29、SEQ ID NO:35和SEQ ID NO:36、SEQ ID NO:28和SEQ ID NO:36、SEQ ID NO:35和SEQ ID NO:29中的至少一对;所述用于特异性扩增rs4986893位点的引物组包括SEQ ID NO:30和SEQ ID NO:31、SEQ ID NO:37和SEQ ID NO:38、SEQ ID NO:30和SEQ ID NO:38、SEQ ID NO:37和SEQ ID NO:31中的至少一对。
  10. 根据权利要求8所述的用于检测多种SNP位点突变的试剂盒,其特征在于,所述接头一选自由SEQ ID NO:5和SEQ ID NO:6组成的rs1799853接头一,由SEQ ID NO:16和SEQ ID NO:6组成或由SEQ ID NO:39和SEQ ID NO:40组成的rs1057910接头一,由SEQ ID NO:21和SEQ ID NO:22组成的rs9923231接头一,由SEQ ID NO:43和SEQ ID NO:44组成的rs4244285接头一,由SEQ ID NO:45和SEQ ID NO:46组成的rs4986893接头一中的至少一种;所述rs1799853接头一用于与含rs1799853位点的基因片段连接,所述rs1057910接头一用于与含rs1057910位点的基因片段连接,所述rs9923231接头一用于与含rs9923231位点的基因片段连接,所述rs4244285接头一用于与含rs4244285位点的基因片段连接,所述rs4986893接头一用于与含rs4986893位点的基因片段连接。
PCT/CN2017/098214 2016-08-30 2017-08-21 一种建库方法及snp分型方法 WO2018040962A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610772852.1 2016-08-30
CN201610772852.1A CN108300773A (zh) 2016-08-30 2016-08-30 一种建库方法及snp分型方法

Publications (1)

Publication Number Publication Date
WO2018040962A1 true WO2018040962A1 (zh) 2018-03-08

Family

ID=61301355

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/098214 WO2018040962A1 (zh) 2016-08-30 2017-08-21 一种建库方法及snp分型方法

Country Status (2)

Country Link
CN (1) CN108300773A (zh)
WO (1) WO2018040962A1 (zh)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090068659A1 (en) * 2007-09-12 2009-03-12 Taylor Paul D Method for identifying the sequence of one or more variant nucleotides in a nucleic acid molecule
CN101434988A (zh) * 2007-11-16 2009-05-20 深圳华因康基因科技有限公司 一种高通量寡核苷酸测序方法
CN102061526A (zh) * 2010-11-23 2011-05-18 深圳华大基因科技有限公司 一种DNA文库及其制备方法、以及一种检测SNPs的方法和装置
US20110160078A1 (en) * 2009-12-15 2011-06-30 Affymetrix, Inc. Digital Counting of Individual Molecules by Stochastic Attachment of Diverse Labels
CN102296065A (zh) * 2011-08-04 2011-12-28 盛司潼 用于构建测序文库的系统与方法
CN102373287A (zh) * 2011-11-30 2012-03-14 盛司潼 一种检测肺癌易感基因的方法及试剂盒
CN102373288A (zh) * 2011-11-30 2012-03-14 盛司潼 一种对目标区域进行测序的方法及试剂盒
CN102586423A (zh) * 2011-12-27 2012-07-18 盛司潼 一种检测结直肠癌易感基因的方法及试剂盒
CN104313172A (zh) * 2014-11-06 2015-01-28 中国海洋大学 一种大量样本同时分型的方法
CN104450943A (zh) * 2014-12-29 2015-03-25 深圳华因康基因科技有限公司 Kras基因突变检测方法及试剂盒
CN104480534A (zh) * 2014-12-29 2015-04-01 深圳华因康基因科技有限公司 一种快速建库方法

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090068659A1 (en) * 2007-09-12 2009-03-12 Taylor Paul D Method for identifying the sequence of one or more variant nucleotides in a nucleic acid molecule
CN101434988A (zh) * 2007-11-16 2009-05-20 深圳华因康基因科技有限公司 一种高通量寡核苷酸测序方法
US20110160078A1 (en) * 2009-12-15 2011-06-30 Affymetrix, Inc. Digital Counting of Individual Molecules by Stochastic Attachment of Diverse Labels
CN102061526A (zh) * 2010-11-23 2011-05-18 深圳华大基因科技有限公司 一种DNA文库及其制备方法、以及一种检测SNPs的方法和装置
CN102296065A (zh) * 2011-08-04 2011-12-28 盛司潼 用于构建测序文库的系统与方法
CN102373287A (zh) * 2011-11-30 2012-03-14 盛司潼 一种检测肺癌易感基因的方法及试剂盒
CN102373288A (zh) * 2011-11-30 2012-03-14 盛司潼 一种对目标区域进行测序的方法及试剂盒
CN102586423A (zh) * 2011-12-27 2012-07-18 盛司潼 一种检测结直肠癌易感基因的方法及试剂盒
CN104313172A (zh) * 2014-11-06 2015-01-28 中国海洋大学 一种大量样本同时分型的方法
CN104450943A (zh) * 2014-12-29 2015-03-25 深圳华因康基因科技有限公司 Kras基因突变检测方法及试剂盒
CN104480534A (zh) * 2014-12-29 2015-04-01 深圳华因康基因科技有限公司 一种快速建库方法

Also Published As

Publication number Publication date
CN108300773A (zh) 2018-07-20

Similar Documents

Publication Publication Date Title
US10308978B2 (en) Transposon nucleic acids comprising a calibration sequence for DNA sequencing
CN104480534B (zh) 一种建库方法
US9745614B2 (en) Reduced representation bisulfite sequencing with diversity adaptors
US20070172839A1 (en) Asymmetrical adapters and methods of use thereof
JP7460539B2 (ja) 核酸を結合、修飾、および切断する物質の基質選択性および部位のためのin vitroでの高感度アッセイ
JP2019216734A (ja) 核酸プローブ及びゲノム断片検出方法
JP6430631B2 (ja) リンカー要素、及び、それを使用してシーケンシングライブラリーを構築する方法
US20130017978A1 (en) Methods and transposon nucleic acids for generating a dna library
US11274333B2 (en) Compositions and methods for preparing sequencing libraries
JP6925424B2 (ja) 短いdna断片を連結することによる一分子シーケンスのスループットを増加する方法
JP2017526725A (ja) 核酸一本鎖環状ライブラリを構築するための方法および試薬
CN110886021B (zh) 一种单细胞dna文库的构建方法
WO2017181880A1 (zh) 构建待测基因组的dna测序文库的方法及其应用
WO2018040961A1 (zh) 一种建库方法及snp分型方法
US20180044668A1 (en) Mate pair library construction
BR112021006038A2 (pt) Complexos de stranspossomas ligados à superfície do complexo
CN114096678A (zh) 多种核酸共标记支持物及其制作方法与应用
WO2018113799A1 (zh) 构建简化基因组文库的方法及试剂盒
US20140336058A1 (en) Method and kit for characterizing rna in a composition
JP2002537774A (ja) 多型dnaフラグメントおよびその使用
WO2017113655A1 (zh) 引物组、锚定引物、试剂盒、文库构建及基因测序方法
WO2018040962A1 (zh) 一种建库方法及snp分型方法
US20190078083A1 (en) Method for controlled dna fragmentation
US11268087B2 (en) Isolation and immobilization of nucleic acids and uses thereof
WO2023050968A1 (zh) 制备dna纳米球的双链dna接头及其制备方法、试剂盒以及它们的用途

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17845264

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24/07/2019)

122 Ep: pct application non-entry in european phase

Ref document number: 17845264

Country of ref document: EP

Kind code of ref document: A1