WO2016109928A1 - 一种单体型分型测序文库的构建方法、分型方法和试剂 - Google Patents

一种单体型分型测序文库的构建方法、分型方法和试剂 Download PDF

Info

Publication number
WO2016109928A1
WO2016109928A1 PCT/CN2015/070143 CN2015070143W WO2016109928A1 WO 2016109928 A1 WO2016109928 A1 WO 2016109928A1 CN 2015070143 W CN2015070143 W CN 2015070143W WO 2016109928 A1 WO2016109928 A1 WO 2016109928A1
Authority
WO
WIPO (PCT)
Prior art keywords
linker
fragment
sequencing library
site
haplotype
Prior art date
Application number
PCT/CN2015/070143
Other languages
English (en)
French (fr)
Inventor
芦静
李晟
蒋浩君
康雄斌
陈芳
蒋慧
Original Assignee
深圳华大基因研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大基因研究院 filed Critical 深圳华大基因研究院
Priority to CN201580068011.6A priority Critical patent/CN107208314B/zh
Priority to PCT/CN2015/070143 priority patent/WO2016109928A1/zh
Publication of WO2016109928A1 publication Critical patent/WO2016109928A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof

Definitions

  • the invention relates to a haplotype technology in the technical field of molecular biology, in particular to a method for constructing a haplotype typing sequencing library, a typing method and a reagent.
  • Haplotype is an abbreviation for haploid genotype and genetically refers to a combination of alleles at multiple loci that are commonly inherited on the same chromosome.
  • the popular saying is that several genotypes of closely linked genes that determine the same trait are formed.
  • the haplotype may even refer to at least two loci or the entire chromosome, depending on the number of genes involved in a given locus. There is a high correlation between adjacent SNPs (single nucleotide polymorphism), which often allows us to speculate on the state of another locus from the state of one locus on the chromosome.
  • haplotype software such as PHASE (PHylogenetics And Sequence Evolution) software is based on the population genotype to reconstruct individual haplotypes, Linkage Disequilibrium Analyzer (LDA) software, is to calculate two bits The degree of linkage between points.
  • LDA Linkage Disequilibrium Analyzer
  • the invention provides a method for constructing a haplotype typing sequencing library, a typing method and a reagent, and the haplotype typing sequencing library can construct a SNP site on the same chromosome or a partial region thereof. , forming a small fragment sequencing library with a SNP site at both ends, which can greatly reduce the length of fragments that need to be sequenced, and maintain a correlation between different small fragments, and can easily assemble a single chromosome or a partial region thereof. Body type.
  • the present invention provides a method of constructing a haplotype sequencing sequencing library, the method comprising:
  • the genomic DNA derived from the individual is subjected to random multiplex PCR amplification to obtain an amplified fragment having the SNP site at both ends;
  • a 5'-end linker and a 3'-end linker are respectively connected at both ends of the amplified fragment, wherein the 5'-end linker and the 3'-end linker respectively have a restriction endonuclease recognition site, Restriction enzyme cleavage site The point is located inside the SNP site;
  • the circular DNA molecule is cleaved at the cleavage site using the restriction endonuclease to obtain a cleavage fragment comprising two SNP sites, two primer sequences and two linker sequences, ie, the sequencing library.
  • the amplification enzyme used in the multiplex PCR amplification is an amplification enzyme that amplifies a long fragment of 10 kb or more.
  • the restriction endonuclease is an Ecop15 enzyme
  • the recognition site is AGACC or CAGCAG
  • the cleavage site is located 24-26 bp downstream of the recognition site.
  • a nick translation reaction is performed after the 5'-terminal linker and the 3'-terminal linker are respectively connected to both ends of the amplified fragment.
  • PCR amplification is performed using a primer that binds to the 5'-terminal linker and the 3'-terminal linker, wherein the PCR amplification uses a primer with a U Base site; digested with USER to generate a sticky end that facilitates cyclization; cyclization of the fragment with viscous ends at both ends of the USER digestion results in a circular DNA molecule.
  • the present invention provides a haplotype typing method, the method comprising: sequencing a sequencing library obtained by the method of the first aspect; and then corresponding to each SNP position
  • the sequence of the tag on the primer of the spot is a sequence identifier, and the linkage of the SNP site obtained by the sequencing is analyzed to obtain a haplotype.
  • the invention provides a construction reagent for a haplotyped sequencing library, the reagent comprising the following components:
  • a primer set comprising a plurality of primers designed near a plurality of SNP sites in a chromosomal region for random multiplex PCR amplification of genomic DNA derived from the individual to be obtained An amplified fragment of a SNP site;
  • a 5'-end linker and a 3'-end linker respectively having a restriction endonuclease recognition site, the restriction endonuclease cleavage site being located inside the SNP site for ligation Amplifying both ends of the fragment;
  • a cyclase ligase for cyclizing a fragment after ligation of a 5'-terminal linker and a 3'-terminal linker to obtain a circular DNA molecule
  • a restriction endonuclease for cleaving the circular DNA molecule at the cleavage site to obtain a cleavage fragment comprising two SNP sites, two primer sequences, and two linker sequences.
  • the construction reagent further comprises an amplification enzyme that amplifies a long fragment of 10 kb or more for the multiplex PCR amplification.
  • the restriction endonuclease is an Ecop15 enzyme
  • the recognition site is AGACC or CAGCAG
  • the cleavage site is located 24-26 bp downstream of the recognition site.
  • the constructing reagent further comprises a component of the nick translation reaction, and the nick translation reaction is performed after the 5'-terminal linker and the 3'-terminal linker are respectively connected at both ends of the amplified fragment. .
  • the constructing reagent further comprises:
  • Primer pairs used for PCR amplification which bind to the 5'-terminal linker and the 3'-end linker, respectively, and carry a U base site for performing nick translation reaction, followed by PCR amplification;
  • the USER enzyme is used to cleave the PCR amplified product to produce a sticky end that facilitates cyclization.
  • the construction reagent further comprises an exonuclease for digesting the uncircularized linear DNA molecule after cyclization of the fragment after ligation of the 5'-terminal linker and the 3'-terminal linker.
  • the method for constructing the haplotype sequencing library of the present invention can enrich the SNP site on the same chromosome or a partial region thereof by using primers, linker sequences and restriction enzymes designed in the vicinity of the SNP site. Collecting and forming a small fragment sequencing library with a SNP site at both ends can greatly reduce the length of fragments that need to be sequenced, and maintain correlation between different small fragments, and can easily assemble chromosomes or parts thereof. Monomer type.
  • the sequencing library constructed by the construction method of the haplotype sequencing library of the present invention can obtain a haplotype by sequencing and linkage analysis of SNP sites, and is important for scientific research, diagnosis of genetic diseases and other applications. The role, especially for maternal or paternal disease gene carriers, plays an important role in the prediction and diagnosis of offspring diseases.
  • the present invention can effectively perform individual haplotype typing, does not require complete family information, and is simple and efficient; the haplotype typing method of the present invention can be applied to certain haplotype-related disease diagnosis, such as HLA.
  • haplotype-related disease diagnosis such as HLA.
  • Figure 1 is a schematic diagram showing the principle of random multiplex PCR amplification of genomic DNA derived from individual individuals, wherein genomic DNA has multiple SNP sites in a region, and the design is positive (none side) near each SNP site.
  • FIG. 2 is a schematic view showing the structure of a linker product obtained by connecting a 5'-terminal linker (5'-Ad) and a 3'-terminal linker (3'-Ad) at both ends of an amplified fragment obtained by multiplex PCR amplification.
  • the 5'-end linker and the 3'-end linker respectively have a restriction endonuclease recognition site, and the restriction site (CS) of the restriction endonuclease is located inside the SNP site (ie, One side of the SNP locus between the primer sequences);
  • Figure 3 is a schematic view showing the structure of a circular DNA molecule obtained by cyclizing a fragment after linking a 5'-terminal linker (5'-Ad) and a 3'-terminal linker (3'-Ad);
  • Figure 4 is a diagram showing the use of a restriction endonuclease to cleave a circular DNA molecule at the cleavage site (CS) of the circular DNA molecule shown in Figure 3, resulting in two SNP sites (SNP1 and SNP2), two primer sequences.
  • FIG. 5 is a schematic diagram showing the principle of locating a SNP site by using a tag sequence on a primer corresponding to each SNP site as a sequence identifier;
  • Fig. 6 is a schematic diagram showing the principle of analyzing the linkage relationship of SNP sites in the prepared reads by assembling the haplotypes of a chromosomal region.
  • the present invention relates to three subject matter: a method for constructing a haplotype-sequencing library, a haplotype typing method, and a construction reagent for a haplotype-sequencing library.
  • the construction method of the haplotyped sequencing library aims to construct a sequencing library which can be used for haplotype typing, and thus the result of the method is to obtain a sequencing library.
  • the purpose of the haplotype typing method is to perform haplotype analysis on the chromosome or part of the region of the individual, and the typing result can be used as the basis for disease prediction and diagnosis, or just for scientific research or other application purposes.
  • the haplotypes of animal and plant varieties can help achieve certain agricultural values, such as screening for plants with stable inheritance, yield and quality. Therefore, the haplotype typing method of the present invention may be a disease diagnostic property or a non-disease diagnostic property, and thus the haplotype typing method of the present invention may be further defined as a disease diagnostic haplotype typing method or non-disease. Diagnostic haplotype typing method.
  • the individual to be typed in the present invention may be any living body as long as it has a haplotype which can be classified, such as any animal (including human), plant or microorganism.
  • SNP1, SNP2 SNP3, SNP4, etc.
  • haplotypes positive and reverse primers
  • P1, P2, P3, P4, P5, P6, P7, P8, etc. are designed in the vicinity of each SNP site (ie, left and right sides).
  • the reverse primers are randomly bound to the left and right sides of the corresponding SNP sites, and under the action of the polymerase, amplified fragments with one SNP site at both ends can be amplified. Due to the random nature of primer binding, the length of each amplified fragment may be different.
  • a plurality of SNP sites refer to two or more SNP sites, such as two, four, ten, 50, 100 or 1000 SNP sites.
  • the 5'-end linker (5'-Ad) and the 3'-end linker (3'-Ad) are respectively connected at both ends of the amplified fragment, and the result is as shown in FIG.
  • the linker ligation product, the 5'-end linker and the 3'-end linker used respectively have a restriction endonuclease recognition site, and the restriction endonuclease cleavage site (CS) is located inside the SNP site. (ie, the side of the SNP site that is separated from the primer sequence), such that the restriction enzyme binds to the recognition site and cleaves the DNA at a cleavage site downstream thereof.
  • CS restriction endonuclease cleavage site
  • a typical feature of a restriction endonuclease that can be used in the present invention is that its cleavage site is located a distance downstream of its recognition site, which is just across the portion of the primer sequence between the recognition site and the SNP site.
  • Typical but non-limiting examples of such restriction enzymes are, for example, the EcoP15 enzyme and EcoP1, whose recognition site is AGACC or CAGCAG, and the cleavage site is located 24-26 bp downstream of its recognition site.
  • One embodiment of the invention uses the Ecop15 enzyme as a restriction endonuclease.
  • a gap may be left, and the gap can be eliminated by the Nick Translation.
  • the efficiency of the linker is not sufficiently high, a larger amount of the amplified fragment to which the linker is ligated can be obtained by PCR amplification after the linker is ligated, and the primer used for PCR amplification should have a linker-binding sequence.
  • the present invention preferably adopts a method of introducing a U-base site on a primer used for PCR amplification, because it can ensure that only the U-base site is present on the primer, and the specificity of the cleavage is ensured, that is, it is not like ordinary
  • the restriction endonuclease cleaves to a site that is not desired to be cleaved.
  • Example Individual haplotypes were performed on a normal individual plasma sample.
  • the amount of proteinase K, lysate AL and absolute ethanol should be increased in the same proportion.
  • the adsorption column can accommodate a maximum volume of about 600 ⁇ L, and step (4) can be repeated.
  • Step 2 PCR, database building and sequencing
  • the above reaction was carried out with 62 ⁇ L + 18 ⁇ L of the reaction mixture, stored at 12 ° C for 20 min, and stored at 4 ° C.
  • Magnetic beads were purified, 1 time AmpureXP magnetic beads were mixed with the reaction solution, 10 minutes later, 75% ethanol was added, ethanol was removed, 37 ° C for 3-5 min, and 42 ⁇ L TE was added for 10 min.
  • 5'-terminal linker short chain 5'-TTGCGGTTCTGAAGT-3'dd (SEQ ID NO: 206).
  • the magnetic beads were purified, 0.9 times of AmpureXP was added, and after 10 min, 640 ⁇ L of 75% ethanol was added, the ethanol was removed, the bath was 3-5 min at 37 ° C, and 42 ⁇ L of TE was added for 10 min.
  • the extension primer sequence ON3659 is:
  • the magnetic beads were purified, 0.9 times AmpureXP was added, and after 10 min, 640 ⁇ L of 75% ethanol was added, the ethanol was removed, the bath was 3-5 min at 37 ° C, and 42 ⁇ L of TE was added for 10 min.
  • 3'-terminal linker short chain 5'-CGGGAACGCTGAAGA-3'dd (SEQ ID NO: 209).
  • the 3'-terminal joint connection system is as follows:
  • Each sample was spiked with a 3'-end linker of 14.8 ⁇ L and 30.5 ⁇ L of reaction mixture.
  • the magnetic beads were purified, 0.9 times of AmpureXP was added, and after 10 min, 640 ⁇ L of 75% ethanol was added, the ethanol was removed, the bath was 3-5 min at 37 ° C, and 42 ⁇ L of TE was added for 10 min.
  • the magnetic beads were purified, 80 ⁇ L of AmpureXP magnetic beads were added, and after 10 min, 640 ⁇ L of 75% ethanol was added, the ethanol was removed, the bath was 3-5 min at 37 ° C, and 47 ⁇ L of TE was added for 10 min.
  • the PCR amplification primers are as follows:
  • the 5'-end extension primer sequence ON3659 is:
  • the PCR amplification system is as follows:
  • the magnetic beads were purified, and 480 ⁇ L Lampure XP magnetic beads were added. After 10 min, 2000 ⁇ L of 75% ethanol was added, the ethanol was removed, the bath was 3-5 min at 37 ° C, and 85 ⁇ L of TE was added for 10 min.
  • the above reaction mixture (about 60 ⁇ L) was added to 50 ⁇ L of the enzyme reaction solution, and stored at 37 ° C for 1 h at 4 ° C.
  • the USER enzyme reaction product of the previous step was dispensed into 27.5 ⁇ L ⁇ 4, and 423 ⁇ L of the cyclization reaction buffer was added thereto, and the mixture was heated at 60 ° C for 30 minutes.
  • the above-prepared 44 ⁇ L of DNA was added to 8.6 ⁇ L of the end-blend mixture, stored at 12 ° C for 20 min, and stored at 4 ° C.
  • the library obtained in this example was sequenced using the Proton sequencing platform.
  • Steps 3 to 5 are specific examples of the sequence analysis of the reads obtained by the sequencing of the present invention, and those skilled in the art can also implement sequence analysis by other methods, which is easy for a person skilled in the art, because there are currently There are many well-known software that can be used for typing.
  • the sequencing sequence was subjected to an alignment operation using BWA (http://bio-bwa.sourceforge.net/bwa.shtml) with a sequence length of about 26 bp.
  • a reference sequence is constructed with the sequence of the PCR target product.
  • the Primer3 software (http://www.mybiosoftware.com/pcr-primer-design/1470) designs the primer information file for the primer output (the last column needs to add the SNP position of interest).
  • the target product gene sequence is intercepted based on the designed primer information.
  • aln_bwa.pl (provided by Shenzhen Huada Gene Research Institute).
  • the main output of this program is the bam file, and the format is bam standard output.
  • the alignment result BAM file is used to locate the SNP tag sequence (anchor sequence) file list (6-10 bp sequence before the SNP site).
  • This process classifies local haplotypes by previous SNP-related statistics.
  • the haplotype is spliced.
  • the previous step is to run the link&loci file generated by the script Haplotype.stepl.pl

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biophysics (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Plant Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

提供了一种单体型分型测序文库的构建方法、分型方法和试剂,其中单体型分型测序文库的构建方法包括:使用在一段染色体区域的多个SNP位点附近设计的引物,进行随机的多重PCR扩增;在扩增片段的两端分别连接5'-端接头和3'-端接头,其中接头分别具有限制性内切酶的识别位点,其切割位点位于SNP位点的内侧;对连接接头后的片段进行环化;使用限制性内切酶在切割位点切割环状DNA分子,得到测序文库。

Description

一种单体型分型测序文库的构建方法、分型方法和试剂 技术领域
本发明涉及分子生物学技术领域中的单体型技术,尤其涉及一种单体型分型测序文库的构建方法、分型方法和试剂。
背景技术
单体型(Haplotype)是单倍体基因型的简称,在遗传学上是指在同一染色体上共同遗传的多个基因座上等位基因的组合。通俗的说法就是若干个决定同一性状的紧密连锁的基因构成的基因型。按照某一指定基因座上基因重组发生的数量,单体型甚至可以指至少两个基因座或整个染色体。相邻的SNP(single nucleotide polymorphism,单苷酸多态性)之间有很高的相关性,往往使得我们可以从染色体上一个位点的状态来推测另一个位点的状态。
现有单体型分型软件比如PHASE(PHylogenetics And Sequence Evolution,序列进化分析)软件是根据群体基因型重构个体的单体型,连锁分析(Linkage Disequilibrium Analyzer,LDA)软件,是计算两个位点间的连锁程度。这些单体型分型技术依赖于基因测序结果。然而,目前用于基因测序的测序文库存在片段较长的问题,并且受到测序读长以及组装技术的限制。此外,单个个体的单体型一直是基于一代Sanger测序来实现的。
发明内容
本发明提供一种单体型分型测序文库的构建方法、分型方法和试剂,所述单体型分型测序文库的构建方法能够将同一染色体或其部分区域上的SNP位点富集起来,形成两端分别带有一个SNP位点的小片段测序文库,能够大大降低需要测序的片段长度,并且不同的小片段之间保持有相关性,能够容易地组装出染色体或其部分区域的单体型。
根据本发明的第一方面,本发明提供一种单体型分型测序文库的构建方法,所述方法包括:
使用在一段染色体区域的多个SNP位点附近设计的引物,对待分型个体来源的基因组DNA进行随机的多重PCR扩增,得到两端具有所述SNP位点的扩增片段;
在所述扩增片段的两端分别连接5’-端接头和3’-端接头,其中所述5’-端接头和3’-端接头分别具有限制性内切酶的识别位点,所述限制性内切酶的切割位 点位于所述SNP位点的内侧;
对连接5’-端接头和3’-端接头后的片段进行环化,得到环状DNA分子;
使用所述限制性内切酶在所述切割位点切割所述环状DNA分子,得到包含两个SNP位点、两个引物序列和两个接头序列的切割片段,即为所述测序文库。
作为本发明的优选方案,所述多重PCR扩增使用的扩增酶为扩增10kb以上长片段的扩增酶。
作为本发明的优选方案,所述限制性内切酶为Ecop15酶,所述识别位点为AGACC或CAGCAG,所述切割位点位于所述识别位点的下游24-26bp处。
作为本发明的优选方案,在所述扩增片段的两端分别连接5’-端接头和3’-端接头之后,进行缺口平移反应。
作为本发明的优选方案,所述进行缺口平移反应之后,使用结合所述5’-端接头和3’-端接头的引物进行PCR扩增,其中所述PCR扩增使用的引物上带有U碱基位点;使用USER酶切以产生利于环化的粘性末端;环化所述USER酶切得到的两端带有粘性末端的片段,得到环状DNA分子。
作为本发明的优选方案,所述对连接5’-端接头和3’-端接头后的片段进行环化之后,消化未环化的线形DNA分子,再进行所述限制性内切酶的切割。
根据本发明的第二方面,本发明提供一种单体型分型方法,所述方法包括:对所述第一方面所述的方法得到的测序文库进行测序;然后以对应于每个SNP位点的引物上的标签序列为序列标识,对所述测序得到的reads进行SNP位点的连锁关系分析,得到单体型。
根据本发明的第三方面,本发明提供一种单体型分型测序文库的构建试剂,所述试剂包括如下组成部分:
引物组,所述引物组包括在一段染色体区域的多个SNP位点附近设计的多条引物,用于对待分型个体来源的基因组DNA进行随机的多重PCR扩增,以得到两端具有所述SNP位点的扩增片段;
5’-端接头和3’-端接头,分别具有限制性内切酶的识别位点,所述限制性内切酶的切割位点位于所述SNP位点的内侧,用于连接在所述扩增片段的两端;
环化连接酶,用于对连接5’-端接头和3’-端接头后的片段进行环化,得到环状DNA分子;
限制性内切酶,用于在所述切割位点切割所述环状DNA分子,以得到包含两个SNP位点、两个引物序列和两个接头序列的切割片段。
作为本发明的优选方案,所述构建试剂还包括扩增酶,所述扩增酶扩增10kb以上长片段,用于所述多重PCR扩增。
作为本发明的优选方案,所述限制性内切酶为Ecop15酶,所述识别位点为AGACC或CAGCAG,所述切割位点位于所述识别位点的下游24-26bp处。
作为本发明的优选方案,所述构建试剂还包括缺口平移反应的组分,用于在所述扩增片段的两端分别连接5’-端接头和3’-端接头之后,进行缺口平移反应。
作为本发明的优选方案,所述构建试剂还包括:
PCR扩增使用的引物对,分别结合所述5’-端接头和3’-端接头并且带有U碱基位点,用于进行缺口平移反应之后,进行PCR扩增;
USER酶,用于酶切所述PCR扩增的产物,以产生利于环化的粘性末端。
作为本发明的优选方案,所述构建试剂还包括核酸外切酶,用于对连接5’-端接头和3’-端接头后的片段进行环化之后,消化未环化的线形DNA分子。
本发明的单体型分型测序文库的构建方法,通过在SNP位点附近设计的引物、接头序列以及限制性内切酶的配合使用,能够将同一染色体或其部分区域上的SNP位点富集起来,形成两端分别带有一个SNP位点的小片段测序文库,能够大大降低需要测序的片段长度,并且不同的小片段之间保持有相关性,能够容易地组装出染色体或其部分区域的单体型。使用本发明的单体型分型测序文库的构建方法构建的测序文库,经测序以及SNP位点的连锁关系分析,即可得到单体型,对于科研、基因疾病的诊断以及其它应用具有重要的作用,尤其是对于母系或者父系疾病基因携带者对于后代疾病的预测和诊断具有重要的作用。
此外,利用本发明能有效地进行个体单体型分型,不需要完全的家系信息,简单高效;本发明的单体型分型方法可应用于某些单体型相关的疾病诊断,如HLA血型分型以及无创产前的β地贫检测等;结合第二代高通量测序进行个体单体型分型,降低了成本,提高了处理通量,理论上一次可进行上百个样本检测。
附图说明
图1为对待分型个体来源的基因组DNA进行随机的多重PCR扩增的原理示意图,其中基因组DNA在一段区域上有多个SNP位点,在每个SNP位点附近(旁侧)设计正、反向引物;
图2为在多重PCR扩增得到的扩增片段的两端分别连接5’-端接头(5’-Ad)和3’-端接头(3’-Ad),得到的接头连接产物的结构示意图,其中5’-端接头和3’-端接头分别具有限制性内切酶的识别位点,该限制性内切酶的切割位点(cut site,CS)位于SNP位点的内侧(即与引物序列之间间隔SNP位点的一侧);
图3为对连接5’-端接头(5’-Ad)和3’-端接头(3’-Ad)后的片段进行环化,得到的环状DNA分子的结构示意图;
图4为使用限制性内切酶在图3所示的环状DNA分子的切割位点(CS)切割环状DNA分子,得到的包含两个SNP位点(SNP1和SNP2)、两个引物序列(P1和P4)和两个接头序列(5’-Ad和3’-Ad)的切割片段的结构示意图;
图5为以对应于每个SNP位点的引物上的标签序列为序列标识,对SNP位点进行定位的原理示意图;
图6为对测序得到的reads进行SNP位点的连锁关系分析,组装出一段染色体区域的单体型的原理示意图。
具体实施方式
下面通过具体实施例对本发明作进一步详细说明。除非特别说明,下面实施例中所使用的技术均为本领域内的技术人员已知的常规技术;所使用的仪器设备和试剂等,均为本领域内的技术人员可以通过公共途径如商购等获得的。
本发明涉及三个主题:单体型分型测序文库的构建方法、单体型分型方法和单体型分型测序文库的构建试剂。其中,单体型分型测序文库的构建方法的目的在于构建出能够用于单体型分型的测序文库,因此该方法的结果就是得到一个测序文库。单体型分型方法的目的在于对待分型个体的染色体或其部分区域进行单体型分析,分型结果可以作为疾病的预测和诊断的依据,也可以仅仅是科研性或其它应用目的性的,比如动植物品种的单体型分型等,如鉴定水稻等农作物单体型有助于实现一定的农业价值,如筛选稳定遗传、产量和品质较高的植株等。因此,本发明的单体型分型方法可以是疾病诊断性质或非疾病诊断性质的方法,故此本发明的单体型分型方法可以进一步限定为疾病诊断性单体型分型方法或非疾病诊断性单体型分型方法。
基于本发明方法的通用性,本发明中的待分型个体可以是任何生命体,只要具有可分型的单体型即可,如任何动物(包括人)、植物或微生物体等。
下面结合附图对本发明的实施方式进行具体详细描述。
如图1所示,同一染色体或其部分区域上具有多个SNP位点(SNP1、SNP2、 SNP3、SNP4等),多个连锁的SNP位点的特定形式可称为“单体型”。在每个SNP位点的附近(即左右两侧)设计正、反向引物(P1、P2、P3、P4、P5、P6、P7、P8等),在多重PCR扩增过程中,这些正、反向引物随机结合到相应SNP位点的左右两侧,在聚合酶的作用下能够扩增出两端分别带有一个SNP位点的扩增片段。由于引物结合的随机性,各个扩增片段长度可能不同,如果假设相邻两个SNP位点的距离为1kb,则引物P1与P4、P3与P6、P5与P8可以分别扩增得到如图1所示的1kb的片段,而引物P1与P6、P3与P8可以分别扩增得到如图1所示的2kb的片段,类似地可以得到3kb、4kb甚至更长的扩增片段。
本发明中,多个SNP位点是指两个以上的SNP位点,比如2个、4个、10个、50个、100个或1000个SNP位点。
本发明中,引物位于SNP位点附近,是指位于SNP位点的左右两侧,引物的最靠近SNP位点的碱基可以在位置上直接与SNP位点相邻,也可以有一个或几个碱基的间隔,但是一般不要有太多碱基的间隔。
本发明中,称为“多重PCR扩增”,是因为多条随机引物在同一扩增体系中进行PCR扩增反应,得到多种多样的扩增片段。
本发明中,多重PCR扩增使用的扩增酶优选是扩增10kb以上长片段的扩增酶,比如Takara公司的LATaq酶可以扩增较长片段的DNA(大于10kb)。使用长片段的扩增酶能够提高扩增效率,保证得到足够的多重PCR扩增产物用于后续实验。
通过多重PCR扩增得到扩增片段以后,在扩增片段的两端分别连接5’-端接头(5’-Ad)和3’-端接头(3’-Ad),得到如图2所示的接头连接产物,所使用的5’-端接头和3’-端接头分别具有限制性内切酶的识别位点,该限制性内切酶的切割位点(CS)位于SNP位点的内侧(即与引物序列之间间隔SNP位点的一侧),这样可使得限制性内切酶结合到识别位点上并在其下游的切割位点切割DNA。可用于本发明中的限制性内切酶的典型特点是:其切割位点位于其识别位点的下游一段距离,该距离正好可以跨过识别位点与SNP位点之间的引物序列部分。典型但非限定性的这类限制性内切酶的实例比如:EcoP15酶和EcoP1,其识别位点为AGACC或CAGCAG,切割位点位于其识别位点的下游24-26bp处。本发明一个实施例使用Ecop15酶作为限制性内切酶。选择这类限制性内切酶时,应当考虑识别位点与切割位点之间的距离,因为这二者之间是引物序列部分,如果这二者之间的距离过短,就会限制引物序列的长度,可能会影响扩增效果。 一般识别位点与切割位点之间的距离在20-40bp之间、优选20-30bp之间具有较好的效果。
在扩增片段的两端连接接头可通过多种方式实现,不同的测序平台(或装置)有自己特有的接头以及接头连接方式。比如,通过扩增片段与平端接头之间的连接实现,或通过带有“A”碱基粘端的扩增片段与带有“T”碱基粘端的接头连接实现。
连接接头后,可能留有间隙(gap),可以通过缺口平移反应(Nick Translation)消除间隙。另外,在接头连接效率不够高的情况下,可以在连接接头后通过PCR扩增得到较大量的连接有接头的扩增片段,其中PCR扩增使用的引物应当有接头结合序列。
在本发明一个实施方案中,在进行PCR扩增接头连接产物的过程中使用的引物上带有U碱基位点,该U碱基位点能够被USER酶识别和酶切,从而产生利于环化的粘性末端,然后环化USER酶切得到的两端带有粘性末端的片段,即可得到环状DNA分子。当然,也可以在PCR扩增使用的引物上引入特定限制性内切酶的酶切位点,通过特定限制性内切酶酶切PCR扩增产物产生粘性末端,再进行环化。但是,本发明优选采用在PCR扩增使用的引物上引入U碱基位点的方式,因为这样可以保证只在引物上具有U碱基位点,保证切割的专一性,即不会像普通限制性内切酶那样切割到不希望切割的位点。
对连接5’-端接头(5’-Ad)和3’-端接头(3’-Ad)后的片段进行环化,可以得到如图3所示的环状DNA分子,该环状DNA分子的特点是:5’-端接头(5’-Ad)与3’-端接头(3’-Ad)一端互连,而另一端分别连接多重PCR扩增引物序列(P1和P4),再向下依次是SNP位点(SNP1和SNP2)和限制性内切酶的切割位点(CS)。
环化后,可以使用核酸外切酶消化未环化的线形DNA分子,然后使用限制性内切酶在切割位点(CS)切割环状DNA分子,得到如图4所示的包含两个SNP位点(SNP1和SNP2)、两个引物序列(P1和P4)和两个接头序列(5’-Ad和3’-Ad)的切割片段。众多这样的切割片段组成本发明的单体型分型测序文库。不同切割片段之间的共同点在于:接头序列相同;不同点在于:引物序列和SNP位点不同。片段大小一般在几十个碱基,适合第二代高通量测序。
测序时,以5’-端接头和3’-端接头序列作为测序引物的锚定位点。由于每个SNP位点有相应的旁侧引物,可以以引物上的标签序列(如图5所示,即引物 上的一段特异性对应于特定SNP位点的序列)作为定位标识,将测序得到的reads(读段)组装成多个SNP位点串联的特定形式的单体型(如图6所示)。
下面通过具体实施例来说明本发明的具体可行性,应当理解实施例的描述仅是说明性的,不应当理解为对本发明保护范围的限制。
实施例:对1例正常个体血浆样本进行个体单体型分型。
1步骤一:基因组样本处理
1.1全血DNA提取
将-20℃中用EDTA抗凝管保存的全血取出置于室温融化,轻轻颠倒混匀。按照试剂盒操作手册,以100μL全血为起始量,利用QIAamp DNA Micro Kit提取全血DNA。具体的提取步骤如下:
(1)将100μL全血置于干净的1.5ml离心管中,加入10μL蛋白酶K,加入100μL AL buffer,间隔震荡15s使充分混匀,短暂离心。
(2)置于56℃的恒温混匀仪中孵育10min,短暂离心。
(3)加入50μL无水乙醇,间隔震荡15s,短暂离心后置于室温孵育3min,短暂离心。
(4)将吸附柱置于2ml的收集管上,将步骤(3)中的全部混合物转移至吸附柱中,8000rpm离心1min,弃废液。
(5)在吸附柱中加入500μL AW1 buffer,8000rpm离心1min,弃废液。
(6)在吸附柱中加入500μL AW2 buffer,8000rpm离心1min,弃废液;14000rpm离心3min,弃废液和收集管。
(7)将吸附柱置于新的1.5ml离心管中,打开吸附柱盖子并置于室温3min,使吸附膜晾干。
(8)加入20μL TE buffer于吸附膜上,室温静置孵育1min,14000rpm离心1min,离心管中所收集的溶液即为目的DNA溶液。
注:为了提高提取的DNA总量,增加全血起始量,需同比例提高蛋白酶K、裂解液AL、无水乙醇的使用量。吸附柱中可容纳最大体积约600μL,步骤(4)可重复进行。
1.2 DNA定量分析
使用Invitrogen公司的
Figure PCTCN2015070143-appb-000001
fluorometer,根据操作手册,利用QubitTMdsDNA HS Assay Kits对所提取的组织DNA以及全血DNA进行定量分 析。具体地,将试剂盒中的QubitTMdsDNA HS reagent用QubitTMdsDNA HS buffer稀释200倍作为工作液,后续步骤如下:
(1)配制标准品反应液:
(a)在190μL工作液中加入10μL dsDNA HS standard#1(0ng/μL),震荡混匀并短暂离心。
(b)在190μL工作液中加入10μL dsDNA HS standard#2(10ng/μL)震荡混匀并短暂离心。
(2)配制样品反应液:在199μL工作液中加入1μL待测样品,震荡混匀并短暂离心。
(3)将标准品反应液和样品反应液置于室温避光孵育2min。
(4)按操作提示,依次放入标准品反应液并设置标准曲线。
(5)按操作提示,依次放入样品反应液,检测样品浓度。
2步骤二:PCR、建库和测序
2.1长范围多重PCR(Long Range Multiplex PCR)
使用LA Taq酶,以50μLPCR体系为例:
Figure PCTCN2015070143-appb-000002
PCR反应条件为:
94℃1min;98℃10sec,68℃15min,30个循环;72℃10min。
表1位点序列信息(所有位点序列均位于11号染色体,表中位置参考的基因组为hg19)
Figure PCTCN2015070143-appb-000003
Figure PCTCN2015070143-appb-000004
Figure PCTCN2015070143-appb-000005
Figure PCTCN2015070143-appb-000006
Figure PCTCN2015070143-appb-000007
Figure PCTCN2015070143-appb-000008
Figure PCTCN2015070143-appb-000009
Figure PCTCN2015070143-appb-000010
Figure PCTCN2015070143-appb-000011
2.2建厍
(1)虾碱性磷酸酶(rSAP)处理,体系如下:
NEBuffer2(10×)  6μL
rSAP(1U/μL)     6μL
总               12μL
50μL多重PCR产物每个样本中加12μL上述反应体系,37℃45min,65℃10min,4℃保存。
(2)末端补平,体系如下:
Figure PCTCN2015070143-appb-000012
Figure PCTCN2015070143-appb-000013
上步反应62μL+18μL反应混合液,12℃20min,4℃保存。
磁珠纯化,1倍AmpureXP磁珠同反应溶液混合,10min后加入75%乙醇,去掉乙醇,37℃3-5min,加42μL TE溶解10min。
(3)5’-端接头连接
5’-端接头序列如下:
5’-端接头长链:5’-pACTTCAGAACCGCAATGCACGATACGC-3’dd(SEQ ID NO:205,dd表示末端核苷酸为双脱氧修饰);
5’-端接头短链:5’-TTGCGGTTCTGAAGT-3’dd(SEQ ID NO:206)。
5’-端接头连接体系如下:
Figure PCTCN2015070143-appb-000014
注意,先加接头再加反应混合液,防止自连。
14℃1h,4℃保存。
磁珠纯化,加入0.9倍的AmpureXP,10min后加入75%的乙醇640μL,去掉乙醇,37℃温浴3-5min,加42μL TE溶解10min。
(4)5’引物延伸
延伸引物序列ON3659为:
5’-ACGTATCGTGCAUTGCGGTTCTGAAGT-3’(SEQ ID NO:207)。
5’引物延伸体系如下:
Figure PCTCN2015070143-appb-000015
每个样本加43.6μL引物延伸反应液。
95℃3min;55℃1min;72℃10min;4℃保存。
磁珠纯化,加入0.9倍AmpureXP,10min后加入75%的乙醇640μL,去掉乙醇,37℃温浴3-5min,加42μL TE溶解10min。
(5)3’-端接头连接
3’-端接头序列如下:
3’-端接头长链:5’-pTCTTCAGCGTTCCCGAGACGTATCGTGCAC-3’dd(SEQ ID NO:208);
3’-端接头短链:5’-CGGGAACGCTGAAGA-3’dd(SEQ ID NO:209)。
3’-端接头连接体系如下:
HB(3×)           28.4μL
连接酶(600U/μL)  2.1μL
总                30.5μL
每个样本加3’-端接头14.8μL和30.5μL反应混合液。
14℃1h,4℃保存。
磁珠纯化,加入0.9倍的AmpureXP,10min后加入75%的乙醇640μL,去掉乙醇,37℃温浴3-5min,加42μL TE溶解10min。
(6)缺口平移反应(Nick Translation)
Oligo反应混合液
Figure PCTCN2015070143-appb-000016
Nitro反应液:
Figure PCTCN2015070143-appb-000017
上步纯化产物加入32μL Oligo反应混合液,60℃5min,梯度0.1℃/sec至37℃,加入8μL Nitro反应液,37℃20min,4℃保存。
磁珠纯化,加80μL AmpureXP磁珠,10min后加入75%的乙醇640μL,去掉乙醇,37℃温浴3-5min,加47μL TE溶解10min。
(7)PCR扩增
PCR扩增引物如下:
5’-端延伸引物序列ON3659为:
5’-ACGTATCGTGCAUTGCGGTTCTGAAGT-3’(SEQ ID NO:207);
3’-端延伸引物序列ON3660为:
5’-ATGCACGATACGUCTCGGGAACGCUGAAGA-3’(SEQ ID NO:210)。
PCR扩增体系如下:
Figure PCTCN2015070143-appb-000018
100μL DNA加450μL PCR反应混合物,转入4个Bio-Rad PCR板,120μL/孔。
95℃3min;95℃30s,56℃30s,72℃8min,7个循环;68℃10min;梯度0.1℃/sec至4℃。
磁珠纯化,加480μLAmpureXP磁珠,10min后加入75%的乙醇2000μL,去掉乙醇,37℃温浴3-5min,加85μL TE溶解10min。
(8)USER酶反应
Figure PCTCN2015070143-appb-000019
将上述反应混合物(约60μL)加50μL酶反应液,37℃1h,4℃保存。
(9)双链DNA环化
环化反应缓冲液:
水                     1520μL
TA buffer(10×)        180μL
总                     1700μL
将上一步USER酶反应产物分装27.5μL×4,分别加入423μL环化反应缓冲液,水浴60℃30min。
环化酶反应混合物:
Figure PCTCN2015070143-appb-000020
将上述酶反应产物加50μL环化酶反应混合物,室温放置1h,环化反应,两轮磁珠纯化。
(10)消化未环化DNA
Plasmid Safe反应液:
Figure PCTCN2015070143-appb-000021
上步纯化后,60μL DNA样本加20μL Plasmid Safe反应液,37℃1h,4℃保存。磁珠纯化,加80μL AmpureXP磁珠,10min后加入75%的乙醇640μL,去掉乙醇,37℃温浴3-5min,加42μL TE溶解10min。
(11)Ecop15酶切
Figure PCTCN2015070143-appb-000022
上步纯化后的DNA样本37μL加323μL酶切反应液,37℃孵育16h。加入252μL PEG32磁珠纯化,再加2000μL 75%乙醇洗脱。
(12)末端修复
Figure PCTCN2015070143-appb-000023
Figure PCTCN2015070143-appb-000024
上步纯化的44μLDNA加入8.6μL末端补平混合液,12℃20min,4℃保存。
2.3测序
本实施例得到的文库采用Proton测序平台进行测序。
3步骤三:比对
步骤三至步骤五为本发明测序得到的reads进行序列分析的具体实施例,本领域的技术人员也可以通过其它方法实现序列分析,这对于本领域技术人员而言是容易实现的,因为目前有很多公知的可用于分型的软件。
3.1操作原理
将测序序列进行比对操作,所用软件为BWA(http://bio-bwa.sourceforge.net/bwa.shtml),序列长度约26bp。
为了减少错误的比对结果发生的可能,以PCR目的产物的序列建立参考序列。
3.2提取参考序列
输入数据:
Primer3软件(http://www.mybiosoftware.com/pcr-primer-design/1470)设计引物输出的引物信息文件(最后一列需添加关注的SNP位置)。
格式:ID、引物长度、染色体、起始位置、终止位置、评分1、评分2、引物序列、TM、GC、SELF_ANY SELF_END END_STABILITY PAIR_COMPL_ANY PAIR_COMPL_END产物序列、SNP位置。
根据设计的引物信息,截取目的产物基因序列。参数要求:
-i 引物信息文件
-o PCR产物序列fasta
-r .REF文件
3.3建立INDEX
使用BWA软件中的index子程序。
使用方法:
bwa index Primer.fasta
3.4运行比对程序
主程序为:aln_bwa.pl(深圳华大基因研究院提供)。
参数要求:
-il 输入fastq文件列表
-d 输出文件路径
-p bwa参数设置(aln-o3-e10-i1-L-O3-E1-M5-t4)
运行时间与样本个数和测序量有关,一般运行时间不超过1h。
3.5输出结果分析
本程序主要输出结果为bam文件,格式为bam标准输出。
4步骤四:统计信息
4.1操作原理
对比对到参考序列的reads统计SNP位点的碱基信息以及覆盖深度,包括SNP之间的连锁关系。
4.2输入文件
比对结果BAM文件,用于定位SNP的标签序列(anchor序列)文件列表(SNP位点前的6-10bp序列)。
4.3运行脚本
Haplotype.stepl.pl(深圳华大基因研究院提供)。
-i 输入BAM或SAM文件
-o 输出文件
-p anchor文件
4.4输出结果
(1)link文件,统计两位点的关联情况;
(2)loci文件,统计单点各个碱基的深度。
5步骤五:局部单体型分型
5.1操作原理
本过程通过之前的SNP相关统计信息对局部单体型进行分型。
(1)根据单点的深度统计判断是否为纯合位点;
(2)根据杂合位点两两之间的关联,拼接出单体型。
5.2输入文件
上一步运行脚本Haplotype.stepl.pl生成的link&loci文件
5.3运行程序
Haplotype.step2.pl(深圳华大基因研究院提供)。
5.4输出结果分析
SNP ID、第一条等位基因(Allele)、第二条等位基因。结果如表2所示。
表2单体型分型结果
位点 等位基因1 等位基因2
5207436-rs7116766 G T
5208174-rs7939206 A C
5208594-rs7478828 C G
5209957-rs7938660 G T
5209964-rs12278876 T T
5210727-rs6578572 G T
5211300-rs7482349 A A
5212503-rs7483401 C T
5212574-rs6578576 A G
5214007-rs6578579 A A
5215042-rs12798684 C C
5215307-rs12225185 C C
5216455-rs12804838 T G
5217187-rs10837552 G G
5217565-rs10742571 G G
5218117-rs11036200 G G
5220001-rs6578582 T T
5221645-rs11512276 C G
5221825-rs11036212 A G
5222215-rs7938317 C G
5223750-rs9667878 T T
5224054-rs12293662 A A
5225635-rs11036238 G G
5228708-rs10837582 A A
5229743-rs56385516 A A
5231897-rs10837593 C C
5233697-rs6578584 C C
5233836-rs6421048 C C
5234781-rs12362241 A A
5236417-rs7945118 C C
5236954-rs79681613 T G
5246203-rs78928216 A A
5246512-rs7110263 T A
5247733-rs7480526 A A
5247791-rs10768683 C C
5252251-rs6578588 C C
5253586-rs74049332 G G
5255912-rs3813727 A G
5256647-rs7948668 A C
5257645-rs3759075 G C
5258592-rs4402323 C C
5258827-rs4910543 G T
5260458-rs968857 T T
5261239-rs10768687 C C
5263853-rs16912210 A A
5264146-rs2071348 T T
5264929-rs7934275 G G
5265106-rs12417960 C G
5266728-rs7924684 T T
5268406-rs7480197 C G
5269140-rs6578592 C G
5271671-rs2855039 C C
5272795-rs74049345 G G
5273687-rs2855125 G G
5273922-rs10768707 T A
5275240-rs11036475 G G
5276818-rs2011051 G G
5277236-rs2855122 C G
5279153-rs5010979 T T
5280022-rs11036496 G G
5281018-rs10837697 G G
5284978-rs4348933 A A
5286808-rs10837707 T A
5289306-rs11822578 G G
5290053-rs72872549 C C
5291872-rs10768737 C C
5294401-rs4910548 C C
5296104-rs7130110 G G
以上内容是结合具体的实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换。

Claims (13)

  1. 一种单体型分型测序文库的构建方法,其特征在于,所述方法包括:
    使用在一段染色体区域的多个SNP位点附近设计的引物,对待分型个体来源的基因组DNA进行随机的多重PCR扩增,得到两端具有所述SNP位点的扩增片段;
    在所述扩增片段的两端分别连接5’-端接头和3’-端接头,其中所述5’-端接头和3’-端接头分别具有限制性内切酶的识别位点,所述限制性内切酶的切割位点位于所述SNP位点的内侧;
    对连接5’-端接头和3’-端接头后的片段进行环化,得到环状DNA分子;
    使用所述限制性内切酶在所述切割位点切割所述环状DNA分子,得到包含两个SNP位点、两个引物序列和两个接头序列的切割片段,即为所述测序文库。
  2. 根据权利要求1所述的单体型分型测序文库的构建方法,其特征在于,所述多重PCR扩增使用的扩增酶为扩增10kb以上长片段的扩增酶。
  3. 根据权利要求1所述的单体型分型测序文库的构建方法,其特征在于,所述限制性内切酶为Ecop15酶,所述识别位点为AGACC或CAGCAG,所述切割位点位于所述识别位点的下游24-26bp处。
  4. 根据权利要求1所述的单体型分型测序文库的构建方法,其特征在于,在所述扩增片段的两端分别连接5’-端接头和3’-端接头之后,进行缺口平移反应。
  5. 根据权利要求4所述的单体型分型测序文库的构建方法,其特征在于,所述进行缺口平移反应之后,使用结合所述5’-端接头和3’-端接头的引物进行PCR扩增,其中所述PCR扩增使用的引物上带有U碱基位点;使用USER酶切以产生利于环化的粘性末端;环化所述USER酶切得到的两端带有粘性末端的片段,得到环状DNA分子。
  6. 根据权利要求1所述的单体型分型测序文库的构建方法,其特征在于,所述对连接5’-端接头和3’-端接头后的片段进行环化之后,消化未环化的线形DNA分子,再进行所述限制性内切酶的切割。
  7. 一种单体型分型方法,其特征在于,所述方法包括:对所述权利要求1-6任一项所述的方法得到的测序文库进行测序;然后以对应于每个SNP位点的引物上的标签序列为序列标识,对所述测序得到的reads进行SNP位点的连锁关系分析,得到单体型。
  8. 一种单体型分型测序文库的构建试剂,其特征在于,所述试剂包括如下 组成部分:
    引物组,所述引物组包括在一段染色体区域的多个SNP位点附近设计的多条引物,用于对待分型个体来源的基因组DNA进行随机的多重PCR扩增,以得到两端具有所述SNP位点的扩增片段;
    5’-端接头和3’-端接头,分别具有限制性内切酶的识别位点,所述限制性内切酶的切割位点位于所述SNP位点的内侧,用于连接在所述扩增片段的两端;
    环化连接酶,用于对连接5’-端接头和3’-端接头后的片段进行环化,得到环状DNA分子;
    限制性内切酶,用于在所述切割位点切割所述环状DNA分子,以得到包含两个SNP位点、两个引物序列和两个接头序列的切割片段。
  9. 根据权利要求8所述的单体型分型测序文库的构建试剂,其特征在于,还包括扩增酶,所述扩增酶扩增10kb以上长片段,用于所述多重PCR扩增。
  10. 根据权利要求8所述的单体型分型测序文库的构建试剂,其特征在于,所述限制性内切酶为Ecop15酶,所述识别位点为AGACC或CAGCAG,所述切割位点位于所述识别位点的下游24-26bp处。
  11. 根据权利要求8所述的单体型分型测序文库的构建试剂,其特征在于,还包括缺口平移反应的组分,用于在所述扩增片段的两端分别连接5’-端接头和3’-端接头之后,进行缺口平移反应。
  12. 根据权利要求11所述的单体型分型测序文库的构建试剂,其特征在于,还包括:
    PCR扩增使用的引物对,分别结合所述5’-端接头和3’-端接头并且带有U碱基位点,用于进行缺口平移反应之后,进行PCR扩增;
    USER酶,用于酶切所述PCR扩增的产物,以产生利于环化的粘性末端。
  13. 根据权利要求8所述的单体型分型测序文库的构建试剂,其特征在于,还包括核酸外切酶,用于对连接5’-端接头和3’-端接头后的片段进行环化之后,消化未环化的线形DNA分子。
PCT/CN2015/070143 2015-01-06 2015-01-06 一种单体型分型测序文库的构建方法、分型方法和试剂 WO2016109928A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201580068011.6A CN107208314B (zh) 2015-01-06 2015-01-06 一种单体型分型测序文库的构建方法、分型方法和试剂
PCT/CN2015/070143 WO2016109928A1 (zh) 2015-01-06 2015-01-06 一种单体型分型测序文库的构建方法、分型方法和试剂

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/070143 WO2016109928A1 (zh) 2015-01-06 2015-01-06 一种单体型分型测序文库的构建方法、分型方法和试剂

Publications (1)

Publication Number Publication Date
WO2016109928A1 true WO2016109928A1 (zh) 2016-07-14

Family

ID=56355392

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/070143 WO2016109928A1 (zh) 2015-01-06 2015-01-06 一种单体型分型测序文库的构建方法、分型方法和试剂

Country Status (2)

Country Link
CN (1) CN107208314B (zh)
WO (1) WO2016109928A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109576799A (zh) * 2018-11-30 2019-04-05 安吉康尔(深圳)科技有限公司 Fh测序文库的构建方法和引物组及试剂盒
CN112795620A (zh) * 2019-11-13 2021-05-14 深圳华大基因股份有限公司 双链核酸环化方法、甲基化测序文库构建方法和试剂盒
WO2023030259A1 (zh) * 2021-08-30 2023-03-09 司法鉴定科学研究院 一种基于二代测序技术检测微单倍型基因座的引物组合物、试剂盒和方法及其应用

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108985009B (zh) * 2018-08-29 2022-06-07 北京希望组生物科技有限公司 一种获得基因单体型序列的方法及其应用
CN114250279B (zh) * 2020-09-22 2024-04-30 上海韦翰斯生物医药科技有限公司 一种单倍型的构建方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102181519A (zh) * 2011-02-16 2011-09-14 上海交通大学 利用环化RNase H检测基因多态性SNP位点的方法
CN102181443A (zh) * 2011-03-21 2011-09-14 中国科学院植物研究所 一种多重检测基因组dna多态性的方法及其专用探针
CN102199668A (zh) * 2011-04-15 2011-09-28 上海交通大学 纳米颗粒分子单倍型分型方法
CN103146816A (zh) * 2013-02-05 2013-06-12 南京大学 一种鉴定外来入侵物种互花米草种群的dna分子标记方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013071520A1 (zh) * 2011-11-18 2013-05-23 深圳华大基因科技有限公司 用于病毒检测的方法和系统
CN102925471B (zh) * 2012-08-10 2014-09-03 内蒙古科技大学 非限制性构建无缝质粒表达载体的方法
CN102864498B (zh) * 2012-09-24 2014-07-16 中国科学院天津工业生物技术研究所 一种长片段末端文库的构建方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102181519A (zh) * 2011-02-16 2011-09-14 上海交通大学 利用环化RNase H检测基因多态性SNP位点的方法
CN102181443A (zh) * 2011-03-21 2011-09-14 中国科学院植物研究所 一种多重检测基因组dna多态性的方法及其专用探针
CN102199668A (zh) * 2011-04-15 2011-09-28 上海交通大学 纳米颗粒分子单倍型分型方法
CN103146816A (zh) * 2013-02-05 2013-06-12 南京大学 一种鉴定外来入侵物种互花米草种群的dna分子标记方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109576799A (zh) * 2018-11-30 2019-04-05 安吉康尔(深圳)科技有限公司 Fh测序文库的构建方法和引物组及试剂盒
CN109576799B (zh) * 2018-11-30 2022-04-26 深圳安吉康尔医学检验实验室 Fh测序文库的构建方法和引物组及试剂盒
CN112795620A (zh) * 2019-11-13 2021-05-14 深圳华大基因股份有限公司 双链核酸环化方法、甲基化测序文库构建方法和试剂盒
WO2023030259A1 (zh) * 2021-08-30 2023-03-09 司法鉴定科学研究院 一种基于二代测序技术检测微单倍型基因座的引物组合物、试剂盒和方法及其应用

Also Published As

Publication number Publication date
CN107208314B (zh) 2020-06-16
CN107208314A (zh) 2017-09-26

Similar Documents

Publication Publication Date Title
Lachance et al. Evolutionary history and adaptation from high-coverage whole-genome sequences of diverse African hunter-gatherers
KR102393608B1 (ko) 희귀 돌연변이 및 카피수 변이를 검출하기 위한 시스템 및 방법
Bose et al. Target capture enrichment of nuclear SNP markers for massively parallel sequencing of degraded and mixed samples
US6287825B1 (en) Methods for reducing the complexity of DNA sequences
WO2016109928A1 (zh) 一种单体型分型测序文库的构建方法、分型方法和试剂
Schmidt et al. Genotyping‐in‐Thousands by sequencing (GT‐seq) panel development and application to minimally invasive DNA samples to support studies in molecular ecology
JP2009520497A (ja) ハイスループットaflp系多型検出法
JP2013215212A (ja) 試料中の制限断片を同定する方法
Die et al. Superior cross-species reference genes: a blueberry case study
WO2020249111A1 (zh) 一种基因组编辑检测方法、试剂盒及应用
Pan et al. Rapid identification and recovery of ENU-induced mutations with next-generation sequencing and Paired-End Low-Error analysis
CN114606332B (zh) 用于判断西瓜果肉硬度的SNP位点、Hf-KASP1标记及其应用
EP2646575A1 (en) Detecting mutations in dna
Goswami et al. An overview of molecular genetic diagnosis techniques
AU647806B2 (en) Genomic mapping method by direct haplotyping using intron sequence analysis
Wang et al. Development of molecular inversion probes for soybean progeny genomic selection genotyping
US20180100180A1 (en) Methods of single dna/rna molecule counting
Sharma et al. DNA, RNA isolation, primer designing, sequence submission, and phylogenetic analysis
Stawski Preparing whole genome human mitochondrial DNA libraries for next generation sequencing using Illumina Nextera XT
WO2022146773A1 (en) Methods and compositions for sequencing and fusion detection using ligation tail adapters (lta)
US20090305288A1 (en) Methods for amplifying nucleic acids and for analyzing nucleic acids therewith
JP3979572B2 (ja) 高トリグリセリド血症、肥満及び高血圧症に対する易罹患性の判定方法
Vasu et al. Screening of CRISPR-Cas9-generated point mutant mice using MiSeq and locked nucleic acid probe PCR
JP5687414B2 (ja) 多型の識別方法
Ernest et al. Development of 10 microsatellite loci for Yellow‐billed Magpies (Pica nuttalli) and corvid ecology and West Nile virus studies

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15876439

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15876439

Country of ref document: EP

Kind code of ref document: A1