CN109576346B

CN109576346B - Construction method and application of high-throughput sequencing library

Info

Publication number: CN109576346B
Application number: CN201811306555.3A
Authority: CN
Inventors: 张巨永; 卢瀚林
Original assignee: Shenzhen Acegen Technology Co ltd
Current assignee: Shenzhen Acegen Technology Co ltd
Priority date: 2018-11-05
Filing date: 2018-11-05
Publication date: 2022-06-10
Anticipated expiration: 2038-11-05
Also published as: CN109576346A

Abstract

The invention discloses a construction method and application of a high-throughput sequencing library. The method for constructing the high-throughput sequencing library comprises the following steps: adding a base A to the 3' end of a DNA fragment obtained by genome DNA fragmentation and end repair, connecting a linker and amplifying, and then mixing the DNA library with a circular blocking oligonucleotide and a specific probe to perform hybridization capture, wherein the circular blocking oligonucleotide is designed for the linker and the tag sequence and is complementarily paired with the linker sequence at the two ends of the library sequence; and performing hybridization capture on the connection product by using a specific probe so as to obtain the target fragment. The invention captures the DNA template which seals the sequencing joint sequence in a circular mode, and improves the efficiency of capturing the target DNA sequence by the probe.

Description

Construction method and application of high-throughput sequencing library

Technical Field

The invention relates to the field of biotechnology. And in particular to targeted sequencing techniques involving the determination of target DNA fragments of a sample. More specifically, the invention provides a method for constructing a high-throughput sequencing library, a sequencing method for determining target DNA fragments of a sample, a device for determining the target DNA fragments of the sample and a kit for constructing the high-throughput sequencing library of the target DNA fragments of the sample.

Background

A new generation of high-throughput sequencing technology which rises in recent years can simultaneously sequence billions of DNA fragments, and provides a powerful tool for basic biomedical research and clinical detection. Whole genome sequencing is widely used in the field of basic research with its comprehensive detection performance, however, the cost and complexity of analysis of whole genome sequencing is still difficult for researchers, and although the throughput of Next Generation Sequencing (NGS) is higher and the cost is lower, it is still a viable option for most genetic laboratories and clinical detection centers. This is especially true for the study of complex diseases, which require at least hundreds of samples to achieve sufficient statistical power, and whole genome sequencing of so many samples, both from a cost standpoint and from a data analysis standpoint, is relatively difficult.

Therefore, another sequencing technology is available, which is a target-targeted sequencing technology, wherein a target DNA of interest is captured by different methods to prepare a sequencing library, and then is subjected to sequencing analysis by high-throughput sequencing to obtain a sequence of the target DNA, such as exon capture sequencing, which captures and determines about 30MB of whole genome exon sequence, and the sequencing cost of the sequencing technology is only one percent of that of whole genome sequencing. The target-targeted sequencing technology is a large genome of human or higher organisms, can improve the sequencing efficiency by hundreds of times and greatly improve the throughput of samples, is a high-throughput sequencing technology better applied to the field of clinical detection, and develops various target-targeted sequencing technologies at present, which are mainly divided into an enrichment technology based on probe capture and an enrichment technology based on multiplex PCR.

The target-targeted sequencing technology based on multiplex PCR is applied to some clinical detection fields through simple experimental procedures, but most of the target-targeted sequencing technology can only capture a region smaller than 1MB, and can only detect known mutations, and the detection stability is poor, so that the clinical application of the target-targeted sequencing technology is limited due to the characteristics. The target-targeted sequencing technology based on the probe can capture a region more than 10mb, has good stability, can detect various types of mutation, can customize different detection regions, and has great potential in clinical application.

However, the target sequencing technology based on probe capture has a long library building process, and the hybridization time of the probe for being sufficiently combined with the target area is 1-2 days or longer, so that the timeliness of clinical detection is greatly limited. In addition, the efficiency of hybrid capture is limited (typically only 50-60% capture efficiency), which is wasted in non-target areas and adds virtually to the cost of probe capture.

Disclosure of Invention

The present invention is directed to solving at least one of the problems of the prior art. The first aspect of the invention provides the following technical scheme:

fragmenting genomic DNA so as to obtain DNA fragments;

end-repairing the DNA fragment to obtain an end-repaired DNA fragment;

Adding a base A to the 3' -end of the end-repaired DNA fragment so as to obtain a DNA fragment having a cohesive end A;

ligating the DNA fragment having the cohesive end A with a linker to obtain a ligation product;

carrying out PCR amplification on the ligation product through a primer with 5-end phosphorylation and another primer without 5-end phosphorylation to obtain a DNA library;

in a preferred embodiment of the invention, the DNA product is digested with an exonuclease to obtain a single-stranded DNA library; in a preferred embodiment of the invention, wherein the exonuclease is lambda exonuclease;

mixing the DNA library with blocking oligonucleotides and specific probes for hybrid capture, wherein the blocking oligonucleotides form a linker and/or a tag sequence introduced at two ends of the circular blocking DNA library, and the specific probes perform hybrid capture on the ligation products so as to obtain target fragments; the circular blocking oligonucleotide is designed for a connector and/or a label sequence, two sections of the blocking oligonucleotide are respectively in complementary pairing with the connector and/or the label sequence at two ends of a DNA library and are connected to form a closed loop, so that circular blocking is realized;

In a preferred embodiment of the invention, the hybrid capture is 6-8 h;

in a preferred embodiment of the present invention, the hybrid capture is followed by adsorption and washing with magnetic beads with streptavidin;

performing PCR amplification on the obtained target fragment so as to obtain an amplification product;

in a preferred embodiment of the invention, the PCR amplification is performed for 10-12 cycles;

and isolating and purifying the amplification products, the amplification products constituting the high-throughput sequencing library,

in a preferred embodiment of the present invention, the method further comprises the step of extracting genomic DNA from a sample, preferably the sample is derived from at least one of a mammal, a plant, and a microorganism, more preferably the mammal is at least one of a human and a mouse, preferably the genomic DNA is human whole blood genomic DNA, more preferably the genomic DNA is peripheral blood mononuclear cell genomic DNA,

in a preferred embodiment of the invention, the amount of genomic DNA is 2. mu.g,

in a preferred embodiment of the invention, genomic DNA is fragmented using a covaris-S2 disruptor,

in a preferred embodiment of the invention, the DNA fragment has a length of about 150-300bp, preferably 200-250bp,

In a preferred embodiment of the present invention, before the DNA fragment is subjected to end repair, a step of purifying the DNA fragment is further included,

in a preferred embodiment of the invention, the end repair of the DNA fragment is performed using Klenow fragment having 5 '→ 3' polymerase activity and 3 '→ 5' polymerase activity, but lacking 5 '→ 3' exonuclease activity, T4DNA polymerase and T4 polynucleotide kinase,

in a preferred embodiment of the present invention, the addition of the base A to the 3 ' -end of the end-repaired DNA fragment is carried out using Klenow (3 ' -5 ' exo-),

in a preferred embodiment of the invention, the linker comprises a tag sequence,

in a preferred embodiment of the present invention, the ligation of the DNA fragment having cohesive end A to the linker is performed using T4DNA ligase,

in a preferred embodiment of the present invention, after obtaining the ligation product, further comprising a step of purifying the ligation product,

in a preferred embodiment of the invention, the specific probes are designed using the eArray system,

in a preferred embodiment of the invention, the length of the probe is 120 mers,

In a preferred embodiment of the invention, 1. mu.g of ligation product is used for the hybrid capture,

in a preferred embodiment of the invention, the PCR amplification is performed using a hot start DNA polymerase,

in a preferred embodiment of the present invention, the separation and purification of the amplification product is performed by at least one selected from the group consisting of magnetic bead purification, purification column purification, and 2% agarose gel electrophoresis, preferably by 2% agarose gel electrophoresis,

in a preferred embodiment of the present invention, the length of the library fragment of the high throughput sequencing library is 300-450 bp.

In a second aspect, the present invention provides a method for sequencing a target DNA sequence of a sample, comprising the steps of:

constructing a high throughput sequencing library of target DNA fragments of said sample according to the method of the first aspect of the invention;

sequencing a high-throughput sequencing library of target DNA sequences of the sample to obtain a sequencing result.

In a preferred embodiment of the invention, the sequencing is performed using high throughput sequencing techniques.

In a preferred embodiment of the invention, the sequencing is performed using a Hiseq2000 sequencer.

A third aspect of the present invention provides an apparatus for determining a target DNA sequence of a sample, comprising:

a library preparation unit for preparing a high-throughput sequencing library of a sample for determining a target DNA fragment of the sample, wherein a specific probe and a blocking oligonucleotide are arranged in the library preparation unit;

a sequencing unit connected with the library preparation unit and receiving a high-throughput sequencing library of target DNA fragments of a determined sample of the sample from the library preparation unit so as to be used for sequencing the high-throughput sequencing library of the target DNA fragments of the determined sample of the sample and obtain a sequencing result; and

and the data analysis unit is connected with the sequencing unit and receives the sequencing result from the sequencing unit so as to perform data analysis on the sequencing result and determine the information of the sample, which determines the target DNA fragment of the sample.

In a preferred embodiment of the invention, the high throughput sequencing library is a single stranded DNA library,

In a preferred embodiment of the present invention, the length of the probe is about 120 mers.

The fourth aspect of the invention provides a closed oligonucleotide for constructing a high-throughput sequencing library of a sample target DNA sequence, wherein the circular closed oligonucleotide is designed for a linker and/or a tag sequence, two segments of the closed oligonucleotide are respectively complementarily paired with the linker and/or tag sequences at two ends of a DNA library and are connected to form a closed loop, and the linker and/or tag sequences introduced at two ends of the circular closed DNA library are realized.

In a preferred embodiment of the invention, the blocking oligonucleotide has the sequence shown in SEQ ID NO 1 and SEQ ID NO 2.

In a fifth aspect, the present invention provides a high throughput sequencing library for constructing a sample target DNA sequence, the high throughput sequencing library being constructed according to the method of the first aspect of the present invention.

The sixth aspect of the present invention provides a kit for constructing a high-throughput sequencing library of target DNA sequences of a sample, comprising:

DNA library, specific probe and blocking oligonucleotide,

in a preferred embodiment of the invention, the DNA library is a single-stranded DNA library,

The invention provides a method for closing a sequencing linker sequence in a DNA hybridization capture process, which can improve the capture efficiency of a probe, reduce the proportion of non-target region DNA, greatly reduce the sequencing cost for capturing a target DNA sequence and promote the application of a target sequencing technology based on probe capture to clinic (figure 5). The existing capture technology is to capture double-stranded DNA fragments using RNA or DNA probes, and to block the introduced adaptor and tag sequences linearly by DNA oligonucleotides. The method is different from the conventional capture method in that a circular block (circle block) mode is adopted to combine the label sequences of the introduced linkers, the introduced sequences are completely sealed as far as possible, and non-target capture caused by combination of the linker sequences and the probes and the linkers is prevented. The technical scheme of the application has the following advantages in detail:

1. increasing target capture efficiency

The invention adopts a ring-shaped sealing strategy, the block can be firmly combined with the introduced joint and the label sequence, the capture efficiency and the capture of non-specific sequences caused by the combination of the probe and the joint label sequence are avoided, the ring-shaped sealing and the template have 2 binding sites, and compared with the linear capture, the invention has stronger combination capability and better sealing effect (figure 3).

2. Using multiple capture systems

The method provided by the invention is applicable to a NimbleGen chip hybridization system, an Agilent liquid phase hybridization system and a NimbleGen EZ liquid phase hybridization system, and has consistent results when single sample or a plurality of samples are hybridized as target region coverage for measuring sequence capture effect and sequence capture specificity indexes at the same or close sequencing depth (the number of times each base is sequenced).

3. Is suitable for various sequencing platforms

When the method provided by the invention is used for constructing a hybridization sequencing library, only the corresponding joint and primer sequence provided by the used sequencing platform need to be replaced, and the method can be suitable for other second-generation sequencing platforms such as Roche454, AB SOLiD and the like, and has wide application prospect.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1: a schematic diagram showing the principle of linear and circular closure;

FIG. 2: a schematic showing non-target area capture resulting from unclosed linkers;

FIG. 3: a schematic diagram showing the structure of the blocked oligonucleotide oligo;

FIG. 4 is a schematic view of: a schematic showing the improvement of capture efficiency by annular confinement;

FIG. 5: a schematic technical flow diagram of capture library construction is shown;

FIG. 6: GC preference diagram for constructing library by adopting circular closure and circular closure plus single chain

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

Method for constructing high-throughput sequencing library

According to one aspect of the invention, the invention provides a method of constructing a high throughput sequencing library. According to an embodiment of the invention, the method comprises the steps of:

first, genomic DNA is fragmented to obtain DNA fragments. The term "DNA" as used in the present invention may be any polymer comprising deoxyribonucleotides, including but not limited to modified or unmodified DNA. It will be understood by those skilled in the art that the source of the genomic DNA is not particularly limited, and it can be obtained from any possible route, either directly from the market, from other laboratories, or directly from a sample. According to the embodiment of the present invention, genomic DNA can be extracted from a sample. According to one embodiment of the present invention, the method for constructing a high throughput sequencing library may further comprise the step of extracting genomic DNA from the sample. According to some specific examples of the invention, the sample may be derived from at least one of a mammal, a plant, and a microorganism. According to some embodiments of the invention, the mammal may be at least one of a human and a mouse. According to one embodiment of the present invention, the genomic DNA may be human whole blood genomic DNA, preferably peripheral blood mononuclear cell genomic DNA.

According to an embodiment of the present invention, the amount of the genomic DNA is not particularly limited, and according to a specific example of the present invention, it is preferable that the amount of the genomic DNA is 2. mu.g. The inventor surprisingly found that when the amount of the genomic DNA is 2 μ g, the high-throughput sequencing library of the target DNA fragment of the determined sample of the sample constructed according to the method for constructing the high-throughput sequencing library of the embodiment of the invention can be very conveniently applied to high-throughput sequencing technologies, such as the Solexa sequencing technology, and the library sequencing result is accurate and has good repeatability.

Next, the DNA fragments are subjected to end repair to obtain end-repaired DNA fragments. According to an embodiment of the present invention, a step of purifying the DNA fragment may be further included before the DNA fragment is subjected to end repair, thereby facilitating subsequent end repair. According to an embodiment of the present invention, end repair of a DNA fragment may be performed using Klenow fragment having 5 '→ 3' polymerase activity and 3 '→ 5' polymerase activity, but lacking 5 '→ 3' exonuclease activity, T4DNA polymerase and T4 polynucleotide kinase. Therefore, the DNA fragment can be conveniently and accurately subjected to end repair. According to an embodiment of the present invention, a step of purifying the end-repaired DNA fragment may be further included, thereby enabling convenient subsequent processing.

Next, a base A is added to the 3' -end of the end-repaired DNA fragment to obtain a DNA fragment having a cohesive end A. According to one embodiment of the present invention, base A may be added to the 3 ' end of the end-repaired DNA fragment using Klenow (3 ' -5 ' exo-), i.e., Klenow having 3 ' → 5 ' exonuclease activity. Thus, the base A can be added to the 3' -end of the DNA fragment subjected to end repair easily and accurately. According to an embodiment of the present invention, a step of purifying the DNA fragment having the sticky end A may be further included, thereby enabling convenient subsequent processing.

According to one embodiment of the present invention, ligation of a DNA fragment having a cohesive end A to a linker is performed using T4DNA ligase, whereby a ligation product can be conveniently obtained. According to an embodiment of the present invention, a step of purifying the ligation product may be further included, thereby enabling convenient subsequent processing.

Then, the ligation product is subjected to hybrid capture using a specific probe to obtain a fragment of interest. According to an embodiment of the present invention, the term "specific probe" herein refers to a probe that is specific for a known target DNA fragment. According to a specific example of the present invention, a specific probe is designed based on the use of a human genome as a reference sequence and a target DNA fragment known on the genome as a target sequence, and thus, by performing hybrid capture using the specific probe according to an embodiment of the present invention, a sequence complementary to the target sequence in a sample (in the present specification, sometimes referred to as "target DNA fragment for identifying a sample") can be efficiently captured.

According to the principle of complementary pairing of nucleic acids, the capture probe in a single-stranded state can be complementarily bound to the target sequence in a single-stranded state, thereby successfully capturing the target region. According to the embodiments of the present invention, the probe design can be selected from a solid phase capture chip (the probe is fixed on a solid support) or a liquid phase capture probe (the probe is free in liquid), however, the solid phase capture chip is limited by many factors such as the length of the probe, the density of the probe, and the price, and the liquid phase capture is the first choice.

According to the embodiment of the invention, the probe is designed by using an Array (Agilent) probe design system, the length of the probe is 80-120 mers, and the coverage length of the probe is large and ranges from less than 200kb to 24Mb or even longer. The eArray probe design system can conveniently use the bioinformatics tools window mask (window sequence shielding) and repeat mask (repeat sequence shielding) to analyze and shield target areas, thereby avoiding the probe design of the areas and very effectively reducing the capture interference in experiments and the comparison interference generated in the subsequent sequence analysis; and shortening the cover length can reduce the cost to some extent.

The invention designs a sequencing joint closing method in a DNA hybridization capture system, which comprises a linear circular closed sequencing joint hybridization capture system.

Breaking the sample genome DNA into fragments with the size of 200-250 bp by an ultrasonic breaking method, adding a joint to the DNA fragments through the processes of end repair, adding 'A' base, connection and the like, carrying out PCR amplification through a primer with 5-end phosphorylation and another primer without 5-end phosphorylation, and digesting the obtained double-stranded DNA product by lambda exonuclease to obtain a single-stranded DNA library.

Hybridization of samples

In the conventional probe capture process, a linear block oligo is generally used to block the introduced linker and tag sequences (fig. 2), because double-strand capture is used, the added block oligo can only block 1/2 linker sequences (fig. 1), which may cause binding between the probe and the linker and tag, resulting in non-specific capture; in addition, the adaptor sequence at the 2-end of the double-stranded DNA fragment is complementary and can be easily combined, for example, the adaptor at one end of the DNA fragment in the non-target region is combined with the adaptor at one end of the DNA fragment in the target region, when the target region is captured by the magnetic beads, the fragment in the non-target region is also captured along the band, causing non-specific capture (FIG. 6),

the invention improves the DNA hybridization process, adopts the adaptor and the label sequence introduced by the complete blocking of the circular oligo, combines the completely blocked single-stranded DNA through the probe, prevents the non-specific capture caused by the complementation between the adaptors and between the probe and the adaptor, and then captures the target DNA sequence by the probe.

a) The method adopts a blocking strategy of the circular oligo to completely block the tag sequence of the introduced adaptor, has stronger stability of circular blocking than linear blocking and better blocking effect, and reduces non-target capture caused by complementary combination of the adaptor, the adaptor and the probe and the adaptor DNA sequence as much as possible.

Embodiments of the present invention will be described in detail below with reference to examples, but those skilled in the art will appreciate that the following examples are only illustrative of the present invention and should not be construed as limiting the scope of the present invention. The examples do not specify particular techniques or conditions, and are carried out according to techniques or conditions described in literature in the art (for example, refer to molecular cloning, a laboratory Manual, third edition, scientific Press, written by J. SammBruke et al, Huang Petang et al) or according to product instructions. The reagents or instruments used are not indicated by the manufacturer, and are all conventional products commercially available.

EXAMPLE 1 Ring closure

Agilent liquid phase hybridization System (Agilent Corp.) control example: single samples were captured with 50M full Exon sequences (SureSelect Human All Exon 50Mb Kit)

The experimental method comprises the following steps:

hybridization Library construction protocol referring to the SureSelectXT Target expression System for Illumina Pair-End Multiplexed Sequencing Library protocol, 3ug of genomic DNA (extracted from human peripheral blood) was fragmented, End-filled, base "A" added, and linker (from Illumina Multiplexed Sample Preparation Oligonucleotide Kit) added. The single-stranded DNA library preparation method is adopted to prepare the single-stranded DNA library, and the sequences of the primers and the sequences of the blocking oligos are shown in tables 1 and 2.

Tip repair

The following reagents were placed in a 1.5ml centrifuge tube

Reagent	Volume of
		Cleaved DNA	40(μL)
End repair buffer	4(μL)
		End repair enzyme	6(μL)
In total	50(μL)

25 degrees, 30min, 65 degrees and 15 min;

joint connection

Adding the following reagents into the reaction system in the last step

23 ℃ for 30 min;

100ul of XP beads were added to the reaction system to purify the product according to the Agencour AMPure protocol (Beckman Co., U.S.A.), and the PCR product was finally purified and dissolved in 35ul of purified water.

Double stranded DNA library preparation

The PCR reaction system and reaction conditions were as follows:

the following reaction system was arranged in a 200ul PCR tube:

reagent	Volume of
		Adaptor-added DNA	33.5(μL)
5×Herculase II Reaction Buffer	10(μL)
		100mM dNTP Mix	0.5(μL)
Herculase II Fusion DNA Polymerase	1(μL)
		General primer 1(10uM)	2.5(μL)
General primer 2(10uM)	2.5(μL)
		In total	50(μL)

The reaction was carried out under the following reaction conditions:

(a).98℃ 30s

(b).98℃ 30s

(c).65℃ 30s

(d).72℃ 1min

(e) repeating steps (b) - (d) 3-9 times (for 4-10 cycles)

(f).72℃ 5min

(g) Standing at 4 deg.C

The obtained PCR product is subjected to the next reaction

60ul of XP beads were added to the reaction system, and the product was purified according to the Agencour AMPure protocol (Beckman Co., U.S.A.), dissolved in 25ul of pure water, and the concentration of the PCR product was detected using NanoDrop 1000.

Hybridization of

a. A double-stranded DNA library of 3.4. mu.L or more and 100 ng/. mu.L or less was prepared by concentration or the like.

b. Formulation Hybridization Buffer (all reagents from Agilent):

c. Prepare SureSelect Oligo Capture Library Mix (all reagents from Agilent) and place on ice:

d. the DNA library of sample SureSelect-SC was added to the PCR tube, together with cotDNA (100ng)2ul and the circular blocking oligo1(100uM) or circular blocking oligo2(100uM)2ul, and after mixing, the mixture was maintained at 65 ℃.

e. The HybridizationBuffer was added to the PCR tube as required, mixed well, and hybridized at 65 deg.C (hot lid set to 105 deg.C) for 8 hours

f. The hybridized sample was adsorbed to a Dynal magnetic bead (Invitrogen), and the captured sequence was eluted with 35. mu.L of SureSelect extraction Buffer.

Post-capture PCR amplification:

reagent	Volume of
		Capture of DNA	33.5(μL)
5×Herculase II Reaction Buffer	10(μL)
		100mM dNTP Mix	0.5(μL)
Herculase II Fusion DNA Polymerase	1(μL)
		General primer 1(10uM)	2.5(μL)
General primer 2(10uM)	2.5(μL)
		In total	50(μL)

Reaction conditions are as follows:

(a).98℃ 2min

(b).98℃ 20s

(c).60℃ 30s

(d).72℃ 30s

(e) repeating steps (b) - (d) 9-14 times (10-15 times total)

(f).72℃ 5min

(g) Standing at 4 deg.C

j. 50ul of XP beads were added to the reaction system to purify the product according to the Agencour AMPure protocol (Beckman Co., U.S.A.), the product was dissolved in 25ul of pure water, and the concentration of the PCR product was detected using NanoDrop 1000.

Sequencing and data analysis:

the obtained library is qualified and then put on an illuminonextseq 500 platform to be sequenced to length PE150, the obtained data is compared with a ginseng reference genome, and parameters such as comparison rate, capture efficiency and the like are counted

As a result:

the ratio of capture efficiency obtained with the present invention to that obtained with the conventional method (76% vs 59%) (fig. 4), the circular blocking method of this example achieves unexpected technical results.

Example 2 Single Strand construction of library + circular blocking

Agilent liquid phase hybridization System (Agilent Co.) reference example: single samples were captured with 50M full Exon sequences (SureSelect Human All Exon 50Mb Kit)

The experimental method comprises the following steps:

Tip repair

The following reagents were placed in a 1.5ml centrifuge tube

25 degrees, 30min, 65 degrees and 15 min;

joint connection

Adding the following reagents into the reaction system in the last step

Reagent	Volume of
		DNA of the previous step	50(μL)
Ligation buffer	25(μL)
		Ligase	5(μL)
Joint (10Um)	20(μL)
		In total	100(μL)

23 ℃ for 30 min;

100ul of XP beads were added to the above reaction system to purify the product according to the Agencour AMPure protocol (Beckman, USA), and the PCR product was finally purified and dissolved in 35ul of purified water.

Double stranded DNA library preparation

The PCR reaction system and reaction conditions were as follows:

the following reaction system was arranged in a 200ul PCR tube:

the reaction was carried out under the following reaction conditions:

(a).98℃ 30s

(b).98℃ 30s

(c).65℃ 30s

(d).72℃ 1min

(e) repeating steps (b) - (d) 3-9 times (for 4-10 cycles)

(f).72℃ 5min

(g) Standing at 4 deg.C

The obtained PCR product is subjected to the next reaction

Double-stranded library digestion

The phosphorylated DNA was digested with the lambda exonuclease of NEB, and the following reagents were added to the PCR product:

reagent	Volume of
		Double-stranded DNA	50
10Xlambda buffer	6
		Lambda exonuclease	1
Water (I)	3
		In all	60

Reaction conditions are as follows: the reaction time is 37 ℃ for 30 minutes,

Hybridization of

a. A single-stranded DNA library of 3.4. mu.L or more and 100 ng/. mu.L or less was prepared by concentration or the like.

b. Hybridization Buffer was prepared (all reagents from Agilent corporation):

c. A SureSelect Oligo Capture Library Mix (all reagents from Agilent) was prepared and placed on ice:

d. the DNA library of the sample SureSelect-SC was added to the PCR tube together with cotDNA (100ng)2ul and the circular blocking oligo2(100uM)2ul, and the mixture was mixed and maintained at 65 ℃.

e. Hybridization Buffer was added to the PCR tube as required, mixed well, and hybridized at 65 deg.C (hot lid set to 105 deg.C) for 8 hours

Post-capture PCR amplification:

Reaction conditions are as follows:

(a).98℃ 2min

(b).98℃ 20s

(c).60℃ 30s

(d).72℃ 30s

(e) repeating steps (b) - (d) 9-14 times (10-15 times total)

(f).72℃ 5min

(g) Standing at 4 deg.C

Sequencing and data analysis:

the obtained library is put on an illumina nextsseq 500 platform after being qualified, the length PE150 is sequenced, the obtained data is compared with the ginseng reference genome, and the parameters of the comparison rate, the capture efficiency and the like are counted

As a result:

the GC bias of the circular block plus single strand library construction is less than that of the circular block method alone (FIG. 6).

Table 1: circular blocked oligo

N sample INDEX

Table 2: sequencing adaptors and amplification primers

I: the sample INDEX is a sample of the sample,

phos #: 5 terminal phosphorylation

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Sequence listing

<110> Shenzhen auss Gene science and technology Limited

Construction method and application of <120> high-throughput sequencing library

<160> 6

<170> SIPOSequenceListing 1.0

<210> 1

<211> 128

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 1

tctagccttc tcgcagcaca tccctttctc acacacatct agagccacca gcggcatagt 60

aagttcgtct tctgccgtat gctctannnn nnnncactga cctcaagtct gcacacgaga 120

aggctaga 128

<210> 2

<211> 128

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 2

tctagccttc tcgtgtgcag acttgaggtc agtgnnnnnn nntagagcat acggcagaag 60

acgaacttac tatgccgctg gtggctctag atgtgtgtga gaaagggatg tgctgcgaga 120

aggctaga 128

<210> 3

<211> 65

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 3

gatcggaaga gcacacgtct gaactccagt cacnnnnnnn natctcgtat gccgtcttct 60

gcttg 65

<210> 4

<211> 62

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 4

aatgatacgg cgaccaccga gatctacaca cactctttcc ctacacgacg ctcttccgat 60

ct 62

<210> 5

<211> 27

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 5

aatgatacgg cgaccaccga gatctac 27

<210> 6

<211> 24

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 6

caagcagaag acggcatacg agat 24

Claims

1. A method for constructing a high throughput sequencing library, comprising the steps of:

fragmenting genomic DNA so as to obtain DNA fragments;

end-repairing the DNA fragments to obtain end-repaired DNA fragments;

adding a base A to the 3' end of the end-repaired DNA fragment to obtain a DNA fragment having a sticky end A;

ligating the DNA fragment having the cohesive end A with a linker so as to obtain a ligation product;

digesting the DNA library by using exonuclease to obtain a single-stranded DNA library;

mixing the single-stranded DNA library with blocking oligonucleotides and specific probes for hybridization capture, wherein the blocking oligonucleotides circularly block complete joints and tag sequences introduced at two ends of the DNA library, and the specific probes perform hybridization capture on the connection products so as to obtain target fragments; the closed oligonucleotide is designed aiming at a linker and a tag sequence, and the two ends of the closed oligonucleotide are respectively in complementary pairing with the complete linker and tag sequences at the two ends of a DNA library and are connected to form a closed loop so as to realize annular closure; the length of the specific probe is 80-120 mers;

The hybrid capture is 6-8h;

adsorbing and washing the hybridized and captured solution by using magnetic beads with streptavidin;

carrying out PCR amplification on the obtained target fragment to obtain an amplification product;

separating and purifying the amplification products, wherein the amplification products form the high-throughput sequencing library.

2. A method of sequencing a target DNA sequence of a sample, comprising the steps of:

constructing a high-throughput sequencing library of target DNA fragments of the sample according to the method of claim 1;

and sequencing the high-throughput sequencing library of the target DNA sequence of the sample to obtain a sequencing result.

3. The method of claim 2, wherein the sequencing is performed using a high throughput sequencing technique.

4. The method of claim 3, wherein the sequencing is performed using a Hiseq2000 sequencer.