CN106350590A

CN106350590A - DNA library construction method for high-throughput sequencing

Info

Publication number: CN106350590A
Application number: CN201610807018.1A
Authority: CN
Inventors: 王旭; 钟嘉泳; 张弓; 董鸣; 余卓
Original assignee: Chaintech Medical (shenzhen) Technology Co Ltd
Current assignee: Chaintech Medical (shenzhen) Technology Co Ltd
Priority date: 2016-09-06
Filing date: 2016-09-06
Publication date: 2017-01-25
Anticipated expiration: 2036-09-06
Also published as: CN106350590B

Abstract

The invention discloses a DNA library construction method for high-throughput sequencing. The DNA library construction method includes the steps of carrying out multiple PCR (polymerase chain reaction) to a DNA template obtained from a to-be-tested sample through multiple pairs of fusion primers, then recycling PCR products to obtain a sequencing library. The multiple pairs of fusion primers respectively aim at different target fragments of the DNA template, and each fusion primer comprises specificity primer sequence aiming at the target fragment and linker sequence for sequencing. The multiple PCR is carried out on reaction conditions: reacting at 95 DEG C for 2 minutes; reacting within 38 cycles, wherein each cycle includes processes of preserving at the temperature of 95 DEG C for 30 seconds, cooling to the temperature from 76 DEG C to 55-58 DEG C slowly at the speed of 0.1 DEG C each second, holding the temperature for 20 seconds once cooling to the temperature of 58 DEG C, and then preserving at 72 DEG C for 30 seconds; after that, preserving at 72 DEG C for 2 minutes, and finally preserving at 4 DEG C. The invention further discloses application of the library construction method in testing STR locus and paternity identification on the basis of high-throughput sequencing.

Description

Dna library constructing method for high-flux sequence

Technical field

The present invention relates to field of biological detection is and in particular to a kind of dna library constructing method for high-flux sequence.

Background technology

The research and analysis object of legal medical expert's dna inspection technology is mainly the polymorphism of biological internal dna.In general, base The dna labelling of cause or noncoding region position on chromosome is referred to as locus, and is located on same locus different sequences Gene be referred to as allele.Dna polymorphism just refers to there are multiple allele as the locus of genetic marker, analysis On locus, the difference of allele is exactly to realize the same Basic of Biology [1].

STR (short tandem repeats, str) [2] in microsatellite dna is that a kind of dna is polymorphic Property, be widely present on 23 pairs of chromosomes of human genome, generally by 2-6bp recurring units (or claim core sequence, core Sequence) form, the repetition number that the difference between Different Individual (or allele) typically only shows as recurring units is poor Different.Str locus are a kind of important genetic markers, and on the high locus of polymorphism degree, its individual identification ability is strong, and And simply can use pcr technology for detection out, thus be widely used.Str locus have the following characteristics that str polymorphism Locus quantity is many, and fragment length is generally less than 400bp it is easy to amplification, is suitable to the inspection of micro sample；Allele size ratio It is closer to, less allele preferential amplification is inconspicuous；The str locus fragment difference in length of different loci is less, is easy to multiple Close amplification, improve detection efficiency.

The typing inspection technology of str locus important role in legal medical expert dna inspection, paternity identification.China is most Number paternity identification laboratory is all used limited loci str typing method as Main Means, to different str locus polymorphic It is analyzed, compares and calculate paternity index (pi value, the i.e. ratio of true father's probability and false father's probability).According to many in the world The boundary line whether number laboratory [3] accepted standard (pi value >=10 000) " has one's own relation " as differentiation.

More than traditional str typing method based on pcr technology and capillary electrophoresis technique, using fluorescently-labeled primer Sample is carried out with multiple pcr amplification, produces different size of amplified fragments and separate in capillary electrophoresis, thus realizing dividing Type, but this method flux is relatively low, and due to analysis to as if fragment length it is impossible to detect nucleic acid primary structure further Fine difference, limit the resolution of detection, additionally, be likely to occur stutter peak (fragment analysis when there is biased sample When come across the small peak before main peak) interference.And sanger sequencing due to due to flux, cost it is difficult to be used for str typing Analysis.

Secondary high throughput sequencing technologies (next generation sequencing) technology can be once parallel to tens Ten thousand to millions of dna molecules carry out sequencing, promote the high speed development of sequencing technologies.Using secondary high-flux sequence skill Art can be sequenced to full-length genome.When researcher is only interested in specific genome area, it is possible to use expand Increase son sequencing (amplicon sequencing) method and only respective regions are carried out with sequencing research.By designing base interested Because organizing the primer in region, carry out pcr amplification, target area is enriched with, be then directed to the pcr product of length-specific or catch The fragment obtaining carries out building storehouse, and high-flux sequence simultaneously carries out sequence analysis.

Current amplicon sequencing is all by conventional banking process, after target area is enriched to by pcr amplification, Carry out sample mixing, then carry out end reparation and add a, then adjunction head, purification of samples, reclaim purpose band, then pcr amplification, then Run glue reclaim master tape, this conventional banking process complex steps, lose time, workload big, be unfavorable for the survey of a large amount of samples Sequence builds storehouse.

Content of the invention

It is an object of the invention to provide a kind of construction method of amplicon sequencing library and this library constructing method exist High-flux sequence detects the application in str locus, and the method can realize the structure of amplicon sequencing library by once amplification Build, simplify the operating process building storehouse, reduce time and the cost building storehouse, do not affect sequencing quality simultaneously.

According to an aspect of the present invention, provide a kind of construction method of amplicon sequencing library, including using multipair fusion The dna template that primer pair obtains from testing sample carries out multiple pcr, reclaims pcr product, obtains sequencing library；Wherein said many To merging the different purpose fragments that primer is respectively directed in dna template, each pair merges primer and comprises successively to survey from 5' end to 3' end Sequence joint sequence and the specific primer sequence for purpose fragment.

In preferred embodiments, multiple pcr is carried out by following reaction conditions: 95 DEG C of 2min；38 circulations, each follows Ring is 95 DEG C of 30s, then from 76 DEG C to the arbitrary temp slow cooling 55 DEG C and 58 DEG C, 0.1 DEG C of fall per second, and mesh to be down to Temperature after keep 20s, then 72 DEG C of 30s；72℃2min.

In the present invention, described " multipair fusion primer " refer to two to, three to or more to merge primer.

In a more preferred embodiment, multiple pcr is carried out by following reaction conditions: 95 DEG C of 2min；38 circulations, each Circulate as 95 DEG C of 30s, the then slow cooling from 76 DEG C to 58 DEG C, 0.1 DEG C of fall per second, keep 20s after being down to 58 DEG C, then 72 ℃30s；72℃2min.

In some embodiments, testing sample include but is not limited to blood, body fluid, saliva, seminal fluid, hair, muscle or Histoorgan.In preferred embodiments, dna template is the genome dna extracting from testing sample.

In the present invention, each pair merges primer and comprises respectively for the specific forward primer sequence of purpose fragment and reversely draw Thing sequence.

In some embodiments, comprise index sequence in sequence measuring joints sequence, to identify different dna templates, be easy to Different sample mix are sequenced.

Merge the sequence measuring joints sequence included in primer in the present invention and may is that 5'- in forward primer Cctctctatgggcagtcggtgat-3', 5'-ccatctcatccctgcgtgtctccgactcag-index in downstream primer Sequence-gat-3'.

Merge sequence measuring joints sequence included in primer it may also is that 5'- in forward primer in the present invention Aatgatacggcgaccaccgagatctacac- the first index sequence- Acactctttccctacacgacgctcttccgatct-3', 5'-caagcagaagacggcatacgagat+ in downstream primer Two index sequence+gtgactggagttcagacgtgtgctcttccgatct-3'.

Wherein the first index sequence is different index sequences from the 2nd index sequence.

In some embodiments, multiple pcr reaction divides one or more reaction systems to complete.

In preferred embodiments, comprise two, three, four or five in a reaction system and be directed to different mesh Fragment fusion primer pair.

In preferred embodiments, described purpose fragment is str locus.

In preferred embodiments, purpose fragment includes two or more the str genes selected from following str locus Seat:

csf1po(genbank x14720)、fga(genbank m64982)、th01(genbank d00269)、tpox (genbank m68651)、d3s1358(nt_005997positions 754983-755121)、d5s818(genbank ac008512)、d7s820(genbank ac004848)、d8s1179(genbank af216671)、 d13s317(genbank al353628)、d16s539(genbank ac024591)、d18s51(genbank ap001534)、d21s11(genbank ap000433)、d2s1338(genbank ac010136)、cd4(genbank m86525)、d12s391(genbank m08921)、pla2a1(genbank m22970)、fabp(genbank m18079)、d18s865(genbank ) and vwa (genbank m25858) ac012289.12.

In a more preferred embodiment, purpose fragment includes following all str locus:

csf1po(genbank x14720)、fga(genbank m64982)、th01(genbank d00269)、tpox (genbank m68651)、d3s1358(nt_005997positions 754983-755121)、d5s818(genbank ac008512)、d7s820(genbank ac004848)、d8s1179(genbank af216671)、d13s317(genbank al353628)、d16s539(genbank ac024591)、d18s51(genbank ap001534)、d21s11(genbank ap000433)、d2s1338(genbank ac010136)、cd4(genbank m86525)、d12s391(genbank M08921), pla2a1 (genbank m22970) and fabp (genbank m18079).

csf1po(genbank x14720)、fga(genbank m64982)、th01(genbank d00269)、tpox (genbank m68651)、d3s1358(nt_005997positions 754983-755121)、d5s818(genbank ac008512)、d7s820(genbank ac004848)、d8s1179(genbank af216671)、d13s317(genbank al353628)、d16s539(genbank ac024591)、d18s51(genbank ap001534)、d21s11(genbank ap000433)、d2s1338(genbank ac010136)、cd4(genbank m86525)、d12s391(genbank M08921), d18s865 (genbank ac012289.12) and vwa (genbank m25858).

The specific forward primer sequence and the reverse primer sequences that are wherein directed to each str locus are as follows:

Csf1po: positive 5'-3':tagcaggttgctaaccaccc, reverse 5'-3': tcagaccctgttctaagtacttc；

Fga: positive 5'-3':cccataggttttgaactcacag, reverse 5'-3': gtgatttgtctgtaattgccagc；

Th01: positive 5'-3':gggcaaaattcaaagggtatctg, reverse 5'-3':tgcaggtcacagggaacac；

Tpox: positive 5'-3':aggcacttagggaaccctc, reverse 5'-3':tccttgtcagcgtttatttgcc；

D3s1358: positive 5'-3':caagaccctgtctcatagatag, reverse 5'-3': tcaacagaggcttgcatgtatc；

D5s818: positive 5'-3':gtgacaagggtgattttcctctt, reverse 5'-3': gtgattccaatcatagccacag；

D7s820: positive 5'-3':ggtcaggctgactatggag, reverse 5'-3': tcctcattgacagaattgcacc；

D8s1179: positive 5'-3':tctttttgcccacacggcc, reverse 5'-3': ctgtagattattttcactgtgggg；

D13s317: positive 5'-3':atttctttagtgggcatccgtg, reverse 5'-3': ccttcaacttgggttgagcc；

D16s539: positive 5'-3':cagatcccaagctcttcctc, reverse 5'-3': gcatgtatctatcatccatctctg；

D18s51: positive 5'-3':cacttcactctgagtgacaaattg, reverse 5'-3': gtgtggagatgtcttacaataacag；

D21s11: positive 5'-3':tcaattccccaagtgaattgcc, reverse 5'-3': tgttctccagagacagactaatag；

D2s1338: positive 5'-3':gtggatttggaaacagaaatggc, reverse 5'-3': gtggcccataatcatgagttattc；

Cd4: positive 5'-3':aggggtacttgtgttaattgttgg, reverse 5'-3': gcgttttccagtctgaaaaaagtg；

D12s391: positive 5'-3':gaatcaacaggatcaatggatgc, reverse 5'-3': cctccatatcacttgagctaattc；

Fabp: positive 5'-3':ttgtaagctccatgaggttagag, reverse 5'-3': agcctccctaggtcagatag；

Pla2a1: positive 5'-3':tagtatcagtttcatagggtcacc, reverse 5'-3': agttcgtttccattgtctgtcc；

D18s865: positive 5'-3':caaatgtagatcttgggacttgtc, reverse 5'-3': attctcaaacatccccattaccttc；

Vwa: positive 5'-3':tcagtatgtgacttggattgatc, reverse 5'-3': caggttagatagattagacagacag.

In preferred embodiments, multiple pcr reaction divides one or more reaction systems to complete.The plurality of reaction System includes the one or more reaction systems being selected from the group:

Comprise the reaction system merging primer and the fusion primer for d18s51 locus for tpox locus；

Comprise the reactant merging primer and the fusion primer for d21s11 locus for d8s1179 locus System；

Comprise merging primer, the fusion primer for d18s51 locus, being directed to d8s1179 base for tpox locus The reaction system merging primer and the fusion primer for d21s11 locus because of seat；

Comprise merging primer, the fusion primer for d3s1358 locus, being directed to for csf1po locus The reaction system merging primer and the fusion primer for d12s391 locus of d13s317 locus；

Comprise merging primer, the fusion primer for th01 locus, being directed to d5s818 locus for fga locus Merge primer and for pla2a1 locus fusion primer reaction system；

Comprise merging primer, the fusion primer for th01 locus, being directed to d5s818 locus for fga locus Merge primer and for vma locus fusion primer reaction system；

Comprise merging primer, the fusion primer for d16s539 locus, being directed to for d7s820 locus The fusion primer of d2s1338 locus, the reaction system merging primer and fabp primer for cd4 locus；With

Comprise merging primer, the fusion primer for d16s539 locus, being directed to for d7s820 locus The fusion primer of d2s1338 locus, the fusion primer merging primer and being directed to d18s865 locus for cd4 locus Reaction system.

In a more preferred embodiment, the plurality of reaction system includes the one or more reactants being selected from the group System:

Comprise 1.5 μm be directed to tpox locus merge the fusion primer that primers and 3.5 μm are directed to d18s51 locus Reaction system；

Comprise 1.5 μm of fusion primers being directed to d8s1179 locus and 3.5 μm of fusion primers for d21s11 locus Reaction system；

Comprise 1 μm be directed to tpox locus merge primer, 1.5 μm be directed to the fusion primer of d18s51 locus, 1 μm of pin Fusion primer to d8s1179 locus and 1.5 μm of reaction systems for the fusion primer of d21s11 locus；

Comprise 1 μm be directed to csf1po locus merge primer, 2 μm be directed to d3s1358 locus fusion primer, 1 μm Fusion primer for d13s317 locus and 1 μm of reaction system for the fusion primer of d12s391 locus；

Comprise 1 μm be directed to csf1po locus merge primer, 1 μm be directed to d3s1358 locus fusion primer, 1 μm Fusion primer for d13s317 locus and 2 μm of reaction systems for the fusion primer of d12s391 locus；

Comprise 1 μm be directed to fga locus merge primer, 1.5 μm be directed to the fusion primer of th01 locus, 1.5 μm of pins Fusion primer to d5s818 locus and 1 μm of reaction system for the fusion primer of pla2a1 locus；

Comprise 1.5 μm be directed to fga locus merge primer, 1 μm be directed to the fusion primer of th01 locus, 1.5 μm of pins Fusion primer to d5s818 locus and 1 μm of reaction system for the fusion primer of vma locus；

Comprise 1 μm be directed to d7s820 locus merge primer, 1 μm be directed to d16s539 locus fusion primer, 1 μm For the fusion primer of d2s1338 locus, 1 μm of reactant merging primer and 1 μm of fabp primer for cd4 locus System；With

Comprise 1 μm be directed to d7s820 locus merge primer, 1 μm be directed to d16s539 locus fusion primer, 1 μm Merge primer, 1 μm of fusion primer being directed to cd4 locus and 1 μm for d2s1338 locus are directed to d18s865 locus Fusion primer reaction system.

Other str locus can also according to circumstances be selected.Alternative str locus select to see network address: http: // Www.cstl.nist.gov/biotech/strbase/seq_info.htm, this network address is the medicolegal authoritative net of the U.S. Stand, which provide the str locus being related in legal medical expert's Relationship iden- tification.The reference sequences information of these str locus can To obtain from network address http://www.cstl.nist.gov/biotech/strbase/seq_ref.htm.

The construction method of described amplicon sequencing library is applied to the high-flux sequence to any amplicon, for example, can be used for Str locus, snp, snv, the mutation of a range of insertion and deletion etc. are detected by high-flux sequence, but not limited to this.

The method that another aspect of the present invention provides amplicon high-flux sequence, comprising: using the sequencing of above-mentioned amplicon Library constructing method prepares sequencing library, then carries out high-flux sequence and carries out data analysiss to sequencing data.

The target fragment of described amplicon high-flux sequence including but not limited to comprises str locus or comprises snp site Fragment.In preferred embodiments, the target fragment of described amplicon high-flux sequence is multiple str locus.

Another aspect of the present invention provides the detection method of str locus, comprising: using above-mentioned amplicon sequencing library Construction method preparation, for the sequencing library of multiple str locus, then carries out high-flux sequence and enters line number to sequencing data According to analysis.

In preferred embodiments, described data analysiss include obtaining the plurality of str locus core repeat sequence Number of repetition.

In some embodiments, the shadow band that described data analysiss also include chain slipped mispairing is led to is analyzed.

In some embodiments, described data analysiss further include to obtain the mutation in the plurality of str locus Information.

In some embodiments, the detection method of the method for amplicon high-flux sequence of the present invention and str locus It is non-diagnostic purpose.

The method of the detection str locus of the present invention can be used for legal medical expert and other is related to the correlation of str locus detection Field, the such as field such as gene diagnosises, genetic map construction and population genetic study.In preferred embodiments, the present invention Detection str locus method be used for paternity identification.

The present invention also provides a kind of method for paternity test, comprising: detect gene by the detection method of above-mentioned str locus Number of repetition in group, then calculates paternity index (pi value).

The invention has the beneficial effects as follows:

(1) present invention entirely build storehouse process only pass through once multiple pcr can complete, greatly reduce amplicon sequencing Build time and the cost in storehouse.When being applied to str genotype analysis and the identification of parental right relation, can simple flow reduces cost.

(2) the amplicon banking process of the present invention is optimized design it is adaptable to various expansion to multiple pcr reaction condition Increase the high-flux sequence of son, there is no selectivity to primer, all can pass through the banking process one of the present invention using different primers Storehouse is built in secondary amplification.

Brief description

Fig. 1 is the agarose gel electrophoresis figure of multiple pcr product in embodiment 1.

Fig. 2 is the library quality inspection figure in embodiment 1.

Fig. 3 is the scattergram of different numbers of repetition in the sequencing data of each str in embodiment 1.

Fig. 4 is the polyacrylamide page gel nucleic acid electrophoretogram of multiple pcr product in embodiment 2.

Fig. 5 is the library quality inspection figure in embodiment 2.

Specific embodiment

In order to be more clearly understood that the technology contents of the present invention, it is described with reference to the accompanying drawings especially exemplified by following examples. It should be understood that these embodiments are only illustrative of the invention and is not intended to limit the scope of the invention.Unreceipted in the following example The experimental technique of actual conditions, generally according to normal condition, or according to the condition proposed by manufacturer.Used in embodiment Various chemical reagent and biological preparation, be commercially available prod.

Embodiment 1: legal medical expert's Relationship iden- tification

The present embodiment purpose is the number of repetition of repetitive sequence in detection str, wherein using ngs sequencing scheme.

First, mouth epithelial cells collection and process

(1) buccal swab collection tube is used to gather the mouth epithelial cells of people.

(2) buccal swab genome extracts kit (dp322) is used to extract mouth epithelial cells dna.

(3) use nanodrop 2000 type spectrophotometric determination od260nm and od280nm, confirm that its purity is high, and Measure its concentration.

2nd, multiple pcr reacts design of primers

(1) design of primers becomes to merge primer, the specific primer sequence including purpose fragment and sequence measuring joints sequence, wherein Containing index sequence.

(2) merging primer construction is:

Forward primer: 5 '-cctctctatgggcagtcggtgat-3'+ purpose fragment specific forward primer；

Downstream primer: 5'-ccatctcatccctgcgtgtctccgactcagctaaggtaacGat-3'+ purpose fragment is special Different in nature reverse primer；

Wherein underlined sequences are index sequence.

(3) the str locus with sequencing to be amplified and corresponding specific primer are shown in Table 1.

The corresponding specific primer of table 1str locus

Str locus	Forward primer 5'-3'	Reverse primer 5'-3'
			csf1po	tagcaggttgctaaccaccc	tcagaccctgttctaagtacttc
fga	cccataggttttgaactcacag	gtgatttgtctgtaattgccagc
			th01	gggcaaaattcaaagggtatctg	tgcaggtcacagggaacac
tpox	aggcacttagggaaccctc	tccttgtcagcgtttatttgcc
			d3s1358	caagaccctgtctcatagatag	tcaacagaggcttgcatgtatc
d5s818	gtgacaagggtgattttcctctt	gtgattccaatcatagccacag
			d7s820	ggtcaggctgactatggag	tcctcattgacagaattgcacc
d8s1179	tctttttgcccacacggcc	ctgtagattattttcactgtgggg
			d13s317	atttctttagtgggcatccgtg	ccttcaacttgggttgagcc
d16s539	cagatcccaagctcttcctc	gcatgtatctatcatccatctctg
			d18s51	cacttcactctgagtgacaaattg	gtgtggagatgtcttacaataacag
d21s11	tcaattccccaagtgaattgcc	tgttctccagagacagactaatag
			d2s1338	gtggatttggaaacagaaatggc	gtggcccataatcatgagttattc
cd4	aggggtacttgtgttaattgttgg	gcgttttccagtctgaaaaaagtg
			d12s391	gaatcaacaggatcaatggatgc	cctccatatcacttgagctaattc
fabp	ttgtaagctccatgaggttagag	agcctccctaggtcagatag
			pla2a1	tagtatcagtttcatagggtcacc	agttcgtttccattgtctgtcc

3rd, storehouse and nucleic acid agarose gel electrophoresiies are built in multiple pcr reaction

Five groups are divided to carry out multiple pcr reaction, packet situation and various primer concentration are as shown in table 2.

The multiple pcr primer of table 2 is grouped situation

Multiple pcr agents useful for same is dreamtaq green pcr master mix 2x (thermo fisher, article No. K1081), reaction system is shown in Table 3:

The multiple pcr reaction system of table 3

Multiple pcr reaction is carried out using slow cooling method, reaction condition is as shown in table 4.

The multiple pcr reaction condition of table 4

Pcr product is entered with row agarose gel electrophoresis, the condition of electrophoresis is set as: voltage 120v, electric current 400ma, Time is 30min, and gum concentration is that 1.5%, marker band standard is from bottom to up successively: 100bp, 200bp, 300bp, 400bp, 500bp, 700bp, 1kb, 1600bp, 2kb, 5kb, 8kb, 10kb, result is as shown in figure 1,5 groups of multiple pcr reactions are equal Purpose band can be amplified, and the position that purpose band occurs is all in expected scope.

It was pre-mixed with sanprep pillar pcr Product Purification Kit (work, article No. b518141 are given birth in Shanghai) purification Five groups of multiple pcr products, are illustrated to carry out purification by manufacturer, obtain the dna library for sequencing.Nanodrop spectrophotometer The dna library concentration recording after purification is 128.73ng/ μ l, and 260/280 is worth for 1.944.

4th, library quality inspection

Library quality inspection to be completed by agilent 2100, and quality inspection result is as shown in Fig. 2 by figure it can be seen that multiple pcr Expanding effect is more homogeneous, does not expand the Preference of very different, can directly carry out high-flux sequence.

5th, large scale sequencing

(1) carry out high-flux sequence using the dna sequencing library of above-mentioned preparation with ion torrent platform se400, sequencing After the completion of, result is output as fastq formatted file, sequencing data amount 67m, records and matches the reads quantity of str locus and be 490536.

6th, sequencing data is processed

(1) intercept the forward and backward 20-30 base of each read thus positioning which str locus each read belongs to, if This read cannot navigate to and then discard on target str locus.The reads number that every kind of str matches is shown in Table 5.

The reads number that each str locus of table 5 are positioned

(2) because multiple pcr reacts during primer extension, primer strand or template strand slip, and lead to one to repeat list The base non-matching ring that position is formed, i.e. chain slipped mispairing mechanism, finally produce shadow band (stutter).For convenience of analysis shadow The impact of band, finally to analyze the number of repetition class of the core recurring units obtaining on the corresponding reads of each str in sequencing data Type is abscissa, and the percentage ratio of the reads number corresponding to each type, as vertical coordinate, does rectangular histogram, such as shown in Fig. 3 (a-q).

(4) number of times of each str locus recurring unit such as table 6 in test sample:

The number of times of each str locus recurring unit of table 6

Str locus	Number of repetition
		csf1po	10,11
fga	21,23
		th01	7,9
tpox	8
		d3s1358	16
d5s818	11
		d7s820	10
d8s1179	13,14
		d13s317	8,10
d16s539	9,10
		d18s51	18
d21s11	31,32
		d2s1338	19,23
cd4	5
		d12s391	18
fabp	10
		pla2a1	11,14

Embodiment 2: detect which kind of cell line is unknown cell line belong to

The present embodiment uses 2 unknown cell lines of people, by detecting the number of repetition of repetitive sequence in str thus confirming Which kind of cell line is unknown cell line belong to.Wherein using ngs sequencing scheme.

First, the extraction of human cell line dna

(1) human cell line dna extracts and adopts peripheral blood dna test kit (qiagen, article No. 937236).

(2) use nanodrop 2000 type spectrophotometric determination od260nm and od280nm, confirm that its purity is high, and Measure its concentration, ultimate density is 51ng/ μ l.

2nd, storehouse and polyacrylamide page gel nucleic acid electrophoresis are built in multiple pcr reaction

(2) merging primer construction is:

Sample one:

Forward primer: 5 '-aatgatacggcgaccaccgagatctacac-3'+index sequences 1 (ctctctat)+ Acactctttccctacacgacgctcttccgatct+ purpose fragment specific forward primer；

Downstream primer: 5'-caagcagaagacggcatacgagat-3'+index sequence 2 (tcgcctta)+ Gtgactggagttcagacgtgtgctcttccgatct+ purpose fragment specific reverse primers；

Sample two:

Forward primer: 5 '-aatgatacggcgaccaccgagatctacac-3'+index sequences 1 (tatcctct)+ Acactctttccctacacgacgctcttccgatct+ purpose fragment specific forward primer；

Downstream primer: 5'-caagcagaagacggcatacgagat-3'+index sequence 2 (ctagtacg)+ Gtgactggagttcagacgtgtgctcttccgatct+ purpose fragment specific reverse primers；

(3) the str locus with sequencing to be amplified and corresponding specific primer are shown in Table 7.

The corresponding specific primer of table 7str locus

3rd, storehouse and polyacrylamide page gel nucleic acid electrophoresis are built in multiple pcr reaction

(1) multiple pcr reacts using dreamtaq green pcr master mix (2x) reagent (the silent winged generation that of match, goods Number k1081) carry out pcr reaction, points of 4 groups multiple pcr react and to complete the dna library construction of target str, packet situation and each Plant primer concentration as shown in table 8, multiple pcr reaction condition is such as the reaction condition in example 1.

The multiple pcr primer of table 8 is grouped situation

Multiple pcr reaction system and reaction condition are with example 1.

(2) product of multiple for two samples pcr reactions is all mixed, carry out polyacrylamide page gel Nucleic acid electrophoresis, gum concentration is 8%, runs 4 hours glue time, wherein, and electric current 16ma, 20 minutes；30ma, 3 hours 40 minutes.Result As shown in figure 4,4 groups of multiple pcr reactions all can amplify purpose band, and the position that purpose band occurs is all in expected model In enclosing.

4th, pcr product (str correlation dna library) purification

The pcr product mixtures that the product of pcr reactions multiple to two samples all mixes carry out purification, according to Sanprep pillar pcr Product Purification Kit (Shanghai give birth to work, article No. b518141) description operating guidance completing, The dna library concentration that nanodrop spectrophotometer records after purification is 96.44ng/ μ l, and 260/280 is worth for 1.826.

5th, library quality inspection

(1) library quality inspection to be completed by agilent 2100, and quality inspection result is as shown in figure 5, by figure it can be seen that multiple Pcr expanding effect is more homogeneous, does not expand the Preference of very different, can directly carry out high-flux sequence.

6th, large scale sequencing

(1) purpose dna library good for purification is carried out high-flux sequence with illumina miseq platform pe250, sequencing After the completion of, result is output as fastq formatted file, sequencing data amount 63.6m.

(2) the reads quantity matching str locus measured by is: 686347.

7th, sequencing data analysis

(1) read that in both-end sequencing, one end can cover str core repeat sequence is selected to be analysis target, the other end Read discards without intercepting the forward and backward 20-30 base of each target read thus positioning which str base each read belongs to Because of seat, if this read cannot navigate on target str locus, discard.The reads number that every kind of str matches is shown in Table 9:

The reads number that each str locus of table 9 are positioned

Str locus	The reads number being positioned
		csf1po	9646
fga	58728
		th01	55099
tpox	39217
		d3s1358	15413
d5s818	62128
		d7s820	108465
d8s1179	57787
		d13s317	76639
d16s539	45218
		d18s51	13047
d21s11	28294
		d2s1338	15417
cd4	22561
		d12s391	36707
d18s865	12472
		vwa	29509

(2) impact of the stutter band with regard to producing in pcr course of reaction, with reference to the method analysis in example 1, and Eventually analysis obtain cell line locus str allele the number of repetition of core recurring units and with cell in atcc cell bank Be corresponding str allelic gene typing result (analyzed target str is: csf1po, d13s317, d16s539, d5s818, D7s820, th01, tpox, vwa) mutually compare, the results are shown in Table 10,11.

The number of times of each str locus recurring unit of 10 1 unknown cell lines of table

Str locus	Number of repetition
		csf1po	10
d13s317	11,12
		d16s539	9,13
d5s818	13
		d7s820	10
th01	8
		tpox	8,11
vwa	16,17

The number of times of each str locus recurring unit of another unknown cell line of table 11

So, two unknown cell lines can be judged respectively by the corresponding allelic gene typing result of above-mentioned str locus Belong to nci-h292 cell line and a549 cell line.

Claims

1. a kind of construction method of amplicon sequencing library, merges, including with multipair, the dna mould that primer pair obtains from testing sample Plate carries out multiple pcr, reclaims pcr product, obtains sequencing library；

Wherein said multipair merge the different purpose fragments that primer is respectively directed in dna template, each pair fusion primer from 5' end to 3' end comprises sequence measuring joints sequence and the specific primer sequence for purpose fragment successively；

Wherein multiple pcr is carried out by following reaction conditions: 95 DEG C of 2min；38 circulations, each circulates as 95 DEG C of 30s, Ran Houcong 76 DEG C to the arbitrary temp slow cooling between 55 DEG C and 58 DEG C, 0.1 DEG C of fall per second, keep 20s after being down to purpose temperature, so 72 DEG C of 30s afterwards；72℃2min.

2. the construction method of claim 1, wherein dna template are the genome dna extracting from testing sample.

3. the construction method of claim 1, wherein multiple one or more reaction systems of pcr reaction point complete.

4. the amplicon high-flux sequence method of non-diagnostic purpose, comprising: using the method preparation sequencing literary composition of claim 1 or 2 Storehouse, then carries out high-flux sequence.

5. the method for claim 4, the purpose fragment of wherein said amplicon high-flux sequence is multiple str locus.

6. the detection method of the str locus of non-diagnostic purpose, comprising: using the preparation of above-mentioned amplicon sequencing library construction method For the sequencing library of multiple str locus, then carry out high-flux sequence and data analysiss are carried out to sequencing data.

7. the detection method of claim 6, wherein said data analysiss include obtaining the plurality of str locus core repetition sequence The number of repetition of row.

8. the method for claim 7, the shadow band that described data analysiss also include chain slipped mispairing is led to is analyzed.

9. a kind of method for paternity test, comprising: measure the repetition of str locus in genome using the method for claim 7 or 8 Number of times, then calculates paternity index (pi value).