CN104293940B - Build the method and its application of sequencing library - Google Patents
Build the method and its application of sequencing library Download PDFInfo
- Publication number
- CN104293940B CN104293940B CN201410521540.4A CN201410521540A CN104293940B CN 104293940 B CN104293940 B CN 104293940B CN 201410521540 A CN201410521540 A CN 201410521540A CN 104293940 B CN104293940 B CN 104293940B
- Authority
- CN
- China
- Prior art keywords
- sequencing data
- sequence
- sequencing
- chain
- label
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1082—Preparation or screening gene libraries by chromosomal integration of polynucleotide sequences, HR-, site-specific-recombination, transposons, viral vectors
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Physics & Mathematics (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Immunology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Virology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Analytical Chemistry (AREA)
- Plant Pathology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The method and its application for building sequencing library are disclosed, this method includes:(a) jointing is distinguished at the two ends of double chain DNA fragment, to obtain connection product;(b) connection product is cracked into Single-stranded DNA fragments;(c) Single-stranded DNA fragments are screened using probe;(d) chain extension reaction is carried out using Single-stranded DNA fragments described in the first primer pair, to obtain chain extension product;(e) the chain extension product is expanded, to obtain amplified production, the amplified production constitutes the sequencing library.Also disclose sequence measurement, the method for determining nucleotide sequence, the device for building sequencing library, sequencing equipment and the system for determining nucleotide sequence.
Description
Technical field
The present invention relates to biomedical sector.Specifically, the present invention relates to method, the sequencing side for building sequencing library
Method, the method for determining nucleotide sequence, the device for building sequencing library, sequencing equipment and the system for determining nucleotide sequence.
Background technology
High-flux sequence is increasingly concerned, but high-flux sequence still needs to be changed for the detection of low frequency mutation at present
Enter.
The content of the invention
It is contemplated that at least solving one of technical problem present in prior art.Therefore, according to the implementation of the present invention
Example, the present invention proposes the method for building sequencing library and detects the means of low frequency mutation.
In the first aspect of the present invention, the present invention proposes a kind of method for building sequencing library.According to the reality of the present invention
Example is applied, this method includes:(a) jointing is distinguished at the two ends of double chain DNA fragment, to obtain connection product, wherein, it is described
Joint includes the first chain and the second chain, and first chain and the second chain part are matched and first chain includes the first label sequence
Row, to limit double stranded region and two single-stranded afterbodys on the joint, are included in the sequence of one of described two single-stranded afterbodys
First label;(b) connection product is cracked into Single-stranded DNA fragments;(c) Single-stranded DNA fragments are carried out using probe
Screening, wherein, the probe specificity recognizes presumptive area, wherein, the presumptive area includes one of following:(1) shown in table 1
At least one of gene;(2) the CDS regions of (1);And the upstream and downstream of (3) (2) at least 10bp region;(d) draw using first
Thing carries out chain extension reaction to the Single-stranded DNA fragments, to obtain chain extension product, wherein, first primer includes the
Two sequence labels, and first primer is suitable to the first chain with the joint into duplex structure, simply described first marks
There is mispairing between label sequence and second sequence label;(e) the chain extension product is expanded, to be expanded
Product, the amplified production constitutes the sequencing library, wherein, the amplification using suitable for expanding the first label sequence simultaneously
The primer of row and second sequence label..
Thus, using the method for structure sequencing library according to embodiments of the present invention, sequencing library can be effectively built,
Meanwhile, in constructed sequencing library, for every of identical double chain DNA fragment (also referred herein as " source sequence ")
Chain, obtains the amplified production with the first sequence label and the second sequence label respectively, thus, in point of follow-up sequencing result
In analysis, mutual correction can be carried out according to the sequencing result of two kinds of labels, improve the reliability of analysis result.
Embodiments in accordance with the present invention, the double chain DNA fragment is obtained through the following steps:Sample of nucleic acid is carried out
End is repaired, to obtain the sample of nucleic acid by reparation;And base A is added in 5 ' ends of the sample of nucleic acid, so as to
Obtain two ends has cohesive end base A sample of nucleic acid respectively, and the two ends have cohesive end base A nucleic acid sample respectively
This composition double chain DNA fragment.Thus, it is possible in subsequent operation, easily be added at the two ends of the double chain DNA fragment
Joint.So as to improve the efficiency for building sequencing library.
Embodiments in accordance with the present invention, the sample of nucleic acid is at least a portion or free nucleic acid of human gene group DNA.Root
According to embodiments of the invention, people's free nucleic acid is extracted from the peripheral blood of patient.Embodiments in accordance with the present invention, it is described
Patient suffers from colorectal cancer.Thus, using the method for the embodiment of the present invention, effectively the gene of human patient can be dashed forward
Change is effectively analyzed, and then can be effective for the early diagnosis of colorectal cancer, personalized medicine and postoperative monitoring etc..
Embodiments in accordance with the present invention, at least a portion of the human gene group DNA is by being carried out to human gene group DNA
Interrupt and obtain at random.Thus, it is possible in subsequent operation, easily add joint at the two ends of the double chain DNA fragment.
So as to improve the efficiency for building sequencing library.
Embodiments in accordance with the present invention, the joint has 3 ' base T cohesive ends.Thus, it is possible in subsequent operation,
Easily joint is added at the two ends of the double chain DNA fragment.So as to improve the efficiency for building sequencing library.
Embodiments in accordance with the present invention, the Single-stranded DNA fragments are by the way that connection product progress denaturation treatment is obtained
.Thus, it is possible to fast and effectively obtain Single-stranded DNA fragments.According to some embodiments of the present invention, the denaturation treatment can
Think thermal denaturation processing or alkaline denaturation processing.
Embodiments in accordance with the present invention, the probe is provided in the form of chip.Thus, it is possible to improve probe screening
Efficiency.
Embodiments in accordance with the present invention, when there is UDG enzymes/FPG enzymes, carry out the chain extension reaction.Thus, it is possible to have
Effect ground is repaired to the DNA that there is damage during chain extension, reduces the generation of false positive, is improved and is built sequencing library
Quality.
Separately length is for embodiments in accordance with the present invention, first sequence label and second sequence label
4~10nt.The length of embodiments in accordance with the present invention, first sequence label and second sequence label is 8nt.Root
According to embodiments of the invention, there is at least 2nt mispairing between first sequence label and second sequence label.Invention
People utilizes the first sequence label and the second mark it has surprisingly been found that using being arranged such, can effectively improve in subsequent analysis
The efficiency that label sequence is corrected.
Embodiments in accordance with the present invention, the first chain of the joint has SEQ ID NO:Sequence shown in 1, the joint
The second chain there is SEQ ID NO:Sequence shown in 2, first label has SEQ ID NO:Shown in any one of 3-6
Sequence, second label has SEQ ID NO:Sequence shown at least one of 7-10, first primer has SEQ
ID NO:Sequence shown in 11, the primer tool for being suitable to expand first sequence label and second sequence label simultaneously
There are SEQ ID NO:12 and SEQ ID NO:Sequence shown in 13.
Wherein, " XXXXXXXX " is represented in the first sequence label, the first primer in sequence in the sequence of the first chain of joint
" XXXXXXXX " represent the second sequence label.
Embodiments in accordance with the present invention, label includes but is not limited to 4 couple described above, can be related to as needed multipair
Detected while label is for Multi-example.
In the second aspect of the present invention, the present invention proposes a kind of sequence measurement, and this method includes:According to foregoing
Method builds sequencing library;The sequencing library is sequenced.
Embodiments in accordance with the present invention, carry out the sequencing on Hiseq2000 or Hiseq2500.Thus, it is possible to effectively
Improve the efficiency of sequencing in ground.In addition, it is previously with regard to build the feature and advantage described by the method for sequencing library, it is equally applicable to be somebody's turn to do
Sequence measurement, will not be repeated here.
In the third aspect of the present invention, the present invention proposes a kind of method for determining nucleotide sequence, and this method includes:For
Sample of nucleic acid, is sequenced according to the foregoing method of claim, to obtain the sequencing being made up of multiple sequencing datas
As a result;Based on the sequencing result, at least one sequencing data subset is built, wherein, it is all in each sequencing data subset
Sequencing data corresponds to identical source sequence on sample of nucleic acid;For each sequencing data subset, determine respectively and described the
The corresponding sequencing data of one sequence label is normal chain sequencing data, and sequencing data corresponding with second sequence label is minus strand
Sequencing data;For each sequencing data subset, the normal chain sequencing data and the minus strand sequencing data are based respectively on, it is right
Sequencing data is corrected, to determine corrected sequencing data;And based on the corrected sequencing data, really
The sequence of the fixed sample of nucleic acid.Thus, it is possible to be effectively corrected based on normal chain sequencing data and minus strand sequencing data, carry
The reliability of high analyte result.
Embodiments in accordance with the present invention, the sequencing is double end sequencings, and the sequencing result is by multipair paired sequencing
Data are constituted.
Embodiments in accordance with the present invention, based on the sequencing result, it is under to build at least one sequencing data subset
What row step was carried out:For every a pair of the multipair paired sequencing data, it is determined that sequencing data index in pairs, described paired
Sequencing data index be made up of the initial N number of base of each of paired sequencing data, wherein, N be 10~20 between it is whole
Number;Indexed based on the paired sequencing data, build at least one preliminary sequencing data subset, wherein, the preliminary sequencing number
The paired sequencing data index of identical is respectively provided with according to each sequencing data in subset;And based on the preliminary sequencing data
Hamming distance in subset between sequencing data, is finely divided at least one described preliminary sequencing data subset, to obtain
Multiple sequencing data subsets.
Embodiments in accordance with the present invention, N is 12.
Embodiments in accordance with the present invention, in each of the multiple sequencing data subset, any two pairs sequencings in pairs
The Hamming distance of data is no more than 20.
Embodiments in accordance with the present invention, in each of the multiple sequencing data subset, normal chain sequencing data and negative
Chain sequencing data is respectively at least two.
Embodiments in accordance with the present invention, based on the normal chain sequencing data and the minus strand sequencing data, it is determined that by school
Positive sequencing data is carried out based on following principle:Each base in corrected sequencing data is obtained at least simultaneously
50% normal chain sequencing data and at least support of 50% minus strand sequencing data.
Each base in embodiments in accordance with the present invention, corrected sequencing data is obtaining at least 80% just simultaneously
Chain sequencing data and at least support of 80% minus strand sequencing data.
Embodiments in accordance with the present invention, further comprise:The corrected sequencing data is compared to reference sequences
On, and delete the sequencing data that comparison quality is less than 30.
Embodiments in accordance with the present invention, further comprise:Based on the sequence of the sample of nucleic acid, carry out SNV analyses or
Indel is analyzed.
In the fourth aspect of the present invention, the present invention proposes a kind of device for building sequencing library.According to the reality of the present invention
Example is applied, the device includes:Connection unit, for distinguishing jointing at the two ends of double chain DNA fragment, to obtain connection production
Thing, wherein, the joint includes the first chain and the second chain, the first chain and the second chain part matching and the first chain bag
Containing the first sequence label, to limit one of double stranded region and two single-stranded afterbodys, described two single-stranded afterbodys on the joint
Sequence in include the first label;Unit is cracked, for the connection product to be cracked into Single-stranded DNA fragments;Screening unit, is used
In before the chain extension is carried out, the Single-stranded DNA fragments are screened using probe, wherein, the probe specificity
Presumptive area is recognized, wherein, the presumptive area includes one of following:(1) at least one of gene shown in table 1;(2) (1)
CDS regions;And the upstream and downstream of (3) (2) at least 10bp region;Chain extension unit, for utilizing list described in the first primer pair
Chain DNA fragment carries out chain extension reaction, to obtain chain extension product, wherein, first primer includes the second sequence label,
And first primer is suitable to the first chain of the joint into duplex structure, simply first sequence label with it is described
There is mispairing between second sequence label;Amplification unit, for being expanded to the chain extension product, to obtain amplification production
Thing, the amplified production constitutes the sequencing library, wherein, the amplification using suitable for expanding first sequence label simultaneously
With the primer of second sequence label.
Embodiments in accordance with the present invention, said apparatus can effectively implement the side of structure sequencing library described above
Method, can effectively build sequencing library, meanwhile, in constructed sequencing library, for identical double chain DNA fragment (at this
Every chain, obtains the amplification with the first sequence label and the second sequence label in text also referred to as " source sequence ") respectively
Product, thus, in the analysis of follow-up sequencing result, can carry out mutual correction according to the sequencing result of two kinds of labels, improve
The reliability of analysis result.
Embodiments in accordance with the present invention, further comprise:Unit is repaired in end, for sample of nucleic acid progress end to be repaiied
It is multiple, to obtain the sample of nucleic acid by reparation;And end modified unit, in the addition of 5 ' ends of the sample of nucleic acid
Base A, has cohesive end base A sample of nucleic acid, the two ends have cohesive end alkali respectively respectively to obtain two ends
Base A sample of nucleic acid constitutes the double chain DNA fragment.
Embodiments in accordance with the present invention, the probe is provided in the form of chip.
Embodiments in accordance with the present invention, when there is UDG enzymes/FPG enzymes, carry out the chain extension reaction.Thus, it is possible to have
Effect ground is repaired to the DNA that there is damage during chain extension, reduces the generation of false positive, is improved and is built sequencing library
Quality.
Separately length is for embodiments in accordance with the present invention, first sequence label and second sequence label
4~10nt.
The length of embodiments in accordance with the present invention, first sequence label and second sequence label is 8nt.
, there is at least 2nt between first sequence label and second sequence label in embodiments in accordance with the present invention
Mispairing.
Embodiments in accordance with the present invention, the first chain of the joint has SEQ ID NO:Sequence shown in 1, the joint
The second chain there is SEQ ID NO:Sequence shown in 2, first label has SEQ ID NO:Shown in any one of 3-6
Sequence, second label has SEQ ID NO:Sequence shown at least one of 7-10, first primer has SEQ
ID NO:Sequence shown in 11, the primer tool for being suitable to expand first sequence label and second sequence label simultaneously
There are SEQ ID NO:12 and SEQ ID NO:Sequence shown in 13.
Embodiments in accordance with the present invention, label includes but is not limited to 4 couple described above, can be related to as needed multipair
Detected while label is for Multi-example.
It will be appreciated to those of skill in the art that above for the feature and excellent built described by the method for sequencing library
Point, is equally applicable to the device of the structure sequencing library, will not be repeated here.
In the fifth aspect of the present invention, the present invention proposes a kind of sequencing equipment.Embodiments in accordance with the present invention, the sequencing
Equipment includes:According to the device of foregoing structure sequencing library;Sequencing device, for being surveyed to the sequencing library
Sequence.
Thus, it is possible to effectively improve the efficiency of sequencing.In addition, being previously with regard to build the method and apparatus institute of sequencing library
The feature and advantage of description, the equally applicable sequencing equipment, will not be repeated here.
Embodiments in accordance with the present invention, the sequencing device is Hiseq2000 or Hiseq2500.
In the sixth aspect of the present invention, the present invention proposes a kind of system for determining nucleotide sequence.According to the reality of the present invention
Example is applied, the system includes:Foregoing sequencing equipment, for being sequenced for sample of nucleic acid, is surveyed to obtain by multiple
Ordinal number according to composition sequencing result;Sequencing data subset builds equipment, for based on the sequencing result, building at least one survey
Sequence data subset, wherein, all sequencing datas in each sequencing data subset correspond to identical source sequence on sample of nucleic acid;
Sequencing data sorting device, for for each sequencing data subset, determining respectively corresponding with first sequence label
Sequencing data is normal chain sequencing data, and sequencing data corresponding with second sequence label is minus strand sequencing data;Number is sequenced
According to calibration equipment, for for each sequencing data subset, being based respectively on the normal chain sequencing data and minus strand sequencing
Data, are corrected to sequencing data, to determine corrected sequencing data;And sequence determination device, for based on
The corrected sequencing data, determines the sequence of the sample of nucleic acid.Thus, determination according to embodiments of the present invention is utilized
The system of nucleotide sequence, can effectively implement the method for nucleotide sequence determined above.Surveyed so as to effectively be based on normal chain
Ordinal number evidence and minus strand sequencing data are corrected, and improve the reliability of analysis result.
Embodiments in accordance with the present invention, the sequencing is double end sequencings, and the sequencing result is by multipair paired sequencing
Data are constituted.
Embodiments in accordance with the present invention, sequencing data subset, which builds equipment, to be included:Sequencing data index determines equipment, is used for
For every a pair of the multipair paired sequencing data, it is determined that sequencing data index in pairs, the paired sequencing data index
It is made up of the initial N number of base of each of paired sequencing data, wherein, N is the integer between 10~20;Preliminary screening is filled
Put, for being indexed based on the paired sequencing data, build at least one preliminary sequencing data subset, wherein, the just pacing
Each sequencing data in sequence data subset is respectively provided with the paired sequencing data index of identical;And postsearch screening device, use
Hamming distance in based on the preliminary sequencing data subset between sequencing data, at least one described preliminary sequencing data
Subset is finely divided, to obtain multiple sequencing data subsets.
Embodiments in accordance with the present invention, N is 12.
Embodiments in accordance with the present invention, in each of the multiple sequencing data subset, any two pairs sequencings in pairs
The Hamming distance of data is no more than 20.
Embodiments in accordance with the present invention, in each of the multiple sequencing data subset, normal chain sequencing data and negative
Chain sequencing data is respectively at least two.
Embodiments in accordance with the present invention, based on the normal chain sequencing data and the minus strand sequencing data, it is determined that by school
Positive sequencing data is carried out based on following principle:Each base in corrected sequencing data is obtained at least simultaneously
50% normal chain sequencing data and at least support of 50% minus strand sequencing data.
Each base in embodiments in accordance with the present invention, corrected sequencing data is obtaining at least 80% just simultaneously
Chain sequencing data and at least support of 80% minus strand sequencing data.
Embodiments in accordance with the present invention, further comprise:The corrected sequencing data is compared to reference sequences
On, and delete the sequencing data that comparison quality is less than 30.
Embodiments in accordance with the present invention, further comprise sequence analysis device, and the sequence analysis device is used to be based on institute
The sequence of sample of nucleic acid is stated, SNV analyses or Indel analyses is carried out.
It will be appreciated by persons skilled in the art that being previously with regard to determine advantage and the spy described by the method for nucleotide sequence
The system for levying the equally applicable determination nucleotide sequence, will not be repeated here.
The additional aspect and advantage of the present invention will be set forth in part in the description, and will partly become from the following description
Obtain substantially, or recognized by the practice of the present invention.
Brief description of the drawings
The above-mentioned and/or additional aspect and advantage of the present invention will become from description of the accompanying drawings below to embodiment is combined
Substantially and be readily appreciated that, wherein:
Fig. 1 shows the flow chart for the method for building sequencing library according to an embodiment of the invention;
Fig. 2 shows according to one embodiment of present invention, the analysis result of same index reads clusters;And
Fig. 3 shows according to one embodiment of present invention, spectrum of mutation analysis result.
Embodiment
Below by specific embodiment, the present invention will be described, it is necessary to which explanation is that these embodiments are only to be
Illustration purpose, and can not be construed to limitation of the present invention in any way.
Conventional method
Unless stated otherwise, in the following embodiments, carried out according to following conventional method:
First, probe is designed
According to human genome HG19, transfer the exon sequence of related gene, it is contemplated that the size of capture region and into
This, final chip has pertained only to the CDS regions of said gene, and to extending 20bp before and after CDS regions.On chip covered with
Abundant capture probe, probe overlay area can be enriched with target DNA fragments, same up to 98% from complicated genome
Open on chip with high specific and high coverage rate capture genome area.
2nd, sequencing library and sequencing are built
Reference picture 1, builds the step of library and sequencing as follows:
1. extracting patient's 5ml peripheral bloods, centrifugal separation plasma and leucocyte, plasma sample and leucocyte sample are carried respectively
Take DNA, detection of the control for somatic mutation will be used as after the DNA that leucocyte is extracted.
2. the free Circulating DNA extracted in blood plasma is average in 170BP, directly 3 are carried out according to conventional banking process afterwards
Walk enzymatic reaction:End is repaired, plus " A " and the sequence measuring joints of connection specially treated (carry 8BP label, ordered on the joint
Entitled index1, it not only has the function of distinguishing different samples, the mark of normal chain after being also used for).
3. obtain connection product, carry out Colorectalpan chip hybridization captures, its elute single-stranded template product it
Afterwards by the primer amplification marked with index2 of 1 wheel, 1 circulation so that anti-chain is labeled.Added simultaneously during PCR
UDG/FPG enzymes are incubated, and to eliminate the DNA damage carried in template strand, reduce the generation of false positive.
4. the product that the double index marks of positive anti-chain are completed, takes turns PCR enrichments by after purification, carrying out second, completes library
Prepare.
5. sequence measurement uses Hiseq 2000 or Hiseq2500, the difference measured according to sequencing and sample number, can be flexible
Select suitable microarray dataset.
Specific steps include:
1.cfDNA extraction
The blood plasma about 2-3ml that 5ml peripheral bloods are isolated is taken, according to QIAamp Circulating Nucleic Acid
Kit extracts reagent specifications, carry out blood plasma cfDNA extraction.Qubit (Invitrogen, the Quant-iTTM dsDNA
HS Assay Kit) quantitative extracted DNA, total amount is about 5~50ng.
2. the preparation in sample library:
The cfDNA extracted in blood plasma, builds storehouse specification according to KAPA LTP Library Preparation Kit afterwards,
Carry out 3 step enzymatic reactions.
1) end is repaired
Afterwards, the μ L of Agencourt AMPure XP reagent 120 are added, magnetic beads for purifying, the last μ L of back dissolving 42 is carried out
ddH2O, band magnetic bead carries out next step reaction.
2) A is added
The μ L of PEG/NaCl SPRI solution 90 are added afterwards, are sufficiently mixed, and carry out magnetic beads for purifying, last back dissolving (35- joints)
μL ddH2O, band magnetic bead carries out next step reaction.
3) joint is connected
50 μ L of PEG/NaCl SPRI solution are separately added into afterwards 2 times, carry out 2 magnetic beads for purifying, the last μ L of back dissolving 25
ddH2O。
3 chip hybridizations are captured
The early screening chip Colorectalpan for colorectal cancer designed in the present invention using inventor, with reference to chip
The specification that manufacturer provides carries out hybrid capture.Finally elute the μ L ddH of back dissolving 212O band hybridization elution magnetic beads.
4. the positive anti-chain marks of couple index and enrichment:
2 are carried out altogether and takes turns PCR, and PCR 1 carries out anti-chain mark and template DNA injury repair, and PCR2 carries out amplification enrichment, complete
Prepared into library.
1)PCR1
PCR1 programs:
Hybridization elution magnetic bead is first removed, the μ L of Agencourt AMPure XP reagent 40 is then added, carries out magnetic bead
Purifying, the last μ L ddH of back dissolving 202O, band magnetic bead carries out next step reaction.
2)PCR2
PCR2 programs:
Previous step magnetic bead is first removed, the μ L of Agencourt AMPure XP reagent 50 are then rejoined, magnetic is carried out
Pearl purifies, the last μ L ddH of back dissolving 252O, carries out QC and upper machine.
3rd, sequencing result is analyzed
1, by paired reads (paired sequencing data) reads1 preceding 12bp bases and reads2 preceding 12bp alkali
Base (i.e. sequence of breakpoints) connects into a 24bp short sequence, and using this 24bp as paired reads index, and root
Normal chain and anti-chain are marked according to its index.
2, external sort, the purpose being brought together with the copy reached same DNA profiling are carried out to index.
3, central cluster is carried out to the reads for possessing same index gathered together, according to the Hamming distance between its sequence
From each big cluster for having same index to be gathered into the Chinese of any two couples of paired reads in several tuftlets, each tuftlet
Prescribed distance is no more than 10, to reach the purpose for distinguishing the reads for possessing same index but from different DNA profilings.
4, the copy cluster of the same DNA profiling to being obtained in step 3 is screened, if the reads numbers of normal chain and anti-chain
More than 2 pairs are all reached, then carries out subsequent analysis.
5, error correction is carried out to the cluster for meeting 4 conditionals, and a pair of error-free new reads are produced, for each of DNA profiling
Individual sequencing base, if certain concordance rate of base type in the reads of normal chain reaches 80%, and it is consistent in anti-chain reads
Rate also reaches 80%, then this base for remembering new reads is this base type, is otherwise designated as N, has so just obtained representing original
The new reads of DNA profiling sequence.
6, new reads is compared on genome again with bwa mem algorithms, screens out and compares the reads that quality is less than 30.
7, SNV analyses:
1) counted according to the reads obtained in 6, the base type distribution in each site in capture region is obtained, with master
It had both been mutating alkali yl type to flow the inconsistent base type of base type (ratio is more than 15% base type).Count target area covering big
Small, average sequencing depth, positive anti-chain interworking rate, low frequency mutation rate etc..
2) SNP is annotated using CCDS, human genome database (NCBI36.3), dbSNP (v130) information, really
Determine mutational site generation gene, coordinate, mRNA sites, amino acid change, (the missense mutation/nonsense mutation/variable of SNP functions
Shearing site), SIFT prediction SNP influence protein function predictions etc.;
3) according to the comparison of Patient Sample A and control sample information, Call Somatic Mutation.Simultaneously candidate's
The SNP occurred in dbSNP, HAPMAP, 1000 human genomes, other extron sequencing projects is got rid of in SNV, using as
The related candidate SNV of last disease.
8, INDEL analyses:
1) counted according to the reads containing indel in the reads obtained in 6, obtain all indel and selection
There is the indel of 2 and above reads supports as reliable mutation indel,
2) Indel is annotated using CCDS, human genome database (NCBI36.3), dbSNP (v130) information,
Determine gene, coordinate, mRNA sites, the change of Coding region sequence, the influence to amino acid, InDel that mutational site occurs
Function (amino acid insertion/amino acid deletions/frameshift mutation);
3) according to the comparison of Patient Sample A and control sample information, Call Somatic Mutation.Simultaneously candidate's
The Indel occurred in dbSNP and other extron sequencing projects is got rid of in Indel, to be used as last disease correlation
Candidate Indel.
Embodiment 1:Colorectal cancer is early sieved
First, chip is designed
1) design of colorectal cancer early screening chip:
Based on TCGA, ICGC, database and the pertinent literature reference such as COSMIC design pin Colon and rectum using iterative algorithm
The genetic chip Colorectalpan that cancer is early sieved.Colorectalpan chips include:The related driving gene of colorectal cancer,
Important gene in high frequency mutant gene, and the signal paths of cancer 12, altogether 60 genes, 123KB.
Chip the design process is divided into 4 steps:
1st, statistics cosmic databases in about colorectal cancer driver gene each exon 1 variation sample number,
Variation sample, hottest point variation where sample number, PI values (to assess level of patient's reply frequency on each extron,
Accumulative number of patients/extron length of mutation is carried on the every extrons of PI=), and arranged according to PI values descending.Use afterwards
Iterative algorithm:Sample using first exon 1 variation counts other all interval and sample datas as sample database
The number of storehouse difference sample, the most sample interval of different number of samples is classified as into second, and to screen chip interval, now with
The two interval variation samples screened screen the 3rd interval, until sample in the same way as sample database
Database includes all samples, to count exon 1 collection, and for not screening any all areas of interval gene
Between, then all it is added on chip interval.
2. based on TCGA, the database such as ICGC is interval and including being more than or equal to 5 samples to remove driver gene
Focus variation interval (SNV>=5) interval for candidate, repeat the iterative calculation of previous step.
3. based on TCGA, the database such as ICGC, remove be screened it is interval in respectively with:PI>=30, SNV>=3
With:PI>=20, SNV>=3 be that candidate is interval, and screening causes single sample database sample number to reduce most intervals and be used as first
Individual chip is interval, repeats above procedure and is iterated calculating.
4. add the intervals such as fusion.
List of genes details are shown in Table 1.
Table 1
KRAS | SRC | TLR3 | EP300 | TMPRSS13 | EPHA5 |
BRAF | PTEN | MC4R | CYLD | PHF2 | EPHA3 |
APC | AXIN1 | MLH1 | FBN2 | OPRD1 | PTPRD |
TP53 | FLG | AKT1 | NF1 | LILRB5 | NTRK3 |
PIK3CA | LIG1 | CASD1 | ASXL1 | COL18A1 | NTRK1 |
CTNNB1 | MAP2K1 | PTCH1 | SMAD4 | LARP4B | ALK |
NRAS | PIK3R1 | ADAMTS18 | IRF5 | DMKN | ROS1 |
EGFR | ERBB2 | MSH2 | DOCK3 | ROBO2 | RET |
FBXW7 | STK11 | BAP1 | MYOM1 | KCNN3 | PDGFRA |
ARID1A | IL7R | CTNNA1 | NEFH | INHBA | FGFR1 |
2nd, sequencing analysis
Using the present invention, 1 intestinal polyp patient is surveyed according to colorectal cancer early screening is carried out the step of above method, as a result
It is as follows:
Sequencing data statistical result see the table below:
Annotation:Positive anti-chain interworking rate:Based on cluster total 3 more than reads cluster/3 more than reads that just anti-chain is having
Ratio, to assess positive anti-chain interworking situation in data available;Valid data utilization rate:Based on the reads at least meeting 2+/2- clusters
Number and the ratio of total sequencing reads numbers after error correction;Average sequencing depth:After valid data error correction, to target area
The average coverage condition of base.
The analysis of cluster:
Fig. 2 is shown in the analysis of same index reads clusters, wherein, abscissa represents duplication (dup) number of cluster, indulges
Coordinate represents the total reads numbers for the cluster for meeting a certain dup numbers.Fig. 2 result is shown:The dup clusters overwhelming majority is left 6
The right side, most of cluster interior energy meets 2 just+2 anti-conditions, and final data data effective rate of utilization is 5.12%, average sequencing depth
For:1033X
It is mutated analysis of spectrum:
Spectrum of mutation analysis result is shown in Fig. 3, wherein, complementary mutation type is theoretical for the molecule (DNA) from double-strand
The frequency of mutation is essentially identical, and abscissa represents the type of base mutation;Ordinate represents the number of mutation.Fig. 3 result is shown:
The distribution of mutating alkali yl type is in a basic balance, and its frequency of mutation (Mutations per nucleotide) is:2.2×10-6。
Variation detection list details (are counted) based on exon areas and nonsynonymous mutation:
Gene | Base mutation | Amino acid mutation | Mutation type | The frequency of mutation |
SMAD4 | c.2119G>A | p.Y301F | Missense mutation | 2.8% |
ARID1A | c.817C>T | p.A1872T | Missense mutation | 2.34% |
APC | c.217A>C | p.A426T | Missense mutation | 1.80% |
Interpretation of result:Relational database and the documents and materials such as foundation TCGA, COSMIC, ClinVar, HMGD, in patient
SMAD4 p.Y301F, APC p.A426T driving mutation are detected in blood plasma and imply that patient has higher risk of cancer
Rate, it is proposed that patient to relevant healthcare institution is more fully detected intervening measure related to taking.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " illustrative examples ",
The description of " example ", " specific example " or " some examples " etc. means to combine specific features, the knot that the embodiment or example are described
Structure, material or feature are contained at least one embodiment of the present invention or example.In this manual, to above-mentioned term
Schematic representation is not necessarily referring to identical embodiment or example.Moreover, specific features, structure, material or the spy of description
Point can in an appropriate manner be combined in any one or more embodiments or example.In addition, it is necessary to explanation, ability
Field technique personnel are it is understood that order the step of included in scheme proposed by the invention, and those skilled in the art can be with
It is adjusted, this is also included within the scope of the present invention.
Although an embodiment of the present invention has been shown and described, it will be understood by those skilled in the art that:Not
In the case of departing from the principle and objective of the present invention a variety of change, modification, replacement and modification can be carried out to these embodiments, this
The scope of invention is limited by claim and its equivalent.
Claims (46)
1. a kind of method for building sequencing library, it is characterised in that including:
(a) jointing is distinguished at the two ends of double chain DNA fragment, to obtain connection product, wherein, the joint includes first
Chain and the second chain, first chain and the second chain part are matched and first chain includes the first sequence label, so as to described
Limit double stranded region and two single-stranded afterbodys on joint, the first label is included in the sequence of one of described two single-stranded afterbodys;
(b) connection product is cracked into Single-stranded DNA fragments;
(c) Single-stranded DNA fragments are screened using probe, wherein, the probe specificity recognizes presumptive area, its
In, the presumptive area includes one of following:
(1)TLR3、TMPRSS13、MC4R、PHF2、OPRD1、FLG、LILRB5、LIG1、CASD1、COL18A1、LARP4B、
At least one of ADAMTS18, IRF5, DMKN, DOCK3, MYOM1, KCNN3 and NEFH gene;
(2) the CDS regions of (1);And
(3) upstream and downstream of (2) at least 10bp region;
(d) chain extension reaction is carried out using Single-stranded DNA fragments described in the first primer pair, to obtain chain extension product, wherein, institute
Stating the first primer includes the second sequence label, and first primer is suitable to the first chain link in pairs with the joint
, simply there is mispairing between first sequence label and second sequence label in structure;
(e) the chain extension product is expanded, to obtain amplified production, the amplified production constitutes the sequencing text
Storehouse, wherein, the amplification is described using the primer for being suitable to expand first sequence label and second sequence label simultaneously
Primer is the second primer and three-primer.
2. according to the method described in claim 1, it is characterised in that the double chain DNA fragment is obtained through the following steps:
Sample of nucleic acid is subjected to end reparation, to obtain the sample of nucleic acid by reparation;And
Base A is added in 5 ' ends of the sample of nucleic acid, there is cohesive end base A nucleic acid sample respectively to obtain two ends
This, the sample of nucleic acid with cohesive end base A constitutes the double chain DNA fragment respectively at the two ends.
3. method according to claim 2, it is characterised in that the sample of nucleic acid is at least one of human gene group DNA
Divide or free nucleic acid.
4. method according to claim 3, it is characterised in that the free nucleic acid is extracted from the peripheral blood of patient.
5. method according to claim 4, it is characterised in that the patient suffers from colorectal cancer.
6. method according to claim 3, it is characterised in that at least a portion of the human gene group DNA is by right
Human gene group DNA is interrupted and obtained at random.
7. according to the method described in claim 1, it is characterised in that the joint has 3 ' base T cohesive ends.
8. according to the method described in claim 1, it is characterised in that the Single-stranded DNA fragments are by by the connection product
Carry out denaturation treatment acquisition.
9. according to the method described in claim 1, it is characterised in that the probe is provided in the form of chip.
10. according to the method described in claim 1, it is characterised in that when there is UDG enzymes/FPG enzymes, carry out the chain extension
Reaction.
11. according to the method described in claim 1, it is characterised in that first sequence label and second sequence label
Separately length is 4~10nt.
12. according to the method described in claim 1, it is characterised in that first sequence label and second sequence label
Length be 8nt.
13. according to the method described in claim 1, it is characterised in that first sequence label and second sequence label
Between exist at least 2nt mispairing.
14. according to the method described in claim 1, it is characterised in that the nucleotides sequence of the first chain of the joint is classified as SEQ
ID NO:Sequence shown in 1, the nucleotides sequence of the second chain of the joint is classified as SEQ ID NO:Sequence shown in 2, described
The nucleotides sequence of one label is classified as SEQ ID NO:Sequence shown at least one of 3-6, the nucleotides sequence of second label
It is classified as SEQ ID NO:Sequence shown at least one of 7-10, the nucleotides sequence of first primer is classified as SEQ ID NO:11
Shown sequence, the nucleotides sequence of second primer is classified as SEQ ID NO:Sequence shown in 12, the core of the three-primer
Nucleotide sequence is SEQ ID NO:Sequence shown in 13.
15. a kind of sequence measurement, methods described is used for non-diagnostic purpose, it is characterised in that including:
Method according to any one of claim 1~14 builds sequencing library;
The sequencing library is sequenced.
16. method according to claim 15, it is characterised in that the survey is carried out on Hiseq2000 or Hiseq2500
Sequence.
17. a kind of method for determining nucleotide sequence, methods described is used for non-diagnostic purpose, it is characterised in that including:
For sample of nucleic acid, the method according to claim 15 or 16 is sequenced, to obtain by multiple sequencing datas
The sequencing result of composition;
Based on the sequencing result, at least one sequencing data subset is built, wherein, all surveys in each sequencing data subset
Ordinal number is according to identical source sequence on corresponding sample of nucleic acid;
For each sequencing data subset, determine that sequencing data corresponding with first sequence label is sequenced for normal chain respectively
Data, sequencing data corresponding with second sequence label is minus strand sequencing data;
For each sequencing data subset, the normal chain sequencing data and the minus strand sequencing data are based respectively on, to sequencing
Data are corrected, to determine corrected sequencing data;And
Based on the corrected sequencing data, the sequence of the sample of nucleic acid is determined.
18. method according to claim 17, it is characterised in that the sequencing is double end sequencings, the sequencing result
It is made up of multipair paired sequencing data.
19. method according to claim 17, it is characterised in that based on the sequencing result, builds at least one sequencing
Data subset is carried out through the following steps:
For every a pair of the multipair paired sequencing data, it is determined that sequencing data index, the paired sequencing data in pairs
Index is made up of the initial N number of base of each of paired sequencing data, wherein, N is the integer between 10~20;
Indexed based on the paired sequencing data, build at least one preliminary sequencing data subset, wherein, the preliminary sequencing number
The paired sequencing data index of identical is respectively provided with according to each sequencing data in subset;And
Based on the Hamming distance between sequencing data in the preliminary sequencing data subset, at least one described preliminary sequencing number
It is finely divided according to subset, to obtain multiple sequencing data subsets.
20. method according to claim 19, it is characterised in that N is 12.
21. method according to claim 19, it is characterised in that in each of the multiple sequencing data subset,
The Hamming distance of any two pairs paired sequencing datas is no more than 20.
22. method according to claim 19, it is characterised in that in each of the multiple sequencing data subset,
Normal chain sequencing data and minus strand sequencing data are respectively at least two.
23. method according to claim 22, it is characterised in that be sequenced based on the normal chain sequencing data and the minus strand
Data, determining corrected sequencing data is carried out based on following principle:
Each base in corrected sequencing data obtains at least 50% normal chain sequencing data and at least 50% negative simultaneously
The support of chain sequencing data.
24. method according to claim 23, it is characterised in that each base in corrected sequencing data is same
When obtain at least 80% normal chain sequencing data and at least support of 80% minus strand sequencing data.
25. method according to claim 23, it is characterised in that further comprise:
The corrected sequencing data is compared to reference sequences, and deletes the sequencing data that comparison quality is less than 30.
26. method according to claim 17, it is characterised in that the sequence based on the sample of nucleic acid, carries out SNV analyses
Or Indel analyses.
27. a kind of device for building sequencing library, it is characterised in that including:
Connection unit, for distinguishing jointing at the two ends of double chain DNA fragment, to obtain connection product, wherein, it is described to connect
Head includes the first chain and the second chain, and first chain and the second chain part are matched and first chain includes the first label sequence
Row, to limit double stranded region and two single-stranded afterbodys on the joint, are included in the sequence of one of described two single-stranded afterbodys
First label;
Unit is cracked, for the connection product to be cracked into Single-stranded DNA fragments;
Screening unit, for before chain extension is carried out, being screened using probe to the Single-stranded DNA fragments, wherein, it is described
Probe specificity recognizes presumptive area, wherein, the presumptive area includes one of following:
(1)TLR3、TMPRSS13、MC4R、PHF2、OPRD1、FLG、LILRB5、LIG1、CASD1、COL18A1、LARP4B、
At least one of ADAMTS18, IRF5, DMKN, DOCK3, MYOM1, KCNN3 and NEFH gene;
(2) the CDS regions of (1);And
(3) upstream and downstream of (2) at least 10bp region;
Chain extension unit, for carrying out chain extension reaction using Single-stranded DNA fragments described in the first primer pair, to obtain chain extension
Product, wherein, first primer includes the second sequence label, and first primer is suitable to the first chain with the joint
Duplex structure is formed, simply there is mispairing between first sequence label and second sequence label;
Amplification unit, for being expanded to the chain extension product, to obtain amplified production, the amplified production constitutes institute
Sequencing library is stated, wherein, the amplification uses the second primer and three-primer, and second primer recognizes the of the joint
Two chains, the three-primer is arranged to be suitable to while expanding first sequence label and second sequence label.
28. device according to claim 27, it is characterised in that further comprise:
Unit is repaired in end, for sample of nucleic acid to be carried out into end reparation, to obtain the sample of nucleic acid by reparation;And
End modified unit, for adding base A in 5 ' ends of the sample of nucleic acid, has viscosity respectively to obtain two ends
Terminal bases A sample of nucleic acid, the sample of nucleic acid with cohesive end base A constitutes the double-stranded DNA piece respectively at the two ends
Section.
29. device according to claim 27, it is characterised in that the probe is provided in the form of chip.
30. device according to claim 27, it is characterised in that when there is UDG enzymes/FPG enzymes, carries out the chain extension
Reaction.
31. device according to claim 27, it is characterised in that first sequence label and second sequence label
Separately length is 4~10nt.
32. device according to claim 27, it is characterised in that first sequence label and second sequence label
Length be 8nt.
33. device according to claim 27, it is characterised in that first sequence label and second sequence label
Between exist at least 2nt mispairing.
34. device according to claim 27, it is characterised in that the nucleotides sequence of the first chain of the joint is classified as SEQ
ID NO:Sequence shown in 1, the nucleotides sequence of the second chain of the joint is classified as SEQ ID NO:Sequence shown in 2, described
The nucleotides sequence of one label is classified as SEQ ID NO:Sequence shown at least one of 3-6, the nucleotides sequence of second label
It is classified as SEQ ID NO:Sequence shown at least one of 7-10, the nucleotides sequence of first primer is classified as SEQ ID NO:11
Shown sequence, the nucleotides sequence of second primer is classified as SEQ ID NO:Sequence shown in 12, the core of the three-primer
Nucleotide sequence is SEQ ID NO:Sequence shown in 13.
35. a kind of sequencing equipment, it is characterised in that including:
The device of structure sequencing library according to any one of claim 27~34;
Sequencing device, for the sequencing library to be sequenced.
36. sequencing equipment according to claim 35, it is characterised in that the sequencing device be Hiseq2000 or
Hiseq2500。
37. a kind of system for determining nucleotide sequence, it is characterised in that including:
Sequencing equipment described in claim 35 or 36, for being sequenced for sample of nucleic acid, to obtain by multiple sequencings
The sequencing result that data are constituted;
Sequencing data subset builds equipment, for based on the sequencing result, building at least one sequencing data subset, wherein,
All sequencing datas in each sequencing data subset correspond to identical source sequence on sample of nucleic acid;
Sequencing data sorting device, for for each sequencing data subset, determining and first sequence label pair respectively
The sequencing data answered is normal chain sequencing data, and sequencing data corresponding with second sequence label is minus strand sequencing data;
Sequencing data calibration equipment, for for each sequencing data subset, being based respectively on the normal chain sequencing data and institute
Minus strand sequencing data is stated, sequencing data is corrected, to determine corrected sequencing data;And
Sequence determination device, for based on the corrected sequencing data, determining the sequence of the sample of nucleic acid.
38. the system according to claim 37, it is characterised in that the sequencing is double end sequencings, the sequencing result
It is made up of multipair paired sequencing data.
39. the system according to claim 37, it is characterised in that sequencing data subset, which builds equipment, to be included:
Sequencing data index determines equipment, for every a pair for the multipair paired sequencing data, it is determined that sequencing in pairs
Data directory, the paired sequencing data index is made up of the initial N number of base of each of paired sequencing data, wherein, N
For the integer between 10~20;
Preliminary screening device, for being indexed based on the paired sequencing data, builds at least one preliminary sequencing data subset, its
In, each sequencing data in the preliminary sequencing data subset is respectively provided with the paired sequencing data index of identical;And
Postsearch screening device, for based on the Hamming distance between sequencing data in the preliminary sequencing data subset, to described
At least one preliminary sequencing data subset is finely divided, to obtain multiple sequencing data subsets.
40. the system according to claim 39, it is characterised in that N is 12.
41. the system according to claim 39, it is characterised in that in each of the multiple sequencing data subset,
The Hamming distance of any two pairs paired sequencing datas is no more than 20.
42. the system according to claim 39, it is characterised in that in each of the multiple sequencing data subset,
Normal chain sequencing data and minus strand sequencing data are respectively at least two.
43. system according to claim 42, it is characterised in that be sequenced based on the normal chain sequencing data and the minus strand
Data, determining corrected sequencing data is carried out based on following principle:
Each base in corrected sequencing data obtains at least 50% normal chain sequencing data and at least 50% negative simultaneously
The support of chain sequencing data.
44. system according to claim 43, it is characterised in that each base in corrected sequencing data is same
When obtain at least 80% normal chain sequencing data and at least support of 80% minus strand sequencing data.
45. system according to claim 43, it is characterised in that further comprise:
The corrected sequencing data is compared to reference sequences, and deletes the sequencing data that comparison quality is less than 30.
46. the system according to claim 37, it is characterised in that further comprise sequence analysis device, the sequence point
Analysis apparatus is used for the sequence based on the sample of nucleic acid, carries out SNV analyses or Indel analyses.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410521540.4A CN104293940B (en) | 2014-09-30 | 2014-09-30 | Build the method and its application of sequencing library |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410521540.4A CN104293940B (en) | 2014-09-30 | 2014-09-30 | Build the method and its application of sequencing library |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104293940A CN104293940A (en) | 2015-01-21 |
CN104293940B true CN104293940B (en) | 2017-07-28 |
Family
ID=52313887
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410521540.4A Active CN104293940B (en) | 2014-09-30 | 2014-09-30 | Build the method and its application of sequencing library |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104293940B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106434856B (en) * | 2015-08-06 | 2020-02-07 | 深圳华大智造科技有限公司 | Method for running and testing sequencer |
CN105332063B (en) * | 2015-08-13 | 2017-04-12 | 厦门飞朔生物技术有限公司 | Construction method of single-tube and high-flux sequencing library |
CN105602936A (en) * | 2015-11-18 | 2016-05-25 | 中国人民解放军第四军医大学 | Construction method of dual-barcode next-generation sequencing library |
CN107034267B (en) * | 2016-02-03 | 2021-06-08 | 深圳华大智造科技股份有限公司 | Method and device for preparing candidate sequencing probe set and application of candidate sequencing probe set |
CN107038349B (en) * | 2016-02-03 | 2020-03-31 | 深圳华大生命科学研究院 | Method and apparatus for determining pre-rearrangement V/J gene sequence |
CN105950709A (en) * | 2016-03-30 | 2016-09-21 | 广州精科生物技术有限公司 | Kit, library building method, and method and system for detecting variation of object region |
CN107312822A (en) * | 2016-04-26 | 2017-11-03 | 厦门飞朔生物技术有限公司 | A kind of construction method in oncogene variation library detected for high-flux sequence and its application |
CN109994155B (en) * | 2019-03-29 | 2021-08-20 | 北京市商汤科技开发有限公司 | Gene variation identification method, device and storage medium |
CN114464252B (en) * | 2022-01-26 | 2023-06-27 | 深圳吉因加医学检验实验室 | Method and device for detecting structural variation |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101967476A (en) * | 2010-09-21 | 2011-02-09 | 深圳华大基因科技有限公司 | Joint connection-based deoxyribonucleic acid (DNA) polymerase chain reaction (PCR)-free tag library construction method |
CN102409048A (en) * | 2010-09-21 | 2012-04-11 | 深圳华大基因科技有限公司 | DNA index library building method based on high throughput sequencing |
CN103806111A (en) * | 2012-11-15 | 2014-05-21 | 深圳华大基因科技有限公司 | Construction method and application of high-throughout sequencing library |
WO2014145078A1 (en) * | 2013-03-15 | 2014-09-18 | Verinata Health, Inc. | Generating cell-free dna libraries directly from blood |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104264231B (en) * | 2014-09-30 | 2017-04-19 | 天津华大基因科技有限公司 | Method for constructing sequencing library and application of sequencing library |
CN104293938B (en) * | 2014-09-30 | 2017-11-03 | 天津华大基因科技有限公司 | Build the method and its application of sequencing library |
CN104294371B (en) * | 2014-09-30 | 2017-07-04 | 天津华大基因科技有限公司 | Build method and its application of sequencing library |
-
2014
- 2014-09-30 CN CN201410521540.4A patent/CN104293940B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101967476A (en) * | 2010-09-21 | 2011-02-09 | 深圳华大基因科技有限公司 | Joint connection-based deoxyribonucleic acid (DNA) polymerase chain reaction (PCR)-free tag library construction method |
CN102409048A (en) * | 2010-09-21 | 2012-04-11 | 深圳华大基因科技有限公司 | DNA index library building method based on high throughput sequencing |
CN103806111A (en) * | 2012-11-15 | 2014-05-21 | 深圳华大基因科技有限公司 | Construction method and application of high-throughout sequencing library |
WO2014145078A1 (en) * | 2013-03-15 | 2014-09-18 | Verinata Health, Inc. | Generating cell-free dna libraries directly from blood |
Also Published As
Publication number | Publication date |
---|---|
CN104293940A (en) | 2015-01-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104293940B (en) | Build the method and its application of sequencing library | |
CN104264231B (en) | Method for constructing sequencing library and application of sequencing library | |
US11667959B2 (en) | Systems and methods to detect rare mutations and copy number variation | |
JP7119014B2 (en) | Systems and methods for detecting rare mutations and copy number variations | |
CN104293938B (en) | Build the method and its application of sequencing library | |
CN107475375B (en) | A kind of DNA probe library, detection method and kit hybridized for microsatellite locus related to microsatellite instability | |
AU2016293025A1 (en) | System and methodology for the analysis of genomic data obtained from a subject | |
CN104294371B (en) | Build method and its application of sequencing library | |
CN108949941A (en) | Low-frequency mutation detection method, kit and device | |
CN105518151A (en) | Identification and use of circulating nucleic acid tumor markers | |
CN105132407B (en) | A kind of cast-off cells DNA low frequencies mutation enrichment sequence measurement | |
CN104293941B (en) | Method for constructing sequencing library and application of sequencing library | |
US12031186B2 (en) | Homologous recombination repair deficiency detection | |
CN110093417B (en) | Method for detecting tumor single cell somatic mutation | |
CN108229103A (en) | The processing method and processing device of Circulating tumor DNA repetitive sequence | |
CN112639983A (en) | Microsatellite instability detection | |
CN105925664A (en) | Method and system for determining nucleic acid sequence | |
CN111511930A (en) | Genetic modulation of immune responses through chromosomal interactions | |
CN106480078A (en) | Gastric cancer peritoneal metastasis markers and application thereof | |
CN107760783A (en) | Gastric cancer peritoneum branch prediction model and its application based on 108 genes | |
CN108359723B (en) | Method for reducing deep sequencing errors | |
CN105950709A (en) | Kit, library building method, and method and system for detecting variation of object region | |
US20200095641A1 (en) | Means and methods for anti-vegf therapy | |
KR20220074756A (en) | Method for tracking the generation order of the generaed strands by linking information of the strands generated during the pcr process to create a cluster | |
CN117965734B (en) | Gene marker for detecting hard fibroid, kit, detection method and application |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |