CN110111839A - The method and its application of reads number are supported in mutation in a kind of accurate quantification tumour standard items - Google Patents
The method and its application of reads number are supported in mutation in a kind of accurate quantification tumour standard items Download PDFInfo
- Publication number
- CN110111839A CN110111839A CN201810101797.2A CN201810101797A CN110111839A CN 110111839 A CN110111839 A CN 110111839A CN 201810101797 A CN201810101797 A CN 201810101797A CN 110111839 A CN110111839 A CN 110111839A
- Authority
- CN
- China
- Prior art keywords
- mutation
- reads
- standard items
- reference sequences
- tumour
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000035772 mutation Effects 0.000 title claims abstract description 103
- 238000000034 method Methods 0.000 title claims abstract description 51
- 206010028980 Neoplasm Diseases 0.000 title claims abstract description 41
- 238000011002 quantification Methods 0.000 title claims abstract description 20
- 239000000523 sample Substances 0.000 claims abstract description 36
- 238000001514 detection method Methods 0.000 claims description 23
- 230000004927 fusion Effects 0.000 claims description 22
- 210000004027 cell Anatomy 0.000 claims description 10
- 210000000349 chromosome Anatomy 0.000 claims description 10
- 108700020796 Oncogene Proteins 0.000 claims description 9
- 102000052116 epidermal growth factor receptor activity proteins Human genes 0.000 claims description 5
- 108700015053 epidermal growth factor receptor activity proteins Proteins 0.000 claims description 5
- 238000001914 filtration Methods 0.000 claims description 5
- YOHYSYJDKVYCJI-UHFFFAOYSA-N n-[3-[[6-[3-(trifluoromethyl)anilino]pyrimidin-4-yl]amino]phenyl]cyclopropanecarboxamide Chemical compound FC(F)(F)C1=CC=CC(NC=2N=CN=C(NC=3C=C(NC(=O)C4CC4)C=CC=3)C=2)=C1 YOHYSYJDKVYCJI-UHFFFAOYSA-N 0.000 claims description 5
- 239000012634 fragment Substances 0.000 claims description 3
- 108090000623 proteins and genes Proteins 0.000 description 11
- 238000012360 testing method Methods 0.000 description 11
- 230000000869 mutational effect Effects 0.000 description 6
- 101150023956 ALK gene Proteins 0.000 description 4
- 101150068690 eml4 gene Proteins 0.000 description 4
- 238000012163 sequencing technique Methods 0.000 description 3
- 238000011144 upstream manufacturing Methods 0.000 description 3
- 102100033793 ALK tyrosine kinase receptor Human genes 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000012827 research and development Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 102100028630 Cytoskeleton-associated protein 2 Human genes 0.000 description 1
- 101150039808 Egfr gene Proteins 0.000 description 1
- 101000766848 Homo sapiens Cytoskeleton-associated protein 2 Proteins 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 235000019506 cigar Nutrition 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- WOERBKLLTSWFBY-UHFFFAOYSA-M dihydrogen phosphate;tetramethylazanium Chemical compound C[N+](C)(C)C.OP(O)([O-])=O WOERBKLLTSWFBY-UHFFFAOYSA-M 0.000 description 1
- 210000003027 ear inner Anatomy 0.000 description 1
- 108700021358 erbB-1 Genes Proteins 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 101150101299 gene 4 gene Proteins 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000011528 liquid biopsy Methods 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 1
- 238000012113 quantitative test Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 102200048928 rs121434568 Human genes 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Medical Informatics (AREA)
- Genetics & Genomics (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a kind of methods that reads number is supported in mutation in accurate quantification tumour standard items.This method comprises the following steps: according to the abrupt information of tumour standard items, assembling reference sequences;It reference sequences is successively indexed, compared, filtered, sorted and duplicate removal using software is compared, obtain target reads;Target reads and reference sequences are compared, in comparison is that reads is supported in mutation;Reads number is supported in statistics mutation.It is demonstrated experimentally that using method provided by the invention to assess capture probe to the capture ability of mutation support reads, reads number can be supported to compare with the mutation of variation inspection software identification, to assess the performance of variation inspection software.The present invention has great application value.
Description
Technical field
The invention belongs to field of biotechnology, and in particular to reads is supported in mutation in a kind of accurate quantification tumour standard items
Several method and its application.
Background technique
The appearance of TKI drug significantly improves 5 years survival rates of non-small cell lung cancer.TKI drug and EGFR gene, ALK
The mutation status of gene etc. is related, therefore clinically needs conventional detection EGFR L858R, EX 19Del, ALK fusion
Deng.And it uses high-flux sequence to detect oncogene, support the information of mutation and normal information is supported to be mingled in sequencing number
In, to influence mutation detection.
, mainly there are two aspects in the reason of influencing mutation detection.First is that acquisition phase: the sequence of capture probe is to be with hg19
What basis was designed, it is directed to normal sequence, and be mutated and support reads with abrupt information, with normal sequence
It can have a certain difference, otherness depends on the complexity (such as SNV, INDEL, complexity INDEL) of mutation, and mutation content is got over
Complexity, capture probe are poorer to the sequence capturing ability being mutated, and mutation in sequencing data is caused to support reads number inclined
It is few, and then influence the detection of mutation.Second is that variation inspection software: when detection SNV or INDEL, common representative variation detection
Software is GATK, by reading in the bam formatted data compared through bwa, counts the number of tetra- kinds of bases of ATCG on each position
Mesh identifies INDEL by counting the I identified in CIGAR word string and D to identify SNV;It follows that reads is supported in mutation
Identification depend critically upon the performance for comparing software, SNV and short INDEL pertain only to the change of single base, influence not on comparing
Greatly, discrimination is close to true value, but INDEL longer for EGFR EX19DEL etc., when mutation occurs at the end reads
Often by error resolution, spaced several SNV or softclip are such as split into, are had again especially for after first lacking
The base of the abrupt of insertion, insertion can fluctuate on deletion sites, and comparison result not can reflect all dash forward
Become and support reads, GATK that will lose the partial information in subsequent modeling, to influence the detection of mutation.Detection
The representative variation inspection software of FUSION is SEEKSV, the software by the improper comparison of soft-clip and PEreads come
Reads is supported in identification mutation.In liquid biopsy, for DNA fragmentation length main peak in 170bp, PE information is insufficient, only remaining soft-
Clip information is a kind of.Soft-clip information, which places one's entire reliance upon, compares the performance of software, when breakpoint is located at the end reads or breaks
When there is homology region in the genome at point, it can all lead to not identify, to influence the detection of mutation.
In order to clearly be which link influences mutation detection, common method is that sequencing library is taken to carry out third-party test
Card, but this method have the shortcomings that it is many, such as time-consuming, at high cost, accuracy by third-party authentication method precision
Influence etc..And for the Performance Evaluation for the inspection software that makes a variation, change can only can not be embodied from macroscopically judging whether to detect
Different inspection software is mutated the recognition capability for supporting reads to different labyrinths.
Summary of the invention
Capture ability and the variation of reads are supported the technical problem to be solved by the present invention is to assess capture probe mutation
The performance of inspection software.
In order to solve the above technical problems, present invention firstly provides be mutated to support in a kind of accurate quantification tumour standard items
The method of reads number, it may include following steps:
(1) according to the abrupt information of tumour standard items, reference sequences are assembled;
After (2) completing step (1), the reference sequences are successively indexed using comparison software, are compared, are filtered,
Sequence and duplicate removal, obtain target reads;
(3) after completing step (2), the target reads and the reference sequences are compared, in comparison is prominent
Become and supports reads;Reads number is supported in statistics mutation.
In the step (1), the tumour standard items can be a1) or a2) or a3) or a4): a1) melt with ALK_EML4
Close the tumour standard items of mutation;A2) the fused cell system standard items of different mixed frequencies;A3) H2228 fused cell system;A4) no
With the tumour standard items of the EGFR EX19INDEL of mixed frequency.
Concretely one hundred Biotechnology Co., Ltd, Nanjing section is raw for the fused cell system standard items of the difference mixed frequency
The fused cell system standard items (H2228) of the different mixed frequencies of production.
The tumour standard items with ALK_EML4 fusion mutation, H2228 fused cell system are Bu Tong mixed with described
The tumour standard items of the EGFR EX19 INDEL of sum of fundamental frequencies rate can be the product of one hundred Biotechnology Co., Ltd, Nanjing section.
In the step (1), the assembling reference sequences can be the reference sequences and/or group of assembling Fusion mutation type
It fills the reference sequences of SNV mutation type and/or assembles the reference sequences of INDEL mutation type.
The method of the reference sequences of the assembling Fusion mutation type can be for according to the mutation position of the tumour standard items
The breakpoint information and breakpoint direction with biological significance of point refer to genome based on the mankind, along respective breakpoint direction,
Front and back respectively extends 180-220bp, assembling.
The method of the reference sequences of the reference sequences of the assembling SNV mutation type or the assembling INDEL mutation type
The mankind be may be based on reference to genome, the sequence information in the mutational site of the tumour standard items is substituted for the sequence after mutation
Information is then based on the mankind with reference to each before and after genome and extends 180-220bp, assembling.
Above, described " extending 180-220bp " concretely extends 200bp.
In the step (2), the comparison can be to carry out once by original lower machine data or with the mankind with reference to genome
The reads of comparison and the reference sequences are compared, and obtain accurately comparing reads.
In the step (2), the filtering can not be compared or be compared to filter out from the accurate comparison reads
Reads of the mass value less than 30.
In the step (2), the sequence can be the reads by filtering is passed through according to chromosome numbers and designation of chromosome
On position be ranked up.
It is described specific " by being ranked up by the reads of filtering according to the position on chromosome numbers and designation of chromosome "
It can be arranged according to the position on chromosome numbers and designation of chromosome by sequence from small to large for the reads filtered will be passed through
Sequence.
Ordering Software progress can be used in the sequence.The Ordering Software concretely Samtools.
In the step (2), the duplicate removal can remove PCR repeated fragment for the reads after sorting.
The progress of duplicate removal software can be used in the duplicate removal.
In any of the above-described method, the comparison software can be comparison software tmap or comparison software bwa.
When comparing using software tmap is compared, the duplicate removal software concretely BamDuplicates software
(product of ThermoFisher company).
When comparing using software bwa is compared, the duplicate removal software concretely picard software.
The application of any of the above-described the method also belongs to protection scope of the present invention.The application of any of the above-described the method can
For b1) or b2) or b3) or b4):
B1) analysis is the detection that capture probe or variation inspection software influence oncogene mutation;
Capture probe supports mutation the capture ability of reads when b2) assessing oncogene abrupt climatic change;
B3) the performance of assessment variation inspection software;
B4) analysis tumour standard items are positive, weakly positive or feminine gender.
In above-mentioned application, the capture probe can be the reference sequences.
The present invention, which also protects, a kind of judges tumour standard items to be measured for positive, weakly positive or negative method, it may include
Following steps: reads number is supported according to any of the above-described method accurate quantification mutation, is then made the following judgment: if
Mutation supports that reads number is 3 or more, then tumour standard items to be measured are the positive;If mutation supports that reads number is 1 or 2, to
Survey tumour standard items are weakly positive;If mutation supports that reads number is 0, tumour standard items to be measured are feminine gender.
Above, the variation inspection software can be TVC, VarScan, GATK or LOD.
Above, with reference to genome, concretely the mankind refer to genome hg19 to any of the above-described mankind.
Reads number, transition mutations frequency, the mutation with standard items theory are supported it is demonstrated experimentally that being mutated by accurate quantification
Frequency compares, it can be estimated that the experimental stage supports mutation the capture ability of reads;With dashing forward for variation inspection software identification
Become and reads number is supported to compare, the performance of assessment variation inspection software.When the inspection software inspection that makes a variation does not measure corresponding mutation
It waits, the method provided through the invention, which can define, specifically to be supported the reason is that not capturing corresponding mutation because of the experimental stage
Reads still make a variation inspection software detection accuracy it is inadequate, i.e., be clearly which link influences mutation detection, to instruct to research and develop
The optimization of system.The present invention has great application value.
Detailed description of the invention
Fig. 1 is the various combinations of fusion breakpoint direction.
Fig. 2 is the ref.fa file of (a) in 1 step 2 of embodiment (1).
Fig. 3 is the ref.fa file of (b) in 1 step 2 of embodiment (1).
The experimental result that Fig. 4 is in 1 step 2 of embodiment 7.
Fig. 5 is the ref.fa file of step 2 in embodiment 4.
Fig. 6 is that reads number is supported in the mutation of step 2 in embodiment 4.
Specific embodiment
Embodiment below facilitates a better understanding of the present invention, but does not limit the present invention.Experiment in following embodiments
Method is unless otherwise specified conventional method.Test material as used in the following examples is unless otherwise specified certainly
What routine biochemistry reagent shop was commercially available.Quantitative test in following embodiment is respectively provided with three repeated experiments, as a result makes even
Mean value.
Fusion breakpoint direction: the fusion direction with biological significance is the upper of promoter gene (such as EML4 gene)
The downstream of trip connection oncogene (such as ALK gene).
Tumour standard items can provide the specific gene and respective breakpoint merged, the breakpoint side with biological significance
Positive negativity to two genes with merging is related, and Fig. 1 (+expression gene is shown in the various combinations of fusion breakpoint direction
Positivity ,-indicate gene negativity).
The program being related in following embodiments is both needed to run in the environment of linux.Wherein compare the net of software tmap
Location are as follows: https: //github.com/iontorrent/TS/tree/master/Analysis/TMAP.
The method that reads number is supported in mutation in embodiment 1, accurate quantification tumour standard items
One, the method that reads number is supported in mutation in accurate quantification tumour standard items
The basic principle of this method is to utilize assembling and the method compared essence according to the specific abrupt information of tumour standard items
Reads number is supported in the amount mutation of determination.Specific step is as follows:
1, reference sequences are assembled
The reference sequences of fasta format are assembled according to the mutational site of tumour standard items.Relate generally to SNV, INDEL and
The reference sequences of tri- kinds of mutation types of Fusion assemble, and Fusion is increasingly complex with respect to for SNV and INDEL.
(1) assemble method of Fusion
According to the breakpoint information in the mutational site of tumour standard items and with the breakpoint direction of biological significance, it is based on the mankind
With reference to genome hg19, along respective breakpoint direction, front and back respectively extends 200bp, and assembling obtains ref.fa file.
(2) assemble method of SNV
Genome hg19 is referred to based on the mankind, after the sequence information in the mutational site of tumour standard items is substituted for mutation
Sequence information is then based on the mankind with reference to each before and after genome hg19 and extends 200bp, and assembling obtains ref.fa file.
(3) assemble method of INDEL
Same step (2).
2, it indexes
After completing step 1, using software tmap is compared, tmap index-f ref.fa is indexed to ref.fa file.
3, it compares
After completing step 2, using software tmap is compared, by original lower machine data (fastq file or the file not compared)
Or the bam file once compared excessively with the mankind with reference to genome hg19 compares ref.fa, obtains accurate comparison result
tmap.bam。
4, it filters
Complete step 3 after, filter out tamp.bam entreme and mean ratio to it is upper or can compare still comparison mass value less than 30
Reads obtains tamp.filter.bam.
5, it sorts
After completing step 4, tmap.filter.bam is ranked up (according to chromosome using samtools Ordering Software
Position in number and designation of chromosome is ranked up by sequence from small to large).
6, duplicate removal
It is right using BamDuplicates software (product of ThermoFisher company) after completing step 5
The ranking results of tmap.filter.bam remove PCR repeated fragment, obtain tmap.filter.rmdup.bam.
7, reads number is supported in statistics mutation
After completing step 6, the mutation branch that reference sequences ref.fa can be compared in tmap.filter.rmdup.bam is counted
Hold reads number.Such as fusion is mutated, reads is supported in the mutation in comparison are as follows: across breakpoint (200) 5bp or more, whole reads
Editing distance be 5 or less.
Two, the mutation of the tumour standard items according to the method accurate quantification of step 1 with ALK_EML4 fusion mutation is supported
Reads number
1, reference sequences are assembled
According to the tumour standard items (product of one hundred Biotechnology Co., Ltd, Nanjing section) with ALK_EML4 fusion mutation
Mutational site assembling fasta format reference sequences.Relate generally to the reference of tri- kinds of mutation types of SNV, INDEL and Fusion
Sequence assembling, Fusion are increasingly complex with respect to for SNV and INDEL.
(1) assemble method of Fusion
The breakpoint information of two genes of the tumour standard items with ALK_EML4 fusion mutation is respectively ALK-chr2:
29448092 and EML4-chr2:42493956, ALK gene are negative chain gene, and EML4 gene is positive chain gene.According to two bases
Because positive negativity obtains the breakpoint direction with biological significance, but other than the direction with biological significance, in the survey of DNA
Ordinal number can also find the signal in oncogene (ALK gene) upstream connection promoter gene (EML4 gene) downstream sometimes in, because
This two kinds of breakpoint direction when assembling reference sequences requires to consider.Genome hg19 is referred to based on the mankind, in ALK gene
At the breakpoint of EML4 gene, along respective breakpoint direction, front and back respectively extends 200bp, assembles promoter gene upstream company
Connect the ref.fa file in oncogene downstream and oncogene upstream connection promoter gene downstream.Ref.fa file is shown in Fig. 2.
(2) assemble method of SNV
Genome hg19 is referred to based on the mankind, by the mutational site of the tumour standard items with ALK_EML4 fusion mutation
Sequence information is substituted for the sequence information after mutation, is then based on the mankind with reference to each before and after genome hg19 and extends 200bp, group
Dress, obtains ref.fa file.Ref.fa file is shown in Fig. 3.
(3) assemble method of INDEL
Same step (2).
2, it indexes
With in step 12.
3, it compares
With in step 13.
4, it filters
With in step 14.
5, it sorts
With in step 15.
6, duplicate removal
With in step 16.
7, reads number is supported in statistics mutation
With in step 17.
Experimental result Fig. 4.
The trackability of the method for reads number is supported in the accurate quantification mutation that embodiment 2, embodiment 1 provide
Sample to be tested one: (one hundred biotechnology of Nanjing section is limited for the fused cell system standard items (H2228) of different mixed frequencies
The product of company), mutated-genotype ALK_EML4.
Sample to be tested two: (one hundred biotechnology of Nanjing section is limited for the fused cell system standard items (H2228) of different mixed frequencies
The product of company), mutated-genotype ALK_EML4.
1, sample to be tested one or sample to be tested two are taken, is detected using NGS (high-flux sequence).
The NGS testing result of sample to be tested one is shown in Table 1 the 4th column.The NGS testing result of sample to be tested two is shown in Table 2 the 4th column.
2, sample to be tested one or sample to be tested two are taken, is supported using the method detection accurate quantification mutation that embodiment 1 provides
Reads number.Specific step is as follows:
(1) reference sequences are assembled
With in 1 step 1 of embodiment 1.
(2) it indexes
With in 1 step 1 of embodiment 2.
(3) it compares
With in 1 step 1 of embodiment 3.
(4) it filters
With in 1 step 1 of embodiment 4.
(5) it sorts
With in 1 step 1 of embodiment 5.
(6) duplicate removal
With in 1 step 1 of embodiment 6.
(7) reads number is supported in statistics mutation
With in 1 step 1 of embodiment 7.
The testing result of sample to be tested one is shown in Table 1 the 5th column.The testing result of sample to be tested two is shown in Table 2 the 5th column.
Table 1
Table 2
1 result of table is as follows: the mutation of sample to be tested one supports reads number seldom, only 0 or 1;2 result of table is as follows: to
The mutation of test sample sheet two supports reads number very much, and reads number is supported in the mutation much larger than sample to be tested one, and is with mixed
Sum of fundamental frequencies rate increases and increases.
The testing result obtained using NGS detection method and the testing result that the method provided using embodiment 1 is obtained are complete
It is complete consistent.It can be seen that the reason of failing detection using NGS (high-flux sequence) detection sample to be tested one is sample to be tested one
Body is without abrupt information.
The accurate quantification mutation that embodiment 3, embodiment 1 provide supports the method for reads number in assessment Different Variation detection
Application in the performance of software
Sample to be tested: (one hundred biotechnology of Nanjing section has the tumour standard items of the EGFR EX19INDEL of different mixed frequencies
The product of limit company).
1, sample to be tested is taken, variation inspection software TVC, VarScan, GATK and LOD detection is respectively adopted, counts each soft
Reads number is supported in the mutation of part identification.
Testing result is shown in Table 3 the 3rd column to 6 column.
2, sample to be tested is taken, reads number is supported using the method detection accurate quantification mutation that embodiment 1 provides.
Testing result is shown in Table 3 the 7th column.
Table 3
The result shows that different variation inspection softwares supports the identification of reads to be all not quite similar mutation.Therefore, lead to
Crossing to be mutated with accurate quantification supports reads number to compare, it can be estimated that the performance of Different Variation inspection software can also be used as section
It grinds or foundation that research and development of products stage optimal detection software is selected.
The above results, which are also shown that be mutated by accurate quantification, supports reads number, transition mutations frequency, with standard items theory
The frequency of mutation compare, it can be estimated that the experimental stage for mutation support reads capture ability;Know with variation inspection software
Other mutation supports reads number to compare, the performance of assessment variation inspection software.When variation inspection software inspection does not measure corresponding dash forward
When change, the method provided through the invention can define specific the reason is that dashing forward accordingly because the experimental stage does not capture
It is inadequate to become the detection accuracy for supporting reads still to make a variation inspection software, to instruct the optimization of R&D mode.
The verifiability of the method for reads number is supported in the accurate quantification mutation that embodiment 4, embodiment 1 provide
Sample to be tested: H2228 fused cell system (product of one hundred Biotechnology Co., Ltd, Nanjing section);Its breakpoint is ALK-
Chr2:29448093 and EML4-chr2:42493957.
1, sample to be tested is taken, using variation inspection software (the matched independent development software of Lung Cancer: Guangzhou product) detection, system
Reads number is supported in the mutation of meter identification.
The result shows that sample to be tested is weakly positive.
2, take sample to be tested, using embodiment 1 provide method detection accurate quantification mutation support reads number, then into
Row judges as follows: if mutation supports that reads number is 3 or more, sample to be tested is the positive;If mutation supports that reads number is 1
Or 2, then sample to be tested is weakly positive;If mutation supports that reads number is 0, sample to be tested is feminine gender.
Ref.fa file is shown in Fig. 5.Experimental result Fig. 6 of reads number is supported in statistics mutation.
The result shows that the mutation of sample to be tested supports that reads number is 7, judge sample to be tested for the positive.It can be seen that not
With variation inspection software to support the discrimination of reads to have relative to true mutation support reads mutation certain
Gap, when the inspection software that makes a variation detects the result of weakly positive, it may be possible to because variation inspection software supports reads to mutation
Recognition capability it is limited, the mutation that recognizes supports reads number very little, cannot pass through the positive threshold value of variation detection algorithm.It can
See there is certain limitation using traditional variation inspection software detection.It is accurately counted using method provided by the invention true
Mutation support reads, and then verified in result of the level of biological information to weakly positive.
Claims (10)
1. a kind of method that reads number is supported in mutation in accurate quantification tumour standard items, includes the following steps:
(1) according to the abrupt information of tumour standard items, reference sequences are assembled;
(2) after completing step (1), the reference sequences is successively indexed using comparison software, compared, filtered, are sorted
And duplicate removal, obtain target reads;
(3) after completing step (2), the target reads and the reference sequences are compared, in comparison is mutation branch
Hold reads;Reads number is supported in statistics mutation.
2. the method as described in claim 1, it is characterised in that: in the step (1), the tumour standard items are a1) or a2)
Or a3) or a4): tumour standard items a1) with ALK_EML4 fusion mutation;A2) the fused cell system mark of different mixed frequencies
Quasi- product;A3) H2228 fused cell system;A4) the tumour standard items of the EGFR EX19INDEL of different mixed frequencies.
3. method according to claim 1 or 2, it is characterised in that: in the step (1), the assembling reference sequences are group
It fills the reference sequences of Fusion mutation type and/or assembles the reference sequences and/or assembling INDEL mutation class of SNV mutation type
The reference sequences of type.
4. the method as described in claim 1, it is characterised in that: in the step (2), described compare is by original lower machine data
Or the reads and the reference sequences once compared excessively with the mankind with reference to genome is compared, and is accurately compared
reads。
5. the method as described in claim 1, it is characterised in that: described to be filtered into from the accurate ratio in the step (2)
Reads of the mass value less than 30 is not compared or compared to filtering out in reads.
6. the method as described in claim 1, it is characterised in that: described to be ordered as that filtering will be passed through in the step (2)
Reads is ranked up according to the position on chromosome numbers and designation of chromosome.
7. the method as described in claim 1, it is characterised in that: in the step (2), the duplicate removal is after sorting
Reads removes PCR repeated fragment.
8. the method as described in claim 1, it is characterised in that: the comparison software is to compare software tmap or comparison software
bwa。
Be b1 9. the application of any the method for claim 1 to 8) or b2) or b3) or b4):
B1) analysis is the detection that capture probe or variation inspection software influence oncogene mutation;
Capture probe supports mutation the capture ability of reads when b2) assessing oncogene abrupt climatic change;
B3) the performance of assessment variation inspection software;
B4) analysis tumour standard items are positive, weakly positive or feminine gender.
10. a kind of judge that tumour standard items to be measured for positive, weakly positive or negative method, include the following steps: according to power
Benefit requires 1 to 8 any method accurate quantification mutation to support reads number, then makes the following judgment: if mutation is supported
Reads number is 3 or more, then tumour standard items to be measured are the positive;If mutation supports that reads number is 1 or 2, tumour mark to be measured
Quasi- product are weakly positive;If mutation supports that reads number is 0, tumour standard items to be measured are feminine gender.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810101797.2A CN110111839A (en) | 2018-02-01 | 2018-02-01 | The method and its application of reads number are supported in mutation in a kind of accurate quantification tumour standard items |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810101797.2A CN110111839A (en) | 2018-02-01 | 2018-02-01 | The method and its application of reads number are supported in mutation in a kind of accurate quantification tumour standard items |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110111839A true CN110111839A (en) | 2019-08-09 |
Family
ID=67483643
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810101797.2A Pending CN110111839A (en) | 2018-02-01 | 2018-02-01 | The method and its application of reads number are supported in mutation in a kind of accurate quantification tumour standard items |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110111839A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111180010A (en) * | 2019-12-27 | 2020-05-19 | 北京优迅医学检验实验室有限公司 | Tumor somatic mutation site detection method and device thereof |
CN114005489A (en) * | 2021-12-28 | 2022-02-01 | 成都齐碳科技有限公司 | Analysis method and device for detecting point mutation based on third-generation sequencing data |
CN117253546A (en) * | 2023-10-11 | 2023-12-19 | 北京博奥医学检验所有限公司 | Method, system and storable medium for reducing targeted second-generation sequencing background noise |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104762402A (en) * | 2015-04-21 | 2015-07-08 | 广州定康信息科技有限公司 | Method for rapidly detecting human genome single base mutation and micro-insertion deletion |
CN104928399A (en) * | 2015-04-24 | 2015-09-23 | 深圳华大基因科技有限公司 | Primer group, kit and use thereof in HPV whole-genome detection |
CN106909806A (en) * | 2015-12-22 | 2017-06-30 | 广州华大基因医学检验所有限公司 | The method and apparatus of fixed point detection variation |
CN107345253A (en) * | 2017-07-25 | 2017-11-14 | 臻和(北京)科技有限公司 | Lung cancer clinical medication genetic test standard items and its application |
CN107491666A (en) * | 2017-09-01 | 2017-12-19 | 深圳裕策生物科技有限公司 | Single sample somatic mutation loci detection method, device and storage medium in abnormal structure |
-
2018
- 2018-02-01 CN CN201810101797.2A patent/CN110111839A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104762402A (en) * | 2015-04-21 | 2015-07-08 | 广州定康信息科技有限公司 | Method for rapidly detecting human genome single base mutation and micro-insertion deletion |
CN104928399A (en) * | 2015-04-24 | 2015-09-23 | 深圳华大基因科技有限公司 | Primer group, kit and use thereof in HPV whole-genome detection |
CN106909806A (en) * | 2015-12-22 | 2017-06-30 | 广州华大基因医学检验所有限公司 | The method and apparatus of fixed point detection variation |
CN107345253A (en) * | 2017-07-25 | 2017-11-14 | 臻和(北京)科技有限公司 | Lung cancer clinical medication genetic test standard items and its application |
CN107491666A (en) * | 2017-09-01 | 2017-12-19 | 深圳裕策生物科技有限公司 | Single sample somatic mutation loci detection method, device and storage medium in abnormal structure |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111180010A (en) * | 2019-12-27 | 2020-05-19 | 北京优迅医学检验实验室有限公司 | Tumor somatic mutation site detection method and device thereof |
CN111180010B (en) * | 2019-12-27 | 2023-07-11 | 北京优迅医学检验实验室有限公司 | Tumor somatic mutation site detection method and device |
CN114005489A (en) * | 2021-12-28 | 2022-02-01 | 成都齐碳科技有限公司 | Analysis method and device for detecting point mutation based on third-generation sequencing data |
CN117253546A (en) * | 2023-10-11 | 2023-12-19 | 北京博奥医学检验所有限公司 | Method, system and storable medium for reducing targeted second-generation sequencing background noise |
CN117253546B (en) * | 2023-10-11 | 2024-05-28 | 北京博奥医学检验所有限公司 | Method, system and storable medium for reducing targeted second-generation sequencing background noise |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109022553B (en) | Genetic chip for Tumor mutations cutting load testing and preparation method thereof and device | |
CN106909806B (en) | The method and apparatus of fixed point detection variation | |
CN109033749B (en) | Tumor mutation load detection method, device and storage medium | |
CN109887548A (en) | The detection method and detection device of ctDNA accounting based on capture sequencing | |
CN107180166A (en) | A kind of full-length genome structure variation analysis method and system being sequenced based on three generations | |
CN107423578B (en) | Device for detecting somatic cell mutation | |
CN104298892B (en) | Detection device and method for gene fusion | |
CN104657628A (en) | Proton-based transcriptome sequencing data comparison and analysis method and system | |
CN104232777B (en) | Determine the method and device of fetal nucleic acid content and chromosomal aneuploidy simultaneously | |
CN110111839A (en) | The method and its application of reads number are supported in mutation in a kind of accurate quantification tumour standard items | |
CN106021984A (en) | Whole-exome sequencing data analysis system | |
CN108073791B (en) | Method based on two generation sequencing datas detection target gene structure variation | |
CN108319813A (en) | Circulating tumor DNA copies the detection method and device of number variation | |
WO2023115662A1 (en) | Method for detecting variant nucleic acids | |
CN111326212A (en) | Detection method of structural variation | |
CN110060733A (en) | Tumour somatic variation detection device is sequenced in two generations based on single sample | |
CN111051535A (en) | Methods for determining the sensitivity of a patient with a proliferative disease to treatment with an agent targeting a component of the PD1/PD-L1 pathway | |
CN106650254A (en) | Method for detecting fusion gene based on transcriptome sequencing data | |
CN105925665A (en) | Kit, database establishment method, and method and system for detecting area target variation | |
CN106022001A (en) | Tumor mutation site screening and mutual exclusion gene mining system | |
CN108304694B (en) | Method for analyzing gene mutation based on second-generation sequencing data | |
CN108021788A (en) | The method and apparatus of deep sequencing data extraction biomarker based on cell free DNA | |
CN108642568A (en) | A kind of special SNP chip design method of domesticated dog full-length genome low-density cultivar identification | |
CN110021346A (en) | Gene Fusion and mutation detection methods and system based on RNAseq data | |
CN111584006A (en) | Circular RNA identification method based on machine learning strategy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190809 |