CN108268752A - A kind of chromosome abnormality detection device - Google Patents

A kind of chromosome abnormality detection device Download PDF

Info

Publication number
CN108268752A
CN108268752A CN201810047686.8A CN201810047686A CN108268752A CN 108268752 A CN108268752 A CN 108268752A CN 201810047686 A CN201810047686 A CN 201810047686A CN 108268752 A CN108268752 A CN 108268752A
Authority
CN
China
Prior art keywords
coverage
window
sample
cbs
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810047686.8A
Other languages
Chinese (zh)
Other versions
CN108268752B (en
Inventor
糜庆丰
彭春方
张娟
赵宇
陈样宜
饶兴蔷
罗东红
黄铨飞
刘丽菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CapitalBio Genomics Co Ltd
Original Assignee
CapitalBio Genomics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CapitalBio Genomics Co Ltd filed Critical CapitalBio Genomics Co Ltd
Priority to CN201810047686.8A priority Critical patent/CN108268752B/en
Publication of CN108268752A publication Critical patent/CN108268752A/en
Application granted granted Critical
Publication of CN108268752B publication Critical patent/CN108268752B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Epidemiology (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioethics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a kind of chromosome abnormality detection devices.Existing chromosome abnormality detection and analysis are based on reading long counting statistics model, can only remove the repetitive sequence compared to initial position identical in genome, can not remove different for initial position but have the reads of overlapping between each other;Apparatus of the present invention pass through calling sequence coverage (coverage) statistical model, repetitive sequence and its overlapping region that unicellular whole genome amplification Preference is brought can be effectively removed, significantly improve the homogeneity of data, and then noise data is reduced, improve the recall rate of positive sample and reduces false positive rate.

Description

A kind of chromosome abnormality detection device
Technical field
The present invention relates to data processing techniques, and in particular to a kind of chromosome abnormality detection device.
Background technology
In recent years, help pregnant patient more and more with receiving supplementary reproduction, a large amount of clinical discoveries are in supplementary reproduction process Easily there is the situation of plantation failure or Unexplained spontaneous abortion repeatedly in the embryo of middle part high risk Mr. and Mrs, and test-tube baby is overall Live birth rate is studied less than 30% and finds that embryo chromosome is the main reason for test-tube baby is caused to fail extremely.Therefore, to embryo Tire carries out implantation prochromosome abnormality detection, and then the embryo of health is selected to be implanted into, and is remarkably improved the pregnant of test-tube baby Rate of being pregnent and live birth rate.
Embryonic limb bud cell prochromosome abnormality detection needs to carry out blastula embryo Trophectoderm cells or blastomere single Cell expands, and makes up to the required DNA initial amounts of high-flux sequence platform, i.e., reaches μ g ranks by the DNA of pg ranks DNA content;The unicellular amplification method of mainstream is divided into three classes by principle at present:Unicellular amplification method (such as DOP- of based on PCR PCR)[1], multiple strand displacement amplification (MDA)[2]With the cyclic annular cyclic amplification technology (MALBAC) of multiple annealing[3].Since these are slender Born of the same parents' amplification method is all using tens wheel exponential amplification, this so that the amplification Preference of the certain specific positions of genome is unlimited Amplification, generates a large amount of repetitive sequences (duplicate reads), and the homogeneity that depth is sequenced is caused to significantly reduce, is ultimately caused There are a large amount of exceptional values and false positive results in sample results analysis.Therefore, the repetitive sequence brought by amplification Preference is removed Embryonic limb bud cell prochromosome abnormality detection based on unicellular amplification is very important.
At present, it is all based on reading long counting (reads number) for the detection and analysis of the chromosome abnormality of embryo:It will survey The reading length (reads) that sequence generates is compared into reference gene group;Specific filtration resistance is to the reads to initial position identical in genome (duplicate reads);Reference gene group is divided into the statistical window of N number of fixed length, counts the reading long number of each window;It is right It reads long number and carries out GC corrections;Reading long number is normalized and is converted into reading long ratio (reads ratio);Finally count Long ratio (reads ratio) is read in analysis genome to judge that embryo to be measured whether there is chromosome abnormality.Above analysis stream Journey is merely capable of removal in the processing method of removal repetitive sequence (duplicate reads) and compares to identical in genome The duplicate reads of beginning position have for initial position difference but between each other the reads of overlapping (overlap) to be It can not effectively remove.Therefore, it is necessary to using more efficiently removal repetition methods, can just effectively improve based on unicellular complete The accuracy of the chromosome abnormality detection of genome amplification.
Bibliography
[1]Telenius H,Carter NP,Bebb CE,et al.Degenerate oligonucleotide- primed PCR:general amplification of target DNA by a single degenerate primer [J].Genomics,1992,13(3):718-725.
[2]Dean FB,Nelson JR,Giesler TL,et al.Rapid amplification of plasmid and phage DNA using Phi 29DNA polymerase and multiply-primed rolling circle amplification[J].Genome Research,2001,11(6):1095-1099.
[3]Zong C,Lu S,Chapman AR,et al.Genome-wide detection of single- nucleotide and copy-number variations of a single human cell[J].Science,2012, 338(6114):1622-1626.
[4]Olshen A B,Venkatraman E S,Lucito R,et al.Circular binary segmentation for the analysis of array-based DNA copy number data.[J] .Biostatistics,2004,5(4):557-72.
[5]Venkatraman E S,Olshen A B.A faster circular binary segmentation algorithm for the analysis of array CGH data[J].Bioinformatics,2007,23(6): 657-63.
Invention content
In order to solve the above-mentioned technical problem, the object of the present invention is to provide a kind of chromosome abnormality detection devices.
The technical solution adopted in the present invention is:
A kind of chromosome abnormality detection device, including:
Sequencing data acquiring unit:For obtaining the reading long segment obtained through high-flux sequence;
Comparing unit:It is compared for long segment will to be read with human genome reference sequences, obtains the position for reading long segment Confidence ceases and length information;
Coverage computing unit:For human genome reference sequences to be divided into several first windows, grown according to reading The location information and length information of segment calculate the coverage of each first window, according to the coverage and G/C content of first window Carry out Loess corrections;Several continuous first windows are merged into the second window, after calculating the second window Loess corrections Coverage and its coverage accounting;
Candidate CNV recognition units:For using the breakpoint location of cyclic annular binary segmentation algorithm identification chromosome, calculating adjacent CBS ratio between breakpoint identify candidate CNV regions according to CBS ratio threshold values;
False positive filter element:For calculating the significance P-value of candidate CNV regions CBS ratio values, according to P-value filtering false positives region obtains CNV regions and the results of karyotype of sample to be tested.
Particularly, the base sum/section length covered in coverage=section;The covering of coverage accounting=section Degree/all autosomal coverages.
Particularly, CBS ratio are all second window coverages between the adjacent breakpoint that cyclic annular binary segmentation algorithm identifies The mean value of accounting.
In coverage computing unit, the first window is the non-duplicate section of 10~50Kb, it is preferable that first window Mouth is the non-duplicate section of 20Kb.
In coverage computing unit, second length of window is 0.1~2Mb, it is preferable that second length of window is appointed Selected from 100Kb, 500Kb and 1Mb.
Preferably, in candidate CNV recognition units, the CBS ratio threshold values are [1.4,2.6], are sentenced beyond threshold range It is set to candidate CNV regions.
Preferably, it in false positive filter element, calculates P-value and includes:
Randomly sampled data library is formed according to the result of nominal reference sample, therefrom extracts at least 100000 times and candidate The isometric simulation CBS sections in CNV regions obtain the density profile of simulation CBS ratio values, calculate candidate CNV regions CBS The significance P-value of ratio values.
Preferably, in false positive filter element, the P-value < 0.001 in candidate CNV regions are then determined as CNV regions, Otherwise, as false positive area filter.
Further, described device further includes sequencing unit:
It is connected with sequencing data acquiring unit, for carrying out high-flux sequence, the sample to the library built using sample This is included through unicellular amplification or the sample for expanding through PCR or being expanded in advance without PCR in advance.
Further, described device further includes filter element:
It is connected with comparing unit, for according to comparison result, rejecting in tandem sequence repeats position and transposons repeatable position Reading long segment and low-quality, more matchings and non-fully match the reading long segment on chromosome.
The beneficial effects of the invention are as follows:
Existing chromosome abnormality detection and analysis are based on reading long counting statistics model, can only remove comparison to genome In identical initial position repetitive sequence, reads that is different for initial position but having overlapping between each other can not be removed;This Invention device can effectively remove unicellular whole genome amplification preference by calling sequence coverage (coverage) statistical model Property the repetitive sequence that brings and its overlapping region, significantly improve the homogeneity of data, and then reduce noise data, improve positive sample This recall rate and reduction false positive rate.
Description of the drawings
Fig. 1 is chromosome abnormality testing process schematic diagram;
Fig. 2 is the lower 24 chromosome copies numeric distribution figure of T1 sample 1M resolution ratio;A figures show tradition based on reading length The testing result of counting method, B figures show the testing result provided by the invention based on coverage method;
Fig. 3 is the distribution map of the lower 24 chromosome copies numerical value of T8 sample 1M resolution ratio;A figures show tradition based on reading The testing result of long counting method, B figures show the testing result provided by the invention based on coverage method;
Fig. 4 is the distribution map of the lower 24 chromosome copies numerical value of T19 sample 1M resolution ratio;A figures show that tradition is based on The testing result of long counting method is read, B figures show the testing result provided by the invention based on coverage method;
Fig. 5 is the distribution map of the lower 24 chromosome copies numerical value of T2 sample 1M resolution ratio;A figures show tradition based on reading The testing result of long counting method, B figures show the testing result provided by the invention based on coverage method.
Specific embodiment
The thought of the present invention:For the low sample (such as unicellular sample) of starting DNA content, the unicellular expansion of utilization index type During DNA concentration is promoted to μ g ranks by increasing mode by pg grades, amplification preference is often infinitely amplified, and is generated a large amount of Repetitive sequence (duplicate reads), the homogeneity for causing sample is poor.Traditional chromosome abnormality based on the long counting of reading Analysis method is merely capable of removal in the processing method of removal repetitive sequence (duplicate reads) and compares into genome The duplicate reads of identical initial position have for initial position difference but between each other overlapping (overlap) Reads can not be effectively removed, and therefore, conventional method is for expanding the genome area of preference and the gene of non-amplification preference The obtained sequencing reading length number of group range statistics can difference, eventually lead to the sequence ratios of some regions in analysis result Regular meeting is significantly higher than (or less than) normal condition, so as to false positive results occur.Apparatus of the present invention are in order to avoid unicellular amplification Testing result is influenced, chromosome abnormality is detected using based on coverage (coverage) statistical model, phase can be effectively removed The characteristic of overlapping region between adjacent sequencing reading length is reduced the influence of false positive results brought due to unicellular amplification preference, realized The detection of the chromosome abnormality of high-accuracy.It is visible based on inventive concept:Apparatus of the present invention are applicable not only to need to be through slender Screening before the Embryonic limb bud cell of the trace sample of born of the same parents' amplification, is equally applicable to need the chromosome abnormality of the pre- amplified samples of PCR to detect, such as The chromosome abnormality detection of abortion tissue object is more suitable for the chromosome abnormality detection without the PCR constant samples expanded in advance.This Invention device is a kind of general chromosome abnormality detection device, and more particularly to solve, there are the detections of PCR amplification preference sample Problem embodies more superior detection result.
A kind of chromosome abnormality detection device provided by the invention, including:
Sequencing data acquiring unit:For obtaining the reading long segment obtained through high-flux sequence;
Comparing unit:It is compared for long segment will to be read with human genome reference sequences, obtains the position for reading long segment Confidence ceases and length information;
Coverage computing unit:For human genome reference sequences to be divided into several first windows, grown according to reading The location information and length information of segment calculate the coverage of each first window, according to the coverage and G/C content of first window Carry out Loess corrections;Several continuous first windows are merged into the second window, after calculating the second window Loess corrections Coverage and its coverage accounting;
Candidate CNV recognition units:For using the breakpoint location of cyclic annular binary segmentation algorithm identification chromosome, calculating adjacent CBS ratio between breakpoint identify candidate CNV regions according to CBS ratio threshold values;
False positive filter element:For calculating the significance P-value of candidate CNV regions CBS ratio values, according to P-value filtering false positives region obtains CNV regions and the results of karyotype of sample to be tested.
Particularly, the base sum/section length covered in coverage=section;The covering of coverage accounting=section Degree/all autosomal coverages.
Particularly, CBS ratio are all second window coverages between the adjacent breakpoint that cyclic annular binary segmentation algorithm identifies The mean value of accounting.
In coverage computing unit, the first window is the non-duplicate section of 10~50Kb, it is preferable that first window Mouth is the non-duplicate section of 20Kb.
In coverage computing unit, second length of window is 0.1~2Mb, it is preferable that the second window length is optional From for 100Kb, 500Kb and 1Mb.
Preferably, in candidate CNV recognition units, the CBS ratio threshold values are [1.4,2.6], are sentenced beyond threshold range It is set to candidate CNV regions.
Preferably, it in false positive filter element, calculates P-value and includes:
Randomly sampled data library is formed according to the result of nominal reference sample, therefrom extracts at least 100000 times and candidate The isometric simulation CBS sections in CNV regions obtain the density profile of simulation CBS ratio values, calculate candidate CNV regions CBS The significance P-value of ratio values.
Preferably, in false positive filter element, the P-value < 0.001 in candidate CNV regions are then determined as CNV regions, Otherwise, as false positive area filter.
Further, described device further includes sequencing unit:
It is connected with sequencing data acquiring unit, for carrying out high-flux sequence, the sample to the library built using sample This is included through unicellular amplification or the sample for expanding through PCR or being expanded in advance without PCR in advance.
Further, described device further includes filter element:
It is connected with comparing unit, for according to comparison result, rejecting in tandem sequence repeats position and transposons repeatable position Reading long segment and low-quality, more matchings and non-fully match the reading long segment on chromosome.
Above-mentioned sequencing unit, sequencing data acquiring unit, comparing unit, filter element, coverage computing unit, candidate CNV units, false positive filter element can be program module or hardware device module.
The present invention is explained further below in conjunction with specific embodiment, protection scope of the present invention is without being limited thereto.
Embodiment 1
A kind of chromosome abnormality detection device provided by the invention is applied to the chromosome abnormality based on unicellular amplification In detection technique, following processing step is specifically included, flow diagram is as shown in Figure 1.
1st, sequencing data of whole genome is obtained
Cell strain known to a collection of caryogram is had purchased from Coriell companies, totally 25 samples participate in this item detection, and sample is compiled Number for T1~T25, wherein:2 negative samples;3 sex chromosome abnormalities samples;7 autosome aneuploid samples; The micro- repetition of 1 sex chromosome or micro-deleted sample;The micro- repetition of 12 autosomes or micro-deleted sample;Sample above is carried out single Cell whole genome amplification, library construction and high-flux sequence obtain and read long segment.
2nd, it compares
The reading long segment of acquisition with human genome standard sequence hg19 is compared, each reading long segment is compared to dye Colour solid corresponding position obtains each comparison information for reading long segment, location information, length information and the Quality Control letter including reading long segment Breath.
3rd, it filters
Quality Control information in comparison result rejects the reading lengthy motion picture in tandem sequence repeats position and transposons repeatable position Section and low-quality, more matchings and non-fully match the reading long segment on chromosome.
4th, coverage (coverage) calculates
Human genome reference sequences are divided into several first windows, each first window is the non-overlapping area of 20kb Domain according to the location information and length information of the reading long segment after filtering, calculates the coverage of first window, according to first window Coverage and G/C content to GC Preferences carry out Loess corrections, several continuous first windows are merged into the second window, Each second length of window is 1Mb, calculates coverage and its coverage accounting (coverage after the second window Loess corrections Ratio, abbreviation CR);Wherein, the base sum/section length covered in coverage=section;Coverage accounting=section Coverage/all autosomal coverages.
5th, candidate CNV is identified
Use cyclic annular binary segmentation algorithm (CBS, Circular Binary Segmentation) algorithm[4][5]Identification dye The breakpoint location of colour solid sets CBS ratio threshold values as [1.4,2.6], is determined as candidate CNV regions beyond threshold range, no Then it is determined as dye-free body exception, wherein, CBS ratio all second window coverages between the adjacent breakpoint of CBS identifications account for The mean value of ratio.
6th, false positive filters
Randomly sampled data library is formed according to the result of nominal reference sample, therefrom extracts 100000 times and candidate CNV areas The isometric simulation CBS sections in domain obtain the density profile of simulation CBS ratio values, and then calculate candidate CNV regions CBS The significance P-value of ratio values;False positive region is filtered according to the P-value in candidate CNV regions, specially:It is candidate The P-value < 0.001 in CNV regions, then be determined as CNV regions, otherwise, as false positive area filter, finally obtains to be measured The CNV regions of sample and results of karyotype.
Inventor is by the present embodiment (hereinafter referred to as " coverage method ") with traditional based on the chromosome abnormality for reading long counting Detection method (hereinafter referred to as " read long counting method " ") it compares, while the lot sample is originally analyzed using chip method.
Table 1 provides the chromosome abnormality testing result of 25 known caryogram cells, wherein:24 samples are reading long counting method It is identical with testing result under coverage method and consistent with the results of karyotype of chip;Inspection of 1 sample (T2) under two methods It is different to survey result, and the results of karyotype of chip is consistent with the testing result of the present embodiment.It can be seen that dye provided by the invention Colour solid abnormal detector has reliability and accuracy.
The chromosome abnormality testing result of table 1, known caryogram cell
Table 2 provides CV value of the above-mentioned sample respectively using the long counting method of reading and coverage method under 1M resolution ratio, CV values The dispersion degree of data is represented, can reflect the homogeneity that the reading long segment that sequencing obtains is distributed in reference gene group, and then anti- Reflect amplification homogeneity quality.It is clear that coverage method detection CV values are substantially reduced, illustrate chromosome abnormality provided by the invention Detection device can improve the problem of amplification homogeneity is poor.
The CV values of table 2, all samples the 1M resolution ratio under two kinds of detection methods
From above-mentioned sample, T1, T2, T8 and T19 sample are picked as example, further illustrates result.
Fig. 2 illustrates T1 samples 24 chromosome copies numeric distribution situations under 1M resolution ratio, wherein Fig. 2A be based on The testing result of long counting method is read, Fig. 2 B are the testing result based on coverage method.T1 is a negative sample (46, XX), root It can more intuitively illustrate that chromosome abnormality detection device provided by the invention can carry according to the distribution situation at Fig. 2A and Fig. 2 B midpoints The homogeneity of height amplification.
Fig. 3 and Fig. 4 respectively with T8 samples (47, XY ,+15) and T19 samples (46, XX, del (8) (pter-p12)) for, Chromosome aneuploid sample and segment CNV samples 24 chromosome copies numeric distribution situations in 1M resolution ratio are illustrated, Wherein Fig. 3 A, 4A are based on the testing result for reading long counting method, and Fig. 3 B, 5B are the testing result based on coverage method.With reference to figure 3rd, Fig. 4 and table 3 be not it is found that chromosome abnormality detection device provided by the invention influences sun while amplification homogeneity is improved The detection value of property result.
Table 3, T2, T8 and T19 sample the CNV regions detection value under two kinds of detection methods
Fig. 5 illustrates T2 samples (46, XY) 24 chromosome copies numeric distribution situations in 1M resolution ratio, wherein A figures For based on the testing result for reading long counting method, B figures are the testing result based on coverage method.According to point under 1M resolution ratio in Fig. 5 A Distribution situation understand that the amplification homogeneity of T2 samples is poor, use the CV values read when long counting method is detected under 1M resolution ratio It is 0.123, higher than other detection samples, false positive CNV has been detected in based on the testing result for reading long counting method and (has been located at No. 7 Chromosome q11.21 regions, section length about 5M);And after using the detection device provided by the invention based on coverage method, have Improve to effect the homogeneity (as shown in Figure 5 B) of T2 samples, the CV values under 1M resolution ratio are reduced to 0.073, final detection result It is consistent with chip caryogram, do not occur false positive CNV regions.This is illustrated:When amplification homogeneity is poor, traditional reading length meter Number methods may introduce false positive results, and while being analyzed using chromosome abnormality detection device provided by the invention can improve The homogeneity of amplification reduces the probability that false positive results occur.
It is that the preferred embodiment of the present invention is illustrated above, but the invention is not limited to the implementation Example, those skilled in the art can also make various equivalent variations under the premise of without prejudice to spirit of the invention or replace It changes, these equivalent deformations or replacement are all contained in the application claim limited range.

Claims (10)

1. a kind of chromosome abnormality detection device, including:
Sequencing data acquiring unit:For obtaining the reading long segment obtained through high-flux sequence;
Comparing unit:It is compared for long segment will to be read with human genome reference sequences, obtains the position letter for reading long segment Breath and length information;
Coverage computing unit:For human genome reference sequences to be divided into several first windows, according to reading long segment Location information and length information, calculate the coverage of each first window, carried out according to the coverage of first window and G/C content Loess is corrected;Several continuous first windows are merged into the second window, calculate the covering after the second window Loess corrections Degree and its coverage accounting;
Candidate CNV recognition units:For using the breakpoint location of cyclic annular binary segmentation algorithm identification chromosome, calculating adjacent breakpoint Between CBS ratio, candidate CNV regions are identified according to CBS ratio threshold values;
False positive filter element:For calculating the significance P-value of candidate CNV regions CBS ratio values, according to P- Value filtering false positives region obtains CNV regions and the results of karyotype of sample to be tested.
2. the apparatus according to claim 1, it is characterised in that:Base sum/the section covered in coverage=section Length;The coverage of coverage accounting=section/all autosomal coverages.
3. the apparatus according to claim 1, it is characterised in that:CBS ratio are the phase of cyclic annular binary segmentation algorithm identification The mean value of all second window coverage accountings between adjacent breakpoint.
4. the apparatus according to claim 1, it is characterised in that:In coverage computing unit, the first window for 10~ The non-duplicate section of 50Kb, it is preferable that the first window is the non-duplicate section of 20Kb.
5. the apparatus according to claim 1, it is characterised in that:In coverage computing unit, second length of window is 0.1~2Mb, it is preferable that second length of window is optionally from 100Kb, 500Kb, 1Mb.
6. the apparatus according to claim 1, it is characterised in that:In candidate CNV recognition units, the CBS ratio threshold values For [1.4,2.6], it is determined as candidate CNV regions beyond threshold range.
7. the apparatus according to claim 1, it is characterised in that:In false positive filter element, calculate P-value and include:
Randomly sampled data library is formed according to the result of nominal reference sample, therefrom extracts at least 100000 times and candidate CNV areas The isometric simulation CBS sections in domain obtain the density profile of simulation CBS ratio values, calculate candidate CNV regions CBS ratio The significance P-value of value.
8. the apparatus according to claim 1, it is characterised in that:In false positive filter element, the P- in candidate CNV regions Value < 0.001 are then determined as CNV regions, otherwise, as false positive area filter.
9. according to claim 1~8 any one of them device, it is characterised in that:Described device further includes sequencing unit:
It is connected with sequencing data acquiring unit, for carrying out high-flux sequence, the sample packet to the library built using sample It includes through unicellular amplification or the sample for expanding through PCR or being expanded in advance without PCR in advance.
10. according to claim 1~8 any one of them device, it is characterised in that:Described device further includes filter element:
It is connected with comparing unit, for according to comparison result, rejecting the reading in tandem sequence repeats position and transposons repeatable position Long segment and low-quality, more matchings and non-fully match the reading long segment on chromosome.
CN201810047686.8A 2018-01-18 2018-01-18 A kind of chromosome abnormality detection device Active CN108268752B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810047686.8A CN108268752B (en) 2018-01-18 2018-01-18 A kind of chromosome abnormality detection device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810047686.8A CN108268752B (en) 2018-01-18 2018-01-18 A kind of chromosome abnormality detection device

Publications (2)

Publication Number Publication Date
CN108268752A true CN108268752A (en) 2018-07-10
CN108268752B CN108268752B (en) 2019-02-01

Family

ID=62775981

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810047686.8A Active CN108268752B (en) 2018-01-18 2018-01-18 A kind of chromosome abnormality detection device

Country Status (1)

Country Link
CN (1) CN108268752B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109920480A (en) * 2019-03-14 2019-06-21 深圳市海普洛斯生物科技有限公司 A kind of method and apparatus correcting high-flux sequence data
CN110268044A (en) * 2017-03-07 2019-09-20 深圳华大生命科学研究院 A kind of detection method and device of chromosomal variation
CN113496761A (en) * 2020-04-03 2021-10-12 深圳华大生命科学研究院 Method, device and application for determining CNV in nucleic acid sample
CN115019892A (en) * 2022-06-13 2022-09-06 郑州大学第一附属医院 Confidence determination method for sequence coverage in sequencing of environmental microbiota metagenome

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010024894A1 (en) * 2008-08-26 2010-03-04 23Andme, Inc. Processing data from genotyping chips
CN103484560B (en) * 2008-11-07 2014-11-05 财团法人工业技术研究院 Methods for accurate sequence data and modified base position determination
CN104428425A (en) * 2012-05-04 2015-03-18 考利达基因组股份有限公司 Methods for determining absolute genome-wide copy number variations of complex tumors
CN104781421A (en) * 2012-09-04 2015-07-15 夸登特健康公司 Systems and methods to detect rare mutations and copy number variation
CN104968800A (en) * 2012-08-30 2015-10-07 普莱梅沙有限公司 Method of detecting chromosomal abnormalities
US20160098517A1 (en) * 2014-10-01 2016-04-07 Samsung Sds Co., Ltd. Apparatus and method for detecting internal tandem duplication
CN106650312A (en) * 2016-12-29 2017-05-10 安诺优达基因科技(北京)有限公司 Device for detecting DNA copy number variation of circulating tumor

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010024894A1 (en) * 2008-08-26 2010-03-04 23Andme, Inc. Processing data from genotyping chips
CN103484560B (en) * 2008-11-07 2014-11-05 财团法人工业技术研究院 Methods for accurate sequence data and modified base position determination
CN104428425A (en) * 2012-05-04 2015-03-18 考利达基因组股份有限公司 Methods for determining absolute genome-wide copy number variations of complex tumors
CN104968800A (en) * 2012-08-30 2015-10-07 普莱梅沙有限公司 Method of detecting chromosomal abnormalities
CN104781421A (en) * 2012-09-04 2015-07-15 夸登特健康公司 Systems and methods to detect rare mutations and copy number variation
US20160098517A1 (en) * 2014-10-01 2016-04-07 Samsung Sds Co., Ltd. Apparatus and method for detecting internal tandem duplication
CN106650312A (en) * 2016-12-29 2017-05-10 安诺优达基因科技(北京)有限公司 Device for detecting DNA copy number variation of circulating tumor

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
OLSHEN A B, ET AL,: "Circular binary segmentation for the analysis of array‐based DNA copy number data", 《BIOSTATISTICS》 *
李平,等;: "改进的基因拷贝数变异检测算法", 《计算机工程》 *
马天骏,等;: "DNA拷贝数变异及其研究进展", 《中华临床医师杂志》 *
黄龙生,主编;: "《应用数理统计 2015年5月第1版》", 31 May 2015, 中国农业出版社 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110268044A (en) * 2017-03-07 2019-09-20 深圳华大生命科学研究院 A kind of detection method and device of chromosomal variation
CN110268044B (en) * 2017-03-07 2022-08-02 深圳华大生命科学研究院 Method and device for detecting chromosome variation
CN109920480A (en) * 2019-03-14 2019-06-21 深圳市海普洛斯生物科技有限公司 A kind of method and apparatus correcting high-flux sequence data
CN113496761A (en) * 2020-04-03 2021-10-12 深圳华大生命科学研究院 Method, device and application for determining CNV in nucleic acid sample
CN113496761B (en) * 2020-04-03 2023-09-19 深圳华大生命科学研究院 Method, device and application for determining CNV in nucleic acid sample
CN115019892A (en) * 2022-06-13 2022-09-06 郑州大学第一附属医院 Confidence determination method for sequence coverage in sequencing of environmental microbiota metagenome

Also Published As

Publication number Publication date
CN108268752B (en) 2019-02-01

Similar Documents

Publication Publication Date Title
CN108268752B (en) A kind of chromosome abnormality detection device
CN107423578B (en) Device for detecting somatic cell mutation
CN106778073B (en) A kind of method and system of assessment tumor load variation
CN105219844B (en) Gene marker combination, kit and the disease risks prediction model of a kind of a kind of disease of screening ten
CN108256292A (en) A kind of copy number variation detection device
CN105483229B (en) A kind of method and system of detection foetal chromosome aneuploidy
CN108604258B (en) Chromosome abnormality determination method
CN104462869A (en) Method and device for detecting somatic cell SNP
CN107949845A (en) The new method of sex of foetus and fetus sex chromosomal abnormality can be distinguished on multiple platforms
WO2018054254A1 (en) Method and system for identifying tumor load in sample
CN105779572A (en) Chip and method for capturing target sequences of tumor susceptibility genes, and mutation detection method
CN106096330B (en) A kind of noninvasive antenatal biological information determination method
CN111091868B (en) Method and system for analyzing chromosome aneuploidy
CN110033829A (en) The fusion detection method of homologous gene based on difference SNP marker object
CN104846089A (en) Quantitative method for free fetal DNA (deoxyribonucleic acid) proportion in maternal peripheral blood
CN104951671A (en) Device for detecting aneuploidy of fetus chromosomes based on single-sample peripheral blood
CN113903398A (en) Intestinal cancer early-screening marker, detection method, detection device, and computer-readable medium
CN115083521A (en) Method and system for identifying tumor cell group in single cell transcriptome sequencing data
CN112786103A (en) Method and device for analyzing feasibility of target sequencing Panel for estimating tumor mutation load
CN109402247A (en) A kind of fetal chromosomal detection system counted based on DNA variation
CN113284558B (en) Method for distinguishing gene expression difference and long copy number variation in RNA sequencing data
CN107557458A (en) A kind of method and device of effective detection genotype
CN109979534B (en) C site extraction method and device
CN106709267A (en) Data acquisition method and device
CN116168761B (en) Method and device for determining characteristic region of nucleic acid sequence, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant