CN107239676A - A kind of sequence data processing unit for embryo chromosome - Google Patents
A kind of sequence data processing unit for embryo chromosome Download PDFInfo
- Publication number
- CN107239676A CN107239676A CN201710347798.0A CN201710347798A CN107239676A CN 107239676 A CN107239676 A CN 107239676A CN 201710347798 A CN201710347798 A CN 201710347798A CN 107239676 A CN107239676 A CN 107239676A
- Authority
- CN
- China
- Prior art keywords
- chromosome
- length
- interval
- dna fragmentation
- ratio
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
Landscapes
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Include processor the invention discloses a kind of sequence data processing unit for embryo chromosome, be adapted for carrying out various instructions, instruction is suitable to be loaded by processor and perform following steps:Obtain unique sequence of matching completely;According to the reading long segment distribution situation of unique sequence of matching completely, divide different reading length interval, calculate the DNA fragmentation ratio of each length of interval on every chromosome, whether according to DNA fragmentation ratio and known autosome DNA fragmentation ratio in the case where different length interval difference between the two of the chromosome different length to be measured under interval, it is aneuploid to judge chromosome to be measured;The DNA fragmentation ratio is drawn according to the length computation of all autosomal DNA fragmentation number summations under length of interval of the DNA fragmentation number under length of interval, sample and chromosome.Apparatus of the present invention can improve the accuracy rate of embryo chromosome numerical abnormality detection, it is possible to decrease false positive rate and false positive rate.The device of the present invention can be widely applied in high throughput sequencing technologies.
Description
Technical field
The present invention relates to data processing technique, more particularly to a kind of sequence data processing unit for embryo chromosome,
Suitable for embryo chromosome aneuploid detection technique.
Background technology
Chromosome abnormality is to cause the important clinical factors such as spontaneous abortion, inborn defect, fetus Poly-monstrosity.The dyeing
Body includes numerical abnormalities of chromosomes and the micro- repetition of microdeletion extremely.Wherein, the unknown spontaneous abortion of pregnant early stage reason
Caused by middle major part is chromosome aneuploid, B ultrasound shows in the fetus for exist Poly-monstrosity 10%, and to anomaly exist chromosome non-
In euploid, the neonate of inborn defect about 20% also be chromosome abnormality caused by.Therefore, chromosome abnormality is detected,
This aspect is conducive to caused by whether investigation miscarriage be fetal chromosomal abnormalities, particularly to multiple for Early-stage cervical cancer
Couple can be carried out chromosome abnormality detection by the pregnant woman of the pregnant early stage recurrent abortion of unknown reason, pregnant again to reduce
The possibility of abnormal infant birth when being pregnent;On the other hand, be conducive to caused by whether early detection fetal abnormality be chromosome abnormality,
The auxiliary information of diagnosis is provided for doctor, so as to realize the early treatment of fetal abnormality, and then inborn defect is reduced.
In addition, in recent years, the fast development of Issues of Human Assisted Reproductive Technologies causes " test-tube baby " technology to be gradually applied to face
Bed, helps man and wife that is more infertile or older or carrying genetic disease to obtain of future generation.But a large amount of clinical researches
It was found that, it is fertilized in vitro in the embryo of formation, the embryo of only about half of left and right has the phenomenon of chromosome abnormality, and this is often perhaps
The main cause [1] of plantation failure or spontaneous abortion or stillbirth repeatedly occurs in grand multigravida.And as pregnant woman age increases, embryo
The risk of fetal hair life chromosome abnormality is also higher, significantly limit the success rate of auxiliary procreation technology.Therefore, before Embryonic limb bud cell
Accurate examination that can be abnormal to embryo chromosome, and then select the Embryonic limb bud cell of health, is can significantly improve test-tube baby pregnant
Rate of being pregnent and live birth rate.
At present, the method detected for chromosome abnormality mainly includes FISH, microarray-comparative genome hybridization
(array-CGH) technology and high throughput sequencing technologies.Fluorescence in situ hybridization technique (fluorescence in situ
Hybridization, FISH) be early phase chromosome abnormality detection golden standard.Although FISH has quick, specificity high
Advantage, but be due to be limited by probe species and the plain species of mark fluorescent so that the technology is only capable of once to chromosome dyad
Numerical abnormality is detected, and can not be detected in the level of full-length genome.More it is widely used in chromosome at present different
The method often detected is microarray-comparative genome hybridization (array-CGH) technology [2].Compared to FISH technology, array-CGH
Technology only can just detect that all 23 pairs of chromosome numbers change by a hybrid experiment, but the resolution ratio of its detection takes
Certainly in the density (the unlapped region of probe can not be detected) of probe, to detect 23 pairs of dyes from full-length genome level
The abnormal situation of colour solid, must just increase the quantity of probe, substantially increase cost.And with high-flux sequence cost
Reduction, in recent years, based on high throughput sequencing technologies carry out embryo chromosome aneuploid detection method be increasingly becoming master
Stream.
Detect that the main process of embryo chromosome aneuploid is as follows based on high throughput sequencing technologies:1), obtain reasonable
(apoblema tissue or embryonic tissue then can be with direct enzyme cuttings or ultrasonic by DNA fragmentation for the DNA profiling of quantity;And blastomere
Or cleavage-cell due to starting DNA profiling for microgram rank so needing to carry out unicellular amplification in advance);2) one, is selected
Determine the DNA molecular (such as 150-250bp) of clip size;3) library, is built, sequencing joint is added at above-mentioned DNA molecular two ends;
4), upper machine sequencing obtains the sequence (reads) of certain length;5), using compare software by sequence (reads) compare to the mankind ginseng
Genome, filtering repetitive sequence and low-quality sequence are examined, the sequence number (reads of each chromosome diverse location is obtained
) and sequence ratio (reads ratio) number;6), judge that embryo whether there is chromosome abnormality using statistical model.Work as embryo
When there is chromosome aneuploid in tire, corresponding total chromosome number have it is a certain proportion of be raised and lowered, therefore can with it is certain
The reference set that amount sample is constituted compare or itself sample in compare to judge chromosome with the presence or absence of abnormal.Chromosome is different
The statistical method often detected can be largely classified into reference subset composition and division in a proportion compared with compare two methods in itself sample.
Reference subset composition and division in a proportion compared with exemplary process be Z test [3]:Z test model is built using a large amount of normal samples
Reference database, obtains average and standard deviation that reference data concentrates the long ratio of reading (reads ratio) of each chromosome, then
Z-score of the sample to be tested in every chromosome is calculated, come judgement sample whether is aneuploid according to Z-score.But,
The subject matter that Z test model has is that the Z-score sizes of sample to be tested are very strong to the model dependence of reference data set,
If sensitivity and specificity serious reduction can be caused when sample to be tested and the low data consistency of reference sample set.It is right
In aneuploidy screening before Embryonic limb bud cell (PGS), the starting DNA content of embryo is about 6.6pg~30pg, the template of DNA startings
Content is very low, so need to carry out whole genome amplification (whole genome amplification, WGA) and then sequencing,
And whole genome amplification can introduce serious GC preferences, this often leads to the uniformity of sample to be tested and reference data set sample very
Difference, it is seen then that Z-score models are not suitable for Embryonic limb bud cell prochromosome aneuploid detection method.
Therefore, examination mainly uses the method compared in itself sample before Embryonic limb bud cell:Genome is divided into different windows
The bins (data box) of size, counts all bins sequence ratio (copy ratio), then by reading the change of long ratio
Trend infers whether to exist chromosome abnormality [4].And the subject matter based on the method for inspection compared in itself sample then exists
Single statistical indicator " the copy ratio ", when unicellular amplification homogeneity is poor of single sample are based only in the result of inspection
When, " copy ratio " fluctuation is very big, it may appear that the result of substantial amounts of exceptional value and false positive.Therefore in order to solve to pass
The problem of result precision and low reliability in itself sample produced by comparative approach of uniting, the present invention is in itself sample
The data handling procedure of comparative approach proposes improvement.
Bibliography
1.Bielanska,M.,S.L.Tan,and A.Ao,Chromosomal mosaicism throughout
human preimplantation development in vitro:incidence,type,and relevance to
embryo outcome.Hum Reprod,2002.17(2):p.413-9.
2.Gutierrez-Mateo,C.,et al.,Validation of microarray comparative
genomic hybridization for comprehensive chromosome analysis of embryos.Fertil
Steril,2011.95(3):p.953-8.
3.Chiu,R.W.,et al.,Noninvasive prenatal diagnosis of fetal
chromosomal aneuploidy by massively paralel genomic sequencing of DNA in
maternal plasma.Proc Natl Acad Sci U S A,2008.105(51):p.20458-63.
4.Fu,Y.,et al.,Uniform and accurate single-cel sequencing based on
emulsion whole-genome amplification.Proc Natl Acad Sci U S A,2015.112(38):
p.11923-8.
The content of the invention
In order to solve the above-mentioned technical problem, at it is an object of the invention to provide a kind of sequence data for embryo chromosome
Manage device.
The technical solution adopted in the present invention is:A kind of sequence data processing unit for embryo chromosome, the device
Including:
Sequencing data acquiring unit, the DNA obtained for obtaining after high-flux sequence reads long segment;
Sequencing data processing unit, is compared for the DNA of acquisition to be read into long segment with human genome standard sequence,
Each DNA is read into long segment to compare to chromosome relevant position, so as to obtain the chromosome corresponding to each DNA readings long segment, starting
Site and sequence length, and unique sequence of matching completely;
Data result analytic unit, for the reading long segment distribution situation according to unique sequence of matching completely, divides different
Reading length it is interval, the DNA fragmentation ratio of each length of interval on every chromosome is calculated, according to chromosome different length area to be measured
Between under the difference of DNA fragmentation ratio between the two in the case where different length is interval of DNA fragmentation ratio and known autosome, sentence
Whether the chromosome to be measured that breaks is aneuploid;
Wherein, the DNA fragmentation ratio be according to the DNA fragmentation number under length of interval, sample under length of interval
All autosomal DNA fragmentation number summations and the length computation of chromosome are drawn.
Further, on the chromosome length of interval DNA fragmentation ratio, the calculation formula that it is used is as follows:
Wherein, i is expressed as chromosome numbers;J is expressed as length of interval numbering;ratioijIt is expressed as on No. i-th chromosome
DNA fragmentation ratio under j-th of length of interval;reads_nijIt is expressed as the DNA under j-th of length of interval on No. i-th chromosome
Segment number;reads_njIt is expressed as all autosomal DNA fragmentation number summations of the sample under j-th of length of interval;
chr_leniIt is expressed as the length of No. i-th chromosome.
Further, it is described to be existed according to DNA fragmentation ratio of the chromosome different length to be measured under interval with known autosome
Whether the difference of DNA fragmentation ratio between the two under different length is interval, it is this step of aneuploid to judge chromosome to be measured
Suddenly, it is specifically included:
Judge that DNA fragmentation ratio and known autosome under chromosome different length interval to be measured are interval in different length
Under DNA fragmentation ratio difference between the two whether in coincidence statistics meaning significant difference standard, if so, then judging to treat
Survey chromosome is aneuploid, conversely, then judging chromosome to be measured not for aneuploid.
Further, the length of the chromosome refers to that chromosome filters out the length behind centromere, telomere and sat-zone.
Further, the division for reading long interval is realized using sliding window method.
Another technical scheme of the present invention is:A kind of sequence data processing unit for embryo chromosome, bag
Processor is included, various instructions are adapted for carrying out, the instruction is suitable to be loaded by processor and perform following steps:
Obtain the DNA obtained after high-flux sequence and read long segment;
The DNA of acquisition is read into long segment to be compared with human genome standard sequence, each DNA is read into long segment comparison and arrived
Chromosome relevant position, so as to obtain chromosome, initiation site and the sequence length corresponding to each DNA readings long segment, Yi Jiwei
One matches sequence completely;
According to the reading long segment distribution situation of unique sequence of matching completely, different reading length intervals are divided, every dye is calculated
The DNA fragmentation ratio of each length of interval on colour solid, according to the DNA fragmentation ratio under chromosome different length to be measured interval and
Know the difference of DNA fragmentation ratio of the autosome in the case where different length is interval between the two, whether judge chromosome to be measured is non-
Euploid;
Wherein, the DNA fragmentation ratio be according to the DNA fragmentation number under length of interval, sample under length of interval
All autosomal DNA fragmentation number summations and the length computation of chromosome are drawn.
Further, on the chromosome length of interval DNA fragmentation ratio, the calculation formula that it is used is as follows:
Wherein, i is expressed as chromosome numbers;J is expressed as length of interval numbering;ratioijIt is expressed as on No. i-th chromosome
DNA fragmentation ratio under j-th of length of interval;reads_nijIt is expressed as the DNA under j-th of length of interval on No. i-th chromosome
Segment number;reads_njIt is expressed as all autosomal DNA fragmentation number summations of the sample under j-th of length of interval;
chr_leniIt is expressed as the length of No. i-th chromosome.
Further, it is described to be existed according to DNA fragmentation ratio of the chromosome different length to be measured under interval with known autosome
Whether the difference of DNA fragmentation ratio between the two under different length is interval, it is this step of aneuploid to judge chromosome to be measured
Suddenly, it is specifically included:
Judge that DNA fragmentation ratio and known autosome under chromosome different length interval to be measured are interval in different length
Under DNA fragmentation ratio difference between the two whether in coincidence statistics meaning significant difference standard, if so, then judging to treat
Survey chromosome is aneuploid, conversely, then judging chromosome to be measured not for aneuploid.
Further, the length of the chromosome refers to that chromosome filters out the length behind centromere, telomere and sat-zone.
Further, the division for reading long interval is realized using sliding window method.
The beneficial effects of the invention are as follows:By the way that apparatus of the present invention are applied into comparative approach in itself traditional sample, come real
During existing embryo chromosome numerical abnormality, not only accuracy rate is high, and the ginseng that the present apparatus need not be built using normal negative sample
Examine collection and be used as reference, it is to avoid reference subset, which closes comparative approach and in reference subset and sample to be tested there are severe deviations, to be caused
False positive and false negative.Meanwhile, apparatus of the present invention introduce the reading long message of each chromosome, make the judgement to chromosome abnormality
The numerical value change of sequence ratio (copy ratio) is not merely depended on, and also needs to investigate copy ratio in different reading length
Whether the changing features under (reads length) ratio are reasonable, more accurate with the presence or absence of abnormal judgement to chromosome, can
To reduce false positive rate and false positive rate simultaneously.
Brief description of the drawings
Fig. 1 is the analysis process figure that embryo chromosome aneuploid judgement is carried out based on high-flux sequence data;
Fig. 2 is the P value index number distribution maps of each chromosome after each chromosome Multiple range test of amniocyte T2 samples;
Fig. 3 is the P value tables of each chromosome Multiple range test of amniocyte T2 samples;
Fig. 4 is the P value index numbers point of each chromosome after each chromosome Multiple range test of the unicellular amplified production T4 samples of blastomere
Butut;
Fig. 5 is the P value tables of each chromosome Multiple range test of the unicellular amplified production T4 samples of blastomere.
Embodiment
The present invention thought be:In itself sample on the basis of comparative approach, the length information of calling sequence utilizes sequence
The length of row is classified to the copy ratio values of chromosome, meanwhile, the present invention is removed when judging chromosome with the presence or absence of exception
The change of consideration sequence ratio (reads ratio), it is also contemplated that different sequence ratios for reading long (reads length)
Whether numerical value is reasonable, therefore the testing result drawn by using apparatus of the present invention is more accurately and reliably, and can reduce simultaneously
False positive rate and false negative rate.It can be seen that, the present invention is not only applicable to the chromosome abnormality detection of apoblema and embryonic tissue,
It is a general detection means suitable for examination before the Embryonic limb bud cell based on unicellular amplification.
Apparatus of the present invention are described in detail below in conjunction with specific embodiment.
Embodiment 1
A kind of sequence data processing unit for embryo chromosome, is specifically included:
Sequencing data acquiring unit, the DNA obtained for obtaining after high-flux sequence reads long segment;Wherein, the DNA
Read long segment and refer to the information such as the DNA information that sequencing is obtained, including DNA base sequence and length;
Wherein, the DNA acquired reads long segment, and it is to the unicellular amplified production of blastomere or abortion tissue or amniotic fluid
DNA in cell carries out DNA obtained from after high-flux sequence and reads long segment;
Sequencing data processing unit, is carried out for the DNA of acquisition to be read into long segment with human genome standard sequence hg19
Compare, each DNA is read into long segment compared to arrive chromosome relevant position, thus obtain each DNA read chromosome corresponding to long segment,
Specific initiation site and sequence length;Meanwhile, read long segment in DNA and process is compared in human genome standard sequence hg19
In, by rejecting the nucleotide sequence in tandem sequence repeats position and transposons repeatable position, and low-quality, many matchings and
Non-fully match after the nucleotide sequence on chromosome, obtain unique sequences, i.e., unique sequence of matching completely;
Data result analytic unit, for the reading long segment distribution situation according to unique sequences, divides different readings long
Interval, different reading length interval is interval for different length;
The DNA fragmentation ratio of each length of interval on every chromosome is calculated using sliding window method, then the DNA to calculating
Fragment ratio carries out GC corrections, by compare the chromosome different length to be measured after correction it is interval under DNA fragmentation ratio and its
Whether he the difference of DNA fragmentation ratio of the known autosome in the case where different length is interval has conspicuousness, so as to judge to be measured
Whether chromosome is aneuploid;
Preferably, the use sliding window method calculates this step of the DNA fragmentation ratio of each length of interval on every chromosome
Suddenly, it is specifically included:
Using sliding window method, according to default length gradient and step (step-length), difference is respectively divided in DNA readings long segment
Length of interval, specifically, using 10bp as length gradient (window), using 10bp as step (step-length), obtain different length
Fragment interval is:[100,110), [110,120), [120,130) ... ..., [210,220), [220,230);
Then, in order to different in view of length between chromosome, chromosome is introduced in DNA fragmentation ratio calculation formula long
Variable is spent, the linear module unification of reads ratio between coloured differently body is realized, i.e. length of interval on the chromosome
DNA fragmentation ratio, the first calculation formula that it is used is as follows:
Wherein, i is expressed as chromosome numbers;J is expressed as length of interval numbering;ratioijIt is expressed as on No. i-th chromosome
DNA fragmentation ratio under j-th of length of interval;reads_nijIt is expressed as the DNA under j-th of length of interval on No. i-th chromosome
Segment number;reads_njIt is expressed as all autosomal DNA fragmentation number summations of the sample under j-th of length of interval;
chr_leniIt is expressed as the length of No. i-th chromosome;
Wherein, above-mentioned is the reading long segment after being corrected based on GC through counting the DNA fragmentation number under the length of interval drawn
Distribution situation carries out what statistics was drawn;
Preferably, it is described by compare the chromosome different length to be measured after correction it is interval under DNA fragmentation ratio and its
Whether he the difference of DNA fragmentation ratio of the known autosome in the case where different length is interval has conspicuousness, so as to judge to be measured
The step for whether chromosome is aneuploid, it is specifically included:
Judge the DNA fragmentation ratio under chromosome different length interval to be measured with other known autosomes in different length
DNA fragmentation ratio under interval, difference between the two whether in coincidence statistics meaning significant difference standard, specifically,
Judge different length under unit chromosome length it is interval in DNA read long segment ratio it is whether statistically significant on significance difference
It is different, if so, chromosome to be measured is then judged for aneuploid, conversely, then judging chromosome to be measured not for aneuploid.
Above-mentioned sequencing data acquiring unit, sequencing data processing unit and data result analytic unit can be program module,
Or hardware device module.
Embodiment 2
A kind of sequence data processing unit for embryo chromosome, including processor, are adapted for carrying out various instructions, described
Instruction is suitable to be loaded by processor and perform following steps:
The DNA that S101, acquisition are obtained after high-flux sequence reads long segment, wherein, the DNA reads long segment and refers to surveying
The DNA information that sequence is obtained, including the information such as DNA base sequence and length;
Wherein, the DNA acquired reads long segment, and it is to the unicellular amplified production of blastomere or abortion tissue or amniotic fluid
DNA in cell carries out DNA obtained from after high-flux sequence and reads long segment;
S102, the DNA of acquisition is read into long segment be compared with human genome standard sequence hg19, each DNA is read into length
Fragment, which is compared, arrives chromosome relevant position, so as to obtain chromosome, specific initiation site and the sequence corresponding to each DNA readings long segment
Row length;Meanwhile, during DNA reads long segment and human genome standard sequence hg19 is compared, it is in by rejecting
Tandem sequence repeats position and the nucleotide sequence of transposons repeatable position, and low-quality, many matchings and non-fully match dye
After nucleotide sequence on colour solid, unique sequences are obtained, i.e., unique sequence of matching completely;
S103, the reading long segment distribution situation according to unique sequences, divide different reading length intervals, different Du Chang areas
Between correspondence different length it is interval;Count coloured differently body different length it is interval under DNA fragmentation number, when chromosome to be measured not
Other known autosomal DNA fragmentation numbers with the DNA fragmentation number under length of interval and under corresponding length of interval, both it
Between numerical difference when meeting conspicuousness condition, i.e., the DNA fragmentation number under chromosome different length to be measured is interval be significantly more than or
During less than corresponding to other autosomal DNA fragmentation numbers under length of interval, then judge the chromosome to be measured for aneuploid;
Preferably, it is described count coloured differently body different length it is interval under DNA fragmentation number the step for before
Provided with aligning step, the aligning step is:Reading long segment distribution situation to unique sequences carries out GC corrections;Namely
Say,
DNA fragmentation number under coloured differently body different length is interval is the DNA fragmentation distribution situation after being corrected based on GC
Counted;
S104, the DNA fragmentation ratio for calculating using sliding window method each length of interval on every chromosome, then to calculating
DNA fragmentation ratio carry out GC corrections, by compare the chromosome different length to be measured after correction it is interval under DNA fragmentation ratio
Whether there is conspicuousness with the difference of DNA fragmentation ratio of other known autosomes in the case where different length is interval, so as to judge
Whether chromosome to be measured is aneuploid;
Preferably, the use sliding window method calculates this step of the DNA fragmentation ratio of each length of interval on every chromosome
Suddenly, it is specifically included:
Using sliding window method, according to default length gradient and step (step-length), difference is respectively divided in DNA readings long segment
Length of interval, specifically, using 10bp as length gradient (window), using 10bp as step (step-length), obtain different length
Fragment interval is:[100,110), [110,120), [120,130) ... ..., [210,220), [220,230);
Then, in order to different in view of length between chromosome, chromosome is introduced in DNA fragmentation ratio calculation formula long
Variable is spent, the linear module unification of reads ratio between coloured differently body is realized, i.e. length of interval on the chromosome
DNA fragmentation ratio, the first calculation formula that it is used is as follows:
Wherein, i is expressed as chromosome numbers;J is expressed as length of interval numbering;ratioijIt is expressed as on No. i-th chromosome
DNA fragmentation ratio under j-th of length of interval;reads_nijIt is expressed as the DNA under j-th of length of interval on No. i-th chromosome
Segment number;reads_njIt is expressed as all autosomal DNA fragmentation number summations of the sample under j-th of length of interval;
chr_leniIt is expressed as the length of No. i-th chromosome;
Preferably, it is described by compare the chromosome different length to be measured after correction it is interval under DNA fragmentation ratio and its
Whether he the difference of DNA fragmentation ratio of the known autosome in the case where different length is interval has conspicuousness, so as to judge to be measured
The step for whether chromosome is aneuploid, it is specifically included:
Judge the DNA fragmentation ratio under chromosome different length interval to be measured with other known autosomes in different length
DNA fragmentation ratio under interval, difference between the two whether in coincidence statistics meaning significant difference standard, specifically,
Judge different length under unit chromosome length it is interval in DNA read long segment ratio it is whether statistically significant on significance difference
It is different, if so, chromosome to be measured is then judged for aneuploid, conversely, then judging chromosome to be measured not for aneuploid.
Embodiment 3
A kind of above-mentioned sequence data processing unit for embryo chromosome is applied in the inspection of embryo chromosome aneuploid
In survey technology, it specifically detects that achievement unit point includes following six part, and it is as shown in Figure 1 to implement process step.
Part I, samples sources:2 samples come from amniocyte, and its karyotyping result is respectively 46, XN and 47,
XN,+16;Blastomere unicellular amplified production of 2 samples from embryo's spilting of an egg period, its array-CGH Microarray results
Respectively 46, XN and 47, XN ,+9.
Part II, sequencing data are compared and Quality Control
Sequencing data is compared with human genome standard sequence hg19, determines sequence dna fragment on chromosome
Accurate location.In order to ensure the quality of sequencing result and avoid the interference of some repetitive sequences, low-quality sequence is rejected, and
The base for being pointed to genome tandem sequence repeats and swivel base repeat region is filtered, the final DNA fragmentation for obtaining unique match, i.e.,
Unique sequences.
Part III, GC corrections
Influenceed to eliminate G/C content interior DNA fragmentation number interval on coloured differently body different length, count different GC and contain
DNA fragmentation number under amount group, and it is corrected using median.
Part IV, the DNA fragmentation ratio for calculating each each length of interval of chromosome in sample to be tested
Using 10bp as length gradient (window) in a, embodiment, using 10bp as step (step-length), different length is obtained
Fragment interval is:[100,110), [110,120), [120,130) ... ..., [210,220), [220,230);
DNA fragmentation sum after each length of interval is corrected through GC in b, statistical sample;
DNA fragmentation number after each each length of interval of chromosome is corrected through GC in c, statistical sample;
D, according to above-mentioned first calculation formula, calculate the DNA fragmentation ratio of each each length of interval of chromosome in sample to be tested.
As a result as shown in table 1-4, wherein i is No. i-th chromosome, and j is jth group length of interval.
The corresponding DNA fragmentation ratio of each each length of interval of autosome in the amniocyte sample T1 of table 1
The corresponding DNA fragmentation ratio of each each length of interval of autosome in the amniocyte sample T2 of table 2
The corresponding DNA fragmentation ratio of each each length of interval of autosome in the unicellular amplified production sample T3 of the blastomere of table 3
The corresponding DNA fragmentation ratio of each each length of interval of autosome in the unicellular amplified production sample T4 of the blastomere of table 4
Part V, the variance analysis (two-way that two-way classification is carried out to the DNA fragmentation ratio after correction
classification ANOVA)
A, two factors:Factor 1:DNA fragmentation reads long interval, factor 2:Chromosome, does not consider reciprocation.According to P values
And significance, judge that the lower DNA fragmentation ratio in each chromosome different length interval has indifference;
B, consideration two factors of DNA fragmentation length and chromosome, the variance analysis of two-way classification is carried out to DNA fragmentation ratio
(assuming that H0:22 euchromatic dna fragment ratio population mean is all equal, i.e., do not consider in the case of sex chromosome, the sample is
Negative sample;H1:22 euchromatic dna fragment ratio population mean is not all equal, i.e., the sample is positive sample, is existed non-
Euploid chromosomal);
C, the results of analysis of variance interpretation:Read long interval for 1-DNA fragmentation of factor, if P values (variance test result pair
The probable value answered) be less than the level of signifiance 0.05, illustrate coloured differently body different length interval under DNA fragmentation ratio difference by
To the factor influence, therefore the sample result be it is insecure (because different DNA fragmentation length produce be random by digestion
Fragmentation is produced, and DNA fragmentation length and DNA fragmentation ratio are not in contact with);If P values are more than 0.05, illustrate the sample knot
Fruit is rational, and further the result of factor 2 can be analyzed;For 2-chromosome of factor, if P values are more than 0.05, say
DNA fragmentation ratio between bright coloured differently body is not significantly different, and 22 autosomes are all euploid, therefore be can determine whether as just
Normal sample (do not consider sex chromosome in the case of);If P values are less than 0.05, DNA fragmentation is in the presence of aobvious between illustrating coloured differently body
Write in difference, 22 autosomes and there is aneuploid chromosome, therefore next need to carry out the multiple ratio of a plurality of interchromosomal
Compared with so that it is determined which bar chromosome is aneuploid.
D, according to the results of analysis of variance, calculate P values.As a result as shown in table 5 (P1:Different DNA fragmentations read long interval factor;
P2:Chromosome factor).
The P value results of the variance analysis of table 5
Note:T1 and T2 is amniocyte;T3 and T4 is the unicellular amplified production of blastomere.
According to above-mentioned table 5, judge as follows:
1) for T1, P1And P2Both greater than 0.05, therefore deducibility is normal sample;Similarly, T3 is inferred to for normal sample.
2) for T2, P1More than 0.05, and P2Less than 0.05, then it is assumed that the sample has aneuploid chromosome, therefore judges
For positive sample;Similarly, it is also positive sample to be inferred to T4.
Part VI, each interchromosomal DNA fragmentation average to exceptional sample carry out Multiple range test
Because variance analysis can only judge that the sample whether there is aneuploid chromosome, without specifically any bar can determine that
It is abnormal, therefore, the Multiple range test for being determined as that abnormal sample carries out average to variance analysis is examined using multiple t.I.e. to every
For the population mean of the DNA fragmentation ratio of chromosome, respectively with the population mean of the DNA fragmentation ratio of other 21 chromosomes
Otherness comparison is carried out, method is examined using the t of two Normal Means.Can the criminal of increase I due to repeatedly using t inspections
The probability of class mistake (this indiscriminate two population mean is judged to difference), so that the conclusion of " there were significant differences " is not
It is certain reliable.Therefore, P values are adjusted using Bonferroni methods.
Multiple range test analysis is carried out to above-mentioned two exceptional samples (T2 and T4), P value results are as shown in Figure 3.
For T2 samples, No. 16 chromosomes and other are can be seen that from the distribution map of the P value index numbers of Fig. 2 variance analysis
Chromosome has the P values of Multiple range test in notable difference, Fig. 3 it can also be seen that being all between No. 16 chromosomes and other chromosomes
Existing significant difference (P values are less than 0.05), but other chromosomes are each other without significant difference.And No. 16 chromosomes are not
It is 5.627 with the average dna fragment ratio under length of interval, average dna fragment of other chromosomes in the case where different length is interval
Ratio is between 3.7~3.8, it is therefore contemplated that many No. 16 chromosomes, therefore T2 samples caryogram is judged for 47, XN ,+16 (with core
Type analysis result is consistent).
Similarly, for T4 samples, from the distribution map of the P value index numbers of Fig. 4 variance analysis can be seen that No. 9 chromosomes with
Other chromosomes have the P values of Multiple range test in notable difference, Fig. 5 it can also be seen that between No. 9 chromosomes and other chromosomes all
Significant difference (P values are less than 0.05) is presented, but other chromosomes are each other without significant difference.And No. 9 chromosomes exist
Average dna fragment ratio under different length is interval is 5.915, average dna piece of other chromosomes in the case where different length is interval
Section ratio is all 3.75 or so, it is therefore contemplated that many No. 9 chromosomes, therefore T4 samples caryogram is judged for 47, XN ,+9 (with
Array-CGH analysis results are consistent).
Above is the preferable implementation to the present invention is illustrated, but the invention is not limited to the implementation
Example, those skilled in the art can also make a variety of equivalent variations or replace on the premise of without prejudice to spirit of the invention
Change, these equivalent deformations or replacement are all contained in the application claim limited range.
Claims (10)
1. a kind of sequence data processing unit for embryo chromosome, it is characterised in that:The device includes:
Sequencing data acquiring unit, the DNA obtained for obtaining after high-flux sequence reads long segment;
Sequencing data processing unit, is compared for the DNA of acquisition to be read into long segment with human genome standard sequence, will be each
DNA reads long segment and compared to chromosome relevant position, so as to obtain the chromosome corresponding to each DNA readings long segment, initiation site
And sequence length, and unique sequence of matching completely;
Data result analytic unit, for the reading long segment distribution situation according to unique sequence of matching completely, divides different readings
It is long interval, the DNA fragmentation ratio of each length of interval on every chromosome is calculated, according under chromosome different length to be measured interval
The difference of DNA fragmentation ratio between the two in the case where different length is interval of DNA fragmentation ratio and known autosome, judge to treat
Survey whether chromosome is aneuploid;
Wherein, the DNA fragmentation ratio is all under length of interval according to the DNA fragmentation number under length of interval, sample
Autosomal DNA fragmentation number summation and the length computation of chromosome are drawn.
2. a kind of sequence data processing unit for embryo chromosome according to claim 1, it is characterised in that:The dye
The DNA fragmentation ratio of length of interval on colour solid, the calculation formula that it is used is as follows:
<mrow>
<msub>
<mi>ratio</mi>
<mrow>
<mi>i</mi>
<mi>j</mi>
</mrow>
</msub>
<mo>=</mo>
<mfrac>
<mrow>
<mi>r</mi>
<mi>e</mi>
<mi>a</mi>
<mi>d</mi>
<mi>s</mi>
<mo>_</mo>
<msub>
<mi>n</mi>
<mrow>
<mi>i</mi>
<mi>j</mi>
</mrow>
</msub>
</mrow>
<mrow>
<mi>r</mi>
<mi>e</mi>
<mi>a</mi>
<mi>d</mi>
<mi>s</mi>
<mo>_</mo>
<msub>
<mi>n</mi>
<mi>j</mi>
</msub>
</mrow>
</mfrac>
<mo>&CenterDot;</mo>
<mfrac>
<mn>1</mn>
<mrow>
<mi>c</mi>
<mi>h</mi>
<mi>r</mi>
<mo>_</mo>
<msub>
<mi>len</mi>
<mi>i</mi>
</msub>
</mrow>
</mfrac>
</mrow>
Wherein, i is expressed as chromosome numbers;J is expressed as length of interval numbering;ratioijIt is expressed as on No. i-th chromosome j-th
DNA fragmentation ratio under length of interval;reads_nijIt is expressed as the DNA fragmentation number under j-th of length of interval on No. i-th chromosome
Mesh;reads_njIt is expressed as all autosomal DNA fragmentation number summations of the sample under j-th of length of interval;chr_leniTable
It is shown as the length of No. i-th chromosome.
3. a kind of sequence data processing unit for embryo chromosome according to claim 1 or claim 2, it is characterised in that:Institute
State according to the DNA fragmentation ratio and known autosome under chromosome different length to be measured interval in the case where different length is interval
The difference of DNA fragmentation ratio between the two, the step for whether chromosome to be measured is aneuploid is judged, it is specifically included:
Judge the DNA fragmentation ratio under chromosome different length interval to be measured with known autosome in the case where different length is interval
The difference of DNA fragmentation ratio between the two whether in coincidence statistics meaning significant difference standard, if so, then judging dye to be measured
Colour solid is aneuploid, conversely, then judging chromosome to be measured not for aneuploid.
4. a kind of sequence data processing unit for embryo chromosome according to claim 1 or claim 2, it is characterised in that:Institute
The length for stating chromosome refers to that chromosome filters out the length behind centromere, telomere and sat-zone.
5. a kind of sequence data processing unit for embryo chromosome according to claim 1 or claim 2, it is characterised in that:Institute
The long interval division of reading is stated to realize using sliding window method.
6. a kind of sequence data processing unit for embryo chromosome, including processor, are adapted for carrying out various instructions, its feature
It is:The instruction is suitable to be loaded by processor and perform following steps:
Obtain the DNA obtained after high-flux sequence and read long segment;
The DNA of acquisition is read into long segment to be compared with human genome standard sequence, each DNA is read into long segment compares to dyeing
Body relevant position, so that chromosome, initiation site and the sequence length corresponding to each DNA readings long segment are obtained, and it is unique complete
Full matching sequence;
According to the reading long segment distribution situation of unique sequence of matching completely, different reading length intervals are divided, every chromosome is calculated
The DNA fragmentation ratio of upper each length of interval, according to the lower DNA fragmentation ratio in chromosome different length to be measured interval with it is known normal
Whether the difference of DNA fragmentation ratio of the chromosome in the case where different length is interval between the two, it is non-multiple to judge chromosome to be measured
Body;
Wherein, the DNA fragmentation ratio is all under length of interval according to the DNA fragmentation number under length of interval, sample
Autosomal DNA fragmentation number summation and the length computation of chromosome are drawn.
7. a kind of sequence data processing unit for embryo chromosome according to claim 6, it is characterised in that:The dye
The DNA fragmentation ratio of length of interval on colour solid, the calculation formula that it is used is as follows:
<mrow>
<msub>
<mi>ratio</mi>
<mrow>
<mi>i</mi>
<mi>j</mi>
</mrow>
</msub>
<mo>=</mo>
<mfrac>
<mrow>
<mi>r</mi>
<mi>e</mi>
<mi>a</mi>
<mi>d</mi>
<mi>s</mi>
<mo>_</mo>
<msub>
<mi>n</mi>
<mrow>
<mi>i</mi>
<mi>j</mi>
</mrow>
</msub>
</mrow>
<mrow>
<mi>r</mi>
<mi>e</mi>
<mi>a</mi>
<mi>d</mi>
<mi>s</mi>
<mo>_</mo>
<msub>
<mi>n</mi>
<mi>j</mi>
</msub>
</mrow>
</mfrac>
<mo>&CenterDot;</mo>
<mfrac>
<mn>1</mn>
<mrow>
<mi>c</mi>
<mi>h</mi>
<mi>r</mi>
<mo>_</mo>
<msub>
<mi>len</mi>
<mi>i</mi>
</msub>
</mrow>
</mfrac>
</mrow>
Wherein, i is expressed as chromosome numbers;J is expressed as length of interval numbering;ratioijIt is expressed as on No. i-th chromosome j-th
DNA fragmentation ratio under length of interval;reads_nijIt is expressed as the DNA fragmentation number under j-th of length of interval on No. i-th chromosome
Mesh;reads_njIt is expressed as all autosomal DNA fragmentation number summations of the sample under j-th of length of interval;chr_leniTable
It is shown as the length of No. i-th chromosome.
8. a kind of sequence data processing unit for embryo chromosome according to claim 6 or 7, it is characterised in that:Institute
State according to the DNA fragmentation ratio and known autosome under chromosome different length to be measured interval in the case where different length is interval
The difference of DNA fragmentation ratio between the two, the step for whether chromosome to be measured is aneuploid is judged, it is specifically included:
Judge the DNA fragmentation ratio under chromosome different length interval to be measured with known autosome in the case where different length is interval
The difference of DNA fragmentation ratio between the two whether in coincidence statistics meaning significant difference standard, if so, then judging dye to be measured
Colour solid is aneuploid, conversely, then judging chromosome to be measured not for aneuploid.
9. a kind of sequence data processing unit for embryo chromosome according to claim 6 or 7, it is characterised in that:Institute
The length for stating chromosome refers to that chromosome filters out the length behind centromere, telomere and sat-zone.
10. a kind of sequence data processing unit for embryo chromosome according to claim 6 or 7, it is characterised in that:Institute
The long interval division of reading is stated to realize using sliding window method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710347798.0A CN107239676B (en) | 2017-05-17 | 2017-05-17 | A kind of sequence data processing unit for embryo chromosome |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710347798.0A CN107239676B (en) | 2017-05-17 | 2017-05-17 | A kind of sequence data processing unit for embryo chromosome |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107239676A true CN107239676A (en) | 2017-10-10 |
CN107239676B CN107239676B (en) | 2018-04-17 |
Family
ID=59984497
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710347798.0A Active CN107239676B (en) | 2017-05-17 | 2017-05-17 | A kind of sequence data processing unit for embryo chromosome |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107239676B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110970089A (en) * | 2019-11-29 | 2020-04-07 | 北京优迅医疗器械有限公司 | Preprocessing method and preprocessing device for fetal concentration calculation and application of preprocessing method and device |
CN113113081A (en) * | 2020-08-31 | 2021-07-13 | 东莞博奥木华基因科技有限公司 | System for detecting polyploid and genome homozygous region ROH based on CNV-seq sequencing data |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104951671A (en) * | 2015-06-10 | 2015-09-30 | 东莞博奥木华基因科技有限公司 | Device for detecting aneuploidy of fetus chromosomes based on single-sample peripheral blood |
CN105296606A (en) * | 2014-07-25 | 2016-02-03 | 深圳华大基因股份有限公司 | Method and device for determining proportion of free nucleic acids in biological sample and application of method and device for determining proportion of free nucleic acids in biological sample |
-
2017
- 2017-05-17 CN CN201710347798.0A patent/CN107239676B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105296606A (en) * | 2014-07-25 | 2016-02-03 | 深圳华大基因股份有限公司 | Method and device for determining proportion of free nucleic acids in biological sample and application of method and device for determining proportion of free nucleic acids in biological sample |
CN104951671A (en) * | 2015-06-10 | 2015-09-30 | 东莞博奥木华基因科技有限公司 | Device for detecting aneuploidy of fetus chromosomes based on single-sample peripheral blood |
Non-Patent Citations (2)
Title |
---|
STEPHANIE C. Y. YU: "Size-based molecular diagnostics using plasma DNA for noninvasive prenatal testing", 《PROC NATL ACAD SCI USA》 * |
王彦林: "无创产前检测结果与胎儿核型不一致的遗传学分析", 《中国博士学位论文全文数据库》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110970089A (en) * | 2019-11-29 | 2020-04-07 | 北京优迅医疗器械有限公司 | Preprocessing method and preprocessing device for fetal concentration calculation and application of preprocessing method and device |
CN110970089B (en) * | 2019-11-29 | 2023-05-23 | 北京优迅医疗器械有限公司 | Pretreatment method and pretreatment device for fetal concentration calculation and application of pretreatment device |
CN113113081A (en) * | 2020-08-31 | 2021-07-13 | 东莞博奥木华基因科技有限公司 | System for detecting polyploid and genome homozygous region ROH based on CNV-seq sequencing data |
Also Published As
Publication number | Publication date |
---|---|
CN107239676B (en) | 2018-04-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108573125B (en) | Method for detecting genome copy number variation and device comprising same | |
CN105483229B (en) | A kind of method and system of detection foetal chromosome aneuploidy | |
CN103525939B (en) | The method and system of Non-invasive detection foetal chromosome aneuploidy | |
CN105844116B (en) | The processing method and processing unit of sequencing data | |
KR101614471B1 (en) | Method and apparatus for diagnosing fetal chromosomal aneuploidy using genomic sequencing | |
CN104951671B (en) | The device of fetal chromosomal aneuploidy is detected based on single sample peripheral blood | |
CN105825076B (en) | Eliminate autosome in and interchromosomal GC preference method and detection system | |
CN106096330B (en) | A kind of noninvasive antenatal biological information determination method | |
US12110561B2 (en) | Non-invasive detection method for screening for a well-developed blastocyst | |
CN111899789B (en) | Method and system for identifying BRCA1/2 large fragment rearrangement by second-generation sequencing | |
CN104156631A (en) | Triploid testing method for chromosomes | |
CN107541561A (en) | Improve kit, the device and method of fetus dissociative DNA concentration in maternal peripheral blood | |
CN107239676B (en) | A kind of sequence data processing unit for embryo chromosome | |
EP3023504A1 (en) | Method and device for detecting chromosomal aneuploidy | |
Wapner et al. | The impact of new genomic technologies in reproductive medicine | |
CN105765076B (en) | A kind of chromosomal aneuploidy detection method and device | |
CN109461473B (en) | Method and device for acquiring concentration of free DNA of fetus | |
CN106795551A (en) | The CNV analysis methods and detection means of unicellular chromosome | |
Wauters et al. | Fully automated FISH examination of amniotic fluid cells | |
CN117095745A (en) | Method and device for detecting fetal aneuploidy and copy number variation in maternal plasma free DNA and application thereof | |
CN109402247B (en) | Fetus chromosome detection system based on DNA variation counting | |
CN113593629B (en) | Method for reducing non-invasive prenatal detection false positive and false negative based on semiconductor sequencing | |
CN111440860B (en) | Plasma quality control product for noninvasive prenatal detection and preparation method thereof | |
KR101618032B1 (en) | Non-invasive detecting method for chromosal abnormality of fetus | |
TW202204636A (en) | A method for detecting a copy number of a chromosome or a part thereof in a cell |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |