CN112652359A - Chromosome abnormality detection device - Google Patents
Chromosome abnormality detection device Download PDFInfo
- Publication number
- CN112652359A CN112652359A CN202011624173.2A CN202011624173A CN112652359A CN 112652359 A CN112652359 A CN 112652359A CN 202011624173 A CN202011624173 A CN 202011624173A CN 112652359 A CN112652359 A CN 112652359A
- Authority
- CN
- China
- Prior art keywords
- chromosome
- feature
- dna sample
- value
- chromosomes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 208000031404 Chromosome Aberrations Diseases 0.000 title claims abstract description 37
- 238000001514 detection method Methods 0.000 title claims abstract description 28
- 206010067477 Cytogenetic abnormality Diseases 0.000 title abstract description 22
- 210000000349 chromosome Anatomy 0.000 claims abstract description 131
- 210000002593 Y chromosome Anatomy 0.000 claims abstract description 70
- 210000003765 sex chromosome Anatomy 0.000 claims abstract description 61
- 238000012163 sequencing technique Methods 0.000 claims abstract description 58
- 210000001766 X chromosome Anatomy 0.000 claims abstract description 57
- 230000002159 abnormal effect Effects 0.000 claims abstract description 18
- 238000004458 analytical method Methods 0.000 claims abstract description 13
- 230000002759 chromosomal effect Effects 0.000 claims abstract description 10
- 238000000034 method Methods 0.000 claims description 32
- 238000001914 filtration Methods 0.000 claims description 24
- 230000004075 alteration Effects 0.000 claims description 13
- 238000010606 normalization Methods 0.000 claims description 9
- 238000012937 correction Methods 0.000 claims description 6
- 210000001519 tissue Anatomy 0.000 claims description 6
- 210000002718 aborted fetus Anatomy 0.000 claims description 4
- 210000004381 amniotic fluid Anatomy 0.000 claims description 4
- 238000005259 measurement Methods 0.000 abstract description 2
- 208000026487 Triploidy Diseases 0.000 description 49
- 208000011580 syndromic disease Diseases 0.000 description 11
- 238000012512 characterization method Methods 0.000 description 8
- 238000012545 processing Methods 0.000 description 8
- 230000003426 interchromosomal effect Effects 0.000 description 6
- 238000002864 sequence alignment Methods 0.000 description 5
- 208000020584 Polyploidy Diseases 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000007935 neutral effect Effects 0.000 description 4
- 230000005856 abnormality Effects 0.000 description 3
- 238000012165 high-throughput sequencing Methods 0.000 description 3
- 206010000234 Abortion spontaneous Diseases 0.000 description 2
- 206010000210 abortion Diseases 0.000 description 2
- 231100000176 abortion Toxicity 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 230000004720 fertilization Effects 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 208000000995 spontaneous abortion Diseases 0.000 description 2
- 206010008805 Chromosomal abnormalities Diseases 0.000 description 1
- 101100495925 Schizosaccharomyces pombe (strain 972 / ATCC 24843) chr3 gene Proteins 0.000 description 1
- 208000037492 Sex Chromosome Aberrations Diseases 0.000 description 1
- 206010061513 Sex chromosome abnormality Diseases 0.000 description 1
- 208000035199 Tetraploidy Diseases 0.000 description 1
- 230000003322 aneuploid effect Effects 0.000 description 1
- 208000036878 aneuploidy Diseases 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 231100000005 chromosome aberration Toxicity 0.000 description 1
- 210000001726 chromosome structure Anatomy 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 210000003754 fetus Anatomy 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 210000003917 human chromosome Anatomy 0.000 description 1
- 208000021267 infertility disease Diseases 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000021121 meiosis Effects 0.000 description 1
- 230000028161 membrane depolarization Effects 0.000 description 1
- 230000035935 pregnancy Effects 0.000 description 1
- 238000003793 prenatal diagnosis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
Abstract
The present invention relates to a chromosome abnormality detection device, including: a sequencing data obtainer that performs sequencing based on the DNA sample to obtain sequencing data of the DNA sample; a reference sequence aligner for aligning the sequencing data with a reference sequence to obtain chromosomal data of the DNA sample; an inter-chromosome feature analyzer that performs inter-chromosome feature analysis on chromosomes of the DNA sample based on the acquired chromosome data to obtain a first feature of each chromosome with respect to all chromosomes; a sex chromosome characteristic measuring device which performs characteristic measurement on the sex chromosome of the DNA sample based on the acquired chromosome data to obtain a second characteristic of the sex chromosome relative to the autosome and a third characteristic of the Y chromosome relative to the X chromosome; an abnormal feature determiner that determines whether there is a chromosomal abnormality in the DNA sample based on the first feature, the second feature, and the third feature, and determines whether there is a euploid distortion in the DNA sample based on the second feature and/or the third feature.
Description
Technical Field
The present invention relates to a chromosome abnormality detection device. The device of the application is suitable for detecting chromosome abnormality, and compared with the existing chromosome abnormality detection device, the detection device is more suitable for detecting the triploid syndrome, especially for detecting the triploid syndrome with the nuclear type of 69 XXY.
Background
The triploid syndrome refers to that a set of haploid chromosomes are added more than a normal diploid, three sex chromosomes are provided, and the total number of the chromosomes is 69. Triploid syndrome is the most common polyploid in prenatal diagnosis. 99% of triploid fetuses are unable to survive birth, and most of them are aborted at 10-20 gestational weeks, accounting for about 10% of cases of spontaneous abortion at early pregnancy. Triploids of the mosaic can survive for longer periods. There are three types of karyotypes for the triploid, 69XXY, 69XXX, 69XYY, in proportions of 60%, 37% and 3%, respectively. Triploid generation mechanisms mainly include bisexual fertilization and bisestrous fertilization.
For chromosome abnormality, especially triploid syndrome, the Z value detection method commonly used in the prior art is to determine whether there is abnormality in the chromosome to be detected by analyzing the correlation between each chromosome in the sample and other chromosomes. Therefore, it is possible to detect whether a chromosome is abnormal or not with respect to other chromosomes. However, when all chromosomes are abnormal, such as when triploid syndrome occurs, one chromosome cannot be distinguished from other chromosomes.
Disclosure of Invention
In view of the above-described drawbacks of the prior art, it is an object of the present invention to provide a chromosome abnormality detection apparatus, and particularly to provide a chromosome abnormality detection apparatus capable of accurately detecting triploid syndrome.
Specifically, the object of the present invention is achieved by the following means.
The present invention relates to the following:
1. a chromosomal abnormality detection device, comprising:
a sequencing data acquirer that performs sequencing based on a DNA sample to obtain sequencing data of the DNA sample;
a reference sequence aligner for aligning the sequencing data to a reference sequence to obtain chromosomal data of the DNA sample;
an inter-chromosome signature analyzer that performs inter-chromosome signature analysis on chromosomes of the DNA sample based on the acquired chromosome data to obtain a first signature of each chromosome relative to all chromosomes;
a sex chromosome characterizer that characterizes the sex chromosomes of the DNA sample based on the acquired chromosome data to obtain second characteristics of the sex chromosomes relative to the autosomes and third characteristics of the Y chromosomes relative to the X chromosomes;
an abnormal feature determiner that determines whether there is a chromosomal abnormality in the DNA sample based on the first feature, the second feature, and the third feature, and determines whether there is a euploid distortion in the DNA sample based on the second feature and/or the third feature.
2. The apparatus according to item 1, wherein,
the reference sequence aligner comprises a window cutter, wherein the window cutter is used for cutting the reference sequence into windows with the same size and correspondingly cutting the chromosome data into a plurality of chromosome windows.
3. The apparatus according to item 1, wherein,
the sequencing data obtainer comprises a first low quality data filtering component for filtering the sequencing data of the DNA sample to remove low quality data in the sequencing data of the DNA sample, and using the filtered sequencing data of the DNA sample for alignment with a reference sequence in the reference sequence aligner.
4. The apparatus according to item 1, wherein the inter-chromosome feature analyzer obtains a UR value for each chromosome of the DNA sample, and obtains a first feature of each chromosome with respect to all chromosomes based on the UR value for each chromosome.
5. The apparatus according to item 4, wherein the inter-chromosome feature analyzer obtains the first feature by a Lowess method and a normalization method based on UR value and GC content.
6. The apparatus according to item 1, wherein the sex chromosome characteristic determinator comprises:
a second feature obtainer for obtaining a ratio of the UR value of the sex chromosome to a sum of the UR values of any one of the autosomes and the sex chromosome, i.e., a second feature;
a third feature obtainer for obtaining a ratio of the UR value of the Y chromosome with respect to a sum of the UR values of the X chromosome and the Y chromosome, i.e., a third feature.
7. The apparatus of item 6, wherein the autosome is chromosome 1.
8. The apparatus according to item 1, wherein, in the abnormal feature determiner,
determining the number of each autosome based on the first feature;
determining the number of Y chromosomes based on the third feature;
determining the number of X chromosomes based on the second feature, an
Correcting the number of autosomes based on the second feature and/or the third feature,
thereby determining whether the DNA sample has chromosomal abnormality and whether the DNA sample has euploid aberration.
9. The device according to item 1, wherein the DNA sample is derived from an embryo, aborted tissue or amniotic fluid to be detected.
10. The apparatus of item 1, wherein the apparatus further comprises a correction component that adjusts the second and/or third features based on a detection result of the apparatus.
11. The apparatus of item 2, wherein the inter-chromosome feature analyzer comprises:
a UR value acquisition component that acquires a UR value for each chromosome window;
a second low quality data filtering component that filters chromosomal windows below a UR set point;
a rectification component that GC rectifies the remaining chromosome window filtered by the second low-quality data filtering component based on GC content using the LOWESS method to obtain a residual;
a first feature obtainer for normalizing the residuals to obtain a first feature of each chromosome relative to all chromosomes.
12. The apparatus of item 11, wherein the sex chromosome characteristic determinator comprises:
a second feature obtainer for obtaining a ratio of the UR value of the sex chromosome to a sum of the UR values of any one of the autosomes and the sex chromosome, i.e., a second feature;
a third feature obtainer for obtaining a third feature, which is a ratio of the UR value of the Y chromosome with respect to a sum of the UR values of the X chromosome and the Y chromosome;
the UR value of the neutral chromosome is the sum of the UR values of each window of the sex chromosome after removing the extreme value;
the UR value of an autosome is the sum of the UR values of each window of the autosome after removal of the extremum;
the UR value of the X chromosome is the sum of the UR values of each window of the X chromosome after removing extreme values; and
the UR value of the Y chromosome is the sum of the UR values of each window of the Y chromosome after removing the extremum.
13. The apparatus of claim 12, wherein the de-extremum processing is removing a maximum 5% portion and a minimum 5% portion of the total number.
14. A method of detecting chromosomal abnormalities, comprising:
a sequencing data acquisition step in which sequencing is performed based on a DNA sample to obtain sequencing data of the DNA sample;
a reference sequence alignment step in which the sequencing data is aligned with a reference sequence to obtain chromosomal data of the DNA sample;
an interchromosomal feature analysis step of performing interchromosomal feature analysis on chromosomes of the DNA sample based on the acquired chromosome data to obtain a first feature of each chromosome with respect to all chromosomes;
a sex chromosome characterization step in which a sex chromosome of the DNA sample is characterized based on the acquired chromosome data to obtain a second characteristic of the sex chromosome relative to an autosome, and a third characteristic of the Y chromosome relative to the X chromosome;
an abnormal feature determination step of determining whether or not there is a chromosomal abnormality in the DNA sample based on the first feature, the second feature and the third feature, and determining whether or not there is a euploid distortion in the DNA sample based on the second feature and/or the third feature.
15. The method of item 14, wherein,
the reference sequence alignment step further comprises a window cutting step in which the reference sequence is cut into windows of the same size and the chromosome data is correspondingly cut into a plurality of chromosome windows.
16. The method of item 14, wherein,
the sequencing data acquisition step comprises a first low quality data filtering step in which the sequencing data of the DNA sample is filtered to remove low quality data in the sequencing data of the DNA sample and the filtered sequencing data of the DNA sample is used for alignment with a reference sequence in the reference sequence aligner.
17. The method according to item 14, wherein, in the inter-chromosome feature analysis step, a UR value of each chromosome of the DNA sample is acquired, and the first feature of each chromosome with respect to all chromosomes is acquired based on the UR value of each chromosome.
18. The method according to item 17, wherein in the inter-chromosome feature analysis step, the first feature is obtained by the Lowess method and the normalization method based on the UR value and the GC content.
19. The method of item 14, wherein the sex chromosome characterization step further comprises:
a second feature acquisition step of acquiring a ratio of the UR value of the sex chromosome to the sum of the UR values of any one of the autosomes and the sex chromosome, i.e., a second feature,
a third feature acquisition step of acquiring a ratio of the UR value of the Y chromosome to the sum of the UR values of the X chromosome and the Y chromosome, that is, a third feature.
20. The method of item 19, wherein the autosome is chromosome 1.
21. The method according to item 14, wherein, in the abnormal feature determination step,
determining the number of each autosome based on the first feature;
determining the number of Y chromosomes based on the third feature;
determining the number of X chromosomes based on the second feature, an
Correcting the number of autosomes based on the second feature and/or the third feature,
thereby determining whether the DNA sample has chromosomal abnormality and whether the DNA sample has euploid aberration.
22. The method of item 14, wherein the DNA sample is derived from an embryo, aborted tissue, or amniotic fluid to be detected.
23. The method according to item 14, wherein the method further comprises a correction step in which the second and/or third features are adjusted based on the detection result of the device.
24. The step of item 15, wherein the inter-chromosome feature analysis step comprises:
a UR value acquisition step of acquiring a UR value for each chromosome window;
a second low quality data filtering step in which a chromosome window below the UR set point is filtered;
a rectification step in which GC rectification is performed on the chromosome window remaining after the filtering by the second low-quality data filtering component based on GC content using the LOWESS method to obtain a residual error;
a first feature acquisition step in which the residuals are normalized to obtain a first feature of each chromosome relative to all chromosomes.
25. The method of claim 24, wherein the sex chromosome characterization step comprises:
a second feature acquisition step of acquiring a ratio of the UR value of the sex chromosome to the sum of the UR values of any one of the autosomes and the sex chromosome, that is, a second feature;
a third feature acquisition step of acquiring a ratio of UR values of the Y chromosome to the sum of UR values of the X chromosome and the Y chromosome, that is, a third feature;
the UR value of the neutral chromosome is the sum of the UR values of each window of the sex chromosome after removing the extreme value;
the UR value of an autosome is the sum of the UR values of each window of the autosome after removal of the extremum;
the UR value of the X chromosome is the sum of the UR values of each window of the X chromosome after removing extreme values; and
the UR value of the Y chromosome is the sum of the UR values of each window of the Y chromosome after removing the extremum.
26. The method of claim 25, wherein the de-extremum processing is removing a maximum 5% portion and a minimum 5% portion of the total number.
Effects of the invention
The device provided by the invention solves the problem which cannot be solved by the device adopted in the prior art, namely, although whether a certain chromosome is abnormal relative to other chromosomes or not can be detected, when all chromosomes are abnormal, such as triploid syndrome, the certain chromosome cannot be distinguished from other chromosomes.
The invention provides a chromosome abnormality detection device capable of accurately judging when all chromosomes are abnormal, such as when triploid syndrome occurs.
Drawings
Various other advantages and benefits of the present invention will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. Also, like parts are designated by like reference numerals throughout the drawings.
FIG. 1 is a view showing an overall frame configuration of a chromosome abnormality detection apparatus according to the present invention;
FIG. 2 is a block diagram of a chromosome abnormality detection apparatus according to an embodiment of the present invention;
FIG. 3 is a frame configuration diagram of a chromosome abnormality detection apparatus according to another embodiment of the present invention;
FIG. 4 is a frame configuration diagram of a chromosome abnormality detection apparatus according to another embodiment of the present invention;
FIG. 5 is a block diagram showing a structure of a chromosome abnormality detection apparatus according to another embodiment of the present invention.
Reference numerals: 1-sequencing data obtainer, 11-first low-quality data filter component, 2-reference sequence aligner, 21-window cutter, 3-inter-chromosome feature analyzer, 31-UR value obtaining component, 32-second low-quality data filter component, 33-correction component, 34-first feature obtainer, 4-sex chromosome feature determinator, 41-second feature obtainer, 42-third feature obtainer, 5-abnormal feature determinator
Detailed Description
The present invention relates to the following definitions.
Typically human 23 pairs of chromosomes, including 22 pairs of autosomes and 1 pair of sex chromosomes. Sex chromosomes consist of either X and X or X and Y chromosomes.
Triploid: refers to a cell or organism containing three sets of chromosomes. Triploid organisms are often infertile because of the difficulty in meiosis to form gametes.
High-throughput sequencing: high-throughput sequencing, also known as "Next-generation" sequencing technology, is used to sequence hundreds of thousands to millions of DNA molecules in parallel at a time.
Window: generally refers to a fixed length region on the genome.
And (5) reading: the plurality of reads is a short sequencing fragment sequence generated by a high-throughput sequencing platform.
Unique reads: refers to reads that align uniquely to the genome. During sequencing, some reads can be aligned to multiple positions of the genome at the same time, and the Unique reads filter out the multiple aligned reads from all non-dup reads, and the rest are Unique reads.
UR value: each window contains the number of Unique Reads.
GC content: among the 4 bases of DNA, the ratio of guanine to cytosine is called GC content.
LOWESS method: the Lowess method is that in a designated window, the numerical value of each point is obtained by weighted regression by using adjacent data in the window.
Removing extreme values: the extreme values in the data are removed.
In the present invention, the residual error is a difference between an actual detection value and an estimated value (e.g., a regression value) obtained by gc correction processing (e.g., Lowess processing).
In the present invention, the normalization method refers to Z normalization, also called standard deviation normalization, which performs normalization of data based on the mean and standard deviation (standard deviation) of raw data.
The number and structure of chromosomes of each organism are relatively constant, but under the influence of natural conditions or artificial factors, the number and structure of chromosomes may change, thereby causing the variation of organisms. Chromosomal aberrations include variations in chromosome number and chromosome structure.
The euploid aberration in this context means that the numerical aberration of human chromosomes is divided into two types, euploid aberration and aneuploid aberration. The euploid aberration is divided into haploid and polyploid, wherein triploid and tetraploid are more common and are one of the main causes of spontaneous abortion. Euploid aberrations include the triploid syndrome, such as the triploid syndrome with a karyotype of 69 XXY.
As shown in fig. 1 to 5, a chromosome abnormality detection apparatus of the present invention includes: a sequencing data acquirer 1, a reference sequence aligner 2, an inter-chromosome feature analyzer 3, a sex chromosome feature determinator 4, and an abnormal feature determinator 5.
Wherein the sequencing data acquirer 1 performs sequencing based on a DNA sample to obtain sequencing data of the DNA sample. The DNA sample may be derived from an embryo, aborted tissue or amniotic fluid to be detected. The DNA sample may be obtained and sequenced using any known technique.
In a preferred embodiment, as shown in FIG. 2, the sequencing data obtainer 1 comprises a first low quality data filtering component 11, the first low quality data filtering component 11 is configured to filter the sequencing data of the DNA sample to remove low quality data in the sequencing data of the DNA sample, and to use the filtered sequencing data of the DNA sample for alignment with a reference sequence of 2 in a downstream reference sequence aligner.
A reference sequence aligner 2 is disposed downstream of the sequencing data obtainer 1 for aligning the sequencing data with a reference sequence to obtain chromosomal data of the DNA sample. Wherein the reference sequence is a human genomic sequence, such as the hg19 whole genome reference sequence.
In a preferred embodiment, as shown in FIG. 3, the reference sequence aligner 2 comprises a window cutter 21, wherein the window cutter 21 is used for cutting the reference sequence into windows with the same size and correspondingly cutting the chromosome data into a plurality of chromosome windows. In a specific embodiment, the window slicer 21 slices the reference sequence into windows with the same size, and correspondingly slices the obtained chromosome data into a plurality of chromosome windows. Both the reference sequence and the chromosomal data were cut into windows of 100kb each, with 50kb overlapping each window.
An interchromosomal signature analyzer 3 is located downstream of the reference sequence aligner 2 and performs interchromosomal signature analysis on the chromosomes of the DNA sample based on the acquired chromosome data to obtain a first signature of each chromosome relative to all chromosomes.
In a specific embodiment, the inter-chromosome feature analyzer 3 obtains a UR value for each chromosome of the DNA sample, and obtains a first feature of each chromosome with respect to all chromosomes based on the UR value of each chromosome. Preferably, in the inter-chromosome feature analyzer 3, the first feature is obtained by the Lowess method and the normalization method based on the UR value and the GC content. The Lowess method is in a designated window, and the numerical value of each point is obtained by weighted regression of adjacent data in the window.
When the reference sequence aligner 2 includes the window slicer 21, in a preferred embodiment, as shown in fig. 5, the inter-chromosome feature analyzer 3 includes a UR value acquisition component 31, a second low quality data filtering component 32, a rectification component 33, and a first feature acquirer 34. Wherein the UR value acquiring component 31 acquires the UR value of each chromosome window, and the second low quality data filtering component 32 filters the chromosome windows lower than the UR set value. The rectifying component 33 performs GC rectification on the remaining chromosome window filtered by the second low-quality data filtering component 32 based on GC content using the LOWESS method to obtain a residual. A first feature obtainer 34 for normalizing the residuals to obtain a first feature for each chromosome relative to all chromosomes.
In one embodiment, the sample is entered into the inter-chromosome profiler 3, and the UR value of each chromosome window, i.e. the numbers of Reads and the numbers of Unique Reads falling within the window in the statistical bam file, is first obtained by the UR value obtaining component 31. The ratio of the number of Unique Reads per window to the total Reads per chromosome was then calculated. UR set value is 0.625. If the ratio is lower than the UR set value, the second low quality data filtering component 32 in the inter-chromosome feature analyzer 3 filters out a window lower than the UR set value, the rectifying component 33 in the inter-chromosome feature analyzer 3 performs GC rectification on the remaining chromosome window filtered by the second low quality data filtering component 32 based on GC content using a LOWESS method to obtain residuals, and the first feature obtainer 34 is configured to normalize the residuals to obtain a first feature of each chromosome relative to the other chromosomes.
A sex chromosome characterizer 4, located downstream of the reference sequence aligner 2 or the inter-chromosome profiler 3, characterizes the sex chromosomes of the DNA sample based on the acquired chromosome data to obtain a second characteristic of the sex chromosomes relative to the autosomes, and a third characteristic of the Y chromosomes relative to the X chromosomes.
In a particular embodiment, the sex chromosome characteristic determinator 4 includes a second characteristic obtainer 41 and a third characteristic obtainer 42. Wherein the second feature obtainer 41 is configured to obtain a ratio of the UR value of the sex chromosome to a sum of the UR values of any one of the autosomes and the sex chromosome, i.e., the second feature. The third feature obtainer 42 is for obtaining a ratio of the UR value of the Y chromosome with respect to the sum of the UR values of the X chromosome and the Y chromosome, i.e., a third feature. In a preferred embodiment, the autosome is chromosome 1.
When reference sequence aligner 2 includes window cutter 21, in a preferred embodiment sex chromosome characteristic determinator 4 includes a second characteristic obtainer 41 and a third characteristic obtainer 42. Wherein the second feature obtainer 41 is configured to obtain a ratio of the UR value of the sex chromosome to a sum of the UR values of any one of the autosomes and the sex chromosome, i.e., the second feature. The third feature obtainer 42 is for obtaining a ratio of the UR value of the Y chromosome with respect to the sum of the UR values of the X chromosome and the Y chromosome, i.e., a third feature. The UR value of the neutral chromosome is the sum of the UR values of each window of the sex chromosome after removing the extreme value. The UR value of an autosome is the sum of the UR values of each window of the autosome after removing the extremum. The UR value of the X chromosome is the sum of the UR values of each window of the X chromosome after removing the extremum. And the UR value of the Y chromosome refers to the sum of the UR values of each window of the Y chromosome after removing the extremum. The extreme value removing processing means removing the maximum 5% part and the minimum 5% part of all the numerical values.
In a specific embodiment, the second feature comprises a scale feature of the X chromosome and a scale feature of the Y chromosome. Wherein the ratio of the X chromosome is characterized by a ratio of the UR value of the X chromosome to (UR value of autosome + UR value of X chromosome). In this example, the UR value of the X chromosome is the sum of the UR values of each window of the X chromosome after removing the extremum. The proportion of the Y chromosome is characterized by the ratio of the UR value of the Y chromosome to (UR value of autosome + UR value of Y chromosome). The third feature is the ratio of the UR value of the Y chromosome/(UR value of the X chromosome + UR value of the Y chromosome).
The ratio characteristic of the X chromosome and the ratio characteristic of the Y chromosome may be used separately or in combination with the third characteristic value, and are used to detect whether there is an abnormality in the sex chromosomes of the DNA sample, and particularly when the ratio characteristics are within a certain threshold range, the condition of the autosomal chromosome determined by the first characteristic needs to be adjusted according to the ratio characteristics, for example, the condition that the autosomal diploid determined by the first characteristic is corrected to be an autosomal triploid.
An abnormal feature determiner 5 is located downstream of the inter-chromosome feature analyzer 3 and the sex chromosome feature determinator 4, determines whether there is a chromosome abnormality in the DNA sample based on the first feature, the second feature and the third feature, and determines whether there is a euploid distortion in the DNA sample based on the second feature and/or the third feature, that is, determining whether there is a chromosome abnormality in the DNA sample requires three features based on the first feature, the second feature and the third feature, and determining whether there is a euploid distortion in the DNA sample is based on two features of the second feature and the third feature.
In a specific embodiment, in the abnormal feature determiner 5, the number of each autosome is determined based on the first feature; determining the number of Y chromosomes based on the third feature; determining the number of X chromosomes based on the second characteristic, and correcting the number of autosomes based on the second characteristic and/or the third characteristic, thereby determining whether the DNA sample has chromosome abnormality and whether the DNA sample has euploid aberration.
In one specific embodiment, the decision process is as follows:
(1) for the Y chromosome
If the third characteristic value is less than 0.03 and the first characteristic value of the Y chromosome is < -3, judging that the Y chromosome is 0 ploid;
if the third eigenvalue is >0.125 and the first eigenvalue of the Y chromosome is >3, judging the Y chromosome to be diploid;
and judging that the Y chromosome is a haploid in other cases.
(2) For the X chromosome
If the second characteristic value is less than 0.275, the X chromosome is judged to be a polyploid, and other autosomes are unchanged based on the judgment of the first characteristic value;
if the second characteristic value is greater than 0.425, judging the X chromosome as a triploid, and keeping the judgment of other autosomes based on the first characteristic value unchanged;
if the second eigenvalue is between 0.275 and 0.425, the X chromosome is judged to be diploid. Further, if the second eigenvalue > <0.275 and the second eigenvalue <0.32, it is determined that X is diploid and other autosomes need to be corrected to triploid.
(3) Against autosomes 1-22
(ii) haploid if the first characteristic value is-1.5;
a triploid if the first eigenvalue > -1.5,
the remainder are diploids.
The present invention also provides a method for detecting chromosomal abnormality, which comprises:
a sequencing data acquisition step in which sequencing is performed based on a DNA sample to obtain sequencing data of the DNA sample;
a reference sequence alignment step in which the sequencing data is aligned with a reference sequence to obtain chromosomal data of the DNA sample;
an interchromosomal feature analysis step of performing interchromosomal feature analysis on chromosomes of the DNA sample based on the acquired chromosome data to obtain a first feature of each chromosome with respect to all chromosomes;
a sex chromosome characterization step in which a sex chromosome of the DNA sample is characterized based on the acquired chromosome data to obtain a second characteristic of the sex chromosome relative to an autosome, and a third characteristic of the Y chromosome relative to the X chromosome;
an abnormal feature determination step of determining whether or not there is a chromosomal abnormality in the DNA sample based on the first feature, the second feature and the third feature, and determining whether or not there is a euploid distortion in the DNA sample based on the second feature and/or the third feature.
In a specific embodiment, the reference sequence alignment step further comprises a window cutting step, in which the reference sequence is cut into windows with the same size, and the chromosome data is correspondingly cut into a plurality of chromosome windows.
Further, in a specific embodiment, the inter-chromosome characteristic analyzing step comprises: a UR value acquisition step of acquiring a UR value for each chromosome window; a second low quality data filtering step in which a chromosome window below the UR set point is filtered; a rectification step in which GC rectification is performed on the chromosome window remaining after the filtering by the second low-quality data filtering component based on GC content using the LOWESS method to obtain a residual error; a first feature acquisition step in which the residuals are normalized to obtain a first feature of each chromosome relative to all chromosomes.
Further, in a specific embodiment, the sex chromosome characterization step comprises: a second feature acquisition step of acquiring a ratio of the UR value of the sex chromosome to the sum of the UR values of any one of the autosomes and the sex chromosome, that is, a second feature; a third feature acquisition step of acquiring a ratio of UR values of the Y chromosome to the sum of UR values of the X chromosome and the Y chromosome, that is, a third feature; the UR value of the neutral chromosome is the sum of the UR values of each window of the sex chromosome after removing the extreme value; the UR value of an autosome is the sum of the UR values of each window of the autosome after removal of the extremum; the UR value of the X chromosome is the sum of the UR values of each window of the X chromosome after removing extreme values; and the UR value of the Y chromosome refers to the sum of the UR values of each window of the Y chromosome after removing the extremum.
In a specific embodiment, the depolarization processing refers to removing a maximum 5% portion and a minimum 5% portion of all values.
In a specific embodiment, the sequencing data obtaining step comprises a first low quality data filtering step in which the sequencing data of the DNA sample is filtered to remove low quality data in the sequencing data of the DNA sample and the filtered sequencing data of the DNA sample is used for alignment with a reference sequence in the reference sequence aligner.
In a specific embodiment, in the inter-chromosome feature analysis step, a UR value of each chromosome of the DNA sample is obtained, and the first feature of each chromosome with respect to all chromosomes is obtained based on the UR value of each chromosome.
Further, in the inter-chromosome feature analysis step, the first feature is obtained by the Lowess method and the normalization method based on the UR value and the GC content.
In a specific embodiment, the sex chromosome characterization step further comprises: a second feature acquisition step of acquiring a ratio of the UR value of the sex chromosome to the sum of the UR values of any one of the autosomes and the sex chromosome, that is, a second feature; a third feature acquisition step of acquiring a ratio of the UR value of the Y chromosome to the sum of the UR values of the X chromosome and the Y chromosome, that is, a third feature. Further, the autosome is chromosome 1.
In a specific embodiment, in the abnormal feature determination step, the number of each autosome is determined based on the first feature; determining the number of Y chromosomes based on the third feature; determining the number of X chromosomes based on the second characteristic, and correcting the number of autosomes based on the second characteristic and/or the third characteristic, thereby determining whether the DNA sample has chromosome abnormality and whether the DNA sample has euploid aberration.
In a particular embodiment, the method further comprises a correction step in which the second and/or third characteristic is adjusted based on the detection result of the device.
The device provided by the invention can obtain the first characteristic, the second characteristic and the third characteristic, judge whether the DNA sample has chromosome abnormality or not based on the first characteristic, the second characteristic and the third characteristic, judge whether the DNA sample has sex chromosome abnormality or not based on the second characteristic and/or the third characteristic, and correct the detection result of the autosomal abnormality of the first characteristic according to the second characteristic and/or the third characteristic. If the autosome has euploid aberration, the detection result of the autosome abnormality determined by the first characteristic can be corrected according to the second characteristic and/or the third characteristic, so that whether the euploid aberration exists or not can be accurately judged.
Examples
The present invention will be described more specifically with reference to the following examples, but the present invention is not limited to these examples.
One example of abortion tissue sample with karyotype 69 and XXY detected by chromosome banding experiment is detected by the device provided by the invention.
Extracting DNA sample from the abortion tissue sample, and placing the extracted DNA sample into a chromosome abnormality detection device.
In the apparatus, a sequencing data obtainer 1 is used to perform sequencing based on the extracted DNA sample to obtain sequencing data of the DNA sample in a first step.
In detail, in the sequencing data obtainer 1, the original off-line data, i.e., the data stored in the form of the fastq file, is obtained first. The sequencing data obtainer 1 comprises a first low-quality data filtering component 11, and can filter the original files stored in the fastq form to remove low-quality data, wherein the low-quality data are only data which have negative influence on detection, such as a linker sequence, an over-short sequence, a high-proportion N sequence and the like. And obtaining the sequencing data of the filtered DNA sample.
Then, in the apparatus, the filtered sequencing data (processed fastq file) was aligned with a reference sequence, that is, hg19 whole genome, using a reference sequence aligner 2(bwa aligner) located downstream of the sequencing data obtainer 1, thereby obtaining chromosomal data of the DNA sample. In reference sequence alignment device 2, PCR-induced duplication in the bam file is removed first when performing the alignment.
Meanwhile, the window cutter 21 in the reference sequence aligner 2 is used to cut the reference sequence into windows with the same size, and correspondingly cut the obtained chromosome data into a plurality of chromosome windows. Both the reference sequence and the chromosomal data were cut into windows of 100kb each, with 50kb overlapping each window.
Then, the sample enters the inter-chromosome profiler 3, and first the UR value of each chromosome window is obtained by using the UR value obtaining component 31, that is, the numbers of Reads and Unique Reads falling into the window in the statistical bam file. The ratio of the number of Unique Reads per window to the total Reads per chromosome was then calculated. In this embodiment, the UR set value is 0.625.
If the ratio is lower than the UR set value, the second low quality data filtering component 32 in the inter-chromosome feature analyzer 3 filters out a window lower than the UR set value, the rectifying component 33 in the inter-chromosome feature analyzer 3 performs GC rectification on the remaining chromosome window filtered by the second low quality data filtering component 32 based on GC content using a LOWESS method to obtain residuals, and the first feature obtainer 34 is configured to normalize the residuals to obtain a first feature of each chromosome relative to the other chromosomes.
Based on the above-described processing, the first characteristics of each chromosome obtained are shown in table 1:
TABLE 1
Subsequently, in the sex chromosome characterizer 4, the sex chromosome of the DNA sample is characterized based on the acquired chromosome data to obtain a second characteristic of the sex chromosome relative to the autosome, and a third characteristic of the Y chromosome relative to the X chromosome.
Wherein the second feature acquirer 41 is configured to acquire a ratio of the UR value of the sex chromosome with respect to the UR values of the number 1 autosome and the sex chromosome.
Wherein the second feature comprises a scale feature of the X chromosome and a scale feature of the Y chromosome.
The calculation formula is as follows:
proportional feature of X chromosome ═ UR value of X chromosome/(UR value of autosome + UR value of X chromosome) in the present example, UR value of X chromosome is the sum of UR values of X chromosome per window after removing the extremum.
In this embodiment, the UR value of an autosome is the sum of the UR values of each window of the autosome except the extreme value, wherein the autosome is the chromosome 1. Where the values are de-polarized, i.e. the maximum 5% and minimum 5% of all values are removed, and then the UR values for each of the remaining windows are summed.
The ratio of Y chromosome is characterized by the UR value of Y chromosome/(UR value of autosome + UR value of Y chromosome)
In this embodiment, the UR value of an autosome is the sum of the UR values of each window of the autosome except the extreme value, wherein the autosome is the chromosome 1. The UR value of the Y chromosome is the sum of the UR values of each window of the Y chromosome after removing the extremum. Where the values are de-polarized, i.e. the maximum 5% and minimum 5% of all values are removed, and then the UR values for each of the remaining windows are summed.
Third feature the UR value of Y chromosome/(UR value of X chromosome + UR value of Y chromosome)
In this embodiment, the UR value of the X chromosome means the sum of the UR values of the X chromosome per window after removing the extremum, and the UR value of the Y chromosome means the sum of the UR values of the Y chromosome per window after removing the extremum. Where the values are de-polarized, i.e. the maximum 5% and minimum 5% of all values are removed, and then the UR values for each of the remaining windows are summed.
Based on the above-described processing, the scale features of the X chromosome and the scale features and the third features of the Y chromosome of the sex chromosome obtained are as shown in table 2:
TABLE 2
Type (B) | Ratio of |
Third characteristic | 0.049662 |
Proportional characterization of the Y chromosome | 0.021216 |
Proportional characterization of the X chromosome | 0.293184 |
Then, the result of the above calculation enters an abnormal feature determiner 5, which determines whether there is a chromosomal abnormality in the DNA sample based on the first feature value, the second feature value, and the third feature value, and determines whether there is a euploid distortion in the DNA sample based on the second feature value and/or the third feature value.
The judgment process is as follows:
(1) for the Y chromosome
If the third characteristic value is less than 0.03 and the first characteristic value of the Y chromosome is < -3, judging that the Y chromosome is 0 ploid;
if the third eigenvalue is >0.125 and the first eigenvalue of the Y chromosome is >3, judging the Y chromosome to be diploid;
and judging that the Y chromosome is a haploid in other cases.
(2) For the X chromosome
If the second characteristic value is less than 0.275, the X chromosome is judged to be a polyploid, and other autosomes are unchanged based on the judgment of the first characteristic value;
if the second characteristic value is greater than 0.425, judging the X chromosome as a triploid, and keeping the judgment of other autosomes based on the first characteristic value unchanged;
if the second eigenvalue is between 0.275 and 0.425, the X chromosome is judged to be diploid. Further, if the second eigenvalue > <0.275 and the second eigenvalue <0.32, it is determined that X is diploid and other autosomes need to be corrected to triploid.
(3) Against autosomes 1-22
(ii) haploid if the first characteristic value is-1.5;
a triploid if the first eigenvalue > -1.5,
the remainder are diploids.
The first feature is corrected by the second feature value and the third feature, and the detection results are shown in table 3:
TABLE 3
Chromosome | xploid |
chr1 | Triploid |
chr2 | Triploid |
chr3 | Triploid |
chr4 | Triploid |
chr5 | Triploid |
chr6 | Triploid |
chr7 | Triploid |
chr8 | Triploid |
chr9 | Triploid |
chr10 | Triploid |
chr11 | Triploid |
chr12 | Triploid |
chr13 | Triploid |
chr14 | Triploid |
chr15 | Triploid |
chr16 | Triploid |
chr17 | Triploid |
chr18 | Triploid |
chr19 | Triploid |
chr20 | Triploid |
chr21 | Triploid |
chr22 | Triploid |
chrX | Diploid |
chrY | Haploid |
In tables 1-3, Xploid is the number of ploids, Triploid is the number of Triploid, Diploid is the number of diploids, and Haploid is the number of haploids.
As can be seen from the results of the measurements in Table 3, the apparatus of this example corrects XXY (Table 3) to 69, XXY (Table 1) by judging the karyotype to be 47 only from the Z value.
Claims (10)
1. A chromosomal abnormality detection device, comprising:
a sequencing data acquirer that performs sequencing based on a DNA sample to obtain sequencing data of the DNA sample;
a reference sequence aligner for aligning the sequencing data to a reference sequence to obtain chromosomal data of the DNA sample;
an inter-chromosome signature analyzer that performs inter-chromosome signature analysis on chromosomes of the DNA sample based on the acquired chromosome data to obtain a first signature of each chromosome relative to all chromosomes;
a sex chromosome characterizer that characterizes the sex chromosomes of the DNA sample based on the acquired chromosome data to obtain second characteristics of the sex chromosomes relative to the autosomes and third characteristics of the Y chromosomes relative to the X chromosomes;
an abnormal feature determiner that determines whether there is a chromosomal abnormality in the DNA sample based on the first feature, the second feature, and the third feature, and determines whether there is a euploid distortion in the DNA sample based on the second feature and/or the third feature.
2. The apparatus of claim 1, wherein,
the reference sequence aligner comprises a window cutter, wherein the window cutter is used for cutting the reference sequence into windows with the same size and correspondingly cutting the chromosome data into a plurality of chromosome windows.
3. The apparatus of claim 1, wherein,
the sequencing data obtainer comprises a first low quality data filtering component for filtering the sequencing data of the DNA sample to remove low quality data in the sequencing data of the DNA sample, and using the filtered sequencing data of the DNA sample for alignment with a reference sequence in the reference sequence aligner.
4. The apparatus of claim 1, wherein the inter-chromosome feature analyzer obtains a UR value for each chromosome of the DNA sample, and obtains a first feature of each chromosome relative to all chromosomes based on the UR value for each chromosome.
5. The apparatus of claim 4, wherein the inter-chromosome feature analyzer obtains the first feature by Lowess method and normalization method based on UR value and GC content.
6. The apparatus of claim 1, wherein the sex chromosome characteristic determinator comprises:
a second feature obtainer for obtaining a ratio of the UR value of the sex chromosome to a sum of the UR values of any one of the autosomes and the sex chromosome, i.e., a second feature;
a third feature obtainer for obtaining a ratio of the UR value of the Y chromosome with respect to a sum of the UR values of the X chromosome and the Y chromosome, i.e., a third feature.
7. The apparatus of claim 6, wherein the autosome is chromosome 1.
8. The apparatus according to claim 1, wherein, in the abnormal feature determiner,
determining the number of each autosome based on the first feature;
determining the number of Y chromosomes based on the third feature;
determining the number of X chromosomes based on the second feature, an
Correcting the number of autosomes based on the second feature and/or the third feature,
thereby determining whether the DNA sample has chromosomal abnormality and whether the DNA sample has euploid aberration.
9. The device of claim 1, wherein the DNA sample is derived from an embryo, aborted tissue, or amniotic fluid to be tested.
10. The apparatus of claim 1, wherein the apparatus further comprises a correction component that adjusts the second and/or third features based on a detection result of the apparatus.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011624173.2A CN112652359A (en) | 2020-12-30 | 2020-12-30 | Chromosome abnormality detection device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011624173.2A CN112652359A (en) | 2020-12-30 | 2020-12-30 | Chromosome abnormality detection device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112652359A true CN112652359A (en) | 2021-04-13 |
Family
ID=75366846
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011624173.2A Pending CN112652359A (en) | 2020-12-30 | 2020-12-30 | Chromosome abnormality detection device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112652359A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014075228A1 (en) * | 2012-11-13 | 2014-05-22 | 深圳华大基因医学有限公司 | Method, system and computer readable medium for determining whether chromosome number variation exists in biological sample |
CN103987856A (en) * | 2011-12-17 | 2014-08-13 | 深圳华大基因医学有限公司 | Method and system for determining whether genome is abnormal |
CN107133495A (en) * | 2017-05-04 | 2017-09-05 | 北京医院 | A kind of analysis method and analysis system of aneuploidy biological information |
CN110993029A (en) * | 2019-12-26 | 2020-04-10 | 北京优迅医学检验实验室有限公司 | Method and system for detecting chromosome abnormality |
-
2020
- 2020-12-30 CN CN202011624173.2A patent/CN112652359A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103987856A (en) * | 2011-12-17 | 2014-08-13 | 深圳华大基因医学有限公司 | Method and system for determining whether genome is abnormal |
WO2014075228A1 (en) * | 2012-11-13 | 2014-05-22 | 深圳华大基因医学有限公司 | Method, system and computer readable medium for determining whether chromosome number variation exists in biological sample |
CN107133495A (en) * | 2017-05-04 | 2017-09-05 | 北京医院 | A kind of analysis method and analysis system of aneuploidy biological information |
CN110993029A (en) * | 2019-12-26 | 2020-04-10 | 北京优迅医学检验实验室有限公司 | Method and system for detecting chromosome abnormality |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108573125B (en) | Method for detecting genome copy number variation and device comprising same | |
CN103525939B (en) | The method and system of Non-invasive detection foetal chromosome aneuploidy | |
CN104789686B (en) | Detect the kit and device of chromosomal aneuploidy | |
US11339426B2 (en) | Method capable of differentiating fetal sex and fetal sex chromosome abnormality on various platforms | |
CN112669901A (en) | Chromosome copy number variation detection device based on low-depth high-throughput genome sequencing | |
CN106096330B (en) | A kind of noninvasive antenatal biological information determination method | |
WO2016011982A1 (en) | Method and device for determining a ratio of free nucleic acids in a biological sample and use thereof | |
CN108256292B (en) | Copy number variation detection device | |
CN110648721B (en) | Method and device for detecting copy number variation by aiming at exon capture technology | |
CN105844116B (en) | The processing method and processing unit of sequencing data | |
CN112365927B (en) | CNV detection device | |
Bailey et al. | Score distributions for simultaneous matching to multiple motifs | |
CN113658638B (en) | Detection method and quality control system for homologous recombination defects based on NGS platform | |
CN105825076B (en) | Eliminate autosome in and interchromosomal GC preference method and detection system | |
CN110268072B (en) | Method and system for determining paralogous genes | |
CN111755068B (en) | Method and device for identifying tumor purity and absolute copy number based on sequencing data | |
CN110993029A (en) | Method and system for detecting chromosome abnormality | |
CN114420208B (en) | Method and device for identifying CNV in nucleic acid sample | |
WO2019213811A1 (en) | Method, apparatus, and system for detecting chromosomal aneuploidy | |
CN112652359A (en) | Chromosome abnormality detection device | |
TW202300656A (en) | Machine detection of a candidate break-point of a copy number variant on a genomic sequence | |
CN114613434A (en) | Method and system for detecting gene copy number variation based on population sample depth information | |
CN114703263B (en) | Group chromosome copy number variation detection method and device | |
CN110428873B (en) | Chromosome fold abnormality detection method and detection system | |
CN111373054A (en) | Method, system and computer readable medium for determining the presence of triploids in a male test sample |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |