CN112652359A - Chromosome abnormality detection device - Google Patents

Chromosome abnormality detection device Download PDF

Info

Publication number
CN112652359A
CN112652359A CN202011624173.2A CN202011624173A CN112652359A CN 112652359 A CN112652359 A CN 112652359A CN 202011624173 A CN202011624173 A CN 202011624173A CN 112652359 A CN112652359 A CN 112652359A
Authority
CN
China
Prior art keywords
chromosome
feature
dna sample
value
chromosomes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011624173.2A
Other languages
Chinese (zh)
Inventor
杜洋
李申曼
王娟
李志民
孙雪光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anouta Gene Technology Beijing Co ltd
Original Assignee
Anouta Gene Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anouta Gene Technology Beijing Co ltd filed Critical Anouta Gene Technology Beijing Co ltd
Priority to CN202011624173.2A priority Critical patent/CN112652359A/en
Publication of CN112652359A publication Critical patent/CN112652359A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection

Abstract

The present invention relates to a chromosome abnormality detection device, including: a sequencing data obtainer that performs sequencing based on the DNA sample to obtain sequencing data of the DNA sample; a reference sequence aligner for aligning the sequencing data with a reference sequence to obtain chromosomal data of the DNA sample; an inter-chromosome feature analyzer that performs inter-chromosome feature analysis on chromosomes of the DNA sample based on the acquired chromosome data to obtain a first feature of each chromosome with respect to all chromosomes; a sex chromosome characteristic measuring device which performs characteristic measurement on the sex chromosome of the DNA sample based on the acquired chromosome data to obtain a second characteristic of the sex chromosome relative to the autosome and a third characteristic of the Y chromosome relative to the X chromosome; an abnormal feature determiner that determines whether there is a chromosomal abnormality in the DNA sample based on the first feature, the second feature, and the third feature, and determines whether there is a euploid distortion in the DNA sample based on the second feature and/or the third feature.

Description

Chromosome abnormality detection device
Technical Field
The present invention relates to a chromosome abnormality detection device. The device of the application is suitable for detecting chromosome abnormality, and compared with the existing chromosome abnormality detection device, the detection device is more suitable for detecting the triploid syndrome, especially for detecting the triploid syndrome with the nuclear type of 69 XXY.
Background
The triploid syndrome refers to that a set of haploid chromosomes are added more than a normal diploid, three sex chromosomes are provided, and the total number of the chromosomes is 69. Triploid syndrome is the most common polyploid in prenatal diagnosis. 99% of triploid fetuses are unable to survive birth, and most of them are aborted at 10-20 gestational weeks, accounting for about 10% of cases of spontaneous abortion at early pregnancy. Triploids of the mosaic can survive for longer periods. There are three types of karyotypes for the triploid, 69XXY, 69XXX, 69XYY, in proportions of 60%, 37% and 3%, respectively. Triploid generation mechanisms mainly include bisexual fertilization and bisestrous fertilization.
For chromosome abnormality, especially triploid syndrome, the Z value detection method commonly used in the prior art is to determine whether there is abnormality in the chromosome to be detected by analyzing the correlation between each chromosome in the sample and other chromosomes. Therefore, it is possible to detect whether a chromosome is abnormal or not with respect to other chromosomes. However, when all chromosomes are abnormal, such as when triploid syndrome occurs, one chromosome cannot be distinguished from other chromosomes.
Disclosure of Invention
In view of the above-described drawbacks of the prior art, it is an object of the present invention to provide a chromosome abnormality detection apparatus, and particularly to provide a chromosome abnormality detection apparatus capable of accurately detecting triploid syndrome.
Specifically, the object of the present invention is achieved by the following means.
The present invention relates to the following:
1. a chromosomal abnormality detection device, comprising:
a sequencing data acquirer that performs sequencing based on a DNA sample to obtain sequencing data of the DNA sample;
a reference sequence aligner for aligning the sequencing data to a reference sequence to obtain chromosomal data of the DNA sample;
an inter-chromosome signature analyzer that performs inter-chromosome signature analysis on chromosomes of the DNA sample based on the acquired chromosome data to obtain a first signature of each chromosome relative to all chromosomes;
a sex chromosome characterizer that characterizes the sex chromosomes of the DNA sample based on the acquired chromosome data to obtain second characteristics of the sex chromosomes relative to the autosomes and third characteristics of the Y chromosomes relative to the X chromosomes;
an abnormal feature determiner that determines whether there is a chromosomal abnormality in the DNA sample based on the first feature, the second feature, and the third feature, and determines whether there is a euploid distortion in the DNA sample based on the second feature and/or the third feature.
2. The apparatus according to item 1, wherein,
the reference sequence aligner comprises a window cutter, wherein the window cutter is used for cutting the reference sequence into windows with the same size and correspondingly cutting the chromosome data into a plurality of chromosome windows.
3. The apparatus according to item 1, wherein,
the sequencing data obtainer comprises a first low quality data filtering component for filtering the sequencing data of the DNA sample to remove low quality data in the sequencing data of the DNA sample, and using the filtered sequencing data of the DNA sample for alignment with a reference sequence in the reference sequence aligner.
4. The apparatus according to item 1, wherein the inter-chromosome feature analyzer obtains a UR value for each chromosome of the DNA sample, and obtains a first feature of each chromosome with respect to all chromosomes based on the UR value for each chromosome.
5. The apparatus according to item 4, wherein the inter-chromosome feature analyzer obtains the first feature by a Lowess method and a normalization method based on UR value and GC content.
6. The apparatus according to item 1, wherein the sex chromosome characteristic determinator comprises:
a second feature obtainer for obtaining a ratio of the UR value of the sex chromosome to a sum of the UR values of any one of the autosomes and the sex chromosome, i.e., a second feature;
a third feature obtainer for obtaining a ratio of the UR value of the Y chromosome with respect to a sum of the UR values of the X chromosome and the Y chromosome, i.e., a third feature.
7. The apparatus of item 6, wherein the autosome is chromosome 1.
8. The apparatus according to item 1, wherein, in the abnormal feature determiner,
determining the number of each autosome based on the first feature;
determining the number of Y chromosomes based on the third feature;
determining the number of X chromosomes based on the second feature, an
Correcting the number of autosomes based on the second feature and/or the third feature,
thereby determining whether the DNA sample has chromosomal abnormality and whether the DNA sample has euploid aberration.
9. The device according to item 1, wherein the DNA sample is derived from an embryo, aborted tissue or amniotic fluid to be detected.
10. The apparatus of item 1, wherein the apparatus further comprises a correction component that adjusts the second and/or third features based on a detection result of the apparatus.
11. The apparatus of item 2, wherein the inter-chromosome feature analyzer comprises:
a UR value acquisition component that acquires a UR value for each chromosome window;
a second low quality data filtering component that filters chromosomal windows below a UR set point;
a rectification component that GC rectifies the remaining chromosome window filtered by the second low-quality data filtering component based on GC content using the LOWESS method to obtain a residual;
a first feature obtainer for normalizing the residuals to obtain a first feature of each chromosome relative to all chromosomes.
12. The apparatus of item 11, wherein the sex chromosome characteristic determinator comprises:
a second feature obtainer for obtaining a ratio of the UR value of the sex chromosome to a sum of the UR values of any one of the autosomes and the sex chromosome, i.e., a second feature;
a third feature obtainer for obtaining a third feature, which is a ratio of the UR value of the Y chromosome with respect to a sum of the UR values of the X chromosome and the Y chromosome;
the UR value of the neutral chromosome is the sum of the UR values of each window of the sex chromosome after removing the extreme value;
the UR value of an autosome is the sum of the UR values of each window of the autosome after removal of the extremum;
the UR value of the X chromosome is the sum of the UR values of each window of the X chromosome after removing extreme values; and
the UR value of the Y chromosome is the sum of the UR values of each window of the Y chromosome after removing the extremum.
13. The apparatus of claim 12, wherein the de-extremum processing is removing a maximum 5% portion and a minimum 5% portion of the total number.
14. A method of detecting chromosomal abnormalities, comprising:
a sequencing data acquisition step in which sequencing is performed based on a DNA sample to obtain sequencing data of the DNA sample;
a reference sequence alignment step in which the sequencing data is aligned with a reference sequence to obtain chromosomal data of the DNA sample;
an interchromosomal feature analysis step of performing interchromosomal feature analysis on chromosomes of the DNA sample based on the acquired chromosome data to obtain a first feature of each chromosome with respect to all chromosomes;
a sex chromosome characterization step in which a sex chromosome of the DNA sample is characterized based on the acquired chromosome data to obtain a second characteristic of the sex chromosome relative to an autosome, and a third characteristic of the Y chromosome relative to the X chromosome;
an abnormal feature determination step of determining whether or not there is a chromosomal abnormality in the DNA sample based on the first feature, the second feature and the third feature, and determining whether or not there is a euploid distortion in the DNA sample based on the second feature and/or the third feature.
15. The method of item 14, wherein,
the reference sequence alignment step further comprises a window cutting step in which the reference sequence is cut into windows of the same size and the chromosome data is correspondingly cut into a plurality of chromosome windows.
16. The method of item 14, wherein,
the sequencing data acquisition step comprises a first low quality data filtering step in which the sequencing data of the DNA sample is filtered to remove low quality data in the sequencing data of the DNA sample and the filtered sequencing data of the DNA sample is used for alignment with a reference sequence in the reference sequence aligner.
17. The method according to item 14, wherein, in the inter-chromosome feature analysis step, a UR value of each chromosome of the DNA sample is acquired, and the first feature of each chromosome with respect to all chromosomes is acquired based on the UR value of each chromosome.
18. The method according to item 17, wherein in the inter-chromosome feature analysis step, the first feature is obtained by the Lowess method and the normalization method based on the UR value and the GC content.
19. The method of item 14, wherein the sex chromosome characterization step further comprises:
a second feature acquisition step of acquiring a ratio of the UR value of the sex chromosome to the sum of the UR values of any one of the autosomes and the sex chromosome, i.e., a second feature,
a third feature acquisition step of acquiring a ratio of the UR value of the Y chromosome to the sum of the UR values of the X chromosome and the Y chromosome, that is, a third feature.
20. The method of item 19, wherein the autosome is chromosome 1.
21. The method according to item 14, wherein, in the abnormal feature determination step,
determining the number of each autosome based on the first feature;
determining the number of Y chromosomes based on the third feature;
determining the number of X chromosomes based on the second feature, an
Correcting the number of autosomes based on the second feature and/or the third feature,
thereby determining whether the DNA sample has chromosomal abnormality and whether the DNA sample has euploid aberration.
22. The method of item 14, wherein the DNA sample is derived from an embryo, aborted tissue, or amniotic fluid to be detected.
23. The method according to item 14, wherein the method further comprises a correction step in which the second and/or third features are adjusted based on the detection result of the device.
24. The step of item 15, wherein the inter-chromosome feature analysis step comprises:
a UR value acquisition step of acquiring a UR value for each chromosome window;
a second low quality data filtering step in which a chromosome window below the UR set point is filtered;
a rectification step in which GC rectification is performed on the chromosome window remaining after the filtering by the second low-quality data filtering component based on GC content using the LOWESS method to obtain a residual error;
a first feature acquisition step in which the residuals are normalized to obtain a first feature of each chromosome relative to all chromosomes.
25. The method of claim 24, wherein the sex chromosome characterization step comprises:
a second feature acquisition step of acquiring a ratio of the UR value of the sex chromosome to the sum of the UR values of any one of the autosomes and the sex chromosome, that is, a second feature;
a third feature acquisition step of acquiring a ratio of UR values of the Y chromosome to the sum of UR values of the X chromosome and the Y chromosome, that is, a third feature;
the UR value of the neutral chromosome is the sum of the UR values of each window of the sex chromosome after removing the extreme value;
the UR value of an autosome is the sum of the UR values of each window of the autosome after removal of the extremum;
the UR value of the X chromosome is the sum of the UR values of each window of the X chromosome after removing extreme values; and
the UR value of the Y chromosome is the sum of the UR values of each window of the Y chromosome after removing the extremum.
26. The method of claim 25, wherein the de-extremum processing is removing a maximum 5% portion and a minimum 5% portion of the total number.
Effects of the invention
The device provided by the invention solves the problem which cannot be solved by the device adopted in the prior art, namely, although whether a certain chromosome is abnormal relative to other chromosomes or not can be detected, when all chromosomes are abnormal, such as triploid syndrome, the certain chromosome cannot be distinguished from other chromosomes.
The invention provides a chromosome abnormality detection device capable of accurately judging when all chromosomes are abnormal, such as when triploid syndrome occurs.
Drawings
Various other advantages and benefits of the present invention will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. Also, like parts are designated by like reference numerals throughout the drawings.
FIG. 1 is a view showing an overall frame configuration of a chromosome abnormality detection apparatus according to the present invention;
FIG. 2 is a block diagram of a chromosome abnormality detection apparatus according to an embodiment of the present invention;
FIG. 3 is a frame configuration diagram of a chromosome abnormality detection apparatus according to another embodiment of the present invention;
FIG. 4 is a frame configuration diagram of a chromosome abnormality detection apparatus according to another embodiment of the present invention;
FIG. 5 is a block diagram showing a structure of a chromosome abnormality detection apparatus according to another embodiment of the present invention.
Reference numerals: 1-sequencing data obtainer, 11-first low-quality data filter component, 2-reference sequence aligner, 21-window cutter, 3-inter-chromosome feature analyzer, 31-UR value obtaining component, 32-second low-quality data filter component, 33-correction component, 34-first feature obtainer, 4-sex chromosome feature determinator, 41-second feature obtainer, 42-third feature obtainer, 5-abnormal feature determinator
Detailed Description
The present invention relates to the following definitions.
Typically human 23 pairs of chromosomes, including 22 pairs of autosomes and 1 pair of sex chromosomes. Sex chromosomes consist of either X and X or X and Y chromosomes.
Triploid: refers to a cell or organism containing three sets of chromosomes. Triploid organisms are often infertile because of the difficulty in meiosis to form gametes.
High-throughput sequencing: high-throughput sequencing, also known as "Next-generation" sequencing technology, is used to sequence hundreds of thousands to millions of DNA molecules in parallel at a time.
Window: generally refers to a fixed length region on the genome.
And (5) reading: the plurality of reads is a short sequencing fragment sequence generated by a high-throughput sequencing platform.
Unique reads: refers to reads that align uniquely to the genome. During sequencing, some reads can be aligned to multiple positions of the genome at the same time, and the Unique reads filter out the multiple aligned reads from all non-dup reads, and the rest are Unique reads.
UR value: each window contains the number of Unique Reads.
GC content: among the 4 bases of DNA, the ratio of guanine to cytosine is called GC content.
LOWESS method: the Lowess method is that in a designated window, the numerical value of each point is obtained by weighted regression by using adjacent data in the window.
Removing extreme values: the extreme values in the data are removed.
In the present invention, the residual error is a difference between an actual detection value and an estimated value (e.g., a regression value) obtained by gc correction processing (e.g., Lowess processing).
In the present invention, the normalization method refers to Z normalization, also called standard deviation normalization, which performs normalization of data based on the mean and standard deviation (standard deviation) of raw data.
The number and structure of chromosomes of each organism are relatively constant, but under the influence of natural conditions or artificial factors, the number and structure of chromosomes may change, thereby causing the variation of organisms. Chromosomal aberrations include variations in chromosome number and chromosome structure.
The euploid aberration in this context means that the numerical aberration of human chromosomes is divided into two types, euploid aberration and aneuploid aberration. The euploid aberration is divided into haploid and polyploid, wherein triploid and tetraploid are more common and are one of the main causes of spontaneous abortion. Euploid aberrations include the triploid syndrome, such as the triploid syndrome with a karyotype of 69 XXY.
As shown in fig. 1 to 5, a chromosome abnormality detection apparatus of the present invention includes: a sequencing data acquirer 1, a reference sequence aligner 2, an inter-chromosome feature analyzer 3, a sex chromosome feature determinator 4, and an abnormal feature determinator 5.
Wherein the sequencing data acquirer 1 performs sequencing based on a DNA sample to obtain sequencing data of the DNA sample. The DNA sample may be derived from an embryo, aborted tissue or amniotic fluid to be detected. The DNA sample may be obtained and sequenced using any known technique.
In a preferred embodiment, as shown in FIG. 2, the sequencing data obtainer 1 comprises a first low quality data filtering component 11, the first low quality data filtering component 11 is configured to filter the sequencing data of the DNA sample to remove low quality data in the sequencing data of the DNA sample, and to use the filtered sequencing data of the DNA sample for alignment with a reference sequence of 2 in a downstream reference sequence aligner.
A reference sequence aligner 2 is disposed downstream of the sequencing data obtainer 1 for aligning the sequencing data with a reference sequence to obtain chromosomal data of the DNA sample. Wherein the reference sequence is a human genomic sequence, such as the hg19 whole genome reference sequence.
In a preferred embodiment, as shown in FIG. 3, the reference sequence aligner 2 comprises a window cutter 21, wherein the window cutter 21 is used for cutting the reference sequence into windows with the same size and correspondingly cutting the chromosome data into a plurality of chromosome windows. In a specific embodiment, the window slicer 21 slices the reference sequence into windows with the same size, and correspondingly slices the obtained chromosome data into a plurality of chromosome windows. Both the reference sequence and the chromosomal data were cut into windows of 100kb each, with 50kb overlapping each window.
An interchromosomal signature analyzer 3 is located downstream of the reference sequence aligner 2 and performs interchromosomal signature analysis on the chromosomes of the DNA sample based on the acquired chromosome data to obtain a first signature of each chromosome relative to all chromosomes.
In a specific embodiment, the inter-chromosome feature analyzer 3 obtains a UR value for each chromosome of the DNA sample, and obtains a first feature of each chromosome with respect to all chromosomes based on the UR value of each chromosome. Preferably, in the inter-chromosome feature analyzer 3, the first feature is obtained by the Lowess method and the normalization method based on the UR value and the GC content. The Lowess method is in a designated window, and the numerical value of each point is obtained by weighted regression of adjacent data in the window.
When the reference sequence aligner 2 includes the window slicer 21, in a preferred embodiment, as shown in fig. 5, the inter-chromosome feature analyzer 3 includes a UR value acquisition component 31, a second low quality data filtering component 32, a rectification component 33, and a first feature acquirer 34. Wherein the UR value acquiring component 31 acquires the UR value of each chromosome window, and the second low quality data filtering component 32 filters the chromosome windows lower than the UR set value. The rectifying component 33 performs GC rectification on the remaining chromosome window filtered by the second low-quality data filtering component 32 based on GC content using the LOWESS method to obtain a residual. A first feature obtainer 34 for normalizing the residuals to obtain a first feature for each chromosome relative to all chromosomes.
In one embodiment, the sample is entered into the inter-chromosome profiler 3, and the UR value of each chromosome window, i.e. the numbers of Reads and the numbers of Unique Reads falling within the window in the statistical bam file, is first obtained by the UR value obtaining component 31. The ratio of the number of Unique Reads per window to the total Reads per chromosome was then calculated. UR set value is 0.625. If the ratio is lower than the UR set value, the second low quality data filtering component 32 in the inter-chromosome feature analyzer 3 filters out a window lower than the UR set value, the rectifying component 33 in the inter-chromosome feature analyzer 3 performs GC rectification on the remaining chromosome window filtered by the second low quality data filtering component 32 based on GC content using a LOWESS method to obtain residuals, and the first feature obtainer 34 is configured to normalize the residuals to obtain a first feature of each chromosome relative to the other chromosomes.
A sex chromosome characterizer 4, located downstream of the reference sequence aligner 2 or the inter-chromosome profiler 3, characterizes the sex chromosomes of the DNA sample based on the acquired chromosome data to obtain a second characteristic of the sex chromosomes relative to the autosomes, and a third characteristic of the Y chromosomes relative to the X chromosomes.
In a particular embodiment, the sex chromosome characteristic determinator 4 includes a second characteristic obtainer 41 and a third characteristic obtainer 42. Wherein the second feature obtainer 41 is configured to obtain a ratio of the UR value of the sex chromosome to a sum of the UR values of any one of the autosomes and the sex chromosome, i.e., the second feature. The third feature obtainer 42 is for obtaining a ratio of the UR value of the Y chromosome with respect to the sum of the UR values of the X chromosome and the Y chromosome, i.e., a third feature. In a preferred embodiment, the autosome is chromosome 1.
When reference sequence aligner 2 includes window cutter 21, in a preferred embodiment sex chromosome characteristic determinator 4 includes a second characteristic obtainer 41 and a third characteristic obtainer 42. Wherein the second feature obtainer 41 is configured to obtain a ratio of the UR value of the sex chromosome to a sum of the UR values of any one of the autosomes and the sex chromosome, i.e., the second feature. The third feature obtainer 42 is for obtaining a ratio of the UR value of the Y chromosome with respect to the sum of the UR values of the X chromosome and the Y chromosome, i.e., a third feature. The UR value of the neutral chromosome is the sum of the UR values of each window of the sex chromosome after removing the extreme value. The UR value of an autosome is the sum of the UR values of each window of the autosome after removing the extremum. The UR value of the X chromosome is the sum of the UR values of each window of the X chromosome after removing the extremum. And the UR value of the Y chromosome refers to the sum of the UR values of each window of the Y chromosome after removing the extremum. The extreme value removing processing means removing the maximum 5% part and the minimum 5% part of all the numerical values.
In a specific embodiment, the second feature comprises a scale feature of the X chromosome and a scale feature of the Y chromosome. Wherein the ratio of the X chromosome is characterized by a ratio of the UR value of the X chromosome to (UR value of autosome + UR value of X chromosome). In this example, the UR value of the X chromosome is the sum of the UR values of each window of the X chromosome after removing the extremum. The proportion of the Y chromosome is characterized by the ratio of the UR value of the Y chromosome to (UR value of autosome + UR value of Y chromosome). The third feature is the ratio of the UR value of the Y chromosome/(UR value of the X chromosome + UR value of the Y chromosome).
The ratio characteristic of the X chromosome and the ratio characteristic of the Y chromosome may be used separately or in combination with the third characteristic value, and are used to detect whether there is an abnormality in the sex chromosomes of the DNA sample, and particularly when the ratio characteristics are within a certain threshold range, the condition of the autosomal chromosome determined by the first characteristic needs to be adjusted according to the ratio characteristics, for example, the condition that the autosomal diploid determined by the first characteristic is corrected to be an autosomal triploid.
An abnormal feature determiner 5 is located downstream of the inter-chromosome feature analyzer 3 and the sex chromosome feature determinator 4, determines whether there is a chromosome abnormality in the DNA sample based on the first feature, the second feature and the third feature, and determines whether there is a euploid distortion in the DNA sample based on the second feature and/or the third feature, that is, determining whether there is a chromosome abnormality in the DNA sample requires three features based on the first feature, the second feature and the third feature, and determining whether there is a euploid distortion in the DNA sample is based on two features of the second feature and the third feature.
In a specific embodiment, in the abnormal feature determiner 5, the number of each autosome is determined based on the first feature; determining the number of Y chromosomes based on the third feature; determining the number of X chromosomes based on the second characteristic, and correcting the number of autosomes based on the second characteristic and/or the third characteristic, thereby determining whether the DNA sample has chromosome abnormality and whether the DNA sample has euploid aberration.
In one specific embodiment, the decision process is as follows:
(1) for the Y chromosome
If the third characteristic value is less than 0.03 and the first characteristic value of the Y chromosome is < -3, judging that the Y chromosome is 0 ploid;
if the third eigenvalue is >0.125 and the first eigenvalue of the Y chromosome is >3, judging the Y chromosome to be diploid;
and judging that the Y chromosome is a haploid in other cases.
(2) For the X chromosome
If the second characteristic value is less than 0.275, the X chromosome is judged to be a polyploid, and other autosomes are unchanged based on the judgment of the first characteristic value;
if the second characteristic value is greater than 0.425, judging the X chromosome as a triploid, and keeping the judgment of other autosomes based on the first characteristic value unchanged;
if the second eigenvalue is between 0.275 and 0.425, the X chromosome is judged to be diploid. Further, if the second eigenvalue > <0.275 and the second eigenvalue <0.32, it is determined that X is diploid and other autosomes need to be corrected to triploid.
(3) Against autosomes 1-22
(ii) haploid if the first characteristic value is-1.5;
a triploid if the first eigenvalue > -1.5,
the remainder are diploids.
The present invention also provides a method for detecting chromosomal abnormality, which comprises:
a sequencing data acquisition step in which sequencing is performed based on a DNA sample to obtain sequencing data of the DNA sample;
a reference sequence alignment step in which the sequencing data is aligned with a reference sequence to obtain chromosomal data of the DNA sample;
an interchromosomal feature analysis step of performing interchromosomal feature analysis on chromosomes of the DNA sample based on the acquired chromosome data to obtain a first feature of each chromosome with respect to all chromosomes;
a sex chromosome characterization step in which a sex chromosome of the DNA sample is characterized based on the acquired chromosome data to obtain a second characteristic of the sex chromosome relative to an autosome, and a third characteristic of the Y chromosome relative to the X chromosome;
an abnormal feature determination step of determining whether or not there is a chromosomal abnormality in the DNA sample based on the first feature, the second feature and the third feature, and determining whether or not there is a euploid distortion in the DNA sample based on the second feature and/or the third feature.
In a specific embodiment, the reference sequence alignment step further comprises a window cutting step, in which the reference sequence is cut into windows with the same size, and the chromosome data is correspondingly cut into a plurality of chromosome windows.
Further, in a specific embodiment, the inter-chromosome characteristic analyzing step comprises: a UR value acquisition step of acquiring a UR value for each chromosome window; a second low quality data filtering step in which a chromosome window below the UR set point is filtered; a rectification step in which GC rectification is performed on the chromosome window remaining after the filtering by the second low-quality data filtering component based on GC content using the LOWESS method to obtain a residual error; a first feature acquisition step in which the residuals are normalized to obtain a first feature of each chromosome relative to all chromosomes.
Further, in a specific embodiment, the sex chromosome characterization step comprises: a second feature acquisition step of acquiring a ratio of the UR value of the sex chromosome to the sum of the UR values of any one of the autosomes and the sex chromosome, that is, a second feature; a third feature acquisition step of acquiring a ratio of UR values of the Y chromosome to the sum of UR values of the X chromosome and the Y chromosome, that is, a third feature; the UR value of the neutral chromosome is the sum of the UR values of each window of the sex chromosome after removing the extreme value; the UR value of an autosome is the sum of the UR values of each window of the autosome after removal of the extremum; the UR value of the X chromosome is the sum of the UR values of each window of the X chromosome after removing extreme values; and the UR value of the Y chromosome refers to the sum of the UR values of each window of the Y chromosome after removing the extremum.
In a specific embodiment, the depolarization processing refers to removing a maximum 5% portion and a minimum 5% portion of all values.
In a specific embodiment, the sequencing data obtaining step comprises a first low quality data filtering step in which the sequencing data of the DNA sample is filtered to remove low quality data in the sequencing data of the DNA sample and the filtered sequencing data of the DNA sample is used for alignment with a reference sequence in the reference sequence aligner.
In a specific embodiment, in the inter-chromosome feature analysis step, a UR value of each chromosome of the DNA sample is obtained, and the first feature of each chromosome with respect to all chromosomes is obtained based on the UR value of each chromosome.
Further, in the inter-chromosome feature analysis step, the first feature is obtained by the Lowess method and the normalization method based on the UR value and the GC content.
In a specific embodiment, the sex chromosome characterization step further comprises: a second feature acquisition step of acquiring a ratio of the UR value of the sex chromosome to the sum of the UR values of any one of the autosomes and the sex chromosome, that is, a second feature; a third feature acquisition step of acquiring a ratio of the UR value of the Y chromosome to the sum of the UR values of the X chromosome and the Y chromosome, that is, a third feature. Further, the autosome is chromosome 1.
In a specific embodiment, in the abnormal feature determination step, the number of each autosome is determined based on the first feature; determining the number of Y chromosomes based on the third feature; determining the number of X chromosomes based on the second characteristic, and correcting the number of autosomes based on the second characteristic and/or the third characteristic, thereby determining whether the DNA sample has chromosome abnormality and whether the DNA sample has euploid aberration.
In a particular embodiment, the method further comprises a correction step in which the second and/or third characteristic is adjusted based on the detection result of the device.
The device provided by the invention can obtain the first characteristic, the second characteristic and the third characteristic, judge whether the DNA sample has chromosome abnormality or not based on the first characteristic, the second characteristic and the third characteristic, judge whether the DNA sample has sex chromosome abnormality or not based on the second characteristic and/or the third characteristic, and correct the detection result of the autosomal abnormality of the first characteristic according to the second characteristic and/or the third characteristic. If the autosome has euploid aberration, the detection result of the autosome abnormality determined by the first characteristic can be corrected according to the second characteristic and/or the third characteristic, so that whether the euploid aberration exists or not can be accurately judged.
Examples
The present invention will be described more specifically with reference to the following examples, but the present invention is not limited to these examples.
One example of abortion tissue sample with karyotype 69 and XXY detected by chromosome banding experiment is detected by the device provided by the invention.
Extracting DNA sample from the abortion tissue sample, and placing the extracted DNA sample into a chromosome abnormality detection device.
In the apparatus, a sequencing data obtainer 1 is used to perform sequencing based on the extracted DNA sample to obtain sequencing data of the DNA sample in a first step.
In detail, in the sequencing data obtainer 1, the original off-line data, i.e., the data stored in the form of the fastq file, is obtained first. The sequencing data obtainer 1 comprises a first low-quality data filtering component 11, and can filter the original files stored in the fastq form to remove low-quality data, wherein the low-quality data are only data which have negative influence on detection, such as a linker sequence, an over-short sequence, a high-proportion N sequence and the like. And obtaining the sequencing data of the filtered DNA sample.
Then, in the apparatus, the filtered sequencing data (processed fastq file) was aligned with a reference sequence, that is, hg19 whole genome, using a reference sequence aligner 2(bwa aligner) located downstream of the sequencing data obtainer 1, thereby obtaining chromosomal data of the DNA sample. In reference sequence alignment device 2, PCR-induced duplication in the bam file is removed first when performing the alignment.
Meanwhile, the window cutter 21 in the reference sequence aligner 2 is used to cut the reference sequence into windows with the same size, and correspondingly cut the obtained chromosome data into a plurality of chromosome windows. Both the reference sequence and the chromosomal data were cut into windows of 100kb each, with 50kb overlapping each window.
Then, the sample enters the inter-chromosome profiler 3, and first the UR value of each chromosome window is obtained by using the UR value obtaining component 31, that is, the numbers of Reads and Unique Reads falling into the window in the statistical bam file. The ratio of the number of Unique Reads per window to the total Reads per chromosome was then calculated. In this embodiment, the UR set value is 0.625.
If the ratio is lower than the UR set value, the second low quality data filtering component 32 in the inter-chromosome feature analyzer 3 filters out a window lower than the UR set value, the rectifying component 33 in the inter-chromosome feature analyzer 3 performs GC rectification on the remaining chromosome window filtered by the second low quality data filtering component 32 based on GC content using a LOWESS method to obtain residuals, and the first feature obtainer 34 is configured to normalize the residuals to obtain a first feature of each chromosome relative to the other chromosomes.
Based on the above-described processing, the first characteristics of each chromosome obtained are shown in table 1:
TABLE 1
Figure BDA0002872846550000141
Figure BDA0002872846550000151
Subsequently, in the sex chromosome characterizer 4, the sex chromosome of the DNA sample is characterized based on the acquired chromosome data to obtain a second characteristic of the sex chromosome relative to the autosome, and a third characteristic of the Y chromosome relative to the X chromosome.
Wherein the second feature acquirer 41 is configured to acquire a ratio of the UR value of the sex chromosome with respect to the UR values of the number 1 autosome and the sex chromosome.
Wherein the second feature comprises a scale feature of the X chromosome and a scale feature of the Y chromosome.
The calculation formula is as follows:
proportional feature of X chromosome ═ UR value of X chromosome/(UR value of autosome + UR value of X chromosome) in the present example, UR value of X chromosome is the sum of UR values of X chromosome per window after removing the extremum.
In this embodiment, the UR value of an autosome is the sum of the UR values of each window of the autosome except the extreme value, wherein the autosome is the chromosome 1. Where the values are de-polarized, i.e. the maximum 5% and minimum 5% of all values are removed, and then the UR values for each of the remaining windows are summed.
The ratio of Y chromosome is characterized by the UR value of Y chromosome/(UR value of autosome + UR value of Y chromosome)
In this embodiment, the UR value of an autosome is the sum of the UR values of each window of the autosome except the extreme value, wherein the autosome is the chromosome 1. The UR value of the Y chromosome is the sum of the UR values of each window of the Y chromosome after removing the extremum. Where the values are de-polarized, i.e. the maximum 5% and minimum 5% of all values are removed, and then the UR values for each of the remaining windows are summed.
Third feature the UR value of Y chromosome/(UR value of X chromosome + UR value of Y chromosome)
In this embodiment, the UR value of the X chromosome means the sum of the UR values of the X chromosome per window after removing the extremum, and the UR value of the Y chromosome means the sum of the UR values of the Y chromosome per window after removing the extremum. Where the values are de-polarized, i.e. the maximum 5% and minimum 5% of all values are removed, and then the UR values for each of the remaining windows are summed.
Based on the above-described processing, the scale features of the X chromosome and the scale features and the third features of the Y chromosome of the sex chromosome obtained are as shown in table 2:
TABLE 2
Type (B) Ratio of
Third characteristic 0.049662
Proportional characterization of the Y chromosome 0.021216
Proportional characterization of the X chromosome 0.293184
Then, the result of the above calculation enters an abnormal feature determiner 5, which determines whether there is a chromosomal abnormality in the DNA sample based on the first feature value, the second feature value, and the third feature value, and determines whether there is a euploid distortion in the DNA sample based on the second feature value and/or the third feature value.
The judgment process is as follows:
(1) for the Y chromosome
If the third characteristic value is less than 0.03 and the first characteristic value of the Y chromosome is < -3, judging that the Y chromosome is 0 ploid;
if the third eigenvalue is >0.125 and the first eigenvalue of the Y chromosome is >3, judging the Y chromosome to be diploid;
and judging that the Y chromosome is a haploid in other cases.
(2) For the X chromosome
If the second characteristic value is less than 0.275, the X chromosome is judged to be a polyploid, and other autosomes are unchanged based on the judgment of the first characteristic value;
if the second characteristic value is greater than 0.425, judging the X chromosome as a triploid, and keeping the judgment of other autosomes based on the first characteristic value unchanged;
if the second eigenvalue is between 0.275 and 0.425, the X chromosome is judged to be diploid. Further, if the second eigenvalue > <0.275 and the second eigenvalue <0.32, it is determined that X is diploid and other autosomes need to be corrected to triploid.
(3) Against autosomes 1-22
(ii) haploid if the first characteristic value is-1.5;
a triploid if the first eigenvalue > -1.5,
the remainder are diploids.
The first feature is corrected by the second feature value and the third feature, and the detection results are shown in table 3:
TABLE 3
Chromosome xploid
chr1 Triploid
chr2 Triploid
chr3 Triploid
chr4 Triploid
chr5 Triploid
chr6 Triploid
chr7 Triploid
chr8 Triploid
chr9 Triploid
chr10 Triploid
chr11 Triploid
chr12 Triploid
chr13 Triploid
chr14 Triploid
chr15 Triploid
chr16 Triploid
chr17 Triploid
chr18 Triploid
chr19 Triploid
chr20 Triploid
chr21 Triploid
chr22 Triploid
chrX Diploid
chrY Haploid
In tables 1-3, Xploid is the number of ploids, Triploid is the number of Triploid, Diploid is the number of diploids, and Haploid is the number of haploids.
As can be seen from the results of the measurements in Table 3, the apparatus of this example corrects XXY (Table 3) to 69, XXY (Table 1) by judging the karyotype to be 47 only from the Z value.

Claims (10)

1. A chromosomal abnormality detection device, comprising:
a sequencing data acquirer that performs sequencing based on a DNA sample to obtain sequencing data of the DNA sample;
a reference sequence aligner for aligning the sequencing data to a reference sequence to obtain chromosomal data of the DNA sample;
an inter-chromosome signature analyzer that performs inter-chromosome signature analysis on chromosomes of the DNA sample based on the acquired chromosome data to obtain a first signature of each chromosome relative to all chromosomes;
a sex chromosome characterizer that characterizes the sex chromosomes of the DNA sample based on the acquired chromosome data to obtain second characteristics of the sex chromosomes relative to the autosomes and third characteristics of the Y chromosomes relative to the X chromosomes;
an abnormal feature determiner that determines whether there is a chromosomal abnormality in the DNA sample based on the first feature, the second feature, and the third feature, and determines whether there is a euploid distortion in the DNA sample based on the second feature and/or the third feature.
2. The apparatus of claim 1, wherein,
the reference sequence aligner comprises a window cutter, wherein the window cutter is used for cutting the reference sequence into windows with the same size and correspondingly cutting the chromosome data into a plurality of chromosome windows.
3. The apparatus of claim 1, wherein,
the sequencing data obtainer comprises a first low quality data filtering component for filtering the sequencing data of the DNA sample to remove low quality data in the sequencing data of the DNA sample, and using the filtered sequencing data of the DNA sample for alignment with a reference sequence in the reference sequence aligner.
4. The apparatus of claim 1, wherein the inter-chromosome feature analyzer obtains a UR value for each chromosome of the DNA sample, and obtains a first feature of each chromosome relative to all chromosomes based on the UR value for each chromosome.
5. The apparatus of claim 4, wherein the inter-chromosome feature analyzer obtains the first feature by Lowess method and normalization method based on UR value and GC content.
6. The apparatus of claim 1, wherein the sex chromosome characteristic determinator comprises:
a second feature obtainer for obtaining a ratio of the UR value of the sex chromosome to a sum of the UR values of any one of the autosomes and the sex chromosome, i.e., a second feature;
a third feature obtainer for obtaining a ratio of the UR value of the Y chromosome with respect to a sum of the UR values of the X chromosome and the Y chromosome, i.e., a third feature.
7. The apparatus of claim 6, wherein the autosome is chromosome 1.
8. The apparatus according to claim 1, wherein, in the abnormal feature determiner,
determining the number of each autosome based on the first feature;
determining the number of Y chromosomes based on the third feature;
determining the number of X chromosomes based on the second feature, an
Correcting the number of autosomes based on the second feature and/or the third feature,
thereby determining whether the DNA sample has chromosomal abnormality and whether the DNA sample has euploid aberration.
9. The device of claim 1, wherein the DNA sample is derived from an embryo, aborted tissue, or amniotic fluid to be tested.
10. The apparatus of claim 1, wherein the apparatus further comprises a correction component that adjusts the second and/or third features based on a detection result of the apparatus.
CN202011624173.2A 2020-12-30 2020-12-30 Chromosome abnormality detection device Pending CN112652359A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011624173.2A CN112652359A (en) 2020-12-30 2020-12-30 Chromosome abnormality detection device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011624173.2A CN112652359A (en) 2020-12-30 2020-12-30 Chromosome abnormality detection device

Publications (1)

Publication Number Publication Date
CN112652359A true CN112652359A (en) 2021-04-13

Family

ID=75366846

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011624173.2A Pending CN112652359A (en) 2020-12-30 2020-12-30 Chromosome abnormality detection device

Country Status (1)

Country Link
CN (1) CN112652359A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014075228A1 (en) * 2012-11-13 2014-05-22 深圳华大基因医学有限公司 Method, system and computer readable medium for determining whether chromosome number variation exists in biological sample
CN103987856A (en) * 2011-12-17 2014-08-13 深圳华大基因医学有限公司 Method and system for determining whether genome is abnormal
CN107133495A (en) * 2017-05-04 2017-09-05 北京医院 A kind of analysis method and analysis system of aneuploidy biological information
CN110993029A (en) * 2019-12-26 2020-04-10 北京优迅医学检验实验室有限公司 Method and system for detecting chromosome abnormality

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103987856A (en) * 2011-12-17 2014-08-13 深圳华大基因医学有限公司 Method and system for determining whether genome is abnormal
WO2014075228A1 (en) * 2012-11-13 2014-05-22 深圳华大基因医学有限公司 Method, system and computer readable medium for determining whether chromosome number variation exists in biological sample
CN107133495A (en) * 2017-05-04 2017-09-05 北京医院 A kind of analysis method and analysis system of aneuploidy biological information
CN110993029A (en) * 2019-12-26 2020-04-10 北京优迅医学检验实验室有限公司 Method and system for detecting chromosome abnormality

Similar Documents

Publication Publication Date Title
CN108573125B (en) Method for detecting genome copy number variation and device comprising same
CN103525939B (en) The method and system of Non-invasive detection foetal chromosome aneuploidy
CN104789686B (en) Detect the kit and device of chromosomal aneuploidy
US11339426B2 (en) Method capable of differentiating fetal sex and fetal sex chromosome abnormality on various platforms
CN112669901A (en) Chromosome copy number variation detection device based on low-depth high-throughput genome sequencing
CN106096330B (en) A kind of noninvasive antenatal biological information determination method
WO2016011982A1 (en) Method and device for determining a ratio of free nucleic acids in a biological sample and use thereof
CN108256292B (en) Copy number variation detection device
CN110648721B (en) Method and device for detecting copy number variation by aiming at exon capture technology
CN105844116B (en) The processing method and processing unit of sequencing data
CN112365927B (en) CNV detection device
Bailey et al. Score distributions for simultaneous matching to multiple motifs
CN113658638B (en) Detection method and quality control system for homologous recombination defects based on NGS platform
CN105825076B (en) Eliminate autosome in and interchromosomal GC preference method and detection system
CN110268072B (en) Method and system for determining paralogous genes
CN111755068B (en) Method and device for identifying tumor purity and absolute copy number based on sequencing data
CN110993029A (en) Method and system for detecting chromosome abnormality
CN114420208B (en) Method and device for identifying CNV in nucleic acid sample
WO2019213811A1 (en) Method, apparatus, and system for detecting chromosomal aneuploidy
CN112652359A (en) Chromosome abnormality detection device
TW202300656A (en) Machine detection of a candidate break-point of a copy number variant on a genomic sequence
CN114613434A (en) Method and system for detecting gene copy number variation based on population sample depth information
CN114703263B (en) Group chromosome copy number variation detection method and device
CN110428873B (en) Chromosome fold abnormality detection method and detection system
CN111373054A (en) Method, system and computer readable medium for determining the presence of triploids in a male test sample

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination