WO2016176847A1 - 检测染色体非整倍性的试剂盒、装置和方法 - Google Patents

检测染色体非整倍性的试剂盒、装置和方法 Download PDF

Info

Publication number
WO2016176847A1
WO2016176847A1 PCT/CN2015/078422 CN2015078422W WO2016176847A1 WO 2016176847 A1 WO2016176847 A1 WO 2016176847A1 CN 2015078422 W CN2015078422 W CN 2015078422W WO 2016176847 A1 WO2016176847 A1 WO 2016176847A1
Authority
WO
WIPO (PCT)
Prior art keywords
coverage
chromosome
chromosomes
copy number
value
Prior art date
Application number
PCT/CN2015/078422
Other languages
English (en)
French (fr)
Inventor
陈重建
梁峻彬
玄兆伶
李大为
Original Assignee
安诺优达基因科技(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 安诺优达基因科技(北京)有限公司 filed Critical 安诺优达基因科技(北京)有限公司
Priority to SG11201709141YA priority Critical patent/SG11201709141YA/en
Priority to JP2018509952A priority patent/JP6623400B2/ja
Priority to PCT/CN2015/078422 priority patent/WO2016176847A1/zh
Priority to EP15891099.2A priority patent/EP3293270B1/en
Priority to US15/571,859 priority patent/US20180201990A1/en
Publication of WO2016176847A1 publication Critical patent/WO2016176847A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/10Ploidy or copy number detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search

Definitions

  • the present invention relates to the field of biomedicine, and in particular to a kit, device and method for detecting chromosome aneuploidy.
  • NIPT non-invasive prenatal testing methods
  • the non-invasive prenatal testing method has the following two advantages: The first aspect: NIPT does not need to bear any risk of miscarriage, but the clinical karyotype analysis by invasive methods such as amniocentesis and umbilical wear will bring about 1/ Abortion risk of 200, there are also studies showing that premature umbilical wear may also cause fetal position tilt; on the other hand: NIPT can be tested as early as 8 weeks of gestational age, giving risk judgment earlier, reducing induction of labor to pregnant women risks of.
  • the main object of the present application is to provide a kit, apparatus and method for detecting chromosomal aneuploidy to reduce the false positive rate of detection.
  • a method for detecting chromosomal aneuploidy comprising the steps of: performing high-throughput sequencing of peripheral blood free DNA of a pregnant woman to obtain sequencing of all chromosomes.
  • the coverage is calculated by dividing the chromosomes into all the chromosomes in the sequenced data to obtain the pre-correction coverage of each chromosome; the Z-test is performed on the number of single sequences of the pregnant women in each window to obtain the Z CNV value, and According to the Z CNV value, the copy number abnormal segment of the pregnant woman to be tested is obtained; the copy number abnormal segment of the pregnant woman to be tested refers to the segment above 300Kb in the sequencing data, and in the segment above 300Kb, more than 80% of the chromosome fragments in the window
  • the Z CNV value is greater than or equal to 4 or less than or equal to -4 fragments; using the influence of the copy number abnormal segment of the pregnant woman to be tested on the pre-correction coverage of each chromosome, correcting the pre-correction coverage of each chromosome to obtain each chromosome Post-correction coverage; and Z-test of each chromosome using the corrected coverage of each chromosome to obtain Z The
  • m represents the effective length of the chromosome in which the copy number abnormal fragment is located, and the unit is Mb; n represents the length of the abnormal number of copy number of the pregnant woman to be tested, and the unit is Mb; The number of times the copy number abnormal segment appears; in the formula (2), f represents the concentration of fetal free DNA contained in the peripheral blood free DNA of the pregnant woman to be tested and assumes that the fetal free DNA concentration f is less than 50%; Correcting the pre-correction coverage of each chromosome, wherein Representing the pre-correction coverage of each chromosome, x' represents the corrected chromosome coverage of each chromosome.
  • the coverage is calculated by dividing all the chromosomes in the sequencing data into equal-sized windows to obtain the pre-correction coverage of each chromosome.
  • each window has a size of 100 Kb, and the degree of overlap between adjacent windows is 50%.
  • the Z test is performed on the number of single sequences of the pregnant women to be tested in each window to obtain the Z CNV value
  • the step of obtaining the copy number abnormal fragments of the pregnant woman according to the Z CNV value includes: sequencing according to each sequence in the sequencing data Depth, the number of single sequences of each window is counted; the number of single sequences of each window is calculated according to the GC content and the ratio of each chromosome, and the pre-corrected coverage of the number of single sequences of each window is obtained; and for each window the number of unique sequences before correction coverage normalized to give a single number sequence Z CNV value of each window, and determines whether the pregnant woman measured copy number abnormalities according to the size of the fragment Z CNV value; when sequencing data There are more than 300Kb segments, and in the segment above 300Kb, more than 80% of the windows have a single sequence of Z CNV values greater than or equal to 4 or less than or equal to -4, then it is considered that the segment above 300Kb is the pregnant woman to be
  • the Za ⁇ o value is in accordance with To calculate, among them, Is the coverage value obtained by the known negative sample population according to the LOESS algorithm, and s is the negative sample population. Standard deviation.
  • a device for detecting chromosome aneuploidy comprising the following module: a sequencing data detecting module for performing high-throughput of peripheral blood free DNA of a pregnant woman to be tested sequencing, to obtain sequence data comprising all of the chromosomes; a first coverage calculation module: for all chromosomes sequencing data calculation coverage cut into the form of a window, each chromosome to obtain a corrected prior coverage; Z CNV value Calculation module: Calculating the Z CNV value of the number of single sequences of the pregnant women to be tested in each window; the copy number abnormal segment query module: for querying the fragments of 300Kb or more in the sequencing data, and in the segments above 300Kb More than 80% of the chromosome fragments in the window have Z CNV values greater than or equal to 4 or less than or equal to -4; copy number abnormal segment determination module: used for the above 300Kb segment which is to be queried from the sequencing data and at
  • ⁇ second calculation module for the case of a chromosome having an abnormal copy number of the acheeted mother of the fetus, according to the calculation formula parameter ⁇ , m as shown in the formula (2), the effective length of the chromosome in which the copy number abnormal segment is located, the unit Mb;n indicates the length of the abnormal number of copies in pregnant women, the unit is Mb; cn indicates the number of occurrences of abnormal fragments of copy number of pregnant women; f indicates the concentration of fetal free DNA contained in the free DNA of peripheral blood of pregnant women to be tested and Assume that the concentration f of fetal free DNA is less than 50%;
  • Correction module for use Correcting the pre-correction coverage of each chromosome to obtain the corrected coverage of each chromosome; Representing the pre-correction coverage of each chromosome, x' represents the corrected chromosome coverage of each chromosome; second coverage calculation module: used to calculate the Z aneu value of each chromosome by using the corrected coverage of each chromosome; Z aneu value Judging module: for judging whether the Z aneu value is greater than or equal to 3; the chromosome aneuploidy confirmation module: for determining that the chromosome has aneuploidy in the case where the Z aneu value is greater than or equal to 3.
  • the first coverage calculation module includes: a chromosome window cut molecular module: configured to slice all the chromosomes in the sequenced data into equal-sized windows; and a first coverage calculation sub-module: for equal-sized windows The coverage is calculated formally to obtain pre-corrected coverage for each chromosome.
  • each window has a size of 100 Kb, and the degree of overlap between adjacent two windows is 50%.
  • the single sequence calculation module includes: a single sequence statistical unit: for counting the number of single sequences of each window according to the sequencing depth of each sequence in the sequencing data; a single sequence coverage calculation unit: for GC according to each chromosome The content and the comparison ratio are calculated for the number of single sequences of each window to obtain the pre-corrected coverage of the number of single sequences of each window; the single sequence Z CNV value calculation unit: for correcting the number of single sequences of each window The pre-coverage is normalized to obtain the Z CNV value for the number of single sequences of each window.
  • Zaneu follows To calculate, among them, Is the pre-correction coverage value obtained from the known negative sample population according to the LOESS algorithm, and s is the negative sample population. Standard deviation.
  • a kit for detecting chromosomal aneuploidy comprising: a detection reagent and a detection device for performing high-throughput sequencing of peripheral blood free DNA of a pregnant woman to be tested, Obtaining sequencing data containing all chromosomes; first coverage calculation device: for calculating coverage of all chromosomes in the sequenced data in a split window form to obtain pre-correction coverage of each chromosome; calculation of single sequence Z CNV value Instruments: Z-test for the number of single sequences of pregnant women in each window to obtain Z CNV value; copy number abnormal segment query device: used to query fragments of 300Kb or more in sequencing data, and fragments above 300Kb In the above, more than 80% of the chromosome fragments in the window have Z CNV values greater than or equal to 4 or less than -4 segments; copy number abnormal segment confirmation device: used to obtain a copy number abnormal segment of the pregnant woman to be tested according to the Z CNV value; ⁇ first calculation
  • represents the effective length of the chromosome where the copy number abnormal fragment is located, the unit is Mb; n represents the length of the abnormal number of copies of the pregnant woman, the unit is Mb; cn represents the number of occurrences of the abnormal number of copies of the pregnant woman; ⁇ second computing device: In the case of a chromosome in which the copy number of the ismeeted parent of the fetus is abnormal, the formula ⁇ is calculated as shown in the formula (2):
  • m represents the effective length of the chromosome where the copy number abnormal fragment is located, and the unit is Mb; n represents the length of the pregnant woman in the copy number abnormal segment, the unit is Mb; cn represents the number of occurrences of the abnormal copy number of the pregnant woman; f represents the periphery of the pregnant woman to be tested The concentration of fetal free DNA contained in blood free DNA and assuming that the concentration f of fetal free DNA is less than 50%; orthodontic instrument: for utilization Correcting the pre-correction coverage of each chromosome to obtain the corrected coverage of each chromosome; Representing the pre-correction coverage of each chromosome, x' represents the corrected chromosome coverage of each chromosome; second coverage calculation instrument: used to calculate the Z aneu value of each chromosome by using the corrected coverage of each chromosome; Z aneu value Judging device: used to determine whether the Z aneu value is greater than or equal to 3; Chromosomal aneuploidy confirmation device
  • the first coverage calculation device includes: a chromosome window segmentation component: a window for dividing all chromosomes in the sequenced data into equal sizes; a first coverage calculation component: for using an equal-sized window The coverage is calculated to obtain pre-corrected coverage for each chromosome.
  • each window has a size of 100 Kb, and the degree of overlap between adjacent windows is 50%.
  • the Z CNV value calculation device comprises: a single sequence statistical component for counting the number of single sequences of each window according to the sequencing depth of each sequence in the sequencing data; a single sequence coverage calculation component: for each chromosome The GC content and the comparison ratio are calculated for the number of single sequences of each window, and the pre-corrected coverage of the number of single sequences of each window is obtained; the single sequence Z CNV value calculation component: the number of single sequences for each window The pre-correction coverage is normalized to obtain a Z value for the number of single sequences of each window.
  • Zaneu follows To calculate, among them, Is the pre-correction coverage value obtained from the known negative sample population according to the LOESS algorithm, and s is the negative sample population. Standard deviation.
  • the copy number abnormal segment of the female parent is used to calculate the coverage of each chromosome.
  • the degree of influence is removed, thereby obtaining the coverage after correction of each chromosome, and the result of calculating and judging the chromosome aneuploidy using the corrected coverage of the present application is more accurate.
  • FIG. 1 is a flow chart showing a method of detecting chromosome aneuploidy in an exemplary embodiment of the present application
  • FIG. 2 is a schematic view showing the structure of an apparatus for detecting chromosome aneuploidy according to an exemplary embodiment of the present application
  • 3A, 3B, and 3C respectively show schematic diagrams of correction results of aneuploidy detection of chromosomes 13, 18, and 21 according to Example 1 of the present application;
  • FIG. 4 is a schematic diagram showing the results of correction of aneuploidy on chromosome 21 of samples EK01875 and BD01462 according to Example 2 of the present application;
  • Figure 5 is a diagram showing the results of correction of aneuploidy detection of chromosome 21 of sample EK01875 according to Example 3 of the present application;
  • Figure 6 shows the corrected results of the aneuploidy detection of chromosome 21 of sample BD01462 according to Example 4 of the present application.
  • Z CNV or Zaneu refers to the calculated value of the Z test in statistics, which is a method for the large sample (ie, sample size greater than 30) mean difference test. It uses the theory of standard normal distribution to infer the probability of a difference occurring, and thus compares whether the difference between the two means is significant.
  • the alignment ratio refers to the ratio of the sequencing sequences within the window to the ratio of the genomic reference sequence. Since the sequencing sequence may be aligned to multiple positions on the genomic reference sequence at the same time, it may not be the only sequence, so the alignment ratio within the window is greater than the alignment ratio of the single sequence.
  • cff-DNA is derived from the placenta, which means that if the placenta is chimeric (CPM), we will have difficulty in accurately estimating the fetal condition through the results of NIPT, and the results are easily misaligned; secondly, if If a pregnant woman has a certain CNV, the method based on MPS statistical coverage and converted to Z value will be inaccurate. Because when a pregnant woman has a repeating fragment, the number of relative single sequences aligned to the chromosome will increase, and the increase in coverage will increase the Z value, thereby increasing the risk of false positives. Conversely, if a pregnant woman has a fragment missing, the Z value will decrease, increasing the risk of false negatives.
  • CCM chimeric
  • CCM placental mosaic
  • CNV maternal copy number abnormality
  • the present application proposes a method for detecting chromosome aneuploidy, as shown in FIG. 1, the method includes the following steps: measuring peripheral blood of pregnant women The DNA is subjected to high-throughput sequencing to obtain sequencing data containing all chromosomes; the coverage is calculated by dividing all chromosomes in the sequencing data into a window to obtain the pre-correction coverage of each chromosome; the pregnant woman to be tested is single in each window The Z CNV value of the number of sequences is calculated, and the copy number abnormal segment of the pregnant woman to be tested is obtained according to the Z CNV value; the copy number abnormal segment of the pregnant woman to be tested refers to a fragment of 300 Kb or more in the sequencing data, and is above 300 Kb.
  • more than 80% of the chromosome fragments in the window have a Z CNV value greater than or equal to 4 or less than or equal to -4; the effect of using the copy number abnormal fragment of the pregnant woman to be tested on the pre-correction coverage of each chromosome, for each chromosome Correction of pre-correction coverage to obtain corrected coverage of each chromosome; and calculation of corrected coverage using each chromosome
  • the influence of the copy number abnormal segment of the pregnant woman to be tested on the pre-correction coverage of each chromosome is represented by the parameter ⁇ .
  • the calculation formula of the parameter ⁇ is as shown in the formula (1):
  • m represents the effective length of the chromosome in which the copy number abnormal fragment is located, and the unit is Mb; n represents the length of the abnormal number of copy number of the pregnant woman to be tested, and the unit is Mb; The number of times the copy number abnormal segment appears; in the formula (2), f represents the concentration of fetal free DNA contained in the peripheral blood free DNA of the pregnant woman to be tested and assumes that the fetal free DNA concentration f is less than 50%; Correcting the coverage of each chromosome, wherein Representing the pre-correction coverage of each chromosome, x' represents the corrected chromosome coverage of each chromosome.
  • the above method of the present application does not directly remove the parental copy number abnormal segment in the sequencing data as in the prior art, but by screening a specific size copy number abnormal segment existing on the parent chromosome, and When determining whether a chromosome has aneuploidy, the effect of the copy number abnormal segment on calculating the coverage of each chromosome is removed, thereby obtaining the corrected coverage of each chromosome, so that the chromosome detected by the method of the present application is not The result of euploidy is more accurate.
  • the calculation method of the concentration f of fetal free DNA contained in the peripheral blood free DNA of the pregnant woman to be tested is a conventional calculation method in the art.
  • the concentration of fetal free DNA is as follows.
  • fetal and placenta-derived RASSF1A (on chromosome 3) genes are highly methylated, while maternal-derived RASSF1A genes are unmethylated, using methylation-sensitive enzymes such as HhaI, BstUI (30U) and HpaII treats cffDNA, the unmethylated gene will be digested, and the methylated gene is not digested, thereby detecting the fetal cffDNA content by Q-PCR.
  • methylation-sensitive enzymes such as HhaI, BstUI (30U) and HpaII treats cffDNA, the unmethylated gene will be digested, and the methylated gene is not digested, thereby detecting the fetal cffDNA content by Q-PCR.
  • a relatively robust chromosome coverage can be obtained by calculating the chromosome into a window.
  • coverage is calculated for all chromosomes in the sequenced data in a form that is divided into equally sized windows to obtain pre-correction coverage for each chromosome.
  • the coverage is calculated in the form of a split into a window, each window having a size of 100 Kb and an overlap between adjacent two windows of 50%. Controlling the size of each window to 100Kb and controlling the overlap between two adjacent windows to 50%, not only can obtain a relatively robust chromosome coverage, but also increase the overlap between windows to improve the detection copy number. The accuracy of the abnormal fragments, thereby improving the detection efficiency of pregnant women's copy number abnormal fragments.
  • the Z CNV value of the single sequence number of the pregnant woman to be tested in each window is calculated, and the step of obtaining the copy number abnormal segment of the pregnant woman to be tested according to the Z CNV value includes: according to the sequencing data The sequencing depth of each sequence, the number of single sequences in each window is counted; the number of single sequences of each window is calculated according to the GC content and the ratio of each chromosome, and the pre-corrected coverage of the number of single sequences of each window is obtained.
  • copy number abnormalities and coverage of the front segment of the number of unique sequences of each window correction standardized to give a single number sequence Z CNV value of each window, depending on the size and the fragments of pregnant women to be tested whether the value of Z CNV
  • the Z CNV value of the number of single sequences of more than 80% of the windows above 300Kb is greater than or equal to 4 or less than -4, it is considered that the segment above 300Kb is The abnormal number of copies of the pregnant woman to be tested.
  • the normalization processing refers to the number of unique sequences of each window correction of the present application Value, do (xu)/sd(xu), where x is the corrected value, u is the mean of x, and sd is the standard deviation.
  • the present application can detect a reliable pregnant woman copy number abnormal segment, and use these copy number abnormal segments to correct the Z CNV value of the chromosome in which the chromosome is located, thereby avoiding the false detection result of the pregnant woman copy number abnormal segment. Negative judgment.
  • the Za ⁇ o value is in accordance with To calculate, among them, Is the coverage value obtained by the known negative sample population according to the LOESS algorithm, and s is the negative sample population. Standard deviation.
  • the corrected Z aneu value calculated by the above formula can more accurately reflect the aneuploidy of the chromosome, so that the detection result is more accurate.
  • a device for detecting chromosome aneuploidy comprising the following module: sequencing data detection module: for measuring the periphery of a pregnant woman High-throughput sequencing of blood-free DNA to obtain sequencing data containing all chromosomes; first coverage calculation module: for calculating coverage of all chromosomes in the sequenced data in a split window form to obtain correction of each chromosome before coverage; Z CNV value calculation module: used to treat a single test in pregnant women in the window sequence number of Z CNV values are calculated; fragment copy number abnormalities query module: used for more than 300Kb fragment sequencing data query, And in the fragment above 300Kb, the Z CNV value of the chromosome fragment in more than 80% of the window is greater than or equal to 4 or less than or equal to -4; the copy number abnormal fragment determining module is used for 300Kb which will be queried from the sequencing data.
  • sequencing data detection module for measuring the periphery of a pregnant woman High-throughput sequencing of blood-free DNA to obtain sequencing data containing all
  • ⁇ second calculation module for the case of a chromosome having an abnormal copy number of the acheeted mother of the fetus, according to the calculation formula parameter ⁇ , m as shown in the formula (2), the effective length of the chromosome in which the copy number abnormal segment is located, the unit Mb;n indicates the length of the abnormal number of copies in pregnant women, the unit is Mb; cn indicates the number of occurrences of abnormal fragments of copy number of pregnant women; f indicates the concentration of fetal free DNA contained in the free DNA of peripheral blood of pregnant women to be tested and Assume that the concentration f of fetal free DNA is less than 50%;
  • Correction module for use Correcting the pre-correction coverage of each chromosome to obtain the corrected coverage of each chromosome; Representing the pre-correction coverage of each chromosome, x' represents the corrected chromosome coverage of each chromosome; second coverage calculation module: used to calculate the Z aneu value of each chromosome by using the corrected coverage of each chromosome; Z aneu value Judging module: for judging whether the Z aneu value is greater than or equal to 3; the chromosome aneuploidy confirmation module: for determining that the chromosome has aneuploidy in the case where the Z aneu value is greater than or equal to 3.
  • the present application screens a region of at least 300 kb present on the female chromosome and 80% of the window Z in the region.
  • a fragment having a CNV value of 4 or more or less such that the above apparatus of the present application is capable of detecting a reliable abnormal copy number of a pregnant woman, and correcting the Z test value of the chromosome on which the copy number abnormal fragment is used, thereby It can avoid the judgment of false negative caused by the error of the detection result of the pregnant woman's copy number abnormal segment.
  • the chromosome aneuploidy confirmation module of the present application confirms the aneuploidy of the chromosome more accurately.
  • the fetal concentration in the calculation formula of the parameter ⁇ is a conventional calculation method in the art, and is specifically described above and will not be described herein.
  • the foregoing module of the present application may be operated as a part of the device in a computing terminal, and the processor provided by the computer terminal may be used to execute the sequencing data detection module, the first coverage calculation module, and the single sequence calculation.
  • the technical solution implemented by the module, the copy number abnormal segment query module, the copy number abnormal segment confirmation module, the ⁇ first calculation module, the ⁇ second calculation module, the correction module, the second coverage calculation module, and the chromosome aneuploidy determination module It will be apparent that the computer terminal is a hardware implemented device and the processor is also a hardware device for executing the program.
  • the above various functional sub-modules provided by the present application may be operated in a mobile terminal, a computer terminal or the like, or may be stored as part of a storage medium.
  • the first coverage calculation module may be appropriately adjusted according to different sequencing data according to a calculation module conventional in the art.
  • the first coverage calculation module includes: a chromosome window cut molecular module: used to cut all the chromosomes in the sequenced data into equal-sized windows; the first coverage calculation sub-module : Used to calculate coverage in the form of windows of equal size to obtain pre-corrected coverage for each chromosome. Chromosome window-cutting molecule The first coverage calculation module of the module and the first coverage calculation sub-module is calculated by cutting into equal-sized windows to obtain relatively robust coverage.
  • each window has a size of 100 Kb, and the degree of overlap between adjacent two windows is 50%.
  • the calculation module that divides each window into a form of 100Kb is advantageous for obtaining relatively robust coverage on the one hand, and increasing the coverage between windows on the other hand, which can improve the accuracy of detecting abnormal segments of copy number, thereby improving the copy of pregnant women. The detection efficiency of several abnormal segments.
  • a single sequence calculation module can be obtained using a conventional calculation module.
  • the single sequence calculation module further includes: a single sequence statistics unit: for counting the number of single sequences of each window according to the sequencing depth of each sequence in the sequencing data; coverage of a single sequence Calculation unit: for correcting the number of single sequences of each window according to the GC content and the comparison ratio of each chromosome, obtaining the correction value of the number of single sequences of each window; single sequence Z CNV value calculation unit: for each The correction value of the number of single sequences of the window is normalized to obtain the Z CNV value of the number of single sequences of each window.
  • the above single sequence calculation module of the present application calculates the number of single sequences of each window according to the sequencing depth of each sequence in the sequencing data by first running a single sequence statistical unit, and then performs a single sequence coverage calculation unit, according to the GC of each chromosome. The content and the comparison ratio are calculated for the number of single sequences of each window, and the pre-corrected coverage of the number of single sequences of each window is obtained, and then the single-sequence Z CNV value calculation sub-unit is executed, and the number of single sequences for each window is The pre-correction coverage is normalized to obtain the Z CNV value for the number of single sequences of each window.
  • the above unit is an appropriate adjustment based on the conventional computing unit in the field, and is a basis and premise for the copy number abnormal segment query module to perform the query and the copy number abnormal segment confirmation module for confirming, in order to accurately determine the female parent in the sample to be tested.
  • the existence of DNA copy number abnormal fragments provides a basis.
  • Zaneu follows To calculate, among them, Is the pre-correction coverage value obtained from the known negative sample population according to the LOESS algorithm, and s is the negative sample population. Standard deviation.
  • the corrected Z aneu value calculated by the above formula can more accurately reflect the aneuploidy of the chromosome, so that the detection result is more accurate.
  • a kit for detecting chromosomal aneuploidy comprising: a sequencing data detecting device for high-throughput of peripheral blood free DNA of a pregnant woman to be tested Sequencing to obtain sequencing data containing all chromosomes; first coverage calculation device: for calculating coverage of all chromosomes in the sequenced data in a split window form to obtain pre-correction coverage of each chromosome; Z CNV value Computational instrument: Calculated for the number of Z CNV values of a single sequence of pregnant women in each window; copy number abnormal segment query device: used to query segments of 300Kb or more in sequencing data, and in segments above 300Kb More than 80% of the chromosome fragments in the window have Z CNV values greater than or equal to 4 or less than or equal to -4; copy number abnormal fragment determining device: used for the above 300Kb segment which is to be queried from the sequencing data and at 80% above window chromosome fragment Z CNV values are greater than
  • ⁇ Second calculation device for the case of a chromosome having an abnormal copy number of the ismeeted mother of the fetus, according to the calculation formula parameter ⁇ , m as shown in the formula (2), the effective length of the chromosome where the copy number abnormal segment is located, the unit Mb;n indicates the length of the abnormal number of copies in pregnant women, the unit is Mb; cn indicates the number of occurrences of abnormal fragments of copy number of pregnant women; f indicates the concentration of fetal free DNA contained in the free DNA of peripheral blood of pregnant women to be tested and Assume that the concentration f of fetal free DNA is less than 50%;
  • Corrective instruments for use Correcting the pre-correction coverage of each chromosome to obtain the corrected coverage of each chromosome; Representing the pre-correction coverage of each chromosome, x' represents the corrected chromosome coverage of each chromosome; second coverage calculation instrument: used to calculate the Z aneu value of each chromosome by using the corrected coverage of each chromosome; Z aneu value Judging device: for judging whether the Z aneu value is greater than or equal to 3; chromosomal aneuploidy first confirming device: for determining that the chromosome has aneuploidy in the case where the Z aneu value is greater than or equal to 3.
  • the present application screens a region of at least 300 kb present on the female chromosome and 80% of the window in the region.
  • the Z CNV value is greater than or equal to 4 or less than or equal to 4, so that the above kit of the present application can detect a reliable pregnant copy number abnormal segment, and use these copy number abnormal fragments to correct the Z CNV value of the chromosome on which it is located. In addition, it can be avoided that the false negative result is caused by an error in the detection result of the abnormal copy number of the pregnant woman.
  • the effect of the copy number abnormality fragment on calculating the coverage of each chromosome is corrected by the correcting instrument, so that the result of confirming the aneuploidy of the chromosome by the chromosome aneuploidy confirmation device of the present application is more accurate.
  • the fetal concentration in the calculation formula of the parameter ⁇ is a conventional calculation method in the field, and is specifically described above and will not be described herein.
  • the above components and components of the present application may be operated as a part of the device in a computing terminal, and the processor provided by the computer terminal may be used to execute the above-mentioned sequencing data detecting device, the first coverage computing device, and a single device.
  • Sequence calculation device, copy number abnormal segment query device, copy number abnormal segment confirmation device, alpha first computing device, alpha second computing device, orthodontic device, second coverage computing device, and chromosomal aneuploidy determining device Technical Solution
  • the computer terminal is a hardware implemented device and the processor is also a hardware kit for executing the program.
  • the above various functional sub-devices provided by the present application may be operated in a mobile terminal, a computer terminal or the like, or may be stored as part of a storage medium.
  • the first coverage calculation device can be appropriately adjusted based on the difference in sequencing data on the basis of the conventional computing device.
  • the first coverage calculation device includes: a chromosome window cutting molecular device: for dividing all chromosomes in the sequencing data into equal-sized windows; the first coverage calculation sub-device : Used to calculate coverage in the form of windows of equal size to obtain pre-corrected coverage for each chromosome.
  • the first coverage calculation instrument including the chromosomal window scission molecular instrument and the first coverage calculation sub-device of the present application is calculated by cutting into equal-sized windows to obtain relatively robust coverage.
  • each window has a size of 100 Kb, and the degree of overlap between adjacent two windows is 50%.
  • Computational devices that calculate each window in the form of a size of 100Kb are advantageous for obtaining relatively robust coverage on the one hand, and increasing the coverage between windows on the other hand, which can improve the accuracy of detecting abnormal segments of copy number, thereby enhancing the copy of pregnant women. The detection efficiency of several abnormal segments.
  • a single sequence computing device can be obtained using conventional computing devices.
  • the single sequence computing device further includes: a single sequence statistical unit for counting the number of single sequences of each window according to the sequencing depth of each sequence in the sequencing data; coverage of a single sequence The calculation unit is configured to calculate the number of single sequences of each window according to the GC content and the comparison ratio of each chromosome, and obtain the pre-correction coverage of the number of single sequences of each window; the single sequence Z CNV value calculation unit: used for The pre-correction coverage of the number of single sequences of each window is normalized to obtain the Z CNV value of the number of single sequences of each window.
  • the above single sequence computing device of the present application calculates the number of single sequences of each window according to the sequencing depth of each sequence in the sequencing data by first running a single sequence statistical unit, and then performs a single sequence coverage calculation unit, according to the GC of each chromosome. The content and the ratio are corrected for the number of single sequences of each window, and the correction value of the number of single sequences of each window is obtained, and then the single sequence Z CNV value calculation subunit is executed, and the correction value of the number of single sequences of each window is performed. The standardization process is performed to obtain the Z CNV value of the number of single sequences of each window.
  • the above unit is an appropriate adjustment based on the conventional calculation and correction unit in the field, and is a basis and premise for confirming the copy number abnormal segment query device for query and copy number abnormal segment confirmation device, in order to accurately determine the sample to be tested.
  • the existence of an abnormal fragment of the maternal DNA copy number provides a basis.
  • Zaneu is To calculate, among them, Is the pre-correction coverage value obtained from the known negative sample population according to the LOESS algorithm, and s is the negative sample population. Standard deviation.
  • the corrected Z aneu value calculated by the above formula can more accurately reflect the aneuploidy of the chromosome, so that the detection result is more accurate.
  • the present embodiment In order to test the corrective effect of the pregnant woman copy number abnormal segment correction on the chromosome aneuploidy detection of the present application, the present embodiment generates a set of simulation data of the pregnant woman based on the Poisson distribution, and in the simulation data, respectively, respectively.
  • a quantitative copy number abnormality fragment was added to chromosomes 18 and 21, and the size of the copy number abnormal fragment was from 0.5 Mb to 5 Mb in steps of 0.25 Mb.
  • three different concentrations of normal human DNA were mixed into the simulation data containing the copy number abnormal fragments.
  • the whole process was used to simulate the effect of different copy number anomalous fragment size on the coverage of chromosomes 13, 18 and 21 at different fetal concentrations, and to test the corrective effect of pregnant women's copy number abnormal fragments on chromosome aneuploidy detection. . All calculations were performed on the assumption that the fetus did not have an abnormal copy of the pregnant woman's copy number.
  • the test results are shown in Figures 3A, 3B and 3C.
  • the abscissa represents the size of the abnormal copy number of the pregnant woman in the sample
  • the ordinate represents the chromosome Z value of this sample.
  • the solid line in the figure indicates the Z value of the chromosome before the correction
  • the broken line indicates the Z value calculated by the chromosome coverage after the abnormal copy of the copy number of the pregnant woman, that is, the Z aneu value.
  • Squares, circles, and triangles indicate fetal concentrations of 5%, 10%, and 15%, respectively.
  • the dotted line in the figure that is, the Z value of the chromosome calculated by the coverage corrected by the method of the present application, that is, the Z aneu value, is stabilized near the baseline of 0, which indicates that, in various cases, the present application utilizes Detection of chromosomal aneuploidy for correcting fragments of pregnant women with abnormal copy number is extremely effective.
  • the detection device and the kit provided according to the application of the present application in detecting the chromosomal aneuploidy in the actual case sample are also utilized, respectively.
  • the following case samples were tested, see Example 2 and Example 3 for details.
  • the number of single sequences of each window is counted; the number of single sequences of each window is corrected according to the GC content and the ratio of each chromosome, and a single sequence of each window is obtained.
  • Correction value of the quantity normalize the correction value of the number of single sequences of each window, obtain the Z CNV value of the number of single sequences of each window, and judge whether the pregnant woman to be tested has the copy number abnormality according to the size of the Z CNV value Fragment; when there are more than 300Kb in the sequencing data, and the Z CNV value of the number of single sequences of more than 80% of the windows above 300Kb is greater than or equal to 4 or less than -4, it is considered to be more than 300Kb
  • the fragment is an abnormal copy number of the pregnant woman to be tested;
  • the left image in Figure 4 is the Z-value chart of chromosome 21 of all samples detected by the existing detection method. It can be seen that the Z value of the negative sample is almost all less than 3, and the near-normal distribution.
  • the circle in the figure is the sample EK01875 with a Z value of 4.66.
  • the triangle is the sample BD01462 with a Z value of 3.87.
  • sample EK01875, maternal age 29 years, gestational age approximately 18 w The detection of the above sample (sample EK01875, maternal age 29 years, gestational age approximately 18 w) is performed using the apparatus for detecting chromosomal aneuploidy of the present application, the apparatus comprising:
  • Sequencing data detection module high-throughput sequencing of peripheral blood free DNA of pregnant women to be tested to obtain sequencing data containing all chromosomes
  • a first coverage calculation module configured to calculate coverage for all chromosomes in the sequenced data in a split window form to obtain pre-correction coverage of each chromosome;
  • Z CNV value calculation module calculating the Z CNV value of the number of single sequences of the pregnant women to be tested in each window;
  • Copy number abnormal segment query module used to query the fragment of 300Kb or more in the sequencing data, and in the segment above 300Kb, more than 80% of the chromosome fragments in the window have Z CNV values greater than or equal to 4 or less than or equal to -4 ;
  • Copy number abnormal segment determining module for detecting a segment above 300Kb obtained from the sequencing data and determining that a fragment having a Z CNV value of 4 or more and 4 or less in the window of 80% or more of the window is to be tested Abnormal fragments of copy number of pregnant women;
  • ⁇ first calculation module for calculating the parameter ⁇ according to the calculation formula shown in the formula (1) in the case where the fetus inherits the copy number abnormal segment of the mother,
  • ⁇ second calculation module for calculating a formula parameter ⁇ as shown in formula (2) in the case of a chromosome having an abnormal copy number of the acheeted mother of the fetus,
  • Correction module for use Correcting the pre-correction coverage of each chromosome to obtain the corrected coverage of each chromosome;
  • a second coverage calculation module for calculating a Z aneu value of each chromosome by using the corrected coverage of each chromosome
  • Z aneu value judgment module used to determine whether the Z aneu value is greater than or equal to 3;
  • Chromosomal aneuploidy confirmation module for determining that a chromosome has aneuploidy in the case where the Za ⁇ o value is greater than or equal to 3.
  • the position of the chip detection result is almost 100% matched with the position detected by the device of the present application.
  • the influence parameter ⁇ of the pregnant woman copy number abnormal segment on the calculation of the chromosome coverage is 1.012, and the Z value indicating whether the chromosome is aneuploid is corrected from the original 4.66 to 2.36, and the judgment result is thus changed. Negative.
  • sample BD01462 aged 24 years old, gestational age about 24w
  • kit includes:
  • Sequencing data detection reagents and instruments high-throughput sequencing of peripheral blood free DNA of pregnant women to be tested to obtain sequencing data containing all chromosomes;
  • a first coverage calculation device for calculating coverage of all chromosomes in the sequenced data in a divided window form to obtain pre-correction coverage of each chromosome
  • Single sequence computing device Calculated for the Z CNV value of the number of single sequences of pregnant women to be tested in each window;
  • Copy number abnormal segment query device used to query the fragment of 300Kb or more in the sequencing data, and in the segment above 300Kb, more than 80% of the chromosome fragments in the window have Z CNV values greater than or equal to 4 or less than or equal to -4 ;
  • Copy number abnormal segment determining device for detecting a segment of 300 Kb or more which is to be queried from the sequencing data, and determining a segment having a Z CNV value of 4 or more and 4 or less in the window of 80% or more of the window as the test to be tested Abnormal fragments of copy number of pregnant women;
  • ⁇ first computing device for calculating the parameter ⁇ according to the calculation formula shown in the formula (1) in the case where the fetus inherits the copy number abnormal segment of the mother,
  • ⁇ second computing device for calculating a formula parameter ⁇ as shown in formula (2) in the case of a chromosome having an abnormal copy number of the acheeted mother of the fetus,
  • Corrective instruments for use Correcting the pre-correction coverage of each chromosome to obtain the corrected coverage of each chromosome;
  • Second coverage calculation device for calculating the Z aneu value of each chromosome by using the corrected coverage of each chromosome;
  • Z aneu value judgment device used to determine whether the Z aneu value is greater than or equal to 3;
  • Chromosomal aneuploidy confirmation device for determining that the chromosome has aneuploidy in the case where the Z aneu value is greater than or equal to 3.
  • the detected copy number is 4, which is slightly different from the detection result of the present application, the position of the result is almost 100% matched with the position detected by the kit of the present application, which also indicates the accuracy of the detection method of the present application.
  • Sex According to the kit of the present application, the influence parameter of the pregnant woman copy number abnormal segment on the calculation of the chromosome coverage is ⁇ 1.09, and the Z value indicating whether the chromosome is aneuploid is corrected from the original 3.87 to 1.83, and the judgment result is therefore Changed to negative.
  • the above-mentioned embodiments of the present application achieve the following technical effects:
  • the prior art passes In the sequencing data, the parental copy with the copy number abnormality directly removes the idea of not considering it, and the effect of the specific size copy number abnormal segment existing on the female chromosome on the calculation of chromosome aneuploidy is embodied by the parameter ⁇ .
  • the parameter ⁇ is used to correct the coverage of each chromosome, thereby reducing the influence of the copy number abnormal segment on the determination of chromosome aneuploidy, instead of ignoring the existence of the copy number abnormal segment, thereby making the present application
  • the results of the chromosome aneuploidy detected by the method are more accurate.
  • the method, device or kit of the present application provides a method for detecting the aneuploidy of the NIPT fetal chromosome which is hardly affected by abnormal fragments of the copy number of the pregnant woman, and improves the accuracy of the detection, and is suitable for large-scale use.
  • modules, elements or steps of the present application described above may be implemented by a general-purpose computing device, which may be centralized on a single computing device or distributed across multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device, such that they may be stored in the storage device for execution by the computing device, or they may be separately fabricated into individual integrated circuit modules, or Multiple of these modules or steps are fabricated as a single integrated circuit module. Thus, the application is not limited to any particular combination of hardware and software.

Abstract

检测染色体非整倍性的试剂盒、装置和方法。该方法包括:对待测孕妇的外周血游离DNA进行测序,得到包含所有染色体的测序数据;对测序数据中的所有染色体以切分成窗口的形式计算覆盖度,得到各染色体的矫正前覆盖度;对各窗口中的单一序列的数量的Z CNV值进行计算,并根据Z CNV值大小得到待测孕妇的拷贝数异常片段;利用拷贝数异常片段对矫正前覆盖度的影响,对矫正前覆盖度进行矫正,得到矫正后覆盖度;利用各染色体的矫正后覆盖度计算各染色体的Z aneu值,并当Z aneu值的绝对值大于等于3时,则染色体具有非整倍性。

Description

检测染色体非整倍性的试剂盒、装置和方法 技术领域
本发明涉及生物医药领域,具体而言,涉及一种检测染色体非整倍性的试剂盒、装置和方法。
背景技术
胎儿游离DNA(cff-DNA)自从1997被Lo发现至今已经有将近20个年头,正是这项发现为许多无创的产前检测方法(NIPT)提供了可能。无创的产前检测方法主要有以下两方面优势:第一方面:NIPT无需承担任何流产风险,而临床上通过羊水穿刺和脐穿等有创方式进行的染色体核型分析则会带来约1/200的流产风险,也有研究表明过早的脐穿还可能造成胎儿位置倾斜;另一方面:NIPT可以最早于孕周8周时进行检测,更早地给出风险判断,减少引产给孕妇带来的风险。
正是这些优势使得无创产前相关的研究方法日新月异,应用范围越来越广,现有的方法比如NIPT胎儿染色体非整倍性检测,NIPT胎儿单基因病检测,NIPT胎儿拷贝数异常片段(Copy Number Variation,CNV)检测,NIPT胎儿全基因组检测,NIPT胎儿亲子鉴定等等。
目前,在所有NIPT的应用中,应用最广泛也相对最成熟的当属胎儿染色体非整倍性检测。在对胎儿染色体非整倍性检测的众多算法中,Chui于2008年发明的基于高通量测序(MPS)的方法被认为临床使用中合适的,已经展现了它的稳健性。对于唐氏综合征,假阳性率(FPR)可以达到0.443%,假阴性率(FNR)低至0.004%;对于爱德华综合症,FPR则为0.22%,FNR为0.025%。
虽然上述方法已经达到一个极低的错误率,但仍存在判断错误的风险。因此,仍需要对现有的方法进行改进,以尽可能降低检测的错误率。
发明内容
本申请的主要目的在于提供一种检测染色体非整倍性的试剂盒、装置和方法,以降低检测的假阳性率。
为了实现上述目的,根据本申请的一个方面,提供了一种检测染色体非整倍性的方法,方法包括以下步骤:对待测孕妇的外周血游离DNA进行高通量测序,得到包 含所有染色体的测序数据;对测序数据中的所有染色体以切分成窗口的形式计算覆盖度,得到各染色体的矫正前覆盖度;对待测孕妇在各窗口中的单一序列的数量进行Z检验,得到ZCNV值,并根据ZCNV值大小得到待测孕妇的拷贝数异常片段;待测孕妇的拷贝数异常片段是指在测序数据中300Kb以上的片段,且在300Kb以上的片段中,80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段;利用待测孕妇的拷贝数异常片段对各染色体的矫正前覆盖度的影响,对各染色体的矫正前覆盖度进行矫正,得到各染色体的矫正后覆盖度;以及利用各染色体的矫正后覆盖度对各染色体进行Z检验,得到Zaneu值,并根据Zaneu值的绝对值是否大于等于3来判断染色体是否具有非整倍性;当Zaneu值的绝对值大于等于3时,则染色体具有非整倍性;其中,待测孕妇的拷贝数异常片段对各染色体的矫正前覆盖度的影响用参数α表示,当胎儿遗传了母体的拷贝数异常片段时,参数α的计算公式如式(1):
Figure PCTCN2015078422-appb-000001
当胎儿未遗传母体的拷贝数异常片段时,参数α的计算公式如式(2):
Figure PCTCN2015078422-appb-000002
在式(1)和式(2)中,m表示拷贝数异常片段所在的染色体的有效长度,单位为Mb;n表示待测孕妇的拷贝数异常片段的长度,单位为Mb;cn表示孕妇的拷贝数异常片段出现的次数;在式(2)中,f表示待测孕妇的外周血游离DNA中所含的胎儿游离DNA的浓度且假定胎儿游离DNA的浓度f小于50%;并利用
Figure PCTCN2015078422-appb-000003
对各染色体的矫正前覆盖度进行矫正,其中,
Figure PCTCN2015078422-appb-000004
代表各染色体的矫正前覆盖度,x'代表各染色体的矫正后染色体覆盖度。
进一步地,对测序数据中的所有染色体以切分成相等大小的窗口的形式计算覆盖度,得到各染色体的矫正前覆盖度。
进一步地,每个窗口的大小为100Kb,且相邻两个窗口之间的重叠度为50%。
进一步地,对待测孕妇在各窗口中的单一序列的数量进行Z检验,得到ZCNV值,并根据ZCNV值得到待测孕妇的拷贝数异常片段的步骤包括:根据测序数据中各序列的测序深度,统计各窗口的单一序列的数量;根据各染色体的GC含量和比对率对各窗 口的单一序列的数量进行计算,得到各窗口的单一序列的数量的矫正前覆盖度;以及对各窗口的单一序列的数量的矫正前覆盖度进行标准化处理,得到各窗口的单一序列的数量的ZCNV值,并根据ZCNV值的大小判断待测孕妇是否具有拷贝数异常的片段;当在测序数据中存在300Kb以上的片段,且在300Kb以上的片段中80%以上的窗口的单一序列的数量的ZCNV值都大于等于4或小于等于-4时,则认为300Kb以上的片段是待测孕妇的拷贝数异常片段。
进一步地,利用各染色体的矫正后覆盖度对各染色体的进行Z检验,得到Zaneu值的步骤中,Zaneu值按照
Figure PCTCN2015078422-appb-000005
来计算,其中,
Figure PCTCN2015078422-appb-000006
是根据LOESS算法,通过已知阴性样本群体得到的覆盖度值,s表示阴性样本群体里
Figure PCTCN2015078422-appb-000007
的标准差。
为了实现上述目的,根据本申请的一个方面,提供了一种检测染色体非整倍性的装置,该装置包括以下模块:测序数据检测模块:用于对待测孕妇的外周血游离DNA进行高通量测序,以得到包含所有染色体的测序数据;第一覆盖度计算模块:用于对测序数据中的所有染色体以切分成的窗口形式计算覆盖度,以得到各染色体的矫正前覆盖度;ZCNV值计算模块:用于对待测孕妇在各窗口中的单一序列的数量的ZCNV值进行计算;拷贝数异常片段查询模块:用于在测序数据中查询300Kb以上的片段,且在300Kb以上的片段中,80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段;拷贝数异常片段确定模块:用于将从测序数据中查询得到的300Kb以上的片段且在80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段确定为待测孕妇的拷贝数异常片段;α第一计算模块:用于在胎儿遗传了母体的拷贝数异常片段的情况下,按照如式(1)所示的计算公式计算参数α,其中,参数α是指孕妇的拷贝数异常片段对各染色体的矫正前覆盖度的影响;m表示拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示孕妇在拷贝数异常片段的长度,单位为Mb;cn表示孕妇的拷贝数异常片段出现的次数;
Figure PCTCN2015078422-appb-000008
α第二计算模块:用于在胎儿未遗传母体的拷贝数异常的染色体的情况下,按照如式(2)所示的计算公式参数α,m表示拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示孕妇在拷贝数异常片段的长度,单位为Mb;cn表示孕妇的拷贝数异常片段出现的次数;f表示待测孕妇的外周血游离DNA中所含的胎儿游离DNA的浓度且假定胎儿游离DNA的浓度f小于50%;
Figure PCTCN2015078422-appb-000009
矫正模块:用于利用
Figure PCTCN2015078422-appb-000010
对各染色体的矫正前覆盖度进行矫正,得到各染色体的矫正后覆盖度;其中,
Figure PCTCN2015078422-appb-000011
代表各染色体的矫正前覆盖度,x'代表各染色体的矫正后染色体覆盖度;第二覆盖度计算模块:用于利用各染色体的矫正后覆盖度来计算各染色体的Zaneu值;Zaneu值判断模块:用于判断Zaneu值是否大于等于3;染色体非整倍性确认模块:用于在Zaneu值大于等于3的情况下,确定染色体具有非整倍性。
进一步地,第一覆盖度计算模块包括:染色体窗口切分子模块:用于对测序数据中的所有染色体以切分成相等大小的窗口;第一覆盖度计算子模块:用于以相等大小的窗口的形式计算覆盖度,以得到各染色体的校正前覆盖度。
进一步地,染色体窗口切分子模块中,每个窗口的大小为100Kb,且相邻两个窗口之间的重叠度为50%。
进一步地,单一序列计算模块包括:单一序列统计单元:用于根据测序数据中各序列的测序深度,统计各窗口的单一序列的数量;单一序列的覆盖度计算单元:用于根据各染色体的GC含量和比对率对各窗口的单一序列的数量进行计算,得到各窗口的单一序列的数量的矫正前覆盖度;单一序列ZCNV值计算单元:用于对各窗口的单一序列的数量的矫正前覆盖度进行标准化处理,得到各窗口的单一序列的数量的ZCNV值。
进一步地,在第二覆盖度计算模块中,Zaneu按照
Figure PCTCN2015078422-appb-000012
来计算,其中,
Figure PCTCN2015078422-appb-000013
是根据LOESS算法,通过已知阴性样本群体得到的矫正前覆盖度值,s表示阴性样本群体里
Figure PCTCN2015078422-appb-000014
的标准差。
根据本申请的另一方面,提供了一种检测染色体非整倍性的试剂盒,该试剂盒包括:检测试剂和检测器械:用于对待测孕妇的外周血游离DNA进行高通量测序,以得到包含所有染色体的测序数据;第一覆盖度计算器械:用于对测序数据中的所有染色体以切分成的窗口形式计算覆盖度,以得到各染色体的矫正前覆盖度;单一序列ZCNV值计算器械:用于对待测孕妇在各窗口中的单一序列的数量进行Z检验,得到ZCNV值;拷贝数异常片段查询器械:用于在测序数据中查询300Kb以上的片段,且在300Kb以上的片段中,80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等 于-4的片段;拷贝数异常片段确认器械:用于根据ZCNV值大小得到待测孕妇的拷贝数异常片段;α第一计算器械:用于在胎儿遗传了母体的拷贝数异常片段的情况下,按照如式(1)所示的计算公式计算参数α,参数α为孕妇的拷贝数异常片段对各染色体的矫正前覆盖度的影响,
Figure PCTCN2015078422-appb-000015
m表示拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示孕妇在拷贝数异常片段的长度,单位为Mb;cn表示孕妇的拷贝数异常片段出现的次数;α第二计算器械:用于在胎儿未遗传母体的拷贝数异常的染色体的情况下,按照如式(2)所示的计算公式参数α:
Figure PCTCN2015078422-appb-000016
m表示拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示孕妇在拷贝数异常片段的长度,单位为Mb;cn表示孕妇的拷贝数异常片段出现的次数;f表示待测孕妇的外周血游离DNA中所含的胎儿游离DNA的浓度且假定胎儿游离DNA的浓度f小于50%;矫正器械:用于利用
Figure PCTCN2015078422-appb-000017
对各染色体的矫正前覆盖度进行矫正,得到各染色体的矫正后覆盖度;其中,
Figure PCTCN2015078422-appb-000018
代表各染色体的矫正前覆盖度,x'代表各染色体的矫正后染色体覆盖度;第二覆盖度计算器械:用于利用各染色体的矫正后覆盖度来计算各染色体的Zaneu值;Zaneu值判断器械:用于判断Zaneu值是否大于等于3;染色体非整倍性确认器械:用于在Zaneu值大于等于3的情况下,确定染色体具有非整倍性。
进一步地,第一覆盖度计算器械包括:染色体窗口切分部件:用于对测序数据中的所有染色体以切分成相等大小的窗口;第一覆盖度计算部件:用于以相等大小的窗口的形式计算覆盖度,以得到各染色体的校正前覆盖度。
进一步地,染色体窗口切分部件中,每个窗口的大小为100Kb,且相邻两个窗口之间的重叠度为50%。
进一步地,ZCNV值计算器械包括:单一序列统计部件:用于根据测序数据中各序列的测序深度,统计各窗口的单一序列的数量;单一序列的覆盖度计算部件:用于根据各染色体的GC含量和比对率对各窗口的单一序列的数量进行计算,得到各窗口的 单一序列的数量的矫正前覆盖度;单一序列ZCNV值计算部件:用于对各窗口的单一序列的数量的矫正前覆盖度进行标准化处理,得到各窗口的单一序列的数量的Z值。
进一步地,在第二覆盖度计算器械中,Zaneu按照
Figure PCTCN2015078422-appb-000019
来计算,其中,
Figure PCTCN2015078422-appb-000020
是根据LOESS算法,通过已知阴性样本群体得到的矫正前覆盖度值,s表示阴性样本群体里
Figure PCTCN2015078422-appb-000021
的标准差。
应用本申请的技术方案,通过筛选母本染色体上存在的特定大小的拷贝数异常片段,并在判断染色体是否存在非整倍性时,将该母本的拷贝数异常片段对计算各染色体的覆盖度的影响去除,从而得到各染色体矫正后的覆盖度,利用本申请的校正后覆盖度计算和判断得到的染色体非整倍性的结果更准确。
附图说明
构成本申请的一部分的说明书附图用来提供对本申请的进一步理解,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1示出了根据本申请一种典型的实施方式中检测染色体非整倍性的方法的流程示意图;
图2示出了根据本申请一种典型的实施方式中检测染色体非整倍性的装置的结构示意图;
图3A、图3B和图3C分别示出了根据本申请的实施例1对13号染色体、18号染色体和21号染色体的非整倍性检测的矫正结果示意图;
图4示出了根据本申请的实施例2对样本EK01875和BD01462在21号染色体上的非整倍性的矫正结果示意图;
图5示出了根据本申请的实施例3对样本EK01875的21号染色体的非整倍性检测的矫正结果示意图;以及
图6示出了根据本申请的实施例4对样本BD01462的21号染色体的非整倍性检测的矫正结果。
具体实施方式
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考并结合实施例来详细说明本申请。
在本申请中,ZCNV或Zaneu指统计学中的Z检验的计算值,是用于大样本(即样本容量大于30)平均值差异性检验的方法。它是用标准正态分布的理论来推断差异发生的概率,从而比较两个平均数的差异是否显著。
比对率是指窗口内的测序序列比对到基因组参考序列上的比率。由于测序序列可能同时比对到基因组参考序列上的多个位置,可能并不是唯一的序列,所以窗口内的比对率是大于单一序列的比对率。
需要说明的是,本申请的申请人通过对现有方法进行大量分析,发现至少存在以下三种导致NIPT判断错误的可能性:
首先,Lo在1998年发现cff-DNA是来源于胎盘的,这意味着如果胎盘出现嵌合(CPM)时,我们将难以通过NIPT的结果准确地估计胎儿情况,结果容易失准;其次,如果孕妇自身存在一定的CNV的话,基于MPS统计覆盖度并转化为Z值的方法将失准。因为当孕妇存在重复片段时,比对到染色体上的相对的单一序列数将变多,而覆盖度的升高则会使Z值变大,从而增加假阳性的风险。反之,如果孕妇存在片段缺失时,Z值将降低,增加假阴性的风险。而且在之前的一些研究中也表明,胎盘出现嵌合(CPM)和孕妇拷贝数片段异常(CNV)是造成假阳性判断的重要原因。最后,在计算染色体覆盖度或是利用GC含量矫正覆盖度的过程中可能出现的数据波动情况,从而产生误差。
为此,在对上述判断错误原因进行综合分析的基础上,本申请提出了一种检测染色体非整倍性的方法,如图1所示,该方法包括以下步骤:对待测孕妇的外周血游离DNA进行高通量测序,得到包含所有染色体的测序数据;对测序数据中的所有染色体以切分成窗口的形式计算覆盖度,得到各染色体的矫正前覆盖度;对待测孕妇在各窗口中的单一序列的数量的ZCNV值进行计算,并根据ZCNV值大小得到待测孕妇的拷贝数异常片段;待测孕妇的拷贝数异常片段是指在测序数据中300Kb以上的片段,且在300Kb以上的片段中,80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段;利用待测孕妇的拷贝数异常片段对各染色体的矫正前覆盖度的影响,对各染色体的矫正前覆盖度进行矫正,得到各染色体的矫正后覆盖度;以及利用各染色体的矫正后覆盖度计算各染色体的Zaneu值,并根据Zaneu值的绝对值是否大于等于3 来判断染色体是否具有非整倍性;当Zaneu值的绝对值大于等于3时,则染色体具有非整倍性;其中,待测孕妇的拷贝数异常片段对各染色体的矫正前覆盖度的影响用参数α表示,当胎儿遗传了母体的拷贝数异常片段时,参数α的计算公式如式(1):
Figure PCTCN2015078422-appb-000022
当胎儿未遗传母体的拷贝数异常片段时,参数α的计算公式如式(2):
Figure PCTCN2015078422-appb-000023
在式(1)和式(2)中,m表示拷贝数异常片段所在的染色体的有效长度,单位为Mb;n表示待测孕妇的拷贝数异常片段的长度,单位为Mb;cn表示孕妇的拷贝数异常片段出现的次数;在式(2)中,f表示待测孕妇的外周血游离DNA中所含的胎儿游离DNA的浓度且假定胎儿游离DNA的浓度f小于50%;并利用
Figure PCTCN2015078422-appb-000024
对各染色体的覆盖度进行矫正,其中,
Figure PCTCN2015078422-appb-000025
代表各染色体的矫正前覆盖度,x'代表各染色体的矫正后染色体覆盖度。
本申请的上述方法,不是像现有技术中一样,将测序数据中母本有拷贝数异常片段直接去除不予考虑,而是通过筛选母本染色体上存在的特定大小的拷贝数异常片段,并在判断染色体是否存在非整倍性时,将该拷贝数异常片段对计算各染色体的覆盖度的影响去除,从而得到各染色体矫正后的覆盖度,从而使得本申请的方法所检测得到的染色体非整倍性的结果更准确。
本申请的上述方法中,上述待测孕妇的外周血游离DNA中所含的胎儿游离DNA的浓度f的计算方法为本领域常规的计算方法。比如,当胎儿为男性时,且当拷贝数异常片段在X染色体时,胎儿游离DNA的浓度按照
Figure PCTCN2015078422-appb-000026
进行计算,其中,
Figure PCTCN2015078422-appb-000027
表示X染色体上的窗口平均单一序列数和所有窗口平均单一序列数之比;而当拷贝数异常片段在21、18或13号染色体时,胎儿游离DNA的浓度按照
Figure PCTCN2015078422-appb-000028
进行计算,其中,
Figure PCTCN2015078422-appb-000029
表示21、18或13号染色体的窗口平均单一序列数和所有窗口平均单一序列数之比。当胎儿为女性时,需要对孕妇外周血游离DNA进行特定基因的甲基化检测。原理是,某些特定基因在孕妇DNA中和胎儿DNA中的甲基化的形式不同。例 如,胎儿和胎盘来源的RASSF1A(3号染色体上)基因是高度甲基化的,而母亲自身来源的RASSF1A基因是非甲基化的,利用甲基化敏感酶,如HhaI、BstUI(30U)和HpaII处理cffDNA,非甲基化的基因将被消化掉,而甲基化的基因未被消化,由此通过Q-PCR可以检测胎儿cffDNA的含量。具体步骤参考文献PLOS ONE 9:71-7(2014),Quantification of Cell-Free DNA in Normal and Complicated Pregnancies:Overcoming Biological and Technical Issues。
本申请的上述方法,在对各染色体的矫正前覆盖度进行计算时,由于将染色体切分成窗口的形式进行计算,能够获得一个相对稳健的染色体覆盖度。因此,在本申请的一种优选的实施例中,对测序数据中的所有染色体以切分成相等大小的窗口的形式计算覆盖度,得到各染色体的矫正前覆盖度。
在本申请一种更优选的实施例中,在切分成窗口的形式进行计算覆盖度,每个窗口的大小为100Kb,且相邻两个窗口之间的重叠度为50%。将每个窗口的大小控制在100Kb且将相邻两个窗口之间的重叠度控制为50%,不仅能够获得一个相对稳健的染色体覆盖度,而且增加窗口之间的重叠度可以提升检测拷贝数异常片段的精准度,进而提升孕妇拷贝数异常片段的检出效率。
在本申请的上述方法中,可以在常规的计算拷贝数异常片段的方法步骤基础上,根据测序数据质量或检测精度的不同,通过适当地调整拷贝数异常片段所应该满足的条件得到。在本申请一种优选的实施例中,对待测孕妇在各窗口中的单一序列数的ZCNV值进行计算,并根据ZCNV值得到待测孕妇的拷贝数异常片段的步骤包括:根据测序数据中各序列的测序深度,统计各窗口的单一序列的数量;根据各染色体的GC含量和比对率对各窗口的单一序列的数量进行计算,得到各窗口的单一序列的数量的矫正前覆盖度;以及对各窗口的单一序列的数量的矫正前覆盖度进行标准化处理,得到各窗口的单一序列的数量的ZCNV值,并根据ZCNV值的大小片段待测孕妇是否具有拷贝数异常的片段;当在测序数据中存在300Kb以上的片段,且在300Kb以上的片段中80%以上的窗口的单一序列的数量的ZCNV值都大于等于4或小于等于-4,则认为300Kb以上的片段是待测孕妇的拷贝数异常片段。
本申请的上述对各窗口的单一序列的数量的矫正前覆盖度进行标准化处理,得到各窗口的单一序列的数量的ZCNV值步骤中,标准化处理是指对各窗口的单一序列的数量的矫正值,做(x-u)/sd(x-u),其中x为矫正后的值,u为x的均值,sd为标准差。上述检测待测孕妇的拷贝数异常片段的步骤中,通过设定“至少为300kb的区域和该区域中80%的窗口ZCNV值大于等于4或小于等于-4”的条件,使得本申请的上述检测步骤能 够检出可信的孕妇拷贝数异常片段,并利用这些拷贝数异常片段对其所在的染色体的ZCNV值进行修正,进而可以避免因孕妇拷贝数异常片段的检测结果错误而造成假阴性的判断。
在本申请的上述方法中,在利用各染色体的矫正后覆盖度对各染色体进行Z检验,得到Zaneu值的步骤中,Zaneu值按照
Figure PCTCN2015078422-appb-000030
来计算,其中,
Figure PCTCN2015078422-appb-000031
是根据LOESS算法,通过已知阴性样本群体得到的覆盖度值,s表示阴性样本群体里
Figure PCTCN2015078422-appb-000032
的标准差。通过上述公式计算得到的矫正后的Zaneu值能更准确地反映染色体的非整倍性,使得检测结果更准确。
在本申请的另一种典型的实施方式中,提供了一种检测染色体非整倍性的装置,如图2所示,该装置包括以下模块:测序数据检测模块:用于对待测孕妇的外周血游离DNA进行高通量测序,以得到包含所有染色体的测序数据;第一覆盖度计算模块:用于对测序数据中的所有染色体以切分成的窗口形式计算覆盖度,以得到各染色体的矫正前覆盖度;ZCNV值计算模块:用于对待测孕妇在各窗口中的单一序列的数量的ZCNV值进行计算;拷贝数异常片段查询模块:用于在测序数据中查询300Kb以上的片段,且在300Kb以上的片段中,80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段;拷贝数异常片段确定模块:用于将从测序数据中查询得到的300Kb以上的片段且在80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段确定为待测孕妇的拷贝数异常片段;α第一计算模块:用于在胎儿遗传了母体的拷贝数异常片段的情况下,按照如式(1)所示的计算公式计算参数α,其中,参数α是指孕妇的拷贝数异常片段对各染色体的矫正前覆盖度的影响;m表示拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示孕妇在拷贝数异常片段的长度,单位为Mb;cn表示孕妇的拷贝数异常片段出现的次数;
Figure PCTCN2015078422-appb-000033
α第二计算模块:用于在胎儿未遗传母体的拷贝数异常的染色体的情况下,按照如式(2)所示的计算公式参数α,m表示拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示孕妇在拷贝数异常片段的长度,单位为Mb;cn表示孕妇的拷贝数异常片段出现的次数;f表示待测孕妇的外周血游离DNA中所含的胎儿游离DNA的浓度且假定胎儿游离DNA的浓度f小于50%;
Figure PCTCN2015078422-appb-000034
矫正模块:用于利用
Figure PCTCN2015078422-appb-000035
对各染色体的矫正前覆盖度进行矫正,得到各染色体的矫正后覆盖度;其中,
Figure PCTCN2015078422-appb-000036
代表各染色体的矫正前覆盖度,x'代表各染色体的矫正后染色体覆盖度;第二覆盖度计算模块:用于利用各染色体的矫正后覆盖度来计算各染色体的Zaneu值;Zaneu值判断模块:用于判断Zaneu值是否大于等于3;染色体非整倍性确认模块:用于在Zaneu值大于等于3的情况下,确定染色体具有非整倍性。
本申请的上述装置,通过增加了拷贝数异常片段查询模块、拷贝数异常片段确认模块和矫正模块,本申请通过筛选母本染色体上存在的至少为300kb的区域且该区域中80%的窗口ZCNV值大于等于4或小于等于4的片段,使得本申请的上述装置能够检出可信的孕妇拷贝数异常片段,并利用这些拷贝数异常片段对其所在的染色体的Z检验值进行修正,进而可以避免因孕妇拷贝数异常片段的检测结果错误而造成假阴性的判断。通过矫正模块将该拷贝数异常片段对计算各染色体的覆盖度的影响进行矫正,从而使得本申请的染色体非整倍性确认模块对染色体的非整倍性的确认结果更准确。本申请的上述装置的矫正模块中,参数α的计算公式中的胎儿浓度为本领域常规的计算方法,具体如前述,此处不再赘述。
需要说明的是,本申请的上述模块作为装置的一部分可以运行在一个计算终端中,可以利用该计算机终端所提供的处理器来执行上述测序数据检测模块、第一覆盖度计算模块、单一序列计算模块、拷贝数异常片段查询模块、拷贝数异常片段确认模块、α第一计算模块、α第二计算模块、矫正模块、第二覆盖度计算模块以及染色体非整倍性确定模块所实现的技术方案,显而易见的是该计算机终端是硬件实现的设备,处理器也是用于执行程序的硬件装置。而且本申请所提供的上述各个功能子模块可以在移动终端、计算机终端或者类似的运算装置中运行,也可以作为存储介质的一部分进行存储。
在本申请的上述装置中,上述第一覆盖度计算模块可以在本领域常规的计算模块基础上,根据测序数据的不同经过适当调整得到。在本申请一种优选的实施例中,上述第一覆盖度计算模块包括:染色体窗口切分子模块:用于对测序数据中的所有染色体以切分成相等大小的窗口;第一覆盖度计算子模块:用于以相等大小的窗口的形式计算覆盖度,以得到各染色体的校正前覆盖度。通过本申请的包含染色体窗口切分子 模块和第一覆盖度计算子模块的第一覆盖度计算模块以切分成相等大小的窗口形式进行计算,利于得到相对稳健的覆盖度。
在本申请一种更优选的实施例中,上述染色体窗口切分子模块中,每个窗口的大小为100Kb,且相邻两个窗口之间的重叠度为50%。将每个窗口分成100Kb的大小的形式进行计算的计算模块一方面利于得到相对稳健的覆盖度,另一方面增加窗口之间的覆盖度可以提升检测拷贝数异常片段的精准度,进而提升孕妇拷贝数异常片段的检出效率。
在本申请的上述装置中,单一序列计算模块可以利用在常规的计算模块得到。在本申请一种优选的实施例中,上述单一序列计算模块还包括:单一序列统计单元:用于根据测序数据中各序列的测序深度,统计各窗口的单一序列的数量;单一序列的覆盖度计算单元:用于根据各染色体的GC含量和比对率对各窗口的单一序列的数量进行矫正,得到各窗口的单一序列的数量的矫正值;单一序列ZCNV值计算单元:用于对各窗口的单一序列的数量的矫正值进行标准化处理,得到各窗口的单一序列的数量的ZCNV值。
本申请的上述单一序列计算模块,通过首先运行单一序列统计单元,根据测序数据中各序列的测序深度,统计各窗口的单一序列的数量,然后执行单一序列覆盖度计算单元,根据各染色体的GC含量和比对率对各窗口的单一序列的数量进行计算,得到各窗口的单一序列的数量的矫正前覆盖度,接着执行单一序列ZCNV值计算子单元,对各窗口的单一序列的数量的矫正前覆盖度进行标准化处理,得到各窗口的单一序列的数量的ZCNV值。上述单元是在本领域常规的计算单元的基础上进行的适当调整,是拷贝数异常片段查询模块进行查询和拷贝数异常片段确认模块进行确认的根据和前提,为准确确定待测样本中母本DNA拷贝数异常片段的存在提供依据。
在本申请的上述装置中,在第二覆盖度计算模块中,Zaneu按照
Figure PCTCN2015078422-appb-000037
来计算,其中,
Figure PCTCN2015078422-appb-000038
是根据LOESS算法,通过已知阴性样本群体得到的矫正前覆盖度值,s表示阴性样本群体里
Figure PCTCN2015078422-appb-000039
的标准差。通过上述公式计算得到的矫正后的Zaneu值能更准确地反映染色体的非整倍性,使得检测结果更准确。
在本申请又一种典型的实施方式中,还提供了一种检测染色体非整倍性的试剂盒,试剂盒包括:测序数据检测器械:用于对待测孕妇的外周血游离DNA进行高通量测序,以得到包含所有染色体的测序数据;第一覆盖度计算器械:用于对测序数据中的所有染色体以切分成的窗口形式计算覆盖度,以得到各染色体的矫正前覆盖度;ZCNV 值计算器械:用于对待测孕妇在各窗口中的单一序列的数量的ZCNV值进行计算;拷贝数异常片段查询器械:用于在测序数据中查询300Kb以上的片段,且在300Kb以上的片段中,80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段;拷贝数异常片段确定器械:用于将从测序数据中查询得到的300Kb以上的片段且在80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段确定为待测孕妇的拷贝数异常片段;α第一计算器械:用于在胎儿遗传了母体的拷贝数异常片段的情况下,按照如式(1)所示的计算公式计算参数α,其中,参数α是指孕妇的拷贝数异常片段对各染色体的矫正前覆盖度的影响;m表示拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示孕妇在拷贝数异常片段的长度,单位为Mb;cn表示孕妇的拷贝数异常片段出现的次数;
Figure PCTCN2015078422-appb-000040
α第二计算器械:用于在胎儿未遗传母体的拷贝数异常的染色体的情况下,按照如式(2)所示的计算公式参数α,m表示拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示孕妇在拷贝数异常片段的长度,单位为Mb;cn表示孕妇的拷贝数异常片段出现的次数;f表示待测孕妇的外周血游离DNA中所含的胎儿游离DNA的浓度且假定胎儿游离DNA的浓度f小于50%;
Figure PCTCN2015078422-appb-000041
矫正器械:用于利用
Figure PCTCN2015078422-appb-000042
对各染色体的矫正前覆盖度进行矫正,得到各染色体的矫正后覆盖度;其中,
Figure PCTCN2015078422-appb-000043
代表各染色体的矫正前覆盖度,x'代表各染色体的矫正后染色体覆盖度;第二覆盖度计算器械:用于利用各染色体的矫正后覆盖度来计算各染色体的Zaneu值;Zaneu值判断器械:用于判断Zaneu值是否大于等于3;染色体非整倍性第一确认器械:用于在Zaneu值大于等于3的情况下,确定染色体具有非整倍性。
本申请的上述试剂盒,通过增加了拷贝数异常片段查询器械、拷贝数异常片段确认器械和矫正器械,本申请通过筛选母本染色体上存在的至少为300kb的区域且该区域中80%的窗口ZCNV值大于等于4或小于等于4的片段,使得本申请的上述试剂盒能够检出可信的孕妇拷贝数异常片段,并利用这些拷贝数异常片段对其所在的染色体的ZCNV值进行修正,进而可以避免因孕妇拷贝数异常片段的检测结果错误而造成假阴性 的判断。通过矫正器械将该拷贝数异常片段对计算各染色体的覆盖度的影响进行矫正,从而使得本申请的染色体非整倍性确认器械对染色体的非整倍性的确认结果更准确。本申请的上述试剂盒的矫正器械中,参数α的计算公式中的胎儿浓度为本领域常规的计算方法,具体如前述,此处不再赘述。
需要说明的是,本申请的上述部件、元件作为器械的一部分可以运行在一个计算终端中,可以利用该计算机终端所提供的处理器来执行上述测序数据检测器械、第一覆盖度计算器械、单一序列计算器械、拷贝数异常片段查询器械、拷贝数异常片段确认器械、α第一计算器械、α第二计算器械、矫正器械、第二覆盖度计算器械以及染色体非整倍性确定器械所实现的技术方案,显而易见的是该计算机终端是硬件实现的设备,处理器也是用于执行程序的硬件试剂盒。而且本申请所提供的上述各个功能子器械可以在移动终端、计算机终端或者类似的运算试剂盒中运行,也可以作为存储介质的一部分进行存储。
在本申请的上述试剂盒中,上述第一覆盖度计算器械可以在本领常规的计算器械基础上,根据测序数据的不同经过适当调整得到。在本申请一种优选的实施例中,上述第一覆盖度计算器械包括:染色体窗口切分子器械:用于对测序数据中的所有染色体以切分成相等大小的窗口;第一覆盖度计算子器械:用于以相等大小的窗口的形式计算覆盖度,以得到各染色体的校正前覆盖度。通过本申请的包含染色体窗口切分子器械和第一覆盖度计算子器械的第一覆盖度计算器械以切分成相等大小的窗口形式进行计算,利于得到相对稳健的覆盖度。
在本申请一种更优选的实施例中,上述染色体窗口切分子器械中,每个窗口的大小为100Kb,且相邻两个窗口之间的重叠度为50%。将每个窗口分成100Kb的大小的形式进行计算的计算器械一方面利于得到相对稳健的覆盖度,另一方面增加窗口之间的覆盖度可以提升检测拷贝数异常片段的精准度,进而提升孕妇拷贝数异常片段的检出效率。
在本申请的上述试剂盒中,单一序列计算器械可以利用在常规的计算器械得到。在本申请一种优选的实施例中,上述单一序列计算器械还包括:单一序列统计单元:用于根据测序数据中各序列的测序深度,统计各窗口的单一序列的数量;单一序列的覆盖度计算单元:用于根据各染色体的GC含量和比对率对各窗口的单一序列的数量进行计算,得到各窗口的单一序列的数量的矫正前覆盖度;单一序列ZCNV值计算单元:用于对各窗口的单一序列的数量的矫正前覆盖度进行标准化处理,得到各窗口的单一序列的数量的ZCNV值。
本申请的上述单一序列计算器械,通过首先运行单一序列统计单元,根据测序数据中各序列的测序深度,统计各窗口的单一序列的数量,然后执行单一序列覆盖度计算单元,根据各染色体的GC含量和比对率对各窗口的单一序列的数量进行矫正,得到各窗口的单一序列的数量的矫正值,接着执行单一序列ZCNV值计算子单元,对各窗口的单一序列的数量的矫正值进行标准化处理,得到各窗口的单一序列的数量的ZCNV值。上述单元是在本领域常规的计算和矫正单元的基础上进行的适当调整,是拷贝数异常片段查询器械进行查询和拷贝数异常片段确认器械进行确认的根据和前提,为准确确定待测样本中母本DNA拷贝数异常片段的存在提供依据。
在本申请的上述试剂盒中,在第二覆盖度计算器械中,Zaneu按照
Figure PCTCN2015078422-appb-000044
来计算,其中,
Figure PCTCN2015078422-appb-000045
是根据LOESS算法,通过已知阴性样本群体得到的矫正前覆盖度值,s表示阴性样本群体里
Figure PCTCN2015078422-appb-000046
的标准差。通过上述公式计算得到的矫正后的Zaneu值能更准确地反映染色体的非整倍性,使得检测结果更准确。
下面将结合具体的实施例进一步说明本申请的有益效果。
实施例1
为了测试本申请的孕妇拷贝数异常片段矫正对染色体非整倍性检测的矫正效果,本实施例基于泊松分布生成了一组孕妇的模拟数据,在该模拟数据中,分别单独地对13、18和21号染色体加入了定量的拷贝数异常片段,拷贝数异常片段的大小从0.5Mb到5Mb,步长为0.25Mb。然后,在含有拷贝数异常片段的模拟数据中分别混入了3种不同浓度正常人DNA(5%,10%,15%)。整个过程是用来模拟在不同胎儿浓度下,不同拷贝数异常片段大小对13、18和21号染色体覆盖度的影响,并据此测试孕妇拷贝数异常片段对染色体非整倍性检测的矫正效果。所有的计算都是在胎儿没有遗传孕妇拷贝数异常片段的假设下进行的。
测试结果见附图3A、3B和3C,在上述三个图中,横坐标代表该样本所存在的孕妇拷贝数异常片段大小,纵坐标代表这个样本的染色体Z值。图中的实线表示的是修正之前的染色体Z值,虚线则表示经过孕妇拷贝数异常片段矫正后的染色体覆盖度计算出来的Z值,即Zaneu值。方形、圆形和三角形则分别表示该样本的胎儿浓度为5%,10%和15%。
从附图3A、3B和3C中可以很明显地看出,当直接用染色体覆盖度去计算Z值的时候,随着孕妇拷贝数异常片段的增大,样本的Z值也越大。以21号染色体为例, 在10%胎儿浓度下,如果此时孕妇存在3Mb的21号染色体上的重复,那么即使胎儿不是21三体综合症患儿,用之前的覆盖度计算出的Z值也会大于3,会被判断为阳性。而图中的虚线,即经过本申请的方法矫正后的覆盖度计算出的染色体Z值,即Zaneu值,均稳定于0基线附近,这表明,在各种情况下,本申请的通过利用孕妇拷贝数异常的片段进行矫正的染色体非整倍性的检测方法是极为有效的。
为了进一步验证根据本申请的申请思想而提供的检测方法、检测装置和试剂盒在检测实际病例样本中检测染色体非整倍性的效果,还利用了本申请的检测方法、检测装置和试剂盒分别检测了以下病例样本,具体见实施例2和实施例3。
实施例2
对6615例待测孕妇的外周血游离DNA样本进行高通量测序,得到各样本包含所有染色体的测序数据;
根据各样本的测序数据中各序列的测序深度,统计各窗口的单一序列的数量;根据各染色体的GC含量和比对率对各窗口的单一序列的数量进行矫正,得到各窗口的单一序列的数量的矫正值;对各窗口的单一序列的数量的矫正值进行标准化处理,得到各窗口的单一序列的数量的ZCNV值,并根据ZCNV值的大小判断待测孕妇是否具有拷贝数异常的片段;当在测序数据中存在300Kb以上的片段,且在300Kb以上的片段中80%以上的窗口的单一序列的数量的ZCNV值都大于等于4或小于等于-4时,则认为300Kb以上的片段是待测孕妇的拷贝数异常片段;
利用待测孕妇的拷贝数异常片段对各染色体的矫正前覆盖度的影响α,并利用公式
Figure PCTCN2015078422-appb-000047
对各染色体的矫正前覆盖度进行矫正,得到各染色体的矫正后覆盖度;其中,
Figure PCTCN2015078422-appb-000048
代表各染色体的矫正前覆盖度,x'代表各染色体的矫正后染色体覆盖度;其中,待测孕妇的拷贝数异常片段对各染色体的矫正前覆盖度的影响用参数α利用公式(1)或(2)计算;
利用各染色体的矫正后覆盖度计算各染色体的Zaneu值,具体按照公式
Figure PCTCN2015078422-appb-000049
计算得到Zaneu值,并根据Zaneu值的绝对值是否大于等于3来判断染色体是否具有非整倍性;当Zaneu值的绝对值大于等于3时,则染色体具有非整倍性;当Zaneu值的绝对值小于3时,则染色体不具有非整倍性;
通过本申请的上述检测方法发现样本EK01875和BD01462在21号染色体上存在孕妇拷贝数异常片段,并将这两个样本从之前的阳性结果矫正为阴性结果,具体结果见图4。
图4中的左图(参见彩图)为现有的检测方法检测得到的所有样本的21号染色体Z值统计图,可见阴性样本的Z值几乎均小于3,近正态分布。图中的圆形为样本EK01875,Z值为4.66。三角形为样本BD01462,Z值为3.87。
图4中的右图为本申请的检测方法检测得到的21号染色体Z值统计图,此时样本EK01875的Zaneu=2.36,样本BD01462的Zaneu=1.83。
实施例3
利用本申请的检测染色体非整倍性的装置进行上述样本(样本EK01875,孕妇年龄29岁,孕周约18w)的检测,该装置包括:
测序数据检测模块:用于对待测孕妇的外周血游离DNA进行高通量测序,以得到包含所有染色体的测序数据;
第一覆盖度计算模块:用于对测序数据中的所有染色体以切分成的窗口形式计算覆盖度,以得到各染色体的矫正前覆盖度;
ZCNV值计算模块:用于对待测孕妇在各窗口中的单一序列的数量的ZCNV值进行计算;
拷贝数异常片段查询模块:用于在测序数据中查询300Kb以上的片段,且在300Kb以上的片段中,80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段;
拷贝数异常片段确定模块:用于将从测序数据中查询得到的300Kb以上的片段且在80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段确定为待测孕妇的拷贝数异常片段;
α第一计算模块:用于在胎儿遗传了母体的拷贝数异常片段的情况下,按照如式(1)所示的计算公式计算参数α,
α第二计算模块:用于在胎儿未遗传母体的拷贝数异常的染色体的情况下,按照如式(2)所示的计算公式参数α,
矫正模块:用于利用
Figure PCTCN2015078422-appb-000050
对各染色体的矫正前覆盖度进行矫正,得到各染色体的矫正后覆盖度;
第二覆盖度计算模块:用于利用各染色体的矫正后覆盖度计算各染色体的Zaneu值;
Zaneu值判断模块:用于判断Zaneu值是否大于等于3;
染色体非整倍性确认模块:用于在Zaneu值大于等于3的情况下,确定染色体具有非整倍性。
利用本申请的上述检测染色体非整倍性的装置进行分析后,检测出孕妇在21号染色体上存在850kb的重复。如图5所示,拷贝数存在重复的区域分别为500kb的21q22.11(32361194bp~32861193bp)和350kb的21q22.12(37261194bp~37611193bp),拷贝数均为3。
之后,用现有技术中的Affymetrix CytoScan 750k SNP芯片对该孕妇染色体拷贝异常片段的结果进行验证,同样的,检出在21q22.11(32399114bp~32811202bp)区域和21q22.12(37292432bp~37602701bp),拷贝数为3。
可见,芯片检测结果的位置与本申请的装置检测出来的位置几乎百分百匹配。根据本申请的装置中,孕妇拷贝数异常片段对该染色体覆盖度计算的影响参数α值为1.012,将表征染色体是否为非整倍性的Z值从原先的4.66修正为2.36,判断结果因此改为阴性。
实施例4
利用本申请的检测染色体非整倍性的试剂盒进行上述样本(样本BD01462,孕妇年龄24岁,孕周约24w)的检测,该试剂盒包括:
测序数据检测试剂和器械:用于对待测孕妇的外周血游离DNA进行高通量测序,以得到包含所有染色体的测序数据;
第一覆盖度计算器械:用于对测序数据中的所有染色体以切分成的窗口形式计算覆盖度,以得到各染色体的矫正前覆盖度;
单一序列计算器械:用于对待测孕妇在各窗口中的单一序列的数量的ZCNV值进行计算;
拷贝数异常片段查询器械:用于在测序数据中查询300Kb以上的片段,且在300Kb以上的片段中,80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段;
拷贝数异常片段确定器械:用于将从测序数据中查询得到的300Kb以上的片段且在80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段确定为待测孕妇的拷贝数异常片段;
α第一计算器械:用于在胎儿遗传了母体的拷贝数异常片段的情况下,按照如式(1)所示的计算公式计算参数α,
α第二计算器械:用于在胎儿未遗传母体的拷贝数异常的染色体的情况下,按照如式(2)所示的计算公式参数α,
矫正器械:用于利用
Figure PCTCN2015078422-appb-000051
对各染色体的矫正前覆盖度进行矫正,得到各染色体的矫正后覆盖度;
第二覆盖度计算器械:用于利用各染色体的矫正后覆盖度来计算各染色体的Zaneu值;
Zaneu值判断器械:用于判断Zaneu值是否大于等于3;
染色体非整倍性确认器械:用于在Zaneu值大于等于3的情况下,确定染色体具有非整倍性。
通过利用本申请的上述检测试剂盒进行检测分析后,如图6所示,检测出孕妇在21号染色体上总共存在700kb的重复,区域为21q23.1(28911194bp~29611930),拷贝数为3。
同样的,利用Affymetrix CytoScan 750k SNP芯片验证结果显示,发现在21q21.3(28973792bp~29542400)出现重复。
尽管检测到的拷贝数为4,与本申请的检测结果稍有不同,该结果的位置与利用本申请的试剂盒检测出来的位置几乎百分百匹配,同样表明本申请的检测方法的精准 性。根据本申请的试剂盒中孕妇拷贝数异常片段对该染色体覆盖度计算的影响参数α的值为1.009,将表征染色体是否为非整倍性的Z值从原先的3.87修正为1.83,判断结果因此改为阴性。
从以上的描述中可以看出,本申请上述的实施例实现了如下技术效果:本申请在考虑孕妇本身的拷贝数异常片段对计算染色体非整倍性的影响时,摒弃现有技术中通过将测序数据中母本有拷贝数异常的片段直接去除不予考虑的思想,创造性地从母本染色体上存在的特定大小的拷贝数异常片段对计算染色体非整倍性的影响用参数α来体现,并通过该参数α来对各染色体的覆盖度进行矫正,进而将该拷贝数异常片段对染色体非整倍性的判断的影响降低,而不是无视该拷贝数异常片段的存在,从而使得本申请的方法所检测得到的染色体非整倍性的结果更准确。
本申请的方法、装置或试剂盒,提供了一种几乎不受孕妇拷贝数异常片段影响的NIPT胎儿染色体非整倍性的检测方法,提高了检测的精准度,适合大规模使用。
显然,本领域的技术人员应该明白,上述的本申请的一些模块、元件或一些步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请不限制于任何特定的硬件和软件结合。
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (15)

  1. 一种检测染色体非整倍性的方法,其特征在于,所述方法包括以下步骤:
    对待测孕妇的外周血游离DNA进行高通量测序,得到包含所有染色体的测序数据;
    对所述测序数据中的所有染色体以切分成窗口的形式计算覆盖度,得到各所述染色体的矫正前覆盖度;
    对所述待测孕妇在各所述窗口中的单一序列的数量进行Z检验,得到ZCNV值,并根据所述ZCNV值大小得到所述待测孕妇的拷贝数异常片段;所述待测孕妇的拷贝数异常片段是指在所述测序数据中300Kb以上的片段,且在所述300Kb以上的片段中,80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段;
    利用所述待测孕妇的拷贝数异常片段对各所述染色体的矫正前覆盖度的影响,对各所述染色体的矫正前覆盖度进行矫正,得到各所述染色体的矫正后覆盖度;以及
    利用各所述染色体的矫正后覆盖度对各所述染色体进行Z检验,得到Zaneu值,并根据所述Zaneu值的绝对值是否大于等于3来判断所述染色体是否具有非整倍性;当所述Zaneu值的绝对值大于等于3时,则所述染色体具有非整倍性;
    其中,所述待测孕妇的拷贝数异常片段对各所述染色体的矫正前覆盖度的影响用参数α表示,
    当胎儿遗传了母体的所述拷贝数异常片段时,所述参数α的计算公式如式(1):
    Figure PCTCN2015078422-appb-100001
    当胎儿未遗传母体的所述拷贝数异常片段时,所述参数α的计算公式如式(2):
    在所述式(1)和式(2)中,m表示所述拷贝数异常片段所在的染色体的有效长度,单位为Mb;n表示所述待测孕妇的所述拷贝数异常片段的长度,单位为Mb;cn表示所述孕妇的所述拷贝数异常片段出现的次数;
    在所述式(2)中,f表示所述待测孕妇的外周血游离DNA中所含的胎儿游离DNA的浓度且假定所述胎儿游离DNA的浓度f小于50%;
    并利用
    Figure PCTCN2015078422-appb-100003
    对各所述染色体的矫正前覆盖度进行矫正,其中,
    Figure PCTCN2015078422-appb-100004
    代表各所述染色体的矫正前覆盖度,x'代表各所述染色体的矫正后染色体覆盖度。
  2. 根据权利要求1所述的方法,其特征在于,对所述测序数据中的所有染色体以切分成相等大小的窗口的形式计算覆盖度,得到各所述染色体的矫正前覆盖度。
  3. 根据权利要求2所述的方法,其特征在于,每个所述窗口的大小为100Kb,且相邻两个所述窗口之间的重叠度为50%。
  4. 根据权利要求1所述的方法,其特征在于,对所述待测孕妇在各所述窗口中的单一序列的数量进行Z检验,得到ZCNV值,并根据所述ZCNV值得到所述待测孕妇的拷贝数异常片段的步骤包括:
    根据所述测序数据中各序列的测序深度,统计各所述窗口的所述单一序列的数量;
    根据各所述染色体的GC含量和比对率对各所述窗口的所述单一序列的数量进行计算,得到各所述窗口的所述单一序列的数量的矫正前覆盖度;以及
    对各所述窗口的所述单一序列的数量的矫正前覆盖度进行标准化处理,得到各所述窗口的所述单一序列的数量的ZCNV值,并根据所述ZCNV值的大小判断所述待测孕妇是否具有所述拷贝数异常的片段;
    当在所述测序数据中存在300Kb以上的片段,且在所述300Kb以上的片段中80%以上的窗口的所述单一序列的数量的ZCNV值都大于等于4或小于等于-4时,则认为所述300Kb以上的片段是所述待测孕妇的拷贝数异常片段。
  5. 根据权利要求1所述的方法,其特征在于,利用各所述染色体的矫正后覆盖度对各所述染色体的进行Z检验,得到Zaneu值的步骤中,所述Zaneu值按照
    Figure PCTCN2015078422-appb-100005
    来计算,其中,
    Figure PCTCN2015078422-appb-100006
    是根据LOESS算法,通过已知阴性样本群体得到的矫正前覆盖度;s表示阴性样本群体里
    Figure PCTCN2015078422-appb-100007
    的标准差。
  6. 一种检测染色体非整倍性的装置,其特征在于,所述装置包括以下模块:
    测序数据检测模块:用于对待测孕妇的外周血游离DNA进行高通量测序,以得到包含所有染色体的测序数据;
    第一覆盖度计算模块:用于对所述测序数据中的所有染色体以切分成的窗口形式计算覆盖度,以得到各所述染色体的矫正前覆盖度;
    ZCNV值计算模块:用于对所述待测孕妇在各所述窗口中的单一序列的数量的ZCNV值进行计算;
    拷贝数异常片段查询模块:用于在所述测序数据中查询300Kb以上的片段,且在所述300Kb以上的片段中,80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段;
    拷贝数异常片段确定模块:用于将从所述测序数据中查询得到的所述300Kb以上的片段且在80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段确定为待测孕妇的拷贝数异常片段;
    α第一计算模块:用于在胎儿遗传了母体的拷贝数异常片段的情况下,按照如式(1)所示的计算公式计算参数α,其中,所述参数α是指孕妇的拷贝数异常片段对各所述染色体的矫正前覆盖度的影响;
    Figure PCTCN2015078422-appb-100008
    在所述式(1)中,m表示所述拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示所述孕妇在所述拷贝数异常片段的长度,单位为Mb;cn表示所述孕妇的所述拷贝数异常片段出现的次数;
    α第二计算模块:用于在胎儿未遗传母体的拷贝数异常的染色体的情况下,按照如式(2)所示的计算公式所述参数α,
    Figure PCTCN2015078422-appb-100009
    在所述式(2)中,m表示所述拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示所述孕妇在所述拷贝数异常片段的长度,单位为Mb;cn表示所述孕妇的所述拷贝数异常片段出现的次数;f表示所述待测孕妇的外周血游离DNA中所含的胎儿游离DNA的浓度且假定所述胎儿游离DNA的浓度f小于50%;
    矫正模块:用于利用
    Figure PCTCN2015078422-appb-100010
    对各所述染色体的矫正前覆盖度进行矫正,得到各所述染色体的矫正后覆盖度;其中,
    Figure PCTCN2015078422-appb-100011
    代表各所述染色体的矫正前覆盖度,x'代表各所述染色体的矫正后染色体覆盖度;
    第二覆盖度计算模块:用于利用各所述染色体的矫正后覆盖度来计算各染色体的Zaneu值;
    Zaneu值判断模块:用于判断所述Zaneu值是否大于等于3;
    染色体非整倍性确认模块:用于在所述Zaneu值大于等于3的情况下,确定所述染色体具有非整倍性。
  7. 根据权利要求6所述的装置,其特征在于,所述第一覆盖度计算模块包括:
    染色体窗口切分子模块:用于对所述测序数据中的所有染色体以切分成相等大小的窗口;
    第一覆盖度计算子模块:用于以所述相等大小的窗口的形式计算覆盖度,以得到各所述染色体的校正前覆盖度。
  8. 根据权利要求7所述的装置,其特征在于,所述染色体窗口切分子模块中,每个所述窗口的大小为100Kb,且相邻两个所述窗口之间的重叠度为50%。
  9. 根据权利要求6所述的装置,其特征在于,所述ZCNV值计算模块包括:
    单一序列统计单元:用于根据所述测序数据中各序列的测序深度,统计各所述窗口的单一序列的数量;
    单一序列的覆盖度计算单元:用于根据各所述染色体的GC含量和比对率对各所述窗口的所述单一序列的数量进行计算,得到各所述窗口的所述单一序列的数量的矫正前覆盖度;
    单一序列ZCNV值计算单元:用于对各所述窗口的所述单一序列的数量的矫正前覆盖度进行标准化处理,得到各所述窗口的所述单一序列的数量的ZCNV值。
  10. 根据权利要求6所述的装置,其特征在于,在所述第二覆盖度计算模块中,所述Zaneu按照
    Figure PCTCN2015078422-appb-100012
    来计算,其中,
    Figure PCTCN2015078422-appb-100013
    是根据LOESS算法,通过已知阴性样本群体得到的覆盖度值,s表示阴性样本群体里
    Figure PCTCN2015078422-appb-100014
    的标准差。
  11. 一种检测染色体非整倍性的试剂盒,其特征在于,所述试剂盒包括:
    检测试剂和检测器械:用于对待测孕妇的外周血游离DNA进行高通量测序,以得到包含所有染色体的测序数据;
    第一覆盖度计算器械:用于对所述测序数据中的所有染色体以切分成的窗口形式计算覆盖度,以得到各所述染色体的矫正前覆盖度;
    ZCNV值计算器械:用于对所述待测孕妇在各所述窗口中的单一序列的数量进行Z检验,得到ZCNV值;
    拷贝数异常片段查询器械:用于在所述测序数据中查询300Kb以上的片段,且在所述300Kb以上的片段中,80%以上的窗口中染色体片段的ZCNV值都大于等于4或小于等于-4的片段;
    拷贝数异常片段确认器械:用于根据所述ZCNV值大小得到所述待测孕妇的拷贝数异常片段;
    α第一计算器械:用于在胎儿遗传了母体的拷贝数异常片段的情况下,按照如式(1)所示的计算公式计算参数α,所述参数α为孕妇的拷贝数异常片段对各所述染色体的矫正前覆盖度的影响,
    Figure PCTCN2015078422-appb-100015
    m表示所述拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示所述孕妇在所述拷贝数异常片段的长度,单位为Mb;cn表示所述孕妇的所述拷贝数异常片段出现的次数;
    α第二计算器械:用于在胎儿未遗传母体的拷贝数异常的染色体的情况下,按照如式(2)所示的计算公式所述参数α:
    Figure PCTCN2015078422-appb-100016
    m表示所述拷贝数异常片段所在染色体的有效长度,单位为Mb;n表示所述孕妇在所述拷贝数异常片段的长度,单位为Mb;cn表示所述孕妇的所述拷贝数异常片段出现的次数;f表示所述待测孕妇的外周血游离DNA中所含的胎儿游离DNA的浓度且假定所述胎儿游离DNA的浓度f小于50%;
    矫正器械:用于利用
    Figure PCTCN2015078422-appb-100017
    对各所述染色体的矫正前覆盖度进行矫正,得到各所述染色体的矫正后覆盖度;其中,
    Figure PCTCN2015078422-appb-100018
    代表各所述染色体的矫正前覆盖度,x'代表各所述染色体的矫正后染色体覆盖度;
    第二覆盖度计算器械:用于利用各所述染色体的矫正后覆盖度来计算各染色体的Zaneu值;
    Zaneu值判断器械:用于判断所述Zaneu值是否大于等于3;
    染色体非整倍性确认器械:用于在所述Zaneu值大于等于3的情况下,确定所述染色体具有非整倍性。
  12. 根据权利要求11所述的试剂盒,其特征在于,所述第一覆盖度计算器械包括:
    染色体窗口切分部件:用于对所述测序数据中的所有染色体以切分成相等大小的窗口;
    第一覆盖度计算部件:用于以所述相等大小的窗口的形式计算覆盖度,以得到各所述染色体的校正前覆盖度。
  13. 根据权利要求12所述的试剂盒,其特征在于,所述染色体窗口切分部件中,每个所述窗口的大小为100Kb,且相邻两个所述窗口之间的重叠度为50%。
  14. 根据权利要求11所述的试剂盒,其特征在于,所述单一序列ZCNV值计算器械包括:
    单一序列统计部件:用于根据所述测序数据中各序列的测序深度,统计各所述窗口的单一序列的数量;
    单一序列的覆盖度计算部件:用于根据各所述染色体的GC含量和比对率对各所述单一序列的数量进行计算,得到各所述窗口的所述单一序列的数量的矫正前覆盖度;
    单一序列ZCNV值计算部件:用于对各所述窗口的所述单一序列的数量的矫正前覆盖度进行标准化处理,得到各所述窗口的所述单一序列的数量的ZCNV值。
  15. 根据权利要求11所述的试剂盒,其特征在于,在所述第二覆盖度计算器械中,所述Zaneu按照
    Figure PCTCN2015078422-appb-100019
    来计算,其中,
    Figure PCTCN2015078422-appb-100020
    是根据LOESS算法,通过已知阴性样本群体得到的矫正前覆盖度值,s表示阴性样本群体里
    Figure PCTCN2015078422-appb-100021
    的标准差。
PCT/CN2015/078422 2015-05-06 2015-05-06 检测染色体非整倍性的试剂盒、装置和方法 WO2016176847A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
SG11201709141YA SG11201709141YA (en) 2015-05-06 2015-05-06 Reagent kit, apparatus, and method for detecting chromosome aneuploidy
JP2018509952A JP6623400B2 (ja) 2015-05-06 2015-05-06 染色体異数性を測定するためのキット、装置及び方法
PCT/CN2015/078422 WO2016176847A1 (zh) 2015-05-06 2015-05-06 检测染色体非整倍性的试剂盒、装置和方法
EP15891099.2A EP3293270B1 (en) 2015-05-06 2015-05-06 Reagent kit, apparatus, and method for detecting chromosome aneuploidy
US15/571,859 US20180201990A1 (en) 2015-05-06 2015-05-06 Kit, apparatus, and method for detecting chromosome aneuploidy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/078422 WO2016176847A1 (zh) 2015-05-06 2015-05-06 检测染色体非整倍性的试剂盒、装置和方法

Publications (1)

Publication Number Publication Date
WO2016176847A1 true WO2016176847A1 (zh) 2016-11-10

Family

ID=57217322

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/078422 WO2016176847A1 (zh) 2015-05-06 2015-05-06 检测染色体非整倍性的试剂盒、装置和方法

Country Status (5)

Country Link
US (1) US20180201990A1 (zh)
EP (1) EP3293270B1 (zh)
JP (1) JP6623400B2 (zh)
SG (1) SG11201709141YA (zh)
WO (1) WO2016176847A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180190384A1 (en) * 2017-01-05 2018-07-05 Clear Genetics, Inc. Automated genetic test counseling
CN112397148B (zh) * 2019-08-23 2023-10-03 武汉希望组生物科技有限公司 序列比对方法、序列校正方法及其装置
EP4045681A1 (en) * 2019-10-16 2022-08-24 Stilla Technologies Determination of nucleic acid sequence concentrations
CN110993024B (zh) * 2019-12-20 2023-08-22 北京科迅生物技术有限公司 建立胎儿浓度校正模型的方法及装置与胎儿浓度定量的方法及装置
CN112037846A (zh) * 2020-07-14 2020-12-04 广州市达瑞生物技术股份有限公司 一种cffDNA非整倍体检测方法、系统、储存介质以及检测设备
CN114792548B (zh) * 2022-06-14 2022-09-09 北京贝瑞和康生物技术有限公司 校正测序数据、检测拷贝数变异的方法、设备和介质
CN115132271B (zh) * 2022-09-01 2023-07-04 北京中仪康卫医疗器械有限公司 一种基于批次内校正的cnv检测方法
CN117095747B (zh) * 2023-08-29 2024-04-30 广东省农业科学院水稻研究所 一种基于线性泛基因组和人工智能模型检测群体倒位或转座子端点基因型的方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102985561A (zh) * 2011-04-14 2013-03-20 维里纳塔健康公司 用于确定并且验证常见的和罕见的染色体非整倍性的归一化染色体
CN104789686A (zh) * 2015-05-06 2015-07-22 安诺优达基因科技(北京)有限公司 检测染色体非整倍性的试剂盒和装置
CN104789466A (zh) * 2015-05-06 2015-07-22 安诺优达基因科技(北京)有限公司 检测染色体非整倍性的试剂盒和装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104603284B (zh) * 2012-09-12 2016-08-24 深圳华大基因研究院 利用基因组测序片段检测拷贝数变异的方法
US10482994B2 (en) * 2012-10-04 2019-11-19 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations
US10319463B2 (en) * 2015-01-23 2019-06-11 The Chinese University Of Hong Kong Combined size- and count-based analysis of maternal plasma for detection of fetal subchromosomal aberrations

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102985561A (zh) * 2011-04-14 2013-03-20 维里纳塔健康公司 用于确定并且验证常见的和罕见的染色体非整倍性的归一化染色体
CN104789686A (zh) * 2015-05-06 2015-07-22 安诺优达基因科技(北京)有限公司 检测染色体非整倍性的试剂盒和装置
CN104789466A (zh) * 2015-05-06 2015-07-22 安诺优达基因科技(北京)有限公司 检测染色体非整倍性的试剂盒和装置

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHIU, R.W.K. ET AL.: "Noninvasive Prenatal Diagnosis of Fetal Chromosomal Aneuploidy by Massively Parallel Genomic Sequencing of DNA in Maternal Plasma", PNAS, vol. 105, no. 51, 23 December 2008 (2008-12-23), pages 20458 - 20463, XP055284693 *
MATTHEW, W. ET AL.: "Copy-Number Variation and False Positive Prenatal Aneuploidy Screening Results", N ENGL J MED., vol. 372, no. 17, 23 April 2015 (2015-04-23), pages 1639 - 1645, XP055328136 *
See also references of EP3293270A4 *

Also Published As

Publication number Publication date
JP6623400B2 (ja) 2019-12-25
EP3293270A4 (en) 2018-03-28
EP3293270A1 (en) 2018-03-14
EP3293270B1 (en) 2019-09-25
US20180201990A1 (en) 2018-07-19
SG11201709141YA (en) 2017-12-28
JP2018514234A (ja) 2018-06-07

Similar Documents

Publication Publication Date Title
WO2016176847A1 (zh) 检测染色体非整倍性的试剂盒、装置和方法
US20230132951A1 (en) Methods and systems for tumor detection
US20230151436A1 (en) Diagnostic applications using nucleic acid fragments
JP6585117B2 (ja) 胎児の染色体異数性の診断
WO2016011982A1 (zh) 确定生物样本中游离核酸比例的方法、装置及其用途
JP2018514234A5 (zh)
CN107058551B (zh) 检测微卫星位点不稳定性的方法及装置
JP2015536639A5 (zh)
WO2018161245A1 (zh) 一种染色体变异的检测方法及装置
CN107133491B (zh) 一种获取胎儿游离dna浓度的方法
CN104789686A (zh) 检测染色体非整倍性的试剂盒和装置
CN104789466A (zh) 检测染色体非整倍性的试剂盒和装置
WO2019213811A1 (zh) 检测染色体非整倍性的方法、装置及系统
WO2020063052A1 (zh) 胎儿游离dna浓度获取方法、获取装置、存储介质及电子装置
US20180225413A1 (en) Base Coverage Normalization and Use Thereof in Detecting Copy Number Variation
WO2016176846A1 (zh) 检测染色体非整倍性的试剂盒、装置和方法
CN108875307B (zh) 一种基于孕妇外周血中胎儿游离dna的亲子鉴定方法
WO2019213810A1 (zh) 检测染色体非整倍性的方法、装置及系统
US20200318190A1 (en) Stratification of risk of virus associated cancers
CN110428873B (zh) 一种染色体倍数异常检测方法及检测系统
WO2023056884A1 (en) Sequencing of viral dna for predicting disease relapse

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15891099

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018509952

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 11201709141Y

Country of ref document: SG

Ref document number: 15571859

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2015891099

Country of ref document: EP