CN116246704A - System for noninvasive prenatal detection of fetuses - Google Patents

System for noninvasive prenatal detection of fetuses Download PDF

Info

Publication number
CN116246704A
CN116246704A CN202310518720.6A CN202310518720A CN116246704A CN 116246704 A CN116246704 A CN 116246704A CN 202310518720 A CN202310518720 A CN 202310518720A CN 116246704 A CN116246704 A CN 116246704A
Authority
CN
China
Prior art keywords
depth
microdeletion
chromosome
primer
chri
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310518720.6A
Other languages
Chinese (zh)
Other versions
CN116246704B (en
Inventor
曾晓静
蒋馥蔓
李胜
杜伯乐
夏伟成
郭宇来
秦炳财
王阳
李小坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Jingke Dx Co ltd
Shenzhen Jingke Biotechnology Co ltd
Shenzhen Jingke Gene Technology Co ltd
Shenzhen Jingke Medical Laboratory
Original Assignee
Guangzhou Jingke Dx Co ltd
Shenzhen Jingke Gene Technology Co ltd
Shenzhen Jingke Medical Laboratory
Guagnzhou Jingke Biotech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Jingke Dx Co ltd, Shenzhen Jingke Gene Technology Co ltd, Shenzhen Jingke Medical Laboratory, Guagnzhou Jingke Biotech Co ltd filed Critical Guangzhou Jingke Dx Co ltd
Priority to CN202310518720.6A priority Critical patent/CN116246704B/en
Publication of CN116246704A publication Critical patent/CN116246704A/en
Application granted granted Critical
Publication of CN116246704B publication Critical patent/CN116246704B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6879Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for sex determination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Medical Informatics (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Public Health (AREA)
  • Immunology (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Biomedical Technology (AREA)
  • Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Primary Health Care (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Physiology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioethics (AREA)
  • Artificial Intelligence (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a system for noninvasive prenatal detection of a fetus, which comprises a sequencing device, a data analysis device and a result output device. After obtaining sequencing data of plasma free DNA, chromosome copy number analysis, chromosome microdeletion microreplication analysis, and monogenic disease analysis were performed using an own algorithm. On the basis of almost no increase of detection cost and detection time, the detection of the microdeletion and microreplication syndrome is realized, the percentage of total fetal DNA is estimated by using a neural network, and the data display can improve the detection precision to 2M resolution.

Description

System for noninvasive prenatal detection of fetuses
Technical Field
The present invention relates to the field of prenatal diagnosis, and in particular to a system for noninvasive prenatal detection of a fetus.
Background
Genetic diseases refer to diseases in which human genetic material is changed or abnormal, and the structure and function of a fetus at birth or after birth are abnormal or damaged. Genetic diseases mainly include chromosomal abnormalities and monogenic genetic diseases, which seriously threaten human health and are a major public health problem in society.
Chromosomal abnormalities mainly include copy number abnormalities and structural abnormalities, the most common chromosomal abnormalities being chromosomal aneuploidies, i.e., changes in chromosome number (more or less than 46, or two rather than three or one or four or other numbers of a certain chromosome number). The most common chromosomal aneuploidy diseases clinically are 21-trisomy (Down syndrome), 18-trisomy (Edwardsies syndrome), 13-trisomy (Partolochia syndrome). Microdeletion microreplication syndrome is another major class of birth defects in neonates other than chromosomal aneuploidies. There are data showing that 211 chromosomal microdeletion diseases and 79 chromosomal microdeletion diseases have been published by 11 months 2012. The incidence of chromosomal microdeletion/microreplication syndrome varies from 1/4000 to 1/2000000, with smaller deletions and duplications, typically less than 5Mb, being easily missed by prenatal diagnosis. Data shows that most chromosomal microdeletion/microreplication diseases are new mutations and that risk of onset has no significant correlation with age. The chromosome microdeletion/microduplication, which is pathogenic or potentially pathogenic, is 1.7% and the risk of recurrence is high. Single-gene genetic disease refers to genetic disease caused by single gene mutation, and single morbidity is low, but the variety is numerous and the total morbidity is high. According to World Health Organization (WHO) statistics, the cumulative incidence rate of all single-gene genetic diseases of the global birth population is as high as 10/1000.
Through researching the genetic diseases, the genetic diseases can be effectively prevented and treated, and good news is brought to human beings. Among them, early diagnosis and early treatment of genetic diseases are very critical. Clinically conventional screening methods include serological and imaging, and when the result appears positive, prenatal diagnosis will be performed by invasive testing (placental chorionic sampling or amniotic fluid puncture, etc.). However, these methods are not highly sensitive, have certain false positives, and have certain risks of abortion in invasive tests.
Hong Kong student Lu teaches that 3-13% of the free nucleic acids in maternal plasma were found in 1997 to be from fetuses, thus opening a new history of noninvasive prenatal diagnosis using maternal plasma. Initially, noninvasive prenatal DNA analysis was very challenging due to the large amount of maternal free DNA, but as high throughput sequencing technology rapidly developed, tens of millions of DNA fragments were quantified to detect chromosomal aneuploidies such as 21-trisomy, 18-trisomy, and 13-trisomy syndrome, which has been widely validated and accepted by clinical practice today, with specificity for chromosomal aneuploidies of about 97-99% and low false positive rate (< 0.1%). Currently, the conventional methods for noninvasive fetal prenatal detection are the WGS method and the SNP method. Compared with the WGS method, the SNP method (amplification of specific sequences) has lower detectable fetal concentration, but has long detection time, great development difficulty, limited detection range and undetectable sequence of non-SNP locus region. There have been institutions attempting to use chip capture methods for non-invasive prenatal testing by designing probes for selected specific and important areas, then capturing free DNA from the peripheral blood of pregnant women, and then sequencing the library. Compared with the WGS method, the method has the advantages of increased chip capturing steps and cost, long detection time and high cost. Although WGS methods require higher data volumes than SNP methods, current sequencing costs are continually reduced in terms of the ratio of detection and future sequencing costs. Thus detection coverage will play an increasing role in future detection. In recent years, the application has been further extended to the detection of micro-missing micro-repeats. Team Yin Aihua from the women and young healthcare institute in Guangdong, published in 2015, 9 that they used the WGS method to conduct NIPT test on pregnant woman plasma and detected fetal chromosomal microdeletion/microdeletion and confirmed microdeletion/microdeletion results via prenatal diagnosis. Due to the presence of maternal background DNA interference, it is important to improve the detection accuracy of fetal chromosomal microdeletion/microreplication by various methods, common methods are: increasing the concentration of fetal free DNA in the assay result; optimizing an analysis algorithm; SNP method and the like are used. It is apparent that the disadvantage of the SNP method that the detection sites are limited is unavoidable, and therefore, in practical use, the first 2 methods are more widely used.
Single gene genetic diseases are seriously damaged, most of them are teratogenic, disabled and even fatal, and an effective treatment means is lacked. There are studies showing that about half of monogenic diseases are dominant monogenic diseases, with the proportion of new variants being about 74% of them. The new mutation is not inherited by parents, the pregnancy does not have to have phenotype or ultrasonic abnormality can occur in late pregnancy, and an effective early screening method is lacking clinically at present, so that the new mutation is easy to miss before birth. With the development of the requirements of prenatal and postnatal care and detection technologies, the requirements for the detection of single genetic diseases before pregnancy are growing. However, the current clinical fetal monogenic genetic disease detection mainly relies on invasive detection, i.e. the detection needs to be performed by collecting samples such as amniotic fluid, chorion and the like. These invasive tests rely on techniques that are skilled in the art and are costly, and more importantly, invasive tests are at risk of abortion (about 0.5-1%).
In the prior art, a microdeletion micro-duplication detection method with high accuracy is lacked, and meanwhile, the accuracy of detecting the single-gene dominant genetic disease is relatively low.
Disclosure of Invention
The present invention aims to overcome at least one of the deficiencies of the prior art and to provide a system for non-invasive prenatal detection of a fetus.
The technical scheme adopted by the invention is as follows:
a system for noninvasive prenatal detection of a fetus comprising a sequencing device, a data analysis device, and a result output device, wherein:
the sequencing device is used for determining sequence information of a free DNA sequencing library in peripheral blood of the pregnant woman to obtain a sequencing result;
the result output device is used for outputting the analysis result of the data analysis device;
the analysis method of the data analysis device comprises the following steps:
comparing the sequencing result to a human reference genome, constructing a unique comparison sequencing sequence set, and recording the position information of each sequencing sequence comparison;
cutting a reference genome according to a unit window length, and dividing the reference genome into a plurality of primary windows;
counting the number of uncorrected frequencies of the unique comparison sequences in each window according to the comparison position information of each unique sequence;
correcting the number of uncorrected frequencies according to the GC value to obtain the corrected number of frequencies;
prediction of fetal nucleic acid percentages:
estimating fetal nucleic acid percentage PY from the Y chromosome:
calculating average Depth Y of Y chromosome, wherein the sum of the number of frequencies of the Y chromosome after correction of each window of the Y chromosome and/or the length of the whole chromosome base of the Y chromosome is calculated as Dmate value, and the average Depth Y of the Y chromosome of a normal male is calculated as Dfemale value; the average Y chromosome depth of the sample to be detected is calculated as Dtest, and the fetal nucleic acid percentage PY= [ (Dtest-Dfemale)/Dhale ] -Dfemale of the sample to be detected;
Carrying out principal component analysis on a known normal sample by an unsupervised learning method to obtain a principal component set PCs;
selecting PCs as a main component set, introducing an artificial neural network, and re-weighting the PCs by using a sample set of known fetal Y chromosome fetal nucleic acid percentage PY on a label to construct a weight-main component-PY neural network model;
predicting fetal nucleic acid percentage PF: converting the corrected frequency number of the chromosome window of the sample to be detected into a principal component set PCs through a principal component analysis technology, and then outputting a neuron value PF according to a weight-principal component-PY neural network model;
determination of the anomaly percentage PA of microdeleted microreplicated fragments:
randomly setting a breakpoint position, calculating a significance level p-value by a mathematical test method for the frequency number of windows on the left and right sides of the breakpoint position after correction, and selecting the window as a candidate microdeletion micro-repetition window if the calculated p-value is smaller than a preset significance level value p-set; if the calculated p-value is larger than the set significance level value p-set, continuing to merge the windows for next examination until a micro-missing micro-repeated window to be selected is obtained;
merging the adjacent windows of the determined to-be-selected microdeletion micro-repetition windows to obtain a selected microdeletion micro-repetition region;
Performing a depth depth (abnormal) calculation for each region of the microdeletion microrepeat, depth (abnormal) = corrected number of frequencies falling within the microdeletion microrepeat region/the microdeletion microrepeat region size;
the average depth (normal) calculation was also performed for other regions of the chromosome that do not contain the microdeletion microrepeat, depth (normal) =corrected frequency number falling on the remaining regions of each chromosome that do not contain the microdeletion microrepeat/size of the remaining regions that do not contain the microdeletion microrepeat;
calculating an anomaly percentage pa=2×|depth (normal) -depth (abnormal) |/depth (normal) containing microdeletion microrepeat or a whole chromosome copy number variation, and when depth (normal) -depth (abnormal) is positive, identifying as a preliminary microdeletion variation; when depth (normal) -depth (abnormal) is negative, it is considered a preliminary micro-repeat variation; when depth (normal) -depth (abnormal) is 0, it is judged to be normal;
color body copy number variation judgment:
the percent chromosomal copy number variation was obtained as pa_chri (i=1, 2, … …,22, x, y), comprising the steps of:
performing a duty cycle coverage_chri (i=1, 2, … …, x, y) calculation for each chromosome, coverage_chri=the sum of the number of corrected window frequencies falling on that chromosome/the number of window frequencies falling on that chromosome;
Calculating the occupancy ratio Coverage-normal_chri (i=1, 2, … …, x, y) of each chromosome based on known normal samples, wherein Coverage-normal_chri=the sum of the number of window frequencies after correction falling on the chromosome/the number of window frequencies falling on the chromosome, and averaging the occupancy ratio of the chromosomes of all samples to obtain Average (Coverage-normal_chri) (i=1, 2, … … … …, x, y);
calculating the copy number variation percentage PC_chri of each chromosome of the sample to be detected, wherein PC_chri=2× (coverage_chri-Average (Coverage-normal_chri))/Average (Coverage-normal_chri);
calculating the ratio R of the copy number variation percentage PC of each chromosome of the sample to be detected to the fetal nucleic acid percentage PF, wherein R= |PC|/PF;
determining chromosome copy number variation according to the PC_chri and the R value;
prediction of monogenic genetic disease:
sequentially aligning reads from the same cfDNA at each base site, counting from the first base site of the set of reads, if only less than or equal to 30% of reads are the same base at that site, the base is considered to be background noise; if more than or equal to 70% of reads are the same base at the site, the base type of the site is confirmed; if only 30% -70% of reads contain the same base, then that base is designated as an N base;
The same statistics are carried out on the second base site of the reads until the last site of the reads is finished, and the base sequence of the reads from the same cfDNA molecule is obtained;
aligning the ready base sequence of the sequenced cfDNA molecule to a human reference genome using the bwa aln algorithm;
detecting the base of each covered site, and counting the respective depth and the depth ratio of A, T, G, C, N, insertion, deletion on each site;
selecting a single gene disease position to check the total coverage depth of the position, if the total depth coverage is smaller than 1000X, the quality control cannot pass, and if the total depth is larger than 1000X, the quality control passes;
finding out the pathogenic mutation sites which are definitely required to be observed, and if the depth percentage of the pathogenic mutation is more than 3%, considering that the mutation exists; if the depth percentage of the pathogenic mutation is 1% -3%, judging that the gray area range is needed to be detected again; if the depth percentage of the pathogenic mutation is below 1%, the judgment mutation is absent.
In some examples of the system, the chromosomal microdeletion microreplication variation determination comprises the steps of:
calculating a ratio R of the percent of microdeletion microrepeat abnormalities to the percent of fetal nucleic acid, r=pa/PF;
Judging whether the sample is positive or negative of microdeletion and microduplication variation according to the ratio value R:
if the depth (normal) -depth (abnormal) of the microdeletion repeat region is positive, primarily judging that the microdeletion variation is detected, and then filtering a negative signal through an R value to determine that the microdeletion variation is the final microdeletion positive variation;
if R >5, prompting that the positive signal is possibly from a mother, and whether the fetus carries an unpredictable;
if R is more than or equal to 0.8 and less than or equal to 5, positive variation is determined;
if R is less than or equal to 0.5, determining negative variation;
if R is 0.5< 0.8, the dust area is the dust area, and re-detection is needed;
if depth (normal) -depth (abnormal) of the microdeletion repeat region is negative, primarily judging that the microdeletion repeat region is micro-repeat variation, and filtering a negative signal through an R value to confirm that the microdeletion repeat region is final microdeletion positive variation;
if R >5, prompting that the positive signal is possibly from a mother, and whether the fetus carries an unpredictable;
if R is more than or equal to 0.8 and less than or equal to 5, positive variation is determined;
if R is less than or equal to 0.5, determining negative variation;
if 0.5< R <0.8, it is the gray zone, and re-detection is required.
In some system examples, PCs are selected as a main component set, an artificial neural network with 2 hidden layers is introduced, a method of elastic back propagation with weight backtracking is adopted, and the lossfunction adopts a residual variance sum algorithm, and the PCs are re-weighted by using a sample set of the fetal nucleic acid percentage PY of the known fetal Y chromosome on the label, so as to construct a weight-main component-PY neural network model.
In some examples of systems, the criteria for chromosomal copy number variation are as follows:
and (3) three-body judgment:
if PC_chri >0.03 and R >1.8, reporting trisomy positive, and prompting mother or placenta effect;
if PC_chri >0.03, and 0.2< R <1.8, then it is determined as chri trisomy positive;
if PC_chri >0.03,0.1< R <0.2, determining as an ash zone, and requiring re-determination;
if PC_chri is less than or equal to 0.03 and R <0.1, determining that the report is negative;
determination of XO:
when PC_chrX is less than or equal to-0.03, -0.01< PC_chrY <0.01, and R (i.e., |PC_chrX|/PF) is more than or equal to 0.8, determining positive;
when-0.03 < PC_chrX <0.03, -0.01< PC_chrY <0.01, R is less than 0.3, then judging as negative;
determination of XXX:
when PC_chrX >0.03, -0.01< PC_chrY <0.01, R (i.e., |PC_chrX|/PF) is larger than or equal to 0.8, the positive result is judged;
when-0.03 < PC_chrX <0.03, -0.01< PC_chrY <0.01, R is less than 0.3, then judging as negative;
determination of XXY:
when-0.01 < pc_chrx <0.01 and pc_chry > =0.04, then it is determined as positive;
determination of XYY:
when PC_chrX >3% and PC_chrY >3% and the ratio of PC_chrY to PC_chrX is greater than 1.7, then XYY positive is determined.
In some system examples, the unit window is 100k to 5 mbp in length. The length of the unit window can be adjusted accordingly according to the depth, quality and the like of the sequencing.
In some system examples, when the calculated fetal nucleic acid percentage PF is less than 3.5%, the confidence of the result is determined and the peripheral blood sample is re-obtained.
In some system examples, the number of uncorrected frequencies is corrected by a model of linear regression based on GC values and as-batch systematic errors.
In some system examples, p-set is less than or equal to 0.05. This is a common significance criterion. The value may be adjusted as desired.
In some examples of systems, the method of constructing a sequencing library comprises the steps of:
extracting free DNA from a pregnant woman peripheral blood sample;
performing end repair, A adding, connector adding and PCR amplification on the free DNA, wherein the implementation method of PCR amplification is one of the following 4 methods:
1) Simultaneously adding a gene specific primer, a specific joint and a specific barcode primer, wherein the gene specific primer is combined with a plasma free DNA template to form T m T with specific adapter and specific barcode primer combined with gene specific primer and plasma free DNA template amplified product m The value is 2-6 ℃, the PCR process uses high annealing temperature to amplify the gene specific primer to a set concentration, and uses low annealing temperature to amplify the specific joint and the specific barcode primer, and finally forms a complete library; or (b)
2) Synthesizing a fusion primer, namely, a primer contains a gene specific module and a specific joint or a specific barcode primer module, forming an upstream primer and a downstream primer, amplifying and enriching a plasma free DNA template, and finally forming a complete library; or (b)
3) The upstream gene specific primer and the specific joint form an upstream fusion primer, a downstream gene specific primer and a specific barcode primer, when the amplification is started, the high annealing temperature is used firstly, the fusion primer and the downstream gene specific primer are utilized to amplify the plasma free DNA template, and then the low annealing temperature is used to amplify the upstream fusion primer and the specific barcode primer, so that a complete library is finally formed; or (b)
4) The downstream gene specific primer and the specific joint form a downstream fusion primer, an upstream gene specific primer and a specific barcode primer, when the amplification is started, the high annealing temperature is used firstly, the downstream fusion primer and the upstream gene specific primer are utilized for amplifying the plasma free DNA template, and then the low annealing temperature is used for amplifying the downstream fusion primer and the specific barcode primer, so that a complete library is finally formed.
In some system examples, downstream gene specific primers and specific barcode primer T m The difference value of the values is 3-7 ℃; upstream Gene-specific primer and T of specific barcode primer m The difference in values is 3-7 ℃. Such amplification is more effective.
The free DNA length of the fetus is typically less than 200 bp, and in some examples of systems, fragments above 200 bp are screened out after the free DNA is obtained. This can increase the concentration of fetal free DNA.
The beneficial effects of the invention are as follows:
the system of the invention can detect fetal chromosomal aneuploidy, microdeletion microreplication syndrome and single gene dominant genetic disease simultaneously. Compared with the prior art, the detection of the microdeletion microreplication syndrome is realized on the basis that the detection cost and the detection time are hardly increased.
The system of the invention uses a one-step method for multiple amplification and library establishment to complete noninvasive detection of fetal single-gene dominant genetic diseases. Multiple monogenic genetic diseases can be detected at one time, and theoretically all monogenic dominant genetic diseases with deterministic mutation sequences (point mutations, indels, etc.) can be detected.
According to the system disclosed by the invention, the whole genome library and the multiple one-step library are mixed for sequencing, namely 1 sample is shared by 1 sample and 1 sample barcode, so that limited samples which can be sequenced in one sequencing run due to insufficient sample barcode can not be generated, and the sequencing cost is high. The scheme is simple and easy to use, and meets clinical timeliness and practicality.
The system of the invention can meet the requirements of fragment and site specificity analysis when the sequencing depth of a single-gene disease specific region reaches 1000x, and hardly increases the sequencing cost.
The system of the invention can obviously improve the concentration of the free DNA of the fetus, reduce the probability of resampling, and further reduce multiparty resource and cost consumption; even meets the detection requirements of pregnant women, part of which cannot be subjected to conventional NIPT detection.
The system of the invention adopts an analysis algorithm which is developed independently. The detection result of the invention is obviously improved for the micro-missing micro-repetition with lower common detection precision. And the fetal concentration enrichment method and the neural network are innovatively applied to estimate the total fetal DNA percentage ratio, and the data display can improve the detection accuracy to 2M resolution.
Drawings
FIG. 1 is a gel electrophoresis diagram of amplified products of multiplex PCR using adapter and barcode primers after library construction is completed.
FIG. 2 is an electrophoretogram of the amplified product of the multiplex one-step system prior to sample optimization.
FIG. 3 is an electrophoretogram of the amplified product of the multiplex one-step system after sample optimization.
Detailed Description
The technical scheme of the invention is further described below by combining examples.
The following examples are described with respect to a Hua Dazhi-build (MGI) high throughput sequencing platform. Of course, other high throughput sequencing platforms may be used.
Enrichment of fetal DNA:
the plasma free DNA was extracted using magnetic beads according to a conventional procedure, and the plasma free DNA solution after extraction was subjected to a large fragment removal treatment. And (3) carrying out specific fragment screening on the product by using prepared magnetic beads and buffer solution, and removing fragments with more than 200bp in plasma free DNA and greatly retaining the plasma free DNA of small fragments through 2-step magnetic bead screening.
Comparison of different sequencing library construction methods:
performing end repair, A addition, joint addition and PCR amplification on the obtained part of the sample after the fetal DNA enrichment is completed (the process is marked as a first part); multiple one-step libraries (this procedure is labeled as the second part) may use samples after fetal DNA enrichment or samples without fetal DNA enrichment. The first part completes the detection of aneuploidy and microdeletion microduplications. The second part completes the detection of single gene dominant genetic disease.
Alternatively, the extracted plasma free DNA is split into 2 fractions, and the first fraction is subjected to free fetal DNA enrichment followed by end repair, a-addition, linker-addition, PCR amplification of the fragment of interest. The other part directly enters a second part of the multiple one-step method library establishment program without free fetal DNA enrichment; alternatively, the second fraction uses a sample enriched in free fetal DNA as a template.
Adding enzyme, dNTP, buffer solution and the like into the free DNA of the blood plasma to repair the tail end; adding the linker and the linker ligase into the repaired sample to connect the target fragment linker, purifying by using magnetic beads, and removing enzyme mixed solution and the unconnected linker. And then carrying out PCR amplification on the product of the last step by using enzyme, dNTP, buffer solution and specific primer. The amplified product was purified using magnetic beads, the enzyme cocktail and excess primer were removed, and the product was quantified.
The second part is multiple one-step library building. In the invention, optionally, specific molecular tags UMI of 6-8bp are added, and specific molecular tags of 4096-65536 can be formed by adding specific molecular tags UMI of 6-8 bp. The detection template is free DNA of fetus in peripheral blood plasma of pregnant woman, 4096-65536 specific molecular labels are enough to make each target fragment labeled with specific UMI. UMI is introduced at both ends of the target fragment at the beginning of fetal free DNA amplification, and the same UMI is labeled at the time of subsequent target fragment re-amplification, i.e., a single molecule is replicated to thousands of molecules with the same UMI label. The sequence of interest to which UMI is added can be assembled by identifying specific UMI sequences in subsequent analyses, i.e.UMI can help to identify errors in the amplification process and the sequencing process.
Firstly, downloading a gene sequence related to a single-gene genetic disease, and then designing a primer aiming at a mutation site/deletion or repeated fragment so as to ensure that the designed primer can amplify mutant fragments and wild fragments simultaneously. That is, one solution provided by the present invention is to amplify and enrich fragments (including wild type and mutant) containing mutation sites using a multiple one-step method, and then to high-throughput sequence the amplified products, and to analyze the obtained sequencing results. The second partial protocol was designed to detect single-gene dominant inherited diseases, such as GG (allele ratio 100%, homozygous) at the c.1138 locus on FGFR3 gene, which could potentially lead to disease if the genotype was mutated to GT. The inventors have analyzed the percentage of allele G, A, and the percentage of the other pathogenic allele, named pathologic variant allele percentage, by sequencing data, B. When the value of B is 3% or more, the pathogenic mutation is considered to be present. If the value of B is 1% -3%, judging that the gray area is in the range, and detecting again. If the value of B is 1% or less, the discrimination mutation does not exist.
The second part of the process is to use enzyme, dNTP, buffer and multiple primer to enrich the specific fragment of the free DNA in plasma, and complete the process of free DNA in plasma and library out in one step of experiment. There are 3 implementations of this part:
Firstly, adding a gene specific primer, a specific joint and a specific barcode primer at the same time, setting the TM value of the combination of the gene specific primer and a plasma free DNA template to be higher than the TM value of the combination of a product amplified by the gene specific primer and the plasma free DNA template, the specific joint and the specific barcode primer, wherein the temperature is 4+/-2 ℃ higher, the PCR process firstly uses high annealing temperature to preferentially amplify the gene specific primer, and after a few cycles (generally 6-8 cycles), then uses low annealing temperature to amplify the specific joint and the specific barcode primer, thus forming a complete library.
Secondly, synthesizing a fused primer, namely, a primer containing a gene specific module and a specific joint or a specific barcode primer module, forming an upstream primer and a downstream primer, amplifying and enriching a plasma free DNA template, and finally forming a complete library.
Thirdly, the upstream gene specific primer and the specific joint form a fusion primer, the downstream gene specific primer and the specific barcode primer are respectively 2 primers, when the amplification is started, the high annealing temperature is used first, the amplification of the upstream fusion primer and the downstream gene specific primer to the plasma free DNA template is preferentially carried out, and then the low annealing temperature is used, so that the amplification of the upstream fusion primer and the specific barcode primer is carried out. Meanwhile, in the scheme, a fusion primer can be formed by the downstream gene specific primer and the specific joint, the upstream gene specific primer and the specific primer are respectively 2 primers, when the amplification is started, high annealing temperature is firstly used, the amplification of the downstream fusion primer and the upstream gene specific primer to the plasma free DNA template is preferentially carried out, and then the low annealing temperature is used, so that the amplification of the downstream fusion primer and the specific primer is carried out, and finally the complete library is formed.
Meanwhile, in all the implementation modes, a specific recognition tag of 4-8bp is added at the 5' end of a gene specific primer, and the tag is screened by using an information analysis method, wherein the screening principle is as follows: 1) The tag is not present in the amplified target gene; 2) The probability of occurrence in the human genome is low, preferably no sequences in the genome are present, and the probability of occurrence is selected to be the lowest. Preferably, the specific recognition tag is selected to be 5bp in length. The specific identification tag can distinguish and separate the first part and the second part of products in the final sequencing stage, and the addition of the tag brings great convenience in the actual use process.
The enrichment of the target fragments of the plasma free DNA and the construction of the library are completed by a multiple one-step method, the amplified products are purified by using magnetic beads, enzyme mixed solution and redundant primers are removed, and then the products are quantified.
Meanwhile, the inventors compared the results of this system with the conventional PCR system by the multiple one-step method (the first implementation method described above).
The inventors used plasma DNA sample 1, plasma DNA sample 2, and plasma DNA sample 3 to perform multiplex PCR according to a conventional method, first using gene-specific primers, purifying amplified products using magnetic beads, removing enzyme mixed solution and redundant primers, amplifying the products of multiplex PCR using linker primers and barcode primers, and completing library construction, and then performing electrophoresis detection after the completion of this step, as shown in fig. 1. As can be seen from the gel, the amplified product is about 230bp, and there is a primer dimer of greater than 100bp, about 130bp, which is difficult to remove cleanly by 1.0 volume of magnetic beads when the product is purified, while the primer dimer remaining in the sample will affect sequencing, mainly expressed as: 1) Affecting quantification, resulting in unexpected sequencing data; 2) Primer dimers affect other libraries, such as occupancy data; 3) Primer dimer affects sequencing quality.
The product electrophoreses obtained by the inventors, which were initially amplified using the multiplex one-step system of plasma DNA sample 1, plasma DNA sample 2 and plasma DNA sample 3, are shown in FIG. 2, and it can be seen that there are a very large number of non-specific sequences and primer dimers. The inventor performs multiple rounds of testing and optimizing the system. The final system was amplified using plasma DNA sample 1, plasma DNA sample 2, and plasma DNA sample 3. As shown in FIG. 3, the final amplification product of the system used had a single target band of about 230bp, and primer dimer of less than 100bp, about 80bp, which was very easily removed by 1.0 volume of magnetic beads during product purification, as known to those skilled in the art.
In each of the examples below, the specific procedure for plasma free DNA to linker ligation is as follows:
1. extraction of plasma free DNA
The method is strictly operated according to the instruction of a free DNA extraction kit (such as IVD5432, guangdong ear mechanical equipment 20150062) by a magnetic bead method, the elution volume is 65 mu L, the concentration of the extracted DNA is in the range of 0.01-0.2 ng/. Mu.L, and the next library construction operation can be carried out. The obtained DNA sample was dissolved in AE, stored at 2-8deg.C and processed for 7 days, and frozen at-20+ -5deg.C for 3 years. Transporting at low temperature of-20+ -5deg.C for no more than 7 days, and freezing and thawing the extracted DNA sample for no more than 5 times.
2. Fetal DNA enrichment
Dividing the obtained plasma free DNA sample into 2 parts, taking 50 [ mu ] L and entering the following enrichment procedure; the other part directly enters a multiple one-step procedure;
2.1 Adding magnetic beads A (50 mu L) with 1.2 times of volume (60 mu L) into a sample, fully mixing, standing for 5min at room temperature, and placing in a magnetic rack for 3min until the solution is clarified;
2.2 Transferring the supernatant to a corresponding hole filled with 1.0 times of the original sample volume (50 mu L) of magnetic beads B, fully mixing, standing at room temperature for 5min, and placing in a magnetic rack for 3min until the solution is clear; discarding the supernatant;
2.3 Adding 200 μL 80% ethanol, blowing against non-magnetic bead precipitation for 6 times, standing on a magnetic rack until the solution is clear, and discarding supernatant;
2.4 Repeating the step 2.3 for 1 time, standing for 3min at room temperature, airing, and taking down the centrifuge tube from the magnetic rack;
2.5 Adding 27 mu L of an absorption Buffer, fully and uniformly mixing, standing at room temperature for 5min, and placing the centrifuge tube in a magnetic rack for 3min until the solution is clear;
2.6 Taking 25 mu L of supernatant in a new 200 mu L PCR tube, and preserving at 4 ℃ for later use.
3. A first part: end repair
3.1 Taking a proper amount of plasma free DNA sample to be detected into 1 0.2ml PCR tube, supplementing nucleic-free Water to a total volume of 50 mu L, fully mixing, and performing instantaneous centrifugation;
3.2 Respectively adding 10 mu L of Endprep Mix end repair reaction mixed solution into the PCR tube in the step 2.1, fully and uniformly mixing, performing instantaneous centrifugation, and placing the mixture on a PCR instrument for reaction according to the following procedures:
Figure SMS_1
3.3 After the reaction, the PCR tube was taken out and centrifuged instantaneously.
4. Joint connection
4.1 Preparing a connection reaction mixed solution with the amount required for detection in a centrifuge tube according to the proportion of the table,
Figure SMS_2
and (5) after fully and uniformly mixing, carrying out instantaneous centrifugation.
4.2 Adding the prepared connection reaction mixed solution (35 mu L) into the end repair product of each sample, adding 6 mu L of next holy (10 pmol/ul) Adapter X into each sample, fully uniformly mixing, and performing instantaneous centrifugation. The reaction was performed on a PCR instrument according to the following procedure:
Figure SMS_3
note that: one Adapter X was added to each sample.
4.3 After the reaction, the PCR tube was taken out and centrifuged instantaneously.
4.4 Joint ligation product purification
4.4.1 Adding 80 mu L of next holy XP magnetic beads into each sample, fully mixing, standing at room temperature for 5min, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;
4.4.2 Adding 200 mu L of 75% ethanol, blowing for 3 times, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;
4.4.3 Repeating for 2.2.4 times, standing at room temperature for 3min, airing, and taking down the centrifuge tube from the magnetic rack;
4.4.4 Adding 22 mu L of an absorption Buffer, fully and uniformly mixing, standing at room temperature for 5min, and placing the centrifuge tube in a magnetic rack for 3min until the solution is clear;
4.4.5 Taking 20 mu L of supernatant in a new 200 mu L PCR tube, and preserving at 4 ℃ for later use.
The specific data analysis method is as follows:
the analysis method comprises the following steps:
comparing the sequencing result to a human reference genome, constructing a unique comparison sequencing sequence set, and recording the position information of each sequencing sequence comparison;
cutting a reference genome according to the length of a unit window, dividing the reference genome into a plurality of first-level windows, wherein the length of the unit window is 100 k-5 Mbp;
counting the number of uncorrected frequencies of the unique comparison sequences in each window according to the comparison position information of each unique sequence;
correcting the number of uncorrected frequencies according to the GC value through a linear regression model and the batch system error;
prediction of fetal nucleic acid percentages:
estimating fetal nucleic acid percentage PY from the Y chromosome:
calculating average Depth Y of Y chromosome, wherein the sum of the number of frequencies of the Y chromosome after correction of each window of the Y chromosome and/or the length of the whole chromosome base of the Y chromosome is calculated as Dmate value, and the average Depth Y of the Y chromosome of a normal male is calculated as Dfemale value; the average Y chromosome depth of the sample to be detected is calculated as Dtest, and the fetal nucleic acid percentage PY= [ (Dtest-Dfemale)/Dhale ] -Dfemale of the sample to be detected;
Carrying out principal component analysis on a known normal sample by an unsupervised learning method to obtain a principal component set PCs;
selecting PCs as a main component set, introducing an artificial neural network of 2 hidden layers, adopting an elastic back propagation method with weight backtracking and a lossfunction adopting a residual variance sum algorithm, and re-weighting the PCs by using a sample set of the known fetal Y chromosome fetal nucleic acid percentage PY on a label to construct a weight-main component-PY neural network model;
predicting fetal nucleic acid percentage PF: converting the corrected frequency number of the chromosome window of the sample to be detected into a principal component collection PCs through a principal component analysis technology, outputting a neuron value PF according to a weight-principal component-PY neural network model, and judging the reliability of a result when the calculated fetal nucleic acid percentage PF is lower than 3.5%, wherein a peripheral blood sample is required to be acquired again;
determination of the anomaly percentage PA of microdeleted microreplicated fragments:
randomly setting a breakpoint position, calculating a significance level p-value by a mathematical test method for the frequency number of windows on the left and right sides of the breakpoint position after correction, and selecting the window as a candidate microdeletion micro-repetition window if the calculated p-value is smaller than a preset significance level value p-set 0.05; if the calculated p-value is greater than the set significance level value p-set 0.05, continuing to merge windows for the next step of inspection until a to-be-selected micro-missing micro-repeated window is obtained;
Merging the adjacent windows of the determined to-be-selected microdeletion micro-repetition windows to obtain a selected microdeletion micro-repetition region;
performing a depth depth (abnormal) calculation for each region of the microdeletion microrepeat, depth (abnormal) = corrected number of frequencies falling within the microdeletion microrepeat region/the microdeletion microrepeat region size;
the average depth (normal) calculation was also performed for other regions of the chromosome that do not contain the microdeletion microrepeat, depth (normal) =corrected frequency number falling on the remaining regions of each chromosome that do not contain the microdeletion microrepeat/size of the remaining regions that do not contain the microdeletion microrepeat;
calculating a ratio R of the percent of microdeletion microrepeat abnormalities to the percent of fetal nucleic acid, r=pa/PF;
judging whether the sample is positive or negative of microdeletion and microduplication variation according to the ratio value R:
if the depth (normal) -depth (abnormal) of the microdeletion repeat region is positive, primarily judging that the microdeletion variation is detected, and then filtering a negative signal through an R value to determine that the microdeletion variation is the final microdeletion positive variation;
if R >5, prompting that the positive signal is possibly from a mother, and whether the fetus carries an unpredictable;
if R is more than or equal to 0.8 and less than or equal to 5, positive variation is determined;
If R is less than or equal to 0.5, determining negative variation;
if R is 0.5< 0.8, the dust area is the dust area, and re-detection is needed;
if depth (normal) -depth (abnormal) of the microdeletion repeat region is negative, primarily judging that the microdeletion repeat region is micro-repeat variation, and filtering a negative signal through an R value to confirm that the microdeletion repeat region is final microdeletion positive variation;
if R >5, prompting that the positive signal is possibly from a mother, and whether the fetus carries an unpredictable;
if R is more than or equal to 0.8 and less than or equal to 5, positive variation is determined;
if R is less than or equal to 0.5, determining negative variation;
if R is 0.5< 0.8, the dust area is the dust area, and re-detection is needed;
when depth (normal) -depth (abnormal) is 0, it is judged to be normal;
chromosome copy number variation determination:
the percent chromosomal copy number variation was obtained as pa_chri (i=1, 2, … …,22, x, y), comprising the steps of:
performing a duty cycle coverage_chri (i=1, 2, … …, x, y) calculation for each chromosome, coverage_chri=the sum of the number of corrected window frequencies falling on that chromosome/the number of window frequencies falling on that chromosome;
calculating the occupancy ratio Coverage-normal_chri (i=1, 2, … …, x, y) of each chromosome based on known normal samples, wherein Coverage-normal_chri=the sum of the number of window frequencies after correction falling on the chromosome/the number of window frequencies falling on the chromosome, and averaging the occupancy ratio of the chromosomes of all samples to obtain Average (Coverage-normal_chri) (i=1, 2, … … … …, x, y);
Calculating the copy number variation percentage PC_chri of each chromosome of the sample to be detected, wherein PC_chri=2× (coverage_chri-Average (Coverage-normal_chri))/Average (Coverage-normal_chri);
calculating the ratio R of the copy number variation percentage PC of each chromosome of the sample to be detected to the fetal nucleic acid percentage PF, wherein R= |PC|/PF;
and (3) three-body judgment:
if PC_chri >0.03 and R >1.8, reporting trisomy positive, and prompting mother or placenta effect;
if PC_chri >0.03, and 0.2< R <1.8, then it is determined as chri trisomy positive;
if PC_chri >0.03,0.1< R <0.2, determining as an ash zone, and requiring re-determination;
if PC_chri is less than or equal to 0.03 and R <0.1, determining that the report is negative;
determination of XO:
when PC_chrX is less than or equal to-0.03, -0.01< PC_chrY <0.01, and R (i.e., |PC_chrX|/PF) is more than or equal to 0.8, determining positive;
when-0.03 < PC_chrX <0.03, -0.01< PC_chrY <0.01, R is less than 0.3, then judging as negative;
determination of XXX:
when PC_chrX >0.03, -0.01< PC_chrY <0.01, R (i.e., |PC_chrX|/PF) is larger than or equal to 0.8, the positive result is judged;
when-0.03 < PC_chrX <0.03, -0.01< PC_chrY <0.01, R is less than 0.3, then judging as negative;
determination of XXY:
when-0.01 < pc_chrx <0.01 and pc_chry > =0.04, then it is determined as positive;
Determination of XYY:
when PC_chrX is more than 3 percent and PC_chrY is more than 3 percent, and the ratio of PC_chrY to PC_chrX is more than 1.7, the result is judged to be XYY positive;
prediction of monogenic genetic disease:
sequentially aligning reads from the same cfDNA at each base site, counting from the first base site of the set of reads, if only less than or equal to 30% of reads are the same base at that site, the base is considered to be background noise; if more than or equal to 70% of reads are the same base at the site, the base type of the site is confirmed; if only 30% -70% of reads contain the same base, then that base is designated as an N base;
the same statistics are carried out on the second base site of the reads until the last site of the reads is finished, and the base sequence of the reads from the same cfDNA molecule is obtained;
aligning the ready base sequence of the sequenced cfDNA molecule to a human reference genome using the bwa aln algorithm;
detecting the base of each covered site, and counting the respective depth and the depth ratio of A, T, G, C, N, insertion, deletion on each site;
selecting a single gene disease position to check the total coverage depth of the position, if the total depth coverage is smaller than 1000X, the quality control cannot pass, and if the total depth is larger than 1000X, the quality control passes;
Finding out the pathogenic mutation sites which are definitely required to be observed, and if the depth percentage of the pathogenic mutation is more than 3%, considering that the mutation exists; if the depth percentage of the pathogenic mutation is 1% -3%, judging that the gray area range is needed to be detected again; if the depth percentage of the pathogenic mutation is below 1%, the judgment mutation is absent.
Example 1:
when a pregnant woman is known to have a single gene dominant disease, it is necessary to detect the genotype of the fetus.
Extracting plasma free DNA, fetal DNA enrichment, first fraction according to the above steps: after the terminal repair and the joint connection, the following operations are further performed:
5. multiplex one-step amplification
5.1 The reaction mixtures were prepared in 200 μl PCR tubes according to the ratios in the table below.
Figure SMS_4
And (5) after fully and uniformly mixing, carrying out instantaneous centrifugation.
Note that: the same sample was used as the barcode.
5.2 Adding 2 mu L of free plasma DNA into the prepared reaction mixed solution, and carrying out instantaneous centrifugation after fully and uniformly mixing. The reaction was performed on a PCR instrument according to the following procedure:
Figure SMS_5
after the reaction, the PCR tube was taken out and centrifuged instantaneously.
5.3 Multiple one-step process product purification
5.3.1 Adding 25 mu L of next holy XP magnetic beads into each sample, fully mixing, standing at room temperature for 5min, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;
5.3.2 Adding 200 mu L of 75% ethanol, blowing for 3 times, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;
Figure SMS_6
5.3.3 Repeating for 2.2.4 times, standing at room temperature for 3min, airing, and taking down the centrifuge tube from the magnetic rack;
5.3.4 Adding 17 mu L of the solution Buffer, fully and uniformly mixing, standing at room temperature for 5min, and placing the centrifuge tube in a magnetic rack for 3min until the solution is clear;
5.3.5 Taking 15 mu L of supernatant in a new 200 mu L PCR tube, and preserving at 4 ℃ for later use.
6.YS-PCR
6.1 preparing PCR reaction mixed solution with the amount required for detection in a 200 mu L PCR tube according to the proportion of the table below;
Figure SMS_7
and (5) after fully and uniformly mixing, carrying out instantaneous centrifugation.
6.2 adding the prepared 30 mu L PCR reaction mixed solution into the joint product obtained in 3.4.5, fully and uniformly mixing, and then carrying out instantaneous centrifugation. The reaction was performed on a PCR instrument according to the following procedure:
Figure SMS_8
after the reaction, the PCR tube was taken out and centrifuged instantaneously.
6.3 YS-PCR product purification
6.3.1 Adding 50 mu L of next holy XP magnetic beads into each sample, fully mixing, standing at room temperature for 5min, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;
6.3.2 Adding 200 mu L of 75% ethanol, blowing for 3 times, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;
6.3.3 Repeating for 2.2.4 times, standing at room temperature for 3min, airing, and taking down the centrifuge tube from the magnetic rack;
6.3.4 Adding 17 mu L of the solution Buffer, fully and uniformly mixing, standing at room temperature for 5min, and placing the centrifuge tube in a magnetic rack for 3min until the solution is clear;
6.3.5 Taking 32 mu L of supernatant in a new 200 mu L PCR tube, and preserving at 4 ℃ for later use.
7. Library quantification
5.3.5 The samples in (2) and 6.3.5 were equilibrated to room temperature for qkit detection. The sample in 5.3.5 and the sample in 6.3.5 were mixed in a ratio of 1:12000.
8. Single stranded circularized library construction
8.1 denaturation
8.1.1 according to the fragment length of the last step, 1 pmol of DNA was taken into 0.2 mL PCR tubes, and ddH was used 2 O was replenished to 34. Mu.L.
8.1.2 The denaturation reaction mixture was prepared according to the following table
Figure SMS_9
And (5) after fully and uniformly mixing, carrying out instantaneous centrifugation.
8.1.3 The PCR tube was placed on a PCR instrument and reacted under the following conditions:
Figure SMS_10
immediately after the reaction was completed, the PCR tube was transferred to ice and left to stand for 2 min.
8.2 Single Strand cyclization
8.2.1 Single Strand cyclization reaction solutions were prepared on ice according to the following table:
Figure SMS_11
shaking and mixing at low speed, and centrifuging for a short time to centrifuge the reaction solution to the bottom of the tube.
8.2.2 The PCR tube was placed on a PCR instrument and reacted under the following conditions:
Figure SMS_12
after the reaction is finished, the reaction mixture is transferred to the next step.
8.3 digestion by enzyme digestion
8.3.1 preparation of the digestion reaction System on ice according to the following table:
Figure SMS_13
shaking and mixing at low speed, and centrifuging for a short time to centrifuge the reaction solution to the bottom of the tube.
8.3.2 The PCR tube was placed on a PCR instrument and reacted under the following conditions:
Figure SMS_14
after the reaction, the mixture was centrifuged instantaneously and immediately purified.
8.4 digestion product purification
8.4.1 sucking 120 mu L of Hieff NGS DNA Selection Beads to 7.3.2 digestion products, mixing by vortex or blowing, incubating for 10min at room temperature;
8.4.2 the PCR tube was briefly centrifuged and placed in a magnetic rack to separate the beads from the liquid, after the solution was clarified (about 2 min), the supernatant was carefully removed;
8.4.3 the PCR tube was kept always in a magnetic rack, 200. Mu.L of freshly prepared 80% ethanol was added to rinse the beads, and after 30 sec incubation at room temperature, the supernatant was carefully removed;
8.4.4 repeat step 5 for a total of two rinses;
8.4.5 keeping the PCR tube in the magnetic rack all the time, and uncovering the air to dry the magnetic beads until cracks just appear;
8.4.6 taking the PCR tube out of the magnetic frame, adding 22 mu L of TE Buffer, and carrying out vortex oscillation or lightly blowing by using a pipettor until the mixture is fully and uniformly mixed, and standing for 10min at room temperature;
8.4.7 short centrifugation, the PCR tube was kept still in a magnetic rack and after the solution was clarified (about 2 min), the supernatant was carefully transferred to a new PCR tube.
Stopping point: the purified product was cyclized and stored at-20℃for one month.
8.5 digestion product control
The digested products were quantified using the Qubit ssDNA Assay Kit fluorescent reagent.
9 on-machine sequencing
And (5) performing on-machine sequencing on the library with qualified quality control according to the on-machine sequencing protocol.
10 data analysis
Analysis of fetal chromosomal aneuploidies and microdeletions, and mainly analysis of single-gene dominant genetic disease gene detection of a mother.
The analysis results were as follows:
t21 detection example
Figure SMS_15
T18 detection example
Figure SMS_16
T13 detection example
Figure SMS_17
Example 2: realizing aneuploidy+microdeletion microreplication detection
Extracting plasma free DNA, fetal DNA enrichment, first fraction according to the above steps: after the terminal repair and the joint connection, the following operations are further performed:
5.1 preparing PCR reaction mixed solution with the amount required for detection in a 200 mu L PCR tube according to the proportion of the table below;
Figure SMS_18
and (5) after fully and uniformly mixing, carrying out instantaneous centrifugation.
5.2 adding the prepared 30 mu L PCR reaction mixed solution into the joint product obtained in 3.4.5, fully and uniformly mixing, and then carrying out instantaneous centrifugation. The reaction was performed on a PCR instrument according to the following procedure:
Figure SMS_19
after the reaction, the PCR tube was taken out and centrifuged instantaneously.
5.3 YS-PCR product purification
5.3.1 Adding 50 mu L of next holy XP magnetic beads into each sample, fully mixing, standing at room temperature for 5min, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;
5.3.2 Adding 200 mu L of 75% ethanol, blowing for 3 times, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;
5.3.3 Repeating for 2.2.4 times, standing at room temperature for 3min, airing, and taking down the centrifuge tube from the magnetic rack;
5.3.4 Adding 17 mu L of the solution Buffer, fully and uniformly mixing, standing at room temperature for 5min, and placing the centrifuge tube in a magnetic rack for 3min until the solution is clear;
5.3.5 Taking 32 mu L of supernatant in a new 200 mu L PCR tube, and preserving at 4 ℃ for later use.
6. Library quantification
Samples from 5.3.5 were taken out and equilibrated to room temperature for qkit testing.
7. Single stranded circularized library construction
7.1 denaturation
7.1.1 according to the fragment length of the previous step, 1 pmol of DNA was taken into 0.2 mL PCR tubes, and ddH was used 2 O was replenished to 34. Mu.L.
7.1.2 The denaturation reaction mixture was prepared according to the following table
Figure SMS_20
And (5) after fully and uniformly mixing, carrying out instantaneous centrifugation.
7.1.3 The PCR tube was placed on a PCR instrument and reacted under the following conditions:
Figure SMS_21
immediately after the reaction was completed, the PCR tube was transferred to ice and left to stand for 2 min.
7.2 Single Strand cyclization
7.2.1 Single Strand cyclization reaction solutions were prepared on ice according to the following table:
Figure SMS_22
Shaking and mixing at low speed, and centrifuging for a short time to centrifuge the reaction solution to the bottom of the tube.
7.2.2 The PCR tube was placed on a PCR instrument and reacted under the following conditions:
Figure SMS_23
after the reaction is finished, the reaction mixture is transferred to the next step.
7.3 digestion by enzyme digestion
7.3.1 preparation of the digestion reaction on ice according to the following table:
Figure SMS_24
shaking and mixing at low speed, and centrifuging for a short time to centrifuge the reaction solution to the bottom of the tube.
7.3.2 The PCR tube was placed on a PCR instrument and reacted under the following conditions:
Figure SMS_25
after the reaction, the mixture was centrifuged instantaneously and immediately purified.
7.4 digestion product purification
7.4.1 sucking 120 mu L of Hieff NGS DNA Selection Beads to 7.3.2 digestion products, mixing by vortex or blowing, incubating for 10min at room temperature;
7.4.2 the PCR tube was briefly centrifuged and placed in a magnetic rack to separate the beads from the liquid, after the solution was clarified (about 2 min), the supernatant was carefully removed;
7.4.3 keep the PCR tube always placed in the magnetic rack, rinse the beads with 200. Mu.L of freshly prepared 80% ethanol, incubate for 30 sec at room temperature, carefully remove the supernatant;
7.4.4 repeating step 5 for a total of two rinses;
7.4.5 keeping the PCR tube in the magnetic rack all the time, and uncovering the air to dry the magnetic beads until cracks just appear;
7.4.6 taking the PCR tube out of the magnetic frame, adding 22 mu L of TE Buffer, and carrying out vortex oscillation or gentle blowing by using a pipettor until the mixture is fully and uniformly mixed, and standing for 10 min at room temperature;
7.4.7 the PCR tube was kept still in a magnetic rack for a short centrifugation, and after the solution was clarified (about 2 min), the supernatant was carefully transferred to a new PCR tube.
Stopping point: the purified product was cyclized and stored at-20℃for one month.
7.5 digestion product control
The digested products were quantified using the Qubit ssDNA Assay Kit fluorescent reagent.
8. Sequencing on machine
And (5) performing on-machine sequencing on the library with qualified quality control according to the on-machine sequencing protocol.
9. Data analysis
2 examples of microdeletion detection:
Figure SMS_26
three bodies: examples of detection of T13, T21, T18
Figure SMS_27
XO, XXX, XXY, XYY detection example
Figure SMS_28
Example 3: detection of aneuploidy + microdeletion + achondroplasia
Extracting plasma free DNA, fetal DNA enrichment, first fraction according to the above steps: after the terminal repair and the joint connection, the following operations are further performed:
5. multiplex one-step amplification
5.1 The reaction mixtures were prepared in 200 μl PCR tubes according to the ratios in the table below.
Figure SMS_29
And (5) after fully and uniformly mixing, carrying out instantaneous centrifugation.
Note that: the same sample was used as the barcode.
5.2 Adding 2 mu L of free plasma DNA into the prepared reaction mixed solution, and carrying out instantaneous centrifugation after fully and uniformly mixing. The reaction was performed on a PCR instrument according to the following procedure:
Figure SMS_30
after the reaction, the PCR tube was taken out and centrifuged instantaneously.
5.3 Multiple one-step process product purification
5.3.1 Adding 25 mu L of next holy XP magnetic beads into each sample, fully mixing, standing at room temperature for 5min, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;
5.3.2 Adding 200 mu L of 75% ethanol, blowing for 3 times, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;
5.3.3 Repeating for 2.2.4 times, standing at room temperature for 3min, airing, and taking down the centrifuge tube from the magnetic rack;
5.3.4 Adding 17 mu L of the solution Buffer, fully and uniformly mixing, standing at room temperature for 5min, and placing the centrifuge tube in a magnetic rack for 3min until the solution is clear;
5.3.5 Taking 15 mu L of supernatant in a new 200 mu L PCR tube, and preserving at 4 ℃ for later use.
6 YS-PCR
Reference is made to the YS-PCR procedure of example 1.
7. Library mixing
7.1 The sample in 4.3.5 was taken out and equilibrated to room temperature for qkit detection.
7.2 Samples from 5.3.5 were taken out and equilibrated to room temperature for qkit testing.
7.3 According to the detection results of 6.1 and 6.2, 5.3.5 samples and 4.3.5 samples were mixed in a ratio of 2000:1. If a gradient dilution of 4.3.5 samples is required.
8. Single stranded circularized library construction
Reference is made to the single stranded circularized library construction procedure of example 1.
9. Sequencing on machine
And (5) performing on-machine sequencing on the library with qualified quality control according to the on-machine sequencing protocol.
10. Data analysis
Figure SMS_31
Results: 1T 21 positive was detected in 20 samples, and other abnormalities were not detected.
Example 4: realizing detection of aneuploid, microdeletion microreplication and single gene dominant genetic disease
Extracting plasma free DNA, fetal DNA enrichment, first fraction according to the above steps: after the terminal repair and the joint connection, the following operations are further performed:
5. multiplex one-step amplification
Reference is made to the multiplex one-step amplification procedure of example 1.
6 YS-PCR
Reference is made to the YS-PCR procedure of example 1.
7 library mix
7.1 the sample in 5.3.5 was taken out and equilibrated to room temperature for Qkit detection.
7.2 the sample in 6.3.5 was taken out and equilibrated to room temperature for qkit detection.
7.3 according to the detection results of 5.3.5 and 6.3.5, 5.3.5 samples and 6.3.5 samples were mixed in a ratio of 1:1000. Samples 5.3.5 were subjected to gradient dilution if necessary.
8 Single Strand cyclization library construction
Reference is made to the single stranded circularized library construction procedure of example 1.
9. Sequencing on machine
And (5) performing on-machine sequencing on the library with qualified quality control according to the on-machine sequencing protocol.
10. Data analysis
Monogenic genetic disease analysis procedure
1, after sequencing, the different samples were distinguished according to index number. The same sample is first subjected to UMI to correct the measured molecular sequence.
a) Since individual cfDNA molecules are uniquely linked by a specific UMI (unique molecular indexing). All reads split according to different UMI sequences, reads containing the same UMI are grouped together, meaning that the group of reads are derived from the same cfDNA molecule.
b) Reads from the same cfDNA molecule are aligned sequentially at each base site. Counting from the first base site of the set of reads, if only.ltoreq.30% of reads are the same base at that site, the base is considered background noise since the set of reads is derived from the same cfDNA molecule; if more than or equal to 70% of reads are the same base at the site, the base type of the site is confirmed; if only 30% -70% of reads contain the same base, then that base is designated as an N base (no call).
c) The same statistics then continue to be performed at the second base site of the set of reads until the last site of the set of reads ends. Whereby the base sequences of reads derived from the same cfDNA molecule are obtained.
2, comparing the ready base sequence of the sequenced cfDNA molecule to hg19 of chr4 using the bwa aln algorithm.
3, detecting the base of each coverage site on hg19 by using samtools. The parameter choice is mpileup. The respective depth, to depth ratio of A, T, G, C, N, insertion, deletion at each site was counted.
4, selecting a single gene disease position to check the total coverage depth of the position. If the total depth coverage is less than 1000X, the quality control is not passed, and if the total depth is greater than 1000X, the quality control is passed.
6, finding the site of the pathogenic mutation which is clearly needed to be observed, and considering the mutation to exist if the depth percentage of the pathogenic mutation is more than 3%. If the depth percentage of the pathogenic mutation is 1% -3%, the gray area range is judged, and detection is needed again. If the depth percentage of the pathogenic mutation is below 1%, the judgment mutation is absent.
The analysis results are shown in the following table:
Figure SMS_32
the above description of the present invention is further illustrated in detail and should not be taken as limiting the practice of the present invention. It is within the scope of the present invention for those skilled in the art to make simple deductions or substitutions without departing from the concept of the present invention.

Claims (10)

1. A system for noninvasive prenatal detection of a fetus comprising a sequencing device, a data analysis device, and a result output device, wherein:
the sequencing device is used for determining sequence information of a free DNA sequencing library in peripheral blood of the pregnant woman to obtain a sequencing result;
the result output device is used for outputting the analysis result of the data analysis device;
the analysis method of the data analysis device is characterized by comprising the following steps:
comparing the sequencing result to a human reference genome, constructing a unique comparison sequencing sequence set, and recording the position information of each sequencing sequence comparison;
cutting a reference genome according to a unit window length, and dividing the reference genome into a plurality of primary windows;
counting the number of uncorrected frequencies of the unique comparison sequences in each window according to the comparison position information of each unique sequence;
correcting the number of uncorrected frequencies according to the GC value to obtain the corrected number of frequencies;
prediction of fetal nucleic acid percentages:
estimating fetal nucleic acid percentage PY from the Y chromosome:
calculating average Depth Y of Y chromosome, wherein the sum of the number of frequencies of the Y chromosome after correction of each window of the Y chromosome and/or the length of the whole chromosome base of the Y chromosome is calculated as Dmate value, and the average Depth Y of the Y chromosome of a normal male is calculated as Dfemale value; the average Y chromosome depth of the sample to be detected is calculated as Dtest, and the fetal nucleic acid percentage PY= [ (Dtest-Dfemale)/Dhale ] -Dfemale of the sample to be detected;
Carrying out principal component analysis on a known normal sample by an unsupervised learning method to obtain a principal component set PCs;
selecting PCs as a main component set, introducing an artificial neural network, and re-weighting the PCs by using a sample set of known fetal Y chromosome fetal nucleic acid percentage PY on a label to construct a weight-main component-PY neural network model;
predicting fetal nucleic acid percentage PF: converting the corrected frequency number of the chromosome window of the sample to be detected into a principal component set PCs through a principal component analysis technology, and then outputting a neuron value PF according to a weight-principal component-PY neural network model;
determination of the anomaly percentage PA of microdeleted microreplicated fragments:
randomly setting a breakpoint position, calculating a significance level p-value by a mathematical test method for the frequency number of windows on the left and right sides of the breakpoint position after correction, and selecting the window as a candidate microdeletion micro-repetition window if the calculated p-value is smaller than a preset significance level value p-set; if the calculated p-value is larger than the set significance level value p-set, continuing to merge the windows for next examination until a micro-missing micro-repeated window to be selected is obtained;
merging the adjacent windows of the determined to-be-selected microdeletion micro-repetition windows to obtain a selected microdeletion micro-repetition region;
Performing a depth depth (abnormal) calculation for each region of the microdeletion microrepeat, depth (abnormal) = corrected number of frequencies falling within the microdeletion microrepeat region/the microdeletion microrepeat region size;
the average depth (normal) calculation was also performed for other regions of the chromosome that do not contain the microdeletion microrepeat, depth (normal) =corrected frequency number falling on the remaining regions of each chromosome that do not contain the microdeletion microrepeat/size of the remaining regions that do not contain the microdeletion microrepeat;
calculating an anomaly percentage pa=2×|depth (normal) -depth (abnormal) |/depth (normal) containing microdeletion microrepeat or a whole chromosome copy number variation, and when depth (normal) -depth (abnormal) is positive, identifying as a preliminary microdeletion variation; when depth (normal) -depth (abnormal) is negative, it is considered a preliminary micro-repeat variation; when depth (normal) -depth (abnormal) is 0, it is judged to be normal;
chromosome copy number variation determination:
the percent chromosomal copy number variation was obtained as pa_chri (i=1, 2, … …,22, x, y), comprising the steps of:
performing a duty cycle coverage_chri (i=1, 2, … …, x, y) calculation for each chromosome, coverage_chri=the sum of the number of corrected window frequencies falling on that chromosome/the number of window frequencies falling on that chromosome;
Calculating the occupancy ratio Coverage-normal_chri (i=1, 2, … …, x, y) of each chromosome based on known normal samples, wherein Coverage-normal_chri=the sum of the number of window frequencies after correction falling on the chromosome/the number of window frequencies falling on the chromosome, and averaging the occupancy ratio of the chromosomes of all samples to obtain Average (Coverage-normal_chri) (i=1, 2, … … … …, x, y);
calculating the copy number variation percentage PC_chri of each chromosome of the sample to be detected, wherein PC_chri=2× (coverage_chri-Average (Coverage-normal_chri))/Average (Coverage-normal_chri);
calculating the ratio R of the copy number variation percentage PC of each chromosome of the sample to be detected to the fetal nucleic acid percentage PF, wherein R= |PC|/PF;
determining chromosome copy number variation according to the PC_chri and the R value;
prediction of monogenic genetic disease:
sequentially aligning reads from the same cfDNA at each base site, counting from the first base site of the set of reads, if only less than or equal to 30% of reads are the same base at that site, the base is considered to be background noise; if more than or equal to 70% of reads are the same base at the site, the base type of the site is confirmed; if only 30% -70% of reads contain the same base, then that base is designated as an N base;
The same statistics are carried out on the second base site of the reads until the last site of the reads is finished, and the base sequence of the reads from the same cfDNA molecule is obtained;
aligning the ready base sequence of the sequenced cfDNA molecule to a human reference genome using the bwa aln algorithm;
detecting the base of each covered site, and counting the respective depth and the depth ratio of A, T, G, C, N, insertion, deletion on each site;
selecting a single gene disease position to check the total coverage depth of the position, if the total depth coverage is smaller than 1000X, the quality control cannot pass, and if the total depth is larger than 1000X, the quality control passes;
finding out the pathogenic mutation sites which are definitely required to be observed, and if the depth percentage of the pathogenic mutation is more than 3%, considering that the mutation exists; if the depth percentage of the pathogenic mutation is 1% -3%, judging that the gray area range is needed to be detected again; if the depth percentage of the pathogenic mutation is below 1%, the judgment mutation is absent.
2. The system of claim 1, wherein the chromosomal microdeletion microrepeat variation determination comprises the steps of:
calculating a ratio R of the percent of microdeletion microrepeat abnormalities to the percent of fetal nucleic acid, r=pa/PF;
Judging whether the sample is positive or negative of microdeletion and microduplication variation according to the ratio value R:
if the depth (normal) -depth (abnormal) of the microdeletion repeat region is positive, primarily judging that the microdeletion variation is detected, and then filtering a negative signal through an R value to determine that the microdeletion variation is the final microdeletion positive variation;
if R > 5, prompting that the positive signal is possibly from a mother, and whether the fetus carries an unpredictable;
if R is more than or equal to 0.8 and less than or equal to 5, positive variation is determined;
if R is less than or equal to 0.5, determining negative variation;
if 0.5< R <0.8, it is the gray area, and it needs to be detected again;
if depth (normal) -depth (abnormal) of the microdeletion repeat region is negative, primarily judging that the microdeletion repeat region is micro-repeat variation, and filtering a negative signal through an R value to confirm that the microdeletion repeat region is final microdeletion positive variation;
if R > 5, prompting that the positive signal is possibly from a mother, and whether the fetus carries an unpredictable;
if R is more than or equal to 0.8 and less than or equal to 5, positive variation is determined;
if R is less than or equal to 0.5, determining negative variation;
if 0.5< R <0.8, it is the gray area, and re-detection is required.
3. The system of claim 1 wherein the PCs are re-weighted by a sample set of known fetal Y chromosome fetal nucleic acid percentages PY on the label using a method of elastic back propagation with weight backtracking and a lossfunction using a residual variance sum algorithm by selecting a principal component set PCs, introducing an artificial neural network of 2 hidden layers, and constructing a weight-principal component-PY neural network model.
4. The system of claim 1, wherein the determination criteria for chromosomal copy number variation are as follows:
and (3) three-body judgment:
if PC_chri >0.03 and R >1.8, reporting trisomy positive, and prompting mother or placenta effect;
if PC_chri >0.03, and 0.2< R <1.8, then it is determined as chri trisomy positive;
if PC_chri >0.03,0.1< R <0.2, determining as an ash zone, and requiring re-determination;
if PC_chri is less than or equal to 0.03 and R <0.1, determining that the report is negative;
determination of XO:
when PC_chrX is less than or equal to-0.03, -0.01< PC_chrY <0.01, R is more than or equal to 0.8, judging positive;
when-0.03 < PC_chrX <0.03, -0.01< PC_chrY <0.01, R is less than 0.3, then judging as negative;
determination of XXX:
when PC_chrX >0.03, -0.01< PC_chrY <0.01, R is more than or equal to 0.8, the positive result is judged;
when-0.03 < PC_chrX <0.03, -0.01< PC_chrY <0.01, R is less than 0.3, then judging as negative;
determination of XXY:
when-0.01 < pc_chrx <0.01 and pc_chry > =0.04, then it is determined as positive;
determination of XYY:
when PC_chrX >3% and PC_chrY >3% and the ratio of PC_chrY to PC_chrX is greater than 1.7, then XYY positive is determined.
5. The system of claim 1, wherein the confidence level of the result is determined when the calculated fetal nucleic acid percentage PF is less than 3.5% by re-acquiring the peripheral blood sample.
6. The system of claim 1, wherein the number of uncorrected frequencies is corrected based on GC values by a linear regression model and as-batch systematic errors.
7. The system according to any one of claims 1 to 6, wherein p-set is equal to or less than 0.05.
8. The system of any one of claims 1 to 6, wherein the method of constructing a sequencing library comprises the steps of:
extracting free DNA from a pregnant woman peripheral blood sample;
performing end repair, A adding, connector adding and PCR amplification on the free DNA, wherein the implementation method of PCR amplification is selected from one of the following:
simultaneously adding a gene specific primer, a specific joint and a specific barcode primer, wherein the gene specific primer is combined with a plasma free DNA template to form T m T with specific adapter and specific barcode primer combined with gene specific primer and plasma free DNA template amplified product m The value is 2-6 ℃, the PCR process uses high annealing temperature to amplify the gene specific primer to a set concentration, and uses low annealing temperature to amplify the specific joint and the specific barcode primer, and finally forms a complete library; or (b)
Synthesizing a fusion primer, namely, a primer contains a gene specific module and a specific joint or a specific barcode primer module, forming an upstream primer and a downstream primer, amplifying and enriching a plasma free DNA template, and finally forming a complete library; or (b)
The upstream gene specific primer and the specific joint form an upstream fusion primer, a downstream gene specific primer and a specific barcode primer, when the amplification is started, the high annealing temperature is used firstly, the fusion primer and the downstream gene specific primer are utilized to amplify the plasma free DNA template, and then the low annealing temperature is used to amplify the upstream fusion primer and the specific barcode primer, so that a complete library is finally formed; or (b)
The downstream gene specific primer and the specific joint form a downstream fusion primer, an upstream gene specific primer and a specific barcode primer, when the amplification is started, the high annealing temperature is used firstly, the downstream fusion primer and the upstream gene specific primer are utilized for amplifying the plasma free DNA template, and then the low annealing temperature is used for amplifying the downstream fusion primer and the specific barcode primer, so that a complete library is finally formed.
9. The system of claim 8, wherein the downstream gene-specific primer is T with a specific barcode primer m The difference value of the values is 3-7 ℃; upstream Gene-specific primer and T of specific barcode primer m The difference in values is 3-7 ℃.
10. The system of claim 8, wherein more than 200 bp fragments are selected for removal after obtaining free DNA.
CN202310518720.6A 2023-05-10 2023-05-10 System for noninvasive prenatal detection of fetuses Active CN116246704B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310518720.6A CN116246704B (en) 2023-05-10 2023-05-10 System for noninvasive prenatal detection of fetuses

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310518720.6A CN116246704B (en) 2023-05-10 2023-05-10 System for noninvasive prenatal detection of fetuses

Publications (2)

Publication Number Publication Date
CN116246704A true CN116246704A (en) 2023-06-09
CN116246704B CN116246704B (en) 2023-08-15

Family

ID=86631689

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310518720.6A Active CN116246704B (en) 2023-05-10 2023-05-10 System for noninvasive prenatal detection of fetuses

Country Status (1)

Country Link
CN (1) CN116246704B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116837090A (en) * 2023-07-06 2023-10-03 东莞博奥木华基因科技有限公司 Primer group, kit and method for detecting fetal bone dysplasia

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102216456A (en) * 2008-09-16 2011-10-12 塞昆纳姆股份有限公司 Processes and compositions for methylation-based enrichment of fetal nucleic acid from a maternal sample useful for non invasive prenatal diagnoses
US20130029852A1 (en) * 2010-01-19 2013-01-31 Verinata Health, Inc. Detecting and classifying copy number variation
CN103384725A (en) * 2010-12-23 2013-11-06 塞昆纳姆股份有限公司 Fetal genetic variation detection
US20130338933A1 (en) * 2011-10-06 2013-12-19 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations
WO2014055790A2 (en) * 2012-10-04 2014-04-10 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations
US20140274745A1 (en) * 2011-10-28 2014-09-18 Bgi Diagnosis Co., Ltd. Method for detecting micro-deletion and micro-repetition of chromosome
CN105925666A (en) * 2016-03-30 2016-09-07 广州精科生物技术有限公司 Kit and application thereof, and method and system for detecting area target variation
CN106778069A (en) * 2017-02-17 2017-05-31 广州精科医学检验所有限公司 Determine the method and apparatus of micro-deleted micro- repetition in fetal chromosomal
CN111712582A (en) * 2017-11-02 2020-09-25 香港中文大学 Non-invasive prenatal and cancer detection using nucleic acid size ranges
CN111951890A (en) * 2020-08-13 2020-11-17 北京博昊云天科技有限公司 Method, kit and analysis system for synchronous prenatal screening of chromosome and monogenic diseases
CN112575075A (en) * 2013-05-24 2021-03-30 塞昆纳姆股份有限公司 Methods and processes for non-invasive assessment of genetic variation

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102216456A (en) * 2008-09-16 2011-10-12 塞昆纳姆股份有限公司 Processes and compositions for methylation-based enrichment of fetal nucleic acid from a maternal sample useful for non invasive prenatal diagnoses
US20130029852A1 (en) * 2010-01-19 2013-01-31 Verinata Health, Inc. Detecting and classifying copy number variation
CN103384725A (en) * 2010-12-23 2013-11-06 塞昆纳姆股份有限公司 Fetal genetic variation detection
US20130338933A1 (en) * 2011-10-06 2013-12-19 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations
US20140274745A1 (en) * 2011-10-28 2014-09-18 Bgi Diagnosis Co., Ltd. Method for detecting micro-deletion and micro-repetition of chromosome
WO2014055790A2 (en) * 2012-10-04 2014-04-10 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations
CN112575075A (en) * 2013-05-24 2021-03-30 塞昆纳姆股份有限公司 Methods and processes for non-invasive assessment of genetic variation
CN105925666A (en) * 2016-03-30 2016-09-07 广州精科生物技术有限公司 Kit and application thereof, and method and system for detecting area target variation
CN106778069A (en) * 2017-02-17 2017-05-31 广州精科医学检验所有限公司 Determine the method and apparatus of micro-deleted micro- repetition in fetal chromosomal
CN111712582A (en) * 2017-11-02 2020-09-25 香港中文大学 Non-invasive prenatal and cancer detection using nucleic acid size ranges
CN111951890A (en) * 2020-08-13 2020-11-17 北京博昊云天科技有限公司 Method, kit and analysis system for synchronous prenatal screening of chromosome and monogenic diseases

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ANDREAS C. NEOCLEOUS, ET AL.: "First Trimester Noninvasive Prenatal Diagnosis:A Computational Intelligence Approach", 《IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS》, vol. 20, no. 5, XP011621639, DOI: 10.1109/JBHI.2015.2462744 *
SUBEEN HONG,ET AL.: "Simple and rapid detection of common fetal aneuploidies using peptide nucleic acid probe-based real-time polymerase chain reaction", 《SCIENTIFIC REPORTS》, vol. 150, no. 12 *
万波;赵丽娟;: "染色体微阵列分析技术对一例Phelan-McDermid综合征患儿的遗传性诊断研究", 《妇产与遗传(电子版)》, no. 3 *
惠淑宁: "无创产前检测技术在唐氏综合征中的临床应用评价", 《中国优秀硕士学位论文全文数据库》 *
李淑霞;柴丽芬;刘林英;吴琪瑞;张慧萍;: "无创产前DNA检测技术的临床应用研究进展", 《宁夏医科大学学报》, no. 9 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116837090A (en) * 2023-07-06 2023-10-03 东莞博奥木华基因科技有限公司 Primer group, kit and method for detecting fetal bone dysplasia
CN116837090B (en) * 2023-07-06 2024-01-23 东莞博奥木华基因科技有限公司 Primer group, kit and method for detecting fetal bone dysplasia

Also Published As

Publication number Publication date
CN116246704B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
CN103874767B (en) Presumptive area in sample of nucleic acid is carried out the method and system of gene type
JP6585117B2 (en) Diagnosis of fetal chromosomal aneuploidy
EP2772549B1 (en) Method for detecting genetic variation
CN108604258B (en) Chromosome abnormality determination method
CN106834502A (en) A kind of spinal muscular atrophy related gene copy number detection kit and method based on gene trap and two generation sequencing technologies
CN105648045B (en) The method and apparatus for determining fetus target area haplotype
CN114999570B (en) Monomer type construction method independent of forensics
CN111518917B (en) Micro haplotype genetic marker combination and method for noninvasive prenatal paternity relationship determination
EP3564391A1 (en) Method, device and kit for detecting fetal genetic mutation
CN116246704B (en) System for noninvasive prenatal detection of fetuses
EP3797418B1 (en) Method for determining the probability of the risk of chromosomal and genetic disorders from free dna of fetal origin
CN113564266B (en) SNP typing genetic marker combination, detection kit and application
EP3559258A1 (en) Method for determining content of cell-free fetal dna and device for implementing method
CN117230175B (en) Embryo preimplantation genetics detection method based on third generation sequencing
CN116926180A (en) Use of gene marker combinations for the preparation of diagnostic products for Noonan syndrome lineage disorders
US11869630B2 (en) Screening system and method for determining a presence and an assessment score of cell-free DNA fragments
CN114457143B (en) Method for constructing CNV detection library and CNV detection method
CN114774549A (en) Second-generation sequencing kit for targeted detection of CAH candidate gene
CN113308527A (en) Gene composition, chip and kit for screening refractory hereditary bone diseases
JP2014530629A (en) Method for detecting chromosomal microdeletions and microduplications
EP3202912A1 (en) Noninvasive method and system for determining fetal chromosomal aneuploidy
WO2019205132A1 (en) Method for enriching fetal free nucleic acids and application thereof
RU2777072C1 (en) Method for identifying fetal aneuploidy in a blood sample of the pregnant woman
CN111560424A (en) Detectable target nucleic acid, probe, method for determining fetal F8 gene haplotype and application
Du et al. Unique dual indexing PCR reduces chimeric contamination and improves mutation detection in cell-free DNA of pregnant women

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: No. 203, 2nd Floor, Building 2, Guodian Technology Modern Logistics Center, No. 3 Taohua Road, Fubao Community, Fubao Street, Futian District, Shenzhen City, Guangdong Province, 518017

Patentee after: Shenzhen Jingke Biotechnology Co.,Ltd.

Country or region after: China

Patentee after: GUANGZHOU JINGKE DX Co.,Ltd.

Patentee after: Shenzhen Jingke Gene Technology Co.,Ltd.

Patentee after: Shenzhen Jingke Medical Laboratory

Address before: Unit 602, 6th Floor, No. 7 Spiral Fourth Road, International Biological Island, Guangzhou, Guangdong Province, 510320

Patentee before: GUAGNZHOU JINGKE BIOTECH CO.,LTD.

Country or region before: China

Patentee before: GUANGZHOU JINGKE DX Co.,Ltd.

Patentee before: Shenzhen Jingke Gene Technology Co.,Ltd.

Patentee before: Shenzhen Jingke Medical Laboratory