CN116246704B

CN116246704B - System for noninvasive prenatal detection of fetuses

Info

Publication number: CN116246704B
Application number: CN202310518720.6A
Authority: CN
Inventors: 曾晓静; 蒋馥蔓; 李胜; 杜伯乐; 夏伟成; 郭宇来; 秦炳财; 王阳; 李小坤
Original assignee: Guangzhou Jingke Dx Co ltd; Shenzhen Jingke Gene Technology Co ltd; Shenzhen Jingke Medical Laboratory; Guagnzhou Jingke Biotech Co ltd
Current assignee: Guangzhou Jingke Dx Co ltd; Shenzhen Jingke Biotechnology Co ltd; Shenzhen Jingke Gene Technology Co ltd; Shenzhen Jingke Medical Laboratory
Priority date: 2023-05-10
Filing date: 2023-05-10
Publication date: 2023-08-15
Anticipated expiration: 2043-05-10
Also published as: CN116246704A

Abstract

The invention discloses a system for noninvasive prenatal detection of a fetus, which comprises a sequencing device, a data analysis device and a result output device. After obtaining sequencing data of plasma free DNA, chromosome copy number analysis, chromosome microdeletion microreplication analysis, and monogenic disease analysis were performed using an own algorithm. On the basis of almost no increase of detection cost and detection time, the detection of the microdeletion and microreplication syndrome is realized, the percentage of total fetal DNA is estimated by using a neural network, and the data display can improve the detection precision to 2M resolution.

Description

System for noninvasive prenatal detection of fetuses

Technical Field

The present invention relates to the field of prenatal diagnosis, and in particular to a system for noninvasive prenatal detection of a fetus.

Background

Genetic diseases refer to diseases in which human genetic material is changed or abnormal, and the structure and function of a fetus at birth or after birth are abnormal or damaged. Genetic diseases mainly include chromosomal abnormalities and monogenic genetic diseases, which seriously threaten human health and are a major public health problem in society.

Chromosomal abnormalities mainly include copy number abnormalities and structural abnormalities, the most common chromosomal abnormalities being chromosomal aneuploidies, i.e., changes in chromosome number (more or less than 46, or two rather than three or one or four or other numbers of a certain chromosome number). The most common chromosomal aneuploidy diseases clinically are 21-trisomy (Down syndrome), 18-trisomy (Edwardsies syndrome), 13-trisomy (Partolochia syndrome). Microdeletion microreplication syndrome is another major class of birth defects in neonates other than chromosomal aneuploidies. There are data showing that 211 chromosomal microdeletion diseases and 79 chromosomal microdeletion diseases have been published by 11 months 2012. The incidence of chromosomal microdeletion/microreplication syndrome varies from 1/4000 to 1/2000000, with smaller deletions and duplications, typically less than 5Mb, being easily missed by prenatal diagnosis. Data shows that most chromosomal microdeletion/microreplication diseases are new mutations and that risk of onset has no significant correlation with age. The chromosome microdeletion/microduplication, which is pathogenic or potentially pathogenic, is 1.7% and the risk of recurrence is high. Single-gene genetic disease refers to genetic disease caused by single gene mutation, and single morbidity is low, but the variety is numerous and the total morbidity is high. According to World Health Organization (WHO) statistics, the cumulative incidence rate of all single-gene genetic diseases of the global birth population is as high as 10/1000.

Through researching the genetic diseases, the genetic diseases can be effectively prevented and treated, and good news is brought to human beings. Among them, early diagnosis and early treatment of genetic diseases are very critical. Clinically conventional screening methods include serological and imaging, and when the result appears positive, prenatal diagnosis will be performed by invasive testing (placental chorionic sampling or amniotic fluid puncture, etc.). However, these methods are not highly sensitive, have certain false positives, and have certain risks of abortion in invasive tests.

Hong Kong student Lu teaches that 3-13% of the free nucleic acids in maternal plasma were found in 1997 to be from fetuses, thus opening a new history of noninvasive prenatal diagnosis using maternal plasma. Initially, noninvasive prenatal DNA analysis was very challenging due to the large amount of maternal free DNA, but as high throughput sequencing technology rapidly developed, tens of millions of DNA fragments were quantified to detect chromosomal aneuploidies such as 21-trisomy, 18-trisomy, and 13-trisomy syndrome, which has been widely validated and accepted by clinical practice today, with specificity for chromosomal aneuploidies of about 97-99% and low false positive rate (< 0.1%). Currently, the conventional methods for noninvasive fetal prenatal detection are the WGS method and the SNP method. Compared with the WGS method, the SNP method (amplification of specific sequences) has lower detectable fetal concentration, but has long detection time, great development difficulty, limited detection range and undetectable sequence of non-SNP locus region. There have been institutions attempting to use chip capture methods for non-invasive prenatal testing by designing probes for selected specific and important areas, then capturing free DNA from the peripheral blood of pregnant women, and then sequencing the library. Compared with the WGS method, the method has the advantages of increased chip capturing steps and cost, long detection time and high cost. Although WGS methods require higher data volumes than SNP methods, current sequencing costs are continually reduced in terms of the ratio of detection and future sequencing costs. Thus detection coverage will play an increasing role in future detection. In recent years, the application has been further extended to the detection of micro-missing micro-repeats. Team Yin Aihua from the women and young healthcare institute in Guangdong, published in 2015, 9 that they used the WGS method to conduct NIPT test on pregnant woman plasma and detected fetal chromosomal microdeletion/microdeletion and confirmed microdeletion/microdeletion results via prenatal diagnosis. Due to the presence of maternal background DNA interference, it is important to improve the detection accuracy of fetal chromosomal microdeletion/microreplication by various methods, common methods are: increasing the concentration of fetal free DNA in the assay result; optimizing an analysis algorithm; SNP method and the like are used. It is apparent that the disadvantage of the SNP method that the detection sites are limited is unavoidable, and therefore, in practical use, the first 2 methods are more widely used.

Single gene genetic diseases are seriously damaged, most of them are teratogenic, disabled and even fatal, and an effective treatment means is lacked. There are studies showing that about half of monogenic diseases are dominant monogenic diseases, with the proportion of new variants being about 74% of them. The new mutation is not inherited by parents, the pregnancy does not have to have phenotype or ultrasonic abnormality can occur in late pregnancy, and an effective early screening method is lacking clinically at present, so that the new mutation is easy to miss before birth. With the development of the requirements of prenatal and postnatal care and detection technologies, the requirements for the detection of single genetic diseases before pregnancy are growing. However, the current clinical fetal monogenic genetic disease detection mainly relies on invasive detection, i.e. the detection needs to be performed by collecting samples such as amniotic fluid, chorion and the like. These invasive tests rely on techniques that are skilled in the art and are costly, and more importantly, invasive tests are at risk of abortion (about 0.5-1%).

In the prior art, a microdeletion micro-duplication detection method with high accuracy is lacked, and meanwhile, the accuracy of detecting the single-gene dominant genetic disease is relatively low.

Disclosure of Invention

The present invention aims to overcome at least one of the deficiencies of the prior art and to provide a system for non-invasive prenatal detection of a fetus.

The technical scheme adopted by the invention is as follows:

a system for noninvasive prenatal detection of a fetus comprising a sequencing device, a data analysis device, and a result output device, wherein:

the sequencing device is used for determining sequence information of a free DNA sequencing library in peripheral blood of the pregnant woman to obtain a sequencing result;

the result output device is used for outputting the analysis result of the data analysis device;

the analysis method of the data analysis device comprises the following steps:

comparing the sequencing result to a human reference genome, constructing a unique comparison sequencing sequence set, and recording the position information of each sequencing sequence comparison;

cutting a reference genome according to a unit window length, and dividing the reference genome into a plurality of primary windows;

counting the number of uncorrected frequencies of the unique comparison sequences in each window according to the comparison position information of each unique sequence;

correcting the number of uncorrected frequencies according to the GC value to obtain the corrected number of frequencies;

prediction of fetal nucleic acid percentages:

estimating fetal nucleic acid percentage PY from the Y chromosome:

calculating average Depth Y of Y chromosome, wherein the sum of the number of frequencies of the Y chromosome after correction of each window of the Y chromosome and/or the length of the whole chromosome base of the Y chromosome is calculated as Dmate value, and the average Depth Y of the Y chromosome of a normal male is calculated as Dfemale value; the average Y chromosome depth of the sample to be detected is calculated as Dtest, and the fetal nucleic acid percentage PY= [ (Dtest-Dfemale)/Dhale ] -Dfemale of the sample to be detected;

Carrying out principal component analysis on a known normal sample by an unsupervised learning method to obtain a principal component set PCs;

selecting PCs as a main component set, introducing an artificial neural network, and re-weighting the PCs by using a sample set of known fetal Y chromosome fetal nucleic acid percentage PY on a label to construct a weight-main component-PY neural network model;

predicting fetal nucleic acid percentage PF: converting the corrected frequency number of the chromosome window of the sample to be detected into a principal component set PCs through a principal component analysis technology, and then outputting a neuron value PF according to a weight-principal component-PY neural network model;

determination of the anomaly percentage PA of microdeleted microreplicated fragments:

randomly setting a breakpoint position, calculating a significance level p-value by a mathematical test method for the frequency number of windows on the left and right sides of the breakpoint position after correction, and selecting the window as a candidate microdeletion micro-repetition window if the calculated p-value is smaller than a preset significance level value p-set; if the calculated p-value is larger than the set significance level value p-set, continuing to merge the windows for next examination until a micro-missing micro-repeated window to be selected is obtained;

merging the adjacent windows of the determined to-be-selected microdeletion micro-repetition windows to obtain a selected microdeletion micro-repetition region;

Performing a depth depth (abnormal) calculation for each region of the microdeletion microrepeat, depth (abnormal) = corrected number of frequencies falling within the microdeletion microrepeat region/the microdeletion microrepeat region size;

the average depth (normal) calculation was also performed for other regions of the chromosome that do not contain the microdeletion microrepeat, depth (normal) =corrected frequency number falling on the remaining regions of each chromosome that do not contain the microdeletion microrepeat/size of the remaining regions that do not contain the microdeletion microrepeat;

calculating an anomaly percentage pa=2×|depth (normal) -depth (abnormal) |/depth (normal) containing microdeletion microrepeat or a whole chromosome copy number variation, and when depth (normal) -depth (abnormal) is positive, identifying as a preliminary microdeletion variation; when depth (normal) -depth (abnormal) is negative, it is considered a preliminary micro-repeat variation; when depth (normal) -depth (abnormal) is 0, it is judged to be normal;

color body copy number variation judgment:

the percent chromosomal copy number variation was obtained as pa_chri (i=1, 2, … …,22, x, y), comprising the steps of:

performing a duty cycle coverage_chri (i=1, 2, … …, x, y) calculation for each chromosome, coverage_chri=the sum of the number of corrected window frequencies falling on that chromosome/the number of window frequencies falling on that chromosome;

Calculating the occupancy ratio Coverage-normal_chri (i=1, 2, … …, x, y) of each chromosome based on known normal samples, wherein Coverage-normal_chri=the sum of the number of window frequencies after correction falling on the chromosome/the number of window frequencies falling on the chromosome, and averaging the occupancy ratio of the chromosomes of all samples to obtain Average (Coverage-normal_chri) (i=1, 2, … … … …, x, y);

calculating the copy number variation percentage PC_chri of each chromosome of the sample to be detected, wherein PC_chri=2× (coverage_chri-Average (Coverage-normal_chri))/Average (Coverage-normal_chri);

calculating the ratio R of the copy number variation percentage PC of each chromosome of the sample to be detected to the fetal nucleic acid percentage PF, wherein R= |PC|/PF;

determining chromosome copy number variation according to the PC_chri and the R value;

prediction of monogenic genetic disease:

sequentially aligning reads from the same cfDNA at each base site, counting from the first base site of the set of reads, if only less than or equal to 30% of reads are the same base at that site, the base is considered to be background noise; if more than or equal to 70% of reads are the same base at the site, the base type of the site is confirmed; if only 30% -70% of reads contain the same base, then that base is designated as an N base;

The same statistics are carried out on the second base site of the reads until the last site of the reads is finished, and the base sequence of the reads from the same cfDNA molecule is obtained;

aligning the ready base sequence of the sequenced cfDNA molecule to a human reference genome using the bwa aln algorithm;

detecting the base of each covered site, and counting the respective depth and the depth ratio of A, T, G, C, N, insertion, deletion on each site;

selecting a single gene disease position to check the total coverage depth of the position, if the total depth coverage is smaller than 1000X, the quality control cannot pass, and if the total depth is larger than 1000X, the quality control passes;

finding out the pathogenic mutation sites which are definitely required to be observed, and if the depth percentage of the pathogenic mutation is more than 3%, considering that the mutation exists; if the depth percentage of the pathogenic mutation is 1% -3%, judging that the gray area range is needed to be detected again; if the depth percentage of the pathogenic mutation is below 1%, the judgment mutation is absent.

In some examples of the system, the chromosomal microdeletion microreplication variation determination comprises the steps of:

calculating a ratio R of the percent of microdeletion microrepeat abnormalities to the percent of fetal nucleic acid, r=pa/PF;

Judging whether the sample is positive or negative of microdeletion and microduplication variation according to the ratio value R:

if the depth (normal) -depth (abnormal) of the microdeletion repeat region is positive, primarily judging that the microdeletion variation is detected, and then filtering a negative signal through an R value to determine that the microdeletion variation is the final microdeletion positive variation;

if R >5, prompting that the positive signal is possibly from a mother, and whether the fetus carries an unpredictable;

if R is more than or equal to 0.8 and less than or equal to 5, positive variation is determined;

if R is less than or equal to 0.5, determining negative variation;

if R is 0.5< 0.8, the dust area is the dust area, and re-detection is needed;

if depth (normal) -depth (abnormal) of the microdeletion repeat region is negative, primarily judging that the microdeletion repeat region is micro-repeat variation, and filtering a negative signal through an R value to confirm that the microdeletion repeat region is final microdeletion positive variation;

if R is less than or equal to 0.5, determining negative variation;

if 0.5< R <0.8, it is the gray zone, and re-detection is required.

In some system examples, PCs are selected as a main component set, an artificial neural network with 2 hidden layers is introduced, a method of elastic back propagation with weight backtracking is adopted, and the lossfunction adopts a residual variance sum algorithm, and the PCs are re-weighted by using a sample set of the fetal nucleic acid percentage PY of the known fetal Y chromosome on the label, so as to construct a weight-main component-PY neural network model.

In some examples of systems, the criteria for chromosomal copy number variation are as follows:

and (3) three-body judgment:

if PC_chri >0.03 and R >1.8, reporting trisomy positive, and prompting mother or placenta effect;

if PC_chri >0.03, and 0.2< R <1.8, then it is determined as chri trisomy positive;

if PC_chri >0.03,0.1< R <0.2, determining as an ash zone, and requiring re-determination;

if PC_chri is less than or equal to 0.03 and R <0.1, determining that the report is negative;

determination of XO:

when PC_chrX is less than or equal to-0.03, -0.01< PC_chrY <0.01, and R (i.e., |PC_chrX|/PF) is more than or equal to 0.8, determining positive;

when-0.03 < PC_chrX <0.03, -0.01< PC_chrY <0.01, R is less than 0.3, then judging as negative;

determination of XXX:

when PC_chrX >0.03, -0.01< PC_chrY <0.01, R (i.e., |PC_chrX|/PF) is larger than or equal to 0.8, the positive result is judged;

determination of XXY:

when-0.01 < pc_chrx <0.01 and pc_chry > =0.04, then it is determined as positive;

determination of XYY:

when PC_chrX >3% and PC_chrY >3% and the ratio of PC_chrY to PC_chrX is greater than 1.7, then XYY positive is determined.

In some system examples, the unit window is 100k to 5 mbp in length. The length of the unit window can be adjusted accordingly according to the depth, quality and the like of the sequencing.

In some system examples, when the calculated fetal nucleic acid percentage PF is less than 3.5%, the confidence of the result is determined and the peripheral blood sample is re-obtained.

In some system examples, the number of uncorrected frequencies is corrected by a model of linear regression based on GC values and as-batch systematic errors.

In some system examples, p-set is less than or equal to 0.05. This is a common significance criterion. The value may be adjusted as desired.

In some examples of systems, the method of constructing a sequencing library comprises the steps of:

extracting free DNA from a pregnant woman peripheral blood sample;

performing end repair, A adding, connector adding and PCR amplification on the free DNA, wherein the implementation method of PCR amplification is one of the following 4 methods:

1) As same asWhen adding gene specific primer, specific joint and specific barcode primer, wherein the gene specific primer is combined with plasma free DNA template T _m T with specific adapter and specific barcode primer combined with gene specific primer and plasma free DNA template amplified product _m The value is 2-6 ℃, the PCR process uses high annealing temperature to amplify the gene specific primer to a set concentration, and uses low annealing temperature to amplify the specific joint and the specific barcode primer, and finally forms a complete library; or (b)

2) Synthesizing a fusion primer, namely, a primer contains a gene specific module and a specific joint or a specific barcode primer module, forming an upstream primer and a downstream primer, amplifying and enriching a plasma free DNA template, and finally forming a complete library; or (b)

3) The upstream gene specific primer and the specific joint form an upstream fusion primer, a downstream gene specific primer and a specific barcode primer, when the amplification is started, the high annealing temperature is used firstly, the fusion primer and the downstream gene specific primer are utilized to amplify the plasma free DNA template, and then the low annealing temperature is used to amplify the upstream fusion primer and the specific barcode primer, so that a complete library is finally formed; or (b)

4) The downstream gene specific primer and the specific joint form a downstream fusion primer, an upstream gene specific primer and a specific barcode primer, when the amplification is started, the high annealing temperature is used firstly, the downstream fusion primer and the upstream gene specific primer are utilized for amplifying the plasma free DNA template, and then the low annealing temperature is used for amplifying the downstream fusion primer and the specific barcode primer, so that a complete library is finally formed.

In some system examples, downstream gene specific primers and specific barcode primer T _m The difference value of the values is 3-7 ℃; upstream Gene-specific primer and T of specific barcode primer _m The difference in values is 3-7 ℃. Such amplification is more effective.

The free DNA length of the fetus is typically less than 200 bp, and in some examples of systems, fragments above 200 bp are screened out after the free DNA is obtained. This can increase the concentration of fetal free DNA.

The beneficial effects of the invention are as follows:

the system of the invention can detect fetal chromosomal aneuploidy, microdeletion microreplication syndrome and single gene dominant genetic disease simultaneously. Compared with the prior art, the detection of the microdeletion microreplication syndrome is realized on the basis that the detection cost and the detection time are hardly increased.

The system of the invention uses a one-step method for multiple amplification and library establishment to complete noninvasive detection of fetal single-gene dominant genetic diseases. Multiple monogenic genetic diseases can be detected at one time, and theoretically all monogenic dominant genetic diseases with deterministic mutation sequences (point mutations, indels, etc.) can be detected.

According to the system disclosed by the invention, the whole genome library and the multiple one-step library are mixed for sequencing, namely 1 sample is shared by 1 sample and 1 sample barcode, so that limited samples which can be sequenced in one sequencing run due to insufficient sample barcode can not be generated, and the sequencing cost is high. The scheme is simple and easy to use, and meets clinical timeliness and practicality.

The system of the invention can meet the requirements of fragment and site specificity analysis when the sequencing depth of a single-gene disease specific region reaches 1000x, and hardly increases the sequencing cost.

The system of the invention can obviously improve the concentration of the free DNA of the fetus, reduce the probability of resampling, and further reduce multiparty resource and cost consumption; even meets the detection requirements of pregnant women, part of which cannot be subjected to conventional NIPT detection.

The system of the invention adopts an analysis algorithm which is developed independently. The detection result of the invention is obviously improved for the micro-missing micro-repetition with lower common detection precision. And the fetal concentration enrichment method and the neural network are innovatively applied to estimate the total fetal DNA percentage ratio, and the data display can improve the detection accuracy to 2M resolution.

Drawings

FIG. 1 is a gel electrophoresis diagram of amplified products of multiplex PCR using adapter and barcode primers after library construction is completed.

FIG. 2 is an electrophoretogram of the amplified product of the multiplex one-step system prior to sample optimization.

FIG. 3 is an electrophoretogram of the amplified product of the multiplex one-step system after sample optimization.

Detailed Description

The technical scheme of the invention is further described below by combining examples.

The following examples are described with respect to a Hua Dazhi-build (MGI) high throughput sequencing platform. Of course, other high throughput sequencing platforms may be used.

Enrichment of fetal DNA:

the plasma free DNA was extracted using magnetic beads according to a conventional procedure, and the plasma free DNA solution after extraction was subjected to a large fragment removal treatment. And (3) carrying out specific fragment screening on the product by using prepared magnetic beads and buffer solution, and removing fragments with more than 200bp in plasma free DNA and greatly retaining the plasma free DNA of small fragments through 2-step magnetic bead screening.

Comparison of different sequencing library construction methods:

performing end repair, A addition, joint addition and PCR amplification on the obtained part of the sample after the fetal DNA enrichment is completed (the process is marked as a first part); multiple one-step libraries (this procedure is labeled as the second part) may use samples after fetal DNA enrichment or samples without fetal DNA enrichment. The first part completes the detection of aneuploidy and microdeletion microduplications. The second part completes the detection of single gene dominant genetic disease.

Alternatively, the extracted plasma free DNA is split into 2 fractions, and the first fraction is subjected to free fetal DNA enrichment followed by end repair, a-addition, linker-addition, PCR amplification of the fragment of interest. The other part directly enters a second part of the multiple one-step method library establishment program without free fetal DNA enrichment; alternatively, the second fraction uses a sample enriched in free fetal DNA as a template.

Adding enzyme, dNTP, buffer solution and the like into the free DNA of the blood plasma to repair the tail end; adding the linker and the linker ligase into the repaired sample to connect the target fragment linker, purifying by using magnetic beads, and removing enzyme mixed solution and the unconnected linker. And then carrying out PCR amplification on the product of the last step by using enzyme, dNTP, buffer solution and specific primer. The amplified product was purified using magnetic beads, the enzyme cocktail and excess primer were removed, and the product was quantified.

The second part is multiple one-step library building. In the invention, optionally, specific molecular tags UMI of 6-8bp are added, and specific molecular tags of 4096-65536 can be formed by adding specific molecular tags UMI of 6-8 bp. The detection template is free DNA of fetus in peripheral blood plasma of pregnant woman, 4096-65536 specific molecular labels are enough to make each target fragment labeled with specific UMI. UMI is introduced at both ends of the target fragment at the beginning of fetal free DNA amplification, and the same UMI is labeled at the time of subsequent target fragment re-amplification, i.e., a single molecule is replicated to thousands of molecules with the same UMI label. The sequence of interest to which UMI is added can be assembled by identifying specific UMI sequences in subsequent analyses, i.e.UMI can help to identify errors in the amplification process and the sequencing process.

Firstly, downloading a gene sequence related to a single-gene genetic disease, and then designing a primer aiming at a mutation site/deletion or repeated fragment so as to ensure that the designed primer can amplify mutant fragments and wild fragments simultaneously. That is, one solution provided by the present invention is to amplify and enrich fragments (including wild type and mutant) containing mutation sites using a multiple one-step method, and then to high-throughput sequence the amplified products, and to analyze the obtained sequencing results. The second partial protocol was designed to detect single-gene dominant inherited diseases, such as GG (allele ratio 100%, homozygous) at the c.1138 locus on FGFR3 gene, which could potentially lead to disease if the genotype was mutated to GT. The inventors have analyzed the percentage of allele G, A, and the percentage of the other pathogenic allele, named pathologic variant allele percentage, by sequencing data, B. When the value of B is 3% or more, the pathogenic mutation is considered to be present. If the value of B is 1% -3%, judging that the gray area is in the range, and detecting again. If the value of B is 1% or less, the discrimination mutation does not exist.

The second part of the process is to use enzyme, dNTP, buffer and multiple primer to enrich the specific fragment of the free DNA in plasma, and complete the process of free DNA in plasma and library out in one step of experiment. There are 3 implementations of this part:

Firstly, adding a gene specific primer, a specific joint and a specific barcode primer at the same time, setting the TM value of the combination of the gene specific primer and a plasma free DNA template to be higher than the TM value of the combination of a product amplified by the gene specific primer and the plasma free DNA template, the specific joint and the specific barcode primer, wherein the temperature is 4+/-2 ℃ higher, the PCR process firstly uses high annealing temperature to preferentially amplify the gene specific primer, and after a few cycles (generally 6-8 cycles), then uses low annealing temperature to amplify the specific joint and the specific barcode primer, thus forming a complete library.

Secondly, synthesizing a fused primer, namely, a primer containing a gene specific module and a specific joint or a specific barcode primer module, forming an upstream primer and a downstream primer, amplifying and enriching a plasma free DNA template, and finally forming a complete library.

Thirdly, the upstream gene specific primer and the specific joint form a fusion primer, the downstream gene specific primer and the specific barcode primer are respectively 2 primers, when the amplification is started, the high annealing temperature is used first, the amplification of the upstream fusion primer and the downstream gene specific primer to the plasma free DNA template is preferentially carried out, and then the low annealing temperature is used, so that the amplification of the upstream fusion primer and the specific barcode primer is carried out. Meanwhile, in the scheme, a fusion primer can be formed by the downstream gene specific primer and the specific joint, the upstream gene specific primer and the specific primer are respectively 2 primers, when the amplification is started, high annealing temperature is firstly used, the amplification of the downstream fusion primer and the upstream gene specific primer to the plasma free DNA template is preferentially carried out, and then the low annealing temperature is used, so that the amplification of the downstream fusion primer and the specific primer is carried out, and finally the complete library is formed.

Meanwhile, in all the implementation modes, a specific recognition tag of 4-8bp is added at the 5' end of a gene specific primer, and the tag is screened by using an information analysis method, wherein the screening principle is as follows: 1) The tag is not present in the amplified target gene; 2) The probability of occurrence in the human genome is low, preferably no sequences in the genome are present, and the probability of occurrence is selected to be the lowest. Preferably, the specific recognition tag is selected to be 5bp in length. The specific identification tag can distinguish and separate the first part and the second part of products in the final sequencing stage, and the addition of the tag brings great convenience in the actual use process.

The enrichment of the target fragments of the plasma free DNA and the construction of the library are completed by a multiple one-step method, the amplified products are purified by using magnetic beads, enzyme mixed solution and redundant primers are removed, and then the products are quantified.

Meanwhile, the inventors compared the results of this system with the conventional PCR system by the multiple one-step method (the first implementation method described above).

The inventors used plasma DNA sample 1, plasma DNA sample 2, and plasma DNA sample 3 to perform multiplex PCR according to a conventional method, first using gene-specific primers, purifying amplified products using magnetic beads, removing enzyme mixed solution and redundant primers, amplifying the products of multiplex PCR using linker primers and barcode primers, and completing library construction, and then performing electrophoresis detection after the completion of this step, as shown in fig. 1. As can be seen from the gel, the amplified product is about 230bp, and there is a primer dimer of greater than 100bp, about 130bp, which is difficult to remove cleanly by 1.0 volume of magnetic beads when the product is purified, while the primer dimer remaining in the sample will affect sequencing, mainly expressed as: 1) Affecting quantification, resulting in unexpected sequencing data; 2) Primer dimers affect other libraries, such as occupancy data; 3) Primer dimer affects sequencing quality.

The product electrophoreses obtained by the inventors, which were initially amplified using the multiplex one-step system of plasma DNA sample 1, plasma DNA sample 2 and plasma DNA sample 3, are shown in FIG. 2, and it can be seen that there are a very large number of non-specific sequences and primer dimers. The inventor performs multiple rounds of testing and optimizing the system. The final system was amplified using plasma DNA sample 1, plasma DNA sample 2, and plasma DNA sample 3. As shown in FIG. 3, the final amplification product of the system used had a single target band of about 230bp, and primer dimer of less than 100bp, about 80bp, which was very easily removed by 1.0 volume of magnetic beads during product purification, as known to those skilled in the art.

In each of the examples below, the specific procedure for plasma free DNA to linker ligation is as follows:

1. extraction of plasma free DNA

The method is strictly operated according to the instruction of a free DNA extraction kit (such as IVD5432, guangdong ear mechanical equipment 20150062) by a magnetic bead method, the elution volume is 65 mu L, the concentration of the extracted DNA is in the range of 0.01-0.2 ng/. Mu.L, and the next library construction operation can be carried out. The obtained DNA sample was dissolved in AE, stored at 2-8deg.C and processed for 7 days, and frozen at-20+ -5deg.C for 3 years. Transporting at low temperature of-20+ -5deg.C for no more than 7 days, and freezing and thawing the extracted DNA sample for no more than 5 times.

2. Fetal DNA enrichment

Dividing the obtained plasma free DNA sample into 2 parts, taking 50 [ mu ] L and entering the following enrichment procedure; the other part directly enters a multiple one-step procedure;

2.1 Adding magnetic beads A (50 mu L) with 1.2 times of volume (60 mu L) into a sample, fully mixing, standing for 5min at room temperature, and placing in a magnetic rack for 3min until the solution is clarified;

2.2 Transferring the supernatant to a corresponding hole filled with 1.0 times of the original sample volume (50 mu L) of magnetic beads B, fully mixing, standing at room temperature for 5min, and placing in a magnetic rack for 3min until the solution is clear; discarding the supernatant;

2.3 Adding 200 μL 80% ethanol, blowing against non-magnetic bead precipitation for 6 times, standing on a magnetic rack until the solution is clear, and discarding supernatant;

2.4 Repeating the step 2.3 for 1 time, standing for 3min at room temperature, airing, and taking down the centrifuge tube from the magnetic rack;

2.5 Adding 27 mu L of an absorption Buffer, fully and uniformly mixing, standing at room temperature for 5min, and placing the centrifuge tube in a magnetic rack for 3min until the solution is clear;

2.6 Taking 25 mu L of supernatant in a new 200 mu L PCR tube, and preserving at 4 ℃ for later use.

3. A first part: end repair

3.1 Taking a proper amount of plasma free DNA sample to be detected into 1 0.2ml PCR tube, supplementing nucleic-free Water to a total volume of 50 mu L, fully mixing, and performing instantaneous centrifugation;

3.2 Respectively adding 10 mu L of Endprep Mix end repair reaction mixed solution into the PCR tube in the step 2.1, fully and uniformly mixing, performing instantaneous centrifugation, and placing the mixture on a PCR instrument for reaction according to the following procedures:

3.3 After the reaction, the PCR tube was taken out and centrifuged instantaneously.

4. Joint connection

4.1 Preparing a connection reaction mixed solution with the amount required for detection in a centrifuge tube according to the proportion of the table,

and (5) after fully and uniformly mixing, carrying out instantaneous centrifugation.

4.2 Adding the prepared connection reaction mixed solution (35 mu L) into the end repair product of each sample, adding 6 mu L of next holy (10 pmol/ul) Adapter X into each sample, fully uniformly mixing, and performing instantaneous centrifugation. The reaction was performed on a PCR instrument according to the following procedure:

note that: one Adapter X was added to each sample.

4.3 After the reaction, the PCR tube was taken out and centrifuged instantaneously.

4.4 Joint ligation product purification

4.4.1 Adding 80 mu L of next holy XP magnetic beads into each sample, fully mixing, standing at room temperature for 5min, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;

4.4.2 Adding 200 mu L of 75% ethanol, blowing for 3 times, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;

4.4.3 Repeating for 2.2.4 times, standing at room temperature for 3min, airing, and taking down the centrifuge tube from the magnetic rack;

4.4.4 Adding 22 mu L of an absorption Buffer, fully and uniformly mixing, standing at room temperature for 5min, and placing the centrifuge tube in a magnetic rack for 3min until the solution is clear;

4.4.5 Taking 20 mu L of supernatant in a new 200 mu L PCR tube, and preserving at 4 ℃ for later use.

The specific data analysis method is as follows:

the analysis method comprises the following steps:

cutting a reference genome according to the length of a unit window, dividing the reference genome into a plurality of first-level windows, wherein the length of the unit window is 100 k-5 Mbp;

correcting the number of uncorrected frequencies according to the GC value through a linear regression model and the batch system error;

prediction of fetal nucleic acid percentages:

estimating fetal nucleic acid percentage PY from the Y chromosome:

selecting PCs as a main component set, introducing an artificial neural network of 2 hidden layers, adopting an elastic back propagation method with weight backtracking and a lossfunction adopting a residual variance sum algorithm, and re-weighting the PCs by using a sample set of the known fetal Y chromosome fetal nucleic acid percentage PY on a label to construct a weight-main component-PY neural network model;

predicting fetal nucleic acid percentage PF: converting the corrected frequency number of the chromosome window of the sample to be detected into a principal component collection PCs through a principal component analysis technology, outputting a neuron value PF according to a weight-principal component-PY neural network model, and judging the reliability of a result when the calculated fetal nucleic acid percentage PF is lower than 3.5%, wherein a peripheral blood sample is required to be acquired again;

randomly setting a breakpoint position, calculating a significance level p-value by a mathematical test method for the frequency number of windows on the left and right sides of the breakpoint position after correction, and selecting the window as a candidate microdeletion micro-repetition window if the calculated p-value is smaller than a preset significance level value p-set 0.05; if the calculated p-value is greater than the set significance level value p-set 0.05, continuing to merge windows for the next step of inspection until a to-be-selected micro-missing micro-repeated window is obtained;

If R is less than or equal to 0.5, determining negative variation;

if R is 0.5< 0.8, the dust area is the dust area, and re-detection is needed;

if R is less than or equal to 0.5, determining negative variation;

if R is 0.5< 0.8, the dust area is the dust area, and re-detection is needed;

when depth (normal) -depth (abnormal) is 0, it is judged to be normal;

chromosome copy number variation determination:

and (3) three-body judgment:

determination of XO:

determination of XXX:

determination of XXY:

Determination of XYY:

when PC_chrX is more than 3 percent and PC_chrY is more than 3 percent, and the ratio of PC_chrY to PC_chrX is more than 1.7, the result is judged to be XYY positive;

prediction of monogenic genetic disease:

Example 1:

when a pregnant woman is known to have a single gene dominant disease, it is necessary to detect the genotype of the fetus.

Extracting plasma free DNA, fetal DNA enrichment, first fraction according to the above steps: after the terminal repair and the joint connection, the following operations are further performed:

5. multiplex one-step amplification

5.1 The reaction mixtures were prepared in 200 μl PCR tubes according to the ratios in the table below.

Note that: the same sample was used as the barcode.

5.2 Adding 2 mu L of free plasma DNA into the prepared reaction mixed solution, and carrying out instantaneous centrifugation after fully and uniformly mixing. The reaction was performed on a PCR instrument according to the following procedure:

after the reaction, the PCR tube was taken out and centrifuged instantaneously.

5.3 Multiple one-step process product purification

5.3.1 Adding 25 mu L of next holy XP magnetic beads into each sample, fully mixing, standing at room temperature for 5min, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;

5.3.2 Adding 200 mu L of 75% ethanol, blowing for 3 times, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;

5.3.3 Repeating for 2.2.4 times, standing at room temperature for 3min, airing, and taking down the centrifuge tube from the magnetic rack;

5.3.4 Adding 17 mu L of the solution Buffer, fully and uniformly mixing, standing at room temperature for 5min, and placing the centrifuge tube in a magnetic rack for 3min until the solution is clear;

5.3.5 Taking 15 mu L of supernatant in a new 200 mu L PCR tube, and preserving at 4 ℃ for later use.

6.YS-PCR

6.1 preparing PCR reaction mixed solution with the amount required for detection in a 200 mu L PCR tube according to the proportion of the table below;

6.2 adding the prepared 30 mu L PCR reaction mixed solution into the joint product obtained in 3.4.5, fully and uniformly mixing, and then carrying out instantaneous centrifugation. The reaction was performed on a PCR instrument according to the following procedure:

after the reaction, the PCR tube was taken out and centrifuged instantaneously.

6.3 YS-PCR product purification

6.3.1 Adding 50 mu L of next holy XP magnetic beads into each sample, fully mixing, standing at room temperature for 5min, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;

6.3.2 Adding 200 mu L of 75% ethanol, blowing for 3 times, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;

6.3.3 Repeating for 2.2.4 times, standing at room temperature for 3min, airing, and taking down the centrifuge tube from the magnetic rack;

6.3.4 Adding 17 mu L of the solution Buffer, fully and uniformly mixing, standing at room temperature for 5min, and placing the centrifuge tube in a magnetic rack for 3min until the solution is clear;

6.3.5 Taking 32 mu L of supernatant in a new 200 mu L PCR tube, and preserving at 4 ℃ for later use.

7. Library quantification

5.3.5 The samples in (2) and 6.3.5 were equilibrated to room temperature for qkit detection. The sample in 5.3.5 and the sample in 6.3.5 were mixed in a ratio of 1:12000.

8. Single stranded circularized library construction

8.1 denaturation

8.1.1 according to the fragment length of the last step, 1 pmol of DNA was taken into 0.2 mL PCR tubes, and ddH was used ₂ O was replenished to 34. Mu.L.

8.1.2 The denaturation reaction mixture was prepared according to the following table

8.1.3 The PCR tube was placed on a PCR instrument and reacted under the following conditions:

immediately after the reaction was completed, the PCR tube was transferred to ice and left to stand for 2 min.

8.2 Single Strand cyclization

8.2.1 Single Strand cyclization reaction solutions were prepared on ice according to the following table:

shaking and mixing at low speed, and centrifuging for a short time to centrifuge the reaction solution to the bottom of the tube.

8.2.2 The PCR tube was placed on a PCR instrument and reacted under the following conditions:

after the reaction is finished, the reaction mixture is transferred to the next step.

8.3 digestion by enzyme digestion

8.3.1 preparation of the digestion reaction System on ice according to the following table:

8.3.2 The PCR tube was placed on a PCR instrument and reacted under the following conditions:

after the reaction, the mixture was centrifuged instantaneously and immediately purified.

8.4 digestion product purification

8.4.1 sucking 120 mu L of Hieff NGS DNA Selection Beads to 7.3.2 digestion products, mixing by vortex or blowing, incubating for 10min at room temperature;

8.4.2 the PCR tube was briefly centrifuged and placed in a magnetic rack to separate the beads from the liquid, after the solution was clarified (about 2 min), the supernatant was carefully removed;

8.4.3 the PCR tube was kept always in a magnetic rack, 200. Mu.L of freshly prepared 80% ethanol was added to rinse the beads, and after 30 sec incubation at room temperature, the supernatant was carefully removed;

8.4.4 repeat step 5 for a total of two rinses;

8.4.5 keeping the PCR tube in the magnetic rack all the time, and uncovering the air to dry the magnetic beads until cracks just appear;

8.4.6 taking the PCR tube out of the magnetic frame, adding 22 mu L of TE Buffer, and carrying out vortex oscillation or lightly blowing by using a pipettor until the mixture is fully and uniformly mixed, and standing for 10min at room temperature;

8.4.7 short centrifugation, the PCR tube was kept still in a magnetic rack and after the solution was clarified (about 2 min), the supernatant was carefully transferred to a new PCR tube.

Stopping point: the purified product was cyclized and stored at-20℃for one month.

8.5 digestion product control

The digested products were quantified using the Qubit ssDNA Assay Kit fluorescent reagent.

9 on-machine sequencing

And (5) performing on-machine sequencing on the library with qualified quality control according to the on-machine sequencing protocol.

10 data analysis

Analysis of fetal chromosomal aneuploidies and microdeletions, and mainly analysis of single-gene dominant genetic disease gene detection of a mother.

The analysis results were as follows:

t21 detection example

T18 detection example

T13 detection example

Example 2: realizing aneuploidy+microdeletion microreplication detection

5.1 preparing PCR reaction mixed solution with the amount required for detection in a 200 mu L PCR tube according to the proportion of the table below;

5.2 adding the prepared 30 mu L PCR reaction mixed solution into the joint product obtained in 3.4.5, fully and uniformly mixing, and then carrying out instantaneous centrifugation. The reaction was performed on a PCR instrument according to the following procedure:

after the reaction, the PCR tube was taken out and centrifuged instantaneously.

5.3 YS-PCR product purification

5.3.1 Adding 50 mu L of next holy XP magnetic beads into each sample, fully mixing, standing at room temperature for 5min, placing in a magnetic rack for 3min until the solution is clear, and discarding the supernatant;

5.3.5 Taking 32 mu L of supernatant in a new 200 mu L PCR tube, and preserving at 4 ℃ for later use.

6. Library quantification

Samples from 5.3.5 were taken out and equilibrated to room temperature for qkit testing.

7. Single stranded circularized library construction

7.1 denaturation

7.1.1 according to the fragment length of the previous step, 1 pmol of DNA was taken into 0.2 mL PCR tubes, and ddH was used ₂ O was replenished to 34. Mu.L.

7.1.2 The denaturation reaction mixture was prepared according to the following table

7.1.3 The PCR tube was placed on a PCR instrument and reacted under the following conditions:

7.2 Single Strand cyclization

7.2.1 Single Strand cyclization reaction solutions were prepared on ice according to the following table:

7.2.2 The PCR tube was placed on a PCR instrument and reacted under the following conditions:

7.3 digestion by enzyme digestion

7.3.1 preparation of the digestion reaction on ice according to the following table:

7.3.2 The PCR tube was placed on a PCR instrument and reacted under the following conditions:

7.4 digestion product purification

7.4.1 sucking 120 mu L of Hieff NGS DNA Selection Beads to 7.3.2 digestion products, mixing by vortex or blowing, incubating for 10min at room temperature;

7.4.2 the PCR tube was briefly centrifuged and placed in a magnetic rack to separate the beads from the liquid, after the solution was clarified (about 2 min), the supernatant was carefully removed;

7.4.3 keep the PCR tube always placed in the magnetic rack, rinse the beads with 200. Mu.L of freshly prepared 80% ethanol, incubate for 30 sec at room temperature, carefully remove the supernatant;

7.4.4 repeating step 5 for a total of two rinses;

7.4.5 keeping the PCR tube in the magnetic rack all the time, and uncovering the air to dry the magnetic beads until cracks just appear;

7.4.6 taking the PCR tube out of the magnetic frame, adding 22 mu L of TE Buffer, and carrying out vortex oscillation or gentle blowing by using a pipettor until the mixture is fully and uniformly mixed, and standing for 10 min at room temperature;

7.4.7 the PCR tube was kept still in a magnetic rack for a short centrifugation, and after the solution was clarified (about 2 min), the supernatant was carefully transferred to a new PCR tube.

7.5 digestion product control

8. Sequencing on machine

9. Data analysis

2 examples of microdeletion detection:

three bodies: examples of detection of T13, T21, T18

XO, XXX, XXY, XYY detection example

Example 3: detection of aneuploidy + microdeletion + achondroplasia

5. multiplex one-step amplification

Note that: the same sample was used as the barcode.

after the reaction, the PCR tube was taken out and centrifuged instantaneously.

5.3 Multiple one-step process product purification

6 YS-PCR

Reference is made to the YS-PCR procedure of example 1.

7. Library mixing

7.1 The sample in 4.3.5 was taken out and equilibrated to room temperature for qkit detection.

7.2 Samples from 5.3.5 were taken out and equilibrated to room temperature for qkit testing.

7.3 According to the detection results of 6.1 and 6.2, 5.3.5 samples and 4.3.5 samples were mixed in a ratio of 2000:1. If a gradient dilution of 4.3.5 samples is required.

8. Single stranded circularized library construction

Reference is made to the single stranded circularized library construction procedure of example 1.

9. Sequencing on machine

10. Data analysis

Results: 1T 21 positive was detected in 20 samples, and other abnormalities were not detected.

Example 4: realizing detection of aneuploid, microdeletion microreplication and single gene dominant genetic disease

5. multiplex one-step amplification

Reference is made to the multiplex one-step amplification procedure of example 1.

6 YS-PCR

Reference is made to the YS-PCR procedure of example 1.

7 library mix

7.1 the sample in 5.3.5 was taken out and equilibrated to room temperature for Qkit detection.

7.2 the sample in 6.3.5 was taken out and equilibrated to room temperature for qkit detection.

7.3 according to the detection results of 5.3.5 and 6.3.5, 5.3.5 samples and 6.3.5 samples were mixed in a ratio of 1:1000. Samples 5.3.5 were subjected to gradient dilution if necessary.

8 Single Strand cyclization library construction

9. Sequencing on machine

10. Data analysis

Monogenic genetic disease analysis procedure

1, after sequencing, the different samples were distinguished according to index number. The same sample is first subjected to UMI to correct the measured molecular sequence.

a) Since individual cfDNA molecules are uniquely linked by a specific UMI (unique molecular indexing). All reads split according to different UMI sequences, reads containing the same UMI are grouped together, meaning that the group of reads are derived from the same cfDNA molecule.

b) Reads from the same cfDNA molecule are aligned sequentially at each base site. Counting from the first base site of the set of reads, if only.ltoreq.30% of reads are the same base at that site, the base is considered background noise since the set of reads is derived from the same cfDNA molecule; if more than or equal to 70% of reads are the same base at the site, the base type of the site is confirmed; if only 30% -70% of reads contain the same base, then that base is designated as an N base (no call).

c) The same statistics then continue to be performed at the second base site of the set of reads until the last site of the set of reads ends. Whereby the base sequences of reads derived from the same cfDNA molecule are obtained.

2, comparing the ready base sequence of the sequenced cfDNA molecule to hg19 of chr4 using the bwa aln algorithm.

3, detecting the base of each coverage site on hg19 by using samtools. The parameter choice is mpileup. The respective depth, to depth ratio of A, T, G, C, N, insertion, deletion at each site was counted.

4, selecting a single gene disease position to check the total coverage depth of the position. If the total depth coverage is less than 1000X, the quality control is not passed, and if the total depth is greater than 1000X, the quality control is passed.

6, finding the site of the pathogenic mutation which is clearly needed to be observed, and considering the mutation to exist if the depth percentage of the pathogenic mutation is more than 3%. If the depth percentage of the pathogenic mutation is 1% -3%, the gray area range is judged, and detection is needed again. If the depth percentage of the pathogenic mutation is below 1%, the judgment mutation is absent.

The analysis results are shown in the following table:

the above description of the present invention is further illustrated in detail and should not be taken as limiting the practice of the present invention. It is within the scope of the present invention for those skilled in the art to make simple deductions or substitutions without departing from the concept of the present invention.

Claims

1. A system for noninvasive prenatal detection of a fetus comprising a sequencing device, a data analysis device, and a result output device, wherein:

the analysis method of the data analysis device is characterized by comprising the following steps:

prediction of fetal nucleic acid percentages:

estimating fetal nucleic acid percentage PY from the Y chromosome: calculating average depth Y of Y chromosome, wherein the average depth Y=sum of the frequency numbers of each window on the Y chromosome after correction/the length of the whole chromosome base of the Y chromosome, wherein the average depth Y of the Y chromosome of a normal male is marked as Dmate value, the average depth Y of the Y chromosome of a normal female is marked as Dfemale value, the average depth of the Y chromosome of a sample to be detected is marked as Dtest, and the fetal nucleic acid percentage PY= [ (Dtest-Dfemale)/Dlemale ] -Dfemale of the sample to be detected;

selecting PCs as a main component set, introducing an artificial neural network, and reallocating weights to the PCs by using a sample set of known fetal Y chromosome fetal nucleic acid percentage PY on a label to construct a weight-main component-PY neural network model;

chromosome copy number variation determination:

the percent chromosomal copy number variation was obtained as pc_chri (i=1, 2, … …,22, x, y), comprising the steps of:

performing a duty cycle coverage_chri (i=1, 2, … …, x, y) calculation for each chromosome, coverage_chri=sum of all corrected window frequency numbers falling on the chromosome/all window frequency numbers falling on the chromosome of the sample;

Calculating the occupancy ratio Coverage-normal_chri (i=1, 2, … …, x, y) of each chromosome based on a known normal sample, wherein Coverage-normal_chri=the sum of all corrected window frequency numbers falling on the chromosome/all window frequency numbers falling on the chromosome of the sample, and taking the occupancy ratio of each chromosome of all samples to obtain Average (Coverage-normal_chri) (i=1, 2, … … … …, x, y);

calculating the copy number variation percentage PC_chri, PC_chri (i=1, 2, … …,22, X, Y) =2× (coverage_chri-Average (Coverage-normal_chri))/Average (Coverage-normal_chri) of each chromosome of the sample to be tested;

calculating the ratio R1 of the copy number variation percentage PC_chri of each chromosome of the sample to be detected to the fetal nucleic acid percentage PF, wherein R1= |PC_chri|/PF;

determining chromosomal copy number variation from pc_chri (i=1, 2, … …,22, x, y) and R1 values;

prediction of monogenic genetic disease:

marking reads from the same cfDNA as a group, aligning sequentially at each base site, counting from the first base site of the group of reads, and if only less than or equal to 30% of reads are the same base at the site, the base is considered as background noise; if more than or equal to 70% of reads are the same base at the site, the base type of the site is confirmed; if only 30% -70% of reads contain the same base, then that base is designated as an N base;

detecting the base of each covered site, and counting the respective depth to depth ratio of A, T, G, C, N, insertion and delete at each site;

selecting a single gene disease position to check the total coverage depth of the position, if the total depth coverage is smaller than 1000 x, the quality control is not passed, and if the total depth coverage is larger than 1000 x, the quality control is passed;

2. The system of claim 1, wherein the chromosomal microdeletion microrepeat variation determination comprises the steps of:

calculating the ratio R2 of the abnormal percentage of microdeletion micro-repetitive fragments to the percentage of fetal nucleic acid, r2=pa/PF;

Judging whether the sample is positive or negative of the microdeletion microrepeated variation according to R2:

if the depth (normal) -depth (abnormal) of the microdeletion repeat region is positive, primarily judging that the microdeletion variation is detected, and filtering a negative signal through an R2 value to determine that the microdeletion variation is the final microdeletion positive variation;

if R2> 5, then prompting that the positive signal is possibly from a mother source, and whether the fetus carries an unpredictable;

if R2 is more than or equal to 0.8 and less than or equal to 5, positive variation is determined;

if R2 is less than or equal to 0.5, determining negative variation;

if 0.5< R2<0.8, it is the gray area, and re-detection is needed;

if depth (normal) -depth (abnormal) of the microdeletion repeat region is negative, primarily judging that the microdeletion repeat region is micro-repeat variation, and filtering a negative signal through an R2 value to confirm that the microdeletion repeat region is final microdeletion positive variation;

if R2 is less than or equal to 0.5, determining negative variation;

if 0.5< R2<0.8, it is the gray area, and re-detection is required.

3. The system of claim 1 wherein the PCs are reassigned by a sample set of known fetal Y chromosome fetal nucleic acid percentages PY on the label to construct a weight-principal component-PY neural network model by selecting a principal component set PCs, introducing an artificial neural network of 2 hidden layers, adopting a method of elastic back propagation with weight backtracking and a lossfunction adopting a residual variance sum algorithm.

4. The system of claim 1, wherein the determination criteria for chromosomal copy number variation are as follows:

and (3) three-body judgment:

if pc_chri (i=1, 2, … …, 22) >0.03 and R1>1.8, a trisomy positive is reported, while indicating maternal or placental effects;

if pc_chri (i=1, 2, … …, 22) >0.03, and 0.2< r1<1.8, then determining the i chromosome as chri trisomy positive;

if pc_chri (i=1, 2, … …, 22) >0.03,0.1< r1<0.2, then determine as the gray zone, need to be re-measured;

if pc_chri (i=1, 2, … …, 22) is less than or equal to 0.03 and R1<0.1, determining that the report is negative;

determination of XO:

when PC_chrX is less than or equal to-0.03, -0.01< PC_chrY <0.01, R1 is more than or equal to 0.8, judging positive;

when-0.03 < PC_chrX <0.03, -0.01< PC_chrY <0.01, and 1R is less than 0.3, determining negative;

determination of XXX:

when PC_chrX >0.03, -0.01< PC_chrY <0.01, R1 is more than or equal to 0.8, judging positive;

when-0.03 < PC_chrX <0.03, -0.01< PC_chrY <0.01, R1 is less than 0.3, then judging as negative;

determination of XXY:

determination of XYY:

5. The system of claim 1, wherein the confidence level of the result is determined when the calculated fetal nucleic acid percentage PF is less than 3.5% by re-acquiring the peripheral blood sample.

6. The system of claim 1, wherein the number of uncorrected frequencies is corrected based on GC values by a linear regression model and as-batch systematic errors.

7. The system according to any one of claims 1 to 6, wherein p-set is equal to or less than 0.05.

8. The system of any one of claims 1 to 6, wherein the method of constructing a sequencing library comprises the steps of:

extracting free DNA from a pregnant woman peripheral blood sample;

performing end repair, A addition and adaptor addition on free DNA, and performing PCR amplification on target fragments, wherein the implementation method of the PCR amplification is selected from one of the following:

meanwhile, adding a gene specific primer, a specific joint and a specific barcode primer, wherein the Tm value of the combination of the gene specific primer and a plasma free DNA template is 2-6 ℃ higher than the Tm value of the combination of a product amplified by the gene specific primer and the plasma free DNA template and the specific joint and the specific barcode primer, the PCR process firstly uses high annealing temperature to amplify the gene specific primer to a set concentration, and then uses low annealing temperature to amplify the specific joint and the specific barcode primer, thus forming a complete library; or (b)

Synthesizing a fusion primer, namely, a primer contains a gene specific module and a specific joint or a specific barcode primer module, forming an upstream primer and a downstream primer, amplifying and enriching a plasma free DNA template, and finally forming a complete library; or (b)

The upstream gene specific primer and the specific joint form an upstream fusion primer, a downstream gene specific primer and a specific barcode primer, when the amplification is started, the high annealing temperature is used firstly, the fusion primer and the downstream gene specific primer are utilized to amplify the plasma free DNA template, and then the low annealing temperature is used to amplify the upstream fusion primer and the specific barcode primer, so that a complete library is finally formed; or (b)

The downstream gene specific primer and the specific joint form a downstream fusion primer, an upstream gene specific primer and a specific barcode primer, when the amplification is started, the high annealing temperature is used firstly, the downstream fusion primer and the upstream gene specific primer are utilized for amplifying the plasma free DNA template, and then the low annealing temperature is used for amplifying the downstream fusion primer and the specific barcode primer, so that a complete library is finally formed.

9. The system of claim 8, wherein the difference in Tm values between the downstream gene-specific primer and the specific barcode primer is 3-7 ℃; the difference between the Tm values of the upstream gene specific primer and the specific barcode primer is 3 to 7 ℃.

10. The system of claim 8, wherein more than 200 bp fragments are selected for removal after obtaining free DNA.