CN117230175A - Embryo preimplantation genetics detection method based on third generation sequencing - Google Patents

Embryo preimplantation genetics detection method based on third generation sequencing Download PDF

Info

Publication number
CN117230175A
CN117230175A CN202310749101.8A CN202310749101A CN117230175A CN 117230175 A CN117230175 A CN 117230175A CN 202310749101 A CN202310749101 A CN 202310749101A CN 117230175 A CN117230175 A CN 117230175A
Authority
CN
China
Prior art keywords
typing
embryo
parting
generation sequencing
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310749101.8A
Other languages
Chinese (zh)
Other versions
CN117230175B (en
Inventor
温蕾
刘燕霞
巫昭祺
黄丽君
李秋娴
王会建
王婷婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Xuyuan Medical Technology Co ltd
Original Assignee
Guangzhou Xuyuan Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Xuyuan Medical Technology Co ltd filed Critical Guangzhou Xuyuan Medical Technology Co ltd
Priority to CN202310749101.8A priority Critical patent/CN117230175B/en
Priority claimed from CN202310749101.8A external-priority patent/CN117230175B/en
Publication of CN117230175A publication Critical patent/CN117230175A/en
Application granted granted Critical
Publication of CN117230175B publication Critical patent/CN117230175B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to a genetic detection method before embryo implantation based on third-generation sequencing, which comprises third-generation sequencing treatment of a parent sample, haplotype and structural variation detection based on the third-generation sequencing treatment of the parent sample and SNP linkage analysis of an embryo whole genome; the haplotype typing firstly carries out mutation detection and screening on the three-generation sequencing result of the parent sample, then carries out haplotype typing and correction on the three-generation sequencing result of the parent sample and the screened SNPs, and obtains the final haplotype result; and judging whether the candidate embryo has gene defects according to the monomer result and the linkage analysis of parents (both couples) and embryo samples by the embryo genome-wide SNP parent linkage analysis. Compared with the prior art, the invention effectively solves the typing problems of non-forerunner, incomplete family samples or new mutation. The method can quickly, simply and efficiently find the pathogenic chain, has higher typing accuracy and is widely applicable.

Description

Embryo preimplantation genetics detection method based on third generation sequencing
Technical Field
The invention relates to the field of genetic detection, in particular to a third-generation sequencing-based embryo implantation pre-genetic detection method.
Background
Birth defects are caused by a number of causes, of which single gene inherited diseases account for about 22.2%. There are 8000 more monogenic genetic diseases that have been found at present, and the number of monogenic genetic diseases increases year by year. Although the incidence of individual diseases is not high, the comprehensive incidence can reach 1% due to the large number of diseases, and the diseases are mostly teratogenic, disabled and even fatal, and only 5% of the diseases are effectively treated by drugs, but the treatment cost is high. Once the child is born, the life health and the quality of life of the child are seriously affected, and a serious burden is also caused to families and society.
The third generation test tube infant technology can block the transmission of pathogenic sites to the next generation, so that a healthy child is obtained. The third generation tube infant technique mainly comprises: in Vitro Fertilization (IVF) and pre-embryo implantation genetic diagnosis (PGT) methods. Pre-embryo implantation genetic diagnosis (PGT) consists essentially of: pre-embryo implantation aneuploidy genetic testing (PGT-a), pre-embryo implantation monogenic genetic testing (PGT-M), pre-embryo implantation chromosomal structural variation genetic testing (PGT-SR).
The genetic test before single gene disease implantation (PGT-M) is one of the third generation test tube technology, and is an auxiliary reproduction technology for blocking genetic diseases by adopting the third generation test tube infant technology when the birth of the prior prover (genetic patient) or the genetic test of the couple is considered to be the risk of birth genetic defect infant and the birth is carried out again. The conventional technical route of the technology is to combine the methods of first generation sequencing, MLPA multiple linking probe method, chip hybridization and the like to carry out pathogenic site detection verification and SNP linkage typing on samples of at least two generations of people (foreigners or couples parents), obtain family linkage map (pre-experiment) firstly, and then screen healthy embryos. At present, the conventional method relies on samples of at least 2 generations, namely (samples of 3 persons of a forerunner and a couple) or (samples of 6 persons of both parents of the couple are collected for a person without the forerunner). For forerunner or family with incomplete family or new mutation, family sample detection cannot be performed, so that the test tube infant technology cannot benefit, and the hope of obtaining a healthy child is destroyed.
The genetic detection of chromosome structural variation (PGT-SR) before embryo implantation is a method used by the third generation test tube technology for the single or double existence of chromosome balance translocation in couples to carry healthy children. Conventional PGT-SR techniques can distinguish between chromosome-balanced embryos and chromosome-unbalanced embryos, but cannot re-distinguish balanced embryos. The abortion can be reduced by the technology, but 50% of possible child bearing balanced translocation carriers still bear the risks of recurrent abortion, stillbirth, childbirth deformity and the like after growing up.
FISH is the detection method used for the most common chromosome translocation in PGD at the earliest, but the range of FISH detection is limited to fewer chromosomes, and the result is inaccurate and the sample preparation process is complicated because the result judgment is based on the intensity of the fluorescence-labeled signal. In recent years, by combining the technologies of microdissection-NGS, identification of translocation breakpoint in a specific region and microdissection of a junction region in a specific chromosome region and chromosome typing can be performed, so that balanced embryos can be further classified. However, this method is relatively complex, requires advanced skill and special equipment, and is not suitable for clinical use. In addition, there is a simple method to re-distinguish the balanced embryo, but the accuracy is limited, and the location analysis can be performed on the fracture point only in the accuracy range of 2.36M, so that the disadvantage of the wide range accuracy is that there is not enough SNP in the 5M range at the upstream and downstream of the fracture point. It is known that the probability of occurrence of the cleavage recombination per 1M base on one chromosome is about 1% and the probability of occurrence of the cleavage recombination in the 5M range is about 10%, and this error rate may be easily generated in the actual detection of the translocation site. The methods such as MaRecs can be used for distinguishing balanced embryos again by using MARBA-NGS, the cleavage site of about 200kbp can be identified with accuracy, and enough SNP can be provided in the range of 1M upstream and downstream of the cleavage site for site-directed typing. However, this approach has its limitations in that the MaReCs must establish a usable reference embryo karyotype as a standard for other embryo inspection, or else cannot distinguish between balanced embryos. In addition, when there is an allele trip, further balancing embryo differentiation is not possible.
Disclosure of Invention
Based on the above, the invention provides a genetic detection method before embryo implantation based on three-generation sequencing whole genome sequencing.
The detailed technical scheme of the genetic detection method before embryo implantation is as follows:
a genetic detection method before embryo implantation based on third generation sequencing comprises third generation sequencing treatment of a parent sample, haplotype and structural variation detection based on the third generation sequencing treatment of the parent sample and SNP parent linkage analysis of the whole genome of an embryo; the haplotype typing firstly carries out mutation detection and screening on the three-generation sequencing result of the parent sample, then carries out haplotype typing and correction on the three-generation sequencing result of the parent sample and the screened SNPs, and obtains the final haplotype result; and judging whether the candidate embryo has a gene defect or not according to the monomer result and the parental and embryo sample linkage analysis by the embryo whole genome SNP parental linkage analysis.
Compared with the prior art, the invention effectively solves the typing problems of non-forerunner, incomplete family samples or new mutation. The method can quickly, simply and efficiently find the pathogenic chain, and has the advantages of higher typing accuracy and wide application range.
Further, the genotyping comprises the steps of:
1) Carrying out Margin typing on sample processing results and screened SNPs;
2) Carrying out WhatsHap typing on sample processing results and screened SNPs;
3) And (3) combining a Magrin typing route and a WhatsHap typing route, comparing typing results, correcting and optimizing integrated pathogenic chain haplotypes.
Further, the haplotype parting result is corrected through a parting decision tree model.
Further, the parting decision tree model consists of a parting decision tree training module and a parting decision tree prediction module; the typing decision tree training module comprises the following steps:
1) Sorting is carried out based on the historical sample typing result, and mainly comprises the following steps: classifying supervision labels, labeling results, standardizing data, normalizing, randomly disturbing and the like;
2) Constructing a deep learning neural network initialization model, and preliminarily determining super parameters such as the structure (the number of layers and the number of neurons) of the neural network, the total training times, the number of single-batch training samples, the gradient learning rate, the dropout proportion, the regularization type and weight, the activation function, the loss function and the like;
3) And inputting the training data, fine-tuning the super parameters, and training out a parting decision tree with the optimal verification set on the GPU server.
Further, the parting decision tree prediction module comprises the following steps:
1) Sorting Margin parting and WhatsHap parting results, inputting the results into a parting decision tree obtained by training, and predicting an optimal parting decision result;
2) And transforming and integrating the results to obtain the final pathogenic chain haplotype.
Further, the typing decision tree is a meticulous typing decision tree model, and the construction method comprises the following steps:
when the heterozygous site is a SNP, at HP1 only, this SNP compares to the final genotype: when the SNP is identical to the reference sequence and the mutation site is identical, there are 4 cases:
1) Genotype gt=1/1:hp1=alt; hp2=ref; the typing site accuracy = DP (HP 1)/DP (Final); the parting prediction strength is as follows: strong; this type is denoted by T1;
2) Genotype gt=0/0:hp1=ref; hp2=alt; the typing site accuracy = DP (HP 1)/DP (Final); the parting prediction strength is as follows: strong; this type is marked with T2;
3) Genotype gt=0/1, when AD (HP 1)/DP (HP 1) > =ad (Final)/DP (Final), HP 1=alt; hp2=ref; the typing site accuracy = 2 x ad (HP 1)/(DP (HP 1) +dp (Final), the predicted intensity of typing is moderate;
4) Genotype gt=0/1, when AD (HP 1)/DP (HP 1) < AD (Final)/DP (Final), this typing predicts intensity: weak; this type is marked with T4;
by analogy, the case where the SNP is obtained in only HP2 when the locus genotype is obtained and the case where the SNP is obtained in both HP1 and HP2 are obtained by an exhaustive algorithm, and the typing case of T1-T30 is obtained.
Further, the structural variation detection is performed on the sequencing result obtained in the sample processing stage, and the quality filtration is performed through the quality controller, and then the structural variation information is annotated.
Further, the parent sample third generation sequencing process comprises the steps of third generation sequencing the sample and preprocessing the sequencing result, wherein the preprocessing comprises the following detailed steps:
1) And (3) sequencing result off-machine data quality control and low-quality sequence filtering: the overall required data size is greater than 90G, and the average length of reads is greater than 10kb; meanwhile, only the sequence with the mass more than or equal to 10 and the length more than or equal to 1000bp is reserved;
2) Sequence alignment: the off-the-shelf data were aligned to human reference genomes.
For a better understanding and implementation, the present invention is described in detail below with reference to the drawings.
Drawings
FIG. 1 is a flow chart of a method of genetic testing prior to embryo implantation in accordance with the present invention;
FIG. 2 is a diagram of a compact parting decision tree model in accordance with the present invention;
FIG. 3 is a visual representation of IGV breakpoint in example 1 of the present invention;
FIG. 4 is a graph showing the results of three-generation sequencing typing in example 1 of the present invention;
FIG. 5 is a diagram showing the results of verification of the linkage analysis of the SNP family of the chip in example 1 of the present invention;
FIG. 6 is a graph of IGV breakpoint chr19:4516557 in example 2 of the present invention;
FIG. 7 is a diagram of IGV breakpoint chr10:84882445 in example 2 of the present invention;
FIG. 8 is a graph showing the results of three-generation sequencing typing of chr10 in example 2 of the present invention;
FIG. 9 is a graph showing the result of the third generation sequencing typing of chr19 in example 2 of the present invention;
FIG. 10 is a diagram showing the results of the chip SNP family linkage analysis verification chr10 in example 2 of the present invention;
FIG. 11 is a diagram showing the result of chr19 verification by the chip SNP family linkage analysis in example 2 of the present invention;
FIG. 12 is a graph showing the results of a parental linkage analysis chr10 performed on an embryo sample according to example 2 of the present invention;
FIG. 13 is a graph showing the results of a parental linkage analysis chr19 performed on an embryo sample according to example 2 of the present invention;
FIG. 14 is a diagram of IGV breakpoint chr2:4516557 in example 3 of the present invention;
FIG. 15 is a graph of IGV breakpoint chr16:84882445 in example 3 of the present invention;
FIG. 16 is a graph showing the results of three-generation sequencing typing of chr2 in example 3 of the present invention;
FIG. 17 is a graph showing the results of three-generation sequencing typing of chr16 in example 3 of the present invention;
FIG. 18 is a graph showing the results of a parental linkage analysis chr2 performed on an embryo sample according to example 3 of the present invention;
FIG. 19 is a graph showing the results of a parental linkage analysis chr16 performed on an embryo sample according to example 3 of the present invention.
Detailed Description
Referring to fig. 1, the genetic detection method before embryo implantation based on third generation sequencing of the invention comprises third generation sequencing treatment of a parent sample, haplotype typing and structural variation detection based on the third generation sequencing treatment of the parent sample, and SNP parent linkage analysis of the whole genome of an embryo.
The third generation sequencing treatment of the parent sample comprises the following steps:
s101: performing third generation sequencing on DNA of a carrier, wherein the carrier is a male and/or female pathogenic mutation carrier;
the detailed steps are as follows:
1) Extracting a high-quality DNA sample, and detecting the purity, concentration and integrity;
a) Whether the appearance of the sample contains foreign matter;
b) Agarose electrophoresis detects whether the sample has degradation and DNA fragment size;
c) Nanodrop detects DNA purity;
d) The Qubit quantitative instrument accurately quantifies DNA;
2) After the quality of the sample is qualified, a BluePIPP full-automatic nucleic acid recoverer is used for recovering the target fragment;
3) Purifying DNA by magnetic beads;
4) Performing damage and end repair on the target DNA fragment;
5) Purifying DNA by magnetic beads;
6) Ligating the purified product using a sequencing linker in a SQK-LSK109 kit;
7) Accurately quantifying the constructed DNA library by using a Qubit quantifying instrument:
8) Sequencing on a machine.
The above experimental steps represent only one third generation sequencing method using the above reagents and equipment, and the third generation sequencing method referred to in the present invention refers to a sequencing method capable of obtaining a single molecule sequencing result, including but not limited to the above sequencing method.
S102: preprocessing a sequencing result;
the detailed steps are as follows:
1) And (5) controlling the quality of the machine-off data and filtering a low-quality sequence: the overall required data size is greater than 90G, and the reads average length is greater than 10kb; meanwhile, only the sequence with the mass more than or equal to 10 and the length more than or equal to 1000bp is reserved;
2) Sequence alignment: the off-the-shelf data is compared to the human reference genome, and the step is automatically packaged, so that the free switching of a plurality of reference genome versions (hg 19, hg38 and T2T) can be realized. And finally forming a bam file and a bam.bai index file.
The haplotype typing comprises the following steps:
s201: performing mutation detection and screening SNPs on the sample processing result;
the detailed steps are as follows:
1) And (3) mutation detection: performing mutation detection by using a Docker flow of Pepper-Margin-deep;
2) Screening SNPs: SNP and IN/DEL are separated to give vcf_total.
S202: and carrying out haplotype typing and correction on the sample processing result and the screened SNPs to obtain a final haplotype result.
The detailed steps are as follows:
1) Carrying out Margin typing on sample processing results and screened SNPs, wherein the method comprises the following steps:
a. parting: parting is carried out by using Margin software, and a bam file containing parting results is obtained;
b. cutting a Bam file, and dividing the Bam file into HP 1-Bam and HP 2-Bam according to the parting result in the Bam file;
c. mutation detection (HP 1, HP 2) for detecting the mutation of the HP1, HP2 parts;
d. screening SNPs, namely separating the SNP from the IN/DEL to obtain vcf_HP1 and vcf_HP2;
e. inputting vcf_HP1, vcf_HP2 and vcf_total, and combining a pre-designed parting decision tree to obtain a final parting result;
2) The samples were processed and screened SNPs were WhatsHap typed as follows:
a. parting: typing by using WhatsHap software to obtain a bam file containing a typing result;
b. result processing and integration: processing the WhatsHap original result, and converting the WhatsHap original result into a format which is easy to view later by combining a parting decision tree;
3) And (3) combining a Magrin typing route and a WhatsHap typing route, comparing, correcting and optimizing the obtained pathogenic chain haplotype.
The parting decision tree consists of a parting decision tree training module and a parting decision tree prediction module;
the typing decision tree training module comprises the following steps:
1) Sorting is carried out based on the historical sample typing result, and mainly comprises the following steps: classifying supervision labels, labeling results, standardizing data, normalizing, randomly disturbing and the like;
2) Constructing a deep learning neural network initialization model, and preliminarily determining super parameters such as the structure (the number of layers and the number of neurons) of the neural network, the total training times, the number of single-batch training samples, the gradient learning rate, the dropout proportion, the regularization type and weight, the activation function, the loss function and the like;
3) Inputting the training data in the step 1), fine-tuning the super parameters in the step 2), and training a parting decision tree with the optimal verification set on the GPU server.
The parting decision tree prediction module comprises the following steps:
1) Sorting Margin parting and WhatsHap parting results, inputting the results into a parting decision tree obtained by training, and predicting an optimal parting decision result;
2) And (3) transforming and integrating the result of the step (1) to obtain the final pathogenic chain haplotype.
The following describes a meticulous typing decision tree model construction method:
when the heterozygous site is a SNP, at HP1 only, this SNP compares to the final genotype: when the SNP is identical to the reference sequence and the mutation site is identical, there are 4 cases:
1) Genotype gt=1/1:hp1=alt; hp2=ref; the typing site accuracy = DP (HP 1)/DP (Final); the parting prediction strength is as follows: strong; this type is denoted by T1;
2) Genotype gt=0/0:hp1=ref; hp2=alt; the typing site accuracy = DP (HP 1)/DP (Final); the parting prediction strength is as follows: strong; this type is marked with T2;
3) Genotype gt=0/1, when AD (HP 1)/DP (HP 1) > =ad (Final)/DP (Final), HP 1=alt; hp2=ref; the typing site accuracy = 2 x ad (HP 1)/(DP (HP 1) +dp (Final), the predicted intensity of typing is moderate;
4) Genotype gt=0/1, when AD (HP 1)/DP (HP 1) < AD (Final)/DP (Final), this typing predicts intensity: weak; this type is marked with T4;
by analogy, the case where the locus genotype obtained SNP only in HP2 and the case where SNP was obtained in both HP1 and HP2 were obtained by an exhaustive algorithm, as shown in fig. 2: the typing of T1-T30 is obtained, and F represents a situation which cannot occur.
The invention also provides a correction method of the deep learning model, which comprises the following steps:
1) Sorting based on historical sample typing results to obtain sample data with referential significance and typing results thereof;
2) Supervised label classification (example): SNP1-HP1 and SNP1-HP2, SNP2-HP1 and SNP2-HP2, SNP3-HP1 and SNP3-HP2, SNPn-1-HP1 and SNPn-1-HP2, SNPn-HP1 and SNPn-HP2
3) Data normalization and normalization;
4) Randomly disturbing samples;
5) The method comprises the steps of constructing a deep learning neural network initialization model, wherein the number of layers of the neural network is four, the number of input layers is 8, the number of first layer neurons is 4, the number of second layer neurons is 6, the input layer neurons respectively correspond to one-HP 1 on SNP, one-HP 2 on SNP, SNP current-HP 1, SNP current-HP 2, SNP next-HP 1 and SNP next-HP 2, the number of first layer neurons is 8, the number of second layer neurons is 4, and the output layer is 1. Total training times: convergence is achieved within 50 times, the number of training samples in a single batch is 1, the gradient learning rate is 0.03, the dropout proportion is 0.2, the regularization type is L2 regularization, the weight is 0.00001, the activation function is sigmoid, and the loss function is BCE.
6) Inputting the training data, fine-tuning the super parameters, and training a parting decision model with the optimal verification set on a GPU server;
7) The test set F1 for the optimal model was 0.953.
Subsequent predictions after correction:
1) Sorting the result of the parting decision tree, inputting the result into the parting decision tree obtained by training, and predicting the optimal parting decision result;
2) And (3) converting and integrating the neural network model prediction result to obtain a final pathogenic chain haplotype.
The two sets of methods utilize meticulous decision tree logic, combine parting site accuracy and parting prediction intensity obtained by the decision tree, predict and correct by a deep learning neural network model, integrate and optimize parting obtained by two parting routes, and obtain final pathogenic chain haplotype.
The structural variation detection comprises the following steps:
s301: structural variation detection is carried out on the sequencing result obtained in the sample processing stage, and quality filtration is carried out through a quality controller;
s302: structural variation information is annotated.
When the result is read, the embryo genome-wide SNP parent linkage analysis needs to refer to the haplotype parting result and the common analysis of the parent and embryo sample to judge whether the candidate embryo has the gene defect, screen out the embryo with the gene defect, select the healthy embryo for implantation, meet the desire of the normal offspring of birth and reduce the pregnancy and birth of the child suffering from the genetic disease.
The following describes the practical application of the method for genetic testing before embryo implantation according to the invention by means of 3 specific examples.
Example 1: the method carries out genetic detection before embryo implantation on a family of a single gene disease with micro-repeated variation
1. Family conditions:
male (P10001) and female (P10003): 10q24.31-10q24.32 region at least 610kb repeat
Array-CGH results: 10q24.31 (102 832 650-103 511 083). Times.3;
wife (P10002): negative;
2. family analysis strategy: and (3) carrying out typing and pathogenic chain analysis on a single sample by using the detection method, and then carrying out SNP family linkage analysis on the whole genome of the chip to evaluate the accuracy of typing and excavating pathogenic variation by using the detection method.
3. Third generation sequencing of (P10001, P10002, P10003) 3 samples respectively
4. And (5) machine-starting data analysis:
4.1 third generation sequencing results are put off, the results are shown in a table I, and the quality control reaches the standard;
(Q value >7, mean_length >10kb, total Base > 90G)
List one
4.2 mutation detection:
4.2.1: the depth analysis of the adjacent areas shows that the depth of the target areas of the samples P10001 and P10003 is obviously increased as shown in a table II, and the left and right side breakpoints are respectively as follows: chr10, 102829554; chr10, 103513462;
watch II
4.2.2 locating the breakpoint by IGV genome breakpoint visualization software, as shown in FIG. 3, it is intuitively visible that the chr10:102829554-103513462 is repeated, and the left and right breakpoints are respectively: chr10, 102829554; chr10, 103513462;
4.2.3 separately and respectively carrying out haplotype typing on the 3 samples, wherein the haplotype after correction through a parting decision tree and a neural network model is shown in figure 4;
4.2.4 haplotype typing binding depth analysis, the results are shown in Table three, and HP2 of samples P10001 and P10003 can be seen as pathogenic chains;
watch III
5. And (3) verification: through the verification of the SNP family linkage analysis of the whole genome of the chip, the verification result is shown in figure 5, and the consistency of the three-generation sequencing typing result and the family linkage analysis result is verified.
It can be seen that three generations of sequencing can accurately and directly type individual samples.
Example 2: the detection method performs genetic detection before embryo implantation by identifying pathogenic chains of balanced translocation carriers
1. Family conditions:
male (P20001) karyotype: 46, XY, t (10; 19) (q 23.2; q 13.3);
female (P20002) karyotype: 46, XX
Male father (P20003) karyotype: 46, XY, t (10; 19) (q 23.2; q 13.3);
male mother (P20004) karyotype: 46, XX;
the present couples have a sexual desire to bear healthy babies by means of PGT-SR technology.
2. Family analysis strategy: the male (P20001) sample is subjected to typing and pathogenic chain analysis by the detection method, and then SNP family linkage analysis of the whole genome of the chip is used for evaluating the typing and pathogenic mutation mining accuracy of the detection method. Embryo samples are analyzed through whole genome SNP, balanced translocation carrying conditions are identified and checked, and healthy embryos are selected for implantation.
3. Performing third generation sequencing on the (P20001) sample;
4. and (5) machine-starting data analysis:
4.1 third generation sequencing results are put off, the results are shown in a table four, and the quality control reaches the standard;
(Q value >7, mean_length >10kb, total Base > 90G)
Table four
4.2 mutation detection:
4.2.1 structural variation analysis results are shown in Table five, wherein translocation exists in chromosome 10 and chromosome 19, and breakpoints are chr10:84882445 respectively; chr19:45165579
TABLE five
4.2.2 locating the breakpoint by IGV genome breakpoint visualization software, breakpoint chr19:45165579 as shown in FIG. 6, breakpoint chr10:84882445 as shown in FIG. 7;
4.2.3 separately parting the haplotype of the sample (P20001), correcting by parting decision tree and neural network model to obtain the pathogenic chain haplotype, wherein HP2 is the pathogenic chain haplotype, the result of Chr10 is shown in figure 8, and the result of Chr19 is shown in figure 9;
5. and (3) verification: through verification of whole genome SNP family linkage analysis, the result of the Chu 10 is shown in FIG. 10, the result of the Chu 19 is shown in FIG. 11, and the consistency of the three-generation independent typing result and the family linkage analysis result is verified.
6. The embryos were subjected to whole genome CNV detection and family linkage analysis to identify healthy embryos:
the Chr10 results are shown in fig. 12, and the Chr19 results are shown in fig. 13.
The results are shown in Table six:
high throughput sequencing NGS results (CNV) SNP linkage analysis results
Female party - Normal state
Male prescription - Carrying balanced translocation
Embryo 1 No chromosome copy number abnormality was observed Not carry balanced translocation
Embryo 2 No chromosome copy number abnormality was observed Not carry balanced translocation
Embryo 3 No chromosome copy number abnormality was observed Carrying balanced translocation
TABLE six
Thus, embryo 1 and embryo 2 were normal chromosomes, and they did not carry the balanced translocation in men, and they were healthy embryos and transplantable.
Finally, after the two parties of men and women fully agree with each other, the embryo No. 2 is selected for implantation; the post-implantation prenatal diagnosis results are normal and healthy to birth, confirming the accuracy of third generation sequencing for balanced translocation pathogenic chain recognition.
Example 3: the detection method performs genetic detection before embryo implantation by identifying the pathogenic chain of the new mutation of the balanced translocation
1. Family conditions:
male (P30001) 46, XY, t (2; 16) (q 23; q 13), with normal parent and mother nucleus, male being a new mutation;
women (P30002): 46, XX
The male and female have fertility requirements, and want to eliminate balanced translocation of the male and female and to develop healthy babies through the third-generation test tube technology.
2. Analysis strategy: the detection method is used for directly identifying pathogenic chain haplotype of a male (P30001) sample, carrying out genome-wide SNP detection on a female sample and an embryo sample, carrying out linkage analysis according to the haplotype formed by the male and both couples and the embryo sample, removing balanced translocation carriers, and selecting healthy embryos for implantation.
3. Experiment:
1) Extracting a high-quality DNA sample from the (P30001) sample, and detecting the purity, the concentration and the integrity of the high-quality DNA sample;
a) Whether the appearance of the sample contains foreign matter;
b) Agarose electrophoresis detects whether the sample has degradation and DNA fragment size;
c) Nanodrop detects DNA purity;
d) Qubit accurately quantifies DNA;
2) After the quality of the sample is qualified, a BluePIPP full-automatic nucleic acid recoverer is used for recovering the target fragment;
3) Purifying DNA by magnetic beads;
4) Performing damage and end repair on the target DNA fragment;
5) Purifying DNA by magnetic beads;
6) Ligating the purified product using a sequencing linker in a SQK-LSK109 kit;
7) Accurate quantification of the constructed DNA library was performed using Qubit:
8) Sequencing on a machine.
4. And (5) machine-starting data analysis:
4.1 third generation sequencing results are put off, the results are shown in a seventh table, and the quality control reaches the standard;
(Q value >7, mean_length >10kb, total Base > 90G)
Watch seven
4.2 mutation detection:
4.2.1 structural variation analysis shows that the chromosome 2 and the chromosome 16 have translocation, the breakpoint is positioned by IGV genome breakpoint visualization software, and the breakpoint chr2:156208902 is shown in figure 14; the breakpoint chr16:58903979 is shown in fig. 15;
4.2.3 separately parting the haplotype of the sample (P30001), correcting the haplotype by parting decision trees and neural network models to obtain a pathogenic chain haplotype, wherein the result of chr2 is shown in figure 16, HP1 is a pathogenic chain, the result of chr16 is shown in figure 17, and HP2 is a pathogenic chain;
5. carrying out genome-wide CNV detection on the embryo, selecting the haplotypes of the embryo which does not see chromosome abnormality and both men and women for linkage analysis, and identifying healthy embryo, wherein the result of chr2 is shown in figure 18, and the result of chr16 is shown in figure 19;
the summary results are shown in Table eight:
table eight
The embryo 5 is normal in chromosome, does not carry balanced translocation in men, and is a healthy embryo and can be transplanted.
Finally, after the two parties of men and women fully agree with each other, the embryo No. 5 is selected for implantation; the post-implantation prenatal diagnosis result is normal and healthy to birth, and the third generation sequencing is proved to be capable of effectively solving the pain point problems of new mutation, incomplete family and the like.
Compared with the prior art, the invention can directly measure SNP loci linked on the same DNA chain in one time sequencing by taking samples of a carrier (couple or one party), extracting DNA and utilizing the advantages of long reading length (10 kb-150 kb) of the third generation sequencing technology, thereby avoiding the need of carrying out family linkage analysis. The haplotype of the pathogenic chain is directly assembled from the genome, so that the typing difficulty of incomplete families or new mutations is effectively solved, the range of applicable mutation types is wide, and the mutation types such as point mutation, deletion duplication, balanced translocation, inversion, pseudogene and the like can be solved. The method effectively solves the typing problems of non-forerunner, incomplete family samples or new mutation. The method can quickly, simply and efficiently find the pathogenic chain, effectively distinguish and carry pathogenic variant embryos from normal embryos, select normal embryo implantation, effectively block genetic diseases from inheriting to the next generation, and has higher typing accuracy.
The above examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the invention. It should be noted that modifications and improvements can be made by those skilled in the art without departing from the spirit of the invention, and the invention is intended to encompass such modifications and improvements.

Claims (9)

1. The genetic detection method before embryo implantation based on the third-generation sequencing is characterized by comprising the third-generation sequencing treatment of a parent sample, the haplotype typing and structural variation detection based on the third-generation sequencing treatment of the parent sample and the SNP parent linkage analysis of the whole genome of the embryo; the haplotype typing firstly carries out mutation detection and screening on the three-generation sequencing result of the parent sample, then carries out haplotype typing and correction on the three-generation sequencing result of the parent sample and the screened SNPs, and obtains the final haplotype result; and judging whether the candidate embryo has a gene defect or not according to the monomer result and the parental and embryo sample linkage analysis by the embryo whole genome SNP parental linkage analysis.
2. The method of claim 1, wherein the genotyping comprises the steps of:
1) Carrying out Margin typing on sample processing results and screened SNPs;
2) Carrying out WhatsHap typing on sample processing results and screened SNPs;
3) And (3) combining a Magrin typing route and a WhatsHap typing route, comparing typing results, correcting and optimizing the integrated haplotypes.
3. The method of claim 2, wherein the genotyping results are corrected by a genotyping decision tree model.
4. The method of claim 3, wherein the parting decision tree model comprises a parting decision tree training module and a parting decision tree prediction module; the typing decision tree training module comprises the following steps:
1) Sorting based on the historical sample typing results;
2) Constructing a deep learning neural network initialization model, and preliminarily determining super parameters such as the number of layers, the number of neurons, the total training times, the number of single-batch training samples, the gradient learning rate, the dropout proportion, the regularization type and weight, the activation function, the loss function and the like of the neural network;
3) And inputting the training data, fine-tuning the super parameters, and training out a parting decision tree with the optimal verification set on the GPU server.
5. The method of claim 4, wherein the typing decision tree prediction module comprises the steps of:
1) Sorting Margin parting and WhatsHap parting results, inputting the results into a parting decision tree obtained by training, and predicting an optimal parting decision result;
2) And (3) transforming and integrating the results to obtain the final haplotype.
6. The method according to claim 5, wherein the typing decision tree is a meticulous typing decision tree model, and the method comprises the steps of:
when the heterozygous site is a SNP, at HP1 only, this SNP compares to the final genotype: when the SNP is identical to the reference sequence and the mutation site is identical, there are 4 cases:
1) Genotype gt=1/1:hp1=alt; hp2=ref; the typing site accuracy = DP (HP 1)/DP (Final); the parting prediction strength is as follows: strong; this type is denoted by T1;
2) Genotype gt=0/0:hp1=ref; hp2=alt; the typing site accuracy = DP (HP 1)/DP (Final); the parting prediction strength is as follows: strong; this type is marked with T2;
3) Genotype gt=0/1, when AD (HP 1)/DP (HP 1) > =ad (Final)/DP (Final), HP 1=alt; hp2=ref; the typing site accuracy = 2 x ad (HP 1)/(DP (HP 1) +dp (Final), the predicted intensity of typing is moderate;
4) Genotype gt=0/1, when AD (HP 1)/DP (HP 1) < AD (Final)/DP (Final), this typing predicts intensity: weak; this type is marked with T4;
by analogy, the case where the SNP is obtained in only HP2 when the locus genotype is obtained and the case where the SNP is obtained in both HP1 and HP2 are obtained by an exhaustive algorithm, and the typing case of T1-T30 is obtained.
7. The method according to claim 1, wherein the structural variation detection is performed on the sequencing result obtained in the sample processing stage, and the structural variation information is annotated after quality filtering by a quality controller.
8. The method of claim 1, wherein the parent sample third generation sequencing comprises third generation sequencing of sample DNA and pretreatment of sequencing results, the pretreatment detailed steps being as follows:
1) And (3) sequencing result off-machine data quality control and low-quality sequence filtering: the overall required data size is greater than 90G, and the average length of reads is greater than 10kb; meanwhile, only the sequence with the mass more than or equal to 10 and the length more than or equal to 1000bp is reserved;
2) Sequence alignment: the off-the-shelf data were aligned to human reference genomes.
9. Use of the method for pre-embryo implantation genetic testing according to any one of claims 1-8 in genetic testing.
CN202310749101.8A 2023-06-21 Embryo preimplantation genetics detection method based on third generation sequencing Active CN117230175B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310749101.8A CN117230175B (en) 2023-06-21 Embryo preimplantation genetics detection method based on third generation sequencing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310749101.8A CN117230175B (en) 2023-06-21 Embryo preimplantation genetics detection method based on third generation sequencing

Publications (2)

Publication Number Publication Date
CN117230175A true CN117230175A (en) 2023-12-15
CN117230175B CN117230175B (en) 2024-05-28

Family

ID=

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335625A (en) * 2015-11-04 2016-02-17 和卓生物科技(上海)有限公司 Genetics detecting device of embryo before implantation
CN105543339A (en) * 2015-11-18 2016-05-04 上海序康医疗科技有限公司 Method for simultaneously completing gene locus, chromosome and linkage analysis
CN112582022A (en) * 2020-07-21 2021-03-30 序康医疗科技(苏州)有限公司 System and method for non-invasive embryo transfer priority rating
CN112840404A (en) * 2019-10-18 2021-05-25 苏州亿康医学检验有限公司 Methods, systems, and uses for eliminating noisy genetic data, haplotype phasing, and reconstructing progeny genomes
CN114999570A (en) * 2022-08-05 2022-09-02 苏州贝康医疗器械有限公司 Haplotype construction method independent of proband

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335625A (en) * 2015-11-04 2016-02-17 和卓生物科技(上海)有限公司 Genetics detecting device of embryo before implantation
CN105543339A (en) * 2015-11-18 2016-05-04 上海序康医疗科技有限公司 Method for simultaneously completing gene locus, chromosome and linkage analysis
CN112840404A (en) * 2019-10-18 2021-05-25 苏州亿康医学检验有限公司 Methods, systems, and uses for eliminating noisy genetic data, haplotype phasing, and reconstructing progeny genomes
CN112582022A (en) * 2020-07-21 2021-03-30 序康医疗科技(苏州)有限公司 System and method for non-invasive embryo transfer priority rating
WO2022017414A1 (en) * 2020-07-21 2022-01-27 序康医疗科技(苏州)有限公司 System and method for grading non-invasive embryo transplantation priorities
CN114999570A (en) * 2022-08-05 2022-09-02 苏州贝康医疗器械有限公司 Haplotype construction method independent of proband

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
PEI ZHENLE ET AL.: "Identifying balanced chromosomal translocations in human embryos by Oxford nanopore sequencing and breakpoints region analysis", 《FRONTIERS IN GENETICS》, vol. 12, 18 January 2022 (2022-01-18), pages 810900 *
ZHANG SHUO ET AL.: "Long-read sequencing and haplotype linkage analysis enabled preimplantation genetic testing for patients carrying pathogenic inversions", 《JOURNAL OF MEDICAL GENETICS》, vol. 56, no. 11, 30 November 2019 (2019-11-30), pages 741 - 749 *
郝燕 等: "SNP单体型分析在单基因病植入前遗传学检测中的应用", 《安徽医科大学学报》, vol. 55, no. 10, 4 September 2020 (2020-09-04), pages 1556 - 1560 *

Similar Documents

Publication Publication Date Title
Guo et al. Illumina human exome genotyping array clustering and quality control
JP6431769B2 (en) Diagnostic process including experimental conditions as factors
CN103874767B (en) Presumptive area in sample of nucleic acid is carried out the method and system of gene type
US20170342477A1 (en) Methods for Detecting Genetic Variations
ES2886508T3 (en) Methods and procedures for the non-invasive evaluation of genetic variations
JP2019153332A (en) Method for determining a copy number variation in sex chromosome
CN105143466B (en) Pass through extensive parallel RNA sequencing analysis mother blood plasma transcript profile
US20180051329A1 (en) Alignment and variant sequencing analysis pipeline
JP2015513392A5 (en)
CN105555970B (en) Method and system for simultaneous haplotyping and chromosomal aneuploidy detection
JP2021521886A (en) Methods and systems for rapid gene analysis
CN109994154A (en) A kind of screening plant of single-gene recessive genetic disorder candidate disease causing genes
CN112126677B (en) Noninvasive deafness haplotype gene mutation detection method
CA3194557A1 (en) Sequencing adapter manufacture and use
CN114999570A (en) Haplotype construction method independent of proband
CN110648722B (en) Device for evaluating neonatal genetic disease risk
CN116246704B (en) System for noninvasive prenatal detection of fetuses
WO2023246949A1 (en) Non-invasive method for determining parentage before birth by using microhaplotypes
CN117230175B (en) Embryo preimplantation genetics detection method based on third generation sequencing
JP2022537444A (en) Systems, computer program products and methods for determining genetic patterns in embryos
CN110373458A (en) A kind of kit and analysis system of thalassemia detection
Manigbas et al. A phenome-wide association study of tandem repeat variation in 168,554 individuals from the UK Biobank
Sanchez-Lara Clinical and genomic approaches for the diagnosis of craniofacial disorders
CN115798579A (en) Evidence judgment method, system, device and medium for genetic variation
CN117230175A (en) Embryo preimplantation genetics detection method based on third generation sequencing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant