CN108315403B - Method and system for determining fetus Duchenne muscular dystrophy gene haplotype - Google Patents

Method and system for determining fetus Duchenne muscular dystrophy gene haplotype Download PDF

Info

Publication number
CN108315403B
CN108315403B CN201810073058.7A CN201810073058A CN108315403B CN 108315403 B CN108315403 B CN 108315403B CN 201810073058 A CN201810073058 A CN 201810073058A CN 108315403 B CN108315403 B CN 108315403B
Authority
CN
China
Prior art keywords
sites
haplotype
heterozygous
generation sequencing
father
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810073058.7A
Other languages
Chinese (zh)
Other versions
CN108315403A (en
Inventor
张春生
郭宇来
李胜
杜伯乐
蒋馥蔓
曾晓静
夏伟成
王阳
朱文涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Jingke Dx Co ltd
Original Assignee
Guangzhou Jingke Dx Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Jingke Dx Co ltd filed Critical Guangzhou Jingke Dx Co ltd
Priority to CN201810073058.7A priority Critical patent/CN108315403B/en
Priority to PCT/CN2018/074939 priority patent/WO2019144424A1/en
Publication of CN108315403A publication Critical patent/CN108315403A/en
Application granted granted Critical
Publication of CN108315403B publication Critical patent/CN108315403B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/172Haplotypes

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Zoology (AREA)
  • General Health & Medical Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to a method for determining a fetus Duchenne muscular dystrophy gene haplotype, which comprises the following steps: constructing a third generation sequencing library based on a blood sample taken from blood samples of the father and mother of the fetus; carrying out third-generation sequencing on the third-generation sequencing library to obtain a third-generation sequencing result; determining Duchenne muscular dystrophy gene haplotypes of father and mother according to the third generation sequencing result; constructing a second-generation sequencing library based on a detection sample, wherein the detection sample is taken from a peripheral blood sample of a pregnant woman; performing second-generation sequencing on the second-generation sequencing library to obtain a second-generation sequencing result; and determining the Duchenne muscular dystrophy gene haplotype of the fetus according to the second-generation sequencing result and the Duchenne muscular dystrophy gene haplotype of the father and the mother.

Description

Method and system for determining fetus Duchenne muscular dystrophy gene haplotype
Technical Field
The invention relates to a method and a system for determining a fetal Duchenne muscular dystrophy gene haplotype.
Background
At present, detection methods for Duchenne Muscular Dystrophy (DMD) genes of fetuses are divided into invasive methods and non-invasive methods, and the non-invasive methods are more and more recognized and popularized in view of certain abortion risks of the invasive methods.
At present, the methods for detecting the Duchenne muscular dystrophy genes of a fetus based on free DNA in the peripheral blood of a pregnant woman mainly comprise the following steps: 1. parent or new mutations were detected by microdroplet digital PCR. 2. The method of high-throughput sequencing by capturing the target region is used for detecting parent or new mutations. 3. Parents as well as grandparents and external grandparents or presymptors were sequenced to construct haplotypes and maternal peripheral sequencing was used to assess dose imbalance to determine mutations inherited from parents. The methods have the defects of less detection sites, incapability of detecting maternal mutation, more required samples, high sampling difficulty and the like.
Disclosure of Invention
In view of the above, there is a need to provide a method and a system for non-invasively detecting the fetal Duchenne muscular dystrophy gene, which have many detection sites, can cover both paternal and maternal genetic mutations, and have few samples, are simple to operate, and are fast and accurate.
The invention provides a method for determining a Duchenne muscular dystrophy gene haplotype of a fetus, which comprises the following steps:
constructing a third generation sequencing library based on a blood sample taken from blood samples of the father and mother of the fetus;
carrying out third-generation sequencing on the third-generation sequencing library to obtain a third-generation sequencing result;
determining Duchenne muscular dystrophy gene haplotypes of father and mother according to the third generation sequencing result;
constructing a second-generation sequencing library based on a detection sample, wherein the detection sample is taken from a peripheral blood sample of a pregnant woman;
performing second-generation sequencing on the second-generation sequencing library to obtain a second-generation sequencing result;
and determining the Duchenne muscular dystrophy gene haplotype of the fetus according to the second-generation sequencing result and the Duchenne muscular dystrophy gene haplotype of the father and the mother.
The present invention also provides a system for determining the Duchenne muscular dystrophy haplotype in a fetus comprising:
constructing a third generation sequencing library apparatus for constructing a third generation sequencing library based on a blood sample taken from a blood sample of a fetus father and mother;
third-generation sequencing equipment, wherein the third-generation sequencing equipment is used for carrying out third-generation sequencing on the third-generation sequencing library to obtain a third-generation sequencing result;
a parental Duchenne muscular dystrophy haplotype device for determining the paternal and maternal Duchenne muscular dystrophy haplotype from the third generation sequencing results;
a second-generation sequencing library construction device, which is used for constructing a second-generation sequencing library based on a detection sample, wherein the detection sample is taken from a peripheral blood sample of a pregnant woman;
the second-generation sequencing equipment is used for carrying out second-generation sequencing on the second-generation sequencing library to obtain a second-generation sequencing result;
and the device for determining the Duchenne muscular dystrophy gene haplotype of the fetus is used for determining the Duchenne muscular dystrophy gene haplotype of the fetus according to the second-generation sequencing result and the Duchenne muscular dystrophy gene haplotype of the father and the mother.
Compared with the prior art, the method for determining the Duchenne muscular dystrophy gene haplotype of the fetus provided by the invention has the following advantages:
the method can detect Duchenne muscular dystrophy gene haplotype of the fetus noninvasively, avoids the risks of bleeding, abortion, infection, amniotic fluid leakage and fetus damage possibly caused by invasive methods, can reduce the psychological pressure of the pregnant woman, and can give accurate results through a novel detection method.
Furthermore, the method can detect all Duchenne muscular dystrophy gene mutations, can accurately judge the mutations inherited from fathers and mothers, and solves the problem that only a small number of sites can be detected at one time and the maternal mutation cannot be judged in the prior art.
Meanwhile, the invention can detect new mutation and father mutation and simultaneously detect the mother mutation, has important significance for the subsequent judgment of the recessive genetic disease Duchenne muscular dystrophy, can accurately judge the carrier and the patient, and has practical application significance for the detection result.
The invention has simple required sample and strong operability, and can detect the Duchenne muscular dystrophy of the fetus only by the peripheral blood of the father and the mother. The prior art test methods either require samples from grandparents and grandparents, or samples from pre-symptomatic patients, which are difficult to sample, and therefore, in practice, the prior art methods are not feasible. The invention only needs peripheral blood of father and mother, has strong practicability and can detect Duchenne muscular dystrophy of fetus by a non-invasive method.
Drawings
The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which
FIG. 1 is a graph showing the results of the third generation sequencing library assay of the parental sample in example 2 of the present invention.
FIG. 2 is a graph showing the results of the detection of the third generation sequencing library of the mother sample in example 2 of the present invention.
FIG. 3 is a diagram showing the results of the detection of the second generation sequencing library of the mother sample in example 2 of the present invention.
FIG. 4 is a graph showing the results of fetal haplotype estimation in example 2 of the present invention.
Detailed Description
The following describes embodiments of the present invention in detail. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
It should be noted that the terms "first", "second", "third" and "fourth" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", "third", "fourth" may explicitly or implicitly include one or more of the features. Further, in the description of the present invention, "a plurality" means two or more unless otherwise specified.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Unless otherwise indicated herein or apparent from the context, technical and scientific terms used herein do not exclude meanings that are already in the art. Unless specifically indicated to the contrary, the specific embodiments and examples of the present invention will employ conventional methods within the skill of the art. The scope of the art includes, but is not limited to, molecular biology and gene sequencing techniques.
The invention preferably provides a method for determining the Duchenne muscular dystrophy gene haplotype of a fetus, which comprises the following steps:
s1, constructing a third generation sequencing library based on a blood sample, wherein the blood sample is taken from blood samples of father and mother of the fetus;
s2, carrying out third-generation sequencing on the third-generation sequencing library to obtain a third-generation sequencing result;
s3, determining Duchenne muscular dystrophy gene haplotype of father and mother according to the third generation sequencing result;
s4, constructing a second-generation sequencing library based on a detection sample, wherein the detection sample is taken from a peripheral blood sample of the pregnant woman;
s5, carrying out second-generation sequencing on the second-generation sequencing library to obtain a second-generation sequencing result;
and S6, determining the Duchenne muscular dystrophy gene haplotype of the fetus according to the second-generation sequencing result and the Duchenne muscular dystrophy gene haplotype of the father and the mother.
According to an embodiment of the present invention, the S1 further includes the following steps:
s11, separating DNA fragments from the blood sample, wherein the DNA fragments are whole genome DNA fragments of fetal father and mother peripheral blood leucocyte;
s12, carrying out first treatment on the DNA fragment to obtain a first treated DNA fragment;
according to a specific embodiment of the present invention, the first processing includes the steps of:
s121, breaking the DNA segment to obtain a broken DNA segment;
s122, carrying out first purification and recovery on the broken DNA fragments to obtain a first recovery product, wherein the length of the first recovery product is 5-15 kb;
preferably, the first purification and recovery comprises recovery by a magnetic bead purification method, a gel cutting method, a BluePinPin method or the like.
S123, performing first end repair on the first recovery product to obtain a first end repaired DNA fragment;
s124, adding a base A to the 3' end of the DNA fragment with the first end repaired to obtain a DNA fragment with a first sticky end A;
s125, connecting the DNA fragment of the first cohesive end A with a first adaptor to obtain a first connection product;
s126, carrying out first PCR amplification on the first connection product to obtain a first amplification product, namely the DNA fragment after the first treatment.
S13, screening the first processed DNA fragment by using a probe and constructing a third-generation sequencing library, wherein the probe specifically recognizes a Duchenne muscular dystrophy related target gene, the Duchenne muscular dystrophy related target gene is a 30897180 to 33561648 region on chromosome X, the probe is provided in the form of a microchip array, and the microchip array is a liquid phase chip;
according to a specific embodiment of the present invention, the screening the first processed DNA fragment with a probe further comprises: and carrying out second treatment on the DNA fragments obtained after the probe screening to obtain the DNA fragments obtained after the second treatment, wherein the DNA fragments obtained after the second treatment form a third-generation sequencing library.
According to a specific embodiment of the present invention, the second process includes the steps of:
s131, carrying out second PCR amplification on the DNA fragment obtained after the probe screening so as to obtain a second amplification product;
s132, carrying out second purification and recovery on the second amplification product to obtain a second recovery product;
s133, performing damage repair on the second recovered product to obtain a DNA fragment subjected to damage repair;
s134, carrying out second end repair on the DNA fragment subjected to damage repair to obtain a second end repaired DNA fragment;
s135, carrying out third purification and recovery on the DNA fragment repaired at the second end to obtain a third recovered DNA fragment;
s136, performing blunt end adaptor connection on the third recovered DNA fragment to obtain a second connection product;
and S137, carrying out fourth purification and recovery on the second connection product to obtain a fourth recovery product.
According to an embodiment of the present invention, the second, third and fourth purification recovery comprise recovery by magnetic bead purification, gel cutting or BluePinPin method. According to a specific embodiment of the present invention, the three-generation sequencing library is used for sequencing by a three-generation sequencing platform such as Pacbio sequal.
According to a specific embodiment of the present invention, the third generation sequencing in S2 is performed by using a third generation sequencing platform, and the third generation sequencing platform includes a Pacbio sequal and other sequencing platforms.
According to an embodiment of the present invention, the step S3 further includes the following steps:
s31, comparing the obtained third generation sequencing result with a reference human genome sequence to obtain a compared sequencing data set; the software for comparison adopts Blasr; the reference human genome is Hg 19;
s32, screening sequences with the highest alignment scores from the aligned sequencing data sets to obtain a unique aligned sequence set;
s33, calculating the different base depth of each site in the target region of each sequence in the unique alignment sequence set; the base depth is calculated by samtools;
s34, screening heterozygous single nucleotide polymorphism sites or small fragment insertion deletion sites according to the different base depths of each site on each sequence;
according to a specific embodiment of the present invention, the method for screening comprises: screening according to the standard that the base depth of the mutation is divided by the depth of the site to be more than 0.2 and less than 0.8, and the depth of the site is more than 20X;
s35, selecting sequence fragments containing two adjacent hybrid single nucleotide polymorphisms or small fragment insertion deletion sites according to the screened hybrid single nucleotide polymorphism sites or small fragment insertion deletion sites;
according to an embodiment of the present invention, S35 further includes: finding out corresponding sequence fragments for every two adjacent heterozygous single nucleotide polymorphic sites or small fragment insertion deletion sites, and selecting the sequence fragments corresponding to the two adjacent heterozygous site arrangement types with the largest sequence number and containing the two adjacent heterozygous site arrangement types;
s36, judging the insertion deletion sites of two adjacent heterozygous single nucleotide polymorphisms or small fragments on the sequence fragment to obtain the connection type of the two adjacent heterozygous sites;
according to an embodiment of the present invention, S36 further includes:
s361, analyzing and filtering the sequence fragments with low quality values, wherein the sequence fragments with low quality values refer to the sequence fragments which contain low quality value bases and can not correspond to every two adjacent single nucleotide polymorphism sites or small fragment insertion deletion sites; the low quality value base is a base whose base is N.
S362, calculating probability values of the two adjacent heterozygous mutation sites on the filtered sequence fragments, and giving a probability value to each connection type, wherein the probability values comprise Bayesian probabilities or LoD (LoD) values;
s363, selecting the connection type with the maximum probability of the two adjacent heterozygous sites according to the probability value of the two adjacent heterozygous sites;
s37, judging the connection type of two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion sites on all the sequence fragments, namely the overall haplotype;
according to an embodiment of the present invention, S37 further includes: calculating two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion sites on all the sequence fragments by adopting a mathematical statistical method to obtain the connection type of the two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion sites on all the sequence fragments, namely the overall haplotype, wherein the mathematical statistical method comprises a graph theory or an optimal solution method;
and S38, correcting the overall haplotype to obtain the haplotype of the father and the mother.
According to an embodiment of the present invention, S38 further includes the steps of:
s381, judging the strength of the connection relation of the connection sequence of two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion sites on all the sequence fragments, wherein the judgment standard comprises the supported sequence number, the calculated probability value or the odds ratio value;
s382, the sites with the connection relationship of the two adjacent heterozygous sites being weak connection are re-judged, sequences covering the two adjacent heterozygous sites and the multiple sites adjacent to the two adjacent heterozygous sites are selected, the support condition of the sequences is judged, the sites with the weak connection are corrected according to the support condition, and the correction standard comprises the number of the sequences spanning the multiple sites or the number of the sites supporting the nearby haplotypes.
According to an embodiment of the present invention, S3 may further include determining which haplotype the Duchenne muscular dystrophy gene mutation site carried by the father or mother is located in if the father or mother has the Duchenne muscular dystrophy gene mutation site, and further verifying whether the constructed haplotype is consistent by extracting a sequence including the pathogenic mutation site and the mutation sites adjacent to the sequence. The pathogenic mutation site is a mutation site related to Duchenne muscular dystrophy.
According to an embodiment of the present invention, the S4 further includes the following steps:
s41, separating free DNA fragments from the detection sample, wherein the free DNA fragments are free nucleic acids in the peripheral blood of the pregnant woman;
s42, carrying out third treatment on the free DNA fragment to obtain a DNA fragment after the third treatment;
according to a specific embodiment of the present invention, the third process includes the steps of:
s421, performing third end repair on the free DNA fragment to obtain a third end repaired DNA fragment;
s422, adding a base A to the 3' end of the DNA fragment with the repaired third end to obtain a DNA fragment with a third cohesive end A;
s423, connecting the DNA fragment of the third cohesive end A with a third adaptor to obtain a third connection product;
s424, performing third PCR amplification on the third connection product to obtain a third amplification product, namely a third treated DNA fragment;
s43, screening the third processed DNA fragment by using a probe and constructing a second-generation sequencing library, wherein the probe specifically recognizes a Duchenne muscular dystrophy related target gene, the Duchenne muscular dystrophy related target gene is a 30897180 to 33561648 region on chromosome X, the probe is provided in the form of a microchip array, and the microchip array is a liquid phase chip;
according to a specific embodiment of the present invention, the screening the third processed DNA fragment with a probe further comprises: performing fourth PCR amplification on the screened DNA fragment to obtain a fourth amplification product, and performing fifth purification and recovery on the fourth amplification product to obtain a fifth recovery product; the fifth recovery product comprises the second-generation sequencing library;
according to the specific embodiment of the invention, the fifth purification and recovery comprises recovery by a magnetic bead purification method, a gel cutting method or a BluePinPin method; according to a specific embodiment of the present invention, the second-generation sequencing library is used for sequencing by a second-generation sequencing platform such as NextSeq, NovaSeq, etc.
According to a specific embodiment of the present invention, the second-generation sequencing in S5 is to perform sequencing using a second-generation sequencing platform, where the second-generation sequencing platform includes a Nextseq or NovaSeq sequencing platform.
According to an embodiment of the present invention, the S6 includes the following steps:
s61, filtering the second-generation sequencing result to obtain a filtered sequencing result, wherein the filtering conditions comprise the proportion of N contained in the sequence and the proportion of low-quality base number, and N is a base obtained by sequencing and cannot be judged;
according to a specific embodiment of the present invention, the filtering conditions include: filtering sequences having a proportion of N greater than 10% of one sequence and sequences having a proportion of mass values less than 15 greater than 50% of one sequence;
s62, comparing the filtered sequencing result with a reference human genome sequence to obtain a compared sequencing data set; the software for comparison adopts BWA or SOAP; the reference human genome is Hg 19;
s63, performing quality value correction and local re-comparison on the compared sequencing data set to obtain corrected sequencing data; the quality value correction and the local re-comparison are carried out by adopting GATK software;
s64, determining that the modified sequencing data contain single nucleotide polymorphism sites or sites with small fragment insertion deletion, and screening out connection sites corresponding to the haplotype of the father and the mother; further comprising: calculating the base depth of the connecting sites corresponding to the haplotype of the father and the mother in the second generation sequencing data of the peripheral blood of the pregnant woman, wherein the base depth is calculated by adopting samtools or GATK;
s65, selecting the screened connection sites corresponding to the haplotype of the father and the mother: when judging which haplotype of the genetic mother, selecting a site which is heterozygous for the mother and homozygous for the father; when the haplotype of the genetic father is judged, selecting a heterozygous locus of the father and a homozygous locus of the mother;
according to a particular embodiment of the invention, said selection of sites heterozygous for the mother and homozygous for the father comprises the following steps:
s651, filtering the father heterozygous sites;
s652, filtering sites with low base depth;
s653, filtering sites not present in the maternal haplotype;
according to a particular embodiment of the invention, said selection of sites heterozygous for the father and homozygous for the mother comprises the following steps:
s651, filtering the mother heterozygous sites;
s652, filtering sites with low base depth;
s653, filtering sites which do not appear in the paternal haplotype;
s66, calculating the fetal concentration from the sites that are homozygous and different for the parents, the method of calculating comprising calculating the fetal concentration from the sites that are homozygous and different for the parents, if the total depth of maternal bases in the peripheral blood of the pregnant woman is a and the total depth of paternal bases is B, then f 2B/(a + B);
and S67, judging the Duchenne muscular dystrophy gene haplotype of the fetus by a hidden horse model and a site alignment method according to the fetal concentration.
According to an embodiment of the present invention, the S67 further includes the following steps:
s671, calculating the quality value of the screened heterozygous and homozygous sites of the mother in the second generation sequencing data of the peripheral blood of the pregnant woman;
s672, calculating the probability of the sequencing distribution condition of each screened mother heterozygous and father homozygous locus in the peripheral blood of the pregnant woman according to the fetal concentration;
s673, calculating an optimal genetic path by using a hidden horse model and a viterbi algorithm according to the quality value and the probability of the sequencing distribution condition, and judging which haplotype of the mother is inherited by the fetus;
s674, selecting father heterozygous and mother homozygous sites obtained from the third generation data from the second generation sequencing data of the pregnant woman peripheral blood in S64, and determining which haplotype of the father is inherited by comparing a plurality of sites to the two haploids of the father obtained in S38. The father is heterozygous, and the maternal homozygous locus is a locus in the peripheral blood data which is inconsistent with the maternal homozygous locus.
In another aspect, the present invention provides a method for detecting, monitoring and diagnosing Duchenne muscular dystrophy in a fetus, comprising the steps of:
s7, judging whether the Duchenne muscular dystrophy gene haplotype of the fetus inherits the mutant haplotype of the father or the mother;
s8, determining whether the fetus is normal, carrier or patient according to whether the fetus Duchenne muscular dystrophy gene haplotype inherits a mutant haplotype of the father or mother.
If the fetus is a girl and the Duchenne muscular dystrophy gene haplotype inherits only a mutant haplotype of the father or mother, then the fetus is a carrier; if the fetus is a girl and the fetus inherits both a mutant haplotype of the father and the mother for the Duchenne muscular dystrophy gene haplotype, then the fetus is a patient; if the fetus is a girl and neither of the Duchenne muscular dystrophy gene haplotypes has inherited a mutant haplotype of the father or mother, then the fetus does not suffer from Duchenne muscular dystrophy disease. If the fetus is a boy and the Duchenne muscular dystrophy gene haplotype of the fetus inherits the mutant haplotype of the mother, then the fetus is a patient; if the fetus is a boy and the fetus inherits the unmutated haplotype of the mother from the Duchenne muscular dystrophy genotype, the fetus does not suffer from Duchenne muscular dystrophy.
In another aspect, the present invention provides a system for determining the Duchenne muscular dystrophy gene haplotype of a fetus, said system comprising:
constructing a third generation sequencing library apparatus for constructing a third generation sequencing library based on a blood sample taken from a blood sample of a fetus father and mother;
third-generation sequencing equipment, wherein the third-generation sequencing equipment is used for carrying out third-generation sequencing on the third-generation sequencing library to obtain a third-generation sequencing result;
a parental Duchenne muscular dystrophy gene haplotype determining device for determining the parental and maternal Duchenne muscular dystrophy gene haplotypes from the third generation sequencing results;
a second-generation sequencing library construction device, which is used for constructing a second-generation sequencing library based on a detection sample, wherein the detection sample is taken from a peripheral blood sample of a pregnant woman;
the second-generation sequencing equipment is used for carrying out second-generation sequencing on the second-generation sequencing library to obtain a second-generation sequencing result;
and the device for determining the Duchenne muscular dystrophy gene haplotype of the fetus is used for determining the Duchenne muscular dystrophy gene haplotype of the fetus according to the second-generation sequencing result and the Duchenne muscular dystrophy gene haplotype of the father and the mother.
According to a specific embodiment of the present invention, the apparatus for constructing a third generation sequencing library further comprises:
a first separation device for separating DNA fragments from the blood sample, wherein the DNA fragments are whole genome DNA fragments of fetal father and maternal peripheral blood leukocytes;
the first processing device is used for carrying out first processing on the DNA fragments to obtain first processed DNA fragments;
and the first probe screening device is used for screening the first treated DNA fragment by using a probe and constructing a third-generation sequencing library, wherein the probe specifically identifies a target gene related to Duchenne muscular dystrophy.
Preferably, the apparatus for constructing a third generation sequencing library further comprises a second processing device, the second processing device is configured to perform a second processing on the DNA fragments obtained by the probe screening in the first probe screening device to obtain second processed DNA fragments, and the second processed DNA fragments constitute a third generation sequencing library.
According to a specific embodiment of the present invention, the apparatus for determining parental Duchenne muscular dystrophy haplotype further comprises:
the first comparison device is used for comparing a third-generation sequencing result obtained by a third-generation sequencing device with a reference human genome sequence to obtain a compared sequencing data set;
a screening device for screening the sequence with the highest alignment score from the aligned sequencing data set to obtain a unique aligned sequence set;
a base depth calculating device for calculating the base depth of each site in the target region of each sequence in the unique alignment sequence set;
a heterozygous site screening device for screening heterozygous single nucleotide polymorphic sites or sites of small fragment indels according to the different base depths of each site on each sequence;
a means for selecting adjacent heterozygous sites for selecting a sequence fragment containing two adjacent heterozygous single nucleotide polymorphisms or small fragment indels according to the selected heterozygous single nucleotide polymorphism sites or small fragment indels;
a site judging device, which is used for judging the sites of two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion on the sequence fragments to obtain the connection types of the two adjacent heterozygous sites;
a connection type judging device, which is used for judging the connection type of two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion sites on all the sequence fragments, namely the overall haplotype;
a first correction device for correcting the overall haplotype to obtain the haplotype of the father and the mother.
According to a specific embodiment of the present invention, the apparatus for selecting adjacent heterozygous sites further comprises a sequence segment selection unit, wherein the sequence segment selection unit is used for finding out the corresponding sequence segment for each two adjacent heterozygous single nucleotide polymorphic sites or small fragment insertion deletion sites, and selecting the sequence segment corresponding to the two adjacent heterozygous site array types with the largest number of sequences comprising the two adjacent heterozygous site array types.
According to an embodiment of the present invention, the apparatus for determining a site further includes:
an analysis filtering unit for analyzing and filtering sequence fragments with low quality values;
a probability calculation unit, configured to calculate probability values of occurrence of the two adjacent heterozygous mutation sites on the filtered sequence fragment, and give a probability value to each connection type, where the probability value includes a bayesian probability or a lod ratio;
and the connection type selecting unit is used for selecting the connection type with the highest probability of the two adjacent heterozygous sites according to the probability values of the two adjacent heterozygous sites.
According to an embodiment of the present invention, the apparatus for determining a connection type further includes:
and the locus calculation unit is used for calculating two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion sites on all the sequence fragments by adopting a mathematical statistical method to obtain the connection types of the two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion sites on all the sequence fragments.
According to an embodiment of the present invention, the first correction apparatus further includes:
a strong and weak connection relation judging unit, configured to judge the strength of the connection relation between the two adjacent heterozygous single nucleotide polymorphisms or the connection order of the small fragment insertion deletion sites on all the sequence fragments, where the judgment criterion includes a supported sequence number, a calculated probability value, or an odds ratio;
and the weak link re-judgment unit is used for re-judging the sites of which the connection relationship between the two adjacent heterozygous sites is weak link, selecting sequences covering the two adjacent heterozygous sites and a plurality of sites adjacent to the two adjacent heterozygous sites, judging the support condition of the sequences, and correcting the sites of the weak link according to the support condition, wherein the correction standard comprises the number of the sequences spanning the plurality of sites or the number of the sites supporting the adjacent haplotypes.
According to a specific embodiment of the present invention, the apparatus for constructing a second generation sequencing library further comprises:
a second separation means for separating free DNA fragments, which are pregnant woman peripheral blood free nucleic acids, from the test sample;
a third processing device, which is used for carrying out third processing on the free DNA fragment to obtain a third processed DNA fragment;
and the second probe screening device is used for screening the third treated DNA fragment by using a probe and constructing a second-generation sequencing library, wherein the probe specifically identifies a Duchenne muscular dystrophy related target gene.
According to a specific embodiment of the present invention, the apparatus for determining the Duchenne muscular dystrophy gene haplotype of a fetus further comprises:
a filtering device, configured to filter the second-generation sequencing result to obtain a filtered sequencing result, where the filtering conditions include a ratio of N contained in the sequence and a ratio of low-quality base number, where N is a base obtained by sequencing and cannot be determined;
the second comparison device is used for performing quality value correction and local re-comparison on the compared sequencing data set to obtain corrected sequencing data;
a second correcting device, configured to perform quality value correction and local re-alignment on the aligned sequencing data set to obtain corrected sequencing data;
a parental connection site screening device for determining sites containing single nucleotide polymorphism or sites with small fragment insertion deletion in the corrected sequencing data and screening connection sites corresponding to the haplotype of the father and the mother;
a genetic parent haplotype judging device, which is used for selecting the screened connecting sites corresponding to the haplotype of the father and the mother, and selecting the sites which are heterozygous for the mother and homozygous for the father when judging which haplotype of the genetic mother; when judging which haplotype is inherited from father, selecting the site which is heterozygous for father and homozygous for mother;
a fetal concentration calculation means for calculating fetal concentrations from the parents being homozygous and at different loci;
and the device for judging the Duchenne muscular dystrophy gene haplotype of the fetus is used for judging the Duchenne muscular dystrophy gene haplotype of the fetus by using a hidden horse model according to the concentration of the fetus.
According to a specific embodiment of the present invention, the fetal concentration calculation apparatus further comprises: a fetal concentration calculation unit for calculating a fetal concentration of 2B/(A + B) where A is the total depth of maternal bases and B is the total depth of paternal bases in peripheral blood of the pregnant woman, from the sites that are homozygous and different for the parents.
According to an embodiment of the present invention, the apparatus for determining Duchenne muscular dystrophy gene haplotype in a fetus further comprises:
a quality value calculating unit, which is used for calculating the quality value of the screened heterozygous maternal and homozygous loci of the mother in the peripheral blood of the pregnant woman;
the sequencing distribution probability calculation unit is used for calculating the probability of the sequencing distribution condition of each screened mother heterozygous and father homozygous locus in the peripheral blood of the pregnant woman according to the fetal concentration;
judging a maternal haplotype genetic unit, wherein the maternal haplotype genetic unit is used for calculating an optimal genetic path by using a hidden horse model and a viterbi algorithm according to the quality value and the probability of the sequencing distribution condition, and judging which haplotype of the mother is inherited by the fetus;
judging a haploid genetic unit of the father, wherein the haploid genetic unit of the father is used for screening out the heterozygous and homozygous loci of the father obtained from the third generation data in the peripheral plasma data of the pregnant woman, and determining which haploid type of the father is inherited by comparing a plurality of loci to two haploid types of the father.
The scheme of the invention will be explained with reference to the examples. It will be appreciated by persons skilled in the art that the following examples are illustrative only and are not to be construed as limiting the invention. Reagents, software and equipment not specifically submitted to the following examples are conventional commercial products or open sources unless otherwise submitted.
Example 1
1. Preparation of Duchenne muscular dystrophy gene hybridization capture chip
The area of the common Duchenne muscular dystrophy is identified, with preferred areas being areas of Duchenne muscular dystrophy on chromosome X from 30897180 to 33561648.
The repetitive fragments are removed, preferably by removing the region of the repetitive fragment above 200, i.e., if a fragment matches more than 200 regions on the chromosome, then the fragments are removed.
The DMD hybridization capture chip is obtained, and the preferred method is a liquid phase capture chip.
The long fragments can be simultaneously amplified and enriched by using a PCR method, and the PCR method can be used as a supplementary method to assist the chip in capturing the target area.
2. Construction of the third Generation library
a. Sample preparation: obtaining blood samples of father and mother of fetus, and extracting whole genome DNA fragments of peripheral blood leucocyte.
b. Genomic DNA disruption: the genomic DNA is fragmented into large fragments, about 5-15k fragments, by centrifugation, sonication, digestion, etc., and the desired fragments are recovered using magnetic beads, Blue PinPin HT, agarose gel electrophoresis, etc.
c. Sample preparation before hybridization:
1) carrying out end repair on the large fragment of DNA to obtain the large fragment of DNA with the repaired end; the end repair was performed by reacting T4 DNA Polymerase (T4 DNA Polymerase) with T4Polynucleotide Kinase (T4Polynucleotide Kinase).
2) Adding 'A' to the tail end of the large fragment DNA subjected to the tail end repair to obtain the large fragment DNA with the tail end added with 'A'; the end-addition of "A" was carried out in a Klenow (3 '-5' exo-) containing end-addition "A" system.
3) Adding the large fragment DNA connection joint of the 'A' to the tail end to obtain large fragment DNA with a joint; the ligation linker was completed by reaction of the sequencing linker with T4 DNA Ligase (T4 DNA Ligase).
4) Amplifying the large fragment of DNA with the joint to obtain a large fragment of DNA enrichment product with the joint, thereby completing the preparation work of the hybrid library.
d. Chip hybridization capture: and c, performing hybridization capture reaction on the sample prepared in the step c by using the chip obtained in the step 1, and performing enrichment amplification on the obtained capture product.
e. Third Generation sequencing library construction
1) D, carrying out DNA damage repair on the enriched product obtained in the step d to obtain DNA subjected to damage repair;
2) performing end repair on the DNA subjected to damage repair to obtain the DNA subjected to end repair;
3) purifying the DNA with the repaired tail end to obtain purified and recycled DNA;
4) connecting the purified and recovered DNA with a flat terminal joint to obtain DNA with a joint, and obtaining a complete third-generation sequencing library structure;
5) and purifying the DNA with the joint for three times continuously, wherein the purification method can be a magnetic bead method, a gel cutting method, a BluePinPin HT method and the like, so as to obtain a third-generation sequencing library.
f. And (3) carrying out QC quality control on the third-generation library: and (3) performing library quality inspection by using the Qbuit and Agilent 2100, and performing next sequencing on qualified libraries after quality inspection.
3. Third-generation target region sequencing method
And (4) sequencing the qualified library by adopting a third-generation sequencer according to the operating specification of the third-generation sequencer.
4. Obtaining father and mother Single Nucleotide Polymorphism (SNP) and small fragment insertion deletion (InDel) sites according to third generation sequencing data
Obtaining the aligned sequences. And comparing the sample sequencing sequence obtained by sequencing with the reference genome to obtain a compared result.
And selecting an optimal comparison result. Since three generations of data sequences may align to multiple places in the reference genome, the sequence with the highest alignment score is selected.
Calculating the depth of different bases at each position in the target region of each sequence in the unique alignment sequence set by taking SNP as an example, namely if the reference genome is A, the aligned sequences show A and C, and calculating the depth of A and C, wherein A refers to the base of the reference sequence and C refers to the mutated base.
Heterozygous SNP (InDel) sites were screened for base depth. Preferably, if the base depth of the mutation divided by the depth of the site is greater than 0.2 and less than 0.8, and the depth of the site is greater than 20X, the site is determined to be an optional heterozygous SNP site, wherein the depth refers to how many sequences cover the site.
5. Haplotype determination of father and mother Single Nucleotide Polymorphisms (SNPs) and small fragment insertion deletion (InDel) sites
And connecting the detected SNP or InDel sites to obtain a fragment containing two heterozygous SNP or InDel positions.
For every two adjacent SNPs or InDel sites, given the sequence, taking SNPs as an example, for example, the first point is AC and the second point is GT, then the two points on the sequence may be AG, AT, CG, CT.
Selecting the two types with the most support, for example, AG with 5 sequences, AT with 10 sequences, CG with 8 CG with 19 CT, then selecting AT and CG as two-point connection.
And judging the connection of two adjacent heterozygous SNP or InDel sites to obtain the connection type of two adjacent sites.
Analyzing and filtering the sequence fragments with low quality values, wherein the sequence fragments with low quality values refer to the sequence fragments which contain bases with low quality values and cannot correspond to every two adjacent single nucleotide polymorphism sites or small fragment insertion deletion sites.
Calculating the probability value of possible occurrence of two adjacent heterozygous mutation sites, and giving a probability for each connection type, including but not limited to Bayesian probability and Lod ratio of superiority and Logarithm (LOD) value.
And selecting the type with the maximum probability of the two points according to the calculation result.
The haplotype for all points was calculated.
Considering all points, the overall haplotype is obtained, e.g., AT and CG for the connection between point 1 and point 2, TC and GT for the connection between point 2 and point 3, ATC and CGT for points 1, 2 and 3. The methods applied here include, but are not limited to, graph theory, optimal solution method.
The resulting haplotypes are adjusted.
And judging the strong connection and the weak connection of the haplotype result, namely judging whether the connection of the two loci is reliable or not, wherein the judgment criteria include but not limited to supported reads, calculated probability value and LOD value.
And (3) judging the sites with weak link again, selecting sequences covering the two sites and adjacent sites, considering the support conditions of the sequences, and correcting the sites with weak link to obtain the final haplotype, wherein the modified standard comprises but is not limited to the number of sequences spanning multiple sites and the number of sites supporting the haplotype nearby.
6. Construction method of second generation library
a. Sample preparation: obtaining free nucleic acid in a peripheral blood sample of a pregnant woman, wherein the free nucleic acid consists of a plurality of DNA fragments;
b. library preparation:
1) performing end repair on the free DNA to obtain end-repaired free DNA; the end repair is performed by the reaction of T4 DNA Polymerase (T4 DNA Polymerase) and T4Polynucleotide Kinase (T4Polynucleotide Kinase);
2) adding 'A' to the tail end of the free DNA with the tail end repaired to obtain the free DNA with the tail end added with 'A'; the end-plus "A" is carried out in a Klenow (3 '-5' exo-) containing end-plus "A" system;
3) adding a free DNA connection joint of 'A' to the tail end to obtain free DNA with a joint; the connecting joint is completed by the reaction of a sequencing joint and T4 DNA Ligase (T4 DNA Ligase);
4) amplifying the free DNA with the joint to obtain a free DNA enrichment product with the joint, wherein the amplification is carried out in an amplification system containing Pfx DNA polymerase (Platinum Pfx DNA polymerase), and the amplification product is recovered by using magnetic beads to obtain a peripheral blood plasma free DNA library.
c. Chip hybrid capture
And (3) carrying out hybridization capture reaction on the peripheral blood plasma free DNA library obtained in the step (b) by using the chip obtained in the step (1), and carrying out enrichment amplification on the obtained capture product.
d. Recovering the enriched products by using a magnetic bead method, a gel cutting method, a Blue PinPin HT method and the like, and thus completing the construction of a second-generation capture library;
e. performing quality inspection on the library: quality inspection is carried out on the second generation capture library by using Agilent 2100 and QPCR, and qualified libraries are detected to enter the next sequencing step.
7. Second generation sequencing method
And (4) sequencing the qualified library by adopting the second-generation sequencer according to the operating specification of the second-generation sequencer.
8. Obtaining Single Nucleotide Polymorphism (SNP) and small fragment insertion deletion (InDel) sites in peripheral blood sample of pregnant woman according to second generation sequencing data
Low quality of filtered data. The filtration conditions include, but are not limited to, the ratio of N (which means that the sequenced base cannot be judged) contained in the sequence, and the ratio of the number of low-quality bases contained in the sequence.
Obtaining the aligned sequences. The original files from the sequencing machine were aligned to the reference genome using software including but not limited to bwa, soap.
Quality value adjustments and local re-alignments are performed using software including, but not limited to, GATK.
The corresponding parental detection mutation sites in plasma were extracted using software including but not limited to GATK, samtools, etc.
9. Judging fetal haplotype according to Single Nucleotide Polymorphism (SNP) and small fragment insertion deletion (InDel) sites in peripheral blood sample of pregnant woman
Selecting according to Single Nucleotide Polymorphism (SNP) and small fragment insertion deletion (InDel) sites in a pregnant woman peripheral blood sample, selecting sites which are heterozygous for the mother and homozygous for the father when judging which haplotype of the genetic mother, and selecting sites which are heterozygous for the father and homozygous for the mother when judging which haplotype of the genetic father. For example, to determine which haplotype of the genetic mother, the following steps are performed:
sites heterozygous for the father were filtered.
The low depth sites were filtered.
Sites that did not appear in the maternal haplotype were filtered.
When judging which haplotype is in the genetic father, the method comprises the following steps:
the sites heterozygous for the mother were filtered.
The low depth sites were filtered.
Sites that did not appear in the paternal haplotype were filtered.
Calculating fetal concentration by methods including but not limited to calculation using homozygous different loci of parents, preferably by:
parents were selected to be homozygous and different sites, and the total depth of maternal bases counted in peripheral blood was a, and the total depth of paternal bases appeared was B, so the fetal concentration was f 2B/(a + B).
For example: position 5 of the target region was selected according to the requirement that parents were homozygous but base types were different: maternal base type CC, paternal base type GG (see third generation sequencing data); then, the second generation sequencing data shows that the 5 th position is composed of C and G, where C is the base type of the mother and the depth (which is the base ratio of C at the position) is 90%, G is the base type of the father and the depth is 10%, and the estimated fetal concentration is 20%.
And (4) deducing the haplotype of the fetus by using a hidden horse model.
Calculating the mass value of the selected site in the above a in the peripheral blood of the pregnant woman.
Calculating the probability of observing the sequencing distribution at each point, assuming that the maternal haplotypes are m0 and m1, calculating the probability of inheriting m0 and m1 respectively, taking m0 as an example, preferably the sequencing distribution belongs to a plurality of distributions, and then the probability is
b ═ nA + nG + nC + nT! L (nA | nG | nC | nT |) (pA) nA (pG) nG (pC) nC (pT) nT, where nA denotes the depth of a certain site A and pA denotes the probability of A being possible at this site,
pA=0.5*(1-f)*Δ(s,m0)+0.5*(1-f)*Δ(s,m1)+0.5*f*Δ(s,m0)+0.5*f*Δ(s, m1), whereinΔ(x, y) 1-e when x equals y, e/3 when x does not equal y, e is the error rate of bases, and f is the fetal concentration.
Calculating optimal genetic path by using hidden horse model, wherein the optimal recombination probability, namely state transition probability is 10-6The initial state probabilities are 1/2, respectively, and the optimal solution is obtained by using the viterbi algorithm.
Determining whether a haplotype of a father mutation is inherited, determining a site on the haplotype of the father, for example, when the selection mother is homozygous and the bases are not identical, confirming whether the site appears in the peripheral blood, and if such sites on a haplotype appear in the peripheral blood, the peripheral blood inherits the haplotype.
Deducing according to the haplotypes of the parents respectively, namely judging the haplotypes of the fetus, and determining that the fetus is normal, a carrier and a patient.
The third generation sequence is used for verification, and whether sites around the mutation site are in the same sequence with the mutation or not is judged to determine whether the pathogenic mutation is inherited or not.
Example 2
Noninvasive DMD detection was performed on 1 patient, in this example, both the father and mother were carriers of the c.583C > T mutation, and the fetus was homozygous for the c.583C > T mutation.
1. Collection and processing of paternal and maternal samples
5mL of peripheral blood was collected from the father and mother using a Streck blood collection tube according to the peripheral blood standard collection procedure. After collection, plasma separation operation is carried out on peripheral blood of the father and the mother in time according to a standard two-step centrifugation method.
1.1 plasma DNA extraction
Extracting free DNA of maternal peripheral blood plasma by using a TIANAmp Micro DNA Kit, which comprises the following specific operation steps:
1.1.1 taking 600 mu L of pregnant woman peripheral blood plasma in a 2mL centrifuge tube, adding 20 mu L of protease K solution, fully shaking and uniformly mixing, and centrifuging for a short time.
1.1.2 adding 600 u L buffer solution GB (containing Carrier RNA stock solution with concentration of 1u g/u L, the concrete preparation method is described in the specification), fully shaking and mixing, and centrifuging for a short time.
1.1.356 ℃ for 10min, without shaking the sample. Centrifuging to remove droplets from the inner wall of the tube cover.
1.1.4 adding 300 μ L of frozen anhydrous ethanol, mixing, standing at room temperature for 5min, and centrifuging.
1.1.5 the solution obtained in 1.1.4 was transferred to an adsorption column (the adsorption column was put into a collection tube), centrifuged at 12,000rpm for 30sec, discarded, and the adsorption column was returned to the collection tube.
1.1.6 Add 500. mu.L of buffer GD (check whether absolute ethanol has been added before use), centrifuge at 12,000rpm for 30sec, discard waste solution, and place the adsorption column back into the collection tube.
1.1.7 Add 600. mu.L of rinsing solution PW (before use, check if absolute ethanol has been added), centrifuge at 12,000rpm for 30sec, discard the waste solution, and place the adsorption column back into the collection tube.
1.1.8 repeat procedure 1.1.7.
Centrifuging at 1.1.912,000 rpm for 2min, and discarding waste liquid. And placing the adsorption column at room temperature for 2-5min to thoroughly dry the residual rinsing liquid in the adsorption material.
1.1.10 transferring the adsorption column into a clean centrifuge tube, adding 90 μ L elution buffer TB to the middle position of the adsorption membrane, standing at room temperature for 2-5min, centrifuging at 12,000rpm for 2min, collecting the solution into a new 1.5ml centrifuge tube to obtain the pregnant woman peripheral blood plasma free DNA, pasting a corresponding sample information bar code on the centrifuge tube, taking 1uL DNA for NanoDrop detection and recording the concentration, and storing the plasma DNA at-80 ℃ for later use.
1.2 peripheral blood genomic DNA extraction
Extracting the genomic DNA of the peripheral Blood of both couples by using a TIANAmp Blood DNA Kit, which comprises the following steps:
1.2.1 to a new 1.5mL EP tube were added 20uL proteinase K, 200uL whole blood, and 200uL AL in that order.
1.2.2, fully and evenly mixing by reversing, and carrying out warm bath at 56 ℃ for 10 min.
1.2.3 short-term centrifugation, adding 200uL of absolute ethyl alcohol, fully reversing and mixing evenly, and short-term centrifugation.
1.2.4 transfer the solution obtained in 1.2.3 to a CB3 trap (adsorption column CB3 into the trap tube), centrifuge at 12000rpm for 30s, dump the trap, and place adsorption column CB3 into the trap again.
1.2.5 Add 500. mu.L of rinsing GD to adsorption column CB3, centrifuge at 12000rpm for 30s, pour off the waste liquid in the collection tube, and place adsorption column CB3 into the collection tube again.
1.2.6 mu.L of the rinsing solution PW was added to the adsorption column CB3 and centrifuged at 12000rpm for 30 s. The waste liquid in the collecting tube is poured out, and the adsorption column CB3 is put into the collecting tube again.
1.2.7 repeat step 1.2.6.
1.2.8 and centrifuging at 12000rpm for 2min, pouring out waste liquid in the collecting tube, and placing adsorption column CB3 at room temperature for several minutes to completely dry the residual rinsing liquid in the adsorption material.
1.2.9 transferring the adsorption column CB3 into a 1.5ml centrifuge tube, suspending and dripping 80 mu L of elution buffer TB into the middle position of an adsorption film, standing at room temperature for 2-5min, centrifuging at 12000rpm for 2min, collecting the solution into the centrifuge tube, transferring the solution into a new 1.5ml centrifuge tube, sticking a corresponding sample information bar code on the centrifuge tube, taking 1uL of DNA for NanoDrop detection and recording the concentration, and storing the genomic DNA to-20 ℃ for later use.
2. Construction of the third Generation sequencing library
2.1 disruption of genomic DNA
1ug of genomic DNA was taken from each sample, EB solution was supplemented to 150. mu.L, and disrupted using g-TUBE TUBEs.
2.1.1 place g-TUBE on loader, add 150. mu.L of sample to the top of g-TUBE, screw on TUBE cap, ensure tight cap.
2.1.2 the g-TUBE TUBE was placed in the forward direction in a high speed centrifuge and centrifuged at 7000rpm for 1 min.
2.1.3 g-TUBE was removed, and the g-TUBE TUBE was inverted and placed again in the high speed centrifuge (g-TUBE TUBE cover down) and centrifuged at 7000rpm for 1 min.
2.1.4 taking out the g-TUBE, transferring the sample from the g-TUBE to a new 1.5ml centrifugal TUBE, completing sample interruption, purifying by using AMPure PB magnetic bead with 0.8 time volume, finally eluting the magnetic bead by using 24 mu L EB solution, transferring 22 mu L of sample to the new 1.5ml centrifugal TUBE, sticking related sample labels on the side of the sample TUBE, entering the next operation, or placing the sample in-20 ℃ for standby.
2.2 recovery of fragments of interest
Fragments ≧ 7k were recovered using the High Pass mode of the Blue PinPin HT instrument.
And (3) taking out the recovered sample after the operation of the instrument is finished, taking 1 mu L of the recovered sample for Qbuit HS detection, recording concentration information, sticking a related sample label on the side of the sample tube, and entering the next operation or placing the sample tube at-80 ℃ for later use.
2.3 Pre-hybridization sample preparation
Pre-hybridization sample preparation was performed using the KAPA Hyper Prep Kit.
2.3.1 end repair, add "A"
200ng of the DNA obtained in 2.2 was taken, and EB was added to make up a volume of 50. mu.L, and the system of end repair and "A" addition was performed in a 200. mu.L LPCR tube according to the following table.
Figure BDA0001558673970000231
Fully and uniformly mixed, centrifuged for a short time, put into a PCR instrument and reacted according to the following conditions.
Figure BDA0001558673970000232
After the reaction is finished, the next reaction is carried out immediately.
2.3.2 Joint connection
The solution obtained in the previous step was formulated according to the following table.
Figure BDA0001558673970000233
Fully and uniformly mixing, centrifuging for a short time, placing into a PCR instrument, keeping the temperature at 20 ℃ for 15min, immediately purifying by using AMPure PB magnetic beads with the volume of 0.8 time after the reaction is finished, finally eluting the magnetic beads by using 52 mu L of EB solution, transferring 50 mu L of sample into a new 1.5ml centrifugal tube, sticking a related sample label on the side of the sample tube, and entering the next operation or placing the sample tube at-20 ℃ for later use.
2.3.3 amplification of fragments of interest
The 50. mu.L solution obtained in the previous step was divided into 25. mu.L portions on average, and a PCR system was prepared in 200. mu.L LPCR tubes according to the following table, i.e., 2 reactions were performed for 1 sample, respectively.
Figure BDA0001558673970000241
Fully and uniformly mixed, centrifuged for a short time, put into a PCR instrument and reacted according to the following conditions.
Figure BDA0001558673970000242
After the reaction is finished, using AMPure PB magnetic bead with 0.8 time volume to recover PCR products, finally using 27 mu L EB solution to elute the magnetic beads, transferring 25 mu L of samples to a new 1.5ml centrifugal tube, pasting related sample labels on the side of the sample tube, immediately entering the next operation, or placing the sample tube at-80 ℃ for standby.
2.4 chip Capture
2.4.1 Capture chip preparation and chip Capture
1) Equivalently mixing samples needing to be captured into 1 new 1.5ml centrifuge tube, wherein the total amount of DNA is 3 ug;
2) the corresponding Index blocking reagents (1000pM) were added to the candidate Index for library construction in a total volume of 4. mu.L, the indexes used in the library construction in this example were Index1 and Index2, and reagents were added to the mixed sample tubes according to the following table.
Figure BDA0001558673970000243
Shaking, mixing, centrifuging for a short time, pricking 2 holes on the tube cover of a centrifuge tube by using a 1-time sterile blood sampling needle, and vacuum concentrating at 60 deg.C to dry;
3) taking out the library capture chip from-80 ℃, and unfreezing on ice;
4) taking out the evaporated sample, attaching a sealing film, and adding a reagent into the sample tube according to the following system;
Figure BDA0001558673970000251
concussion and uniform mixing, transient centrifugation, and reaction on a PCR instrument: 95 ℃ for 10min (which needs to be set in advance);
5) after the reaction was completed, the sample was removed, centrifuged at maximum speed at room temperature, and then 14. mu.L of the sample was transferred to a 200. mu.L PCR tube containing 6. mu.L of the capture chip and shaken vigorously.
6) Placing the mixture on a PCR instrument to perform hybridization reaction according to the following procedures: 47 ℃ for 20 hours, while the hot lid of the PCR machine was set at 57 ℃. After the reaction is finished, the next step of hybridization elution is carried out.
2.4.2 hybridization elution
1) The streptavidin magnetic bead M270 needs to be placed at room temperature in advance for half an hour for balance; the constant temperature mixer was opened, set at 47 ℃, the following table shows the usage of 1 sample, multiple samples were added, and the reagents to be preheated were placed in the constant temperature mixer according to the following system.
Figure BDA0001558673970000252
2) The beads were mixed by vigorous shaking, M270 beads were removed at 50. mu.L/capture library and added to a new 1.5mL centrifuge tube.
3) Transferring the centrifuge tube with the magnetic beads to a magnetic frame, and discarding the supernatant after clarification;
4) keeping the centrifuge tube on a magnetic rack, and adding 100 mu L of 1x magnetic bead elution buffer solution;
5) taking down the centrifugal tube from the magnetic frame, and oscillating for 12 s;
6) transferring the centrifuge tube to a magnetic frame, and discarding the supernatant after clarification;
7) repeating the steps 4) to 6), and washing for 2 times;
8) taking down the centrifuge tube with the magnetic beads, adding 50 mu L of 1x magnetic bead elution buffer solution, transferring the centrifuge tube into a 200 mu L PCR tube, transferring the PCR tube with the magnetic beads onto a magnetic frame, and discarding the supernatant after clarification, wherein the magnetic beads can be used for combining with a hybridized chip;
9) adding the solution containing the chip after the hybridization reaction into the M270 magnetic bead in the step 8), and uniformly mixing the solution with the chip by shaking;
10) putting the mixture into a PCR instrument with the preset temperature of 47 ℃ for reaction, wherein a hot cover of the PCR instrument is set to be 57 ℃;
11) after the reaction is finished, adding 100 mu L of 1 Xelution buffer solution I which is heated at 47 ℃ into a PCR tube containing 20 mu L of capture chip, blowing and uniformly mixing the solution by using a gun head, and transferring the solution into a 1.5mL centrifuge tube;
12) transferring the centrifuge tube with the capture chip to a magnetic frame, and discarding the supernatant after clarification;
13) taking down the centrifuge tube, adding 200 μ L of 1xStringent elution buffer solution preheated at 47 ℃, blowing and beating the buffer solution uniformly by using a gun head, transferring the centrifuge tube to a constant-temperature mixing instrument, and reacting for 5min at 47 ℃;
14) repeating steps 12) to 13); a total of 2 washes with 1 XStrinent elution buffer pre-warmed at 47 ℃;
15) transferring the centrifuge tube to a magnetic frame, and discarding the supernatant after clarification;
16) taking down the centrifuge tube, adding 200 μ L of normal temperature 1 × elution buffer solution I, blowing and beating with a gun head, mixing, transferring the centrifuge tube onto a magnetic frame, and discarding the supernatant after clarification;
17) taking down the centrifuge tube, adding 200 μ L of normal temperature 1 × elution buffer solution II, blowing and beating with a gun head, mixing, transferring the centrifuge tube onto a magnetic frame, and discarding the supernatant after clarification;
18) taking down the centrifuge tube, adding 200 μ L of normal temperature 1 × elution buffer solution III, blowing and beating with a gun head, mixing, transferring the centrifuge tube onto a magnetic frame, and discarding the supernatant after clarification;
19) taking down the centrifuge tube, adding 50 mu L of PCR-grade water, mixing uniformly, and carrying out the next reaction or storing at-20 ℃ for later use.
2.4.3 hybrid library amplification
1) 25. mu.L/library of M270 magnetic bead chip suspension obtained in the previous step was prepared in a PCR amplification system in 200. mu.M L PCR tube according to the following table.
Figure BDA0001558673970000261
Fully and uniformly mixed, centrifuged for a short time, put into a PCR instrument and reacted according to the following conditions.
Figure BDA0001558673970000262
Figure BDA0001558673970000271
After the reaction is finished, AMPure PB magnetic bead with 0.8-time volume is used for purification, and finally 27 mu LEB solution is used for eluting the magnetic bead, 24 mu L of sample is transferred to a new 1.5ml centrifugal tube, a relevant sample label is pasted on the side of the sample tube, and the next operation is carried out, or the sample tube is placed at-80 ℃ for standby.
2.5 third Generation library construction
2.5.1DNA Damage repair
The purified samples obtained in the previous step were systematically prepared in a PCR tube according to the following table.
Figure BDA0001558673970000272
Mix well, centrifuge briefly, and react according to the following temperature program.
Figure BDA0001558673970000273
2.5.2 end repair
The samples after the reaction in the previous step were prepared in a PCR tube according to the following table.
Figure BDA0001558673970000274
Mix well, centrifuge briefly, and react according to the following temperature program.
Figure BDA0001558673970000275
After the reaction is finished, the next step is quickly carried out.
2.5.3 magnetic bead purification
2.5.3.1 transferring the sample obtained in the previous step to a 1.5mL centrifuge tube, adding 23.6 μ L (0.45 times volume) of AMPure PCR magnetic beads, mixing well, and centrifuging briefly;
2.5.3.2 placing the centrifugal tube with the sample and the magnetic beads on a mixing machine at room temperature, setting 2000rpm for 10 min;
2.5.3.3 centrifuging for a short time, throwing the solution to the bottom of the tube, transferring the tube to a magnetic rack, clarifying, carefully sucking the supernatant and transferring the supernatant into 1 new 1.5mL centrifuge tube (without touching the magnetic beads), and marking;
2.5.3.4 keeping the centrifuge tube with magnetic beads on a magnetic rack, adding 200 μ L of freshly prepared 70% ethanol, clarifying, and discarding the supernatant;
2.5.3.5 repeat step 2.5.3.4;
2.5.3.6 discarding the ethanol, air-drying the magnetic beads to ensure no ethanol remains;
2.5.3.7 adding 32 μ L EB solution, mixing, placing on mixing machine, setting at 2000rpm for 1 min;
2.5.3.8 the centrifuge tubes were transferred to a magnetic rack and, after clarification, 30. mu.L of EB solution was pipetted into 1 new 1.5mL centrifuge tube and 1. mu.L was taken for Nanodrop assay to ensure that the current procedure was not abnormal.
2.5.4 blunt end ligation preparation
1) The samples obtained in 2.5.3 were formulated systematically in a PCR tube according to the following table, and on ice.
Figure BDA0001558673970000281
Mix well, centrifuge briefly, and react according to the following temperature program.
Figure BDA0001558673970000282
After the reaction is finished, the next step is rapidly carried out.
2) Digestion of unsuccessfully ligated products
The following reagents were added to the sample tubes obtained in the previous step according to the following table.
Figure BDA0001558673970000291
Mix well, centrifuge briefly, and react according to the following temperature program.
Figure BDA0001558673970000292
After the reaction of this step was completed, magnetic beads were used for purification.
2.5.5 magnetic bead purification for the first time
2.5.5.1 adding 18.9 μ L (0.45 volume times) of AMPure PB magnetic beads into the sample obtained in the previous step, mixing well, and centrifuging for a short time;
2.5.5.2 placing the centrifugal tube with the sample and the magnetic beads on a mixing machine at room temperature, setting 2000rpm for 10 min;
2.5.5.3 centrifuging for a short time, throwing the solution to the bottom of the tube, transferring the tube to a magnetic rack, clarifying, carefully sucking the supernatant and transferring the supernatant into 1 new 1.5mL centrifuge tube (without touching the magnetic beads), and marking;
2.5.5.4 keeping the centrifuge tube with magnetic beads on a magnetic rack, adding 200 μ L of freshly prepared 70% ethanol, clarifying, and discarding the supernatant;
2.5.5.5 repeat step 2.5.5.4;
2.5.5.6 discarding ethanol, air drying the magnetic beads to ensure no ethanol residue;
2.5.5.7 adding 50 μ L EB solution, mixing, placing on mixing machine, setting at 2000rpm for 1 min;
2.5.5.8 the tubes were transferred to a magnetic rack for clarification, and 50. mu.L of EB solution was pipetted into 1 new 1.5mL tube for the next bead purification step.
2.5.6 magnetic bead purification for the second time
2.5.6.1 adding 22.5 μ L (0.45 times volume) of AMPure PB magnetic beads into the sample obtained in the previous step, mixing well, and centrifuging for a short time;
2.5.6.2 placing the centrifugal tube with the sample and the magnetic beads on a mixing machine at room temperature, setting 2000rpm for 10 min;
2.5.6.3 centrifuging for a short time, throwing the solution to the bottom of the tube, transferring the tube to a magnetic rack, clarifying, carefully sucking the supernatant and transferring the supernatant into 1 new 1.5mL centrifuge tube (without touching the magnetic beads), and marking;
2.5.6.4 keeping the centrifuge tube with magnetic beads on a magnetic rack, adding 200 μ L of freshly prepared 70% ethanol, clarifying, and discarding the supernatant;
2.5.6.5 repeat step 2.5.6.4;
2.5.6.6 discarding ethanol, air drying the magnetic beads to ensure no ethanol residue;
2.5.6.7 adding 100 μ L EB solution, mixing, placing on mixing machine, setting 2000rpm for 1 min;
2.5.6.8 the centrifuge tube was transferred to a magnetic rack and after clarification 100. mu.L of EB solution was pipetted into 1 new 1.5mL centrifuge tube for further magnetic bead purification.
2.5.7 magnetic bead purification for the third time
2.5.7.1 adding 45 μ L (0.45 volume times) of AMPure PB magnetic beads into the sample obtained in the previous step, mixing well, and centrifuging for a short time;
2.5.7.2 placing the centrifugal tube with the sample and the magnetic beads on a mixing machine at room temperature, setting 2000rpm for 10 min;
2.5.7.3 centrifuging for a short time, throwing the solution to the bottom of the tube, transferring the tube to a magnetic rack, clarifying, carefully sucking the supernatant, transferring the supernatant to 1 new 1.5mL tube (without touching the magnetic beads), and marking;
2.5.7.4 keeping the centrifuge tube with magnetic beads on a magnetic rack, adding 200 μ L of freshly prepared 70% ethanol, clarifying, and discarding the supernatant;
2.5.7.5 repeat step 2.5.7.4;
2.5.7.6 discarding ethanol, air drying the magnetic beads to ensure no ethanol residue;
2.5.7.7 adding 10 μ L EB solution, mixing, placing on mixing machine, setting 2000rpm for 1 min;
2.5.7.8 transfer the tube to a magnetic rack for clarification, pipette 10. mu.L of EB solution into 1 new 1.5mL centrifuge tube.
2.5.8 third generation library QC
The library was subjected to Qbuit detection and 2100 detection, 2100 panels are shown in FIGS. 1 and 2, and qualified libraries were detected and sequenced.
2.5.9 third generation sequencing
This example uses Pacbio sequal sequencing, according to Pacbio sequal instrument operating specifications for sequencing.
3. Detection of Single Nucleotide Polymorphism (SNP) and small fragment insertion deletion (InDel) sites in father and mother
And analyzing the data of the third generation sequencing which can be downloaded, wherein the analysis steps are as follows:
3.1 alignment, three generations of sequenced long fragments were aligned to the human reference genome, which was selected for hg19, using bwa for alignment.
3.2 select the best alignment result, analyze the multiple alignment results of each sequence, select the alignment position with the highest score of the sequence as the only alignment result.
3.3 calculating the depth of each site, obtaining the result of mpieup by using samtools, and analyzing to obtain the corresponding situation of the base and the depth.
3.4 filtration of heterozygous SNPs and InDel according to a frequency of (0.2, 0.8) and depth greater than 20X. The results are shown in Table 1.
TABLE 1 number of SNPs and InDel after filtration
SNP InDel
Father and father 8124 3584
Mother 9313 3240
4. Construction of haplotypes for parents with third generation data
And analyzing the third-generation off-line data of the parents to obtain the haplotype of the parents.
4.1 calculating the fragment that is heterozygous for the position of two adjacent heterozygous SNPs (InDel).
4.2 judge the ligation of two heterozygous SNPs (InDel).
4.3 calculate the haplotype for all points.
4.4 adjustments were made to the haplotypes obtained and the results are shown in Table 2.
TABLE 2 results of haplotype determination
Mother Father and father
Number of 11 8
Longest length 374352 365135
Longest starting point 30899230 30898793
Longest endpoint 31273582 31263928
Shortest length 7164 6981
Shortest starting point 31830473 31357817
Shortest end point 31837637 31364798
As shown in Table 2, where the numbers indicate the number of haplotypes detected, the longest indicates the length of the longest haplotype, the longest starting point indicates the starting point of the longest haplotype, the longest end point indicates the end point of the longest haplotype, the shortest indicates the length of the shortest haplotype, the shortest starting point indicates the starting point of the shortest haplotype, and the shortest end point indicates the end point of the shortest haplotype.
4.5 sequence verification of the haplotype of the mutant site. Wherein the mutation site is DMD: c.583C > T, 32827676 for chrX on the human reference genome. The haplotype without mutation of the mother was designated as m0, the haplotype with mutation was designated as m1, the haplotype without mutation of the father was designated as f0, and the haplotype with mutation was designated as f 1.
5. Construction of a second Generation sequencing library
The second generation library construction is carried out on DNA extracted from the separated plasma of the peripheral blood of the pregnant woman.
5.1 end repair, add "A"
200ng of the DNA obtained in 2.2 was taken, EB was added to make up volume to 50. mu.L, and end repair and "A" addition system were performed in 200. mu.L LPCR tubes according to the following table.
Figure BDA0001558673970000321
Fully and uniformly mixed, centrifuged for a short time, put into a PCR instrument and reacted according to the following conditions.
Figure BDA0001558673970000322
After the reaction is finished, the next reaction is carried out immediately.
5.2 Joint connection
The solution obtained in the previous step was formulated according to the following table.
Figure BDA0001558673970000323
Fully and uniformly mixing, centrifuging for a short time, placing into a PCR instrument, keeping the temperature at 20 ℃ for 15min, immediately purifying by using 88 mu LAMPure XP magnetic beads after the reaction is finished, finally eluting the magnetic beads by using 52 mu L EB solution, transferring 50 mu L of sample into a new 1.5ml centrifugal tube, sticking related sample labels on the side of the sample tube, and entering the next operation or placing in-20 ℃ for standby.
5.3 amplification of fragments of interest
A PCR system was prepared in a 200. mu. LPCR tube according to the following table, and a PCR amplification system was prepared.
Figure BDA0001558673970000331
Fully and uniformly mixed, centrifuged for a short time, put into a PCR instrument and reacted according to the following conditions.
Figure BDA0001558673970000332
After the reaction is finished, 1 time volume of AMPure XP magnetic beads is used for recovering PCR products, 27 mu L of EB solution is used for eluting the magnetic beads, 25 mu L of samples are transferred into a new 1.5ml centrifugal tube, relevant sample labels are pasted on the side of the sample tube, and the next step of operation is immediately carried out, or the sample tube is placed at the temperature of minus 20 ℃ for standby.
5.4 target area Capture
5.4.1 Capture chip preparation and chip Capture
1) Equivalently mixing samples needing to be captured into 1 new 1.5ml centrifuge tube, wherein the total amount of DNA is 1 ug;
2) the corresponding Index blocking reagent (1000pM) was added to the Index at the time of library construction, which was Index3 and Index4, in a total volume of 4. mu.L, and the reagents were added to the mixed sample tubes according to the following table;
Figure BDA0001558673970000333
Figure BDA0001558673970000341
shaking, mixing, centrifuging for a short time, pricking 2 holes on the tube cover of a centrifuge tube by using a 1-time sterile blood sampling needle, and vacuum concentrating at 60 deg.C to dry;
3) taking out the library capture chip from-80 ℃, and unfreezing on ice;
4) taking out the evaporated sample, attaching a sealing film, and adding a reagent into the sample tube according to the following system;
Figure BDA0001558673970000342
concussion and uniform mixing, transient centrifugation, and reaction on a PCR instrument: 95 ℃ for 10min (it is set in advance).
5) After the reaction was completed, the sample was removed, centrifuged at maximum speed at room temperature, and then 14. mu.L of the sample was transferred to a 200. mu.L PCR tube containing 6. mu.L of the capture chip and shaken vigorously.
6) Placing the mixture on a PCR instrument to perform hybridization reaction according to the following procedures: and the temperature is 47 ℃, the time is 16-20 hours, and meanwhile, the hot cover of the PCR instrument is set to be 57 ℃. After the reaction is finished, the next step of hybridization elution is carried out.
5.4.2 hybridization elution
1) The streptavidin magnetic bead M270 needs to be placed at room temperature in advance for half an hour for balance; the constant temperature mixer was opened, set at 47 ℃, the following table shows the usage of 1 sample, multiple samples were added, and the reagents to be preheated were placed in the constant temperature mixer according to the following system.
Figure BDA0001558673970000343
2) The beads were mixed by vigorous shaking, M270 beads were removed at 50. mu.L/capture library and added to a new 1.5mL centrifuge tube.
3) Transferring the centrifuge tube with the magnetic beads to a magnetic frame, and discarding the supernatant after clarification;
4) keeping the centrifuge tube on a magnetic rack, and adding 100 mu L of 1x magnetic bead elution buffer solution;
5) taking down the centrifugal tube from the magnetic frame, and oscillating for 12 s;
6) transferring the centrifuge tube to a magnetic frame, and discarding the supernatant after clarification;
7) repeating the steps 4) to 6) for 2 times;
8) taking down the centrifuge tube with the magnetic beads, adding 50 mu L of 1x magnetic bead elution buffer solution, transferring the centrifuge tube into a 200 mu L PCR tube, transferring the PCR tube with the magnetic beads onto a magnetic frame, and discarding the supernatant after clarification, wherein the magnetic beads can be used for combining with a hybridized chip;
9) adding the solution containing the chip after the hybridization reaction into the M270 magnetic bead in the step 8), and uniformly mixing the solution with the chip by shaking;
10) putting the mixture into a PCR instrument with the preset temperature of 47 ℃ for reaction, wherein a hot cover of the PCR instrument is set to be 57 ℃;
11) after the reaction is finished, adding 100 mu L of 1 Xelution buffer solution I which is heated at 47 ℃ into a PCR tube containing 20 mu L of capture chip, blowing and uniformly mixing the solution by using a gun head, and transferring the solution into a 1.5mL centrifuge tube;
12) transferring the centrifuge tube with the capture chip to a magnetic frame, and discarding the supernatant after clarification;
13) taking down the centrifuge tube, adding 200 μ L of 1xStringent elution buffer solution preheated at 47 ℃, blowing and beating the buffer solution uniformly by using a gun head, transferring the centrifuge tube to a constant-temperature mixing instrument, and reacting for 5min at 47 ℃;
14) repeating steps 12) to 13); a total of 2 washes with 1 XStrinent elution buffer pre-warmed at 47 ℃;
15) transferring the centrifuge tube to a magnetic frame, and discarding the supernatant after clarification;
16) taking down the centrifuge tube, adding 200 μ L of normal temperature 1 × elution buffer solution I, blowing and beating with a gun head, mixing, transferring the centrifuge tube onto a magnetic frame, and discarding the supernatant after clarification;
17) taking down the centrifuge tube, adding 200 μ L of normal temperature 1 × elution buffer solution II, blowing and beating with a gun head, mixing, transferring the centrifuge tube onto a magnetic frame, and discarding the supernatant after clarification;
18) taking down the centrifuge tube, adding 200 μ L of normal temperature 1 × elution buffer solution III, blowing and beating with a gun head, mixing, transferring the centrifuge tube onto a magnetic frame, and discarding the supernatant after clarification;
19) taking down the centrifuge tube, adding 40 mu L of PCR-grade water, mixing uniformly, and carrying out the next reaction or storing at-20 ℃ for later use.
5.4.3 hybrid library amplification
1) Preparing a PCR amplification system in a 200 mu L PCR tube by taking 20 mu L of capture library from the M270 magnetic bead chip suspension obtained in the last step according to the following table;
Figure BDA0001558673970000351
Figure BDA0001558673970000361
fully and uniformly mixing, centrifuging for a short time, putting into a PCR instrument, and reacting according to the following conditions;
Figure BDA0001558673970000362
after the reaction is finished, 60 μ L of AmpureXP beads are used for purifying PCR amplification products, and 27 μ L of Elution buffer is used for eluting the enriched products from the AmpureXP beads to obtain a purified hybridization library, so that the construction of the chip capture library is finished.
5.4.4 quality testing of libraries
Library quality detection is carried out by using Agilent 2100Bioanalyzer and fluorescence Quantitative PCR (QPCR), a 2100 detection peak diagram is shown in figure 3, and a detection result meets the requirement of on-machine sequencing.
5.4.5 second Generation sequencing
And selecting a Nextseq CN500 instrument for sequencing, and carrying out the on-machine operation in the sequencing process strictly according to the standard operation flow of the on-machine sequencing.
6. Detection of second-generation sequencing mutations
The parental mutation sites in the periphery were extracted by sequencing the maternal peripheral blood.
6.1 Low quality of filtered data. Filtering sequences having a proportion of N greater than 10% of a sequence, and filtering sequences having a proportion of mass values less than 15 greater than 50% of a sequence.
6.2 obtaining the aligned sequences. The original files from the sequencer were aligned to hg19 using bwa to yield the original bam file.
6.3 quality value adjustment and local realignment, the software used is GATK.
6.4 extraction of the indication of the depth in maternal peripheral blood of those sites in the adjusted bam file that were parentally mutated. Specific results are shown in table 3.
Figure BDA0001558673970000363
Figure BDA0001558673970000371
Table 3. sites corresponding to parental SNPs and InDel were selected for maternal peripheral blood, and depth was taken into account in peripheral blood.
7. Inferring haplotype of the fetus
7.1 selection of loci based on SNP in haplotype (InDel), maternal heterozygous selected when dealing with maternal haplotype, site homozygous in father, paternal heterozygous selected when dealing with paternal haplotype, site homozygous by mother, and judgment of whether these sites are located on the haplotype carrying the causative mutation.
The sites of heterozygosity in the maternal father are shown in Table 4.
Coordinates of the object m0 bases m1 bases
30899531 C T
30937962 G A
31048336 C T
31138926 C A
31393725 C G
31662132 A T
31694129 A G
31752136 A G
31812261 C T
32002485 C A
32052223 T C
32254522 C T
32434433 G A
32656058 C G
32662376 T C
32687216 A G
33337961 T C
33411855 G A
33560377 A T
TABLE 4 sites homozygous for the mother's heterozygous father, given by the haplotypes of m0 and m 1.
The sites of homozygous maternal heterozygosity are shown in table 5.
Figure BDA0001558673970000372
Figure BDA0001558673970000381
Figure BDA0001558673970000391
Figure BDA0001558673970000401
Table 5 sites of paternal heterozygosity, maternal homozygosity, where f0 and f1 represent haplotypes of the paternal, this sample result showed only 114 available sites for SNPs, since there were enough SNPs, considered to be more accurate than InDel.
7.2 calculating fetal concentration by methods including but not limited to calculation using homozygous different sites of parents. The fetal concentration was found to be 8%.
7.3 inferring fetal haplotypes with the hidden horse model
7.3.1 calculate the quality values in the maternal peripheral blood of the sites heterozygous and homozygous for the mother selected in a above.
7.3.2 calculate the probability of sequencing distribution at each point.
7.3.3 optimal path using hidden horse model and viterbi algorithm. See figure 4 in particular.
As shown in fig. 4, the first column represents coordinates, the second column represents a haplotype of m0, the third column represents a haplotype of m1, the fourth column represents results appearing in peripheral blood sequencing data, and the fifth column represents an optimal sequence calculated by the hidden horse model.
7.3.4 calculate which haplotype the father inherited. The father f1 was not shown in the mother but was shown in the peripheral blood, and therefore f1 was determined to have been inherited.
7.3.5 inferences were made based on parental haplotypes, inheriting m1 and f1, whereas both m1 and f1 carry mutated sites, so the fetus was inferred to be a DMD patient.
7.3.6 extracting reads from the three generations of data to verify and confirm that the fetus inherits the pathogenic mutation.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and the above embodiments are only used for explaining the claims. The scope of the invention is not limited by the description. Any changes or substitutions that can be easily made by those skilled in the art within the technical scope of the present disclosure are included in the scope of the present invention.

Claims (1)

1. A system for determining the fetal Duchenne muscular dystrophy gene haplotype, said system comprising:
constructing a third generation sequencing library apparatus for constructing a third generation sequencing library based on a blood sample taken from a blood sample of a fetus father and mother;
third-generation sequencing equipment, wherein the third-generation sequencing equipment is used for carrying out third-generation sequencing on the third-generation sequencing library to obtain a third-generation sequencing result;
a parental Duchenne muscular dystrophy gene haplotype determining device for determining the parental and maternal Duchenne muscular dystrophy gene haplotypes from the third generation sequencing results;
a second-generation sequencing library construction device, which is used for constructing a second-generation sequencing library based on a detection sample, wherein the detection sample is taken from a peripheral blood sample of a pregnant woman;
the second-generation sequencing equipment is used for carrying out second-generation sequencing on the second-generation sequencing library to obtain a second-generation sequencing result;
a device for determining the Duchenne muscular dystrophy gene haplotype of the fetus, which is used for determining the Duchenne muscular dystrophy gene haplotype of the fetus according to the second-generation sequencing result and the Duchenne muscular dystrophy gene haplotype of the father and the mother;
the apparatus for determining the parental Duchenne muscular dystrophy gene haplotype further comprises:
the first comparison device is used for comparing a third-generation sequencing result obtained by a third-generation sequencing device with a reference human genome sequence to obtain a compared sequencing data set; the software for comparison adopts Blasr; the reference human genome is Hg 19;
a screening device for screening the sequence with the highest alignment score from the aligned sequencing data set to obtain a unique aligned sequence set;
a base depth calculating device for calculating the base depth of each site in the target region of each sequence in the unique alignment sequence set;
a heterozygous site screening device which is used for screening heterozygous single nucleotide polymorphism sites or sites of small fragment insertion deletion according to the different base depths of each site on each sequence; the screening method comprises the following steps: screening according to the standard that the depth of the mutated base at the site is more than 0.2 and less than 0.8 and the depth of the site is more than 20X;
a means for selecting adjacent heterozygous sites for selecting a sequence fragment comprising two adjacent heterozygous single nucleotide polymorphisms or small fragment indels according to the selected heterozygous single nucleotide polymorphism sites or small fragment indels; finding out corresponding sequence fragments for every two adjacent heterozygous single nucleotide polymorphic sites or small fragment insertion deletion sites, and selecting the sequence fragments corresponding to the two adjacent heterozygous site arrangement types with the largest sequence number and containing the two adjacent heterozygous site arrangement types;
a site judging device for judging sites of two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion on the sequence fragment to obtain the connection type of the two adjacent heterozygous sites;
analyzing and filtering the sequence fragments with low quality values, wherein the sequence fragments with low quality values refer to the sequence fragments which contain low quality value bases and can not correspond to every two adjacent single nucleotide polymorphism sites or small fragment insertion deletion sites; the low quality value base refers to a base with N as base;
calculating probability values of the two adjacent heterozygous mutation sites on the filtered sequence fragments, and giving a probability value to each connection type, wherein the probability values comprise Bayesian probability or LOD values;
selecting the connection type with the maximum probability of the two adjacent heterozygous sites according to the probability value of the two adjacent heterozygous sites;
a connection type judging device, which is used for judging the connection type of two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion sites on all the sequence fragments, namely the overall haplotype; the method comprises the following steps:
calculating two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion sites on all the sequence fragments by adopting a mathematical statistical method to obtain the connection type of the two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion sites on all the sequence fragments, namely the overall haplotype, wherein the mathematical statistical method comprises a graph theory or an optimal solution method; the first correcting device is used for correcting the overall haplotype to obtain the haplotypes of the father and the mother; the method comprises the following steps:
judging the strength of the connection relation of the connection sequence of two adjacent heterozygous single nucleotide polymorphisms or small fragment insertion deletion sites on all the sequence fragments, wherein the judgment standard comprises the supported sequence number, the calculated probability value or the lod ratio;
judging the sites with weak connection relation between the two adjacent heterozygous sites again, selecting sequences covering the two adjacent heterozygous sites and a plurality of sites adjacent to the two adjacent heterozygous sites, judging the support condition of the sequences, and correcting the weak connection sites according to the support condition, wherein the correction standard comprises the number of the sequences spanning the sites or the number of the sites supporting the adjacent haplotypes;
the apparatus for determining the fetal Duchenne muscular dystrophy gene haplotype further comprises:
a filtering device, configured to filter the second-generation sequencing result to obtain a filtered sequencing result, where the filtering conditions include a ratio of N contained in the sequence and a ratio of low-quality base number, where N is a base obtained by sequencing and cannot be determined; the method comprises the following steps:
filtering sequences having a proportion of N greater than 10% of one sequence and sequences having a proportion of mass values less than 15 greater than 50% of one sequence;
comparing the filtered sequencing result with a reference human genome sequence to obtain a compared sequencing data set; the software for comparison adopts BWA or SOAP; the reference human genome is Hg 19;
the second comparison device is used for performing quality value correction and local re-comparison on the compared sequencing data set to obtain corrected sequencing data; the correction and the local re-comparison of the quality value are carried out by adopting GATK software;
a second correcting device, configured to perform quality value correction and local re-alignment on the aligned sequencing data set to obtain corrected sequencing data; the correction and the local re-comparison of the quality value are carried out by adopting GATK software; a parental connection site screening device for determining sites containing single nucleotide polymorphism or sites with small fragment insertion deletion in the corrected sequencing data and screening connection sites corresponding to the haplotype of the father and the mother; further comprising: calculating the base depth of the connecting sites corresponding to the haplotype of the father and the mother in the second-generation sequencing data of the peripheral blood of the pregnant woman, wherein the base depth is calculated by adopting samtools or GATK;
a genetic parent haplotype judging device, which is used for selecting the screened connecting sites corresponding to the haplotype of the father and the mother, and selecting the sites which are heterozygous for the mother and homozygous for the father when judging which haplotype of the genetic mother; when judging which haplotype is inherited from father, selecting the site which is heterozygous for father and homozygous for mother; the selection of the sites which are heterozygous for the mother and homozygous for the father comprises the following steps:
s651, filtering the father heterozygous sites;
s652, filtering sites with low base depth;
s653, filtering sites not present in the maternal haplotype;
the selection is heterozygous for the father, and the site homozygous for the mother comprises the following steps:
s651, filtering the mother heterozygous sites;
s652, filtering sites with low base depth;
s653, filtering sites which do not appear in the paternal haplotype;
a fetal concentration calculation means for calculating fetal concentrations from the parents being homozygous and at different loci; the method of calculating comprises calculating the total depth of maternal bases in the peripheral blood of the pregnant woman as a and the total depth of paternal bases as B, from the homozygous and different sites of the parents, for a fetal concentration of f 2B/(a + B);
the device for judging the Duchenne muscular dystrophy gene haplotype of the fetus, which is used for judging the Duchenne muscular dystrophy gene haplotype of the fetus by a hidden horse model according to the fetal concentration, comprises:
calculating the quality value of the screened heterozygous and homozygous loci of the mother in the second generation sequencing data of the peripheral blood of the pregnant woman;
calculating the probability of the sequencing distribution condition of each screened mother heterozygous and father homozygous locus in the peripheral blood of the pregnant woman according to the fetal concentration;
calculating an optimal genetic path by using a hidden horse model and a viterbi algorithm according to the quality value and the probability of the sequencing distribution condition, and judging which haplotype of the mother is inherited by the fetus;
and screening the heterozygous and homozygous loci of the father obtained from the third generation data from the second generation sequencing data of the peripheral blood of the pregnant woman, and determining which haplotype of the father is inherited by comparing a plurality of loci to the two haplotypes of the father, wherein the heterozygous and homozygous loci of the father are loci which are inconsistent with the homozygous loci of the mother in the peripheral blood data.
CN201810073058.7A 2018-01-25 2018-01-25 Method and system for determining fetus Duchenne muscular dystrophy gene haplotype Active CN108315403B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810073058.7A CN108315403B (en) 2018-01-25 2018-01-25 Method and system for determining fetus Duchenne muscular dystrophy gene haplotype
PCT/CN2018/074939 WO2019144424A1 (en) 2018-01-25 2018-02-01 Method and system for determining haplotype of fetal duchenne-type muscular dystrophy gene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810073058.7A CN108315403B (en) 2018-01-25 2018-01-25 Method and system for determining fetus Duchenne muscular dystrophy gene haplotype

Publications (2)

Publication Number Publication Date
CN108315403A CN108315403A (en) 2018-07-24
CN108315403B true CN108315403B (en) 2022-05-24

Family

ID=62887133

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810073058.7A Active CN108315403B (en) 2018-01-25 2018-01-25 Method and system for determining fetus Duchenne muscular dystrophy gene haplotype

Country Status (2)

Country Link
CN (1) CN108315403B (en)
WO (1) WO2019144424A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110993025B (en) * 2019-12-20 2023-08-22 北京科迅生物技术有限公司 Method and device for quantifying fetal concentration and method and device for genotyping fetus
GB2615061A (en) * 2021-12-03 2023-08-02 Congenica Ltd Next generation prenatal screening

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104232778B (en) * 2014-09-19 2016-08-17 天津华大基因科技有限公司 Determine the method and device of fetus haplotype and chromosomal aneuploidy simultaneously

Also Published As

Publication number Publication date
WO2019144424A1 (en) 2019-08-01
CN108315403A (en) 2018-07-24

Similar Documents

Publication Publication Date Title
CN108048541B (en) System for determining fetal alpha thalassemia gene haplotype
CN108315404B (en) Method and system for determining fetal beta thalassemia gene haplotype
CN112029861B (en) Tumor mutation load detection device and method based on capture sequencing technology
KR101795124B1 (en) Method and system for detecting copy number variation
CN106715711B (en) Method for determining probe sequence and method for detecting genome structure variation
WO2017045654A1 (en) Method for determining proportion of donor source cfdna in receptor cfdna sample
CN108220403B (en) Method and device for detecting specific mutation site, storage medium and processor
CN106834502A (en) A kind of spinal muscular atrophy related gene copy number detection kit and method based on gene trap and two generation sequencing technologies
CN109817279B (en) Detection method and device for tumor mutation load, storage medium and processor
WO2016049878A1 (en) Snp profiling-based parentage testing method and application
CN105555970B (en) Method and system for simultaneous haplotyping and chromosomal aneuploidy detection
CN112126677B (en) Noninvasive deafness haplotype gene mutation detection method
WO2012000150A1 (en) Pcr primers for determining hla-a,b genotypes and methods for using the same
CN108315403B (en) Method and system for determining fetus Duchenne muscular dystrophy gene haplotype
WO2017193044A1 (en) Noninvasive prenatal diagnostic
WO2024027569A1 (en) Haplotype construction method independent of proband
CN108060227B (en) Amplification primer, kit and detection method for detecting PAH gene mutation
CN108070648B (en) Method and system for determining fetal spinal muscular atrophy (SMR) gene haplotype
CN109652525A (en) Pulmonary thromboembolism gene panel kit and its application
CN105765076A (en) Chromosome aneuploidy detection method and apparatus therefor
CN110993025B (en) Method and device for quantifying fetal concentration and method and device for genotyping fetus
CN112259165A (en) Method and system for detecting microsatellite instability state
CN109280696A (en) The method of SNP detection technique fractionation mixing sample
CN114457143A (en) Method for constructing CNV detection library and CNV detection method
CN112626192A (en) Gene chip, kit comprising gene chip and application of gene chip

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Zhang Chunsheng

Inventor after: Guo Yulai

Inventor after: Li Sheng

Inventor after: Du Bole

Inventor after: Jiang Biman

Inventor after: Zeng Xiaojing

Inventor after: Xia Weicheng

Inventor after: Wang Yang

Inventor after: Zhu Wentao

Inventor before: Zhang Chunsheng

Inventor before: Guo Yulai

Inventor before: Du Bole

Inventor before: Jiang Biman

Inventor before: Zeng Xiaojing

Inventor before: Xia Weicheng

Inventor before: Wang Yang

Inventor before: Zhu Wentao

Inventor before: Li Sheng

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant