CN105648045B - The method and apparatus for determining fetus target area haplotype - Google Patents

The method and apparatus for determining fetus target area haplotype Download PDF

Info

Publication number
CN105648045B
CN105648045B CN201410639577.7A CN201410639577A CN105648045B CN 105648045 B CN105648045 B CN 105648045B CN 201410639577 A CN201410639577 A CN 201410639577A CN 105648045 B CN105648045 B CN 105648045B
Authority
CN
China
Prior art keywords
sequencing data
target area
haplotype
fetus
nucleic acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410639577.7A
Other languages
Chinese (zh)
Other versions
CN105648045A (en
Inventor
袁媛
王垚燊
朱红梅
易鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TIANJIN BGI TECHNOLOGY Co Ltd
BGI Shenzhen Co Ltd
Original Assignee
TIANJIN BGI TECHNOLOGY Co Ltd
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TIANJIN BGI TECHNOLOGY Co Ltd, BGI Shenzhen Co Ltd filed Critical TIANJIN BGI TECHNOLOGY Co Ltd
Priority to CN201410639577.7A priority Critical patent/CN105648045B/en
Publication of CN105648045A publication Critical patent/CN105648045A/en
Application granted granted Critical
Publication of CN105648045B publication Critical patent/CN105648045B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention provides a kind of method and device thereof of determining fetus target area haplotype.The method for determining fetus target area haplotype includes: to carry out sequencing to the target area of free nucleic acid in pregnant woman's body fluid, to obtain the first sequencing data;Sequencing is carried out to the same target region of fetus family member, to obtain the second sequencing data, third sequencing data and the 4th sequencing data, wherein, second sequencing data is the sequencing data of fetus mother, third sequencing data is the sequencing data of fetus father, and the 4th sequencing data is the sequencing data of propositus;The fetal nucleic acid content in pregnant woman's body fluid is stated based on first, second and optional third sequencing data, determination;Based on second, third and the 4th sequencing data, the target area haplotype of fetus mother and the target area haplotype of fetus father are constructed respectively;And based on fetus mother, the target area haplotype of father and fetal nucleic acid content, determine the target area haplotype of fetus.

Description

The method and apparatus for determining fetus target area haplotype
Technical field
The present invention relates to biological information fields, particularly, are related to the method and apparatus for determining fetus target area haplotype.
Background technique
Spinal muscular atrophy (Spinal Muscular Atrophy, SMA) is that one group of common autosomal recessive is lost Disease is passed, the second of lethal autosomal recessive hereditary diseases is occupied, patient's incidence is 1/6000~1/ in life birth baby 10000.The current study has shown that the cause of disease of SMA is mainly SMN gene delection: wherein SMN1 is to determine gene, has been expressed Whole and stable SMN functional protein, and SMN2 is the modifier of SMA.It is reported that 98.7% (226/229) child form patient There are SMN1 gene delections, wherein about 90% patient SMA shows homozygosis SMN1 exon 7 and/or 8 missings.SMA neuromuscular Disease condition is serious, at present clinically without effective treatment means.Pre-natal diagnosis is to prevent the important means of the birth defect.
With maternal blood starch in discovery existing for fetus dissociative DNA, provided for noninvasive antenatal detection fetus genotype It may.However have not yet to see the report in relation to carrying out noninvasive fetus SMA detection by pregnant woman blood plasma dissociative DNA.It is existing SMA detect report, be mostly that QPCR primer and probe are designed by diagnosis SMN17 exon, realize prominent to deletion form SMN1 The detection of change, as disclosed in Xu Xiangmin et al. " a kind of PCR kit for fluorescence quantitative of diagnosing human spinal muscular atrophy " are (public The number of opening CN103614477A).However due in Maternal plasma foetal DNA content it is relatively low, and the sensitivity of QPCR is not enough to The detection to fetus SMN1 gene mutation situation is realized under high parent background.
Therefore, the antenatal of the disease can will be examined with the detection method of Non-invasive detection fetus SMN1 genotype by developing one kind Breaking, it is highly important to play the role of.
Summary of the invention
One side according to the present invention, provides a kind of method of determining fetus target area haplotype, this method include with Lower step: sequencing is carried out to the target area of free nucleic acid in pregnant woman's body fluid, to obtain the first sequencing data;It is right The target area of the family member of the fetus carries out sequencing, to obtain the second sequencing data, third sequencing number According to the 4th sequencing data, wherein second sequencing data is the sequencing data of fetus mother, and the third sequencing data is The sequencing data of fetus father, the 4th sequencing data are the sequencing data of propositus;Based on first sequencing data, Two sequencing datas and optional third sequencing data, determine the fetal nucleic acid content in pregnant woman's body fluid;Based on described Two sequencing datas, third sequencing data and the 4th sequencing data, construct respectively described fetus mother target area haplotype and The target area haplotype of the fetus father;And the target area haplotype based on described fetus mother, the fetus father The target area haplotype of parent and the fetal nucleic acid content, determine the target area haplotype of the fetus.Wherein, One, the precedence relationship that the acquisition of second, third and the 4th sequencing data must not follow, can obtain simultaneously, can also be one by one It obtains or several obtains together;The determination step of fetal nucleic acid content and the construction step of parent's haplotype are also without successive Relationship.
Another aspect according to the present invention provides a kind of device of determining fetus target area haplotype, which can Executing some or all of the method that one aspect of the present invention provides step, the device includes: sequencing unit, for pregnant woman's body fluid The target area of middle free nucleic acid carries out sequencing, to obtain the first sequencing data, and, to the family of the fetus The target area of set member carries out sequencing, to obtain the second sequencing data, third sequencing data and the 4th sequencing Data, wherein second sequencing data is the sequencing data of fetus mother, and the third sequencing data is the survey of fetus father Ordinal number evidence, the 4th sequencing data are the sequencing data of propositus;Fetal nucleic acid content determination unit, with the sequencing unit Connection, for determining the pregnant woman based on first sequencing data, the second sequencing data and optional third sequencing data Fetal nucleic acid content in body fluid;Parent's haplotype determination unit is connect with the sequencing unit, for surveying based on described second Ordinal number constructs the target area haplotype of described fetus mother and described according to, third sequencing data and the 4th sequencing data respectively The target area haplotype of fetus father;And fetus haplotype determination unit, with the fetal nucleic acid content determination unit and Parent's haplotype determination unit is connected, for the target area haplotype based on described fetus mother, the fetus father Target area haplotype and the fetal nucleic acid content, determine the target area haplotype of the fetus.
The method and/or device of an aspect of of the present present invention provide a kind of based on target area capture and family target area The method of domain haplotype linkage analysts starches DNA sequencing data from pregnant woman's body fluid sample such as maternal blood by linkage analysis Middle deduction fetus target area genotype, can be used for judging or whether auxiliary judgment fetus suffers from target area variation related disease Or it is abnormal.Method or apparatus of the invention is by greatly reducing the hair of false positive and false negative using chain haplotype information It is raw.The application of this method and/or device can be avoided greatly since single locus measurement scale is inaccurate, and single locus sequencing is wrong Miss etc. bring false negative and false positive results, so that testing result is more accurate and reliable.Suffered from by this method in SMN1 The application of sick high risk family such as can effectively detect illness youngster, and reduce unnecessary amniocentesis at the invasive sampling operation.
Detailed description of the invention
Above-mentioned and/or additional aspect and advantage of the invention is from combining in description of the following accompanying drawings to embodiment by change It obtains obviously and is readily appreciated that, in which:
Fig. 1 is the signal of the device of the determination fetus target area haplotype in the specific embodiment of the present invention Figure;
Fig. 2 is the overall technology conspectus of the fetus genotype judgement in the specific embodiment of the present invention;
Fig. 3 is the fetus genotype judging result in the specific embodiment of the present invention, and Fig. 3 A is fetus from father The judging result figure for the haplotype that place is genetic to, Fig. 3 B are judging result figure of the fetus from the haplotype of mother place heredity; In figure, point indicates difference of the site the snp heredity from the probability of Hap0 and heredity from the probability of Hap1, and astragal is combination judgement As a result.
Specific embodiment
A kind of embodiment according to the present invention, provides a kind of method of determining fetus target area haplotype, including with Lower step:
Step 1: the first, second, third and fourth sequencing data is obtained.
The free nucleic acid in pregnant woman's body fluid is obtained, target area is captured, sequence is carried out to the target area captured Measurement obtains the first sequencing data.Pregnant woman's body fluid sample is the sample comprising fetal nucleic acid, for example maternal plasma includes Fetal nucleic acid, the peripheral blood free nucleic acid of extraction are the mixtures of pregnant woman and fetal nucleic acid, and mixture is height fragmentation.According to Probe is utilized by carrying out sequencing library building to from the free nucleic acid of maternal blood sample extraction according to existing microarray dataset Or chip or the capture of liquid phase probe obtain target area sequencing library, carry out upper machine sequencing to target area sequencing library, obtain First sequencing data, the first sequencing data are the blended datas of pregnant woman's nucleic acid and fetal nucleic acid mixture.Microarray dataset include but Be not limited to CG (Complete Genomics), Illumina/Solexa, Life Technologies ABI SOLiD and Roche 454 can carry out corresponding sequencing library preparation according to selected microarray dataset, the sequencing of single-ended or both-end may be selected, Thus obtained each sequencing data is made of multiple short sequences, and each short sequence is known as read.Capturing chip used is Be made of solid-phase matrix and multiple probes for being fixed thereon, probe can characteristic identification object region, target area can be with It is that a part of sample gene to be tested group DNA is also possible to whole gene group, in the specific embodiment of the present invention, mesh Marking capture region includes SMN1 gene extron sub-district, and table 1 shows position of each exon region on reference genome HG19, Target area further includes the SNP site of high heterozygosis rate in SMN1 gene internal and its region upstream and downstream 3M, and table 2 is that these SNP exist The distributed number of each region, the secondary gene frequency (MAF) of these SNP is between 0.3-0.5.These regions and site Information facilitates deciding on analysis fetus haplotype, and the capture in the region 3M of target gene upstream and downstream is so that recombination probability is reduced to ten thousand / mono- hereinafter, make it is subsequent can accurately carry out haplotype reconstruction or determination, and the SNP site of above-mentioned high heterozygosis rate Capture, make it easy to obtain derive from fetus itself specific site or sequence, using from fetus itself site or sequence It can estimate the nucleic acid content of the fetus in hybrid dna.When the probe of specific recognition above-mentioned zone is capable of in design, to guarantee to catch The accuracy of the characteristic, detection that obtain, making the probe comprising at least one above-mentioned SNP site is uniquely to compare on reference genome Pair, the specificity of probe capture target site can be enhanced in this way.In probe design, make the G/C content 40- of every probe 50%, be conducive to the whole group probe in the same system in this way and specifically bind target area together, in the same reaction system In can elute together.
1 SMN1 gene region capture range of table
Region (Region) Chromosome numbers (chr) Initial position (start) Final position (end)
1 chr5 70220738 70221835
2 chr5 70222126 70223263
3 chr5 70223351 70223620
4 chr5 70224046 70224569
5 chr5 70224596 70225332
6 chr5 70225421 70227146
7 chr5 70227276 70229560
8 chr5 70229641 70230603
9 chr5 70230671 70231084
10 chr5 70231091 70231402
11 chr5 70231511 70232075
12 chr5 70232161 70232534
13 chr5 70233276 70233724
14 chr5 70234111 70235041
15 chr5 70235136 70235933
16 chr5 70236016 70236631
17 chr5 70236716 70239101
18 chr5 70239196 70239701
19 chr5 70239786 70241034
20 chr5 70241131 70242428
21 chr5 70242496 70242844
22 chr5 70243026 70243331
23 chr5 70243681 70244193
24 chr5 70244286 70244815
25 chr5 70245011 70245717
26 chr5 70247436 70248868
SNP site used in 2 area SMN1 haplotyping of table distinguishes situation
region SNP site number
upstream10M-3M 7
upstream3M-2.5M 1
upstream2.5M-2M 14
upstream2M-1.5M 98
upstream1.5M-1M 52
upstream1M-500K 71
upstream500K-0K 66
Gene±1M 1629
downstream0K-500K 67
downstream500K-1M 26
downstream1M-1.5M 42
downstream1.5M-2M 78
downstream2M-2.5M 87
downstream2.5M-3M 0
downstream3M-10M 7
Obtain the sample of fetus family member, including fetus biology mother (pregnant woman), fetus biology father and elder generation The sample of nucleic acid of card person extracts the nucleic acid in each family member's sample, with reference to the mode of above-mentioned the first sequencing data of acquisition, catches The same target area in fetus family member nucleic acid is obtained, sequencing is carried out to the same target area of each family member, Family member's sequencing data is obtained, family member's sequencing data includes second, third and the 4th sequencing data, is respectively corresponded The sequencing data of the same target area of fetus biology mother, fetus biology father and propositus.Wherein the second sequencing number According to the i.e. acquisition of mother's sequencing data can pass through the maternal blood sample of above-mentioned the first sequencing data of acquisition of separation, separation Maternal blood sample obtains maternal plasma sample and pregnant woman's haemocyte, can be with from pregnant woman's haemocyte, such as leucocyte Maternal gene group nucleic acid is obtained, and then obtains the second sequencing data.It is to determine with target area correlation in propositus's family The member of variation, herein, propositus are the siblings of the fetus of biological parenthood same as fetus to be measured, including birth And be not born, embryo or fertilized eggs including in vitro culture, including alive and not alive.In addition, in other specific implementations In mode, propositus is also possible to siblings of the parent of fetus to be measured, such as uncle, uncle, the aunt of fetus etc., this When, the sequencing data of the family member of fetus should also include the grand parents and/or grand parents of fetus, can utilize parent in this way Siblings sequencing data and parent sequencing data building grand parents or grand parents target area haplotype, into And judge the target area haplotype of parent being genetic to.The acquisition of first, second, third and fourth sequencing data is not required The precedence relationship followed can obtain simultaneously, for example mark multiple samples using label, build library mixing to the mixing of multiple sample nucleic acids The sequencing of upper machine obtains the sequencing data of multiple samples simultaneously, can also obtain one by one or the sequencing of several acquisition sample of nucleic acid Data.
Step 2: fetal nucleic acid content is determined.
Based on the first and second sequencing datas, or it is based on the first, second, and third sequencing data, determines pregnant woman's body Fetal nucleic acid content in liquid sample.
Wherein, determine the fetal nucleic acid content in pregnant woman's body fluid sample based on the first and second sequencing datas, be in this way into Capable: it is to filter out there are two types of genotype in the first sequencing data and only have in the second sequencing data a kind of gene first The site of type.The screening in site can be carried out by comparing, and comparison can use SOAP (Short OligonucleotideAnalysis Package), the softwares such as bwa, samtools carry out, and present embodiment does not limit this System, the progress of comparison can also identify polymorphic site.Reference sequences used in comparing are known arrays, be can be in advance Arbitrary reference template in the affiliated category of the target individual of acquisition.For example, reference sequences can if target individual is the mankind The HG19 for selecting ncbi database to provide.It is further possible to be pre-configured with the resources bank comprising more reference sequences, into Before row sequence alignment, closer sequence first is assembled according to the selection of the factors such as gender, ethnic group, the region of target individual or measurement Column help to obtain more accurately detection and analysis result as reference sequences.In comparison process, according to setting for alignment parameters It sets, every in each sequencing data or each pair of read (reads or a pair of end read pair-end reads) at most allow to have n A base mispairing (mismatch), n are preferably 1 or 2, if having more than n base in reads occurs mispairing, be considered as this/it is right Reads can not compare reference sequences.One site, it is assumed that the site is A, the comparison of the second sequencing data on reference sequences The result shows that the base on comparing in the second sequencing data, that is, mother's sequencing data to the reference sequences site is all A, but the The comparison result of the sequencing data of one sequencing data, that is, mother and fetus shows that reference sequences are compared in the first sequencing data to be somebody's turn to do The base in site is the base of A He another non-A, non-A base such as T, C or G, due to be in the first sequencing data mother and The mixing sequencing data of fetal nucleic acid, and know that the site of mother is AA from the comparison result of the second sequencing data, then just It can determine whether out that the non-A base in the site derives from fetus in the first sequencing data, filter out all such sites in this way, be based on this The ratio that a little sites account in mixing sequencing data just can reflect the content of fetal nucleic acid in mixing nucleic acid.Similar, if the The comparison result of two sequencing datas shows that the genotype in mother site is heterozygosis, such as AG, and the first sequencing data compares Two kinds of genotype of site AG and AA are supported as the result is shown, in this way based on the quantity of A base, content or ratio in the first sequencing data Example can also estimate the fetal nucleic acid content obtained in maternal blood sample.When as former instance above, in the second sequencing data In only homozygous genotype and except having the same homozygous genotype there are also when heterozygous genotypes in the first sequencing data, fetus Nucleic acid content f=2d/ (c+d), and when as latter instance above, there was only heterozygous genotypes in the second sequencing data and the Except having that heterozygous genotypes there are also homozygous genotype in one sequencing data, fetal nucleic acid content f=(c-d)/(c+d), in formula C be the first sequencing data in support allele A read number, d be the first sequencing data in support non-A allele Read number.
Determine the fetal nucleic acid content in pregnant woman's body fluid sample based on the first, second, and third sequencing data, be by with Lower progress: filter out be in the second sequencing data and third sequencing data different homozygous genotypes site, such as the position Genotype o'clock in second and third sequencing data is respectively RR and rr, in this way with hereditary angle, the site in fetal nucleic acid Genotype be Rr, calculate fetal nucleic acid content in maternal blood samples, fetal nucleic acid based on multiple such sites Content f=g/ (g+h), g are the read number that allele r is supported in the first sequencing data, and h is to support in the first sequencing data The read number of allele R.The comparison that the screening in site is related to, setting, comparison result of alignment parameters etc. can refer to front Description based on the first and second sequencing datas estimation fetal nucleic acid content carries out.
Step 3: the target area haplotype of parent is constructed.
Based on the target area haplotype of second, third and the 4th sequencing data building M & F, that is, it is based on parent Respective sequencing data and it is known this to parent target area band variation children (propositus) sequencing data, to construct The respective haplotype of parent.The sequencing data of the respective sequencing data of parent and propositus is compared with reference sequences respectively, SNP in parent and propositus target area is identified using software such as SOAPsnp, GATK, bowtite etc. and is obtained each The genotype of a SNP, since two haplotypes (two groups of SNP set) of propositus are each haplotypes by father and mother Composition, so according to mendelian inheritance, according to the genotype in site where each SNP of parent and propositus, such as Using multiple differentiation type SNP, differentiation type SNP, which refers to that site parent is capable of providing for different genotype, can distinguish monomer to the next generation The SNP in type source constructs the haplotype of father and mother.Haplotype tendency is hereditary to filial generation as a genetic element, at this In, haplotype is the set of one group of SNP.
It should be noted that embodiments of the present invention being not limited in sequence to step 2 and step 3, Step 2 can first be carried out and carry out step 3 again, or first carried out step 3 acquisition parent target area haplotype and carry out step again Two determine fetal nucleic acid content.
Step 4: fetus target area haplotype is determined.
Target area haplotype and fetal nucleic acid content based on M & F determine that the fetus target area is single Figure.Specifically, using it is multiple be heterozygosis on the haplotype of father target area, be homozygous on the haplotype of mother target area Site determine the father target area haplotype that fetus genetic arrives, this is because if fetus SNP site is heterozygosis, due to It is only possible to from mother as a type of base, so just can determine another base in the site from father, using more A such site, as soon as such as can determine the haplotype that father is originated from more than the allele in 10 such sites, It can determine that haplotype from father in two haplotypes of fetus.And the determination of another for fetus haplotype, it can It is similar using it is multiple be site that is homozygous, being heterozygosis on the haplotype of father target area on the haplotype of mother target area It determines, but due to fetal nucleic acid sample, i.e. maternal peripheral blood sample is mixed with a large amount of mother body D NA, single not have from the above type SNP Method judges mother's haplotype where fetus genetic R or r, because any equal bit bases in the site also all may just only Parent, we determine the haplotype for mother that fetus genetic arrives in conjunction with fetal nucleic acid content herein.For multiple in father The polymorphic site for heterozygosis that be on homozygous, mother's haplotype on close haplotype be, such site is in maternal peripheral blood sample Each is represented by Rr, if multiple such sites all meet R/r=(1+x%)/(1-x%), determines fetus genetic Haplotype where mother's allele R determines fetus genetic mother etc. if multiple such sites all meet R/r=1 Haplotype where the gene r of position, R and r indicate a pair of alleles, and x% indicates fetal nucleic acid content, first after R/r=comparison Support the read number of r in sequencing data after read number/comparison of support R in first sequencing data.Fetus is determined as a result, Haplotype.
It will appreciated by the skilled person that all or part of the steps of various methods can be in above embodiment Related hardware is instructed to complete by program, which can be stored in a computer readable storage medium, and storage medium can To include: read-only memory, random access memory, disk or CD etc..
Another embodiment according to the present invention provides a kind of device of determining fetus target area haplotype, the dress Setting can be to complete some or all of the method in one embodiment of the present invention step, as shown in Figure 1, the device 1000 It include: sequencing unit 100, to obtain the free nucleic acid in pregnant woman's body fluid, capture target area, to the target captured Region carries out sequencing, obtains the first sequencing data, right to capture the same target area in fetus family member nucleic acid The same target area of the family member carries out sequencing, obtains family member's sequencing data, family member's sequencing Data include second, third and the 4th sequencing data, respectively correspond the same target area of fetus mother, fetus father and propositus The sequencing data in domain;Fetal nucleic acid content determination unit 200 is connected with the sequencing unit 100, for being based on first and second Sequencing data, or it is based on the first, second, and third sequencing data, contained with the fetal nucleic acid in determination pregnant woman's body fluid sample Amount;Parent's haplotype determination unit 300 is connected with the sequencing unit 100, for based on second, third and the 4th sequencing number According to the target area haplotype of building M & F;Fetus haplotype determination unit 400 is determined with the fetal nucleic acid content Unit 200 is connected with parent's haplotype determination unit 300, for based on M & F target area haplotype and Fetal nucleic acid content determines fetus target area haplotype.To the technology of the method in an embodiment of the invention The description of feature and advantage, the device of equally applicable this embodiment of the present invention, details are not described herein.
The determination of target area haplotype, genotype are carried out really below in conjunction with to the method for specific sample according to the present invention Purposes after fixed, haplotype or genotype are determining is described in detail and result is shown.Example below is only used for explaining this hair It is bright, and be not considered as limiting the invention." first " as used in the present invention, " second ", " third " etc. are only used for Facilitate description purpose, be not understood to indicate or imply relative importance, can not be interpreted as between have sequencing relationship. In description of the invention, unless otherwise indicated, the meaning of " plurality " is two or more.
Except as otherwise explaining, the reagent do not explained especially involved in following embodiment, sequence (connector, label and primer), Software and instrument are all conventional commercial products or disclosed, for example are purchased from the hiseq2000 microarray dataset of Illumina company Library related kit is built to carry out sequencing library building etc..
Conventional method:
1. the selection in target acquistion region and the design of probe
Target acquistion region includes SMN1 gene extron sub-district, high heterozygosis in SMN1 gene internal and its region upstream and downstream 3M The capture of rate SNP site is sequenced.The selection of SNP refer to dbSNP database, selection wherein with reference to chromosome number be greater than 100, SNP site of the MAF between 0.3-0.5.Meanwhile the accuracy in order to guarantee detection, sequence 63mer where guaranteeing SNP site Base sequence is uniquely to compare, and G/C content is in 40%-50% in the genome.2 institute of SMN1 areas captured region such as table 1 and table Show
The acquisition of haplotype 2. family is caused a disease
SNP by analysis of biological information, to pregnant woman, pregnant woman husband and propositus in target gene and its upstream and downstream region Loci gene type is judged.Linkage analysis is carried out by the SNP genotype to three, with determining and pathogenic mutation close linkage SNP site gene information, and further obtain with the chain haplotype information of pathogenic mutation.Overall technology route such as Fig. 2 It is shown.
(1) genomic DNA is extracted from the peripheral blood of pregnant woman, pregnant woman husband and propositus, and using electrophoresis and OD to obtaining The DNA obtained carries out quality testing.
(2) preparation in target area capture library is carried out using the genomic DNA of quality testing qualification.Library preparation be by It is 200-300bp small fragment DNA that 1 μ g genomic DNA, which is broken into master tape, then will interrupt rear DNA fragmentation and carry out end-filling, The end 3' adds base " A ", DNA fragmentation is connect with the end 3' with the special joint of " T " base, through Non-Captured PCR The library that (not capturing preceding PCR) building is completed passes through the Exon for the specific gene that SMN1 gene target areas captured probe is chosen And the region flank ± 30bp is enriched with, then by product after PCR amplification enrichment, finally by hybridization front and back PCR product QPCR Detection obtains sequence capturing hybridization efficiency.
(3) it is sequenced using sample library of the high-flux sequence instrument to acquisition.So that target area average sequencing depth Reach 200 × more than.
(4) by analysis of biological information, sequencing information is analyzed and is studied, to obtain the mononucleotide of related gene The hereditary variations information such as variation (SNV), the insertion of a small number of bases and missing (InDel).And clear and target pathogenic mutation to be checked The SNP information of phase linkage inheritance, that is, cause a disease haplotype.Assuming that propositus obtains a pathogenic mutation from parent both sides respectively, if,
1) genotype for assuming a certain site outside propositus's Disease-causing gene is AA, father AC, mother AA.Known to then: Propositus obtains A from father, obtains an A from mother, and the two SNP sites with the mutually chain something lost of pathogenic mutation It passes.And C and non-pathogenic allele (allele) are chain in father;
2) genotype for assuming a certain site outside propositus's Disease-causing gene is AC, father AC, mother AA.Known to then: Propositus obtains C from father, obtains an A from mother, and the two SNP sites with the mutually chain something lost of pathogenic mutation It passes.And C and non-pathogenic allele are chain in father;
3) genotype for assuming a certain site outside propositus's Disease-causing gene is AC, father AA, mother AC.Known to then: Propositus obtains A from father, obtains a C from mother, and the two SNP sites with the mutually chain something lost of pathogenic mutation It passes.And C and non-pathogenic allele are chain in mother;
Above-mentioned estimation method is applied to the SNP site of SMN1 gene and the two sides region 3M, then can get in one range of person Haplotype information, be informed in this region with the chain haplotype information of pathogenic mutation.To and can further infer that out With the SNP information of non-pathogenic allele close linkage.
3. the capture sequencing of the target area pregnant woman blood plasma DNA
Target area capture sequencing is carried out to pregnant woman blood plasma DNA, and carries out bioinformatics SNP/indel analysis.With parent Whether correct and foetal DNA content is Quality Control link to edge relationship, only carries out subsequent analysis to the sample of Quality Control qualification.To pregnant woman's Plasma DNA sequencing data carries out genotyping, and the family haplotype is combined to carry out linkage analysis, whether judges fetus The heredity pathogenic haplotype of Mr. and Mrs.
(1) cell free DNA is extracted from 1.2ml pregnant woman blood plasma, and is quantified DNA using Qubit and carried out quality testing.
(2) preparation in target area capture library is carried out using the genomic DNA of quality testing qualification.First to DNA piece Duan Jinhang end-filling adds base " A " at the end 3', and DNA fragmentation is connect with the end 3' with the special joint of " T " base, The library completed through Non-Captured PCR building passes through the Exon for the specific gene that the target area SMN1 capture probe is chosen And the region flank ± 30bp is enriched with, then by product after PCR amplification enrichment, finally by hybridization front and back PCR product QPCR Detection obtains sequence capturing hybridization efficiency.
(3) it is sequenced using sample library of the high-flux sequence instrument to acquisition.So that target area average sequencing depth Reach 500 × more than.
4. fetus genotype speculates
(1) by analysis of biological information, sequencing information is analyzed and is studied, to obtain the mononucleotide of related gene The hereditary variations information such as variation (SNV), the insertion of a small number of bases and missing (InDel).
(2) content of foetal DNA in plasma DNA is calculated, calculation is as follows
A) assume that mother's leucocyte DNA genotype is AA, Fetal genome DNA is AT, then can be observed in blood plasma at this time Genotype be A and T, if supporting, the reads number of A is c, supports that the reads number of C is d, then f=2d/ (c+d) at this time;
B) assume that mother's leucocyte DNA genotype is AT, Fetal genome DNA is AA, then can be observed in blood plasma at this time Genotype be A and T, if supporting, the reads number of A is c, supports that the reads number of T is d, then f=(c-d)/(c+d) at this time.
Quality Control qualification is thought if foetal DNA content > 3%, into subsequent experimental.
(3) judge that the genotype of fetus heredity from father, calculation are as follows:
A) select mother for homozygosis, and father is that the site of heterozygosis carries out the judgement of father's heredity haplotype.Assuming that a certain SNP site maternal gene type is AA, and father's genotype is AC, if blood plasma sequencing data call SNP result is containing for A, C, and C Amount meets the fetal concentrations of estimation.Then show that fetus obtains the allele where SNP C from;
B) by the SNP for meeting a) condition all in SMN1 capture region for judging fetus SNP obtained from father Information constitutes the haplotype information that fetus obtains from father.And according to the information in 2- (4), specify the haplotype whether with Pathogenic mutation is mutually chain, to know whether fetus obtains pathogenic allele from father.
(4) judge that the genotype of fetus heredity from mother, calculation are as follows
Select mother for heterozygosis, and father is the judgement that homozygous site carries out mother's heredity haplotype.Assuming that a certain SNP Site maternal gene type is AC, and father's genotype is AA, if blood plasma sequencing data call SNP result is A and C, if fetus is from mother Parent place heredity A allele, the genotype of fetus is AA, then can be observed A/C approximation and (1+f)/(1-f);If fetus is lost C allele is passed, the genotype of fetus is AC, then it is approximately 0.5 that A/C, which can be observed,.Number is supported to the reads of allele Building bi-distribution model obtains relative probability Pa, Pc (Pa+Pc=1) after calculating separately out the probability of heredity A, C and will own SNP each point probability constructs HMM model Viterbi algorithm (Lawrence R.Rabiner, PROCEEDINGS OF THE 2 months IEEE, Vol.77, No.2,1989 years) judge the haplotype information that fetus obtains from mother, and according to haplotype whether It is mutually chain with pathogenic mutation, learn whether fetus obtains pathogenic allele from mother.
(5) comprehensive (3) and (4) as a result, the genotype information of acquisition fetus.
Embodiment
To 1 there is the pregnant woman (Tianjin healthcare hospital for women & children) of fertility two tire high risk of SMN1 illness to carry out noninvasive prenatal gene Detection.Pregnant woman and its husband are the heterozygosis carrier for the 7 exon deletion mutation of SMN1 gene, and give birth to a SMN1 The patient of homozygous mutation.Existing second pregnancy extracts maternal blood and timely separated plasma, then by plasma dna and pregnant Woman, pregnant woman husband, propositus genomic DNA carry out capture sequencing, the genetic profile of this tire fetus is analyzed.
Specimen dna is extracted with salting out method, large fragment DNA carries out ultrasound and interrupts, and interrupting method using sample at present is Covaris interrupts method, and sample DNA is smashed to the segment of 100-700bp range.(effect note: is interrupted generally with required preparation The library segment master tape position Insert is ideal in the position 200-250bp, if interrupt effect it is undesirable if need carry out again It interrupts.)
Plasma DNA is extracted with salting out method, uses the quantitative rear directly progress library construction of Qubit.
1. prepared by library
1.1 ends are repaired and purifying
After configured mix concussion is mixed, 25 μ L enzyme reaction mixed liquors are added in each reaction.
Reaction condition: 20 DEG C, 30min
Carry out product purification using 180 μ L Ampure Beads, the DNA of recycling is dissolved in 30 μ L (wherein 1.9 μ L are loss) Water in.
1.2 ends add " A " (A-Tailing)
After configured mix concussion is mixed, 6.9 μ L enzyme reaction mixed liquors are added in every pipe.
Reaction condition: 20 DEG C, 30min
Note: end does not purify after adding " A "
The connection and purifying of 1.3 Adapter
Configured mix is shaken and is mixed, 15 μ L enzyme reaction mixed liquors are added in each reaction.
Reaction condition: 16 DEG C, 12-16h (overnight)
Product purification is carried out using 75 μ L Ampure Beads, the DNA of recycling is dissolved in 35 μ L's (wherein 2 μ L are loss) In water.
1.4Non-Captured sample P re-LM-PCR
PCR program:
2. chip hybridization, target area capture enrichment
Hybridization elution is carried out referring to NimbleGen operation instructions in this experiment, obtains target gene and PCR enrichment.
3. machine is sequenced on
This experiment carries out machine sequencing using hiseq2000 or hiseq2500PE101+8+101 program.
4. information analysis
Sequenator obtains original short sequence;
Remove the connector and low quality data in sequencing data;
Short sequence navigates on the corresponding position of human genome data;
Count sequencing result information, short sequence quantity, target area covering size, average sequencing depth etc.;
Filter the mononucleotide of low quality value and low cover degree;
Annotation determines gene, coordinate, amino acid change etc. that mutational site occurs;
Determine the genotype of each SNP in SMN1 capture region.
5. interpretation of result
1) data output situation
As shown in table 3, in target area average sequencing depth in 100X or more, blood plasma sequencing depth reaches institute's sample 271x。
3 data output situation table of table
2) SNP phasing situation
We carry out propositus's list using the SNP site of father, mother and propositus within SMN1 gene upstream and downstream 1M Figure building.Table 4 has counted the number (phased SNP) that the region successfully judges the SNP of affiliated haplotype.These phased SNP is subsequently used for father's heredity haplotype judgement (SNP used for Pat-Hap) and judges for mother's heredity haplotype (SNP used for Mat-Hap)
4 SMN1 gene-correlation region phase SNP situation of table statistics
3) foetal DNA content analysis in blood plasma
It selects father for heterozygosis and mother is homozygous point, foetal DNA content in blood plasma is estimated: assuming that mother's base Because type is AA, fetus genotype is AT, if the reads number for being measured as A is a, the reads number for being C is b, then foetal DNA in blood plasma Content c=2b/ (a+b).Foetal DNA content is 0.0930 in the plasma sample as the result is shown.
4) fetus genotype judges
Maternal blood slurry data in 1 family of SMA are analyzed, speculate this pregnancy fetus SMN1 using HMM algorithm Genetic profile specifically using the haplotype Hap0 and Hap1 of fetus as hidden state (hidden states), will successfully be sentenced The SNP of haplotype belonging to disconnected is calculated as observation sequence (observations) according to the position in the site snp and genetic map Recombination probability between adjacent snp extrapolates state transition probability (transition probabilities), is supported according to reads Number calculates the relative probability (Emission_probability) that Hap0, Hap1 are supported in each site snp, then passes through Hui Te ratio Algorithm (Viterbi algorithm) can be inferred that the haplotype arrangement that SNP is totally supported, obtain most probable fetus monomer Type combination.It can refer to Chen S1, Ge H2, Wang X, et al.Haplotype-assisted accurate non- invasive fetal whole genome recovery through maternal plasma Sequencing.Genome Med.2013,5 (2): 18 carry out.
Influence in order to avoid repetitive sequence region to analysis result, is analyzed using only unique sequence area.Knot Fruit as shown in figure 3, each point on figure indicate the site a snp heredity from father/mother Hap0 probability with hereditary from father/mother Hap1 Probability difference, each ringlet is a combination judging result, and the line that ringlet is formed indicates finally to sentence in intermediate baseline upstream From Hap0, final judgement heredity is expressed below from Hap 1 in intermediate baseline in the line that ringlet is formed for disconnected heredity.From figure 3, it can be seen that Pat-Hap 0 and Mat-Hap 0 distinguishes the haplotype that parent both sides have pathogenic mutation, and Pat-Hap 1 and Mat-Hap 1 distinguish Parent both sides do not carry the haplotype of pathogenic mutation.Inferred results show that fetus obtains Pat-Hap1 and Mat- from its parent Hap1 does not carry the chromosome of SMN1 pathogenic mutation.Showing fetus, there is no SMN1 missings.

Claims (8)

1. a kind of method of determining fetus target area haplotype, the method is used for non-disease diagnostic purpose, which is characterized in that Including,
Sequencing is carried out to the target area of free nucleic acid in pregnant woman's body fluid, to obtain the first sequencing data;
Sequencing is carried out to the target area of the family member of the fetus, to obtain the second sequencing data, third Sequencing data and the 4th sequencing data, wherein second sequencing data is the sequencing data of fetus mother, the third sequencing Data are the sequencing data of fetus father, and the 4th sequencing data is the sequencing data of propositus;
Based on first sequencing data, the second sequencing data and optional third sequencing data, pregnant woman's body fluid is determined In fetal nucleic acid content;
Based on second sequencing data, third sequencing data and the 4th sequencing data, the mesh of described fetus mother is constructed respectively Mark the target area haplotype of region haplotype and the fetus father;And
The target area haplotype of target area haplotype, the fetus father based on described fetus mother and the fetus Nucleic acid content determines the target area haplotype of the fetus;
Wherein, the fetal nucleic acid content is determining through the following steps:
Determining is the site of different homozygous genotypes in second sequencing data and the third sequencing data, wherein RR and rr indicates that different homozygous genotypes, R and r are a pair of alleles,
The fetal nucleic acid content is determined based on formula f=g/ (g+h),
Wherein,
G is the read number that allele r is supported in first sequencing data, and h is support etc. in first sequencing data The read number of position gene R;
The determining fetus target area haplotype, including,
Using it is multiple be heterozygosis on the haplotype of father target area, be that homozygous site is true on the haplotype of mother target area Determine the father target area haplotype that fetus genetic arrives, using it is multiple be on the haplotype of father target area it is homozygous, in mother Mother target area monomer that fetus genetic arrives is determined on the haplotype of target area for the site of heterozygosis and fetal nucleic acid content Type;
Wherein, it is homozygous on the haplotype of father target area for the multiple, on the haplotype of mother target area is miscellaneous The site of conjunction determines fetus genetic mother's equipotential if there is multiple such sites to meet R/r=(1+x%)/(1-x%) Target area haplotype where gene R determines fetus genetic mother etc. if there is multiple such sites to meet R/r=1 Target area haplotype where the gene r of position, R and r indicate a pair of alleles, and x% indicates fetal nucleic acid content, R/r=the The read number that r is supported in read number/first sequencing data of R is supported in one sequencing data.
2. method of claim 1, which is characterized in that carry out sequence survey to the target area of free nucleic acid in pregnant woman's body fluid Surely include:
The free nucleic acid is captured using probe, the probe specificity identifies the target area.
3. method for claim 2, which is characterized in that the probe is provided with chip form.
4. method for claim 2, which is characterized in that the probe includes SNP site probe, and the SNP site probe is being joined It examines and is uniquely compared on genome.
5. method for claim 2, which is characterized in that the G/C content of the probe is 40-50%.
6. determine the device of fetus target area haplotype, including,
Unit is sequenced, sequencing is carried out for the target area to free nucleic acid in pregnant woman's body fluid, to obtain first Sequencing data, and, sequencing is carried out to the target area of the family member of the fetus, to obtain the second sequencing Data, third sequencing data and the 4th sequencing data, wherein second sequencing data is the sequencing data of fetus mother, institute The sequencing data that third sequencing data is fetus father is stated, the 4th sequencing data is the sequencing data of propositus;
Fetal nucleic acid content determination unit is connect with the sequencing unit, for based on first sequencing data, the second sequencing Data and optional third sequencing data, determine the fetal nucleic acid content in pregnant woman's body fluid;
Parent's haplotype determination unit is connect with the sequencing unit, for number to be sequenced based on second sequencing data, third According to the 4th sequencing data, construct the target area haplotype of described fetus mother and the target area of the fetus father respectively Haplotype;And
Fetus haplotype determination unit, with the fetal nucleic acid content determination unit and parent's haplotype determination unit phase Even, for the target area haplotype of target area haplotype, the fetus father based on described fetus mother and described Fetal nucleic acid content determines the target area haplotype of the fetus;
Wherein, the fetal nucleic acid content determination unit determines the fetal nucleic acid content through the following steps:
Determining is the site of different homozygous genotypes in second sequencing data and the third sequencing data, wherein RR and rr indicates that different homozygous genotypes, R and r are a pair of alleles,
The fetal nucleic acid content is determined based on formula f=g/ (g+h),
Wherein,
G is the read number that allele r is supported in first sequencing data, and h is support etc. in first sequencing data The read number of position gene R;
The determining fetus target area haplotype, including,
Using it is multiple be heterozygosis on the haplotype of father target area, be that homozygous site is true on the haplotype of mother target area Determine the father target area haplotype that fetus genetic arrives, using it is multiple be on the haplotype of father target area it is homozygous, in mother Mother target area monomer that fetus genetic arrives is determined on the haplotype of target area for the site of heterozygosis and fetal nucleic acid content Type;
Wherein, it is homozygous on the haplotype of father target area for the multiple, on the haplotype of mother target area is miscellaneous The site of conjunction determines fetus genetic mother's equipotential if there is multiple such sites to meet R/r=(1+x%)/(1-x%) Target area haplotype where gene R determines fetus genetic mother etc. if there is multiple such sites to meet R/r=1 Target area haplotype where the gene r of position, R and r indicate a pair of alleles, and x% indicates fetal nucleic acid content, R/r=the The read number that r is supported in read number/first sequencing data of R is supported in one sequencing data.
7. the device of claim 6, which is characterized in that the target area includes the exon 1 of SMN1 gene.
8. the device of claim 7, which is characterized in that the target area further includes SMN1 gene internal and SMN1 gene or more Swim the SNP site that the inferior bit base frequency in each region 3M is 0.3-0.5.
CN201410639577.7A 2014-11-13 2014-11-13 The method and apparatus for determining fetus target area haplotype Active CN105648045B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410639577.7A CN105648045B (en) 2014-11-13 2014-11-13 The method and apparatus for determining fetus target area haplotype

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410639577.7A CN105648045B (en) 2014-11-13 2014-11-13 The method and apparatus for determining fetus target area haplotype

Publications (2)

Publication Number Publication Date
CN105648045A CN105648045A (en) 2016-06-08
CN105648045B true CN105648045B (en) 2019-10-11

Family

ID=56478696

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410639577.7A Active CN105648045B (en) 2014-11-13 2014-11-13 The method and apparatus for determining fetus target area haplotype

Country Status (1)

Country Link
CN (1) CN105648045B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113345518A (en) * 2021-08-02 2021-09-03 北京嘉宝仁和医疗科技有限公司 Haplotype construction method of monogenic disease independent of proband or referent

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108220403B (en) * 2017-12-26 2021-07-06 北京科迅生物技术有限公司 Method and device for detecting specific mutation site, storage medium and processor
CN110699436B (en) * 2018-07-10 2023-07-21 天津华大医学检验所有限公司 Method and system for determining whether seven-exon deletion exists in SMN1 gene of sample to be tested
CN113056563A (en) * 2018-09-03 2021-06-29 拉莫特特拉维夫大学有限公司 Method and system for identifying gene abnormality in blood
KR102400195B1 (en) * 2019-01-25 2022-05-20 주식회사 지니얼로지 Method of predicting a genotype using snp data
CN111560424A (en) * 2019-02-13 2020-08-21 广州医科大学附属第一医院 Detectable target nucleic acid, probe, method for determining fetal F8 gene haplotype and application
CN110444251B (en) * 2019-07-23 2023-09-22 中国石油大学(华东) Monomer style generating method based on branch delimitation
CN110349631B (en) * 2019-07-30 2021-10-29 苏州亿康医学检验有限公司 Analysis method and device for determining haplotype of offspring object
CN112634986A (en) * 2019-09-24 2021-04-09 厦门希吉亚生物科技有限公司 Noninvasive identification method for twins zygote property based on peripheral blood of pregnant woman

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102770558A (en) * 2009-11-05 2012-11-07 香港中文大学 Fetal genomic analysis from a maternal biological sample

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2473638B1 (en) * 2009-09-30 2017-08-09 Natera, Inc. Methods for non-invasive prenatal ploidy calling

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102770558A (en) * 2009-11-05 2012-11-07 香港中文大学 Fetal genomic analysis from a maternal biological sample

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Application of fetal DNA detection in maternal plasma:A prenatal diagnosis unit experience;Cristina Conzalez-Gonzalez;《Journal of Histochemistry》;20050301;第53卷(第3期);第307-314页 *
检测孕妇血浆中游离胎儿DNA行产前诊断的研究进展;洪萍;《临床检验杂志》;20061231;第24卷(第3期);第225-227页 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113345518A (en) * 2021-08-02 2021-09-03 北京嘉宝仁和医疗科技有限公司 Haplotype construction method of monogenic disease independent of proband or referent

Also Published As

Publication number Publication date
CN105648045A (en) 2016-06-08

Similar Documents

Publication Publication Date Title
CN105648045B (en) The method and apparatus for determining fetus target area haplotype
JP6585117B2 (en) Diagnosis of fetal chromosomal aneuploidy
JP6386494B2 (en) Fetal genome analysis of maternal biological samples
KR102339760B1 (en) Diagnosing fetal chromosomal aneuploidy using massively parallel genomic sequencing
KR101890466B1 (en) Highly multiplex pcr methods and compositions
CN104232778B (en) Determine the method and device of fetus haplotype and chromosomal aneuploidy simultaneously
CN106795562A (en) Tissue methylation patterns analysis in DNA mixtures
US20190338350A1 (en) Method, device and kit for detecting fetal genetic mutation
US20210024999A1 (en) Method of identifying risk for autism
CN105648044B (en) The method and apparatus for determining fetus target area haplotype
WO2015042980A1 (en) Method, system, and computer-readable medium for determining snp information in a predetermined chromosomal region
JP2023552507A (en) Method and system for visualizing short reads within repetitive regions of the genome
WO2018090991A1 (en) Universal haplotype-based noninvasive prenatal testing for single gene diseases
Sanchez-Lara Clinical and genomic approaches for the diagnosis of craniofacial disorders
WO2016112539A1 (en) Method and device for determining fetal nucleic acid content
CN111560424A (en) Detectable target nucleic acid, probe, method for determining fetal F8 gene haplotype and application
US11869630B2 (en) Screening system and method for determining a presence and an assessment score of cell-free DNA fragments
Du et al. Unique dual indexing PCR reduces chimeric contamination and improves mutation detection in cell-free DNA of pregnant women
WO2020119626A1 (en) Method for non-invasive prenatal testing of fetus for genetic disease
Kostyk et al. Comparative Genomic Hybridization to Microarrays in Fetuses with High-Risk Prenatal Indications: Polish Experience with 7400 Pregnancies
WO2024076469A1 (en) Non-invasive methods of assessing transplant rejection in pregnant transplant recipients
CN116179677A (en) SNP combination for preeclampsia risk assessment and application thereof
TW202342765A (en) Fragmentation for measuring methylation and disease
CN116814772A (en) Primer combination, kit and method for detecting FANCA gene mutation
CN116218968A (en) Primer composition, method and kit for detecting HBA1/2 single gene defect

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant