CN113337600A - Method for detecting triploid and ROH in chromosome based on low-depth sequencing method - Google Patents

Method for detecting triploid and ROH in chromosome based on low-depth sequencing method Download PDF

Info

Publication number
CN113337600A
CN113337600A CN202110878235.0A CN202110878235A CN113337600A CN 113337600 A CN113337600 A CN 113337600A CN 202110878235 A CN202110878235 A CN 202110878235A CN 113337600 A CN113337600 A CN 113337600A
Authority
CN
China
Prior art keywords
sample
roh
triploid
snp sites
sequencing data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110878235.0A
Other languages
Chinese (zh)
Other versions
CN113337600B (en
Inventor
费嘉
刘沙沙
孙蕾
寇帅
金治平
杨群
张倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiabao Renhe Medical Technology Co ltd
Original Assignee
Peking Jabrehoo Technoiogy Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking Jabrehoo Technoiogy Co ltd filed Critical Peking Jabrehoo Technoiogy Co ltd
Priority to CN202110878235.0A priority Critical patent/CN113337600B/en
Publication of CN113337600A publication Critical patent/CN113337600A/en
Application granted granted Critical
Publication of CN113337600B publication Critical patent/CN113337600B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a method for detecting triploid and ROH in chromosome based on low-depth sequencing method, which comprises the following steps: obtaining genome sequencing data of a sample to be detected based on an enzyme cutting method; extracting SNP sites in genome sequencing data, and recording the number of the SNP sites as m; counting the number n of heterozygous SNP sites in the SNP sites according to the genotyping of the SNP sites; judging whether the sample to be detected is triploid: if p is more than or equal to n/m and more than 0, the number of heterozygous SNP sites of the sample to be detected is too small, and the sample is a whole genome ROH; and if 1 is more than n/m and more than p, judging whether the sample to be detected is a triploid according to the quantity of the SNPs in different AF value intervals, wherein 0.9 is more than p and more than 0. The method provided by the invention has the advantages of strong compatibility, wide applicability, low cost, no need of designing a probe, and realization of simultaneous judgment of the triploid genome or ROH only by an ultra-low sequencing depth.

Description

Method for detecting triploid and ROH in chromosome based on low-depth sequencing method
Technical Field
The invention belongs to the technical field of gene detection, relates to a method for detecting whether a chromosome is triploid or ROH, and particularly relates to a method for detecting the triploid and the ROH in the chromosome based on a low-depth sequencing method.
Background
It is well known that normal human cells contain two sets of chromosomes, one from the father and one from the mother. Among them, triploid is an extra chromosome group in fetal cells, which is a serious chromosome abnormality and is one of the important causes of early pregnancy abortion. The homozygous state (ROH) of a genome region is a phenomenon of loss of heterozygosity continuously appearing in a certain range of the genome region, and the presence of ROH in a chromosome indicates that a uniparental disomy (UPD) may exist, and when the UPD appears on a specific chromosome, the related diseases can be caused by genetic imprinting effect. Furthermore, the risk of mendelian recessive genetic disease within the ROH region is significantly increased.
In the genetic screening before implantation, the condition that whether the chromosome has the triploid and the ROH is detected, partial abortion can be avoided, the birth of a child patient can be reduced, unnecessary time and economic cost of a patient family can be reduced, and in the obstetric histology diagnosis, the detection of the triploid and the ROH can help to determine the genetic cause of abortion and improve the diagnosis rate of abortion.
At present, the existing triploid detection methods include karyotype, Fluorescence In Situ Hybridization (FISH), quantitative PCR (QF-PCR), SNP chip, second-generation sequencing and the like. In the above method:
the karyotype, FISH, QF-PCR method is a low throughput detection technique that cannot detect all chromosomes simultaneously, which can suggest triploids by designing specific probes, but cannot suggest UPD (uniparental);
the MS-MLPA technology can detect UPD, but can only detect UPD of a specific type, and cannot prompt triploid;
array CGH in chip technology can detect copy number abnormality but cannot detect triploid and ROH; the SNP array technology can detect triploid and ROH, but has higher requirements and cost on a sample to be detected;
the method for detecting the copy number of the chromosome by the sequencing technology has the advantages that the CNV-seq method has low sequencing depth and cannot detect triploid and ROH; triploids and normal diploids are distinguished by STR method analysis, and STR probes have high polymorphism but are not uniformly distributed in a genome.
Therefore, in view of the problems of low throughput, complex operation, high cost, etc. of the detection products in the current market, it is urgently needed to design a method capable of simultaneously detecting the triploid and the ROH.
Disclosure of Invention
The invention aims to provide a method for detecting triploid and ROH in chromosome based on a low-depth sequencing method, which has strong applicability and can be suitable for various library construction modes and sequencing instruments; and the method has low detection cost, high flux and strong expansibility, and can detect a large number of SNP sites in a genome sequencing data range.
The technical scheme for realizing the purpose of the invention is as follows: a method for detecting triploid and ROH in chromosome based on low-depth sequencing method comprises the following steps:
obtaining genome sequencing data of a sample to be detected based on an enzyme cutting method;
extracting SNP sites in genome sequencing data, and recording the number of the SNP sites as m;
counting the number n of heterozygous SNP sites in the SNP sites according to the genotyping of the SNP sites;
judging whether the sample to be detected is triploid or whole genome ROH:
if p is more than or equal to n/m and more than 0, the sample to be detected is the whole genome ROH;
if 1 is more than n/m is more than p, judging whether the sample to be detected is a triploid, wherein 0.9 is more than p and more than 0.
The principle of the method for triploid and ROH in chromosome of the invention is as follows: firstly, obtaining low-depth sequencing data (namely genome sequencing data) of a sample to be tested based on an enzyme digestion library building mode; secondly, extracting and counting the number of SNP sites, and counting the number of heterozygous SNP sites according to the genotyping of the SNP sites; and then, judging whether the sample to be detected is triploid or genome sequencing data ROH according to the proportion of the heterozygous SNP sites. The method provided by the invention has the advantages of strong compatibility, wide applicability, low cost, no need of designing a probe and capability of judging whether the sample to be detected is triploid or not or judging whether the sample to be detected is genome sequencing data ROH or not only by the ultra-low sequencing depth.
The method for judging whether the sample to be detected is triploid comprises the following steps:
counting the number of SNP loci at positions of 0.33, 0.5 and 0.67 of allele frequency AF in m SNP loci based on genome sequencing data, and sequentially counting as Count1, Count2 and Count 3;
and if (Count 1+ Count 3)/m > 2 (Count 2/m), judging the sample to be detected to be triploid.
Further, p is 0.1 to 0.3.
In one embodiment of the invention, the method further comprises: based on the genome sequencing data, it is determined whether a region ROH is within the genome sequencing data. Specifically, the method for judging the region ROH includes:
dividing genome sequencing data of a sample to be detected into a plurality of sections in sequence;
counting the number q1 of SNP sites and the number q2 of heterozygous SNP sites in each section;
calculating the value of Q2/Q1 of each section, and if the values of Q2/Q1 of a plurality of continuous sections are all smaller than the threshold value Q, judging the plurality of continuous sections to be the region ROH.
Further, the threshold Q is 0.1 to 0.3.
Furthermore, the number of SNP sites q1 in each of the above-mentioned segments is not less than 10.
In one embodiment of the invention, the method further comprises: and calculating the chromosome copy number of the sample to be tested based on the genome sequencing data.
The method for acquiring the SNP sites in the genome sequencing data comprises the following steps:
detecting all SNP sites or target SNP sites in genome sequencing data;
removing all SNP sites or SNP sites containing a plurality of alleles in a target SNP site;
and filtering the SNP loci left after removing the SNP loci containing a plurality of alleles to obtain the SNP loci in the genome sequencing data.
Compared with the prior art, the invention has the beneficial effects that:
1. the method provided by the invention has the advantages of strong compatibility, wide applicability, low cost, no need of designing a probe, and realization of triploid or whole genome ROH judgment of genome DNA only by ultra-low sequencing depth.
2. The method of the present invention can judge the chromosome Copy Number (CNV) and regional ROH while judging the triploid of the genome DNA or the whole genome ROH.
3. The invention divides the genome sequencing data into a plurality of sections, analyzes a plurality of continuous sections by counting the number of SNP sites in each section and the number of heterozygous SNP sites to judge whether the ROH (remote access) condition exists in the region, and can further analyze a sample to be detected.
Drawings
In order to more clearly illustrate the technical solution of the embodiment of the present invention, the drawings used in the description of the embodiment will be briefly introduced below. It should be apparent that the drawings in the following description are only for illustrating the embodiments of the present invention or technical solutions in the prior art more clearly, and that other drawings can be obtained by those skilled in the art without any inventive work.
FIG. 1 is a flowchart of the method for detecting triploid chromosome and whole genome ROH based on low depth sequencing method in example 1;
FIG. 2 is a flowchart of a method for detecting triploid chromosome, genome-wide ROH and chromosome copy number based on low-depth sequencing method in example 2;
FIG. 3 is a flowchart of the method for detecting triploid, whole genome ROH, region ROH, chromosome copy number in chromosome based on low depth sequencing method in example 3.
Detailed Description
The invention will be further described with reference to specific embodiments, and the advantages and features of the invention will become apparent as the description proceeds. These examples are illustrative only and do not limit the scope of the present invention in any way. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention, and that such changes and modifications may be made without departing from the spirit and scope of the invention.
In the description of the present embodiments, it is to be understood that the terms "center", "longitudinal", "lateral", "up", "down", "front", "back", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc. indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of describing the present invention and simplifying the description, but do not indicate or imply that the device or element referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention.
Furthermore, the terms "first," "second," "third," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit to a number of indicated technical features. Thus, a feature defined as "first," "second," etc. may explicitly or implicitly include one or more of that feature. In the description of the invention, the meaning of "a plurality" is two or more unless otherwise specified.
Example 1:
the present embodiment provides a method for detecting triploid and ROH in chromosome based on low-depth sequencing method, as shown in fig. 1, the method includes:
and S1, obtaining genome sequencing data of the sample to be detected based on the enzyme cutting method.
Specifically, the method for acquiring the genome sequencing data comprises the following steps: extracting the genome DNA of a sample to be detected, and fragmenting the genome DNA. In this embodiment, the genomic DNA is preferentially fragmented by an enzymatic cleavage method (e.g., MspI enzyme in specific endonuclease), and then a sequencing library is constructed by performing end repair, a linker, amplification, fragment selection, and the like, and sequencing data is obtained by performing on-machine sequencing.
When the sequencing data is obtained, the sequencing data is required to be processed, and the method comprises the following steps: removing a linker and low-quality bases of sequencing data, comparing the sequencing data with hg18 or hg19 reference genomes by using BWA, Bowtie and other comparison software, generating a comparison file, sequencing and counting the file to obtain a sequenced bam file and statistical information, wherein the statistical information comprises data such as comparison rate, average depth and the like. And (4) after the sequencing data are processed, obtaining the genome sequencing data of the sample to be detected.
And S2, extracting SNP sites in the genome sequencing data, and marking the number of the SNP sites as m.
Specifically, the method for acquiring the SNP sites in the genome sequencing data includes:
and S21, detecting all SNP sites or target SNP sites in the genome sequencing data.
In this step, SNP sites in the genome sequencing data are detected by software (such as Samtools, GATK, Freeboyes, etc.). Wherein the target SNP site is a designated SNP site, for example, obtained from an own SNP site list or public database SNPs (e.g., dbSNP, 1000G).
S22, eliminating all SNP sites or SNP sites containing multiple alleles in the target SNP site.
And S23, filtering the remaining SNP sites after removing the SNP sites containing a plurality of alleles to obtain the SNP sites in the genome sequencing data.
In this step, SNP sites are filtered according to the coverage depth (for example, the number of times of base coverage > 10), the mutation quality value, and the like.
S3, counting the number n of heterozygous SNP sites in the SNP sites according to the genotyping of the SNP sites.
S4, judging whether the sample to be detected is triploid or whole genome ROH;
s41, if p is more than or equal to n/m and more than 0, the heterozygosity SNP locus in the sample to be detected is too few, and the sample to be detected can be judged to be a whole genome ROH, wherein 0.9 is more than p and more than 0.
In the step, p is 0.1-0.3, and p =0.2 is preferably selected.
S42, if 1 > n/m > p, excluding the sample to be detected as a whole genome ROH, and judging whether the sample to be detected is a triploid, wherein 0.9 > p >0.
In the step, p is 0.1-0.3, and p =0.2 is preferably selected.
Specifically, the method for determining whether the sample to be tested is a triploid includes:
s421, counting the number of SNP loci with allele frequencies AF of 0.33, 0.5 and 0.67 in m SNP loci based on genome sequencing data, and sequentially counting as Count1, Count2 and Count 3;
and S422, if the (Count 1+ Count 3)/m > 2 (Count 2/m), judging that the sample to be detected is triploid.
The principle of the method for triploid and ROH in chromosome of the invention is as follows: firstly, obtaining low-depth sequencing data (namely genome sequencing data) of a sample to be tested based on an enzyme digestion library building mode; secondly, extracting and counting the number of SNP sites, and counting the number of heterozygous SNP sites according to the genotyping of the SNP sites; and then, judging whether the sample to be detected is triploid or whole genome ROH according to the proportion of the heterozygous SNP sites. The method provided by the invention has the advantages of strong compatibility, wide applicability, low cost, no need of designing a probe, and realization of judgment on whether the sample to be detected is triploid or whole genome ROH only by ultra-low sequencing depth.
Example 2:
the embodiment is an improvement on the basis of example 1, and particularly provides a method for detecting triploid and whole genome ROH in a chromosome based on a low-depth sequencing method.
As shown in fig. 2, the method for simultaneously detecting triploid or ROH or chromosome Copy Number (CNV) in chromosome comprises:
and S1, obtaining genome sequencing data of the sample to be detected based on the enzyme cutting method.
The method for obtaining the genome sequencing data of the sample to be tested in the step is the same as that in embodiment 1, and is not repeated herein.
And S2, extracting SNP sites in the genome sequencing data, and marking the number of the SNP sites as m.
The method for obtaining the SNP sites in the genomic sequencing data in the step is the same as that in example 1, and is not repeated herein.
S3, counting the number n of heterozygous SNP sites in the SNP sites according to the genotyping of the SNP sites.
S4, judging whether the sample to be detected is triploid or whole genome ROH;
s41, if p is more than or equal to n/m and more than 0, the heterozygosity SNP locus in the sample to be detected is too few, and the sample to be detected can be judged to be a whole genome ROH, wherein 0.9 is more than p and more than 0.
In the step, p is 0.1-0.3, and p =0.2 is preferably selected.
S42, if 1 > n/m > p, excluding the sample to be detected as a whole genome ROH, and judging whether the sample to be detected is a triploid, wherein 0.9 > p >0.
In the step, p is 0.1-0.3, and p =0.2 is preferably selected.
Specifically, the method for determining whether the sample to be tested is a triploid includes:
s421, counting the number of SNP loci with allele frequencies AF of 0.33, 0.5 and 0.67 in m SNP loci based on genome sequencing data, and sequentially counting as Count1, Count2 and Count 3;
and S422, if the (Count 1+ Count 3)/m > 2 (Count 2/m), judging that the sample to be detected is triploid.
And S43, calculating the chromosome Copy Number (CNV) of the sample to be tested based on the genome sequencing data.
Specifically, the calculation method of the chromosome Copy Number (CNV) is preferably selected as follows: dividing the genome DNA into windows with certain length, calculating the numbers of reads (sequencing sequences) aligned in the windows, carrying out correction treatment through the steps of homogenization in a sample, GC correction and the like, comparing the corrected value with the value of a normal diploid sample to obtain a ratio, and using a hidden Markov model to estimate the copy number of a genome region. The method of calculating the chromosome Copy Number (CNV) may be performed by selecting other conventional methods, and is not particularly limited herein.
The embodiment can judge whether the sample to be detected is triploid or genome sequencing data ROH, and can also calculate the chromosome copy number so as to ensure the detection accuracy of the sample to be detected. The method provided by the invention has the advantages of strong compatibility, wide applicability, low cost, no need of designing a probe, and capability of simultaneously judging whether the sample to be detected is triploid or whole genome ROH or chromosome copy number only by ultra-low sequencing depth.
Example 3:
the embodiment is an improvement on the basis of embodiment 2, and specifically provides a method for detecting triploid and ROH in a chromosome based on a low-depth sequencing method, which is used for detecting whether a region ROH exists in genome sequencing data of a sample to be detected on the basis of simultaneous detection of the existence of the triploid or a whole genome ROH and the chromosome Copy Number (CNV) in the chromosome of the sample to be detected.
As shown in fig. 3, the method for simultaneously detecting triploid, whole genome ROH, chromosome Copy Number (CNV), regional ROH in chromosome comprises:
and S1, obtaining genome sequencing data of the sample to be detected based on the enzyme cutting method.
The method for obtaining the genome sequencing data of the sample to be tested in the step is the same as that in embodiment 1, and is not repeated herein.
And S2, extracting SNP sites in the genome sequencing data, and marking the number of the SNP sites as m.
The method for obtaining the SNP sites in the genomic sequencing data in the step is the same as that in example 1, and is not repeated herein.
S3, counting the number n of heterozygous SNP sites in the SNP sites according to the genotyping of the SNP sites.
S4, judging whether the sample to be detected is triploid or whole genome ROH;
s41, if p is more than or equal to n/m and more than 0, the heterozygosity SNP locus in the sample to be detected is too few, and the sample to be detected can be judged to be a whole genome ROH, wherein 0.9 is more than p and more than 0.
In the step, p is 0.1-0.3, and p =0.2 is preferably selected.
S42, if 1 > n/m > p, excluding the sample to be detected as a whole genome ROH, and judging whether the sample to be detected is a triploid, wherein 0.9 > p >0.
In the step, p is 0.1-0.3, and p =0.2 is preferably selected.
Specifically, the method for determining whether the sample to be tested is a triploid includes:
s421, counting the number of SNP loci with allele frequencies AF of 0.33, 0.5 and 0.67 in m SNP loci based on genome sequencing data, and sequentially counting as Count1, Count2 and Count 3;
and S422, if the (Count 1+ Count 3)/m > 2 (Count 2/m), judging that the sample to be detected is triploid.
And S43, calculating the chromosome Copy Number (CNV) of the sample to be tested based on the genome sequencing data.
The method for calculating the copy number of the chromosome in the genome sequencing data in the step is the same as that in example 1, and is not described herein again.
S44, based on the genome sequencing data, it is determined whether or not there is a region ROH in the genome sequencing data.
Specifically, the method for judging the region ROH includes:
s441, the genome sequencing data of the sample to be tested is sequentially divided into a plurality of sections.
Specifically, after genome sequencing data are divided into segments due to different lengths, the number of SNP sites contained in each segment is different for different individuals, and meanwhile, when the region ROH is calculated, a more accurate result can be obtained only under the condition that the number of SNP sites in each segment is a certain number, otherwise, the condition of partial false negative and false positive caused by too few SNP sites occurs. Therefore, the number q1 of SNP sites in each segment is preferably selected to be more than or equal to 10, and the length of each segment is set to be 300kb-2Mb under the condition of ensuring the number of SNP sites in each segment, which needs to be determined according to specific conditions.
S442, counting the number q1 of SNP sites in each section and the number q2 of heterozygous SNP sites;
s443, calculating the value of Q2/Q1 of each section, and if the values of Q2/Q1 of a plurality of (2 or more) continuous sections are all smaller than the threshold value Q, judging the plurality of continuous sections to be the region ROH.
In this step, the threshold Q is 0.1-0.3, preferably 0.1-0.2.
The embodiment can judge whether the sample to be detected is triploid and ROH (including whole genome ROH and regional ROH), and can calculate the copy number of the chromosome so as to ensure the detection accuracy of the sample to be detected. The method provided by the invention has the advantages of strong compatibility, wide applicability, low cost, no need of designing a probe, and realization of simultaneous judgment on whether the sample to be detected is triploid, ROH and chromosome copy number only by ultra-low sequencing depth.
Hereinafter, the presence or absence of triploid, whole genome ROH, and regional ROH in a sample to be tested is determined by 3 examples.
Example 1: judging whether the ROH or the triploid exists in the sample 1 to be detected:
obtaining genome sequencing data of a sample 1 to be detected by an enzyme cutting method;
counting SNP sites within the chromosomal region within the genomic sequencing data, m = 14856;
analyzing the genotyping of each SNP locus, and counting the number of heterozygous SNP loci, wherein n = 4138;
setting p =0.2, calculating n/m =0.28>0.2, then excluding the possibility of overall ROH for the chromosomal region;
counting the number of SNPs with three allele frequencies AF0.33, 0.5 and 0.67 in m SNP loci, and sequentially counting as Count1=1204, Count2=740 and Count3= 912;
calculating (Count 1+ Count 3)/m > 2 (Count 2/m), and then, the chromosome region of the sample 1 to be detected is suggested to be triploid.
Example 2: judging whether the ROH or the triploid exists in the sample 2 to be detected:
and acquiring genome sequencing data of the sample based on the enzyme cutting method.
Counting SNP sites within a chromosomal region within the genomic sequencing data, m = 25887;
analyzing the genotyping of the SNP loci, and counting the number of heterozygous SNP loci, wherein n = 1809;
setting p =0.2, calculating n/m =0.07<0.2, indicating that the heterozygous SNP sites in the genome sequencing data are less, and indicating that the sample 2 to be tested is the whole genome ROH.
Example 3: judging whether the ROH of the area exists in the sample 3 to be detected:
and acquiring genome sequencing data of the sample 3 to be detected based on an enzyme digestion method.
Dividing the genome sequencing data into a plurality of windows (namely, sections) with the length of 1M, and calculating the number q2 of heterozygous SNP sites and the number q1 of SNP sites in each window;
and calculating the ratio of q2/q1 of each window, wherein the calculated ratio is within about 14M range of the end of the chromosome 8 in the genome sequencing data, wherein the ratio of q2/q1 of 12 continuous windows is lower than 0.2, and judging that the 12 continuous windows within 14M range of the end of the chromosome 8 in the genome sequencing data of the sample 3 to be detected are regions ROH.
And meanwhile, carrying out copy number analysis on the genome sequencing data of the sample 3, carrying out homogenization, GC correction, baseline correction and other steps on the sequencing data in the sample, and using a hidden Markov model to estimate the copy number of a genome region, wherein the result indicates 14M heterozygous deletion at the end of chromosome 8.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (8)

1. A method for detecting triploid and ROH in chromosome based on low depth sequencing method is characterized in that the method comprises the following steps:
obtaining genome sequencing data of a sample to be detected based on an enzyme cutting method;
extracting SNP sites in genome sequencing data, and recording the number of the SNP sites as m;
counting the number n of heterozygous SNP sites in the SNP sites according to the genotyping of the SNP sites;
judging whether the sample to be detected is triploid or whole genome ROH:
if p is more than or equal to n/m and more than 0, the sample to be detected is the whole genome ROH;
if 1 is more than n/m is more than p, judging whether the sample to be detected is a triploid, wherein 0.9 is more than p and more than 0.
2. The method of claim 1, wherein the determining whether the sample is a triploid comprises:
counting the number of SNP loci at positions of 0.33, 0.5 and 0.67 of allele frequency AF in m SNP loci based on genome sequencing data, and sequentially counting as Count1, Count2 and Count 3;
and if (Count 1+ Count 3)/m > 2 (Count 2/m), judging the sample to be detected to be triploid.
3. The method of claim 1, wherein p is 0.1 to 0.3.
4. The method of claim 1 or 2, further comprising, based on the genomic sequencing data, determining whether a region ROH is within the genomic sequencing data;
the method for judging the region ROH comprises the following steps:
dividing genome sequencing data of a sample to be detected into a plurality of sections in sequence;
counting the number q1 of SNP sites and the number q2 of heterozygous SNP sites in each section;
calculating the value of Q2/Q1 of each section, and if the values of Q2/Q1 of a plurality of continuous sections are all smaller than the threshold value Q, judging the plurality of continuous sections to be the region ROH.
5. The method of claim 4, wherein the threshold Q is 0.1-0.3.
6. The method according to claim 4, wherein the number of SNP sites q1 in each segment is 10 or more.
7. The method of any one of claims 1 or 2 or 3 or 5 or 6, further comprising: and calculating the chromosome copy number of the sample to be tested based on the genome sequencing data.
8. The method of claim 1, wherein the method for obtaining the SNP site in the genome sequencing data comprises:
detecting all SNP sites in the genome sequencing data;
removing all SNP sites or SNP sites containing a plurality of alleles in a target SNP site;
and filtering the SNP loci left after removing the SNP loci containing a plurality of alleles to obtain the SNP loci in the genome sequencing data.
CN202110878235.0A 2021-08-02 2021-08-02 Method for detecting triploid and ROH in chromosome based on low-depth sequencing method Active CN113337600B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110878235.0A CN113337600B (en) 2021-08-02 2021-08-02 Method for detecting triploid and ROH in chromosome based on low-depth sequencing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110878235.0A CN113337600B (en) 2021-08-02 2021-08-02 Method for detecting triploid and ROH in chromosome based on low-depth sequencing method

Publications (2)

Publication Number Publication Date
CN113337600A true CN113337600A (en) 2021-09-03
CN113337600B CN113337600B (en) 2021-11-09

Family

ID=77480528

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110878235.0A Active CN113337600B (en) 2021-08-02 2021-08-02 Method for detecting triploid and ROH in chromosome based on low-depth sequencing method

Country Status (1)

Country Link
CN (1) CN113337600B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114049914A (en) * 2022-01-14 2022-02-15 苏州贝康医疗器械有限公司 Method and device for integrally detecting CNV, uniparental disomy, triploid and ROH
CN114300051A (en) * 2021-12-22 2022-04-08 北京吉因加医学检验实验室有限公司 Method and device for calculating fusion gene frequency
CN115287369A (en) * 2022-10-08 2022-11-04 北京大学第三医院(北京大学第三临床医学院) Single cell sequencing based non-single sperm determination method
WO2024164854A1 (en) * 2023-02-10 2024-08-15 北京嘉宝仁和医疗科技股份有限公司 Integrated genome analysis method based on genotype imputation and low-coverage sequencing

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113113081A (en) * 2020-08-31 2021-07-13 东莞博奥木华基因科技有限公司 System for detecting polyploid and genome homozygous region ROH based on CNV-seq sequencing data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113113081A (en) * 2020-08-31 2021-07-13 东莞博奥木华基因科技有限公司 System for detecting polyploid and genome homozygous region ROH based on CNV-seq sequencing data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
QIAN GENG等: "screening of triploid with low-coverage whole-genome sequencing by a single-nucleotide polymorphism-based test in miscarriage tissue", 《JOURNAL OF ASSISTED REPRODUCTION AND GENETICS》 *
李奉瑾等: "低深度全基因组测序技术在产前诊断中的研究进展", 《国际妇产科学杂志》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114300051A (en) * 2021-12-22 2022-04-08 北京吉因加医学检验实验室有限公司 Method and device for calculating fusion gene frequency
CN114300051B (en) * 2021-12-22 2022-07-15 北京吉因加医学检验实验室有限公司 Method and device for calculating fusion gene frequency
CN114049914A (en) * 2022-01-14 2022-02-15 苏州贝康医疗器械有限公司 Method and device for integrally detecting CNV, uniparental disomy, triploid and ROH
CN115287369A (en) * 2022-10-08 2022-11-04 北京大学第三医院(北京大学第三临床医学院) Single cell sequencing based non-single sperm determination method
WO2024164854A1 (en) * 2023-02-10 2024-08-15 北京嘉宝仁和医疗科技股份有限公司 Integrated genome analysis method based on genotype imputation and low-coverage sequencing

Also Published As

Publication number Publication date
CN113337600B (en) 2021-11-09

Similar Documents

Publication Publication Date Title
CN113337600B (en) Method for detecting triploid and ROH in chromosome based on low-depth sequencing method
US20200232021A1 (en) Method for Detecting Tumor DNA in a cfDNA Sample Collected from a Patient that has Previously Undergone Cancer Therapy
KR102339760B1 (en) Diagnosing fetal chromosomal aneuploidy using massively parallel genomic sequencing
TWI611186B (en) Molecular testing of multiple pregnancies
CN113724791B (en) CYP21A2 gene NGS data analysis method, device and application
CN113151474A (en) Plasma DNA mutation analysis for cancer detection
EP3859010A1 (en) Second generation sequencing-based method for detecting microsatellite stability and genome changes by means of plasma
WO2024164854A1 (en) Integrated genome analysis method based on genotype imputation and low-coverage sequencing
WO2018090991A1 (en) Universal haplotype-based noninvasive prenatal testing for single gene diseases
WO2019008148A9 (en) Enrichment of targeted genomic regions for multiplexed parallel analysis
WO2020192680A1 (en) Determining linear and circular forms of circulating nucleic acids
US11535886B2 (en) Array-based methods for analysing mixed samples using different allele-specific labels, in particular for detection of fetal aneuploidies
CN116052766A (en) Detection method and system for chromosome homozygous region and electronic equipment
WO2016112539A1 (en) Method and device for determining fetal nucleic acid content
CN112639129A (en) Method and apparatus for determining the genetic status of a new mutation in an embryo
US12098429B2 (en) Determining linear and circular forms of circulating nucleic acids
EP3863019A1 (en) Methods for detecting and characterizing microsatellite instability with high throughput sequencing
CN116913378A (en) Method and system for detecting genome homozygous region based on low-depth sequencing data
KR20220002929A (en) Methods and systems for genetic analysis
Hocking Chromosomal copy number analysis using SNP microarrays: an extension of RLMM with a binomial test statistic
Srivastava et al. Low-Pass Genome Sequencing: A Good Option for Detecting Copy Number Variations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 102629 Room 302, floor 3, building 7, courtyard 19, Tianrong street, Daxing biomedical industry base, Zhongguancun Science and Technology Park, Daxing District, Beijing

Patentee after: Beijing Jiabao Renhe Medical Technology Co.,Ltd.

Address before: 102629 Room 203, building 6, courtyard 19, Tianrong street, Daxing biomedical industrial base, Zhongguancun Science and Technology Park, Daxing District, Beijing

Patentee before: PEKING JABREHOO TECHNOIOGY Co.,Ltd.