CN111755072A - Method and device for simultaneously detecting methylation level, genome variation and insertion fragment - Google Patents

Method and device for simultaneously detecting methylation level, genome variation and insertion fragment Download PDF

Info

Publication number
CN111755072A
CN111755072A CN202010774753.3A CN202010774753A CN111755072A CN 111755072 A CN111755072 A CN 111755072A CN 202010774753 A CN202010774753 A CN 202010774753A CN 111755072 A CN111755072 A CN 111755072A
Authority
CN
China
Prior art keywords
dna
methylation
sample
detected
methylated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010774753.3A
Other languages
Chinese (zh)
Other versions
CN111755072B (en
Inventor
杨玲
张燕艳
管彦芳
姬利延
马梦亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Guiinga Medical Laboratory
Original Assignee
Shenzhen Guiinga Medical Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Guiinga Medical Laboratory filed Critical Shenzhen Guiinga Medical Laboratory
Priority to CN202010774753.3A priority Critical patent/CN111755072B/en
Publication of CN111755072A publication Critical patent/CN111755072A/en
Application granted granted Critical
Publication of CN111755072B publication Critical patent/CN111755072B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Abstract

The invention provides a method for simultaneously detecting methylation level, genome variation and insertion fragments by utilizing a methylation non-bisulfite sequencing technology, wherein the genome variation comprises gene mutation, copy number variation and structural variation. The invention also provides a device and equipment for implementing the method, and a corresponding computer readable medium. The invention can realize one-step detection and analysis from off-line data to methylation level, genome variation and insert multi-dimension aiming at methylation non-bisulfite sequencing data, is suitable for the whole genome and target capture data types of methylation non-bisulfite, and can analyze single cancer sample and paired sample (cancer sample containing control sample).

Description

Method and device for simultaneously detecting methylation level, genome variation and insertion fragment
Technical Field
The invention relates to the technical field of bioinformatics, in particular to a method for simultaneously detecting methylation level, genome variation and insertion fragments, a device and equipment for implementing the method and a corresponding computer readable medium.
Background
DNA methylation is one of the chemical modifications of DNA that can alter genetic material without altering the DNA sequence. As early as 1925, DNA methylation modification has been discovered. Numerous studies have shown that DNA methylation has a epigenetic role in gene regulation. In DNA methylation, the most studied is 5-methylcytosine (5mC), a modification that is generally considered to be a stable inhibitory regulator of gene expression. The current DNA methylation detection method based on the second generation sequencing technology is that unmethylated cytosine (C) is converted into uracil (U) through bisulfite, then U is identified as thymine (T) by adopting polymerase which is tolerant to U in the PCR process, C-to-T conversion is realized, sequencing data are respectively compared to reference genomes converted from C to T and G to A in analysis, and the methylation level of sample DNA is identified. In normal human DNA, about 3% to 6% of C are methylated, so that more than 90% of C are converted to T by bisulfite converted sequencing data.
Genomic variations mainly include genetic mutations, copy number variations and structural variations. A gene mutation refers to a change in the base pair composition or arrangement order of a gene in structure, including a change in sequence caused by base substitution, DNA insertion, DNA deletion, or DNA duplication. Copy number variation generally refers to copy number duplication and deletion of large genomic fragments ranging from 1kb to several Mb in length. Structural variation generally refers to chromosomal recombination, where two genes located at great distances from the genome fuse to form a new coding sequence. The gene mutation, copy number variation and structure variation directly change the DNA base sequence, influence the genetic characteristics of organisms and play an important role in early diagnosis, medication guidance and prognosis monitoring of tumors. The gene mutation, copy number variation and structural variation are all changes of DNA molecule nucleotide sequences, and can be obtained by comparing, analyzing and detecting sequencing data and reference genome by a second-generation sequencing technology. The data generated by this detection method only contains DNA base sequence information, and it is not possible to identify whether or not the base is methylated.
The insert generally refers to a DNA fragment obtained by breaking DNA molecules in a sample by using ultrasonic or enzyme digestion technology in library construction of next generation sequencing. For free DNA (cfDNA) fragments in blood, the length of the fragments is distributed between 75-250 bp, and the cfDNA does not need to be broken before library construction. The distribution of inserts, e.g., cfDNA fragments, can be reflected by insert analysis results. Currently, insert analysis is obtained by comparing genome double-end sequencing data, and the detection data does not contain methylation information and cannot be subjected to methylation-related analysis.
In summary, the analysis of genomic variations (gene mutations, copy number variations, structural variations) and the analysis of inserts are based on the difference analysis of DNA sequencing data compared with reference genome, and the analysis of sequences containing methylation marker information is not considered, and the analysis of data containing methylation signals cannot be performed. The current detection of methylation levels is based on differential analysis of Bisulfite Sequencing (BS) data with reference genomes that have undergone C to T and G to a conversions, and evaluation of methylation levels at all sites. Since more than 90% of the C in BS sequencing data is transformed and DNA damage occurs during transformation, genomic variation and insert analysis cannot be performed. Thus, simultaneous detection of methylation levels, genomic variations (gene mutations, copy number variations, structural variations) and insert analysis of DNA sequencing data is currently not available in the art.
Disclosure of Invention
The invention aims to realize the detection of methylation level, genome variation (gene mutation, copy number variation and structural variation) and insert analysis of DNA sequencing data at the same time.
The invention achieves the purpose through the following technical scheme.
In a first aspect, the present invention provides a method for simultaneously detecting methylation levels, genomic variations and inserts, wherein the genomic variations comprise genetic mutations, copy number variations and structural variations, the method comprising the steps of:
s1: a sequencing data providing step of providing sequencing data by performing methylation non-bisulfite sequencing on a mixed sample containing a sample DNA to be tested and methylated positive reference DNA and methylated negative reference DNA;
s2: a sequencing data processing step, wherein sequencing data are processed to obtain effective data of the methylated male ginseng DNA and the methylated female ginseng DNA and effective data of the DNA of a sample to be detected;
s3: a condition judgment step, namely counting the methylation level alpha of the methylated male ginseng DNA in a CpG region and the methylation level beta of the methylated female ginseng DNA on the whole genome according to effective data of the methylated male ginseng DNA and the methylated female ginseng DNA, judging whether the alpha and the beta meet the conditions that the alpha is more than or equal to 95% and the beta is less than 5%, if so, performing S4, and if not, returning to the S1 step to perform methylation non-bisulfite sequencing again;
s4: methylation detection of a sample to be detected, namely performing methylation analysis on effective data of the DNA of the sample to be detected, and counting the methylation level of the DNA of the sample to be detected;
s5: a gene mutation detection step of a sample to be detected, which is to perform gene mutation analysis on effective data of DNA of the sample to be detected, perform gene function region filtration and database frequency filtration according to a gene mutation analysis result to obtain a first mutation set, remove reads of methylation conversion according to a methylation statistical result, and perform filtration of the first mutation set according to an upward floating threshold set by CpG, CHG and CHH to obtain a final mutation set;
s6: a step of detecting copy number variation of a sample to be detected, which is to perform copy number variation analysis according to effective data of DNA of the sample to be detected to obtain copy number variation data, and perform filtering and screening;
s7: a step of detecting structural variation of a sample to be detected, which is to perform structural variation analysis according to effective data of DNA of the sample to be detected to obtain structural variation data, and perform filtering and screening;
s8: and detecting the insertion sequence of the sample to be detected, namely analyzing the insertion fragment according to the effective data of the DNA of the sample to be detected, distinguishing the reads covering the mutant type and the wild type, and respectively counting the distribution results of the mutant type insertion fragment and the wild type insertion fragment.
In a specific embodiment of the present invention, the DNA of the sample to be tested is human somatic DNA, and the methylated positive reference DNA and methylated negative reference DNA are DNAs of species different from human species.
In a specific embodiment of the invention, processing the sequencing data in step S2 comprises subjecting the sequencing data to linker removal, low quality sequence filtration, wherein the low quality sequence filtration conditions are such that no lower quality bases with a quality value <15 exceed 50% of the sequence; comparing the filtered sequence data with a ginseng reference genome, methylated male reference DNA and methylated female reference DNA respectively to obtain a comparison file, and establishing an index; combining the comparison files obtained by the multiple lanes, and sequencing; and removing the repeated sequence generated by PCR from the combined comparison file to obtain the effective data.
In a specific embodiment of the present invention, in step S4, the sample methylation analysis result comprises all C base methylation information including genome position information, methylation coverage depth, non-methylation coverage depth, methylation frequency; the sample methylation statistics comprise methylation levels of CpG dinucleoside and non-CpG dinucleoside regions, wherein the non-CpG dinucleoside regions comprise a CHH site and a CHG site, wherein H is a non-G base.
In a specific embodiment of the present invention, in step S5, the sample gene mutation analysis result includes genomic locus information and mutation information, the gene function region is filtered to only retain exon regions, missense mutations, nonsense mutations and frameshift mutations, the database frequency is filtered to remove thousands of human genome frequency not less than 0.001, the upward floating threshold of CpG is 0.1, and the upward floating threshold of CHG and CHH is 0.05.
In a specific embodiment of the present invention, in step S6, the copy number variation data filtering condition is ratio >2 or ratio < 0.5.
In the specific embodiment of the present invention, in step S7, the structural variation data screening condition is that the breakpoint coverage is greater than or equal to 5.
In a specific embodiment of the present invention, in the step of S8, the distinction between mutant and wild type reads is based on the cigar and flag information of the BAM file to identify the mutation site, wherein the identification of the mutation site excludes the influence of the methylation signal.
In a second aspect, the present invention provides an apparatus for performing the method of the first aspect of the present invention for simultaneously detecting methylation levels, genomic variations and inserts, wherein the genomic variations include genetic mutations, copy number variations and structural variations, the apparatus comprising the following modules:
m1: a sequencing data providing module for providing sequencing data by performing methylation non-bisulfite sequencing on a mixed sample containing a sample DNA to be tested and methylated positive reference DNA and methylated negative reference DNA;
m2: the sequencing data processing module is used for processing the sequencing data to obtain effective data of the methylated male ginseng DNA and the methylated female ginseng DNA and effective data of the DNA of the sample to be detected;
m3: the condition judgment module is used for counting the methylation level alpha of the methylated male ginseng DNA in a CpG region and the methylation level beta of the methylated female ginseng DNA on the whole genome according to effective data of the methylated male ginseng DNA and the methylated female ginseng DNA, judging whether the alpha and the beta meet the conditions that the alpha is more than or equal to 95% and the beta is less than 5%, if so, carrying out the M4 module, and if not, returning to the M1 module to carry out methylation non-bisulfite sequencing again;
m4: the methylation detection module of the sample to be detected is used for carrying out methylation analysis on the effective data of the DNA of the sample to be detected and counting the methylation level of the DNA of the sample to be detected;
m5: the detection module for the gene mutation of the sample to be detected is used for carrying out gene mutation analysis on effective data of DNA of the sample to be detected, carrying out gene functional region filtration and database frequency filtration according to a gene mutation analysis result to obtain a first mutation set, removing reads of methylation conversion according to a methylation statistical result, and setting an upward floating threshold according to CpG, CHG and CHH to carry out filtration of the first mutation set to obtain a final mutation set;
m6: the copy number variation detection module of the sample to be detected is used for carrying out copy number variation analysis according to the effective data of the DNA of the sample to be detected to obtain copy number variation data and carrying out filtering and screening;
m7: the structural variation detection module of the sample to be detected is used for carrying out structural variation analysis according to the effective data of the DNA of the sample to be detected to obtain structural variation data and carrying out filtering and screening;
m8: and the detection module of the insertion sequence of the sample to be detected is used for analyzing the insertion fragment according to the effective data of the DNA of the sample to be detected, distinguishing the reads covering the mutant type and the wild type, and respectively counting the distribution results of the mutant type insertion fragment and the wild type insertion fragment.
In a third aspect, the invention provides a computer readable medium having stored thereon computer program instructions, wherein the method of the first aspect of the invention for simultaneous detection of methylation levels, genomic variations and inserts is performed when the computer program instructions are executed by a processor.
In a fourth aspect, the present invention provides an apparatus for carrying out the method of the first aspect of the invention for simultaneous detection of methylation levels, genomic variations and inserts, the apparatus comprising:
a memory for storing computer program instructions, and
a processor for executing the computer program instructions,
wherein the computer program instructions, when executed by the processor, cause the apparatus to perform the method of the first aspect of the invention for simultaneous detection of methylation levels, genomic variations and inserts.
The invention has the beneficial effects that:
aiming at methylation non-bisulfite sequencing data, one-step detection analysis from off-line data to methylation level, genome variation (gene mutation, copy number variation and structural variation) and multiple dimensions of insert fragments is realized.
The method is suitable for the whole genome and targeted capture data types of methylated non-bisulfite, and can be used for analyzing single cancer samples and paired samples (cancer samples containing control samples).
For a single site, methylation and gene mutation detection of the site can be carried out simultaneously, and methylation frequency and mutation frequency are given. The gene mutation screening can remove false positive results caused by methylation signal influence and insufficient coverage, and the filtering efficiency reaches 99%.
Removing methylation background noise influence, and performing insert fragment analysis to effectively distinguish mutant fragment from wild fragment.
The data efficiency and human reference genome alignment for methylation non-bisulfite sequencing is significantly higher than for bisulfite conversion sequencing (BS).
Drawings
FIG. 1 shows a flow chart of the steps of the method of the present invention for simultaneous detection of methylation levels, genomic variations and inserts;
FIG. 2 shows a block diagram of an apparatus for simultaneous detection of methylation levels, genomic variants and inserts according to the present invention;
FIG. 3 shows that the gene mutation screening in the method of the present invention can remove the false positive results induced by methylation signal influence and coverage insufficiency, and the filtration efficiency reaches more than 99.66%;
FIG. 4 shows that insert analysis performed to remove the effects of methylation background noise by the method of the present invention can effectively distinguish mutant from wild-type fragments;
FIG. 5 shows a comparison of the methylation non-bisulfite sequencing method used in the methods of the present invention with genome-wide methylation bisulfite sequencing.
Detailed Description
In order to make the technical problems solved by the present invention, the technical solutions adopted and the advantages obtained by the present invention more clearly understood, the present invention is further described in detail below with reference to the accompanying drawings and the specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The simultaneous detection of methylation levels, genomic variations (gene mutations, copy number analysis, structural variations) and insert analysis of DNA sequencing data is currently not available in the art. Briefly, existing genomic methylation detection methods are typically Bisulfite Sequencing (BS), and evaluation of methylation levels at all sites is performed by performing differential analysis of BS data with reference genomes that have undergone C to T and G to a conversions. Since more than 90% of the C in BS sequencing data is transformed and DNA damage occurs during transformation, genomic variation and insert analysis cannot be performed. The sequencing data and the reference genome are compared and analyzed by a second-generation sequencing technology, so that gene mutation, copy number variation and structural variation can be detected, but the data generated by the detection method only comprises DNA base sequence information and cannot identify whether the base is methylated or not. Furthermore, the current insert analysis is obtained by comparing genome double-end sequencing data, and the detection data does not contain methylation information and can not be subjected to methylation-related analysis.
Methylation-non-bisulfite sequencing (OMAS) is a novel methylation detection technology, which involves converting Methylated C bases of a DNA genome into 5-formylcytosine (5fC) and 5-carboxycytosine (5caC) with the aid of TET enzyme, and then reducing the Methylated C bases into T bases with borane. This technique converts only modified C bases (about 4-5% of total cytosine) such as 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5mC) to T bases. The detection of the methylation level of each site and the evaluation of the methylation level can be performed by sequencing data of the detection technology. For example, see the applicant's chinese patent application with application No. CN201911159400, entitled "genome-wide methylation non-bisulfite sequencing library and construction", the disclosure of which is incorporated herein by reference in its entirety.
The inventor researches and discovers that the methylation non-bisulfite sequencing technology has little change on genome, and can simultaneously analyze genome variation and insert. However, there is currently no report of simultaneous detection of methylation levels, genomic variations (gene mutations, copy number analysis, structural variations) and insert analysis for methylated non-bisulfite sequencing data.
Therefore, the present inventors have pioneered a method for simultaneously detecting methylation level, genomic variations and inserts using methylation non-bisulfite sequencing technology, wherein the genomic variations include gene mutations, copy number variations and structural variations. The flow chart of the steps of the method is shown in fig. 1, and the steps of the method will be described in detail below.
S1: and a sequencing data providing step of providing sequencing data by performing methylation non-bisulfite sequencing on a mixed sample containing the DNA of the sample to be tested and the methylated positive reference DNA and the methylated negative reference DNA.
Specifically, the DNA of the sample to be tested is human somatic DNA, and includes but is not limited to human fresh tissue-derived DNA, paraffin-embedded tissue-derived DNA, plasma-derived DNA, pleural effusion-derived DNA, and ascites-derived DNA. Methylated male ginseng DNA and methylated female ginseng DNA are DNA of species different from human species. Methylated ginseng can be obtained, for example, directly from methylated pUC19(Zymoresearch), or can be obtained, for example, from methylated pUC19 synthesized by M.ssI methyltransferase using unmethylated pUC 19; for example, lambda DNA (Promega) can be used as the methylated DNA of the negative control gene. The methylated non-bisulfite sequencing data can be whole genome data and targeted capture data, wherein the targeted capture data includes whole exon data. The sequencing data may be that of a single test sample (e.g., a cancer sample) or that of a test sample (e.g., a cancer sample) containing a control sample, such as leukocytes or a paracancerous sample isolated from the subject's own peripheral blood.
S2: and a sequencing data processing step, wherein the sequencing data are processed to obtain effective data of the methylated male ginseng DNA and the methylated female ginseng DNA and effective data of the DNA of the sample to be detected.
Specifically, processing the sequencing data comprises subjecting the sequencing data to linker removal, low quality sequence filtration, wherein the low quality sequence filtration conditions are such that no low quality bases having a quality value of <15 exceed 50% of the sequence; comparing the filtered sequence data with a ginseng reference genome, methylated male reference DNA and methylated female reference DNA respectively to obtain a comparison file, and establishing an index; combining the comparison files obtained by the multiple lanes, and sequencing; the combined alignment files were stripped of repetitive sequences generated by PCR performed in methylated non-bisulfite sequencing to obtain valid data. The method can utilize fastp or trimgalore software to perform joint removal and low-quality sequence filtering, and the filtered data format is fastq. The filtered data can be aligned to methylated positive reference DNA, methylated negative reference DNA and a reference genome using bwa software to obtain an alignment file in BAM format. The index file may be indexed by samtools (index) software with the suffix bai. Merging multiple pieces of lane data may be merged by samtools (merge) software. The PCR-generated repeat sequence was removed, and the removal of the repeat sequence was performed by calling the picard packet through GATK.
S3: and a condition judgment step, namely counting the methylation level alpha of the methylated male ginseng DNA in a CpG region and the methylation level beta of the methylated female ginseng DNA on the whole genome according to effective data of the methylated male ginseng DNA and the methylated female ginseng DNA, judging whether the alpha and the beta meet the conditions that the alpha is more than or equal to 95% and the beta is less than 5%, if so, performing the step S4, and if not, returning to the step S1 to perform methylation non-bisulfite sequencing again.
Specifically, the methylation levels of the methylated positive and methylated negative references are the average of the ratios of bases C to T-converted reads at CpG dinucleotide sites in the alignment file. The methylation level alpha of the positive ginseng in the CpG region and the methylation level beta of the negative ginseng on the whole genome can be counted by astair call. If the conditions that alpha is more than or equal to 95% and beta is less than 5% are not met, the methylation non-bisulfite conversion is not successful, and the methylation non-bisulfite conversion needs to be carried out again, namely the step S1 is returned to carry out the methylation non-bisulfite sequencing again.
S4: and a methylation detection step of the sample to be detected, wherein effective data of the DNA of the sample to be detected is subjected to methylation analysis, and the methylation level of the DNA of the sample to be detected is counted.
Specifically, the sample methylation analysis results contain all C base methylation information, including genomic position information, methylation coverage depth, non-methylation coverage depth, methylation frequency; the sample methylation statistics comprise methylation levels of CpG dinucleoside and non-CpG dinucleoside regions, wherein the non-CpG dinucleoside regions comprise a CHH site and a CHG site, wherein H is a non-G base. The method for counting the methylation level of the sample is consistent with the method for counting the methylation level of the methylated genetic marker in step S3.
S5: and a gene mutation detection step of the sample to be detected, wherein effective data of the DNA of the sample to be detected is subjected to gene mutation analysis, gene function region filtration and database frequency filtration are carried out according to the gene mutation analysis result to obtain a first mutation set, reads of methylation conversion are removed according to the methylation statistical result, and an upward floating threshold is set according to CpG, CHG and CHH to carry out filtration of the first mutation set to obtain a final mutation set.
Specifically, the mutation detection of the sample gene can be performed using mutation detection software bcftools, Mutect2, Varscan. The sample gene mutation analysis result comprises genome site information and mutation information, wherein the genome site information comprises a chromosome number and a mutation starting place value, and the mutation information comprises a wild type, a mutant type and a reads coverage number thereof, mutation frequency and mutation annotation information (comprising gene, function and database annotation information). The functional region of the gene is filtered to only reserve exon regions, missense mutation, nonsense mutation and frameshift mutation, and the frequency of the database is filtered to remove the frequency of the thousand human genomes which is more than or equal to 0.001. The upward floating threshold for CpG was 0.1 and for CHG and CHH was 0.05. That is, for CpG sites, mutations with a frequency of C to T or G to A mutations that differ from the methylation frequency of that site by <0.1 are removed, and for other C sites, mutations with a frequency of C to T or G to A mutations that differ from the methylation frequency of that site by <0.05 are removed. Mutation annotation can be performed using the software Annovar, VEP.
S6: and a step of detecting copy number variation of the sample to be detected, which is to perform copy number variation analysis according to effective data of the DNA of the sample to be detected to obtain copy number variation data, and perform filtering and screening.
Specifically, the copy number variation analysis can be performed by using Freec or CNVnator software, the copy number variation data includes gene region information and gene information included in the region, and the copy number variation data filtering condition is ratio >2 or ratio < 0.5.
S7: and a step of detecting structural variation of the sample to be detected, which is to perform structural variation analysis according to the effective data of the DNA of the sample to be detected to obtain structural variation data, and perform filtering and screening.
Specifically, the analysis of structural variation can be performed by Manta software, and the screening condition of structural variation data is that the breakpoint coverage is more than or equal to 5.
S8: and detecting the insertion sequence of the sample to be detected, namely analyzing the insertion fragment according to the effective data of the DNA of the sample to be detected, distinguishing the reads covering the mutant type and the wild type, and respectively counting the distribution results of the mutant type insertion fragment and the wild type insertion fragment.
In particular, mutant and wild-type reads are distinguished by identifying the mutation site based on the cigar and flag information of the BAM file, wherein the identification of the mutation site excludes the effect of methylation signals.
The present inventors have proposed a method for detecting methylation level, genomic variation and insert fragment simultaneously by using methylation non-bisulfite sequencing technology, and accordingly propose an apparatus for implementing the method, the block diagram of the apparatus is shown in fig. 2, the apparatus comprises the following modules:
m1: a sequencing data providing module for providing sequencing data by performing methylation non-bisulfite sequencing on a mixed sample containing a sample DNA to be tested and methylated positive reference DNA and methylated negative reference DNA;
m2: the sequencing data processing module is used for processing the sequencing data to obtain effective data of the methylated male ginseng DNA and the methylated female ginseng DNA and effective data of the DNA of the sample to be detected;
m3: the condition judgment module is used for counting the methylation level alpha of the methylated male ginseng DNA in a CpG region and the methylation level beta of the methylated female ginseng DNA on the whole genome according to effective data of the methylated male ginseng DNA and the methylated female ginseng DNA, judging whether the alpha and the beta meet the conditions that the alpha is more than or equal to 95% and the beta is less than 5%, if so, carrying out the M4 module, and if not, returning to the M1 module to carry out methylation non-bisulfite sequencing again;
m4: the methylation detection module of the sample to be detected is used for carrying out methylation analysis on the effective data of the DNA of the sample to be detected and counting the methylation level of the DNA of the sample to be detected;
m5: the detection module for the gene mutation of the sample to be detected is used for carrying out gene mutation analysis on effective data of DNA of the sample to be detected, carrying out gene functional region filtration and database frequency filtration according to a gene mutation analysis result to obtain a first mutation set, removing reads of methylation conversion according to a methylation statistical result, and setting an upward floating threshold according to CpG, CHG and CHH to carry out filtration of the first mutation set to obtain a final mutation set;
m6: the copy number variation detection module of the sample to be detected is used for carrying out copy number variation analysis according to the effective data of the DNA of the sample to be detected to obtain copy number variation data and carrying out filtering and screening;
m7: the structural variation detection module of the sample to be detected is used for carrying out structural variation analysis according to the effective data of the DNA of the sample to be detected to obtain structural variation data and carrying out filtering and screening;
m8: and the detection module of the insertion sequence of the sample to be detected is used for analyzing the insertion fragment according to the effective data of the DNA of the sample to be detected, distinguishing the reads covering the mutant type and the wild type, and respectively counting the distribution results of the mutant type insertion fragment and the wild type insertion fragment.
Further, the inventors propose a computer readable medium having stored thereon computer program instructions, wherein the method of the first aspect of the invention for simultaneous detection of methylation levels, genomic variations and inserts is performed when the computer program instructions are executed by a processor.
Still further, the present inventors propose an apparatus for carrying out the method of the first aspect of the present invention for simultaneously detecting methylation levels, genomic variations and inserts, the apparatus comprising:
a memory for storing computer program instructions, and
a processor for executing the computer program instructions,
wherein the computer program instructions, when executed by the processor, cause the apparatus to perform the method of the first aspect of the invention for simultaneous detection of methylation levels, genomic variations and inserts.
The present invention is further illustrated by the following examples and comparative examples.
Example 1: examples of applications of the method of the invention
And step S1:
100ng of human blood cfDNA, 0.2ng of methylated pUC19 DNA (methylated male ginseng DNA), unmethylated lambda DNA (methylated female ginseng DNA) were mixed for fragmentation and sequenced by methylated non-bisulfite. The sequencing library was constructed according to example 4 of the chinese patent application with the invention name of "whole genome methylation non-bisulfite sequencing library and construction", which was filed under the application number CN201911159400 by the applicant of the present invention, and the sequencing platform used a Gene + Seq platform. After sequencing, the off-line data L1_ r1.fq.gz, L1_ r2.fq.gz, L1_ r1.clean. fq.gz, L2_ r2.clean. fq.gz were obtained.
And step S2:
the following data were subjected to linker removal and mass filtration by commanding fastp-I R1.fq. gz-I R2.fq. gz-O R1.clean. fq. gz-O R2.clean. fq. gz to obtain filtered sequence data as shown in the following table:
Figure BDA0002617973990000101
then, the filtered sequence data are respectively compared with the positive reference, the negative reference and the ginseng reference genome through a command bw mem-o test, BAM-M has, fa L1-r 1-clean, fq, ga L1-r 2-clean, fq, gz to obtain a comparison file (BAM format), and an index is established by using a samtools index command. Merging the comparison files of multiple lanes by using commands samtools merge and samtools sort respectively, sorting the merged files, and counting comparison results, as shown in the following table:
Figure BDA0002617973990000102
then, the repeated sequences generated by the PCR were removed by using java-Xmx20G-Djava. io. tmpdir ═ h./-jar graphic. jar Mark duplicates I ═ test.bam O ═ test.mark.bam M ═ a. metrics, and the results were counted as shown in the following table:
Figure BDA0002617973990000103
and step S3:
the methylation level alpha of the methylated positive ginseng in the CpG region and the methylation level beta of the methylated negative ginseng on the whole genome are counted by astair call. The results satisfy the conditions of alpha ≥ 95% and beta < 5%, as shown in the following table:
Sample α β
2003270172PD 95.31% 0.67%
and step S4:
the sample methylation levels were measured by astair call and the statistical results (in part) are shown in the following table:
Figure BDA0002617973990000111
then, according to the methylation detection result of the sample, the methylation levels of different regions of the genome are counted, as shown in the following table:
Sample CpG CHG CHH C
2003270172PD 64.32% 0.35% 0.31% 2.58%
and step S5:
performing sample gene mutation analysis by using samtools + bcfttools according to effective data of a sample DNA to be detected, performing gene function region filtration and database frequency filtration to obtain a first mutation result, and then performing methylation noise filtration to obtain a final mutation result, wherein the following table shows:
Figure BDA0002617973990000112
and step S6:
calling a command free-conf config _ wgs. txt to perform copy number variation analysis and filtering and screening to obtain a final copy number analysis result, which is shown in the following table:
Sample Gene CNV Type
2003270172PD ERBB2 4.054828 Gain
and step S7:
and calling a command configManta to perform structural variation detection and screening.
And step S8:
insert analysis is performed from written code, insert information is extracted from the BAM file (column 9), mutant and wild type reads are distinguished according to cigar and flag information, two files (1 insert for mutant reads and 1 insert for wild type reads) are output respectively, and then density distribution maps of the inserts are drawn (the inserts for mutant and wild type reads are distributed in one map).
The analysis results can be obtained by calling software Multi _ analysis in the steps from S1 to S8, and the specific commands are as follows:
perl Multi_analyse\
--ref_pUC19 pUC19.fa\
--ref_lambda lambda.fasta\
--ref hs37d5.fa\
--bed hs37d5.region.bed\
--chrFile chromosome/\
sample.list
list is as follows:
MachineNo:Geneseq2000
SampleNo IdNo SampleTypecase
case 2003270172PD P 2003270172PD_HUM_C_GC0C_4014_Z_0_A,2020-04-19
and (3) operating results:
methylation detection results:
Type Methylation level Methylation depth All depth
CpG 69.78% 28197854 40407938
CHG 0.44% 1016117 2.3E+08
CHH 0.40% 3294378 8.24E+08
results of Gene mutation detection (part):
Chr Pos Ref Alt Info
1 16893254 A G 0.16|(12,14,3,2)
1 1.21E+08 T C 0.24|(38,51,9,19)
1 1.43E+08 A C 0.28|(19,9,8,3)
further, FIG. 3 shows that the gene mutation screening in the method of the present invention can remove the false positive results introduced by methylation signal effect and coverage insufficiency, and the filtration efficiency reaches over 99.66%.
Copy number variation assay results (parts):
Chr Start End CNV
22 16245000 16344000 loss
22 16344000 16353000 gain
22 16353000 16851000 loss
22 16851000 17037000 gain
structural variation assay results (section):
CHROM_A START_A END_A CHROM_B START_B END_B TYPE
2 79076569 79076572 2 79081205 79081208 DEL
2 179301045 179301046 2 179306335 179306336 DEL
3 20342979 20342980 hs37d5 9106407 9106408 BND
3 80064469 80064475 3 80065106 80065107 BND
3 80065101 80065107 3 80064474 80064475 BND
3 80325189 80325190 6 121481643 121481644 BND
4 115928718 115928723 4 115931872 115931877 DEL
6 121481643 121481644 3 80325189 80325190 BND
insert assay results (partial):
Figure BDA0002617973990000131
Figure BDA0002617973990000141
note: mut group is a group of inserts covering the mutation site; wild group is the set of inserts covering the non-mutated sites.
With further reference to FIG. 4, which shows that the present method removes the effects of methylation background noise, insert analysis can be performed to effectively distinguish mutant and wild-type fragments.
Comparative example 1: comparison of methylation non-bisulfite sequencing and Whole genome methylation bisulfite sequencing
100ng of human blood cfDNA was taken and subjected to methylation non-bisulfite sequencing (OMAS) and genome-wide methylation bisulfite sequencing (WGBS), respectively, and a comparison of data efficiency (effective _ rate) and reference genome alignment (mapping rate) was performed.
Methylation non-bisulfite sequencing was performed with reference to step S1 of example 1; the whole genome methylation bisulfite sequencing (WGBS) is constructed by referring to a whole genome methylation sequencing library and a construction method thereof disclosed in Chinese patent document CN104532360B, wherein the conversion treatment from 'C' to 'T' is carried out by bisulfite conversion; the sequencing platform adopts Illumina (Hiseq). The results are shown in fig. 5, which shows that the data efficiency and the human reference genome alignment ratio are about 93% and 95%, respectively, compared to WGBS, which is significantly higher than the data efficiency of about 84% and the human reference genome alignment ratio of about 85% of WGBS, when the method of the present invention employs OMAS. This is because WGBS converts unmethylated C in the genome to T, and then performs library sequencing. The sequencing data has low base quality due to base imbalance, so the data efficiency is low, and in the alignment analysis of the human reference genome, the data is respectively aligned to the human genome which is converted from C to T and the human genome which is converted from G to A by using the bismarker software, and the alignment rate is also low.
This comparative example demonstrates that the data efficiency and the human reference genome alignment ratio of OMAS are significantly higher than those of WGBS due to the different principles of C to T conversion. Therefore, the method of the invention adopts OMAS, which not only greatly reduces the damage to DNA in the process of base conversion, and can realize the simultaneous detection of methylation level, genome variation and insertion fragment, but also can improve the data efficiency and the reference genome comparison rate, and improve the detection accuracy.
The present invention has been described above using specific examples, which are only for the purpose of facilitating understanding of the present invention, and are not intended to limit the present invention. Numerous simple deductions, modifications or substitutions may be made by those skilled in the art in light of the teachings of the present invention. Such deductions, modifications or alternatives also fall within the scope of the claims of the present invention.

Claims (10)

1. A method for simultaneously detecting methylation levels, genomic variations and inserts, wherein the genomic variations include gene mutations, copy number variations and structural variations, the method comprising the steps of:
s1: a sequencing data providing step of providing sequencing data by performing methylation non-bisulfite sequencing on a mixed sample containing a sample DNA to be tested and methylated positive reference DNA and methylated negative reference DNA;
s2: a sequencing data processing step, wherein the sequencing data are processed to obtain effective data of the methylated male ginseng DNA and the methylated female ginseng DNA and effective data of the DNA of a sample to be detected;
s3: a condition judgment step, namely counting the methylation level alpha of the methylated male ginseng DNA in a CpG region and the methylation level beta of the methylated female ginseng DNA on the whole genome according to effective data of the methylated male ginseng DNA and the methylated female ginseng DNA, judging whether the alpha and the beta meet the conditions that the alpha is more than or equal to 95% and the beta is less than 5%, if so, performing S4, and if not, returning to the S1 step to perform methylation non-bisulfite sequencing again;
s4: methylation detection of a sample to be detected, namely performing methylation analysis on effective data of the DNA of the sample to be detected, and counting the methylation level of the DNA of the sample to be detected;
s5: a gene mutation detection step of a sample to be detected, which is to perform gene mutation analysis on effective data of DNA of the sample to be detected, perform gene function region filtration and database frequency filtration according to a gene mutation analysis result to obtain a first mutation set, remove reads of methylation conversion according to a methylation statistical result, and perform filtration of the first mutation set according to an upward floating threshold set by CpG, CHG and CHH to obtain a final mutation set;
s6: a step of detecting copy number variation of a sample to be detected, which is to perform copy number variation analysis according to effective data of DNA of the sample to be detected to obtain copy number variation data, and perform filtering and screening;
s7: a step of detecting structural variation of a sample to be detected, which is to perform structural variation analysis according to effective data of DNA of the sample to be detected to obtain structural variation data, and perform filtering and screening;
s8: detecting an insertion sequence of a sample to be detected, analyzing an insertion fragment according to effective data of DNA of the sample to be detected, distinguishing reads covering a mutant type and a wild type, and respectively counting distribution results of the mutant type insertion fragment and the wild type insertion fragment;
preferably, the DNA of the sample to be detected is human somatic DNA, and the methylated positive reference DNA and the methylated negative reference DNA are DNAs of species different from human species.
2. The method of claim 1, wherein in step S2, processing the sequencing data comprises subjecting the sequencing data to linker removal, low quality sequence filtering, wherein the low quality sequence filtering conditions are such that no lower quality bases with a quality value <15 exceed 50% of the sequence; comparing the filtered sequence data with a ginseng reference genome, methylated male reference DNA and methylated female reference DNA respectively to obtain a comparison file, and establishing an index; combining the comparison files obtained by the multiple lanes, and sequencing; and removing the repeated sequence generated by PCR from the combined comparison file to obtain the effective data.
3. The method of claim 1, wherein in the step of S4, the sample methylation analysis result comprises all C base methylation information including genome position information, methylation coverage depth, non-methylation coverage depth, methylation frequency; the sample methylation statistics comprise methylation levels of CpG dinucleoside and non-CpG dinucleoside regions, wherein the non-CpG dinucleoside regions comprise a CHH site and a CHG site, wherein H is a non-G base.
4. The method of claim 1, wherein in the step S5, the sample gene mutation analysis result comprises genomic locus information and mutation information, the gene function region is filtered to only reserve exon region, missense mutation, nonsense mutation and frameshift mutation, the database frequency is filtered to remove thousand human genome frequency more than or equal to 0.001, the upward floating threshold of CpG is 0.1, and the upward floating threshold of CHG and CHH is 0.05.
5. The method according to claim 1, wherein in the step of S6, the copy number variation data filtering condition is ratio >2 or ratio < 0.5.
6. The method according to claim 1, wherein in the step S7, the structural variation data screening condition is that breakpoint coverage is greater than or equal to 5.
7. The method of claim 1, wherein in the step of S8, the distinguishing of mutant and wild-type reads is to identify mutation sites based on cigar and flag information of BAM file, wherein the identification of mutation sites excludes the influence of methylation signal.
8. An apparatus for performing the method for simultaneous detection of methylation levels, genomic variations and inserts according to any one of claims 1-7, wherein genomic variations include gene mutations, copy number variations and structural variations, characterized in that the apparatus comprises the following modules:
m1: a sequencing data providing module for providing sequencing data by performing methylation non-bisulfite sequencing on a mixed sample containing a sample DNA to be tested and methylated positive reference DNA and methylated negative reference DNA;
m2: the sequencing data processing module is used for processing the sequencing data to obtain effective data of the methylated male ginseng DNA and the methylated female ginseng DNA and effective data of the DNA of a sample to be detected;
m3: the condition judgment module is used for counting the methylation level alpha of the methylated male ginseng DNA in a CpG region and the methylation level beta of the methylated female ginseng DNA on the whole genome according to effective data of the methylated male ginseng DNA and the methylated female ginseng DNA, judging whether the alpha and the beta meet the conditions that the alpha is more than or equal to 95% and the beta is less than 5%, if so, carrying out the M4 module, and if not, returning to the M1 module to carry out methylation non-bisulfite sequencing again;
m4: the methylation detection module of the sample to be detected is used for carrying out methylation analysis on the effective data of the DNA of the sample to be detected and counting the methylation level of the DNA of the sample to be detected;
m5: the detection module for the gene mutation of the sample to be detected is used for carrying out gene mutation analysis on effective data of DNA of the sample to be detected, carrying out gene functional region filtration and database frequency filtration according to a gene mutation analysis result to obtain a first mutation set, removing reads of methylation conversion according to a methylation statistical result, and setting an upward floating threshold according to CpG, CHG and CHH to carry out filtration of the first mutation set to obtain a final mutation set;
m6: the copy number variation detection module of the sample to be detected is used for carrying out copy number variation analysis according to the effective data of the DNA of the sample to be detected to obtain copy number variation data and carrying out filtering and screening;
m7: the structural variation detection module of the sample to be detected is used for carrying out structural variation analysis according to the effective data of the DNA of the sample to be detected to obtain structural variation data and carrying out filtering and screening;
m8: and the detection module of the insertion sequence of the sample to be detected is used for analyzing the insertion fragment according to the effective data of the DNA of the sample to be detected, distinguishing the reads covering the mutant type and the wild type, and respectively counting the distribution results of the mutant type insertion fragment and the wild type insertion fragment.
9. A computer readable medium storing computer program instructions, wherein the computer program instructions, when executed by a processor, perform the method of simultaneously detecting methylation levels, genomic variations and inserts of any one of claims 1-7.
10. An apparatus for carrying out the method for simultaneous detection of methylation levels, genomic variations and inserts according to any one of claims 1-7, characterized in that the apparatus comprises:
a memory for storing computer program instructions, and
a processor for executing the computer program instructions,
wherein the computer program instructions, when executed by the processor, cause the apparatus to perform the method of any one of claims 1-7 for simultaneous detection of methylation levels, genomic variations and inserts.
CN202010774753.3A 2020-08-04 2020-08-04 Method and device for simultaneously detecting methylation level, genome variation and insertion fragment Active CN111755072B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010774753.3A CN111755072B (en) 2020-08-04 2020-08-04 Method and device for simultaneously detecting methylation level, genome variation and insertion fragment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010774753.3A CN111755072B (en) 2020-08-04 2020-08-04 Method and device for simultaneously detecting methylation level, genome variation and insertion fragment

Publications (2)

Publication Number Publication Date
CN111755072A true CN111755072A (en) 2020-10-09
CN111755072B CN111755072B (en) 2021-02-02

Family

ID=72713104

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010774753.3A Active CN111755072B (en) 2020-08-04 2020-08-04 Method and device for simultaneously detecting methylation level, genome variation and insertion fragment

Country Status (1)

Country Link
CN (1) CN111755072B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112289376A (en) * 2020-10-26 2021-01-29 深圳基因家科技有限公司 Method and device for detecting somatic cell mutation
CN112634984A (en) * 2020-12-29 2021-04-09 北京吉因加医学检验实验室有限公司 Method, device and storage medium for simultaneously detecting DNA methylation and genome variation
CN113817723A (en) * 2021-09-28 2021-12-21 深圳吉因加医学检验实验室 Polynucleotide and standard substance, kit and application thereof
CN115064211A (en) * 2022-08-15 2022-09-16 臻和(北京)生物科技有限公司 ctDNA prediction method based on whole genome methylation sequencing and application thereof
CN115910211A (en) * 2022-12-15 2023-04-04 广州女娲生命科技有限公司 Method and device for analyzing and detecting DNA (deoxyribonucleic acid) before embryo implantation

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1370281A2 (en) * 2001-03-16 2003-12-17 K.U.Leuven Research &amp; Development Human growth hormone for treating children with abnormal short stature and kits and methods for diagnosing gs protein dysfunctions
CN102216456A (en) * 2008-09-16 2011-10-12 塞昆纳姆股份有限公司 Processes and compositions for methylation-based enrichment of fetal nucleic acid from a maternal sample useful for non invasive prenatal diagnoses
CN104532360A (en) * 2014-12-17 2015-04-22 北京诺禾致源生物信息科技有限公司 Whole-genome methylation sequencing library and construction method thereof
CN106103743A (en) * 2014-01-07 2016-11-09 Imppc私人基金会 For producing the method in double-stranded DNA library and for identifying the sequence measurement of methylated cytosine
CN107451419A (en) * 2017-07-14 2017-12-08 浙江大学 It is a kind of that the method for simplifying DNA methylation sequencing data is produced by computer program simulation
US20180119218A1 (en) * 2016-10-06 2018-05-03 The Board Of Trustees Of The University Of Illinois Spatial Molecular Analysis of Tissue
CN108949972A (en) * 2017-05-19 2018-12-07 香港中文大学 Tumor suppressor gene REC8 as biomarker for cancer
CN109416928A (en) * 2016-06-07 2019-03-01 伊路米纳有限公司 For carrying out the bioinformatics system, apparatus and method of second level and/or tertiary treatment
EP2807170B1 (en) * 2012-01-25 2019-03-13 Dicot AB Phragmalin limonoids for the treatment of sexual dysfunction
CN109982710A (en) * 2016-09-13 2019-07-05 杰克逊实验室 Target the DNA demethylation of enhancing
CN110777161A (en) * 2019-11-01 2020-02-11 中国林业科学研究院林业研究所 Method for creating phenotypic variation transgenic plant by using methyltransferase gene
CN110820050A (en) * 2019-11-22 2020-02-21 北京吉因加科技有限公司 Whole genome methylation non-bisulfite sequencing library and construction
CN111052250A (en) * 2017-06-28 2020-04-21 西奈山伊坎医学院 High resolution microbiological analysis method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1370281A2 (en) * 2001-03-16 2003-12-17 K.U.Leuven Research &amp; Development Human growth hormone for treating children with abnormal short stature and kits and methods for diagnosing gs protein dysfunctions
CN102216456A (en) * 2008-09-16 2011-10-12 塞昆纳姆股份有限公司 Processes and compositions for methylation-based enrichment of fetal nucleic acid from a maternal sample useful for non invasive prenatal diagnoses
EP2807170B1 (en) * 2012-01-25 2019-03-13 Dicot AB Phragmalin limonoids for the treatment of sexual dysfunction
CN106103743A (en) * 2014-01-07 2016-11-09 Imppc私人基金会 For producing the method in double-stranded DNA library and for identifying the sequence measurement of methylated cytosine
CN104532360A (en) * 2014-12-17 2015-04-22 北京诺禾致源生物信息科技有限公司 Whole-genome methylation sequencing library and construction method thereof
CN109416928A (en) * 2016-06-07 2019-03-01 伊路米纳有限公司 For carrying out the bioinformatics system, apparatus and method of second level and/or tertiary treatment
CN109982710A (en) * 2016-09-13 2019-07-05 杰克逊实验室 Target the DNA demethylation of enhancing
US20180119218A1 (en) * 2016-10-06 2018-05-03 The Board Of Trustees Of The University Of Illinois Spatial Molecular Analysis of Tissue
CN108949972A (en) * 2017-05-19 2018-12-07 香港中文大学 Tumor suppressor gene REC8 as biomarker for cancer
CN111052250A (en) * 2017-06-28 2020-04-21 西奈山伊坎医学院 High resolution microbiological analysis method
CN107451419A (en) * 2017-07-14 2017-12-08 浙江大学 It is a kind of that the method for simplifying DNA methylation sequencing data is produced by computer program simulation
CN110777161A (en) * 2019-11-01 2020-02-11 中国林业科学研究院林业研究所 Method for creating phenotypic variation transgenic plant by using methyltransferase gene
CN110820050A (en) * 2019-11-22 2020-02-21 北京吉因加科技有限公司 Whole genome methylation non-bisulfite sequencing library and construction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GEORGY A. ROMANOV等: "Arginine CGA codons as a source of nonsense mutations: a possible role in multivariant gene expression, control of mRNA quality, and aging", 《MOL GENET GENOMICS》 *
STEVEN R. EICHTEN等: "Epigenetic and Genetic Influences on DNA Methylation Variation in Maize Populations", 《THE PLANT CELL》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112289376A (en) * 2020-10-26 2021-01-29 深圳基因家科技有限公司 Method and device for detecting somatic cell mutation
CN112289376B (en) * 2020-10-26 2021-07-06 北京吉因加医学检验实验室有限公司 Method and device for detecting somatic cell mutation
CN112634984A (en) * 2020-12-29 2021-04-09 北京吉因加医学检验实验室有限公司 Method, device and storage medium for simultaneously detecting DNA methylation and genome variation
CN112634984B (en) * 2020-12-29 2021-09-28 北京吉因加医学检验实验室有限公司 Method, device and storage medium for simultaneously detecting DNA methylation and genome variation
CN113817723A (en) * 2021-09-28 2021-12-21 深圳吉因加医学检验实验室 Polynucleotide and standard substance, kit and application thereof
CN115064211A (en) * 2022-08-15 2022-09-16 臻和(北京)生物科技有限公司 ctDNA prediction method based on whole genome methylation sequencing and application thereof
CN115910211A (en) * 2022-12-15 2023-04-04 广州女娲生命科技有限公司 Method and device for analyzing and detecting DNA (deoxyribonucleic acid) before embryo implantation
CN115910211B (en) * 2022-12-15 2024-03-22 广州女娲生命科技有限公司 Method and device for analyzing and detecting DNA before embryo implantation

Also Published As

Publication number Publication date
CN111755072B (en) 2021-02-02

Similar Documents

Publication Publication Date Title
CN111755072B (en) Method and device for simultaneously detecting methylation level, genome variation and insertion fragment
US10127351B2 (en) Accurate and fast mapping of reads to genome
CN112397144B (en) Method and device for detecting gene mutation and expression quantity
US20190233883A1 (en) Methods and compositions for analyzing nucleic acid
CN110211633B (en) Detection method for MGMT gene promoter methylation, processing method for sequencing data and processing device
CN110033829B (en) Fusion detection method of homologous genes based on differential SNP markers
WO2013075629A1 (en) Method for detecting hydroxylmethylation modification in nucleic acid and use thereof
CN110343748B (en) Method for analyzing tumor mutation load based on high-throughput targeted sequencing
CN110189796A (en) A kind of sheep full-length genome resurveys sequence analysis method
CN109584957B (en) Detection kit for capturing α thalassemia related gene copy number
CN108304694B (en) Method for analyzing gene mutation based on second-generation sequencing data
CN115803447A (en) Detection of structural variation in chromosome proximity experiments
CN115305290A (en) Chicken liquid chip and application thereof
CN112634984B (en) Method, device and storage medium for simultaneously detecting DNA methylation and genome variation
CN105528532B (en) A kind of characteristic analysis method in rna editing site
CN107885972B (en) Fusion gene detection method based on single-ended sequencing and application thereof
CN113373234A (en) Small cell lung cancer molecular typing determination method based on mutation characteristics and application
CN110373458B (en) Kit and analysis system for thalassemia detection
CN111575349A (en) Linker sequence and application thereof
WO2013097060A1 (en) Method for analyzing dna methylation based on mspji cleavage
WO2013097328A1 (en) Method and device for tagging genomic indel site
CN113564266B (en) SNP typing genetic marker combination, detection kit and application
CN102776270A (en) Method and device for detecting DNA methylation
CN109979534B (en) C site extraction method and device
CN108949945B (en) Sequencing library with single base resolution for detecting DNA methylation and single nucleotide variation and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant