CN117409856B - Mutation detection method, system and storable medium based on single sample to be detected targeted gene region second generation sequencing data - Google Patents

Mutation detection method, system and storable medium based on single sample to be detected targeted gene region second generation sequencing data Download PDF

Info

Publication number
CN117409856B
CN117409856B CN202311392297.6A CN202311392297A CN117409856B CN 117409856 B CN117409856 B CN 117409856B CN 202311392297 A CN202311392297 A CN 202311392297A CN 117409856 B CN117409856 B CN 117409856B
Authority
CN
China
Prior art keywords
copy number
sample
site
sites
sequencing data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311392297.6A
Other languages
Chinese (zh)
Other versions
CN117409856A (en
Inventor
翟兵兵
刘建红
邓涛
孙立超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Capitalbio Medlab Co ltd
Original Assignee
Beijing Capitalbio Medlab Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Capitalbio Medlab Co ltd filed Critical Beijing Capitalbio Medlab Co ltd
Priority to CN202311392297.6A priority Critical patent/CN117409856B/en
Publication of CN117409856A publication Critical patent/CN117409856A/en
Application granted granted Critical
Publication of CN117409856B publication Critical patent/CN117409856B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention provides a mutation detection method, a mutation detection system and a storable medium based on single sample targeted gene region second-generation sequencing data, and relates to the field of belief analysis. The mutation detection method comprises the following steps: acquiring a sequence of a targeted gene region; designing a probe covering the whole exon area, a copy number variation detection probe and a minor equivalent frequency shift detection probe; performing second generation sequencing after capturing the target site based on the probe covering the whole exon region, the copy number variation detection probe and the secondary equivalent frequency shift detection probe; the single sample to be tested can detect copy number variation, heterozygosity deletion, single nucleotide site variation and indel mutation simultaneously based on the difference between the sample to be tested and the reference sample of sequencing data of the target site. The mutation detection method disclosed by the invention is based on the research front edge and comprehensive and reliable data, provides a high-efficiency and reliable research method for researchers and clinicians in the field of accurate medical treatment, and has important scientific research and clinical values.

Description

Mutation detection method, system and storable medium based on single sample to be detected targeted gene region second generation sequencing data
Technical Field
The invention belongs to the field of biological analysis, and in particular relates to a mutation detection method, a mutation detection system, mutation detection equipment, a mutation detection device, a mutation detection computer readable storage medium and application of mutation detection method based on single sample to be detected targeting gene region second generation sequencing data.
Background
Currently, the main molecular gene detection techniques for detecting heterozygosity deficiency (LOH) include Fluorescence In Situ Hybridization (FISH), single nucleotide polymorphism microarray (SNP array), comparative genomic hybridization (aCGH), whole Genome Sequencing (WGS), whole Exon Sequencing (WES), multiplex ligation dependent probe amplification (MLPA), and the like. These techniques have respective suitable types of detection and advantages and disadvantages, and at the present stage, there are difficulties in simultaneous detection of heterozygosity Loss (LOH), copy Number Variation (CNV), single nucleotide locus variation (SNV), inDel mutation (InDel) variation types within a single gene region under the condition of limited cost.
For example, the FISH technology is a gold standard method for the current pathological examination of gene CNV, but the technology can only detect copy number variation, can not detect single nucleotide site variation and insertion deletion mutation, and the FISH method has more steps, is easy to cause signal loss and causes false negative results; SNP array can detect LOH and CNV, but can not detect SNV and InDel with shorter length; copy number variation sequencing (CNVs-seq) can only detect CNV; quantitative PCR (qPCR) is unable to detect LOH; WES and WGS sequencing technologies cover multiple genes, the detection cost is too high, and low-depth sequencing cannot detect SNV and InDel with low frequency; multiplex ligation-dependent probe amplification techniques are unable to detect InDel of shorter length.
Disclosure of Invention
The invention provides a method for simultaneously detecting heterozygosity deficiency, copy number variation, single nucleotide locus variation and indel mutation by utilizing a specific gene region of a single sample to be detected, which aims to overcome the defects of the prior art.
The application discloses a mutation detection method based on single sample target gene region second generation sequencing data, which comprises the following steps:
s1: acquiring a sequence of a target gene region, wherein the target gene region comprises a whole exon region and two side regions of the target gene;
s2: designing a probe covering the whole exon area, a copy number variation detection probe and a minor equivalent frequency shift detection probe based on the sequence of the targeted gene area; the probe covering the whole exon region is used for detecting single nucleotide site variation and/or indel mutation, the copy number variation detection probe is used for detecting copy number variation, and the secondary equivalent frequency shift detection probe is used for detecting heterozygosity deletion;
s3: capturing target sites based on the probes covering the whole exon region, the copy number variation detection probes and the secondary equivalent frequency shift detection probes, and then performing second generation sequencing to obtain sequencing data of the target sites;
s4: based on the difference between the sequencing data of the target site and the sample to be tested and the reference sample, the single sample to be tested can detect copy number variation, heterozygosity deletion, single nucleotide site variation and indel mutation at the same time;
s5: and outputting the variation type of the sample to be detected based on the detection result.
Further, the detection step of the single nucleotide site variation and the indel mutation comprises the steps of:
s1, executing;
s21: designing a probe covering the whole exon area based on the sequence of the target gene area;
s31: capturing protein coding sites based on the probe covering the whole exon region, and then performing second generation sequencing to obtain sequencing data of the protein coding sites;
s41: obtaining all exon areas of a sample to be detected and a reference sample based on sequencing data of protein coding sites, taking the all exon areas of the reference sample as a base line, and detecting single nucleotide site variation and indel mutation of the sample to be detected through analysis software; preferably, single nucleotide site variations and indel mutations of the test sample are detected by Sentieon;
s5 is performed.
Further, the detecting step of the copy number variation includes:
s1, executing;
s22: designing copy number variation detection probes based on the sequence of the targeted gene region;
s32: performing second generation sequencing after capturing the copy number variation candidate sites in a targeted manner based on the copy number variation detection probes to obtain sequencing data of the copy number variation candidate sites;
s42: detecting copy number variation of the sample to be detected by analysis software based on sequencing data of the copy number variation candidate sites; preferably, calculating copy number variation of the targeted gene region of the sample to be tested through a sliding window;
s5 is performed.
Further, the step of calculating copy number variation of the targeted gene region of the sample to be tested through the sliding window includes:
s421: dividing the targeted gene region into n windows, and assigning a specific weight to each window based on the sequencing data of the copy number variation candidate sites of the reference sample;
s422: calculating the average copy ratio of genes of the targeted gene region, namely log2P, based on the number n of sliding windows of the reference sample and the weight of each window, wherein the average copy ratio of genes is calculated by the formula:
s423: calculating a copy number normal range based on the average copy frequency of genes in the targeted gene region of the reference sample, wherein the copy number, namely CN, has a calculation formula as follows:
CN=2 log2P+1
preferably, the normal range of copy numbers is [1.5,2.7];
s424: calculating the copy number value of a sample to be detected, and deleting the targeted gene fragment when the copy number is smaller than the minimum value of the normal range; when the copy number is greater than the maximum of the normal range, it is repeated for the targeted gene segment. Further, the step of detecting the heterozygosity loss includes:
s1, executing;
s23: screening candidate heterozygous sites based on the sequence of the target gene region, and designing minor allele frequency shift detection probes based on the candidate heterozygous sites;
s33: performing second-generation sequencing after capturing candidate heterozygous sites in a targeted manner based on the secondary equivalent frequency shift detection probe to obtain sequencing data of the candidate heterozygous sites;
s43: judging whether the candidate heterozygous site generates minor frequency deviation or not based on sequencing data of the candidate heterozygous site, and when the number of the minor frequency of the candidate heterozygous site in the deviation interval meets a threshold value, generating the minor frequency deviation of the heterozygous site; if the minor allele frequency is shifted and the copy number is unchanged, the copy neutral LOH is obtained, if the minor allele frequency is shifted and the copy number is deleted, the heterozygosity is deleted, and if the minor allele frequency is not shifted and the copy number is deleted, the heterozygosity is deleted; preferably, when the number of minor frequencies of the candidate heterozygous sites is greater than or equal to 10 at (0.055,0.39) and (0.61,0.945), the minor frequencies of the heterozygous sites shift;
s5 is performed.
Further, the screening of candidate heterozygote sites is to screen single nucleotide variation sites with the frequency of sub-population groups of thousands of genomes of the sub-isobaric sites being greater than or equal to a threshold value as the candidate heterozygote sites;
preferably, single nucleotide variation sites with the frequency of more than or equal to 20% of the thousand genome minor population are used as candidate heterozygous sites.
Further, the defining step of the offset interval includes:
s431: calculating the minor allele frequency of each candidate heterozygous site in the reference sample, namely BAF;
s432: calculating the absolute value of the difference between the BAF and 0.5, namely AbsBAF;
s433: counting a heterozygous site value interval and a homozygous site value interval in a reference sample; the value interval of the heterozygote is the sum of the AbsBAF average value of the selected heterozygote and 3 times of standard deviation and is the heterozygote deviation value, and the value interval of the heterozygote is [ 0.5-heterozygote deviation value, 0.5+ heterozygote deviation value ]; the value interval of the homozygous site is the value of the homozygous site deviation between the AbsBAF average value of the selected homozygous site and 3 times of standard deviation, and the value interval of the homozygous site is [0 ], the value of the homozygous site deviation ] and [ 1-homozygous site deviation, 1];
s434: the deviation interval is the sum of the value interval of the heterozygous site and the value interval of the homozygous site, and then the complement of the value interval of the heterozygous site and the value interval of the homozygous site and the value interval of the [0,1], namely (the value of the deviation of the homozygous site, 0.5-value of the deviation of the heterozygous site) and (0.5+ value of the deviation of the heterozygous site, 1-value of the deviation of the homozygous site); preferably, the range of values of the offset interval is (0.055,0.39) and (0.61,0.945).
A mutation detection system based on single test sample targeted gene region second generation sequencing data, the system comprising:
an acquisition unit: the sequence is used for acquiring a target gene region, wherein the target gene region comprises a whole exon region and two side regions of the target gene;
probe design unit: the probe covering the whole exon area, the copy number variation detection probe and the suboptimal frequency shift detection probe are designed based on the sequence of the targeted gene area; the probe covering the whole exon region is used for detecting single nucleotide site variation and/or indel mutation, the copy number variation detection probe is used for detecting copy number variation, and the secondary equivalent frequency shift detection probe is used for detecting heterozygosity deletion;
a targeted sequencing unit: the sequencing method comprises the steps of capturing target sites based on the probes covering the whole exon region, the copy number variation detection probes and the secondary equivalent frequency shift detection probes, and then performing second generation sequencing to obtain sequencing data of the target sites;
and a detection unit: the method is used for realizing simultaneous detection of copy number variation, heterozygosity deletion, single nucleotide site variation and indel mutation of a single sample to be detected based on the difference between the sample to be detected and a reference sample of sequencing data of a target site;
a result output unit: and the device is used for outputting the mutation type of the sample to be detected based on the detection result.
A mutation detection apparatus based on single test sample targeted gene region second generation sequencing data, the apparatus comprising: a memory and a processor;
the memory is used for storing program instructions;
the processor is used for calling program instructions, and when the program instructions are executed, the variation detection method based on the single-sample targeted gene region second-generation sequencing data is executed.
A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described mutation detection method based on single sample-to-be-detected targeted gene region second-generation sequencing data.
The invention has the advantages that:
1. the mutation detection method disclosed by the invention utilizes the sequence design specificity probe of the target gene region, captures the whole exon region in the second generation sequencing process, realizes simultaneous detection of single nucleotide locus mutation, indel mutation, copy number mutation and heterozygosity deletion, improves the sensitivity and specificity of mutation detection, reduces the conditions of omission and false detection, and provides important support for application in the fields of accurate medical treatment and the like.
2. The mutation detection method disclosed by the invention is characterized in that probes covering the whole exon area and partial flanking area range of the targeted gene are designed based on the sequence of the targeted gene area, so that the range of 8000bp can be covered, comprehensive, reliable and rich information is provided, and scientists and doctors are helped to make more accurate research, assessment or diagnosis.
3. The mutation detection method disclosed by the invention utilizes the second generation sequencing data to sequence the target sites, realizes the one-time detection of the mutation types of a plurality of target sites by a single sample to be detected, has higher flux and faster detection speed compared with the traditional Sanger sequencing, and is suitable for large-scale sample detection.
4. The mutation detection method disclosed by the invention realizes the detection of the multi-dimensional mutation type, wherein the multi-dimensional mutation type comprises mutation, indel mutation, copy number mutation and heterozygosity deletion of a single nucleotide site, realizes the comprehensive and accurate evaluation of the genetic mutation condition of a sample to be detected, and is beneficial to the discovery of potential pathogenic mutation related to diseases.
5. According to the invention, the copy number change of the targeted gene region of the sample to be detected is calculated through the sliding window, and each window is weighted according to the sequencing data of the copy number variation candidate sites of the reference sample so as to reflect the importance of different windows; based on the number of sliding windows of the reference sample and the weight of each window, the average copy ratio of the genes in the target gene region is calculated, the copy number variation condition of the target gene region in the sample to be tested is accurately reflected, the gene copy number variation condition of the sample to be tested is accurately and rapidly analyzed, and an efficient, accurate and reliable tool is provided for researching the gene copy number variation.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a mutation detection method based on single sample target gene region second generation sequencing data;
FIG. 2 is a schematic diagram of a mutation detection system based on single sample target gene region second generation sequencing data;
FIG. 3 is a schematic diagram of a mutation detection device based on second-generation sequencing data of a target gene region of a single sample to be detected according to an embodiment of the present invention;
FIG. 4 is a schematic flow chart of a single nucleotide site variation and indel mutation detection method based on single sample target gene region second generation sequencing data;
FIG. 5 is a schematic flow chart of a copy number variation detection method based on single sample target gene region second generation sequencing data;
fig. 6 is a schematic flow chart of a heterozygosity deletion detection method based on single sample target gene region second generation sequencing data.
Detailed Description
In order to enable those skilled in the art to better understand the present invention, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present invention with reference to the accompanying drawings.
In some of the flows described in the specification and claims of the present invention and in the above figures, a plurality of operations appearing in a particular order are included, but it should be clearly understood that the operations may be performed in other than the order in which they appear herein or in parallel, the sequence numbers of the operations such as S101, S102, etc. are merely used to distinguish between the various operations, and the sequence numbers themselves do not represent any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types.
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments according to the invention without any creative effort, are within the protection scope of the invention.
Fig. 1 is a mutation detection method based on single sample target gene region second generation sequencing data, which includes:
s1: acquiring a sequence of a target gene region, wherein the target gene region comprises a whole exon region and two side regions of the target gene;
s2: designing a probe covering the whole exon area, a copy number variation detection probe and a minor equivalent frequency shift detection probe based on the sequence of the targeted gene area; the probe covering the whole exon region is used for detecting single nucleotide site variation and/or indel mutation, the copy number variation detection probe is used for detecting copy number variation, and the secondary equivalent frequency shift detection probe is used for detecting heterozygosity deletion;
in one example, 69 probes covering 300kb of the CDKN2A, CDKN B gene exon region and partially covering flanking regions were designed to cover 8000bp, and the 69 probes were located at the beginning of chromosome 9 and the specific sequence information are as follows:
/>
/>
/>
/>
s3: capturing target sites based on the probes covering the whole exon region, the copy number variation detection probes and the secondary equivalent frequency shift detection probes, and then performing second generation sequencing to obtain sequencing data of the target sites;
s4: based on the difference between the sequencing data of the target site and the sample to be tested and the reference sample, the single sample to be tested can detect copy number variation, heterozygosity deletion, single nucleotide site variation and indel mutation at the same time;
s5: and outputting the variation type of the sample to be detected based on the detection result.
FIG. 4 shows a method for detecting single nucleotide site variation and indel mutation based on single sample target gene region second generation sequencing data, which comprises the following steps:
s1: acquiring a sequence of a target gene region, wherein the target gene region comprises a whole exon region and two side regions of the target gene;
s21: designing a probe covering the whole exon area based on the sequence of the target gene area;
s31: capturing protein coding sites based on the probe covering the whole exon region, and then performing second generation sequencing to obtain sequencing data of the protein coding sites;
s41: obtaining all exon areas of a sample to be detected and a reference sample based on sequencing data of protein coding sites, taking the all exon areas of the reference sample as a base line, and detecting single nucleotide site variation and indel mutation of the sample to be detected through analysis software; preferably, single nucleotide site variations and indel mutations of the test sample are detected by Sentieon;
in one embodiment, 40 normal human blood samples are collected as a baseline; SNV and InDel in the probe coverage area were detected by the tnscope command in Sentieon software.
S5: and outputting the variation type of the sample to be detected based on the detection result.
FIG. 5 shows a copy number variation detection method based on single sample target gene region second generation sequencing data, which includes:
s1: acquiring a sequence of a target gene region, wherein the target gene region comprises a whole exon region and two side regions of the target gene;
s22: designing copy number variation detection probes based on the sequence of the targeted gene region;
s32: performing second generation sequencing after capturing the copy number variation candidate sites in a targeted manner based on the copy number variation detection probes to obtain sequencing data of the copy number variation candidate sites;
s42: detecting copy number variation of the sample to be detected by analysis software based on sequencing data of the copy number variation candidate sites; preferably, calculating copy number variation of the targeted gene region of the sample to be tested through a sliding window;
in one embodiment, the step of calculating the copy number variation of the targeted gene region of the test sample through the sliding window comprises:
s421: dividing the targeted gene region into n windows, and assigning a specific weight to each window based on the sequencing data of the copy number variation candidate sites of the reference sample;
s422: calculating the average copy ratio of genes of the targeted gene region, namely log2P, based on the number n of sliding windows of the reference sample and the weight of each window, wherein the average copy ratio of genes is calculated by the formula:
s423: calculating a copy number normal range based on the average copy frequency of genes in the targeted gene region of the reference sample, wherein the copy number, namely CN, has a calculation formula as follows:
CN=2 log2P+1
preferably, the normal range of copy numbers is [1.5,2.7];
s424: calculating the copy number value of a sample to be detected, and deleting the targeted gene fragment when the copy number is smaller than the minimum value of the normal range; when the copy number is greater than the maximum of the normal range, it is repeated for the targeted gene segment.
In one embodiment, normal tissue is collected as a baseline:
40 tissue samples of FISH or ddpcr validated CDKN2A, CDKN B without copy number change were collected as baseline and then passed through cnvkit as baseline.
1. Calculating the copy number:
the copy number variation of the exon region and the flanking regions of 200Kb on both sides of the CDKN2A, CDKN B gene was calculated using a CNVkit tool sliding window set to 100bp, resulting in a copy ratio (log 2) and weight (weight) for each window, and the average copy ratio of the gene was calculated (log 2 P=
N is the number of sliding windows), the Copy Number (CN) is calculated as cn=2 log2P+1
2. Delineating copy number loss thresholds:
9 background genomic DNA (human normal B lymphocyte line genomic DNA standard NA24385, purchased from Coriell Institute) was collected, gene copy numbers were calculated using the method described above, the percentile of the calculated copy number list was 1.5 and the percentile 99 was 2.7, indicating that the copy number was normal over the [1.5,2.7] range, with no change in copy number. A value with a copy number below 1.5 is copy number Deletion (Deletion) and a value with a copy number above 2.7 is copy number Duplication (Duplication).
S5: and outputting the variation type of the sample to be detected based on the detection result.
FIG. 6 is a diagram of a method for detecting heterozygosity loss based on single sample target gene region second generation sequencing data, which includes:
s1: acquiring a sequence of a target gene region, wherein the target gene region comprises a whole exon region and two side regions of the target gene;
s23: screening candidate heterozygous sites based on the sequence of the target gene region, and designing minor allele frequency shift detection probes based on the candidate heterozygous sites;
in one embodiment, the screening of the candidate heterozygous site is to screen the single nucleotide variation site with the frequency of the minor allele thousands of genome minor population being greater than or equal to a threshold value as the candidate heterozygous site;
preferably, single nucleotide variation sites with the frequency of more than or equal to 20% of the thousand genome minor population are used as candidate heterozygous sites.
In one embodiment, 46 SNP loci with a frequency higher than 20% of the minor allele locus thousand genome minor population in the exon region of the CDKN2A, CDKN B gene and the flanking regions of 200Kb on both sides are selected as candidate heterozygous loci, and probes are designed according to the screened 46 SNP loci. The 46 selected sites are shown below:
rs4595216 rs10811598 rs1414238 rs1561651 rs1561652
rs1414243 rs1335505 rs1414244 rs7856941 rs7037577
rs7037277 rs4326466 rs6475580 rs12380373 rs869329
rs7027989 rs4345650 rs3927737 rs10965153 rs4478653
rs10117507 rs756641 rs4977746 rs10965186 rs10757260
rs7041637 rs2518720 rs7036656 rs3218020 rs643319
rs2151280 rs1360590 rs10120688 rs10757270 rs10738606
rs10738607 rs4977574 rs2891168 rs944796 rs944797
rs1333049 rs10757284 rs12238587 rs10811668 rs9987548
rs10811675
s33: performing second-generation sequencing after capturing candidate heterozygous sites in a targeted manner based on the secondary equivalent frequency shift detection probe to obtain sequencing data of the candidate heterozygous sites;
s43: judging whether the candidate heterozygous site generates minor frequency deviation or not based on sequencing data of the candidate heterozygous site, and when the number of the minor frequency of the candidate heterozygous site in the deviation interval meets a threshold value, generating the minor frequency deviation of the heterozygous site; if the minor allele frequency is shifted and the copy number is unchanged, the copy neutral LOH is obtained, if the minor allele frequency is shifted and the copy number is deleted, the heterozygosity is deleted, and if the minor allele frequency is not shifted and the copy number is deleted, the heterozygosity is deleted; preferably, when the number of minor frequencies of the candidate heterozygous sites is greater than or equal to 10 at (0.055,0.39) and (0.61,0.945), the minor frequencies of the heterozygous sites shift;
in one embodiment, the defining of the offset interval includes:
s431: calculating the minor allele frequency of each candidate heterozygous site in the reference sample, namely BAF;
s432: calculating the absolute value of the difference between the BAF and 0.5, namely AbsBAF;
s433: counting a heterozygous site value interval and a homozygous site value interval in a reference sample; the value interval of the heterozygote is the sum of the AbsBAF average value of the selected heterozygote and 3 times of standard deviation and is the heterozygote deviation value, and the value interval of the heterozygote is [ 0.5-heterozygote deviation value, 0.5+ heterozygote deviation value ]; the value interval of the homozygous site is the value of the homozygous site deviation between the AbsBAF average value of the selected homozygous site and 3 times of standard deviation, and the value interval of the homozygous site is [0 ], the value of the homozygous site deviation ] and [ 1-homozygous site deviation, 1];
s434: the deviation interval is the sum of the value interval of the heterozygous site and the value interval of the homozygous site, and then the complement of the value interval of the heterozygous site and the value interval of the homozygous site and the value interval of the [0,1], namely (the value of the deviation of the homozygous site, 0.5-value of the deviation of the heterozygous site) and (0.5+ value of the deviation of the heterozygous site, 1-value of the deviation of the homozygous site); preferably, the range of values of the offset interval is (0.055,0.39) and (0.61,0.945).
In one particular embodiment, determining whether the BAF is offset includes:
1. calculating BAF:
capturing and sequencing according to the designed probe, wherein the sequencing depth is more than 300X, and calculating the suboptimal Frequency BAF (B-Allle Frequency) of each site according to the screened 46 SNP sites;
2. filtering candidate heterozygous site rules:
in single sample detection, there may be a pure sum site of the sample in 46 candidate heterozygous sites, requiring filtration. 3 FISH-verified FFPE LOH negative samples (FISH verification was performed by An Bi flat "CDKN2A (9 p 21) gene probe" product, cat# f.01225-01) were taken, BAFs at each site in the 3 LOH negative samples were counted, and absolute values of differences between BAFs and 0.5 were calculated as absbafs.
The negative samples have class 2 characteristics of AbsBAF, one class is heterozygous sites, and the distribution of AbsBAF is:
the method is characterized in that AbsBAF_mean (average value) is 0.04, absBAF_std (standard deviation) is 0.024, and the average value is 3 times the standard deviation as a threshold value, and AbsBAF_mean+AbsBAF_std 3 is 0.11. The true positive heterozygosity sites are distributed [0.39,0.61].
The other class is homozygous sites, absBAF distribution is:
the method is characterized in that AbsBAF_mean (average value) is 0.49, absBAF_std (standard deviation) is 0.015, and the average value is 3 times the standard deviation as a threshold value, and AbsBAF_mean-AbsBAF_std 3 is 0.445. The hybrid sites of the pseudo-yang are distributed in [0.945,1] and [0,0.055].
3. Filtering candidate heterozygous loci:
and calculating BAFs of 46 candidate sites of the sample, filtering out the sites of which the BAFs fall outside the (0.61,0.945) interval and the (0.055,0.39) interval, counting the number of the remaining sites as BAFN, and if BAFN > =10, judging that the BAFs are offset.
The reason for 10 as the threshold judgment is: taking 3 LOH negative samples verified by FISH, and calculating BAFN as 9,3,6; taking BAFN > =10 is reasonable.
In one embodiment, a determination is made as to whether the BAF is offset as a result of combining the copy number variation:
BAF is offset and Copy-neutral LOH with no change in Copy number, namely Copy neutral LOH;
baf shifted and copy number deleted as LOH;
baf was not shifted and copy number deletions were pure and missing.
S5: and outputting the variation type of the sample to be detected based on the detection result.
In one particular embodiment, the verification:
28 samples of FISH-validated FFPE were collected (FISH validation detected by An Bi flat "CDKN2A (9 p 21) gene probe" product, cat No. f.01225-01), compared to the present invention:
accuracy: 85.71% (65.32% -97.04%);
sensitivity: 100.00% (51.02% -100.0%);
specificity: 83.33% (71.06% -93.02%).
Fig. 2 is a mutation detection system based on second-generation sequencing data of a target gene region of a single sample to be detected, which includes:
101 acquisition unit: the sequence is used for acquiring a target gene region, wherein the target gene region comprises a whole exon region and two side regions of the target gene;
102 probe design unit: the probe covering the whole exon area, the copy number variation detection probe and the suboptimal frequency shift detection probe are designed based on the sequence of the targeted gene area; the probe covering the whole exon region is used for detecting single nucleotide site variation and/or indel mutation, the copy number variation detection probe is used for detecting copy number variation, and the secondary equivalent frequency shift detection probe is used for detecting heterozygosity deletion;
103 targeted sequencing unit: the sequencing method comprises the steps of capturing target sites based on the probes covering the whole exon region, the copy number variation detection probes and the secondary equivalent frequency shift detection probes, and then performing second generation sequencing to obtain sequencing data of the target sites;
104 detection unit: the method is used for realizing simultaneous detection of copy number variation, heterozygosity deletion, single nucleotide site variation and indel mutation of a single sample to be detected based on the difference between the sample to be detected and a reference sample of sequencing data of a target site;
105 result output unit: and the device is used for outputting the mutation type of the sample to be detected based on the detection result.
Fig. 3 is a mutation detection apparatus provided in an embodiment of the present invention, based on second-generation sequencing data of a target gene region of a single sample to be detected, the apparatus including: a memory and a processor; the apparatus may further include: input means and output means.
The memory, processor, input device, and output device may be connected by a bus or other means. FIG. 3 illustrates an example of a bus connection; wherein the memory is used for storing program instructions; the processor is used for calling program instructions, and when the program instructions are executed, the variation detection method based on the single sample target gene region second generation sequencing data is realized.
In some embodiments, the memory may be understood as any device holding a program and the processor may be understood as a device using the program.
A computer readable storage medium having stored thereon a computer program which when executed by a processor implements a variation detection method based on single sample-to-be-detected targeted gene region second generation sequencing data as described above.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working procedures of the above-described systems, apparatuses and units may refer to the corresponding procedures in the foregoing method embodiments, which are not repeated herein. In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
While the foregoing describes a computer device provided by the present invention in detail, those skilled in the art will appreciate that the foregoing description is not meant to limit the invention thereto, as long as the scope of the invention is defined by the claims appended hereto.

Claims (13)

1. A mutation detection method based on single sample targeted gene region second generation sequencing data, the method comprising:
s1: acquiring a sequence of a target gene region, wherein the target gene region comprises a whole exon region and two side regions of the target gene;
s2: designing a probe covering the whole exon area, a copy number variation detection probe and a minor equivalent frequency shift detection probe based on the sequence of the targeted gene area; the probe covering the whole exon region is used for detecting single nucleotide site variation and/or indel mutation, the copy number variation detection probe is used for detecting copy number variation, and the secondary equivalent frequency shift detection probe is used for detecting heterozygosity deletion;
s3: capturing target sites based on the probes covering the whole exon region, the copy number variation detection probes and the secondary equivalent frequency shift detection probes, and then performing second generation sequencing to obtain sequencing data of the target sites;
s4: based on the difference between the sequencing data of the target site and the sample to be tested and the reference sample, the single sample to be tested can detect copy number variation, heterozygosity deletion, single nucleotide site variation and indel mutation at the same time;
s5: outputting the variation type of the sample to be detected based on the detection result;
wherein the detection steps of the single nucleotide site variation and the indel mutation comprise: s31: capturing protein coding sites based on the probe covering the whole exon region, and then performing second generation sequencing to obtain sequencing data of the protein coding sites; s41: obtaining all exon areas of a sample to be detected and a reference sample based on sequencing data of protein coding sites, taking the all exon areas of the reference sample as a base line, and detecting single nucleotide site variation and indel mutation of the sample to be detected through analysis software;
the copy number variation detection step comprises the following steps: s32: performing second generation sequencing after capturing the copy number variation candidate sites in a targeted manner based on the copy number variation detection probes to obtain sequencing data of the copy number variation candidate sites; s42: detecting copy number variation of the sample to be detected by analysis software based on sequencing data of the copy number variation candidate sites;
the step of detecting the heterozygosity deletion includes: s23: screening candidate heterozygous sites based on the sequence of the target gene region, and designing minor allele frequency shift detection probes based on the candidate heterozygous sites; s33: performing second-generation sequencing after capturing candidate heterozygous sites in a targeted manner based on the secondary equivalent frequency shift detection probe to obtain sequencing data of the candidate heterozygous sites; s43: judging whether the candidate heterozygous site generates minor frequency deviation or not based on sequencing data of the candidate heterozygous site, and when the number of the minor frequency of the candidate heterozygous site in the deviation interval meets a threshold value, generating the minor frequency deviation of the heterozygous site; if the minor allele frequency is shifted and the copy number is unchanged, the copy neutral LOH is obtained, if the minor allele frequency is shifted and the copy number is deleted as heterozygous, and if the minor allele frequency is not shifted and the copy number is deleted as homozygous.
2. The method for detecting mutation based on the second-generation sequencing data of the target gene region of a single sample to be detected according to claim 1, wherein the mutation of the single nucleotide site and the indel mutation of the sample to be detected are detected by Sentieon.
3. The method for detecting the mutation based on the single-sample targeted gene region second-generation sequencing data according to claim 1, wherein the copy number change of the targeted gene region of the sample to be detected is calculated through a sliding window.
4. The method for detecting mutation based on single test sample targeted gene region second generation sequencing data according to claim 3, wherein the step of calculating copy number variation of the targeted gene region of the test sample through a sliding window comprises:
s421: dividing a targeted gene region into n windows, and assigning a weight to each window based on sequencing data of copy number variation candidate sites of a reference sample;
s422: calculating the average copy ratio of genes of the targeted gene region, namely log2P, based on the number n of sliding windows of the reference sample and the weight of each window, wherein the average copy ratio of genes is calculated by the formula:
s423: calculating a copy number normal range based on the average copy frequency of genes in the targeted gene region of the reference sample, wherein the copy number, namely CN, is calculated by the following formula:
CN=2 log2P+1
s424: calculating the copy number value of a sample to be detected, and deleting the targeted gene fragment when the copy number is smaller than the minimum value of the normal range; when the copy number is greater than the maximum of the normal range, it is repeated for the targeted gene segment.
5. The method for detecting mutation based on single sample targeted gene region second generation sequencing data according to claim 4, wherein the normal copy number range is [1.5,2.7].
6. The method for detecting mutation based on single-sample targeted gene region second-generation sequencing data according to claim 1, wherein when the number of the minor frequencies of the candidate heterozygous sites is equal to or greater than 10 (0.055,0.39) and (0.61,0.945), the minor frequencies of the heterozygous sites are shifted.
7. The mutation detection method based on single sample targeted gene region second generation sequencing data according to claim 1, wherein the screening of candidate heterozygous sites is to screen single nucleotide mutation sites with the frequency of thousands of genome sub-population of sub-isobaric sites being greater than or equal to a threshold as candidate heterozygous sites.
8. The mutation detection method based on single-sample targeted gene region second-generation sequencing data according to claim 7, wherein single nucleotide mutation sites with the frequency of 20% or more of thousands of genome sub-populations are used as candidate heterozygous sites.
9. The method for detecting mutation based on single sample targeted gene region second generation sequencing data according to claim 1, wherein the step of defining the offset interval comprises:
s431: calculating the minor allele frequency of each candidate heterozygous site in the reference sample, namely BAF;
s432: calculating the absolute value of the difference between the BAF and 0.5, namely AbsBAF;
s433: counting a heterozygous site value interval and a homozygous site value interval in a reference sample; the value interval of the heterozygote is the sum of the AbsBAF average value of the selected heterozygote and 3 times of standard deviation and is the heterozygote deviation value, and the value interval of the heterozygote is [ 0.5-heterozygote deviation value, 0.5+ heterozygote deviation value ]; the value interval of the homozygous site is the value of the homozygous site deviation between the AbsBAF average value of the selected homozygous site and 3 times of standard deviation, and the value interval of the homozygous site is [0 ], the value of the homozygous site deviation ] and [ 1-homozygous site deviation, 1];
s434: the deviation interval is the sum of the value interval of the heterozygous site and the value interval of the homozygous site, and the complement of the value interval of the heterozygous site and the value interval of the homozygous site and the value interval of the [0,1], namely (the value of the deviation of the homozygous site, 0.5-value of the deviation of the heterozygous site) and (the value of the deviation of the heterozygous site, 1-value of the deviation of the homozygous site).
10. The method for detecting mutation based on single-sample targeted gene region second-generation sequencing data according to claim 1, wherein the range of values of the offset interval is (0.055,0.39) and (0.61,0.945).
11. A mutation detection system based on single test sample targeted gene region second generation sequencing data, the system comprising:
an acquisition unit: the sequence is used for acquiring a target gene region, wherein the target gene region comprises a whole exon region and two side regions of the target gene;
probe design unit: the probe covering the whole exon area, the copy number variation detection probe and the suboptimal frequency shift detection probe are designed based on the sequence of the targeted gene area; the probe covering the whole exon region is used for detecting single nucleotide site variation and/or indel mutation, the copy number variation detection probe is used for detecting copy number variation, and the secondary equivalent frequency shift detection probe is used for detecting heterozygosity deletion;
a targeted sequencing unit: the sequencing method comprises the steps of capturing target sites based on the probes covering the whole exon region, the copy number variation detection probes and the secondary equivalent frequency shift detection probes, and then performing second generation sequencing to obtain sequencing data of the target sites;
and a detection unit: the method is used for realizing simultaneous detection of copy number variation, heterozygosity deletion, single nucleotide site variation and indel mutation of a single sample to be detected based on the difference between the sample to be detected and a reference sample of sequencing data of a target site; wherein the detection steps of the single nucleotide site variation and the indel mutation comprise: s31: capturing protein coding sites based on the probe covering the whole exon region, and then performing second generation sequencing to obtain sequencing data of the protein coding sites; s41: obtaining all exon areas of a sample to be detected and a reference sample based on sequencing data of protein coding sites, taking the all exon areas of the reference sample as a base line, and detecting single nucleotide site variation and indel mutation of the sample to be detected through analysis software;
the copy number variation detection step comprises the following steps: s32: performing second generation sequencing after capturing the copy number variation candidate sites in a targeted manner based on the copy number variation detection probes to obtain sequencing data of the copy number variation candidate sites; s42: detecting copy number variation of the sample to be detected by analysis software based on sequencing data of the copy number variation candidate sites;
the step of detecting the heterozygosity deletion includes: s23: screening candidate heterozygous sites based on the sequence of the target gene region, and designing minor allele frequency shift detection probes based on the candidate heterozygous sites; s33: performing second-generation sequencing after capturing candidate heterozygous sites in a targeted manner based on the secondary equivalent frequency shift detection probe to obtain sequencing data of the candidate heterozygous sites; s43: judging whether the candidate heterozygous site generates minor frequency deviation or not based on sequencing data of the candidate heterozygous site, and when the number of the minor frequency of the candidate heterozygous site in the deviation interval meets a threshold value, generating the minor frequency deviation of the heterozygous site; if the minor allele frequency is shifted and the copy number is unchanged, the copy neutral LOH is obtained, if the minor allele frequency is shifted and the copy number is deleted, the heterozygosity is deleted, and if the minor allele frequency is not shifted and the copy number is deleted, the heterozygosity is deleted;
a result output unit: and the device is used for outputting the mutation type of the sample to be detected based on the detection result.
12. A mutation detection apparatus based on single test sample targeted gene region second generation sequencing data, the apparatus comprising: a memory and a processor;
the memory is used for storing program instructions;
the processor is configured to invoke program instructions, which when executed, are configured to perform the mutation detection method of any one of claims 1-10 based on single sample targeted gene region second generation sequencing data.
13. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the mutation detection method based on the single sample-to-be-detected targeted gene region second-generation sequencing data according to any one of claims 1 to 10.
CN202311392297.6A 2023-10-25 2023-10-25 Mutation detection method, system and storable medium based on single sample to be detected targeted gene region second generation sequencing data Active CN117409856B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311392297.6A CN117409856B (en) 2023-10-25 2023-10-25 Mutation detection method, system and storable medium based on single sample to be detected targeted gene region second generation sequencing data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311392297.6A CN117409856B (en) 2023-10-25 2023-10-25 Mutation detection method, system and storable medium based on single sample to be detected targeted gene region second generation sequencing data

Publications (2)

Publication Number Publication Date
CN117409856A CN117409856A (en) 2024-01-16
CN117409856B true CN117409856B (en) 2024-03-29

Family

ID=89497585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311392297.6A Active CN117409856B (en) 2023-10-25 2023-10-25 Mutation detection method, system and storable medium based on single sample to be detected targeted gene region second generation sequencing data

Country Status (1)

Country Link
CN (1) CN117409856B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760712A (en) * 2016-03-01 2016-07-13 西安电子科技大学 Copy number variation detection method based on next generation sequencing
KR101721480B1 (en) * 2016-06-02 2017-03-30 주식회사 랩 지노믹스 Method and system for detecting chromosomal abnormality
CN110491441A (en) * 2019-05-06 2019-11-22 西安交通大学 A kind of gene sequencing data simulation system and method for simulation crowd background information
CN112927755A (en) * 2021-02-09 2021-06-08 北京博奥医学检验所有限公司 Method and system for identifying cfDNA (cfDNA) variation source
CN112980961A (en) * 2021-05-11 2021-06-18 上海思路迪医学检验所有限公司 Method and device for jointly detecting SNV (single nucleotide polymorphism), CNV (CNV) and FUSION (FUSION mutation)
WO2021204205A1 (en) * 2020-04-08 2021-10-14 北京智因东方转化医学研究中心有限公司 Method and system for detecting smn1 gene mutation by means of high-throughput sequencing
CN113889187A (en) * 2021-09-24 2022-01-04 上海仁东医学检验所有限公司 Single-sample allele copy number variation detection method, probe set and kit
CN114283889A (en) * 2021-12-27 2022-04-05 深圳吉因加医学检验实验室 Method and device for correcting homologous recombination repair defect score
CN115497557A (en) * 2022-08-30 2022-12-20 杭州瑞普基因科技有限公司 Method and device for detecting gene copy number variation aiming at targeted sequencing
CN116189763A (en) * 2023-02-21 2023-05-30 厦门艾德生物医药科技股份有限公司 Single sample copy number variation detection method based on second generation sequencing
CN116312780A (en) * 2023-05-10 2023-06-23 广州迈景基因医学科技有限公司 Method, terminal and medium for detecting somatic mutation of targeted gene second-generation sequencing data
CN116386718A (en) * 2023-05-30 2023-07-04 北京华宇亿康生物工程技术有限公司 Method, apparatus and medium for detecting copy number variation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220215900A1 (en) * 2021-01-07 2022-07-07 Tempus Labs, Inc. Systems and methods for joint low-coverage whole genome sequencing and whole exome sequencing inference of copy number variation for clinical diagnostics

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760712A (en) * 2016-03-01 2016-07-13 西安电子科技大学 Copy number variation detection method based on next generation sequencing
KR101721480B1 (en) * 2016-06-02 2017-03-30 주식회사 랩 지노믹스 Method and system for detecting chromosomal abnormality
CN110491441A (en) * 2019-05-06 2019-11-22 西安交通大学 A kind of gene sequencing data simulation system and method for simulation crowd background information
WO2021204205A1 (en) * 2020-04-08 2021-10-14 北京智因东方转化医学研究中心有限公司 Method and system for detecting smn1 gene mutation by means of high-throughput sequencing
CN112927755A (en) * 2021-02-09 2021-06-08 北京博奥医学检验所有限公司 Method and system for identifying cfDNA (cfDNA) variation source
CN112980961A (en) * 2021-05-11 2021-06-18 上海思路迪医学检验所有限公司 Method and device for jointly detecting SNV (single nucleotide polymorphism), CNV (CNV) and FUSION (FUSION mutation)
CN113889187A (en) * 2021-09-24 2022-01-04 上海仁东医学检验所有限公司 Single-sample allele copy number variation detection method, probe set and kit
CN114283889A (en) * 2021-12-27 2022-04-05 深圳吉因加医学检验实验室 Method and device for correcting homologous recombination repair defect score
CN115497557A (en) * 2022-08-30 2022-12-20 杭州瑞普基因科技有限公司 Method and device for detecting gene copy number variation aiming at targeted sequencing
CN116189763A (en) * 2023-02-21 2023-05-30 厦门艾德生物医药科技股份有限公司 Single sample copy number variation detection method based on second generation sequencing
CN116312780A (en) * 2023-05-10 2023-06-23 广州迈景基因医学科技有限公司 Method, terminal and medium for detecting somatic mutation of targeted gene second-generation sequencing data
CN116386718A (en) * 2023-05-30 2023-07-04 北京华宇亿康生物工程技术有限公司 Method, apparatus and medium for detecting copy number variation

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Yu Wang等.Identifying Human Genome-Wide CNV,LOH and UPD by Targeted Sequencing of Selected Regions.《PLOS ONE》.2015,第1-18页. *
二代测序技术在NSCLC中的临床应用中国专家共识(2020版);周彩存等;《中国肺癌杂志》;20201231;第23卷(第9期);第741-761页 *
基于全基因组重测序技术探索原发性高血压易感基因;张旭昌等;《中华高血压杂志》;20171011;第25卷(第8期);第737-744页 *
面向下一代测序数据的处理算法优化研究;张浩;《万方数据》;20231011;第1-167页 *

Also Published As

Publication number Publication date
CN117409856A (en) 2024-01-16

Similar Documents

Publication Publication Date Title
JP7119014B2 (en) Systems and methods for detecting rare mutations and copy number variations
Zhao et al. Detection of fetal subchromosomal abnormalities by sequencing circulating cell-free DNA from maternal plasma
EP2926288B1 (en) Accurate and fast mapping of targeted sequencing reads
JP7113838B2 (en) Enabling method and system for array variant calling
JP2020524499A (en) Validation method and system for sequence variant calls
CN107077537A (en) With short reading sequencing data detection repeat amplification protcol
Garud et al. Elevated linkage disequilibrium and signatures of soft sweeps are common in Drosophila melanogaster
Ma et al. The analysis of ChIP-Seq data
Piazza et al. CEQer: a graphical tool for copy number and allelic imbalance detection from whole-exome sequencing data
CN116064755B (en) Device for detecting MRD marker based on linkage gene mutation
Smart et al. A novel phylogenetic approach for de novo discovery of putative nuclear mitochondrial (pNumt) haplotypes
CN113674803A (en) Detection method of copy number variation and application thereof
CN105950707A (en) Method and system for determining nucleic acid sequence
CN113450871A (en) Method for identifying sample identity based on low-depth sequencing
Karimi et al. Approach to genetic diagnosis of inborn errors of immunity through next-generation sequencing
Ahsan et al. A survey of algorithms for the detection of genomic structural variants from long-read sequencing data
CN117409856B (en) Mutation detection method, system and storable medium based on single sample to be detected targeted gene region second generation sequencing data
Wyllie et al. M. tuberculosis microvariation is common and is associated with transmission: analysis of three years prospective universal sequencing in England
Greenwald et al. Integration of phased Hi-C and molecular phenotype data to study genetic and epigenetic effects on chromatin looping
CN109390039B (en) Method, device and storage medium for counting DNA copy number information
CN116646007B (en) Device for identifying real mutation or sequencing noise in ctDNA sequencing data, computer readable storage medium and application
CN117316271A (en) Method and detection system for screening copy number variation of blood tumor specimen based on second-generation sequencing technology
Lawal Estimating the Number of Genes that Are Differentially Expressed in Two Dependent Experiments or Analyses
CN115631788A (en) Gene pure heterozygous deletion detection method and system based on NGS platform
CN116913378A (en) Method and system for detecting genome homozygous region based on low-depth sequencing data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant