CN114203261A - Method for developing gene detection Panel clinical diagnosis index algorithm - Google Patents

Method for developing gene detection Panel clinical diagnosis index algorithm Download PDF

Info

Publication number
CN114203261A
CN114203261A CN202111251878.9A CN202111251878A CN114203261A CN 114203261 A CN114203261 A CN 114203261A CN 202111251878 A CN202111251878 A CN 202111251878A CN 114203261 A CN114203261 A CN 114203261A
Authority
CN
China
Prior art keywords
data
gene detection
detection panel
sequencing
panel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111251878.9A
Other languages
Chinese (zh)
Inventor
汪强虎
李铜舒
吴玲祥
黄斌
夏鹏
葛东伟
吴维
李�杰
王子宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ankai Life Technology Suzhou Co ltd
Original Assignee
Ankai Life Technology Suzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ankai Life Technology Suzhou Co ltd filed Critical Ankai Life Technology Suzhou Co ltd
Priority to CN202111251878.9A priority Critical patent/CN114203261A/en
Publication of CN114203261A publication Critical patent/CN114203261A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Epidemiology (AREA)
  • Databases & Information Systems (AREA)
  • Public Health (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioethics (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a method for developing a gene detection Panel clinical diagnosis index algorithm. Belongs to the field of cell gene detection, and specifically comprises the following steps: providing a gene locus information table, and filtering sequencing data; simulating sequencing data, and taking the transmitted data as virtual gene detection Panel data; analyzing sequencing data by adopting an existing index analysis algorithm; analyzing Panel data by adopting an existing index analysis algorithm; integrating analysis results and performing model training; evaluating the performance of various calculation models and selecting an optimal scheme. The method is based on the sequencing data of the whole genome and the whole exon in the public database, extracts the site data from the sequencing data through the site information of each gene in the gene detection Panel to construct virtual gene detection Panel data, and carries out algorithm development on the virtual Panel detection data, thereby improving the development quality and efficiency of the gene detection Panel product.

Description

Method for developing gene detection Panel clinical diagnosis index algorithm
Technical Field
The invention belongs to the field of cell gene detection, and relates to a method for developing a gene detection Panel clinical diagnosis index algorithm; the development and optimization of clinical diagnosis indexes of a gene (locus) detection Panel sequencing sample are realized through a novel data analysis model. Specifically, the method is based on multigroup sequencing data (including but not limited to whole genome sequencing, whole exon sequencing, whole genome methylation sequencing, whole transcriptome sequencing and the like), and helps developers construct digital detection Panel by simulating the characteristics of distribution patterns of gene sites under specific detection Panel, reading enrichment bias and the like. On the basis, fitting analysis is carried out on the detection value obtained by calculation in the detection Panel and the original detection value by using an artificial intelligence algorithm, so that the detection Panel has the detection performance consistent with multiple groups of chemical sequencing data. The invention can greatly reduce the development and test cost of detecting Panel.
Background
In the prior art, the method for developing the gene detection Panel clinical diagnosis index algorithm mainly adopts the steps of collecting a large number of samples and carrying out gene Panel detection on the samples to generate a large amount of data to develop the algorithm, but the method needs to consume a large amount of money, time and manpower, and once the initial site design of the gene Panel is wrong, the method may bring great risk to product development; and the current genome-wide and exon-wide omics high-throughput sequencing cost is higher, and more detection sites irrelevant to diseases are covered. Therefore, some gene detection panels are designed to detect mutation states of sites of some important genes related to diseases, so that not only can detection cost be reduced, but also sequencing depth of the specific gene sites can be intensively increased, and sensitivity and accuracy of detection results are improved. However, when some clinical diagnosis index analyses (such as indexes of TMB, MSI, etc.) are performed based on sequencing data generated by these gene detection panels, due to factors such as bias of selected gene combinations, results obtained by existing index calculation methods cannot completely reflect the true state of the sample; 1. a conventional method; currently, the following two methods are mainly used to construct and optimize Panel: (1) and carrying out mass sampling to construct Panel from the head. The method comprises the following specific steps: a: collecting a large number of samples (such as 100 samples and 500 samples), and respectively carrying out specific omics sequencing (such as whole exon sequencing) and detection Panel sequencing on each sample; b: analyzing the two sequencing methods by using a similar analysis algorithm to obtain a specific score related to a certain index; c: fitting the index score obtained based on detecting Panel according to the index score obtained by sequencing of the specific omics so as to obtain a standard score for clinical evaluation and diagnosis; the biggest defects of the method are that the early sample acquisition period is long, the cost is high, and a large amount of manpower is spent; and once the initial site of the gene Panel is designed by mistake, the product development is possibly carried with great risk; (2) optimizing a Panel prediction algorithm based on public data; the method comprises the following specific steps: a: collecting related omics sequencing data based on a public database, and capturing corresponding regions of the collected sequencing data according to the genome region related to Panel so as to simulate and detect the sequencing data of Panel; steps b and c correspond to steps b and c of the first method; although the method greatly reduces the cost of preparation in the early stage of Panel optimization; however, due to the bias of the technology of the Panel itself, the actually captured region and the detection depth of different regions and the like can be greatly different from the sequencing data in the existing public database; therefore, the effect of simply grabbing the corresponding area for subsequent simulation analysis is limited, and even a result opposite to that in the actual detection and analysis process can be obtained; therefore, the method has limited application range and is difficult to popularize on a large scale; therefore, a new index analysis algorithm is urgently needed to be developed based on the gene detection Panel data; the invention mainly aims at the scene of the application and development of clinical diagnosis indexes of gene detection Panel.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to provide a method for developing a gene detection Panel clinical diagnosis index algorithm.
The technical scheme is as follows: the invention relates to a method for developing a gene detection Panel clinical diagnosis index algorithm, which comprises two transmission processes of constructing a virtual gene detection Panel and developing a clinical index analysis algorithm aiming at data of the virtual gene detection Panel;
firstly, the specific transmission process of constructing the virtual gene detection Panel is as follows:
(1) providing information of all detection sites involved in the designed gene detection Panel,
(2) filtering the sequencing data of the whole genome or the whole exome;
(3) simulating sequencing data retained to encompass the detection site based on a set of sequencing-related parameters,
(4) sorting and storing the data (simulation) transmitted in the step (3) as virtual gene detection Panel data;
secondly, the specific delivery process for developing the clinical index analysis algorithm aiming at the virtual gene detection Panel data is as follows:
(5) analyzing the sequencing data of the whole genome or the whole exome input in the step (2) by adopting an existing index analysis algorithm;
(6) analyzing the Panel data of the virtual gene detection provided in the step (4) by adopting an existing index analysis algorithm;
(7) integrating the results of steps (5) and (6); corresponding the result of each sample in the step (5) to the corresponding sample in the step (6) and marking the result as the expected result of the sample;
performing model training based on the integrated result by adopting a proper machine learning algorithm;
(8) evaluating the performance of various calculation models and selecting an optimal scheme.
Further, in step (1), the information provided includes, but is not limited to, the position information of the locus on the genome and the sequence information of the locus.
Further, in step (2), the filtering of the whole genome or whole exome sequencing data specifically comprises: extracting sequencing data based on the detection site information provided in the step (1), and only reserving the sequencing data covered in the detection site;
further, in step (3), the parameters include, but are not limited to, the platform used for sequencing, the length of the sequence, the depth of sequencing, and the GC content on the sequence;
the simulation process includes but is not limited to re-fitting the read distribution and enrichment degree in the data (sequencing data in the detection sites) transmitted in the step (3) according to parameter setting, so that the generated data is consistent with the read distribution and enrichment degree of the sequencing data of the gene detection Panel obtained under the real condition.
Further, in the step (6), the analysis results analyzed by the index analysis algorithm are divided into two groups, namely a training set and a test set; the sample analyzed in the step (5) is consistent with the sample analyzed in the step (5);
the training set and the test set are grouped according to the proportion of 7:3 randomly to the existing data, wherein 70% of sample data is used as the training set for training the model; the remaining 30% of the data was used as a test set to finally evaluate the predicted performance of the model.
Accessories:
gene detection Panel: means that not only one site, one gene is detected in the detection; but simultaneously detecting a plurality of loci and a plurality of genes; these sites and genes need to be selected and combined according to a standard to form a detection set; this collection of gene loci is called the gene test Panel.
Whole genome sequencing: all DNA fragments in the cell nucleus are collectively called as genome, and the genome is subjected to high-throughput sequencing to obtain whole genome sequencing.
Sequencing of all exons: there is a portion of DNA within the cell that is capable of directing the encoding of a protein, this portion of DNA being called an "exon"; all fragments of DNA that have these functions are called exomes; and performing high-throughput sequencing on the exome to obtain the sequencing of the whole exon.
Sequencing depth: the ratio of the total amount of bases obtained by sequencing to the size of the genome is one of the indexes for evaluating the sequencing quantity.
TMB: tumor mutational burden; defined as the total number of somatic gene coding errors, base substitutions, gene insertion or deletion errors detected per million bases; TMB is the latest marker for the evaluation of the therapeutic effect of PD-1 antibodies, and its effect has been demonstrated in a variety of tumor therapies.
MSI: microsatellite instability; a kind of short tandem repeat DNA sequence in genome, generally composed of 1-6 nucleotides, is in tandem repeat arrangement; microsatellites have population polymorphisms due to differences in the number of repeats of their core repeat units. MSI occurs due to a functional defect in the DNA mismatch repair of tumor tissue; the MSI phenomenon, which is accompanied by a deficiency in DNA mismatch repair, is a clinically important tumor marker.
Reading: sequencing the obtained sequence fragment.
Omics: the method mainly comprises genomics, proteomics, metabonomics, transcriptomics, lipidomics, immunoomics, glycomics, imageomics, ultrasound and the like.
Has the advantages that: compared with the prior art, the invention has the advantages that: the invention is based on the sequencing data of the whole genome and the whole exon in the public database (or accumulated by the public database), extracts the site data from the sequencing data through the site information of each gene in the gene detection Panel to construct virtual gene detection Panel data, and carries out algorithm development on the virtual Panel detection data, thereby improving the development quality and efficiency of the gene detection Panel product and greatly reducing the development cost and risk.
Drawings
FIG. 1 is a flow chart of the operation of the present invention;
FIG. 2 is a graphical representation of TMB values for two sets of data analyzed using a linear fitting algorithm in accordance with the present invention;
FIG. 3 is a schematic representation of the statistical signal values of the Beta mixture model for a single probe in the present invention;
figure 4 is a graph of GCIMP values for two sets of data analyzed using a linear fitting algorithm in accordance with the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
The invention aims to realize the construction of virtual gene detection Panel and the development of a clinical index calculation method through the following scheme.
As shown in the figure, the present invention is divided into two transfer processes and 8 main steps.
The 1 st transmission process is mainly used for constructing virtual gene detection Panel, and comprises the following specific steps:
step 1: providing information of all detection sites related in the designed gene detection Panel, wherein the step 1 of the invention is to obtain the information of all detection sites related in the gene detection Panel;
wherein, the information includes but is not limited to the position information of the locus on the genome, the sequence information of the locus and the like; this information will pass to step 2;
in addition, such information can be directly provided by the worker or the like who designs gene testing Panel; the specific locus information can also be determined by performing sequence alignment analysis (such as by BWA or other alignment tools) on sample data of the test case sequencing of the gene detection Panel;
step 2: filtering whole genome or whole exome (or other omics) sequencing data; specifically, the sequencing data is extracted based on the detection site information provided in step 1, and only the sequencing data contained in the detection site is reserved; the reserved data is transmitted to the step 3;
step 2 of the invention is based on whole genome, whole exon, whole transcriptome, whole genome methylation or other omics sequencing data (hereinafter referred to as reference data set) to capture local locus data;
capturing the site data in the invention mainly captures the read data of the corresponding site in the reference data set according to the site coordinate information obtained in the step 1;
in the invention, the reference data set can be obtained by downloading a platform such as a public database; or the sequencing data accumulated by the staff;
the method adopted by the data capture in the invention comprises the steps of extracting the data of the specific site from the reference data set by using tools such as BWA, samtools and the like, but not limited to the tools;
and step 3: the data retained at step 3 was simulated based on a series of sequencing-related parameters including, but not limited to, the platform used for sequencing, the length of the sequence, the depth of sequencing, the GC content on the sequence, etc. The simulation process includes, but is not limited to, re-fitting the read distribution, enrichment degree and the like in the data delivered in the step 3 according to the parameter setting, so that the generated data and the sequencing data of the gene detection Panel obtained under the real condition are consistent in the read distribution, enrichment degree and the like. The fitted data will be further passed to step 4;
step 3 of the invention is based on the data distribution characteristics in the gene detection Panel, and the data obtained by capturing in the reference data set is subjected to distribution characteristic fitting, and the method mainly comprises the following two methods:
firstly, directly constructing a mathematical statistical model (such as a Poisson distribution model) through parameters such as a sequencing platform, sequence length, sequencing depth, GC content on a sequence and the like provided by a worker, and fitting the number of reads of each site in the captured data to ensure that the distribution characteristics of the number of reads are consistent with the data distribution characteristics generated by real gene detection Panel;
calculating information such as sequence length, sequencing depth, GC content on a sequence and the like in sample sequencing sample data of a test example of the gene detection Panel by means of tools such as BWA, samtools, flagstat and the like, constructing a mathematical statistic model (such as a Poisson distribution model and the like) according to the parameter information, and fitting the number of reads of each site in the captured data to enable the read number distribution characteristics to be consistent with the data distribution characteristics generated by the real gene detection Panel;
and 4, step 4: the data transmitted in the step 3 are sorted and stored to be used as virtual gene detection Panel data;
step 4 of the invention is to store the fitting data result in the format of Fastq or BAM, etc., and the fitting data result is named as virtual gene detection Panel data for subsequent analysis.
The second transmission process is mainly used for developing a clinical index analysis algorithm aiming at the virtual gene detection Panel data, and comprises the following specific steps:
and 5: analyzing the sequencing data of the whole genome or the whole exome (or other omics) input in the step 2 by adopting the existing index analysis algorithm (including but not limited to TMB, MSI and the like); because the standard calculation method of most clinical indexes is constructed based on whole genome/whole exon omics sequencing data; the result from this step will therefore be used as a gold standard for the algorithm training of step 7;
step 5 of the invention is to utilize index analysis algorithm widely used in the industry to calculate corresponding index score for each sample in the reference data set, and the calculated index score is used as gold standard;
the index score includes, but is not limited to, index calculation methods such as MSI, HRD, TMB and the like;
step 6: analyzing the virtual gene detection Panel data provided in the step 4 by adopting an existing index analysis algorithm; the analysis result is divided into a training set and a test set; the sample analyzed in the step is consistent with the sample analyzed in the step 5; the training set and the test set data are grouped according to the proportion of 7:3 randomly to the existing data, wherein 70 percent of sample data is used as the training set for training the model; the remaining 30% of the data is used as a test set for finally evaluating the prediction performance of the model;
step 6 of the invention is to calculate corresponding index score for each sample in the virtual gene detection Panel data by utilizing an index analysis algorithm widely used in the industry;
the index score includes, but is not limited to, index calculation methods such as MSI, HRD, TMB and the like;
and 7: integrating the results of steps 5 and 6; corresponding the result of each sample in the step 5 to the corresponding sample in the step 6, and marking the result as the expected result of the sample; model training based on the integrated results using appropriate machine learning algorithms (including but not limited to support vector machines, deep learning algorithms, etc.);
step 7 of the invention is to construct a prediction model by using the index scores calculated in steps 5 and 6, and the specific steps are as follows:
firstly, the result of each sample in the step 5 is corresponding to the corresponding sample in the step 6 and is marked as the expected result of the sample, and the paired sample results are divided into two groups of a training set and a testing set according to the proportion of 1:1 (or 7:3 and the like);
secondly, in the training set data, a model is trained by utilizing various machine learning algorithms (such as linear fitting and the like), so that the score calculated by the model based on the virtual gene detection Panel data is approximate to the score calculated by the corresponding sample in the reference data set. Then, evaluating the model prediction performance through test set data;
and 8: evaluating the performance of various calculation models and selecting an optimal scheme;
step 8 of the invention is to select an optimal scheme as a specific index calculation method for the virtual gene detection Panel by comparing the prediction performance of each model in step 7.
Example 1:
constructing a lung cancer gene detection Panel TMB prediction algorithm:
TMB is the tumor mutation burden, representing the density of non-synonymous mutation distributions of the protein coding regions; in some cancer types, patients with high TMB may benefit from immunotherapy;
1. downloading sequencing data of 100 lung cancer exons from a GDC website; meanwhile, downloading gene detection Panel data designed by the commercial kit of MSK-IMPACT as a pre-simulation object;
2. extracting corresponding reads from exon sequencing data according to site information on gene detection Panel;
3. constructing a Poisson distribution model based on the number of reads of each site of gene detection Panel sequencing data, and recording various parameter information in the model;
4. performing addition and deletion of the reads extracted from exon sequencing again according to the constructed cedar model parameters in 3, so that the relative distribution of the number of the reads at each site obtained based on exon sequencing is consistent with the relative distribution of the number of the reads at each site on the gene detection Panel;
5. respectively calculating exon sequencing data and reading data obtained based on exon extraction by using a conventional TMB calculation method to respectively obtain TMB scores of two groups of data;
6. the two sets of data are analyzed by using a linear fitting model, and a prediction model is constructed, so that the TMB score calculated based on the reading data obtained by exon extraction can predict a result similar to the TMB score directly calculated based on exon data according to the model, and the result is specifically shown in fig. 2.
Example 2:
constructing a brain tumor DNA methylation Panel G-CIMP prediction algorithm:
G-CIMP is an epigenetic characteristic in glioma, and means that a large number of CpG islands in the glioma have methylation phenomena; patients carrying this feature will generally have a better prognosis;
1. downloading 100 cases of data of the Illumina 450K DNA methylation chip of the brain cancer from a GDC website; simultaneously downloading 10 cases of Illumina 27K DNA methylation chip data as pre-simulation objects;
2. extracting corresponding data of Illumina 450K DNA methylation data according to site information on Illumina 27K DNA methylation Panel;
3. constructing a Beta mixed model (see figure 3) based on the signal value of each site of Illumina 27K DNA methylation data, and recording various kinds of parameter information in the model;
4. increasing and decreasing the data extracted from Illumina 450K DNA methylation data again according to Beta mixed model parameters constructed in 3 to enable the relative distribution of the signal value of each site to be consistent with the relative distribution of the signal value of each site on Illumina 27K DNA methylation data;
5. respectively calculating Illumina 450K DNA methylation data and data obtained by extraction based on Illumina 450K DNA methylation by using a conventional G-CIMP calculation method to respectively obtain G-CIMP scores of the two groups of data;
6. the two sets of data were analyzed using a linear fitting model to construct a prediction model, so that the score calculated based on data extracted from Illumina 450K DNA methylation could predict a result similar to the G-CIMP score calculated directly based on Illumina 450K DNA methylation data from the model (see fig. 4).
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.

Claims (5)

1. A method for gene detection Panel clinical diagnosis index algorithm development is characterized by comprising two transmission processes of constructing virtual gene detection Panel and developing a clinical index analysis algorithm aiming at data of the virtual gene detection Panel;
firstly, the specific transmission process of constructing the virtual gene detection Panel is as follows:
(1) providing information of all detection sites involved in the designed gene detection Panel,
(2) filtering the sequencing data of the whole genome or the whole exome;
(3) simulating sequencing data retained to encompass the detection site based on a set of sequencing-related parameters,
(4) the data passing through the simulation is sorted and stored, and is used as virtual gene detection Panel data;
secondly, the specific delivery process for developing the clinical index analysis algorithm aiming at the virtual gene detection Panel data is as follows:
(5) analyzing the filtered whole genome or whole exome sequencing data by adopting an existing index analysis algorithm;
(6) analyzing the provided virtual gene detection Panel data by adopting an existing index analysis algorithm;
(7) and integrating the analysis results of steps (5) and (6): corresponding the result of each sample in the step (5) to the corresponding sample in the step (6) and marking the result as the expected result of the sample;
performing model training based on the integrated result by adopting a proper machine learning algorithm;
(8) evaluating the performance of various calculation models and selecting an optimal scheme.
2. The method for gene detection Panel clinical diagnostic indicator algorithm development as claimed in claim 1, wherein in step (1), the provided information includes, but is not limited to, position information of the locus on the genome and sequence information of the locus.
3. The method for gene detection Panel clinical diagnostic index algorithm development according to claim 1, wherein in step (2), the filtering of whole genome or whole exome sequencing data specifically means: extracting sequencing data based on the detection site information provided in step (1), and only preserving the sequencing data contained in the detection site.
4. The method for gene detection Panel clinical diagnostic indicator algorithm development as claimed in claim 1, wherein in step (3), the sequence is based on a series of sequencing related parameters including but not limited to sequencing platform, length of sequence, sequencing depth and GC content on sequence;
the simulation process includes but is not limited to fitting read distribution and enrichment degree in sequencing data in the detection site again according to parameter setting, so that the generated data is consistent with the sequencing data of the gene detection Panel obtained under the real condition in the read distribution and enrichment degree.
5. The method for gene detection Panel clinical diagnosis index algorithm development according to claim 1, characterized in that, in step (6), the analysis results analyzed by the index analysis algorithm are divided into two groups, namely a training set and a test set;
the training set and the test set are grouped according to the proportion of 7:3 randomly to the existing data, wherein 70% of sample data is used as the training set for training the model; the remaining 30% of the data was used as a test set to finally evaluate the predicted performance of the model.
CN202111251878.9A 2021-10-26 2021-10-26 Method for developing gene detection Panel clinical diagnosis index algorithm Pending CN114203261A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111251878.9A CN114203261A (en) 2021-10-26 2021-10-26 Method for developing gene detection Panel clinical diagnosis index algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111251878.9A CN114203261A (en) 2021-10-26 2021-10-26 Method for developing gene detection Panel clinical diagnosis index algorithm

Publications (1)

Publication Number Publication Date
CN114203261A true CN114203261A (en) 2022-03-18

Family

ID=80646355

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111251878.9A Pending CN114203261A (en) 2021-10-26 2021-10-26 Method for developing gene detection Panel clinical diagnosis index algorithm

Country Status (1)

Country Link
CN (1) CN114203261A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451419A (en) * 2017-07-14 2017-12-08 浙江大学 It is a kind of that the method for simplifying DNA methylation sequencing data is produced by computer program simulation
CN109136371A (en) * 2018-07-25 2019-01-04 南京世和基因生物技术有限公司 A kind of radiotherapy effect and the combination of toxic reaction related gene, detection probe library and detection kit
CN109880910A (en) * 2019-04-25 2019-06-14 南京世和基因生物技术有限公司 A kind of detection site combination, detection method, detection kit and the system of Tumor mutations load
CN111826447A (en) * 2020-09-21 2020-10-27 求臻医学科技(北京)有限公司 Method for detecting tumor mutation load and prediction model
CN112029861A (en) * 2020-09-07 2020-12-04 臻悦生物科技江苏有限公司 Tumor mutation load detection device and method based on capture sequencing technology
US20210020314A1 (en) * 2018-03-30 2021-01-21 Juno Diagnostics, Inc. Deep learning-based methods, devices, and systems for prenatal testing
CN112786103A (en) * 2020-12-31 2021-05-11 普瑞基准生物医药(苏州)有限公司 Method and device for analyzing feasibility of target sequencing Panel for estimating tumor mutation load
CN113517066A (en) * 2020-08-03 2021-10-19 东南大学 Depression assessment method and system based on candidate gene methylation sequencing and deep learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451419A (en) * 2017-07-14 2017-12-08 浙江大学 It is a kind of that the method for simplifying DNA methylation sequencing data is produced by computer program simulation
US20210020314A1 (en) * 2018-03-30 2021-01-21 Juno Diagnostics, Inc. Deep learning-based methods, devices, and systems for prenatal testing
CN109136371A (en) * 2018-07-25 2019-01-04 南京世和基因生物技术有限公司 A kind of radiotherapy effect and the combination of toxic reaction related gene, detection probe library and detection kit
CN109880910A (en) * 2019-04-25 2019-06-14 南京世和基因生物技术有限公司 A kind of detection site combination, detection method, detection kit and the system of Tumor mutations load
CN113517066A (en) * 2020-08-03 2021-10-19 东南大学 Depression assessment method and system based on candidate gene methylation sequencing and deep learning
CN112029861A (en) * 2020-09-07 2020-12-04 臻悦生物科技江苏有限公司 Tumor mutation load detection device and method based on capture sequencing technology
CN111826447A (en) * 2020-09-21 2020-10-27 求臻医学科技(北京)有限公司 Method for detecting tumor mutation load and prediction model
CN112786103A (en) * 2020-12-31 2021-05-11 普瑞基准生物医药(苏州)有限公司 Method and device for analyzing feasibility of target sequencing Panel for estimating tumor mutation load

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
冉冰冰;梁楠;孙辉;: "组学技术在肿瘤精准诊疗中应用的研究进展:从单组学分析到多组学整合", 中国肿瘤生物治疗杂志, no. 12, 25 December 2019 (2019-12-25) *
徐云碧;杨泉女;郑洪建;许彦芬;桑志勤;郭子锋;彭海;张丛;蓝昊发;王蕴波;吴坤生;陶家军;张嘉楠;: "靶向测序基因型检测(GBTS)技术及其应用", 中国农业科学, no. 15, 1 August 2020 (2020-08-01) *
陈如萍;刘蕊;: "下一代测序技术在结直肠癌诊疗中的应用", 天津医药, no. 09, 15 September 2020 (2020-09-15) *

Similar Documents

Publication Publication Date Title
CN109022553B (en) Genetic chip for Tumor mutations cutting load testing and preparation method thereof and device
CN107403074B (en) A kind of detection method and device of mutain
CN112397151B (en) Methylation marker screening and evaluating method and device based on target capture sequencing
CN108319813A (en) Circulating tumor DNA copies the detection method and device of number variation
CN109706065A (en) Tumor neogenetic antigen load detection device and storage medium
CN106446597B (en) Several species feature selecting and the method for identifying unknown gene
CN113096728B (en) Method, device, storage medium and equipment for detecting tiny residual focus
CN115052994A (en) Method for determining base type of predetermined site in chromosome of embryonic cell and application thereof
CN116825188B (en) Method, device and computer readable storage medium for identifying tumor neoantigen at multiple groups of chemical layers based on high-throughput sequencing technology
CN112746097A (en) Method for detecting sample cross contamination and method for predicting cross contamination source
CN111584006A (en) Circular RNA identification method based on machine learning strategy
CN114898803B (en) Mutation detection analysis method, device, readable medium and apparatus
CN112837748A (en) System and method for distinguishing tumors of different anatomical origins
CN113096737A (en) Method and system for automatically analyzing pathogen types
CN114203261A (en) Method for developing gene detection Panel clinical diagnosis index algorithm
CN114496089B (en) Pathogenic microorganism identification method
CN107885972A (en) It is a kind of based on the fusion detection method of single-ended sequencing and its application
CN114067908B (en) Method, device and storage medium for evaluating single-sample homologous recombination defects
CN109215736A (en) A kind of high-flux detection method of enterovirus group and application
CN113355426B (en) Evaluation gene set and kit for predicting liver cancer prognosis
CN111411167A (en) DNA fingerprint atlas database of tobacco variety and application thereof
CN110684830A (en) RNA analysis method for paraffin section tissue
CN113793641B (en) Method for rapidly judging sample gender from FASTQ file
CN117577182B (en) System for rapidly identifying drug identification sites and application thereof
CN116312786B (en) Single cell expression pattern difference evaluation method based on multi-group comparison

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination