CN117316289A - Methylation sequencing typing method and system for central nervous system tumor - Google Patents
Methylation sequencing typing method and system for central nervous system tumor Download PDFInfo
- Publication number
- CN117316289A CN117316289A CN202311144613.8A CN202311144613A CN117316289A CN 117316289 A CN117316289 A CN 117316289A CN 202311144613 A CN202311144613 A CN 202311144613A CN 117316289 A CN117316289 A CN 117316289A
- Authority
- CN
- China
- Prior art keywords
- methylation
- sequencing
- sample
- tumor
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 208000025997 central nervous system neoplasm Diseases 0.000 title claims abstract description 76
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000012164 methylation sequencing Methods 0.000 title claims abstract description 27
- 201000007455 central nervous system cancer Diseases 0.000 title claims abstract description 23
- 230000011987 methylation Effects 0.000 claims abstract description 139
- 238000007069 methylation reaction Methods 0.000 claims abstract description 139
- 238000012163 sequencing technique Methods 0.000 claims abstract description 49
- 238000007637 random forest analysis Methods 0.000 claims abstract description 35
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 29
- 206010028980 Neoplasm Diseases 0.000 claims abstract description 19
- 238000004458 analytical method Methods 0.000 claims abstract description 11
- 238000001514 detection method Methods 0.000 claims abstract description 10
- 239000000523 sample Substances 0.000 claims description 117
- 108020004414 DNA Proteins 0.000 claims description 58
- 230000009466 transformation Effects 0.000 claims description 29
- LSNNMFCWUKXFEE-UHFFFAOYSA-N Sulfurous acid Chemical compound OS(O)=O LSNNMFCWUKXFEE-UHFFFAOYSA-N 0.000 claims description 25
- 238000006243 chemical reaction Methods 0.000 claims description 23
- UORVGPXVDQYIDP-UHFFFAOYSA-N borane Chemical compound B UORVGPXVDQYIDP-UHFFFAOYSA-N 0.000 claims description 22
- 238000012216 screening Methods 0.000 claims description 19
- 238000003066 decision tree Methods 0.000 claims description 18
- 230000000295 complement effect Effects 0.000 claims description 17
- 239000012634 fragment Substances 0.000 claims description 15
- 239000003550 marker Substances 0.000 claims description 13
- 238000010276 construction Methods 0.000 claims description 12
- 229910000085 borane Inorganic materials 0.000 claims description 11
- 102000004190 Enzymes Human genes 0.000 claims description 10
- 108090000790 Enzymes Proteins 0.000 claims description 10
- 108091081021 Sense strand Proteins 0.000 claims description 9
- 230000000692 anti-sense effect Effects 0.000 claims description 9
- 238000013461 design Methods 0.000 claims description 8
- 238000012937 correction Methods 0.000 claims description 7
- 102000053602 DNA Human genes 0.000 claims description 6
- 108020004682 Single-Stranded DNA Proteins 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 5
- 238000012772 sequence design Methods 0.000 claims description 5
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 4
- 238000010801 machine learning Methods 0.000 claims description 4
- 108091029523 CpG island Proteins 0.000 claims description 3
- 108700039691 Genetic Promoter Regions Proteins 0.000 claims description 3
- 239000003623 enhancer Substances 0.000 claims description 3
- 230000006607 hypermethylation Effects 0.000 claims description 3
- 230000001105 regulatory effect Effects 0.000 claims description 3
- 238000013467 fragmentation Methods 0.000 claims description 2
- 238000006062 fragmentation reaction Methods 0.000 claims description 2
- 238000003745 diagnosis Methods 0.000 abstract description 11
- 238000005516 engineering process Methods 0.000 abstract description 8
- 230000009286 beneficial effect Effects 0.000 abstract description 7
- 210000003169 central nervous system Anatomy 0.000 description 6
- 238000009396 hybridization Methods 0.000 description 6
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical class NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 5
- 238000007477 logistic regression Methods 0.000 description 3
- 230000001575 pathological effect Effects 0.000 description 3
- 238000010186 staining Methods 0.000 description 3
- 208000003174 Brain Neoplasms Diseases 0.000 description 2
- 208000032612 Glial tumor Diseases 0.000 description 2
- 206010018338 Glioma Diseases 0.000 description 2
- 238000012408 PCR amplification Methods 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 238000013145 classification model Methods 0.000 description 2
- 238000003759 clinical diagnosis Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000004393 prognosis Methods 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 108091029430 CpG site Proteins 0.000 description 1
- 230000007067 DNA methylation Effects 0.000 description 1
- 208000034953 Twin anemia-polycythemia sequence Diseases 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 229920001896 polybutyrate Polymers 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/30—Detection of binding sites or motifs
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B35/00—ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
Abstract
The invention discloses a methylation sequencing typing method and a methylation sequencing typing system for central nervous system tumors, comprising the following steps: extracting genome DNA from a tumor sample to be classified to obtain sample genome DNA; analyzing the methylation status of a plurality of independent genomic CpG positions in the genomic DNA of the sample, and constructing a methylation dataset based on the methylation status and the common dataset; the classification rules obtained by analyzing the plurality of methylation data sets are learned through a random forest model algorithm, and subtypes of the tumor sample to be classified are classified according to the learning results; the method provided by the invention obtains the standardized CNS tumor typing result through the standardization of data detection and analysis, can accurately distinguish different subtypes of CNS tumor, and avoids the problem of similar histological features and human interpretation errors among the subtypes; and the methylation capturing sequencing technology platform is specially designed for CNS tumor identification, can be added into sites beneficial to CNS tumor diagnosis in a targeted manner, and maintains scalability.
Description
Technical Field
The invention belongs to the technical field of medicine detection, and particularly relates to a methylation sequencing typing method and a methylation sequencing typing system for central nervous system tumors.
Background
The central nervous system (Central Nervous System, CNS) is composed of a wide variety of cell types, also resulting in a wide variety of CNS tumor subtypes. The world health organization lists over 100 brain tumor subtypes in the CNS tumor classification, many of which exhibit overlapping histological features. Furthermore, even histologically identical tumors may belong to different molecular subgroups, with very different therapeutic requirements and prognosis. Thus, more advanced diagnostic tools are needed to distinguish between different subtypes of CNS tumors.
The current CNS tumor diagnosis is mainly carried out by pathological section staining and artificial naked eye interpretation. The CNS tumors are complex and various, similar histological features exist among subtypes, and the problem of inconsistent diagnosis among different doctors exists in manual interpretation. Many studies have identified the problem of inconsistent diagnostic results from glioma pathology to physician, which can affect clinical decisions. For example, in a systematic study of the diagnosis of a cohort of 500 brain tumor patients, 42.8% of the patients were found to have a degree of divergence in diagnosis, and 8.8% of the patients were considered to have serious divergence. In a study on diagnosis differences of adult glioma, 26% of 457 cases of the referral were found to have inconsistent diagnosis; the degree of case diagnosis inconsistency is higher in the referral from a community hospital. Of these, 16% of inconsistent diagnoses are considered clinically significant, altering patient management and prognosis. In a prospective study of 244 cases by 4 pathologists, it was shown that diagnostic consistency could be improved by repeated examinations. In the first trial, four experts only agreed to 52% of cases, but after the fourth trial this ratio was 69%. From this, it can be seen that with respect to complex CNS tumors, accuracy is paramount, and even if repeatedly confirmed by the most experienced pathologist, it is difficult to agree on more than 30% of cases.
Currently, another approach to CNS tumor diagnosis is the Illumina 450K methylation chip data-based typing method of the german tumor center. The 450K chip is an old methylation chip of Illumina company, has been stopped, has about 45 ten thousand probes, and detects 45 ten thousand CpG sites correspondingly. The Germany method specifically comprises the steps of detecting 2800 CNS tumors by using 450K chips, screening 32000 probe sites with the largest variation, performing unsupervised clustering on 2800 samples, and finally obtaining 91 subclasses based on methylation data. And constructing a random forest machine learning algorithm by using the classified grouping guidance of the 91 subclasses, and screening out 10000 most important probe sites capable of distinguishing 91 subclasses. The subclass of CNS tumors was predicted by the random forest algorithm from the 10000 methylation site data.
However, the chip technology platform also has a non-negligible problem:
the chip generates fluorescent markers based on probe hybridization, and then calculates methylation values by scanning the intensity of the fluorescent markers, which are analog signals and not real methylation values; the inconsistency of the excitation wavelength and intensity of the methylation fluorescence and the unmethylation fluorescence also affects the accuracy of the methylation value; during probe hybridization of the chip, nonspecific hybridization occurs, including: the methylated template will hybridize to the unmethylated probe, or the unmethylated template hybridizes to the methylated probe; and the homologous sequences of the genome hybridize to the probes, which all produce false signals;
the chip has interference of background value in the scanning process, so that the methylation value of the chip is deviated; SNPs at the site of interest can affect probe hybridization and further interfere with the determination of methylation values. The data for initial assay and algorithm development were generated from the 450K chip platform based on Illumina commercial chips, but the 450K chip had been shut down. The new chip has obvious batch-to-batch differences between data and the old chip due to format and the like, 850K is also stopped, and the currently applied new 935K chip has new probes in probe design, but some original probes are removed, and about 5% of the original probes can be lost, which also affects the chip-based algorithm.
Disclosure of Invention
This section is intended to outline some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. Some simplifications or omissions may be made in this section as well as in the description summary and in the title of the application, to avoid obscuring the purpose of this section, the description summary and the title of the invention, which should not be used to limit the scope of the invention.
The present invention has been made in view of the above and/or problems occurring in the prior art.
In a first aspect of embodiments of the present invention, there is provided a methylation sequencing typing method for a central nervous system tumor, comprising: extracting genome DNA from a tumor sample to be classified to obtain sample genome DNA;
analyzing the methylation status of a plurality of independent genomic CpG positions in the sample genomic DNA and constructing a methylation dataset based on the methylation status and a common dataset;
and learning classification rules obtained by analyzing a plurality of methylation data sets through a random forest model algorithm, and classifying the subtypes of the tumor sample to be classified according to learning results.
As a preferred embodiment of the methylation sequencing typing method for central nervous system tumors according to the present invention, wherein: analysis of methylation status of a plurality of independent genomic CpG positions in the sample genomic DNA includes,
designing and customizing a methylation capture probe panel of a target region;
capturing a target sequence based on the methylation capture probe panel and constructing a sequencing library;
hybridizing the sequencing library with the methylation capture probe panel, amplifying an eluted product by PCR, performing on-machine sequencing, and detecting methylation states of a plurality of independent genome CpG positions in the genomic DNA of the sample.
As a preferred embodiment of the methylation sequencing typing method for central nervous system tumors according to the present invention, wherein: the regional design of the methylated capture probe panel includes,
the regional scope of the methylation-captured probe panel can cover the design of CpG island, promoter region and enhancer regulatory elements in the whole genome scope, and can also aim at the methylation region of the difference between subtypes of CNS tumor;
calculating methylation areas of differences among subtypes of CNS tumors based on a GEO data set, carrying out self-organizing clustering on samples in the GEO data set by a t-SNE clustering method to obtain multiple subtypes, analyzing difference sites among different subtypes of CNS tumors in the GEO data set by using a machine learning algorithm of random forests, screening out the most important 1000-40000 probe sites capable of distinguishing the multiple subtypes, and then designing a probe pattern in a targeted manner.
As a preferred embodiment of the methylation sequencing typing method for central nervous system tumors according to the present invention, wherein: comprising the steps of (a) a step of,
the sequence design of the methylation capture probe panel can firstly capture target DNA fragments, then sulfite conversion, enzyme conversion and borane conversion are carried out, namely, the probe sequence is complementary with the original DNA sequence, and the complementary probe of any one or two of sense strand or antisense strand can be designed;
the methylation state detection method comprises the steps of firstly capturing target DNA fragments, then performing sulfite conversion, selecting a small amount of sample genome DNA, performing sequencing library construction after fragmentation, mainly filling the fragmented DNA with a complement, adding a sequencing joint at the 3' end, hybridizing with a methylation capture probe panel, performing sulfite conversion on a captured product, amplifying a converted product by PCR, and performing sequencing detection on the methylation state.
As a preferred embodiment of the methylation sequencing typing method for central nervous system tumors according to the present invention, wherein: comprising the steps of (a) a step of,
the sequence design of the methylation capture probe panel can also be carried out firstly, and then target DNA fragments can be captured, wherein the transformation can be carried out through sulfite transformation, enzyme transformation and borane transformation, the probe sequence is complementary to the subsequent sequence, and the sequence can be designed into one or more of the following four sequences: a sequence after transformation of the sense strand, a sequence after transformation of the antisense strand, a sequence complementary to the sequence after transformation of the sense strand, a sequence complementary to the sequence after transformation of the antisense strand;
firstly, performing sulfite conversion, then capturing target DNA fragments, performing methylation state detection including double-chain library construction and single-chain library construction, obtaining a converted library, hybridizing the library with a capture probe, amplifying an eluted product by PCR, and performing on-machine sequencing to detect the methylation state;
the double-chain library construction comprises the steps of selecting a small amount of sample genome DNA, fragmenting, constructing a sequencing library, mainly filling the fragmented DNA, adding A at the 3' end, adding a sequencing joint, then carrying out sulfite or enzyme or borane conversion, hybridizing with a methylation capture probe panel, and amplifying a capture product by PCR to obtain a converted methylation library with the sequencing joint;
the single-stranded library construction comprises the steps of selecting a small amount of sample genome DNA to perform sulfite or enzyme or borane conversion, converting the converted product into single-stranded DNA fragments, and then performing single-stranded DNA library construction, wherein the converted product is a methylation library with a sequencing joint.
As a preferred embodiment of the methylation sequencing typing method for central nervous system tumors according to the present invention, wherein: learning classification rules by a random forest model algorithm includes,
screening candidate methylation markers identified by CNS tumors according to a standard by analyzing a plurality of methylation data sets, screening out suitable candidate methylation marker sites, and filtering out low-quality sites;
introducing the filtered data into a random forest model algorithm, and further screening molecular markers from the residual sites to obtain 1000-40000 probe methylation sites capable of distinguishing subtypes of CNS tumors;
meanwhile, the random forest model algorithm combines methylation sites to form a random forest algorithm of a decision tree containing the methylation sites, the random forest algorithm comprises 1000-30000 decision trees, each decision tree comprises 200-300 nodes, each node comprises a methylation marker and a threshold value of the methylation marker, whether the methylation marker belongs to a subtype classification or enters the next decision node is judged according to the relation between the methylation value of each sample at the node and the threshold value, and each decision tree generates a decision result for determining the subtype and votes for the subtype once;
after the methylation values of all the sites of each sample are input, voting scoring results of a decision tree are obtained, corresponding scoring results are obtained for each subtype after comprehensive statistics, CNS tumor subtype judgment can be directly carried out, probability score correction can be carried out after corresponding branch results are obtained, and the prediction results after probability score correction can be used as the basis for more accurate tumor sample classification.
As a preferred embodiment of the methylation sequencing typing method for central nervous system tumors according to the present invention, wherein: screening criteria for candidate methylation markers identified for CNS tumors include,
the methylation value in the chip and the sequencing platform keeps better correlation;
including characteristics of each CNS tumor subtype, hypermethylation or hypomethylation only in specific subtypes, can be used for high quality methylation sites of classification.
In a second aspect of embodiments of the present invention, there is provided a methylation sequencing typing system for a central nervous system tumor, comprising:
a sample acquisition unit for extracting genomic DNA from a tumor sample to be classified to obtain sample genomic DNA;
a state analysis unit for analyzing methylation states of a plurality of independent genomic CpG positions in the sample genomic DNA and constructing a methylation dataset based on the methylation states and a common dataset;
and the classification learning unit is used for learning classification rules obtained by analyzing a plurality of methylation data sets through a random forest model algorithm and classifying the subtypes of the tumor sample to be classified according to a learning result.
In a third aspect of embodiments of the present invention, there is provided an apparatus, comprising,
a processor;
a memory for storing processor-executable instructions;
the processor is configured to invoke the instructions stored in the memory to perform the method according to any of the embodiments of the present invention.
In a fourth aspect of embodiments of the present invention, there is provided a computer readable storage medium having stored thereon computer program instructions comprising:
the computer program instructions, when executed by a processor, implement a method according to any of the embodiments of the present invention.
Compared with the prior art, the invention has the beneficial effects that:
(1) the standardized CNS tumor typing result is obtained through data detection and analysis standardization, the condition that the clinical diagnosis result of the CNS tumor is inconsistent among different hospitals or different doctors at present is avoided, and the standardized and high-repeatability typing identification result can be obtained through methylation marker analysis.
(2) Compared with naked eye judgment, the invention has more accurate typing and tens of thousands of methylation markers, the dimension of the carried information is far more than that of pathological section staining results, different subtypes of CNS tumors can be accurately distinguished, and the problems of similar histological characteristics and human interpretation errors among the subtypes are avoided.
(3) The technical platform is specially designed for CNS tumor identification, and solves the problem of chip updating iteration; the capture product is specially designed for CNS tumor identification, can ensure the stability of a probe set, and only increases sites which are more beneficial to CNS typing, but does not reduce useful sites; the chip iteration upgrading and the loss of identification sites are avoided, the stability of a database is affected, and precious data accumulated before is affected.
(4) The technical platform of the invention is based on sequencing data, the main factor influencing the accuracy of the data is the depth of sequencing, the accuracy can be improved as long as the depth is enough, and the result is closer to the true value; the chip technology can not be improved because of the problems of probe sequences and hybridization background and the fact that the difference between the chip technology and the real value is too large.
(5) The methylation capturing sequencing technology platform is specially designed for CNS tumor identification, can be added with a site which is beneficial to CNS tumor diagnosis in a targeted way, and keeps expandability, so long as a new capturing probe is added.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:
FIG. 1 is a flow chart showing a method and system for methylation sequencing typing of a central nervous system tumor according to the present invention;
FIG. 2 is a cluster map of CNS tumor subtypes t-SNE of a method and system for methylation sequencing typing of CNS tumors provided by the invention;
FIG. 3 is a schematic diagram of a random forest decision tree of the method and system for methylation sequencing typing of CNS tumors provided by the invention.
Detailed Description
So that the manner in which the above recited objects, features and advantages of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
Further, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic can be included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
While the embodiments of the present invention have been illustrated and described in detail in the drawings, the cross-sectional view of the device structure is not to scale in the general sense for ease of illustration, and the drawings are merely exemplary and should not be construed as limiting the scope of the invention. In addition, the three-dimensional dimensions of length, width and depth should be included in actual fabrication.
Also in the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "upper, lower, inner and outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the system or element to be referred to must have a specific direction, be constructed and operated in the specific direction, and thus should not be construed as limiting the present invention. Furthermore, the terms "first, second, or third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected, and coupled" should be construed broadly in this disclosure unless otherwise specifically indicated and defined, such as: can be fixed connection, detachable connection or integral connection; it may also be a mechanical connection, an electrical connection, or a direct connection, or may be indirectly connected through an intermediate medium, or may be a communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
Example 1
Referring to fig. 1 to 3, in one embodiment of the present invention, a methylation sequencing typing method for central nervous system tumor is provided, which mainly comprises the following steps:
s1: extracting genome DNA from tumor samples to be classified, and obtaining sample genome DNA. It should be noted that:
tumor samples to be classified obtained from patient tumors, and optionally genomic DNA isolated therefrom, are extracted using QIAamp DNA Mini Kit (Qiagen, valencia, CA) or similar kits.
S2: analyzing methylation status of a plurality of independent genomic CpG positions in the genomic DNA of the sample. It should be noted that:
analysis of methylation status includes
Designing and customizing a methylation capture probe panel of a target region; capturing a target sequence based on a methylation capture probe panel and constructing a sequencing library; hybridizing the sequencing library with a methylation capture probe panel, amplifying the eluted product by PCR, sequencing on the machine, and detecting the methylation states of a plurality of independent genome CpG positions in the genomic DNA of the sample.
Furthermore, the region design range of the methylation capture probe panel can cover the design of CpG island, promoter region and enhancer regulatory elements in the whole genome range, and can also aim at the methylation region of the difference between subtypes of CNS tumor;
specifically, calculating methylation areas of differences among subtypes of CNS tumors based on GEO data sets GSE90496, GSE122994, GSE143843, GSE58218, GSE121723, GSE74193 and the like, carrying out self-organizing clustering on samples in the GEO data sets by a t-SNE clustering method to obtain 91 subtypes, analyzing difference sites among different subtypes of CNS tumors in the GEO data sets by using a random forest machine learning algorithm, screening out the most important 10000 probe sites capable of distinguishing the 91 subtypes, and designing a probe pattern in a targeted manner as shown in fig. 2;
the specific steps for screening the probe sites are as follows:
the batch effect of the chip between FFPE and fresh material samples was corrected (for german tumor center 2801 samples) using a linear hybrid model; filtering probes on XY chromosomes, filtering probes affected by SNP loci, and the like, wherein the probes cannot be compared with genome only; constructing a random forest classification model by adopting a random forest R packet, and finally selecting a probe with an importance degree top 40000;
and analyzing methylation capture sequencing and chip data in the own database, matching the methylation capture sequencing data with a top40000 probe site according to position information, and finally screening to obtain 10000 methylation sites which exist in a methylation capture sequencing result and have good coverage and high consistency.
Further, two designs can be used for the sequence design of the probe, which correspond to two experimental procedures:
a. capturing target DNA fragments and then converting: the probe sequence is complementary with the original DNA sequence, and a complementary probe of either one or two of a sense strand or an antisense strand can be designed;
b. conversion is performed first, and then target DNA fragments are captured: the probe sequence being complementary to the latter sequence, the transformation may be by sulfite transformation or enzymatic transformation (e.g.TET, APOBEC or the like, e.g.NEBEnzymatic Methyl-seq Kit or similar kits and methods) and borane transformation (e.g., TAPS, etc.), can be designed as a combination of one or more of the following four sequences: a sequence after transformation of the sense strand, a sequence after transformation of the antisense strand, a sequence complementary to the sequence after transformation of the sense strand, and a sequence complementary to the sequence after transformation of the antisense strand.
Further, constructing a sequencing library and sequencing to detect methylation status includes,
(1) scheme one: capturing target DNA fragments, and then performing sulfite conversion;
taking 500 ng-1 mug genome DNA, fragmenting, constructing a sequencing library, mainly filling the fragmented DNA with A at the 3' end, adding a sequencing joint, then hybridizing with a capture probe, and carrying out sulfite conversion on the capture product, wherein EZ DNA Methylation Gold Kit (ZYMO Research, irvine, CA) or a similar kit can be used for sulfite conversion;
unmethylated cytosine (C) in the DNA sequence after sulfite conversion treatment is changed into uracil (U), and U is amplified into thymine (T); whereas methylated cytosine (5 mC) is not altered by sulfite treatment and remains cytosine.
It should be noted that, the transformation of sulfite changes the methylation modification state into a base sequence which is easy to be identified, and the methylation modification state can be detected by means of sequencing and the like, and the methylation state is detected by sequencing on a machine after PCR amplification of the transformation product.
(2) Scheme II: firstly, sulfite conversion is carried out, and then target DNA fragments are captured;
the second scheme is a scheme of firstly carrying out sulfite conversion before capturing, and can be divided into two processes of double-chain library establishment and single-chain library establishment:
a. double-chain library establishment: taking 500 ng-1 mug genome DNA, fragmenting, constructing a sequencing library, mainly filling the fragmented DNA, adding A at the 3' end, adding a sequencing joint, then converting by sulfite or enzyme or borane, then hybridizing with a capture probe, and carrying out PCR amplification on the capture product to obtain a converted methylation library with the sequencing joint.
b. Single chain library building: the transformation of sulfite, enzyme or borane is carried out on 500 ng-1 mug genome DNA, the transformation product is single-stranded DNA fragment, then single-stranded DNA Library establishment is carried out, and a PBAT scheme or similar scheme, such as an IDT xGen Methyl-Seq Library Prep Kit (original name Accel-NGS Methyl-Seq Library Kit) or similar Kit, is used, and the product is a methylation Library after transformation with a sequencing joint.
After the converted library is obtained, the library is hybridized with a capture probe, and the eluted product is amplified by PCR and then sequenced by an upper machine to detect the methylation state.
S3: and learning classification rules obtained by analyzing the plurality of methylation data sets through a random forest model algorithm, and classifying subtypes of the tumor sample to be classified according to learning results. It should be noted that:
classification rules are obtained by analyzing multiple methylation datasets, including common datasets, such as GEO datasets GSE90496, GSE122994, GSE143843, GSE58218, GSE121723, GSE74193, etc., containing 450K chip results, self dataset containing 850K chips of a series of CNS samples, and methylation-captured sequencing results;
screening candidate methylation markers identified by CNS tumors according to a standard by analyzing a plurality of methylation data sets, screening out suitable candidate methylation marker sites, and filtering out low-quality sites;
screening criteria for candidate methylation markers identified for CNS tumors include,
(1) the methylation value in the chip and the sequencing platform keeps better correlation;
(2) including characteristics of each CNS tumor subtype, hypermethylation or hypomethylation only in specific subtypes, can be used for high quality methylation sites of classification.
Introducing the filtered data into a random forest model algorithm, and further screening molecular markers from the residual sites to obtain 10000 probe methylation sites capable of distinguishing subtypes of CNS tumors;
meanwhile, the random forest model algorithm combines methylation sites to form a random forest algorithm of decision trees containing the methylation sites, each decision tree contains 1000-30000 decision trees, each decision tree contains 200-300 nodes, each node contains 1 methylation marker and a threshold value of the decision thereof, whether the methylation marker belongs to a subtype classification or enters the next decision node is judged according to the relation between the methylation value of each sample at the node and the threshold value, and each decision tree generates a decision result for determining the subtype and votes for one time relative to the subtype;
after the methylation values of all the sites of each sample are input, voting scoring results of a decision tree are obtained, and after comprehensive statistics, corresponding scoring results are obtained for each subtype, as shown in figure 3, in short, which classification result of the decision tree has the highest classification score, and then the result is taken as a final result of subtype identification by a random forest;
after the prediction result of the random forest is obtained, probability score correction is carried out, the model adopts an L2 norm regularized multi-classification logistic regression algorithm (Multinomial Logistic Regression) to correct the original score predicted by the random forest algorithm, and the corrected probability score is obtained.
It should be noted that, the glmcet R packet is adopted to take the methylation classification result as a response variable, the random forest model score as an interpretation variable, the fitting of the L2 regularized multiple logistic regression model is performed, and the L2 penalty term coefficient is determined by adopting a 10-fold cross validation method so as to reduce the overfitting, correct the classification model score and improve the classification prediction level of the random forest model.
And taking the original probability score predicted by the random forest model as input data of the trained probability score correction model to obtain a corrected classification result and a probability score result. While the classification model can classify the sample to be detected, the accurate probability of each class cannot be predicted, and the probability score correction model can obtain corrected probability scores, which are taken as final prediction results, as shown in table 1.
Table 1: and a corrected random forest model prediction probability score table.
Compared with the prior art, the method provided by the invention has the beneficial effects that:
(1) the standardized CNS tumor typing result is obtained through data detection and analysis standardization, the condition that the clinical diagnosis result of the CNS tumor is inconsistent among different hospitals or different doctors at present is avoided, and the standardized and high-repeatability typing identification result can be obtained through methylation marker analysis.
(2) Compared with naked eye judgment, the invention has more accurate typing and tens of thousands of methylation markers, the dimension of the carried information is far more than that of pathological section staining results, different subtypes of CNS tumors can be accurately distinguished, and the problems of similar histological characteristics and human interpretation errors among the subtypes are avoided.
(3) The technical platform is specially designed for CNS tumor identification, and solves the problem of chip updating iteration; the capture product is specially designed for CNS tumor identification, can ensure the stability of a probe set, and only increases sites which are more beneficial to CNS typing, but does not reduce useful sites; the chip iteration upgrading and the loss of identification sites are avoided, the stability of a database is affected, and precious data accumulated before is affected.
(4) The technical platform of the invention is based on sequencing data, the main factor influencing the accuracy of the data is the depth of sequencing, the accuracy can be improved as long as the depth is enough, and the result is closer to the true value; the chip technology can not be improved because of the problems of probe sequences and hybridization background and the fact that the difference between the chip technology and the real value is too large.
(5) The methylation capturing sequencing technology platform is specially designed for CNS tumor identification, can be added with a site which is beneficial to CNS tumor diagnosis in a targeted way, and keeps expandability, so long as a new capturing probe is added.
In a second aspect of the present disclosure,
there is provided a methylation sequencing typing system for a central nervous system tumor, comprising:
a sample acquisition unit for extracting genomic DNA from a tumor sample to be classified to obtain sample genomic DNA;
a state analysis unit for analyzing methylation states of a plurality of independent genomic CpG positions in the sample genomic DNA and constructing a methylation dataset based on the methylation states and the common dataset;
and the classification learning unit is used for learning classification rules obtained by analyzing the plurality of methylation data sets through a random forest model algorithm and classifying subtypes of the tumor sample to be classified according to learning results.
In a third aspect of the present disclosure,
there is provided an apparatus comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the instructions stored in the memory to perform the method of any of the preceding.
In a fourth aspect of the present disclosure,
there is provided a computer readable storage medium having stored thereon computer program instructions comprising:
the computer program instructions, when executed by a processor, implement a method of any of the preceding.
The present invention may be a method, apparatus, system, and/or computer program product, which may include a computer-readable storage medium having computer-readable program instructions embodied thereon for performing various aspects of the present invention.
The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted without departing from the spirit and scope of the technical solution of the present invention, and it should be covered in the scope of the present invention.
Claims (10)
1. A methylation sequencing typing method for a central nervous system tumor, characterized by: comprising the steps of (a) a step of,
extracting genome DNA from a tumor sample to be classified to obtain sample genome DNA;
analyzing the methylation status of a plurality of independent genomic CpG positions in the sample genomic DNA and constructing a methylation dataset based on the methylation status and a common dataset;
and learning classification rules obtained by analyzing a plurality of methylation data sets through a random forest model algorithm, and classifying the subtypes of the tumor sample to be classified according to learning results.
2. The methylation sequencing typing method of a central nervous system tumor of claim 1, wherein: analysis of methylation status of a plurality of independent genomic CpG positions in the sample genomic DNA includes,
designing and customizing a methylation capture probe panel of a target region;
capturing a target sequence based on the methylation capture probe panel and constructing a sequencing library;
hybridizing the sequencing library with the methylation capture probe panel, amplifying an eluted product by PCR, performing on-machine sequencing, and detecting methylation states of a plurality of independent genome CpG positions in the genomic DNA of the sample.
3. The methylation sequencing typing method of a central nervous system tumor of claim 2, wherein: the regional design of the methylated capture probe panel includes,
the regional scope of the methylation-captured probe panel can cover the design of CpG island, promoter region and enhancer regulatory elements in the whole genome scope, and can also aim at the methylation region of the difference between subtypes of CNS tumor;
calculating methylation areas of differences among subtypes of CNS tumors based on a GEO data set, carrying out self-organizing clustering on samples in the GEO data set by a t-SNE clustering method to obtain multiple subtypes, analyzing difference sites among different subtypes of CNS tumors in the GEO data set by using a machine learning algorithm of random forests, screening out the most important 1000-40000 probe sites capable of distinguishing the multiple subtypes, and then designing a probe pattern in a targeted manner.
4. The methylation sequencing typing method of a central nervous system tumor of claim 3, wherein: comprising the steps of (a) a step of,
the sequence design of the methylation capture probe panel can firstly capture target DNA fragments, then sulfite conversion, enzyme conversion and borane conversion are carried out, namely, the probe sequence is complementary with the original DNA sequence, and the complementary probe of any one or two of sense strand or antisense strand can be designed;
the methylation state detection method comprises the steps of firstly capturing target DNA fragments, then performing sulfite conversion, selecting a small amount of sample genome DNA, performing sequencing library construction after fragmentation, mainly filling the fragmented DNA with a complement, adding a sequencing joint at the 3' end, hybridizing with a methylation capture probe panel, performing sulfite conversion on a captured product, amplifying a converted product by PCR, and performing sequencing detection on the methylation state.
5. The methylation sequencing typing method of a central nervous system tumor of claim 3, wherein: comprising the steps of (a) a step of,
the sequence design of the methylation capture probe panel can also be carried out firstly, and then target DNA fragments can be captured, wherein the transformation can be carried out through sulfite transformation, enzyme transformation and borane transformation, the probe sequence is complementary to the subsequent sequence, and the sequence can be designed into one or more of the following four sequences: a sequence after transformation of the sense strand, a sequence after transformation of the antisense strand, a sequence complementary to the sequence after transformation of the sense strand, a sequence complementary to the sequence after transformation of the antisense strand;
firstly, performing sulfite conversion, then capturing target DNA fragments, performing methylation state detection including double-chain library construction and single-chain library construction, obtaining a converted library, hybridizing the library with a capture probe, amplifying an eluted product by PCR, and performing on-machine sequencing to detect the methylation state;
the double-chain library construction comprises the steps of selecting a small amount of sample genome DNA, fragmenting, constructing a sequencing library, mainly filling the fragmented DNA, adding A at the 3' end, adding a sequencing joint, then carrying out sulfite or enzyme or borane conversion, hybridizing with a methylation capture probe panel, and amplifying a capture product by PCR to obtain a converted methylation library with the sequencing joint;
the single-stranded library construction comprises the steps of selecting a small amount of sample genome DNA to perform sulfite or enzyme or borane conversion, converting the converted product into single-stranded DNA fragments, and then performing single-stranded DNA library construction, wherein the converted product is a methylation library with a sequencing joint.
6. The methylation sequencing typing method of a central nervous system tumor according to any one of claims 1 to 5, wherein: learning classification rules by a random forest model algorithm includes,
screening candidate methylation markers identified by CNS tumors according to a standard by analyzing a plurality of methylation data sets, screening out suitable candidate methylation marker sites, and filtering out low-quality sites;
introducing the filtered data into a random forest model algorithm, and further screening molecular markers from the residual sites to obtain 1000-40000 probe methylation sites capable of distinguishing subtypes of CNS tumors;
meanwhile, the random forest model algorithm combines methylation sites to form a random forest algorithm of a decision tree containing the methylation sites, the random forest algorithm comprises 1000-30000 decision trees, each decision tree comprises 200-300 nodes, each node comprises a methylation marker and a threshold value of the methylation marker, whether the methylation marker belongs to a subtype classification or enters the next decision node is judged according to the relation between the methylation value of each sample at the node and the threshold value, and each decision tree generates a decision result for determining the subtype and votes for the subtype once;
after the methylation values of all the sites of each sample are input, voting scoring results of a decision tree are obtained, corresponding scoring results are obtained for each subtype after comprehensive statistics, CNS tumor subtype judgment can be directly carried out, probability score correction can be carried out after corresponding branch results are obtained, and the prediction results after probability score correction can be used as the basis for more accurate tumor sample classification.
7. The methylation sequencing typing method of a central nervous system tumor of claim 6, wherein: screening criteria for candidate methylation markers identified for CNS tumors include,
the methylation value in the chip and the sequencing platform keeps better correlation;
including characteristics of each CNS tumor subtype, hypermethylation or hypomethylation only in specific subtypes, can be used for high quality methylation sites of classification.
8. A system for performing the methylation sequencing typing method of a central nervous system tumor according to any one of claims 1 to 7, wherein: comprising the steps of (a) a step of,
a sample acquisition unit for extracting genomic DNA from a tumor sample to be classified to obtain sample genomic DNA;
a state analysis unit for analyzing methylation states of a plurality of independent genomic CpG positions in the sample genomic DNA and constructing a methylation dataset based on the methylation states and a common dataset;
and the classification learning unit is used for learning classification rules obtained by analyzing a plurality of methylation data sets through a random forest model algorithm and classifying the subtypes of the tumor sample to be classified according to a learning result.
9. An apparatus, characterized in that the apparatus comprises,
a processor;
a memory for storing processor-executable instructions;
the processor is configured to invoke the instructions stored in the memory to perform the method of any of claims 1-7.
10. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311144613.8A CN117316289B (en) | 2023-09-06 | 2023-09-06 | Methylation sequencing typing method and system for central nervous system tumor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311144613.8A CN117316289B (en) | 2023-09-06 | 2023-09-06 | Methylation sequencing typing method and system for central nervous system tumor |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117316289A true CN117316289A (en) | 2023-12-29 |
CN117316289B CN117316289B (en) | 2024-04-26 |
Family
ID=89248933
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311144613.8A Active CN117316289B (en) | 2023-09-06 | 2023-09-06 | Methylation sequencing typing method and system for central nervous system tumor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117316289B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180066317A1 (en) * | 2015-03-11 | 2018-03-08 | Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts | Dna-methylation based method for classifying tumor species |
CN113811618A (en) * | 2019-05-21 | 2021-12-17 | 深圳华大智造科技股份有限公司 | Sequencing library constructed based on methylated DNA target region, system and application |
CN115094142A (en) * | 2022-07-19 | 2022-09-23 | 中国医学科学院肿瘤医院 | Methylation markers for diagnosing colorectal adenocarcinoma |
CN115424666A (en) * | 2022-09-13 | 2022-12-02 | 江苏先声医学诊断有限公司 | Method and system for screening pan-cancer early-screening molecular marker based on whole genome bisulfite sequencing data |
CN115612744A (en) * | 2022-12-14 | 2023-01-17 | 中国医学科学院肿瘤医院 | Human papilloma virus typing and related gene methylation integrated detection model and construction method thereof |
CN115725591A (en) * | 2022-09-22 | 2023-03-03 | 上海奕谱生物科技有限公司 | Novel tumor detection marker TAGMe and application thereof |
CN116064819A (en) * | 2022-12-12 | 2023-05-05 | 无锡泛生子生物科技有限公司 | Method for detecting mutation and methylation of tumor specific gene in ctDNA |
CN116218973A (en) * | 2023-02-22 | 2023-06-06 | 北京元码医学检验实验室有限公司 | Probe set, method and system for methylation target detection for second generation sequencing |
-
2023
- 2023-09-06 CN CN202311144613.8A patent/CN117316289B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180066317A1 (en) * | 2015-03-11 | 2018-03-08 | Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts | Dna-methylation based method for classifying tumor species |
CN113811618A (en) * | 2019-05-21 | 2021-12-17 | 深圳华大智造科技股份有限公司 | Sequencing library constructed based on methylated DNA target region, system and application |
CN115094142A (en) * | 2022-07-19 | 2022-09-23 | 中国医学科学院肿瘤医院 | Methylation markers for diagnosing colorectal adenocarcinoma |
CN115424666A (en) * | 2022-09-13 | 2022-12-02 | 江苏先声医学诊断有限公司 | Method and system for screening pan-cancer early-screening molecular marker based on whole genome bisulfite sequencing data |
CN115725591A (en) * | 2022-09-22 | 2023-03-03 | 上海奕谱生物科技有限公司 | Novel tumor detection marker TAGMe and application thereof |
CN116064819A (en) * | 2022-12-12 | 2023-05-05 | 无锡泛生子生物科技有限公司 | Method for detecting mutation and methylation of tumor specific gene in ctDNA |
CN115612744A (en) * | 2022-12-14 | 2023-01-17 | 中国医学科学院肿瘤医院 | Human papilloma virus typing and related gene methylation integrated detection model and construction method thereof |
CN116218973A (en) * | 2023-02-22 | 2023-06-06 | 北京元码医学检验实验室有限公司 | Probe set, method and system for methylation target detection for second generation sequencing |
Non-Patent Citations (2)
Title |
---|
叶松山;刘先娟;侯俊然;毛秉豫;邱耕;: "基于p73和DAPK基因异常甲基化模式的白血病肿瘤标志物研究", 中华肿瘤防治杂志, no. 11, 14 June 2016 (2016-06-14) * |
王刚;张红河;何华东;陈昭典;: "尿液P16基因甲基化对浅表性膀胱移行细胞癌早期诊断的意义", 实用肿瘤杂志, no. 02, 10 April 2009 (2009-04-10) * |
Also Published As
Publication number | Publication date |
---|---|
CN117316289B (en) | 2024-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20190316209A1 (en) | Multi-Assay Prediction Model for Cancer Detection | |
US20200395100A1 (en) | Population based treatment recommender using cell free dna | |
CN112802548B (en) | Method for predicting allele-specific copy number variation of single-sample whole genome | |
EP3704264B1 (en) | Using nucleic acid size range for noninvasive prenatal testing and cancer detection | |
CN107771221A (en) | The abrupt climatic change analyzed for screening for cancer and fetus | |
CN113257350B (en) | ctDNA mutation degree analysis method and device based on liquid biopsy and ctDNA performance analysis device | |
US11581062B2 (en) | Systems and methods for classifying patients with respect to multiple cancer classes | |
CN117778576A (en) | Free DNA end characterization | |
JP2008507993A (en) | Automated analysis of multiple probe target interaction patterns: pattern matching and allele identification | |
CN113851185B (en) | Prognosis evaluation method for immunotherapy of non-small cell lung cancer patient | |
CN109859796B (en) | Dimension reduction analysis method for DNA methylation spectrum of gastric cancer | |
CN117316289B (en) | Methylation sequencing typing method and system for central nervous system tumor | |
CN109712671B (en) | Gene detection device based on ctDNA, storage medium and computer system | |
US20190108311A1 (en) | Site-specific noise model for targeted sequencing | |
CN115976209A (en) | Training method of lung cancer prediction model, prediction device and application | |
US20210310050A1 (en) | Identification of global sequence features in whole genome sequence data from circulating nucleic acid | |
CN110310700B (en) | DNA methylation chip mark site screening method based on deep learning model | |
CN116209777A (en) | Genetic relationship judging method and device based on noninvasive prenatal gene detection data | |
CN116168761B (en) | Method and device for determining characteristic region of nucleic acid sequence, electronic equipment and storage medium | |
CN115851930A (en) | Methylation marker for detecting benign and malignant lung nodules and application thereof | |
CN116987789A (en) | UTUC molecular typing, single sample classifier and construction method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |