CN113160889A - Cancer noninvasive early screening method based on cfDNA omics characteristics - Google Patents

Cancer noninvasive early screening method based on cfDNA omics characteristics Download PDF

Info

Publication number
CN113160889A
CN113160889A CN202110118814.5A CN202110118814A CN113160889A CN 113160889 A CN113160889 A CN 113160889A CN 202110118814 A CN202110118814 A CN 202110118814A CN 113160889 A CN113160889 A CN 113160889A
Authority
CN
China
Prior art keywords
cfdna
cancer
noninvasive
omics
system based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110118814.5A
Other languages
Chinese (zh)
Other versions
CN113160889B (en
Inventor
蓝勋
季加孚
布召德
李�杰
陈佳辉
孙克用
孙欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renke Beijing Biotechnology Co ltd
Original Assignee
Tsinghua University
Beijing Cancer Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Beijing Cancer Hospital filed Critical Tsinghua University
Priority to CN202110118814.5A priority Critical patent/CN113160889B/en
Publication of CN113160889A publication Critical patent/CN113160889A/en
Application granted granted Critical
Publication of CN113160889B publication Critical patent/CN113160889B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Bioethics (AREA)
  • Data Mining & Analysis (AREA)
  • Epidemiology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Artificial Intelligence (AREA)
  • Hospice & Palliative Care (AREA)
  • Oncology (AREA)
  • Software Systems (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)

Abstract

The invention relates to a cancer noninvasive early screening system based on cfDNA omics characteristics, which comprises a cfDNA omics characteristic model and a machine learning training model, wherein the cfDNA omics characteristic model is established; extracting cfDNA by blood collection; performing library construction and sequencing on the extracted cfDNA; and extracting cfDNA omics characteristics for comparison. The method comprehensively characterizes the cfDNA in the gastric cancer patients by combining the length distribution characteristics, the copy number variation density distribution characteristics and the openness characteristics around the cfDNA promoter and by a low-depth genome-wide sequencing mode of the cfDNA, so that the early gastric cancer patients can be accurately identified.

Description

Cancer noninvasive early screening method based on cfDNA omics characteristics
Technical Field
The invention relates to the field of cancer early screening, in particular to a cancer noninvasive early screening method based on cfDNA omics characteristics.
Background
Fluid biopsy is the clinical application of early screening, molecular typing, prognosis, medication guidance, and recurrence monitoring of cancer by analyzing cancer components in blood. Liquid biopsy is used as a new precise medical technology, can qualitatively and quantitatively detect tumor cells and DNA directly related to tumors, has the characteristics of non-invasiveness, convenience in sampling, real-time monitoring and the like, and gradually plays an increasingly important role in tumor diagnosis and treatment.
Currently, the conventional method of studying fluid biopsy, cancer early screening, is to identify cfDNA released by tumors by mutation detection of oncogenes or cancer suppressor genes. Document Razavi, p., Li, b.t., Brown, d.n.et al.high-intensity sequencing reactions the sources of plasma circulating cell-free DNA variants. nat Med 25, 1928-. This can lead to a significant misjudgment rate in identifying cancer patients through mutation of a specific target gene. Moreover, since cancer patients have great variability, a part of the cancer patients may be missed after defining a specific target gene. In addition, high depth target sequencing is also extremely expensive and cannot be used universally in the clinic. Moreover, the above studies have been only directed to patients with advanced cancer metastasis.
The documents Chen Xiaoji, Chang Ching-Wei, Spoerke Jill M, et al, Low-pass white-genome Sequencing of Circulating Cell-free DNA signatures Dynamic Changes in Genomic Copy Number in Squamous Lung Cancer Clinical code 2019,25(7): 2254-. Meaning that this approach is only useful for a large number of cancer species with copy number variations. Lower depth genome-wide sequencing is also prone to missing copy number variation regions of some cancer genes. Moreover, the above studies have been directed only to patients in the advanced stages of cancer metastasis.
Matthew et al, obtained cfDNA isolated from circulating plasma, map the occupancy Of nucleosomes within the genome, found that the distribution pattern Of cfDNA is closely related to the tissue site, and predicted the distribution pattern Of nucleosomes by studying cfDNA, thereby determining the specific origin Of cfDNA, which can be used for non-invasive detection Of clinical conditions, but it is limited to the theoretical level and does not relate to specific applications, lacks a comprehensive evaluation Of the patient's DNA genome, and lacks an evaluation Of the abundance Of cfDNA.
Therefore, it is necessary for those skilled in the art to design a noninvasive cancer early screening method which can completely predict early gastric cancer patients only by a low-depth whole genome sequencing mode, can greatly reduce the cost of cancer early screening and improve the screening accuracy.
Disclosure of Invention
In view of the above, the present application aims to provide a cancer noninvasive early screening method based on cfDNA omics characteristics, which comprehensively characterizes cfDNA in gastric cancer patients by combining cfDNA length distribution characteristics, copy number variation density distribution characteristics and cfDNA promoter periphery openness characteristics and by a cfDNA low-depth whole genome sequencing manner, thereby accurately identifying early gastric cancer patients.
In order to achieve the above object, the present application provides the following technical solutions.
A cancer noninvasive early screening system based on cfDNA omics characteristics comprises a cfDNA omics characteristic model and a machine learning training model, and is characterized in that the cancer noninvasive screening method comprises the following steps:
s101, establishing a cfDNA omics characteristic model;
s102, blood collection;
s103, extracting cfDNA;
s104, performing library construction and sequencing on the extracted cfDNA;
and S105, extracting cfDNA omics characteristics and comparing the cfDNA omics characteristics.
Preferably, the establishing a cfdnamics feature model in step S101 comprises the following steps:
s201, blood collection;
s202, extracting cfDNA;
s203, performing library construction and sequencing on the extracted cfDNA;
s204, extracting cfDNA omics characteristics;
and S205, machine learning and training the model.
Preferably, the blood collection in step S102 and step S201 is performed by whole blood extraction using a blood collection tube. The blood collection tube contains a preservative which can stabilize nucleated blood cells, prevent the release of cell genome DNA, inhibit cfDNA nuclease-mediated degradation and contribute to the overall stability of cfDNA.
Preferably, the extracting of cfDNA in step S103 and step S202 includes the steps of:
s301, placing the blood collection tube in a centrifuge, and centrifuging until plasma is separated;
s302, adding protease K and ACL buffer into a centrifugal tube containing plasma, fully mixing uniformly and incubating;
s303, carrying out suction filtration on the incubated collection pipe by using a vacuum pump and washing off impurities;
s304, placing the mixture in a centrifuge for centrifugation;
s306, placing the collecting pipe in a metal bath to volatilize ethanol, adding AVE, and incubating;
s307, placing the collecting pipe in a centrifuge for centrifugation, carrying out DNA concentration determination on the filtrate, and detecting the fragment distribution of the cfDNA.
Preferably, the cf omics features comprise essentially one or more of fragment pattern, cnv diversity and TSS coverage.
Preferably, the cfdnamics feature extraction method in steps S101, S105 and S204 comprises:
s401, comparing the cfDNA sequence file with a reference genome to obtain a BAM file;
s402, removing low-quality sequences and repeated sequences in the BAM file;
s403, excluding the region with low coverage rate of the reference genome and the Duke black box region;
s404, dividing the chromosome into adjacent segments without intersection;
s405, counting the number of long and short cfDNAs;
s406, correcting and processing the counted number by GC content;
s407, fragment pattern quantization is carried out by using a proportion; and carrying out median standardization by adopting a gold standard, and counting the density distribution of copy number variation.
S408, obtaining the coordinate of a transcription start site of the reference genome, comparing the coordinate to a BAM file, and obtaining the coverage of a sequence near the site;
and S409, obtaining TSS coverage through coverage calculation.
Preferably, the method for establishing the machine learning training model in step S205 includes:
s501, dividing a sample into a training set and a testing set;
s502, processing sample data in the training set;
s503, extracting omics characteristics of cfDNA in the training set and verifying the characteristics in the testing set;
and S504, evaluating the efficiency of the model.
Preferably, the test set in step 501 comprises n gastric cancer samples and m healthy samples, and the training set comprises n +1 gastric cancer samples and m healthy samples, wherein n and m are positive integers.
Preferably, the sample data processing method in step S502 includes using an algorithm using ten-fold cross validation.
Preferably, the evaluation in step S504 includes calculation and evaluation of sensitivity, specificity, accuracy, recall, ROC, and AUG.
The beneficial technical effects obtained by the invention are as follows:
1) the invention adopts a low-depth whole genome sequencing mode, compared with the target sequencing with ultrahigh depth or high depth; the sequencing cost is greatly reduced, and the cost is lower;
2) according to the invention, the overall appearance of cfDNA in the gastric cancer patients can be reflected more comprehensively by means of whole genome sequencing, and the omission of gastric cancer patients with large heterogeneity is avoided;
3) the invention can more comprehensively excavate the specificity of cfDNA in the gastric cancer patient through the analysis mode of the trimomics.
The foregoing description is only an overview of the technical solutions of the present application, so that the technical means of the present application can be more clearly understood and the present application can be implemented according to the content of the description, and in order to make the above and other objects, features and advantages of the present application more clearly understood, the following detailed description is made with reference to the preferred embodiments of the present application and the accompanying drawings.
The above and other objects, advantages and features of the present application will become more apparent to those skilled in the art from the following detailed description of specific embodiments thereof, taken in conjunction with the accompanying drawings.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Throughout the drawings, like elements or portions are generally identified by like reference numerals. In the drawings, elements or portions are not necessarily drawn to scale.
Figure 1 cfdnamics features non-invasive screening flowsheet for patients in early gastric cancer;
FIG. 2 is a schematic diagram of feature extraction and model training for cfDNA fragment patterns;
FIG. 3 schematic diagram of feature extraction and model training for cfDNA cnv diversity;
FIG. 4 is a schematic diagram of feature extraction and model training for cfDNA TSS coverage;
FIG. 5 is a schematic diagram of MUC2 as a target gene for early gastric cancer.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. In the following description, specific details such as specific configurations and components are provided only to help the embodiments of the present application be fully understood. Accordingly, it will be apparent to those skilled in the art that various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of the present application. In addition, descriptions of well-known functions and constructions are omitted in the embodiments for clarity and conciseness.
It should be appreciated that reference throughout this specification to "one embodiment" or "the embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrase "one embodiment" or "the present embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Further, the present application may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, B exists alone, and A and B exist at the same time, and the term "/and" is used herein to describe another association object relationship, which means that two relationships may exist, for example, A/and B, may mean: a alone, and both a and B alone, and further, the character "/" in this document generally means that the former and latter associated objects are in an "or" relationship.
The term "at least one" herein is merely an association relationship describing an associated object, and means that there may be three relationships, for example, at least one of a and B, may mean: a exists alone, A and B exist simultaneously, and B exists alone.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion.
Example 1
A cancer noninvasive early screening system based on cfDNA omics characteristics comprises a cfDNA omics characteristic model and a machine learning training model, and is characterized in that the cancer noninvasive screening method comprises the following steps:
s101, establishing a cfDNA omics characteristic model;
s102, blood collection;
s103, extracting cfDNA;
s104, performing library construction and sequencing on the extracted cfDNA;
and S105, extracting cfDNA omics characteristics and comparing the cfDNA omics characteristics.
Preferably, the establishing a cfdnamics feature model in step S101 comprises the following steps:
s201, blood collection;
s202, extracting cfDNA;
s203, performing library construction and sequencing on the extracted cfDNA;
s204, extracting cfDNA omics characteristics;
and S205, machine learning and training the model.
Preferably, the blood collection in step S102 and step S201 is a whole blood extraction using Streck blood collection tubes. The Streck blood collection tube contains a preservative which can stabilize nucleated blood cells, prevent the release of cell genome DNA, inhibit cfDNA nuclease-mediated degradation and contribute to the overall stability of cfDNA.
Preferably, the extracting of cfDNA in step S103 and step S202 includes the steps of:
s301, placing the Streck blood collection tube in a centrifuge, and centrifuging until plasma is separated;
s302, adding protease K and ACL buffer into a centrifugal tube containing plasma, fully mixing uniformly and incubating;
s303, carrying out suction filtration on the incubated collection pipe by using a vacuum pump and washing off impurities;
s304, placing the mixture in a centrifuge for centrifugation;
s306, placing the collecting pipe in a metal bath to volatilize ethanol, adding AVE, and incubating;
s307, placing the collecting tube in a centrifuge for centrifugation, collecting the filtrate in an EP tube, carrying out DNA concentration determination, and detecting the fragment distribution of the cfDNA.
Preferably, the cf omics features comprise essentially one or more combinations of fragment pattern, cnv diversity and TSS coverage.
Preferably, the cfdnamics feature extraction method in steps S101, S105 and S204 comprises:
s401, comparing the cfDNA sequence file with a reference genome to obtain a BAM file;
s402, removing low-quality sequences and repeated sequences in the BAM file;
s403, excluding the region with low coverage rate of the reference genome and the Duke black box region;
s404, dividing the chromosome into adjacent segments without intersection;
s405, counting the number of long and short cfDNAs;
s406, correcting and processing the counted number by GC content;
s407, fragment pattern quantization is carried out by using a proportion; and carrying out median standardization by adopting a gold standard, and counting the density distribution of copy number variation.
S408, obtaining the coordinate of a transcription start site of the reference genome, comparing the coordinate to a BAM file, and obtaining the coverage of a sequence near the site;
and S409, obtaining TSS coverage through coverage calculation.
Preferably, the method for establishing the machine learning training model in step S205 includes:
s501, dividing a sample into a training set and a testing set;
s502, processing sample data in the training set;
s503, extracting omics characteristics of cfDNA in the training set and verifying the characteristics in the testing set;
and S504, evaluating the efficiency of the model.
Preferably, the test set in step 501 includes n gastric cancer samples and m healthy samples, the training set includes n +1 gastric cancer samples and m healthy samples, and n and m are positive integers.
Preferably, the sample data processing method in step S502 includes using an algorithm using ten-fold cross validation.
Preferably, the evaluation in step S504 includes the calculation of sensitivity, specificity, accuracy, recall, ROC and AUG.
Example 2
This embodiment is performed based on embodiment 1, and the same points as embodiment 1 are not repeated.
This example introduces a method of cfDNA extraction, comprising the specific steps of:
s601, placing a Streck tube in a 4 ℃ centrifuge, centrifuging at 2000rpm for 10min, and separating plasma;
s602, adding 500ul protease K and 4ML ACL buffer into a 50ML centrifuge tube containing plasma, fully mixing the protease K and the 4ML ACL buffer uniformly for vortex 30S, and placing the mixture in a water bath kettle at 60 ℃ for incubation for 30 min;
s603, carrying out suction filtration on the incubated collection pipe for 10min by using a vacuum pump, sequentially adding 600ul of ACW1, 750ul of ACW2 and 750ul of 100% ethanol, and washing away impurities;
s604, centrifuging at 12000rpm for 3 min;
s605, placing the collecting pipe in a metal bath at 50 ℃ for 10min, volatilizing ethanol, adding 110ul AVE, placing the mixture in the collecting pipe, and incubating for 3min at room temperature;
s606, placing the collection tube in a centrifuge at 12000rpm, centrifuging for 1min, collecting the filtrate into a 1.5ml EP tube, performing DNA concentration determination by using Quibt 3.0, and detecting the fragment distribution of cfDNA through 2100.
Example 3
This embodiment is performed based on embodiment 1, and the same points as embodiment 1 are not repeated.
This example introduces a method for feature extraction in cfDNA triomics, comprising the steps of:
fragment pattern first align cfDNA sequence files to reference genome hg19, discard low quality sequences from the resulting BAM files and filter out duplicate sequences; then excluding the region with low coverage of hg19 reference genome and the Duke black box region; next, hg19 autosomes were divided into 504 contiguous, non-intersecting segments, each 5mb in length; counting the number of cfDNAs with the length of more than 150bp and the number of cfDNAs with the length of less than 150bp in each fragment region; correcting the GC content of the number of the cfDNAs by using an LOESS regression method, and processing the number of the cfDNAs after the GC correction by using a mean value standardization method; and finally, obtaining the number of the cfDNA short and short fragments in each 5mb interval, and finally quantifying fragment patterns by using the proportion.
Cnv diversity: removing low-quality and repeated sequences in the aligned BAM file, and dividing the chromosome into 51120 adjacent fragments without intersection, wherein each fragment is 50 kb; correcting the GC content in the same way as in the fragment pattern, taking the median of the number of cfDNA of each fragment of healthy people after GC correction as a gold standard, and carrying out median standardization on the number of cfDNA in each fragment of the cancer patient by using the gold standard; dividing the amplified and deleted fragments by taking 0.2 as a threshold value, counting the density distribution of copy number variation, and identifying abnormal amplified and deleted intervals; the path and the biological process in which the gene in the amplification interval is involved are explored.
Removing low-quality sequences in the BAM file, downloading coordinates of hg19 reference genome transcription start sites from an ENSEMBL database, and comparing the coordinates to the BAM file to obtain the coverage of sequences near the sites; firstly, calculating the coverage of Nucleosome Deletion Region (NDR) near the initiation site, and then calculating the coverage from upstream 1000bp to downstream 1000bp (2k region) of the initiation site; then, in order to standardize the two coverage degrees, the average value of the coverage degrees of the upstream 3000bp to upstream 1000bp fragment and the downstream 1000bp to downstream 3000bp fragment of the initiation site is calculated to be used as a gold standard; NDR and 2k regions were divided by the gold standard as the final TSS coverage.
Example 4
This embodiment is performed on the basis of embodiment 1, and the same points as embodiment 1 are not repeated.
This example presents cfDNA combined feature extraction and model training.
Preoperative peripheral blood of 81 gastric cancer patients and peripheral blood of 38 healthy persons were collected and cfDNA extraction, pooling and sequencing were performed in this example.
The first step is as follows: the feature extraction and model training of cfDNA fragment patterns, the results are shown in fig. 2.
A. Dividing the BAM file after the comparison into 504 bins which are adjacent and have no intersection, then calculating the number of long fragments of which the cfDNA is more than 150bp and the number of short fragments of which the cfDNA is less than 150bp after GC correction in each bin, and calculating the proportion of the short fragments to the long fragments; the proportion of healthy persons was found to be relatively concentrated and the number of long fragments per bin was more proportional than in patients with gastric cancer; the proportion of gastric cancer patients is relatively diffuse and the proportion of short fragments per bin is greater compared to healthy people.
B. After the average value standardization of the proportion distribution of the gastric cancer patients and the healthy people is finished, the proportion of the healthy people is found to be stable and unchanged, and the proportion variability of the gastric cancer patients is strong.
C. The median of 504 bins from healthy persons was used as the gold standard, and the similarity between each sample and the gold standard was sought. The similarity between healthy people is found to be strong, and a healthy person from a nature article is selected for comparison, so that the healthy person in the nature is found to be similar to the gold standard, and the similarity between the gastric cancer patient and the gold standard is obviously reduced; compared with healthy people, the difference p value is 0.0003313, compared with healthy people in nature, the difference p value is 3.686e-08, and the detection mode is rank sum detection.
D. Training the training set by a random gradient descent algorithm, extracting features by adopting a ten-fold cross validation mode, and finally evaluating the performance of the model in the test set. In the test set, the AUC was 0.96447, the sensitivity was 0.975, the specificity was 0.842, the accuracy was 0.929, and the recall was 0.941.
The second step is that: feature extraction and model training of cfDNA cnv diversity, the results are shown in fig. 3.
A. After calculating the cnv of cfDNA, its density was evaluated. The cnv density of healthy people was found to be concentrated around the 0 value, while that of gastric cancer patients was more dispersed and a common feature. The cnv density is shown for a healthy person and for a patient with gastric cancer.
B. Setting the interval larger than 0.2 as the gene fragment amplification interval and setting the interval smaller than-0.2 as the gene fragment deletion interval, and counting the proportion of the amplification intervals and the deletion intervals of all samples. The proportion of the cnv abnormal interval of the gastric cancer patient is far higher than that of a healthy person, the p value of the difference significance is 6.499e-12, and the test mode is rank sum test.
C. Training the training set by a random gradient descent algorithm, extracting features by adopting a ten-fold cross validation mode, and finally evaluating the performance of the model in the test set. In the test set, AUC was 0.98947, sensitivity was 1, specificity was 0.895, accuracy was 0.952, and recall was 1.
The third step: feature extraction and model training of cfDNA TSS coverage, the results are shown in fig. 4.
A. The figure shows cfDNA coverage of a gene 1mb upstream to 1mb downstream of the transcription start site for a particular gastric cancer patient. The red dashed line represents the transcription initiation site, near which the coverage of cfDNA is greatly down-regulated, representing here a promoter region, which can be recognized by transcription factors.
B. The figure shows cfDNA coverage of the transcription start site from 1kb upstream to 1kb downstream. The red dashed line represents the transcription initiation site, and similar to graph a, around the transcription initiation, coverage of cfDNA is greatly down-regulated, representing here a promoter region, which can be recognized by transcription factors.
C. The cfDNA coverage of the transcriptional start site from 150bp upstream to 50bp downstream Nucleosome Deletion Region (NDR) is shown. The red dashed line represents the transcription initiation site, and similar to graph a, around the transcription initiation, coverage of cfDNA is greatly down-regulated, representing here a promoter region, which can be recognized by transcription factors.
D. The figure shows twenty thousand protein coding genes, the lower the mean coverage of the cfDNA of 81 gastric cancer samples in 2k region, the stronger the openness of the transcription initiation site of the gene is shown, and after the mean coverage sequencing is completed, the more the gene is, the stronger the openness of the promoter is.
E. The figure shows twenty thousand protein coding genes, the mean coverage of cfDNA of 81 gastric cancer samples in a Nucleosome Deletion Region (NDR) is lower, the transcription initiation site of the gene has strong openness, and the more upward the genes are sequenced after the mean coverage, the stronger the openness of a promoter is.
F. Selecting 2k region and Nucleosome Deleted Region (NDR), and if the mean coverage of transcription initiation sites of the gene in more than 80% of gastric cancer samples is less than 1, judging the promoter of the gene as an open region. The genes are subjected to KEGG pathway enrichment analysis, and the pathways are related to cell proliferation, autophagy and migration, and are significantly related to the occurrence and development of cancer.
G. Training the training set by a random gradient descent algorithm, extracting features by adopting a ten-fold cross validation mode, and finally evaluating the performance of the model in the test set. In the test set, AUC was 0.98947, sensitivity was 1, specificity was 0.895, accuracy was 0.952, and recall was 1.
The fourth step: the cfDNA TSS coverage and single cell analysis identified MUC2 as the target gene for early stage gastric cancer, and the results are shown in fig. 5.
After analyzing the characteristics of cfDNA TSS coverage, combined with gastric cancer single cell transcriptome data analysis, the gastric cancer patients in early stage have stronger opening property of MUC2 promoter region compared with healthy people, the lower normalized coverage represents stronger opening property, the stronger the opening property is in 2K region (graph A) and NDR (graph B), and the expression of MUC2 is also strong. The tumor region of the early stage gastric cancer patient was found to be stained with MUC2 fluorescent protein by HE staining and IF staining, as shown in figure C, while the tumor region of the late stage gastric cancer patient was only partially stained with MUC2 fluorescent protein or was not stained with MUC2 fluorescent protein directly, as shown in figure D.
The above description is only a preferred embodiment of the present invention, and it is not intended to limit the scope of the present invention, and various modifications and changes may be made by those skilled in the art. Variations, modifications, substitutions, integrations and parameter changes of the embodiments may be made without departing from the principle and spirit of the invention, which may be within the spirit and principle of the invention, by conventional substitution or may realize the same function.

Claims (10)

1. A cancer noninvasive early screening system based on cfDNA omics characteristics comprises a cfDNA omics characteristic model and a machine learning training model, and is characterized in that the cancer noninvasive screening method comprises the following steps:
s101, establishing a cfDNA omics characteristic model;
s102, blood collection;
s103, extracting cfDNA;
s104, performing library construction and sequencing on the extracted cfDNA;
and S105, extracting cfDNA omics characteristics and comparing the cfDNA omics characteristics.
2. The cancer noninvasive early screening system based on cfdnamics features according to claim 1, wherein the noninvasive screening method for cancer step S101, wherein the establishing a cfdnamics features model specifically comprises the following steps:
s201, blood collection;
s202, extracting cfDNA;
s203, performing library construction and sequencing on the extracted cfDNA;
s204, extracting cfDNA omics characteristics;
and S205, machine learning and training the model.
3. The cancer noninvasive early screening system based on cfDNomics characteristics as defined in claim 1 or 2, wherein the noninvasive cancer screening method comprises the steps S102 and S201, wherein the blood collection is performed by whole blood extraction with a blood collection tube; the blood collection tube contains a preservative which can stabilize nucleated blood cells, prevent the release of cell genome DNA, inhibit cfDNA nuclease-mediated degradation and contribute to the overall stability of cfDNA.
4. The cancer noninvasive early screening system based on cfDNA omics characteristics as claimed in claim 1 or 2, wherein the noninvasive screening method for cancer comprises the following steps of extracting cfDNA in step S103 and step S202:
s301, placing the blood collection tube in a centrifuge, and centrifuging until plasma is separated;
s302, adding protease K and ACL buffer into a centrifugal tube containing plasma, fully mixing uniformly and incubating;
s303, carrying out suction filtration on the incubated collection pipe by using a vacuum pump and washing off impurities;
s304, placing the mixture in a centrifuge for centrifugation;
s306, placing the collecting pipe in a metal bath to volatilize ethanol, adding AVE, and incubating;
s307, placing the collecting pipe in a centrifuge for centrifugation, carrying out DNA concentration determination on the filtrate, and detecting the fragment distribution of the cfDNA.
5. The cancer noninvasive early screening system based on cfdnamics features according to claim 1 or 2, wherein the cfdnamics features comprise one or more of fragmentpatterrn, cnv density and TSS coverage.
6. The cancer noninvasive early screening system based on cfdnamics features according to claim 1 or 2, wherein the cancer screening method in steps S101, S105 and S204 comprises the cfdnamics feature extraction method:
s401, comparing the cfDNA sequence file with a reference genome to obtain a BAM file;
s402, removing low-quality sequences and repeated sequences in the BAM file;
s403, excluding the region with low coverage rate of the reference genome and the Duke black box region;
s404, dividing the chromosome into adjacent segments without intersection;
s405, counting the number of long and short cfDNAs;
s406, correcting and processing the counted number by GC content;
s407, fragment pattern quantization is carried out by using a proportion; and carrying out median standardization by adopting a gold standard, and counting the density distribution of copy number variation.
S408, obtaining the coordinate of a transcription start site of the reference genome, comparing the coordinate to a BAM file, and obtaining the coverage of a sequence near the site;
and S409, obtaining TSS coverage through coverage calculation.
7. The cancer noninvasive early screening system based on cfDNomics characteristics as claimed in claim 2, wherein the establishment method of the machine learning training model in step S205 of the noninvasive screening method for cancer comprises the following steps:
s501, dividing a sample into a training set and a testing set;
s502, processing sample data in the training set;
s503, extracting omics characteristics of cfDNA in the training set and verifying the characteristics in the testing set;
and S504, evaluating the efficiency of the model.
8. The noninvasive cancer early screening system based on cfDNomics features of claim 7, wherein in step 501 of the noninvasive cancer screening method, the test set comprises n gastric cancer samples and m healthy samples, and the training set comprises n +1 gastric cancer samples and m healthy samples, wherein n and m are positive integers.
9. The cancer noninvasive early screening system based on cfDNomics features as defined in claim 7, wherein the sample data processing method in step S502 of the noninvasive screening method for cancer comprises an algorithm using ten-fold cross validation.
10. The noninvasive cancer early screening system based on cfDNomics characteristics as set forth in claim 7, wherein the assessment in step S504 of the noninvasive cancer screening method comprises the calculation and assessment of sensitivity, specificity, accuracy, recall, ROC and AUG.
CN202110118814.5A 2021-01-28 2021-01-28 Cancer noninvasive early screening method based on cfDNA omics characteristics Active CN113160889B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110118814.5A CN113160889B (en) 2021-01-28 2021-01-28 Cancer noninvasive early screening method based on cfDNA omics characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110118814.5A CN113160889B (en) 2021-01-28 2021-01-28 Cancer noninvasive early screening method based on cfDNA omics characteristics

Publications (2)

Publication Number Publication Date
CN113160889A true CN113160889A (en) 2021-07-23
CN113160889B CN113160889B (en) 2022-07-19

Family

ID=76879009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110118814.5A Active CN113160889B (en) 2021-01-28 2021-01-28 Cancer noninvasive early screening method based on cfDNA omics characteristics

Country Status (1)

Country Link
CN (1) CN113160889B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113838533A (en) * 2021-08-17 2021-12-24 福建和瑞基因科技有限公司 Cancer detection model and construction method and kit thereof
CN114242164A (en) * 2021-12-21 2022-03-25 苏州吉因加生物医学工程有限公司 Analysis method, device and storage medium for whole genome replication
CN114613436A (en) * 2022-05-11 2022-06-10 北京雅康博生物科技有限公司 Blood sample Motif feature extraction method and cancer early screening model construction method
CN115662519A (en) * 2022-09-29 2023-01-31 昂凯生命科技(苏州)有限公司 cfDNA fragment feature combination and system for predicting cancer based on machine learning
CN115691667A (en) * 2022-12-30 2023-02-03 北京橡鑫生物科技有限公司 Method for early screening of urothelial cancer, method, device and equipment for constructing model

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104560697A (en) * 2015-01-26 2015-04-29 上海美吉生物医药科技有限公司 Detection device for instability of genome copy number
CN106295245A (en) * 2016-07-27 2017-01-04 广州麦仑信息科技有限公司 The method of storehouse noise reduction own coding gene information feature extraction based on Caffe
CN107099577A (en) * 2017-03-06 2017-08-29 华南理工大学 Vaginal fluid humidity strip candida albicans detection method based on Hough loop truss and depth convolutional network
CN107133496A (en) * 2017-05-19 2017-09-05 浙江工业大学 Gene expression characteristicses extracting method based on manifold learning Yu closed loop depth convolution dual network model
US20180149636A1 (en) * 2016-11-30 2018-05-31 The Chinese University Of Hong Kong Analysis of cell-free dna in urine and other samples
CN108949979A (en) * 2018-07-11 2018-12-07 深圳市海普洛斯生物科技有限公司 A method of judging that Lung neoplasm is good pernicious by blood sample
CN109360604A (en) * 2018-11-21 2019-02-19 南昌大学 A kind of oophoroma molecule parting forecasting system
CN109652513A (en) * 2019-02-25 2019-04-19 元码基因科技(北京)股份有限公司 The method and kit of liquid biopsy idiovariation are accurately detected based on two generation sequencing technologies
CN109680049A (en) * 2018-12-03 2019-04-26 东南大学 A kind of method and its application based on the dissociative DNA in blood high-flux sequence analysis affiliated individual physiological state of cfDNA
WO2019147663A1 (en) * 2018-01-24 2019-08-01 Freenome Holdings, Inc. Methods and systems for abnormality detection in the patterns of nucleic acids
CN110100013A (en) * 2016-10-24 2019-08-06 香港中文大学 Method and system for lesion detection
CN110189798A (en) * 2019-06-26 2019-08-30 广州市雄基生物信息技术有限公司 A kind of clustering method and application based on peripheral blood plasma DNA nucleosome footprint difference
CN110211632A (en) * 2019-05-06 2019-09-06 西安电子科技大学 A kind of nucleotide unit point mutation detection method neural network based
CN110739027A (en) * 2019-10-23 2020-01-31 深圳吉因加医学检验实验室 cancer tissue positioning method and system based on chromatin region coverage depth
CN110760580A (en) * 2018-10-10 2020-02-07 杭州翱锐生物科技有限公司 Early diagnosis equipment for liver cancer
CN111081317A (en) * 2019-12-10 2020-04-28 山东大学 Gene spectrum-based breast cancer lymph node metastasis prediction method and prediction system
CN111243673A (en) * 2019-12-25 2020-06-05 北京橡鑫生物科技有限公司 Tumor screening model, and construction method and device thereof
CN111254211A (en) * 2020-02-28 2020-06-09 广东药科大学 Aquilaria plant identification method based on ITS sequence and machine learning
CN111902529A (en) * 2018-03-29 2020-11-06 索尼公司 Information processing apparatus, information processing method, and program
CN112086129A (en) * 2020-09-23 2020-12-15 深圳吉因加医学检验实验室 Method and system for predicting cfDNA of tumor tissue

Patent Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104560697A (en) * 2015-01-26 2015-04-29 上海美吉生物医药科技有限公司 Detection device for instability of genome copy number
CN106295245A (en) * 2016-07-27 2017-01-04 广州麦仑信息科技有限公司 The method of storehouse noise reduction own coding gene information feature extraction based on Caffe
CN110100013A (en) * 2016-10-24 2019-08-06 香港中文大学 Method and system for lesion detection
US20180149636A1 (en) * 2016-11-30 2018-05-31 The Chinese University Of Hong Kong Analysis of cell-free dna in urine and other samples
CN107099577A (en) * 2017-03-06 2017-08-29 华南理工大学 Vaginal fluid humidity strip candida albicans detection method based on Hough loop truss and depth convolutional network
CN107133496A (en) * 2017-05-19 2017-09-05 浙江工业大学 Gene expression characteristicses extracting method based on manifold learning Yu closed loop depth convolution dual network model
WO2019147663A1 (en) * 2018-01-24 2019-08-01 Freenome Holdings, Inc. Methods and systems for abnormality detection in the patterns of nucleic acids
CN111902529A (en) * 2018-03-29 2020-11-06 索尼公司 Information processing apparatus, information processing method, and program
CN108949979A (en) * 2018-07-11 2018-12-07 深圳市海普洛斯生物科技有限公司 A method of judging that Lung neoplasm is good pernicious by blood sample
CN110760580A (en) * 2018-10-10 2020-02-07 杭州翱锐生物科技有限公司 Early diagnosis equipment for liver cancer
CN109360604A (en) * 2018-11-21 2019-02-19 南昌大学 A kind of oophoroma molecule parting forecasting system
CN109680049A (en) * 2018-12-03 2019-04-26 东南大学 A kind of method and its application based on the dissociative DNA in blood high-flux sequence analysis affiliated individual physiological state of cfDNA
CN109652513A (en) * 2019-02-25 2019-04-19 元码基因科技(北京)股份有限公司 The method and kit of liquid biopsy idiovariation are accurately detected based on two generation sequencing technologies
CN110211632A (en) * 2019-05-06 2019-09-06 西安电子科技大学 A kind of nucleotide unit point mutation detection method neural network based
CN110189798A (en) * 2019-06-26 2019-08-30 广州市雄基生物信息技术有限公司 A kind of clustering method and application based on peripheral blood plasma DNA nucleosome footprint difference
CN110739027A (en) * 2019-10-23 2020-01-31 深圳吉因加医学检验实验室 cancer tissue positioning method and system based on chromatin region coverage depth
CN111081317A (en) * 2019-12-10 2020-04-28 山东大学 Gene spectrum-based breast cancer lymph node metastasis prediction method and prediction system
CN111243673A (en) * 2019-12-25 2020-06-05 北京橡鑫生物科技有限公司 Tumor screening model, and construction method and device thereof
CN111254211A (en) * 2020-02-28 2020-06-09 广东药科大学 Aquilaria plant identification method based on ITS sequence and machine learning
CN112086129A (en) * 2020-09-23 2020-12-15 深圳吉因加医学检验实验室 Method and system for predicting cfDNA of tumor tissue

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
JIANG TAO 等: "Utilization of circulating cell-free DNA profiling to guide first-line chemotherapy in advanced lung squamous cell carcinoma", 《THERANOSTICS》 *
QING ZHOU 等: "Cell-free DNA analysis reveals POLR1Dmediated resistance to bevacizumab in colorectal cancer", 《GENOME MEDICINE》 *
STEPHEN CRISTIANO 等: "Genome-wide cell-free DNA fragmentation in patients with cancer", 《NATURE》 *
任晓宾: "基于血浆游离DNA浓度和完整性的肺癌非侵入性诊断价值研究", 《中国优秀硕士学位论文全文数据库 医药卫生科技辑》 *
李琳 等: "外周血循环核酸作为肿瘤标志物在胃癌中的应用现状", 《中华胃肠外科杂志》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113838533A (en) * 2021-08-17 2021-12-24 福建和瑞基因科技有限公司 Cancer detection model and construction method and kit thereof
WO2023019918A1 (en) * 2021-08-17 2023-02-23 福建和瑞基因科技有限公司 Cancer detection model and construction method therefor, and reagent kit
CN113838533B (en) * 2021-08-17 2024-03-12 福建和瑞基因科技有限公司 Cancer detection model, construction method thereof and kit
CN114242164A (en) * 2021-12-21 2022-03-25 苏州吉因加生物医学工程有限公司 Analysis method, device and storage medium for whole genome replication
CN114613436A (en) * 2022-05-11 2022-06-10 北京雅康博生物科技有限公司 Blood sample Motif feature extraction method and cancer early screening model construction method
CN114613436B (en) * 2022-05-11 2022-08-02 北京雅康博生物科技有限公司 Blood sample Motif feature extraction method and cancer early screening model construction method
CN115662519A (en) * 2022-09-29 2023-01-31 昂凯生命科技(苏州)有限公司 cfDNA fragment feature combination and system for predicting cancer based on machine learning
CN115662519B (en) * 2022-09-29 2023-11-03 南京医科大学 cfDNA fragment characteristic combination and system for predicting cancer based on machine learning
CN115691667A (en) * 2022-12-30 2023-02-03 北京橡鑫生物科技有限公司 Method for early screening of urothelial cancer, method, device and equipment for constructing model

Also Published As

Publication number Publication date
CN113160889B (en) 2022-07-19

Similar Documents

Publication Publication Date Title
CN113160889B (en) Cancer noninvasive early screening method based on cfDNA omics characteristics
CN110689921B (en) Microsatellite instability detection device, computer equipment and computer storage medium
CN113257350B (en) ctDNA mutation degree analysis method and device based on liquid biopsy and ctDNA performance analysis device
CN110910957B (en) Single-tumor-sample-based high-throughput sequencing microsatellite instability detection site screening method
CN113284554B (en) Circulating tumor DNA detection system for screening micro residual focus after colorectal cancer operation and predicting recurrence risk and application
TWI679280B (en) Non-invasive detection of bladder cancer and method for monitoring its recurrence
KR102381252B1 (en) Method for Prognosing Hepatic Cancer Patients Based on Circulating Cell Free DNA
KR20190085667A (en) Circulating Tumor DNA Detection Method Using Sample comprising Cell free DNA and Uses thereof
CN113838533B (en) Cancer detection model, construction method thereof and kit
CN109830264B (en) Method for classifying tumor patients based on methylation sites
CN111968701A (en) Method and device for detecting somatic copy number variation of designated genome region
He et al. Assessing the impact of data preprocessing on analyzing next generation sequencing data
CN105132407A (en) Method for low-frequency mutant-enriched sequencing of DNA of exfoliative cells
CN115410713A (en) Hepatocellular carcinoma prognosis risk prediction model construction based on immune-related gene
WO2019211418A1 (en) Surrogate marker and method for tumor mutation burden measurement
CN112037863B (en) Early NSCLC prognosis prediction system
CN116631508B (en) Detection method for tumor specific mutation state and application thereof
CN115954052B (en) Screening method and system for monitoring sites of tiny residual focus of solid tumor
CN117954097A (en) Lung adenocarcinoma prognosis evaluation system and equipment
Wilmott et al. Tumour procurement, DNA extraction, coverage analysis and optimisation of mutation-detection algorithms for human melanoma genomes
Dan et al. Distal fecal wash host transcriptomics identifies inflammation throughout the colon and terminal ileum
CN111028888A (en) Detection method of genome-wide copy number variation and application thereof
CN104630357A (en) Plasma miRNA biomarker related to colorectal cancer and application thereof
CN105838720A (en) PTPRQ gene mutant and application thereof
CN114724631A (en) Chromosome copy number variation degree evaluation model, method and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Lan Xun

Inventor after: Ji Jiafu

Inventor after: Bu Zhaode

Inventor after: Li Jie

Inventor after: Chen Jiahui

Inventor after: Sun Keyong

Inventor after: Sun Xin

Inventor before: Lan Xun

Inventor before: Ji Jiafu

Inventor before: Bu Zhaode

Inventor before: Li Jie

Inventor before: Chen Jiahui

Inventor before: Sun Keyong

Inventor before: Sun Xin

CB03 Change of inventor or designer information
TA01 Transfer of patent application right

Effective date of registration: 20220218

Address after: B215, floor 2, No. 5, Kaifa Road, Haidian District, Beijing 100089

Applicant after: Renke (Beijing) Biotechnology Co.,Ltd.

Address before: No. 30 Shuangqing Road, Haidian District, Beijing 100084

Applicant before: TSINGHUA University

Applicant before: BEIJING CANCER HOSPITAL (BEIJING CANCER Hospital)

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant