CN112080565A - Colorectal cancer-related prediction system, electronic device, and storage medium - Google Patents

Colorectal cancer-related prediction system, electronic device, and storage medium Download PDF

Info

Publication number
CN112080565A
CN112080565A CN201910519062.6A CN201910519062A CN112080565A CN 112080565 A CN112080565 A CN 112080565A CN 201910519062 A CN201910519062 A CN 201910519062A CN 112080565 A CN112080565 A CN 112080565A
Authority
CN
China
Prior art keywords
sample
analysis
model
samples
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910519062.6A
Other languages
Chinese (zh)
Inventor
韩书文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201910519062.6A priority Critical patent/CN112080565A/en
Priority to PCT/CN2020/082950 priority patent/WO2020248665A1/en
Publication of CN112080565A publication Critical patent/CN112080565A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12MAPPARATUS FOR ENZYMOLOGY OR MICROBIOLOGY; APPARATUS FOR CULTURING MICROORGANISMS FOR PRODUCING BIOMASS, FOR GROWING CELLS OR FOR OBTAINING FERMENTATION OR METABOLIC PRODUCTS, i.e. BIOREACTORS OR FERMENTERS
    • C12M1/00Apparatus for enzymology or microbiology
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/30Unsupervised data analysis

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Biotechnology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • Immunology (AREA)
  • Genetics & Genomics (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Data Mining & Analysis (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Public Health (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Pathology (AREA)
  • Evolutionary Biology (AREA)
  • Bioethics (AREA)
  • General Physics & Mathematics (AREA)
  • Hospice & Palliative Care (AREA)
  • Oncology (AREA)
  • Mycology (AREA)
  • Botany (AREA)

Abstract

The invention discloses a colorectal cancer related prediction system, which comprises: the collecting unit is used for collecting and storing the excrement of the related personnel, and is called as an excrement sample; the analysis unit is used for analyzing the excrement to obtain an analysis sample, and the analysis sample comprises one or more of microorganism information and microorganism metabolite information; and the processing unit is used for constructing a prediction model according to the analysis samples and the machine learning model, and acquiring a prediction result according to a target sample and the prediction model, wherein the target sample is a sample obtained by analyzing the excrement of the target person. The invention also discloses an electronic device and a computer storage medium. The present invention allows for the acquisition of a relevant prognosis of colorectal cancer by stool analysis.

Description

Colorectal cancer-related prediction system, electronic device, and storage medium
Technical Field
The present invention relates to the field of intestinal cancer prediction technologies, and in particular, to a colorectal cancer-related prediction system, an electronic device, and a storage medium.
Background
Colorectal cancer (CRC) is one of the most common malignant tumors in clinic and one of the leading causes of cancer-related deaths, with up to 220 million new cases and 110 million deaths worldwide being expected by 2030. The 5-year survival time for patients with early colorectal cancer exceeds 40%, and 5-year PFS (progression free survival) patients reaches 22%. Despite the great progress made in colorectal cancer in combination therapy and MDT (multidisciplinary diagnostic model) over the last few years, PFS is about 13-15 months in most stage III and IV colorectal cancers, and OS (overall survival) is less than 3 years. Therefore, early screening and early prediction are crucial to improve the prognosis of colorectal cancer.
The current clinical methods for early screening and diagnosis of colorectal cancer mainly include the following aspects: (1) clinical symptoms and signs, such as changes in bowel habit, rectal bleeding, and even black stools or blood stools; colic or abdominal distension occurs in the abdomen. However, these clinical and physical signs are often not readily appreciated by patients. (2) Imaging examination enteroscopy: firstly, common endoscopy is the most effective method for detecting asymptomatic precancerous polyps and colorectal cancers at present, but intestinal tract cleaning, general anesthesia and invasive operation are firstly carried out, so that patients face potential serious complications; CT colonoscopy is a new non-invasive examination method, but has the defects of radiation exposure, lack of standardization, high false negative rate and the like; sigmoidoscopy can prevent a small proportion of proximal colon cancer, but has less benefit in protecting right colon cancer; capsule colonoscopy can complete endoscopic imaging without invasive operation, so that the risk of colonoscopy is avoided, but the requirement of intestinal tract preparation is stricter than that of colonoscopy; (3) serological examination: such as tumor marker, Septin9 test, etc., has been used clinically, but the sensitivity is low. (4) And (3) pathological examination: the pathological biopsy report is the gold standard for colorectal cancer diagnosis, but the difficulty in obtaining materials makes it impossible to be used as a method for colorectal cancer screening and early diagnosis. In conclusion, the current methods for predicting and early diagnosing colorectal cancer are limited by various factors, and colorectal cancer risk assessment of healthy people is not yet incorporated into clinical application, so that a new approach for colorectal cancer early screening and risk assessment is an urgent clinical problem to be solved.
Disclosure of Invention
In order to overcome the disadvantages of the prior art, it is an object of the present invention to provide a system for the relevant prediction of colorectal cancer, which can obtain the relevant prediction of colorectal cancer by stool analysis.
One of the purposes of the invention is realized by adopting the following technical scheme:
a system for correlated prediction of colorectal cancer, comprising:
the collecting unit is used for collecting and storing the excrement of the related personnel, and is called as an excrement sample;
the analysis unit is used for analyzing the excrement to obtain an analysis sample, and the analysis sample comprises one or more of microorganism information and microorganism metabolite information;
and the processing unit is used for constructing a prediction model according to the analysis samples and the machine learning model, and acquiring a prediction result according to a target sample and the prediction model, wherein the target sample is a sample obtained by analyzing the excrement of the target person.
Further, the collection unit includes the centrifuging tube, will excrement and urine is arranged in the centrifuging tube to with first preset speed centrifugation first preset time back with supernatant and deposit partial shipment, save the refrigerator for subsequent use.
Further, the first preset speed is 1800 and 2500 revolutions per minute; or/and the first preset time is 4-8 minutes.
Further, when the analysis sample includes microbiological information, the microbiological information is one or more of fungal abundance, fungal diversity, fungal community structure, bacterial abundance, bacterial diversity, and bacterial community structure.
Further, when analyzing the biological substances in the feces, the analysis unit comprises:
the excrement genome extraction kit is used for extracting microbial genome DNA of the excrement sample, and the extracted microbial genome DNA is called a DNA sample;
the ultraviolet spectrophotometer is used for judging the concentration of the DNA sample;
agarose gel electrophoresis, which is used for judging the purity of the DNA sample;
the PCR amplification instrument is used for carrying out PCR amplification on the DNA sample to obtain a PCR amplification product;
performing DGGE bidirectional sequencing analysis on the PCR amplification product to obtain sequenced original data, and performing quality control on the sequenced original data, wherein the quality control comprises splicing and filtering to discard low-quality sequences and remove heterozygous sequences, and finally obtaining high-quality sequences for analysis;
and carrying OUT OTU clustering on the high-quality sequence by adopting QIIME to obtain a plurality of OUT units, and carrying OUT species annotation on the obtained representative sequence of each OUT unit to obtain the microbial information.
Further, the regions subjected to bidirectional sequencing are gene sequences of a bacterial 16S rRNA variable region or/and a fungal ITS region.
Further, when the analysis sample includes microbial metabolite information, the microbial information is microbial metabolomics analysis information.
Further, when analyzing the microbial metabolites, the analysis unit includes:
the grinder is used for grinding the excrement sample added with the precooled methanol aqueous solution;
a glass derivative bottle for holding a supernatant of the ground fecal sample;
a centrifugal concentrator for evaporating the supernatant;
the shaking incubator is used for placing the glass derivative bottle added with the methoxylamine pyridine hydrochloride solution into the shaking incubator for oximation reaction to finally form a microorganism metabolite analysis sample;
GC/MS analysis for performing omics analysis on the microbial metabolite analysis sample.
Further, the machine learning model is any one of a random forest model, a decision tree model, a logistic regression model and a deep learning model.
Further, the prediction model is an early screening model for high risk population;
the related people are high risk people, and the high risk people are one or more of people with age above 40 years old, positive occult blood of stool, family history of intestinal cancer and medical history of intestinal polyp;
selecting a plurality of related persons, and grouping the related persons according to whether intestinal cancer occurs or not; labeling each related person according to whether intestinal cancer occurs, namely setting a label for an analysis sample of each related person, wherein the label is whether the related person corresponding to the analysis sample occurs the intestinal cancer;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain an early screening model;
the prediction result is whether the target sample has intestinal cancer.
Further, the prediction model is a risk prediction model for a healthy population;
the related persons include healthy persons, patients with intestinal polyp history and early intestinal cancer;
selecting a plurality of related persons, and grouping the related persons according to whether intestinal cancer occurs or not; labeling each related person according to whether intestinal cancer occurs, namely setting a label for an analysis sample of each related person, wherein the label is whether the related person corresponding to the analysis sample occurs the intestinal cancer;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a risk prediction model;
the prediction result is the probability of intestinal cancer occurrence in the target sample.
Further, the prediction model is a survival time prediction model for intestinal cancer patients;
the related personnel are diagnosed intestinal cancer patients;
selecting a plurality of related personnel, and grouping the related personnel according to the survival time of the related personnel; labeling each related person according to the survival time, namely setting a label for the analysis sample of each related person, wherein the label is the survival time of the related person corresponding to the analysis sample, and the survival times are (0, 3), (3, 5), (5, 10) respectively;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a survival time prediction model;
and the prediction result is the survival time of the target sample.
Further, the prediction model is a treatment scheme selection model aiming at patients with intestinal cancer of stages II-III;
the related persons are patients with intestinal cancer of II-III stage;
selecting a plurality of related persons, and grouping the related persons according to whether the related persons receive chemotherapy; labeling each related person according to survival time, namely setting a label for an analysis sample of each related person, wherein the label is the survival time of the related person corresponding to the analysis sample, and the survival time is 5-year survival time and 10-year survival time respectively;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a treatment scheme selection model;
the predicted outcome is whether the target sample needs to receive chemotherapy.
Further, the prediction model is a prediction model aiming at diarrhea related to intestinal cancer chemotherapy;
the related person is an intestinal cancer patient receiving chemotherapy;
selecting a plurality of related persons, and grouping the related persons according to the FOLFOX chemotherapy scheme and the FOLFIRI chemotherapy scheme; labeling each related person according to whether diarrhea occurs, namely setting a label for the analysis sample of each related person, wherein the label is the label which indicates whether the related person corresponding to the analysis sample has diarrhea;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a prediction model;
the prediction result is the probability of the target sample generating diarrhea.
Further, the prediction model is a monitoring model aiming at recurrence and metastasis of intestinal cancer patients;
the related person is an early intestinal cancer patient or an intestinal cancer patient who lives for 5 years or more;
selecting a plurality of related persons, and grouping the related persons according to whether the related persons relapse or cancer cell metastasis; labeling each related person according to whether the relapse or the transfer occurs, namely setting a label for the analysis sample of each related person, wherein the label is the fact whether the related person corresponding to the analysis sample relapses or transfers;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a monitoring model;
the prediction result is the probability of relapse or metastasis of the target sample.
Further, the prediction model is an early screening model aiming at primary drug resistance of intestinal cancer patients receiving chemotherapy;
the related person is an intestinal cancer patient receiving chemotherapy;
selecting a plurality of related persons, and grouping the related persons according to the FOLFOX chemotherapy scheme and the FOLFIRI chemotherapy scheme; labeling each related person according to the disease progress condition, namely setting a label for the analysis sample of each related person, wherein the label is the disease progress condition of the related person corresponding to the analysis sample; the disease progression includes PR, SD, PD, and CR;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain an early screening model;
the prediction result is the probability of the target sample to have primary drug resistance.
Further, the prediction model is an early screening model aiming at primary drug resistance of intestinal cancer patients receiving targeted therapy;
the related person is an intestinal cancer patient receiving targeted therapy;
selecting a plurality of related persons, and grouping the related persons according to the fact that the related persons receive the anti-EGFR drugs and the anti-VEGFR drugs; labeling each related person according to the disease progress condition, namely setting a label for the analysis sample of each related person, wherein the label is the disease progress condition of the related person corresponding to the analysis sample; the disease progression includes PR, SD, PD, CR;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain an early screening model;
the prediction result is the probability of the target sample to have primary drug resistance.
Another object of the present invention is to provide an electronic device, comprising a processor, a storage medium, and a computer program, the computer program being stored in the storage medium, the computer program, when executed by the processor, comprising: receiving an analysis sample of the feces of the relevant person; and constructing a prediction model according to the analysis samples and the machine learning model, and acquiring a prediction result according to a target sample and the prediction model, wherein the target sample is a sample obtained by analyzing the excrement of the target person.
It is a further object of the present invention to provide a computer readable storage medium having a computer program stored thereon, the computer program being executed by a processor to perform the steps of: receiving an analysis sample of the feces of the relevant person; and constructing a prediction model according to the analysis samples and the machine learning model, and acquiring a prediction result according to a target sample and the prediction model, wherein the target sample is a sample obtained by analyzing the excrement of the target person.
Compared with the prior art, the invention has the beneficial effects that:
the invention directly collects and analyzes the excrement of related personnel to obtain a large amount of analysis samples, then trains and tests the machine learning model according to the analysis samples to further obtain a prediction model, predicts the relevant aspects of colorectal cancer, develops a new solution for the prediction of colorectal cancer, has low cost, does not depend on the experience of doctors, and can achieve the following effects:
1. non-invasive: the excrement is derived from the intestinal tract, and compared with the method for directly obtaining intestinal microorganisms, the excrement has the advantages of convenience in collection, strong repeatability, no wound and the like, the influence of the micro-ecological environment on the colorectal cancer is researched by collecting an excrement sample as an intestinal microorganism sample, and the method is a new method for researching the human microbiology in recent years.
2. The advancement is as follows: by utilizing the advanced machine learning technology, a colorectal cancer prediction model is constructed, big data statistics can be carried out, and the prediction result is accurate and reliable. And as the sample size increases, the automatic learning capability of machine learning will obtain more scientific decision.
3. Early prevention: is beneficial to the primary prevention of colorectal cancer, carries out early intervention before the occurrence of tumors and really realizes the prevention before the occurrence of the cancers.
Drawings
FIG. 1 is a block diagram of the outcome of a system for colorectal cancer prediction according to a first embodiment of the present invention;
FIG. 2 is a diagram of the structure of the bacterial genus level community at the first 30-position of abundance of the bacteria of the present invention;
FIG. 3 is a system number heat map of differential analysis of bacteria at different stages of development;
FIG. 4 is a graph of linear discriminant analysis based on differential analysis of bacteria at different stages of development;
FIG. 5 is a composition ratio chart of microorganism-associated metabolites;
FIG. 6 is a heat map of microbial related metabolites;
fig. 7 is a block diagram of a tenth electronic device according to an embodiment of the present invention.
Detailed Description
The present invention will now be described in more detail with reference to the accompanying drawings, in which the description of the invention is given by way of illustration and not of limitation. The various embodiments may be combined with each other to form other embodiments not shown in the following description.
By summarizing the relationship between the intestinal microorganisms and the metabolites thereof and the occurrence and development of colorectal cancer and constructing a network diagram of the relationship between the microorganisms and the metabolites thereof and the colorectal cancer and a food-microorganism-SCFAs network diagram, the interaction relationship such as competition, predation, symbiosis, cooperation and the like among intestinal microorganism microflora is discovered, and the interaction and the mutual connection among the intestinal microorganisms and the metabolites thereof of colorectal cancer patients are discovered.
We compared fecal bacterial community structures of 5 patients with early stage colon cancer, 8 patients with advanced colorectal cancer, 6 patients with intestinal polyps and 8 healthy volunteers, and the bacterial horizontal community structure at the first 30 abundant bacteria is shown in fig. 2. And performing differential analysis on bacteria in different development stages, as shown in FIGS. 3-4, wherein FIG. 3 is a system number heat map, using Wilcoxon rank sum test, with P less than 0.05 as the judgment criterion for the difference, and the differential bacteria are marked in the upper right corner; FIG. 4 is a statistical difference criterion based on linear discriminant analysis with an LDA value greater than 2. Statistically different bacteria mainly include: bifidobaciaceae, Bifidobaciales, Bacillaceae, Bacillales, Peptostreptococcus, Ruminococcus, Rhodospirillaceae, Rhodospirillales, Pseudomonadaceae.
Analyzing the microbial metabolites, and detecting through GS/MS to finally determine the 132 microbial-related metabolites as detectable intestinal microbial metabolites. The details are shown in Table 1.
TABLE 1, Absolute quantitative determination List of 132 flora-host co-metabolites
Figure BDA0002095287230000091
Figure BDA0002095287230000101
Stool samples from 40 cases of colorectal cancer (20 cases of rectal cancer and 20 cases of colon cancer) were subjected to metabolomic analysis for their microbiologically relevant metabolites using the GS/MS method. Analysis was performed on 124 metabolites divided into 10 major groups, as shown in FIGS. 5 to 6, in which FIG. 5 is the composition ratio of the metabolites and FIG. 6 is a heat map of the microbial-related metabolites of the patients, respectively.
The research and analysis can show that the incidence and development of the colorectal cancer are closely related to microorganisms and metabolites thereof in the excrement, so that the colorectal cancer related aspect can be predicted by analyzing the microorganisms and the metabolites thereof in the excrement.
Example one
Referring to fig. 1, a system for predicting colorectal cancer includes a collecting unit 10, an analyzing unit 20, and a processing unit 30. Wherein: the collecting unit 10 is used for collecting and storing the feces of the related personnel, and is called as a feces specimen; the analysis unit 20 is used for analyzing the feces to obtain an analysis sample, wherein the analysis sample comprises one or more of microorganism information and microorganism metabolite information; the processing unit 30 is configured to construct a prediction model according to the analysis sample and the machine learning model, and obtain a prediction result according to a target sample and the prediction model, where the target sample is a sample obtained by analyzing the feces of a target person.
Analysis of microorganisms in the feces (to obtain microbial information) or/and analysis of microbial metabolites (to obtain microbial metabolite information) can give corresponding assessment of the development of colorectal cancer. If the microbial information in the feces is analyzed, the microbial information is one or more of fungal abundance, fungal diversity, fungal community structure, bacterial abundance, bacterial diversity and bacterial community structure, that is, any microbial information, and is not limited herein.
Certainly, in order to achieve better effect, the microbial information and the microbial metabolite information can be combined, and the abundance, community structure and diversity of fungi and bacteria are comprehensively considered in the microbial information, so that the prediction effect is more obvious. Meanwhile, human protein analysis and genome analysis can be added, so that the prediction effect can be further improved.
The collection unit mainly comprises a centrifuge tube, a centrifuge and a refrigerator. The process is that a little fresh excrement sample of a research object is taken and placed in 3-5 centrifuge tubes and marked, and after centrifugation is carried out in a centrifuge for 4-8 minutes at 1800 plus 2500 rpm, supernatant and sediment are subpackaged and stored in a refrigerator at-80 ℃ for standby.
The analysis unit is used for analyzing the excrement sample reserved in a refrigerator at the temperature of 80 ℃ below zero. When the biological substances in the fecal sample are analyzed, the analysis unit mainly comprises a fecal genome extraction kit, an ultraviolet spectrophotometer, agarose gel electrophoresis and a PCR amplification instrument. The process is as follows:
for the fecal samples, bacterial genomic DNA from the collected specimens was extracted according to the procedures described in the fecal genomic extraction Kit (TIANAmp DNA pool Kit). The concentration and purity of the DNA samples were judged using a UV spectrophotometer (Thermo Scientific, NanoDrop2000) and agarose gel electrophoresis, respectively. The extracted DNA was subjected to PCR amplification, and the target regions for amplification were the V3 region of the 16SrRNA gene (i.e., the bacterial 16S rRNA variable region; primers: 318F: ACTCCTACGGGAGGCAGCAG: 519R: TTACCGCGGCTGCTGGCAC) and the fungal ITS region (of course, one of the interfaces was selected if only microbial analysis was performed on fungi or bacteria). 25 μ l of the amplification system, the reaction conditions were: pre-denaturation for 5min (temperature 95 ℃); denaturation for 50s (temperature 95 ℃); annealing for 50s (temperature 55 ℃); extension for 50s (temperature 72 ℃), after 32 cycles; renaturation was carried out for 10min (temperature 72 ℃ C.) for a total of 35 cycles. The resulting PCR product was purified using an agarose gel recovery kit (Tianzhu DP209-02) and the concentration and purity of the product was judged using an ultraviolet spectrophotometer and agarose gel electrophoresis, respectively. And carrying out DGGE bidirectional sequencing analysis on the PCR amplification product. Firstly, performing quality control on the measured original data, splicing and filtering, discarding low-quality sequences, removing heterozygous sequences, and finally obtaining high-quality sequences for analysis. And clustering the high-quality sequences into operation classification units (OTUs) by adopting QIIME (germ 16SrRNA analysis pipeline), and performing OTU clustering, and performing species annotation on the representative sequence of each OTU to obtain corresponding species information and the abundance distribution condition based on the species. And performing multi-series comparison on the OTUs, constructing phylogenetic numbers, further obtaining community structure differences of different samples and groups, and obtaining the difference of the distribution of the intestinal flora of the model group and the normal group.
When the analysis sample comprises microbial metabolite information, the microbial information is microbial metabolomics analysis information; when the microorganism metabolites are analyzed, the analysis unit mainly comprises a grinder, a glass derivative bottle, a centrifugal concentrator, a shaking incubator and GC/MS analysis. The process is as follows:
1. sample pretreatment: stool samples were weighed precisely, placed into a 1.5mL centrifuge tube, 800uL of pre-cooled aqueous methanol (V: V ═ 4: 1), added with two small steel balls, placed in a-80 ℃ freezer for 2min, and then ground in a grinder (6OHz, 2 min): 160ul chloroform and 20ul internal standard (L-2-chlorophenylalanine, 0.3mg/mL methanol configuration) were added sequentially, vortexed in a mill (20Hz, 1min), then extracted by ultrasound for 10min, left to stand at 4 ℃ for 10 min: low temperature centrifugation for 10min (15000rpm, 4 ℃): 200ul of supernatant was taken each time, put into a glass derivative bottle, and evaporated to dryness by a rapid centrifugal concentrator, and 500ul in total: adding 80uL of methoxylamine pyridine hydrochloride solution (15Mg/mL) into a glass raw vial, carrying out vortex oscillation for 2min, carrying out 37C oximation reaction in a shaking incubator for 90min, taking out, adding 80uL of BSTFA (containing 1% TMCS) derivative reagent and 20uL of n-hexane, carrying out vortex oscillation for 2min, and carrying out reaction at 70 ℃ for 60 min. And taking out the sample, standing at room temperature for 30min, and performing GC/MS metabonomics analysis.
2. GC/MS analysis: the analytical instrument of the experiment was a 7890B-5977A gas chromatography-mass spectrometry combination instrument (Agilent USA) of Agilent, 1uL of the derivatized extract was injected into a GC-MS system for analysis in a 2: 1 split ratio mode, and the sample was subjected to mass spectrometric detection after being separated by a nonpolar DB-5MS capillary column (30m × 250um I.D., J & W Scientific, Folsom, CA). High purity ammonia gas was used as the carrier gas at a flow rate of 1.0 ml/min. Temperature programming: 8 ℃/min.80-100 ℃; 10 ℃/min, 100 ℃ and 170 ℃; 5 ℃/min, 170 ℃ and 200 ℃; 8 ℃/min, 200 ℃ and 305 ℃, and the temperature is maintained at 305 ℃ for 4 min. The temperature of the sample inlet is 260 ℃, the temperature of the EI source is 260 ℃, and the voltage is-70V. Mass scan range: m/z is 50-450, collection is started after 5min of delay, and the collection speed is 36.966 spectrums/second.
Machine learning models include, but are not limited to, decision trees, random forests, logistic regression, support vector machines, bayesian, K-nearest neighbors, K-means, and deep learning (various artificial neural networks), among others. Machine learning is a method in which a computer performs machine learning, teaching learning, analogy learning, case learning, or the like based on sample data to obtain a corresponding relationship model. The more samples are analyzed, the more accurate the prediction model obtained by machine learning is. The prediction model obtained by machine learning can be verified through some analysis samples, so that whether the prediction model needs to be optimized or updated or not is judged, and the optimization or the update is a process for improving the prediction model in a mode of adding the analysis samples. Each machine learning is slightly different according to its learning strategy, but any way of obtaining a machine learning model can be applied in the present invention.
In the preferred embodiment of the present invention, a neural network learning mode or a random forest mode is adopted. The relevant predictive models fall into two broad categories, one of which is the prognosis of pre-colorectal cancer and the other of which is the relevant prognosis after colorectal cancer. See, in particular, examples two through nine.
Example two
Example two mainly aims at the early screening model of high risk population, its relevant personnel are high risk population, the said high risk population is above age 40, have stool occult blood positive, have intestinal cancer family history and have intestinal polyp medical history one or more; selecting a plurality of related persons (preferably diversified, namely, each high risk group selects some persons with intestinal cancer and persons without intestinal cancer), and grouping the related persons according to whether intestinal cancer occurs or not; and marking each related person according to whether intestinal cancer occurs, namely setting a label for the analysis sample of each related person, wherein the label is whether the related person corresponding to the analysis sample occurs the intestinal cancer.
Dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model (the training and testing process is the conventional technology and is not repeated here), so as to finally obtain an early screening model; and inputting the target sample into an early screening model to obtain whether the target sample (namely the corresponding target person) has intestinal cancer.
EXAMPLE III
The third embodiment mainly aims at risk prediction models of healthy people; the related persons include healthy persons, patients with intestinal polyps and early intestinal cancer patients.
Selecting a plurality of related persons (preferably diversified, namely healthy persons, patients with intestinal polyp history and early intestinal cancer patients are selected), and grouping the related persons according to whether intestinal cancer occurs or not; and marking each related person according to whether intestinal cancer occurs, namely setting a label for the analysis sample of each related person, wherein the label is whether the related person corresponding to the analysis sample occurs the intestinal cancer.
Dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a risk prediction model; the probability of intestinal cancer occurrence of the target sample (namely the corresponding target person) can be obtained by inputting the target sample into the risk prediction model.
Example four
The fourth embodiment is mainly directed to a life time prediction model of a patient with intestinal cancer; the relevant persons are diagnosed intestinal cancer patients.
Selecting a plurality of related personnel, and grouping the related personnel according to the survival time of the related personnel; labeling each related person according to the survival time, namely setting a label for the analysis sample of each related person, wherein the label is the survival time of the related person corresponding to the analysis sample, and the survival times are (0, 3), (3, 5) and (5, 10) respectively.
Dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a survival time prediction model; the survival time of the target sample (i.e., the corresponding target person) can be obtained by inputting the target sample into the survival time prediction model, and similarly, the predicted survival time is also an interval.
EXAMPLE five
Example five models were selected primarily for treatment regimens for stage II-III intestinal cancer patients; the related people are patients with stage II-III intestinal cancer.
Selecting a plurality of related persons, and grouping the related persons according to whether the related persons receive chemotherapy; labeling each related person by using survival time, namely setting a label for the analysis sample of each related person, wherein the label is the survival time of the related person corresponding to the analysis sample, and the survival time is 5-year survival time and 10-year survival time respectively.
Dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a treatment scheme selection model; inputting the target sample into the treatment scheme selection model to obtain whether the target sample (i.e. the corresponding target person) needs to receive chemotherapy, wherein the judgment is based on the survival time of the target person who receives chemotherapy, obviously, the longer the survival time is, the more effective the corresponding treatment scheme is.
EXAMPLE six
Example six mainly for intestinal cancer chemotherapy-associated diarrhea prediction models; the relevant persons are intestinal cancer patients who receive chemotherapy.
Selecting a plurality of related persons, and grouping the related persons according to the FOLFOX chemotherapy scheme and the FOLFIRI chemotherapy scheme; and marking each related person according to whether the diarrhea occurs, namely setting a label for the analysis sample of each related person, wherein the label is the label which indicates whether the related person corresponding to the analysis sample has the diarrhea.
Dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a prediction model; the probability of diarrhea occurring in the target sample (i.e., the corresponding target person) can be obtained by inputting the target sample into the prediction model, and it can be certainly obtained whether diarrhea occurs, which needs to be described as follows: the probability and whether the occurrence is relative, and whether the occurrence is determined can be obtained by setting a threshold value for the probability; for example, when a probability value of developing diarrhea greater than 80% is determined to develop diarrhea, the two may be transformed into each other.
EXAMPLE seven
Example seven is directed primarily to a model for monitoring recurrent metastasis in patients with intestinal cancer; the patients with early intestinal cancer or intestinal cancer with survival time of 5 years or more are the relevant people.
Selecting a plurality of related persons (preferably diversified, namely, some intestinal cancer patients with early stage or intestinal cancer patients with 5 years or more of survival), and grouping the related persons according to whether the intestinal cancer patients relapse or cancer cell metastasis (cancer cell metastasis to lymph node or other organ metastasis); and marking each related person according to whether the related person has relapse or metastasis, namely setting a label for the analysis sample of each related person, wherein the label is the fact that whether the related person corresponding to the analysis sample has relapse or metastasis.
Dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a monitoring model; the probability of recurrence or transfer of the target sample (i.e. the corresponding target person) can be obtained by inputting the target sample into the monitoring model.
Example eight
Example eight early screening models of primary drug resistance mainly in patients with intestinal cancer receiving chemotherapy; the relevant persons are intestinal cancer patients who receive chemotherapy.
Selecting a plurality of related persons, and grouping the related persons according to the FOLFOX chemotherapy scheme and the FOLFIRI chemotherapy scheme; labeling each related person according to the disease progress condition, namely setting a label for the analysis sample of each related person, wherein the label is the disease progress condition of the related person corresponding to the analysis sample; the disease progression includes PR, SD, PD, and CR; PR, SD, PD and CR refer to RECIST standards.
Dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain an early screening model; the probability of primary drug resistance of the target sample (i.e. the corresponding target person) can be obtained by inputting the target sample into the early-stage screening model.
Example nine
Example nine is primarily directed to an early screening model of primary drug resistance in colon cancer patients receiving targeted therapy; the related people are intestinal cancer patients receiving targeted therapy.
Selecting a plurality of related persons, and grouping the related persons according to the fact that the related persons receive the anti-EGFR drugs and the anti-VEGFR drugs; labeling each related person according to the disease progress condition, namely setting a label for the analysis sample of each related person, wherein the label is the disease progress condition of the related person corresponding to the analysis sample; the disease progression comprises PR, SD, PD, CR, PR, SD, PD and CR reference RECIST criteria;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain an early screening model; the probability of primary drug resistance of the target sample (i.e. the corresponding target person) can be obtained by inputting the target sample into the early-stage screening model.
Example ten
Fig. 7 is a schematic structural diagram of an electronic apparatus according to a tenth embodiment of the present invention, as shown in fig. 7, the electronic apparatus includes a processor 40, a memory 50, an input device 60, and an output device 70; the number of processors 40 in the computer device may be one or more, and one processor 40 is taken as an example in fig. 7; the processor 40, the memory 50, the input device 60 and the output device 70 in the electronic apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 7.
The memory 50 serves as a computer readable storage medium for storing software programs, computer executable programs, and modules, such as program instructions/modules corresponding to the colorectal cancer-related prediction system in the embodiment of the present invention (for example, implementing the steps of receiving analysis samples of feces of related persons, constructing a prediction model according to the analysis samples and a machine learning model, and obtaining a prediction result according to a target sample and the prediction model, wherein the target sample analyzes the feces of the target person). The processor 40 executes various functional applications and data processing of the electronic device by executing the software programs, instructions and modules stored in the memory 50, namely, implements the colorectal cancer related prediction system of the above embodiment.
The memory 50 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 50 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 50 may further include memory located remotely from the processor 40, which may be connected to the electronic device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input means 60 may be used to input an identity signal of the user, a corresponding preset threshold value, etc. The output device 70 may include a display device such as a display screen.
EXAMPLE eleven
Embodiments eleven further provide a storage medium containing computer-executable instructions that, when executed by a computer processor, perform a system for colorectal cancer-related prediction, the method comprising:
receiving an analysis sample of the feces of the relevant person;
and constructing a prediction model according to the analysis samples and the machine learning model, and acquiring a prediction result according to a target sample and the prediction model, wherein the target sample is a sample obtained by analyzing the excrement of the target person.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes instructions for enabling an electronic device (which may be a mobile phone, a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
Various other modifications and changes may be made by those skilled in the art based on the above-described technical solutions and concepts, and all such modifications and changes should fall within the scope of the claims of the present invention.

Claims (19)

1. A system for the relative prediction of colorectal cancer, comprising:
the collecting unit is used for collecting and storing the excrement of the related personnel, and is called as an excrement sample;
the analysis unit is used for analyzing the excrement to obtain an analysis sample, and the analysis sample comprises one or more of microorganism information and microorganism metabolite information;
and the processing unit is used for constructing a prediction model according to the analysis samples and the machine learning model, and acquiring a prediction result according to a target sample and the prediction model, wherein the target sample is a sample obtained by analyzing the excrement of the target person.
2. The system of claim 1, wherein the collection unit comprises a centrifuge tube, the stool is placed in the centrifuge tube, and the supernatant and the pellet are dispensed after centrifugation at a first preset speed for a first preset time, and stored in a refrigerator for use.
3. The system of claim 2, wherein the first predetermined speed is 1800-2500 rpm; or/and the first preset time is 4-8 minutes.
4. The system of claim 1, wherein when the analysis sample includes microbiological information, the microbiological information is one or more of fungal abundance, fungal diversity, fungal community structure, bacterial abundance, bacterial diversity, and bacterial community structure.
5. The system of claim 4, wherein the analysis unit is configured to analyze the biological sample in the stool, the analysis unit comprising:
the excrement genome extraction kit is used for extracting microbial genome DNA of the excrement sample, and the extracted microbial genome DNA is called a DNA sample;
the ultraviolet spectrophotometer is used for judging the concentration of the DNA sample;
agarose gel electrophoresis, which is used for judging the purity of the DNA sample;
the PCR amplification instrument is used for carrying out PCR amplification on the DNA sample to obtain a PCR amplification product;
performing DGGE bidirectional sequencing analysis on the PCR amplification product to obtain sequenced original data, and performing quality control on the sequenced original data, wherein the quality control comprises splicing and filtering to discard low-quality sequences and remove heterozygous sequences, and finally obtaining high-quality sequences for analysis;
and carrying OUT OTU clustering on the high-quality sequence by adopting QIIME to obtain a plurality of OUT units, and carrying OUT species annotation on the obtained representative sequence of each OUT unit to obtain the microbial information.
6. The system of claim 5, wherein the regions sequenced bidirectionally are gene sequences of the bacterial 16S rRNA variable region or/and the fungal ITS region.
7. The system of claim 1, wherein the microbiological information is microbiologic metabolite analysis information when the analysis sample includes microbiological metabolite information.
8. The system of claim 7, wherein the analysis unit, when analyzing the microbial metabolites, comprises:
the grinder is used for grinding the excrement sample added with the precooled methanol aqueous solution;
a glass derivative bottle for holding a supernatant of the ground fecal sample;
a centrifugal concentrator for evaporating the supernatant;
the shaking incubator is used for placing the glass derivative bottle added with the methoxylamine pyridine hydrochloride solution into the shaking incubator for oximation reaction to finally form a microorganism metabolite analysis sample;
GC/MS analysis for performing omics analysis on the microbial metabolite analysis sample.
9. The system of claim 1, wherein the machine learning model is any one of a random forest model, a decision tree model, a logistic regression model, and a deep learning model.
10. The system of any one of claims 1 to 9, wherein the predictive model is an early screening model for high risk populations;
the related people are high risk people, and the high risk people are one or more of people with age above 40 years old, positive occult blood of stool, family history of intestinal cancer and medical history of intestinal polyp;
selecting a plurality of related persons, and grouping the related persons according to whether intestinal cancer occurs or not; labeling each related person according to whether intestinal cancer occurs, namely setting a label for an analysis sample of each related person, wherein the label is whether the related person corresponding to the analysis sample occurs the intestinal cancer;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain an early screening model;
the prediction result is whether the target sample has intestinal cancer.
11. The system for the correlated prediction of colorectal cancer according to any one of claims 1 to 9, characterized in that said predictive model is a risk predictive model for a healthy population;
the related persons include healthy persons, patients with intestinal polyp history and early intestinal cancer;
selecting a plurality of related persons, and grouping the related persons according to whether intestinal cancer occurs or not; labeling each related person according to whether intestinal cancer occurs, namely setting a label for an analysis sample of each related person, wherein the label is whether the related person corresponding to the analysis sample occurs the intestinal cancer;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a risk prediction model;
the prediction result is the probability of intestinal cancer occurrence in the target sample.
12. The colorectal cancer-related prediction system according to any one of claims 1 to 9, wherein the prediction model is a life time prediction model for a patient with intestinal cancer;
the related personnel are diagnosed intestinal cancer patients;
selecting a plurality of related personnel, and grouping the related personnel according to the survival time of the related personnel; labeling each related person according to the survival time, namely setting a label for the analysis sample of each related person, wherein the label is the survival time of the related person corresponding to the analysis sample, and the survival times are (0, 3), (3, 5), (5, 10) respectively;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a survival time prediction model;
and the prediction result is the survival time of the target sample.
13. The system of any one of claims 1 to 9, wherein the predictive model is a model selected for a treatment regimen for a stage II-III intestinal cancer patient;
the related persons are patients with intestinal cancer of II-III stage;
selecting a plurality of related persons, and grouping the related persons according to whether the related persons receive chemotherapy; labeling each related person according to survival time, namely setting a label for an analysis sample of each related person, wherein the label is the survival time of the related person corresponding to the analysis sample, and the survival time is 5-year survival time and 10-year survival time respectively;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a treatment scheme selection model;
the predicted outcome is whether the target sample needs to receive chemotherapy.
14. The system of any one of claims 1 to 9, wherein the predictive model is a predictive model for chemotherapy-related diarrhea associated with intestinal cancer;
the related person is an intestinal cancer patient receiving chemotherapy;
selecting a plurality of related persons, and grouping the related persons according to the FOLFOX chemotherapy scheme and the FOLFIRI chemotherapy scheme; labeling each related person according to whether diarrhea occurs, namely setting a label for the analysis sample of each related person, wherein the label is the label which indicates whether the related person corresponding to the analysis sample has diarrhea;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a prediction model;
the prediction result is the probability of the target sample generating diarrhea.
15. The system according to any one of claims 1 to 9, wherein the prediction model is a monitoring model for recurrence and metastasis of a patient with intestinal cancer;
the related person is an early intestinal cancer patient or an intestinal cancer patient who lives for 5 years or more;
selecting a plurality of related persons, and grouping the related persons according to whether the related persons relapse or cancer cell metastasis; labeling each related person according to whether the relapse or the transfer occurs, namely setting a label for the analysis sample of each related person, wherein the label is the fact whether the related person corresponding to the analysis sample relapses or transfers;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain a monitoring model;
the prediction result is the probability of relapse or metastasis of the target sample.
16. A system for the prediction of colorectal cancer according to any one of claims 1 to 9, wherein the predictive model is an early screening model for primary drug resistance in patients with colorectal cancer receiving chemotherapy;
the related person is an intestinal cancer patient receiving chemotherapy;
selecting a plurality of related persons, and grouping the related persons according to the FOLFOX chemotherapy scheme and the FOLFIRI chemotherapy scheme; labeling each related person according to the disease progress condition, namely setting a label for the analysis sample of each related person, wherein the label is the disease progress condition of the related person corresponding to the analysis sample; the disease progression includes PR, SD, PD, and CR;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain an early screening model;
the prediction result is the probability of the target sample to have primary drug resistance.
17. The system of any one of claims 1 to 9, wherein the predictive model is an early screening model for primary drug resistance in colon cancer patients receiving targeted therapy;
the related person is an intestinal cancer patient receiving targeted therapy;
selecting a plurality of related persons, and grouping the related persons according to the fact that the related persons receive the anti-EGFR drugs and the anti-VEGFR drugs; labeling each related person according to the disease progress condition, namely setting a label for the analysis sample of each related person, wherein the label is the disease progress condition of the related person corresponding to the analysis sample; the disease progression includes PR, SD, PD, CR;
dividing analysis samples corresponding to a plurality of related persons into training samples and testing samples, and training and testing the machine learning model to finally obtain an early screening model;
the prediction result is the probability of the target sample to have primary drug resistance.
18. An electronic device comprising a processor, a storage medium, and a computer program, the computer program being stored in the storage medium, wherein the computer program, when executed by the processor, comprises the steps of:
receiving an analysis sample of the feces of the relevant person;
and constructing a prediction model according to the analysis samples and the machine learning model, and acquiring a prediction result according to a target sample and the prediction model, wherein the target sample is a sample obtained by analyzing the excrement of the target person.
19. A computer-readable storage medium having a computer program stored thereon, the computer program being executable by a processor to perform the steps of:
receiving an analysis sample of the feces of the relevant person;
and constructing a prediction model according to the analysis samples and the machine learning model, and acquiring a prediction result according to a target sample and the prediction model, wherein the target sample is a sample obtained by analyzing the excrement of the target person.
CN201910519062.6A 2019-06-14 2019-06-14 Colorectal cancer-related prediction system, electronic device, and storage medium Pending CN112080565A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910519062.6A CN112080565A (en) 2019-06-14 2019-06-14 Colorectal cancer-related prediction system, electronic device, and storage medium
PCT/CN2020/082950 WO2020248665A1 (en) 2019-06-14 2020-04-02 Related prediction system for colorectal cancer, and electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910519062.6A CN112080565A (en) 2019-06-14 2019-06-14 Colorectal cancer-related prediction system, electronic device, and storage medium

Publications (1)

Publication Number Publication Date
CN112080565A true CN112080565A (en) 2020-12-15

Family

ID=73734266

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910519062.6A Pending CN112080565A (en) 2019-06-14 2019-06-14 Colorectal cancer-related prediction system, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN112080565A (en)
WO (1) WO2020248665A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113488121A (en) * 2021-07-24 2021-10-08 山东省千佛山医院 Accurate detection, evaluation and intervention system and method for colon cancer intestinal microecology
CN113782191A (en) * 2021-09-26 2021-12-10 萱闱(北京)生物科技有限公司 Colorectal lesion type prediction device, model construction method, medium, and apparatus
CN116904586A (en) * 2023-09-12 2023-10-20 上海益诺思生物技术股份有限公司 Application of reagent for detecting plasma-derived exosome lncRNA in preparation of diagnostic reagent for detecting kidney injury

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105567846A (en) * 2016-02-14 2016-05-11 上海交通大学医学院附属仁济医院 Kit for detecting bacteria DNAs in faeces and application thereof in colorectal cancer diagnosis
CN106611094A (en) * 2015-10-15 2017-05-03 北京寻因生物科技有限公司 Method for carrying out prediction and intervention on toxic and side effect of chemotherapy drug on the basis of intestinal tract microbial flora
CN109493926A (en) * 2018-10-30 2019-03-19 中山大学肿瘤防治中心 Processing method, device, medium and the electronic equipment of colorectal cancer medical data

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107541544A (en) * 2016-06-27 2018-01-05 卡尤迪生物科技(北京)有限公司 Methods, systems, kits, uses and compositions for determining a microbial profile
AR110378A1 (en) * 2016-12-15 2019-03-20 Univ College Cork National Univ Of Ireland Cork METHODS TO DETERMINE THE STATE OF COLORRECTAL CANCER ON A PERSON
CN109609639B (en) * 2019-01-14 2022-04-26 深圳微健康基因科技有限公司 Colorectal cancer detection method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106611094A (en) * 2015-10-15 2017-05-03 北京寻因生物科技有限公司 Method for carrying out prediction and intervention on toxic and side effect of chemotherapy drug on the basis of intestinal tract microbial flora
CN105567846A (en) * 2016-02-14 2016-05-11 上海交通大学医学院附属仁济医院 Kit for detecting bacteria DNAs in faeces and application thereof in colorectal cancer diagnosis
CN109493926A (en) * 2018-10-30 2019-03-19 中山大学肿瘤防治中心 Processing method, device, medium and the electronic equipment of colorectal cancer medical data

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113488121A (en) * 2021-07-24 2021-10-08 山东省千佛山医院 Accurate detection, evaluation and intervention system and method for colon cancer intestinal microecology
CN113488121B (en) * 2021-07-24 2024-03-15 山东省千佛山医院 Intestinal microecology precise detection and evaluation intervention system and method for colon cancer
CN113782191A (en) * 2021-09-26 2021-12-10 萱闱(北京)生物科技有限公司 Colorectal lesion type prediction device, model construction method, medium, and apparatus
CN116904586A (en) * 2023-09-12 2023-10-20 上海益诺思生物技术股份有限公司 Application of reagent for detecting plasma-derived exosome lncRNA in preparation of diagnostic reagent for detecting kidney injury
CN116904586B (en) * 2023-09-12 2023-12-22 上海益诺思生物技术股份有限公司 Application of reagent for detecting plasma-derived exosome lncRNA in preparation of diagnostic reagent for detecting kidney injury

Also Published As

Publication number Publication date
WO2020248665A1 (en) 2020-12-17

Similar Documents

Publication Publication Date Title
Sepehri et al. Microbial diversity of inflamed and noninflamed gut biopsy tissues in inflammatory bowel disease
WO2020248665A1 (en) Related prediction system for colorectal cancer, and electronic device and storage medium
WO2015018307A1 (en) Biomarkers for colorectal cancer
CN109477145A (en) The biomarker of inflammatory bowel disease
Mottawea et al. The mucosal–luminal interface: an ideal sample to study the mucosa-associated microbiota and the intestinal microbial biogeography
WO2016112488A1 (en) Biomarkers for colorectal cancer related diseases
US20150267249A1 (en) Determination of reduced gut bacterial diversity
CN111500705B (en) IgAN intestinal flora marker, igAN metabolite marker and application thereof
CN114317746B (en) Application of exosomes ARPC5, YPEL2 and the like in lung cancer diagnosis
CN111763740B (en) System for predicting treatment effect and prognosis of neoadjuvant radiotherapy and chemotherapy of esophageal squamous carcinoma patient based on lncRNA molecular model
EP2909332A1 (en) Determination of a tendency to gain weight
CN111748626B (en) System for predicting treatment effect and prognosis of neoadjuvant radiotherapy and chemotherapy of esophageal squamous carcinoma patient and application of system
JP7362241B2 (en) How to test for colorectal cancer
Fujiki et al. Hydrogen gas and the gut microbiota are potential biomarkers for the development of experimental colitis in mice
Noor et al. Genetic and Genomic Markers for Prognostication
CN114231638A (en) Kit, device and method for lung cancer diagnosis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201215