CN111334590A - Kit for identifying colorectal cancer and application thereof - Google Patents

Kit for identifying colorectal cancer and application thereof Download PDF

Info

Publication number
CN111334590A
CN111334590A CN202010103575.1A CN202010103575A CN111334590A CN 111334590 A CN111334590 A CN 111334590A CN 202010103575 A CN202010103575 A CN 202010103575A CN 111334590 A CN111334590 A CN 111334590A
Authority
CN
China
Prior art keywords
colorectal cancer
indicator
bacteria
sample
combination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010103575.1A
Other languages
Chinese (zh)
Inventor
孙子奎
宣涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Personal Gene Technology Co ltd
Original Assignee
Nanjing Personal Gene Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Personal Gene Technology Co ltd filed Critical Nanjing Personal Gene Technology Co ltd
Priority to CN202010103575.1A priority Critical patent/CN111334590A/en
Publication of CN111334590A publication Critical patent/CN111334590A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a kit for identifying colorectal cancer, which is characterized by comprising the following components: identifying a specific primer combination of the indicator bacteria, wherein the primer combination is shown as SEQ ID NO. 1-12; wherein the indicator bacteria are the combination of Fusobacterium nucleatum (F.nuclearum), Streptococcus anaerobicus (P.anaerobobius), Clostridium symbiosum (C.symbiosum), enterococcus faecalis (E.faecalis), Porphyromonas saccharolyticus (P.asaccharolytica) and Streptococcus salivarius (S.salivarius). The invention also discloses the application of the composition. The invention has the beneficial effects that: colorectal cancer risk prediction can be performed by only collecting a stool sample. The principle is that by detecting colorectal cancer pathogenic bacteria in excrement, the risk assessment sensitivity can reach 92%, and the specificity is 95%.

Description

Kit for identifying colorectal cancer and application thereof
Technical Field
The invention belongs to the field of gene detection, and particularly relates to a kit for identifying colorectal cancer and application thereof.
Background
Among various common cancers, Cancer infection and Cancer infection in China,2014 reports data statistics that the incidence rate of colorectal Cancer is 17.2/105, and the fifth grade of tumor incidence rate (the first four are lung Cancer, breast Cancer, gastric Cancer and liver Cancer). Meanwhile, colorectal cancer is one of the most common death reasons, and the number of people dying from colorectal cancer every year reaches more than sixty thousand according to statistics.
Colorectal cancer is a multifactorial process involving progressive changes in the structure of intestinal flora, molecular genetics and associated histomorphological changes, interacting with external factors such as the environment in the body. Which is again a multi-step, multi-pass process. The precancerous lesions comprise inflammatory enteritis, abnormal crypt lesions, adenomas, juvenile polyposis, jagged lesions and the like, the disease development period can reach 10 years, patients at early stage do not have any obvious symptoms and often cannot seek medical advice actively, and the patients who present obvious clinical symptoms often miss the best detection opportunity when going to medical advice again. Moreover, the traditional detection methods (such as fecal occult blood, finger detection, tumor markers and the like) have poor sensitivity, are difficult to detect the variant signals of early intestinal cancer, are complicated to prepare in the early stage of enteroscopy detection, have strong invasiveness, and have the enteroscopy detection rate of asymptomatic people lower than 1%. In view of this, the auxiliary diagnosis of colorectal cancer is of great importance, and the convenient, noninvasive and household auxiliary diagnosis and detection product of colorectal cancer is more easily accepted by asymptomatic people.
Research has found that the intestinal flora of patients with colorectal cancer has obvious difference with the intestinal flora of healthy people, and microorganisms in the intestinal tract of human bodies are closely related to the occurrence of colorectal cancer.
The intestinal flora participates in the development of colorectal tumors, and the mechanism of the intestinal flora can have the following aspects: products of certain intestinal microorganisms can directly damage intestinal mucosa and cause incomplete repair of the intestinal mucosa; peroxides produced by certain intestinal bacteria can also contribute to the development of colorectal cancer; certain intestinal microorganisms induce colorectal cancer by causing genetic mutations in the intestinal epithelial cells, either directly or indirectly.
Therefore, screening suitable indicator bacteria as markers and adopting a kit form to carry out auxiliary diagnosis of colorectal cancer is a new route.
In the prior art, whether colorectal cancer is suffered or not is often judged by using a fecal occult blood detection result, however, the judgment result is not completely accurate, and further examination is often required by combining an enteroscope, which has higher requirements on the physical quality of a detector and is painful.
Disclosure of Invention
In order to overcome the above-mentioned drawbacks of the prior art, it is an object of the present invention to provide a kit for identifying colorectal cancer.
In order to realize one of the purposes of the invention, the adopted technical scheme is as follows:
a kit for identifying colorectal cancer, which comprises the following components:
rapidly identifying a specific primer combination of the indicator bacteria;
the primer combination is shown in SEQ ID NO. 1-12;
wherein the indicator bacteria are the combination of Fusobacterium nucleatum (F.nuclearum), Streptococcus anaerobicus (P.anaerobobius), Clostridium symbiosum (C.symbiosum), enterococcus faecalis (E.faecalis), Porphyromonas saccharolyticus (P.asaccharolytica) and Streptococcus salivarius (S.salivarius).
In a preferred embodiment of the present invention, the kit further comprises:
sample collection and/or collection devices (e.g., collection boxes, collection tubes, masks, gloves, etc.);
PCR reaction premix solution;
and (3) water.
Note: the primer names are named by exon numbers corresponding to the genes; f represents an upstream primer, and R represents a downstream primer.
The invention also aims to provide an application of the kit, which is a device for auxiliary diagnosis of colorectal cancer developed in later period.
The invention has the beneficial effects that: develops a colorectal cancer auxiliary diagnosis product which is at home, noninvasive and high in sensitivity. Colorectal cancer risk prediction can be performed by only collecting a stool sample. The principle is that by detecting colorectal cancer pathogenic bacteria in excrement, the risk assessment sensitivity can reach 92%, and the specificity is 95%.
Drawings
Fig. 1 is a prior art logic reference diagram.
FIG. 2 is a logic reference diagram (1) of the present invention.
FIG. 3 is a logic reference diagram (2) of the present invention.
FIG. 4 is a logical reference diagram of the present invention (3).
FIG. 5 is a logic reference diagram (4) of the present invention.
Detailed Description
The method of the invention comprises the following steps:
1. collecting a fecal sample:
a fecal sample sampling cartridge comprising: instructions for use, health questionnaires/informed consent, disposable feces collection box, feces collection tube A, feces collection tube B, disposable plastic bag, disposable mask, disposable gloves, self-sealing tube, sealing patch, etc.
The collection step comprises: 1) correctly fill in a health questionnaire/informed consent; 2) taking out the excrement collecting box, tearing off the gum, and sleeving the box into a plastic bag; 3) sticking the box to the toilet bowl; excreting feces into a feces collection box; 4) sampling by using a fecal collection A tube: (wearing gloves and a mask before sampling) sampling (about 5 spoons of excrement) by using a spoon on a tube cover into the preservation solution, and then screwing the tube cover forcefully; 5) b tube sampling for excrement collection: collecting a spoon of excrement to a sampling tube B by using a spoon on a tube cover, and then screwing down the tube cover; 6) respectively placing the A and B sampling tubes into a self-sealing belt and a number sealing bag opening, and placing the sampling tubes and the health questionnaire/informed consent into a box; 7) the plastic bag is tied down, the waste bin is discarded, and the gloves and feces collection bin are discarded to a trash bin. 8) And sealing the sampling box by using a sealing paste for detection.
2. Bacterial genome DNA extraction from fecal samples
The DNA extraction kit is composed of "bacterial genome DNA extraction kit Specification" of Tiangen Biochemical technology (Beijing) Co., Ltd
2.1 taking 1-2 g of a fecal sample, adding 200 mu l of buffer solution GA into the sample, and oscillating or blowing and sucking until the sample is suspended completely;
2.2 adding 20 mul of protease K solution into the tube, and uniformly mixing;
2.3 adding 220 mul buffer solution GB, oscillating for 15sec, standing at 70 ℃ for 30min, cleaning the solution, and centrifuging briefly to remove water drops on the inner wall of the tube cover;
2.4 adding 220 mul of absolute ethyl alcohol, fully shaking and mixing for 15sec, wherein flocculent precipitates can appear, and centrifuging briefly to remove water drops on the inner wall of the tube cover;
2.5 adding the solution and flocculent precipitate obtained in the previous step into an adsorption column CB3 (the adsorption column is placed into a collecting pipe), centrifuging at 12,000rpm (13,400 × g) for 30sec, pouring the waste liquid, placing an adsorption column CB3 into the collecting pipe, removing all the liquid in a CB3 column if the liquid cannot be centrifuged completely, adding 220 microliter absolute ethyl alcohol for blowing and sucking for 2-3 times, centrifuging in a centrifuge, and adding the obtained solution and flocculent precipitate
In an adsorption column CB3, the column is passed again for adsorption;
2.6 adding 500 μ l buffer GD (check whether absolute ethyl alcohol is added before using) into adsorption column CB3, centrifuging at 12,000rpm (-13,400 × g) for 30sec, pouring off waste liquid, and putting adsorption column CB3 into a collection tube;
2.7 adding 600 μ l rinsing liquid PW (checking whether absolute ethyl alcohol is added before use) into adsorption column CB3, centrifuging at 12,000rpm (-13,400 × g) for 30sec, pouring off waste liquid, and placing adsorption column CB3 into a collection tube;
2.8 repeating the operation step 8;
2.9 putting the adsorption column CB3 back into the collecting tube, centrifuging for 2min at 12,000rpm (13,400 × g), pouring off waste liquid, placing the adsorption column CB3 at room temperature for a plurality of minutes to completely dry the residual rinsing liquid in the adsorption material;
2.10 transferring the adsorption column CB3 into a clean centrifuge tube, suspending and dripping 50-200 mul of elution buffer TE into the middle part of the adsorption film, standing at 65 ℃ for 2-5 min, centrifuging at 12,000rpm (13,400 × g) for 2min, and collecting the solution into the centrifuge tube;
2.11 the experimental wastes were treated according to the regulations on the management of medical wastes of the medical and health institutions in Shanghai City and the regulations on the treatment of laboratory medical wastes
3. Detection of indicator bacteria associated with colorectal cancer
3.1 preferred primer combination sequences within the present application are shown in Table 1(SEQ ID NO. 1-12);
TABLE 1
Figure BDA0002387702370000051
3.2 detection of indicator bacteria
3.2.1 diluting the feces sample bacterial genome DNA sample qualified in quality inspection to 30ng/uL
3.2.2qPCR reaction system see Table 2;
TABLE 2
Figure BDA0002387702370000052
Figure BDA0002387702370000061
3.2.3qPCR reaction program see Table 3;
TABLE 3
Figure BDA0002387702370000062
4. Data analysis
The calculation of the relative abundance of the target enterobacteria was normalized by 16 srna, i.e., Δ Ct ═ Cttaget-Ct16 srna.
5. Stool occult blood detection (colloidal gold method)
5.1 balancing the detection reagent strip of the fecal occult blood detection kit (colloidal gold method) and the sample of the tube B to room temperature in advance;
5.2 preparing a sample processing tube for each sample, and adding 600ul of distilled water to the processing tube;
5.3 sampling randomly from different parts of the excrement sample by using a sampling rod, uniformly stirring the sample in a treatment tube based on the sampling amount of a small circular ring attached to the front end of the sampling rod, and completely mixing the sample with distilled water;
5.4 vertically putting one end of the detection reagent strip for sample application into the sample mixed solution, taking out the reagent strip after soaking for 15 seconds and horizontally putting the reagent strip on a desktop, wherein the end of the detection reagent strip for sample application can not be soaked above the MAX line of the reagent strip;
5.5 observe the results, record and take pictures within 3-5 minutes.
6. Colorectal cancer risk assessment
According to the detected abundance of 6 intestinal bacteria in excrement and the detection result of fecal occult blood, a support vector machine algorithm is used for evaluating the colorectal cancer risk, and the specific detection steps are as follows:
the method comprises the following steps: standardizing the format of an input reference set data file into a csv format;
step two: reading data, screening indexes to be used, selecting a delta Ct value as a basic numerical value, and selecting a combination of an indicator bacterium and a FIT (FIT) index as an indicator index;
step three: taking 70% of all reference samples as a training set, and taking the rest samples as a verification set;
step four: selecting a support vector machine method, and initializing a classifier;
step five: using the training set to carry out parameter optimization and training of the support vector machine algorithm; the parameters include: kernel function, penalty coefficient, gamma value;
step six: using parameters including the kernel function, the penalty coefficient and the gamma value obtained by the previous step and a corresponding model, and predicting by using the characteristic data in the verification set to obtain a prediction result of the verification set;
step seven: calculating the sensitivity and specificity of the method according to the comparison between the prediction result and the real result of the verification set, and then adjusting the sensitivity and specificity according to the use scene;
step eight: and inputting a data file of a sample to be detected according to the standard, analyzing by the method after the indicator bacteria and the kernel function, the penalty coefficient and the gamma value are optimized, and outputting a colorectal cancer risk result file and a detection report.
Example 1:
SVM algorithm (support vector machine) judgment accuracy is higher than that of manual interpretation method
The method is applied to detecting the abundance of 9 intestinal bacteria in excrement and the detection result of fecal occult blood, and colorectal cancer risk assessment and diagnosis.
1 materials and methods
1.1 sample Source
Two stool samples of colorectal cancer patients in tumor hospitals in Liaoning province of tumor hospital from 1 month 1 to 30 months 6 and 2018 in 2017 are collected through cooperation relationship and are respectively marked as a first sample and a second sample.
1.2 specimen extraction
Tearing off the gum by using a feces collection box, and sleeving the feces collection box into a plastic bag; adhering the box to a closestool, and discharging the excrement into an excrement collecting box; sampling with a sampling spoon (about 5 spoons of excrement) to an excrement storage tube, and then screwing the tube cover with force; the excrement storage tube is arranged in a self-sealing belt and stored at the temperature of minus 80 ℃ for later use.
Bacterial genomic DNA in the samples was extracted using a Tiangen bacterial genomic DNA extraction kit (DP302) and stored at-20 ℃ until use.
1.3 Synthesis of PCR amplification primers
The gene sequence comparison analysis of 9 intestinal bacteria gene nucleic acid sequences in the feces searched in the Genbank database is carried out by utilizing bioinformatics knowledge and related bioinformatics software such as DNAstar, specific sequences of target regions are selected, and PCR primers aiming at the 9 intestinal bacteria in the feces and corresponding specific gene segments of internal reference 16s rRNA are designed (see table 4).
TABLE 4 primer sequences
Figure BDA0002387702370000081
Figure BDA0002387702370000091
Note: the primer names are named by exon numbers corresponding to the genes; f represents an upstream primer, and R represents a downstream primer.
1.4 preparation of Mixed PCR primer working solution
(1) Preparing 100 mu mol/L stock solution of each synthesized PCR primer by using double distilled water respectively;
(2) the primers are paired and divided into 10 groups, each group has 1 pair of primers,
the first group is primer combination I: respectively taking 10 mul of PCR primer stock solution corresponding to F.nucleolus-F and F.nucleolus-R, adding the 10 mul of PCR primer stock solution into the same 1.5ml of Eppendorf tube, and adding 80 mul of double distilled water to obtain mixed PCR primer working solution I;
the second group is primer combination II: respectively taking 10 mul of PCR primer stock solution corresponding to P.anaerobius-F and P.anaerobius-R, adding the 10 mul of PCR primer stock solution into the same 1.5ml Eppendorf tube, and adding 80 mul of double distilled water to obtain mixed PCR primer working solution II;
the third group is primer combination III: respectively adding 10 mul of PCR primer stock solution corresponding to C.symbiosum-F and C.symbiosum-R into the same 1.5ml of Eppendorf tube, and adding 80 mul of double distilled water to obtain mixed PCR primer working solution III;
the fourth group is a primer combination IV: respectively taking 10 mul of PCR primer stock solution corresponding to E.faecalis-F and E.faecalis-R, adding the 10 mul of PCR primer stock solution into the same 1.5ml of Eppendorf tube, and adding 80 mul of double distilled water to obtain mixed PCR primer working solution IV;
the fifth group is a primer combination V: respectively adding 10 mul of PCR primer stock solution corresponding to P.asaccharolytica-F and P.asaccharolytica-R into the same 1.5ml Eppendorf tube, and adding 80 mul of double distilled water to obtain mixed PCR primer working solution V;
the sixth group is a primer combination VI: respectively taking 10 mul of PCR primer stock solution corresponding to S.salivarius-F and S.salivarius-R, adding the 10 mul of PCR primer stock solution into the same 1.5ml of Eppendorf tube, and adding 80 mul of double distilled water to obtain a mixed PCR primer working solution VI;
the seventh group is a primer combination VII: respectively taking 10 mul of PCR primer stock solution corresponding to E.faecalis-F and E.faecalis s-R, adding the 10 mul of PCR primer stock solution into the same 1.5ml of Eppendorf tube, and adding 80 mul of double distilled water to obtain a mixed PCR primer working solution VII;
the eighth group is a primer combination VIII: and respectively adding 10 mul of PCR primer stock solution corresponding to the E.rectangle-F and the E.rectangle-R into the same 1.5ml of Eppendorf tube, and adding 80 mul of double distilled water to obtain the mixed PCR primer working solution VIII.
The ninth group is a primer combination IX: and respectively adding 10 mu l of PCR primer stock solutions corresponding to F.praussnitzi-F and F.praussnitzi-R into the same 1.5ml Eppendorf tube, and adding 80 mu l of double distilled water to obtain the mixed PCR primer working solution IX.
The tenth group is primer set X: and respectively taking 10 mu l of PCR primer stock solutions corresponding to 16s rRNA-F and 16s rRNA-R, adding the PCR primer stock solutions into the same 1.5ml Eppendorf tube, and adding 80 mu l of double distilled water to obtain the mixed PCR primer working solution X.
1.5PCR amplification reaction:
(1) a PCR amplification reaction system is 20 mu l, wherein the PCR amplification reaction system comprises 10 mu l of 2 × qPCR SYBR GreenMaster Mix, 1 mu l of mixed PCR primer working solution I (or mixed PCR primer working solution II or mixed PCR primer working solution III or mixed PCR primer working solution IV or mixed PCR primer working solution V or mixed PCR primer working solution VI or mixed PCR primer working solution VII or mixed PCR primer working solution VIII or mixed PCR primer working solution IX or mixed PCR primer working solution X), 30ng of sample (DNA), and 20 mu l of supplemented ddH2O until the final volume is 20 mu l;
(2) PCR reaction procedure: 95 ℃ 30sec → 95 ℃ 5sec, 60 ℃ 34sec, (40 cycles) → 95 ℃ 15sec, 60 ℃ 60sec (melting curve) → 4 ℃ holding temperature.
1.6 data analysis
Calculating the abundance of the target intestinal bacteria by adopting an amplification curve threshold line (Ct) calculation method and standardizing the target intestinal bacteria by 16srRNA, namely delta Ct ═ Ct16srRNA-Ct target
1.7 fecal occult blood test (colloidal gold method)
Balancing the detection reagent strip of the fecal occult blood detection kit (colloidal gold method) and the sample of the tube B to room temperature in advance; preparing a sample processing tube for each sample, and adding 600ul of distilled water into the processing tube; randomly sampling different parts of the excrement sample by using a sampling rod, uniformly stirring the sample in a treatment tube based on the sampling amount of a small circular ring attached to the front end of the sampling rod, and completely mixing the sample with distilled water; vertically putting one end of the detection reagent strip for sample application into the sample mixed solution, taking out the reagent strip after soaking for 15 seconds and horizontally putting the reagent strip on a desktop, wherein the MAX line of the reagent strip cannot be soaked; the results were observed, recorded and photographed within 3-5 minutes.
2. Colorectal cancer risk assessment
2.1 assessment of colorectal cancer risk using artificial interpretation: according to the detected abundance of 9 intestinal bacteria in excrement and the detection result of fecal occult blood, the colorectal cancer risk is evaluated by artificial interpretation, and the interpretation standards and the grades are divided into:
(1) low-level risk of total negative results for occult blood in stool and 9 indicator bacteria
(2) The medium-low risk is negative in fecal occult blood result and positive in 1-3 indicator bacteria result
(3) The intermediate risk is that the fecal occult blood result is positive and less than or equal to 3 indicator bacteria result is positive or the fecal occult blood is negative and more than or equal to 4 indicator bacteria result is positive
(4) High risk that fecal occult blood result is positive and more than or equal to 4 indicator bacteria result is positive
2.2 use support vector machine algorithm to assess colorectal cancer risk:
3 results
3.1 sample-detection result
3.1.1 the relative abundance of 9 intestinal bacteria in feces is shown in Table 5;
TABLE 5
Figure BDA0002387702370000111
Figure BDA0002387702370000121
Note: s.salivarius, F.praussnitizi and E.rectangle are probiotics, and the other 7 kinds of bacteria are pathogenic bacteria
3.1.2 fecal occult blood test results: negative of
3.1.3 colorectal cancer Risk assessment
Assessing colorectal cancer risk according to manual interpretation: intermediate to low risk.
According to the detected abundance of intestinal bacteria in excrement and the detection result of fecal occult blood, the colorectal cancer risk is evaluated by using a support vector machine algorithm as follows: high risk, further comprehensive examination is recommended.
Table 5 shows the relative abundance of 9 intestinal bacteria in the stool of the sample, the fecal occult blood test result is negative, and the pathological information shows that the sample is a stool sample of a colorectal cancer patient. And manually interpreting and evaluating the colorectal cancer risk as a medium-low level risk.
Example 2: method for determining preferred combination of indicator bacteria by adopting excrement sample support vector machine
1. Treating the excrement
The specific experimental procedure in this example was the same as that in example 1. Feces of healthy persons and colorectal cancer patients were collected, bacterial genomes were extracted, PCR was performed using 16s rRNA primers, and values were recorded by amplification curve threshold line (Ct) calculation. Additionally, fecal occult blood detection was performed.
2. Normalizing fecal treatment results
The data file format of the standard input reference set is characterized in that a sample number is arranged in a first column, delta Ct values of indicator bacteria 1, indicator bacteria 2, indicator bacteria 3, indicator bacteria 4, indicator bacteria 5, indicator bacteria 6, indicator bacteria 7, indicator bacteria 8 and indicator bacteria 9 are arranged in second to tenth columns, artificial negative and positive judgment results of the indicator bacteria 1, indicator bacteria 2, indicator bacteria 3, indicator bacteria 4, indicator bacteria 5, indicator bacteria 6, indicator bacteria 7, indicator bacteria 8 and indicator bacteria 9 are arranged in eleventh to nineteenth columns, a occult blood result is arranged in twentieth column, and sample information is arranged in twenty-first column. The file format is csv format.
3. Reading csv file, screening indication index
The original abundance judgment method is shown in a logic diagram 1.
For convenience of judgment, the original method firstly judges whether indicator bacterium 1, indicator bacterium 2, indicator bacterium 3, indicator bacterium 4, indicator bacterium 5, indicator bacterium 6, indicator bacterium 7, indicator bacterium 8 and indicator bacterium 9 are positive or negative. But this process loses part of the information and may reduce the accuracy of the detection result.
The biggest differences in this application are:
and analyzing each sample respectively based on the two data types of the delta Ct value and the positive and negative judgment value, selecting according to the accuracy of the result, and determining and selecting the delta Ct value as the data type of the indicator bacterium.
On the basis of determining the delta Ct value as the data type of the indicator bacteria, the effectiveness indicated by part of the 9 indicator bacteria is poor, and the indicator bacteria combination needs to be screened.
Indicator 1, indicator 2, indicator 3, indicator 4, and indicator 5 have high indicating efficacy and are used as core indicator, and are combined with indicator 6, indicator 7, indicator 8, and indicator 9, respectively, for analysis.
If the indicator bacterium 1, the indicator bacterium 2, the indicator bacterium 3, the indicator bacterium 4, the indicator bacterium 5 and the indicator bacterium 6 are a combination A;
indicator bacterium 3, indicator bacterium 4, indicator bacterium 5, indicator bacterium 7, indicator bacterium 8, and indicator bacterium 9 are another combination B.
According to the result accuracy, referring to the results in table 2, the relative abundance and the relative threshold are integrated, the preferred combination of the indicator bacteria is determined to be the combination A, the indicator bacteria 1, the indicator bacteria 2, the indicator bacteria 3, the indicator bacteria 4, the indicator bacteria 5 and the indicator bacteria 6, and the judgment logic is as logic diagram 2.
4. Dividing training set and verification set
A reference set of 258 known samples is taken, 181 of the reference set are randomly taken to train parameters and the like, and 77 samples are taken to verify the training effect.
5. Model optimization and parameter tuning
And comparing the linear regression of default parameters, the support vector machine and the extreme gradient improvement method, and finally selecting the support vector machine method with higher accuracy. Initializing a support vector machine classifier.
The support vector machine comprises linear kernel parameters, polynomial kernel functions, Gaussian radial basis function kernel (RBF kernel) parameters and the like, and the linear kernel parameters and the Gaussian kernel parameters with better effectiveness are selected to carry out a grid method for carrying out important parameter optimization.
The first important parameter C is a penalty coefficient, and the larger the C is, the more unacceptable the error is, the closer the C is to the training set, the overfitting is easy to occur; conversely, C is too small and is easily under-fitted. C is set to 7 levels.
The other important parameter is a g parameter (gamma value in an RBF core), and the larger the g is, the more concentrated the Gaussian distribution can be, the more overfitting is carried out to the support vector, namely, the overfitting is carried out, so that the conditions that the accuracy of the training set is high but the accuracy of the verification set is not high easily occur; the smaller g, the more scattered the gaussian distribution and the smoother the model, but the accuracy of the training set is not easily improved. We set g to 8 levels.
The parameter C is a penalty coefficient, and the larger the parameter C is, the more unacceptable the error is, the closer the parameter C is to the training set, the easier the overfitting is; conversely, C is too small and is easily under-fitted.
The penalty coefficient C is set to 7 levels: 0.001,0.01,0.1,1,10,100,1000. An important parameter of the gaussian kernel is the gamma value, which is set to 8 levels: 0.00001,0.0001,0.001,0.1,1,10,100,1000.
And carrying out grid combination test on the linear kernel, the Gaussian kernel, 7 penalty coefficient grades and 8 gamma values within the application range of the linear kernel, the Gaussian kernel, the 7 penalty coefficient grades and the 8 gamma values to obtain the optimal parameters: c:100, gamma:0.001, kernel: Gaussian kernel, and the logic for judgment is shown in logic diagram 3.
For comparison, set the comparison parameter combination 1 as C:1000, gamma:1000, kernel: Gaussian kernel.
And setting the contrast parameter combination 2 as C:1, gamma:0.1 and kernel as Gaussian kernel.
6. Authentication using authentication sets
And (3) predicting by using the optimal parameters obtained in the last step, including the optimal kernel (Gaussian kernel), the optimal penalty coefficient (100), the optimal gamma value (0.001) and a corresponding model, and obtaining a prediction result of the verification set by using the indication data of 77 samples in the verification set.
And calculating the sensitivity and specificity of the method according to the comparison between the prediction result and the real result of the verification set, and then adjusting the sensitivity and specificity according to the use scene. The method accuracy statistics are shown in table 6 below.
TABLE 6
Method of producing a composite material Sensitivity of the device Specificity of
Support vector machine 90% 99%
7. Detecting a sample to be detected
And inputting a data file of a sample to be detected according to the specification, analyzing by using the method after the indexes such as the preferred indicator bacteria and the like and the preferred penalty coefficient of the preferred kernel function are preferred to be a gamma value, and outputting a colorectal cancer risk result file and a detection report. Such as logic diagram 4.
The combination of 6 bacteria of the present application was used as a preferred combination a, and indicator bacteria 3, indicator bacteria 4, indicator bacteria 5, indicator bacteria 7, indicator bacteria 8, and indicator bacteria 9 were used as another combination B, and the primer combination examples related to the bacterial species were compared to obtain comparative data in table 4:
TABLE 7
Sample numbering Preferred combination A Comparative combination B Sample actual information
C209 Positive for Negative of Feces of tumor patient
C217 Positive for Negative of Feces of tumor patient
C278 Positive for Negative of Feces of tumor patient
As is clear from Table 7, the preferred combination of 6 strains according to the present invention is excellent in avoiding false positive test results.
And (3) comparing the accuracy of the strain or primer combination method: the preferred parameter combinations and comparative parameter combinations in this application were compared and the data are shown in table 8:
TABLE 8
Figure BDA0002387702370000151
As can be seen from Table 8, the preferred parameter combinations of the present application are good for avoiding false positive test results.
TABLE 9
Figure BDA0002387702370000152
Figure BDA0002387702370000161
As can be seen from table 9, the preferred combinations of the present application are more sensitive, which means that the results are more easily detected for samples that are more difficult to detect.
The comparison of the accuracy of the model parameter combination is shown in table 10, and it can be seen that the specificity and the sensitivity of the parameter combination are both high, which indicates that the detection rate of the method is higher and the detection effect is better.
Watch 10
Method of producing a composite material Sensitivity of the device Specificity of
Comparison parameter combination 1 68% 94%
Comparison parameter combination 2 81% 98%
Preferred parameter combinations 90% 99%
The conventional manual method often has the problem of misjudgment, and the comparison data of the method and the conventional manual method are shown in a table 11.
TABLE 11
Figure BDA0002387702370000162
As can be seen from Table 11, the method of the present invention can overcome the misjudgment in the conventional manual judgment method, thereby avoiding the detection mode requiring physical intervention in the later period due to the misjudgment, and the influence of the present invention on the patient is smaller.
As can be seen from Table 12, the SVM algorithm of the present invention is more accurate, sensitive and specific than the manual method.
TABLE 12
Figure BDA0002387702370000171
In summary, the main innovation points of the present invention are:
1. compared with the traditional method, colorectal cancer risk screening is carried out based on a non-invasive method, namely intestinal bacteria detection;
2. based on data of colorectal cancer, intestinal flora combination is optimized, and judgment accuracy is effectively improved; the detection cost is saved;
3. the SVM algorithm makes more full use of the detection result information. In the conventional risk judgment method, each indicator bacterium is judged to be negative or positive in advance according to a threshold value determined by literature research and the like, then the number of the positive bacteria is calculated, and finally the risk judgment is carried out. The algorithm directly uses the delta Ct value, and the information resolution is higher.
4. Based on the data of colorectal cancer, a data model is preferred, as well as a kernel function, a penalty factor, and a gamma value.
5. The SVM algorithm has a more definite conclusion, and the trouble of judgment of a plurality of indicator bacteria and fecal occult blood to a client is avoided.
Sequence listing
<110> Nanjing Pekino Gene science and technology Co., Ltd
<120> kit for identifying colorectal cancer and application thereof
<130>20200217
<160>20
<170>SIPOSequenceListing 1.0
<210>1
<211>29
<212>DNA
<213>homo sapiens
<400>1
caaccattac tttaactcta ccatgttca 29
<210>2
<211>33
<212>DNA
<213>homo sapiens
<400>2
gttgacttta cagaaggaga ttatgtaaaa atc 33
<210>3
<211>25
<212>DNA
<213>homo sapiens
<400>3
agacgaattc aagtcagtaa ataca 25
<210>4
<211>22
<212>DNA
<213>homo sapiens
<400>4
ctcctatcca ccaggatatc aa 22
<210>5
<211>19
<212>DNA
<213>homo sapiens
<400>5
gtgagatgat gtgccaggc 19
<210>6
<211>20
<212>DNA
<213>homo sapiens
<400>6
taccggttgc ttcgtcgatt 20
<210>7
<211>24
<212>DNA
<213>homo sapiens
<400>7
cccttattgt tagttgccat catt 24
<210>8
<211>21
<212>DNA
<213>homo sapiens
<400>8
actcgttgta cttcccattg t 21
<210>9
<211>22
<212>DNA
<213>homo sapiens
<400>9
tcgaccacat agagctaagc ac 22
<210>10
<211>21
<212>DNA
<213>homo sapiens
<400>10
tcctcgactt tcataccgtc t 21
<210>11
<211>20
<212>DNA
<213>homo sapiens
<400>11
ttcgcttccc agagtcaagt 20
<210>12
<211>20
<212>DNA
<213>homo sapiens
<400>12
aaacgaccag ccagcaattc 20
<210>13
<211>17
<212>DNA
<213>homo sapiens
<400>13
ggtgaatacg ttcccgg 17
<210>14
<211>22
<212>DNA
<213>homo spaiens
<400>14
tacggctacc ttgttacgac tt 22
<210>15
<211>24
<212>DNA
<213>homo sapiens
<400>15
cccttattgt tagttgccat catt 24
<210>16
<211>21
<212>DNA
<213>homo sapiens
<400>16
actcgttgta cttcccattg t 21
<210>17
<211>20
<212>DNA
<213>homo sapiens
<400>17
aagggaagca acgctgtgaa 20
<210>18
<211>19
<212>DNA
<213>homo sapiens
<400>18
cggttaggtc actggcttc 19
<210>19
<211>20
<212>DNA
<213>homo sapiens
<400>19
ggaggattga ccccttcagt 20
<210>20
<211>20
<212>DNA
<213>homo sapiens
<400>20
ctggtcccga agaaacacat 20

Claims (3)

1. A kit for identifying colorectal cancer is characterized by comprising the following components:
identifying a specific primer combination of the indicator bacteria, wherein the primer combination is shown as SEQ ID NO. 1-12;
wherein the indicator bacteria are the combination of Fusobacterium nucleatum (F.nuclearum), Streptococcus anaerobicus (P.anaerobobius), Clostridium symbiosum (C.symbiosum), enterococcus faecalis (E.faecalis), Porphyromonas saccharolyticus (P.asaccharolytica) and Streptococcus salivarius (S.salivarius).
2. The kit for identifying colorectal cancer according to claim 1, further comprising:
a sample collection and/or collection device;
PCR reaction premix solution;
and (3) water.
3. The use of the seed kit of claim 1 for later development of a device for aiding in the diagnosis of colorectal cancer.
CN202010103575.1A 2020-02-20 2020-02-20 Kit for identifying colorectal cancer and application thereof Pending CN111334590A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010103575.1A CN111334590A (en) 2020-02-20 2020-02-20 Kit for identifying colorectal cancer and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010103575.1A CN111334590A (en) 2020-02-20 2020-02-20 Kit for identifying colorectal cancer and application thereof

Publications (1)

Publication Number Publication Date
CN111334590A true CN111334590A (en) 2020-06-26

Family

ID=71180378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010103575.1A Pending CN111334590A (en) 2020-02-20 2020-02-20 Kit for identifying colorectal cancer and application thereof

Country Status (1)

Country Link
CN (1) CN111334590A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112695077A (en) * 2020-12-30 2021-04-23 上海派森诺医学检验所有限公司 Method for identifying and analyzing flora in intestinal tract
CN112725457A (en) * 2021-03-08 2021-04-30 温州医科大学 Intestinal flora marker for judging intestinal cancer and detection method thereof
WO2022105835A1 (en) * 2020-11-19 2022-05-27 The Chinese University Of Hong Kong Assessing risk for colorectal adenoma recurrence by noninvasive means

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109943636A (en) * 2019-04-11 2019-06-28 上海宝藤生物医药科技股份有限公司 A kind of colorectal cancer microbial markers and its application
CN110512015A (en) * 2019-09-11 2019-11-29 苏州普瑞森基因科技有限公司 A kind of intestinal cancer biomarker combinations object and its application
CN110643721A (en) * 2019-09-04 2020-01-03 上海派森诺生物科技股份有限公司 Kit for detecting colorectal cancer indicator bacteria
CN110781915A (en) * 2019-09-17 2020-02-11 上海派森诺生物科技股份有限公司 Method for improving colorectal cancer indicator bacterium detection sensitivity by applying support vector machine algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109943636A (en) * 2019-04-11 2019-06-28 上海宝藤生物医药科技股份有限公司 A kind of colorectal cancer microbial markers and its application
CN110643721A (en) * 2019-09-04 2020-01-03 上海派森诺生物科技股份有限公司 Kit for detecting colorectal cancer indicator bacteria
CN110512015A (en) * 2019-09-11 2019-11-29 苏州普瑞森基因科技有限公司 A kind of intestinal cancer biomarker combinations object and its application
CN110781915A (en) * 2019-09-17 2020-02-11 上海派森诺生物科技股份有限公司 Method for improving colorectal cancer indicator bacterium detection sensitivity by applying support vector machine algorithm

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022105835A1 (en) * 2020-11-19 2022-05-27 The Chinese University Of Hong Kong Assessing risk for colorectal adenoma recurrence by noninvasive means
CN112695077A (en) * 2020-12-30 2021-04-23 上海派森诺医学检验所有限公司 Method for identifying and analyzing flora in intestinal tract
CN112725457A (en) * 2021-03-08 2021-04-30 温州医科大学 Intestinal flora marker for judging intestinal cancer and detection method thereof
CN112725457B (en) * 2021-03-08 2022-09-02 温州医科大学 Intestinal flora marker for judging intestinal cancer and detection method thereof

Similar Documents

Publication Publication Date Title
Prideaux et al. Impact of ethnicity, geography, and disease on the microbiota in health and inflammatory bowel disease
CN105368944B (en) Biomarker of detectable disease and application thereof
Sankar et al. The human gut microbiome, a taxonomic conundrum
CN111334590A (en) Kit for identifying colorectal cancer and application thereof
CN109706235A (en) A kind of the detection and analysis method and its system of intestinal microflora
CN107217089B (en) Method and device for determining individual state
CN107430644A (en) Method for determining gastrointestinal bacterial flora imbalance
WO2016049932A1 (en) Biomarkers for obesity related diseases
CN110643721A (en) Kit for detecting colorectal cancer indicator bacteria
CN105219844A (en) A kind of compose examination 11 kinds of diseases gene marker combination, test kit and disease risks predictive model
CN113106163B (en) Biomarker composition for early diagnosis of colorectal cancer and application
JP2020516231A (en) Methods for determining the status of colorectal cancer in an individual
CN110541026A (en) Biomarker for detecting ulcerative colitis and application
CN107217088B (en) Ankylosing spondylitis microbial markers
CN109689890B (en) Biomarker combination for adenomyosis detection and application thereof
CN107217086B (en) Disease marker and application
CN112553344B (en) Biomarker related to colorectal cancer and application thereof
CN105733988B (en) Composition and application
CN110781915B (en) Method for improving colorectal cancer indicator bacteria detection sensitivity by using support vector machine algorithm
CN112410449A (en) Microbial marker related to colorectal cancer and application thereof
CN114891904A (en) Maternal intestinal flora marker for children ASD diagnosis and application thereof
CN108531569A (en) The gene marker screened for obsessive-compulsive disorder and schizophrenia, depression and its application
CN111261222B (en) Construction method of oral microbial community detection model
WO2016049927A1 (en) Biomarkers for obesity related diseases
Kobayashi et al. Numerical analyses of intestinal microbiota by data mining

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination