CN116042827A

CN116042827A - Novel microbial markers for predicting colorectal cancer risk

Info

Publication number: CN116042827A
Application number: CN202211563402.3A
Authority: CN
Inventors: 黄秀娟; 陈家亮; 梁巧仪; 周彤
Original assignee: Zhenao Biotechnology Testing Shenzhen Co ltd; Chinese University of Hong Kong CUHK
Current assignee: Zhenao Biotechnology Testing Shenzhen Co ltd; Chinese University of Hong Kong CUHK
Priority date: 2022-12-07
Filing date: 2022-12-07
Publication date: 2023-05-02
Anticipated expiration: 2042-12-07
Also published as: WO2024119614A1; CN116042827B

Abstract

The present invention relates to novel microbial markers for assessing the risk of colorectal cancer or colorectal adenoma in a subject.

Description

Novel microbial markers for predicting colorectal cancer risk

Technical Field

The present application relates to the identification of bacterial species that contribute to improved colorectal adenoma diagnosis and the development of new bacterial marker combinations for diagnosis of CRC and adenoma. In particular, the present application relates to reagents that specifically and quantitatively identify DNA, RNA or proteins that are characteristic for Cloacibacillus porcorum.

Background

Colorectal cancer (CRC) is one of the most common malignant tumors worldwide. Most CRCs begin with small polyps. Some polyps, particularly adenomas, may develop into cancer. Early detection of tumors may promote successful treatment and early detection of adenomas may prevent and reduce the incidence of CRC. Although the non-invasive tests currently available for CRC screening perform well in detecting CRC, they have limited sensitivity to adenomas. The incidence of CRC is higher in the more developed regions than in the less developed regions, and the increase in incidence of CRC is believed to be attributable to changes in diet ^1,2 . Recent evidence suggests that changes in the microbial environment in the gut are associated with colorectal tumor development. Abnormal intestinal flora composition is considered to be a potentially important cause of CRC occurrence and development ³ . With the widespread use of metagenomic analysis in intestinal microbiota research, an increasing number of bacteria were identified as positively correlated with CRC ^4-7 . Recent basic studies have established intestinal microbiota ⁸ And key functions of specific bacterial species in promoting colorectal tumorigenesis, such as Fusobacterium nucleatum (Fusobacterium nucleatum, fn) ^9-11 And anaerobic streptococcus digestion (Peptostreptococcus anaerobius) ¹² . Bacteria, e.g. Fn ¹³ Symbiotic clostridium (Clostridium symbiosum) ¹⁴ And species within the genus parvomonas (Parvimonas), porphyrinogen (Porphyromonas) and Parabacterium (Parabacterium) ¹⁵ Has proven to be a potential marker for diagnosis of CRC patients. However, current knowledge of biomarkers for colorectal adenoma detection is limited.

Bacterial markers for non-invasive diagnosis of CRC and adenoma have been previously identified and validated by metagenomic sequencing and targeted qPCR. Specifically, for four gene markers from four bacteria, a qPCR assay has been developed for diagnosis of CRC and adenomas, including clostridium nucleatum (Fusobacterium nucleatum, fn), clostridium halosai (Hungatella hathewayi, old name Clostridium hathewayi, ch), lachnoclostrichum bacterial marker m3 and bacteroides clausii (Bacteroides clarus, bc). Three of the bacteria (Fn, ch and m 3) were enriched in faeces of CRC or adenoma patients, while Bc was enriched in healthy subjects. Although tests involving these four bacterial markers show superior performance to the currently available tests for non-invasive diagnosis of CRC and adenoma, there is still a need to further increase sensitivity to adenoma.

Disclosure of Invention

In a first aspect, the present application provides a kit for detecting colorectal cancer or colorectal adenoma in a subject, comprising: reagents that specifically and quantitatively identify DNA, RNA or proteins that are characteristic for Cloacibacillus porcorum.

In some embodiments, the DNA or RNA characteristic of Cloacibacillus porcorum comprises the nucleic acid sequence set forth in SEQ ID NO. 1 or 21.

In some embodiments, the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for Cloacibacillus porcorum comprises an oligonucleotide primer comprising the nucleic acid sequences set forth in SEQ ID No. 2 and SEQ ID No. 3.

In some embodiments, the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for Cloacibacillus porcorum comprises the polynucleotide probe set forth in SEQ ID No. 4.

In some embodiments, the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for Cloacibacillus porcorum comprises an antibody that specifically binds to the protein.

In some embodiments, the kit further comprises a standard control that provides an average amount of Cloacibacillus porcorum in the fecal sample.

In some embodiments, the kit further comprises one or more reagents selected from the group consisting of:

an agent for specifically and quantitatively identifying DNA, RNA or protein specific to Fusobacterium genus;

an agent for specifically and quantitatively identifying DNA, RNA or protein specific to the bacterium m3 of the genus Lachnoclostrichum;

an agent that specifically and quantitatively identifies DNA, RNA, or protein specific for clostridium harbouri (Hungatella hathewayi, old name Clostridium hathewayi); and

an agent for specifically and quantitatively identifying DNA, RNA or protein specific to the bacterium Cladosiphon (Bacteroides clarus).

In some embodiments, the DNA or RNA characteristic of Fusobacterium comprises the Fusobacterium nusG gene shown in SEQ ID NO. 5.

In some embodiments, DNA or RNA that is characteristic for lachnoclostrichum genus bacterium m3 comprises the gene marker m482585.

In some embodiments, the DNA or RNA characteristic for clostridium harbouri comprises the gene marker m2736705.

In some embodiments, the DNA or RNA that is characteristic for bacteroides clathraustochytrias comprises the gene marker m370640.

In some embodiments, the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Fusobacterium genus comprises an oligonucleotide primer comprising the nucleic acid sequences set forth in SEQ ID NO. 6 and SEQ ID NO. 7.

In some embodiments, the reagent that specifically and quantitatively identifies a DNA, RNA or protein that is characteristic for bacteria m3 of the genus Lachnoclostrichum comprises an oligonucleotide primer comprising the nucleic acid sequences set forth in SEQ ID NO. 10 and SEQ ID NO. 11.

In some embodiments, the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for clostridium harbouri (Hungatella hathewayi, old name Clostridium hathewayi) comprises an oligonucleotide primer comprising the nucleic acid sequences set forth in SEQ ID No. 14 and SEQ ID No. 15.

In some embodiments, the agent that specifically and quantitatively identifies a DNA, RNA, or protein that is characteristic for Clostridium clausii (Bacteroides clarus) comprises an oligonucleotide primer comprising the nucleic acid sequences set forth in SEQ ID NO. 18 and SEQ ID NO. 19.

In some embodiments, the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Fusobacteria genus comprises a polynucleotide probe set forth in SEQ ID NO. 8.

In some embodiments, the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for Lachnoclostrichum bacterium m3 comprises a polynucleotide probe shown in SEQ ID NO. 12.

In some embodiments, the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for Clostridium harbouri (Hungatella hathewayi, old name Clostridium hathewayi) comprises the polynucleotide probe set forth in SEQ ID NO. 16.

In some embodiments, the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for Cladosiphon sp (Bacteroides clarus) comprises the polynucleotide probe set forth in SEQ ID NO. 20.

In some embodiments, the reagent comprises a detectable moiety.

In some embodiments, the kit further comprises an instruction manual.

In some embodiments, the kit further comprises reagents for Fecal Immunochemical Testing (FIT).

In some embodiments, the kit further comprises a standard control that provides an average amount in the fecal sample of one or more selected from the group consisting of:

fusobacterium genus;

lachnoclostrichum genus bacterium m3;

clostridium harbouri; and

bacteroides clausii.

In a second aspect, the present application provides the use of an agent that specifically and quantitatively identifies DNA, RNA or protein that is characteristic for Cloacibacillus porcorum in the manufacture of a kit for detecting colorectal cancer or colorectal adenoma in a subject.

In some embodiments, the reagent comprises a detectable moiety.

fusobacterium genus;

lachnoclostrichum genus bacterium m3;

clostridium harbouri; and

bacteroides clausii.

In some embodiments, the agent quantitatively identifies the DNA or RNA by RT-PCR, real-time quantitative PCR, or metagene sequencing.

In some embodiments, the reagent quantitatively identifies the protein by western blot, radioimmunoassay, enzyme-linked immunosorbent assay, immunofluorescence, or protein chip.

As non-limiting examples, the present application provides the following embodiments:

embodiment 1 a kit for detecting colorectal cancer or colorectal adenoma in a subject comprising:

reagents that specifically and quantitatively identify DNA, RNA or proteins that are characteristic for Cloacibacillus porcorum.

Embodiment 2. The kit according to embodiment 1, wherein,

the DNA or RNA specific for Cloacibacillus porcorum comprises the nucleic acid sequence shown in SEQ ID NO. 1 or 21.

Embodiment 3. The kit of embodiment 1, wherein the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for Cloacibacillus porcorum comprises an oligonucleotide primer comprising the nucleic acid sequences set forth in SEQ ID No. 2 and SEQ ID No. 3; or (b)

The agent for specifically and quantitatively identifying DNA, RNA or protein specific to Cloacibacillus porcorum comprises a polynucleotide probe shown in SEQ ID NO. 4; or (b)

The agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for Cloacibacillus porcorum comprises an antibody that specifically binds to the protein.

Embodiment 4. The kit of embodiment 1 further comprising a standard control providing an average amount of Cloacibacillus porcorum in the fecal sample.

Embodiment 5 the kit of embodiment 1, wherein the kit further comprises one or more reagents selected from the group consisting of:

an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for clostridium harbouri (Hungatella hathewayi); and

Embodiment 6. The kit of embodiment 5, wherein,

the DNA or RNA specific to Fusobacterium genus contains a Fusobacterium genus nusG gene shown in SEQ ID NO. 5;

the DNA or RNA specific for the bacterium m3 of the genus Lachnoclostrichum comprises the gene marker m482585;

the DNA or RNA specific for clostridium harbouri comprises the gene marker m2736705; or (b)

The DNA or RNA specific for the species bacteroides clausii contains the gene marker m370640.

Embodiment 7. The kit of embodiment 5, wherein

The reagent for specifically and quantitatively identifying DNA, RNA or protein specific for Fusobacterium genus comprises an oligonucleotide primer comprising the nucleic acid sequence shown in SEQ ID NO. 6 and SEQ ID NO. 7;

the reagent for specifically and quantitatively identifying a DNA, RNA or protein specific for the bacterium m3 of the genus Lachnoclostrichum comprises an oligonucleotide primer comprising the nucleic acid sequences shown in SEQ ID NO. 10 and SEQ ID NO. 11;

reagents for the specific and quantitative identification of DNA, RNA or protein specific for clostridium harbouri (Hungatella hathewayi) comprise oligonucleotide primers comprising the nucleic acid sequences shown in SEQ ID No. 14 and SEQ ID No. 15; or (b)

The reagent for specifically and quantitatively identifying DNA, RNA or protein specific for the bacterium Cladosiphon (Bacteroides clarus) comprises an oligonucleotide primer comprising the nucleic acid sequence shown as SEQ ID NO. 18 and SEQ ID NO. 19;

or,

reagents for specifically and quantitatively identifying DNA, RNA or protein specific for Fusobacteria genus comprising the polynucleotide probe shown in SEQ ID NO. 8;

an agent for specifically and quantitatively identifying a DNA, RNA or protein specific for a bacterium m3 of the genus Lachnoclostrichum comprises the polynucleotide probe shown in SEQ ID NO. 12;

Reagents for specific and quantitative identification of DNA, RNA or protein specific for clostridium harbouri (Hungatella hathewayi) comprise the polynucleotide probe shown in SEQ ID No. 16; or (b)

The reagent for specifically and quantitatively identifying DNA, RNA or protein specific for the corynebacterium clavatum (Bacteroides clarus) comprises the polynucleotide probe shown in SEQ ID NO. 20.

Embodiment 8. The kit of

embodiment

1 or 5, wherein the reagents comprise a detectable moiety.

Embodiment 9. The kit of

embodiment

1 or 5 further comprising an instruction manual.

Embodiment 10. The kit as in

embodiment

1 or 5 further comprising reagents for stool immunochemical testing (FIT).

Embodiment 11. The kit of embodiment 5 further comprising a standard control providing an average amount in the fecal sample of one or more selected from the group consisting of:

fusobacterium genus;

lachnoclostrichum genus bacterium m3;

clostridium harbouri; and

bacteroides clausii.

Embodiment 12 use of an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for Cloacibacillus porcorum in the manufacture of a kit for detecting colorectal cancer or colorectal adenoma in a subject.

Embodiment 13. The use according to embodiment 12, wherein,

Embodiment 14. The use as in embodiment 12, wherein the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for Cloacibacillus porcorum comprises an oligonucleotide primer comprising the nucleic acid sequences set forth in SEQ ID No. 2 and SEQ ID No. 3; or (b)

Embodiment 15. The use as in embodiment 12, wherein the kit further comprises a standard control providing an average amount of Cloacibacillus porcorum in the fecal sample.

Embodiment 16. The use according to embodiment 12, the kit further comprises one or more reagents selected from the group consisting of:

Embodiment 17. The use according to embodiment 16, wherein,

Embodiment 18 the use according to embodiment 16, wherein

or,

Embodiment 19. The use of embodiment 12 or 16, wherein the agent comprises a detectable moiety.

Embodiment 20. The use as in embodiment 12 or 16, wherein the kit further comprises reagents for a stool immunochemical test (FIT).

Embodiment 21. The use of embodiment 16, wherein the kit further comprises a standard control providing an average amount in the fecal sample of one or more selected from the group consisting of:

fusobacterium genus;

lachnoclostrichum genus bacterium m3;

clostridium harbouri; and

bacteroides clausii.

Embodiment 22. The use as in embodiment 12, wherein,

the reagents quantitatively identify the DNA or RNA by RT-PCR, real-time quantitative PCR or metagene sequencing, or

The reagents quantitatively identify the protein by western blotting, radioimmunoassay, enzyme-linked immunosorbent assay, immunofluorescence or protein chip.

Drawings

Fig. 1: abundance of five bacteria identified by metagenomic sequencing. Wherein N represents a normal control; a represents adenoma; CRC indicates colorectal cancer.

Fig. 2: correlation between metagenomic sequencing and qPCR quantification.

Fig. 3: qPCR verifies the recognized bacterial markers. (A) The abundance of candidate markers correlates with disease progression from normal to adenoma and further to CRC. (B) abundance of three bacteria detected by qPCR. Wherein N represents a normal control; a represents adenoma; CRC indicates colorectal cancer.

Fig. 4: fecal levels and discovery rates of c.porcum assessed by metagenomic sequencing (a) and qPCR (B). Wherein N represents a normal control; a represents adenoma; CRC indicates colorectal cancer; nAA represents a non-advanced adenoma; AA represents advanced adenoma.

Fig. 5: ROC curves analyze the performance comparison of individual bacterial markers in CRC and adenoma diagnosis. Wherein Fn represents fusobacterium nucleatum; ch represents clostridium harbouri; m3 represents a bacterium m3 of the genus Lachnoclostrichum; cp represents C.porcum; n represents a normal control; a represents adenoma.

Fig. 6: the performance of the bacterial marker panel in CRC and adenoma diagnosis was analyzed by ROC curves. Wherein Fn represents fusobacterium nucleatum; ch represents clostridium harbouri; m3 represents a bacterium m3 of the genus Lachnoclostrichum; cp represents C.porcum; bc represents bacteroides clausii; n represents a normal control; a represents adenoma.

Fig. 7: the "4Bac" test or diagnosis of adenoma in m3 was significantly improved by the mercinum. 4Bac: fusobacterium nucleatum+Clostridium Harbour+Lachnoclostrichum genus bacterium m3+Crarrussia; cp represents C.porcum; n represents a normal control; a represents adenoma; AUC represents the area under the subject's operating profile.

Fig. 8: combining with the stool immunochemistry test (FIT) significantly improves the diagnostic performance of CRC and advanced adenomas. 4Bac: fusobacterium nucleatum+Clostridium Harbour+Lachnoclostrichum genus bacterium m3+Crarrussia; 5Bac: 4Bac+C.pore; AA represents advanced adenoma; nAA represents a non-advanced adenoma; n represents a normal control; AUC represents the area under the subject's operating profile.

Fig. 9: effects of stool immunochemistry assay (FIT) and c.pore on CRC and adenoma detection. 4Bac: fusobacterium nucleatum+Clostridium Harbour+Lachnoclostrichum genus bacterium m3+Crarrussia; 5Bac: 4Bac+C.porcum.

Fig. 10: comparison of sensitivity of stool immunochemistry assay (FIT), bacterial markers and combinations thereof in detection of CRC according to the TNM staging subgroup. 4Bac: fusobacterium nucleatum+Clostridium Harbour+Lachnoclostrichum genus bacterium m3+Crarrussia; 5Bac: 4Bac+C.pore; PPV represents a positive predictive value; NPV represents a negative predictive value.

DESCRIPTION OF THE SEQUENCES

SEQ ID NO. 1 through SEQ ID NO. 4 are respectively a fragment of the gene marker for Cloacibacillus porcorum (SEQ ID NO. 1), a forward primer (Cp-F, SEQ ID NO. 2) and a reverse primer (Cp-R, SEQ ID NO. 3) for detecting the gene marker, and a probe (Cp-probe, SEQ ID NO. 4) for detecting the gene marker.

SEQ ID NO. 5 to SEQ ID NO. 8 are, respectively, the Fusobacterium nusG gene (SEQ ID NO. 5), a forward primer (Fn-F, SEQ ID NO. 6) and a reverse primer (Fn-R, SEQ ID NO. 7) for detecting the gene, and a probe (Fn-probe, SEQ ID NO. 8) for detecting the gene.

SEQ ID NO. 9 to SEQ ID NO. 12 are respectively a gene marker m482585 (SEQ ID NO. 9) for the bacterium m3 of the genus Lachnoclostrichum, a forward primer (m 3-F, SEQ ID NO. 10) and a reverse primer (m 3-R, SEQ ID NO. 11) for detecting the gene marker, and a probe (m 3-probe, SEQ ID NO. 12) for detecting the gene marker.

SEQ ID NO. 13 through SEQ ID NO. 16 are, respectively, a gene marker m2736705 (SEQ ID NO. 13) for Clostridium harbouri (Hungatella hathewayi, old name Clostridium hathewayi), a forward primer (Ch-F, SEQ ID NO. 14) and a reverse primer (Ch-R, SEQ ID NO. 15) for detecting the gene marker, and a probe (Ch-probe, SEQ ID NO. 16) for detecting the gene marker.

SEQ ID NO. 17 through SEQ ID NO. 20 are respectively a gene marker m370640 (SEQ ID NO. 17) against Cladosiphon bacillus (Bacteroides clarus), a forward primer (Bc-F, SEQ ID NO. 18) and a reverse primer (Bc-R, SEQ ID NO. 19) for detecting the gene marker, and a probe (Bc-probe, SEQ ID NO. 20) for detecting the gene marker.

SEQ ID NO. 21 is a gene marker for Cloacibacillus porcorum.

Definition of the definition

In the present application, the terms "colorectal cancer (CRC)", "colorectal cancer" and "intestinal cancer" have the same meaning and refer to cancers of the large intestine (colon), the lower part of the human digestive system. "colorectal cancer cell" is a colorectal epithelial cell that is characteristic of colorectal cancer and includes a precancerous cell, i.e., a cell that is at an early stage of, or is prone to, transformation into a cancer cell. Such cells may exhibit one or more phenotypic trait characteristics of cancerous cells.

In this disclosure, the term "or" is generally employed in its sense including "and/or" unless the content clearly dictates otherwise.

The term "nucleic acid" or "polynucleotide" refers to deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) in single or double stranded form, as well as polymers thereof. Unless specifically limited, the term encompasses nucleic acids that include known analogs of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, single Nucleotide Polymorphisms (SNPs) and complementary sequences, as well as the sequence explicitly indicated. In particular, degenerate codon substitutions may be achieved by generating sequences in which a third position of one or more selected (or all) codons is substituted with mixed bases and/or deoxyinosine residues. The term nucleic acid is used interchangeably with gene, cDNA, and mRNA encoded by a gene.

The term "gene" refers to a DNA segment involved in the production of a polypeptide chain; it includes regions (leader and trailer) preceding and following the coding regions involved in transcription/translation and regulation of transcription/translation of the gene product, as well as intervening sequences (introns) between individual coding segments (exons).

In the present application, the terms "polypeptide", "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The term applies to amino acid polymers in which one or more amino acid residues are artificial chemical mimics of the corresponding naturally occurring amino acid, as well as naturally occurring amino acid polymers and non-naturally occurring amino acid polymers. As used herein, the term encompasses amino acid chains of any length, including full length proteins (i.e., antigens), in which the amino acid residues are linked by covalent peptide bonds.

The term "amino acid" refers to naturally occurring amino acids and synthetic amino acids, as well as amino acid analogs and amino acid mimics that function in a similar manner to naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those which are subsequently modified, such as hydroxyproline, gamma-carboxyglutamic acid, and O-phosphoserine. For the purposes of this application, amino acid analogs refer to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. For the purposes of this application, amino acid mimetics refer to compounds that have a structure that is different from the general chemical structure of an amino acid, but that function in a manner similar to a naturally occurring amino acid.

Amino acids may include those having non-naturally occurring D-chirality, which may improve the stability (e.g., half life), bioavailability, and other characteristics of polypeptides comprising one or more such D-amino acids.

Amino acids may be represented herein by commonly known three-letter symbols, or by the single-letter symbols recommended by the IUPAC-IUB biochemical nomenclature committee. Also, nucleotides may be represented by single letter codes that they are commonly accepted.

As used herein, "primer" refers to an oligonucleotide that can be used in an amplification method, such as the Polymerase Chain Reaction (PCR), to amplify a nucleotide sequence based on a polynucleotide sequence corresponding to a gene of interest (e.g., a DNA or RNA sequence of a related bacterial species). Typically, at least one PCR primer for amplifying a polynucleotide sequence is sequence specific for that polynucleotide sequence. The exact length of the primer depends on a variety of factors, including temperature, primer source and method used. For example, for diagnostic and prognostic applications, an oligonucleotide primer will typically contain at least 10, at least 15, at least 20, or at least 25 or more nucleotides, although it may contain fewer nucleotides or more nucleotides, depending on the complexity of the target sequence. Factors involved in determining the appropriate length of the primer are well known to those skilled in the art.

A "label", "detectable label" or "detectable moiety" is a moiety that can be detected by spectrophotometry, photochemistry, biochemistry, immunochemistry, chemistry, or other physical means. For example, useful markers include ³² P, fluorescent dyes, electron density reagents, enzymes (e.g., enzymes commonly used in ELISA), biotin, digoxin, or haptens and proteins that can be made detectable (e.g., by incorporating a radioactive component into the peptide) or used to detect specific reactions of antibodies with the peptide. Typically, a detectable label is attached to a probe or molecule (e.g., a polypeptide or polynucleotide having a known binding specificity) having defined binding characteristics to allow for easy detection of the presence of the probe (and its binding target).

As used herein, a "standard control" refers to a predetermined amount or concentration of a polynucleotide sequence or polypeptide (e.g., DNA, RNA, or protein of a related bacterium) present in an established disease-free fecal sample (e.g., a fecal sample from an average healthy individual that is not diagnosed with CRC or is known to have an increased risk of developing CRC). Standard control values are suitable for use in the methods of the invention as a basis for comparing the amount of DNA, RNA or protein of the relevant bacteria present in the test sample. As established samples of standard controls, average amounts of DNA, RNA or protein of the commonly relevant bacteria in fecal samples of average healthy persons not suffering from any colon disease (especially CRC) as conventionally defined are provided. Standard control values may vary depending on the nature of the sample and other factors such as the sex, age, race of the individual on which the control value is based.

When used to describe healthy humans who do not have any colorectal disease (particularly CRC) as conventionally defined, the term "average" refers to the level of certain characteristics, particularly DNA, RNA or protein, in human fecal samples that can represent a randomly selected healthy population that does not have any colorectal disease (particularly CRC) and that is not known to be at risk of developing the disease. The selected population should include a sufficient number of people such that the average level or amount of DNA, RNA or protein of the relevant bacteria found in the faeces of these individuals reflects the corresponding level or amount of such DNA, RNA or protein in a general healthy population with reasonable accuracy. Furthermore, the selected population is typically of similar age to the subject for whom the fecal sample is used to test for colorectal cancer indications. In addition, other factors such as gender, race, medical history, etc. should be considered and it is preferable to achieve a close match in characteristics between the test subject and the selected group of individuals who establish the "average" value.

As used herein, the term "amount" refers to the amount of a polynucleotide of interest or a polypeptide of interest (e.g., DNA, RNA, or protein of a related bacterium) present in a sample. Such amounts may be expressed in absolute terms, i.e., the total amount of polynucleotide or polypeptide in the sample, or in relative terms, i.e., the concentration of polynucleotide or polypeptide in the sample.

Detailed Description

In some embodiments, the adenoma is a non-advanced adenoma or an advanced adenoma.

In some embodiments, the agent specifically and quantitatively identifies DNA or RNA that is characteristic for Cloacibacillus porcorum.

In some embodiments, the reagent comprises two sets of oligonucleotide primers comprising the nucleic acid sequences set forth in SEQ ID NO. 2 and SEQ ID NO. 3.

In some embodiments, the reagent comprises a polynucleotide probe as set forth in SEQ ID NO. 4.

In some embodiments, the genus Fusobacterium is Fusobacterium nucleatum (Fusobacterium nucleatum, fn).

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of fusarium; optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma. In some cases, such a combination may be abbreviated as LR2' or fn+cp.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of lachnoclostrichum bacterium m 3; optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma. In some cases, such a combination may be abbreviated as LR2 "or m3+cp.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of clostridium harbouri (Hungatella hathewayi); optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of bacteroides clathraustochytrium (Bacteroides clarus); optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of lachnoclostrichum bacteria m3, and (3) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of fusarium; optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma. In some cases, such a combination may be abbreviated as LR3 or fn+m3+cp.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of clostridium harbouri (Hungatella hathewayi), and (3) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of fusarium; optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma. In some cases, such a combination may be abbreviated as LR3 "or fn+ch+cp.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of bacteroides clathraustochytrias (Bacteroides clarus), and (3) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of fusarium; optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of clostridium harbouri (Hungatella hathewayi), and (3) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of lachnoclostrichum genus bacterium m 3; optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma. In some cases, such a combination may be abbreviated as LR 3' "or ch+m3+cp.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies a DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies a DNA, RNA, or protein that is characteristic of lachnoclostrichum bacteria m3, and (3) an agent that specifically and quantitatively identifies a DNA, RNA, or protein that is characteristic of bacteroides clathraustochytrium (Bacteroides clarus); optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of clostridium harbouring (Hungatella hathewayi), and (3) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of bacteroides clarkii (Bacteroides clarus); optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of clostridium harbouri (Hungatella hathewayi), and (3) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of fusarium, and (4) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of lachnoclostricium, bacterium m 3; optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma. In some cases, such a combination may be abbreviated as LR4' or fn+m3+cp+ch.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of clostridium harbouring (Hungatella hathewayi), and (3) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of bacteroides clarkii (Bacteroides clarus), and (4) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of lachnoclostricium genus m 3; optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma. In some cases, such a combination may be abbreviated as LR4 "or ch+m3+cp+bc.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of clostridium harbouring (Hungatella hathewayi), and (3) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Fusobacterium, and (4) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of bacteroides clathraustochytrium (Bacteroides clarus); optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma. In some cases, such a combination may be abbreviated as LR 4' "or fn+ch+cp+bc.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of bacteroides clathraustochytriasis (Bacteroides clarus), and (3) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of fusarium, and (4) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of lachnoclostricium bacteria m 3; optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma. In some cases, such a combination may be abbreviated as LR4 or fn+m3+cp+bc.

In some embodiments, the kit comprises (1) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Cloacibacillus porcorum, and (2) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of clostridium harbouri (Hungatella hathewayi), and (3) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of Fusobacterium, and (4) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of lachnoclostrichum bacteria m3, and (5) an agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic of bacteroides clavulans (Bacteroides clarus); optionally, the kit is for detecting colorectal cancer and/or colorectal adenoma in a subject, preferably the adenoma is a non-advanced adenoma or an advanced adenoma. In some cases, such a combination may be abbreviated as LR5 or 5Bac or fn+m3+ch+cp+bc.

In some embodiments, fn+m3+ch+cp+bc may be used to diagnose CRC.

In some embodiments, fn+m3+ch+cp+bc may be used to diagnose an adenoma, preferably, the adenoma is a non-advanced adenoma or an advanced adenoma.

In some embodiments, the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for clostridium harbouri (Hungatella hathewayi) comprises an oligonucleotide primer comprising the nucleic acid sequences set forth in SEQ ID No. 14 and SEQ ID No. 15.

In some embodiments, the agent that specifically and quantitatively identifies DNA, RNA, or protein that is characteristic for clostridium harbouri (Hungatella hathewayi) comprises the polynucleotide probe set forth in SEQ ID No. 16.

In some embodiments, two sets of oligonucleotide primers SEQ ID NO. 6 and SEQ ID NO. 7 are used to detect the nusG gene; in some embodiments, polynucleotide probe SEQ ID NO. 8 is used to detect the nusG gene.

In some embodiments, two sets of oligonucleotide primers SEQ ID NO. 10 and SEQ ID NO. 11 are used to detect the gene marker m482585; in some embodiments, polynucleotide probe SEQ ID NO. 12 is used to detect gene marker m482585.

In some embodiments, two sets of oligonucleotide primers SEQ ID NO. 14 and SEQ ID NO. 15 are used to detect the gene marker m2736705; in some embodiments, polynucleotide probe SEQ ID NO. 16 is used to detect gene marker m2736705.

In some embodiments, two sets of oligonucleotide primers SEQ ID NO. 18 and SEQ ID NO. 19 are used to detect the gene marker m370640; in some embodiments, polynucleotide probe SEQ ID NO. 20 is used to detect gene marker m370640.

In some embodiments, the reagent comprises a detectable moiety.

In some embodiments, the reagents for specifically and quantitatively identifying DNA or RNA specific for a target bacterium generally comprise at least one oligonucleotide useful for specifically hybridizing to at least one segment of the target DNA or RNA sequence or its complement. In some embodiments, this oligonucleotide is labeled with a detectable moiety. In some embodiments, the reagents may include at least two oligonucleotide primers that can be used to amplify at least one segment of target bacterial DNA or RNA by PCR (including by RT-PCR). In some embodiments, the reagent may include at least one oligonucleotide probe that is capable of binding to at least one segment of target bacterial DNA or RNA. In some embodiments, the primer or probe is labeled with a detectable moiety.

In some embodiments, the reagents for specifically and quantitatively identifying a protein specific for a target bacterium generally comprise at least one antibody that is useful for specifically binding to the protein specific for the target bacterium. In some embodiments, the antibody is labeled with a detectable moiety. The antibody may be a monoclonal antibody or a polyclonal antibody. In some embodiments, the reagent may include at least two different antibodies, one for specific binding to a protein specific for the target bacteria (i.e., a primary antibody) and the other for detection of the primary antibody (i.e., a secondary antibody), the secondary antibody typically being linked to a detectable moiety.

In some embodiments, use is made of ³ H、 ¹²⁵ I、 ³⁵ S、 ¹⁴ C or ³² Autoradiography of the labeled probes with P, etc. In some embodiments, the probes or primers are conjugated directly with labels such as fluorophores, chemiluminescent reagents, and enzymes. The choice of detectable moiety depends on the sensitivity desired, convenience of conjugation to the probe or primer, stability requirements and available instrumentation.

In some embodiments, the probe carries the 5 'reporter dye FAM (6-carboxyfluorescein) or VIC (4, 7,2' -trichloro-7 '-phenyl-6-carboxyfluorescein) and the 3' quencher dye TAMRA (6-carboxytetramethyl-rhodamine).

In some embodiments, the kit further comprises an instruction manual.

In some embodiments, fn+m3+ch+cp+bc+fit may be used to diagnose CRC.

In some embodiments, fn+m3+ch+cp+bc+fit may be used to diagnose adenomas, preferably, the adenomas are non-advanced adenomas or advanced adenomas.

Fusobacterium genus;

lachnoclostrichum genus bacterium m3;

clostridium harbouri; and

bacteroides clausii.

In some embodiments, the reagent comprises a detectable moiety.

Fusobacterium genus;

lachnoclostrichum genus bacterium m3;

clostridium harbouri; and

bacteroides clausii.

In some embodiments, the agent may further quantitatively identify the DNA or RNA by at least one method selected from the group consisting of: in situ hybridization, polymerase Chain Reaction (PCR), RNase Protection Assay (RPA), northern blot, microarray and high throughput sequencing.

In some embodiments, the agent may further quantitatively identify the protein by at least one method selected from the group consisting of: two-dimensional electrophoresis, immunohistochemistry (IHC), fluorescence Activated Cell Sorter (FACS), radioimmunoassay (RIA), matrix assisted laser Desorption/ionization time-of-flight mass spectrometry (MALDI-TOF), radioimmunodiffusion, immunoprecipitation, flow cytometry, ouchterlony two-way immunodiffusion, and complement fixation assays.

In some embodiments, the nucleic acid sequence set forth in SEQ ID NO. 1 or 21 and the level of one or more DNA or RNA selected from the group consisting of m482585, nusG gene, m2736705 and m370640 are quantitatively identified by RT-PCR, real time quantitative PCR or metagene sequencing.

In some embodiments, the nucleic acid sequence set forth in SEQ ID NO. 1 or 21 and the level of one DNA or RNA selected from the group consisting of m482585, nusG gene, m2736705 and m370640 are quantitatively identified by RT-PCR, real time quantitative PCR or metagene sequencing.

In some embodiments, the nucleic acid sequence set forth in SEQ ID NO. 1 or 21 and the level of two DNA or RNA selected from the group consisting of m482585, nusG gene, m2736705 and m370640 are quantitatively identified by RT-PCR, real time quantitative PCR or metagene sequencing.

In some embodiments, the nucleic acid sequence set forth in SEQ ID NO. 1 or 21 and the levels of three DNA or RNA selected from the group consisting of m482585, nusG gene, m2736705 and m370640 are quantitatively identified by RT-PCR, real time quantitative PCR or metagene sequencing.

In some embodiments, the nucleic acid sequence set forth in SEQ ID NO. 1 or 21 and the levels of DNA or RNA of the m482585, nusG genes, m2736705 and m370640 are quantitatively identified by RT-PCR, real time quantitative PCR or metagene sequencing.

In some embodiments, colorectal cancer or colorectal adenoma is diagnosed by comparing the level of the DNA, RNA, or protein to a standard control.

In a third aspect, the present application provides a method of assessing the risk of colorectal cancer or colorectal adenoma in a subject comprising the steps of:

(a) Quantitatively determining the level of at least one of Cloacibacillus porcorum, fusobacterium, lachnoclostrichum bacteria m3, clostridium halosporum (Hungatella hathewayi) and Clostridium clarituxii (Bacteroides clarus) in a fecal sample taken from the subject;

(b) Comparing the level obtained in step (a) with a standard control;

(c) Determining the level obtained in step (a) as increased or decreased relative to a standard control; and

(d) The subject is determined to have an increased risk of colorectal cancer or colorectal adenoma.

In some embodiments, wherein step (a) comprises determining the level of DNA, RNA, or protein that is characteristic for at least one of Cloacibacillus porcorum, fusobacterium, lachnoclostrichum, clostridium harbouri (Hungatella hathewayi), and bacteroides clarkii (Bacteroides clarus).

In some embodiments, wherein step (a) comprises determining the level of DNA characteristic for at least one of Cloacibacillus porcorum, fusobacterium, lachnoclostrichum, clostridium harbouri (Hungatella hathewayi), and bacteroides clarkii (Bacteroides clarus).

In some embodiments, wherein step (a) comprises determining the level of the nucleic acid sequence set forth in SEQ ID NO. 1 or 21, the Fusobacterium nusG gene, the gene marker m482585, m2736705, or m 370640.

In some embodiments, when the subject is determined to have an increased risk of colorectal cancer or colorectal adenoma, further comprising repeating step (a) at a subsequent time using another stool sample from the subject at the subsequent time.

In a fourth aspect, the present application provides a method of detecting an increase or decrease in the level of at least one of Cloacibacillus porcorum, fusarium, lachnoclostrichum, clostridium harbouring (Hungatella hathewayi) and bacteroides clarkii (Bacteroides clarus) in a fecal sample, comprising the steps of:

(a) Quantitatively determining the level of at least one of Cloacibacillus porcorum, fusobacterium, lachnoclostrichum, clostridium harbouring (Hungatella hathewayi) and Bacteroides clavulanis (Bacteroides clarus) in the fecal sample;

(b) Comparing the level obtained in step (a) with a standard control;

(c) Determining the level obtained in step (a) as increased or decreased relative to a standard control.

In some embodiments, the fecal sample is obtained from a human subject.

In some embodiments, wherein step (a) comprises determining the level of DNA specific for at least one of genus Cloacibacillus porcorum Fusobacterium, genus lachnoclostrichum, genus m3, clostridium harbouring (Hungatella hathewayi), and bacteroides clarkii (Bacteroides clarus).

In some embodiments, wherein step (a) comprises determining the level of the nucleic acid sequence set forth in SEQ ID NO. 1 or 21, the nusG gene, the gene marker m482585, m2736705 or m 370640.

Examples

The following examples are provided by way of illustration only and not by way of limitation. Those skilled in the art will readily recognize that various non-critical parameters may be changed or modified to produce substantially the same or similar results.

Example 1

Metagenomic data

The study analyzed stool metagenomic sequencing data from 589 subjects (184 CRC patients, 185 adenoma patients, and 220 control subjects) ¹⁶ . Written informed consent has been obtained from all subjects. The relative abundance of species was analyzed by MetaPhlAn3 ¹⁷ 。

Human fecal sample collection

Faecal samples were collected from 426 subjects (127 CRC, 161 adenomas and 138 normal controls). According to previous metagenomic studies ⁴ Subjects include individuals exhibiting symptoms such as altered bowel habits, rectal bleeding, abdominal pain or anemia, and asymptomatic individuals who are under colonoscopy 50 years or older. Samples were collected one month before or after colonoscopy, at which time the intestinal microbiota should have returned to baseline ¹⁸ . The exclusion criteria were: 1) Antibiotics have been used in the past 3 months; 2) Vegetarian food; 3) Invasive medical interventions were accepted during the past 3 months; 4) There is a prior history of any cancer or inflammatory disease of the intestinal tract. Subjects were asked to collect stool samples in standardized containers at home and store the samples immediately in a-20 ℃ refrigerator. And then stored at-80 ℃ until further analysis. The patient is diagnosed by colonoscopy and histopathological examination of any biopsy. Informed consent was obtained from all subjects. The study was approved by the ethics committee of clinical research at the university of hong Kong Chinese.

DNA extraction, primer and probe design and qPCR experiments

DNA extraction, primer and Probe sequence design and qPCR amplification on the ABI Quantum studio Gene sequence detection System were performed as described previously ¹³ . The primer and probe sequences specific for Cloacibacillus porcorum and other markers are listed in table 1. Primer and probe sequences for other bacterial gene markers and 16s rDNA internal controls and methods of useIs the same as the study of (a) ^13,19 . Each probe carries the 5 'reporter dye FAM (6-carboxyfluorescein) or VIC (4, 7,2' -trichloro-7 '-phenyl-6-carboxyfluorescein) and the 3' quencher dye TAMRA (6-carboxytetramethyl rhodamine). Primers and hydrolysis probes were synthesized by Invitrogen (Carlsbad, calif.). The specificity of PCR amplification was confirmed by direct Sanger sequencing of the PCR products or by sequencing of randomly selected TA clones. The relative abundance of each marker gene was calculated using the delta Cq method by comparison with the internal control abundance and shown as a logarithmic value of "x10e6+1".

Fecal Immunochemical Test (FIT)

FIT is detected by using a stool sample for an automatic quantitative OC-Sensor test (Eiken Chemical, japan), which is performed as described previously ²⁰ The positive cut-off value corresponds to a concentration of 100ng hemoglobin per ml.

Scoring algorithm and cut-off value

The use of a logistic regression model (4 Bac score=i was determined in previous studies ₁ +β ₁ *Fn+β ₂ *m3+β ₃ *Bc+β ₄ * Ch) comprehensive score for four bacterial markers (4 Bac) ¹⁹ . The combined scores for 2 to 5 markers with or without FIT using the logistic regression model are shown in table 2. In the regression model, I represents the intercept, β represents the regression coefficient, and the marker represents the corresponding Cq value. The cut-off value is determined by Receiver Operating Characteristic (ROC) analysis that maximizes Youden index (j=sensitivity+specificity-1) ²¹ 。

Statistical analysis

The values are each expressed as mean.+ -. SD or mean (interquartile range (IQR)), as the case may be. Differences in bacterial abundance were determined by the Mann-Whitney U test. One-way anova multiplex comparison (One-way ANOVA multiple comparison) with linear trend test was used to assess the change in marker levels during disease progression (from control to adenoma to cancer). Simple and multiple regression analysis is used to estimate the correlation between the marker level and the factor of interest. Chi-square test (Chi-square test) was used to analyze the incidence between the different groups and the sensitivity of the different markers. By applying logistic regression models The combination of multiple biomarkers is type analyzed to obtain a value for estimating the incidence of CRC compared to the control. ROC curves were used to evaluate the diagnostic value of bacterial markers/models in distinguishing CRC/adenoma from controls. The pairwise comparison of ROC curves is performed using a non-parametric approach ²² . All tests were performed by Graphpad Prism 5.0 (Graphpad Software inc., san Diego, CA) or MedCalc statistical software version 18.5 (MedCalc Software bvba, ostend, belgium; http:// www.medcalc.org; 2018). P (P)<0.05 is considered statistically significant.

Results

Identification of bacterial species as potential biomarkers for colorectal tumors

The abundance of all bacterial species in fecal samples from 220 control subjects, 185 adenoma patients and 184 CRC patients were analyzed by metagenomic sequencing. By comparison with the diagnostic type (normal, adenoma and CRC) of the spaman grade correlation and between every two groups, enteromonas butyrate (i.) producing, clostridium (c.) sporum, clostridium asparagi (c.) asparagine, bacteroides (b.) dorei, dorei and Streptococcus stomatis (s.) are not yet reported as markers for colorectal tumor diagnosis, and there was a significant difference between adenoma or CRC patients and control subjects as the disease progressed from normal to adenoma and further to cancer (table 3; fig. 1). Stool i.butyriciproducts, cloacibacillus porcorum and s.oralis were further demonstrated to exhibit a significant trend of increasing from normal to adenoma to CRC (P <0.0001, one-way ANOVA; fig. 1).

Verification of recognized bacterial markers by qPCR

Five newly identified bacterial candidate markers were validated by targeting quantification using qPCR. Although the metagenomic sequencing data and qPCR quantification data of three species appeared to be more non-linear, there was a significant positive correlation between both sets of data for the five species (all P <0.0001; fig. 2). qPCR results showed that there was a significant difference in abundance of the three candidate strain markers C.asparagine, cloacibacillus porcorum and I.butyriciproducens between the three groups (all P <0.05; FIG. 3A). These three species showed a significant trend of increasing from normal to adenoma to CRC (all P <0.05,1-way ANOVA; B of fig. 3). However, only Cloacibacillus porcorum was significantly increased in both adenoma and CRC samples compared to the control (B of fig. 3).

Fecal level and incidence of Cloacibacillus porcorum

The incidence of Cloacibacillus porcorum assessed by metagenomic sequencing and qPCR was similar. The Cloacibacillus porcorum incidence of adenoma and CRC patients was significantly higher compared to normal subjects as assessed by metagenomic sequencing (A1 of fig. 4) and qPCR quantification (B1 of fig. 4). As assessed by metagenomic sequencing (A2 of fig. 4) and qPCR quantification (B2 of fig. 4), levels in adenoma and CRC patients were significantly higher than Cloacibacillus porcorum of the control group, with no differences between non-advanced and advanced adenomas or between different TNM stages of the cancer.

Diagnostic properties of Cloacibacillus porcorum in feces for adenoma and/or CRC

Stool Cloacibacillus porcorum alone showed an AUC for CRC of 0.657 (95% CI:0.583 to 0.725; P < 0.0001), and an AUC for adenoma of 0.618 (95% CI:0.550 to 0.682; P < 0.0001) (FIG. 5). Cloacibacillus porcorum is not different from Ch in distinguishing CRC patients from control subjects, compared to markers Fn, m3 and Ch enriched in bowel and adenomas. These markers were not significantly different Cloacibacillus porcorum from Fn in distinguishing adenomatous patients from control subjects, cloacibacillus porcorum was significantly better than Ch (P < 0.05) (fig. 5). At the threshold of maximum Youden's index, cloacibacillus porcorum sensitivity to CRC was 37.1% with a specificity of 93.6%; sensitivity to adenomas was 30.8%, specificity was 92.3%, whereas sensitivity to advanced adenomas (n=83) and non-advanced adenomas (n=60) was 33.7% and 26.7%, respectively (p=0.46, fisher exact test).

Stool Cloacibacillus porcorum is significantly associated with diagnosis of CRC and adenoma and is not affected by age

By univariate analysis, the abundance of all four markers enriched in bowel and adenomas correlated significantly with CRC and adenoma diagnosis, and was independent of sex, CRC stage, lesion location, or body mass index; fn, m3 and Ch increased significantly with age, while only Cloacibacillus porcorum was age independent (table 4). Multivariate analysis showed that Fn, m3, ch and Cloacibacillus porcorum were significantly correlated with CRC and adenoma diagnosis, and Fn was significantly correlated with age (table 5).

Cloacibacillus porcorum in combination with other bacterial markers improves the diagnostic performance on CRC and/or adenoma

Bacterial markers Fn, ch, bc and m3 for diagnosis of CRC and adenoma have been previously reported, and the present application further analyzes the performance of these markers in combination with Cloacibacillus porcorum for diagnosis of CRC and/or adenoma (table 6). A logistic regression model was built to distinguish patients with CRC/adenomas from control subjects using all five or less markers, with the least significant markers removed one by one according to their importance in the model. The results showed that the 5 marker models (Fn, ch, bc, m and Cloacibacillus porcorum) performed best in diagnosing CRC with an AUROC of 0.923 (ROC curve comparison shows that P values are <0.05 for both compared to the fewer marker combinations; FIG. 6A and FIG. 6B). The 5 marker model showed no significant difference in performance in distinguishing adenomas from control subjects in combination with 4 markers and 3 markers with Ch and/or Bc removed (a of fig. 6 and B of fig. 6). Comparison between the 2 marker models (Fn, m 3) and the 3 marker models (Fn, m3, cloacibacillus porcorum) showed no difference in diagnostic performance for CRC, while Cloacibacillus porcorum significantly improved diagnostic performance for adenoma (C of fig. 6).

The addition of Cloacibacillus porcorum significantly improves the diagnostic properties of "4Bac" and m3 for adenomas

Since 4Bac (Fn, ch, bc and m 3) for diagnosing CRC and m3 for diagnosing adenoma were previously designed, the present application further compares the performance of 5 marker models involving Cloacibacillus porcorum with 4Bac/m3 for CRC and adenoma. ROC curves compared, 5 marker models (4Bac+Cloacibacillus porcorum) showed no significant difference from 4Bac diagnostic CRC (P > 0.05), but at a specificity of 85% compared to 4Bac, 5 marker models showed slightly higher sensitivity to diagnostic CRC (88.6% vs 85.7%) (a of fig. 7 and C of fig. 7). For diagnosis of adenoma, the diagnostic performance of 4Bac on adenoma was significantly improved by comparing ROC curves in combination with Cloacibacillus porcorum (p=0.002). For adenomas, the 5 marker models also performed significantly better than m3 alone (p=0.048). At 85% specificity, 5 marker models were more sensitive to adenomas (58.7%) than 4Bac (44.8%) and m3 (41.6%) (fig. 7B and fig. 7C).

The combination of FIT and Cloacibacillus porcorum improves the diagnostic ability of 4Bac for CRC and advanced adenomas

In the tested cohort, only 11.2% of advanced adenomas were detected by FIT, but no non-advanced adenomas were detected, with a specificity of 98.7%. The study further trained logistic regression models by combining bacterial markers with FIT to distinguish patients with CRC/adenomas from normal controls, and then assessed their diagnostic performance on CRC, advanced adenomas, and non-advanced adenomas, respectively (FIG. 8; table 6). By ROC curve comparison analysis, after FIT binding, 5bac & FIT and 4bac & FIT showed no difference in diagnosis of CRC, while both performed significantly better than the corresponding model without FIT binding (both P < 0.05). For advanced adenomas, 5Bac combined with FIT performed significantly better than the other models (all P < 0.05). The use of FIT (4 Bac & FIT) or Cloacibacillus porcorum (5 Bac) in combination significantly improved the diagnosis of advanced adenomas (with P < 0.05). For non-advanced adenomas, no significant differences were observed between these models, but models with Cloacibacillus porcorum were more sensitive at specific specificities, e.g., > 80%.

Comparison of 5Bac & FIT with 4Bac showed Cloacibacillus porcorum and FIT significantly improved the detection rate of CRC and AA (both P < 0.05). Comparison between bacterial models with and without FIT shows that FIT significantly increases detection of CRC. When comparing models with and without Cloacibacillus porcorum, cloacibacillus porcorum was found to improve detection of non-advanced and advanced adenomas, although only the increase in advanced adenomas was significant (p=0.019 for 5bac & fit versus 4bac & fit) (fig. 9).

The diagnostic sensitivity of FIT, bacterial markers and combinations thereof in detecting CRC was further compared according to the TNM staging subsets (fig. 10). Bacterial marker models were all more sensitive to stage I-III cancers than FIT. The combination of 4Bac/5Bac and FIT has significantly higher sensitivity to stage I-III cancers than FIT, and also increases the detection rate of stage IV cancers. Cloacibacillus porcorum increases the sensitivity of 4Bac in detecting stage II-IV cancer, although not significantly. These results indicate that bacterial marker combinations are superior to FIT in detecting phase I-III CRC, and that their combination further improves the noninvasive diagnosis of CRC.

Discussion of the invention

In this study, new bacterial species markers for diagnosis of CRC and adenomas were identified by metagenomic analysis and further validated by targeting quantification using qPCR. The macrogenomics identified five new candidate markers for CRC diagnosis that have never been reported, including i.butyl triciciproducts, cloacibacillus porcorum, c.asparagiform, b.dorei, and s.oralis. qPCR further validated a significant trend of increasing three species from normal to adenoma to CRC, including c.asparaginform, cloacibacillus porcorum and i.butyl triciciproducts. Of these Cloacibacillus porcorum is the most promising new marker for diagnosis of CRC and adenomas, with a significant increase in Cloacibacillus porcorum in both adenomas and CRC samples compared to healthy controls. The present application further compares the diagnostic properties of Cloacibacillus porcorum to previously identified bacterial markers (including m3, fn, ch and Bc). The present application also contemplates novel bacterial marker combinations, with or without FIT, for diagnosis of CRC and adenomas. The results of the studies of the present application demonstrate that the combination with Cloacibacillus porcorum significantly improves the diagnostic performance of previously determined bacterial markers for diagnosis of CRC and adenomas, including non-advanced and advanced adenomas, whereas the combination with FIT further improves the diagnosis of CRC and advanced adenomas.

Targeting detection of bacterial markers identified based on shotgun macrogenomics is a more promising strategy for clinical applications. In this study, quantification of bacterial markers by qPCR performed well in diagnosing CRC and adenomas. Specifically, fn, m3 and Cloacibacillus porcorum combined showed AUCs of CRC and adenoma of 0.897 (95% ci:0.844 to 0.937) and 0.770 (95% ci:0.709 to 0.824), respectively. Further additions of Bc and/or Ch increased AUC of CRC and adenoma to over 0.92 and 0.77, respectively. Binding to FIT further increases the diagnostic sensitivity of bacterial markers to CRC and advanced adenomas. The best combination containing all five bacterial markers and FIT showed a sensitivity of 64.3% for adenomas, 96.2% for CRC and 84.6% specificity.

Reference to the literature

1.Allemani C,Matsuda T,Di Carlo V,et al.Global surveillance of trends in cancer survival 2000-14(CONCORD-3):analysis of individual records for 37 513 025patients diagnosed with one of 18cancers from 322population-based registries in 71countries.Lancet 2018；391(10125):1023-75.

2.The L.GLOBOCAN 2018:counting the toll of cancer.Lancet 2018；392(10152):985.

3.Irrazabal T,Belcheva A,Girardin SE,Martin A,Philpott DJ.The multifaceted role of the intestinal microbiota in colon cancer.Molecular cell 2014；54(2):309-20.

4.Yu J,Feng Q,Wong SH,et al.Metagenomic analysis of faecal microbiome as atool towards targeted non-invasive biomarkers for colorectal cancer.Gut 2015:Sep 25.pii:gutjnl-2015-309800.

5.Nakatsu G,Li X,Zhou H,et al.Gut mucosal microbiome across stages of colorectal carcinogenesis.Nat Commun 2015；6:8727.

6.Dai Z,Coker OO,Nakatsu G,et al.Multi-cohort analysis of colorectal cancer metagenome identified altered bacteria across populations and universal bacterial markers.Microbiome 2018；6(1):70.

7.Tilg H,Adolph TE,Gerner RR,Moschen AR.The Intestinal Microbiota in Colorectal Cancer.Cancer cell 2018；33(6):954-64.

8.Wong SH,Zhao L,Zhang X,et al.Gavage of Fecal Samples From Patients With Colorectal Cancer Promotes Intestinal Carcinogenesis in Germ-Free and Conventional Mice.Gastroenterology 2017；153(6):1621-33e6.

9.Kostic AD,Chun E,Robertson L,et al.Fusobacterium nucleatum potentiates intestinal tumorigenesis and modulates the tumor-immune microenvironment.Cell host&microbe 2013；14(2):207-15.

10.Rubinstein MR,Wang X,Liu W,Hao Y,Cai G,Han YW.Fusobacterium nucleatum promotes colorectal carcinogenesis by modulating E-cadherin/beta-catenin signaling via its FadA adhesin.Cell Host Microbe 2013；14(2):195-206.

11.Yu T,Guo F,Yu Y,et al.Fusobacterium nucleatum Promotes Chemoresistance to Colorectal Cancer by Modulating Autophagy.Cell 2017；170(3):548-63e16.

12.Tsoi H,Chu ESH,Zhang X,et al.Peptostreptococcus anaerobius Induces Intracellular Cholesterol Biosynthesis in Colon Cells to Induce Proliferation and Causes Dysplasia in Mice.Gastroenterology 2017；152(6):1419-33e5.

13.Liang Q,Chiu J,Chen Y,et al.Fecal Bacteria Act as Novel Biomarkers for Noninvasive Diagnosis of Colorectal Cancer.Clin Cancer Res 2017；23(8):2061-70. 14. Xie YH, Gao QY, Cai GX, et al. Fecal Clostridium symbiosum for NoninvasiveDetection of Early and Advanced Colorectal Cancer: Test and Validation Studies.EBioMedicine 2017; 25: 32-40. 15. Shah MS, DeSantis TZ, Weinmaier T, et al. Leveraging sequence-based faecalmicrobial community survey data to identify a composite biomarker for colorectal cancer.Gut 2018; 67(5): 882-91. 16. Nakatsu G, Zhou H, Wu WKK, et al. Alterations in Enteric Virome Are AssociatedWith Colorectal Cancer and Survival Outcomes. Gastroenterology 2018; 155(2): 529-41e5. 17. Beghini F, McIver LJ, Blanco-Miguez A, et al. Integrating taxonomic, functional,and strain-level profiling of diverse microbial communities with bioBakery 3. Elife 2021;10. 18. Jalanka J, Salonen A, Salojarvi J, et al. Effects of bowel cleansing on the intestinalmicrobiota. Gut 2015; 64(10): 1562-8. 19. Liang JQ, Li T, Nakatsu G, et al. Anovel faecal Lachnoclostridium marker for thenon-invasive diagnosis of colorectal adenoma and cancer. Gut 2020; 69(7): 1248-57. 20. Wong SH, Kwong TNY, Chow TC, et al. Quantitation of faecal Fusobacteriumimproves faecal immunochemical test in detecting advanced colorectal neoplasia. Gut2017; 66(8): 1441-8. 21. Youden WJ. Index for rating diagnostic tests. Cancer 1950; 3(1): 32-5. 22. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two ormore correlated receiver operating characteristic curves: a nonparametric approach.Biometrics 1988; 44(3): 837-45.

Claims

1. Use of an agent that specifically and quantitatively identifies DNA or RNA that is characteristic for Cloacibacillus porcorum in the manufacture of a kit for detecting colorectal adenoma in a subject;

wherein the DNA or RNA specific for Cloacibacillus porcorum comprises the nucleic acid sequence shown in SEQ ID NO. 21;

the specific and quantitative identification of DNA or RNA specific for Cloacibacillus porcorum comprises an oligonucleotide primer comprising the nucleic acid sequences set forth in SEQ ID NO. 2 and SEQ ID NO. 3; or (b)

The specific and quantitative identification of DNA or RNA specific for Cloacibacillus porcorum comprises the polynucleotide probe shown in SEQ ID NO. 4.

2. The use of claim 1, wherein the kit further comprises a standard control providing an average amount of Cloacibacillus porcorum in a fecal sample.

3. The use of claim 1, the kit further comprising one or more reagents selected from the group consisting of:

an agent for specifically and quantitatively identifying DNA or RNA specific to Fusobacterium genus;

an agent for specifically and quantitatively identifying DNA or RNA specific for the bacterium m3 of the genus Lachnoclostrichum;

an agent that specifically and quantitatively identifies DNA or RNA that is characteristic for clostridium harbouri (Hungatella hathewayi); and

an agent for specifically and quantitatively identifying DNA or RNA specific for bacteroides clathraustochytrias (Bacteroides clarus).

4. The method according to claim 3, wherein,

5. The use according to claim 3, wherein

The reagent for specifically and quantitatively identifying DNA or RNA specific for Fusobacterium genus comprises an oligonucleotide primer comprising the nucleic acid sequence shown in SEQ ID NO. 6 and SEQ ID NO. 7;

the reagent for specifically and quantitatively identifying DNA or RNA specific for bacteria m3 of the genus Lachnoclostrichum comprises an oligonucleotide primer comprising the nucleic acid sequence shown in SEQ ID NO. 10 and SEQ ID NO. 11;

reagents for specific and quantitative identification of DNA or RNA specific for clostridium harbouri (Hungatella hathewayi) comprise oligonucleotide primers comprising the nucleic acid sequences shown in SEQ ID No. 14 and SEQ ID No. 15; or (b)

The reagent for specifically and quantitatively identifying DNA or RNA specific for the bacterium Cladosiphon (Bacteroides clarus) comprises an oligonucleotide primer comprising the nucleic acid sequences shown as SEQ ID NO. 18 and SEQ ID NO. 19;

or,

reagents for specifically and quantitatively identifying DNA or RNA specific for Fusobacteria genus comprising the polynucleotide probe shown in SEQ ID NO. 8;

An agent for specifically and quantitatively identifying DNA or RNA specific for bacterium m3 of the genus Lachnoclostrichum comprises the polynucleotide probe shown in SEQ ID NO. 12;

reagents that specifically and quantitatively identify DNA or RNA that is characteristic for clostridium harbouri (Hungatella hathewayi) comprise the polynucleotide probe shown in SEQ ID No. 16; or (b)

Reagents for specifically and quantitatively identifying DNA or RNA characteristic for Corynebacterium clathraustochytrium (Bacteroides clarus) comprise the polynucleotide probe shown in SEQ ID NO. 20.

6. The use of claim 1 or 3, wherein the agent comprises a detectable moiety.

7. The use of claim 1 or 3, wherein the kit further comprises reagents for a stool immunochemical test (FIT).

8. The use of claim 3, wherein the kit further comprises a standard control providing an average amount in the fecal sample of one or more selected from the group consisting of:

fusobacterium genus;

lachnoclostrichum genus bacterium m3;

clostridium harbouri; and

bacteroides clausii.

9. The use according to claim 1, wherein,

the reagents quantitatively identify the DNA or RNA by RT-PCR, real-time quantitative PCR, or metagene sequencing.