WO2019204985A1 - 骨质疏松生物标志物及其用途 - Google Patents

骨质疏松生物标志物及其用途 Download PDF

Info

Publication number
WO2019204985A1
WO2019204985A1 PCT/CN2018/084276 CN2018084276W WO2019204985A1 WO 2019204985 A1 WO2019204985 A1 WO 2019204985A1 CN 2018084276 W CN2018084276 W CN 2018084276W WO 2019204985 A1 WO2019204985 A1 WO 2019204985A1
Authority
WO
WIPO (PCT)
Prior art keywords
osteoporosis
biomarker
bacteroides
analogue
relative abundance
Prior art date
Application number
PCT/CN2018/084276
Other languages
English (en)
French (fr)
Inventor
王奇
郭锐进
鞠艳梅
贾慧珏
Original Assignee
深圳华大生命科学研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大生命科学研究院 filed Critical 深圳华大生命科学研究院
Priority to CN201880092711.2A priority Critical patent/CN112384634B/zh
Priority to PCT/CN2018/084276 priority patent/WO2019204985A1/zh
Publication of WO2019204985A1 publication Critical patent/WO2019204985A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N1/00Microorganisms, e.g. protozoa; Compositions thereof; Processes of propagating, maintaining or preserving microorganisms or compositions thereof; Processes of preparing or isolating a composition containing a microorganism; Culture media therefor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids

Definitions

  • the present invention relates to the field of biomedicine, and in particular, to osteoporosis biomarkers and uses thereof.
  • the invention relates to biomarkers of osteoporosis or related diseases, methods of diagnosing or predicting the risk of osteoporosis or related diseases, kits and uses of osteoporosis biomarkers in the preparation of kits.
  • Osteoporosis (English: osteoporosis, from Greek porous bones, meaning "porous bone”) is a disease in which the risk of fracture increases due to decreased bone density. The cause is a large loss of minerals, resulting in the loss of calcium in the bones to the blood; osteoporosis is also the most common cause of fractures in the middle and high age groups. Skeletal parts that are prone to fracture due to osteoporosis include the spine, forearm bone, and hip bone. Usually there are no symptoms before the fracture, until the bone becomes soft and easy to fold, it will break when pressed slightly; even after chronic pain and functional decline, even daily activities will lead to re-fracture.
  • Osteoporosis will increase with age, about 15% of white people will develop symptoms in their 50s, and will increase to 70% when they are over 80 years old. Osteoporosis is more common in women than in male patients. Screening in developed countries found that 2%-8% of men and 9%-38% of women were diagnosed with osteoporosis; the incidence rate in developing countries is still unclear. In 2010, there were nearly 22 million female patients and 5.5 million male patients in Europe. In the same year, 8 million women and 1 million to 2 million male patients were found in the United States. Risk factors for osteoporosis include gender (especially women), premature menopause, ethnicity (especially whites and Asians), thin bone structure, low body mass index, smoking, alcoholism, insufficient activity, family Medical history.
  • osteoporosis does not directly lead to death in most cases, osteoporosis increases the chance of fracture, which affects the patient's health and independent living ability, and greatly increases the social medical burden.
  • the existing bone mineral density screening is only a surface change in understanding bone density, but it is not possible to assess the patient's osteoporosis symptoms as a whole.
  • the present application is based on the discovery and recognition by the inventors of the facts and problems that the gut microbes are microbial communities present in the human gut and are the "second genome" of the human body.
  • the human intestinal flora and host form an interrelated whole.
  • the intestinal microbe can not only degrade the nutrients, host vitamins and other nutrients in the food, but also promote the differentiation and maturation of intestinal epithelial cells, thereby activating the intestines.
  • the immune system and the regulation of host energy storage and metabolism play an important role in the body's digestion and absorption, immune response, and metabolic activity.
  • the inventors of the present invention screened out intestinal flora and gene sequences of osteoporosis patients and healthy people, thereby screening biomarkers highly correlated with osteoporosis, and using the markers to accurately Diagnose osteoporosis or related diseases or predict the risk of illness, and can be used to monitor treatment outcomes.
  • Shortcomings such as early warning, inability to predict the onset of osteoporosis, and trends in development. Therefore, it can be applied to predict the onset and development of osteoporosis, and to apply to pathological typing of diseases.
  • Osteoporosis-associated biomarkers are believed to be valuable for early diagnosis for the following reasons.
  • the markers of the invention are specific and sensitive.
  • the analysis of feces ensures accuracy, safety, affordability, and patient compliance.
  • the sample of feces is transportable.
  • Polymerase chain reaction (PCR)-based assays are comfortable and non-invasive, so people are more likely to participate in a given screening procedure.
  • the markers of the invention can also be used as a tool for therapeutic monitoring of osteoporotic patients to detect response to treatment.
  • the invention provides a biomarker.
  • the biomarker comprises at least one selected from the group consisting of:
  • Bacteroides thetaiotaomicron or an analogue thereof Bacteroides uniformis or an analogue thereof, Bacteroides intestinalis or an analogue thereof, Bacteroides dorei or an analogue thereof, Ruminococcus sp.
  • the Bacteroides thetaiotaomicron analogue has an alignment similarity of more than 85% compared to the genomic sequence of Bacteroides thetaiotaomicron, and the Bacteroides uniformis analogue and Bacteroides monocytogenes ( The conjugate genome of Bacteroides uniformis) has an alignment similarity of more than 85%, and the Bacteroides intestinalis analogue has an alignment similarity of more than 85% compared to the genomic sequence of Bacteroides intestinalis, the Bacteroides dorei analogue and Bacteroides Compared with the genomic sequence of dorei, the alignment is similar Above 85%, the Ruminococcus sp.
  • analogue has an alignment similarity of more than 85% compared to the genomic sequence of Ruminococcus sp., the mucin-Ekmania ( The Akkermansia muciniphila) analogue has an alignment similarity of more than 85% compared to the genomic sequence of Akkermansia muciniphila, and the Parabacteroides merdae analog is compared with the genomic sequence of Parabacteroides merdae.
  • the similarity is above 85%, and the similarity of the Rubinococcus torques analog to the genomic sequence of Ruminococcus torques is more than 85%, and the Dialister invisus analog and Compared to the genome sequence of Dialister invisus, the alignment similarity is above 85%.
  • biomarkers can be used as biomarkers for the detection of osteoporosis. It is possible to determine whether the test subject is effective by determining whether one or two or more of these markers are present in the intestinal flora of the subject. Suffering or susceptible to osteoporosis (ie, predicting the risk of osteoporosis), these biomarkers can be further used to monitor the therapeutic effects of patients with osteoporosis. In addition, when the amount of the healthy sample is sufficient, the person skilled in the art can also obtain the normal value or the normal range of each biomarker in the intestine according to the test and calculation method, thereby indicating that each marker is healthy.
  • the content in the sample thereby determining whether the subject has or is susceptible to osteoporosis by detecting the amount of at least one of these biomarkers in the intestinal flora, and can be used Monitor the effectiveness of treatment outcomes in patients with osteoporosis.
  • the alignment similarity is more than 85%.
  • the microorganism belongs to the same genus as the strain, or the gene sequence can be classified into the same genus as the strain, and the microorganisms of the same genus usually have the same or similar functions, and therefore, these analogs can also be utilized as a marker of osteoporosis. Things.
  • the alignment similarity in the present invention refers to the sequence of the same base or amino acid residue between the target sequence (the sequence to be determined) and the reference sequence (known sequence) in the sequence alignment process.
  • the size of the proportion refers to the sequence of the same base or amino acid residue between the target sequence (the sequence to be determined) and the reference sequence (known sequence) in the sequence alignment process. The size of the proportion.
  • the Bacteroides intestinalis is Bacteroides intestinalis DSM 17393, the Ruminococcus sp. is Ruminococcus sp. 5_1_39BFAA, and the mesophilin-Akkermania muciniphila is a mucin - Ekmania strain ATCC BAA-835 (Akkermansia muciniphila ATCC BAA-835), the Parabacteroides merdae is Parabacteroides merdae ATCC 43184, and the Ruminococcus torques are Ruminococcus torques L2 -14), the Dialister invisus is Dialister invisus DSM 15470.
  • These biomarkers can be used as representative strains of the corresponding strains to indicate the disease state or risk of osteoporosis or osteoporosis-related diseases.
  • the Bacteroides thetaiotaomicron analogue has a similarity of more than 95% compared to the genomic sequence of Bacteroides thetaiotaomicron, and the Bacteroides monocytogenes
  • the analogy of the (Bacteroides uniformis) analogue is more than 95% compared with the genomic sequence of Bacteroides uniformis, and the similarity of the Bacteroides intestinalis analog is compared with the genomic sequence of Bacteroides intestinalis.
  • the Bacteroides dorei analogue has a similarity of more than 95% compared with the genomic sequence of Bacteroides dorei, and the genomic sequence of the Ruminococcus sp.
  • the alignment similarity is above 95%, and the analog of the Akkermansia muciniphila analog is similar to the genomic sequence of the Akkermansia muciniphila.
  • the Parabacteroides merdae analogue is associated with the genomic sequence of Parabacteroides merdae Ratio, the similarity is more than 95%, and the similarity of the Ruminococcus torques analog to the genomic sequence of Ruminococcus torques is above 95%, the Dialister The invisus analogue has a similarity of more than 95% compared to the genomic sequence of Dialister invisus.
  • microorganism when an unknown microorganism or a nucleic acid-derived gene sequence has a similarity of more than 95% compared with a known strain, the microorganism can be considered to be the same as the strain. Alternatively, the gene sequence can be classified into the same species as the strain. Thus, those skilled in the art can directly obtain the nucleic acid sequence information in the detection object and then compare it with the genome sequence of these strains. If there is more than 95% sequence similarity, it can be used as a detection target. A sign of osteoporosis or susceptible osteoporosis.
  • the analogs when the respective bacterial analogs are compared with the genomic sequence of the corresponding bacteria, the alignment coverage is above 80%, and the analog similarity is above 85%, the analogs can be considered as It belongs to the same genus as the corresponding bacterium and can be used as a marker for osteoporosis.
  • the analog coverage of the analogs and the corresponding bacteria is above 80%, and the similarity is more than 95%, the analogs can be considered to be the same species as the corresponding bacteria, and can be used as osteoporosis. Signs.
  • the ratio of coverage refers to the ratio of the length of the sequence in the target sequence aligned with the reference sequence to the total length of the detection sequence in the process of aligning the target sequence with the reference sequence.
  • the invention proposes a method of diagnosing whether a subject has osteoporosis or a related disease or predicting whether the subject has a risk of osteoporosis or a related disease.
  • the method comprises the steps of: (1) collecting a sample from the object; (2) determining the biomarker according to the first aspect of the invention in the sample obtained in step (1) Relative abundance information of the object; (3) comparing the relative abundance information described in step (2) with a reference data set or reference value.
  • the method is not only used for the diagnosis of diseases in the sense of patent law, but also can be used as a non-disease diagnosis of scientific research or other personal genetic information and a rich database of genetic information.
  • the relative abundance information of each biomarker in the test subject is compared with a reference data set or a reference value to determine whether the subject has osteoporosis or related diseases, or is predicted to have osteoporosis or related diseases. risks of.
  • the reference data set in the present invention refers to the relative abundance information of each biomarker obtained by operating a sample that has been diagnosed as a diseased individual and a healthy individual, and is used as a relative abundance of each biomarker. Degree reference.
  • the reference data set refers to a training data set.
  • the training set refers to and the verification set has a meaning as is known in the art.
  • the training set refers to a data set containing the content of each biomarker in a sample of osteoporosis and a sample of a non-osteoporosis subject to be tested comprising a certain number of samples.
  • the verification set is an independent data set used to test the performance of the training set.
  • the reference value in the present invention refers to a reference value or a normal value of a healthy control. It is known to those skilled in the art that when the sample size is sufficiently large, the range of normal values (absolute values) of each biomarker in the sample can be obtained using detection and calculation methods well known in the art. When the level of biomarker is measured by the assay method, the absolute value of the biomarker level in the sample can be directly compared with the reference value to assess the risk of the disease and to diagnose or early diagnose osteoporosis or related diseases. Ground selection can include statistical methods.
  • the osteoporosis-related disease means a disease associated with osteoporosis, including a pre-existing symptom or disease which can cause osteoporosis, and a follow-up or concurrency caused by osteoporosis. Symptoms or diseases.
  • the method may further add the following technical features:
  • the reference data set comprises relative abundance information of biomarkers in samples from a plurality of osteoporosis patients and a plurality of healthy controls, the biomarkers being according to the first aspect of the invention The biomarker.
  • the method further comprises performing a multivariate statistical model to obtain a disease probability. Fast and efficient detection can be achieved by using multivariate statistical models.
  • the multivariate statistical model is a random forest model.
  • the probability of being above a threshold indicates that the subject has osteoporosis or a related disease or is at risk of having osteoporosis or a related disease.
  • the threshold is 0.5.
  • the decrease in the Akermansia muciniphila or its analog, Parabacteroides merdae or its analog, Dialister invisus or its analogue, when compared to a reference value indicates The subject has osteoporosis or a related disease or is at risk of having osteoporosis or a related disease; the Bacteroides thetaiotaomicron or its analog, Bacteroides uniformis or the like An increase in the substance, Bacteroides intestinalis or an analogue thereof, Bacteroides dorei or an analogue thereof, Ruminococcus sp. or an analogue thereof, Ruminococcus torques or the like thereof indicates that the subject has bone mass Loose or related diseases or at risk of suffering from osteoporosis or related diseases.
  • the relative abundance information of the biomarker in the step (2) is obtained by using a sequencing method, and further comprising: separating the nucleic acid sample from the sample of the object, based on the obtained The nucleic acid sample, constructing a DNA library, sequencing the DNA library to obtain a sequencing result, and comparing the sequencing result with a reference gene set based on the sequencing result to determine a relative abundance of the biomarker Degree information.
  • the sequencing result can be compared with the reference gene set by using at least one of SOAP2 and MAQ, whereby the efficiency of the alignment can be improved, and the efficiency of osteoporosis detection can be improved.
  • a plurality of (at least two) biomarkers can be simultaneously detected, and the efficiency of osteoporosis detection can be improved.
  • the reference gene set comprises performing metagenomic sequencing from a plurality of osteoporosis patients and a plurality of healthy control samples, obtaining a non-redundant gene set, and then combining the non-redundant gene set with The gut microbial genes are pooled and the reference gene set is obtained.
  • the reference gene set in the present invention may be an existing gene set, such as the existing published intestinal microbial reference gene set; or a plurality of osteoporosis patients and a plurality of healthy control samples may be subjected to a metagenomic group.
  • non-redundant gene set described in the present invention is to be interpreted as commonly understood by those skilled in the art, and is simply a collection of remaining genes after removal of redundant genes.
  • a redundant gene usually refers to multiple copies of a gene that appears on a chromosome.
  • the sample is a stool sample.
  • the sequencing method is performed by a second generation sequencing method or a third generation sequencing method.
  • the means for performing the sequencing is not particularly limited, and sequencing by the second- or third-generation sequencing method enables rapid and efficient sequencing.
  • the sequencing method is performed by at least one selected from the group consisting of Hiseq2000, SOLiD, 454, and a single molecule sequencing device.
  • the invention provides a kit comprising an agent for detecting a biomarker, the biomarker comprising a biomarker according to the first aspect of the invention.
  • the kit uses the kit, the relative abundance of these markers in the intestinal flora can be determined, whereby the relative abundance values obtained can be used to determine whether the subject has or is susceptible to osteoporosis, and To monitor the treatment effect of patients with osteoporosis.
  • the kit includes a set of reference data sets or reference values for use as a reference for the relative abundance of each biomarker.
  • the reference data set or reference value can be attached to a physical carrier, such as an optical disc, such as a CD-ROM or the like.
  • the kit further comprises a first computer program product for performing the obtaining of the reference data set or reference value. That is, the first computer program product is used to perform a set of reference data sets or reference values for obtaining a diagnosis whether the subject has osteoporosis or a related disease or predicting whether the subject has osteoporosis or related diseases.
  • the kit further comprises a second computer program product, which can also be used to perform the diagnosis according to the second aspect of the invention, whether the subject has osteoporosis or related diseases Or a method of predicting whether a subject has a risk of osteoporosis or a related disease.
  • the invention provides the use of a biomarker for the preparation of a kit for diagnosing whether a subject has osteoporosis or a related disease or for predicting whether the subject has osteoporosis or The risk of related diseases.
  • the diagnosis or prediction comprises the steps of: 1) collecting a sample from the subject; 2) determining the relative abundance information of the biomarker in the sample obtained in step 1),
  • the biomarker is a biomarker according to the first aspect of the invention; 3) the relative abundance information described in step 2) is compared to a reference data set or reference value.
  • the kit the relative abundance of these markers in the intestinal flora can be determined, whereby the relative abundance values obtained can be used to determine whether the subject has or is susceptible to osteoporosis, and The efficiency of monitoring the therapeutic effects of patients with osteoporosis.
  • the use of the above biomarker in the preparation of the kit may further include the following technical features:
  • the reference data set comprises relative abundance information of biomarkers in samples from a plurality of osteoporosis patients and a plurality of healthy controls, the biomarkers being the first according to the invention Aspect of the biomarker.
  • step of comparing the relative abundance information described in step 2) with the reference data set further comprising performing a multivariate statistical model to obtain a disease probability; preferably, the multivariate statistical model For the random forest model.
  • the probability of being above a threshold indicates that the subject has osteoporosis or a related disease or is at risk of having osteoporosis or a related disease; preferably, the threshold is 0.5.
  • the decrease in the Akermansia muciniphila or its analog, Parabacteroides merdae or its analog, Dialister invisus or its analogue, when compared to a reference value indicates The subject has osteoporosis or a related disease or is at risk of having osteoporosis or a related disease; the Bacteroides thetaiotaomicron or its analog, Bacteroides uniformis or the like An increase in the substance, Bacteroides intestinalis or an analogue thereof, Bacteroides dorei or an analogue thereof, Ruminococcus sp. or an analogue thereof, Ruminococcus torques or the like thereof indicates that the subject has bone mass Loose or related diseases or at risk of suffering from osteoporosis or related diseases.
  • the relative abundance information of the biomarker in step 2) is obtained by a sequencing method, further comprising: separating a nucleic acid sample from the sample of the subject, based on the obtained nucleic acid A sample, a DNA library is constructed, the DNA library is sequenced to obtain a sequencing result, and based on the sequencing result, the sequencing result is aligned with a reference gene set to determine relative abundance information of the biomarker.
  • the reference gene set comprises performing metagenomic sequencing from a plurality of osteoporosis patients and a plurality of healthy control samples, obtaining a non-redundant gene set, and then combining the non-redundant gene set with The gut microbial genes are pooled and the reference gene set is obtained.
  • the sample is a stool sample.
  • the sequencing method is performed by a second generation sequencing method or a third generation sequencing method.
  • the sequencing method is performed by at least one selected from the group consisting of Hiseq2000, SOLiD, 454, and a single molecule sequencing device.
  • the present invention provides a use of a biomarker as a target for screening for a medicament for treating or preventing osteoporosis or a related disease.
  • the biomarker comprises a biomarker according to the first aspect of the invention.
  • the effects of the candidate drugs on these biomarkers before and after use can be utilized to determine whether the candidate drug can be used to treat or prevent osteoporosis.
  • the invention provides the use of a biomarker for diagnosing whether a subject has osteoporosis or a related disease or for predicting whether the subject is at risk of osteoporosis or related diseases.
  • the biomarker comprises a biomarker according to the first aspect of the invention.
  • the present invention provides a medicament for preventing or treating osteoporosis or a related disease.
  • the drug is capable of detecting relative abundance values of Akkermansia muciniphila or an analog thereof, Parabacteroides merdae or its analog, Dialister invisus or the like thereof in a subject.
  • Bacteroides thetaiotaomicron or analogue thereof Bacteroides uniformis or analogue thereof, Bacteroides intestinalis or analogue thereof, Bacteroides dorei or analogue thereof, rumen cocci
  • the relative abundance values of (Ruminococcus sp.) or an analog thereof, Ruminococcus torques or the like are reduced.
  • feces are metabolites of the human body, which not only contain metabolites of the human body, but also intestinal microbes closely related to changes in metabolism and immunity of the body and other functions of the body, and the feces are carried out.
  • the study found that there are significant differences in the composition of the intestinal flora between osteoporosis patients and healthy people, and it is possible to accurately assess the risk of early diagnosis of osteoporosis patients.
  • the invention compares and analyzes the intestinal flora of osteoporosis patients and healthy people to obtain a variety of related intestinal strains, and combines high quality osteoporosis population and non-osteoporosis population MLGs as training. The collection can accurately assess the risk of early diagnosis of osteoporosis patients. Compared with the currently used diagnostic methods, the method has the characteristics of convenience and quickness.
  • Fig. 1 is a graph showing the difference in the number of osteoporosis patients and healthy controls at the genetic level according to an embodiment of the present invention, and it can be seen that there is a difference in the composition of the intestinal flora of the osteoporosis group and the healthy person group.
  • FIG. 2 is a diagram showing an error rate distribution of five 10-fold cross-validations in a random forest classifier according to an embodiment of the present invention.
  • FIG 3 shows a receiver operating curve (ROC) and an area under the curve (AUC) of a training set composed of a random forest model (9 intestinal markers) based on a random forest model (9 intestinal markers) in accordance with one embodiment of the present invention.
  • Figure 4 shows the recipient operation of a validation set consisting of a healthy control and osteoporosis (health: 7 and disease: 7) based on a random forest model (9 intestinal markers) in accordance with one embodiment of the present invention.
  • Figure 5 is a schematic view showing the structure of an apparatus for determining whether a subject has osteoporosis or a related disease or predicting whether a subject has osteoporosis or a related disease according to an embodiment of the present invention, wherein Figure a is the Schematic diagram of the device, and Figure b is a schematic diagram of the relative abundance determining device of the biomarker in the device.
  • the present invention proposes a biomarker for assessing the risk of osteoporosis or early diagnosis of osteoporosis.
  • Materials, as well as the diagnosis and risk assessment methods of osteoporosis can predict the incidence and development of osteoporosis, and apply to pathological classification of diseases.
  • the invention proposes a biomarker for osteoporosis.
  • WHO World Health Organization
  • the level of the biomarker substance is indicated by relative abundance.
  • biomarker also referred to as “biological marker” refers to a measurable indicator of the biological state of an individual.
  • a biomarker may be any substance in an individual as long as they are related to a specific biological state (for example, a disease) of the individual to be tested, for example, a nucleic acid marker (which may also be referred to as a genetic marker such as DNA), Protein markers, cytokine markers, chemokine markers, carbohydrate markers, antigen markers, antibody markers, species markers (species/genus markers) and functional markers (KO/OG markers).
  • nucleic acid marker is not limited to the existing gene which can be expressed as a biologically active protein, and includes any nucleic acid fragment, which may be DNA or RNA, may be modified DNA or RNA, or may be It is unmodified DNA or RNA or a combination thereof. Nucleic acid markers are sometimes also referred to herein as feature fragments.
  • biomarkers can also be replaced with "intestinal markers" because several biomarkers found in the present invention that are closely related to osteoporosis are present in the intestinal tract of a subject. Biomarkers are measured and evaluated and are often used to examine normal biological processes, pathogenic processes, or therapeutic interventions, and are useful in many scientific fields.
  • high-throughput sequencing can be used to batch analyze stool samples from healthy people and osteoporosis patients. Based on high-throughput sequencing data, a healthy population is compared to a population of osteoporosis to determine specific nucleic acid sequences associated with a population of osteoporosis patients.
  • the steps are as follows:
  • Collection and processing of samples collecting stool samples from healthy people and osteoporosis patients, and using the kit for DNA extraction to obtain nucleic acid samples;
  • DNA library construction and sequencing is performed using high throughput sequencing to obtain the nucleic acid sequence of the gut microbes contained in the stool sample;
  • Specific gut microbial nucleic acid sequences associated with osteoporosis patients are determined by bioinformatics analysis methods.
  • the sequencing sequence and the reference gene set also referred to as the reference gene set, which may be a newly constructed gene set or a database of any known sequence, for example, using a known human intestinal microbial community non-redundant gene Set
  • the relative abundance of each gene in the nucleic acid sample from the healthy population and the osteoporosis patient population stool sample is determined, respectively.
  • the sequencing sequence can be associated with the gene in the reference gene set, so that the number of the corresponding sequence corresponding to the specific gene in the nucleic acid sample can effectively reflect the gene.
  • the relative abundance of genes in the nucleic acid sample can be determined by comparison of the results and in accordance with conventional statistical analysis.
  • the relative abundance of each gene in the nucleic acid sample from the healthy population and the osteoporosis patient population is statistically tested, thereby judging in healthy populations and Whether there is a gene with a significant difference in relative abundance in a population of osteoporosis, if there is a significant difference in the gene, the gene is regarded as a biomarker of an abnormal state, that is, a nucleic acid marker.
  • the species information and functional annotations of the genes can be further classified. Thereby determining the relative abundance and relative abundance of the species of each microorganism in the intestinal flora, it is possible to further determine the species markers and functional markers of the abnormal state.
  • the method for determining a species marker and a functional marker further comprises: comparing a sequencing sequence of a healthy population and a population group of osteoporosis with a reference gene set; and determining a healthy population and a bone based on the comparison result, respectively.
  • Species relative abundance and relative abundance of each gene in a nucleic acid sample from a population of patients with osteoporosis were performed; and species markers and functional markers with significant differences in relative abundance between nucleic acid samples from healthy populations and osteoporosis patient populations were determined, respectively.
  • statistical tests such as summation, averaging, median value, etc., can be performed to determine the relative abundance of genes from the same species and the relative abundance of genes having the same function annotation. Relative abundance and relative abundance of species.
  • biomarkers with significant differences in relative abundance between healthy people and fecal samples from osteoporosis patient populations were identified, including microbial species: Bacteroides thetaiotaomicron or its analogues, single Bacteroides uniformis or an analogue thereof, Bacteroides intestinalis or an analogue thereof, Bacteroides dorei or an analogue thereof, Ruminococcus sp. or its analogue, Akkermansia muciniphila Or an analog thereof, Parabacteroides merdae or an analogue thereof, Ruminococcus torques or an analogue thereof, Dialister invisus or an analogue thereof.
  • pre-determining whether the subject has or is susceptible to osteoporosis by detecting the presence or absence of at least one of the above microorganisms, and can be used for monitoring the therapeutic effect of a patient with osteoporosis.
  • the term "presence” as used herein shall be understood broadly and may refer to whether a qualitative analysis of a sample contains a corresponding target, or a quantitative analysis of the target in the sample, and further The results of the quantitative analysis obtained are compared with a reference (for example, a quantitative analysis result obtained by performing a parallel test on a sample having a known state) or a result obtained by any known mathematical operation.
  • a reference for example, a quantitative analysis result obtained by performing a parallel test on a sample having a known state
  • a result obtained by any known mathematical operation Those skilled in the art can make an easy selection according to needs and test conditions.
  • it is also possible to determine whether a subject has or is susceptible to osteoporosis by determining the relative abundance of these microorganisms in the intestinal
  • biomarker combination refers to a combination of two or more biomarkers.
  • strain identification can be performed by performing 16s rRNA.
  • the present invention provides the use of an agent in a kit for diagnosing whether or not suffering from osteoporosis or related diseases or predicting osteoporosis or The risk of a related disease, the reagent being used to detect the biomarkers of the invention.
  • the present invention provides an apparatus for detecting whether or not a subject has osteoporosis or a related disease or predicting whether or not the subject has osteoporosis or a related disease, as shown in Fig. 5.
  • the apparatus comprises a sample collection device 100, a biomarker relative abundance determining device 200, and a disease probability determining device 300 (shown as a in Fig. 5).
  • the sample collection device is adapted to collect a sample from the object;
  • the biomarker relative abundance determining device is coupled to the sample collection device, and is adapted to determine relative abundance information of the biomarker in the obtained sample,
  • the biomarker is a biomarker according to the first aspect of the present invention;
  • the disease probability determining device is connected to the biomarker relative abundance determining device, and the disease probability determining device is used to relatively rich
  • the relative abundance information of the biomarkers obtained in the degree determining device is compared with a reference data set or a reference value.
  • the reference data set comprises relative abundance information of the biomarkers according to the first aspect of the invention in a sample from a plurality of osteoporosis patients and a plurality of healthy controls.
  • the disease probability determining apparatus further includes performing a multivariate statistical model to obtain a disease probability; preferably, the multivariate statistical model is a random forest model.
  • the probability of being greater than a threshold indicates that the subject has osteoporosis or a related disease or is at risk of having osteoporosis or a related disease; preferably, the threshold Is 0.5.
  • the Akermansia muciniphila or analogue thereof, Parabacteroides merdae or analogue thereof, Dialister invisus or the like when compared to a reference value A decrease indicates that the subject has osteoporosis or a related disease or is at risk of suffering from osteoporosis or a related disease, Bacteroides thetaiotaomicron or its analog, Bacteroides Bacteroides uniformis) or an analogue thereof, Bacteroides intestinalis or an analogue thereof, Bacteroides dorei or an analogue thereof, Ruminococcus sp. or an analogue thereof, Ruminococcus torques L2-14 or An increase in the analog indicates that the subject has osteoporosis or a related disease or is at risk of having osteoporosis or a related disease.
  • the biomarker relative abundance determining device further comprises: a nucleic acid sample separating unit 210, a sequencing unit 220, and a comparing unit 230 (shown as b in FIG. 5).
  • the nucleic acid sample separation unit is adapted to separate a nucleic acid sample from the sample of the subject
  • the sequencing unit is connected to the nucleic acid sample separation unit, and based on the obtained nucleic acid sample, construct a DNA library
  • the DNA library is sequenced to obtain sequencing results
  • the alignment unit is coupled to the sequencing unit, and based on the sequencing results, the sequencing results are aligned with a reference gene set to determine relative abundance information of the biomarker.
  • the reference gene set comprises performing metagenomic sequencing from a plurality of osteoporosis patients and a plurality of healthy control samples, obtaining a non-redundant gene set, and then performing the non-redundant The gene set is combined with the gut microbial gene to obtain the reference gene set.
  • the sequencing unit is not particularly limited.
  • the sequencing unit is performed using a second generation sequencing method or a third generation sequencing method.
  • the sequencing unit is at least one selected from the group consisting of Hiseq2000, SOLiD, 454, and single molecule sequencing devices.
  • Hiseq2000, SOLiD, 454, and single molecule sequencing devices are selected from the group consisting of Hiseq2000, SOLiD, 454, and single molecule sequencing devices.
  • the comparison unit performs the alignment using at least one selected from the group consisting of SOAP2 and MAQ.
  • the efficiency of the alignment can be improved, and the efficiency of detecting osteoporosis can be improved.
  • the present invention also proposes a drug screening method.
  • a marker closely related to osteoporosis is used as a drug design target for drug screening, and a new drug for treating osteoporosis is promoted.
  • whether a candidate drug can be used as a drug for treating or preventing osteoporosis can be determined by detecting a change in the level of a biomarker before and after contact with a drug candidate. For example, whether the level of the pest marker is detected to decrease after exposure to the drug candidate, and whether the level of the beneficial biomarker is increased after exposure to the drug candidate.
  • the present invention also provides the use of a biomarker for osteoporosis in screening for a medicament for treating or preventing osteoporosis.
  • the technical means employed in the examples are conventional means well known to those skilled in the art, and the reagents and products employed are also commercially available.
  • the various processes and methods not described in detail are conventional methods well known in the art, the source of the reagents used, the trade name, and the necessity to list the components thereof, which are indicated on the first occurrence, and the same reagents used thereafter are not The descriptions are the same for the first time.
  • the invention adopts the analysis method of metagenomic association analysis (MWAS), analyzes the bacterial composition and functional difference of the fecal sample through sequencing, and discriminates the osteoporosis group and the non-osteoporotic group by using the random forest discriminant model to obtain the disease probability, and uses the probability of disease. Risk assessment, diagnosis, early diagnosis of osteoporosis or finding potential drug targets.
  • MWAS metagenomic association analysis
  • MLG refers to the Metagenomic Linkage Group (Qin J, Li Y, Cai Z, et al. A metagenome-wide association study of gut microbiota in type 2 diabetes [J]. Nature, 2012, 490 (7418): 55-60.), in the phylogenetic study or population genetics research, in order to facilitate the analysis, artificially set the same for a certain taxonomic unit (strain, species, genus, group, etc.) Sign. Sequences are usually divided into different MLGs according to similarity thresholds, and each MLG is usually considered a microbial species.
  • MLG is considered to be a known species; if more than 50% of the sequences in an MLG are 85% alkaline
  • the base similarity is known to be at the level of the microbial genus, and MLG is considered to be a level annotation for this known species.
  • the term "individual” refers to an animal, in particular a mammal, such as a primate, preferably a human.
  • the sequencing (second generation sequencing) and MWAS are well known in the art, and those skilled in the art can make adjustments according to specific conditions.
  • the method described in the literature Wang, Jun, and Huijue Jia. "Metagenome-wide association studies: fine-mining the microbiome.” Nature Reviews Microbiology 14.8 (2016): 508-522.) can be used. get on.
  • the methods of using the random forest model and the ROC curve are well known in the art, and those skilled in the art can perform parameter setting and adjustment according to specific conditions. According to an embodiment of the invention, it can be based on the literature (Drogan D, Dunn WB, Lin W, Buijsse B, Schulze MB, Langenberg C, Brown M, Floegel a., Dietrich S, Rolandsson O, Wedge DC, Goodacre R, Forouhi NG , Sharp SJ, Spranger J, Wareham NJ, Boeing H: Untargeted Metabolic Profiling Identifies Altered Serum Metabolites of Type 2-Diabetes Mellitus in a Prospective, Nested Case Control Study.
  • a training set of biomarkers for osteoporotic subjects and non-osteoporotic subjects is constructed, and based on this, the biomarker content values of the samples to be tested are evaluated.
  • a non-osteoporotic subject is a subject in a good mental state.
  • the subject may be a human or a model animal.
  • the normal content range (absolute value) of each biomarker in the sample can be derived using sample detection and calculation methods well known in the art.
  • the absolute value of the detected biomarker content can be compared with the normal content value, and optionally, statistical methods can also be combined to determine the risk assessment, diagnosis, and the like of osteoporosis.
  • biomarkers are intestinal flora present in the human body. Correlation analysis was performed on the intestinal flora of the subject by the method of the present invention, and the biomarker of the osteoporosis population was found to exhibit a certain content range value in the detection of the flora.
  • fecal samples were collected and transported frozen and rapidly transferred to -80. Store at °C and perform DNA extraction to obtain an extracted DNA sample.
  • a sequencing library was constructed using the extracted DNA samples, and single-end metagenomic sequencing (read length 100 bp) was performed on an Illumina HiSeq2000 sequencing platform.
  • the data generated by sequencing is filtered (quality-controlled, removing the indicator of the contamination of the adapter, removing the low-quality sequence, and de-hosting the genome-contaminated sequence).
  • the method described in the reference A metagenome-wide association study of gut microbiota in type 2 diabetes (Qin, J. et al. Nature 490, 55–60 (2012))
  • the predicted genes are classified by species.
  • the similarity of the alignment is above 65%, and the ratio of coverage is above 70% as the critical value of species classification at the gate level.
  • the similarity of the alignment is above 85% as the critical value of the classification of the genus.
  • the similarity of the alignment is above 95% as the critical value of species classification at the species and plant level.
  • the relative abundance of the gene is then used to calculate the relative identity of the species, as described in the literature A metagenome-wide association study of gut microbiota in type 2 diabetes (Qin, J. et al. Nature 490, 55–60 (2012)). Abundance, and statistical test using the Wilcoxon rank-sum test (p ⁇ 0.05), to determine the species with significant differences in relative abundance between the case and the control.
  • this example constructs a training set of biomarkers for osteoporosis subjects and non-osteoporosis subjects, and based on this, the sample to be tested Biomarker content values were evaluated.
  • the training set and the verification set have meanings well known in the art.
  • a training set refers to a data set comprising the content of each biomarker in a sample of osteoporosis and a sample of a non-osteoporosis subject to be tested.
  • a validation set is a collection of independent data used to test the performance of a training set.
  • the non-osteoporosis subject is a subject with good mental state, and the subject can be a human or a model animal, and in this embodiment, the experiment is performed on a human subject.
  • the RF classifier obtained in the present invention contains 9 metabolites (ie, 9 biomarkers), and the relative abundances of the 9 biomarkers are shown in Table 1, and the detailed information thereof is shown in Table 2.
  • Figure 2 shows the distribution of error rates for five 10-fold cross-validations in a random forest classifier.
  • the black thick curve in Figure 2 represents the 5 trials, the average of 10 replicates, and the vertical bars represent the number of MLGs in the best combination selected.
  • Figure 3 shows the determination of the receiver operating curve (ROC) and the area under the curve (AUC) of the training set based on the random forest model (9 biomarkers) for osteoporosis patients and healthy controls. Sensitivity refers to the probability of judging the disease.
  • ROC receiver operating curve
  • AUC area under the curve
  • each marker gene set represents the number of nucleic acid sequences included in each marker; the marker gene set annotation number represents: how many genes are annotated to the marker
  • the model is verified using an independent population, and the probability of disease (RP) ⁇ 0.5 predicts that the individual has a risk of osteoporosis or suffers from osteoporosis.
  • RP probability of disease
  • the relative abundance of each biomarker in each sample in the validation set was calculated according to the method described in 1.4-1.5.
  • the verification set data is verified by the random forest model according to the method of 1.6.1.
  • ROC receiver operating curve
  • AUC area under the curve
  • Random forest model classification and regression were performed using the "randomForest 4.6-12 package" in version 3.2.5 R.
  • Inputs include training set data (ie, relative abundance of selected MLGs markers in the training sample, see Table 1), sample disease status (sample disease status of training samples is vector, '1' stands for osteoporosis, '0' On behalf of healthy people), and a validation set (the relative abundance of selected MLGs markers in the validation set, see Tables 4-1, 4-2).
  • the inventor uses the random forest function of random forest packet in R software to establish classification and prediction function to predict the validation set data, and the output is the prediction result (probability of disease); the threshold is 0.5, if the probability of disease is ⁇ 0.5, then It is considered to be at risk of osteoporosis or suffering from osteoporosis.
  • biomarkers disclosed by the present invention have high accuracy and specificity, and have good prospects for development as a diagnostic method, thereby assessing, diagnosing, and early diagnosis of osteoporosis risk, and searching for potential drugs.
  • the target provides the basis.
  • first and second are used for descriptive purposes only, and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated.
  • features defining “first” or “second” may include at least one of the features, either explicitly or implicitly.
  • the meaning of "a plurality” is at least two, such as two, three, etc., unless specifically defined otherwise.
  • the terms “installation”, “connected”, “connected”, “fixed” and the like shall be understood broadly, and may be either a fixed connection or a detachable connection, unless explicitly stated and defined otherwise. Or in one piece; it may be a mechanical connection, or it may be an electrical connection or a communication with each other; it may be directly connected or indirectly connected through an intermediate medium, and may be an internal connection of two elements or an interaction relationship between two elements. Unless otherwise expressly defined. For those skilled in the art, the specific meanings of the above terms in the present invention can be understood on a case-by-case basis.
  • the first feature "on” or “under” the second feature may be a direct contact of the first and second features, or the first and second features may be indirectly through an intermediate medium, unless otherwise explicitly stated and defined. contact.
  • the first feature "above”, “above” and “above” the second feature may be that the first feature is directly above or above the second feature, or merely that the first feature level is higher than the second feature.
  • the first feature “below”, “below” and “below” the second feature may be that the first feature is directly below or obliquely below the second feature, or merely that the first feature level is less than the second feature.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Virology (AREA)
  • Biomedical Technology (AREA)
  • Physics & Mathematics (AREA)

Abstract

本发明涉及骨质疏松生物标志物及其用途。所述生物标志物包括多形拟杆菌(Bacteroides thetaiotaomicron),单形拟杆菌(Bacteroides uniformis),Bacteroides intestinalis,Bacteroides dorei,瘤胃球菌(Ruminococcus sp.),嗜粘蛋白-艾克曼菌Akkermansia muciniphila,Parabacteroides merdae,扭链瘤胃球菌Ruminococcus torques,Dialister invisus及它们的类似物。

Description

骨质疏松生物标志物及其用途 技术领域
本发明涉及生物医药领域,具体地,本发明涉及骨质疏松生物标志物及其用途。尤其地,本发明涉及骨质疏松或相关疾病的生物标志物、诊断或预测骨质疏松或相关疾病风险的方法、试剂盒及骨质疏松生物标志物在制备试剂盒中的用途。
背景技术
骨质疏松症(英文:osteoporosis,来自希腊文porous bones,意为“多孔的骨头”)是一种因骨质密度下降,而骨折风险提高的疾病。起因是矿物质大量流失,导致骨头中的钙质不断流失到血液中;骨质疏松症也是中高年龄族群最常见的骨折原因。易于因骨质疏松而骨折的骨骼部位有脊椎、前臂骨、髋关节骨。通常骨折前都不会有任何症状,一直到骨骼变得松软易折,稍微受压就会断裂;发生慢性疼痛及机能衰退后,就连日常活动都会导致再度骨折。
骨质疏松症会随着年纪增加而加重,约15%的白人50多岁起会出现症状,80岁以上则会提高到70%;骨质疏松亦多见于女性,甚于男性患者。发达国家中利用筛检发现2%-8%男性及9%-38%的女性确诊罹患骨质疏松;发展中国家的发病率则尚不明朗。2010年,欧洲有将近2200万女性患者和550万左右男性患者,同年在美国,发现有800万左右女性和100-200万男性患者。骨质疏松症的危险因子包括性别(尤其是女性)、太早停经、种族(尤其是白人和亚洲人)、骨头结构较细、身体质量指数过低、抽烟、酗酒、活动量不足、具有家族病史。
虽然骨质疏松症多数情况下并不会直接导致死亡,但骨质疏松症增加骨折机会,从而影响病患者的健康和独立生活能力,更大大增加社会医疗负担。现有的骨密度筛查只是表面的了解骨密度的变化,但并不能从整体上对患者进行骨质疏松症状的评估。
有关骨质疏松的诊断和筛查还需要进一步改进。本领域迫切需要对骨质疏松生物标志物进行进一步的研究。
发明内容
本申请是基于发明人对以下事实和问题的发现和认识作出的:肠道微生物是存在于人体肠道中的微生物群落,是人体的“第二基因组”。人体肠道菌群和宿主构成一个相互关联的整体,肠道微生物不仅能降解食物中消化的营养成分、宿主维生素以及其他的一些营养物质,还能促进肠上皮细胞的分化与成熟,从而激活肠道免疫系统以及调节宿主能量存储 与代谢,这些在人体的消化吸收、免疫反应、代谢活性等方面都发挥着重要的作用。因此,本发明发明人通过对骨质疏松患者以及健康人群的肠道菌群以及基因序列进行分析,从而筛选出与骨质疏松症相关性高的生物标志物,并且利用该标志物能够准确地诊断骨质疏松或相关疾病或者预测患病风险,并且可以用于监测治疗效果。
因此,本发明的目的在于提供用于评估骨质疏松风险或者早期诊断骨质疏松的生物标志物,以及骨质疏松的诊断和患病风险评估方法,可以解决现有骨质疏松诊断方法不能做到早期预警、不能预测骨质疏松发病以及发展的趋势等缺点。从而可以应用于预测骨质疏松发病以及发展的趋势,以及应用于疾病病理分型。
据认为,由于以下原因,骨质疏松相关的生物标记物对早期诊断是有价值的。第一,本发明的标记物具有特异性和灵敏性。第二,粪便的分析保证准确性、安全性、可负担性和患者依从性。并且粪便的样本是可运输的。基于聚合酶链反应(PCR)的试验舒适且无创,所以人们会更容易参与给定的筛选程序。第三,本发明的标记物还可以用作用于对骨质疏松患者进行治疗监测的工具以检测对治疗的响应。
根据本发明的第一方面,本发明提供了一种生物标志物。根据本发明的实施例,该生物标志物包括选自下列中的至少一种:
多形拟杆菌(Bacteroides thetaiotaomicron)或其类似物,单形拟杆菌(Bacteroides uniformis)或其类似物,Bacteroides intestinalis或其类似物,Bacteroides dorei或其类似物,瘤胃球菌(Ruminococcus sp.)或其类似物,嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)或其类似物,Parabacteroides merdae或其类似物,扭链瘤胃球菌(Ruminococcus torques)或其类似物,Dialister invisus或其类似物,所述多形拟杆菌(Bacteroides thetaiotaomicron)类似物与多形拟杆菌(Bacteroides thetaiotaomicron)的基因组序列相比,比对相似度在85%以上,所述单形拟杆菌(Bacteroides uniformis)类似物与单形拟杆菌(Bacteroides uniformis)的基因组序列相比,比对相似度在85%以上,所述Bacteroides intestinalis类似物与Bacteroides intestinalis的基因组序列相比,比对相似度在85%以上,所述Bacteroides dorei类似物与Bacteroides dorei的基因组序列相比,比对相似度在85%以上,所述瘤胃球菌(Ruminococcus sp.)类似物与瘤胃球菌(Ruminococcus sp.)的基因组序列相比,比对相似度在85%以上,所述嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)类似物与嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)的基因组序列相比,比对相似度在85%以上,所述Parabacteroides merdae类似物与Parabacteroides merdae的基因组序列相比,比对相似度在85%以上,所述扭链瘤胃球菌(Ruminococcus torques)类似物与扭链瘤胃球菌(Ruminococcus torques)的基因组序列相比,比对相似度在85%以上,所述Dialister invisus类似物与Dialister invisus的基因组序列相比,比对相似度在85%以上。这些生物标志物均可以作为骨质疏松 症检测的生物学标记物,可以通过确定对象肠道菌群中是否存在这些标志物中的一种或者两种或者多种,从而有效地确定检测对象是否患有或者易感骨质疏松症(即预测患有骨质疏松症的风险),并且还可以进一步将这些生物标志物用于监控骨质疏松症患者的治疗效果。另外,当健康样本量足够多的时候,本领域技术人员还可以根据检验和计算方法,得到每个生物标志物在肠道中的正常值或者正常的范围,从而用来指示每种标志物在健康样本中的含量,由此,可以通过对样本中这些生物标志物的至少一种在肠道菌群中的含量进行检测,来确定对象是否患有或者易感骨质疏松症,同时可以用来监控骨质疏松症患者的治疗效果的效率。而且本领域技术人员可知的是,当某种未知的微生物或者某种核酸来源的某些基因序列与某种已知菌株的基因序列相比,比对相似度在85%以上的时候,即可认为该微生物与该菌株属于同一属,或者可以将基因序列归类到与该菌株同属,而同属的微生物通常具有相同或相似的功能,因此,也可以利用这些类似物作为骨质疏松症的标志物。
本发明中比对相似性,也可以称为比对相似度,是指序列比对过程中目标序列(待确定的序列)和参考序列(已知序列)之间相同碱基或氨基酸残基序列所占比例的大小。
根据本发明的实施例,所述Bacteroides intestinalis为Bacteroides intestinalis DSM 17393,所述瘤胃球菌(Ruminococcus sp.)为Ruminococcus sp.5_1_39BFAA,所述嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)为嗜粘蛋白-艾克曼菌ATCC BAA-835(Akkermansia muciniphila ATCC BAA-835),所述Parabacteroides merdae为Parabacteroides merdae ATCC 43184,所述扭链瘤胃球菌(Ruminococcus torques)为扭链瘤胃球菌L2-14(Ruminococcus torques L2-14),所述Dialister invisus为Dialister invisus DSM 15470。这些生物标志物作为相应菌种的代表菌株,均可以用来指示骨质疏松症或者骨质疏松症相关疾病的患病状态或者患病风险。
根据本发明的实施例,所述多形拟杆菌(Bacteroides thetaiotaomicron)类似物与多形拟杆菌(Bacteroides thetaiotaomicron)的基因组序列相相比,比对相似度在95%以上,所述单形拟杆菌(Bacteroides uniformis)类似物与单形拟杆菌(Bacteroides uniformis)的基因组序列相比,比对相似度在95%以上,所述Bacteroides intestinalis类似物与Bacteroides intestinalis的基因组序列相比,比对相似度在95%以上,所述Bacteroides dorei类似物与Bacteroides dorei的基因组序列相比,比对相似度在95%以上,所述瘤胃球菌(Ruminococcus sp.)类似物与瘤胃球菌(Ruminococcus sp.)的基因组序列相比,比对相似度在95%以上,所述嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)类似物与嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)的基因组序列相比,比对相似度在95%以上,所述Parabacteroides merdae类似物与Parabacteroides merdae的基因组序列相比,比对相似度在95%以上,所述扭链瘤胃球菌(Ruminococcus torques)类似物与扭链瘤胃球菌(Ruminococcus torques)的基因组序 列相比,比对相似度在95%以上,所述Dialister invisus类似物与Dialister invisus的基因组序列相比,比对相似度在95%以上。本领域技术人员可知的是,当某种未知微生物或者某种核酸来源的基因序列与某种已知菌株相比,比对相似度在95%以上的时候,即可以认为该微生物与该菌株同种,或者可以将基因序列归类到与该菌株同种。由此,本领域技术人员可以直接通过对检测对象中的核酸序列信息获取,然后将其与这些菌株的基因组序列进行比对,如有95%以上的序列相似性,则就可以作为检测对象是否患有骨质疏松症或者易感骨质疏松症的标志。
根据本发明的实施例,当所述各菌类似物与相应的菌的基因组序列相比,比对覆盖度在80%以上,且比对相似度在85%以上时,均可以认为这些类似物与相应菌属于同一属,可以作为骨质疏松症的标志物。优选地,当这些类似物与相应的菌的比对覆盖度在80%以上,且比对相似度在95%以上时,均可以认为这些类似物与相应菌同种,可以作为骨质疏松症的标志物。
本发明中比对覆盖度,指的是对目标序列与参考序列比对的过程中,目标序列中拿来和参考序列进行比对的序列的长度占检测序列总长度的比例。
根据本发明的第二方面,本发明提出了一种诊断对象是否患有骨质疏松或相关疾病或者预测对象是否患有骨质疏松或相关疾病的风险的方法。根据本发明的实施例,所述方法包括步骤:(1)从所述对象中采集样本;(2)确定步骤(1)中获得的所述样本中根据本发明第一方面所述的生物标志物的相对丰度信息;(3)将步骤(2)中所述的相对丰度信息与参考数据集或参考值进行比较。所述方法不仅仅用于专利法意义上的疾病诊断,同时可以用作科学研究或者其他个人遗传信息的丰富以及遗传信息库的丰富等非疾病诊断。利用检测对象中的各生物标志物的相对丰度信息与参考数据集或参考值进行比较,来确定对象是否患有骨质疏松症或相关疾病,或者预测其患有骨质疏松症或者相关疾病的风险。
本发明中所述参考数据集指的是对已确诊为患病个体和健康个体的样本进行操作,所获得的各生物标志物的相对丰度信息,用来作为每种生物标志物的相对丰度的参考。在本发明的一个实施方案中,参考数据集是指训练数据集。根据本发明,所述训练集是指和验证集具有本领域公知的含义。在本发明的一个实施方案中,所述训练集是指包含一定样本数的骨质疏松症受试者和非骨质疏松症受试者待测样本中的各生物标志物的含量的数据集合。所述验证集是用来测试训练集性能的独立数据集合。
本发明中所述参考值指的是健康对照的参考值或正常值。本领域技术人员已知,当样本容量足够大时,可利用本领域公知的检测和计算方法获得样品中每个生物标志物的正常值(绝对值)的范围。当采用测定方法检测生物标志物的水平时,可将样品中的生物标志物水平的绝对值直接与参考值进行比较,以评估患病风险以及诊断或早期诊断骨质疏松症 或相关疾病,任选地,可以包括统计方法。
本发明中所述骨质疏松症相关疾病,意指与骨质疏松症相互关联的疾病,包括可以引发骨质疏松症的前期的症状或疾病,以及由骨质疏松症引发的后续的或者并发的症状或疾病。
根据本发明的实施例,所述方法可以进一步附加如下技术特征:
根据本发明的实施例,所述参考数据集包括来自多个骨质疏松患者和多个健康对照的样本中的生物标志物的相对丰度信息,所述生物标志物为根据本发明第一方面所述的生物标志物。
根据本发明的实施例,在将步骤(2)中所述的相对丰度信息与参考数据集进行比较的步骤中,还包括执行多元统计模型以获得患病概率。利用多元统计模型可以实现快速高效检测。
根据本发明的实施例,所述多元统计模型为随机森林模型。
根据本发明的实施例,所述患病概率大于阈值表明所述对象患有骨质疏松或相关疾病或者有患有骨质疏松或相关疾病的风险。
根据本发明的实施例,所述阈值为0.5。
根据本发明的实施例,当与参考值比较时,所述嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)或其类似物、Parabacteroides merdae或其类似物、Dialister invisus或其类似物的减少表明所述对象患有骨质疏松或相关疾病或者处于患有骨质疏松或相关疾病的风险中;所述多形拟杆菌(Bacteroides thetaiotaomicron)或其类似物、单形拟杆菌(Bacteroides uniformis)或其类似物、Bacteroides intestinalis或其类似物、Bacteroides dorei或其类似物、瘤胃球菌(Ruminococcus sp.)或其类似物、扭链瘤胃球菌(Ruminococcus torques)或其类似物的增加表明所述对象患有骨质疏松或相关疾病或者处于患有骨质疏松或相关疾病的风险中。
根据本发明的实施例,步骤(2)中所述生物标志物的相对丰度信息是利用测序方法得到的,进一步包括:从所述对象的所述样本中分离得到核酸样本,基于所获得的所述核酸样本,构建DNA文库,对所述DNA文库进行测序,以便获得测序结果,以及基于所述测序结果,将测序结果与参考基因集进行比对,以确定所述生物标志物的相对丰度信息。根据本发明的一种实施例,可以利用SOAP2和MAQ的至少一种将测序结果与参考基因集进行比对,由此,可以提高比对的效率,进而可以提高骨质疏松症检测的效率。根据本发明的实施例,可以同时对多种(至少两种)生物标志物进行检测,可以提高骨质疏松症检测的效率。
根据本发明的实施例,所述参考基因集包括从多个骨质疏松患者和多个健康对照的样 本中进行宏基因组测序,获得非冗余基因集,然后将所述非冗余基因集与肠道微生物基因集合并,得到所述参考基因集。本发明中的参考基因集可以是已有的基因集,如现有的已经公开的肠道微生物参考基因集;也可以是将多个骨质疏松症患者和多个健康对照的样品进行宏基因组测序,获得非冗余基因集,然后将所述非冗余基因集与肠道微生物基因集合并,得到所述参考基因集,由此获得的参考基因集信息更全面,检测结果更可靠。
本发明中所述非冗余基因集作本领域技术人员通常的理解来解释,简单来说是去除冗余基因后的剩余基因的集合。冗余基因通常指的是一条染色体上出现的一个基因的多个复份。
根据本发明的实施例,所述样本为粪便样本。
根据本发明的实施例,所述测序方法是通过第二代测序方法或第三代测序方法进行的。进行测序的手段并不受特别限制,通过二代或者三代测序的方法进行测序,可以实现快速高效的测序。
根据本发明的实施例,所述测序方法是通过选自Hiseq2000、SOLiD、454、和单分子测序装置的至少一种进行的。由此,能够利用这些测序装置的高通量、深度测序的特点,从而有利于对后续测序数据进行分析,尤其是进行统计学检验时的精确性和准确度。
根据本发明的第三方面,本发明提出了一种试剂盒,包括用于检测生物标志物的试剂,所述生物标志物包括根据本发明的第一方面的生物标志物。利用该试剂盒,可以确定这些标志物在肠道菌群中的相对丰度,由此,可以通过所得到的相对丰度值,从而确定对象是否患有或者易感骨质疏松症,以及用于监控骨质疏松症患者的治疗效果。
根据本发明的实施例,所述试剂盒包括一组参考数据集或者参考值,用来作为每种生物标志物的相对丰度的参考。优选可以将参考数据集或者参考值附在物理载体上,例如光盘,如CD-ROM等。
根据本发明的实施例,所述试剂盒还包括第一计算机程序产品,该第一计算机程序产品用来执行获得所述的参考数据集或者参考值。即该第一计算机程序产品用来执行获得诊断对象是否患有骨质疏松或相关疾病或者预测对象是否患有骨质疏松或相关疾病的一组参考数据集或者参考值。
根据本发明的实施例,所述试剂盒还包括第二计算机程序产品,该第二计算机程序产品还可以用来执行根据本发明第二方面所述的诊断对象是否患有骨质疏松或相关疾病或者预测对象是否患有骨质疏松或相关疾病的风险的方法。
根据本发明的第四方面,本发明提出了生物标志物在制备试剂盒中的用途,所述试剂盒用于诊断对象是否患有骨质疏松或相关疾病或者预测对象是否患有骨质疏松或相关疾病的风险。
根据本发明的实施例,所述诊断或预测包括以下步骤:1)从所述对象中采集样本;2)确定步骤1)中获得的所述样本中的生物标志物的相对丰度信息,所述生物标志物为根据本发明第一方面所述的生物标志物;3)将步骤2)中所述的相对丰度信息与参考数据集或参考值进行比较。根据所述的试剂盒,可以确定这些标志物在肠道菌群中的相对丰度,由此可以通过所得到的相对丰度值,从而确定对象是否患有或者易感骨质疏松症,以及用于监控骨质疏松症患者的治疗效果的效率。
根据发明的实施例,以上生物标志物在制备试剂盒中的用途,可以进一步附加如下技术特征:
根据本发明的实施例,所述参考数据集包括来自多个骨质疏松患者和多个健康对照的样本中的生物标志物的相对丰度信息,所述生物标志物为根据本发明的第一方面的所述生物标志物。
根据本发明的实施例,在将步骤2)中所述的相对丰度信息与参考数据集进行比较的步骤中,还包括执行多元统计模型以获得患病概率;优选地,所述多元统计模型为随机森林模型。
根据本发明的实施例,所述患病概率大于阈值表明所述对象患有骨质疏松或相关疾病或者有患有骨质疏松或相关疾病的风险;优选地,所述阈值为0.5。
根据本发明的实施例,当与参考值比较时,所述嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)或其类似物、Parabacteroides merdae或其类似物、Dialister invisus或其类似物的减少表明所述对象患有骨质疏松或相关疾病或者处于患有骨质疏松或相关疾病的风险中;所述多形拟杆菌(Bacteroides thetaiotaomicron)或其类似物、单形拟杆菌(Bacteroides uniformis)或其类似物、Bacteroides intestinalis或其类似物、Bacteroides dorei或其类似物、瘤胃球菌(Ruminococcus sp.)或其类似物、扭链瘤胃球菌(Ruminococcus torques)或其类似物的增加表明所述对象患有骨质疏松或相关疾病或者处于患有骨质疏松或相关疾病的风险中。
根据本发明的实施例,通过测序方法得到步骤2)中所述生物标志物的相对丰度信息,进一步包括:从所述对象的所述样本中分离得到核酸样本,基于所获得的所述核酸样本,构建DNA文库,对所述DNA文库进行测序,以便获得测序结果,以及基于所述测序结果,将测序结果与参考基因集进行比对,以确定所述生物标志物的相对丰度信息。
根据本发明的实施例,所述参考基因集包括从多个骨质疏松患者和多个健康对照的样本中进行宏基因组测序,获得非冗余基因集,然后将所述非冗余基因集与肠道微生物基因集合并,得到所述参考基因集。
根据本发明的实施例,所述样本为粪便样本。
根据本发明的实施例,所述测序方法是通过第二代测序方法或第三代测序方法进行的。
根据本发明的实施例,所述测序方法是通过选自Hiseq2000、SOLiD、454、和单分子测序装置的至少一种进行的。
根据本发明的第五方面,本发明提出了一种生物标志物作为靶点用于筛选治疗或者预防骨质疏松或相关疾病的药物的用途。根据本发明的实施例,所述生物标志物包括根据本发明的第一方面的生物标志物。根据本发明的实施例,可以利用候选药物使用前和使用后对这些生物标志物的影响,从而确定候选药物是否可以用于治疗或预防骨质疏松症。
根据本发明的第六方面,本发明提出了一种生物标志物在诊断对象是否患有骨质疏松症或相关疾病或者预测对象是否患有骨质疏松或相关疾病的风险中的用途。根据本发明的实施例,所述生物标志物包括根据本发明第一方面的生物标志物。
根据本发明的第七方面,本发明提出了一种药物,所述药物用于预防或治疗骨质疏松或相关疾病。根据本发明的实施例,所述药物能够使得检测对象中嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)或其类似物、Parabacteroides merdae或其类似物、Dialister invisus或其类似物的相对丰度值增加;或者所述药物能够使得多形拟杆菌(Bacteroides thetaiotaomicron)或其类似物、单形拟杆菌(Bacteroides uniformis)或其类似物、Bacteroides intestinalis或其类似物、Bacteroides dorei或其类似物、瘤胃球菌(Ruminococcus sp.)或其类似物、扭链瘤胃球菌(Ruminococcus torques)或其类似物的相对丰度值减少。
本发明所取得的有益效果为:粪便是人体的代谢产物,其内不仅包含人体的代谢产物,还包括对我们的机体代谢和免疫以及机体其他功能的变化密切相关的肠道微生物,对粪便进行研究,发现在骨质疏松症患者和健康人群的肠道菌群的组成上存在明显的差异,可以准确地对骨质疏松症患者进行患病风险评估、早期诊断。本发明通过对骨质疏松症患者和健康人群的肠道菌群的比较和分析,得到多种相关的肠道菌株,结合高质量的骨质疏松症人群和非骨质疏松症人群MLGs作为训练集,能够准确地对骨质疏松症患者进行患病风险评估、早期诊断。该方法与目前常用的诊断方法相比,具有方便、快捷的特点。
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。
附图说明
本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中:
图1示出了根据本发明的一个实施例在基因水平上骨质疏松患者和健康对照的数量差异图,可以看出骨质疏松组和健康人组的肠道菌群组成存在差异情况。
图2示出了根据本发明的一个实施例随机森林分类器中5次10折交叉验证的错误率分布情况图。
图3示出了根据本发明的一个实施例基于随机森林模型(9个肠道标志物)判别骨质疏松和健康对照组成的训练集的接收者操作曲线(ROC)和曲线下面积(AUC)。
图4示出了根据本发明的一个实施例基于随机森林模型(9个肠道标志物),由健康对照和骨质疏松(健康:7和患病:7)组成的验证集的接收者操作曲线(ROC)和曲线下面积(AUC)。
图5示出了根据本发明的一个实施例中确定对象是否患有骨质疏松症或相关疾病或者预测对象是否患有骨质疏松症或相关疾病的设备的结构示意图,其中图a为所述设备的示意图,图b为设备中的生物标志物相对丰度确定装置的示意图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。
需要说明的是,本发明中所提供的术语的解释仅为了使本领域技术人员更好地理解本发明,并非对本发明限制。
针对现有骨质疏松诊断方法不能做到早期预警、不能预测骨质疏松发病以及发展的趋势等缺点,本发明提出一种用于评估骨质疏松症风险或者早期诊断骨质疏松症的生物标志物,以及骨质疏松症的诊断和患病风险评估方法,能预测骨质疏松发病以及发展的趋势,应用于疾病病理分型。
生物标志物
根据本发明的第一方面,本发明提出了用于骨质疏松症的生物标志物。
本发明所用术语具有相关领域普通技术人员通常理解的含义。然而,为了更好地理解本发明,对一些定义和相关术语的解释如下:
根据本发明,术语“骨质疏松”,是一种因骨质密度下降,而骨折风险提高的疾病。骨质疏松可以经由临床诊断,也可以参考世界卫生组织(WHO)推荐的诊断标准,即基于DXA测定骨密度通常用T-Score(T值)表示,T值=(测定值-同性别同种族正常成人骨峰值)/正常成人骨密度标准差,例如当T值≤-2.5时,可以认为患有骨质疏松症。
根据本发明,生物标志物质的水平通过相对丰度指示。
根据本发明,术语“生物标志物”,也称为“生物学标志物”,是指个体的生物状态的可测量指标。这样的生物标记物可以是在个体中的任何物质,只要它们与被检个体的特定生物状态(例如,疾病)有关系,例如,核酸标志物(也可以称为基因标志物,例如DNA), 蛋白质标志物,细胞因子标记物,趋化因子标记物,碳水化合物标志物,抗原标志物,抗体标志物,物种标志物(种/属的标记)和功能标志物(KO/OG标记)等。其中,核酸标志物的含义并不局限于现有可以表达为具有生物活性的蛋白质的基因,还包括任何核酸片段,可以为DNA,也可以为RNA,可以是经过修饰的DNA或者RNA,也可以是未经修饰的DNA或者RNA或其组合。在本文中核酸标志物有时也可以称为特征片段。在本发明中,生物标志物也可以用“肠道标志物”来替代,因为本发明所发现的与骨质疏松症密切相关的几种生物标志物均存在于受试者的肠道内。生物标记物经过测量和评估,经常用以检查正常生物过程,致病过程,或治疗干预药理响应,而且在许多科学领域都是有用的。
根据本发明的实施例,可以运用高通量测序,批量分析健康人群和骨质疏松症患者的粪便样本。基于高通量测序数据,对健康人群与骨质疏松症患者群进行比对,从而确定与骨质疏松症患者群相关的特异性核酸序列。简言之,其步骤如下:
样品的收集与处理:收集健康人群与骨质疏松症患者群的粪便样本,使用试剂盒进行DNA提取,得到核酸样本;
文库构建和测序:DNA文库构建和测序是利用高通量测序进行,以便得到粪便样品中所包含肠道微生物的核酸序列;
通过生物信息学的分析方法,确定与骨质疏松症患者相关的特异性肠道微生物核酸序列。首先,将测序序列(reads)与参照基因集(也称为参考基因集,可以为新构建的基因集或任何已知序列的数据库,例如,采用已知的人肠道微生物群落非冗余基因集)进行比对。接下来,基于比对结果,分别确定来自健康人群和骨质疏松症患者群粪便样品的核酸样本中各基因的相对丰度。通过将测序序列与参照基因集进行比对,可以将测序序列与参照基因集中的基因建立对应关系,从而针对核酸样本中的特定基因,与其相对应的测序序列的数目可以有效地反映该基因的相对丰度。由此,可以通过比对结果,按照常规的统计分析,确定在核酸样本中基因的相对丰度。最后,在确定核酸样本中各基因的相对丰度后,对来自健康人群和骨质疏松症患者群粪便的核酸样本中各基因的相对丰度进行统计检验,由此,可以判断在健康人群和骨质疏松症患者人群中是否存在相对丰度有显著差异的基因,如果存在基因是显著差异的,则该基因被当作是异常状态的生物标志物,即核酸标志物。
另外,对于已知或新构建的参照基因集,其通常包含基因物种信息和功能注释,由此,在确定基因相对丰度的基础上,可以进一步通过将基因的物种信息和功能注释进行分类,从而确定肠道菌群中各微生物的物种相对丰度和功能相对丰度,也就可以进一步确定异常状态的物种标志物和功能标志物。简言之,确定物种标志物和功能标志物的方法进一步包括:将健康人群和骨质疏松症患者群的测序序列与参照基因集进行比对;基于比对结果,分别确定健康人群和的骨质疏松症病患者群的核酸样本中各基因的物种相对丰度和功能相 对丰度;对来自健康人群和骨质疏松症病人群的核酸样本中各基因的物种相对丰度和功能相对丰度进行统计学检验;以及分别确定在健康人群和骨质疏松症病患者群的核酸样本之间相对丰度存在显著差异的物种标志物和功能标志物。根据本发明的实施例,可以采用对来自相同物种的基因的相对丰度和具有相同功能注释的基因的相对丰度进行统计检验,例如加和、取平均值、中位数值等,来确定功能相对丰度和物种相对丰度。
最后,确定了在健康人群和骨质疏松症患者群的粪便样品之间相对丰度存在显著差异的生物学标志物,即包括微生物物种:多形拟杆菌(Bacteroides thetaiotaomicron)或其类似物,单形拟杆菌(Bacteroides uniformis)或其类似物,Bacteroides intestinalis或其类似物,Bacteroides dorei或其类似物,瘤胃球菌(Ruminococcus sp.)或其类似物,嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)或其类似物,Parabacteroides merdae或其类似物,扭链瘤胃球菌(Ruminococcus torques)或其类似物,Dialister invisus或其类似物。由此,通过检测上述微生物至少一种是否存在,来有效地确定对象是否患有或者易感骨质疏松症病,并且可以用于监控骨质疏松症病患者的治疗效果。在本文中所使用的术语“存在”应做广义理解,既可以指的是定性分析样本中是否含有相应的目标物,也可以指对样本中的目标物进行定量分析,并且还可以进一步将所得到的定量分析结果与参照(例如通过对具有已知状态的样本进行平行试验所得到的定量分析结果)进行统计学分析或者任何已知数学运算所得到的结果。本领域技术人员可以根据需要和试验条件进行容易的选择。根据本发明的实施例,还可以通过确定这些微生物在肠道菌群中的相对丰度,从而能够确定对象是否患有或者易感骨质疏松症病,以及用于监控骨质疏松症患者的治疗效果。
可以通过检测对象肠道菌群中是否存在上述微生物物种中的至少一种,也可以是检测对象肠道菌群中是否存在上述中的两种或者多种,即是否存在上述生物标志物组合,从而来有效地确定对象是否患有或者易感骨质疏松症,并且可以用于监控骨质疏松症患者的治疗效果。在本文中,术语“生物标志物组合”是指两个或者更多个生物标记物组成的组合。
对于物种标志物和功能标志物本领域技术人员还可以通过常规的菌种鉴别手段和生物活性检验手段来确定在肠道菌群中是否存在所述物种和功能。例如,菌种鉴别可以通过进行16s rRNA进行。
根据本发明的一种具体实施方式,本发明提出了试剂在制备试剂盒中的用途,所述试剂盒用来诊断是否患有骨质疏松症或相关疾病或者预测是否患有骨质疏松症或相关疾病的风险,所述试剂用来检测本发明的生物标志物。
检测对象是否患有骨质疏松症或相关疾病或者预测对象是否患有骨质疏松症或相关疾病的设备
根据本发明的又一方面,本发明提出了一种检测对象中是否患有骨质疏松症或相关疾 病或者预测对象是否患有骨质疏松症或相关疾病的设备,如图5所示。根据本发明的实施例,所述设备包括样本采集装置100、生物标志物相对丰度确定装置200以及患病概率确定装置300(如图5中a所示)。其中,样本采集装置适于从所述对象中采集样本;生物标志物相对丰度确定装置与所述样本采集装置相连,其适于确定所获得的样本中的生物标志物的相对丰度信息,所述生物标志物为根据本发明的第一方面的生物标志物;所述患病概率确定装置与所述生物标志物相对丰度确定装置相连,所述患病概率确定装置用于将相对丰度确定装置中获得的生物标志物的相对丰度信息与参考数据集或参考值进行比对。
根据本发明的一种具体实施方式,所述参考数据集包括来自多个骨质疏松症患者和多个健康对照的样本中的根据本发明的第一方面的生物标志物的相对丰度信息。
根据本发明的一种具体实施方式,所述患病概率确定装置中还包括执行多元统计模型以获得患病概率;优选地,所述多元统计模型为随机森林模型。根据本发明的一种优选实施方式,所述患病概率大于阈值表明所述对象患有骨质疏松症或相关疾病或者有患有骨质疏松症或相关疾病的风险;优选地,所述阈值为0.5。根据本发明的一种优选实施方式,当与参考值进行比较时,所述嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)或其类似物、Parabacteroides merdae或其类似物、Dialister invisus或其类似物的减少表明所述对象患有骨质疏松症或相关疾病或者处于患有骨质疏松症或相关疾病的风险中,所述多形拟杆菌(Bacteroides thetaiotaomicron)或其类似物、单形拟杆菌(Bacteroides uniformis)或其类似物、Bacteroides intestinalis或其类似物、Bacteroides dorei或其类似物、瘤胃球菌(Ruminococcus sp.)或其类似物、扭链瘤胃球菌L2-14(Ruminococcus torques L2-14)或其类似物的增加表明所述对象患有骨质疏松症或相关疾病或者处于患有骨质疏松症或相关疾病的风险中。
根据本发明的一种具体实施方式,所述生物标志物相对丰度确定装置进一步包括:核酸样本分离单元210、测序单元220以及比对单元230(如图5中b所示)。根据本发明的实施例,核酸样本分离单元适于从所述对象的所述样本中分离得到核酸样本,测序单元与核酸样本分离单元相连,并且基于所获得的核酸样本,构建DNA文库,对所述DNA文库进行测序,以便获得测序结果,比对单元与测序单元相连,并且基于所述测序结果,将测序结果与参考基因集进行比对,以确定所述生物标志物的相对丰度信息。
根据本发明的一种具体实施方式,所述参考基因集包括从多个骨质疏松患者和多个健康对照的样本中进行宏基因组测序,获得非冗余基因集,然后将所述非冗余基因集与肠道微生物基因集合并,得到所述参考基因集。
根据本发明的实施例,测序单元并不受特别限制。优选地,所述测序单元利用第二代测序方法或第三代测序方法进行。优选地,所述测序单元为选自Hiseq2000、SOLiD、454、 和单分子测序装置的至少一种。由此,能够利用这些测序装置的高通量、深度测序的特点,从而有利于对后续测序数据进行分析,尤其是进行统计学检验时的精确性和准确度。
根据本发明的一个实施例,所述比对单元利用选自SOAP2和MAQ的至少一种进行所述比对。由此,可以提高比对的效率,进而可以提高检测骨质疏松症的效率。
另外,根据本发明的实施例,本发明还提出了一种药物筛选方法。由此,根据本发明实施例,骨质疏松症密切相关的标志物作为药物设计靶点来进行药物的筛选,促进新的治疗骨质疏松症病的药物的发现。例如,可以通过检测与候选药物接触前后,生物标志物水平的变化,来确定候选药物是否可以作为治疗或预防骨质疏松症病的药物。例如,检测有害生物标志物水平在接触药物候选物之后是否有所降低,有益生物标志物水平在接触药物候选物之后是否有所升高。另外,还可以通过确定药物对生物标志物的至少一种的生物活性的直接影响或间接影响来对候选化合物是否可以作为治疗或预防骨质疏松症的药物来进行筛选。由此,根据本发明的实施例,本发明还提出了根据骨质疏松症的生物标志物在筛选治疗或预防骨质疏松症的药物中的用途。
应理解,在本发明范围内中,本发明的上述各技术特征和在下文(如实施例)中具体描述的各技术特征之间都可以互相组合,从而构成新的或优选的技术方案。限于篇幅,在此不再一一累述。
下面参考具体实施例,对本发明进行说明,需要说明的是,这些实施例仅仅是说明性的,而不能理解为对本发明的限制。
若未特别指明,实施例中所采用的技术手段为本领域技术人员所熟知的常规手段,所采用的试剂和产品也均为可商业获得的。未详细描述的各种过程和方法是本领域中公知的常规方法,所用试剂的来源、商品名以及有必要列出其组成成分者,均在首次出现时标明,其后所用相同试剂如无特殊说明,均以首次标明的内容相同。
本发明采用宏基因组关联分析(MWAS)的分析方法,经测序分析粪便样本的菌群组成,功能差异;用随机森林判别模型判别骨质疏松群体和非骨质疏松群体,获得患病概率,用于骨质疏松的患病风险评估、诊断、早期诊断或,寻找潜在药物靶点。
根据本发明,术语“MLG”是指操作分类单元(Metagenomic Linkage Group)(Qin J,Li Y,Cai Z,et al.A metagenome-wide association study of gut microbiota in type 2 diabetes[J].Nature,2012,490(7418):55-60.),是在系统发生学研究或群体遗传学研究中,为了便于进行分析,人为给某一个分类单元(品系,种,属,分组等)设置的同一标志。通常按照相似性阈值将序列划分为不同的MLG,每一个MLG通常被视为一个微生物物种。若一个MLG中有超过50%的序列以95%的碱基相似性比对上已知微生物物种, 则认为MLG为此已知物种;若一个MLG中有超过50%的序列以85%的碱基相似性比对上已知微生物属水平,则认为MLG为此已知物种属水平注释。
根据本发明,术语“个体”指动物,特别是哺乳动物,如灵长类动物,最好是人。
根据本发明,术语如“一”、“一个”和“这”不仅指单数的个体,而是包括可以用来说明特定实施方式的通常的一类。
在本发明中,所述的测序(二代测序)和MWAS具有本领域公知,本领域技术人员可以根据具体情况进行调整。根据本发明的实施例,可以根据文献(Wang,Jun,and Huijue Jia."Metagenome-wide association studies:fine-mining the microbiome."Nature Reviews Microbiology 14.8(2016):508-522.)中记载的方法进行。
在本发明中,随机森林模型和ROC曲线的使用方法为本领域所公知,本领域技术人员可以根据具体情况进行参数设置和调整。根据本发明的实施例,可以根据文献(Drogan D,Dunn WB,Lin W,Buijsse B,Schulze MB,Langenberg C,Brown M,Floegel a.,Dietrich S,Rolandsson O,Wedge DC,Goodacre R,Forouhi NG,Sharp SJ,Spranger J,Wareham NJ,Boeing H:Untargeted Metabolic Profiling Identifies Altered Serum Metabolites of Type 2-Diabetes Mellitus in a Prospective,Nested Case Control Study.Clin Chem 2015,61:487-497.;Mihalik SJ,Michaliszyn SF,de las Heras J,Bacha F,Lee S,Chace DH,DeJesus VR,Vockley J,Arslanian SA:Metabolomic profiling of fatty acid and amino acid metabolism in youth with obesity and type 2 diabetes:evidence for enhanced mitochondrial oxidation.Diabetes Care 2012,35:605-611.,通过引用全文并入此处),中记载的方法进行。
在本发明中,构建了骨质疏松受试者和非骨质疏松受试者的生物标志物的训练集,并以此为基准,对待测样本的生物标志物含量值进行评估。
在本发明中,非骨质疏松受试者为精神状态良好的受试者。
在本发明中,所述受试者可以为人或者模型动物。
本领域技术人员知晓,当进一步扩大样本量时,利用本领域公知的样本检测和计算方法,可以得出每种生物标志物在样本中的正常含量值区间(绝对数值)。可以将检测得到的生物标志物含量的绝对值与正常含量值进行比较,任选地,还可以结合统计学方法,以得出骨质疏松患病风险评价、诊断等。
不希望受任何理论的限制,发明人指出这些生物标志物是存在于人体中的肠道菌 群。通过本发明所述的方法对受试者肠道菌群进行关联分析,得到骨质疏松群体的所述生物标志物在菌群检测中表现出一定的含量范围值。
实施例1
1.1样本收集:
参照文献A metagenome-wide association study of gut microbiota in type 2 diabetes(Qin,J.et al..Nature 490,55–60(2012))记载的方法,采集粪便样品后冷冻运输并迅速转移到-80℃保存,进行DNA提取,得到提取的DNA样本。
1.2宏基因组测序与筛选
利用所提取的DNA样本构建测序文库,在Illumina HiSeq2000测序平台上进行单向(Single-end)宏基因组测序(读长100bp)。对测序产生的数据进行过滤(quality-controlled,去除adapter污染序列、去低质量序列和去宿主基因组污染序列)。
1.3基因集比对
将上述“1.2宏基因组测序与筛选”后的高质量测序片段(reads)与肠道参考基因集进行比对,从而得到基因的相对丰度(参考文献Qin,J.et al.2012)。
1.4物种分类注释与丰度计算
通过与IMG(v400)数据库进行比对,参考文献A metagenome-wide association study of gut microbiota in type 2 diabetes(Qin,J.et al.Nature 490,55–60(2012))记载的方法,对对预测的基因进行物种分类。对于门水平的物种分类,比对的相似度在65%以上,比对覆盖度在70%以上作为门水平的物种分类的临界值。对于属水平的物种分类,比对的相似度在85%以上作为属水平的分类的临界值。对于比对的相似性在95%以上作为种、株水平的物种分类的临界值。
然后参照文献A metagenome-wide association study of gut microbiota in type 2 diabetes(Qin,J.et al.Nature 490,55–60(2012))中记载的方法,利用基因的相对丰度计算该物种的相对丰度,并用秩和检验(Wilcoxon rank-sum test)进行统计检验(p<0.05),确定病例与对照之间的相对丰度存在显著差异的物种。
1.5生物标志物丰度计算
根据基因丰度对基因进行聚类(参考文献Qin,J.et al.A metagenome-wide association study of gut microbiota in type 2 diabetes.Nature 490,55–60(2012).),选取聚类基因数大于50的MLGs进行物种注释;并取对应基因丰度中位数的办法,得到对应MLGs的丰度,并计算病例与对照之间的相对丰度存在显著差异的MLGs。
1.6利用随机森林(ROC/AUC)筛选骨质疏松发生发展的潜在生物标志物
为进一步筛选潜在疾病肠道生物标志物,本实施例构建了骨质疏松症受试者和非骨质疏松症受试者的生物标志物的训练集,并以此为基准,对待测样本的生物标志物含量值进行评估。其中,在本发明中,所述训练集和所述验证集具有本领域公知的含义。在本发明的实施方案中,训练集是指包含一定样本数的骨质疏松症受试者和非骨质疏松症受试者待测样本中的各生物标志物的含量的数据集合。验证集是用来测试训练集性能的独立数据集合。其中,非骨质疏松症受试者为精神状态良好的受试者,受试者可以为人或者模型动物,在本实施例中是以人为受试者进行实验的。
具体包括如下步骤:
本发明从91个样品(健康人:47和骨质疏松病人:44人)中,随机地从44个患病样本中选取37个样品,随机地从47个正常样本中抽取40正常样本,然后组成77个样本(37个骨质疏松和40个正常人)作为训练集(表1),其余样品作为验证集(包括7个骨质疏松病人和7个正常人)。
1.6.1利用训练集数据验证筛选得到的生物标志物
首先,按照1.4-1.5描述的方法计算训练集中每个样本的相对丰度。然后将训练集基因数量大于50的MLG输入随机森林(randomForest 4.6-12 in R 3.2.5,RF)分类器。对分类器进行5次10折交叉验证,10次重复,利用RF模型筛选的MLG相对丰度对每一个体计算其骨质疏松患病风险(图2),并绘制受试者操作特征(receiver operation characteristic,ROC)曲线,并计算出曲线下面积(AUC)作为判别模型效能评价参数。选取标志物组合数<30,且判别效能最佳的组合为本发明组合。在模型中输出每个MLG的选择频率,频率越高,代表该标志物用来判别骨质疏松和非骨质疏松的重要性越高。
结果显示,本发明所得RF分类器包含了9个代谢物(即9个生物标志物),这9个生物标志物对应的相对丰度如表1所示,其详细信息如表2所示。图2示出了随机森林分类器中5次10折交叉验证的错误率分布情况。图2中黑色粗曲线代表5次试验,10次重复的平均值,竖线代表所选最佳组合中MLG数目。图3示出了基于随机森林模型(9个生物标志物)判断骨质疏松症患者和健康对照,训练集的接收者操作曲线(ROC)和曲线下面积(AUC),其中特异性表征的是对不患病判对的概率,敏感性指的是对于患病判对的概率,其中对训练集样本的判别效能为:AUC=95.92%,95%置信区间CI=86.4%-100%(图3),结果表明该模型所得代谢物组合可作为区分骨质疏松与非骨质疏松的潜在生物标志物。
其中,表3中,每种标志物基因集大小代表的是每种标志物中包括的核酸序列的个数;标志物基因集注释数代表的是:代表其中有多少基因注释到这个标志物上,标志物最优注释表征的是根据每种标志物包括的所有基因集与IMG(v400)数据库进行比对,得到的相应的物种分类;最优注释基因比例表征的是:这个基因簇里面有多少比例的基因注释到那个物种;最优注释相似度表征的是:这些基因簇里注释到这个物种,所有基因的注释准确度的均值作为该标志物的最优注释相似度;富集方向代表的是,每种生物标志物在骨质疏松症患者和健康对照中的相对丰度的变化,其中OS<=H代表的是该生物标志物在骨质疏松症患者中的相对丰度小于在健康对照中的相对丰度,H<=OS代表的是该生物标志物在骨质疏松症患者中的相对丰度大于在健康对照中的相对丰度;筛选频率代表的是:进行5折10次交叉验证,该生物标志物被选择的频率;验证集AUC代表的是:代表在训练集数据得到模型下,对验证集数据的判别程度;95%置信区间(95%CI)在a到b之间,代表的是对于给出的每种生物标志物,有相应的95%的概率可以说样本介于给出的a到b之间,发生错误的概率为5%。
从表3可以看出,富集方向一栏中相比较于健康对照,骨质疏松患者在嗜粘蛋白-艾克曼菌ATCC BAA-835(Akkermansia muciniphila ATCC BAA-835)、Parabacteroides merdae ATCC 43184、Dialister invisus DSM 15470均表现为相对丰度减少,在多形拟杆菌(Bacteroides thetaiotaomicron)、单形拟杆菌(Bacteroides uniformis)、Bacteroides intestinalis DSM 17393、Bacteroides dorei、瘤胃球菌sp.5_1_39BFAA(Ruminococcus sp.sp.5_1_39BFAA)、扭链瘤胃球菌L2-14(Ruminococcus torques L2-14)均表现为相对丰度增加。
1.6.2利用验证集数据验证筛选得到的生物标志物
本发明,随即使用独立人群对该模型进行验证,患病概率(RP)≥0.5预测个体具有骨质疏松患疾病风险或者患有骨质疏松。首先,按照1.4-1.5描述的方法计算验证集中每个样本中各生物标志物的相对丰度。然后按照1.6.1的方法利用随机森林模型对验证集数据进行验证。
基于该模型:
图4示出了基于随机森林模型(9个生物标志物)判断骨质疏松症患者和健康对照,验证集的接收者操作曲线(ROC)和曲线下面积(AUC)。其中,基于9个标记物,对独立验证集(骨质疏松=7和健人对照=7),模型的判别AUC=94.33%(95%CI=87.13%-99.4%)(图4,表3、表4-1、表4-2、表5);基于Ruminococcus sp.5_1_39BFAA,验证集曲线下面积为0.9082,特异性高。
在3.2.5版本R中使用“randomForest 4.6-12 package”进行随机森林模型分类和回归。输入包括训练集数据(即训练样本中选定的MLGs标记物的相对丰度,见表1),样本疾病 状态(训练样本的样本疾病状态为矢量,‘1’代表骨质疏松,‘0’代表健康人),以及一个验证集(验证集中所选MLGs标记物的相对丰度,见表4-1、4-2)。然后,发明人利用R软件中随机森林包的随机森林函数建立分类和预测函数对验证集数据进行预测,输出即为预测结果(患病概率);阈值为0.5,如果疾病的概率≥0.5,则认为有患骨质疏松的风险或者患有骨质疏松。
Figure PCTCN2018084276-appb-000001
Figure PCTCN2018084276-appb-000002
Figure PCTCN2018084276-appb-000003
Figure PCTCN2018084276-appb-000004
Figure PCTCN2018084276-appb-000005
Figure PCTCN2018084276-appb-000006
Figure PCTCN2018084276-appb-000007
Figure PCTCN2018084276-appb-000008
Figure PCTCN2018084276-appb-000009
表5.随机森林模型(分别基于9个肠道标志物、Ruminococcus sp.5_1_39BFAA)预测骨质疏松和健康对照的样品患有骨质疏松的风险或患有骨质疏松的概率(患病概率>=0.5确认个体具有患骨质疏松的风险或者患有骨质疏松。)
Figure PCTCN2018084276-appb-000010
以上结果表明,本发明公开的生物标志物具有较高的准确度和特异性,具有良好的开发为诊断方法的前景,从而为骨质疏松的患病风险评估、诊断、早期诊断,寻找潜在药物靶点提供依据。
在本发明中,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
在本发明中,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”、“固定”等术语应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是机械连接,也可以是电连接或彼此可通讯;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系,除非另有明确的限定。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。
在本发明中,除非另有明确的规定和限定,第一特征在第二特征“上”或“下”可以是第一和第二特征直接接触,或第一和第二特征通过中间媒介间接接触。而且,第一特征在第二特征“之上”、“上方”和“上面”可以是第一特征在第二特征正上方或斜上方,或仅仅表示第一特征水平高度高于第二特征。第一特征在第二特征“之下”、“下方”和“下面”可以是第一特征在第二特征正下方或斜下方,或仅仅表示第一特征水平高度小于第二特征。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (34)

  1. 一种骨质疏松生物标志物,其特征在于,包括选自下列中的至少一种:
    多形拟杆菌(Bacteroides thetaiotaomicron)或其类似物,单形拟杆菌(Bacteroides uniformis)或其类似物,Bacteroides intestinalis或其类似物,Bacteroides dorei或其类似物,瘤胃球菌(Ruminococcus sp.)或其类似物,嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)或其类似物,Parabacteroides merdae或其类似物,扭链瘤胃球菌(Ruminococcus torques)或其类似物,Dialister invisus或其类似物,
    所述多形拟杆菌(Bacteroides thetaiotaomicron)类似物与多形拟杆菌(Bacteroides thetaiotaomicron)的基因组序列相比,比对相似度在85%以上,
    所述单形拟杆菌(Bacteroides uniformis)类似物与单形拟杆菌(Bacteroides uniformis)的基因组序列相比,比对相似度在85%以上,
    所述Bacteroides intestinalis类似物与Bacteroides intestinalis的基因组序列相比,比对相似度在85%以上,
    所述Bacteroides dorei类似物与Bacteroides dorei的基因组序列相比,比对相似度在85%以上,
    所述瘤胃球菌(Ruminococcus sp.)类似物与瘤胃球菌(Ruminococcus sp.)的基因组序列相比,比对相似度在85%以上,
    所述嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)类似物与嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)的基因组序列相比,比对相似度在85%以上,
    所述Parabacteroides merdae类似物与Parabacteroides merdae的基因组序列相比,比对相似度在85%以上,
    所述扭链瘤胃球菌(Ruminococcus torques)类似物与扭链瘤胃球菌(Ruminococcus torques)的基因组序列相比,比对相似度在85%以上,
    所述Dialister invisus类似物与Dialister invisus的基因组序列相比,比对相似度在85%以上。
  2. 根据权利要求1所述的生物标志物,其特征在于,所述Bacteroides intestinalis为Bacteroides intestinalis DSM 17393,所述瘤胃球菌(Ruminococcus sp.)为瘤胃球菌5_1_39BFAA(Ruminococcus sp.5_1_39BFAA),所述嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)为嗜粘蛋白-艾克曼菌ATCC BAA-835(Akkermansia muciniphila ATCC BAA-835),所述Parabacteroides merdae为Parabacteroides merdae ATCC 43184,所述扭链瘤胃球菌(Ruminococcus torques)为扭链瘤胃球菌L2-14(Ruminococcus torques L2-14), 所述Dialister invisus为Dialister invisus DSM 15470。
  3. 根据权利要求1所述的生物标志物,其特征在于,所述多形拟杆菌(Bacteroides thetaiotaomicron)类似物与多形拟杆菌(Bacteroides thetaiotaomicron)的基因组序列相比,比对相似度在95%以上,
    所述单形拟杆菌(Bacteroides uniformis)类似物与单形拟杆菌(Bacteroides uniformis)的基因组序列相比,比对相似度在95%以上,
    所述Bacteroides intestinalis类似物与Bacteroides intestinalis的基因组序列相比,比对相似度在95%以上,
    所述Bacteroides dorei类似物与Bacteroides dorei的基因组序列相比,比对相似度在95%以上,
    所述瘤胃球菌(Ruminococcus sp.)类似物与瘤胃球菌(Ruminococcus sp.)的基因组序列相比,比对相似度在95%以上,
    所述嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)类似物与嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)的基因组序列相比,比对相似度在95%以上,
    所述Parabacteroides merdae类似物与Parabacteroides merdae的基因组序列相比,比对相似度在95%以上,
    所述扭链瘤胃球菌(Ruminococcus torques)类似物与扭链瘤胃球菌(Ruminococcus torques)的基因组序列相比,比对相似度在95%以上,
    所述Dialister invisus类似物与Dialister invisus的基因组序列相比,比对相似度在95%以上。
  4. 一种诊断对象是否患有骨质疏松或相关疾病或者预测对象是否患有骨质疏松或相关疾病的风险的方法,其特征在于,包括:
    (1)从所述对象中采集样本;
    (2)确定步骤(1)中获得的所述样本中根据权利要求1~3任一项所述的生物标志物的相对丰度信息;
    (3)将步骤(2)中所述的相对丰度信息与参考数据集或参考值进行比较;
    优选地,所述参考数据集包括来自多个骨质疏松患者和多个健康对照的样本中的根据权利要求1~3中任一项所述的生物标志物的相对丰度信息。
  5. 根据权利要求4所述的方法,其特征在于,在将步骤(2)中所述的相对丰度信息与参考数据集进行比较的步骤中,还包括执行多元统计模型以获得患病概率。
  6. 根据权利要求5所述的方法,其特征在于,所述多元统计模型为随机森林模型。
  7. 根据权利要求5所述的方法,其特征在于,所述患病概率大于阈值表明所述对象患 有骨质疏松或相关疾病或者有患有骨质疏松或相关疾病的风险。
  8. 根据权利要求7所述的方法,其特征在于,所述阈值为0.5。
  9. 根据权利要求4所述的方法,其特征在于,当与参考值比较时,所述嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)或其类似物、Parabacteroides merdae或其类似物、Dialister invisus或其类似物的减少表明所述对象患有骨质疏松或相关疾病或者处于患有骨质疏松或相关疾病的风险中;所述多形拟杆菌(Bacteroides thetaiotaomicron)或其类似物、单形拟杆菌(Bacteroides uniformis)或其类似物、Bacteroides intestinalis或其类似物、Bacteroides dorei或其类似物、瘤胃球菌(Ruminococcus sp.)或其类似物、扭链瘤胃球菌(Ruminococcus torques)或其类似物的增加表明所述对象患有骨质疏松或相关疾病或者处于患有骨质疏松或相关疾病的风险中。
  10. 根据权利要求4所述的方法,其特征在于,步骤(2)中所述生物标志物的相对丰度信息是利用测序方法得到的,进一步包括:
    从所述对象的所述样本中分离得到核酸样本,
    基于所获得的所述核酸样本,构建DNA文库,对所述DNA文库进行测序,以便获得测序结果,
    以及基于所述测序结果,将测序结果与参考基因集进行比对,以确定所述生物标志物的相对丰度信息,
    所述参考基因集包括从多个骨质疏松患者和多个健康对照的样本中进行宏基因组测序,获得非冗余基因集,然后将所述非冗余基因集与肠道微生物基因集合并,得到所述参考基因集。
  11. 根据权利要求10所述的方法,其特征在于,所述样本为粪便样本。
  12. 根据权利要求10所述的方法,其特征在于,所述测序方法是通过第二代测序方法或第三代测序方法进行的。
  13. 一种试剂盒,其特征在于,包括用于检测权利要求1~3任一项所述的生物标志物的试剂。
  14. 根据权利要求13所述的试剂盒,其特征在于,所述试剂盒包括以下中的至少一种:
    一组参考数据集或者参考值,所述参考数据集或者参考值用来作为每种生物标志物的相对丰度的参考。
  15. 根据权利要求14所述的试剂盒,其特征在于,所述试剂盒还包括第一计算机程序产品,所述第一计算机产品用来执行获得所述的参考数据集或者参考值。
  16. 根据权利要求13所述的试剂盒,其特征在于,所述试剂盒还包括第二计算机程序产品,所述第二计算机产品用来执行权利要求4所述的诊断对象是否患有骨质疏松或相关 疾病或者预测对象是否患有骨质疏松或相关疾病的风险的方法。
  17. 权利要求1~3任一项所述的生物标志物在制备试剂盒中的用途,所述试剂盒用于诊断对象是否患有骨质疏松或相关疾病或者预测对象是否患有骨质疏松或相关疾病的风险。
  18. 根据权利要求17所述的用途,其特征在于,所述诊断或预测包括如下步骤:
    1)从所述对象中采集样本;
    2)确定步骤1)中获得的所述样本中根据权利要求1~3中任一项所述的生物标志物的相对丰度信息;
    3)将步骤2)中所述的相对丰度信息与参考数据集或参考值进行比较。
  19. 根据权利要求18所述的用途,其特征在于,所述参考数据集包括来自多个骨质疏松患者和多个健康对照的样本中的生物标志物的相对丰度信息,所述生物标志物为根据权利要求1~3中任一项所述的生物标志物。
  20. 根据权利要求17所述的用途,其特征在于,在将步骤2)中所述的相对丰度信息与参考数据集进行比较的步骤中,还包括执行多元统计模型以获得患病概率。
  21. 根据权利要求20所述的用途,其特征在于,所述多元统计模型为随机森林模型。
  22. 根据权利要求20所述的用途,其特征在于,所述患病概率大于阈值表明所述对象患有骨质疏松或相关疾病或者有患有骨质疏松或相关疾病的风险。
  23. 根据权利要求22所述的用途,其特征在于,所述阈值为0.5。
  24. 根据权利要求18所述的用途,其特征在于,当与参考值比较时,所述嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)或其类似物、Parabacteroides merdae或其类似物、Dialister invisus或其类似物的减少表明所述对象患有骨质疏松或相关疾病或者处于患有骨质疏松或相关疾病的风险中;所述多形拟杆菌(Bacteroides thetaiotaomicron)或其类似物、单形拟杆菌(Bacteroides uniformis)或其类似物、Bacteroides intestinalis或其类似物、Bacteroides dorei或其类似物、瘤胃球菌(Ruminococcus sp.)或其类似物、扭链瘤胃球菌(Ruminococcus torques)或其类似物的增加表明所述对象患有骨质疏松或相关疾病或者处于患有骨质疏松或相关疾病的风险中。
  25. 根据权利要求18所述的用途,其特征在于,通过测序方法得到步骤2)中所述生物标志物的相对丰度信息,进一步包括:
    从所述对象的所述样本中分离得到核酸样本,
    基于所获得的所述核酸样本,构建DNA文库,对所述DNA文库进行测序,以便获得测序结果,
    以及基于所述测序结果,将测序结果与参考基因集进行比对,以确定所述生物标志物 的相对丰度信息。
  26. 根据权利要求25所述的用途,其特征在于,所述参考基因集包括从多个骨质疏松患者和多个健康对照的样本中进行宏基因组测序,获得非冗余基因集,然后将所述非冗余基因集与肠道微生物基因集合并,得到所述参考基因集。
  27. 生物标志物作为靶点用于筛选治疗或者预防骨质疏松或相关疾病的药物的用途,其中所述生物标志物包括权利要求1~3任一项所述的生物标志物。
  28. 生物标志物在诊断对象是否患有骨质疏松症或相关疾病或者预测对象是否患有骨质疏松或相关疾病的风险中的用途,其中所述生物标志物包括权利要求1~3中任一项所述的生物标志物。
  29. 一种检测对象是否患有骨质疏松症或相关疾病或者预测对象是否患有骨质疏松症或相关疾病的设备,其特征在于,包括:
    样本采集装置,所述样本采集装置适于从所述对象中采集样本;
    生物标志物相对丰度确定装置,生物标志物相对丰度确定装置与所述样本采集装置相连,其适于确定所获得的样本中的生物标志物的相对丰度信息,所述生物标志物包括权利要求1~3中任一项所述的生物标志物;
    患病概率确定装置,所述患病概率确定装置与所述生物标志物相对丰度确定装置相连,所述患病概率确定装置用于将相对丰度确定装置中获得的生物标志物的相对丰度信息与参考数据集或参考值进行比对。
  30. 根据权利要求29所述的设备,其特征在于,所述参考数据集包括来自多个骨质疏松患者和多个健康对照的样本中的根据权利要求1~3中任一项所述的生物标志物的相对丰度信息。
  31. 根据权利要求29所述的设备,其特征在于,所述患病概率确定装置中还包括执行多元统计模型以获得患病概率。
  32. 根据权利要求29所述的设备,其特征在于,所述生物标志物相对丰度确定装置进一步包括:
    核酸样本分离单元,所述核酸样本分离单元适于从所述对象的所述样本中分离得到核酸样本;
    测序单元,所述测序单元与所述核酸样本分离单元相连,并且基于所获得的核酸样本,构建DNA文库,对所述DNA文库进行测序,以便获得测序结果;
    比对单元,所述比对单元与所述测序单元相连,并且基于所述测序结果,将测序结果与参考基因集进行比对,以确定所述生物标志物的相对丰度信息。
  33. 根据权利要求32所述的设备,其特征在于,参考基因集包括从多个骨质疏松患者 和多个健康对照的样本中进行宏基因组测序,获得非冗余基因集,然后将所述非冗余基因集与肠道微生物基因集合并,得到所述参考基因集。
  34. 一种药物,其特征在于,所述药物用于预防或治疗骨质疏松或相关疾病,所述药物能够使得检测对象中嗜粘蛋白-艾克曼菌(Akkermansia muciniphila)或其类似物、Parabacteroides merdae或其类似物、Dialister invisus或其类似物的相对丰度值增加;或者所述药物能够使得多形拟杆菌(Bacteroides thetaiotaomicron)或其类似物、单形拟杆菌(Bacteroides uniformis)或其类似物、Bacteroides intestinalis或其类似物、Bacteroides dorei或其类似物、瘤胃球菌(Ruminococcus sp.)或其类似物、扭链瘤胃球菌(Ruminococcus torques)或其类似物的相对丰度值减少。
PCT/CN2018/084276 2018-04-24 2018-04-24 骨质疏松生物标志物及其用途 WO2019204985A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201880092711.2A CN112384634B (zh) 2018-04-24 2018-04-24 骨质疏松生物标志物及其用途
PCT/CN2018/084276 WO2019204985A1 (zh) 2018-04-24 2018-04-24 骨质疏松生物标志物及其用途

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/084276 WO2019204985A1 (zh) 2018-04-24 2018-04-24 骨质疏松生物标志物及其用途

Publications (1)

Publication Number Publication Date
WO2019204985A1 true WO2019204985A1 (zh) 2019-10-31

Family

ID=68294361

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/084276 WO2019204985A1 (zh) 2018-04-24 2018-04-24 骨质疏松生物标志物及其用途

Country Status (2)

Country Link
CN (1) CN112384634B (zh)
WO (1) WO2019204985A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112957376A (zh) * 2019-12-14 2021-06-15 山东大学 一种调节高原人群血糖的微生物组合物及其应用
US11457471B2 (en) 2018-08-09 2022-09-27 Sony Corporation Methods, communications device and infrastructure equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107075446A (zh) * 2014-09-30 2017-08-18 深圳华大基因科技有限公司 用于肥胖症相关疾病的生物标记物

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
HN1996000101A (es) * 1996-02-28 1997-06-26 Inc Pfizer Terapia combinada para la osteoporosis
CN102154450B (zh) * 2010-12-23 2014-07-16 深圳华大基因科技有限公司 一种检测肠炎致病菌的方法
CA2948134C (en) * 2014-05-06 2023-03-14 Is-Diagnostics Ltd. Microbial population analysis

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107075446A (zh) * 2014-09-30 2017-08-18 深圳华大基因科技有限公司 用于肥胖症相关疾病的生物标记物

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HUANG, YONGQUAN ET AL.: "Gut Microbiota and Osteoporosis", JOURNAL OF SOUTHERN MEDICAL UNIVERSITY, vol. 37, 31 December 2017 (2017-12-31), pages 278 - 280 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11457471B2 (en) 2018-08-09 2022-09-27 Sony Corporation Methods, communications device and infrastructure equipment
CN112957376A (zh) * 2019-12-14 2021-06-15 山东大学 一种调节高原人群血糖的微生物组合物及其应用

Also Published As

Publication number Publication date
CN112384634A (zh) 2021-02-19
CN112384634B (zh) 2024-04-16

Similar Documents

Publication Publication Date Title
CN108350510B (zh) 用于胃肠健康相关病症的源自微生物群系的诊断及治疗方法和系统
CN105368944B (zh) 可检测疾病的生物标志物及其用途
CN111440884B (zh) 源于肠道的诊断肌少症的菌群及其用途
CN112119167B (zh) 抑郁症生物标志物及其用途
WO2021184412A1 (zh) 基于肠道微生物的双相情感障碍生物标志物及其筛选应用
WO2020244017A1 (zh) 一种基于肠道菌群的精神分裂症生物标志物组合、其应用及mOTU筛选方法
CN105296590B (zh) 大肠癌标志物及其应用
CN110904213B (zh) 一种基于肠道菌群的溃疡性结肠炎生物标志物及其应用
US20150211053A1 (en) Biomarkers for diabetes and usages thereof
CN108348167B (zh) 用于脑-颅面健康相关病症的源自微生物群系的诊断及治疗方法和系统
CN108348166B (zh) 用于与抗生素使用相关的感染性疾病及其它健康状况的源自微生物群系的诊断及治疗方法和系统
CN105132518B (zh) 大肠癌标志物及其应用
CA2957549C (en) Diagnostic method for distinguishing forms of esophageal eosinophilia
WO2019204985A1 (zh) 骨质疏松生物标志物及其用途
CN114182007B (zh) 白塞病标志基因及其应用
CN111020020A (zh) 一种精神分裂症的生物标志物组合、其应用及metaphlan2筛选方法
CN110396538B (zh) 偏头痛生物标志物及其用途
CN108350503B (zh) 用于甲状腺健康问题相关病症的源自微生物群系的诊断及治疗方法和系统
WO2018049946A1 (zh) 用于子宫腺肌症检测的生物标志物组合及其应用
WO2021184413A1 (zh) 双相情感障碍疗效预测的肠道微生物标志物及其筛选应用
CN111020021A (zh) 一种基于肠道菌群的小规模精神分裂症生物标志物组合、其应用及mOTU筛选方法
CN112063709A (zh) 一种以微生物作为诊断标志物的重症肌无力的诊断试剂盒及应用
CN110396537B (zh) 哮喘生物标志物及其用途
WO2020168541A1 (zh) 肠道宏基因组在筛选pd-1抗体阻断剂疗效方面的用途
WO2022210606A1 (ja) 認知症の将来の発症リスクの評価方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18916426

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18916426

Country of ref document: EP

Kind code of ref document: A1