CN107075562A - Biomarker for obesity-related disorder - Google Patents

Biomarker for obesity-related disorder Download PDF

Info

Publication number
CN107075562A
CN107075562A CN201480082372.1A CN201480082372A CN107075562A CN 107075562 A CN107075562 A CN 107075562A CN 201480082372 A CN201480082372 A CN 201480082372A CN 107075562 A CN107075562 A CN 107075562A
Authority
CN
China
Prior art keywords
mrow
msub
sample
label
biomarker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201480082372.1A
Other languages
Chinese (zh)
Other versions
CN107075562B (en
Inventor
冯强
张东亚
唐龙清
王俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Publication of CN107075562A publication Critical patent/CN107075562A/en
Application granted granted Critical
Publication of CN107075562B publication Critical patent/CN107075562B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material

Abstract

There is provided the biomarker and method for predicting the disease related to microorganism, particularly obesity or the risk of relevant disease.

Description

Biomarker for obesity-related disorder
The cross reference of related application
Nothing
Technical field
The present invention relates to the life for predicting the disease related to microorganism, particularly obesity or the risk of relevant disease Substance markers thing and method.
Background technology
Obesity is very universal in developed country, and (de Carvalho Pereira etc. are dramatically increased in worldwide People, 2013).It is reported that during 1980 to 2013, in the world in the illness rate of overweight and obesity altogether, adult 27.5% is increased, children increase 47.1%.Overweight population increases to 2,100,000,000 in 2013 from 8.57 hundred million in 1980, its In, 6.71 hundred million populations receive the influence of obesity.Among these, the obesity patient more than 50% is lived in ten countries, And the U.S. possesses the obese people of maximum quantity, next to that Chinese (Ng et al., 2014).
Increasing evidence shows, is overweight patient relative to not being diagnosed as overweight patient more by diagnosis It is possible to lose weight.However, the low diagnosis of doctor and the suggestion pair the healthy hazard factor of the behavior relevant with obesity Relevant (Bleich et al., 2011).
In children, the diagnosis of obesity is body-mass index (BMI) point of penetration based on age and sex-specific 's.This is with adult on the contrary, in adult, the diagnosis of obesity is made based on the BMI for not considering age or sex.With into Year, people was different, and the diagnostic criteria of obesity is simpler for adult, and a small number of Obese childrens are with more complicated diagnosis mark Standard is diagnosed and there occurs change (Walsh et al., 2013) to the term of childhood obesity exactly.In addition, being considered as BMI Limitation (Nevill et al., 2006) in different crowd in terms of homogeneity.It is therefore contemplated that waistline (WC) is to be used to comment Estimate the reliable and useful instrument of the epidemiological study of abdominal adiposity, but this measurement seems more to be difficult to carry out (Miguel-Etayo et al., 2014).In addition, using《International Classification of Diseases》(the 9th revision (ICD-9)), national outpatient service doctor Treat nursing investigation (NAMCS) and the nursing of national inpatient medical is investigated the regional study that (NHAMCS) diagnose to childhood obesity and shown Go out the relatively low sensitiveness (Walsh et al., 2013) of clinical diagnosis.
Nearest observation shows that human intestinal microorganisms group can play a significant role in obesity.Based on amplification The early stage of 16SrRNA gene sequencing, which reports, to be shown, the firmicutes in the fecal specimens from 12 fat mankind (Firmicutes) with the ratio between bacteroid (Bacteroidetes) is thin far above two compares (Ley et al., 2006).In people Using verified Phylogenetic diversity of bacteria reduction, bacteroid in the nearest observational study of grand gene order-checking in class obesity (Bacteroidetes) relative shortage and be related to carbohydrate and lipid-metabolism gene enrichment (Allin and Pedersen, 2014).These related discoveries show that the change of intestinal microbiota is the cause in the pathogenesis of obesity Cause of disease element.This shows, the standard that perhaps we can be diagnosed by the use of the feature of intestinal microbiota as obesity.
In a word, the diagnosis to obesity has considerable ignored chance and muting sensitivity.Need exploitation more effective (deviation is less) is to overweight and/or obesity assessment.
The content of the invention
The embodiment of the disclosure attempts at least to solve at least one problem present in prior art to a certain extent.
Following discovery of the invention based on the present inventor:
Assessment and sign to intestinal microbiota are had become to the main research in the human diseases including obesity Field.In order to which the enteric microorganism composition to obesity patient is analyzed, the present inventor is based on from 158 individual enteron aisles The depth air gun sequencing of microbial DNA implements grand genome association analysis (MGWAS) scheme (Qin, J. et al., A metagenome-wide association study of gut microbiota in type 2 diabetes.Nature 490,55-60 (2012), are incorporated herein by reference).The present inventor differentiates and demonstrates the related base of 396,100 obesity Because of label.In order to which using the potential ability that fat grader is carried out by intestinal microbiota, inventor developed based on 9 The classification of diseases device system of individual gene marker, the gene marker is by minimal redundancy-maximal correlation (mRMR) Method for Feature Selection It is defined as optimal gene set.In order to carry out intuitively assessing obesity based on 9 enteric microorganism gene markers Risk, the present inventor calculates health index.The grand genome of the data pair of the present inventor enteron aisle related to obesity risk Feature has made intensive studies the future studies of the pathophysiological role in other relevant diseases there is provided the grand genome of enteron aisle Example and potential application for being estimated based on intestinal microbiota to the individual of such disease risks.
It is believed that detection of the gene marker of intestinal microbiota for improving obesity in early stage is that have Value, this is due to the following reasons.First, label of the invention is more specific and quicker compared with conventional labels thing Sense.Second, copra analysis ensure that accuracy, security, affordability and patient compliance.And fecal specimens are to transport 's.Therefore, the present invention relates to a kind of comfortable and noninvasive in-vitro method so that people more easily participate in given screening journey Sequence.3rd, label of the invention is also used as the Treatment monitoring instrument of cancer patient, to detect its response to treatment.
On the one hand the disclosure provides the biomarker collection for being used for predicting subject's disease related to micropopulation, its Consist of:
Including SEQ ID NO:The enteron aisle biomarker of 1 to 9 at least part sequence.
According to the embodiment of the disclosure, the disease is obesity or relevant disease.
Using these biomarkers, subject's some diseases related to micropopulation can be analyzed, such as based on next From some samples of subject, for example, it can use some fecal specimens, it may be determined that obesity or relevant disease.
On the other hand the disclosure provides the kit for being used for determining said gene label collection, and it includes being used for PCR expansions Increase and according to such as SEQ ID NO:DNA sequence dna described in 1 to 9 at least part sequence and the primer designed.
On the other hand the disclosure provides the kit for being used for determining said gene label collection, and it includes more than one root According to SEQ ID NO:Gene described in 1 to 9 and the probe designed.
Another aspect of the present disclosure, which provides said gene label collection, to be used to predict subject's obesity or relevant disease Risk purposes, it includes:
(1) sample j is collected from subject;
(2) SEQ ID NO in the DNA of sample are determined:The relative abundance information of each in 1 to 9;With
(3) calculate according to the following formula by IjThe sample j of expression index:
AijIt is the relative abundance of label i in sample j, wherein i refers to each gene mark that the gene marker is concentrated Remember thing;
N is the label being enriched with all patients in the selected biomarker related to unusual condition First subset,
M is the label being enriched with all controls in the selected biomarker related to unusual condition Yield in the second subset,
| N | and | M | it is the number of the biomarker in the first subset and yield in the second subset respectively,
Wherein
Index more than critical value shows that subject has in unusual condition or risk in the situation of undergoing an unusual development.
According to some embodiments of the disclosure, | N | it is 5, | M | it is 4.
According to some embodiments of the disclosure, critical value is 0.03519 to 0.1337.
Another aspect of the present disclosure provides said gene label collection and prepared for predicting subject's obesity or phase Purposes in the kit of the risk of related disorders, it includes:
(1) sample j is collected from subject;
(2) SEQ ID NO in the DNA of sample are determined:The relative abundance information of each in 1 to 9;With
(3) calculate according to the following formula by IjThe sample j of expression index:
AijIt is the relative abundance of label i in sample j, wherein i refers to each gene mark that the gene marker is concentrated Remember thing;
N is the mark being enriched with all patients in the selected selected biomarker related to unusual condition First subset of thing,
M is the mark being enriched with all controls in the selected selected biomarker related to unusual condition The yield in the second subset of thing,
| N | and | M | it is the number of the biomarker in the first subset and yield in the second subset respectively,
Wherein
Index more than critical value shows that subject has in unusual condition or risk in the situation of undergoing an unusual development.
According to some embodiments of the disclosure, | N | it is 5, | M | it is 4.
According to some embodiments of the disclosure, critical value is 0.03519 to 0.1337.
Another aspect of the present disclosure provides whether diagnosis subject has the unusual condition related to micropopulation or place Method in the risk for developing the unusual condition related to micropopulation, it includes:
It is determined that in the sample from subject above-mentioned biomarker relative abundance, and
Determine subject whether with the unusual condition related to micropopulation or in development based on the relative abundance In the risk of the unusual condition related to micropopulation.
According to the embodiment of the disclosure, this method includes:
(1) sample j is collected from subject;
(2) SEQ ID NO in the DNA of determination sample:The relative abundance information of each in 1 to 9;With
(3) calculate according to the following formula by IjThe sample j of expression index:
AijIt is the relative abundance of label i in sample j, wherein i refers to each gene mark that the gene marker is concentrated Remember thing;
N is the label being enriched with all patients in the selected biomarker related to unusual condition First subset,
M is the label being enriched with all controls in the selected biomarker related to unusual condition Yield in the second subset,
| N | and | M | it is the number of the biomarker in the first subset and yield in the second subset respectively,
Wherein
Index more than critical value shows that subject has in unusual condition or risk in the situation of undergoing an unusual development.
According to some embodiments of the disclosure, | N | it is 5, | M | it is 4.
According to some embodiments of the disclosure, critical value is 0.03519 to 0.1337.
According to the embodiment of the disclosure, the unusual condition relevant with micropopulation is obesity or relevant disease.
Brief description of the drawings
The these and other aspect and advantage of the disclosure will be apparent from following description with reference to the accompanying drawings and more hold It is readily understood, wherein:
Strong correlation label out-of-proportion excessive table in relatively low P values is determined in the association analysis of Fig. 1 obesity p value distributions Show.
Fig. 2 the present inventor is incremented by by minimal redundancy maximal correlation (mRMR) method in obesity-related gene label Retrieval, and produce consecutive numbers purpose subset.Then by linear discriminant grader to stay a cross validation (LOOCV) every to estimate The error rate of individual subset.Most preferably (lowest error rate) subset includes 9 gene markers.
Fig. 3 ROC are drawn by the obesity index in training set, AUC=0.9763.
The ROC of Fig. 4 test sets (42 samples) is drawn by the obesity index of test set, AUC=0.9024.
The ROC of Fig. 5 test sets (22 samples) is drawn by the obesity index of test set, AUC=0.8462.
Embodiment
Example
Terms used herein has the implication that the those of ordinary skill in field related to the present invention is generally understood that.Such as The term of " one ", " one " and "the", which are not intended to, only refers to singular entity, but including available for the general of explanation particular example Classification.Unless had been described in the claims, term herein is used to describing specific embodiments of the present invention, but it Usage do not limit the present invention.
The present invention is further illustrated in following non-limiting example in.Unless otherwise indicated, number and percentage Than all by weight, the number of degrees for degree Celsius.Although it will be obvious to those skilled in the art that these examples are represented The preferred embodiments of the invention, but only provide by way of illustration, and all reagents are all commercially available.
Embodiment
Example 1. differentiates the biomarker for assessing obesity risk
1.1 sample collection
Excrement from 158 Chinese subjects (including 78 obesity patients and 80 control subjects's (training set)) Sample was collected by Medical College, Shanghai Communication Univ.'s Ruijin Hospital in 2012.Obesity patient's age, BMI was higher than from 18 to 30 years old 25.It is required that subject collects fresh excreta sample in hospital.The sample of collection is placed in sterile tube, -80 DEG C are stored in immediately Until being further analyzed.
Complete ethics approval is achieved, and all patients give written informed consent.The research obtains Shanghai The approval of medical college of university of communications Ruijin Hospital Institutional Review Board.
1.2DNA extract
Fecal specimens are thawed on ice, and use Qiagen QIAamp DNA Stool Mini kits (Qiagen) DNA extractions, are carried out according to the explanation of manufacturer.The RNase of no DNA enzymatic is used to handle extract to eliminate RNA dirts Dye.Use NanoDrop spectrophotometers, Qubit fluorescence photometers (there is Quant-iTTMdsDNA BR to determine kit) and gel Electrophoretic determination amount of DNA.
The DNA library of 1.3 fecal specimens builds and is sequenced
Carrying out DNA library structure according to the explanation of manufacturer, (Illumina inserts size 350bp, read length 100bp).The present inventor carries out fasciation into, template hybridization, isothermal duplication, linearisation, envelope using workflow as hereinbefore Close and be denatured and sequencing primer hybridization.The present inventor constructs one with insertion size for each sample (PE) library of 350bp end pairing, then carries out high-flux sequence and obtains about 3,000 ten thousand PE that length is 2x100bp Read.Polluted by being filtered out from the original reads of Illumina with uncertain " N " base, joint pollution and people's source DNA Low quality read and obtain high-quality read by the low quality terminal bases for shearing read simultaneously.
The present inventor is on the platforms of Illumina HiSeq 2000 from 158 samples (78 cases and 80 controls) Each sample about 5.9Gb fecal microorganism group's sequencing data (high quality, clean data) (table 1) is exported altogether.
The grand genomic data of table 1 collects.4th result of the row report from Wilcoxon rank tests.
1.4 grand genomic data processing and analysis
1.4.1 read is compared
Inventors used Li, J. et al., An integrated catalog of reference genes in The renewal that the human gut microbiome.Nat.Biotechnol. (2014) (being incorporated herein by reference) are set up Human intestine's gene catalogue, and to compare standard homogeneity >=90 by the human intestine of high-quality read comparison to the renewal Gene catalogue.Average read comparison rate is shown in Table 1.The comparison rate is close to Li, J. et al., 2014, ibid in sample, this Illustrate that the comparison rate is sufficient for further research.After read comparison, the present inventor's use and Li, J. et al., 2014, Ibid identical method exports gene profile (9.9Mb genes) from comparison result.
The taxology distribution of gene.Using interior described in published paper (Li, J. et al., 2014, ibid) The flow (pipeline) of portion's exploitation is predicted the taxology distribution of gene.
1.4.2 data file is built
Gene profile.Based on read compare result, the present inventor use disclosed T2D papers (Qin et al., 2012, together On) described in same procedure calculate Relative gene abundance.
1.4.3 the factor analysis of intestinal microbiota gene profile is influenceed.Based on gene profile, the present inventor is more using nonparametric First variance analysis (PERMANOVA) assesses the shadow of 6 clinical parameters (including age, sex, height, body weight, BMI and obesity) Ring.Inventor is analyzed using the method implemented in " vegan " bag in R, and passes through 10,000 displacements (permutation) (permuted) p value of displacement is obtained.The present inventor is also using Benjamini-Hochberg methods in R It is middle that multiple testing is corrected using " p.adjust ", to obtain the q values each tested.PERMANOA is determined and enteric microorganism Related three key factors (being based on gene profile) (q<0.05, table 2).Analysis shows, body weight, BMI and obese state are strong close Connection mark, it was demonstrated that disease (obesity) state is the major determinant for influenceing intestinal microbiota to constitute.
The PERMANOVA of Euclidean distance analysis of the table 2 based on gene profile.In q values<Analyzed for 0.05 time to test Whether clinical parameter and Obesity have on intestinal microbiota significantly affects.
1.4.4 the determination of obesity mark of correlation thing
The determination of obesity related gene.In order to determine the association between grand genome spectrum and obesity, 9,879,897 Using double tails in individual High frequency gene (removal is present in less than the gene in 10 samples in all 158 samples) spectrum Wilcoxon rank tests.Obtain 396 be all enriched with case and control, 100 gene markers, p value<0.01、FDR =3.8% (Fig. 1).
False discovery rate estimates (FDR).The present inventor applies " q values " method proposed in previous research rather than continuous p Value exclusive method (sequential p-value rejection method) estimates FDR (Storey, JDA direct approach to false discovery rates.Journal of the Royal Statistical Society 64,479-498 (2002), are incorporated herein by reference).
Receiver Operating Characteristics (ROC) analyze.The present inventor analyzes to assess based on grand genomic marker thing using ROC The performance of obesity classification.Then, the present inventor wraps to draw ROC curve using " pROC " in R.
1.5 select method (maximal correlation minimal redundancy (mRMR) feature choosing of 9 optimum mark things from biomarker Select framework)
In order to determine optimal gene set, using minimal redundancy maximal correlation (mRMR) (details referring to Peng, H., Long, F.&Ding, C.Feature selection based on mutual information:criteria of Max--relevance and min-redundancy, IEEE Trans Pattern Anal Mach Intell27,1226- 1238, doi:10.1109/TPAMI.2005.159 (2005), it is incorporated herein by reference) Method for Feature Selection is from all fertilizer Selected in fat related gene label.Inventor performs incremental inspection using " sideChannelAttack " bag of R softwares Rope, and find 158 continued labelling thing collection (sequential markers sets).One is stayed by linear discriminant grader Cross validation (LOOCV), the present inventor have estimated the error rate of each continuum.The optimal selection of label collection is corresponded to most That label collection of low error rate.In our current research, inventor is to one group 396, and 100 obesity-related gene labels are carried out Feature selecting.Due to can not computationally use all genes to carry out mRMR, the present inventor has obtained statistically nonredundancy Gene set.First, we select 8010 gene (q<0.0005).Then, the present inventor applies mRMR Method for Feature Selection and true Best set (lowest error rate, figure with 9 gene biological labels of obesity strong correlation in classifying for obesity are determined 2), it is shown in table 3 and table 4.Gene id comes from disclosed such as Li, J. et al., 2014, reference gene catalogue ibid.
The enrichment information of the optimal gene marker of 3. 9, table
Gene id Enrichment (1=obesity, 0=controls)
64552 0
1208989 0
2285506 0
3104115 1
3581202 0
5042942 1
5243950 1
6793200 1
7860042 1
The SEQ ID of the optimal gene marker of 4. 9, table
Gene id SEQ ID NO:
Gene _ id:7860042 1
Gene _ id:1208989 2
Gene _ id:5243950 3
Gene _ id:5042942 4
Gene _ id:3104115 5
Gene _ id:2285506 6
Gene _ id:3581202 7
Gene _ id:64552 8
Gene _ id:6793200 9
1.6 intestinal health indexes (obesity index)
In order to develop the potential ability that classification of diseases is carried out by intestinal microbiota, inventor developed based on this hair The classification of diseases system for 9 gene markers that a person of good sense defines.It is straight in order to be carried out based on these enteric microorganism gene markers See ground and evaluate disease risks, the present inventor calculates intestinal health index (obesity index).
In order to evaluate influence of the grand genome of enteron aisle to obesity, the present inventor is based on selected 9 as described above Gene marker defines and calculated the intestinal health index of each individual.For each single sample, calculate according to the following formula By IjThe sample j of expression intestinal health index:
AijIt is the relative abundance of label i in sample j;
N be in the selected biomarker related to unusual condition (i.e. in this 9 selected gene markers The label being enriched with all obesity subset) in the label being enriched with all patients subset,
M be in the selected biomarker related to unusual condition (i.e. in this 9 selected gene markers The label being enriched with all controls subset) in the label being enriched with all controls subset,
| N | and | M | it is the number (size) of the biomarker in the two subsets respectively, wherein | N | it is 5, | M | it is 4,
Index wherein more than critical value shows that subject has obesity or in the risk for developing obesity.
The 1.7 obesity classification based on enteric microorganism
Relative abundance of the present inventor based on this 9 gene markers calculates obesity index, and it is clearly distinguished Obesity patient's microorganism group is with compareing microorganism group (table 5).Using obesity index by 78 obesity patient's microorganism groups Sort out to come from 80 control microorganism groups, it shows that Receiver Operating Characteristics (ROC) TG-AUC is 0.9763 (figure 3).Under optimum index critical value 0.03519, True Positive Rate (TPR) is 0.9487, and false positive rate (FPR) is 0.1, error rate For 8.23% (13/158), show that 9 gene markers can be used for Accurate classification obesity individual.
The intestinal health index (obesity patient and the control of non-obese disease) for 158 samples that table 5. is calculated
Example 2. verifies 9 gene biological labels in 42 samples (test set)
The present inventor (is included in Medical College, Shanghai Communication Univ.'s Ruijin Hospital collection using another new dependent research groups 17 obesity patients and the control of 25 non-obese diseases) demonstrate the resolving ability of obesity grader.
DNA and the constructed dna library of each sample are extracted, high-flux sequence is then carried out as described in example 1.The present inventor Using with Qin et al., 2012, ibid described in identical method calculate the gene abundance spectrums of these samples.Then such as SEQ is determined ID NO:The gene relative abundance of each label shown in 1-9.Then the index of each sample is calculated by following formula:
AijIt is the relative abundance of label i in sample j;
N be in the selected biomarker related to unusual condition (i.e. in this 9 selected gene markers The label being enriched with all obesity subset) in the label being enriched with all patients subset,
M is the biomarker in the selected selection related to unusual condition (i.e. in this 9 selected gene marks Remember thing in the label being enriched with all controls subset) in the label being enriched with all controls subset,
| N | and | M | it is the number of the biomarker in the two subsets respectively, wherein | N | it is 5, | M | it is 4,
Wherein, the index more than critical value shows that subject has obesity or in the risk for developing obesity.
Table 6 shows the index calculated of each sample, and table 7 shows representative sample DB78A related gene Relative abundance.It is that (optimum index above in 158 samples is critical at 0.03519 in critical value in the analysis and assessment Value), error rate is 21.42% (9/42), and checking illustrates that 54 gene markers can sort out obesity individual.It is most of Obesity patient (16/17) is correctly diagnosed as obesity.In addition, the ROC of test set is painted by the obesity index of test set System, AUC=0.9024 (Fig. 4).At optimal critical value 0.1337, True Positive Rate (TPR) is 0.9412, false positive rate (FPR) For 0.24.
Table 6. calculates the intestinal health index of 42 samples
The sample DB78A of table 7. gene relative abundance
Gene id DB78A (calculating of gene relative abundance) Enrichment (1=obesity, 0=controls)
64552 0 0
1208989 0 0
2285506 1.46332E-06 0
3104115 3.47323E-06 1
3581202 0 0
5042942 0 1
5243950 5.26732E-06 1
6793200 1.06787E-06 1
7860042 0 1
Example 3. verifies 9 gene biological labels in 22 samples (test set)
Inventor demonstrates the resolving ability (table 8) of obesity grader using other 22 samples, including 9 diseases Example sample and 13 control samples (5 samples after operation 1 month and 8 samples after operation 3 months), sample is also in Shanghai Medical college of university of communications Ruijin Hospital is collected.After case represents that preoperative sample, control represent operation 1 month and 3 months.
The information of 8. 22 samples of table
* before:Operation consent;1-M:Performed the operation after one month;3-M:Performed the operation after three months.
DNA and the constructed dna library of each sample are extracted, high-flux sequence is then carried out as described in example 1.The present inventor Using with Qin et al., 2012, ibid described in identical method calculate the gene abundance spectrums of these samples.It is then determined that such as SEQ ID NO:The gene relative abundance of each label shown in 1-9.Then the index of each sample is calculated by following formula:
AijIt is the relative abundance of label i in sample j.
N be in the selected biomarker related to unusual condition (i.e. in this 9 selected gene markers The label being enriched with all obesity subset) in the label being enriched with all patients subset,
M be in the selected biomarker related to unusual condition (i.e. in this 9 selected gene markers It is all control enrichment in label subsets) in the label being enriched with all controls subset,
| N | and | M | it is the number of the biomarker in the two subsets respectively, wherein | N | it is 5, | M | it is 4,
Wherein, the index more than critical value shows that subject has obesity or in the risk for developing obesity.
Table 9 shows the index calculated of each sample, and table 10 shows representative sample DB126 related gene Relative abundance.It is (the optimum index critical value in 158 samples above) at 0.03519 in critical value in the analysis and assessment, Error rate is 22.72% (5/22), and checking illustrates that 54 gene markers can sort out obesity individual.And mostly Number obesity patient (8/9) is correctly diagnosed as obesity.In addition, the ROC of test set is painted by the obesity index of test set System, AUC=0.8462 (Fig. 5).At optimal critical value 0.9695, True Positive Rate (TPR) is 0.6667, false positive rate (FPR) For 0.07692.
Table 9. calculates the intestinal health index of 22 samples
The sample DB126 of table 10. gene relative abundance
Gene id DB12 (calculating of gene relative abundance) Enrichment (1=obesity, 0=controls)
64552 0 0
1208989 0 0
2285506 7.99701E-08 0
3104115 6.25943E-05 1
3581202 0 0
5042942 7.19308E-08 1
5243950 5.97579E-07 1
6793200 0 1
7860042 1.52752E-07 1
Therefore, inventor passes through minimal redundancy-maximum correlation (mRMR) spy based on 396,100 fat mark of correlation things Levy back-and-forth method and identify and demonstrate 9 label collection.And the present inventor establishes intestinal health index, based on this 9 intestines Road microbial gene label have evaluated the risk of obesity.
While there has been shown and described that illustrative embodiment, it will be understood by those skilled in the art that on State embodiment and be not construed to limit the disclosure, and embodiment can be changed, substitutions and modifications are without de- From spirit, principle and the scope of the present invention.

Claims (16)

1. a kind of biomarker collection for being used to predict subject's disease related to micropopulation, is consisted of:
Including SEQ ID NO:The enteron aisle biomarker of 1 to 9 at least part sequence.
2. biomarker collection according to claim 1, wherein the enteron aisle biomarker is by SEQ ID NO:1 to 9 Composition.
3. the biomarker collection according to claim 1 or 2 for being used to predict subject's disease related to micropopulation, Wherein described disease is obesity or relevant disease.
4. a kind of kit for being used to determine the gene marker collection any one of claims 1 to 3, it includes being used for PCR is expanded and according to including SEQ ID NO:The DNA sequence dna of 1 to 9 at least part sequence and the primer designed.
5. a kind of kit for being used to determine gene marker collection any one of claims 1 to 3, it include one with Upper basis includes SEQ ID NO:The gene of 1 to 9 at least part sequence and the probe designed.
6. gene marker collection any one of claims 1 to 3 is used to predicting subject's obesity or relevant disease The purposes of risk, it includes
(1) sample j is collected from subject;
(2) SEQ ID NO in the DNA of the sample are determined:The relative abundance information of each in 1 to 9;With
(3) calculate according to the following formula by IjThe sample j of expression index:
<mrow> <msub> <mi>I</mi> <mi>j</mi> </msub> <mo>=</mo> <mrow> <mo>&amp;lsqb;</mo> <mrow> <mfrac> <mrow> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>&amp;Element;</mo> <mi>N</mi> </mrow> </msub> <msub> <mi>log</mi> <mn>10</mn> </msub> <mrow> <mo>(</mo> <mrow> <msub> <mi>A</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>+</mo> <msup> <mn>10</mn> <mrow> <mo>-</mo> <mn>10</mn> </mrow> </msup> </mrow> <mo>)</mo> </mrow> </mrow> <mrow> <mo>|</mo> <mi>N</mi> <mo>|</mo> </mrow> </mfrac> <mo>-</mo> <mfrac> <mrow> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>&amp;Element;</mo> <mi>M</mi> </mrow> </msub> <msub> <mi>log</mi> <mn>10</mn> </msub> <mrow> <mo>(</mo> <mrow> <msub> <mi>A</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>+</mo> <msup> <mn>10</mn> <mrow> <mo>-</mo> <mn>10</mn> </mrow> </msup> </mrow> <mo>)</mo> </mrow> </mrow> <mrow> <mo>|</mo> <mi>M</mi> <mo>|</mo> </mrow> </mfrac> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow>
AijIt is the relative abundance of label i in sample j, wherein i refers to each described base that the gene marker is concentrated Because of label;
N is first of the label being enriched with all patients in the selected biomarker related to unusual condition Subset,
M is the label being enriched with all controls in the selected biomarker related to the unusual condition Yield in the second subset,
| N | and | M | it is the number of the biomarker in first subset and the yield in the second subset respectively,
Wherein
Index more than critical value shows that subject has in unusual condition or risk in the situation of undergoing an unusual development.
7. purposes according to claim 6, wherein | N | it is 5, | M | it is 4.
8. purposes according to claim 7, wherein the critical value is 0.03519 to 0.1337.
9. the gene marker collection any one of claims 1 to 3 is being prepared for predicting subject's obesity or correlation Purposes in the kit of the risk of disease, it includes:
(1) sample j is collected from subject;
(2) SEQ ID NO in the DNA of the sample are determined:The relative abundance information of each in 1 to 9;With
(3) calculate according to the following formula by IjThe sample j of expression index:
<mrow> <msub> <mi>I</mi> <mi>j</mi> </msub> <mo>=</mo> <mrow> <mo>&amp;lsqb;</mo> <mrow> <mfrac> <mrow> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>&amp;Element;</mo> <mi>N</mi> </mrow> </msub> <msub> <mi>log</mi> <mn>10</mn> </msub> <mrow> <mo>(</mo> <mrow> <msub> <mi>A</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>+</mo> <msup> <mn>10</mn> <mrow> <mo>-</mo> <mn>10</mn> </mrow> </msup> </mrow> <mo>)</mo> </mrow> </mrow> <mrow> <mo>|</mo> <mi>N</mi> <mo>|</mo> </mrow> </mfrac> <mo>-</mo> <mfrac> <mrow> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>&amp;Element;</mo> <mi>M</mi> </mrow> </msub> <msub> <mi>log</mi> <mn>10</mn> </msub> <mrow> <mo>(</mo> <mrow> <msub> <mi>A</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>+</mo> <msup> <mn>10</mn> <mrow> <mo>-</mo> <mn>10</mn> </mrow> </msup> </mrow> <mo>)</mo> </mrow> </mrow> <mrow> <mo>|</mo> <mi>M</mi> <mo>|</mo> </mrow> </mfrac> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow>
AijIt is the relative abundance of label i in sample j, wherein i refers to each described base that the gene marker is concentrated Because of label;
N is first of the label being enriched with all patients in the selected biomarker related to unusual condition Subset,
M is the label being enriched with all controls in the selected biomarker related to the unusual condition Yield in the second subset,
| N | and | M | it is the number of the biomarker in first subset and the yield in the second subset respectively,
Wherein
Index more than critical value shows that subject has in unusual condition or risk in the situation of undergoing an unusual development.
10. purposes according to claim 9, wherein | N | it is 5, | M | it is 4.
11. purposes according to claim 10, wherein the critical value is 0.03519 to 0.1337.
Diagnose whether subject has the unusual condition or in development with micropopulation related related to micropopulation 12. a kind of Unusual condition risk in method, it includes:
It is determined that in the sample from the subject biomarker according to any one of claim 1 to 3 it is relative Abundance, and
Determine the subject whether with the unusual condition related to micropopulation or in hair based on the relative abundance In the risk of the exhibition unusual condition related to micropopulation.
13. method according to claim 12, it includes:
(1) sample j is collected from subject;
(2) SEQ ID NO in the DNA of sample are determined:The relative abundance information of each in 1 to 9;With
(3) calculate according to the following formula by IjThe sample j of expression index:
<mrow> <msub> <mi>I</mi> <mi>j</mi> </msub> <mo>=</mo> <mrow> <mo>&amp;lsqb;</mo> <mrow> <mfrac> <mrow> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>&amp;Element;</mo> <mi>N</mi> </mrow> </msub> <msub> <mi>log</mi> <mn>10</mn> </msub> <mrow> <mo>(</mo> <mrow> <msub> <mi>A</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>+</mo> <msup> <mn>10</mn> <mrow> <mo>-</mo> <mn>10</mn> </mrow> </msup> </mrow> <mo>)</mo> </mrow> </mrow> <mrow> <mo>|</mo> <mi>N</mi> <mo>|</mo> </mrow> </mfrac> <mo>-</mo> <mfrac> <mrow> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>&amp;Element;</mo> <mi>M</mi> </mrow> </msub> <msub> <mi>log</mi> <mn>10</mn> </msub> <mrow> <mo>(</mo> <mrow> <msub> <mi>A</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>+</mo> <msup> <mn>10</mn> <mrow> <mo>-</mo> <mn>10</mn> </mrow> </msup> </mrow> <mo>)</mo> </mrow> </mrow> <mrow> <mo>|</mo> <mi>M</mi> <mo>|</mo> </mrow> </mfrac> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow>
AijIt is the relative abundance of label i in sample j, wherein i refers to each described base that the gene marker is concentrated Because of label;
N is first of the label being enriched with all patients in the selected biomarker related to unusual condition Subset,
M is the label being enriched with all controls in the selected biomarker related to the unusual condition Yield in the second subset,
| N | and | M | it is the number of the biomarker in first subset and the yield in the second subset respectively,
Wherein
Index more than critical value shows that the subject has in unusual condition or risk in the situation of undergoing an unusual development.
14. purposes according to claim 13, wherein | N | it is 5, | M | it is 4.
15. purposes according to claim 14, wherein the critical value is 0.03519 to 0.1337.
16. method according to claim 12, wherein the unusual condition related to micropopulation is obesity or phase Related disorders.
CN201480082372.1A 2014-09-30 2014-09-30 Biomarkers for obesity related diseases Active CN107075562B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/088043 WO2016049917A1 (en) 2014-09-30 2014-09-30 Biomarkers for obesity related diseases

Publications (2)

Publication Number Publication Date
CN107075562A true CN107075562A (en) 2017-08-18
CN107075562B CN107075562B (en) 2021-09-24

Family

ID=55629342

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201480082372.1A Active CN107075562B (en) 2014-09-30 2014-09-30 Biomarkers for obesity related diseases

Country Status (2)

Country Link
CN (1) CN107075562B (en)
WO (1) WO2016049917A1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006012586A2 (en) * 2004-07-27 2006-02-02 Washington University In St. Louis Modulation of fiaf and the gastrointestinal microbiota
WO2006102350A1 (en) * 2005-03-23 2006-09-28 Washington University In St. Louis The use of archaea to modulate the nutrient harvesting functions of the gastrointestinal microbiota
CN101960019A (en) * 2007-04-03 2011-01-26 国家科学研究中心 FTO gene polymorphisms associated to obesity and/or type II diabetes
WO2011058232A1 (en) * 2009-11-16 2011-05-19 Mas-Metabolic Analytical Services Oy Nutrigenetic biomarkers for obesity and type 2 diabetes
CN102099487A (en) * 2008-05-16 2011-06-15 英特利金遗传学有限公司 Genetic markers for weight management and methods of use thereof
WO2011107482A2 (en) * 2010-03-01 2011-09-09 Institut National De La Recherche Agronomique Method of diagnostic of obesity
WO2012115885A1 (en) * 2011-02-22 2012-08-30 Caris Life Sciences Luxembourg Holdings, S.A.R.L. Circulating biomarkers
CN102864090A (en) * 2002-04-05 2013-01-09 梅瑞尔有限责任公司 Attenuated gram negative bacteria

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102864090A (en) * 2002-04-05 2013-01-09 梅瑞尔有限责任公司 Attenuated gram negative bacteria
WO2006012586A2 (en) * 2004-07-27 2006-02-02 Washington University In St. Louis Modulation of fiaf and the gastrointestinal microbiota
WO2006102350A1 (en) * 2005-03-23 2006-09-28 Washington University In St. Louis The use of archaea to modulate the nutrient harvesting functions of the gastrointestinal microbiota
CN101960019A (en) * 2007-04-03 2011-01-26 国家科学研究中心 FTO gene polymorphisms associated to obesity and/or type II diabetes
CN102099487A (en) * 2008-05-16 2011-06-15 英特利金遗传学有限公司 Genetic markers for weight management and methods of use thereof
WO2011058232A1 (en) * 2009-11-16 2011-05-19 Mas-Metabolic Analytical Services Oy Nutrigenetic biomarkers for obesity and type 2 diabetes
WO2011107482A2 (en) * 2010-03-01 2011-09-09 Institut National De La Recherche Agronomique Method of diagnostic of obesity
WO2012115885A1 (en) * 2011-02-22 2012-08-30 Caris Life Sciences Luxembourg Holdings, S.A.R.L. Circulating biomarkers

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LAURA GLENDINNING等: "Supra-organismal interactions in the human intestine", 《FRONTIERS IN CELLULAR AND INFECTION MICROBIOLOGY》 *
朱慧峰等: "肥胖与消化道菌群关联性研究", 《现代医药卫生》 *
赵立平 等: "肠道菌群与肥胖症的关系研究进展", 《微生物与感染》 *

Also Published As

Publication number Publication date
WO2016049917A1 (en) 2016-04-07
CN107075562B (en) 2021-09-24

Similar Documents

Publication Publication Date Title
CN107075446A (en) Biomarker for obesity-related disorder
CN109852714B (en) Early diagnosis of intestinal cancer and adenoma diagnosis marker and application
CN105473738B (en) colorectal cancer biomarker
CN106574294A (en) Method for diagnosing colorectal cancer from human feces sample by quantitive pcr, primers and kit
CN107217089B (en) Method and device for determining individual state
CN112111586A (en) Crohn disease related microbial marker set and application thereof
Drengenes et al. Laboratory contamination in airway microbiome studies
CN105219844A (en) A kind of compose examination 11 kinds of diseases gene marker combination, test kit and disease risks predictive model
CN105473739A (en) Biomarkers for colorectal cancer
CN111500705B (en) IgAN intestinal flora marker, igAN metabolite marker and application thereof
CN110904213B (en) Ulcerative colitis biomarker based on intestinal flora and application thereof
CN110241205A (en) A kind of schizophrenia biomarker combinations and its application and screening based on intestinal flora
CN114277143B (en) Application of exosomes ARPC5, CDA and the like in lung cancer diagnosis
CN109234385A (en) Detect the primer sets and kit of Alzheimer&#39;s disease gene mutation
CN106795481A (en) For the biomarker of obesity-related disorder
CN114480636B (en) Application of bile bacteria as diagnosis and prognosis marker of hepatic portal bile duct cancer
CN112553344B (en) Biomarker related to colorectal cancer and application thereof
CN107058561A (en) Self-closing disease biomarker and its application
CN107075562A (en) Biomarker for obesity-related disorder
CN114891904A (en) Maternal intestinal flora marker for children ASD diagnosis and application thereof
CN109715828A (en) Biomarker combinations and its application for endometriosis detection
CN113930526A (en) Method and composition for identifying methamphetamine-related people and application of composition
CN108064273A (en) The biomarker of colorectal cancer relevant disease
CN110317877A (en) Application of the unstable variation of one group chromosome in preparation diagnosis bladder transitional cell carcinoma, the reagent or kit of assessing prognosis
CN114134231B (en) Brain glioma gene marker based on ecDNA and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1240980

Country of ref document: HK

CB02 Change of applicant information
CB02 Change of applicant information

Address after: 518083 11F-3, Beishan industrial complex, 146 Beishan Road, Yantian District, Shenzhen, Guangdong

Applicant after: BGI SHENZHEN Co.,Ltd.

Applicant after: BGI SHENZHEN

Address before: 518083 11F-3, Beishan industrial complex, 146 Beishan Road, Yantian District, Shenzhen, Guangdong

Applicant before: BGI SHENZHEN Co.,Ltd.

Applicant before: BGI SHENZHEN

GR01 Patent grant
GR01 Patent grant