CN107075562A

CN107075562A - Biomarker for obesity-related disorder

Info

Publication number: CN107075562A
Application number: CN201480082372.1A
Authority: CN
Inventors: 冯强; 张东亚; 唐龙清; 王俊
Original assignee: BGI Shenzhen Co Ltd
Current assignee: BGI Shenzhen Co Ltd
Priority date: 2014-09-30
Filing date: 2014-09-30
Publication date: 2017-08-18
Anticipated expiration: 2034-09-30
Also published as: WO2016049917A1; CN107075562B

Abstract

There is provided the biomarker and method for predicting the disease related to microorganism, particularly obesity or the risk of relevant disease.

Description

Biomarker for obesity-related disorder

The cross reference of related application

Nothing

Technical field

The present invention relates to the life for predicting the disease related to microorganism, particularly obesity or the risk of relevant disease Substance markers thing and method.

Background technology

Obesity is very universal in developed country, and (de Carvalho Pereira etc. are dramatically increased in worldwide People, 2013).It is reported that during 1980 to 2013, in the world in the illness rate of overweight and obesity altogether, adult 27.5% is increased, children increase 47.1%.Overweight population increases to 2,100,000,000 in 2013 from 8.57 hundred million in 1980, its In, 6.71 hundred million populations receive the influence of obesity.Among these, the obesity patient more than 50% is lived in ten countries, And the U.S. possesses the obese people of maximum quantity, next to that Chinese (Ng et al., 2014).

Increasing evidence shows, is overweight patient relative to not being diagnosed as overweight patient more by diagnosis It is possible to lose weight.However, the low diagnosis of doctor and the suggestion pair the healthy hazard factor of the behavior relevant with obesity Relevant (Bleich et al., 2011).

In children, the diagnosis of obesity is body-mass index (BMI) point of penetration based on age and sex-specific 's.This is with adult on the contrary, in adult, the diagnosis of obesity is made based on the BMI for not considering age or sex.With into Year, people was different, and the diagnostic criteria of obesity is simpler for adult, and a small number of Obese childrens are with more complicated diagnosis mark Standard is diagnosed and there occurs change (Walsh et al., 2013) to the term of childhood obesity exactly.In addition, being considered as BMI Limitation (Nevill et al., 2006) in different crowd in terms of homogeneity.It is therefore contemplated that waistline (WC) is to be used to comment Estimate the reliable and useful instrument of the epidemiological study of abdominal adiposity, but this measurement seems more to be difficult to carry out (Miguel-Etayo et al., 2014).In addition, using《International Classification of Diseases》(the 9th revision (ICD-9)), national outpatient service doctor Treat nursing investigation (NAMCS) and the nursing of national inpatient medical is investigated the regional study that (NHAMCS) diagnose to childhood obesity and shown Go out the relatively low sensitiveness (Walsh et al., 2013) of clinical diagnosis.

Nearest observation shows that human intestinal microorganisms group can play a significant role in obesity.Based on amplification The early stage of 16SrRNA gene sequencing, which reports, to be shown, the firmicutes in the fecal specimens from 12 fat mankind (Firmicutes) with the ratio between bacteroid (Bacteroidetes) is thin far above two compares (Ley et al., 2006).In people Using verified Phylogenetic diversity of bacteria reduction, bacteroid in the nearest observational study of grand gene order-checking in class obesity (Bacteroidetes) relative shortage and be related to carbohydrate and lipid-metabolism gene enrichment (Allin and Pedersen, 2014).These related discoveries show that the change of intestinal microbiota is the cause in the pathogenesis of obesity Cause of disease element.This shows, the standard that perhaps we can be diagnosed by the use of the feature of intestinal microbiota as obesity.

In a word, the diagnosis to obesity has considerable ignored chance and muting sensitivity.Need exploitation more effective (deviation is less) is to overweight and/or obesity assessment.

The content of the invention

The embodiment of the disclosure attempts at least to solve at least one problem present in prior art to a certain extent.

Following discovery of the invention based on the present inventor：

Assessment and sign to intestinal microbiota are had become to the main research in the human diseases including obesity Field.In order to which the enteric microorganism composition to obesity patient is analyzed, the present inventor is based on from 158 individual enteron aisles The depth air gun sequencing of microbial DNA implements grand genome association analysis (MGWAS) scheme (Qin, J. et al., A metagenome-wide association study of gut microbiota in type 2 diabetes.Nature 490,55-60 (2012), are incorporated herein by reference).The present inventor differentiates and demonstrates the related base of 396,100 obesity Because of label.In order to which using the potential ability that fat grader is carried out by intestinal microbiota, inventor developed based on 9 The classification of diseases device system of individual gene marker, the gene marker is by minimal redundancy-maximal correlation (mRMR) Method for Feature Selection It is defined as optimal gene set.In order to carry out intuitively assessing obesity based on 9 enteric microorganism gene markers Risk, the present inventor calculates health index.The grand genome of the data pair of the present inventor enteron aisle related to obesity risk Feature has made intensive studies the future studies of the pathophysiological role in other relevant diseases there is provided the grand genome of enteron aisle Example and potential application for being estimated based on intestinal microbiota to the individual of such disease risks.

It is believed that detection of the gene marker of intestinal microbiota for improving obesity in early stage is that have Value, this is due to the following reasons.First, label of the invention is more specific and quicker compared with conventional labels thing Sense.Second, copra analysis ensure that accuracy, security, affordability and patient compliance.And fecal specimens are to transport 's.Therefore, the present invention relates to a kind of comfortable and noninvasive in-vitro method so that people more easily participate in given screening journey Sequence.3rd, label of the invention is also used as the Treatment monitoring instrument of cancer patient, to detect its response to treatment.

On the one hand the disclosure provides the biomarker collection for being used for predicting subject's disease related to micropopulation, its Consist of：

Including SEQ ID NO：The enteron aisle biomarker of 1 to 9 at least part sequence.

According to the embodiment of the disclosure, the disease is obesity or relevant disease.

Using these biomarkers, subject's some diseases related to micropopulation can be analyzed, such as based on next From some samples of subject, for example, it can use some fecal specimens, it may be determined that obesity or relevant disease.

On the other hand the disclosure provides the kit for being used for determining said gene label collection, and it includes being used for PCR expansions Increase and according to such as SEQ ID NO：DNA sequence dna described in 1 to 9 at least part sequence and the primer designed.

On the other hand the disclosure provides the kit for being used for determining said gene label collection, and it includes more than one root According to SEQ ID NO：Gene described in 1 to 9 and the probe designed.

Another aspect of the present disclosure, which provides said gene label collection, to be used to predict subject's obesity or relevant disease Risk purposes, it includes：

(1) sample j is collected from subject；

(2) SEQ ID NO in the DNA of sample are determined：The relative abundance information of each in 1 to 9；With

(3) calculate according to the following formula by I_jThe sample j of expression index：

A_ijIt is the relative abundance of label i in sample j, wherein i refers to each gene mark that the gene marker is concentrated Remember thing；

N is the label being enriched with all patients in the selected biomarker related to unusual condition First subset,

M is the label being enriched with all controls in the selected biomarker related to unusual condition Yield in the second subset,

| N | and | M | it is the number of the biomarker in the first subset and yield in the second subset respectively,

Wherein

Index more than critical value shows that subject has in unusual condition or risk in the situation of undergoing an unusual development.

According to some embodiments of the disclosure, | N | it is 5, | M | it is 4.

According to some embodiments of the disclosure, critical value is 0.03519 to 0.1337.

Another aspect of the present disclosure provides said gene label collection and prepared for predicting subject's obesity or phase Purposes in the kit of the risk of related disorders, it includes：

(1) sample j is collected from subject；

N is the mark being enriched with all patients in the selected selected biomarker related to unusual condition First subset of thing,

M is the mark being enriched with all controls in the selected selected biomarker related to unusual condition The yield in the second subset of thing,

Wherein

According to some embodiments of the disclosure, | N | it is 5, | M | it is 4.

Another aspect of the present disclosure provides whether diagnosis subject has the unusual condition related to micropopulation or place Method in the risk for developing the unusual condition related to micropopulation, it includes：

It is determined that in the sample from subject above-mentioned biomarker relative abundance, and

Determine subject whether with the unusual condition related to micropopulation or in development based on the relative abundance In the risk of the unusual condition related to micropopulation.

According to the embodiment of the disclosure, this method includes：

(1) sample j is collected from subject；

(2) SEQ ID NO in the DNA of determination sample：The relative abundance information of each in 1 to 9；With

Wherein

According to some embodiments of the disclosure, | N | it is 5, | M | it is 4.

According to the embodiment of the disclosure, the unusual condition relevant with micropopulation is obesity or relevant disease.

Brief description of the drawings

The these and other aspect and advantage of the disclosure will be apparent from following description with reference to the accompanying drawings and more hold It is readily understood, wherein：

Strong correlation label out-of-proportion excessive table in relatively low P values is determined in the association analysis of Fig. 1 obesity p value distributions Show.

Fig. 2 the present inventor is incremented by by minimal redundancy maximal correlation (mRMR) method in obesity-related gene label Retrieval, and produce consecutive numbers purpose subset.Then by linear discriminant grader to stay a cross validation (LOOCV) every to estimate The error rate of individual subset.Most preferably (lowest error rate) subset includes 9 gene markers.

Fig. 3 ROC are drawn by the obesity index in training set, AUC=0.9763.

The ROC of Fig. 4 test sets (42 samples) is drawn by the obesity index of test set, AUC=0.9024.

The ROC of Fig. 5 test sets (22 samples) is drawn by the obesity index of test set, AUC=0.8462.

Embodiment

Example

Terms used herein has the implication that the those of ordinary skill in field related to the present invention is generally understood that.Such as The term of " one ", " one " and "the", which are not intended to, only refers to singular entity, but including available for the general of explanation particular example Classification.Unless had been described in the claims, term herein is used to describing specific embodiments of the present invention, but it Usage do not limit the present invention.

The present invention is further illustrated in following non-limiting example in.Unless otherwise indicated, number and percentage Than all by weight, the number of degrees for degree Celsius.Although it will be obvious to those skilled in the art that these examples are represented The preferred embodiments of the invention, but only provide by way of illustration, and all reagents are all commercially available.

Embodiment

Example 1. differentiates the biomarker for assessing obesity risk

1.1 sample collection

Excrement from 158 Chinese subjects (including 78 obesity patients and 80 control subjects's (training set)) Sample was collected by Medical College, Shanghai Communication Univ.'s Ruijin Hospital in 2012.Obesity patient's age, BMI was higher than from 18 to 30 years old 25.It is required that subject collects fresh excreta sample in hospital.The sample of collection is placed in sterile tube, -80 DEG C are stored in immediately Until being further analyzed.

Complete ethics approval is achieved, and all patients give written informed consent.The research obtains Shanghai The approval of medical college of university of communications Ruijin Hospital Institutional Review Board.

1.2DNA extract

Fecal specimens are thawed on ice, and use Qiagen QIAamp DNA Stool Mini kits (Qiagen) DNA extractions, are carried out according to the explanation of manufacturer.The RNase of no DNA enzymatic is used to handle extract to eliminate RNA dirts Dye.Use NanoDrop spectrophotometers, Qubit fluorescence photometers (there is Quant-iTTMdsDNA BR to determine kit) and gel Electrophoretic determination amount of DNA.

The DNA library of 1.3 fecal specimens builds and is sequenced

Carrying out DNA library structure according to the explanation of manufacturer, (Illumina inserts size 350bp, read length 100bp).The present inventor carries out fasciation into, template hybridization, isothermal duplication, linearisation, envelope using workflow as hereinbefore Close and be denatured and sequencing primer hybridization.The present inventor constructs one with insertion size for each sample (PE) library of 350bp end pairing, then carries out high-flux sequence and obtains about 3,000 ten thousand PE that length is 2x100bp Read.Polluted by being filtered out from the original reads of Illumina with uncertain " N " base, joint pollution and people's source DNA Low quality read and obtain high-quality read by the low quality terminal bases for shearing read simultaneously.

The present inventor is on the platforms of Illumina HiSeq 2000 from 158 samples (78 cases and 80 controls) Each sample about 5.9Gb fecal microorganism group's sequencing data (high quality, clean data) (table 1) is exported altogether.

The grand genomic data of table 1 collects.4th result of the row report from Wilcoxon rank tests.

1.4 grand genomic data processing and analysis

1.4.1 read is compared

Inventors used Li, J. et al., An integrated catalog of reference genes in The renewal that the human gut microbiome.Nat.Biotechnol. (2014) (being incorporated herein by reference) are set up Human intestine's gene catalogue, and to compare standard homogeneity >=90 by the human intestine of high-quality read comparison to the renewal Gene catalogue.Average read comparison rate is shown in Table 1.The comparison rate is close to Li, J. et al., 2014, ibid in sample, this Illustrate that the comparison rate is sufficient for further research.After read comparison, the present inventor's use and Li, J. et al., 2014, Ibid identical method exports gene profile (9.9Mb genes) from comparison result.

The taxology distribution of gene.Using interior described in published paper (Li, J. et al., 2014, ibid) The flow (pipeline) of portion's exploitation is predicted the taxology distribution of gene.

1.4.2 data file is built

Gene profile.Based on read compare result, the present inventor use disclosed T2D papers (Qin et al., 2012, together On) described in same procedure calculate Relative gene abundance.

1.4.3 the factor analysis of intestinal microbiota gene profile is influenceed.Based on gene profile, the present inventor is more using nonparametric First variance analysis (PERMANOVA) assesses the shadow of 6 clinical parameters (including age, sex, height, body weight, BMI and obesity) Ring.Inventor is analyzed using the method implemented in " vegan " bag in R, and passes through 10,000 displacements (permutation) (permuted) p value of displacement is obtained.The present inventor is also using Benjamini-Hochberg methods in R It is middle that multiple testing is corrected using " p.adjust ", to obtain the q values each tested.PERMANOA is determined and enteric microorganism Related three key factors (being based on gene profile) (q<0.05, table 2).Analysis shows, body weight, BMI and obese state are strong close Connection mark, it was demonstrated that disease (obesity) state is the major determinant for influenceing intestinal microbiota to constitute.

The PERMANOVA of Euclidean distance analysis of the table 2 based on gene profile.In q values<Analyzed for 0.05 time to test Whether clinical parameter and Obesity have on intestinal microbiota significantly affects.

1.4.4 the determination of obesity mark of correlation thing

The determination of obesity related gene.In order to determine the association between grand genome spectrum and obesity, 9,879,897 Using double tails in individual High frequency gene (removal is present in less than the gene in 10 samples in all 158 samples) spectrum Wilcoxon rank tests.Obtain 396 be all enriched with case and control, 100 gene markers, p value<0.01、FDR =3.8% (Fig. 1).

False discovery rate estimates (FDR).The present inventor applies " q values " method proposed in previous research rather than continuous p Value exclusive method (sequential p-value rejection method) estimates FDR (Storey, JDA direct approach to false discovery rates.Journal of the Royal Statistical Society 64,479-498 (2002), are incorporated herein by reference).

Receiver Operating Characteristics (ROC) analyze.The present inventor analyzes to assess based on grand genomic marker thing using ROC The performance of obesity classification.Then, the present inventor wraps to draw ROC curve using " pROC " in R.

1.5 select method (maximal correlation minimal redundancy (mRMR) feature choosing of 9 optimum mark things from biomarker Select framework)

In order to determine optimal gene set, using minimal redundancy maximal correlation (mRMR) (details referring to Peng, H., Long, F.＆Ding, C.Feature selection based on mutual information：criteria of Max--relevance and min-redundancy, IEEE Trans Pattern Anal Mach Intell27,1226- 1238, doi：10.1109/TPAMI.2005.159 (2005), it is incorporated herein by reference) Method for Feature Selection is from all fertilizer Selected in fat related gene label.Inventor performs incremental inspection using " sideChannelAttack " bag of R softwares Rope, and find 158 continued labelling thing collection (sequential markers sets).One is stayed by linear discriminant grader Cross validation (LOOCV), the present inventor have estimated the error rate of each continuum.The optimal selection of label collection is corresponded to most That label collection of low error rate.In our current research, inventor is to one group 396, and 100 obesity-related gene labels are carried out Feature selecting.Due to can not computationally use all genes to carry out mRMR, the present inventor has obtained statistically nonredundancy Gene set.First, we select 8010 gene (q<0.0005).Then, the present inventor applies mRMR Method for Feature Selection and true Best set (lowest error rate, figure with 9 gene biological labels of obesity strong correlation in classifying for obesity are determined 2), it is shown in table 3 and table 4.Gene id comes from disclosed such as Li, J. et al., 2014, reference gene catalogue ibid.

The enrichment information of the optimal gene marker of 3. 9, table

Gene id	Enrichment (1=obesity, 0=controls)
		64552	0
1208989	0
		2285506	0
3104115	1
		3581202	0
5042942	1
		5243950	1
6793200	1
		7860042	1

The SEQ ID of the optimal gene marker of 4. 9, table

Gene id	SEQ ID NO:
		Gene _ id:7860042	1
Gene _ id:1208989	2
		Gene _ id:5243950	3
Gene _ id:5042942	4
		Gene _ id:3104115	5
Gene _ id:2285506	6
		Gene _ id:3581202	7
Gene _ id:64552	8
		Gene _ id:6793200	9

1.6 intestinal health indexes (obesity index)

In order to develop the potential ability that classification of diseases is carried out by intestinal microbiota, inventor developed based on this hair The classification of diseases system for 9 gene markers that a person of good sense defines.It is straight in order to be carried out based on these enteric microorganism gene markers See ground and evaluate disease risks, the present inventor calculates intestinal health index (obesity index).

In order to evaluate influence of the grand genome of enteron aisle to obesity, the present inventor is based on selected 9 as described above Gene marker defines and calculated the intestinal health index of each individual.For each single sample, calculate according to the following formula By I_jThe sample j of expression intestinal health index：

A_ijIt is the relative abundance of label i in sample j；

N be in the selected biomarker related to unusual condition (i.e. in this 9 selected gene markers The label being enriched with all obesity subset) in the label being enriched with all patients subset,

M be in the selected biomarker related to unusual condition (i.e. in this 9 selected gene markers The label being enriched with all controls subset) in the label being enriched with all controls subset,

| N | and | M | it is the number (size) of the biomarker in the two subsets respectively, wherein | N | it is 5, | M | it is 4,

Index wherein more than critical value shows that subject has obesity or in the risk for developing obesity.

The 1.7 obesity classification based on enteric microorganism

Relative abundance of the present inventor based on this 9 gene markers calculates obesity index, and it is clearly distinguished Obesity patient's microorganism group is with compareing microorganism group (table 5).Using obesity index by 78 obesity patient's microorganism groups Sort out to come from 80 control microorganism groups, it shows that Receiver Operating Characteristics (ROC) TG-AUC is 0.9763 (figure 3).Under optimum index critical value 0.03519, True Positive Rate (TPR) is 0.9487, and false positive rate (FPR) is 0.1, error rate For 8.23% (13/158), show that 9 gene markers can be used for Accurate classification obesity individual.

The intestinal health index (obesity patient and the control of non-obese disease) for 158 samples that table 5. is calculated

Example 2. verifies 9 gene biological labels in 42 samples (test set)

The present inventor (is included in Medical College, Shanghai Communication Univ.'s Ruijin Hospital collection using another new dependent research groups 17 obesity patients and the control of 25 non-obese diseases) demonstrate the resolving ability of obesity grader.

DNA and the constructed dna library of each sample are extracted, high-flux sequence is then carried out as described in example 1.The present inventor Using with Qin et al., 2012, ibid described in identical method calculate the gene abundance spectrums of these samples.Then such as SEQ is determined ID NO：The gene relative abundance of each label shown in 1-9.Then the index of each sample is calculated by following formula：

A_ijIt is the relative abundance of label i in sample j；

M is the biomarker in the selected selection related to unusual condition (i.e. in this 9 selected gene marks Remember thing in the label being enriched with all controls subset) in the label being enriched with all controls subset,

| N | and | M | it is the number of the biomarker in the two subsets respectively, wherein | N | it is 5, | M | it is 4,

Wherein, the index more than critical value shows that subject has obesity or in the risk for developing obesity.

Table 6 shows the index calculated of each sample, and table 7 shows representative sample DB78A related gene Relative abundance.It is that (optimum index above in 158 samples is critical at 0.03519 in critical value in the analysis and assessment Value), error rate is 21.42% (9/42), and checking illustrates that 54 gene markers can sort out obesity individual.It is most of Obesity patient (16/17) is correctly diagnosed as obesity.In addition, the ROC of test set is painted by the obesity index of test set System, AUC=0.9024 (Fig. 4).At optimal critical value 0.1337, True Positive Rate (TPR) is 0.9412, false positive rate (FPR) For 0.24.

Table 6. calculates the intestinal health index of 42 samples

The sample DB78A of table 7. gene relative abundance

Gene id	DB78A (calculating of gene relative abundance)	Enrichment (1=obesity, 0=controls)
			64552	0	0
1208989	0	0
			2285506	1.46332E-06	0
3104115	3.47323E-06	1
			3581202	0	0
5042942	0	1
			5243950	5.26732E-06	1
6793200	1.06787E-06	1
			7860042	0	1

Example 3. verifies 9 gene biological labels in 22 samples (test set)

Inventor demonstrates the resolving ability (table 8) of obesity grader using other 22 samples, including 9 diseases Example sample and 13 control samples (5 samples after operation 1 month and 8 samples after operation 3 months), sample is also in Shanghai Medical college of university of communications Ruijin Hospital is collected.After case represents that preoperative sample, control represent operation 1 month and 3 months.

The information of 8. 22 samples of table

* before：Operation consent；1-M：Performed the operation after one month；3-M：Performed the operation after three months.

DNA and the constructed dna library of each sample are extracted, high-flux sequence is then carried out as described in example 1.The present inventor Using with Qin et al., 2012, ibid described in identical method calculate the gene abundance spectrums of these samples.It is then determined that such as SEQ ID NO：The gene relative abundance of each label shown in 1-9.Then the index of each sample is calculated by following formula：

A_ijIt is the relative abundance of label i in sample j.

M be in the selected biomarker related to unusual condition (i.e. in this 9 selected gene markers It is all control enrichment in label subsets) in the label being enriched with all controls subset,

Table 9 shows the index calculated of each sample, and table 10 shows representative sample DB126 related gene Relative abundance.It is (the optimum index critical value in 158 samples above) at 0.03519 in critical value in the analysis and assessment, Error rate is 22.72% (5/22), and checking illustrates that 54 gene markers can sort out obesity individual.And mostly Number obesity patient (8/9) is correctly diagnosed as obesity.In addition, the ROC of test set is painted by the obesity index of test set System, AUC=0.8462 (Fig. 5).At optimal critical value 0.9695, True Positive Rate (TPR) is 0.6667, false positive rate (FPR) For 0.07692.

Table 9. calculates the intestinal health index of 22 samples

The sample DB126 of table 10. gene relative abundance

Gene id	DB12 (calculating of gene relative abundance)	Enrichment (1=obesity, 0=controls)
			64552	0	0
1208989	0	0
			2285506	7.99701E-08	0
3104115	6.25943E-05	1
			3581202	0	0
5042942	7.19308E-08	1
			5243950	5.97579E-07	1
6793200	0	1
			7860042	1.52752E-07	1

Therefore, inventor passes through minimal redundancy-maximum correlation (mRMR) spy based on 396,100 fat mark of correlation things Levy back-and-forth method and identify and demonstrate 9 label collection.And the present inventor establishes intestinal health index, based on this 9 intestines Road microbial gene label have evaluated the risk of obesity.

While there has been shown and described that illustrative embodiment, it will be understood by those skilled in the art that on State embodiment and be not construed to limit the disclosure, and embodiment can be changed, substitutions and modifications are without de- From spirit, principle and the scope of the present invention.

Claims

1. a kind of biomarker collection for being used to predict subject's disease related to micropopulation, is consisted of：

2. biomarker collection according to claim 1, wherein the enteron aisle biomarker is by SEQ ID NO：1 to 9 Composition.

3. the biomarker collection according to claim 1 or 2 for being used to predict subject's disease related to micropopulation, Wherein described disease is obesity or relevant disease.

4. a kind of kit for being used to determine the gene marker collection any one of claims 1 to 3, it includes being used for PCR is expanded and according to including SEQ ID NO：The DNA sequence dna of 1 to 9 at least part sequence and the primer designed.

5. a kind of kit for being used to determine gene marker collection any one of claims 1 to 3, it include one with Upper basis includes SEQ ID NO：The gene of 1 to 9 at least part sequence and the probe designed.

6. gene marker collection any one of claims 1 to 3 is used to predicting subject's obesity or relevant disease The purposes of risk, it includes

(1) sample j is collected from subject；

(2) SEQ ID NO in the DNA of the sample are determined：The relative abundance information of each in 1 to 9；With

<mrow> <msub> <mi>I</mi> <mi>j</mi> </msub> <mo>=</mo> <mrow> <mo>&lsqb;</mo> <mrow> <mfrac> <mrow> <msub> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>&Element;</mo> <mi>N</mi> </mrow> </msub> <msub> <mi>log</mi> <mn>10</mn> </msub> <mrow> <mo>(</mo> <mrow> <msub> <mi>A</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>+</mo> <msup> <mn>10</mn> <mrow> <mo>-</mo> <mn>10</mn> </mrow> </msup> </mrow> <mo>)</mo> </mrow> </mrow> <mrow> <mo>|</mo> <mi>N</mi> <mo>|</mo> </mrow> </mfrac> <mo>-</mo> <mfrac> <mrow> <msub> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>&Element;</mo> <mi>M</mi> </mrow> </msub> <msub> <mi>log</mi> <mn>10</mn> </msub> <mrow> <mo>(</mo> <mrow> <msub> <mi>A</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>+</mo> <msup> <mn>10</mn> <mrow> <mo>-</mo> <mn>10</mn> </mrow> </msup> </mrow> <mo>)</mo> </mrow> </mrow> <mrow> <mo>|</mo> <mi>M</mi> <mo>|</mo> </mrow> </mfrac> </mrow> <mo>&rsqb;</mo> </mrow> </mrow>

A_ijIt is the relative abundance of label i in sample j, wherein i refers to each described base that the gene marker is concentrated Because of label；

N is first of the label being enriched with all patients in the selected biomarker related to unusual condition Subset,

M is the label being enriched with all controls in the selected biomarker related to the unusual condition Yield in the second subset,

| N | and | M | it is the number of the biomarker in first subset and the yield in the second subset respectively,

Wherein

7. purposes according to claim 6, wherein | N | it is 5, | M | it is 4.

8. purposes according to claim 7, wherein the critical value is 0.03519 to 0.1337.

9. the gene marker collection any one of claims 1 to 3 is being prepared for predicting subject's obesity or correlation Purposes in the kit of the risk of disease, it includes：

(1) sample j is collected from subject；

Wherein

10. purposes according to claim 9, wherein | N | it is 5, | M | it is 4.

11. purposes according to claim 10, wherein the critical value is 0.03519 to 0.1337.

Diagnose whether subject has the unusual condition or in development with micropopulation related related to micropopulation 12. a kind of Unusual condition risk in method, it includes：

It is determined that in the sample from the subject biomarker according to any one of claim 1 to 3 it is relative Abundance, and

Determine the subject whether with the unusual condition related to micropopulation or in hair based on the relative abundance In the risk of the exhibition unusual condition related to micropopulation.

13. method according to claim 12, it includes：

(1) sample j is collected from subject；

Wherein

Index more than critical value shows that the subject has in unusual condition or risk in the situation of undergoing an unusual development.

14. purposes according to claim 13, wherein | N | it is 5, | M | it is 4.

15. purposes according to claim 14, wherein the critical value is 0.03519 to 0.1337.

16. method according to claim 12, wherein the unusual condition related to micropopulation is obesity or phase Related disorders.