CN108038350A

CN108038350A - A kind of method that accumulation microbial community structure of fermented grain is judged using physical and chemical index

Info

Publication number: CN108038350A
Application number: CN201711184733.5A
Authority: CN
Inventors: 赵亮; 王莉; 闫松显; 汪地强; 王和玉
Original assignee: Kweichow Moutai Co Ltd
Current assignee: Kweichow Moutai Co Ltd
Priority date: 2017-11-23
Filing date: 2017-11-23
Publication date: 2018-05-15
Anticipated expiration: 2037-11-23
Also published as: CN108038350B

Abstract

The present invention relates to a kind of method that accumulation microbial community structure of fermented grain is judged using physical and chemical index.Step is as follows：S1. corresponding sexual intercourse of the different fermentations period with microbiologic population's relative abundance section is built；S2. the physical and chemical index of sample to be tested is obtained；S3：Bring the physical and chemical index of sample to be tested into discrimination model, differentiate the real attenuation period of sample to be tested；S4：By the fermentation period in step S1 and the corresponding sexual intercourse in microbiologic population's relative abundance section, the biological community structure of sample to be tested is obtained.The present invention uses sequencing technologies, so that the defects of effectively avoiding traditional pure culture or culture-independent method from causing microbial information to obtain wretched insufficiency；Fermented grain physical and chemical index is directly utilized, by the model built, fast and accurately reflects the fermentation period residing for sample to be tested, and provides the rational microbiologic population's composition section of these samples, so that being capable of timely, effective Instructing manufacture work.

Description

A kind of method that accumulation microbial community structure of fermented grain is judged using physical and chemical index

Technical field

The present invention relates to microbial community structure of fermented grain analysis technical field, and in particular to utilizes fermented grain physical and chemical index, knot Multivariate statistics means are closed, differentiates residing for fermented grain the heap fermentation period, judge a kind of method of biological community structure in fermented grain.

Background technology

Heap fermentation is the special procedure link of Maotai-flavor liquor production, while is also the one of Maotai-flavor liquor flavor formation A foundation phase, under conditions of heap fermentation opposing open formula, the microorganism of surrounding air, place, instrument etc. is enlisted the services of It is enriched in poor unstrained spirits, while the microorganism in song is spread cultivation breeding, so as to provides abundant microorganism raw material for pit entry fermentation. In Maotai-flavor liquor production process, everybody gradually forms a common recognition, by production operation code, such as adds high temperature song, without Heap fermentation, as a result next round do not produce wine, let alone paste flavor style；The heap fermentation time is short, temperature low (43~45 DEG C), production wine is more, but paste flavor does not protrude；The heap fermentation time is longer, and accumulation temperature is higher (50 DEG C or so), and production wine is more, paste flavor It is prominent, style typical case, it is seen that generation of the heap fermentation process to alcoholic fermentation and Sauce flavor material has all played important function. On the other hand, more being opened in heap fermentation stage, environment, fermented quality is influenced by a variety of physicochemical environments, climatic factor, once Microbial growth, the required environmental condition of metabolism change, and directly affect down and swim out of wine link, as distillation yield is low, vinosity Difference etc..Microbiologic population is influenced in fermented grain by microenvironment, is experiencing succession in whole fermentation stage, is enriched with and was screened Journey, and in this process a certain amount of fragrance or fragrance precursor material are produced using the nutrient metabolism in fermented grain.Depth, Quan Mianliao Microbiologic population is solved in whole heap fermentation stage succession changing rule, can be to select into cellar for storing things best opportunity and analysis fermented grain Fermented quality poor prognostic cause provides important clue.

At present, mainly using pure culture, denaturing gradient gel electrophoresis (PCR-DGGE), real-time quantitative PCR (Q-PCR), with And the nearlyer two generation sequencing technologies gradually risen for 10 years obtain microbiologic population's information in fermented grain fermentation process.Though pure culture technigne Some feature bacterial strains so can be surprisingly obtained, but can only at most be obtained in sample from group's angle and cultivate micro- life less than 1% Thing information, and time-consuming, cost of labor is big.PCR-DGGE and Q-PCR technologies take short compared to pure culture, and manpower demand is low, Acquisition of information can break through the limitation of Bacterial diversity, but this method is typically only capable to there are the problem of resolution ratio and test limit [Gao Huiwen, Lv Xin, Dong Ming contain .PCR-DGGE fingerprint techniques and are ground in food microorganisms the superior microorganism monoid of excavation more than 1% Food Sciences, 2005,26 (8) are applied in studying carefully：465-468.].In addition, Q-PCR technologies can only be directed to certain specified microorganisms Colony, such as bacillus, do quantitative analysis, can not be parsed in categorization levels.Two generation sequencing technologies can preferably make up former classes The defects of technology, while the excavation of microbial total information content is promoted to more than 90%, it can more comprehensively reflect that sample is micro- Biology community structure feature.It is micro- but because with high costs, two generation sequencing technologies are difficult to large-scale application, and in wine production process The prediction of biology community structure is common requirements.In actual production, referring generally to fermented grain physical and chemical parameter, as temperature, moisture, Lactic acid etc. understands course of fermentation, and microorganism growth metabolism is controlled by these factors at the same time.The detection week of fermented grain physical and chemical parameter Phase is short, speed is fast, almost without retardance in terms of Instructing manufacture.Therefore, built using fermented grain physical and chemical parameter and two generation sequencing datas Vertical analysis method, on the one hand can embody superiority of the big data in production technology use, on the other hand can grasp hair rapidly Microbiologic population's composition change during ferment, so as to judge fermented quality, timely optimized production process provides safeguard.

In terms of modeling, [Qun Wu, Jie Ling, the Yan Xu.Starter culture selection for such as Wu making Chinese sesame-flavored liquor based on microbial metabolic activity in mixed-culture fermentation.AEM,2014,80(14):4450-4459.], [the Yu Kong, Qun such as Kong Wu,Yan Zhang,Yan Xu.In situ analysis of metabolic characteristics reveals the key yeast in the spontaneous and solid-state fermentation process of Chinese light-style liquor.AEM,2014,80(12):3667-3676.] once established using Partial Least Squares Regression (PLS) method Contacted between saccharomycete and flavor substance, models fitting goodness is good, and can be predicted by model by saccharomycete abundance data The content of related flavor substance.But the structure thought of this model, 1. can only reflect particular variables (such as flavor substance) with it is a few Relation between class superior microorganism, can not reflect their relations with all microorganisms in sample as far as possible；2. taken out using method of descent Returned from principal component, once explanatory variable is much larger than sample size, deviation occurs in explanation of the model to primary data information (pdi)；③ Estimate microorganism group into when, its accuracy is influenced by sample size, and the result drawn is a certain concrete numerical value, to objective reality Supposition it is more absolute.In addition, from Instructing manufacture angle, have not seen directly use physical and chemical parameter predictive microbiology group at present Fall the method for structure, initial data construction method is sequenced especially with two generations.Now, as cost reduction was sequenced in two generations, this Technology infiltrates into each research field extensively, and the microbial information amount obtained is very big.Using method provided by the present invention, no It only can be determined that the fermentation period residing for sample to be tested, while can more objectively reflect different microorganisms class in these samples Group's relative abundance scope, so as to provide new opinion to make the analytical mathematics of microorganism.

The content of the invention

The present invention for experimental period to need to obtain immediately during heap fermentation fermented grain microbial information bring it is stagnant Property afterwards, and microbial information is covered insufficient when method is built, data overall distribution requires (such as Gauss point during model construction Cloth), sample size the technological deficiency such as overweight is influenced on model prediction accuracy, there is provided make Prediction Parameters using fermented grain physical and chemical index, In two generation of microorganism, is sequenced original OTU lists and makees response data sets, by quadratic discriminatory analysis method, Bootstrap methods and accelerates Drift correction quantile method (BC_a), the judgement sample actually located fermentation period, and different microorganisms monoid accounts in these samples Than the analysis method in section.This method is easy to operate to detected value data distribution no requirement (NR), and can accurately reflect the micro- life of sample Thing forms, so as to provide technical support for Instructing manufacture technique.

Technical scheme is as follows：

S1：Corresponding sexual intercourse of the different fermentations period with microbiologic population's relative abundance section is built, it is specific and includes Following steps：

(1) physical and chemical index and microorganism sequencing information of sufficient amount of, the different fermentations period fermented grain sample are obtained；

(2) quadratic discriminatory analysis is carried out using the physical and chemical index of fermented grain sample as explanatory variable, builds discrimination model；

(3) discrimination model is used, respectively the modeling sample sampled in the different fermentations period sentence；

(4) the microorganism sequencing information for sentencing successful fermented grain sample is used back, each fermentation is calculated by Bootstrap methods Microbiologic population's relative abundance section of period；

S2：Obtain the physical and chemical index of sample to be tested；

S3：Bring the physical and chemical index of sample to be tested into discrimination model, differentiate the real attenuation period of sample to be tested；

S4：The fermentation period differentiated by discrimination model, fermentation period and microbiologic population's relative abundance in step S1 The correspondence sexual intercourse in section, obtains the biological community structure of sample to be tested；

Wherein, step S2 obtains the physical and chemical index of sample to be tested, can be operated, can also grasped at the same time before step S1 Make.

Wherein, in step S1, the different fermentations period is from heap is lost to the lower different periods stored and continuously selected in all stage. It is pre- to accumulate the time more than 4 days as preferred embodiment based on a large amount of experiences of inventor, when period interval is small not less than 24, When pre- 2-4 days time of accumulation, period interval are small not less than 12.Based on this condition, Time segments division is more, to sample real attenuation rank The resolution of section is more apparent.If interval time is less than setting value, biological community structure intersegmental change at two is not notable, and Increase workload.

Wherein, in step S1, the sufficient amount refers to that the sample size selected by each fermentation period disclosure satisfy that structure Build the demand of secondary discrimination model, certainly sample in it is each fermentation the period modeling sample quantity it is The more the better.According to practice During, the operability of method, and the practical experience of inventor, the sample number of each period of fermenting are constructed more than 10 Model disclosure satisfy that accuracy requirement；It is highly preferred that each the sample number of fermentation period is more than 15；It is highly preferred that each hair The sample number of ferment period is more than 20.As preferred embodiment, the different parts that sample used is derived from heap body are modeled, are taken Sample is representative, reduces the error brought by sampling.

Wherein, step S1, in S2, S3, the physical and chemical index is selected from moisture, lactic acid content, acetic acid content, alcohol At least three in content, temperature, total sugar content, pH value；Preferably wherein at least four.Secondary discrimination model, which differentiate, to be divided It is the variation according to the sample measures value of explanatory variable between priori group during analysis, infers that each sample belongs to different priori groups Posterior probability, the highest priori group of probability is discriminating group.Herein, explanatory variable is equal to physical and chemical index, and priori group is equal to Ferment the period, sample is fermented grain sample.From mathematical angle, at least two explanatory variables can complete model calculating, But it is quite sensitive to the degree of variation of explanatory variable measured value during model construction, once a small number of measured values of one of explanatory variable There is deviation, the prediction effect of whole model can be influenced by very big.Empirically, if minimum explanatory variable is three, this list The deviation effects of a explanatory variable measured value would generally buffer by other two explanatory variables, therefore at least select three physics and chemistry Index does modeling analysis.According to this, if experiment condition is sufficient, at least preferably four, forecast result of model can be more preferable.As preferred Embodiment, in the method for the invention, the physical and chemical index is selected from temperature, moisture, lactic acid content, acetic acid contain Amount, alcohol content, this five indexs are most representational five indexs in fermented grain fermentation.

Wherein, in step S1, the microorganism sequencing information is the DNA by extracting fermented grain sample, carries out sequencing skill Art obtains；Preferably, the sequencing technologies are two generation sequencing technologies；It is highly preferred that the two generation sequencing technologies are selected from 454GSFLX pyrosequencings or Illumina both-end sequencing technologies.The original OTU lists number returned after sequencing by sequencing company According to the known file usually stored in a manner of " OTU_table.txt " or " OTU_table.xlsx " etc., file is included in sample All microbial informations.

As preferred embodiment, the quadratic discriminatory analysis described in step S1, builds discrimination model, can use any Statistical software completes, and as an exemplary embodiment of the invention, structure discrimination model is completed using R softwares.

As preferable mode, the discrimination model is by verification；As an alternative embodiment, by 1~K Roll over (user can make K numbers by oneself, but K is not more than sample number) cross-validation method；Preferably, the discrimination model passes through 1~10 Cross-validation method is rolled over, precision is up to more than 95%.

1~K of use folding cross-validation methods, refer to unlock the beginning until K rolls over (K≤sample number n), each broken number from 1 T repetitive operation of interior completion (t >=sample number n, t more macrooperation result is more stable), obtains the verification precision under each broken number, and Take model accuracy of the average value of these verification precision as discrimination model.Herein, it is authenticated as K=1 or K=sample number n Journey, which is equal to, stays a proof method.Differentiating 1~K folding cross validations of other model can be tested using software of the prior art Card, can also write procedure script in the statistical softwares such as SAS, R, MATLAB and be verified.In an embodiment of the invention, Data set is divided into ten parts using 1~10 folding cross-validation method, in turn will wherein 9 parts be used as training data, 1 part as test number According to being tested.Experiment can all draw corresponding accuracy (or error rate) every time.Accuracy (or the mistake of the result of 10 times Rate) average value as the estimation to arithmetic accuracy.

Wherein, in step S1, to sampling modeling sample in the different fermentations period sentence respectively, concretely comprise the following steps by The physicochemical property of fermented grain sample used in modeling, brings discrimination model into, differentiates the real attenuation period of fermented grain sample, further, Through discrimination model differentiate the real attenuation period be by discrimination model calculate sample belong to it is each fermentation the period discrimination probability Afterwards, the period of wherein discrimination probability maximum is the real attenuation period.When returning fermentation when sentencing the sampling for successfully referring to modeling sample The real attenuation period that Duan Yujing models time differentiate after sentencing is consistent.Highly preferred embodiment, returning to sentence successfully not only needs Meet that fermentation period during modeling sample sampling is consistent with the real attenuation period differentiated after model is returned and sentenced, and discrimination model The maximum discrimination probability being calculated is more than 0.6.

As preferred embodiment, the present invention using R softwares sentence.

Wherein, it is described to bring the physical and chemical index of sample to be tested into discrimination model in step S3, obtain the reality of sample to be tested Border is fermented the period, refers to that model after calculating sample and belonging to the discrimination probability of each fermentation period, can choose discrimination probability maximum Period as differentiating the period, i.e. real attenuation period, i.e., the step of sentencing with above time.After inventor's many experiments Research experience, if maximum discrimination probability is less than 0.6, and the real attenuation period (differentiating the period) of the corresponding discriminating of this probability with The fermentation period (sampling period) during sampling is not inconsistent, and illustrates that sample is in a fermentation transition period and (it is specific to be not belonging to some Period) or fermentation is extremely, the identification result of such sample is not adopted, i.e., cannot judge that this is to be measured by the method for the present invention The biological community structure of sample.

As preferred embodiment, wherein described bring the physical and chemical index of sample to be tested into discrimination model, treated The real attenuation period of sample, R softwares can be used to complete.

Wherein, in step S1 (4), microbiologic population's relative abundance of each fermentation period is calculated by Bootstrap methods Section, this step are based on R softwares, call boot bags, RAM bags.Power function is used for OTU Data Integrations Cheng Weisheng in RAM bags Relative abundance data under thing different classifications grade, power function is used for the double sampling of OTU data in boot bags, and is weighed at m time After sampling, all kinds of group's relative abundance sections are obtained.Parameter setting：Bootstrap resets frequency in sampling and is more than sample number, and number More credible result degree are higher.No matter OTU data come from bacterium or fungi, can be in boundary, doors, classes, orders, families, genera and species, seven points Under class hierarchy, as needed, the abundance data of any classification grade is integrated.Certainly, this step can also be used other softwares to complete.

Because different type, the method for making liquor of different breweries are different.For Maotai-flavor liquor, the heap of eight rounds can be carried out Product fermentation, and the time of each round fermentation is not necessarily identical, biological community structure is also not quite similar.Therefore, for difference Fermented grain fermentation round, can be respectively adopted the present invention method, using physical and chemical index judge heap fermentation microbiologic population knot Structure.

As an exemplary embodiment, the present invention judges the biological community structure of two round of alcoholic, the wine The unstrained spirits fermentation period is selected from the second round from heap is lost to the lower cellar for storing things all stage, and the time of its pre- accumulation is more than 4 days, therefore selects interval It is a fermentation period when time is small more than 24, is divided into chronological order for five fermentation periods of P1, P2, P3, P4, P5.

Sample data is each 15 samples of fermentation time section used by modeling.

Secondary discrimination model construction, is completed by R softwares.

By 1~10 folding cross validation, precision is up to more than 95%.

Secondary discrimination model, which returns, sentences modeling sample, is completed by R softwares.

Sentence successful modeling sample using returning, Bootstrap methods obtain, different fermentations period and superior microorganism Community Facies The following table of correspondence sexual intercourse to abundance section：

Wherein, superior microorganism is in biological community structure, and abundance accounting is more than 0.5% microorganism.This area In it is generally believed that these superior microorganisms can represent the biological community structure of a certain period completely.

The beneficial effects of the invention are as follows：

(1) compared to that can cultivate and conventional non-culture such as PCR-DGGE microbial informations extractive technique, the survey of two generations is used Sequence technology more comprehensively reflects microbial community structure of fermented grain；

(2) zone of reasonableness of sample size and explanatory variable, effective verification strong to model when secondary discrimination modeling is provided The confidence level of method and sample to be tested identification result, the discrimination model for promoting to construct can effectively differentiate sample to be tested institute Place's fermentation period；

(3) in separate microorganism group message context, avoid using to " reprocessing " data after OTU information processings, and Directly using the original OTU data generated after sequencing, analysis on the one hand can be made more efficient, on the other hand can avoid using " again There is the problem of deviation in microbiologic population's information loss or information caused by processing " data.

(4) Bootstrap methods combination secondary discrimination model, the fermentation period belonging to sample to be tested provide a kind of all kinds of micro- lifes Thing forms section, and the group's component characteristics for making to deduce have more universality, objectivity, and then strengthen the actual meaning to operation instruction Justice.

Brief description of the drawings

Fig. 1 is the flow diagram of the method for the invention；

Fig. 2 be in embodiment 2 secondary discrimination model to the identification result of embodiment；Each column is represented and each treated in column figure Sample.The discrimination probability of sample to be tested is represented by the shape filling size that each fermentation period is represented in column, is taken from difference The sample to be tested of sample period is separated by black vertical line in figure；

Fig. 3 is in embodiment 1, and microbial information resets through Bootstrap methods and calculates 5000 times, in the case where belonging to category level, 6 density profiles for representing flora relative abundance of each fermentation stage.

Embodiment

Technical scheme is further illustrated below by way of specific embodiment, and specific embodiment is not represented to this hair The limitation of bright protection domain.Other people according to the present invention theory made some it is nonessential modification and adjustment still fall within this hair Bright protection domain.

It should be noted that when fermentation period during sampling is samples, the fermentation period residing for sample.But due to sample Place heap body position is different, so that physicochemical environment and microorganism group are into may also be different.Therefore, the sample of same period is derived from, Its attenuation degree may be different.Sampling time period such as some samples is 48h, its sampling time period belongs to the P2 stages, still The sample fermenting speed may be relatively slow, its actual fermentation stage is also in P1.So the present invention is just by building model, and Physical and chemical index, to identify the real fermentation period of fermented grain sample, so that opposite with fermented grain microbiologic population according to the fermentation period The correspondence sexual intercourse in abundance section, obtains microbiologic population's feature of sample to be tested.

Standard Maotai-flavor liquor method of making liquor 1 year is a production cycle, and a production cycle totally nine boiling rounds, are pressed Time sequencing is lower sand round time, rough sand round time, the round of alcoholic one to seven.In the first eight round for needing heap fermentation, each wheel The secondary accumulation time, through brewery, month after month knowhow, all formulation have an approximate range throughout the year, this is the pre- accumulation time.Because every Year solar term, season is different, and in the same production year, month, season are different residing for each round, so that being accumulated between each round Fermentation time is different, and for same round between the different production years, the accumulation time also slightly has difference.Certainly, microbiologic population is not yet It is identical to the greatest extent.

Secondary discrimination model construction, modeling sample time are sentenced, sample to be tested differentiates, preferably use R software implementations.R softwares Operation is flexible, wide in range to initial data combination requirement, and it is succinct, quick to read data, analysis data efficient, thoroughly, and can ought Preceding analysis result is analyzed again as next analysis initial data.It is of course possible to be completed using other softwares.

The R softwares that embodiment uses in the present invention are the Robert Gentleman and Ross by Auckland, NZL university Ihaka and other aspiration staff developments.It is a set of external member formed by the integration of data manipulation, calculating and figure displaying function. Including：Effective data storage and processing function, array (particularly matrix) computational operator of complete set, it is a set of it is perfect, Simply, effective programming language (including condition, circulation, custom function, input/output function), the data of complete set system Analysis tool, and the powerful graphing capability provided for data analysis and display.The positioning of R softwares is one perfect, unified System, rather than other Data Analysis Software are used as a special, inflexible outfit like that.Detailed on R softwares is situated between Continue and refer to Baidupedia (https://baike.baidu.com/item/R%E8%AF%AD%E8%A8%80/ 4090790Fr=aladdin), or enter official website (https://www.r-project.org/) specifically understand and freely download The software.

Embodiment 1 builds corresponding sexual intercourse of the different fermentations period with microbiologic population's relative abundance section

As shown in Figure 1, a kind of method that accumulation microbial community structure of fermented grain is judged using physical and chemical index, including following step Suddenly：

1. information extraction

Fermented grain sample is accumulated in the present embodiment and is derived from two round of alcoholic.To lose heap (heap body is collapsed and finished) for start time point Start to sample, be denoted as P1 when small (0), it is follow-up 48 it is small when, 120 it is small when, 144 it is small when, 168 respectively take a sample when small, be denoted as respectively P2, P3, P4, P5.P1~P5 day parts take modeling sample 15 respectively, amount to 75 samples.Sample temperature is measured during sampling at the same time Degree.All samples genomic DNA is extracted in laboratory, in addition detects sample moisture (g/10g fermented grains), lactic acid content (g/ 100g fermented grains), acetic acid content (g/100g fermented grains), alcohol content (g/100g fermented grains).The sampling time section of 75 samples and survey Fixed physicochemical property such as table 1.

The sampling time section of 1 75 modeling samples of table and the physicochemical property of measure

2. obtain sequencing information

Using Illumina both-end sequencing technologies, 75 modeling samples are sequenced, obtain the original OTU data of sample.

3. the foundation of discrimination model

Sample is classified as 5 groups by the sampling period, physical and chemical parameter include temperature, moisture, lactic acid content, acetic acid content, Ethanol content is as explanatory variable, using quadratic discriminatory analysis method to this 5 groups of sample structure discrimination models.This step is soft using R Part is completed.

Operating method is：1. the physicochemical data of modeling sample is organized into the variable number of a n row × p row in Excel tables According to frame, the entitled each modeling sample title of data frame row, arranges entitled different physical and chemical index titles, and data correspond to for each sample in frame The detection numerical value of different physical and chemical indexes.In addition, sample the period according to sample arranges the column split data frame of a n row × 1, data again The entitled each sample names of frame row, row name can be from intending, such as " packet ", " period ", " group ", each sample sample period such as P1, P2, P3, P4, P5, are respectively filled in.It is noted that position will in frame residing for sample names in variable data frame and grouped data frame Correspond, as sample M1 is located at the third line in variable data frame, it must also be located at the third line in grouped data frame.② Start R softwares, replicate variable data frame in Excel tables, the accurate input order " data=in R software operating platforms Read.table (" clipboard ", header=T, row.names=1) " (quotation marks are not added with during input, it is certain to key in, it is impossible to Input is pasted after this password is replicated).Then, grouped data frame in Excel tables is replicated, it is accurate defeated in R software operating platforms Enter order " library (MASS)；Grp=read.table (" clipboard ", header=T, row.names=1) " is (defeated It is fashionable to be not added with quotation marks, certain to key in, it is impossible to paste and input after this password is replicated).3. accurately inputted in R software operating platforms " model=qda (data, grp [, 1]) " (quotation marks being not added with during input, this reproducible password pastes input)." model " herein The secondary discrimination model as constructed.

4. model is verified

Model accuracy is verified using 1~10 folding cross-validation method, operation times t=1000 under each broken number, verification result Such as table 1.K=1~10 is taken, verifies that precision average 95.671% is used as model accuracy under 10 broken numbers, it was demonstrated that constructed model 5 fermentation period samples are differentiated and are worked well.

2 secondary discrimination model of table, 1~10 folding cross validation results

5. time sentence

Modeling sample is done back and is sentenced.Its route is：Bring the physical and chemical index of modeling sample into model, be pre- using physical and chemical index Variable is surveyed, modeling sample is done and is differentiated.This step is completed by R softwares, its specific route is calculated by software discrimination model Go out sample belong to it is each fermentation the period discrimination probability after, the period of wherein discrimination probability maximum is the real attenuation period.Return and sentence It is consistent with returning the discriminating period after sentencing through model successfully to refer to the fermentation period source of sample, and more insures, differentiates mould The maximum discrimination probability that type is calculated is more than 0.6.If maximum discrimination probability is less than 0.6, and during the corresponding discriminating of this probability Section is not inconsistent with the sampling period, illustrates that sample is in a fermentation transition period (being not belonging to some specific time period) or fermentation is different Often.

Returning the operating method sentenced is：In R softwares, after the completion of secondary discrimination model " model " structure, accurate input " predmodel=predict (model, data)；Predmodel " (is not added with quotation marks, this reproducible password is pasted defeated during input Enter), you can obtain discriminating period and the day part discrimination probability of each sample.

Return to sentence and the results are shown in Table 3, it is respectively P1 (14), P2 (13), P3 (14 that the fermentation period time, which sentences successful sample number, It is a), P4 (13), P5 (15), total return sentence success rate 0.92, and sample returns that to sentence effect fine.

Table 3

Note：The period is sampled when be fermented grain sample, the residing period, returns that to sentence the period be the sample through after time sentencing, discriminating The obtained real attenuation period.Discrimination probability is less than 0.001, is replaced with 0 value

6. group's relative abundance interval estimation

Sentence successful sample microbial sequencing information using returning, Bootstrap methods calculate 5 fermentation period bacteriums and (belong to water It is flat) each group's relative abundance section, and sample to be tested is sorted out by the fermentation stage belonging to after differentiating, similar table 4 is obtained, is belonged to Microbiologic population's composition section of different fermentations stage sample to be tested, as the attenuation degree of sample M4I1, M4I2, M4I3 are in The P1 periods, the estimation of bacillus (Bacillus) proportion is between 45.31%~50.47% in them.Then, draw Each microbe groups Bootstrap density profiles, gained interval stability is assessed according to curve normality.Curve normality is got over Substantially, show that the section of acquisition is more stable, i.e., result more can represent objective fact.As shown in figure 3, each 6 representatives of fermentation stage Flora relative abundance density profile normal state sex expression is good, illustrates that the corresponding each flora relative abundance section of table 4 can be more Accurately the actual microorganism group of fermentation period belonging to reflection sample to be tested is into situation.

This step is based on R softwares, calls boot bags, RAM bags.Parameter setting：Bootstrap resets sampling 5000 times, carefully Bacterium OTU Data Integrations are based on belonging to horizontal.

Operating procedure is：Double sampling, frequency in sampling m are carried out for sample row in OTU lists>Sample number n (the bigger effects of m Better).Every time after sampling, all kinds of group's relative abundances under different microorganisms classification grade are calculated as needed.Treat all pumpings After sample computing, by acceleration drift correction quantile method (BC_a) estimate all kinds of group's relative abundance sections under 95% probability. Corresponding monoid probability density curve is drawn at the same time, assesses the robustness of interval estimation.This step need to call R software boot bags, RAM bags, power function is used for OTU Data Integrations into relative abundance data under microorganism different classifications grade, boot in RAM bags Power function is used for the double sampling of OTU data in bag, and after m (the present embodiment is 5000 times) double sampling, obtains all kinds of Group's relative abundance section.

Table 4 returns the corresponding sexual intercourse for sentencing the sample of successful different fermentations period with microbiologic population's relative abundance section

The identification of the biological community structure of 2 sample to be tested of embodiment

1, obtain the physical and chemical index of sample to be tested

Fermented grain sample is accumulated in the present embodiment and is derived from two round of alcoholic.To lose heap (heap body is collapsed and finished) for start time point Start to sample, be denoted as P1 when small (0), it is follow-up 48 it is small when, 120 it is small when, 144 it is small when, 168 respectively take a sample when small, be denoted as respectively P2, P3, P4, P5.P1~P5 day parts take sample to be tested 10,50 samples respectively.Sample temperature is measured during sampling at the same time.Together When detect sample to be tested moisture (g/10g fermented grains), lactic acid content (g/100g fermented grains), acetic acid content (g/100g fermented grains), Alcohol content (g/100g fermented grains).The physicochemical property such as table 5 of the sampling time section and measure of 50 samples.

The sampling time section of 5 50 samples to be tested of table and the physicochemical property of measure

2 differentiate the real attenuation period of sample to be tested

Bring the physical and chemical index of sample to be tested into discrimination model, differentiate the real attenuation period of sample to be tested.

As shown in Figure 2, some samples from the P1 periods, are authenticated the probability highest for belonging to the P1 periods, and model pushes away This disconnected sample fermentation level is in period P1；Other are derived from the sample of fermentation period P4, are authenticated the probability for belonging to the P2 periods Highest, it is inferred that this sample fermentation level is in period P2, you can illustrate heap body position attenuation degree residing for such sample Relatively slowly, although these samples are taken from the P4 periods, fermentation level is in the P2 periods.Final 50 samples to be tested are through mirror Not, 5 fermentation stages are respectively at.Wherein, while the maximum discrimination probability of only one sample M16IV5 is less than 0.6, it reflects Other period P3 is not inconsistent with sampling period P2, illustrates that this sample fermentation level is most likely in the transition stage of P2-P3, or sample copy Body fermentation is abnormal.Accordingly, the identification result of sample M16IV5 is not adopted.Remaining 49 sample meets adopting for identification result It is required that illustrate that model is good to sample to be tested resolution.

This step is completed using R softwares, and operating method is：1. the physicochemical data of sample to be tested is organized into Excel tables The variable data frame of one n row × p row, the entitled each modeling sample title of data frame row, arranges entitled different physical and chemical index titles, frame Interior data correspond to the detection numerical value of different physical and chemical indexes for each sample.It must be noted that physical and chemical index must used in sample to be tested Must be identical with physical and chemical index used in modeling sample, and the position of each physical and chemical index residing row in the two variable data frames Correspond.Such as model in variable data frame used, lactic acid is located at secondary series, and lactic acid also must in the variable data frame of sample to be tested Secondary series must be located at.2. variable data frame in Excel tables is replicated, the accurate input order " test=in R software operating platforms Read.table (" clipboard ", header=T, row.names=1) " (quotation marks are not added with during input, it is certain to key in, it is impossible to Input is pasted after this password is replicated).3. sample to be tested variable data frame substitutes into model " model ", i.e., accurate input order " discrim=predict (model, test)；Discrim " (is not added with quotation marks, this reproducible password pastes input) during input, It can obtain sample to be tested and differentiate period and each period discrimination probability.

The fermentation period identification result of 6 sample to be tested of table

Note：When the sampling period is fermented grain sample, residing fermentation period, the discriminating period is the sample through Model checking Afterwards, the real attenuation period differentiated.Discrimination probability is less than 0.001, is replaced with 0 value.Identification result refuses the sample adopted, Indicated with overstriking font.

3. the biological community structure of sample to be tested

By the real attenuation period of model prediction, according to the fermentation period obtained in embodiment 1 and superior microorganism group Fall the correspondence sexual intercourse in relative abundance section, the biological community structure for obtaining sample to be tested the results are shown in Table 7.

The biological community structure of 7 sample to be tested of table

Claims

1. a kind of method that accumulation microbial community structure of fermented grain is judged using physical and chemical index, it comprises the following steps：

S1：Build corresponding sexual intercourse of the different fermentations period with microbiologic population's relative abundance section；

(4) the microorganism sequencing information for sentencing successful fermented grain sample is used back, each fermentation period is calculated by Bootstrap methods Microbiologic population's relative abundance section；

S2：Obtain the physical and chemical index of sample to be tested；

S4：The real attenuation period differentiated by discrimination model, fermentation period and microbiologic population's relative abundance in step S1 The correspondence sexual intercourse in section, obtains the biological community structure of sample to be tested.

2. according to the method described in claim 1, it is characterized in that, the fermentation period is from heap is lost to the lower cellar for storing things all stage The different periods inside continuously selected；Preferably, the accumulation time is more than 4 days in advance, pre- to accumulate the time 2- when period interval is small more than 24 4 days, when period interval is small more than 12.

3. according to the method described in claim 1, it is characterized in that, the physical and chemical index of the fermented grain sample contains selected from moisture At least three in amount, lactic acid content, acetic acid content, alcohol content, total sugar content, pH value, temperature；It is highly preferred that at least select Select four physical and chemical indexes；It is highly preferred that the physical and chemical index of the fermented grain sample is selected from temperature, moisture, lactic acid content, second Acid content and alcohol content.

4. according to the method described in claim 1, it is characterized in that, the secondary discrimination model is built using statistical software； Preferably, the secondary discrimination model uses R software buildings.

5. according to the method described in claim 4, it is characterized in that, the discrimination model is by 1~K folding cross validations；It is excellent Selection of land, the secondary discrimination model pass through 1~10 folding cross validation, and precision is up to more than 95%.

6. according to the method described in claim 1, it is characterized in that, the sample number that sufficient amount refers to each fermentation period is more than 10；Preferably, each the sample number of fermentation period is more than 15；It is highly preferred that each the sample number of fermentation period is more than 20；It is more excellent Selection of land, fermented grain sample are derived from heap body different parts.

7. according to the method described in claim 1, it is characterized in that, step (4) calculates each fermentation by Bootstrap methods Microbiologic population's relative abundance section of period, is completed using R software boot bags, RAM bags.

8. according to the method described in claim 1, it is characterized in that, in step (3), return to sentence and refer to refer to the physics and chemistry of modeling sample Marker tape enters discrimination model, differentiates the real attenuation stage of modeling sample；

Preferably, described return sentences success, fermentation period when referring to sample and the real attenuation differentiated after model time is sentenced Period is consistent；

Preferably, returning in step (3) is sentenced with step S3, and the real attenuation period differentiated through discrimination model is by differentiating mould Type calculates the discrimination probability that sample belongs to each fermentation period, and the period of wherein discrimination probability maximum is the real attenuation period；

It is highly preferred that success is sentenced in described returning, fermentation period when referring to sample and the actual hair differentiated after model is returned and sentenced The ferment period is consistent, and maximum discrimination probability is more than 0.6；

It is highly preferred that in step S3, bring the physical and chemical index of sample to be tested into discrimination model, obtain the real attenuation of sample to be tested Period, as model calculate maximum discrimination probability be less than 0.6, and the real attenuation period identified with sampling when fermentation when Section is inconsistent, then the biological community structure of the sample to be tested is not judged.

9. according to the method described in claim 1, it is characterized in that, the microorganism sequencing information is by extracting fermented grain sample DNA, carry out sequencing technologies acquisition；Preferably, the sequencing technologies are sequenced for two generations；It is highly preferred that in two generations, survey Sequence technology is selected from 454GS FLX pyrosequencings or Illumina both-end sequencing technologies.

10. according to the method described in claim 2, it is characterized in that, the fermented grain fermentation is divided into multiple rounds；The wine The unstrained spirits fermentation period is selected from the second round from heap is lost to the lower different periods stored and continuously selected in all stage；

Preferably, when the fermentation period interval of the second round is small not less than 24, it is divided into P1, P2, P3, P4, P5 five in chronological order A fermentation period；

Preferably, the discrimination model of the second round is by R software buildings；

Preferably, the second round modeling sample returns to sentence is completed by R softwares；

Preferably, the second round sample to be tested differentiates and is completed by R softwares；

Preferably, the fermentation period of the second round is following table with the corresponding sexual intercourse in superior microorganism group relative abundance section；

Preferably, the superior microorganism is more than 0.5% microorganism for abundance accounting.