WO2022080411A1 - Method for predicting soybean yield - Google Patents

Method for predicting soybean yield Download PDF

Info

Publication number
WO2022080411A1
WO2022080411A1 PCT/JP2021/037900 JP2021037900W WO2022080411A1 WO 2022080411 A1 WO2022080411 A1 WO 2022080411A1 JP 2021037900 W JP2021037900 W JP 2021037900W WO 2022080411 A1 WO2022080411 A1 WO 2022080411A1
Authority
WO
WIPO (PCT)
Prior art keywords
yield
components
data
component
prediction model
Prior art date
Application number
PCT/JP2021/037900
Other languages
French (fr)
Japanese (ja)
Inventor
春香 前田
輝久 藤松
圭二 遠藤
Original Assignee
花王株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 花王株式会社 filed Critical 花王株式会社
Publication of WO2022080411A1 publication Critical patent/WO2022080411A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N27/00Investigating or analysing materials by the use of electric, electrochemical, or magnetic means
    • G01N27/62Investigating or analysing materials by the use of electric, electrochemical, or magnetic means by investigating the ionisation of gases, e.g. aerosols; by investigating electric discharges, e.g. emission of cathode
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N5/00Analysing materials by weighing, e.g. weighing small particles separated from a gas or liquid

Definitions

  • the present invention relates to a method for predicting the yield of rice at an early stage.
  • Rice is an important grain and is widely eaten in Japan and around the world. In addition, the production volume in Japan is higher than that of other typical grains such as corn, wheat and soybean. It is widely cultivated as one of such important grains, and techniques for increasing the yield are being developed.
  • the growing period of rice varies slightly depending on the variety and cultivation conditions, but it usually takes a long period of 3 to 6 months from sowing to harvesting. Therefore, in the development of a technique for increasing the yield of rice, it takes a lot of time for cultivation to evaluate the yield.
  • rice cultivation which takes 3 to 6 months to harvest, is generally cultivated once a year.
  • Non-Patent Document 1 a model for predicting production at the national level is constructed using meteorological data such as cumulative precipitation and cumulative average temperature before sowing. Further, in Patent Document 1 and Patent Document 2, an attempt is made to measure the amount of nitrogen in the leaf blade, the color of the leaf, or the amount of chlorophyll, evaluate the growth and yield, and determine the amount of fertilizer applied. Furthermore, in Non-Patent Document 2, metabolites extracted from rice seeds or from above-ground parts about 15 days after sowing are comprehensively measured by GC-MS, and a hybrid rice yield prediction model is used using these data. It is reported that it was created.
  • Non-Patent Document 1 is a model for predicting the yield of rice using meteorological data predicted before sowing, but the predictable unit is the national unit, and the predictors and yields for each individual are used. It is not suitable for evaluation when you want to correspond.
  • the methods of Patent Documents 1 and 2 can be said to be non-destructive and simple measurement, but the prediction time is after the panicle formation stage, that is, after half of the growth period has elapsed. Furthermore, since the prediction is made on a field-by-field basis, it is not a technique for predicting the yield at the individual level. Further, in Non-Patent Document 2, the predictability evaluation of a model called cross-validation, which is performed when constructing a normal predictive model, is not performed.
  • Non-Patent Document 2 is an overfitted model in which the error for model construction data is small but the prediction error for unknown data is large. Further, the method of Non-Patent Document 2 is invasive and is not suitable for evaluation when it is desired to correspond the predictor of each individual with the yield.
  • Patent Document 1 Japanese Patent Application Laid-Open No. 2000-300077
  • Patent Document 2 Japanese Patent Application Laid-Open No. 2018-82648
  • Non-Patent Document 1 Iizumi, T. et al., climate Services, 2018, vol. 11, p.13 -twenty three
  • Non-Patent Document 2 Dan, Z. et al., Scientific Reports, 2016, 6, 21732
  • the present invention is a rice yield prediction method, which obtains analysis data of one or more components from a leaf sample collected from rice and predicts the rice yield by using the correlation between the data and the rice yield. offer.
  • R2 indicated as R2Y in the figure
  • Q2 indicated as Q2 in the figure
  • the present invention relates to providing a method for accurately predicting the yield of rice at an early stage.
  • the present inventors have found that the metabolites contained in the leaves have a component whose abundance correlates with the yield, and that one of the developed leaves of rice is used at an early stage after sowing. It was found that the final yield can be evaluated at the individual level by collecting the parts, analyzing the components contained in the leaves, and analyzing the components.
  • the yield of rice can be predicted at an early stage. As a result, for example, it becomes easy to determine the introduction of additional techniques for ensuring the yield, and it is possible to greatly improve the efficiency of the development of the techniques for increasing the yield.
  • rice means all plants belonging to the genus Oryza (scientific name Oryza) in the family Gramineae, and preferably means Asian rice (scientific name Oryza sativa) belonging to the genus Oryza in the family Gramineae.
  • Asian rice includes Japonica and Indica.
  • the varieties included in the Japonica species are Nihonbare, Milky Queen, Koshihikari, Hitomebore, Hinohikari, Akitakomachi, Nanatsuboshi, Haenuki, Masigura, Kinuhikari, Asahi no Yume, Yumepirika, Kinumusume, Koshihikari, Tsuyahime, and Yume Tsuyahime. , Fusakogane, etc.
  • Examples of the varieties included in the indica species include Mudgo, Peta, Yodanya, Sigadis, IR8, IR36, IR72 and the like. As described above, there are a wide variety of varieties, but the present invention is not limited thereto.
  • the growth stages from the emergence of rice to the heading stage are the germination stage (around 5 days after sowing), the two-leaf age stage (around 25 days after sowing), the tillering stage (around 60 days after sowing), and the panicle formation stage (around 60 days after sowing). It is divided into a tiller stage (around 100 days after sowing) and a heading stage (around 120 days after sowing).
  • the rice leaf sample may be collected from the germination stage to the heading stage when the leaves can be collected, and the collection time is preferably from the two-leaf instar stage to the heading stage, more preferably.
  • the two-leaf instar to the stop leaf stage more preferably the two-leaf instar to the panicle formation stage, and further preferably the two-leaf instar to the splitting stage.
  • the number of days before and after each of the above growth stages is within 5 days.
  • the time for collecting rice leaves is 5 days or more after sowing, preferably 16 days or more, more preferably 25 days or more, still more preferably 40 days or more, and preferably 90 days or more after sowing, more preferably. It can be before 70 days, more preferably before 60 days. Further, it may be 5 to 90 days after sowing, preferably 16 to 70 days, more preferably 25 to 60 days, and even more preferably 40 to 60 days.
  • the number of days after sowing represents the number of days when seedlings are transplanted and cultivated outdoors.
  • the number of days after sowing of each growth stage is different from the above number of days, but those skilled in the art can understand the number of days after sowing of each growth stage in consideration of the direct sowing cultivation conditions in the indoor house.
  • the tillering period is about 60 days after sowing when transplanted outdoors, but about 30 days after sowing when directly sown in an indoor house.
  • leaf samples may be collected at a time corresponding to the above-mentioned collection time.
  • the site for collecting the leaf sample is not particularly limited, but for example, about 1 to 5 leaves, preferably about 2 to 6, and more preferably about 4 to 7 leaves are cut from the root of the strain and collected. Be done.
  • the yield means the actual amount of rice seedlings per individual or the number of grains per individual.
  • the mass is not particularly limited, but a dry mass is preferable. Of these, the actual amount of dried rice per individual is particularly preferable.
  • the actual amount of dried child in the present invention means the mass measured in a state where the water content contained in the grain after drying at 90 ° C. for 72 hours is reduced to 7% or less. It is preferable that the actual amount of the desiccant is measured only by a balance with a calibration function such as an electronic balance.
  • the analysis data of the components to be obtained include high performance liquid chromatography (HPLC), gas chromatography (GC), ion chromatography, mass spectrometry (MS), near infrared spectrometric analysis (NIR), and Fourier conversion.
  • HPLC high performance liquid chromatography
  • GC gas chromatography
  • MS mass spectrometry
  • NMR near infrared spectrometric analysis
  • FT-IR Infrared spectroscopic analysis
  • NMR nuclear magnetic resonance analysis
  • FT-NMR Fourier transform nuclear magnetic resonance analysis
  • ICP-MS inductively coupled plasma mass spectrometer
  • liquid chromatograph examples thereof include data analyzed / measured using a combined LC / MS or other instrument analysis means, preferably mass spectrometric data, and more preferably mass spectrometric data by LC / MS.
  • the mass spectrometric data include precision mass (“m / z value”), ionic strength, holding time, and the like, but precision mass information is preferable.
  • the leaf sample In order to apply the leaf sample to the above-mentioned instrumental analysis means, it is appropriately pretreated according to the analysis means, but usually, the collected leaves are wrapped in aluminum foil and immediately frozen in liquid nitrogen to stop the metabolic reaction. After being freeze-dried and dried, it is subjected to an extraction operation. Extraction is performed by pulverizing the freeze-dried leaf sample using a bead crusher or the like, adding an extraction solvent, and stirring the mixture.
  • the extraction solvent used here include methanol, ethanol, butanol, acetonitrile, chloroform, ethyl acetate, hexane, acetone, isopropanol, water and the like, and a mixture thereof.
  • an 80v / v% aqueous methanol solution to which an internal standard substance is added is preferably used.
  • examples of the components in the leaves to be analyzed include rice metabolites separated and detected by LC / MS.
  • the components provided by mass spectrometry have a precise mass (m / z) of 101-1215. More preferably, the 1,324 components listed in Tables 1a to 1i below, which are defined by the precise mass (m / z value) provided by mass spectrometry, can be mentioned.
  • the detected partial decomposition products are different from the original metabolites. It was used as a component of.
  • the 1,324 components are selectively extracted from the metabolites of rice, and the selection method is as shown in the examples in detail. Directly sown and cultivated, 2) 4 to 7 leaves were collected about 1 month after sowing to obtain leaf samples, 3) components were extracted using 80v / v% methanol, and then 4) LC / MS analysis. To obtain molecular ion information (precision mass, m / z) and structural information derived from fragments, 5) extract peaks derived from components, and then align each peak between each sample, isotopes. The analysis data of 1,324 components is acquired by removing peaks, correcting peak intensity between samples, and removing noise.
  • the method of correcting the peak intensity between the samples is not particularly limited, and examples thereof include correction using the popled QC method.
  • a sample called poored QC is prepared by mixing a certain amount from all the samples in the same batch, and the popled QC is analyzed at a constant frequency (about once every 5 to 9 times) between each sample.
  • the estimated value "what will happen to each peak intensity if it is assumed that the QC sample was analyzed when each sample was analyzed" is calculated, and the process of correcting with that value is performed. It corrects the sensitivity between each sample.
  • the data correction method does not significantly affect the correlation with yield and the performance of the prediction model.
  • the component to be analyzed in the present invention is a component having a significant correlation with the yield (p ⁇ 0.05) and an absolute value of the correlation coefficient
  • the component to be analyzed in the present invention is a component having a significant correlation with the yield (p ⁇ 0.05) and an absolute value of the correlation coefficient
  • the components of 1,324 are defined by the precise mass obtained by mass spectrometry, and the composition formula of the compound can be estimated from these precise mass data. Further, the partial structure information of the compound can be obtained from the MS / MS data acquired at the same time as the analysis. Therefore, the target component can be estimated from the composition formula and the partial structure information, and the one that can be compared with the reagent can be identified.
  • No. 10 is the composition formula C 6 H 9 N 3
  • No. 178 is the composition formula C 13 H 18 O 3
  • No. 245 is the composition formula C 13 H 20 O 4
  • No. 272 is the composition.
  • Formulas C 13 H 28 O 4 , No. 347 are composition formulas C 18 H 26 O 2
  • No. 416 and No. 417 are composition formulas C 18 H 28 O 3
  • No. 539 is composition formula C 15 H 22 O 8
  • No. 729 is the composition formula C 19 H 30 O 7
  • No. 1050 is the composition formula C 24 H 28 O 11
  • No. 1182 was presumed to have the composition formula C 34 H 40 O 10 .
  • the above 1,324 components preferably components having a significant correlation with the yield (p ⁇ 0.05) and an absolute value of the correlation coefficient
  • > 0.66 eg, a correlation coefficient of ⁇ 0.825.
  • Peak area can also be measured for the rice leaf sample to be predicted, and the yield can be predicted from the correlation between the known yield and the measured peak area.
  • the yield can be predicted by using a plurality of the analysis data of the above 1,324 components and collating with the yield prediction model constructed by using the multivariate analysis method. That is, a rice leaf sample after a lapse of a predetermined period from sowing is collected, an analysis sample is obtained, the analysis sample is subjected to instrumental analysis to obtain instrumental analysis data, and the instrumental analysis data is collated with a yield prediction model. Therefore, the yield of the rice can be predicted.
  • the yield prediction model can be constructed by performing regression analysis using the peak area value of the corrected component analysis data with each precise mass as the explanatory variable and the yield value as the objective variable.
  • Regression analysis methods include, for example, principal component regression analysis, PLS (Partial least squares projection to latent structures) regression analysis, OPLS (Orthogonal projections to latent structures) regression analysis, generalized linear regression analysis, bagging, and support vector machines. Examples include multivariate regression analysis methods such as machine learning / regression analysis methods such as random forest and neural network regression analysis. Of these, it is preferable to use the PLS method, the OPLS method which is an improved version of the PLS method, or the machine learning / regression analysis method.
  • the OPLS method has the same predictability as the PLS method, but is superior in that it is easier to visualize for interpretation for the purpose of this time.
  • Both the PLS method and the OPLS method are methods in which information is aggregated from high-dimensional data, replaced with a small number of latent variables, and the objective variable is expressed using the latent variables. It is important to properly select the number of latent variables, and cross-validation is often used to determine the number of latent variables. That is, the data for model construction is divided into several groups, one group is used for model verification, and the other group is used for model construction to estimate the prediction error, and this work is repeated while exchanging the groups to obtain the prediction error. The number of latent variables with the smallest total is chosen.
  • the evaluation of the prediction model is mainly judged by two indicators.
  • R 2 which represents prediction accuracy
  • Q 2 which represents predictability
  • R 2 is the square of the correlation coefficient between the measured value of the data used for constructing the prediction model and the predicted value calculated by the model, and the closer it is to 1, the higher the prediction accuracy.
  • Q2 is the result of the above cross validation, and represents the square of the correlation coefficient between the actually measured value and the predicted value which is the result of repeated model validation.
  • a list of the top 800 VIP values is shown in Tables 5a to 5j below.
  • Model construction using the VIP value as an index (2-1) Model using analysis data of the components up to the top 800 VIP values Select all the components up to the top 800 VIP values and select the 800 per data.
  • the number of components used for prediction is preferably as small as possible, for example, 10 or less, preferably 5 or less, more preferably 3 or less, and most preferably 3 or less, in the case of simple prediction. It is one. Further, when it is desired to improve the accuracy, it is preferable that the number of components is large, for example, 11 or more, preferably 20 or more, more preferably 50 or more, still more preferably 90 or more, and most preferably 150. More than one. When predicting with a small number of components, it is preferable to use a component having a higher VIP value or a component having a higher correlation coefficient for prediction.
  • the component having a higher VIP value is, for example, at least one component selected from the top 800 VIP values, preferably at least five components selected from the top 800 VIP values, and more preferably a VIP value. At least 10 components selected from the top 800, more preferably at least 1 component selected from the top 10 VIP values, and even more preferably at least 5 selected from the top 10 VIP values. It is a component of, more preferably at least 9 components selected from the top 10 VIP values, and even more preferably a component with the top 800 VIP values.
  • ⁇ 1> From the leaf sample collected from rice, analysis data of one or more components selected from the components having a precise mass (m / z) of 101 to 1215 provided by mass spectrometry is acquired, and the data and the data are obtained.
  • a rice yield prediction method that predicts rice yield using the correlation with rice yield.
  • ⁇ 2> The method according to ⁇ 1>, wherein the analysis data of the one or more components is corrected by the popled QC method.
  • the component is at least one selected from the components shown in Tables 1a to 1i defined by the precision mass (m / z) provided by mass spectrometry, ⁇ 1> or ⁇ 2>. The method described in.
  • the components are the component Nos. No. 1 shown in Tables 1a to 1i. 1, 4, 6, 9, 10, 11, 12, 19, 20, 21, 23, 26, 27, 29, 30, 33, 34, 35, 38, 39, 45, 46, 47, 48, 49, 50, 51, 52, 54, 55, 56, 61, 62, 63, 64, 65, 66, 67, 69, 71, 75, 76, 77, 78, 81, 83, 84, 85, 88, 89, 90, 91, 92, 96, 100, 102, 105, 106, 107, 108, 109, 113, 116, 118, 119, 120, 121, 123, 124, 126, 127, 129, 130, 131, 133, 134, 137, 139, 142, 145, 147, 148, 149, 151, 152, 153, 154, 155, 156, 159, 162, 163, 164, 165, 166,
  • the components are the component Nos. No. 1 shown in Tables 1a to 1i. Selected from 10, 177, 178, 245, 254, 272, 294, 337, 366, 435, 462, 259, 359, 708, 729, 832, 842, 869, 901, 912, 1050, 1050, 1173 and 1306.
  • the method according to ⁇ 3> which is one or more types.
  • the components are the component Nos. No. 1 shown in Tables 1a to 1i. 10.
  • the components are the component Nos. No. 1 shown in Tables 1a to 1i.
  • Component No. 10 is a component of the composition formula C 6 H 9 N 3 and described above.
  • Component No. 178 is a component of composition formula C 13 H 18 O 3
  • said component No. 245 is a component of composition formula C 13 H 20 O 4
  • component No. 272 is a component of composition formula C 13 H 28 O.
  • the component No. 347 is a component of the composition formula C 18 H 26 O 2
  • the component No. 416 is a component of the composition formula C 18 H 28 O 3
  • the component No. 417 is a component of the composition formula C 18 H 28 O 3.
  • ⁇ 9> The method according to any one of ⁇ 1> to ⁇ 7>, wherein the leaf sample is collected from rice in the two-leaf instar to the panicle formation stage.
  • the analysis data is mass spectrometry data.
  • ⁇ 11> At least one of the top 800 VIP values calculated from the yield prediction model constructed using the component information shown in Tables 1a to 1i of the component analysis data obtained from the leaf sample.
  • the method according to any one of ⁇ 3> to ⁇ 10> which comprises a step of collating with a yield prediction model constructed using the analysis data of the components of.
  • the yield prediction model uses at least 5 of the top 800 VIP values calculated from the yield prediction model constructed using the component information shown in Tables 1a to 1i.
  • the method described. ⁇ 13> the yield prediction model uses at least 10 of the top 800 VIP values calculated from the yield prediction model constructed using the component information shown in Tables 1a to 1i.
  • the method described. ⁇ 14> the yield prediction model uses at least one of the top 10 VIP values calculated from the yield prediction model constructed using the component information shown in Tables 1a to 1i.
  • the yield prediction model uses at least 9 out of the top 10 VIP values calculated from the yield prediction model constructed using the component information shown in Tables 1a to 1i.
  • the method described. ⁇ 17> The method according to ⁇ 11>, wherein the yield prediction model uses the top 800 VIP values calculated from the yield prediction model constructed by using the component information shown in Tables 1a to 1i.
  • ⁇ 18> The method according to any one of ⁇ 11> to ⁇ 17>, wherein the yield prediction model is a model constructed by using the OPLS method.
  • the precision mass is measured with an accuracy of four or more digits after the decimal point.
  • Comparative example 1 Acquisition of data for analysis We obtained the data (https://www.nature.com/articles/srep21732#Sec8) published together with the literature of Dan et al. Of Non-Patent Document 2. The actual amount of dried child per individual was used as the yield data. For leaf extract data, all published data was used for analysis.
  • the evaluation method of the prediction model is mainly judged by two indexes.
  • One is R 2 which represents prediction accuracy
  • the other is Q 2 which represents predictability.
  • R 2 is the square of the correlation coefficient between the measured value of the data used for constructing the prediction model and the predicted value calculated by the model, and the closer it is to 1, the higher the prediction accuracy.
  • Q 2 is the result of the cross validation, and represents the square of the correlation coefficient between the actually measured value and the predicted value which is the result of repeated model validation.
  • Q2 > 0.50 was used as the criterion for model evaluation. Since R 2 > Q 2 is always satisfied, Q 2 > 0.50 satisfies R 2 > 0.50 at the same time.
  • Example 1 Cultivation test The data of the pot cultivation test in the greenhouse conducted in 2019 will be described in detail. Pot cultivation was carried out in a greenhouse in Hiratsuka City, Kanagawa Prefecture. The soil used was the field soil in the Tochigi Plant of Kao Corporation. The basic fertilizer application amount was set to 0.8 g per pot, and a chemical fertilizer containing nitrogen, phosphorus and potassium as fertilizer components (trade name "Hyakukatsu Ichiki” Kanryo Chemical Co., Ltd.) was mixed with 4 L of soil. By setting the conditions of 1/4 times, 1/2 times, 2 times and 4 times the basic fertilizer application amount, cultivation was carried out under a total of 5 types of fertilizer application conditions.
  • a 1 / 5000a Wagner pot was used as a pot, and about 4 L of the above soil was packed per pot to prepare 30 pots.
  • 3 seeds were sown at 2 places in each pot (6 seeds were used per pot).
  • Plants derived from three seeds were collectively treated as one individual.
  • the cultivar used was Japonica rice "Nihonbare”. When two true leaves were unfolded, they were thinned out so that there would be one plant per pot. From August 22nd to October 28th, the plants were cultivated under flooded conditions, and from sowing to the start of flooding, watering was carried out once a week to the extent that the soil was moistened. No watering was done after October 28th. Sampling was done on August 28th.
  • Leaf sampling was performed during the daytime, 30 days after sowing (generally from 13:00 to 15:00). The growth stage of rice at this time was slightly different depending on the individual, but the number of leaves per individual was about 15 to 20, which was a growth stage corresponding to the tillering stage. Leaf sampling was taken by cutting 4-7 leaves from the root of the plant. At the time of collection, the whole strain was collected without bias. The collected leaves were wrapped in aluminum foil and immediately frozen in liquid nitrogen to stop the metabolic reaction. The frozen sample was taken back to the laboratory while maintaining the frozen state, and dried by freeze-drying. This dried sample was subjected to the extraction operation described later.
  • LC / MS analysis is performed using an Agilent HPLC system (Infinity 1260 series) as a front and an AB SCIEX Q-TOFMS device (TripleTOF4600) as a detector. I did it. Separation columns in HPLC include a core shell column Capcell core C18 (2.1 mm ID ⁇ 100 mm, particle total 2.7 ⁇ m) manufactured by Shiseido Co., Ltd. and a guard column (2.1 mm ID ⁇ 5 mm, particles). A total of 2.7 ⁇ m) was used, and the column temperature was set to 40 ° C. The autosampler was kept at 5 ° C during the analysis. The analytical sample was injected with 5 ⁇ L.
  • A: 0.1 v / v% formic acid aqueous solution and B: 0.1 v / v% formic acid acetonitrile solution were used as eluents.
  • the gradient elution condition was maintained at 1v / v% B (99v / v% A) for 0 to 0.1 minutes, and 1v / v% B to 99.5v / v for 0.1 to 13 minutes.
  • the ratio of eluent B was increased to% B and maintained at 99.5 v / v% B from 13.01 minutes to 16 minutes.
  • the flow velocity was 0.5 mL / min.
  • the ionization mode was set to the positive mode, and ESI was used as the ionization method.
  • the elution ions are scanned by TOFMS for 0.1 seconds, 10 high-intensity ions are selected, and each of them is subjected to MS / MS for 0.05 seconds while repeating the cycle.
  • Molecular ion information (precision mass, m / z) obtained by TOFMS scanning and structural information derived from fragments generated by MS / MS scanning were acquired.
  • the mass measurement range was set to m / z 100-1,250 for TOFMS and m / z 50-1,250 for MS / MS.
  • Peak finding option is a peak corresponding to a retention time of 0.5 to 16 minutes, 20 scans of Subtraction offset in the item of “Enhance Peak Finding”, 5 ppm of Minimum spectral peak width, and 5 ppm of Minimum spectral peak. .. Factor was set to 1.2, Minimum RT peak width was set to 10 scans, Noise threshold was set to 5, and the Assign charge state in the "More" item was checked. As a result, peak information of 31,649 was obtained.
  • an alignment process was performed to align the detected peaks between the analyzed samples.
  • the alignment processing conditions (“Alightment & Filtering”)
  • the Retition time tradition in the item of “Alignment” was set to 0.20 minutes and the Mass tolerance was set to 10.0 ppm.
  • the Integrity threshold in the "Filtering” item was set to 10
  • the Retention time filtering was checked, the Move peaks in ⁇ 3 samples were set, and the Maximum number of peaks was set to 50,000.
  • the retention time was corrected using the lidocaine peak in the item of "Internal standard”.
  • the isotope peak was removed. Since the isotope peak is automatically recognized by the software at the time of peak extraction and labeled as "isotopic" on the peak list, the corresponding peak was deleted by sorting by "isotopic". As a result, the peak decreased to 25,895 peaks.
  • a sample called poored QC was prepared by mixing a certain amount from all the samples, and the polled QC was analyzed once every 6 times. From all these QC analysis results, an estimated value "what will happen to each peak intensity if it is assumed that the QC sample was analyzed when each sample was analyzed" is calculated, and the process of correcting with that value is performed. This was done and the sensitivity between each sample in the same batch was corrected. For this treatment, free software (LOWESS-Normalization-Tool) provided by RIKEN was used.
  • Correlation analysis was performed using the analysis data of 1,324 components in the leaves of 26 individuals acquired and the corresponding yield data (real amount of dried matter), that is, the matrix data of 26 ⁇ 1,324.
  • the p-value was calculated by the simple correlation coefficient r and the uncorrelated test between the analysis data of each component and the yield data.
  • the results are shown in Tables 4a-4q.
  • the "component No.” in the table is for convenience, in which 1,324 components are numbered from the smallest mass number when arranged in order of mass.
  • the analysis result includes information on the holding time as well as the mass information. It has been shown that mass spectrometric data can be compared and analyzed between mass spectrometric samples. Therefore, the information on the holding time was removed, and only the precise mass information was described.
  • Model construction / evaluation A multivariate analysis method was used to construct a yield prediction model using analysis data of two or more components, and SIMCA ver. 14 (Umetrics) was used.
  • regression analysis was performed using the peak area value of the corrected component analysis data with each precise mass as the explanatory variable and the yield value as the objective variable.
  • Regression analysis was performed by the OPLS method, which is an improved version of the PLS method.
  • the evaluation method of the prediction model is mainly judged by two indexes.
  • R 2 which represents prediction accuracy
  • Q 2 which represents predictability
  • R 2 is the square of the correlation coefficient between the measured value of the data used for constructing the prediction model and the predicted value calculated by the model, and the closer it is to 1, the higher the prediction accuracy.
  • Q 2 is the result of the cross validation, and represents the square of the correlation coefficient between the actually measured value and the predicted value which is the result of repeated model validation. From a prediction point of view, the model is considered to have good predictability if at least Q2 > 0.50 (Triba, M.N. et al., Mol. BioSyst. 2015, 11, 13-19. .), Q2 > 0.50 was used as the standard for model evaluation. Since R 2 > Q 2 is always satisfied, Q 2 > 0.50 satisfies R 2 > 0.50 at the same time.
  • VIP Value calculation In the model constructed in 8-1, VIP (Variable Impact in) The degree of contribution to the model performance given to each component called the projection (variable importance in projection) value is calculated. The larger the VIP value, the greater the contribution to the model, and it also correlates with the absolute value of the correlation coefficient. A list of the top 800 VIP values is shown in Tables 5a to 5j.
  • Model construction using the VIP value as an index A model was constructed with a plurality of components based on the VIP value ranking (Tables 5a to 5j), which is the contribution of each component to the model constructed in 8-1.
  • the model performance standard is set to Q2 > 0.50 for convenience.
  • Model using the analysis data of the components up to the top 800 VIP values All the components up to the top 800 VIP values are selected, and each data has the peak area value and yield value of the analysis data of the 800 components.

Abstract

Provided is a method for predicting soybean yield at an early stage with high accuracy. A method for predicting soybean yield, comprising acquiring analysis data of one or more components from a leaf sample collected from soybean, and predicting the soybean yield by using a correlation between the data and the soybean yield.

Description

イネの収量予測方法Rice yield prediction method
 本発明はイネの収量を早期に予測する方法に関する。 The present invention relates to a method for predicting the yield of rice at an early stage.
 イネは、重要な穀物であり、日本を始め世界中で広く食されている。また他の代表的な穀物であるトウモロコシ、コムギ、ダイズと比較しても日本での生産量は多い。このように重要な穀物の1つとして広く栽培され、また収量を増加させる技術の開発が行われている。
 イネの生育期間は、品種や栽培条件によって若干異なるが、通常、播種から収穫まで3~6か月という長期間を要する。よって、イネの収量を増加させる技術の開発において、収量評価を行うには栽培に多くの時間を必要とする。さらに、日本のような季節・気候条件では、収穫まで3~6か月を要するイネの栽培は年に1回が一般的である。屋外栽培での収量評価が年に1度しかできず収量増加技術の開発の障害となっていることから、早期に収量を予測する方法が求められてきた。また、実際の生産場面において、早期に収量を予測することができれば、生産者は安定した収量確保のために費用コストのかかる追加技術を投入すべきかどうかの判断を容易に下すことができる。
Rice is an important grain and is widely eaten in Japan and around the world. In addition, the production volume in Japan is higher than that of other typical grains such as corn, wheat and soybean. It is widely cultivated as one of such important grains, and techniques for increasing the yield are being developed.
The growing period of rice varies slightly depending on the variety and cultivation conditions, but it usually takes a long period of 3 to 6 months from sowing to harvesting. Therefore, in the development of a technique for increasing the yield of rice, it takes a lot of time for cultivation to evaluate the yield. Furthermore, under seasonal and climatic conditions such as Japan, rice cultivation, which takes 3 to 6 months to harvest, is generally cultivated once a year. Since the yield evaluation in outdoor cultivation can be performed only once a year, which is an obstacle to the development of the yield increasing technique, a method for predicting the yield at an early stage has been required. In addition, if the yield can be predicted at an early stage in the actual production situation, the producer can easily determine whether or not to introduce a costly additional technique in order to secure a stable yield.
 これまでにも気象データ、圃場の画像や生育情報を利用した早期に収量性を評価する方法が種々検討されている。例えば、非特許文献1では、播種前に積算降水量・積算平均気温等の気象データを用いて国レベルでの生産量を予測するモデルの構築が行われている。また、特許文献1及び特許文献2では、葉身窒素量、葉色又はクロロフィル量を測定し、生育や収量性を評価し、施肥量を決定する試みもなされている。さらに、非特許文献2では、イネの種子中又は播種後15日程度の地上部から抽出される代謝物をGC-MSにより網羅的に測定し、それらのデータを用いてハイブリッドライス収量予測モデルを作成したことが報告されている。 Various methods for early evaluation of yield using meteorological data, field images and growth information have been studied so far. For example, in Non-Patent Document 1, a model for predicting production at the national level is constructed using meteorological data such as cumulative precipitation and cumulative average temperature before sowing. Further, in Patent Document 1 and Patent Document 2, an attempt is made to measure the amount of nitrogen in the leaf blade, the color of the leaf, or the amount of chlorophyll, evaluate the growth and yield, and determine the amount of fertilizer applied. Furthermore, in Non-Patent Document 2, metabolites extracted from rice seeds or from above-ground parts about 15 days after sowing are comprehensively measured by GC-MS, and a hybrid rice yield prediction model is used using these data. It is reported that it was created.
 しかしながら、非特許文献1のモデルは、播種前から予測された気象データを用いてイネの収量を予測するモデルであるが、予測できる単位が国単位であり、個体毎の予測因子と収量とを対応させたい場合の評価には向いていない。特許文献1及び2の方法は、非破壊で簡易的な測定であるといえるが、予測時期が幼穂形成期以降、すなわち生育期間の半分が経過した後での予測となる。さらに、圃場単位での予測を行うため、個体レベルでの収量を予測する技術ではない。また、非特許文献2では、通常の予測モデル構築の際に行われるクロスバリデーションというモデルの予測性評価が行われていない。具体的には、モデルの予測精度を示すRは0.82と報告されているものの、Rとともに重要なモデルの予測性を示すQの記載がない。非特許文献2のモデルは、モデル構築用データに対する誤差は小さいが未知のデータに対する予測誤差が大きいオーバーフィッティングしたモデルであることが懸念された。また、非特許文献2の方法は、侵襲的であり、個体毎の予測因子と収量とを対応させたい場合の評価には向いていない。 However, the model of Non-Patent Document 1 is a model for predicting the yield of rice using meteorological data predicted before sowing, but the predictable unit is the national unit, and the predictors and yields for each individual are used. It is not suitable for evaluation when you want to correspond. The methods of Patent Documents 1 and 2 can be said to be non-destructive and simple measurement, but the prediction time is after the panicle formation stage, that is, after half of the growth period has elapsed. Furthermore, since the prediction is made on a field-by-field basis, it is not a technique for predicting the yield at the individual level. Further, in Non-Patent Document 2, the predictability evaluation of a model called cross-validation, which is performed when constructing a normal predictive model, is not performed. Specifically, although R 2 indicating the prediction accuracy of the model is reported to be 0.82, there is no description of Q 2 indicating the predictability of the important model together with R 2 . There was concern that the model of Non-Patent Document 2 is an overfitted model in which the error for model construction data is small but the prediction error for unknown data is large. Further, the method of Non-Patent Document 2 is invasive and is not suitable for evaluation when it is desired to correspond the predictor of each individual with the yield.
  〔特許文献1〕特開2000-300077号公報
  〔特許文献2〕特開2018-82648号公報
  〔非特許文献1〕Iizumi,T. et al., Climate Services, 2018, vol. 11, p.13-23
  〔非特許文献2〕Dan,Z. et al., Scientific Reports, 2016, 6, 21732
[Patent Document 1] Japanese Patent Application Laid-Open No. 2000-300077 [Patent Document 2] Japanese Patent Application Laid-Open No. 2018-82648 [Non-Patent Document 1] Iizumi, T. et al., Climate Services, 2018, vol. 11, p.13 -twenty three
[Non-Patent Document 2] Dan, Z. et al., Scientific Reports, 2016, 6, 21732
 本発明は、イネから採取された葉サンプルから1以上の成分の分析データを取得し、当該データとイネ収量との相関性を利用してイネの収量を予測する、イネの収量予測方法、を提供する。 The present invention is a rice yield prediction method, which obtains analysis data of one or more components from a leaf sample collected from rice and predicts the rice yield by using the correlation between the data and the rice yield. offer.
非特許文献2のデータを用いて構築されたOPLSモデルによる収量の予測値と実測値との関係を示す図。The figure which shows the relationship between the predicted value and the actually measured value of the yield by the OPLS model constructed using the data of Non-Patent Document 2. 全26データを用いて構築されたOPLSモデルによる収量の予測値と実測値との関係を示す図。The figure which shows the relationship between the predicted value and the actually measured value of the yield by the OPLS model constructed using all 26 data. 図2のモデルにおけるVIP値11位以下800位までのすべての成分の分析データ、21位以下800位までのすべての成分の分析データ、31位以下800位までのすべての成分の分析データ・・・及び111位以下800位までのすべての成分の分析データを用いてOPLS法により構築した各々のモデルのR(図中ではR2Yと表示)値及びQ(図中ではQ2と表示)値を示す図。Analytical data of all components from the 11th place to the 800th place, analysis data of all the components from the 21st place to the 800th place, and analysis data of all the components from the 31st place to the 800th place in the model of FIG. R2 (indicated as R2Y in the figure) and Q2 (indicated as Q2 in the figure) value of each model constructed by the OPLS method using the analysis data of all the components from the 111th place to the 800th place. The figure which shows. 図2のモデルにおけるVIP値上位1位から10位までの成分の分析データの内、任意の9個の組み合わせ(10通り)についてOPLS法により構築した各々のモデルのR(図中ではR2Yと表示)値及びQ(図中ではQ2と表示)値を示す図。Of the analysis data of the components from the top 1 to 10 VIP values in the model of FIG. 2, R 2 (R2Y in the figure) of each model constructed by the OPLS method for any 9 combinations (10 combinations). The figure which shows the value (display) value and Q2 (displayed as Q2 in the figure) value.
発明の詳細な説明Detailed description of the invention
 本発明は、イネの収量を早期に精度よく予測する方法を提供することに関する。 The present invention relates to providing a method for accurately predicting the yield of rice at an early stage.
 本発明者らは、イネの収量性評価について種々検討した結果、葉中に含まれる代謝物にはその存在量が収量と相関する成分があること、そして、播種後早期にイネ展開葉から一部を採取し、葉中に含まれる成分を分析し、解析することで最終的な収量を個体レベルで評価できることを見出した。 As a result of various studies on the evaluation of the yield of rice, the present inventors have found that the metabolites contained in the leaves have a component whose abundance correlates with the yield, and that one of the developed leaves of rice is used at an early stage after sowing. It was found that the final yield can be evaluated at the individual level by collecting the parts, analyzing the components contained in the leaves, and analyzing the components.
 本発明の方法によれば、イネの収量を早期に予測できる。これにより、例えば、収量確保のための追加技術投入の判断が容易となるほか、収量増加技術の開発の大幅な効率化を図ることができる。 According to the method of the present invention, the yield of rice can be predicted at an early stage. As a result, for example, it becomes easy to determine the introduction of additional techniques for ensuring the yield, and it is possible to greatly improve the efficiency of the development of the techniques for increasing the yield.
 本発明において、イネとは、イネ科イネ属(学名Oryza)に属する植物全般を意味し、好ましくはイネ科イネ属のアジアイネ(学名Oryza sativa)を意味する。アジアイネには、ジャポニカ種とインディカ種が含まれる。ジャポニカ種に含まれる品種は、ニホンバレ、ミルキークイーン、コシヒカリ、ひとめぼれ、ヒノヒカリ、あきたこまち、ななつぼし、はえぬき、まっしぐら、キヌヒカリ、あさひの夢、ゆめぴりか、きぬむすめ、こしいぶき、つや姫、夢つくし、ふさこがね等が挙げられる。インディカ種に含まれる品種は、Mudgo、Peta、Yodanya、Sigadis、IR8、IR36、IR72等が挙げられる。このように品種は多岐にわたるが、本発明においてはそれに限定されるものではない。 In the present invention, rice means all plants belonging to the genus Oryza (scientific name Oryza) in the family Gramineae, and preferably means Asian rice (scientific name Oryza sativa) belonging to the genus Oryza in the family Gramineae. Asian rice includes Japonica and Indica. The varieties included in the Japonica species are Nihonbare, Milky Queen, Koshihikari, Hitomebore, Hinohikari, Akitakomachi, Nanatsuboshi, Haenuki, Masigura, Kinuhikari, Asahi no Yume, Yumepirika, Kinumusume, Koshihikari, Tsuyahime, and Yume Tsuyahime. , Fusakogane, etc. Examples of the varieties included in the indica species include Mudgo, Peta, Yodanya, Sigadis, IR8, IR36, IR72 and the like. As described above, there are a wide variety of varieties, but the present invention is not limited thereto.
 イネの出芽から出穂期までの生育ステージは、出芽期(播種後5日前後)、2葉齢期(播種後25日前後)、分げつ期(播種後60日前後)、幼穂形成期(播種後90日前後)、止め葉期(播種後100日前後)、出穂期(播種後120日前後)に分けられる。本発明において、イネの葉サンプルの採取は、葉が採取可能な出芽期から出穂期までの間に行われればよく、該採取時期としては、好ましくは2葉齢期~出穂期、より好ましくは2葉齢期~止め葉期、さらに好ましくは2葉齢期~幼穂形成期、さらに好ましくは2葉齢期~分げつ期が挙げられる。尚、上記各生育ステージにおける前後の日数幅は5日間以内が好適である。
 或いは、イネの葉の採取時期は、播種後5日以上、好ましくは16日以上、より好ましくは25日以上、さらに好ましくは40日以上で、且つ好ましくは播種後90日より前、より好ましくは70日より前、さらに好ましくは60日より前であり得る。また、播種後5~90日目、好ましくは16~70日目、より好ましくは25~60日目、さらに好ましくは40~60日目であり得る。
 ここで、播種後日数とは、屋外で苗を移植栽培した場合の日数を表す。屋内ハウスでの直播栽培では、各生育ステージの播種後日数は上記日数と相違するが、当業者であれば、屋内ハウスでの直播栽培条件を考慮して各生育ステージの播種後日数を理解でき、例えば、分げつ期は屋外で移植栽培した場合は播種後60日前後であるが、屋内ハウスで直播栽培した場合は播種後30日前後である。屋内ハウスでの直播栽培では、上記採取時期に相当する時期に葉サンプルを採取すればよい。
The growth stages from the emergence of rice to the heading stage are the germination stage (around 5 days after sowing), the two-leaf age stage (around 25 days after sowing), the tillering stage (around 60 days after sowing), and the panicle formation stage (around 60 days after sowing). It is divided into a tiller stage (around 100 days after sowing) and a heading stage (around 120 days after sowing). In the present invention, the rice leaf sample may be collected from the germination stage to the heading stage when the leaves can be collected, and the collection time is preferably from the two-leaf instar stage to the heading stage, more preferably. The two-leaf instar to the stop leaf stage, more preferably the two-leaf instar to the panicle formation stage, and further preferably the two-leaf instar to the splitting stage. It is preferable that the number of days before and after each of the above growth stages is within 5 days.
Alternatively, the time for collecting rice leaves is 5 days or more after sowing, preferably 16 days or more, more preferably 25 days or more, still more preferably 40 days or more, and preferably 90 days or more after sowing, more preferably. It can be before 70 days, more preferably before 60 days. Further, it may be 5 to 90 days after sowing, preferably 16 to 70 days, more preferably 25 to 60 days, and even more preferably 40 to 60 days.
Here, the number of days after sowing represents the number of days when seedlings are transplanted and cultivated outdoors. In direct sowing cultivation in an indoor house, the number of days after sowing of each growth stage is different from the above number of days, but those skilled in the art can understand the number of days after sowing of each growth stage in consideration of the direct sowing cultivation conditions in the indoor house. For example, the tillering period is about 60 days after sowing when transplanted outdoors, but about 30 days after sowing when directly sown in an indoor house. In direct sowing cultivation in an indoor house, leaf samples may be collected at a time corresponding to the above-mentioned collection time.
 葉サンプルの採取部位は、特に限定されないが、例えば、株の根本から葉を1~5枚程度、好ましくは2~6枚程度、より好ましくは4~7枚程度切断して採取することが挙げられる。 The site for collecting the leaf sample is not particularly limited, but for example, about 1 to 5 leaves, preferably about 2 to 6, and more preferably about 4 to 7 leaves are cut from the root of the strain and collected. Be done.
 本発明において、収量とは、個体あたりのイネの子実質量、または個体あたりの粒数を意味する。質量については、特に限定されないが、乾燥質量が好ましい。これらのうち、個体あたりのイネの乾燥子実質量が特に好ましい。本発明における乾燥子実質量とは、90℃にて72時間乾燥させた後の子実に含まれる水分率が7%以下に減少した状態で測定した質量を意味する。乾燥子実質量は、電子天秤などの校正機能つきの天秤ばかりによって測定されることが好ましい。 In the present invention, the yield means the actual amount of rice seedlings per individual or the number of grains per individual. The mass is not particularly limited, but a dry mass is preferable. Of these, the actual amount of dried rice per individual is particularly preferable. The actual amount of dried child in the present invention means the mass measured in a state where the water content contained in the grain after drying at 90 ° C. for 72 hours is reduced to 7% or less. It is preferable that the actual amount of the desiccant is measured only by a balance with a calibration function such as an electronic balance.
 本発明において、取得される成分の分析データとしては、高速液体クロマトグラフィー(HPLC)、ガスクロマトグラフィー(GC)、イオンクロマトグラフィー、質量分析(MS)、近赤外分光分析(NIR)、フーリエ変換赤外分光分析(FT-IR)、核磁気共鳴分析(NMR)、フーリエ変換核磁気共鳴分析(FT-NMR)、誘導結合プラズマ質量分析計(ICP-MS)、液体クロマトグラフと質量分析とを組合せたLC/MS等の機器分析手段を用いて分析・測定されたデータが挙げられるが、好ましくは質量分析データであり、より好ましくはLC/MSによる質量分析データである。
 質量分析データとしては、精密質量(「m/z値」)、イオン強度、保持時間等が挙げられるが、好ましくは精密質量の情報である。
In the present invention, the analysis data of the components to be obtained include high performance liquid chromatography (HPLC), gas chromatography (GC), ion chromatography, mass spectrometry (MS), near infrared spectrometric analysis (NIR), and Fourier conversion. Infrared spectroscopic analysis (FT-IR), nuclear magnetic resonance analysis (NMR), Fourier transform nuclear magnetic resonance analysis (FT-NMR), inductively coupled plasma mass spectrometer (ICP-MS), liquid chromatograph and mass spectrometry. Examples thereof include data analyzed / measured using a combined LC / MS or other instrument analysis means, preferably mass spectrometric data, and more preferably mass spectrometric data by LC / MS.
Examples of the mass spectrometric data include precision mass (“m / z value”), ionic strength, holding time, and the like, but precision mass information is preferable.
 葉サンプルを、上記機器分析手段に適用するためには、分析手段に応じて適宜前処理されるが、通常、採取した葉はアルミホイルで包み直ちに液体窒素中で凍結して代謝反応を停止させ、凍結乾燥にかけて乾燥した後、抽出操作に供される。
 抽出は、凍結乾燥した葉サンプルを、ビーズ粉砕機等を用いて粉砕した後、抽出溶媒を添加して撹拌することにより行われる。ここで用いられる抽出溶媒としては、メタノール、エタノール、ブタノール、アセトニトリル、クロロホルム、酢酸エチル、ヘキサン、アセトン、イソプロパノール、水等及びそれらを混合したものが挙げられる。分析手段としてLC/MSを用いる場合には、内部標準物質を添加した80v/v%メタノール水溶液等が好適に使用される。
In order to apply the leaf sample to the above-mentioned instrumental analysis means, it is appropriately pretreated according to the analysis means, but usually, the collected leaves are wrapped in aluminum foil and immediately frozen in liquid nitrogen to stop the metabolic reaction. After being freeze-dried and dried, it is subjected to an extraction operation.
Extraction is performed by pulverizing the freeze-dried leaf sample using a bead crusher or the like, adding an extraction solvent, and stirring the mixture. Examples of the extraction solvent used here include methanol, ethanol, butanol, acetonitrile, chloroform, ethyl acetate, hexane, acetone, isopropanol, water and the like, and a mixture thereof. When LC / MS is used as the analytical means, an 80v / v% aqueous methanol solution to which an internal standard substance is added is preferably used.
 本発明において、分析される葉中の成分としては、LC/MSによって分離検出されるイネの代謝物質が挙げられる。好ましくは、質量分析により提供される精密質量(m/z)が101~1215である成分が挙げられる。より好ましくは、質量分析により提供される精密質量(m/z値)で規定された、下記表1a~1iに記載された1,324成分が挙げられる。尚、LC/MSによる分離検出の過程において、代謝物質から部分分解物及びアダクト(M+H、M+Na等)の異なる分子イオンピークが生じる場合、検出された部分分解物は、元の代謝物質とは別の成分とした。 In the present invention, examples of the components in the leaves to be analyzed include rice metabolites separated and detected by LC / MS. Preferably, the components provided by mass spectrometry have a precise mass (m / z) of 101-1215. More preferably, the 1,324 components listed in Tables 1a to 1i below, which are defined by the precise mass (m / z value) provided by mass spectrometry, can be mentioned. In the process of separation detection by LC / MS, if different molecular ion peaks of partial decomposition products and adducts (M + H, M + Na, etc.) are generated from the metabolites, the detected partial decomposition products are different from the original metabolites. It was used as a component of.
Figure JPOXMLDOC01-appb-T000010
Figure JPOXMLDOC01-appb-T000010
Figure JPOXMLDOC01-appb-T000011
Figure JPOXMLDOC01-appb-T000011
Figure JPOXMLDOC01-appb-T000012
Figure JPOXMLDOC01-appb-T000012
Figure JPOXMLDOC01-appb-T000013
Figure JPOXMLDOC01-appb-T000013
Figure JPOXMLDOC01-appb-T000014
Figure JPOXMLDOC01-appb-T000014
Figure JPOXMLDOC01-appb-T000015
Figure JPOXMLDOC01-appb-T000015
Figure JPOXMLDOC01-appb-T000016
Figure JPOXMLDOC01-appb-T000016
Figure JPOXMLDOC01-appb-T000017
Figure JPOXMLDOC01-appb-T000017
Figure JPOXMLDOC01-appb-T000018
Figure JPOXMLDOC01-appb-T000018
 当該1,324成分はイネの代謝物質から選択抽出されたものであり、その選択方法は詳細には実施例に示すとおりであるが、概略すると、1)施肥条件を変えてイネを屋内ハウスで直播栽培し、2)それぞれ播種後1ヶ月程度に葉を4~7枚採取して葉サンプルを得、3)80v/v%メタノールを用いて成分抽出を行った後、4)LC/MS分析を行って分子イオン情報(精密質量,m/z)とフラグメントに由来する構造情報を取得し、5)成分由来ピークを抽出し、次いで各ピークを各サンプル間で整列化させるアラインメント処理、同位体ピークの除去、サンプル間のピーク強度補正、ノイズの除去、を行って1,324成分の分析データを取得する、というものである。尚、サンプル間のピーク強度補正の方法は特に限定されないが、pooled QC法を用いた補正が挙げられる。pooled QC法は、同一バッチ内の全てのサンプルから一定量を混合したpooled QCと呼ばれるサンプルを作製し、各サンプルの合間に一定の頻度(5~9回に1回程度)でpooled QCの分析を実施することにより、「各サンプルを分析していた際にQCサンプルを分析していたと仮定するとそれぞれのピーク強度はどうなるか」という推定値を計算し、その値で補正するという処理を行って各サンプル間の感度の補正を行うものである。なお、データの補正方法が収量との相関性および予測モデルの性能に大きく影響することはない。 The 1,324 components are selectively extracted from the metabolites of rice, and the selection method is as shown in the examples in detail. Directly sown and cultivated, 2) 4 to 7 leaves were collected about 1 month after sowing to obtain leaf samples, 3) components were extracted using 80v / v% methanol, and then 4) LC / MS analysis. To obtain molecular ion information (precision mass, m / z) and structural information derived from fragments, 5) extract peaks derived from components, and then align each peak between each sample, isotopes. The analysis data of 1,324 components is acquired by removing peaks, correcting peak intensity between samples, and removing noise. The method of correcting the peak intensity between the samples is not particularly limited, and examples thereof include correction using the popled QC method. In the popled QC method, a sample called poored QC is prepared by mixing a certain amount from all the samples in the same batch, and the popled QC is analyzed at a constant frequency (about once every 5 to 9 times) between each sample. By implementing, the estimated value "what will happen to each peak intensity if it is assumed that the QC sample was analyzed when each sample was analyzed" is calculated, and the process of correcting with that value is performed. It corrects the sensitivity between each sample. The data correction method does not significantly affect the correlation with yield and the performance of the prediction model.
 また、取得した1,324成分の分析データと対応する収量データ(乾燥子実質量)との相関解析を行った結果(各成分の分析データのピーク面積と収量との単相関係数r及び無相関の検定によりp値を算出)、一定の成分は収量と有意に相関することが示された(後記表4a~4q参照)。 In addition, the result of correlation analysis between the acquired analysis data of 1,324 components and the corresponding yield data (dried material real amount) (single correlation coefficient r between the peak area and yield of the analysis data of each component and none). The p-value was calculated by the correlation test), and it was shown that certain components significantly correlated with the yield (see Tables 4a-4q below).
 したがって、1,324成分のうち、本発明における分析対象成分としては、収量との相関が有意(p<0.05)かつ相関係数の絶対値|r|>0.51である成分、すなわち成分No.10、177、178、245、254、272、294、337、366、435、462、529、539、708、729、832、842、869、901、912、1050、1060、1173及び1306から選ばれる1種以上を含むのが好ましい。なお、上記成分は、後述のVIP値がすべて1.08以上であった。 Therefore, among the 1,324 components, the component to be analyzed in the present invention is a component having a significant correlation with the yield (p <0.05) and an absolute value of the correlation coefficient | r |> 0.51, that is, Ingredient No. Selected from 10, 177, 178, 245, 254, 272, 294, 337, 366, 435, 462, 259, 359, 708, 729, 832, 842, 869, 901, 912, 1050, 1050, 1173 and 1306. It is preferable to include one or more kinds. All of the above components had VIP values of 1.08 or more, which will be described later.
 さらに1,324成分のうち、本発明における分析対象成分としては、収量との相関が有意(p<0.05)かつ相関係数の絶対値|r|>0.66である成分、すなわち成分No.10、178及び1173から選ばれる1種以上を含むのが好ましい。なお、上記成分は、後述のVIP値がすべて2.17以上であった。 Further, among the 1,324 components, the component to be analyzed in the present invention is a component having a significant correlation with the yield (p <0.05) and an absolute value of the correlation coefficient | r |> 0.66, that is, a component. No. It is preferable to include one or more selected from 10, 178 and 1173. All of the above components had VIP values of 2.17 or more, which will be described later.
 表1a~1iでは、1,324の成分を質量分析により得られる精密質量で規定しているが、これらの精密質量データから化合物の組成式を推定することができる。また、分析時に同時に取得しているMS/MSデータからは、化合物の部分構造情報が得られる。よって、組成式と部分構造情報から、対象の成分を推定することができ、更に試薬との比較が可能なものについては同定することができる。 In Tables 1a to 1i, the components of 1,324 are defined by the precise mass obtained by mass spectrometry, and the composition formula of the compound can be estimated from these precise mass data. Further, the partial structure information of the compound can be obtained from the MS / MS data acquired at the same time as the analysis. Therefore, the target component can be estimated from the composition formula and the partial structure information, and the one that can be compared with the reagent can be identified.
 例えば、解析の結果、No.10は組成式C、No.178は組成式C1318、No.245は組成式C1320、No.272は組成式C1328、No.347は組成式C1826、No.416及びNo.417は組成式C1828、No.539は組成式C1522、No.729は組成式C1930、No.1050は組成式C242811、No.1182は組成式C344010であると推定した。 For example, as a result of the analysis, No. 10 is the composition formula C 6 H 9 N 3 , No. 178 is the composition formula C 13 H 18 O 3 , No. 245 is the composition formula C 13 H 20 O 4 , and No. 272 is the composition. Formulas C 13 H 28 O 4 , No. 347 are composition formulas C 18 H 26 O 2 , No. 416 and No. 417 are composition formulas C 18 H 28 O 3 , No. 539 is composition formula C 15 H 22 O 8 , No. 729 is the composition formula C 19 H 30 O 7 , and No. 1050 is the composition formula C 24 H 28 O 11 , No. 1182 was presumed to have the composition formula C 34 H 40 O 10 .
 イネの収量の予測手段としては、上記1,324の成分、好ましくは収量との相関が有意(p<0.05)かつ相関係数の絶対値|r|>0.51である成分、より好ましくは有意(p<0.05)かつ相関係数の絶対値|r|>0.66である成分の存在量(例えば相関係数が-0.825である精密質量m/z124.0869のピーク面積)を、予測したいイネ葉サンプルについても測定し、既知の収量と測定したピーク面積との相関関係から収量を予測することが挙げられる。 As a means for predicting the yield of rice, the above 1,324 components, preferably components having a significant correlation with the yield (p <0.05) and an absolute value of the correlation coefficient | r |> 0.51. Of the precision mass m / z 124.0869 with a preferably significant (p <0.05) and absolute value of the correlation coefficient | r |> 0.66 (eg, a correlation coefficient of −0.825). (Peak area) can also be measured for the rice leaf sample to be predicted, and the yield can be predicted from the correlation between the known yield and the measured peak area.
 また、上記1,324成分の分析データから複数を使用し、多変量解析手法を用いて構築された収量予測モデルと照合することにより、収量を予測することができる。
 すなわち、播種から所定期間経過後のイネの葉サンプルを採取し、分析サンプルを得、該分析サンプルを機器分析に供して機器分析データを得、該機器分析データを、収量予測モデルと照合することにより、当該イネの収量を予測することができる。
In addition, the yield can be predicted by using a plurality of the analysis data of the above 1,324 components and collating with the yield prediction model constructed by using the multivariate analysis method.
That is, a rice leaf sample after a lapse of a predetermined period from sowing is collected, an analysis sample is obtained, the analysis sample is subjected to instrumental analysis to obtain instrumental analysis data, and the instrumental analysis data is collated with a yield prediction model. Therefore, the yield of the rice can be predicted.
 収量予測モデルは、説明変数に各精密質量をもった補正済みの成分の分析データのピーク面積値を、また目的変数に収量値を用いた回帰分析を行うことにより構築できる。回帰分析法としては、例えば主成分回帰分析、PLS(Partial least squares projection to latent structures)回帰分析、OPLS(Orthogonal projections to latent structures)回帰分析、一般化線形回帰分析の他、バギング、サポートベクターマシン、ランダムフォレスト、ニューラルネットワーク回帰分析等の機械学習・回帰分析手法等の多変量回帰分析手法が挙げられる。このうち、PLS法、PLS法の改良版であるOPLS法、或いは機械学習・回帰分析手法を用いるのが好ましい。OPLS法は、PLS法に比べ予測性は同じだが、解釈のための視覚化がより容易になる点が今回のような目的においては優れている。PLS法及びOPLS法は、共に高次元のデータから情報を集約し少数の潜在変数に置き換え、その潜在変数を用いて目的変数を表現する方法である。潜在変数の数を適切に選ぶことが重要であり、潜在変数の数を決めるのによく利用されるのがクロスバリデーション(交差検証)である。すなわち、モデル構築用データをいくつかのグループに分割し、あるグループをモデル検証に、その他のグループをモデル構築に用いて予測誤差を見積り、この作業を、グループを入れ替えながら繰り返して、予測誤差の合計が最小となる潜在変数の数が選ばれる。 The yield prediction model can be constructed by performing regression analysis using the peak area value of the corrected component analysis data with each precise mass as the explanatory variable and the yield value as the objective variable. Regression analysis methods include, for example, principal component regression analysis, PLS (Partial least squares projection to latent structures) regression analysis, OPLS (Orthogonal projections to latent structures) regression analysis, generalized linear regression analysis, bagging, and support vector machines. Examples include multivariate regression analysis methods such as machine learning / regression analysis methods such as random forest and neural network regression analysis. Of these, it is preferable to use the PLS method, the OPLS method which is an improved version of the PLS method, or the machine learning / regression analysis method. The OPLS method has the same predictability as the PLS method, but is superior in that it is easier to visualize for interpretation for the purpose of this time. Both the PLS method and the OPLS method are methods in which information is aggregated from high-dimensional data, replaced with a small number of latent variables, and the objective variable is expressed using the latent variables. It is important to properly select the number of latent variables, and cross-validation is often used to determine the number of latent variables. That is, the data for model construction is divided into several groups, one group is used for model verification, and the other group is used for model construction to estimate the prediction error, and this work is repeated while exchanging the groups to obtain the prediction error. The number of latent variables with the smallest total is chosen.
 予測モデルの評価は、主に2つの指標で判断される。1つは予測精度を表すR、もう1つは予測性を表すQである。Rは予測モデル構築に使用したデータの実測値とモデルで計算した予測値との相関係数の2乗であり、1に近いほど予測精度が高いことを示している。一方、Qは、上記クロスバリデーションの結果であり、実測値と、繰り返し実施したモデル検証の結果である予測値との相関係数の2乗を表している。本発明のイネ収量予測モデルにおいては、Q>0.50をモデル評価の基準とするのが好ましい。なお、常にR>Qとなるため、Q>0.50は同時にR>0.50を満たすこととなる。
 以下に、上記1,324成分の分析データのピーク面積値と、子実収量を用いた種々のイネ収量予測モデルを作成しその精度を検証した結果を示す。
The evaluation of the prediction model is mainly judged by two indicators. One is R 2 which represents prediction accuracy, and the other is Q 2 which represents predictability. R 2 is the square of the correlation coefficient between the measured value of the data used for constructing the prediction model and the predicted value calculated by the model, and the closer it is to 1, the higher the prediction accuracy. On the other hand, Q2 is the result of the above cross validation, and represents the square of the correlation coefficient between the actually measured value and the predicted value which is the result of repeated model validation. In the rice yield prediction model of the present invention, it is preferable to use Q2 > 0.50 as the standard for model evaluation. Since R 2 > Q 2 is always satisfied, Q 2 > 0.50 satisfies R 2 > 0.50 at the same time.
The following shows the results of creating various rice yield prediction models using the peak area values of the analysis data of the above 1,324 components and the grain yield and verifying their accuracy.
(1)全ての成分情報を用いた収量予測モデルの構築
 1データ当り1,324個の成分の分析データのピーク面積値と収量値を持つ全26個のデータマトリックスからOPLSモデルを構築した。なお、構築の際は、各成分の分析データのピーク面積値及び収量データはオートスケーリングにより平均0、分散1に変換した。
 上記モデルではVIP(Variable Importance in the Projection,投影における変数重要性)値とよばれる各成分に与えられるモデル性能への寄与度が算出される。
 VIP値は、下記式1により求められる。
(1) Construction of a yield prediction model using all component information An OPLS model was constructed from a total of 26 data matrices having peak area values and yield values of analysis data of 1,324 components per data. At the time of construction, the peak area value and the yield data of the analysis data of each component were converted into an average of 0 and a variance of 1 by autoscaling.
In the above model, the contribution to the model performance given to each component called the VIP (Variable Impact in the Projection) value is calculated.
The VIP value is obtained by the following formula 1.
Figure JPOXMLDOC01-appb-M000019
Figure JPOXMLDOC01-appb-M000019
 VIP値はその値が大きいほどモデルへの寄与度が大きく、相関係数の絶対値とも相関する。VIP値上位800位までのリストを後記表5a~5jに示す。 The larger the VIP value, the greater the contribution to the model, and it also correlates with the absolute value of the correlation coefficient. A list of the top 800 VIP values is shown in Tables 5a to 5j below.
(2)VIP値を指標としたモデル構築
 (2-1)VIP値上位800位までの成分の分析データを用いたモデル
 VIP値上位800位までのすべての成分を選択し、1データ当り該800個の成分の分析データのピーク面積値と収量値を持つ全26個のデータマトリックスからOPLSモデル(図2)を構築した。なお、構築の際は、各成分の分析データのピーク面積値及び収量データはオートスケーリングにより平均0、分散1に変換した。R=0.78、Q=0.51であり、高い予測性を持つモデルといえる。
(2) Model construction using the VIP value as an index (2-1) Model using analysis data of the components up to the top 800 VIP values Select all the components up to the top 800 VIP values and select the 800 per data. An OPLS model (Fig. 2) was constructed from a total of 26 data matrices having peak area values and yield values of the analysis data of the individual components. At the time of construction, the peak area value and the yield data of the analysis data of each component were converted into an average of 0 and a variance of 1 by autoscaling. R 2 = 0.78 and Q 2 = 0.51, which means that the model has high predictability.
 (2-2)VIP値上位800位までの成分のうちVIP値が下位の成分の分析データを用いたモデル
 VIP値11位以下800位までのすべての成分の分析データ、21位以下800位までのすべての成分の分析データ、31位以下800位までのすべての成分の分析データ・・・及び111位以下800位までのすべての成分の分析データを用いてOPLS法によりモデル(図3)を構築した。
 Q>0.5を満たすのは11位以下800位までのすべての成分の分析データ及び21位以下800位までのすべての成分の分析データを用いたモデルである。VIP値31位以下800位までのすべての成分の分析データを用いてもQ>0.50とはならない。
(2-2) Model using analysis data of components with lower VIP values among the components up to the top 800 VIP values Analysis data of all components from the 11th place to the 800th place with the VIP value, up to the 800th place from the 21st place A model (Fig. 3) was created by the OPLS method using the analysis data of all the components of the above, the analysis data of all the components from the 31st position to the 800th position, and the analysis data of all the components from the 111th position to the 800th position. It was constructed.
Q2 > 0.5 is satisfied by a model using the analysis data of all the components from the 11th position to the 800th position and the analysis data of all the components from the 21st position to the 800th position. Even if the analysis data of all the components from the VIP value 31st to the 800th is used, Q2 > 0.50 does not hold.
 (2-3)VIP値上位10位までの成分の分析データを9個用いたモデル
 VIP値上位1位から10位までの成分の分析データの内、任意の9個の組み合わせ(10通り)についてOPLS法によりモデル(図4)を構築した。
 いずれのモデルにおいてもQ>0.50を満たす。
(2-3) Model using 9 analysis data of the components from the top 10 VIP values Arbitrary 9 combinations (10 ways) of the analysis data of the components from the top 1 to 10 VIP values A model (Fig. 4) was constructed by the OPLS method.
Both models satisfy Q2 > 0.50.
 予測に用いる成分数は、簡便に予測を行う場合には、成分数が少ない方が好適であり、例えば、10個以下であり、好ましくは5個以下、より好ましくは3個以下、最も好ましくは1個である。また、精度を高めたい場合には、成分数が多い方が好適であり、例えば、11個以上、好ましくは20個以上、より好ましくは50個以上、さらに好ましくは90個以上、最も好ましくは150個以上である。少ない成分数にて予測する場合は、VIP値上位の成分または相関係数のより高い成分を予測に用いることが好ましい。 The number of components used for prediction is preferably as small as possible, for example, 10 or less, preferably 5 or less, more preferably 3 or less, and most preferably 3 or less, in the case of simple prediction. It is one. Further, when it is desired to improve the accuracy, it is preferable that the number of components is large, for example, 11 or more, preferably 20 or more, more preferably 50 or more, still more preferably 90 or more, and most preferably 150. More than one. When predicting with a small number of components, it is preferable to use a component having a higher VIP value or a component having a higher correlation coefficient for prediction.
 VIP値上位の成分は、例えば、VIP値上位800個から選択される少なくとも1個の成分であり、好ましくはVIP値上位800個から選択される少なくとも5個の成分であり、より好ましくはVIP値上位800個から選択される少なくとも10個の成分であり、さらに好ましくはVIP値上位10個から選択される少なくとも1個の成分であり、さらに好ましくはVIP値上位10個から選択される少なくとも5個の成分であり、さらに好ましくはVIP値上位10個から選択される少なくとも9個の成分であり、さらに好ましくはVIP値上位800個の成分である。 The component having a higher VIP value is, for example, at least one component selected from the top 800 VIP values, preferably at least five components selected from the top 800 VIP values, and more preferably a VIP value. At least 10 components selected from the top 800, more preferably at least 1 component selected from the top 10 VIP values, and even more preferably at least 5 selected from the top 10 VIP values. It is a component of, more preferably at least 9 components selected from the top 10 VIP values, and even more preferably a component with the top 800 VIP values.
 本発明の態様及び好ましい実施態様を以下に示す。
<1>イネから採取された葉サンプルから、質量分析により提供される精密質量(m/z)が101~1215である成分から選ばれる1種以上の成分の分析データを取得し、当該データとイネ収量との相関性を利用してイネの収量を予測する、イネの収量予測方法。
<2>前記1以上の成分の分析データをpooled QC法により補正する、<1>に記載の方法。
<3>前記成分が、質量分析により提供される精密質量(m/z)で規定された、前記表1a~1iに記載の成分から選ばれる1種以上である、<1>又は<2>に記載の方法。
<4>成分が、前記表1a~1iに記載の成分No.1、4、6、9、10、11、12、19、20、21、23、26、27、29、30、33、34、35、38、39、45、46、47、48、49、50、51、52、54、55、56、61、62、63、64、65、66、67、69、71、75、76、77、78、81、83、84、85、88、89、90、91、92、96、100、102、105、106、107、108、109、113、116、118、119、120、121、123、124、126、127、129、130、131、133、134、137、139、142、145、147、148、149、151、152、153、154、155、156、159、162、163、164、165、166、167、168、169、170、172、174、175、177、178、181、182、183、184、186、187、188、191、193、194、196、198、200、202、203、206、208、209、210、212、213、214、215、217、218、219、220、221、222、223、224、225、226、227、228、229、230、231、232、233、235、237、238、239、240、243、245、246、247、248、249、250、251、252、254、255、258、259、260、261、262、263、264、265、266、268、270、271、272、273、274、275、277、278、280、281、283、284、285、286、288、289、290、291、293、294、298、300、302、303、305、310、312、314、316、317、318、320、321、323、325、327、329、331、332、333、334、335、337、338、339、342、343、344、345、346、347、348、350、351、352、355、358、359、360、361、362、363、365、366、368、369、370、371、373、374、375、378、379、381、382、389、390、391、392、395、397、398、399、401、404、407、408、409、410、411、413、414、415、416、417、418、419、423、424、425、428、431、433、434、435、436、437、438、439、441、444、445、446、447、449、450、451、454、455、457、458、459、460、461、462、464、465、469、471、472、473、474、475、478、480、481、482、483、487、489、490、491、492、494、502、503、504、507、509、510、511、512、513、514、516、517、522、523、525、526、529、532、534、539、540、542、543、547、548、549、551、552、554、555、557、561、565、566、567、573、582、583、585、586、588、589、590、591、593、594、595、596、597、599、600、602、603、604、606、609、611、612、613、615、616、617、619、620、621、624、628、630、631、632、633、635、639、643、644、647、649、650、651、653、654、655、656、658、660、661、662、665、666、671、672、673、674、675、681、682、683、684、685、688、689、691、692、693、694、695、696、699、700、701、702、703、704、706、707、708、713、714、715、717、719、721、722、723、724、725、726、727、728、729、731、732、734、735、737、738、740、745、746、748、749、750、754、756、757、762、765、766、767、768、770、774、776、777、780、781、782、785、787、789、792、793、794、795、796、797、798、799、801、802、803、804、810、811、813、815、816、817、818、820、822、823、824、827、828、829、830、832、834、841、842、843、844、845、846、848、849、850、852、854、858、863、864、867、868、869、870、871、872、874、877、878、879、882、883、884、885、886、888、889、893、894、895、896、898、899、900、901、902、903、910、911、912、914、917、919、922、923、924、925、926、928、930、932、938、941、942、943、944、945、946、947、948、949、950、952、953、955、956、958、959、960、962、965、966、968、969、973、976、979、980、981、983、985、986、989、992、993、994、995、996、997、999、1001、1002、1003、1005、1006、1007、1009、1012、1013、1015、1017、1019、1020、1021、1022、1024、1025、1026、1027、1031、1032、1034、1036、1039、1043、1044、1045、1046、1047、1048、1049、1050、1051、1053、1054、1057、1058、1059、1060、1062、1066、1067、1068、1069、1070、1072、1074、1075、1077、1078、1079、1081、1082、1087、1088、1089、1092、1094、1098、1100、1101、1102、1103、1104、1105、1106、1108、1110、1112、1113、1114、1117、1118、1119、1120、1121、1123、1126、1127、1128、1129、1133、1134、1135、1139、1140、1141、1142、1143、1144、1147、1148、1149、1150、1151、1152、1153、1154、1158、1160、1163、1166、1167、1168、1170、1171、1172、1173、1174、1177、1178、1179、1180、1181、1182、1184、1186、1187、1188、1189、1190、1191、1192、1193、1194、1195、1196、1197、1198、1199、1202、1204、1208、1211、1212、1214、1217、1218、1221、1222、1224、1225、1226、1229、1231、1233、1234、1235、1237、1238、1239、1240、1241、1242、1243、1244、1246、1247、1248、1249、1250、1252、1254、1255、1256、1257、1258、1261、1263、1265、1267、1268、1269、1271、1272、1276、1277、1278、1280、1283、1291、1292、1295、1296、1297、1299、1300、1301、1304、1305、1306、1309、1311、1312、1313、1314、1315、1316、1317、1318、1319、1321及び1322から選ばれる1種以上である<3>に記載の方法。
<5>成分が、前記表1a~1iに記載の成分No.10、177、178、245、254、272、294、337、366、435、462、529、539、708、729、832、842、869、901、912、1050、1060、1173及び1306から選ばれる1種以上である<3>に記載の方法。
<6>成分が、前記表1a~1iに記載の成分No.10、178及び1173から選ばれる1種以上である<3>に記載の方法。
<7>成分が、前記表1a~1iに記載の成分No.10、178、245、272、347、416、417、539、729、1050及び1182から選ばれる1種以上であり、前記成分No.10が組成式Cの成分であり、前記成分No.178が組成式C1318の成分であり、前記成分No.245が組成式C1320の成分であり、前記成分No.272が組成式C1328の成分であり、前記成分No.347が組成式C1826の成分であり、前記成分No.416が組成式C1828の成分であり、前記成分No.417が組成式C1828の成分であり、前記成分No.539が組成式C1522の成分であり、前記成分No.729が組成式C1930の成分であり、前記成分No.1050が組成式C242811の成分であり、前記成分No.1182が組成式C344010の成分である、<3>に記載の方法。
<8>葉サンプルが、出芽期から出穂期のイネから採取される、<1>~<7>のいずれかに記載の方法。
<9>葉サンプルが、2葉齢期から幼穂形成期のイネから採取される、<1>~<7>のいずれかに記載の方法。
<10>分析データが、質量分析データである<1>~<9>のいずれかに記載の方法。
<11>葉サンプルから取得された成分の分析データを、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位800個の中から少なくとも1個の成分の分析データを用いて構築された収量予測モデルと照合する工程を含む、<3>~<10>のいずれかに記載の方法。
<12>収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位800個の中から少なくとも5個を用いる、<11>に記載の方法。
<13>収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位800個の中から少なくとも10個を用いる、<11>に記載の方法。
<14>収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位10個の中から少なくとも1個を用いる、<11>に記載の方法。
<15>収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位10個の中から少なくとも5個を用いる、<11>に記載の方法。
<16>収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位10個の中から少なくとも9個を用いる、<11>に記載の方法。
<17>収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位800個を用いる、<11>に記載の方法。
<18>収量予測モデルが、OPLS法を用いて構築されたモデルである<11>~<17>のいずれかに記載の方法。
<19>精密質量が小数点以下4桁以上の精度にて測定されたものである<1>~<18>のいずれかに記載の方法。
The embodiments and preferred embodiments of the present invention are shown below.
<1> From the leaf sample collected from rice, analysis data of one or more components selected from the components having a precise mass (m / z) of 101 to 1215 provided by mass spectrometry is acquired, and the data and the data are obtained. A rice yield prediction method that predicts rice yield using the correlation with rice yield.
<2> The method according to <1>, wherein the analysis data of the one or more components is corrected by the popled QC method.
<3> The component is at least one selected from the components shown in Tables 1a to 1i defined by the precision mass (m / z) provided by mass spectrometry, <1> or <2>. The method described in.
<4> The components are the component Nos. No. 1 shown in Tables 1a to 1i. 1, 4, 6, 9, 10, 11, 12, 19, 20, 21, 23, 26, 27, 29, 30, 33, 34, 35, 38, 39, 45, 46, 47, 48, 49, 50, 51, 52, 54, 55, 56, 61, 62, 63, 64, 65, 66, 67, 69, 71, 75, 76, 77, 78, 81, 83, 84, 85, 88, 89, 90, 91, 92, 96, 100, 102, 105, 106, 107, 108, 109, 113, 116, 118, 119, 120, 121, 123, 124, 126, 127, 129, 130, 131, 133, 134, 137, 139, 142, 145, 147, 148, 149, 151, 152, 153, 154, 155, 156, 159, 162, 163, 164, 165, 166, 167, 168, 169, 170, 172, 174,175,177,178,181,182,183,184,186,187,188,191,193,194,196,198,200,202,203,206,208,209,210,212,213, 214, 215, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 235, 237, 238, 239, 240, 243, 245, 246, 247, 248, 249, 250, 251, 252, 254, 255, 258, 259, 260, 261 and 262, 263, 264, 265, 266, 268, 270, 271, 272, 273, 274, 275, 277, 278, 280, 281, 283, 284, 285, 286, 288, 289, 290, 291, 293, 294, 298, 300, 302, 303, 305, 310, 312, 314, 316, 317, 318, 320, 321, 323, 325, 327, 329, 331, 332, 333, 334, 335, 337, 338, 339, 342, 343, 344, 345, 346, 347, 348, 350, 351 and 352, 355, 358, 359, 360, 361, 362, 363, 365, 366, 368, 369, 370, 371, 373, 374, 375, 378, 379, 381, 382, 389, 390, 391, 392, 395, 397, 398, 399, 401, 404, 407, 408, 409, 410, 411, 413, 414, 415, 416, 41 7, 418, 419, 423, 424, 425, 428, 431, 433, 434, 435, 436, 437, 438, 439, 441, 444, 445, 446, 447, 449, 450, 451, 454, 455, 457, 458, 459, 460, 461, 462, 464, 465, 469, 471, 472, 473, 474, 475, 478, 480, 481, 482, 483, 487, 489, 490, 491, 492, 494, 502, 503, 504, 507, 509, 510, 511, 512, 513, 514, 516, 517, 522, 523, 525, 526, 259, 532, 534, 359, 540, 542, 543, 547, 548, 549, 551, 552, 554, 555, 557, 561, 565, 566, 567, 573, 582, 583, 585, 586, 588, 589, 590, 591, 593, 594, 595, 596, 599, 599, 600, 602, 603, 604, 606, 609, 611, 612, 613, 615, 616, 617, 618, 620, 621, 624, 628, 630, 631, 632, 633, 635, 639, 643, 644, 647, 649, 650, 651, 653, 654, 655, 656, 658, 660, 661, 662, 665, 666, 671, 672, 673, 674, 675, 681, 682, 683, 684, 685, 688, 689, 691, 692, 693, 694, 695, 696, 699, 700, 701, 702, 703, 704, 706, 707, 708, 713, 714, 715, 717, 719, 721, 722, 723, 724, 725,726,727,728,729,731,732,734,735,737,738,740,745,746,748,749,750,754,756,757,762,765,766,767,768, 770, 774, 767, 777, 780, 781, 782, 785, 787, 789, 792, 793, 794, 795, 796, 797, 798, 799, 801, 802, 803, 804, 810, 811, 813, 815, 816, 817, 818, 820, 822, 823, 824, 827, 828, 829, 830, 832, 834, 841, 842, 843, 844, 845, 846, 848, 849, 850, 852, 854, 85 8,863,864,867,868,869,870,871,872,874,877,878,879,882,883,884,885,886,888,889,893,894,895,896,898, 899, 900, 901, 902, 903, 910, 911, 912, 914, 917, 919, 922, 923, 924, 925, 926, 928, 930, 923, 938, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 952, 953, 955, 956, 958, 959, 960, 962, 965, 966, 968, 969, 973, 976, 979, 980, 981, 983, 985, 986, 989, 992, 993, 994, 995, 996, 997, 999, 1001, 1002, 1003, 1005, 1006, 1007, 1009, 1012, 1013, 1015, 1017, 1019, 1020, 1021, 1022, 1024, 1025, 1026, 1027, 1031, 1032, 1034, 1036, 1039, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1053, 1054, 1057, 1058, 1059, 1060, 1062, 1066, 1067, 1068, 1069, 1070, 1072, 1074, 1075, 1077, 1078, 1079, 1081, 1082, 1087, 1088, 1089, 1092, 1094, 1098, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1108, 1110, 1112, 1113, 1114, 1117, 1118, 1119, 1120, 1121, 1123, 1126, 1127, 1128, 1129, 1133, 1134, 1135, 1139, 1140, 1141, 1142, 1143, 1144, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1158, 1160, 1163, 1166, 1167, 1168, 1170, 1171, 1172, 1173, 1174, 1177, 1178, 1179, 1180, 1181, 1182, 1184, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193, 1194, 1195, 1196, 1197, 1198, 1199, 1202, 1204, 1208, 1211, 1212, 1214, 1217, 1218, 1221, 1222, 1224, 1225, 1226, 1229, 1231, 1233, 1234, 1235, 1237, 1238, 1239, 1240, 1241, 1242, 1243, 1244, 1246, 1247, 1248, 1249, 1250, 1252, 1254, 1255, 1256, 1257, 1258, 1261, 1263, 1265, 1267, 1268, 1269, 1271, 1272, 1276, 1277, 1278, 1280, 1283, 1291, 1292, 1295, 1296, Described in <3>, which is one or more selected from 1297, 1299, 1300, 1301, 1304, 1305, 1306, 1309, 1311, 1312, 1313, 1314, 1315, 1316, 1317, 1318, 1319, 1321 and 1322. the method of.
<5> The components are the component Nos. No. 1 shown in Tables 1a to 1i. Selected from 10, 177, 178, 245, 254, 272, 294, 337, 366, 435, 462, 259, 359, 708, 729, 832, 842, 869, 901, 912, 1050, 1050, 1173 and 1306. The method according to <3>, which is one or more types.
<6> The components are the component Nos. No. 1 shown in Tables 1a to 1i. 10. The method according to <3>, which is one or more selected from 10, 178 and 1173.
<7> The components are the component Nos. No. 1 shown in Tables 1a to 1i. It is one or more selected from 10, 178, 245, 272, 347, 416, 417, 359, 729, 1050 and 1182, and the component No. 10 is a component of the composition formula C 6 H 9 N 3 and described above. Component No. 178 is a component of composition formula C 13 H 18 O 3 , said component No. 245 is a component of composition formula C 13 H 20 O 4 , and component No. 272 is a component of composition formula C 13 H 28 O. The component No. 347 is a component of the composition formula C 18 H 26 O 2 , the component No. 416 is a component of the composition formula C 18 H 28 O 3 , and the component No. 417 is a component of the composition formula C 18 H 28 O 3. It is a component of the composition formula C 18 H 28 O 3 , and the component No. 539 is a component of the composition formula C 15 H 22 O 8 , and the component No. 729 is a component of the composition formula C 19 H 30 O 7 , and the component No. 1050 is a component of the composition formula C 24 H 28 O 11 . The method according to <3>, wherein 1182 is a component of the composition formula C 34 H 40 O 10 .
<8> The method according to any one of <1> to <7>, wherein the leaf sample is collected from rice in the germination stage to the heading stage.
<9> The method according to any one of <1> to <7>, wherein the leaf sample is collected from rice in the two-leaf instar to the panicle formation stage.
<10> The method according to any one of <1> to <9>, wherein the analysis data is mass spectrometry data.
<11> At least one of the top 800 VIP values calculated from the yield prediction model constructed using the component information shown in Tables 1a to 1i of the component analysis data obtained from the leaf sample. The method according to any one of <3> to <10>, which comprises a step of collating with a yield prediction model constructed using the analysis data of the components of.
<12> In <11>, the yield prediction model uses at least 5 of the top 800 VIP values calculated from the yield prediction model constructed using the component information shown in Tables 1a to 1i. The method described.
<13> In <11>, the yield prediction model uses at least 10 of the top 800 VIP values calculated from the yield prediction model constructed using the component information shown in Tables 1a to 1i. The method described.
<14> In <11>, the yield prediction model uses at least one of the top 10 VIP values calculated from the yield prediction model constructed using the component information shown in Tables 1a to 1i. The method described.
<15> In <11>, the yield prediction model uses at least 5 of the top 10 VIP values calculated from the yield prediction model constructed using the component information shown in Tables 1a to 1i. The method described.
<16> In <11>, the yield prediction model uses at least 9 out of the top 10 VIP values calculated from the yield prediction model constructed using the component information shown in Tables 1a to 1i. The method described.
<17> The method according to <11>, wherein the yield prediction model uses the top 800 VIP values calculated from the yield prediction model constructed by using the component information shown in Tables 1a to 1i.
<18> The method according to any one of <11> to <17>, wherein the yield prediction model is a model constructed by using the OPLS method.
<19> The method according to any one of <1> to <18>, wherein the precision mass is measured with an accuracy of four or more digits after the decimal point.
比較例1.解析用データの取得
 前記非特許文献2のDanらの文献と一緒に公開されているデータ(https://www.nature.com/articles/srep21732#Sec8)を入手した。収量データとして個体あたりの乾燥子実質量を使用した。葉抽出物のデータは公開されているすべてのデータを解析に使用した。
Comparative example 1. Acquisition of data for analysis We obtained the data (https://www.nature.com/articles/srep21732#Sec8) published together with the literature of Dan et al. Of Non-Patent Document 2. The actual amount of dried child per individual was used as the yield data. For leaf extract data, all published data was used for analysis.
2.モデル構築・評価
 2つ以上の複数の成分の分析データを用いた収量予測モデルの構築には多変量解析手法を用い、解析ツールとしてSIMCA ver.14(Umetrics)を用いた。予測モデルは、説明変数に各精密質量をもった補正済みの成分の分析データのピーク面積値を、また目的変数に収量値を用いた回帰分析をおこなった。回帰分析はPLS法の改良版であるOPLS法でおこなった。
2. 2. Model construction / evaluation A multivariate analysis method was used to construct a yield prediction model using analysis data of two or more components, and SIMCA ver. 14 (Umetrics) was used. For the prediction model, regression analysis was performed using the peak area value of the corrected component analysis data with each precise mass as the explanatory variable and the yield value as the objective variable. Regression analysis was performed by the OPLS method, which is an improved version of the PLS method.
 予測モデルの評価方法は、主に2つの指標で判断される。1つは予測精度を表すR、もう1つは予測性を表すQである。Rは予測モデル構築に使用したデータの実測値とモデルで計算した予測値との相関係数の2乗であり、1に近いほど予測精度が高いことを示している。一方、Qは、上記クロスバリデーションの結果であり、実測値と繰り返し実施したモデル検証の結果である予測値との相関係数の2乗を表している。実施例と同様に、Q>0.50をモデル評価の基準とした。なお、常にR>Qとなるため、Q>0.50は同時にR>0.50を満たすこととなる。 The evaluation method of the prediction model is mainly judged by two indexes. One is R 2 which represents prediction accuracy, and the other is Q 2 which represents predictability. R 2 is the square of the correlation coefficient between the measured value of the data used for constructing the prediction model and the predicted value calculated by the model, and the closer it is to 1, the higher the prediction accuracy. On the other hand, Q 2 is the result of the cross validation, and represents the square of the correlation coefficient between the actually measured value and the predicted value which is the result of repeated model validation. As in the examples, Q2 > 0.50 was used as the criterion for model evaluation. Since R 2 > Q 2 is always satisfied, Q 2 > 0.50 satisfies R 2 > 0.50 at the same time.
3.全データを用いたモデルの構築・評価
 1データ当り525個の成分の分析データのピーク面積値と収量値を持ち、全295個のデータマトリックスから、収量を予測するOPLSモデルを構築した。構築の際、各成分の分析データのピーク面積値及び収量データはオートスケーリングにより平均0、分散1に変換した。モデル構築の結果、予測精度を示すR=0.07、予測性を示すQ=0.008であり、Q>0.50の基準を満たさなかった。結果を図1に示す。よって、非特許文献2の収量予測モデルは、予測精度が非常に低いことが判明した。
3. 3. Construction / evaluation of a model using all data An OPLS model was constructed from a total of 295 data matrices, which has peak area values and yield values of analysis data of 525 components per data. At the time of construction, the peak area value and the yield data of the analysis data of each component were converted into an average of 0 and a variance of 1 by autoscaling. As a result of model construction, R 2 = 0.07 indicating prediction accuracy and Q 2 = 0.008 indicating predictability did not meet the criteria of Q 2 > 0.50. The results are shown in FIG. Therefore, it was found that the yield prediction model of Non-Patent Document 2 has very low prediction accuracy.
実施例1.栽培試験
 2019年に実施した温室内ポット栽培試験データについて詳述する。
 神奈川県平塚市内の温室内にてポット栽培を実施した。土壌は花王株式会社栃木事業場内の圃場土を用いた。ポットあたり0.8gを基本施肥量と設定し、窒素、リンおよびカリウムを肥料成分として含む化成肥料(商品名「百勝一基」関菱化学株式会社)を4Lの土壌に混和した。上記基本施肥量の1/4倍量、1/2倍量、2倍量及び4倍量の条件も設定することで、計5種類の施肥条件での栽培を行った。ポットには1/5000aワグネルポットを用い、上記土壌を1ポットあたり約4L詰め、30ポットを準備した。2019年7月29日に3粒播きで各ポット内2カ所に播種した(1ポットあたり6粒使用)。3粒の種子由来の植物体をまとめて1個体として扱った。なお、品種は、ジャポニカ種の「ニホンバレ」を用いた。本葉が2枚展開時に1ポットにつき1株となるように間引きした。8月22日から10月28日の間は湛水条件で栽培を行い、播種から湛水開始までは週に1回の頻度で土壌が湿る程度に水やりを行った。10月28日以降は水やりは行わなかった。サンプリングは8月28日に行った。サンプリング後、窒素・リン酸・カリウムをそれぞれ14%ずつ含む化成肥料(商品名「化成肥料14号」株式会社サンアンドホープ)をポットあたり1g追肥した。収穫は11月14日に実施した(播種後80日)。なお、4個体が欠株したため、収量予測には計26個体を用いた。温室内の温度は、気温に応じて扉の開閉により適宜調整した。
Example 1. Cultivation test The data of the pot cultivation test in the greenhouse conducted in 2019 will be described in detail.
Pot cultivation was carried out in a greenhouse in Hiratsuka City, Kanagawa Prefecture. The soil used was the field soil in the Tochigi Plant of Kao Corporation. The basic fertilizer application amount was set to 0.8 g per pot, and a chemical fertilizer containing nitrogen, phosphorus and potassium as fertilizer components (trade name "Hyakukatsu Ichiki" Kanryo Chemical Co., Ltd.) was mixed with 4 L of soil. By setting the conditions of 1/4 times, 1/2 times, 2 times and 4 times the basic fertilizer application amount, cultivation was carried out under a total of 5 types of fertilizer application conditions. A 1 / 5000a Wagner pot was used as a pot, and about 4 L of the above soil was packed per pot to prepare 30 pots. On July 29, 2019, 3 seeds were sown at 2 places in each pot (6 seeds were used per pot). Plants derived from three seeds were collectively treated as one individual. The cultivar used was Japonica rice "Nihonbare". When two true leaves were unfolded, they were thinned out so that there would be one plant per pot. From August 22nd to October 28th, the plants were cultivated under flooded conditions, and from sowing to the start of flooding, watering was carried out once a week to the extent that the soil was moistened. No watering was done after October 28th. Sampling was done on August 28th. After sampling, 1 g of chemical fertilizer containing 14% each of nitrogen, phosphoric acid, and potassium (trade name "Chemical Fertilizer No. 14" Sun and Hope Co., Ltd.) was topped up. Harvesting was carried out on 14th November (80 days after sowing). Since 4 individuals were deficient, a total of 26 individuals were used for yield prediction. The temperature inside the greenhouse was adjusted appropriately by opening and closing the door according to the temperature.
2.葉のサンプリング
 葉のサンプリングは、播種後30日後となる日の日中に実施した(おおむね13時―15時)。この際のイネの生育ステージは、個体により若干異なるが概ね個体あたりの葉の数が15-20枚程度であり、分げつ期に相当する生育ステージであった。葉のサンプリングは、株の根元から葉を4-7枚切断することによって採取した。採取時には株全体から偏りなく採取するようにした。採取した葉はアルミホイルで包み直ちに液体窒素中で凍結し、代謝反応を停止させた。凍結サンプルは凍結状態を維持したまま実験室へ持ち帰り、凍結乾燥にかけて乾燥させた。この乾燥したサンプルを後述の抽出操作に供試した。
2. 2. Leaf sampling Leaf sampling was performed during the daytime, 30 days after sowing (generally from 13:00 to 15:00). The growth stage of rice at this time was slightly different depending on the individual, but the number of leaves per individual was about 15 to 20, which was a growth stage corresponding to the tillering stage. Leaf sampling was taken by cutting 4-7 leaves from the root of the plant. At the time of collection, the whole strain was collected without bias. The collected leaves were wrapped in aluminum foil and immediately frozen in liquid nitrogen to stop the metabolic reaction. The frozen sample was taken back to the laboratory while maintaining the frozen state, and dried by freeze-drying. This dried sample was subjected to the extraction operation described later.
3.最終的な子実収量の測定
 播種後80日である11月14日に収穫を行った。栽培試験後の各個体から全子実を回収し、90℃に設定した乾燥機(送風定温恒温器DKN602,ヤマト科学株式会社)にて3日間乾燥させた。収量データとして乾燥子実質量(mgDW/個体)、及び粒数(個/個体)を測定した。後述する各成分の分析データと収量との単相関解析及び予測モデルの構築には乾燥子実質量(mgDW/個体)を用いた。乾燥子実質量のデータは、表2に示すように最小で5507.97mgDW/個体、最大で10507.17mgDW/個体であった。
3. 3. Final measurement of grain yield Harvesting was carried out on November 14, 80 days after sowing. All grains were collected from each individual after the cultivation test and dried in a dryer set at 90 ° C. (blower constant temperature incubator DKN6022, Yamato Kagaku Co., Ltd.) for 3 days. As the yield data, the actual amount of dried sardines (mgDW / individual) and the number of grains (pieces / individual) were measured. The dry cell parenchyma (mgDW / individual) was used for the simple correlation analysis between the analysis data of each component and the yield described later and the construction of the prediction model. As shown in Table 2, the data on the parenchymal amount of dried child was 5507.97 mgDW / individual at the minimum and 10507.17 mgDW / individual at the maximum.
Figure JPOXMLDOC01-appb-T000020
Figure JPOXMLDOC01-appb-T000020
4.採取した葉の成分の抽出
 凍結乾燥した葉サンプルは、スパーテルを用いて手作業にて可能な限り粉砕をおこなった。粉砕後、2mLのチューブ(セーフロックチューブ,エッペンドルフ)に10mgを秤量し、直径5mmのジルコニア製ボール1つをチューブに加えて、ビーズ粉砕機(MM400,Retsch)にて25Hzで1分間粉砕した。抽出溶媒は、内部標準としてリドカイン(和光純薬工業,♯120-02671)を500ng/mLとなるように加えた80v/v%メタノール水溶液を用いた。粉砕後のチューブに調製した抽出溶媒を1mL添加し、同ビーズ粉砕機にて、20Hzで5分間ホモジナイズ抽出をおこなった。抽出終了後、2,000×g程度の卓上遠心機(チビタン)にて、30秒程度遠心し、0.45μmの親水性PTFEフィルター(DISMIC-13HP 0.45μm syringe filter,ADVANTEC)でろ過し、分析サンプルを得た。
4. Extraction of collected leaf components Freeze-dried leaf samples were manually crushed as much as possible using a spatula. After pulverization, 10 mg was weighed in a 2 mL tube (Safelock tube, Eppendorf), one zirconia ball having a diameter of 5 mm was added to the tube, and the mixture was pulverized at 25 Hz for 1 minute with a bead crusher (MM400, Resch). As the extraction solvent, an 80v / v% methanol aqueous solution containing lidocaine (Wako Pure Chemical Industries, Ltd., # 120-02671) at a concentration of 500 ng / mL was used as an internal standard. 1 mL of the prepared extraction solvent was added to the crushed tube, and homogenized extraction was performed at 20 Hz for 5 minutes using the same bead crusher. After the extraction is completed, the mixture is centrifuged in a desktop centrifuge (Chibitan) of about 2,000 × g for about 30 seconds, and filtered with a 0.45 μm hydrophilic PTFE filter (DISMIC-13HP 0.45 μm syringe filter, ADVANTEC). An analysis sample was obtained.
5.LC/MSによる葉サンプルの分析
 葉抽出サンプルの分析は、Agilent社製HPLCシステム(Infinity1260シリーズ)をフロントとし、AB SCIEX社製Q-TOFMS装置(TripleTOF4600)を検出器として用いてLC/MS分析をおこなった。HPLCにおける分離カラムには、株式会社資生堂社製のコアシェルカラムCapcell core C18(2.1mm I.D.×100mm,粒子計2.7μm)及びガードカラム(2.1mm I.D.×5mm, 粒子計2.7μm)を使用し、カラム温度は40℃に設定した。オートサンプラーは分析中5℃を保持した。分析サンプルは5μLを注入した。溶離液にはA:0.1v/v%ギ酸水溶液及びB:0.1v/v%ギ酸アセトニトリル溶液を用いた。グラジエント溶出条件は、0分~0.1分は1v/v%B(99v/v%A)で保持し、0.1分~13分の間に1v/v%Bから99.5v/v%Bまで溶離液Bの比率を上昇させ、13.01分~16分まで99.5v/v%Bで保持した。流速は0.5mL/minとした。
5. Analysis of leaf samples by LC / MS For analysis of leaf extraction samples, LC / MS analysis is performed using an Agilent HPLC system (Infinity 1260 series) as a front and an AB SCIEX Q-TOFMS device (TripleTOF4600) as a detector. I did it. Separation columns in HPLC include a core shell column Capcell core C18 (2.1 mm ID × 100 mm, particle total 2.7 μm) manufactured by Shiseido Co., Ltd. and a guard column (2.1 mm ID × 5 mm, particles). A total of 2.7 μm) was used, and the column temperature was set to 40 ° C. The autosampler was kept at 5 ° C during the analysis. The analytical sample was injected with 5 μL. A: 0.1 v / v% formic acid aqueous solution and B: 0.1 v / v% formic acid acetonitrile solution were used as eluents. The gradient elution condition was maintained at 1v / v% B (99v / v% A) for 0 to 0.1 minutes, and 1v / v% B to 99.5v / v for 0.1 to 13 minutes. The ratio of eluent B was increased to% B and maintained at 99.5 v / v% B from 13.01 minutes to 16 minutes. The flow velocity was 0.5 mL / min.
 質量分析装置条件は、イオン化モードをポジティブモードとし、イオン化法はESIを用いた。本分析系では、溶出してくるイオンをTOFMSにより0.1秒間スキャンし、その中の強度の大きいイオンを10個選択し、それぞれを0.05秒間MS/MSにかけるというサイクルを繰り返しながら、TOFMSスキャンによる分子イオン情報(精密質量, m/z)とMS/MSスキャンにより生じるフラグメントに由来する構造情報を取得した。質量測定範囲はTOFMSがm/z 100-1,250、MS/MSがm/z 50-1,250に設定した。各スキャンのパラメータはTOFMSスキャンについては、GS1=50、GS2=50、CUR=25、TEM=450、ISVF=5500、DP=80及びCE=10に設定し、MS/MSスキャンについては、GS1=50、GS2=50、CUR=25、TEM=450、ISVF=5500、DP=80、CE=30、CES=15、IRD=30及びIRW=15に設定した。 As for the mass spectrometer conditions, the ionization mode was set to the positive mode, and ESI was used as the ionization method. In this analysis system, the elution ions are scanned by TOFMS for 0.1 seconds, 10 high-intensity ions are selected, and each of them is subjected to MS / MS for 0.05 seconds while repeating the cycle. Molecular ion information (precision mass, m / z) obtained by TOFMS scanning and structural information derived from fragments generated by MS / MS scanning were acquired. The mass measurement range was set to m / z 100-1,250 for TOFMS and m / z 50-1,250 for MS / MS. The parameters of each scan are set to GS1 = 50, GS2 = 50, CUR = 25, TEM = 450, ISVF = 5500, DP = 80 and CE = 10 for TOFMS scans, and GS1 = for MS / MS scans. It was set to 50, GS2 = 50, CUR = 25, TEM = 450, ISVF = 5500, DP = 80, CE = 30, CES = 15, IRD = 30 and IRW = 15.
6.データ行列の作成
 データ処理は下記の通りおこなった。まず、MarkerViewTM Software(AB SCIEX)を用いてピークの抽出をおこなった。ピーク抽出条件(「peak finding option」)は、保持時間0.5分~16分に該当するピークとし、「Enhance Peak Finding」の項目におけるSubtraction offsetを20スキャン、Minimum spectral peak widthを5ppm、Subtraction multi. Factorを1.2、Minimum RT peak widthを10スキャン、Noise thresholdを5に設定し、「More」の項目におけるAssign charge stateにチェックを入れた。その結果、31,649のピーク情報を得た。
6. Creation of data matrix Data processing was performed as follows. First, peaks were extracted using MarkerView TM Software (AB SCIEX). The peak extraction condition (“peak finding option”) is a peak corresponding to a retention time of 0.5 to 16 minutes, 20 scans of Subtraction offset in the item of “Enhance Peak Finding”, 5 ppm of Minimum spectral peak width, and 5 ppm of Minimum spectral peak. .. Factor was set to 1.2, Minimum RT peak width was set to 10 scans, Noise threshold was set to 5, and the Assign charge state in the "More" item was checked. As a result, peak information of 31,649 was obtained.
 次に、検出したピークを分析した各サンプル間で整列化させるアラインメント処理をおこなった。アラインメントの処理条件(「Alighmment & Filtering」)は、「Alignment」の項目におけるRetention time toleranceを0.20分及びMass toleranceを10.0ppmに設定した。また「Filtering」の項目におけるIntensity thresholdを10、Retention time filteringにチェックを入れ、Remove peaks in<3サンプルとし、Maximum number of peaksを50,000に設定した。「Internal standards」の項目においてリドカインのピークを用いて保持時間の補正をおこなった。 Next, an alignment process was performed to align the detected peaks between the analyzed samples. As for the alignment processing conditions (“Alightment & Filtering”), the Retition time tradition in the item of “Alignment” was set to 0.20 minutes and the Mass tolerance was set to 10.0 ppm. In addition, the Integrity threshold in the "Filtering" item was set to 10, the Retention time filtering was checked, the Move peaks in <3 samples were set, and the Maximum number of peaks was set to 50,000. The retention time was corrected using the lidocaine peak in the item of "Internal standard".
 次に同位体ピークの除去をおこなった。同位体ピークはピーク抽出の時点でソフトウェアが自動で認識し、ピークリスト上で「isotopic」のラベルが付けられているため、「isotopic」でソートして該当ピークを削除した。その結果、ピークは25,895ピークに減少した。 Next, the isotope peak was removed. Since the isotope peak is automatically recognized by the software at the time of peak extraction and labeled as "isotopic" on the peak list, the corresponding peak was deleted by sorting by "isotopic". As a result, the peak decreased to 25,895 peaks.
 次に、サンプル間のピーク強度補正をおこなった。今回の分析では、サンプルの他に、すべてのサンプルから一定量を混合したpooled QCと呼ばれるサンプルを作製し、6回に1回の頻度でpooled QCの分析を実施した。これらの全QC分析結果から、「各サンプルを分析していた際にQCサンプルを分析していたと仮定するとそれぞれのピーク強度はどうなるか」という推定値を計算し、その値で補正するという処理を実施し、同一バッチ内における各サンプル間の感度の補正をおこなった。なお、本処理は、理研が提供しているフリーソフト(LOWESS-Normalization-Tool)を用いた。最後に、測定した9個のQC分析データを用いて11,408ピークの相対標準偏差(RSD)を計算し、RSD>30%となるばらつきの大きいピークを除去し、最終的に1,324のピークデータ、すなわち1,324成分の分析データを得た。得られた分析データを表3a~3qに示す。これらのデータを用いて、以降の解析をおこなった。 Next, the peak intensity was corrected between the samples. In this analysis, in addition to the samples, a sample called poored QC was prepared by mixing a certain amount from all the samples, and the polled QC was analyzed once every 6 times. From all these QC analysis results, an estimated value "what will happen to each peak intensity if it is assumed that the QC sample was analyzed when each sample was analyzed" is calculated, and the process of correcting with that value is performed. This was done and the sensitivity between each sample in the same batch was corrected. For this treatment, free software (LOWESS-Normalization-Tool) provided by RIKEN was used. Finally, the 9 QC analysis data measured were used to calculate the relative standard deviation (RSD) of the 11,408 peaks, removing the highly variable peaks with RSD> 30% and finally 1,324. Peak data, i.e., analytical data for 1,324 components was obtained. The obtained analytical data are shown in Tables 3a to 3q. Subsequent analysis was performed using these data.
Figure JPOXMLDOC01-appb-T000021
Figure JPOXMLDOC01-appb-T000021
Figure JPOXMLDOC01-appb-T000022
Figure JPOXMLDOC01-appb-T000022
Figure JPOXMLDOC01-appb-T000023
Figure JPOXMLDOC01-appb-T000023
Figure JPOXMLDOC01-appb-T000024
Figure JPOXMLDOC01-appb-T000024
Figure JPOXMLDOC01-appb-T000025
Figure JPOXMLDOC01-appb-T000025
Figure JPOXMLDOC01-appb-T000026
Figure JPOXMLDOC01-appb-T000026
Figure JPOXMLDOC01-appb-T000027
Figure JPOXMLDOC01-appb-T000027
Figure JPOXMLDOC01-appb-T000028
Figure JPOXMLDOC01-appb-T000028
Figure JPOXMLDOC01-appb-T000029
Figure JPOXMLDOC01-appb-T000029
Figure JPOXMLDOC01-appb-T000030
Figure JPOXMLDOC01-appb-T000030
Figure JPOXMLDOC01-appb-T000031
Figure JPOXMLDOC01-appb-T000031
Figure JPOXMLDOC01-appb-T000032
Figure JPOXMLDOC01-appb-T000032
Figure JPOXMLDOC01-appb-T000033
Figure JPOXMLDOC01-appb-T000033
Figure JPOXMLDOC01-appb-T000034
Figure JPOXMLDOC01-appb-T000034
Figure JPOXMLDOC01-appb-T000035
Figure JPOXMLDOC01-appb-T000035
Figure JPOXMLDOC01-appb-T000036
Figure JPOXMLDOC01-appb-T000036
Figure JPOXMLDOC01-appb-T000037
Figure JPOXMLDOC01-appb-T000037
7.相関解析
 取得した26個体分の葉中1,324成分の分析データと対応する収量データ(乾燥子実質量)、すなわち26×1,324のマトリックスデータを用いて相関解析をおこなった。各成分の分析データと収量データとの単相関係数r及び無相関の検定によりp値を算出した。結果を表4a~4qに示す。なお、表中の「成分No.」は1,324個の成分を質量順に並べた際に質量数が小さい方から番号を付けた便宜的なものである。また、分析結果には質量情報とともに保持時間の情報も含まれるが、特開2016-57219号公報によれば、少数点以下4桁以上の精密質量数を用いれば、保持時間によらず複数の質量分析用試料間で質量分析データの比較及び解析が可能であることが示されている。よって、保持時間の情報は除去し、精密質量情報のみを記載した。
7. Correlation analysis Correlation analysis was performed using the analysis data of 1,324 components in the leaves of 26 individuals acquired and the corresponding yield data (real amount of dried matter), that is, the matrix data of 26 × 1,324. The p-value was calculated by the simple correlation coefficient r and the uncorrelated test between the analysis data of each component and the yield data. The results are shown in Tables 4a-4q. The "component No." in the table is for convenience, in which 1,324 components are numbered from the smallest mass number when arranged in order of mass. Further, the analysis result includes information on the holding time as well as the mass information. It has been shown that mass spectrometric data can be compared and analyzed between mass spectrometric samples. Therefore, the information on the holding time was removed, and only the precise mass information was described.
Figure JPOXMLDOC01-appb-T000038
Figure JPOXMLDOC01-appb-T000038
Figure JPOXMLDOC01-appb-T000039
Figure JPOXMLDOC01-appb-T000039
Figure JPOXMLDOC01-appb-T000040
Figure JPOXMLDOC01-appb-T000040
Figure JPOXMLDOC01-appb-T000041
Figure JPOXMLDOC01-appb-T000041
Figure JPOXMLDOC01-appb-T000042
Figure JPOXMLDOC01-appb-T000042
Figure JPOXMLDOC01-appb-T000043
Figure JPOXMLDOC01-appb-T000043
Figure JPOXMLDOC01-appb-T000044
Figure JPOXMLDOC01-appb-T000044
Figure JPOXMLDOC01-appb-T000045
Figure JPOXMLDOC01-appb-T000045
Figure JPOXMLDOC01-appb-T000046
Figure JPOXMLDOC01-appb-T000046
Figure JPOXMLDOC01-appb-T000047
Figure JPOXMLDOC01-appb-T000047
Figure JPOXMLDOC01-appb-T000048
Figure JPOXMLDOC01-appb-T000048
Figure JPOXMLDOC01-appb-T000049
Figure JPOXMLDOC01-appb-T000049
Figure JPOXMLDOC01-appb-T000050
Figure JPOXMLDOC01-appb-T000050
Figure JPOXMLDOC01-appb-T000051
Figure JPOXMLDOC01-appb-T000051
Figure JPOXMLDOC01-appb-T000052
Figure JPOXMLDOC01-appb-T000052
Figure JPOXMLDOC01-appb-T000053
Figure JPOXMLDOC01-appb-T000053
Figure JPOXMLDOC01-appb-T000054
Figure JPOXMLDOC01-appb-T000054
 相関解析で得られた結果により、一定の相関係数を持つ成分は収量と有意に相関することが示された。相関係数の絶対値|r|>0.51となる成分は24個、|r|>0.66となる成分は3個であることがわかった。 The results obtained by the correlation analysis showed that the components with a certain correlation coefficient significantly correlated with the yield. It was found that the absolute value of the correlation coefficient | r |> 0.51 was 24 components and | r |> 0.66 was 3 components.
8.モデル構築・評価
 2つ以上の複数の成分の分析データを用いた収量予測モデルの構築には多変量解析手法を用い、解析ツールとしてSIMCA ver.14(Umetrics)を用いた。予測モデルは、説明変数に各精密質量をもった補正済みの成分の分析データのピーク面積値を、また目的変数に収量値を用いた回帰分析をおこなった。回帰分析はPLS法の改良版であるOPLS法でおこなった。
8. Model construction / evaluation A multivariate analysis method was used to construct a yield prediction model using analysis data of two or more components, and SIMCA ver. 14 (Umetrics) was used. For the prediction model, regression analysis was performed using the peak area value of the corrected component analysis data with each precise mass as the explanatory variable and the yield value as the objective variable. Regression analysis was performed by the OPLS method, which is an improved version of the PLS method.
 予測モデルの評価方法は、主に2つの指標で判断される。1つは予測精度を表すR、もう1つは予測性を表すQである。Rは予測モデル構築に使用したデータの実測値とモデルで計算した予測値との相関係数の2乗であり、1に近いほど予測精度が高いことを示している。一方、Qは、上記クロスバリデーションの結果であり、実測値と繰り返し実施したモデル検証の結果である予測値との相関係数の2乗を表している。予測の観点から、少なくともQ>0.50であれば、そのモデルは良好な予測性を持つとされていることから(Triba, M. N. et al., Mol. BioSyst. 2015, 11, 13-19.)、Q>0.50をモデル評価の基準とした。なお、常にR>Qとなるため、Q>0.50は同時にR>0.50を満たすこととなる。 The evaluation method of the prediction model is mainly judged by two indexes. One is R 2 which represents prediction accuracy, and the other is Q 2 which represents predictability. R 2 is the square of the correlation coefficient between the measured value of the data used for constructing the prediction model and the predicted value calculated by the model, and the closer it is to 1, the higher the prediction accuracy. On the other hand, Q 2 is the result of the cross validation, and represents the square of the correlation coefficient between the actually measured value and the predicted value which is the result of repeated model validation. From a prediction point of view, the model is considered to have good predictability if at least Q2 > 0.50 (Triba, M.N. et al., Mol. BioSyst. 2015, 11, 13-19. .), Q2 > 0.50 was used as the standard for model evaluation. Since R 2 > Q 2 is always satisfied, Q 2 > 0.50 satisfies R 2 > 0.50 at the same time.
8-1.全データを用いたモデルの構築・評価
 1データ当り1,324個の成分の分析データのピーク面積値と収量値を持ち、全26個のデータマトリックスから、収量を予測するOPLSモデルを構築した。構築の際、各成分の分析データのピーク面積値及び収量データはオートスケーリングにより平均0、分散1に変換した。モデル構築の結果、予測精度を示すR=0.931、予測性を示すQ=0.344であり、Q>0.50の基準を満たさなかった。
8-1. Construction / evaluation of a model using all data An OPLS model was constructed from a total of 26 data matrices, which has peak area values and yield values of analysis data of 1,324 components per data. At the time of construction, the peak area value and the yield data of the analysis data of each component were converted into an average of 0 and a variance of 1 by autoscaling. As a result of model construction, R 2 = 0.931 indicating the prediction accuracy and Q 2 = 0.344 indicating the predictability did not meet the criteria of Q 2 > 0.50.
8-2.VIP値の算出
 8-1で構築したモデルではVIP(Variable Importance in
 the Projection,投影における変数重要性)値とよばれる各成分に与えられるモデル性能への寄与度が算出される。VIP値はその値が大きいほどモデルへの寄与度が大きく、相関係数の絶対値とも相関する。VIP値上位800位までのリストを表5a~5jに示す。
8-2. VIP value calculation In the model constructed in 8-1, VIP (Variable Impact in)
The degree of contribution to the model performance given to each component called the projection (variable importance in projection) value is calculated. The larger the VIP value, the greater the contribution to the model, and it also correlates with the absolute value of the correlation coefficient. A list of the top 800 VIP values is shown in Tables 5a to 5j.
Figure JPOXMLDOC01-appb-T000055
Figure JPOXMLDOC01-appb-T000055
Figure JPOXMLDOC01-appb-T000056
Figure JPOXMLDOC01-appb-T000056
Figure JPOXMLDOC01-appb-T000057
Figure JPOXMLDOC01-appb-T000057
Figure JPOXMLDOC01-appb-T000058
Figure JPOXMLDOC01-appb-T000058
Figure JPOXMLDOC01-appb-T000059
Figure JPOXMLDOC01-appb-T000059
Figure JPOXMLDOC01-appb-T000060
Figure JPOXMLDOC01-appb-T000060
Figure JPOXMLDOC01-appb-T000061
Figure JPOXMLDOC01-appb-T000061
Figure JPOXMLDOC01-appb-T000062
Figure JPOXMLDOC01-appb-T000062
Figure JPOXMLDOC01-appb-T000063
Figure JPOXMLDOC01-appb-T000063
Figure JPOXMLDOC01-appb-T000064
Figure JPOXMLDOC01-appb-T000064
8-3.VIP値を指標としたモデル構築
 8-1で構築したモデルへの各成分の寄与度であるVIP値のランキング(表5a~5j)を基に複数の成分でモデルを構築した。特に限定されるわけではないが、モデル性能の基準を便宜上Q>0.50とした。
8-3. Model construction using the VIP value as an index A model was constructed with a plurality of components based on the VIP value ranking (Tables 5a to 5j), which is the contribution of each component to the model constructed in 8-1. Although not particularly limited, the model performance standard is set to Q2 > 0.50 for convenience.
8-3-1.VIP値上位800位までの成分の分析データを用いたモデル
 VIP値上位800位までのすべての成分を選択し、1データ当り該800個の成分の分析データのピーク面積値と収量値を持ち、全26個のデータマトリックスから、収量を予測するOPLSモデルを構築した。構築の際、各成分の分析データのピーク面積値及び収量データはオートスケーリングにより平均0、分散1に変換した。モデル構築の結果、予測精度を示すR=0.78、予測性を示すQ=0.51であった。結果を図2に示す。この予測モデルにより、栽培1カ月程度の葉に含まれる成分組成を用いることで、高い予測性を持つモデルが構築でき、早期収量予測が可能であることが示された。
8-3-1. Model using the analysis data of the components up to the top 800 VIP values All the components up to the top 800 VIP values are selected, and each data has the peak area value and yield value of the analysis data of the 800 components. An OPLS model for predicting yield was constructed from a total of 26 data matrices. At the time of construction, the peak area value and the yield data of the analysis data of each component were converted into an average of 0 and a variance of 1 by autoscaling. As a result of model construction, R 2 = 0.78 indicating prediction accuracy and Q 2 = 0.51 indicating predictability. The results are shown in FIG. From this prediction model, it was shown that a model with high predictability can be constructed and early yield prediction is possible by using the component composition contained in the leaves for about one month of cultivation.
8-3-2.VIP値上位800位までの成分のうちVIP値が下位の成分の分析データを用いたモデル
 VIP値11位以下800位までのすべての成分の分析データ、21位以下800位までのすべての成分の分析データ、31位以下800位までのすべての成分の分析データ・・・及び111位以下800位までのすべての成分の分析データを用いてそれぞれOPLSモデルの構築をおこなった。その結果、Q>0.5を満たすのは11位以下800位までのすべての成分の分析データ及び21位以下800位までのすべての成分の分析データを用いたモデルであり、VIP値31位以下800位までのすべての成分の分析データを用いてもQ>0.50とはならないことがわかった(図3)。
8-3-2. Model using analysis data of components with lower VIP values among the components up to the top 800 VIP values Analysis data of all components from the 11th place to the 800th place with VIP values, and all components from the 21st place to the 800th place An OPLS model was constructed using the analysis data, the analysis data of all the components from the 31st position to the 800th position, and the analysis data of all the components from the 111th position to the 800th position. As a result, Q2 > 0.5 is satisfied by the model using the analysis data of all the components from the 11th place to the 800th place and the analysis data of all the components from the 21st place to the 800th place, and the VIP value 31. It was found that Q2 > 0.50 does not hold even when the analysis data of all the components from the rank to the 800th rank are used (Fig. 3).
8-3-3.VIP値上位10位までの成分の分析データを9個用いたモデル
 VIP値上位1位から10位までの成分の分析データの内、任意の9個の組み合わせ(10通り)についてOPLSモデルの構築をおこなった。その結果、いずれのモデルにおいてもQ>0.50を満たすことがわかった。このことからVIP値上位10位までの代謝物を9個含んでいれば、一定の予測性を持つモデルが構築できることが示された(図4)。
8-3-3. A model using 9 analysis data of the components of the top 10 VIP values Build an OPLS model for any 9 combinations (10 ways) of the analysis data of the components of the top 10 VIP values. I did it. As a result, it was found that Q2 > 0.50 was satisfied in all the models. From this, it was shown that a model with a certain predictability can be constructed if it contains 9 metabolites up to the top 10 VIP values (Fig. 4).

Claims (17)

  1.  イネから採取された葉サンプルから、質量分析により提供される精密質量(m/z)が101~1215である成分から選ばれる1種以上の成分の分析データを取得し、当該データとイネ収量との相関性を利用してイネの収量を予測する、イネの収量予測方法。 From the leaf sample collected from rice, analysis data of one or more components selected from the components having a precise mass (m / z) of 101 to 1215 provided by mass spectrometry was obtained, and the data and the rice yield were obtained. A method for predicting the yield of rice, which predicts the yield of rice using the correlation of rice.
  2.  前記1以上の成分の分析データをpooled QC法により補正する、請求項1に記載の方法。 The method according to claim 1, wherein the analysis data of the above 1 or more components is corrected by the popled QC method.
  3.  前記成分が、質量分析により提供される精密質量(m/z)で規定された、下記表1a~1iに記載の成分から選ばれる1種以上である、請求項1又は2に記載の方法。
    Figure JPOXMLDOC01-appb-T000001

    Figure JPOXMLDOC01-appb-T000002

    Figure JPOXMLDOC01-appb-T000003

    Figure JPOXMLDOC01-appb-T000004

    Figure JPOXMLDOC01-appb-T000005

    Figure JPOXMLDOC01-appb-T000006

    Figure JPOXMLDOC01-appb-T000007

    Figure JPOXMLDOC01-appb-T000008

    Figure JPOXMLDOC01-appb-T000009
    The method according to claim 1 or 2, wherein the component is one or more selected from the components shown in Tables 1a to 1i below, which are defined by the precision mass (m / z) provided by mass spectrometry.
    Figure JPOXMLDOC01-appb-T000001

    Figure JPOXMLDOC01-appb-T000002

    Figure JPOXMLDOC01-appb-T000003

    Figure JPOXMLDOC01-appb-T000004

    Figure JPOXMLDOC01-appb-T000005

    Figure JPOXMLDOC01-appb-T000006

    Figure JPOXMLDOC01-appb-T000007

    Figure JPOXMLDOC01-appb-T000008

    Figure JPOXMLDOC01-appb-T000009
  4.  成分が、前記表1a~1iに記載の成分No.1、4、6、9、10、11、12、19、20、21、23、26、27、29、30、33、34、35、38、39、45、46、47、48、49、50、51、52、54、55、56、61、62、63、64、65、66、67、69、71、75、76、77、78、81、83、84、85、88、89、90、91、92、96、100、102、105、106、107、108、109、113、116、118、119、120、121、123、124、126、127、129、130、131、133、134、137、139、142、145、147、148、149、151、152、153、154、155、156、159、162、163、164、165、166、167、168、169、170、172、174、175、177、178、181、182、183、184、186、187、188、191、193、194、196、198、200、202、203、206、208、209、210、212、213、214、215、217、218、219、220、221、222、223、224、225、226、227、228、229、230、231、232、233、235、237、238、239、240、243、245、246、247、248、249、250、251、252、254、255、258、259、260、261、262、263、264、265、266、268、270、271、272、273、274、275、277、278、280、281、283、284、285、286、288、289、290、291、293、294、298、300、302、303、305、310、312、314、316、317、318、320、321、323、325、327、329、331、332、333、334、335、337、338、339、342、343、344、345、346、347、348、350、351、352、355、358、359、360、361、362、363、365、366、368、369、370、371、373、374、375、378、379、381、382、389、390、391、392、395、397、398、399、401、404、407、408、409、410、411、413、414、415、416、417、418、419、423、424、425、428、431、433、434、435、436、437、438、439、441、444、445、446、447、449、450、451、454、455、457、458、459、460、461、462、464、465、469、471、472、473、474、475、478、480、481、482、483、487、489、490、491、492、494、502、503、504、507、509、510、511、512、513、514、516、517、522、523、525、526、529、532、534、539、540、542、543、547、548、549、551、552、554、555、557、561、565、566、567、573、582、583、585、586、588、589、590、591、593、594、595、596、597、599、600、602、603、604、606、609、611、612、613、615、616、617、619、620、621、624、628、630、631、632、633、635、639、643、644、647、649、650、651、653、654、655、656、658、660、661、662、665、666、671、672、673、674、675、681、682、683、684、685、688、689、691、692、693、694、695、696、699、700、701、702、703、704、706、707、708、713、714、715、717、719、721、722、723、724、725、726、727、728、729、731、732、734、735、737、738、740、745、746、748、749、750、754、756、757、762、765、766、767、768、770、774、776、777、780、781、782、785、787、789、792、793、794、795、796、797、798、799、801、802、803、804、810、811、813、815、816、817、818、820、822、823、824、827、828、829、830、832、834、841、842、843、844、845、846、848、849、850、852、854、858、863、864、867、868、869、870、871、872、874、877、878、879、882、883、884、885、886、888、889、893、894、895、896、898、899、900、901、902、903、910、911、912、914、917、919、922、923、924、925、926、928、930、932、938、941、942、943、944、945、946、947、948、949、950、952、953、955、956、958、959、960、962、965、966、968、969、973、976、979、980、981、983、985、986、989、992、993、994、995、996、997、999、1001、1002、1003、1005、1006、1007、1009、1012、1013、1015、1017、1019、1020、1021、1022、1024、1025、1026、1027、1031、1032、1034、1036、1039、1043、1044、1045、1046、1047、1048、1049、1050、1051、1053、1054、1057、1058、1059、1060、1062、1066、1067、1068、1069、1070、1072、1074、1075、1077、1078、1079、1081、1082、1087、1088、1089、1092、1094、1098、1100、1101、1102、1103、1104、1105、1106、1108、1110、1112、1113、1114、1117、1118、1119、1120、1121、1123、1126、1127、1128、1129、1133、1134、1135、1139、1140、1141、1142、1143、1144、1147、1148、1149、1150、1151、1152、1153、1154、1158、1160、1163、1166、1167、1168、1170、1171、1172、1173、1174、1177、1178、1179、1180、1181、1182、1184、1186、1187、1188、1189、1190、1191、1192、1193、1194、1195、1196、1197、1198、1199、1202、1204、1208、1211、1212、1214、1217、1218、1221、1222、1224、1225、1226、1229、1231、1233、1234、1235、1237、1238、1239、1240、1241、1242、1243、1244、1246、1247、1248、1249、1250、1252、1254、1255、1256、1257、1258、1261、1263、1265、1267、1268、1269、1271、1272、1276、1277、1278、1280、1283、1291、1292、1295、1296、1297、1299、1300、1301、1304、1305、1306、1309、1311、1312、1313、1314、1315、1316、1317、1318、1319、1321及び1322から選ばれる1種以上である請求項3に記載の方法。 The components are the component Nos. No. 1 shown in Tables 1a to 1i. 1, 4, 6, 9, 10, 11, 12, 19, 20, 21, 23, 26, 27, 29, 30, 33, 34, 35, 38, 39, 45, 46, 47, 48, 49, 50, 51, 52, 54, 55, 56, 61, 62, 63, 64, 65, 66, 67, 69, 71, 75, 76, 77, 78, 81, 83, 84, 85, 88, 89, 90, 91, 92, 96, 100, 102, 105, 106, 107, 108, 109, 113, 116, 118, 119, 120, 121, 123, 124, 126, 127, 129, 130, 131, 133, 134, 137, 139, 142, 145, 147, 148, 149, 151, 152, 153, 154, 155, 156, 159, 162, 163, 164, 165, 166, 167, 168, 169, 170, 172, 174,175,177,178,181,182,183,184,186,187,188,191,193,194,196,198,200,202,203,206,208,209,210,212,213, 214, 215, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 235, 237, 238, 239, 240, 243, 245, 246, 247, 248, 249, 250, 251, 252, 254, 255, 258, 259, 260, 261 and 262, 263, 264, 265, 266, 268, 270, 271, 272, 273, 274, 275, 277, 278, 280, 281, 283, 284, 285, 286, 288, 289, 290, 291, 293, 294, 298, 300, 302, 303, 305, 310, 312, 314, 316, 317, 318, 320, 321, 323, 325, 327, 329, 331, 332, 333, 334, 335, 337, 338, 339, 342, 343, 344, 345, 346, 347, 348, 350, 351 and 352, 355, 358, 359, 360, 361, 362, 363, 365, 366, 368, 369, 370, 371, 373, 374, 375, 378, 379, 381, 382, 389, 390, 391, 392, 395, 397, 398, 399, 401, 404, 407, 408, 409, 410, 411, 413, 414, 415, 416, 41 7, 418, 419, 423, 424, 425, 428, 431, 433, 434, 435, 436, 437, 438, 439, 441, 444, 445, 446, 447, 449, 450, 451, 454, 455, 457, 458, 459, 460, 461, 462, 464, 465, 469, 471, 472, 473, 474, 475, 478, 480, 481, 482, 483, 487, 489, 490, 491, 492, 494, 502, 503, 504, 507, 509, 510, 511, 512, 513, 514, 516, 517, 522, 523, 525, 526, 259, 532, 534, 359, 540, 542, 543, 547, 548, 549, 551, 552, 554, 555, 557, 561, 565, 566, 567, 573, 582, 583, 585, 586, 588, 589, 590, 591, 593, 594, 595, 596, 599, 599, 600, 602, 603, 604, 606, 609, 611, 612, 613, 615, 616, 617, 618, 620, 621, 624, 628, 630, 631, 632, 633, 635, 639, 643, 644, 647, 649, 650, 651, 653, 654, 655, 656, 658, 660, 661, 662, 665, 666, 671, 672, 673, 674, 675, 681, 682, 683, 684, 685, 688, 689, 691, 692, 693, 694, 695, 696, 699, 700, 701, 702, 703, 704, 706, 707, 708, 713, 714, 715, 717, 719, 721, 722, 723, 724, 725,726,727,728,729,731,732,734,735,737,738,740,745,746,748,749,750,754,756,757,762,765,766,767,768, 770, 774, 767, 777, 780, 781, 782, 785, 787, 789, 792, 793, 794, 795, 796, 797, 798, 799, 801, 802, 803, 804, 810, 811, 813, 815, 816, 817, 818, 820, 822, 823, 824, 827, 828, 829, 830, 832, 834, 841, 842, 843, 844, 845, 846, 848, 849, 850, 852, 854, 85 8,863,864,867,868,869,870,871,872,874,877,878,879,882,883,884,885,886,888,889,893,894,895,896,898, 899, 900, 901, 902, 903, 910, 911, 912, 914, 917, 919, 922, 923, 924, 925, 926, 928, 930, 923, 938, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 952, 953, 955, 956, 958, 959, 960, 962, 965, 966, 968, 969, 973, 976, 979, 980, 981, 983, 985, 986, 989, 992, 993, 994, 995, 996, 997, 999, 1001, 1002, 1003, 1005, 1006, 1007, 1009, 1012, 1013, 1015, 1017, 1019, 1020, 1021, 1022, 1024, 1025, 1026, 1027, 1031, 1032, 1034, 1036, 1039, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1053, 1054, 1057, 1058, 1059, 1060, 1062, 1066, 1067, 1068, 1069, 1070, 1072, 1074, 1075, 1077, 1078, 1079, 1081, 1082, 1087, 1088, 1089, 1092, 1094, 1098, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1108, 1110, 1112, 1113, 1114, 1117, 1118, 1119, 1120, 1121, 1123, 1126, 1127, 1128, 1129, 1133, 1134, 1135, 1139, 1140, 1141, 1142, 1143, 1144, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1158, 1160, 1163, 1166, 1167, 1168, 1170, 1171, 1172, 1173, 1174, 1177, 1178, 1179, 1180, 1181, 1182, 1184, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193, 1194, 1195, 1196, 1197, 1198, 1199, 1202, 1204, 1208, 1211, 1212, 1214, 1217, 1218, 1221, 1222, 1224, 1225, 1226, 1229, 1231, 1233, 1234, 1235, 1237, 1238, 1239, 1240, 1241, 1242, 1243, 1244, 1246, 1247, 1248, 1249, 1250, 1252, 1254, 1255, 1256, 1257, 1258, 1261, 1263, 1265, 1267, 1268, 1269, 1271, 1272, 1276, 1277, 1278, 1280, 1283, 1291, 1292, 1295, 1296, 1297, 1299, 1300, 1301, 1304, 1305, 1306, 1309, 1311, 1312, 1313, 1314, 1315, 1316, 1317, 1318, 1319, 1321 and 1322. the method of.
  5.  成分が、前記表1a~1iに記載の成分No.10、177、178、245、254、272、294、337、366、435、462、529、539、708、729、832、842、869、901、912、1050、1060、1173及び1306から選ばれる1種以上である請求項3に記載の方法。 The components are the component Nos. No. 1 shown in Tables 1a to 1i. Selected from 10, 177, 178, 245, 254, 272, 294, 337, 366, 435, 462, 259, 359, 708, 729, 832, 842, 869, 901, 912, 1050, 1050, 1173 and 1306. The method according to claim 3, wherein the method is one or more.
  6.  成分が、前記表1a~1iに記載の成分No.10、178及び1173から選ばれる1種以上である請求項3に記載の方法。 The components are the component Nos. No. 1 shown in Tables 1a to 1i. 10. The method according to claim 3, wherein the method is one or more selected from 10, 178 and 1173.
  7.  葉サンプルが、出芽期から出穂期のイネから採取される、請求項1~6のいずれか1項に記載の方法。 The method according to any one of claims 1 to 6, wherein the leaf sample is collected from rice from the germination stage to the heading stage.
  8.  葉サンプルが、2葉齢期から幼穂形成期のイネから採取される、請求項1~6のいずれか1項に記載の方法。 The method according to any one of claims 1 to 6, wherein the leaf sample is collected from rice in the two-leaf instar to the panicle formation stage.
  9.  分析データが、質量分析データである請求項1~8のいずれか1項に記載の方法。 The method according to any one of claims 1 to 8, wherein the analysis data is mass spectrometry data.
  10.  葉サンプルから取得された成分の分析データを、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位800個の中から少なくとも1個の成分の分析データを用いて構築された収量予測モデルと照合する工程を含む、請求項3~9のいずれか1項に記載の方法。 The analysis data of the components obtained from the leaf sample is used for at least one component from the top 800 VIP values calculated from the yield prediction model constructed by using the component information shown in Tables 1a to 1i. The method according to any one of claims 3 to 9, comprising a step of collating with a yield prediction model constructed using analytical data.
  11.  収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位800個の中から少なくとも5個を用いる、請求項10に記載の方法。 The method according to claim 10, wherein the yield prediction model uses at least 5 out of the top 800 VIP values calculated from the yield prediction model constructed by using the component information shown in Tables 1a to 1i. ..
  12.  収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位800個の中から少なくとも10個を用いる、請求項10に記載の方法。 The method according to claim 10, wherein the yield prediction model uses at least 10 out of the top 800 VIP values calculated from the yield prediction model constructed by using the component information shown in Tables 1a to 1i. ..
  13.  収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位10個の中から少なくとも1個を用いる、請求項10に記載の方法。 The method according to claim 10, wherein the yield prediction model uses at least one of the top 10 VIP values calculated from the yield prediction model constructed by using the component information shown in Tables 1a to 1i. ..
  14.  収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位10個の中から少なくとも5個を用いる、請求項10に記載の方法。 The method according to claim 10, wherein the yield prediction model uses at least 5 out of the top 10 VIP values calculated from the yield prediction model constructed by using the component information shown in Tables 1a to 1i. ..
  15.  収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位10個の中から少なくとも9個を用いる、請求項10に記載の方法。 The method according to claim 10, wherein the yield prediction model uses at least 9 out of the top 10 VIP values calculated from the yield prediction model constructed by using the component information shown in Tables 1a to 1i. ..
  16.  収量予測モデルが、前記表1a~1iに記載の成分情報を用いて構築された収量予測モデルから算出されたVIP値の上位800個を用いる、請求項10に記載の方法。 The method according to claim 10, wherein the yield prediction model uses the top 800 VIP values calculated from the yield prediction model constructed by using the component information shown in Tables 1a to 1i.
  17.  収量予測モデルが、OPLS法を用いて構築されたモデルである請求項10~16のいずれか1項に記載の方法。 The method according to any one of claims 10 to 16, wherein the yield prediction model is a model constructed by using the OPLS method.
PCT/JP2021/037900 2020-10-13 2021-10-13 Method for predicting soybean yield WO2022080411A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020172820 2020-10-13
JP2020-172820 2020-10-13

Publications (1)

Publication Number Publication Date
WO2022080411A1 true WO2022080411A1 (en) 2022-04-21

Family

ID=81209172

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/037900 WO2022080411A1 (en) 2020-10-13 2021-10-13 Method for predicting soybean yield

Country Status (2)

Country Link
JP (1) JP2022064328A (en)
WO (1) WO2022080411A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115953694A (en) * 2022-12-29 2023-04-11 北大荒信息有限公司 Leaf age spatial distribution detection method based on satellite remote sensing image
CN115953391A (en) * 2023-03-08 2023-04-11 吉林大学 Rice phenological index monitoring system and method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011033533A (en) * 2009-08-04 2011-02-17 National Institute For Environmental Studies Method for evaluating influence of ozone on rice using sakuranetin
JP2012215482A (en) * 2011-03-31 2012-11-08 Central Research Institute Of Electric Power Industry Method for evaluating ozone influence of rice yield
US20150168419A1 (en) * 2010-11-16 2015-06-18 University College Cork - National University Of Ireland, Cork Prediction of a small-for-gestational age (sga) infant

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011033533A (en) * 2009-08-04 2011-02-17 National Institute For Environmental Studies Method for evaluating influence of ozone on rice using sakuranetin
US20150168419A1 (en) * 2010-11-16 2015-06-18 University College Cork - National University Of Ireland, Cork Prediction of a small-for-gestational age (sga) infant
JP2012215482A (en) * 2011-03-31 2012-11-08 Central Research Institute Of Electric Power Industry Method for evaluating ozone influence of rice yield

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115953694A (en) * 2022-12-29 2023-04-11 北大荒信息有限公司 Leaf age spatial distribution detection method based on satellite remote sensing image
CN115953391A (en) * 2023-03-08 2023-04-11 吉林大学 Rice phenological index monitoring system and method
CN115953391B (en) * 2023-03-08 2023-08-15 吉林高分遥感应用研究院有限公司 Rice weather index monitoring system and method

Also Published As

Publication number Publication date
JP2022064328A (en) 2022-04-25

Similar Documents

Publication Publication Date Title
Couture et al. Spectroscopic determination of ecologically relevant plant secondary metabolites
WO2022080411A1 (en) Method for predicting soybean yield
Allwood et al. Plant metabolomics and its potential for systems biology research: Background concepts, technology, and methodology
CN102478563B (en) Method for studying metabolic difference of transgenic rice and non-transgenic rice
Cappellin et al. PTR-ToF-MS and data mining methods: a new tool for fruit metabolomics
WO2020213672A1 (en) Method of predicting soybean yield
Montero-Vargas et al. Metabolic phenotyping for the classification of coffee trees and the exploration of selection markers
Haeck et al. Trace analysis of multi-class phytohormones in Oryza sativa using different scan modes in high-resolution Orbitrap mass spectrometry: method validation, concentration levels, and screening in multiple accessions
Latif et al. Metabolomic approaches for the identification of flavonoids associated with weed suppression in selected Hardseeded annual pasture legumes
Çatav et al. NMR-based metabolomics reveals that plant-derived smoke stimulates root growth via affecting carbohydrate and energy metabolism in maize
Bindereif et al. Complementary use of 1H NMR and multi-element IRMS in association with chemometrics enables effective origin analysis of cocoa beans (Theobroma cacao L.)
CN107870218A (en) Resistance metabolin Arabidopsides detection method in a kind of rice
Xiong et al. Study on phenolic acids of Lonicerae japonicae Flos based on ultrahigh performance liquid chromatography‐tandem mass spectrometry combined with multivariate statistical analysis
Ikekawa The analysis of brassinosteroids—plant growth-promoting substances
EP2818861B1 (en) Method for predicting the sugar content in a full-grown root vegetable
US8518712B2 (en) Method for discovering pharmacologically active substance of natural products using high resolution mass spectrometry and pharmacological activity test
WO2022080412A1 (en) Method for predicting wheat yield
Rozali et al. Identification of amines, amino and organic acids in oil palm (Elaeis guineensis Jacq.) spear leaf using GC-and LC/Q-TOF MS metabolomics platforms
JP7244338B2 (en) Soybean yield prediction method
WO2022080410A1 (en) Maize yield prediction method
Rivas-Ubach et al. Deciphering the source of primary biological aerosol particles: a pollen case study
Steinbauer et al. Foliar quality of co-occurring mallee eucalypts: balance of primary and secondary metabolites reflects past growing conditions
CN113984940A (en) Analysis method for high-throughput rapid detection of volatile components of rhododendron lapponicum
Jousse et al. Exploring metabolome with GC/MS
Skoneczny et al. Metabolomics and metabolic profiling: investigation of dynamic plant-environment interactions at the functional level

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21880149

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21880149

Country of ref document: EP

Kind code of ref document: A1