CN110705182A - Crop breeding adaptive time prediction method coupling crop model and machine learning - Google Patents

Crop breeding adaptive time prediction method coupling crop model and machine learning Download PDF

Info

Publication number
CN110705182A
CN110705182A CN201911076188.7A CN201911076188A CN110705182A CN 110705182 A CN110705182 A CN 110705182A CN 201911076188 A CN201911076188 A CN 201911076188A CN 110705182 A CN110705182 A CN 110705182A
Authority
CN
China
Prior art keywords
crop
model
time
breeding
yield
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911076188.7A
Other languages
Chinese (zh)
Other versions
CN110705182B (en
Inventor
张朝
张亮亮
陶福禄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Normal University
Original Assignee
Beijing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Normal University filed Critical Beijing Normal University
Publication of CN110705182A publication Critical patent/CN110705182A/en
Application granted granted Critical
Publication of CN110705182B publication Critical patent/CN110705182B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Software Systems (AREA)
  • Marketing (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Husbandry (AREA)
  • Agronomy & Crop Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Mining & Mineral Resources (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a crop breeding adaptive time prediction method by coupling a crop model and machine learning, which comprises the following steps: s1: calibration of the crop model and simulation of the management scenario to obtain the growth time (DOY) and yield (Y) of the crop; s2: selecting key feature variables, S3: constructing a hybrid evaluation model, including selecting the hybrid evaluation model with the highest precision by combining a machine learning method; s4: assessing the effect of climate change, including calculating the yield change (Yc) for each variety; and S5: identifying a time of breeding adaptation; calculating whether the median value of the yield change of at least any half year in each time window exceeds an adaptability threshold value, if so, determining that the lattice point needs breeding intervention, wherein the breeding adaptability time is the middle time t of the time window; this ultimately results in times and locations in the study area that require breeding adaptations in specific future climatic scenarios.

Description

Crop breeding adaptive time prediction method coupling crop model and machine learning
Technical Field
The invention relates to the technical field of agricultural information, in particular to a crop breeding adaptive time prediction method for coupling a crop model and machine learning.
Background
Climate change causes a significant increase in the frequency and intensity of extreme climatic events (such as extreme high temperature, drought and heat waves) and poses a serious threat to global food safety. The variety updating is a key measure for the agricultural production system to cope with climate change, and relates to three links of cultivation (breeding), delivery (delivery) and adoption (option), which generally requires 15-30 years and consumes a large amount of funds. Therefore, the breeding adaptation time should be scientifically predicted in advance so as not to waste funds. However, completing a systematic assessment of the impact of climate change on existing varieties on a regional scale is a prerequisite for determining when and where breeding adaptations are required.
At present, there are two main methods for evaluating the influence of climate change on crop varieties: (1) the statistical model is used for establishing a regression relationship between meteorological factors and yield in a reference time period, and then substituting the trend of meteorological elements under the future climate situation into the statistical model to estimate the influence of climate change; however, the method can only evaluate the influence of climate change on a single variety, and cannot systematically research the response of the existing variety to the climate change; (2) crop models, which can artificially reproduce the continuous process of crop from sowing to maturity on a day or even hour scale, reflect the way crop growth responds to different environmental and regulatory factors. In the evaluation of the influence of climate change, only the meteorological data, the soil data and the management data in the reference time period and the future situation are input into a crop model for simulation to obtain the corresponding yield, and then the yields in the two time periods are compared to estimate the influence of climate change. Existing crop models can be divided into two categories: the site model is designed for a specific field test, and although the influence of management measures on yield formation is successfully described, only a single-point simulation can be performed. The regional application of the model can be realized through the regional variety parameter and the spatial interpolation technology of the meteorological elements, but a new error is inevitably introduced; although the lattice crop model can represent regional spatial differences, a large amount of driving data is required for construction and operation, and the parameters are very difficult to define due to the large spatial heterogeneity of surface parameters, varieties and management modes, and large-regional research is still not easy to realize. In addition, the grid point model mainly considers the influence of the change of meteorological elements on the crop yield, and the contribution of agricultural measures is usually ignored.
Most of the existing breeding adaptation researches utilize field temperature increasing experiments or crop models to study whether the cultivated high-temperature and drought resistant varieties can make up for yield loss caused by climate change or not based on the simulation of hypothetical varieties, and a frame for predicting breeding adaptation time does not exist at present.
Therefore, it is necessary to construct a flexible and efficient method for quantifying the influence of climate change on a regional scale while taking into account the contribution of management measures such as variety, which is the basis for predicting breeding adaptation time.
Disclosure of Invention
The inventor realizes in the research process that machine learning is a direct relay of statistical methods, and the difference is that the machine learning is predicted by using weights without making any assumption on input information, so that data containing noise is more stable, and the complex nonlinear relation of an agricultural system can be better described; furthermore, machine learning is completely data dependent, i.e. its predicted spatial scale depends on the input data, allowing flexible multi-scale applications. Therefore, the mechanism process of the site model and the advantages of machine learning model data driving are combined, a mixed evaluation model is constructed by utilizing the output of the crop model to train a machine learning algorithm to depict the complex relation among climate, soil, management, variety and yield in a specific environment, and then the relation is applied to a homogeneous region, so that the influence of climate change on different varieties can be evaluated on a regional scale, and a foundation is laid for the prediction of the next breeding adaptation time.
And the cross threshold analysis for climate change prediction can determine the time and place of occurrence of a certain event (such as estimation of global temperature at any time and any place which is higher than 2 ℃ in the prior industrial era), and the method is applied to the research of climate change adaptability measures, can predict breeding adaptive time and place, can provide early signals of the existing products which cannot be planted at any time and any place for decision makers, and further promotes breeding investment, which is of great importance for guaranteeing national and regional food safety.
Based on the findings, the invention constructs a mixed evaluation model by coupling the site crop model and machine learning, realizes the evaluation of the influence of regional scale climate change on different varieties, and then predicts the breeding adaptive time and place by utilizing cross threshold analysis.
According to an aspect of the present invention, there is provided a method for predicting adaptation time of crop breeding by coupling crop models and machine learning, comprising the steps of:
s1: simulation of calibration and management scenarios for a crop model, comprising: acquiring soil data (S), meteorological data (W) and agricultural production data (A) of experiment sites in a research area, calibrating a crop model by using the data, and simulating various management scenes by using the calibrated crop model to obtain the growth period (DOY) and the yield (Y) of crops under various simulation scenes;
s2: selecting key feature variables, including: for each simulation scene, extracting meteorological data of crops from sowing to maturity every day according to the growth period (DOY) obtained by the simulation, and calculating the agricultural gas index in the growth period; integrating the characteristic variables which influence the growth and development of the crops and correspond to each simulation scene to establish a characteristic variable table; calculating a correlation between the characteristic variables through Pearson correlation analysis, analyzing and sequencing the importance of the characteristic variables relative to the yield by utilizing a machine learning model, removing the characteristic variables with the correlation larger than a preset value (for example, 0.75) and the characteristic variables with insignificant contribution to the yield, and keeping the rest characteristic variables as key characteristic variables;
s3: constructing a hybrid assessment model, comprising: inserting the yield (Y) corresponding to each simulation into a corresponding feature variable table, dividing each simulation scene into a training set and a testing set according to a certain proportion, optimizing the hyper-parameters of a machine learning model by using the key feature variables and the yield (Y) based on the training set by adopting a grid search (GridsearchCV) method in Python, and selecting the mixed evaluation model with the highest precision by using 10-fold cross validation (10-fold cross validation) on the testing set;
s4: assessing the effect of climate change comprising: respectively inputting the key characteristic variables of the grid point scale of each variety under the climate conditions of the reference time period and the future time period into the mixed evaluation model with the highest precision to obtain the grid yield of the reference time period and the future time period, comparing and calculating the yield change (Yc) of each variety, and calculating the yield change formula as follows;
Figure BDA0002262522140000041
wherein Y isfAnd YbAnnual production in the future period and annual average production in the reference period, respectively;
s5: identifying a time of breeding adaptation; the method comprises the following steps: calculating, for each crop planting site, a median of yield variation for a plurality of varieties per year over a future period; setting a time window in the future period, wherein the starting point of the future period is used as the starting point of the time window, calculating whether the median of the yield change of at least any half year in the time window exceeds an adaptive threshold, if the condition is met, determining that the lattice point needs breeding intervention, and the adaptive time of breeding is the middle time t of the time window; and then setting the next time window with t +1 as a middle point to perform the same analysis, and circulating until the end point of the time window moves to the end point of the future time period, thereby finally obtaining the time and the place of the research area needing breeding adaptation under the specific future climate situation, wherein when the yield change is negative and less than a certain value, the yield change is defined as an adaptation threshold value.
Preferably, the crop model is a DSSAT model.
Preferably, in steps S2 and S3, the machine learning model is selected from RF and XGBoost.
Preferably, the variables influencing the growth and development of the crops comprise the agricultural gas indexes, soil attributes, geographical positions and varieties.
Preferably, step S3 includes using the Mean Absolute Error (MAE), the mean square root error (RMSE), and the decision coefficient (R)2) To evaluate the accuracy when MAE and RMSE are lowest and R is2The highest time is the mixed evaluation model with the highest precision.
Preferably, the ratio of the training set to the test set is about 7-9:3-1, preferably 8: 2.
Preferably, the time window is 20 years.
Preferably, the yield is changed to-10% as the adaptive threshold.
Preferably, the method of coupled crop model and machine-learned crop breeding adaptation time prediction further comprises replacing the median with a maximum loss or minimum loss value for yield change of a plurality of varieties annually over a future period, and repeating step S5 to identify earliest and latest breeding adaptation times and locations.
Compared with the prior art, the invention realizes the beneficial technical effects that:
1. and the regional application of the site crop model is realized. The method is characterized in that a site model is used for simulating the yield under various local production conditions, then a machine learning model is trained by using a simulation result, and the purpose is to describe the complex relation among climate, management, variety and yield of a specific area by using machine learning, further apply the relation to a homogeneous area and realize point-to-surface extrapolation. Compared with the traditional regional variety parameter method, the method is more scientific and reasonable.
2. The efficiency of climate change influence evaluation is improved. Compared with simulation based on a lattice point model, the method only needs a small amount of experimental data to calibrate the site model, and avoids complex data preparation and parameter determination processes of the lattice point model. The spatial scale of the hybrid evaluation model based on the machine learning model only depends on input data, and multi-scale climate change influence evaluation can be flexibly completed.
3. A framework for predicting climate change adaptability measures is presented. The technology applies the cross threshold analysis for climate change prediction to the determination of crop breeding time, and predicts breeding adaptive time and place for the first time, which is important for an agricultural production system to cope with climate change and guarantee grain safety. The method is not limited to the determination of breeding adaptation time, and can also be applied to the research of various adaptive measures such as transformation adaptation and the like.
Drawings
The same reference numbers in the drawings identify the same or similar elements or components. The objects and features of the present invention will become more apparent in view of the following description taken in conjunction with the accompanying drawings, in which:
fig. 1 is a schematic flow diagram of a method of coupled crop model and machine-learned crop breeding adaptation time prediction according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of the prediction results of a crop breeding adaptation time prediction method coupling crop models and machine learning according to an embodiment of the present invention.
Detailed Description
For a clear description of the solution according to the invention, preferred embodiments are given below and are described in detail with reference to the accompanying drawings. The following description is merely exemplary in nature and is not intended to limit the present disclosure, application, or uses.
It should be understood that the crop models and machine learning models referred to in the present invention are known per se, such as the various sub-modules of the model, various parameters, operating mechanisms, etc., and therefore the present invention focuses on the coupled application process between the crop models and machine learning.
FIG. 1 is a schematic flow diagram of a method for predicting adaptation time for crop breeding by coupling crop models and machine learning, according to an embodiment of the present invention, which is further described below with reference to the accompanying drawings.
Referring to fig. 1, the method for predicting adaptation time for crop breeding by coupling crop model and machine learning according to the present invention may include the following steps:
selecting a suitable crop model, such as a DSSAT series model, an MCWLA series model and the like, and calibrating (i.e., localizing the model) and simulating various management scenarios on the crop model by using soil data (S), meteorological data (W) and agricultural production data (A) of the experimental sites in the research area. For example, soil parameters may include soil type, color, grade, permeability, reflectivity, soil layer thickness, soil moisture evaporation limits, runoff curve number and soil drainage rate, photosynthesis factors, soil water lower limit or withered point water content, field water capacity, saturated water content, soil capacity, soil organic carbon, nitrogen, soil PH, clay content and particle content, and the like; the meteorological parameters can comprise daily solar radiation, daily maximum temperature, daily minimum temperature, daily rainfall, daily relative humidity, daily average wind speed and the like; the agricultural production data includes site-scale irrigation, variety, fertilization, planting density, seeding manner, and the like. And optimally calibrating the model by using the parameter data to realize localization. Simulations of various management scenarios are then performed using the calibrated model to obtain the growth time (DOY) and yield (Y) of the crop under various simulated scenarios.
The figure shows that the DSSAT model is adopted, and specifically, the method can include inputting the data (S), (W) and (a) into the DSSAT model to generate files S ', W ' and a ' that the model can execute respectively; and calling the files S ', W ' and A ' for calculation through a GLUE parameter estimation tool to obtain a file C containing the crop variety parameters of the research area. And then setting various agricultural management scenes based on the agricultural production data (A), namely expanding the agricultural production data A to simulate various management scenes, inputting the modified data into the crop model again to generate a file A ", calling the files S ', W ', A ' and C through a crop system model embedded in the crop model, and simulating to obtain the growth period (DOY) and the yield (Y) of the crop.
If necessary, the calibrated model may be verified by using the data (S), (W) and (a), for example, the data (S), (W) and (a) may be divided into a calibration part and a verification part, the model may be calibrated by using the calibration part, and the model may be verified by using the verification part.
Selecting key feature variables, including: for each simulation scenario, according to the growth period (DOY) obtained by the simulation, weather data of crops from sowing to maturity every day, such as maximum temperature, minimum temperature, average mild rainfall and the like, are extracted, and the indexes of the agricultural gas in the growth period, such as accumulated temperature (GDD), accumulated rainfall (Pgs), standard rainfall index (SPI) and the like, are calculated; and integrating the characteristic variables which influence the growth and development of the crops and correspond to each simulation situation to establish a characteristic variable table. For example, the characteristic variables may include the above-mentioned agricultural gas indexes, soil properties, geographical locations, species, and the like, for example, the soil properties may include a lower drainage limit (SLLL), an upper drainage limit (SDUL), a saturated water content (SSAT), a bulk density (SBDM), a ph value (SLHM), and a cation exchange capacity (SCEC), and the like, and the geographical locations may include latitude and longitude, elevation, and the like.
And calculating the correlation among the characteristic variables through Pearson correlation analysis, analyzing and sequencing the importance of the characteristic variables relative to the yield by utilizing a machine learning model, removing the characteristic variables with the correlation larger than a preset value and the characteristic variables with insignificant contribution to the yield, and keeping the rest of the characteristic variables as key characteristic variables.
Pearson correlation analysis may calculate the correlation between characteristic variables, for example, when the correlation coefficient is greater than a certain value, for example, it may be set to be greater than 0.75, and these variables are excluded. The machine learning models, which may be for example RF and XGBoost, are known per se, and the above parameter variables are input into these machine learning models, and the importance ranking of the characteristic variables is automatically calculated and output, i.e. ranking is performed according to the influence (contribution) of these characteristic variables on the yield, and the characteristic variables that do not contribute significantly to the yield are removed, for example, the last few variables of the sequence obtained by the ranking may be deleted. One or more machine learning models may be used, such as RF and XGBoost, and the results of both models are then considered together.
After the characteristic variables are eliminated, the rest of the characteristic variables are key characteristic variables, and the influence of the variables on the crop yield is large.
Next, a hybrid assessment model is constructed using the key feature variables and the yield (Y). May include inserting the yield (Y) corresponding to each simulation into a corresponding feature variable table, and then dividing each simulation scenario (i.e., sample) into a training set and a testing set in a certain ratio, for example, the ratio of the training set to the testing set may be about 7-9:3-1, preferably 8: 2; optimizing hyper-parameters of a machine learning model, such as coefficients of a multiple linear regression, based on the training set and using the key feature variables and yield (Y) using a grid search in Python (GridsearchCV) method; the machine learning model may be, for example, RF and/or XGBoost, thereby constructing a hybrid assessment model.
Then, on the test set, using 10-fold cross validation (10-fold cross validation) to evaluate the accuracy of the mixed evaluation model, the mixed evaluation model with the highest accuracy is selected, wherein the average absolute error (MAE), the Root Mean Square Error (RMSE) and the coefficient of determination (R) can be used2) To evaluate the accuracy when MAE and RMSE are lowest and R is2The highest time the mixed evaluation model is the highest precision mixed evaluation model.
Then, the influence of the climate change is evaluated by using the mixed evaluation model with the highest precision, which can include inputting the key characteristic variables of the grid point scale of each variety under the climate situation of the reference time interval and the future time interval into the mixed evaluation model with the highest precision respectively to obtain the grid yield of the reference time interval and the future time interval, comparing and calculating the yield change (Y) of each varietyc) The yield variation formula is calculated as follows;
Figure BDA0002262522140000111
wherein Y isfAnd YbAnnual production during the future period and annual average production during the benchmark period, respectively.
Identifying a time of breeding acclimation. In order to determine the time and place of breeding adaptation, it is first necessary to define an adaptation threshold (adaptation threshold) for a variety, below which means that planting an existing variety in future climatic scenarios at the site will suffer a large yield loss, and new varieties need to be bred. That is, when the yield change is negative and less than a certain value, the yield change may be defined as an adaptive threshold, e.g., Y may be defined ascAn adaptive threshold is defined at-10%, or some other suitable value, where negative values mean loss of production (yield reduction).
Calculating the median of the yield change of a plurality of varieties every year in a future period for each crop planting lattice point, namely simulating the yield change of different varieties of one crop every year and calculating the median of the yield change of each variety on each planting lattice point; then setting a time window in the future period, wherein the starting point of the future period is used as the starting point of the time window, calculating whether the median value (yield loss) of the yield change of at least any half year in the time window exceeds an adaptive threshold value, if the condition is met, determining that the lattice (place) point needs breeding intervention, and the adaptive time of breeding is the middle time t of the time window; and then setting the next time window by taking t +1 as an intermediate point to perform the same analysis, and circulating until the end point of the time window moves to the end point of the future time period, thereby finally obtaining the time and the place of the research area needing breeding adaptation under the specific future climate scene, wherein when the yield change is a negative value and is less than a certain value, the yield change is an adaptive threshold value.
Referring to fig. 1, a 20-year time window is shown, whether the yield loss of at least any 10 years within the 20 years exceeds the adaptability threshold is judged, if the conditions are met, the site is considered to need breeding intervention, and the breeding adaptation time is the middle time of the 20-year time window; if the condition is not met, the next time window is analyzed, and so on. Ultimately, the time and place of the study area that requires breeding adaptations in specific future climatic scenarios are obtained.
The method of predicting crop breeding adaptation times coupling crop models and machine learning according to the present invention may further comprise replacing the median with a maximum loss or a minimum loss value for a change in yield of a plurality of varieties annually over a future period of time, and repeating step S5 to identify earliest and latest breeding adaptation times and locations.
The identification of the earliest and latest breeding adaptation times is consistent with the above process, but calculated based on the maximum and minimum values of the change in yield of the plurality of varieties per year over the future period, respectively. That is, for each crop planting site, calculating a maximum loss or minimum loss value for the change in yield of the plurality of varieties per year over a future period; setting a time window in the future period, wherein the starting point of the future period is used as the starting point of the time window, calculating whether the maximum loss or the minimum loss value of the yield change of at least any half year in the time window exceeds an adaptability threshold value, if the condition is met, determining that the breeding intervention is needed at the (place) point, and the earliest or latest breeding adaptation time is the middle time t of the time window; the same analysis is then performed with t +1 as the middle point and the next time window is set, and the cycle is repeated until the end of the time window moves to the end of the future period, thereby ultimately resulting in the time and place of breeding adaptation where the study area needs to be the earliest and latest in a particular future climatic scenario.
Examples
The present case further illustrates the technical solution of the present invention by taking the estimation of the yield of summer corn in northern China plain as an example. The method comprises the following steps: this example is intended to illustrate the invention only, but not to limit the scope of the invention, e.g. the invention may also be used for other crops such as wheat and the like:
in this example, taking as an example the prediction of the breeding adaptation time of Huang-Huai-Hai-Xia corn in China under the RCP8.5 (an assumed scenario of future carbon emission, namely, the concentration of carbon dioxide in the air is 3-4 times higher than that before the industrial revolution by 2100 years), the technical method of the present invention, the flow of the method for predicting the breeding adaptation time of the crop by coupling the crop model and machine learning, is further illustrated, and specifically includes:
s1: and adopting a CERES-Maize model in the DSSAT series model to perform calibration and simulation of local management scenes. Six corn varieties were tested at 13 sites in the Huang-Huai-Hai corn growing area, as shown in Table 1. Inputting soil data S, meteorological data W and agricultural production data A of all experiments of each variety into a CERES-Maize model, dividing the data into a calibration part and a verification part according to a certain proportion, calling the three files of the calibration year through a GLUE parameter estimation tool, and calculating to obtain a file C containing the crop variety parameters of the region; and then verifying the calibrated crop genetic parameters by using the data of the verification year to finally obtain 6 sets of variety genetic parameters.
TABLE 1 Huang-Huai-Hai plain tested maize varieties
Figure BDA0002262522140000141
Based on the agricultural production data a, for example, the agricultural production data a of each experimental site can be modified (augmented) according to the specific agricultural planting experience or records of multiple years, so as to simulate the yield under different management situations of 6 varieties, 5 planting dates and 6 planting densities in the reference year (1986-.
S2: a key feature is selected. Extracting the maximum temperature, the lowest temperature, the average mild rainfall of the corn from sowing to maturity in each day under each management scene according to the growth period (DOY) obtained by simulation in S1, and calculating 5 agricultural gas indexes according to the formula in Table 2 (the calculation of the agricultural gas indexes is well known in the art and is not repeated herein); then, the simulated maize growing period length (DOY) and 10 surface soil attributes (such as soil physicochemical property, hydrological property, pH value and the like at a position of 100 cm) of the simulated maize growing period length and the corresponding station at each time and 3 geographic position information (longitude, latitude and elevation) are extracted, and a characteristic variable table is established. And then, calculating the correlation among the features by using Pearson correlation analysis, simultaneously sequencing the importance of the factors by using RF and XGboost respectively, finally removing the factors with high correlation and low comprehensive importance score, and storing the finally selected features.
TABLE 2 calculation of the agricultural gas index for the corn growth period
Figure RE-GDA0002302169590000151
s is planting date, m is maturing date.
*Droughts in a warming climate:a global assessment of standardizedprecipitation index(spi)and reconnaissance drought index,Asadi Zarch et al.2015
S3: and constructing a mixed evaluation model. Inserting the yield of each simulation into a characteristic variable table, dividing samples into a training set and a testing set according to the proportion of 8:2, respectively performing parameter optimization on RF and XGboost by using grid search in Pyrhon 3.7 based on 80% of samples, then evaluating the precision of the model by using 10-fold cross validation based on the remaining 20% of samples, and finally selecting MAE (mean absolute error) and RMSE (root mean square error) to be the lowest, R being the lowest, and2(coeffient of determination) the highest model. The results show that the XGboost-based hybrid assessment model has the highest accuracy (Table 3), and is used for assessing the influence of climate change on the next step point scale.
Figure BDA0002262522140000161
Figure BDA0002262522140000162
Figure BDA0002262522140000163
OiAnd SiFor observed and analog values, OavgAnd SavgAre the corresponding average values. Y ispredFor the prediction value of the mixture evaluation model, YsimuIs CERES-MaizeThe analog value, n is the sample size.
Table 3 accuracy of corn yield predicted by RF and XGBoost on test set
Figure BDA0002262522140000164
S4: the effect of climate change is evaluated. To evaluate the influence of climate change on the grid point scale, grid characteristic variables need to be input into an evaluation model, wherein soil and geographic position data are grid point data of 0.5 degrees multiplied by 0.5 degrees, only 10 surface soil attributes and 3 spatial position information of the corn planting grid points need to be extracted, and 5 agricultural gas indexes and growth period length (DOY) need to be further calculated.
S4.1, DOY of the lattice point scale is obtained. The meteorological data of each experimental site in S1 is replaced by the meteorological data of a reference time period (1986-.
S4.2, calculating the indexes of the agricultural gas. And calculating the agricultural gas indexes (GDD, TCD, OCA, Pgs and SPI) of each variety in the growth period of the 1986-year-old corn according to the DOY of the 6 varieties in the grid point scale every year obtained in the last step, and obtaining the agricultural gas index with the resolution of 0.5 degrees multiplied by 0.5 degrees of each variety in 20 years.
S4.3, calculating the yield of the grid point scale. For each variety, 5 agricultural gas indexes, 10 surface soil attributes, 1 DOY and 3 spatial position characteristics of the 1986-2005-year lattice point scale are input into the mixed evaluation model, and the yield of each variety in the 1986-2005-year lattice point scale is obtained.
S4.4, replacing the meteorological data of the reference time interval with data of 2020 + 2060 years under the RCP8.5 scene, and repeating the steps S4.1, S4.2 and S4.3 to obtain the yield of each variety 2020 + 2060 years in lattice point scale.
S4.5 comparing the yield in the future period with the average yield in the reference year to obtain the yield change of 2020-2060 year under RCP8.5 (Y)c) The calculation formula is as follows:
Ycis a predicted change in yield, YfAnd YbAverage production per year and baseline year, respectively, for the future period.
S5: identifying a time of breeding acclimation. Yield loss of 10% was first defined as the fitness threshold below which means that maize production will suffer a large yield loss, requiring breeding intervention. For each corn planting lattice point, calculating the median value of the yield change of 6 varieties per year in 2020-. If the probability is less than 0.5, selecting a time window for calculation in 10 years before and after the midpoint in the next year, and repeating the steps until the midpoint moves to 2050. Finally, the breeding adaptive time of the Huang-Huai-Hai corn planting area under the RCP8.5 scene is obtained, and the result is shown in figure 2.
The crop breeding time prediction method based on the coupling of the crop model and the machine learning integrates the advantages of the crop model and the machine learning, realizes the regional application of the site crop model by training the machine learning model through the output of the site crop model, applies the cross threshold analysis to the prediction of the crop breeding time, and provides a new framework for the research of climate change adaptability measures.
The principles and embodiments of the present invention have been described herein using specific examples, which are presented solely to aid in the understanding of the apparatus and its core concepts; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A crop breeding adaptive time prediction method coupling a crop model and machine learning comprises the following steps:
s1: simulation of calibration and management scenarios for a crop model, comprising: acquiring soil data (S), meteorological data (W) and agricultural production data (A) of experiment sites in a research area, calibrating a crop model by using the data, and simulating various management scenes by using the calibrated crop model to obtain the growth period (DOY) and the yield (Y) of crops under various simulation scenes;
s2: selecting key feature variables, including: for each simulation scene, extracting meteorological data of crops from sowing to maturity every day according to the growth period (DOY) obtained by the simulation, and calculating the agricultural gas index in the growth period; integrating the characteristic variables which influence the growth and development of the crops and correspond to each simulation scene to establish a characteristic variable table; calculating a correlation between the characteristic variables through Pearson correlation analysis, analyzing and sequencing the importance of the characteristic variables relative to the yield by utilizing a machine learning model, removing the characteristic variables with the correlation larger than a preset value (for example, 0.75) and the characteristic variables with insignificant contribution to the yield, and keeping the rest characteristic variables as key characteristic variables;
s3: constructing a hybrid assessment model, comprising: inserting the yield (Y) corresponding to each simulation into a corresponding feature variable table, dividing each simulation scene into a training set and a testing set according to a certain proportion, optimizing the hyper-parameters of a machine learning model by using the key feature variables and the yield (Y) based on the training set by adopting a grid search (GridsearchCV) method in Python, and selecting the mixed evaluation model with the highest precision by using 10-fold cross validation (10-fold cross validation) on the testing set;
s4: assessing the effect of climate change comprising: respectively inputting the key characteristic variables of the grid point scale of each variety under the climate conditions of the reference time period and the future time period into the mixed evaluation model with the highest precision to obtain the grid yield of the reference time period and the future time period, comparing and calculating the yield change (Yc) of each variety, wherein a yield change formula is calculated as follows;
Figure FDA0002262522130000021
wherein Y isfAnd YbAnnual average production for future and benchmark periods, respectively;
s5: identifying a time of breeding adaptation; the method comprises the following steps: calculating, for each crop planting site, a median of yield variation for a plurality of varieties per year over a future time period; setting a time window in the future period, taking the starting point of the future period as the starting point of the time window, calculating whether the median of the yield change of at least any half year in the time window exceeds an adaptability threshold value, if the condition is met, determining that the lattice point needs breeding intervention, and the breeding adaptive time is the middle time t of the time window; and then setting the next time window by taking t +1 as an intermediate point to perform the same analysis, and circulating until the end point of the time window moves to the end point of the future time period, thereby finally obtaining the time and the place of the research area needing breeding adaptation under the specific future climate scene, wherein when the yield change is negative and less than a certain value, the yield change is defined as an adaptation threshold value.
2. The method of claim 1, wherein the crop model is a DSSAT model.
3. A method of crop breeding adaptation time prediction coupling crop model and machine learning as claimed in claim 1 wherein in steps S2 and S3, the machine learning model is selected from RF and XGBoost.
4. The method of claim 1, wherein the variables affecting the growth and development of the crop comprise an index of agricultural gas, a soil property, a geographic location, a variety.
5. The coupled crop model and machine of claim 1The method for predicting adaptive time for learning crop breeding is characterized in that the step S3 includes using a Mean Absolute Error (MAE), a Root Mean Square Error (RMSE), and a coefficient of determination (R)2) To evaluate the accuracy when MAE and RMSE are lowest and R is2The highest time the hybrid evaluation model is the highest precision hybrid evaluation model.
6. The method of claim 1, wherein the ratio of the training set to the test set is about 7-9:3-1, preferably 8: 2.
7. The method of claim 1, wherein the time window is 20 years.
8. The method of claim 1, wherein the fitness threshold is-10%.
9. The method of predicting crop breeding adaptation time coupled with crop model and machine learning of claim 1, further comprising replacing the median with a maximum loss or a minimum loss value for yield variation of a plurality of varieties per year over a future period, and repeating step S5 to identify earliest and latest breeding adaptation times and locations.
10. The method of predicting adaptation time for crop breeding by coupling crop model and machine learning of claim 1, wherein said crop is selected from the group consisting of corn and wheat.
CN201911076188.7A 2019-09-06 2019-11-06 Crop breeding adaptive time prediction method coupling crop model and machine learning Active CN110705182B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2019108420333 2019-09-06
CN201910842033 2019-09-06

Publications (2)

Publication Number Publication Date
CN110705182A true CN110705182A (en) 2020-01-17
CN110705182B CN110705182B (en) 2020-07-10

Family

ID=69205290

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911076188.7A Active CN110705182B (en) 2019-09-06 2019-11-06 Crop breeding adaptive time prediction method coupling crop model and machine learning

Country Status (1)

Country Link
CN (1) CN110705182B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308289A (en) * 2020-09-29 2021-02-02 北京农业信息技术研究中心 Rice yield prediction method and device
CN112992271A (en) * 2020-11-06 2021-06-18 厦门大学 Method for rapidly predicting crop latitude adaptability indoors
CN113011683A (en) * 2021-04-26 2021-06-22 中国科学院地理科学与资源研究所 Crop yield estimation method and system based on corrected crop model
CN114648214A (en) * 2022-03-14 2022-06-21 江西省农业科学院园艺研究所 Proportion allocation method and system for physiological and biochemical indexes of facility crops

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105468854A (en) * 2015-11-27 2016-04-06 河北省科学院地理科学研究所 Key factor yield contribution calculation method based on crop growth mechanism
KR101811640B1 (en) * 2016-08-03 2017-12-26 한국과학기술연구원 Prediction apparatus and method for production of crop using machine learning
CN109711102A (en) * 2019-01-27 2019-05-03 北京师范大学 A kind of crop casualty loss fast evaluation method
CN109754125A (en) * 2019-01-18 2019-05-14 中国农业大学 Crop yield forecast method based on crop modeling, history and meteorological forecast data
CN109829234A (en) * 2019-01-30 2019-05-31 北京师范大学 A kind of across scale Dynamic High-accuracy crop condition monitoring and yield estimation method based on high-definition remote sensing data and crop modeling

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105468854A (en) * 2015-11-27 2016-04-06 河北省科学院地理科学研究所 Key factor yield contribution calculation method based on crop growth mechanism
KR101811640B1 (en) * 2016-08-03 2017-12-26 한국과학기술연구원 Prediction apparatus and method for production of crop using machine learning
CN109754125A (en) * 2019-01-18 2019-05-14 中国农业大学 Crop yield forecast method based on crop modeling, history and meteorological forecast data
CN109711102A (en) * 2019-01-27 2019-05-03 北京师范大学 A kind of crop casualty loss fast evaluation method
CN109829234A (en) * 2019-01-30 2019-05-31 北京师范大学 A kind of across scale Dynamic High-accuracy crop condition monitoring and yield estimation method based on high-definition remote sensing data and crop modeling

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
C. FOLBERTH 等: "Spatio-temporal downscaling of gridded crop model yield estimates based on machine learning", 《AGRICULTURAL AND FOREST METEOROLOGY》 *
FULU TAO 等: "Modelling the impacts of weather and climate variability on crop productivity over a large area: A new super-ensemble-based probabilistic projection", 《AGRICULTURAL AND FOREST METEOROLOGY》 *
秦鹏程 等: "利用作物模型研究气候变化对农业影响的发展过程", 《中国农业气象》 *
胡亚南: "气候变化对中国玉米生产的影响及适应性研究", 《中国优秀硕士学位论文全文数据库 农业科技辑》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308289A (en) * 2020-09-29 2021-02-02 北京农业信息技术研究中心 Rice yield prediction method and device
CN112992271A (en) * 2020-11-06 2021-06-18 厦门大学 Method for rapidly predicting crop latitude adaptability indoors
CN112992271B (en) * 2020-11-06 2024-03-05 厦门大学 Method for rapidly predicting latitude adaptability of crops indoors
CN113011683A (en) * 2021-04-26 2021-06-22 中国科学院地理科学与资源研究所 Crop yield estimation method and system based on corrected crop model
CN114648214A (en) * 2022-03-14 2022-06-21 江西省农业科学院园艺研究所 Proportion allocation method and system for physiological and biochemical indexes of facility crops
CN114648214B (en) * 2022-03-14 2023-09-05 江西省农业科学院园艺研究所 Proportion allocation method and system for physiological and biochemical indexes of facility crops

Also Published As

Publication number Publication date
CN110705182B (en) 2020-07-10

Similar Documents

Publication Publication Date Title
CN110705182B (en) Crop breeding adaptive time prediction method coupling crop model and machine learning
Kucharik Evaluation of a process-based agro-ecosystem model (Agro-IBIS) across the US Corn Belt: Simulations of the interannual variability in maize yield
Balkovič et al. Pan-European crop modelling with EPIC: Implementation, up-scaling and regional crop yield validation
Launay et al. Exploring options for managing strategies for pea–barley intercropping using a modeling approach
Fraisse et al. Calibration of the CERES–MAIZE model for simulating site–specific crop development and yield on claypan soils
Zhao et al. Ensemble forecasting of monthly and seasonal reference crop evapotranspiration based on global climate model outputs
CN110909933B (en) Agricultural drought rapid diagnosis and evaluation method coupling crop model and machine learning language
Langstroff et al. Opportunities and limits of controlled-environment plant phenotyping for climate response traits
Ma et al. Evaluation of RZWQM under varying irrigation levels in eastern Colorado
CN106845428A (en) A kind of crop yield remote sensing estimation method and system
Abedinpour et al. Prediction of maize yield under future water availability scenarios using the AquaCrop model
Cardozo et al. Modeling sugarcane ripening as a function of accumulated rainfall in Southern Brazil
CN116485040B (en) Seed vitality prediction method, system, electronic equipment and storage medium
CN116681169A (en) Method for evaluating influence of extreme climate on crop yield
Borus et al. Improving the prediction of potato productivity: APSIM-Potato model parameterization and evaluation in Tasmania, Australia
Amiri et al. Assessment of CERES-Maize model in simulating maize growth, yield and soil water content under rainfed, limited and full irrigation
McMaster et al. Simulating unstressed crop development and growth using the unified plant growth model (UPGM)
Kamali et al. Improving the simulation of permanent grasslands across Germany by using multi-objective uncertainty-based calibration of plant-water dynamics
WO2010129168A2 (en) Real-time process for targeting trait phenotyping of plant breeding experiments
Peaucelle et al. Representing explicit budburst and senescence processes for evergreen conifers in global models
Aggarwal Applications of systems simulation for understanding and increasing yield potential of wheat and rice
CN115952421A (en) High-precision time-space simulation method for coupling ecological process model and machine learning algorithm
Fraisse et al. Evaluation of Crop Models to Simulate Site‐Specific Crop Development and Yield
Zhang et al. Developing spring wheat in the Noah-MP land surface model (v4. 4) for growing season dynamics and responses to temperature stress
CN113128871A (en) Cooperative estimation method for distribution change and productivity of larch under climate change condition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant