CN115952421A - High-precision time-space simulation method for coupling ecological process model and machine learning algorithm - Google Patents

High-precision time-space simulation method for coupling ecological process model and machine learning algorithm Download PDF

Info

Publication number
CN115952421A
CN115952421A CN202310049924.XA CN202310049924A CN115952421A CN 115952421 A CN115952421 A CN 115952421A CN 202310049924 A CN202310049924 A CN 202310049924A CN 115952421 A CN115952421 A CN 115952421A
Authority
CN
China
Prior art keywords
model
machine learning
process model
data
ecosystem
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310049924.XA
Other languages
Chinese (zh)
Inventor
肖浏骏
罗忠奎
张帅
史舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202310049924.XA priority Critical patent/CN115952421A/en
Publication of CN115952421A publication Critical patent/CN115952421A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种耦合生态过程模型与机器学习算法的高精度时空模拟方法,属于生态学、农学、全球变化相关研究领域。包括:使用研究区内的长期定位试验数据对生态系统过程模型的相关参数进行校准;随机选取若干格点,每个格点随机生成管理措施和气候变化情景组合,用校准后的过程模型模拟各格点和情景下的生态系统状态变量变化;训练和评估机器学习的表现,根据不同模型权重的加权评价法得到最优集成模型;将最优的机器学习集成模型当作过程模型的代理模型,代理模拟高分辨率时空尺度下作物产量和土壤有机碳等变化。通过本发明,能更加快速、高效地预测高时空分辨率下某种生态系统状态变量的时空变化规律及其对不同管理措施和气候变化的响应。

Figure 202310049924

The invention discloses a high-precision time-space simulation method coupling an ecological process model and a machine learning algorithm, and belongs to the research fields related to ecology, agronomy and global change. Including: use the long-term positioning test data in the study area to calibrate the relevant parameters of the ecosystem process model; randomly select a number of grid points, randomly generate management measures and climate change scenario combinations for each grid point, and use the calibrated process model to simulate each Ecosystem state variable changes under grid points and scenarios; training and evaluating the performance of machine learning, and obtaining the optimal integrated model according to the weighted evaluation method of different model weights; using the optimal machine learning integrated model as a proxy model of the process model, The agent simulates changes such as crop yield and soil organic carbon at high-resolution spatio-temporal scales. Through the invention, the spatio-temporal change rule of a certain ecosystem state variable and its response to different management measures and climate change can be predicted more quickly and efficiently at high spatio-temporal resolution.

Figure 202310049924

Description

耦合生态过程模型与机器学习算法的高精度时空模拟方法A high-precision spatio-temporal simulation method for coupling ecological process models and machine learning algorithms

技术领域technical field

本发明属于生态学、农学、全球变化相关研究领域,具体涉及一种耦合生态过程模型与机器学习算法的高精度时空模拟方法。The invention belongs to the related research fields of ecology, agronomy and global change, and specifically relates to a high-precision spatio-temporal simulation method coupling an ecological process model and a machine learning algorithm.

背景技术Background technique

基于过程的生态系统模型由于综合考虑了生态系统过程与环境(如气候、土壤等)和人类活动(如土地利用、农业管理)的相互作用,是应对全球可持续发展和生态环境挑战(如气候变化、粮食安全、生物多样性保护和自然资源管理)的主要工具。然而,过程模型在大尺度上的应用存在计算成本高、模型不确定性大和数据可用性差的问题,很难在可接受的时间内完成高精度时空分辨率下的快速模拟,限制了这类模型在大尺度上的普适性和操作性。Because the process-based ecosystem model comprehensively considers the interaction between ecosystem processes and the environment (such as climate, soil, etc.) change, food security, biodiversity conservation and natural resource management). However, the application of process models on a large scale has the problems of high computational cost, large model uncertainty, and poor data availability. It is difficult to complete fast simulations at high-precision spatio-temporal resolution within an acceptable time, which limits this type of model. Ubiquity and operability on a large scale.

过程模型使用数学方程对生态系统过程进行简化,需要输入大量的参数才能闭合计算,这些参数是基于野外观测站或田间试验站点试验数据进行估计,但在环境条件更加复杂和多样的大尺度上,一些生态过程参数没法得到很好的估计,模型不确定性大。机器学习在大尺度上可以达到比过程模型更高的精度和模拟效率,但其可能会产生与生态系统过程相悖的结果,需要提供足量高质的训练数据。在此背景下,“耦合过程模型与机器学习集成模拟”法应运而生。过程模型可以首先在站点尺度进行校准,校准后的模型可以在足够多的环境-管理情景下生产大量的模拟数据,为机器学习模型提高了大量优质数据。基于这些过程模型的模拟数据,机器学习模型能够自动学习复杂生态系统过程和环境-管理的内在关系,并可以实现快速泛化模拟。把过程模型隐含的先验知识和机器学习的学习能力相结合,利用少量站点观测数据,即可灵活地实现高分辨率时空尺度的快速模拟。然而不同机器学习方法由于算法的差异,其理解生态过程的能力和模型泛化能力不同。为了尽可能降低机器学习集成的不确定性,本发明使用自适应筛选来淘汰学习能力差的模型,并通过计算不同模型的均方根误差来构建权重,加权平均学习能力强的机器学习模型,从而提高模拟精度,降低集成代理模型的不确定性。The process model uses mathematical equations to simplify the ecosystem process, and it needs to input a large number of parameters to complete the calculation. These parameters are estimated based on the experimental data of field observation stations or field test sites, but on a large scale with more complex and diverse environmental conditions, Some ecological process parameters cannot be well estimated, and the uncertainty of the model is large. Machine learning can achieve higher accuracy and simulation efficiency than process models on a large scale, but it may produce results that are contrary to ecosystem processes, and sufficient and high-quality training data must be provided. In this context, the method of "Coupling Process Models and Machine Learning Integrated Simulation" came into being. The process model can be calibrated at the site scale first, and the calibrated model can produce a large amount of simulated data under enough environmental-management scenarios, which provides a large amount of high-quality data for the machine learning model. Based on the simulated data of these process models, the machine learning model can automatically learn the intrinsic relationship between complex ecosystem processes and environment-management, and can achieve fast generalization simulation. Combining the implicit prior knowledge of the process model with the learning ability of machine learning, a small amount of site observation data can be used to flexibly realize fast simulations at high-resolution spatio-temporal scales. However, due to differences in algorithms, different machine learning methods have different abilities to understand ecological processes and model generalization. In order to reduce the uncertainty of machine learning integration as much as possible, the present invention uses adaptive screening to eliminate models with poor learning ability, and builds weights by calculating the root mean square error of different models, weighted average machine learning models with strong learning ability, Thereby improving the simulation accuracy and reducing the uncertainty of the ensemble surrogate model.

本发明将基于过程的生态系统模型与多种机器/深度学习相结合,通过自适应筛选、加权平均的方式建立最优的集成代理模型,实现高精度时空分辨率的快速、精确模拟,为精细尺度上生态系统的模拟和管理决策提供技术方法。The present invention combines a process-based ecosystem model with a variety of machine/deep learning, establishes an optimal integrated agent model through self-adaptive screening and weighted average, and realizes fast and accurate simulation with high-precision spatio-temporal resolution. Provide technical methods for modeling and management decision-making of ecosystems at different scales.

发明内容Contents of the invention

本发明的目的在于克服现有技术中的缺陷,并提供一种耦合生态过程模型与机器学习算法的高精度时空模拟方法。本发明耦合基于过程的生态系统模型与多种机器/深度学习模型,通过自适应筛选、加权平均的方式建立最优的集成代理模型,准确快速地实现高精度时空分辨率的生态系统状态变量的模拟。The purpose of the present invention is to overcome the defects in the prior art, and provide a high-precision spatio-temporal simulation method coupling the ecological process model and the machine learning algorithm. The present invention couples the process-based ecosystem model with various machine/deep learning models, establishes an optimal integrated agent model through self-adaptive screening and weighted average, and accurately and quickly realizes the high-precision spatio-temporal resolution of ecosystem state variables. simulation.

本发明所采用的具体技术方案如下:The concrete technical scheme that the present invention adopts is as follows:

本发明提供了一种耦合生态过程模型与机器学习算法的高精度时空模拟方法,具体如下:The present invention provides a high-precision spatio-temporal simulation method coupling ecological process model and machine learning algorithm, specifically as follows:

S1:使用目标研究区内的长期定位试验数据对生态系统过程模型的相关参数进行校准和验证;S1: Calibrate and validate the relevant parameters of the ecosystem process model using long-term location test data in the target study area;

S2:在所述研究区内,随机选取能覆盖研究区气候类型和土壤特性空间变异的若干格点,每个格点随机生成一种管理措施和气候变化情景组合;基于各格点对应情景下的土壤数据和历史气候数据,用经步骤S1校验后的生态系统过程模型模拟各格点和情景下的生态系统状态变量变化;S2: In the study area, randomly select a number of grid points that can cover the spatial variation of climate types and soil properties in the study area, and randomly generate a combination of management measures and climate change scenarios for each grid point; The soil data and historical climate data of the soil data and the historical climate data, use the ecosystem process model verified by step S1 to simulate the changes of the ecosystem state variables under each grid point and scenario;

S3:整合步骤S2中生态系统过程模型情景模拟的输入数据和输出数据,用于训练不同类型的机器学习模型,使机器学习模型充分学习和仿真生态系统过程模型模拟的状态变量变化;评估机器学习的表现,自适应地淘汰表现差的机器学习模型,并根据不同模型权重的加权评价法得到最优集成模型;S3: Integrate the input data and output data of the scenario simulation of the ecosystem process model in step S2, and use it to train different types of machine learning models, so that the machine learning model can fully learn and simulate the state variable changes simulated by the ecosystem process model; evaluate the machine learning performance, adaptively eliminate machine learning models with poor performance, and obtain the optimal integrated model according to the weighted evaluation method of different model weights;

S4:将所述最优集成模型作为生态系统过程模型的代理模型,基于空间数据,代理模拟并生成高分辨率时空尺度下研究目的所需的高精度数字制图产品。S4: Using the optimal integrated model as a proxy model of the ecosystem process model, based on spatial data, simulate and generate high-precision digital mapping products required for research purposes at high-resolution spatio-temporal scales.

作为优选,所述步骤S1具体如下:Preferably, the step S1 is specifically as follows:

针对特定的研究目的,确定需要模拟的研究区域和适合的生态系统过程模型;根据生态系统过程模型的需要,获取研究区域内的生态系统原位站点观测数据,使用差分进化算法对生态系统过程模型的关键参数进行校正和验证。For specific research purposes, determine the research area that needs to be simulated and the appropriate ecosystem process model; according to the needs of the ecosystem process model, obtain the observation data of the ecosystem in situ in the study area, and use the differential evolution algorithm to model the ecosystem process model. Calibration and verification of key parameters.

进一步的,所述生态系统原位站点观测数据包括气象数据、土壤属性数据、农田生态系统数据、生态系统状态变量数据中的一种或多种。Further, the ecosystem in-situ site observation data includes one or more of meteorological data, soil attribute data, farmland ecosystem data, and ecosystem state variable data.

更进一步的,所述气象数据包括日值温度、降水和太阳辐射,所述土壤属性数据包括土壤有机碳、pH和土壤容重,所述农田生态系统数据包括播期、施肥、灌溉阶段的管理数据和产量数据,所述生态系统状态变量数据包括土壤有机碳和温室气体排放观测数据。Furthermore, the meteorological data includes daily temperature, precipitation and solar radiation, the soil attribute data includes soil organic carbon, pH and soil bulk density, and the farmland ecosystem data includes management data of sowing date, fertilization, and irrigation stages and yield data, the ecosystem state variable data includes soil organic carbon and greenhouse gas emission observation data.

进一步的,所述关键参数包括农田生态系统中的作物品种参数、土壤有机碳和N2O排放过程参数。Further, the key parameters include crop variety parameters, soil organic carbon and N 2 O emission process parameters in the farmland ecosystem.

作为优选,所述步骤S2具体如下:As preferably, the step S2 is specifically as follows:

在所述研究区内,根据设定的空间分辨率生成空间网格,从所有网格中随机选取能覆盖研究区气候类型和土壤特性空间变异的若干网格点,每个格点随机生成一种管理措施和气候变化情景组合;搜集和准备与选取网格点对应的土壤数据和历史气候数据,用以驱动所述步骤1中已经校验好的生态系统过程模型,模拟出历史和未来情景下感兴趣的状态变量。In the research area, a spatial grid is generated according to the set spatial resolution, and a number of grid points that can cover the spatial variation of the climate type and soil properties in the research area are randomly selected from all the grid points, and a grid point is randomly generated for each grid point. combination of various management measures and climate change scenarios; collect and prepare soil data and historical climate data corresponding to selected grid points to drive the ecosystem process model that has been verified in step 1, and simulate historical and future scenarios State variables of interest.

进一步的,所述管理措施情景是指各种可能的人类活动管理,在农田生态系统中,管理措施情景包括作物系统情景、作物播期情景、施肥情景、灌溉情景和秸秆还田情景;所述气候变化情景是指未来气候变化情景,从全球气候模型和共享经济路径排放情景中选择确定。Further, the management measure scenario refers to various possible human activities management. In the farmland ecosystem, the management measure scenario includes crop system scenario, crop sowing date scenario, fertilization scenario, irrigation scenario and straw return scenario; The climate change scenario refers to the future climate change scenario, which is selected and determined from the global climate model and the emission scenarios of the shared economic path.

作为优选,所述步骤S3中,机器学习模型包括LASSO模型、支持向量机、随机森林、XGBoost、卷积神经网络和长短期记忆网络。Preferably, in the step S3, the machine learning model includes LASSO model, support vector machine, random forest, XGBoost, convolutional neural network and long short-term memory network.

作为优选,所述步骤S3中,分别计算均方根误差RMSE和决定系数R2以评估各机器学习模型预测过程模型输出的表现,通过设置筛选标准,R2排序低于各模型平均水平的模型将被弃用,筛选出模型表现在平均水平之上的几个模型;根据模型表现,给每一个筛选出的模型计算权重,权重为1/RMSE,构建加权平均的集成模型,得到最优集成模型。Preferably, in the step S3, the root mean square error RMSE and the coefficient of determination R2 are respectively calculated to evaluate the performance of each machine learning model to predict the output of the process model, and by setting the screening criteria, R2 ranks models lower than the average level of each model It will be discarded, and several models whose performance is above the average level are selected; according to the model performance, the weight is calculated for each selected model, and the weight is 1/RMSE, and a weighted average integration model is constructed to obtain the optimal integration Model.

作为优选,所述步骤S4具体如下:As preferably, the step S4 is specifically as follows:

搜集和制备高精度的空间数据,驱动已经构建的所述代理模型,通过并行运算,实现快速模拟,生成研究目的所需的高精度数字制图产品。Collect and prepare high-precision spatial data, drive the agent model that has been constructed, realize fast simulation through parallel computing, and generate high-precision digital mapping products required for research purposes.

本发明相对于现有技术而言,具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:

1)通过获取站点尺度的生态系统观测数据,校准生态系统过程模型,再通过过程模型进行大量的管理-环境情景模拟。使用多种机器学习模型仿真过程模型的情景模拟结果,充分学习过程模型中的内在关系,并通过自适应筛选和加权平均得到最优的集成模型,并最终用于更高分辨率时空尺度上的生态系统状态变量的快速模拟。1) By obtaining site-scale ecosystem observation data, calibrating the ecosystem process model, and then carrying out a large number of management-environmental scenario simulations through the process model. Use a variety of machine learning models to simulate the scenario simulation results of the process model, fully learn the internal relationship in the process model, and obtain the optimal integrated model through adaptive screening and weighted average, and finally use it for higher resolution spatio-temporal scale Fast simulation of ecosystem state variables.

2)创新性的采用过程模型与机器学习混合建模,既保证了模型的机理性又提高了模型在精细尺度下的模型精度和模拟效率,采用权重法集成模型,进一步减少系统误差,提高数据精确度及可信度,有利于我们理解及预测模拟未来气候变化和不同管理措施对生态系统造成的影响。2) The innovative use of process model and machine learning hybrid modeling not only ensures the mechanism of the model but also improves the model accuracy and simulation efficiency of the model at a fine scale. The weight method is used to integrate the model to further reduce system errors and improve data quality. Accuracy and reliability will help us understand and predict the impact of simulated future climate change and different management measures on ecosystems.

3)本发明适用于任一区域、任一时间、任一生态系统的状态属性模拟,比如可以对全球尺度、全国尺度或者区域尺度,历史或者未来情景的生态系统生产力、作物产量、叶面积指数、土壤有机碳、N2O排放等指标进行模拟,能更加高效、准确的预测高精度时空分辨率下生态系统状态属性模拟及其对不同人类活动管理和气候变化的响应。该技术的广泛应用将为高精度时空尺度上生态系统模拟提供科学技术支撑。3) The present invention is applicable to the state attribute simulation of any region, any time, and any ecosystem, such as ecosystem productivity, crop yield, and leaf area index on a global scale, a national scale, or a regional scale, historical or future scenarios , soil organic carbon, N 2 O emissions and other indicators can be simulated more efficiently and accurately to predict the simulation of ecosystem state attributes and its response to different human activities management and climate change at high-precision spatio-temporal resolution. The wide application of this technology will provide scientific and technical support for ecosystem simulation on high-precision spatio-temporal scales.

附图说明Description of drawings

图1为本发明方法的流程示意图。Fig. 1 is a schematic flow chart of the method of the present invention.

具体实施方式Detailed ways

下面结合附图和具体实施方式对本发明做进一步阐述和说明。本发明中各个实施方式的技术特征在没有相互冲突的前提下,均可进行相应组合。The present invention will be further elaborated and illustrated below in conjunction with the accompanying drawings and specific embodiments. The technical features of the various implementations in the present invention can be combined accordingly on the premise that there is no conflict with each other.

本发明提供了一种耦合生态过程模型与机器学习算法的高精度时空模拟方法,步骤包括:1)使用研究区内的长期定位试验数据对生态系统过程模型的相关参数进行校准;2)在研究区内,随机选取若干格点,覆盖研究区气候类型和土壤特性空间变异,每个格点随机生成一种管理措施和气候变化情景组合,用校准后的过程模型模拟各格点和情景下的生态系统状态变量变化(如农田生态系统中的作物产量、土壤有机碳等);3)整合过程模型情景模拟的输入和输出数据,用于训练不同类型的机器学习模型,使机器学习模型充分学习和仿真过程模型模拟的状态变量变化;评估机器学习的表现,自适应地淘汰表现差的机器学习模型,并根据不同模型权重的加权评价法得到最优集成模型;4)将最优的机器学习集成模型当作过程模型的代理模型,代理模拟高分辨率时空尺度下作物产量和土壤有机碳等变化。The invention provides a high-precision spatio-temporal simulation method coupling an ecological process model and a machine learning algorithm. The steps include: 1) using the long-term positioning test data in the research area to calibrate the relevant parameters of the ecological system process model; In the region, a number of grid points are randomly selected to cover the spatial variation of climate types and soil properties in the study area. Each grid point randomly generates a combination of management measures and climate change scenarios, and the calibrated process model is used to simulate the conditions under each grid point and scenario. Changes in ecosystem state variables (such as crop yields in farmland ecosystems, soil organic carbon, etc.); 3) Integrate the input and output data of process model scenario simulations for training different types of machine learning models, so that the machine learning models can fully learn and the state variable changes simulated by the simulation process model; evaluate the performance of machine learning, adaptively eliminate poorly performing machine learning models, and obtain the optimal integrated model according to the weighted evaluation method of different model weights; 4) combine the optimal machine learning The integrated model is used as a proxy model of the process model, and the proxy simulates changes in crop yield and soil organic carbon at high-resolution spatio-temporal scales.

下面将对各步骤进行具体说明。Each step will be described in detail below.

步骤一:生态系统过程模型的校正与验证Step 1: Calibration and verification of the ecosystem process model

首先针对特定的研究目的,确定需要模拟的研究区域和适合的生态系统过程模拟,根据模型的需要,获取研究区域内的生态系统原位站点观测数据,一般包括日值温度、降水、太阳辐射等气象数据,土壤有机碳、pH、土壤容重等土壤属性数据,对农田生态系统可能还包括播期、施肥、灌溉等管理数据和产量数据,以及其他感兴趣的生态系统状态变量数据(如土壤有机碳和温室气体排放观测数据等)。使用差分进化算法对模型的关键参数(如农田生态系统中的作物品种参数、土壤有机碳和N2O排放过程参数等)进行校正和验证。First, for the specific research purpose, determine the research area to be simulated and the suitable ecosystem process simulation, and obtain the in-situ site observation data of the ecosystem in the research area according to the needs of the model, generally including daily temperature, precipitation, solar radiation, etc. Meteorological data, soil attribute data such as soil organic carbon, pH, soil bulk density, etc., for farmland ecosystems may also include management data such as sowing date, fertilization, irrigation, and yield data, as well as other interesting ecosystem state variable data (such as soil organic Carbon and greenhouse gas emission observation data, etc.). The key parameters of the model (such as parameters of crop varieties in the farmland ecosystem, parameters of soil organic carbon and N 2 O emission process, etc.) were corrected and verified by differential evolution algorithm.

步骤二:情景生成与基于生态系统过程模型的模拟Step 2: Scenario generation and simulation based on the ecosystem process model

为了让机器学习模型可以充分学习和仿真过程模型的内在关系,需要用站点级别校准好的过程模型模拟足够多的数据。本步骤分为三个子步骤,分别是情景生成、输入数据准备、过程模型模拟与输出。In order for the machine learning model to fully learn and simulate the internal relationship of the process model, it is necessary to simulate enough data with the process model calibrated at the site level. This step is divided into three sub-steps, namely scenario generation, input data preparation, process model simulation and output.

首先是情景生成,根据设定的空间分辨率生成空间网格,从所有网格中随机选取若干网格点(尽可能覆盖研究区的所有气候类型和土壤特性空间变异),在这些网格点随机生成管理-气候变化情景。管理情景指各种可能的人类活动管理,在农田生态系统中,管理情景包括作物系统情景、作物播期情景、施肥情景、灌溉情景(如灌溉方法、灌溉深度和灌溉时期等)、秸秆还田情景等。气候情景指未来气候变化情景,可以从全球气候模型(GCMs)和共享经济路径排放情景(SSP)中选择确定。其次是搜集和准备与选取网格点对应的土壤数据和历史气候数据,用以驱动步骤一中已经校验好的生态系统过程模型,模拟出历史和未来情景下感兴趣的状态变量如农田生态系统中的作物产量、土壤有机碳含量等结果。The first is scenario generation. Spatial grids are generated according to the set spatial resolution, and a number of grid points are randomly selected from all grids (covering all climate types and spatial variations of soil properties in the study area as much as possible). Stochastic generation of management-climate change scenarios. Management scenarios refer to the management of various possible human activities. In farmland ecosystems, management scenarios include crop system scenarios, crop sowing date scenarios, fertilization scenarios, irrigation scenarios (such as irrigation methods, irrigation depths, and irrigation periods, etc.), straw returning to the field, etc. scenarios etc. Climate scenarios refer to future climate change scenarios, which can be selected and determined from Global Climate Models (GCMs) and Shared Economy Pathway Emission Scenarios (SSP). The second is to collect and prepare the soil data and historical climate data corresponding to the selected grid points, to drive the ecosystem process model that has been verified in step 1, and to simulate the state variables of interest in historical and future scenarios, such as farmland ecology results in crop yields, soil organic carbon content, etc. in the system.

步骤三:机器/深度学习代理模型Step 3: Machine/Deep Learning Proxy Model

本步骤是为了筛选和集成能够代理过程模型的机器/深度学习模型,通过整合步骤二中网格-管理-气候情景组合的输入和输出结果,用于训练机器学习模型(如LASSO模型、支持向量机、随机森林、XGBoost等机器学习和卷积神经网络或长短期记忆网络等深度学习模型),让机器学习模型学习和仿真过程模型的内在关系。模型训练前先进行数据分组,其中80%的数据(即步骤二中过程模型的输出)训练模型、20%数据验证模型。分别计算均方根误差(RMSE)和决定系数(R2)评估各机器学习模型预测过程模型输出的表现,通过设置筛选标准,R2排序低于六种模型平均水平的模型将被弃用,筛选出模型表现在平均水平之上的几个模型。根据模型表现,给每一个筛选出的模型计算权重,权重为1/RMSE,构建加权平均的集成模型,该模型即为过程模型的代理模型。This step is to screen and integrate the machine/deep learning model that can act as a proxy process model. By integrating the input and output results of the grid-management-climate scenario combination in step 2, it is used to train the machine learning model (such as LASSO model, support vector Machine learning, random forest, XGBoost and other machine learning and deep learning models such as convolutional neural network or long short-term memory network), let the machine learning model learn and simulate the internal relationship of the process model. Data is grouped before model training, 80% of the data (that is, the output of the process model in step 2) trains the model, and 20% of the data verifies the model. Calculate the root mean square error (RMSE) and coefficient of determination (R 2 ) to evaluate the performance of each machine learning model to predict the output of the process model. By setting the screening criteria, the model whose R 2 ranking is lower than the average level of the six models will be discarded. Filter out a few models where the model performs above average. According to the model performance, the weight is calculated for each selected model, and the weight is 1/RMSE, and a weighted average integrated model is constructed, which is the proxy model of the process model.

步骤四:基于代理模型的快速高精度时空模拟Step 4: Fast and high-precision spatio-temporal simulation based on surrogate model

搜集和制备高精度的空间数据,驱动已经构建的代理模型,通过并行运算,实现快速模拟,生成研究目的所需的高精度数字制图产品。Collect and prepare high-precision spatial data, drive the constructed proxy model, realize fast simulation through parallel computing, and generate high-precision digital mapping products required for research purposes.

实施例1Example 1

本实施例的研究内容为不同管理模式下华北平原1km空间尺度冬小麦/夏玉米轮作系统作物产量、土壤有机碳变化的模拟,如图1所示,具体包括以下步骤:The research content of this example is the simulation of crop yield and soil organic carbon changes in the winter wheat/summer corn rotation system at a spatial scale of 1 km in the North China Plain under different management modes, as shown in Figure 1, specifically including the following steps:

1)以我国华北平原冬小麦-夏玉米轮作系统为研究对象,获取辛集、栾城、衡水、莱阳、桓台、徐州等站点尺度的土壤、气候、管理措施和产量、土壤有机碳、N2O排放等观测数据,通过差分进化算法校准农业生态系统模型(以APSIM模型为例)的参数。1) Taking the winter wheat-summer maize rotation system in the North China Plain as the research object, the soil, climate, management measures and yield, soil organic carbon, N 2 O Observation data such as emissions are used to calibrate the parameters of the agricultural ecosystem model (take the APSIM model as an example) through the differential evolution algorithm.

2)根据华北平原1km空间分辨率生成45万个农田耕地网格,从中随机选取6000个网格,同时每个网格随机生成相应的管理情景和气候情景,管理情景包括播期情景(以1天为间隔,从基准播期前30天到后30天)、氮肥情景(氮肥施氮量以1kg ha-1yr-1为单位,从0kgha-1yr-1增加到400kg ha-1yr-1,由于区域施肥方法变化不大,追氮时期、基追比设置为传统模式)、有机肥情景(有机肥以农家粪肥为代表,以10kg为单位,从0kg ha-1增加到4000kgha-1)、秸秆还田情景(秸秆还田量以10%为单位,从0%增加到100%,总共11个秸秆还田情景)、灌溉情景[灌溉深度50cm,小麦季灌溉设置为:0次、1次(拔节)、2次(拔节、灌浆)、3次(返青、拔节、灌浆)、4次(返青、拔节、孕穗和灌浆),玉米季的灌溉设置为0次、1次(拔节)、2次(拔节、抽雄)、3次(拔节、抽雄、灌浆)、4次(拔节、大喇叭口期、抽雄和灌浆)和自动灌溉(当田间持水量分别低于30%、40%、50%、60%、70%和80%开始灌溉)]等。气候情景从CMIP6中的30多个全球气候模型(GCMs)和共享经济路径排放情景(SSP)中随机选择,总共生成6000个网格-管理-气候情景组合。2) According to the 1km spatial resolution of the North China Plain, 450,000 farmland farmland grids were generated, and 6,000 grids were randomly selected from them. At the same time, corresponding management scenarios and climate scenarios were randomly generated for each grid. days as the interval, from 30 days before to 30 days after the reference sowing date), nitrogen fertilizer scenario (the amount of nitrogen fertilization is 1kg ha -1 yr -1 , increased from 0kgha -1 yr -1 to 400kg ha -1 yr -1 1. Due to little change in regional fertilization methods, nitrogen topdressing period, basal topdressing ratio is set to the traditional mode), organic fertilizer scenario (organic fertilizer is represented by farm manure, and the unit is 10kg, increased from 0kg ha -1 to 4000kgha -1 ), straw returning scenarios (the amount of straw returning to the field is increased from 0% to 100% in units of 10%, a total of 11 straw returning scenarios), irrigation scenarios [the irrigation depth is 50cm, and the wheat season irrigation setting is: 0 times, 1 time (joint joint), 2 times (joint joint, grouting), 3 times (returning green, jointing, grouting), 4 times (returning green, jointing, booting and filling), the irrigation settings for the corn season are 0 time, 1 time (joint joint) , 2 times (jointing, tasseling), 3 times (jointing, tasseling, grouting), 4 times (jointing, bell-mouth stage, tassing and grouting) and automatic irrigation (when the field water capacity is lower than 30%, 40%, 50%, 60%, 70% and 80% start irrigation)] etc. Climate scenarios were randomly selected from more than 30 global climate models (GCMs) and shared economy pathway emissions scenarios (SSPs) in CMIP6, generating a total of 6000 grid-management-climate scenario combinations.

3)使用校准好的校准农业生态系统模型对6000个网格-管理-气候情景组合进行模拟,模拟出小麦和玉米产量、土壤有机碳。3) Simulate 6000 grid-management-climate scenario combinations using the calibrated agro-ecosystem model to simulate wheat and maize yields, soil organic carbon.

4)整合6000个网格-管理-气候情景组合模拟结果,80%作为训练集、20%作为验证集。在训练集训练6种不同类型的机器学习模型,通过参数优化和超参数的调整,得到训练最优的模型。将训练好的模型去模拟验证集的数据,用于评估模型的表现。4) Integrate 6000 combined simulation results of grid-management-climate scenarios, 80% as training set and 20% as verification set. Six different types of machine learning models are trained on the training set, and the optimal model for training is obtained through parameter optimization and hyperparameter adjustment. The trained model is used to simulate the data of the verification set to evaluate the performance of the model.

5)基于验证集上过程模型和机器学习的模拟值,计算评价机器学习仿真过程模型表现的统计指标(R2和RMSE),根据R2的高低,淘汰掉平均水平之下的机器学习模型,再根据RMSE计算权重,加权平均得到最优的集成模型。本实例中,LASSO模型和支持向量机模型模拟小麦产量、玉米产量和土壤有机碳变化的R2分别为0.76、0.59-0.62、0.68,低于平均水平被弃用。随机森林、XGBoost、卷积神经网络和长短期记忆网络的模拟四个指标的R2均在0.93以上,被保留使用。四个模型分别计算验证集的RMSE,1/RMSE作为权重,构建最优的集成模型,用于下一步的精细尺度模拟。比如本实例中,随机森林、XGBoost、卷积神经网络和长短期记忆网络模拟小麦产量的RMSE分别0.43Mg ha-1、0.29Mg ha-1、0.35Mg ha-1、0.35Mgha-1,四个模型对应的权重分别为2.33、3.45、2.86和2.86,根据权重加权平均得到最终的结果。5) Based on the simulated values of the process model and machine learning on the verification set, calculate the statistical indicators ( R2 and RMSE) for evaluating the performance of the machine learning simulation process model, and eliminate the machine learning models below the average level according to the level of R2 . Then calculate the weight according to RMSE, and obtain the optimal integrated model by weighted average. In this example, the R 2 values of LASSO model and SVM model for simulating wheat yield, corn yield and soil organic carbon changes were 0.76, 0.59-0.62, and 0.68, respectively, which were lower than the average level and were discarded. The R 2 of the four indicators of random forest, XGBoost, convolutional neural network and long short-term memory network simulation are all above 0.93 and are reserved for use. The four models respectively calculate the RMSE of the verification set, and 1/RMSE is used as the weight to construct the optimal integrated model for the next fine-scale simulation. For example, in this example, the RMSE of wheat yield simulated by random forest, XGBoost, convolutional neural network and long short-term memory network are respectively 0.43Mg ha -1 , 0.29Mg ha -1 , 0.35Mg ha -1 , 0.35Mgha -1 , four The corresponding weights of the model are 2.33, 3.45, 2.86 and 2.86 respectively, and the final result is obtained according to the weighted average.

6)准备好历史(1995-2014)和未来(2030s、2060s)华北平原1km尺度网格输入数据,输入数据的格式与机器学习模型的数据保持一致,将准备好的数据驱动最优的机器学习集成模型,同时设置不同的管理水平,模拟得到历史和未来情景下华北平原小麦和玉米产量、土壤有机碳和N2O排放的空间分布图。对于本实例,在历史最优的氮肥、灌溉和秸秆还田管理下,历史阶段,华北平原小麦产量是

Figure BDA0004057425690000071
玉米产量是
Figure BDA0004057425690000072
SOC是
Figure BDA0004057425690000073
未来小麦产量基本保持不变,玉米产量和土壤有机碳的变化呈现下降趋势。6) Prepare historical (1995-2014) and future (2030s, 2060s) 1km-scale grid input data in the North China Plain. The format of the input data is consistent with the data of the machine learning model, and the prepared data will drive optimal machine learning. Integrating the model and setting different management levels at the same time, the spatial distribution maps of wheat and corn yields, soil organic carbon and N 2 O emissions in the North China Plain were simulated under historical and future scenarios. For this example, under the historical optimal nitrogen fertilizer, irrigation and straw returning management, the wheat yield in the North China Plain in the historical period is
Figure BDA0004057425690000071
corn yield is
Figure BDA0004057425690000072
SOC is
Figure BDA0004057425690000073
In the future, wheat production will remain basically unchanged, and changes in maize production and soil organic carbon will show a downward trend.

由此可见,通过本发明的方法,能更加快速、高效地预测高时空分辨率下某种生态系统状态变量(如农田生态系统中的作物产量、土壤有机碳等)的时空变化规律及其对不同管理措施和气候变化的响应。It can be seen that the method of the present invention can more quickly and efficiently predict the spatiotemporal variation of certain ecosystem state variables (such as crop yields in farmland ecosystems, soil organic carbon, etc.) Different management measures and responses to climate change.

以上所述的实施例只是本发明的一种较佳的方案,然其并非用以限制本发明。有关技术领域的普通技术人员,在不脱离本发明的精神和范围的情况下,还可以做出各种变化和变型。因此凡采取等同替换或等效变换的方式所获得的技术方案,均落在本发明的保护范围内。The above-mentioned embodiment is only a preferred solution of the present invention, but it is not intended to limit the present invention. Various changes and modifications can be made by those skilled in the relevant technical fields without departing from the spirit and scope of the present invention. Therefore, all technical solutions obtained by means of equivalent replacement or equivalent transformation fall within the protection scope of the present invention.

Claims (10)

1. A high-precision space-time simulation method for coupling an ecological process model and a machine learning algorithm is characterized by comprising the following steps:
s1: calibrating and verifying relevant parameters of the ecosystem process model by using long-term positioning test data in a target research area;
s2: randomly selecting a plurality of grid points capable of covering the climate type and soil characteristic spatial variation of the research area in the research area, and randomly generating a management measure and climate variation scene combination for each grid point; simulating the state variable change of the ecological system at each grid point and under the scene by using the ecological system process model verified in the step S1 based on the soil data and the historical climate data at the scene corresponding to each grid point;
s3: integrating input data and output data of the ecosystem process model scene simulation in the step S2, and training different types of machine learning models to enable the machine learning models to fully learn and simulate state variable changes of the ecosystem process model simulation; evaluating the performance of machine learning, adaptively eliminating the poor-performance machine learning model, and obtaining an optimal integrated model according to a weighting evaluation method of different model weights;
s4: and taking the optimal integrated model as a proxy model of the process model of the ecosystem, and performing proxy simulation on the basis of spatial data to generate a high-precision digital drawing product required by a research purpose under a high-resolution spatiotemporal scale.
2. The method for high-precision spatiotemporal simulation coupled with an ecological process model and a machine learning algorithm according to claim 1, wherein the step S1 is specifically as follows:
aiming at a specific research purpose, determining a research area to be simulated and a suitable ecosystem process model; acquiring observation data of the ecological system in-situ site in the research area according to the requirements of the ecological system process model, and correcting and verifying key parameters of the ecological system process model by using a differential evolution algorithm.
3. The method of high-precision spatiotemporal simulation coupling an ecological process model and a machine learning algorithm of claim 2, wherein the ecosystem in-situ site observation data comprises one or more of meteorological data, soil property data, farmland ecosystem data, ecosystem state variable data.
4. The method for high-precision spatiotemporal simulation of a coupled ecological process model and machine learning algorithm of claim 3, wherein the meteorological data includes daily temperature, precipitation and solar radiation, the soil property data includes soil organic carbon, pH and soil volume weight, the field ecosystem data includes management data and yield data for stages of seeding, fertilizing and irrigating, and the ecosystem state variable data includes soil organic carbon and greenhouse gas emission observations.
5. The method of high-precision spatiotemporal simulation of a coupled ecological process model and machine learning algorithm of claim 2, in which the key parameters include crop variety parameters, soil organic carbon and N in a field ecosystem 2 O-exhaust process parameters.
6. The method for high-precision spatio-temporal simulation of a coupled ecological process model and a machine learning algorithm according to claim 1, wherein the step S2 is specifically as follows:
in the research area, generating spatial grids according to a set spatial resolution, randomly selecting a plurality of grid points capable of covering the climate type and soil characteristic spatial variation of the research area from all the grids, and randomly generating a management measure and climate change scene combination at each grid point; and (3) collecting and preparing soil data and historical climate data corresponding to the selected grid points, and driving the ecosystem process model verified in the step 1 to simulate interested state variables under historical and future situations.
7. The method for high-precision spatiotemporal simulation of a coupled ecological process model and machine learning algorithm of claim 6, wherein the management measure scenarios refer to various possible human activity management, in a field ecosystem, the management measure scenarios include a crop system scenario, a crop seeding stage scenario, a fertilization scenario, an irrigation scenario, and a straw returning scenario; the climate change situation refers to a future climate change situation and is selected and determined from a global climate model and a shared economic path emission situation.
8. A high-precision space-time simulation method coupling an ecological process model and a machine learning algorithm according to claim 1, wherein in the step S3, the machine learning model comprises a LASSO model, a support vector machine, a random forest, an XGBoost, a convolutional neural network, and a long-short term memory network.
9. The method for high-precision spatio-temporal simulation of coupled ecological process model and machine learning algorithm as claimed in claim 1, wherein in step S3, the root mean square error RMSE and the decision coefficient R are calculated respectively 2 To evaluate the performance of each machine learning model to predict the process model output by setting a screening criterion, R 2 The models with the ranking lower than the average level of each model are abandoned, and a plurality of models with the models represented on the average level are screened out; and according to the model expression, calculating the weight of each screened model, wherein the weight is 1/RMSE, and constructing a weighted average integration model to obtain an optimal integration model.
10. The method for high-precision spatio-temporal simulation of a coupled ecological process model and a machine learning algorithm according to claim 1, wherein the step S4 is specifically as follows:
and collecting and preparing high-precision spatial data, driving the constructed agent model, and realizing rapid simulation through parallel operation to generate a high-precision digital mapping product required by the research purpose.
CN202310049924.XA 2023-02-01 2023-02-01 High-precision time-space simulation method for coupling ecological process model and machine learning algorithm Pending CN115952421A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310049924.XA CN115952421A (en) 2023-02-01 2023-02-01 High-precision time-space simulation method for coupling ecological process model and machine learning algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310049924.XA CN115952421A (en) 2023-02-01 2023-02-01 High-precision time-space simulation method for coupling ecological process model and machine learning algorithm

Publications (1)

Publication Number Publication Date
CN115952421A true CN115952421A (en) 2023-04-11

Family

ID=87290752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310049924.XA Pending CN115952421A (en) 2023-02-01 2023-02-01 High-precision time-space simulation method for coupling ecological process model and machine learning algorithm

Country Status (1)

Country Link
CN (1) CN115952421A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333321A (en) * 2023-09-27 2024-01-02 中山大学 Agricultural irrigation water consumption estimation method, system and medium based on machine learning
CN117542443A (en) * 2023-09-27 2024-02-09 中国农业大学 Method and device for balancing yield and relieving nitrogen pollution and electronic equipment
CN118014287A (en) * 2024-02-20 2024-05-10 中国农业大学 A comprehensive evaluation method and system for staple food crop planting
CN118674118A (en) * 2024-06-27 2024-09-20 中国科学院西北生态环境资源研究院 Method for realizing intelligent climate crop production by promoting loess plateau agriculture based on APSIM
CN118710431A (en) * 2024-05-30 2024-09-27 江苏希望田野生态科技有限公司 A crop growth prediction system and method based on soil and environmental data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705182A (en) * 2019-09-06 2020-01-17 北京师范大学 Prediction method of crop breeding adaptation time by coupling crop model and machine learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705182A (en) * 2019-09-06 2020-01-17 北京师范大学 Prediction method of crop breeding adaptation time by coupling crop model and machine learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIUJUN XIAO ET AL.: "Coupling agricultural system models with machine learning to facilitate regional predictions of management practices and crop production", 《ENVIRONMENTAL RESEARCH LETTERS》, 1 November 2022 (2022-11-01), pages 1 - 11 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333321A (en) * 2023-09-27 2024-01-02 中山大学 Agricultural irrigation water consumption estimation method, system and medium based on machine learning
CN117542443A (en) * 2023-09-27 2024-02-09 中国农业大学 Method and device for balancing yield and relieving nitrogen pollution and electronic equipment
CN117542443B (en) * 2023-09-27 2024-05-24 中国农业大学 Method and device for balancing yield and relieving nitrogen pollution and electronic equipment
CN118014287A (en) * 2024-02-20 2024-05-10 中国农业大学 A comprehensive evaluation method and system for staple food crop planting
CN118710431A (en) * 2024-05-30 2024-09-27 江苏希望田野生态科技有限公司 A crop growth prediction system and method based on soil and environmental data
CN118674118A (en) * 2024-06-27 2024-09-20 中国科学院西北生态环境资源研究院 Method for realizing intelligent climate crop production by promoting loess plateau agriculture based on APSIM

Similar Documents

Publication Publication Date Title
CN115952421A (en) High-precision time-space simulation method for coupling ecological process model and machine learning algorithm
Chen et al. Improving regional winter wheat yield estimation through assimilation of phenology and leaf area index from remote sensing data
Huang et al. The improved winter wheat yield estimation by assimilating GLASS LAI into a crop growth model with the proposed Bayesian posterior-based ensemble Kalman filter
Shiri et al. Comparison of heuristic and empirical approaches for estimating reference evapotranspiration from limited inputs in Iran
WO2023179167A1 (en) Crop irrigation water demand prediction method based on aquacrop model and svr
Farina et al. Soil carbon dynamics and crop productivity as influenced by climate change in a rainfed cereal system under contrasting tillage using EPIC
US12112105B1 (en) Soil-climate intelligent type determining method for rice target yield and nitrogen fertilizer amount
CN110633841B (en) Provincial range plot scale data assimilation yield prediction method based on set sampling
CN105321120A (en) Assimilation evapotranspiration and LAI (leaf area index) region soil moisture monitoring method
CN110705182B (en) Crop breeding adaptive time prediction method coupling crop model and machine learning
Ruiz-Ramos et al. Comparing correction methods of RCM outputs for improving crop impact projections in the Iberian Peninsula for 21st century
Nendel et al. Simulating regional winter wheat yields using input data of different spatial resolution
Chen et al. Improving the practicability of remote sensing data-assimilation-based crop yield estimations over a large area using a spatial assimilation algorithm and ensemble assimilation strategies
CN111915096B (en) Crop yield early-stage forecasting technology based on crop model, remote sensing data and climate forecasting information
CN112598277A (en) Method for evaluating trans-regional winter wheat yield difference reduction and nitrogen fertilizer efficiency improvement
CN104732299A (en) Maize yield combined prediction system and method
Lokupitiya et al. Carbon and energy fluxes in cropland ecosystems: a model-data comparison
Tatsumi Effects of automatic multi-objective optimization of crop models on corn yield reproducibility in the USA
CN109766871A (en) A regional crop yield estimation method based on spatial difference assimilation of remote sensing data and crop model
CN119417639A (en) Crop precision management method and system based on crop growth model
CN118674118A (en) Method for realizing intelligent climate crop production by promoting loess plateau agriculture based on APSIM
CN116629453B (en) A remote sensing yield estimation method suitable for the entire crop growth period
Nandan et al. Evaluating the utility of weather generators in crop simulation models for in-season yield forecasting
CN109359862B (en) A method and system for real-time yield estimation of food crops
CN118014287A (en) A comprehensive evaluation method and system for staple food crop planting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination