CN110728446A - County scale crop yield estimation method based on CNN-LSTM - Google Patents

County scale crop yield estimation method based on CNN-LSTM Download PDF

Info

Publication number
CN110728446A
CN110728446A CN201910954014.XA CN201910954014A CN110728446A CN 110728446 A CN110728446 A CN 110728446A CN 201910954014 A CN201910954014 A CN 201910954014A CN 110728446 A CN110728446 A CN 110728446A
Authority
CN
China
Prior art keywords
data
county
remote sensing
crop
env
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910954014.XA
Other languages
Chinese (zh)
Other versions
CN110728446B (en
Inventor
孙杰
赖祖龙
陈性义
余俊杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN201910954014.XA priority Critical patent/CN110728446B/en
Publication of CN110728446A publication Critical patent/CN110728446A/en
Application granted granted Critical
Publication of CN110728446B publication Critical patent/CN110728446B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N2021/1765Method using an image detector and processing of image signal
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N2021/178Methods for obtaining spatial resolution of the property being measured
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N2021/1793Remote sensing
    • G01N2021/1797Remote sensing in landscape, e.g. crops

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Analytical Chemistry (AREA)
  • Primary Health Care (AREA)
  • Mining & Mineral Resources (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Animal Husbandry (AREA)
  • Agronomy & Crop Science (AREA)
  • Pathology (AREA)
  • Immunology (AREA)
  • Biochemistry (AREA)

Abstract

本发明涉及遥感影像信息提取技术领域,特别是涉及一种基于CNN‑LSTM的县级尺度农作物估产方法,其包括以下步骤:S1数据的获取和处理、S2数据的叠加和滤除、S3获取县级尺度的特征张量数据、S4构建和训练CNN‑LSTM模型、S5应用S4中训练的CNN‑LSTM模型对目标作物进行估产。本发明方法基于反映作物生长状态的遥感数据和影响作物生长的环境数据,通过直方图统计的方法提取县域作物的更多特征并转化为张量,以此对提取的CNN‑LSTM模型进行训练,有效地提高了小尺度作物估产精度。

Figure 201910954014

The invention relates to the technical field of remote sensing image information extraction, in particular to a county-level crop yield estimation method based on CNN-LSTM, which comprises the following steps: acquiring and processing S1 data, superimposing and filtering S2 data, and S3 acquiring county Level-scale feature tensor data, S4 builds and trains a CNN-LSTM model, and S5 applies the CNN-LSTM model trained in S4 to estimate the yield of target crops. The method of the invention is based on the remote sensing data reflecting the growth state of the crops and the environmental data affecting the growth of the crops, and extracts more features of the crops in the county area through the method of histogram statistics and converts them into tensors, so as to train the extracted CNN-LSTM model, Effectively improve the small-scale crop yield estimation accuracy.

Figure 201910954014

Description

一种基于CNN-LSTM的县级尺度农作物估产方法A county-level crop yield estimation method based on CNN-LSTM

技术领域technical field

本发明涉及遥感影像信息提取技术领域,特别是涉及一种基于CNN-LSTM的县级尺度农作物估产方法。The invention relates to the technical field of remote sensing image information extraction, in particular to a county-level crop yield estimation method based on CNN-LSTM.

背景技术Background technique

作物产量是农业最重要的指标,与人类社会有着许多联系。产量预测是精密农业中最具挑战性的任务之一,对产量图,作物市场计划,作物保险和收获管理具有重要意义。Crop yield is the most important indicator of agriculture and has many connections with human society. Yield forecasting is one of the most challenging tasks in precision agriculture and has important implications for yield mapping, crop market planning, crop insurance and harvest management.

遥感技术已被广泛用于作物估产中。从遥感数据中可以提取各种相关信息辅助估产。特别是各类植被指数(VI),例如归一化植被指数(NDVI)已被广泛使用。其他指数,例如绿叶面积指数(GLAI),作物水分胁迫指数(CWSI),归一化水分指数(NDWI),绿色植被指数(GVI),土壤调整植被指数(SAVI)等也已用于作物产量预测。此外,气象变量例如降水,气温和一些土壤状况数据,包括土壤湿度,温度在产量预测中通常采用和质量作为作物生长环境指标。此类特征提取方法通常依靠计算区域均值,但是却忽略了细节特征。Remote sensing technology has been widely used in crop yield estimation. Various relevant information can be extracted from remote sensing data to assist in yield estimation. In particular, various vegetation indices (VI), such as the normalized vegetation index (NDVI), have been widely used. Other indices such as Green Leaf Area Index (GLAI), Crop Water Stress Index (CWSI), Normalized Water Index (NDWI), Green Vegetation Index (GVI), Soil Adjusted Vegetation Index (SAVI) etc. have also been used for crop yield prediction . In addition, meteorological variables such as precipitation, air temperature and some soil condition data, including soil moisture, temperature and quality are often used in yield prediction as indicators of crop growth environment. Such feature extraction methods usually rely on computing the regional mean, but ignore detailed features.

基于遥感数据,主要有两种作物产量预测方法:作物模拟和经验统计模型。尽管作物模拟模型精确地模拟了作物生长的物理过程,但是由于数据不足,这些模型几乎不能在较大的时空范围内应用。相反,经验统计模型很简单,需要较少的输入数据,因此已被广泛用作基于过程的模型的通用替代方案。包括支持向量机(SVM),决策树(DT),多层感知(MLP)和受限玻尔兹曼机(RBM)在内的机器学习算法可以为传统回归方法提供替代方案,并克服了许多限制。此外,人工神经网络(ANN)也被视为替代模型。传统的人工神经网络,即多层感知器模型,已成功应用于各种农作物的农作物产量估算。Based on remote sensing data, there are two main methods of crop yield prediction: crop simulation and empirical statistical models. Although crop simulation models accurately simulate the physical process of crop growth, these models can hardly be applied on a large spatiotemporal scale due to insufficient data. In contrast, empirical statistical models are simple and require less input data, and thus have been widely used as a general-purpose alternative to process-based models. Machine learning algorithms including Support Vector Machines (SVMs), Decision Trees (DTs), Multilayer Perceptions (MLPs), and Restricted Boltzmann Machines (RBMs) can provide alternatives to traditional regression methods and overcome many limit. Furthermore, Artificial Neural Networks (ANNs) are also considered as alternative models. The traditional artificial neural network, the multi-layer perceptron model, has been successfully applied to crop yield estimation of various crops.

近年来,深度学习(DL)被认为是机器学习和数据挖掘农业遥感中的突破性技术。包括堆叠稀疏自动编码器(SSAE),卷积神经网络(CNN)和递归神经网络(RNN)在内的大多数DL算法已用于产量预测。研究人员普遍认为CNN可以探索更多的空间特征,而LSTM作为一种特殊的RNN具有揭示物候特征的能力,但是,人们很少关注将CNN和LSTM的优势综合用于产量预测。In recent years, deep learning (DL) has been regarded as a breakthrough technology in machine learning and data mining in agricultural remote sensing. Most DL algorithms including Stacked Sparse Autoencoder (SSAE), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) have been used for yield prediction. Researchers generally believe that CNN can explore more spatial features, and LSTM, as a special RNN, has the ability to reveal phenological features, however, little attention has been paid to combining the advantages of CNN and LSTM for yield prediction.

发明内容SUMMARY OF THE INVENTION

为了克服上述现有技术的不足,本发明提供了一种基于CNN-LSTM的县级尺度农作物估产方法。该方法提出了一个CNN-LSTM模型,用于县级的季末作物产量预测。In order to overcome the above-mentioned shortcomings of the prior art, the present invention provides a county-level crop yield estimation method based on CNN-LSTM. This method proposes a CNN-LSTM model for end-of-season crop yield prediction at the county level.

本发明提供一种基于CNN-LSTM的县级尺度农作物估产方法,其特征在于:包括如下步骤:The present invention provides a county-level crop yield estimation method based on CNN-LSTM, which is characterized by comprising the following steps:

S1、数据的获取和处理:分别获取待估产区目标年份的农作物遥感数据RS、环境遥感数据ENV、农作物分类数据D1和县级大豆产量数据D3多幅,以及县域边界矢量数据D2,并分别对同一年份的多幅所述农作物遥感数据RS和多幅所述环境遥感数据ENV进行预处理,以分别得到每年的多时相农作物遥感数据集IRS和环境遥感数据集IENVS1. Data acquisition and processing: respectively obtain the crop remote sensing data RS, environmental remote sensing data ENV, crop classification data D1, county-level soybean yield data D3, and county boundary vector data D2 of the target year of the production area to be estimated, and analyze them respectively. A plurality of described crop remote sensing data RS and a plurality of described environmental remote sensing data ENV of the same year are preprocessed to obtain the annual multi-temporal crop remote sensing data set I RS and environmental remote sensing data set I ENV respectively;

S2、数据的叠加和滤除:将S1中的所述多时相农作物遥感数据集IRS和对应年份的所述IENV进行叠加合成处理,以得到每年的数据总集IRS&ENV,结合S1中对应年份的所述D1,对所述数据总集IRS&ENV进行非目标作物像素滤除,以得到每年的仅包含目标作物像素信息的数据总集DIRS&ENVS2. Superposition and filtering of data: The multi-temporal crop remote sensing data set I RS in S1 and the I ENV of the corresponding year are superimposed and synthesized to obtain the annual data set I RS & ENV , combined with S1 In the described D1 of the corresponding year, the non-target crop pixels are filtered out to the data collection I RS & ENV , to obtain the annual data collection DI RS & ENV that only comprises the target crop pixel information;

S3、获取县级尺度的特征张量数据:利用S1中的所述县域边界矢量数据D2对每幅所述数据总集DIRS&ENV进行裁剪,以得到每个县域每年的所述数据总集DIRS&ENV,统计每个县域每年的所述数据总集DIRS&ENV中各波段的波段像素直方图分布,并将每个县域每年的所述数据总集DIRS&ENV对应的波段像素直方图转换为深度学习张量,即可得到每个县域每年的特征张量数据;S3. Obtain county-level feature tensor data: Use the county boundary vector data D2 in S1 to cut each of the data sets DI RS & ENV to obtain the annual data set for each county DI RS & ENV , count the distribution of the band pixel histogram of each band in the data set DI RS & ENV for each county each year, and calculate the band pixels corresponding to the data set DI RS & ENV for each county each year Convert the histogram into a deep learning tensor to obtain the annual feature tensor data of each county;

S4、构建和训练CNN-LSTM模型:构建CNN-LSTM模型,利用S3中的多个所述特征张量数据和S1中对应县域对应年份所述县级大豆产量数据D3对CNN-LSTM模型进行训练,并评价模型精度;S4. Build and train a CNN-LSTM model: build a CNN-LSTM model, and train the CNN-LSTM model by using a plurality of the feature tensor data in S3 and the county-level soybean yield data D3 in the corresponding year of the corresponding county in S1 , and evaluate the model accuracy;

S5:应用S4中训练的CNN-LSTM模型对目标作物进行估产。S5: Use the CNN-LSTM model trained in S4 to estimate the yield of the target crop.

进一步地,所述农作物遥感数据RS包含t个时相数据,每个时相数据具有m个波段,所述环境遥感数据ENV包含u个时相数据,每个时相数据具有n个波段。Further, the crop remote sensing data RS includes t time-phase data, each time-phase data has m bands, and the environmental remote sensing data ENV includes u time-phase data, each time-phase data has n bands.

进一步地,所述S1还包括:Further, the S1 also includes:

S11、对每幅所述农作物遥感数据RS进行去云处理,将经过去云处理的所述农作物遥感数据RS按年份进行归类,并对同一年份的多幅所述农作物遥感数据RS进行叠加合成处理,即可得到每年的所述多时相农作物遥感数据集IRSS11, carry out cloud removal processing to each described crop remote sensing data RS, classify the crop remote sensing data RS through the cloud removal processing by year, and superimpose and synthesize multiple described crop remote sensing data RS of the same year processing, the annual multi-temporal crop remote sensing dataset IRS can be obtained;

对同一年份的多幅所述环境遥感数据ENV进行叠加合成处理,即可得到每年的所述环境遥感数据集IENVThe multiple pieces of the environmental remote sensing data ENV in the same year are superimposed and synthesized to obtain the annual environmental remote sensing data set I ENV .

进一步地,所述环境遥感数据ENV包括地表温度数据和天气参数数据,得到所述环境遥感数据集IENV还包括以下步骤:Further, the environmental remote sensing data ENV includes surface temperature data and weather parameter data, and obtaining the environmental remote sensing data set I ENV also includes the following steps:

S11a、将所述地表温度数据和所述天气参数数据分别与所述农作物遥感数据RS进行时相对齐,以分别得到对齐后的所述地表温度数据和所述天气参数数据;S11a, aligning the surface temperature data and the weather parameter data with the crop remote sensing data RS, respectively, to obtain the aligned surface temperature data and the weather parameter data;

S11b、以年份单位,分别对S12中对齐后的所述地表温度数据和所述天气参数数据进行处理,以分别得到每年的多时相地表温度数据集和每年的多时相天气参数数据集;S11b, in units of years, respectively process the aligned surface temperature data and the weather parameter data in S12 to obtain an annual multi-temporal surface temperature data set and an annual multi-temporal weather parameter data set;

S11c、将S13中的每年的多时相地表温度数据集与对应年份的多时相天气参数数据集进行处理,即可得到对应年份所述环境遥感数据集IENVS11c: Process the annual multi-temporal surface temperature data set in S13 and the multi-temporal weather parameter data set of the corresponding year to obtain the environmental remote sensing data set I ENV of the corresponding year.

进一步地,S2中所述数据总集IRS&ENV具有t个时相,每个时相数据具有m+n个波段。Further, the data set I RS & ENV in S2 has t time phases, and each time phase data has m+n frequency bands.

进一步地,所述S3还包括:Further, the S3 also includes:

S31、确认各波段中目标作物像素数据的真实分布界限:分别对待估产区内每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段的目标作物像素数据统计形成对应的波段像素直方图,得到每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段的目标作物像素数据的最大值和最小值,以各波段目标作物像素数据的最大值和最小值为界限,将每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段目标作物像素数据范围分为w个区间;S31, confirm the true distribution limit of the target crop pixel data in each band: the target crop pixel data of each band in each of the crop remote sensing data RS and the environmental remote sensing data ENV in the production area to be estimated respectively form a corresponding band pixel histogram Figure, obtain the maximum value and the minimum value of the target crop pixel data of each waveband in each described crop remote sensing data RS and described environmental remote sensing data ENV, take the maximum value and minimum value of each waveband target crop pixel data as the limit, the The target crop pixel data range of each band in each of the crop remote sensing data RS and the environmental remote sensing data ENV is divided into w intervals;

S32、利用S1中的所述县域边界矢量数据D2分别对S2中的所述数据总集DIRS&ENV进行裁剪,以得到每个县域每年的所述数据总集DIRS&ENV,统计每个县域每年的所述数据总集DIRS&ENV中各波段的波段像素直方图,以S31中得到各波段的区间,将每个县域每年的所述数据总集DIRS&ENV中各波段的波段像素转换至对应波段的区间内,即可得到每个县域每年对应的特征张量数据,所述特征张量数据的形状为t*(m+n)*1*w。S32, using the county boundary vector data D2 in S1 to cut the data collection DI RS & ENV in S2 respectively, to obtain the annual data collection DI RS & ENV of each county, count each The band pixel histogram of each band in the annual data set DI RS & ENV of the county area, the interval of each band is obtained in S31, and the bands of each band in the annual data set DI RS & ENV of each county area are obtained. The pixels are converted into the interval of the corresponding band, and the feature tensor data corresponding to each county can be obtained each year, and the shape of the feature tensor data is t*(m+n)*1*w.

进一步地,所述S4还包括以下步骤:Further, the S4 also includes the following steps:

S41、设置训练参数:将所述CNN-LSTM模型的验证数据分割比例设置为0.2,训练周期设置为100次,数据块大小设置为16;S41, set training parameters: set the verification data split ratio of the CNN-LSTM model to 0.2, the training period to 100 times, and the data block size to 16;

S42、向设置好的CNN-LSTM模型中输入多个特征张量数据,并利用通过keras中的fit函数对模型进行训练,以S1中对应县域对应年份的所述县级大豆产量数据D3为训练数据,对CNN-LSTM模型进行训练。S42. Input multiple feature tensor data into the set CNN-LSTM model, and use the fit function in keras to train the model, using the county-level soybean yield data D3 in the corresponding year of the county in S1 as training data to train the CNN-LSTM model.

本发明提供的技术方案带来的有益效果是:本发明方法基于反映作物生长状态的遥感数据和影响作物生长的环境数据,通过直方图统计的方法提取县域作物的更多特征并转化为张量,以此对提取的CNN-LSTM模型进行训练,有效地提高了小尺度作物估产精度,对作物产量估算,作物市场计划,作物保险和收获管理具有重要意义。The beneficial effects brought by the technical solution provided by the present invention are as follows: the method of the present invention extracts more features of county crops by means of histogram statistics and converts them into tensors based on the remote sensing data reflecting the growth state of crops and the environmental data affecting the growth of crops , to train the extracted CNN-LSTM model, which effectively improves the small-scale crop yield estimation accuracy, which is of great significance to crop yield estimation, crop market planning, crop insurance and harvest management.

附图说明Description of drawings

图1为本发明一种基于CNN-LSTM的县级尺度农作物估产方法的流程图;1 is a flowchart of a county-level crop yield estimation method based on CNN-LSTM of the present invention;

图2为本发明提出的CNN-LSTM框架图;Fig. 2 is the framework diagram of CNN-LSTM proposed by the present invention;

图3为某县一时相数据特征的张量示意图;Figure 3 is a schematic diagram of a tensor of a temporal data feature in a county;

图4为采用本发明方法测量大豆产量的结果图与官方公布产量图的比较图;Fig. 4 is the result graph that adopts the method of the present invention to measure soybean yield and the comparison graph of the official announcement yield graph;

图5为采用本发明方法预测2011-2015年大豆产量的精度结果图;Fig. 5 is a graph of the precision result of using the method of the present invention to predict soybean yield in 2011-2015;

图6为5年预测值与观测值关系图。Figure 6 shows the relationship between the five-year predicted value and the observed value.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明实施方式作进一步地描述。In order to make the objectives, technical solutions and advantages of the present invention clearer, the embodiments of the present invention will be further described below with reference to the accompanying drawings.

在发明中,以美国中部15个州为案例区,基于2003-2015年遥感数据和历史产量数据为数据源,对该区域2011-2015年的大豆产量进行预估,并将预估结果与真实官方数据进行比较。In the invention, taking 15 states in the central United States as the case area, based on the remote sensing data and historical yield data from 2003 to 2015 as the data source, the soybean yield in the region from 2011 to 2015 was estimated, and the estimated results were compared with the real ones. Compare with official data.

待估产区的选择:根据美国农业部(USDA)公布的大豆种植分布,大豆在31个州种植,在案例中,选择了15个州作为示例,包括北达科他州,南达科他州,内布拉斯加州,明尼苏达州,爱荷华州,堪萨斯州,密苏里州,阿肯色州,密西西比州,田纳西州,伊利诺伊州,印第安纳州,俄亥俄州,密歇根州和威斯康星州。上述15个州的大豆种植面积占全国大豆种植面积的88.75%。根据大豆生长期,遥感数据的时相选择从四月至十二月。Selection of production areas to be estimated: According to the soybean planting distribution published by the United States Department of Agriculture (USDA), soybeans are grown in 31 states, in this case, 15 states were selected as examples, including North Dakota, South Dakota, Nebraska California, Minnesota, Iowa, Kansas, Missouri, Arkansas, Mississippi, Tennessee, Illinois, Indiana, Ohio, Michigan and Wisconsin. The soybean planting area in the above 15 states accounts for 88.75% of the national soybean planting area. According to the soybean growing period, the time period of remote sensing data is selected from April to December.

在此说明,下文出现的目标作物均为大豆。Here, the target crops appearing below are all soybeans.

获取待估产区的数据,包括:Get data on the region to be estimated, including:

1、县级大豆产量数据:从美国农业部收集了2003年至2015年县级大豆产量数据,其产量单位为kg/ha,获取来源:https://www.nass.usda.gov/Quick_Stats/Lite/index.php;1. County-level soybean yield data: County-level soybean yield data from 2003 to 2015 were collected from the USDA, and the yield unit is kg/ha. Source: https://www.nass.usda.gov/Quick_Stats/ Lite/index.php;

2、农作物分类遥感数据:从美国农业部发放的crop data layer(CDL)收集2003年至2015年的农作物分类遥感数据,每年一幅,其为一种特定于作物的土地覆盖数据层,分辨率30m,获取来源:https://nassgeodata.gmu.edu/CropScape/;2. Crop classification remote sensing data: Collect the crop classification remote sensing data from 2003 to 2015 from the crop data layer (CDL) issued by the USDA, one per year, which is a crop-specific land cover data layer with a high resolution 30m, obtained from: https://nassgeodata.gmu.edu/CropScape/;

3、待估产区的县域边界矢量遥感数据:其格式为shp,获取来源:https://catalog.data.gov/dataset/tiger-line-shapefile-2016-nation-u-s-current-county-and-equivalent-national-shapefile;3. The county boundary vector remote sensing data of the production area to be estimated: its format is shp, obtained from: https://catalog.data.gov/dataset/tiger-line-shapefile-2016-nation-u-s-current-county-and- equivalent-national-shapefile;

4、农作物遥感数据:从指定网站(获取地址为:https://doi.org/10.5067/MODIS/MOD09A1.006)中下载2003年至2015年处于大豆生长期的农作物遥感数据(RS),其中,所述农作物遥感数据为MOD09A1 V6产品,且每一年的RS中提供了500m分辨率下Terra MODIS波段1-7的表面光谱反射率的估计值,并对气体、气溶胶和瑞利散射等大气条件进行了校正,这可保证每个所述RS中,对于每个像素,基于高观测覆盖率,低视角,不存在云或云影以及气溶胶载量。4. Crop remote sensing data: Download the crop remote sensing data (RS) in soybean growing period from 2003 to 2015 from the designated website (access address: https://doi.org/10.5067/MODIS/MOD09A1.006), among which , the crop remote sensing data is MOD09A1 V6 product, and the estimated value of the surface spectral reflectance of Terra MODIS bands 1-7 at 500m resolution is provided in the RS each year, and the gas, aerosol and Rayleigh scattering etc. Corrected for atmospheric conditions, this guarantees the absence of clouds or cloud shadows and aerosol loadings in each of the RSs, for each pixel, based on high observational coverage, low viewing angles.

5、环境遥感数据:环境数据包括地表温度数据和天气参数数据。从指定网站中下载2003年至2015年处于大豆生长期的地表温度数据和天气参数数据。5. Environmental remote sensing data: Environmental data includes surface temperature data and weather parameter data. Download the surface temperature data and weather parameter data during the soybean growing period from 2003 to 2015 from the designated website.

其中,地表温度数据采用MOD11A2 V6产品,获取地址为https://doi.org/10.5067/MODIS/MOD11A2.006。该产品在1km*1km的网格中提供平均8天的地表温度(LST)。MOD11A2中的每个像素值都是在这8天时间内收集的所有相应MOD11A1 LST像素的简单平均值。两个数据波段白天和晚上的地表温度带被用作土壤长期因素。Among them, the surface temperature data adopts MOD11A2 V6 product, and the address is https://doi.org/10.5067/MODIS/MOD11A2.006. The product provides average 8-day surface temperature (LST) in a 1km*1km grid. Each pixel value in MOD11A2 is a simple average of all corresponding MOD11A1 LST pixels collected over this 8-day period. The daytime and nighttime surface temperature bands of the two data bands were used as soil long-term factors.

天气参数数据采用Daymet产品,获取地址为https://daymet.ornl.gov,该产品是通过每日气象观测值的内插法和外推法生成的每日天气参数的网格估计值的集合。本发明中选择了Daymet中两个重要的天气参数:降水和气压两个特征波段,分辨率为1km。The weather parameter data adopts the Daymet product, available at https://daymet.ornl.gov, which is a collection of grid estimates of daily weather parameters generated by interpolation and extrapolation of daily meteorological observations. . In the present invention, two important weather parameters in Daymet are selected: two characteristic bands of precipitation and air pressure, and the resolution is 1km.

请参考图1,一种基于CNN-LSTM的县级尺度农作物估产方法,包括以下步骤:Please refer to Figure 1, a county-level crop yield estimation method based on CNN-LSTM, including the following steps:

S1、数据的获取和处理:按照上述方法分别连接获取2003年至2015年的农作物遥感数据RS多幅、环境遥感数据ENV多幅、农作物分类数据D1多幅和县级大豆产量数据D3多幅,以及县域边界矢量数据D2一幅,并分别对同一年份的多幅所述RS和多幅所述ENV进行预处理,以分别得到每年的多时相农作物遥感数据集IRS和环境遥感数据集IENVS1. Data acquisition and processing: According to the above method, multiple pieces of crop remote sensing data RS, multiple pieces of environmental remote sensing data ENV, multiple pieces of crop classification data and multiple pieces of county-level soybean yield data D3 were obtained from 2003 to 2015, respectively. And one piece of county boundary vector data D2, and preprocesses multiple pieces of the RS and multiple pieces of the ENV in the same year, respectively, to obtain the annual multi-temporal crop remote sensing data set I RS and environmental remote sensing data set I ENV .

所述S1还包括:S11、在google earth engine中,对每幅所述农作物遥感数据RS进行去云处理,将经过去云处理的所述农作物遥感数据RS按年份进行归类,并对同一年份的多幅所述农作物遥感数据RS进行叠加合成处理,即可得到每年的所述多时相农作物遥感数据集IRSThe S1 further includes: S11, in the google earth engine, perform cloud removal processing on each piece of the crop remote sensing data RS, classify the crop remote sensing data RS after the cloud removal processing by year, and classify the crop remote sensing data RS by year, and the same year A plurality of described crop remote sensing data RSs are superimposed and synthesized, and the annual described multi-temporal crop remote sensing data sets IRS can be obtained;

对同一年份的多幅所述环境遥感数据ENV进行叠加合成处理,即可得到每年的所述IENVA plurality of described environmental remote sensing data ENVs in the same year are carried out superimposed synthesis processing, so that the annual described I ENVs can be obtained;

其中,在大豆的生长期内,所述多时相农作物遥感数据集IRS包括t3个时相数据,每个时相数据具有m个波段,所述IENV包括t4个时相数据,每个时相数据具有n个波段,t3=t4=34,m=7,n=4。Wherein, during the growth period of soybean, the multi-phase crop remote sensing data set IRS includes t3 time-phase data, each time-phase data has m bands, and the I ENV includes t4 time - phase data, each time-phase data The time-phase data has n bands, t 3 =t 4 =34, m=7, n=4.

在此,需要说明的是,每一年的多时相农作物遥感数据集IRS和环境遥感数据集IENV中应该涵盖目标作物整个生长季所有农作物遥感数据RS、地表温度数据、天气参数数据。由于不同遥感数据的周期不同,在本发明中,以回访周期最长的遥感数据为准,其他遥感数据在时间上通过多时相数据取均值的方式向回访周期最长的遥感数据进行对齐,对齐后的所述农作物遥感数据RS和所述环境遥感数据ENV两大类数据时相数均为t2,t2=8。Here, it should be noted that each year's multi-temporal crop remote sensing dataset I RS and environmental remote sensing dataset I ENV should cover all crop remote sensing data RS, surface temperature data, and weather parameter data throughout the growing season of the target crop. Since the periods of different remote sensing data are different, in the present invention, the remote sensing data with the longest return visit period shall prevail, and other remote sensing data are aligned with the remote sensing data with the longest return visit period in time by taking the mean value of multi-temporal data. The two types of data of the last crop remote sensing data RS and the environmental remote sensing data ENV are both t 2 , and t 2 =8.

具体操作为:农作物遥感数据RS的产品周期为8天,在生长期内,每一年需获取34个农作物遥感数据,将同一年份获取的农作物遥感数据进行归类和叠加合成处理(在google earth engine中进行),得到每一年的多时相农作物遥感数据集IRS;环境数据包括地表温度数据和天气参数数据,其中,地表温度数据的产品提供平均8天的地表温度,对应的,在生长期内,每一年需获取34个地表温度数据,并将同一年份获取的地表温度数据进行归类和叠加合成处理(在google earth engine中进行),得到每一年的多时相地表温度数据集,而天气参数数据的产品,其每一天对应一个时相数据,由于农作物遥感数据RS和地表温度数据的产品周期为8天,而天气参数数据产品周期是1天,所以要先将天气参数数据产品进行取均值重采样的方式将其周期统一至8天,以将天气参数数据的产品周期与农作物遥感数据RS和地表温度数据的产品周期对齐,因此,在生长期内,对齐后的天气参数数据,其每一年也均包含34个天气参数数据,将同一年份对齐后的天气参数数据进行归类,即可得到每一年的多时相天气参数数据集。每年的多时相地表温度数据集和同年的多时相天气参数数据集归类为同年的环境遥感数据集IENVThe specific operation is: the product cycle of crop remote sensing data RS is 8 days. During the growth period, 34 crop remote sensing data need to be acquired every year, and the crop remote sensing data acquired in the same year are classified and superimposed. engine) to obtain a multi-temporal crop remote sensing dataset IRS every year; the environmental data includes surface temperature data and weather parameter data, wherein the product of the surface temperature data provides an average 8-day surface temperature, correspondingly, in the growth During the period, 34 pieces of surface temperature data need to be obtained every year, and the surface temperature data obtained in the same year are classified and superimposed (in Google Earth Engine) to obtain a multi-temporal surface temperature data set for each year. , and the product of weather parameter data, each day corresponds to one phase data, because the product cycle of crop remote sensing data RS and surface temperature data is 8 days, and the product cycle of weather parameter data is 1 day, so it is necessary to first convert the weather parameter data The product is averaged and resampled to unify its cycle to 8 days to align the product cycle of weather parameter data with the product cycle of crop remote sensing data RS and surface temperature data. Therefore, during the growth period, the aligned weather parameters Each year also contains 34 weather parameter data. By classifying the aligned weather parameter data in the same year, the multi-temporal weather parameter data set for each year can be obtained. The annual multi-temporal surface temperature dataset and the same-year multi-temporal weather parameter dataset are classified as the same-year environmental remote sensing dataset I ENV .

S2、数据的叠加和滤除:将S1中的所述多时相农作物遥感数据集IRS和对应年份的所述IENV进行叠加合成处理,以得到每年的数据总集IRS&ENV,结合S1中对应年份的所述D1,对所述数据总集IRS&ENV进行非目标作物像素滤除,以得到每年仅包含目标作物像素信息的数据总集DIRS&ENVS2. Superposition and filtering of data: The multi-temporal crop remote sensing data set I RS in S1 and the I ENV of the corresponding year are superimposed and synthesized to obtain the annual data set I RS & ENV , combined with S1 In the D1 of the corresponding year in the data set I RS & ENV , the non-target crop pixels are filtered out, so as to obtain the data set DI RS & ENV which only contains the target crop pixel information every year.

其中,在生长期内,所述数据总集IRS&ENV和数据总集DIRS&ENV均包括t1个时相数据,每个时相数据具有m+n个波段,t1=34,m+n=11。Wherein, in the growth period, the data set I RS & ENV and the data set DI RS & ENV both include t 1 time-phase data, each time-phase data has m+n bands, t 1 =34, m+n=11.

S3、利用S1中的所述县域边界矢量数据D2对每幅所述数据总集DIRS&ENV进行裁剪,以得到每个县域每年的所述数据总集DIRS&ENV,统计每个县域每年的所述数据总集DIRS&ENV中各波段的波段像素直方图分布,并将每个县域每年的所述DIRS&ENV对应的波段像素直方图转换为深度学习张量,即可得到每个县域每年的特征张量数据;S3, using the county boundary vector data D2 in S1 to cut each piece of the data set DI RS & ENV to obtain the annual data set DI RS & ENV of each county, and count each county every year The band pixel histogram distribution of each band in the data collection DI RS & ENV , and converting the band pixel histogram corresponding to the DI RS & ENV in each county each year into a deep learning tensor, you can obtain each Yearly feature tensor data for each county;

所述S3还包括:S31、确认各波段中目标作物像素数据的真实分布界限:利用google earth engine中的ui.Chart.image.histogram函数,分别对待估产区内每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段的目标作物像素数据统计形成对应的波段像素直方图,得到每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段的目标作物像素数据的最大值和最小值,进而得到每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段的目标作物像素数据的真实分布界限,以每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段目标作物像素数据的最大值和最小值为界限,将每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段目标作物像素数据范围分为w个区间;The S3 further includes: S31. Confirm the real distribution limits of the target crop pixel data in each band: use the ui.Chart.image.histogram function in the google earth engine to separately treat each piece of the crop remote sensing data RS and the crop in the estimated production area. The target crop pixel data statistics of each band in the environmental remote sensing data ENV form a corresponding band pixel histogram, and obtain the maximum value of the target crop pixel data of each band in each of the described crop remote sensing data RS and the environmental remote sensing data ENV and the minimum value, and then obtain the real distribution limit of the target crop pixel data of each band in each described crop remote sensing data RS and described environmental remote sensing data ENV, with each described crop remote sensing data RS and described environmental remote sensing data ENV The maximum value and the minimum value of the target crop pixel data in each band are the limits, and the range of each band target crop pixel data in each of the crop remote sensing data RS and the environmental remote sensing data ENV is divided into w intervals;

S32、利用S1中的所述县域边界矢量数据D2分别对S2中的所述数据总集DIRS&ENV进行裁剪,以得到每个县域每年的所述数据总集DIRS&ENV,利用google earth engine中的ui.Chart.image.histogram函数,统计每个县域每年的所述数据总集DIRS&ENV中各波段的波段像素直方图,利用google earth engine中的ee.Reducer.fixed Histogram函数,以S31中得到各波段的区间,将每个县域每年的所述数据总集DIRS&ENV中各波段的波段像素转换至对应波段的区间内,即可得到每个县域每年对应的特征张量数据,所述特征张量数据的形状为t*(m+n)*1*w。S32, using the county boundary vector data D2 in S1 to cut the data set DI RS & ENV in S2 respectively to obtain the data set DI RS & ENV for each county each year, using google earth The ui.Chart.image.histogram function in the engine counts the band pixel histogram of each band in the data set DI RS & E NV in each county every year, using the ee.Reducer.fixed Histogram function in the google earth engine, Using the interval of each band obtained in S31, convert the band pixels of each band in the data collection DI RS & ENV of each county each year into the interval of the corresponding band, and then the feature tensor corresponding to each county year can be obtained. data, the shape of the feature tensor data is t*(m+n)*1*w.

在此,需要说明的是,本方法假设通过作物像素信息对产量预估时,与像素的地理位置无关,所以通过统计各波段的波段像素直方图就可以重建影像,即可以用直方图代表原始图像。在给定间隔数量的情况系啊,各波段的目标作物像素数据范围越大,得到的直方图间隔越大,但是通常得到的遥感数据中各波段目标作物像素数据的理论界限值会过大,因此,为了生成规则化的特征张量数据,在对各波段像素直方图转换之前需要探明各波段中目标作物像素数据的真实分布界限。因此,S31的目的是获取各波段中目标作物像素范围的真实分布,进而获取每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段目标作物像素数据的最小值和最大值,以生成规则化的张量。在本发明中,通过采用直方图转换的方法统计各波段像素直方图,其具体操作为:以将每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段目标作物像素数据的最大值和最小值为界限,将每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段目标作物像素数据范围分为w个区间,然后将每个波段中的目标作物像素值分配到对应的w个区间,并统计每个区间内目标作物像素量,于是,每个波段及对应一个1*w的特征张量数据。将得到每个县域每年的数据总集DIRS&ENV中各波段的波段像素直方图转换至对应波段的区间内后,对应每个县域每年数据DIRS&ENV将对应一个张量TRS&ENV,特征张量数据的形状为t*(m+n)*1*w,在本发明中,w为32,t=t3=t4=34,即每个县域每年的特征张量数据形状为34*11*1*32。在此,需要说明的是,w=32只是本发明中一个具体的实施例,在实际应用中,w还可以为64、128等。Here, it should be noted that this method assumes that when crop pixel information is used to estimate yield, it has nothing to do with the geographic location of the pixel, so the image can be reconstructed by counting the band pixel histogram of each band, that is, the histogram can be used to represent the original image. image. In the case of a given number of intervals, the larger the range of target crop pixel data in each band, the larger the obtained histogram interval, but usually the theoretical limit value of each band target crop pixel data in the obtained remote sensing data will be too large. Therefore, in order to generate regularized feature tensor data, it is necessary to find out the real distribution limit of the target crop pixel data in each band before transforming the pixel histogram of each band. Therefore, the purpose of S31 is to obtain the real distribution of the target crop pixel range in each band, and then obtain the minimum and maximum values of each band target crop pixel data in each of the crop remote sensing data RS and the environmental remote sensing data ENV, to Generate regularized tensors. In the present invention, the pixel histogram of each band is counted by adopting the method of histogram conversion, and the specific operation is as follows: to convert the maximum value of the target crop pixel data of each band in each of the crop remote sensing data RS and the environmental remote sensing data ENV The value and the minimum value are the limits, and the target crop pixel data range of each band in each of the crop remote sensing data RS and the environmental remote sensing data ENV is divided into w intervals, and then the target crop pixel value in each band is assigned to Corresponding w intervals, and count the number of target crop pixels in each interval, so each band corresponds to a 1*w feature tensor data. After converting the band pixel histogram of each band in the annual data set DI RS & ENV of each county into the interval of the corresponding band, the corresponding annual data DI RS & ENV of each county will correspond to a tensor T RS & ENV , the shape of the feature tensor data is t*(m+n)*1*w, in the present invention, w is 32, t=t 3 =t 4 =34, that is, the shape of the feature tensor data in each county per year is 34*11*1*32. Here, it should be noted that w=32 is only a specific embodiment of the present invention, and in practical applications, w may also be 64, 128, or the like.

图3为某Kansas州的Marion县在某一个时刻各波段被转换为32箱直方图后的示意图。Figure 3 is a schematic diagram of each band being converted into a 32-box histogram at a certain moment in Marion County in a Kansas state.

S4、构建CNN-LSTM模型,并利用S3中的多个所述特征张量数据和S1中对应县域对应年份所述县级大豆产量数据D3对CNN-LSTM模型进行训练,并评价模型精度,当训练后模型的精度满足要求时,即得到最终的CNN-LSTM模型,若不满足,增加训练次数,直至训练后模型的精度满足要求;其中,所述CNN-LSTM模型包括CNN和LSTM两个部分,CNN用于对特征张量数据进行特征提取,LSTM用于学习CNN提取的特征。训练模型时采用early stopping方法对val_loss参数进行监测,如果该参数连续15个epoch没有优化即停止训练。S4. Construct a CNN-LSTM model, and use the plurality of the feature tensor data in S3 and the county-level soybean yield data D3 in the corresponding county in S1 to train the CNN-LSTM model, and evaluate the model accuracy. When the accuracy of the model after training meets the requirements, the final CNN-LSTM model is obtained. If not, the number of training times is increased until the accuracy of the model after training meets the requirements; wherein, the CNN-LSTM model includes two parts: CNN and LSTM , CNN is used for feature extraction on feature tensor data, and LSTM is used to learn the features extracted by CNN. When training the model, the early stopping method is used to monitor the val_loss parameter. If the parameter is not optimized for 15 consecutive epochs, the training will be stopped.

其中,CNN包含两个Conv2D,第一个Conv2D具有32个过滤器,第二个具有64个过滤器,每个过滤器的卷积核大小均为1*2。每个卷积层之后都紧跟一个核大小为1*2的池化层,并使用批处理规范化处理。通过时间分发包装器,将两个堆叠的Conv2D层应用于输入的每个时间切片,以进行特征提取。然后,再将经过Conv2D处理后的时序张量馈入LSTM层之前,先对其进行平坦化和批量标准化。LSTM部分中只有一层LSTM层。LSTM的神经元数设置为256,然后是具有64个神经元的密集层。之后,将所有时间输出展平为一个长向量,该向量以0.5的丢失概率发送到一个Dropout层,该Dropout层可以在训练过程中随机关闭一定百分比的神经元,这可以帮助防止神经元组过度拟合。最后,一个神经元密集层用于输出预测的产量。模型的激活函数采用修正线性单元ReLU。该模型通过keras进行构建。本发明构建的CNN-LSTM模型结构如图2所示。Among them, CNN contains two Conv2Ds, the first Conv2D has 32 filters, the second has 64 filters, and the convolution kernel size of each filter is 1*2. Each convolutional layer is immediately followed by a pooling layer with a kernel size of 1*2 and is processed using batch normalization. Via a temporal distribution wrapper, two stacked Conv2D layers are applied to each temporal slice of the input for feature extraction. The Conv2D-processed timing tensors are then flattened and batch-normalized before being fed into the LSTM layers. There is only one LSTM layer in the LSTM part. The number of neurons of the LSTM is set to 256, followed by a dense layer with 64 neurons. Afterwards, all time outputs are flattened into a long vector, which is sent with a dropout probability of 0.5 to a Dropout layer that can randomly turn off a certain percentage of neurons during training, which can help prevent groups of neurons from becoming overwhelmed fit. Finally, a dense layer of neurons is used to output the predicted output. The activation function of the model adopts the modified linear unit ReLU. The model is built with keras. The structure of the CNN-LSTM model constructed by the present invention is shown in FIG. 2 .

训练所述CNN-LSTM模型包括以下步骤:Training the CNN-LSTM model includes the following steps:

S41、设置训练参数:将所述CNN-LSTM模型的验证数据分割比例(validation_split)设置为0.2,训练周期设置为100次,数据块大小设置为16;S41. Set training parameters: set the validation data split ratio (validation_split) of the CNN-LSTM model to 0.2, the training period to 100 times, and the data block size to 16;

S42、向设置好的CNN-LSTM模型中输入多个特征张量数据,并利用通过keras中的fit函数对模型进行训练,以S1中对应县域对应年份的所述县级大豆产量数据D3为训练数据,对CNN-LSTM模型进行训练。S42. Input multiple feature tensor data into the set CNN-LSTM model, and use the fit function in keras to train the model, using the county-level soybean yield data D3 in the corresponding year of the county in S1 as training data to train the CNN-LSTM model.

在此,需要说明的是,CNN-LSTM模型的输入为每个县域每年的特征张量数据,标签数据为S1中对应县域对应年份的历史产量数据,输出为对应县域对应年份的预测产量。向CNN-LSTM模型输入数据,利用keras中的fit函数对模型进行训练,以S1中对应县域对应年份的历史产量数据为训练数据,CNN-LSTM模型的验证数据分割比例(validation_split)设置为0.2,目的是将训练数据随机的分离出20%作为验证数据,训练过程总共设置100个epoch训练周期,patch size(数据块大小)设置为16,使用均方误差作物训练度量。为了提高实际性能,采用了提前停止的方法来减少深度学习系统的泛化误差。当监视的“val_loss”指标在连续10个周期后停止改善时,训练将结束并保存模型。优化器使用自适应动量(ADAM)。Here, it should be noted that the input of the CNN-LSTM model is the annual feature tensor data of each county, the label data is the historical output data of the corresponding year in the corresponding county in S1, and the output is the predicted output of the corresponding year in the corresponding county. Input data to the CNN-LSTM model, use the fit function in keras to train the model, use the historical output data of the corresponding year in the corresponding county in S1 as the training data, and set the validation data split ratio (validation_split) of the CNN-LSTM model to 0.2, The purpose is to randomly separate 20% of the training data as validation data. The training process is set for a total of 100 epoch training cycles, the patch size (data block size) is set to 16, and the mean square error crop training metric is used. To improve practical performance, an early stopping method is adopted to reduce the generalization error of deep learning systems. When the monitored "val_loss" metric stops improving after 10 consecutive epochs, the training will end and the model will be saved. The optimizer uses Adaptive Momentum (ADAM).

S5:应用S4中训练的CNN-LSTM模型对目标作物进行估产,以得到待估产区具有县级尺度的目标作物产量。S5: Use the CNN-LSTM model trained in S4 to estimate the yield of the target crop, so as to obtain the target crop yield at the county level in the production area to be estimated.

此外,为了验证本方法的有效性,首先采用本发明所述方法分别对上述15个州2011-2015年的大豆产量进行预估,并将预估值与USDA公布的对应县域对应年份的产值进行对比,并采用RMSE和R2作为评价指标,其中,精度评定标准为决定系数R2和均方根误差RMSE,R2的取值范围为[0,1],R2越大,表示模型拟合效果越好,反之则模型拟合效果越差;RMSE越小,模型拟合效果越好。In addition, in order to verify the effectiveness of this method, the method of the present invention was used to estimate the soybean yields of the above-mentioned 15 states from 2011 to 2015, respectively, and the estimated values were compared with the output values of the corresponding counties published by the USDA in the corresponding years. For comparison, RMSE and R 2 are used as evaluation indicators. The accuracy evaluation standard is the determination coefficient R 2 and the root mean square error RMSE. The value range of R 2 is [0, 1]. The larger the R 2 is, the better the model is. The better the fitting effect, the worse the model fitting effect; the smaller the RMSE, the better the model fitting effect.

采用本发明所述方法得到的估产结果与USDA公布的结果对比结果如图4所示。Figure 4 shows the comparison results between the yield estimation results obtained by the method of the present invention and the results published by the USDA.

通过对每一年的估产精度进行评估,RMSE如图5所示。The RMSE is shown in Figure 5 by evaluating the production accuracy of each year.

综合5年预测数据与USDA公布的数据进行比较,得到R2=0.78(R2>0.6时,可代表预测结果的精准度高),如图6所示,证明了本方法的有效性。Comparing the 5-year forecast data with the data published by USDA, we get R 2 =0.78 (when R 2 >0.6, it can represent the high accuracy of the forecast result), as shown in Figure 6, which proves the effectiveness of this method.

在本文中,所涉及的前、后、上、下等方位词是以附图中零部件位于图中以及零部件相互之间的位置来定义的,只是为了表达技术方案的清楚及方便。应当理解,所述方位词的使用不应限制本申请请求保护的范围。In this document, the related terms such as front, rear, upper and lower are defined by the positions of the components in the drawings and the positions between the components, which are only for the clarity and convenience of expressing the technical solution. It should be understood that the use of the locative words should not limit the scope of protection claimed in this application.

在不冲突的情况下,本文中上述实施例及实施例中的特征可以相互结合。The above-described embodiments and features of the embodiments herein may be combined with each other without conflict.

以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.

Claims (6)

1.一种基于CNN-LSTM的县级尺度农作物估产方法,其特征在于:包括如下步骤:1. a county-level crop yield estimation method based on CNN-LSTM, is characterized in that: comprise the steps: S1、数据的获取和处理:分别获取待估产区目标年份的农作物遥感数据RS、环境遥感数据ENV、农作物分类数据D1和县级大豆产量数据D3多幅,以及县域边界矢量数据D2,并分别对同一年份的多幅所述农作物遥感数据RS和多幅所述环境遥感数据ENV进行预处理,以分别得到每年的多时相农作物遥感数据集IRS和环境遥感数据集IENVS1. Data acquisition and processing: respectively obtain the crop remote sensing data RS, environmental remote sensing data ENV, crop classification data D1, county-level soybean yield data D3, and county boundary vector data D2 of the target year of the production area to be estimated, and analyze them respectively. A plurality of described crop remote sensing data RS and a plurality of described environmental remote sensing data ENV of the same year are preprocessed to obtain the annual multi-temporal crop remote sensing data set I RS and environmental remote sensing data set I ENV respectively; S2、数据的叠加和滤除:将S1中的所述多时相农作物遥感数据集IRS和对应年份的所述环境遥感数据集IENV进行叠加合成处理,以得到每年的数据总集IRS&ENV,结合S1中所述农作物分类数据D1,对对应年份的所述数据总集IRS&ENV进行非目标作物像素滤除,以得到每年的仅包含目标作物像素信息的数据总集DIRS&ENVS2. Superposition and filtering of data: The multi-temporal crop remote sensing data set I RS in S1 and the environmental remote sensing data set I ENV of the corresponding year are superimposed and synthesized to obtain an annual data set I RS & ENV , in conjunction with the crop classification data D1 described in S1, filter out the non-target crop pixels on the data aggregate I RS & ENV of the corresponding year, to obtain an annual data aggregate DI RS &ENV; S3、获取县级尺度的特征张量数据:利用S1中的所述县域边界矢量数据D2对每幅所述数据总集DIRS&ENV进行裁剪,以得到每个县域每年的所述数据总集DIRS&ENV,统计每个县域每年所述数据总集DIRS&ENV中各波段的波段像素直方图分布,并将每个县域每年所述数据总集DIRS&ENV对应的波段像素直方图转换为深度学习张量,即可得到每个县域每年的特征张量数据;S3. Obtain county-level feature tensor data: Use the county boundary vector data D2 in S1 to cut each of the data sets DI RS & ENV to obtain the annual data set for each county DI RS & ENV , calculate the distribution of the band pixel histogram of each band in the data set DI RS & ENV for each county each year, and calculate the band pixel histogram corresponding to the data set DI RS & ENV for each county each year Convert to deep learning tensor to get the annual feature tensor data of each county; S4、构建和训练CNN-LSTM模型:构建CNN-LSTM模型,利用S3中的多个所述特征张量数据和S1中对应县域对应年份所述县级大豆产量数据D3对CNN-LSTM模型进行训练;S4. Build and train a CNN-LSTM model: build a CNN-LSTM model, and train the CNN-LSTM model by using a plurality of the feature tensor data in S3 and the county-level soybean yield data D3 in the corresponding year of the corresponding county in S1 ; S5:应用S4中训练的CNN-LSTM模型对目标作物进行估产,以得到待估产区具有县级尺度的目标作物产量。S5: Use the CNN-LSTM model trained in S4 to estimate the yield of the target crop, so as to obtain the target crop yield at the county level in the production area to be estimated. 2.根据权利要求1所述的一种基于CNN-LSTM的县级尺度农作物估产方法,其特征在于:所述S1还包括:2. a kind of county-level crop yield estimation method based on CNN-LSTM according to claim 1, is characterized in that: described S1 also comprises: S11、对每幅所述农作物遥感数据RS进行去云处理,将经过去云处理的所述农作物遥感数据RS按年份进行归类,并对同一年份的多幅所述农作物遥感数据RS进行叠加合成处理,即可得到每年的所述多时相农作物遥感数据集IRSS11, carry out cloud removal processing to each described crop remote sensing data RS, classify the crop remote sensing data RS through the cloud removal processing by year, and superimpose and synthesize multiple described crop remote sensing data RS of the same year processing, the annual multi-temporal crop remote sensing dataset IRS can be obtained; 对同一年份的多幅所述环境遥感数据ENV进行叠加合成处理,即可得到每年的所述环境遥感数据集IENVThe multiple pieces of the environmental remote sensing data ENV in the same year are superimposed and synthesized to obtain the annual environmental remote sensing data set I ENV . 3.根据权利要求2所述的一种基于CNN-LSTM的县级尺度农作物估产方法,其特征在于:所述ENV包括地表温度数据和天气参数数据,得到所述环境遥感数据集IENV还包括以下步骤:3. a kind of county-level crop yield estimation method based on CNN-LSTM according to claim 2, is characterized in that: described ENV comprises surface temperature data and weather parameter data, obtains described environmental remote sensing data set I ENV also comprises The following steps: S11a、将所述地表温度数据和所述天气参数数据分别与所述农作物遥感数据RS进行时相对齐,以分别得到对齐后的所述地表温度数据和所述天气参数数据;S11a, aligning the surface temperature data and the weather parameter data with the crop remote sensing data RS, respectively, to obtain the aligned surface temperature data and the weather parameter data; S11b、以年份单位,分别对S11a中对齐后的所述地表温度数据和所述天气参数数据进行处理,以分别得到每年的多时相地表温度数据集和每年的多时相天气参数数据集;S11b, in units of years, respectively process the aligned surface temperature data and the weather parameter data in S11a to obtain an annual multi-temporal surface temperature data set and an annual multi-temporal weather parameter data set; S11c、将S13中的每年的多时相地表温度数据集与对应年份的多时相天气参数数据集进行处理,即可得到对应年份所述环境遥感数据集IENVS11c: Process the annual multi-temporal surface temperature data set in S13 and the multi-temporal weather parameter data set of the corresponding year to obtain the environmental remote sensing data set I ENV of the corresponding year. 4.根据权利要求1所述的一种基于CNN-LSTM的县级尺度农作物估产方法,其特征在于:所述S3还包括:4. a kind of county-level crop yield estimation method based on CNN-LSTM according to claim 1, is characterized in that: described S3 also comprises: S31、确认各波段中目标作物像素数据的真实分布界限:分别对待估产区内每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段的目标作物像素数据统计形成对应的波段像素直方图,得到每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段的目标作物像素数据的最大值和最小值,以各波段目标作物像素数据的最大值和最小值为界限,将每幅所述农作物遥感数据RS和所述环境遥感数据ENV中各波段目标作物像素数据范围分为多个区间;S31, confirm the true distribution limit of the target crop pixel data in each band: the target crop pixel data of each band in each of the crop remote sensing data RS and the environmental remote sensing data ENV in the production area to be estimated respectively form a corresponding band pixel histogram Figure, obtain the maximum value and the minimum value of the target crop pixel data of each waveband in each described crop remote sensing data RS and described environmental remote sensing data ENV, take the maximum value and minimum value of each waveband target crop pixel data as the limit, the The target crop pixel data range of each band in each of the crop remote sensing data RS and the environmental remote sensing data ENV is divided into multiple intervals; S32、利用S1中的所述县域边界矢量数据D2分别对S2中的所述数据总集DIRS&ENV进行裁剪,以得到每个县域每年的所述数据总集DIRS&ENV,统计每个县域每年的所述数据总集DIRS&ENV中各波段的波段像素直方图,以S31中得到各波段的区间,将每个县域每年的所述数据总集DIRS&ENV中各波段的波段像素转换至对应波段的区间内,即可得到每个县域每年对应的特征张量数据。S32, using the county boundary vector data D2 in S1 to cut the data collection DI RS & ENV in S2 respectively, to obtain the annual data collection DI RS & ENV of each county, count each The band pixel histogram of each band in the annual data set DI RS & ENV of the county area, the interval of each band is obtained in S31, and the bands of each band in the annual data set DI RS & ENV of each county area are obtained. Convert the pixels to the corresponding band range, and you can get the feature tensor data corresponding to each county each year. 5.根据权利要求1所述的一种基于CNN-LSTM的县级尺度农作物估产方法,其特征在于:所述CNN-LSTM模型包括CNN和LSTM,其中,CNN用于对特征张量数据进行特征提取,LSTM用于学习CNN提取的特征。5. a county-level crop yield estimation method based on CNN-LSTM according to claim 1, is characterized in that: described CNN-LSTM model comprises CNN and LSTM, wherein, CNN is used for characteristic tensor data to be characterized Extraction, LSTM is used to learn the features extracted by CNN. 6.根据权利要求1所述的一种基于CNN-LSTM的县级尺度农作物估产方法,其特征在于:所述S4还包括以下步骤:6. a kind of county-level crop yield estimation method based on CNN-LSTM according to claim 1, is characterized in that: described S4 also comprises the following steps: S41、设置训练参数:将所述CNN-LSTM模型的验证数据分割比例设置为0.2,训练周期设置为100次,数据块大小设置为16;S41, set training parameters: set the verification data split ratio of the CNN-LSTM model to 0.2, the training period to 100 times, and the data block size to 16; S42、向S41中设置后的CNN-LSTM模型输入多个特征张量数据,并利用通过keras中的fit函数对模型进行训练,以S1中对应县域对应年份的所述县级大豆产量数据D3为训练数据,对CNN-LSTM模型进行训练。S42. Input multiple feature tensor data into the CNN-LSTM model set in S41, and use the fit function in keras to train the model. The county-level soybean yield data D3 in the corresponding year of the corresponding county in S1 is Training data to train the CNN-LSTM model.
CN201910954014.XA 2019-10-09 2019-10-09 County scale crop yield estimation method based on CNN-LSTM Expired - Fee Related CN110728446B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910954014.XA CN110728446B (en) 2019-10-09 2019-10-09 County scale crop yield estimation method based on CNN-LSTM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910954014.XA CN110728446B (en) 2019-10-09 2019-10-09 County scale crop yield estimation method based on CNN-LSTM

Publications (2)

Publication Number Publication Date
CN110728446A true CN110728446A (en) 2020-01-24
CN110728446B CN110728446B (en) 2022-04-01

Family

ID=69220854

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910954014.XA Expired - Fee Related CN110728446B (en) 2019-10-09 2019-10-09 County scale crop yield estimation method based on CNN-LSTM

Country Status (1)

Country Link
CN (1) CN110728446B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111950994A (en) * 2020-09-03 2020-11-17 深圳市不动产评估中心(深圳市地质环境监测中心) Geological environment and monitoring information management method, system, terminal and storage medium
CN112258523A (en) * 2020-10-20 2021-01-22 中国石油大学(华东) Method for finely extracting enteromorpha coverage information of medium-low resolution remote sensing image
CN112946187A (en) * 2021-01-22 2021-06-11 西安科技大学 Refuge chamber real-time state monitoring method based on neural network
CN113378476A (en) * 2021-06-28 2021-09-10 武汉大学 Global 250-meter resolution space-time continuous leaf area index satellite product generation method
CN114004389A (en) * 2021-09-18 2022-02-01 苏州憨云智能科技有限公司 Hybrid method for crop yield prediction
CN114529097A (en) * 2022-02-26 2022-05-24 黑龙江八一农垦大学 Multi-scale crop phenological period remote sensing dimensionality reduction prediction method
CN117592619A (en) * 2023-12-18 2024-02-23 中国科学院空天信息创新研究院 A county-level winter wheat yield estimation analysis method and system based on GNN-LSTM
CN118735140A (en) * 2024-09-03 2024-10-01 武汉大学 A crop yield inversion method and system based on fusion network model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101916337A (en) * 2010-08-23 2010-12-15 湖南大学 A Dynamic Prediction Method of Rice Production Potential Based on Geographic Information System
CN108287926A (en) * 2018-03-02 2018-07-17 宿州学院 A kind of multi-source heterogeneous big data acquisition of Agro-ecology, processing and analysis framework
US20180211156A1 (en) * 2017-01-26 2018-07-26 The Climate Corporation Crop yield estimation using agronomic neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101916337A (en) * 2010-08-23 2010-12-15 湖南大学 A Dynamic Prediction Method of Rice Production Potential Based on Geographic Information System
US20180211156A1 (en) * 2017-01-26 2018-07-26 The Climate Corporation Crop yield estimation using agronomic neural network
CN108287926A (en) * 2018-03-02 2018-07-17 宿州学院 A kind of multi-source heterogeneous big data acquisition of Agro-ecology, processing and analysis framework

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孙杰等: "Multilevel Deep Learning Network for County-Level Corn Yield Estimation in the U.S. Corn Belt", 《IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111950994A (en) * 2020-09-03 2020-11-17 深圳市不动产评估中心(深圳市地质环境监测中心) Geological environment and monitoring information management method, system, terminal and storage medium
CN112258523A (en) * 2020-10-20 2021-01-22 中国石油大学(华东) Method for finely extracting enteromorpha coverage information of medium-low resolution remote sensing image
CN112258523B (en) * 2020-10-20 2022-03-08 中国石油大学(华东) Method for finely extracting enteromorpha coverage information of medium-low resolution remote sensing image
CN112946187A (en) * 2021-01-22 2021-06-11 西安科技大学 Refuge chamber real-time state monitoring method based on neural network
CN113378476A (en) * 2021-06-28 2021-09-10 武汉大学 Global 250-meter resolution space-time continuous leaf area index satellite product generation method
CN114004389A (en) * 2021-09-18 2022-02-01 苏州憨云智能科技有限公司 Hybrid method for crop yield prediction
CN114529097A (en) * 2022-02-26 2022-05-24 黑龙江八一农垦大学 Multi-scale crop phenological period remote sensing dimensionality reduction prediction method
CN117592619A (en) * 2023-12-18 2024-02-23 中国科学院空天信息创新研究院 A county-level winter wheat yield estimation analysis method and system based on GNN-LSTM
CN117592619B (en) * 2023-12-18 2024-08-30 中国科学院空天信息创新研究院 GNN-LSTM-based county winter wheat estimated yield analysis method and system
CN118735140A (en) * 2024-09-03 2024-10-01 武汉大学 A crop yield inversion method and system based on fusion network model
CN118735140B (en) * 2024-09-03 2024-11-22 武汉大学 Crop yield inversion method and system based on converged network model

Also Published As

Publication number Publication date
CN110728446B (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CN110728446A (en) County scale crop yield estimation method based on CNN-LSTM
US10755129B2 (en) Disease recognition from images having a large field of view
US11615276B2 (en) Detection of plant diseases with multi-stage, multi-scale deep learning
JP7362751B2 (en) Detecting plant disease infections by classifying plant photos
CA3013215C (en) Modeling trends in crop yields
AU2017308526B2 (en) Automatically detecting outlier values in harvested data
WO2023108213A1 (en) Methods and systems for classifying and benchmarking irrigation performance
AU2019365219B2 (en) Detecting infection of plant diseases with improved machine learning
AU2017310240A1 (en) Delineating management zones based on historical yield maps
Shah et al. Crop yield prediction using remote sensing and meteorological data
CN118521969B (en) Monitoring method for rice seed withdrawal risk
CN202798987U (en) Device automatically collecting crop image and calculating crop growth
Jaslam et al. EBLUP estimate of crop yield at sub-district level in Hisar district, Haryana, India using MODIS/Terra data
Miao et al. Native grassland conversion: the roles of risk intervention and switching costs
Lobell et al. Landsat-Based Reconstruction of Corn and Soybean Yield Histories in the United States Since 1999
Vire Characterizing Spatiotemporal Patterns of Soybean Phenology in the Arkansas Delta Region Using Multi-Source Remotely Sensed Data from 2002 to 2020
Lee Effects of Wet Spring on Prevented Planting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220401