CN110728446B - County scale crop yield estimation method based on CNN-LSTM - Google Patents

County scale crop yield estimation method based on CNN-LSTM Download PDF

Info

Publication number
CN110728446B
CN110728446B CN201910954014.XA CN201910954014A CN110728446B CN 110728446 B CN110728446 B CN 110728446B CN 201910954014 A CN201910954014 A CN 201910954014A CN 110728446 B CN110728446 B CN 110728446B
Authority
CN
China
Prior art keywords
data
remote sensing
crop
county
cnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910954014.XA
Other languages
Chinese (zh)
Other versions
CN110728446A (en
Inventor
孙杰
赖祖龙
陈性义
余俊杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN201910954014.XA priority Critical patent/CN110728446B/en
Publication of CN110728446A publication Critical patent/CN110728446A/en
Application granted granted Critical
Publication of CN110728446B publication Critical patent/CN110728446B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N2021/1765Method using an image detector and processing of image signal
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N2021/178Methods for obtaining spatial resolution of the property being measured
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N2021/1793Remote sensing
    • G01N2021/1797Remote sensing in landscape, e.g. crops

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Analytical Chemistry (AREA)
  • Primary Health Care (AREA)
  • Mining & Mineral Resources (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Animal Husbandry (AREA)
  • Agronomy & Crop Science (AREA)
  • Pathology (AREA)
  • Immunology (AREA)
  • Biochemistry (AREA)

Abstract

The invention relates to the technical field of remote sensing image information extraction, in particular to a county scale crop yield estimation method based on CNN-LSTM, which comprises the following steps: s1 data acquisition and processing, S2 data superposition and filtering, S3 feature tensor data of county scale acquisition, S4 construction and training of a CNN-LSTM model, and S5 application of the CNN-LSTM model trained in S4 to estimate the yield of the target crop. The method extracts more characteristics of county crops and converts the characteristics into tensors by a histogram statistical method based on the remote sensing data reflecting the growth state of the crops and the environmental data influencing the growth of the crops, so that the extracted CNN-LSTM model is trained, and the estimation precision of small-scale crops is effectively improved.

Description

County scale crop yield estimation method based on CNN-LSTM
Technical Field
The invention relates to the technical field of remote sensing image information extraction, in particular to a county scale crop yield estimation method based on CNN-LSTM.
Background
Crop yield is the most important indicator of agriculture and has many connections with human society. Yield prediction is one of the most challenging tasks in precision agriculture, and is of great importance to yield maps, crop market programs, crop insurance and harvest management.
Remote sensing technology has been widely used in crop assessment. Various relevant information can be extracted from the remote sensing data to assist in estimating the yield. In particular, various Vegetation Indices (VI), such as normalized vegetation index (NDVI), have been widely used. Other indices, such as Green Leaf Area Index (GLAI), Crop Water Stress Index (CWSI), normalized water index (NDWI), Green Vegetation Index (GVI), Soil Adjusted Vegetation Index (SAVI), etc., have also been used for crop yield prediction. In addition, meteorological variables such as precipitation, air temperature and some soil condition data, including soil moisture, temperature are commonly used in yield prediction and quality as indicators of the crop growing environment. Such feature extraction methods typically rely on computing the region mean, but omit detail features.
Based on remote sensing data, there are mainly two crop yield prediction methods: crop simulation and empirical statistical models. Although crop simulation models accurately simulate the physical processes of crop growth, these models are hardly applicable over a large spatio-temporal range due to insufficient data. In contrast, empirical statistical models are simple, require less input data, and have therefore been widely used as a general alternative to process-based models. Machine learning algorithms, including Support Vector Machines (SVMs), Decision Trees (DTs), multi-level perceptions (MLPs) and Restricted Boltzmann Machines (RBMs), may provide alternatives to traditional regression methods and overcome many of the limitations. Furthermore, Artificial Neural Networks (ANN) are also considered as surrogate models. Traditional artificial neural networks, i.e., multi-layered perceptron models, have been successfully applied to crop yield estimation for various crops.
In recent years, Deep Learning (DL) has been considered as a breakthrough technique in machine learning and data mining agricultural remote sensing. Most DL algorithms, including stacked sparse auto-encoders (SSAE), Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), have been used for yield prediction. Researchers generally believe that CNNs can explore more spatial features, while LSTM has the ability to reveal phenological features as a special RNN, but little attention has been paid to the integration of the advantages of CNNs and LSTM for yield prediction.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a county scale crop production estimation method based on CNN-LSTM. The method provides a CNN-LSTM model for county-level end-of-season crop yield prediction.
The invention provides a CNN-LSTM-based county scale crop yield estimation method, which is characterized by comprising the following steps: the method comprises the following steps:
s1, acquiring and processing data: respectively acquiring a plurality of crop remote sensing data RS, environmental remote sensing data ENV, crop classification data D1, county-level soybean yield data D3 and county-level boundary vector data D2 of a target year of a to-be-estimated production area, and respectively preprocessing the plurality of crop remote sensing data RS and the plurality of environmental remote sensing data ENV of the same year to respectively obtain a multi-temporal crop remote sensing data set I of each yearRSAnd an environment remote sensing data set IENV
S2, data superposition and filtering: the multi-temporal crop remote sensing data set I in S1RSAnd said I for the corresponding yearENVPerforming superposition synthesis processing to obtain annual data total set IRS&ENVFor the data collection I in combination with the D1 of the corresponding year in S1RS&ENVNon-target crop pixel filtering is performed to obtain an annual data set DI containing only target crop pixel informationRS&ENV
S3, acquiring feature tensor data of a county scale: using the county-side boundary vector data D2 in S1 for each of the data aggregations DIRS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVCounting the data collection DI of each county area per yearRS&ENVDistribution of band pixel histogram of each band, and aggregating the data DI of each county area per yearRS&ENVConverting the corresponding band pixel histogram into a deep learning tensor, and obtaining annual characteristic tensor data of each county;
s4, constructing and training a CNN-LSTM model: constructing a CNN-LSTM model, training the CNN-LSTM model by using a plurality of characteristic tensor data in S3 and county-level soybean yield data D3 of corresponding county-level years in S1, and evaluating model accuracy;
s5: the CNN-LSTM model trained in S4 was used to evaluate the yield of the target crop.
Further, the crop remote sensing data RS include t pieces of time phase data, each time phase data has m bands, the environment remote sensing data ENV includes u pieces of time phase data, and each time phase data has n bands.
Further, the S1 further includes:
s11, carrying out cloud removing processing on each crop remote sensing data RS, classifying the cloud-removed crop remote sensing data RS according to the year, and carrying out superposition synthesis processing on a plurality of crop remote sensing data RS in the same year to obtain the multi-time-phase crop remote sensing data set I of each yearRS
Superposing and synthesizing a plurality of pieces of the environment remote sensing data ENV of the same year to obtain an environment remote sensing data set I of each yearENV
Further, the environment remote sensing data ENV comprises surface temperature data and weather parameter data, and the environment remote sensing data set I is obtainedENVFurther comprising the steps of:
s11a, respectively aligning the earth surface temperature data and the weather parameter data with the crop remote sensing data RS in a time phase mode to respectively obtain the earth surface temperature data and the weather parameter data which are aligned;
s11b, processing the ground surface temperature data and the weather parameter data which are aligned in the step S12 respectively in a year unit to obtain a multi-temporal ground surface temperature data set of each year and a multi-temporal weather parameter data set of each year respectively;
s11c, processing the annual multi-temporal earth surface temperature data set in the S13 and the multi-temporal weather parameter data set of the corresponding year to obtain the environment remote sensing data set I of the corresponding yearENV
Further, the data total set I in S2RS&ENVThere are t phases, and each phase data has m + n bands.
Further, the S3 further includes:
s31, confirming the real distribution limit of the target crop pixel data in each wave band: respectively counting target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV in an estimated production area to form a corresponding wave band pixel histogram, obtaining the maximum value and the minimum value of the target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV, and dividing the target crop pixel data range of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV into w intervals by taking the maximum value and the minimum value of the target crop pixel data of each wave band as boundaries;
s32, respectively comparing the data aggregate DI in S2 with the county-area boundary vector data D2 in S1RS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVCounting the data collection DI of each county area per yearRS&ENVThe band pixel histograms of the middle bands are obtained by summing up the data sets DI for each county area per year in the intervals of the bands obtained in S31RS&ENVAnd converting the wave band pixels of each wave band into the interval of the corresponding wave band, so as to obtain the feature tensor data corresponding to each county area every year, wherein the shape of the feature tensor data is t (m + n) 1 w.
Further, the S4 further includes the following steps:
s41, setting training parameters: setting the verification data segmentation proportion of the CNN-LSTM model to be 0.2, setting the training period to be 100 times, and setting the size of a data block to be 16;
s42, inputting a plurality of feature tensor data into the set CNN-LSTM model, training the model by using a fit function in keras, and training the CNN-LSTM model by using the county-level soybean yield data D3 of the county-level corresponding to the year in the S1 as training data.
The technical scheme provided by the invention has the beneficial effects that: the method extracts more characteristics of county crops and converts the characteristics into tensors by a histogram statistical method based on the remote sensing data reflecting the growth state of the crops and the environmental data influencing the growth of the crops, trains the extracted CNN-LSTM model, effectively improves the estimation precision of small-scale crops, and has important significance for the estimation of the yield of the crops, the planning of the crop market, the insurance of the crops and the harvest management.
Drawings
FIG. 1 is a flow chart of a CNN-LSTM-based county scale crop production assessment method according to the present invention;
FIG. 2 is a diagram of the CNN-LSTM framework proposed by the present invention;
FIG. 3 is a schematic diagram of tensors of a temporal phase data feature in a certain county;
FIG. 4 is a graph comparing a result graph of soybean yield measured by the method of the present invention with an official publication yield graph;
FIG. 5 is a graph of the accuracy results of 2011-2015 soybean yield prediction by the method of the present invention;
FIG. 6 is a graph of 5-year prediction values versus observed values.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described with reference to the accompanying drawings.
In the invention, the 15 states in the middle of the United states are taken as case areas, the soybean yield in 2011-2015 year is estimated based on 2003-2015-year remote sensing data and historical yield data as data sources, and the estimation result is compared with real official data.
Selection of a producing area to be estimated: according to the soybean planting distribution published by the United States Department of Agriculture (USDA), soybeans are planted in 31 states, and in the case, 15 states were selected as examples, including north dakota, south dakota, nebraska, minnesota, iowa, kansas, missouri, arkansas, mississippi, tennessee, illinois, indiana, ohio, michigan, and wisconsin. The soybean planting area of the 15 states accounts for 88.75 percent of the national soybean planting area. The time phase of the remote sensing data is selected from April to December according to the soybean growth period.
In this specification, the target crops appearing hereinafter are all soybeans.
Acquiring data of a to-be-estimated-area, comprising:
1. county-level soybean yield data: county-level soybean yield data from 2003 to 2015 was collected from the U.S. department of agriculture in kg/ha, and obtained from: https:// www.nass.usda.gov/Quick _ states/Lite/index.php;
2. crop classification remote sensing data: crop classification remote sensing data from 2003 to 2015 was collected from Crop Data Layer (CDL) issued by the U.S. department of agriculture, one per year, which is a crop specific land cover data layer, with a resolution of 30m, and obtained from: https:// nassgeodata. gmu. edu/CropScape/;
3. county boundary vector remote sensing data of the production area to be estimated: the format is shp, and the source is obtained: https:// category. data. gov/dataset/tiger-line-shape-2016-nation-u-s-current-count-and-equivalent-national-shape;
4. crop remote sensing data: the crop remote sensing data (RS) in the soybean growth period from 2003 to 2015 are downloaded from a designated website (the acquisition address is https:// doi.org/10.5067/MODIS/MOD09A1.006), wherein the crop remote sensing data is MOD09A 1V 6 product, the RS of each year provides an estimated value of surface spectral reflectivity of Terra MODIS wave bands 1-7 under the resolution of 500m, and atmospheric conditions such as gas, aerosol and Rayleigh scattering are corrected, so that the condition that in each RS, for each pixel, on the basis of high observation coverage, low visual angle, no cloud or cloud shadow and no aerosol loading capacity exists.
5. Environmental remote sensing data: the environmental data includes surface temperature data and weather parameter data. Surface temperature data and weather parameter data for the soybean growth period from 2003 to 2015 were downloaded from a designated website.
The earth surface temperature data adopts MOD11A 2V 6 product, and the obtained address is https:// doi.org/10.5067/MODIS/MOD11A2.006. The product provided an average 8 day surface temperature (LST) in a 1km by 1km grid. Each pixel value in MOD11a2 is a simple average of all the corresponding MOD11a1 LST pixels collected over the 8-day period. The day and night surface temperature bands of both data bands were used as long term factors of the soil.
Weather parameter data was obtained using the Daymet product, which is a collection of grid estimates of daily weather parameters generated by interpolation and extrapolation of daily weather observations, at the address https:// Daymet. Two important weather parameters in Daymet were chosen in the present invention: precipitation and air pressure, and the resolution ratio is 1 km.
Referring to fig. 1, a CNN-LSTM-based county scale crop production assessment method includes the following steps:
s1, acquiring and processing data: respectively connecting and acquiring multiple pieces of crop remote sensing data RS, multiple pieces of environment remote sensing data ENV, multiple pieces of crop classification data D1, multiple pieces of county-level soybean yield data D3 and one piece of county-level boundary vector data D2 from 2003 to 2015 according to the method, and respectively preprocessing the multiple pieces of RS and the multiple pieces of ENV in the same year to respectively obtain a annual multi-time-phase crop remote sensing data set IRSAnd an environment remote sensing data set IENV
The S1 further includes: s11, in the google earth engine, carrying out cloud removing processing on each crop remote sensing data RS, classifying the cloud removed crop remote sensing data RS according to the year, and carrying out superposition synthesis processing on a plurality of crop remote sensing data RS in the same year to obtain the multi-time-phase crop remote sensing data set I of each yearRS
Superposing and synthesizing a plurality of pieces of the environment remote sensing data ENV of the same year to obtain the annual IENV
Wherein, during the growth period of soybean, the multi-temporal crop remote sensing data set IRSIncluding t3Time phase data each having m bands, the IENVIncluding t4Time phase data each having n bands, t3=t4=34,m=7,n=4。
Here, it should be noted that the multi-temporal crop remote sensing data set I of each yearRSAnd an environment remote sensing data set IENVThe system should cover all crop remote sensing data RS, surface temperature data and weather parameter data of the target crop in the whole growing season. Due to the differenceThe periods of the remote sensing data are different, in the invention, the remote sensing data with the longest return visit period is taken as the reference, other remote sensing data are aligned to the remote sensing data with the longest return visit period in a mode of averaging multi-time-phase data in time, and the time phase numbers of the two types of aligned crop remote sensing data RS and environment remote sensing data ENV are t2,t2=8。
The specific operation is as follows: the product cycle of the crop remote sensing data RS is 8 days, in the growing period, 34 crop remote sensing data are required to be acquired every year, the crop remote sensing data acquired in the same year are classified and superposed and synthesized (in a google earth engine), and a multi-time-phase crop remote sensing data set I in every year is obtainedRS(ii) a The environmental data comprises surface temperature data and weather parameter data, wherein the surface temperature data provides an average surface temperature of 8 days, correspondingly, in a growth period, 34 pieces of surface temperature data are acquired every year, the surface temperature data acquired in the same year are classified, superposed and synthesized (in a google earth engine), a multi-time-phase surface temperature data set of every year is obtained, each day of the weather parameter data corresponds to one time-phase data, as the product period of the crop remote sensing data RS and the surface temperature data is 8 days, and the product period of the weather parameter data is 1 day, the period of the weather parameter data product is unified to 8 days by averaging and resampling, so that the product period of the weather parameter data product is aligned with the product period of the crop remote sensing data RS and the surface temperature data, therefore, in the growing period, the aligned weather parameter data also contains 34 weather parameter data every year, and the aligned weather parameter data in the same year are classified, so that a multi-temporal weather parameter data set of every year can be obtained. Classifying annual multi-temporal earth surface temperature data set and annual multi-temporal weather parameter data set into annual environment remote sensing data set IENV
S2, data superposition and filtering: the multi-temporal crop remote sensing data set I in S1RSAnd said I for the corresponding yearENVPerforming superposition synthesis treatment to obtain annual productData collection IRS&ENVFor the data collection I in combination with the D1 of the corresponding year in S1RS&ENVNon-target crop pixel filtering is performed to obtain a data set DI containing only target crop pixel information every yearRS&ENV
Wherein, during the growth phase, the data set IRS&ENVAnd data aggregation DIRS&ENVAll comprise t1Time phase data each having m + n bands, t1=34,m+n=11。
S3, using the county-domain boundary vector data D2 in S1 to summarize DI for each dataRS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVCounting the data collection DI of each county area per yearRS&ENVDistribution of band pixel histogram of each band, and dividing the DI of each county area per yearRS&ENVConverting the corresponding band pixel histogram into a deep learning tensor, and obtaining annual characteristic tensor data of each county;
the S3 further includes: s31, confirming the real distribution limit of the target crop pixel data in each wave band: respectively counting target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV in an estimated production area by using a ui, Chart, image and histogram function in a google earth engine, forming a corresponding wave band pixel histogram, obtaining the maximum value and the minimum value of the target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV, further obtaining the real distribution boundary of the target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV, and dividing the target crop pixel data range of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV into w intervals by taking the maximum value and the minimum value of the target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV as boundaries;
s32, respectively comparing the data aggregate DI in S2 with the county-area boundary vector data D2 in S1RS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVCounting the data aggregate DI of each county domain each year by using the ui, chart, image, histogram function in the google earth engineRS&ENVObtaining the interval of each waveband in S31 by using the ee.reducer.fixed Histogram function in google earth engine, and summing up the data DI of each county region each yearRS&ENVAnd converting the wave band pixels of each wave band into the interval of the corresponding wave band, so as to obtain the feature tensor data corresponding to each county area every year, wherein the shape of the feature tensor data is t (m + n) 1 w.
It should be noted that, in the method, it is assumed that the yield is estimated through the crop pixel information, regardless of the geographic location of the pixel, so that the image can be reconstructed by counting the band pixel histograms of the bands, that is, the histogram can represent the original image. In the case of a given number of intervals, the larger the range of the target crop pixel data of each band is, the larger the interval of the obtained histogram is, but generally, the theoretical limit value of the target crop pixel data of each band in the obtained remote sensing data is too large, so that in order to generate the regularized feature tensor data, the real distribution limit of the target crop pixel data in each band needs to be ascertained before the histogram conversion of the pixels of each band. Therefore, the purpose of S31 is to obtain the true distribution of the target crop pixel range in each band, and further obtain the minimum value and the maximum value of each band of target crop pixel data in each of the crop remote sensing data RS and the environment remote sensing data ENV, so as to generate the regularized tensor. In the invention, the histogram of the pixels in each band is counted by adopting a histogram conversion method, and the specific operation is as follows: dividing the pixel data range of each wave band target crop in each crop remote sensing data RS and the environment remote sensing data ENV into w areas by taking the maximum value and the minimum value of each wave band target crop pixel data in each crop remote sensing data RS and the environment remote sensing data ENV as boundariesAnd then, distributing the target crop pixel value in each wave band to the corresponding w intervals, and counting the target crop pixel quantity in each interval, so that each wave band corresponds to 1 xw characteristic tensor data. Will obtain a data collection DI of each county and each yearRS&ENVAfter the band pixel histogram of each band is converted into the interval of the corresponding band, corresponding to each county area annual data DIRS&ENVWill correspond to a tensor TRS&ENVThe shape of the feature tensor data is t (m + n) 1 w, and in the present invention, w is 32, and t is t3=t4The characteristic tensor data shape per year at 34, i.e., each county, is 34 × 11 × 1 × 32. Here, it should be noted that w ═ 32 is only one specific example of the present invention, and in practical applications, w may be 64, 128, and the like.
Fig. 3 is a schematic diagram of bands converted into 32-bin histograms at a time by Marion county in Kansas.
S4, constructing a CNN-LSTM model, training the CNN-LSTM model by using the plurality of feature tensor data in S3 and the county-level soybean yield data D3 of the county-level corresponding year in S1, evaluating the model precision, obtaining the final CNN-LSTM model when the precision of the trained model meets the requirement, and increasing the training times if the precision of the trained model does not meet the requirement; the CNN-LSTM model comprises a CNN part and an LSTM part, the CNN is used for extracting features of feature tensor data, and the LSTM is used for learning the features extracted by the CNN. And monitoring the val _ loss parameter by adopting an early stopping method during model training, and stopping training if the parameter is not optimized for 15 consecutive epochs.
Where the CNN contains two Conv2D, the first Conv2D with 32 filters and the second with 64 filters, each with a convolution kernel size of 1 x 2. Each convolutional layer is followed by a pooling layer of core size 1 x 2 and is normalized using batch processing. Two stacked Conv2D layers were applied to each time slice of the input by the time distribution wrapper for feature extraction. The timing tensor processed by Conv2D is then planarized and batch normalized before it is fed into the LSTM layer. There is only one LSTM layer in the LSTM portion. The neuron number of the LSTM is set to 256, followed by a dense layer of 64 neurons. All time outputs are then flattened into a long vector that is sent with a loss probability of 0.5 to a Dropout layer that can randomly shut off a certain percentage of neurons during training, which can help prevent the neuron group from overfitting. Finally, one neuron dense layer is used to output the predicted yield. The activation function of the model uses a modified linear unit ReLU. The model was constructed by keras. The structure of the CNN-LSTM model constructed by the invention is shown in figure 2.
Training the CNN-LSTM model comprises the following steps:
s41, setting training parameters: setting the verification data partition ratio (validation _ split) of the CNN-LSTM model to be 0.2, setting the training period to be 100 times, and setting the size of a data block to be 16;
s42, inputting a plurality of feature tensor data into the set CNN-LSTM model, training the model by using a fit function in keras, and training the CNN-LSTM model by using the county-level soybean yield data D3 of the county-level corresponding to the year in the S1 as training data.
Here, the CNN-LSTM model has the input of feature tensor data for each county/country per year, the label data of historical production data for the corresponding county/country corresponding year in S1, and the output of the model as predicted production for the corresponding county/country corresponding year. Inputting data into a CNN-LSTM model, training the model by using a fit function in keras, taking historical yield data of a year corresponding to a county area in S1 as training data, setting a verification data partition ratio (verification _ split) of the CNN-LSTM model to be 0.2, aiming at randomly separating 20% of the training data as the verification data, setting 100 epoch training periods in total in the training process, setting a patch size (data block size) to be 16, and using a mean square error crop training metric. In order to improve the actual performance, a method of stopping in advance is adopted to reduce the generalization error of the deep learning system. When the monitored "val _ loss" indicator stops improving after 10 consecutive cycles, the training will end and the model is saved. The optimizer uses adaptive momentum (ADAM).
S5: and (4) estimating the yield of the target crop by applying the CNN-LSTM model trained in the S4 to obtain the yield of the target crop with county scale in the to-be-estimated production area.
In addition, in order to verify the effectiveness of the method, the method of the invention is firstly adopted to respectively estimate the soybean yield of the 15 states in 2011-2015 years, the estimated value is compared with the yield value of the corresponding county-related year published by the USDA, and RMSE and R are adopted2As an evaluation index, among them, the accuracy evaluation criterion is a determination coefficient R2And root mean square error RMSE, R2Has a value range of [0,1 ]],R2The larger the model fitting effect, the better the model fitting effect, otherwise, the worse the model fitting effect; the smaller the RMSE, the better the model fit.
The comparison of the estimated production results obtained by the method of the present invention with the results published by the USDA is shown in fig. 4.
By evaluating the accuracy of each year of estimation, the RMSE is shown in fig. 5.
Comparing the 5-year prediction data with the data published by the USDA to obtain R2=0.78(R2>0.6, which represents a high accuracy of the prediction), as shown in fig. 6, the effectiveness of the method is proved.
In this document, the terms front, back, upper and lower are used to define the components in the drawings and the positions of the components relative to each other, and are used for clarity and convenience of the technical solution. It is to be understood that the use of the directional terms should not be taken to limit the scope of the claims.
The features of the embodiments and embodiments described herein above may be combined with each other without conflict.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (5)

1. A county scale crop yield estimation method based on CNN-LSTM is characterized in that: the method comprises the following steps:
s1, acquiring and processing data: respectively acquiring a plurality of crop remote sensing data RS, environmental remote sensing data ENV, crop classification data D1, county-level soybean yield data D3 and county-level boundary vector data D2 of a target year of a to-be-estimated production area, and respectively preprocessing the plurality of crop remote sensing data RS and the plurality of environmental remote sensing data ENV of the same year to respectively obtain a multi-temporal crop remote sensing data set I of each yearRSAnd an environment remote sensing data set IENV
S2, data superposition and filtering: the multi-temporal crop remote sensing data set I in S1RSAnd the environment remote sensing data set I corresponding to the yearENVPerforming superposition synthesis processing to obtain annual data total set IRS&ENVIn combination with the crop classification data D1 of S1, the data collection I for the corresponding yearRS&ENVNon-target crop pixel filtering is performed to obtain an annual data set DI containing only target crop pixel informationRS&ENV
S3, acquiring feature tensor data of a county scale: using the county-side boundary vector data D2 in S1 for each of the data aggregations DIRS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVStatistics of the data collection DI per county per yearRS&ENVDistributing the wave band pixel histogram of each wave band, and collecting the data in each county area every yearRS&ENVConverting the corresponding band pixel histogram into a deep learning tensor, and obtaining annual characteristic tensor data of each county;
s4, constructing and training a CNN-LSTM model: constructing a CNN-LSTM model, and training the CNN-LSTM model by using a plurality of feature tensor data in S3 and county-level soybean yield data D3 of corresponding county-level years in S1; in particular, the method comprises the following steps of,
constructing a CNN-LSTM model: the CNN-LSTM model comprises two parts, namely a CNN and an LSTM, wherein the CNN comprises two Conv2D layers, two Conv2D layers are stacked and applied to each input time slice to perform feature extraction, then the time sequence tensor processed by Conv2D is flattened and subjected to batch standardization before being fed into the LSTM layer, all time output is flattened into a long vector, the vector is sent to a Dropout layer with the loss probability of 0.5, and finally, a neuron dense layer is used for outputting the predicted yield;
training the CNN-LSTM model comprises the following steps:
s41, setting training parameters: setting the verification data segmentation proportion of the CNN-LSTM model to be 0.2, setting the training period to be 100 times, and setting the size of a data block to be 16;
s42, inputting a plurality of feature tensor data into the set CNN-LSTM model, training the model by using a fit function in keras, and training the CNN-LSTM model by using the county-level soybean yield data D3 of the corresponding county-domain corresponding year in S1 as training data;
s5: and (4) estimating the yield of the target crop by applying the CNN-LSTM model trained in the S4 to obtain the yield of the target crop with county scale in the to-be-estimated production area.
2. The CNN-LSTM-based county-scale crop yield assessment method of claim 1, wherein: the S1 further includes:
s11, carrying out cloud removing processing on each crop remote sensing data RS, classifying the cloud-removed crop remote sensing data RS according to the year, and carrying out superposition synthesis processing on a plurality of crop remote sensing data RS in the same year to obtain the multi-time-phase crop remote sensing data set I of each yearRS
Superposing and synthesizing a plurality of pieces of the environment remote sensing data ENV of the same year to obtain an environment remote sensing data set I of each yearENV
3. The CNN-LSTM-based county-scale crop estimation of claim 2The production method is characterized by comprising the following steps: the ENV comprises surface temperature data and weather parameter data to obtain the environment remote sensing data set IENVFurther comprising the steps of:
s11a, respectively aligning the earth surface temperature data and the weather parameter data with the crop remote sensing data RS in a time phase mode to respectively obtain the earth surface temperature data and the weather parameter data which are aligned;
s11b, processing the ground surface temperature data and the weather parameter data which are aligned in the S11a respectively in a year unit to obtain a multi-time-phase ground surface temperature data set per year and a multi-time-phase weather parameter data set per year respectively;
s11c, processing the annual multi-temporal earth surface temperature data set in the S13 and the multi-temporal weather parameter data set of the corresponding year to obtain the environment remote sensing data set I of the corresponding yearENV
4. The CNN-LSTM-based county-scale crop yield assessment method of claim 1, wherein: the S3 further includes:
s31, confirming the real distribution limit of the target crop pixel data in each wave band: respectively counting target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV in an estimated production area to form a corresponding wave band pixel histogram, obtaining the maximum value and the minimum value of the target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV, and dividing the target crop pixel data range of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV into a plurality of intervals by taking the maximum value and the minimum value of the target crop pixel data of each wave band as boundaries;
s32, respectively comparing the data aggregate DI in S2 with the county-area boundary vector data D2 in S1RS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVCounting the data collection DI of each county area per yearRS&ENVIn each wave bandThe data aggregation DI per each county area per year is summarized in the interval of each band obtained in S31RS&ENVAnd converting the wave band pixels of each medium wave band into the interval of the corresponding wave band, so as to obtain the feature tensor data corresponding to each county area every year.
5. The CNN-LSTM-based county-scale crop yield assessment method of claim 1, wherein: the CNN-LSTM model comprises CNN and LSTM, wherein the CNN is used for extracting features of the feature tensor data, and the LSTM is used for learning the features extracted by the CNN.
CN201910954014.XA 2019-10-09 2019-10-09 County scale crop yield estimation method based on CNN-LSTM Active CN110728446B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910954014.XA CN110728446B (en) 2019-10-09 2019-10-09 County scale crop yield estimation method based on CNN-LSTM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910954014.XA CN110728446B (en) 2019-10-09 2019-10-09 County scale crop yield estimation method based on CNN-LSTM

Publications (2)

Publication Number Publication Date
CN110728446A CN110728446A (en) 2020-01-24
CN110728446B true CN110728446B (en) 2022-04-01

Family

ID=69220854

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910954014.XA Active CN110728446B (en) 2019-10-09 2019-10-09 County scale crop yield estimation method based on CNN-LSTM

Country Status (1)

Country Link
CN (1) CN110728446B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111950994B (en) * 2020-09-03 2021-03-23 深圳市不动产评估中心(深圳市地质环境监测中心) Geological environment and monitoring information management method, system, terminal and storage medium
CN112258523B (en) * 2020-10-20 2022-03-08 中国石油大学(华东) Method for finely extracting enteromorpha coverage information of medium-low resolution remote sensing image
CN112946187B (en) * 2021-01-22 2023-04-07 西安科技大学 Refuge chamber real-time state monitoring method based on neural network
CN113378476B (en) * 2021-06-28 2022-07-19 武汉大学 Global 250-meter resolution space-time continuous leaf area index satellite product generation method
CN114529097B (en) * 2022-02-26 2023-01-24 黑龙江八一农垦大学 Multi-scale crop phenological period remote sensing dimensionality reduction prediction method
CN117592619A (en) * 2023-12-18 2024-02-23 中国科学院空天信息创新研究院 GNN-LSTM-based county winter wheat estimated yield analysis method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101916337A (en) * 2010-08-23 2010-12-15 湖南大学 Method for dynamically predicting potential productivity of paddy rice based on geographical information system
CN108287926A (en) * 2018-03-02 2018-07-17 宿州学院 A kind of multi-source heterogeneous big data acquisition of Agro-ecology, processing and analysis framework

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10699185B2 (en) * 2017-01-26 2020-06-30 The Climate Corporation Crop yield estimation using agronomic neural network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101916337A (en) * 2010-08-23 2010-12-15 湖南大学 Method for dynamically predicting potential productivity of paddy rice based on geographical information system
CN108287926A (en) * 2018-03-02 2018-07-17 宿州学院 A kind of multi-source heterogeneous big data acquisition of Agro-ecology, processing and analysis framework

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Multilevel Deep Learning Network for County-Level Corn Yield Estimation in the U.S. Corn Belt;孙杰等;《IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing》;20200825;5048 - 5060 *

Also Published As

Publication number Publication date
CN110728446A (en) 2020-01-24

Similar Documents

Publication Publication Date Title
CN110728446B (en) County scale crop yield estimation method based on CNN-LSTM
AU2018346351B2 (en) Disease recognition from images having a large field of view
US10824861B2 (en) System and method using image based machine learning process for earth observation and analysis
AU2021258100A1 (en) Modeling trends in crop yields
WO2023108213A1 (en) Methods and systems for classifying and benchmarking irrigation performance
Chen et al. Investigating rice cropping practices and growing areas from MODIS data using empirical mode decomposition and support vector machines
Chen et al. Delineating rice cropping activities from MODIS data using wavelet transform and artificial neural networks in the Lower Mekong countries
Shah et al. Crop yield prediction using remote sensing and meteorological data
CN112001809A (en) Method for acquiring farmland returning information of agriculture and forestry area
Son et al. Rice yield estimation through assimilating satellite data into a crop simumlation model
Yoon et al. Detecting abandoned farmland using harmonic analysis and machine learning
Narra et al. A data driven approach to decision support in farming
Setiawan et al. Dynamics pattern analysis of paddy fields in Indonesia for developing a near real-time monitoring system using modis satellite images
Goel et al. Machine learning-based remote monitoring and predictive analytics system for crop and livestock
Son et al. Mapping major cropping patterns in Southeast Asia from Modis data using wavelet transform and artificial neural networks
Jahan et al. Machine learning for global food security: a concise overview
Madhuri et al. Role of big data in agriculture
WO2021225528A1 (en) System and method for ai-based improvement of harvesting operations
Jepson et al. Agricultural intensification on Brazil’s Amazonian soybean frontier
Lakshmi et al. A Review on Developing Tech-Agriculture using Deep Learning Methods by Applying UAVs
CN115410053B (en) Crop identification method based on random forest model and transfer learning CDL knowledge
LU102497B1 (en) An UAV thermal imaging system
Terliksiz et al. A Simple and Efficient Deep Learning Architecture for Corn Yield Prediction
Lobell et al. Landsat-Based Reconstruction of Corn and Soybean Yield Histories in the United States Since 1999
Singh et al. Trends, challenges and opportunities of artificial intelligence in agriculture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant