CN110728446B - County scale crop yield estimation method based on CNN-LSTM - Google Patents
County scale crop yield estimation method based on CNN-LSTM Download PDFInfo
- Publication number
- CN110728446B CN110728446B CN201910954014.XA CN201910954014A CN110728446B CN 110728446 B CN110728446 B CN 110728446B CN 201910954014 A CN201910954014 A CN 201910954014A CN 110728446 B CN110728446 B CN 110728446B
- Authority
- CN
- China
- Prior art keywords
- data
- remote sensing
- crop
- county
- cnn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000012549 training Methods 0.000 claims abstract description 34
- 238000012545 processing Methods 0.000 claims abstract description 17
- 230000007613 environmental effect Effects 0.000 claims abstract description 9
- 238000001914 filtration Methods 0.000 claims abstract description 7
- 238000000605 extraction Methods 0.000 claims abstract description 5
- 235000010469 Glycine max Nutrition 0.000 claims description 24
- 244000068988 Glycine max Species 0.000 claims description 24
- 238000013480 data collection Methods 0.000 claims description 15
- 238000004519 manufacturing process Methods 0.000 claims description 12
- 238000009826 distribution Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 8
- 238000013135 deep learning Methods 0.000 claims description 7
- 230000015572 biosynthetic process Effects 0.000 claims description 6
- 210000002569 neuron Anatomy 0.000 claims description 6
- 238000003786 synthesis reaction Methods 0.000 claims description 6
- 238000012795 verification Methods 0.000 claims description 6
- 230000002776 aggregation Effects 0.000 claims description 4
- 238000004220 aggregation Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 2
- 230000012010 growth Effects 0.000 abstract description 10
- 239000000284 extract Substances 0.000 abstract description 2
- 238000007619 statistical method Methods 0.000 abstract description 2
- 238000010276 construction Methods 0.000 abstract 1
- 238000013527 convolutional neural network Methods 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 4
- 239000002689 soil Substances 0.000 description 4
- 238000012272 crop production Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 239000000443 aerosol Substances 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000003306 harvesting Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 238000001556 precipitation Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000013179 statistical model Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 208000005156 Dehydration Diseases 0.000 description 1
- 238000012952 Resampling Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000003698 anagen phase Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003638 chemical reducing agent Substances 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000002310 reflectometry Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/02—Agriculture; Fishing; Forestry; Mining
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N2021/1765—Method using an image detector and processing of image signal
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N2021/178—Methods for obtaining spatial resolution of the property being measured
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N2021/1793—Remote sensing
- G01N2021/1797—Remote sensing in landscape, e.g. crops
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Marketing (AREA)
- Development Economics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Entrepreneurship & Innovation (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Educational Administration (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Analytical Chemistry (AREA)
- Primary Health Care (AREA)
- Mining & Mineral Resources (AREA)
- Marine Sciences & Fisheries (AREA)
- Animal Husbandry (AREA)
- Agronomy & Crop Science (AREA)
- Pathology (AREA)
- Immunology (AREA)
- Biochemistry (AREA)
Abstract
The invention relates to the technical field of remote sensing image information extraction, in particular to a county scale crop yield estimation method based on CNN-LSTM, which comprises the following steps: s1 data acquisition and processing, S2 data superposition and filtering, S3 feature tensor data of county scale acquisition, S4 construction and training of a CNN-LSTM model, and S5 application of the CNN-LSTM model trained in S4 to estimate the yield of the target crop. The method extracts more characteristics of county crops and converts the characteristics into tensors by a histogram statistical method based on the remote sensing data reflecting the growth state of the crops and the environmental data influencing the growth of the crops, so that the extracted CNN-LSTM model is trained, and the estimation precision of small-scale crops is effectively improved.
Description
Technical Field
The invention relates to the technical field of remote sensing image information extraction, in particular to a county scale crop yield estimation method based on CNN-LSTM.
Background
Crop yield is the most important indicator of agriculture and has many connections with human society. Yield prediction is one of the most challenging tasks in precision agriculture, and is of great importance to yield maps, crop market programs, crop insurance and harvest management.
Remote sensing technology has been widely used in crop assessment. Various relevant information can be extracted from the remote sensing data to assist in estimating the yield. In particular, various Vegetation Indices (VI), such as normalized vegetation index (NDVI), have been widely used. Other indices, such as Green Leaf Area Index (GLAI), Crop Water Stress Index (CWSI), normalized water index (NDWI), Green Vegetation Index (GVI), Soil Adjusted Vegetation Index (SAVI), etc., have also been used for crop yield prediction. In addition, meteorological variables such as precipitation, air temperature and some soil condition data, including soil moisture, temperature are commonly used in yield prediction and quality as indicators of the crop growing environment. Such feature extraction methods typically rely on computing the region mean, but omit detail features.
Based on remote sensing data, there are mainly two crop yield prediction methods: crop simulation and empirical statistical models. Although crop simulation models accurately simulate the physical processes of crop growth, these models are hardly applicable over a large spatio-temporal range due to insufficient data. In contrast, empirical statistical models are simple, require less input data, and have therefore been widely used as a general alternative to process-based models. Machine learning algorithms, including Support Vector Machines (SVMs), Decision Trees (DTs), multi-level perceptions (MLPs) and Restricted Boltzmann Machines (RBMs), may provide alternatives to traditional regression methods and overcome many of the limitations. Furthermore, Artificial Neural Networks (ANN) are also considered as surrogate models. Traditional artificial neural networks, i.e., multi-layered perceptron models, have been successfully applied to crop yield estimation for various crops.
In recent years, Deep Learning (DL) has been considered as a breakthrough technique in machine learning and data mining agricultural remote sensing. Most DL algorithms, including stacked sparse auto-encoders (SSAE), Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), have been used for yield prediction. Researchers generally believe that CNNs can explore more spatial features, while LSTM has the ability to reveal phenological features as a special RNN, but little attention has been paid to the integration of the advantages of CNNs and LSTM for yield prediction.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a county scale crop production estimation method based on CNN-LSTM. The method provides a CNN-LSTM model for county-level end-of-season crop yield prediction.
The invention provides a CNN-LSTM-based county scale crop yield estimation method, which is characterized by comprising the following steps: the method comprises the following steps:
s1, acquiring and processing data: respectively acquiring a plurality of crop remote sensing data RS, environmental remote sensing data ENV, crop classification data D1, county-level soybean yield data D3 and county-level boundary vector data D2 of a target year of a to-be-estimated production area, and respectively preprocessing the plurality of crop remote sensing data RS and the plurality of environmental remote sensing data ENV of the same year to respectively obtain a multi-temporal crop remote sensing data set I of each yearRSAnd an environment remote sensing data set IENV;
S2, data superposition and filtering: the multi-temporal crop remote sensing data set I in S1RSAnd said I for the corresponding yearENVPerforming superposition synthesis processing to obtain annual data total set IRS&ENVFor the data collection I in combination with the D1 of the corresponding year in S1RS&ENVNon-target crop pixel filtering is performed to obtain an annual data set DI containing only target crop pixel informationRS&ENV;
S3, acquiring feature tensor data of a county scale: using the county-side boundary vector data D2 in S1 for each of the data aggregations DIRS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVCounting the data collection DI of each county area per yearRS&ENVDistribution of band pixel histogram of each band, and aggregating the data DI of each county area per yearRS&ENVConverting the corresponding band pixel histogram into a deep learning tensor, and obtaining annual characteristic tensor data of each county;
s4, constructing and training a CNN-LSTM model: constructing a CNN-LSTM model, training the CNN-LSTM model by using a plurality of characteristic tensor data in S3 and county-level soybean yield data D3 of corresponding county-level years in S1, and evaluating model accuracy;
s5: the CNN-LSTM model trained in S4 was used to evaluate the yield of the target crop.
Further, the crop remote sensing data RS include t pieces of time phase data, each time phase data has m bands, the environment remote sensing data ENV includes u pieces of time phase data, and each time phase data has n bands.
Further, the S1 further includes:
s11, carrying out cloud removing processing on each crop remote sensing data RS, classifying the cloud-removed crop remote sensing data RS according to the year, and carrying out superposition synthesis processing on a plurality of crop remote sensing data RS in the same year to obtain the multi-time-phase crop remote sensing data set I of each yearRS;
Superposing and synthesizing a plurality of pieces of the environment remote sensing data ENV of the same year to obtain an environment remote sensing data set I of each yearENV。
Further, the environment remote sensing data ENV comprises surface temperature data and weather parameter data, and the environment remote sensing data set I is obtainedENVFurther comprising the steps of:
s11a, respectively aligning the earth surface temperature data and the weather parameter data with the crop remote sensing data RS in a time phase mode to respectively obtain the earth surface temperature data and the weather parameter data which are aligned;
s11b, processing the ground surface temperature data and the weather parameter data which are aligned in the step S12 respectively in a year unit to obtain a multi-temporal ground surface temperature data set of each year and a multi-temporal weather parameter data set of each year respectively;
s11c, processing the annual multi-temporal earth surface temperature data set in the S13 and the multi-temporal weather parameter data set of the corresponding year to obtain the environment remote sensing data set I of the corresponding yearENV。
Further, the data total set I in S2RS&ENVThere are t phases, and each phase data has m + n bands.
Further, the S3 further includes:
s31, confirming the real distribution limit of the target crop pixel data in each wave band: respectively counting target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV in an estimated production area to form a corresponding wave band pixel histogram, obtaining the maximum value and the minimum value of the target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV, and dividing the target crop pixel data range of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV into w intervals by taking the maximum value and the minimum value of the target crop pixel data of each wave band as boundaries;
s32, respectively comparing the data aggregate DI in S2 with the county-area boundary vector data D2 in S1RS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVCounting the data collection DI of each county area per yearRS&ENVThe band pixel histograms of the middle bands are obtained by summing up the data sets DI for each county area per year in the intervals of the bands obtained in S31RS&ENVAnd converting the wave band pixels of each wave band into the interval of the corresponding wave band, so as to obtain the feature tensor data corresponding to each county area every year, wherein the shape of the feature tensor data is t (m + n) 1 w.
Further, the S4 further includes the following steps:
s41, setting training parameters: setting the verification data segmentation proportion of the CNN-LSTM model to be 0.2, setting the training period to be 100 times, and setting the size of a data block to be 16;
s42, inputting a plurality of feature tensor data into the set CNN-LSTM model, training the model by using a fit function in keras, and training the CNN-LSTM model by using the county-level soybean yield data D3 of the county-level corresponding to the year in the S1 as training data.
The technical scheme provided by the invention has the beneficial effects that: the method extracts more characteristics of county crops and converts the characteristics into tensors by a histogram statistical method based on the remote sensing data reflecting the growth state of the crops and the environmental data influencing the growth of the crops, trains the extracted CNN-LSTM model, effectively improves the estimation precision of small-scale crops, and has important significance for the estimation of the yield of the crops, the planning of the crop market, the insurance of the crops and the harvest management.
Drawings
FIG. 1 is a flow chart of a CNN-LSTM-based county scale crop production assessment method according to the present invention;
FIG. 2 is a diagram of the CNN-LSTM framework proposed by the present invention;
FIG. 3 is a schematic diagram of tensors of a temporal phase data feature in a certain county;
FIG. 4 is a graph comparing a result graph of soybean yield measured by the method of the present invention with an official publication yield graph;
FIG. 5 is a graph of the accuracy results of 2011-2015 soybean yield prediction by the method of the present invention;
FIG. 6 is a graph of 5-year prediction values versus observed values.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described with reference to the accompanying drawings.
In the invention, the 15 states in the middle of the United states are taken as case areas, the soybean yield in 2011-2015 year is estimated based on 2003-2015-year remote sensing data and historical yield data as data sources, and the estimation result is compared with real official data.
Selection of a producing area to be estimated: according to the soybean planting distribution published by the United States Department of Agriculture (USDA), soybeans are planted in 31 states, and in the case, 15 states were selected as examples, including north dakota, south dakota, nebraska, minnesota, iowa, kansas, missouri, arkansas, mississippi, tennessee, illinois, indiana, ohio, michigan, and wisconsin. The soybean planting area of the 15 states accounts for 88.75 percent of the national soybean planting area. The time phase of the remote sensing data is selected from April to December according to the soybean growth period.
In this specification, the target crops appearing hereinafter are all soybeans.
Acquiring data of a to-be-estimated-area, comprising:
1. county-level soybean yield data: county-level soybean yield data from 2003 to 2015 was collected from the U.S. department of agriculture in kg/ha, and obtained from: https:// www.nass.usda.gov/Quick _ states/Lite/index.php;
2. crop classification remote sensing data: crop classification remote sensing data from 2003 to 2015 was collected from Crop Data Layer (CDL) issued by the U.S. department of agriculture, one per year, which is a crop specific land cover data layer, with a resolution of 30m, and obtained from: https:// nassgeodata. gmu. edu/CropScape/;
3. county boundary vector remote sensing data of the production area to be estimated: the format is shp, and the source is obtained: https:// category. data. gov/dataset/tiger-line-shape-2016-nation-u-s-current-count-and-equivalent-national-shape;
4. crop remote sensing data: the crop remote sensing data (RS) in the soybean growth period from 2003 to 2015 are downloaded from a designated website (the acquisition address is https:// doi.org/10.5067/MODIS/MOD09A1.006), wherein the crop remote sensing data is MOD09A 1V 6 product, the RS of each year provides an estimated value of surface spectral reflectivity of Terra MODIS wave bands 1-7 under the resolution of 500m, and atmospheric conditions such as gas, aerosol and Rayleigh scattering are corrected, so that the condition that in each RS, for each pixel, on the basis of high observation coverage, low visual angle, no cloud or cloud shadow and no aerosol loading capacity exists.
5. Environmental remote sensing data: the environmental data includes surface temperature data and weather parameter data. Surface temperature data and weather parameter data for the soybean growth period from 2003 to 2015 were downloaded from a designated website.
The earth surface temperature data adopts MOD11A 2V 6 product, and the obtained address is https:// doi.org/10.5067/MODIS/MOD11A2.006. The product provided an average 8 day surface temperature (LST) in a 1km by 1km grid. Each pixel value in MOD11a2 is a simple average of all the corresponding MOD11a1 LST pixels collected over the 8-day period. The day and night surface temperature bands of both data bands were used as long term factors of the soil.
Weather parameter data was obtained using the Daymet product, which is a collection of grid estimates of daily weather parameters generated by interpolation and extrapolation of daily weather observations, at the address https:// Daymet. Two important weather parameters in Daymet were chosen in the present invention: precipitation and air pressure, and the resolution ratio is 1 km.
Referring to fig. 1, a CNN-LSTM-based county scale crop production assessment method includes the following steps:
s1, acquiring and processing data: respectively connecting and acquiring multiple pieces of crop remote sensing data RS, multiple pieces of environment remote sensing data ENV, multiple pieces of crop classification data D1, multiple pieces of county-level soybean yield data D3 and one piece of county-level boundary vector data D2 from 2003 to 2015 according to the method, and respectively preprocessing the multiple pieces of RS and the multiple pieces of ENV in the same year to respectively obtain a annual multi-time-phase crop remote sensing data set IRSAnd an environment remote sensing data set IENV。
The S1 further includes: s11, in the google earth engine, carrying out cloud removing processing on each crop remote sensing data RS, classifying the cloud removed crop remote sensing data RS according to the year, and carrying out superposition synthesis processing on a plurality of crop remote sensing data RS in the same year to obtain the multi-time-phase crop remote sensing data set I of each yearRS;
Superposing and synthesizing a plurality of pieces of the environment remote sensing data ENV of the same year to obtain the annual IENV;
Wherein, during the growth period of soybean, the multi-temporal crop remote sensing data set IRSIncluding t3Time phase data each having m bands, the IENVIncluding t4Time phase data each having n bands, t3=t4=34,m=7,n=4。
Here, it should be noted that the multi-temporal crop remote sensing data set I of each yearRSAnd an environment remote sensing data set IENVThe system should cover all crop remote sensing data RS, surface temperature data and weather parameter data of the target crop in the whole growing season. Due to the differenceThe periods of the remote sensing data are different, in the invention, the remote sensing data with the longest return visit period is taken as the reference, other remote sensing data are aligned to the remote sensing data with the longest return visit period in a mode of averaging multi-time-phase data in time, and the time phase numbers of the two types of aligned crop remote sensing data RS and environment remote sensing data ENV are t2,t2=8。
The specific operation is as follows: the product cycle of the crop remote sensing data RS is 8 days, in the growing period, 34 crop remote sensing data are required to be acquired every year, the crop remote sensing data acquired in the same year are classified and superposed and synthesized (in a google earth engine), and a multi-time-phase crop remote sensing data set I in every year is obtainedRS(ii) a The environmental data comprises surface temperature data and weather parameter data, wherein the surface temperature data provides an average surface temperature of 8 days, correspondingly, in a growth period, 34 pieces of surface temperature data are acquired every year, the surface temperature data acquired in the same year are classified, superposed and synthesized (in a google earth engine), a multi-time-phase surface temperature data set of every year is obtained, each day of the weather parameter data corresponds to one time-phase data, as the product period of the crop remote sensing data RS and the surface temperature data is 8 days, and the product period of the weather parameter data is 1 day, the period of the weather parameter data product is unified to 8 days by averaging and resampling, so that the product period of the weather parameter data product is aligned with the product period of the crop remote sensing data RS and the surface temperature data, therefore, in the growing period, the aligned weather parameter data also contains 34 weather parameter data every year, and the aligned weather parameter data in the same year are classified, so that a multi-temporal weather parameter data set of every year can be obtained. Classifying annual multi-temporal earth surface temperature data set and annual multi-temporal weather parameter data set into annual environment remote sensing data set IENV。
S2, data superposition and filtering: the multi-temporal crop remote sensing data set I in S1RSAnd said I for the corresponding yearENVPerforming superposition synthesis treatment to obtain annual productData collection IRS&ENVFor the data collection I in combination with the D1 of the corresponding year in S1RS&ENVNon-target crop pixel filtering is performed to obtain a data set DI containing only target crop pixel information every yearRS&ENV。
Wherein, during the growth phase, the data set IRS&ENVAnd data aggregation DIRS&ENVAll comprise t1Time phase data each having m + n bands, t1=34,m+n=11。
S3, using the county-domain boundary vector data D2 in S1 to summarize DI for each dataRS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVCounting the data collection DI of each county area per yearRS&ENVDistribution of band pixel histogram of each band, and dividing the DI of each county area per yearRS&ENVConverting the corresponding band pixel histogram into a deep learning tensor, and obtaining annual characteristic tensor data of each county;
the S3 further includes: s31, confirming the real distribution limit of the target crop pixel data in each wave band: respectively counting target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV in an estimated production area by using a ui, Chart, image and histogram function in a google earth engine, forming a corresponding wave band pixel histogram, obtaining the maximum value and the minimum value of the target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV, further obtaining the real distribution boundary of the target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV, and dividing the target crop pixel data range of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV into w intervals by taking the maximum value and the minimum value of the target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV as boundaries;
s32, respectively comparing the data aggregate DI in S2 with the county-area boundary vector data D2 in S1RS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVCounting the data aggregate DI of each county domain each year by using the ui, chart, image, histogram function in the google earth engineRS&ENVObtaining the interval of each waveband in S31 by using the ee.reducer.fixed Histogram function in google earth engine, and summing up the data DI of each county region each yearRS&ENVAnd converting the wave band pixels of each wave band into the interval of the corresponding wave band, so as to obtain the feature tensor data corresponding to each county area every year, wherein the shape of the feature tensor data is t (m + n) 1 w.
It should be noted that, in the method, it is assumed that the yield is estimated through the crop pixel information, regardless of the geographic location of the pixel, so that the image can be reconstructed by counting the band pixel histograms of the bands, that is, the histogram can represent the original image. In the case of a given number of intervals, the larger the range of the target crop pixel data of each band is, the larger the interval of the obtained histogram is, but generally, the theoretical limit value of the target crop pixel data of each band in the obtained remote sensing data is too large, so that in order to generate the regularized feature tensor data, the real distribution limit of the target crop pixel data in each band needs to be ascertained before the histogram conversion of the pixels of each band. Therefore, the purpose of S31 is to obtain the true distribution of the target crop pixel range in each band, and further obtain the minimum value and the maximum value of each band of target crop pixel data in each of the crop remote sensing data RS and the environment remote sensing data ENV, so as to generate the regularized tensor. In the invention, the histogram of the pixels in each band is counted by adopting a histogram conversion method, and the specific operation is as follows: dividing the pixel data range of each wave band target crop in each crop remote sensing data RS and the environment remote sensing data ENV into w areas by taking the maximum value and the minimum value of each wave band target crop pixel data in each crop remote sensing data RS and the environment remote sensing data ENV as boundariesAnd then, distributing the target crop pixel value in each wave band to the corresponding w intervals, and counting the target crop pixel quantity in each interval, so that each wave band corresponds to 1 xw characteristic tensor data. Will obtain a data collection DI of each county and each yearRS&ENVAfter the band pixel histogram of each band is converted into the interval of the corresponding band, corresponding to each county area annual data DIRS&ENVWill correspond to a tensor TRS&ENVThe shape of the feature tensor data is t (m + n) 1 w, and in the present invention, w is 32, and t is t3=t4The characteristic tensor data shape per year at 34, i.e., each county, is 34 × 11 × 1 × 32. Here, it should be noted that w ═ 32 is only one specific example of the present invention, and in practical applications, w may be 64, 128, and the like.
Fig. 3 is a schematic diagram of bands converted into 32-bin histograms at a time by Marion county in Kansas.
S4, constructing a CNN-LSTM model, training the CNN-LSTM model by using the plurality of feature tensor data in S3 and the county-level soybean yield data D3 of the county-level corresponding year in S1, evaluating the model precision, obtaining the final CNN-LSTM model when the precision of the trained model meets the requirement, and increasing the training times if the precision of the trained model does not meet the requirement; the CNN-LSTM model comprises a CNN part and an LSTM part, the CNN is used for extracting features of feature tensor data, and the LSTM is used for learning the features extracted by the CNN. And monitoring the val _ loss parameter by adopting an early stopping method during model training, and stopping training if the parameter is not optimized for 15 consecutive epochs.
Where the CNN contains two Conv2D, the first Conv2D with 32 filters and the second with 64 filters, each with a convolution kernel size of 1 x 2. Each convolutional layer is followed by a pooling layer of core size 1 x 2 and is normalized using batch processing. Two stacked Conv2D layers were applied to each time slice of the input by the time distribution wrapper for feature extraction. The timing tensor processed by Conv2D is then planarized and batch normalized before it is fed into the LSTM layer. There is only one LSTM layer in the LSTM portion. The neuron number of the LSTM is set to 256, followed by a dense layer of 64 neurons. All time outputs are then flattened into a long vector that is sent with a loss probability of 0.5 to a Dropout layer that can randomly shut off a certain percentage of neurons during training, which can help prevent the neuron group from overfitting. Finally, one neuron dense layer is used to output the predicted yield. The activation function of the model uses a modified linear unit ReLU. The model was constructed by keras. The structure of the CNN-LSTM model constructed by the invention is shown in figure 2.
Training the CNN-LSTM model comprises the following steps:
s41, setting training parameters: setting the verification data partition ratio (validation _ split) of the CNN-LSTM model to be 0.2, setting the training period to be 100 times, and setting the size of a data block to be 16;
s42, inputting a plurality of feature tensor data into the set CNN-LSTM model, training the model by using a fit function in keras, and training the CNN-LSTM model by using the county-level soybean yield data D3 of the county-level corresponding to the year in the S1 as training data.
Here, the CNN-LSTM model has the input of feature tensor data for each county/country per year, the label data of historical production data for the corresponding county/country corresponding year in S1, and the output of the model as predicted production for the corresponding county/country corresponding year. Inputting data into a CNN-LSTM model, training the model by using a fit function in keras, taking historical yield data of a year corresponding to a county area in S1 as training data, setting a verification data partition ratio (verification _ split) of the CNN-LSTM model to be 0.2, aiming at randomly separating 20% of the training data as the verification data, setting 100 epoch training periods in total in the training process, setting a patch size (data block size) to be 16, and using a mean square error crop training metric. In order to improve the actual performance, a method of stopping in advance is adopted to reduce the generalization error of the deep learning system. When the monitored "val _ loss" indicator stops improving after 10 consecutive cycles, the training will end and the model is saved. The optimizer uses adaptive momentum (ADAM).
S5: and (4) estimating the yield of the target crop by applying the CNN-LSTM model trained in the S4 to obtain the yield of the target crop with county scale in the to-be-estimated production area.
In addition, in order to verify the effectiveness of the method, the method of the invention is firstly adopted to respectively estimate the soybean yield of the 15 states in 2011-2015 years, the estimated value is compared with the yield value of the corresponding county-related year published by the USDA, and RMSE and R are adopted2As an evaluation index, among them, the accuracy evaluation criterion is a determination coefficient R2And root mean square error RMSE, R2Has a value range of [0,1 ]],R2The larger the model fitting effect, the better the model fitting effect, otherwise, the worse the model fitting effect; the smaller the RMSE, the better the model fit.
The comparison of the estimated production results obtained by the method of the present invention with the results published by the USDA is shown in fig. 4.
By evaluating the accuracy of each year of estimation, the RMSE is shown in fig. 5.
Comparing the 5-year prediction data with the data published by the USDA to obtain R2=0.78(R2>0.6, which represents a high accuracy of the prediction), as shown in fig. 6, the effectiveness of the method is proved.
In this document, the terms front, back, upper and lower are used to define the components in the drawings and the positions of the components relative to each other, and are used for clarity and convenience of the technical solution. It is to be understood that the use of the directional terms should not be taken to limit the scope of the claims.
The features of the embodiments and embodiments described herein above may be combined with each other without conflict.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (5)
1. A county scale crop yield estimation method based on CNN-LSTM is characterized in that: the method comprises the following steps:
s1, acquiring and processing data: respectively acquiring a plurality of crop remote sensing data RS, environmental remote sensing data ENV, crop classification data D1, county-level soybean yield data D3 and county-level boundary vector data D2 of a target year of a to-be-estimated production area, and respectively preprocessing the plurality of crop remote sensing data RS and the plurality of environmental remote sensing data ENV of the same year to respectively obtain a multi-temporal crop remote sensing data set I of each yearRSAnd an environment remote sensing data set IENV;
S2, data superposition and filtering: the multi-temporal crop remote sensing data set I in S1RSAnd the environment remote sensing data set I corresponding to the yearENVPerforming superposition synthesis processing to obtain annual data total set IRS&ENVIn combination with the crop classification data D1 of S1, the data collection I for the corresponding yearRS&ENVNon-target crop pixel filtering is performed to obtain an annual data set DI containing only target crop pixel informationRS&ENV;
S3, acquiring feature tensor data of a county scale: using the county-side boundary vector data D2 in S1 for each of the data aggregations DIRS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVStatistics of the data collection DI per county per yearRS&ENVDistributing the wave band pixel histogram of each wave band, and collecting the data in each county area every yearRS&ENVConverting the corresponding band pixel histogram into a deep learning tensor, and obtaining annual characteristic tensor data of each county;
s4, constructing and training a CNN-LSTM model: constructing a CNN-LSTM model, and training the CNN-LSTM model by using a plurality of feature tensor data in S3 and county-level soybean yield data D3 of corresponding county-level years in S1; in particular, the method comprises the following steps of,
constructing a CNN-LSTM model: the CNN-LSTM model comprises two parts, namely a CNN and an LSTM, wherein the CNN comprises two Conv2D layers, two Conv2D layers are stacked and applied to each input time slice to perform feature extraction, then the time sequence tensor processed by Conv2D is flattened and subjected to batch standardization before being fed into the LSTM layer, all time output is flattened into a long vector, the vector is sent to a Dropout layer with the loss probability of 0.5, and finally, a neuron dense layer is used for outputting the predicted yield;
training the CNN-LSTM model comprises the following steps:
s41, setting training parameters: setting the verification data segmentation proportion of the CNN-LSTM model to be 0.2, setting the training period to be 100 times, and setting the size of a data block to be 16;
s42, inputting a plurality of feature tensor data into the set CNN-LSTM model, training the model by using a fit function in keras, and training the CNN-LSTM model by using the county-level soybean yield data D3 of the corresponding county-domain corresponding year in S1 as training data;
s5: and (4) estimating the yield of the target crop by applying the CNN-LSTM model trained in the S4 to obtain the yield of the target crop with county scale in the to-be-estimated production area.
2. The CNN-LSTM-based county-scale crop yield assessment method of claim 1, wherein: the S1 further includes:
s11, carrying out cloud removing processing on each crop remote sensing data RS, classifying the cloud-removed crop remote sensing data RS according to the year, and carrying out superposition synthesis processing on a plurality of crop remote sensing data RS in the same year to obtain the multi-time-phase crop remote sensing data set I of each yearRS;
Superposing and synthesizing a plurality of pieces of the environment remote sensing data ENV of the same year to obtain an environment remote sensing data set I of each yearENV。
3. The CNN-LSTM-based county-scale crop estimation of claim 2The production method is characterized by comprising the following steps: the ENV comprises surface temperature data and weather parameter data to obtain the environment remote sensing data set IENVFurther comprising the steps of:
s11a, respectively aligning the earth surface temperature data and the weather parameter data with the crop remote sensing data RS in a time phase mode to respectively obtain the earth surface temperature data and the weather parameter data which are aligned;
s11b, processing the ground surface temperature data and the weather parameter data which are aligned in the S11a respectively in a year unit to obtain a multi-time-phase ground surface temperature data set per year and a multi-time-phase weather parameter data set per year respectively;
s11c, processing the annual multi-temporal earth surface temperature data set in the S13 and the multi-temporal weather parameter data set of the corresponding year to obtain the environment remote sensing data set I of the corresponding yearENV。
4. The CNN-LSTM-based county-scale crop yield assessment method of claim 1, wherein: the S3 further includes:
s31, confirming the real distribution limit of the target crop pixel data in each wave band: respectively counting target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV in an estimated production area to form a corresponding wave band pixel histogram, obtaining the maximum value and the minimum value of the target crop pixel data of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV, and dividing the target crop pixel data range of each wave band in each crop remote sensing data RS and the environment remote sensing data ENV into a plurality of intervals by taking the maximum value and the minimum value of the target crop pixel data of each wave band as boundaries;
s32, respectively comparing the data aggregate DI in S2 with the county-area boundary vector data D2 in S1RS&ENVPerforming a cropping to obtain said data collection DI for each county area per yearRS&ENVCounting the data collection DI of each county area per yearRS&ENVIn each wave bandThe data aggregation DI per each county area per year is summarized in the interval of each band obtained in S31RS&ENVAnd converting the wave band pixels of each medium wave band into the interval of the corresponding wave band, so as to obtain the feature tensor data corresponding to each county area every year.
5. The CNN-LSTM-based county-scale crop yield assessment method of claim 1, wherein: the CNN-LSTM model comprises CNN and LSTM, wherein the CNN is used for extracting features of the feature tensor data, and the LSTM is used for learning the features extracted by the CNN.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910954014.XA CN110728446B (en) | 2019-10-09 | 2019-10-09 | County scale crop yield estimation method based on CNN-LSTM |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910954014.XA CN110728446B (en) | 2019-10-09 | 2019-10-09 | County scale crop yield estimation method based on CNN-LSTM |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110728446A CN110728446A (en) | 2020-01-24 |
CN110728446B true CN110728446B (en) | 2022-04-01 |
Family
ID=69220854
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910954014.XA Active CN110728446B (en) | 2019-10-09 | 2019-10-09 | County scale crop yield estimation method based on CNN-LSTM |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110728446B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111950994B (en) * | 2020-09-03 | 2021-03-23 | 深圳市不动产评估中心(深圳市地质环境监测中心) | Geological environment and monitoring information management method, system, terminal and storage medium |
CN112258523B (en) * | 2020-10-20 | 2022-03-08 | 中国石油大学(华东) | Method for finely extracting enteromorpha coverage information of medium-low resolution remote sensing image |
CN112946187B (en) * | 2021-01-22 | 2023-04-07 | 西安科技大学 | Refuge chamber real-time state monitoring method based on neural network |
CN113378476B (en) * | 2021-06-28 | 2022-07-19 | 武汉大学 | Global 250-meter resolution space-time continuous leaf area index satellite product generation method |
CN114529097B (en) * | 2022-02-26 | 2023-01-24 | 黑龙江八一农垦大学 | Multi-scale crop phenological period remote sensing dimensionality reduction prediction method |
CN117592619A (en) * | 2023-12-18 | 2024-02-23 | 中国科学院空天信息创新研究院 | GNN-LSTM-based county winter wheat estimated yield analysis method and system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101916337A (en) * | 2010-08-23 | 2010-12-15 | 湖南大学 | Method for dynamically predicting potential productivity of paddy rice based on geographical information system |
CN108287926A (en) * | 2018-03-02 | 2018-07-17 | 宿州学院 | A kind of multi-source heterogeneous big data acquisition of Agro-ecology, processing and analysis framework |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10699185B2 (en) * | 2017-01-26 | 2020-06-30 | The Climate Corporation | Crop yield estimation using agronomic neural network |
-
2019
- 2019-10-09 CN CN201910954014.XA patent/CN110728446B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101916337A (en) * | 2010-08-23 | 2010-12-15 | 湖南大学 | Method for dynamically predicting potential productivity of paddy rice based on geographical information system |
CN108287926A (en) * | 2018-03-02 | 2018-07-17 | 宿州学院 | A kind of multi-source heterogeneous big data acquisition of Agro-ecology, processing and analysis framework |
Non-Patent Citations (1)
Title |
---|
Multilevel Deep Learning Network for County-Level Corn Yield Estimation in the U.S. Corn Belt;孙杰等;《IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing》;20200825;5048 - 5060 * |
Also Published As
Publication number | Publication date |
---|---|
CN110728446A (en) | 2020-01-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110728446B (en) | County scale crop yield estimation method based on CNN-LSTM | |
AU2018346351B2 (en) | Disease recognition from images having a large field of view | |
US10824861B2 (en) | System and method using image based machine learning process for earth observation and analysis | |
AU2021258100A1 (en) | Modeling trends in crop yields | |
WO2023108213A1 (en) | Methods and systems for classifying and benchmarking irrigation performance | |
Chen et al. | Investigating rice cropping practices and growing areas from MODIS data using empirical mode decomposition and support vector machines | |
Chen et al. | Delineating rice cropping activities from MODIS data using wavelet transform and artificial neural networks in the Lower Mekong countries | |
Shah et al. | Crop yield prediction using remote sensing and meteorological data | |
CN112001809A (en) | Method for acquiring farmland returning information of agriculture and forestry area | |
Son et al. | Rice yield estimation through assimilating satellite data into a crop simumlation model | |
Yoon et al. | Detecting abandoned farmland using harmonic analysis and machine learning | |
Narra et al. | A data driven approach to decision support in farming | |
Setiawan et al. | Dynamics pattern analysis of paddy fields in Indonesia for developing a near real-time monitoring system using modis satellite images | |
Goel et al. | Machine learning-based remote monitoring and predictive analytics system for crop and livestock | |
Son et al. | Mapping major cropping patterns in Southeast Asia from Modis data using wavelet transform and artificial neural networks | |
Jahan et al. | Machine learning for global food security: a concise overview | |
Madhuri et al. | Role of big data in agriculture | |
WO2021225528A1 (en) | System and method for ai-based improvement of harvesting operations | |
Jepson et al. | Agricultural intensification on Brazil’s Amazonian soybean frontier | |
Lakshmi et al. | A Review on Developing Tech-Agriculture using Deep Learning Methods by Applying UAVs | |
CN115410053B (en) | Crop identification method based on random forest model and transfer learning CDL knowledge | |
LU102497B1 (en) | An UAV thermal imaging system | |
Terliksiz et al. | A Simple and Efficient Deep Learning Architecture for Corn Yield Prediction | |
Lobell et al. | Landsat-Based Reconstruction of Corn and Soybean Yield Histories in the United States Since 1999 | |
Singh et al. | Trends, challenges and opportunities of artificial intelligence in agriculture |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |