CN108596104B - Wheat powdery mildew remote sensing monitoring method with disease characteristic preprocessing function - Google Patents

Wheat powdery mildew remote sensing monitoring method with disease characteristic preprocessing function Download PDF

Info

Publication number
CN108596104B
CN108596104B CN201810383173.4A CN201810383173A CN108596104B CN 108596104 B CN108596104 B CN 108596104B CN 201810383173 A CN201810383173 A CN 201810383173A CN 108596104 B CN108596104 B CN 108596104B
Authority
CN
China
Prior art keywords
powdery mildew
characteristic
sample
remote sensing
wheat
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810383173.4A
Other languages
Chinese (zh)
Other versions
CN108596104A (en
Inventor
黄林生
阮超
黄文江
张东彦
赵晋陵
翁士状
曾玮
丁文娟
丁串龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN201810383173.4A priority Critical patent/CN108596104B/en
Publication of CN108596104A publication Critical patent/CN108596104A/en
Application granted granted Critical
Publication of CN108596104B publication Critical patent/CN108596104B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/188Vegetation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24143Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a remote sensing monitoring method for wheat powdery mildew with a disease characteristic preprocessing function, which overcomes the defects of high redundancy and poor monitoring precision of wheat disease characteristics compared with the prior art. The invention comprises the following steps: acquiring and preprocessing remote sensing data; extracting characteristic variables; processing the characteristic variables; constructing and optimizing a powdery mildew monitoring model; and obtaining a remote sensing monitoring result of wheat powdery mildew. The invention combines two characteristic selection technologies of relief and mRMR with a support vector machine optimized by a genetic method to form effective remote sensing monitoring on powdery mildew of regional scale.

Description

Wheat powdery mildew remote sensing monitoring method with disease characteristic preprocessing function
Technical Field
The invention relates to the technical field of remote sensing image processing, in particular to a wheat powdery mildew remote sensing monitoring method with a disease characteristic preprocessing function.
Background
Wheat powdery mildew is one of main diseases in the production process of wheat, can occur in the whole growth period of wheat, and causes serious yield reduction and quality reduction, wherein the yield reduction is generally between 5% and 10% after the wheat is damaged, and the yield reduction is more than 20% when the wheat is serious. The method has great significance for improving the yield and quality of wheat by effectively monitoring the occurrence of wheat powdery mildew in time.
Although the traditional ground survey method has good survey results, a large amount of manpower and material resources are required, and the traditional ground survey method is not suitable for large-area research. Many scholars use meteorological data to monitor and predict crop pests. The Wang congress and the like establish a prediction model of wheat powdery mildew by utilizing air temperature, illumination and rainfall. Stansbury et al used a model of the damp-heat index to predict the Tilletia indica (Tilletia indica) of wheat. Dutta et al established a prediction model of aphids using meteorological parameters such as air temperature and humidity. However, weather data is influenced by terrain and human activities, continuous spatial information cannot be accurately acquired, and the growth information of wheat is also an important factor reflecting the conditions of wheat diseases and insect pests, so that the crop disease and insect pest monitoring and prediction based on weather has certain limitations.
Because the remote sensing means can obtain continuous spatial information and can reflect the growth situation of crops, some scholars develop a series of researches on crop diseases and insect pests by using remote sensing. In the prior art, a wheat powdery mildew monitoring model is established by combining environmental satellite data with wavelet analysis and a support vector machine model; monitoring wheat powdery mildew by using PRI; establishing a monitoring model for wheat powdery mildew based on an Adaboost model and an mRMR algorithm; adopting various vegetation indexes inverted by environment star data, and realizing occurrence prediction of wheat powdery mildew through logistic regression; and extracting growth factors and habitat factors of winter wheat by using environmental star data, and establishing a wheat aphid prediction model by combining a relevant vector machine.
Most scholars focus on researching remote sensing characteristic indexes of image crop diseases at present, and usually, primary selection characteristics are selected only through simple correlation analysis, T inspection and the like on a selection method of characteristic variables, although the relevance of the selected characteristics and the wheat diseases is large, the redundancy among the characteristics can cause the reduction of model precision.
Therefore, how to develop a remote sensing monitoring method based on the characteristics of high disease relevance and minimum redundancy becomes an urgent technical problem to be solved.
Disclosure of Invention
The invention aims to solve the defects of high redundancy and poor monitoring precision of wheat disease characteristics in the prior art, and provides a wheat powdery mildew remote sensing monitoring method with a disease characteristic preprocessing function to solve the problems.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a wheat powdery mildew remote sensing monitoring method with a disease characteristic preprocessing function comprises the following steps:
11) obtaining and preprocessing remote sensing data, obtaining the remote sensing data of a research area, preprocessing the remote sensing data, and extracting the planting area of wheat by a maximum likelihood classification method;
12) extracting characteristic variables, namely acquiring a field investigation data sample of the wheat, and extracting the characteristics required by wheat powdery mildew monitoring by utilizing the preprocessed remote sensing image;
13) processing characteristic variables, namely calculating the weight of wheat powdery mildew characteristics by using a relief technology, and after threshold screening, selecting a characteristic subset with the maximum correlation with a target class and the minimum redundancy among the characteristic subsets by using an mRMR technology as an input variable of a powdery mildew monitoring model;
14) constructing and optimizing a powdery mildew monitoring model, constructing the powdery mildew monitoring model based on a support vector machine, and optimizing the powdery mildew monitoring model;
15) obtaining a wheat powdery mildew remote sensing monitoring result, storing extracted characteristic variables pixel by pixel from a remote sensing image into a matrix A as input variables of a powdery mildew monitoring model, extracting geographical coordinates of each pixel and storing the geographical coordinates into a matrix B, inputting the matrix A into the optimized powdery mildew monitoring model to obtain a matrix result C of a wheat powdery mildew monitoring condition in a research area, and drawing a monitoring result into a picture by combining the matrix result C and the geographical coordinate matrix B to obtain a wheat powdery mildew monitoring result spatial distribution map of the research area.
The acquisition and pretreatment of the remote sensing data comprise the following steps:
21) obtaining remote sensing data of wheat powdery mildew areas, and carrying out image radiometric calibration and atmospheric correction processing, wherein the image radiometric calibration formula is as follows:
L(λ)=Gain·DN+Bais,
wherein, L (lambda) is a radiation brightness value, Gain is a Gain coefficient, Bais is a bias coefficient, and DN is an observation gray value;
22) converting the radiance of the image into reflectivity by adopting a FLAASH atmospheric correction module in ENVI5.1 remote sensing processing software;
23) cutting the image to obtain an image of a region to be researched;
24) and extracting the wheat planting area of the research area by combining a maximum likelihood classification method in ENVI5.1 according to the normalized vegetation index and the near-infrared reflectivity data of the research area.
The extraction of the characteristic variables comprises the following steps:
31) obtaining a wheat field investigation data sample which comprises n sample points, wherein each sample point is marked as a health sample or a disease sample by an artificial label;
32) selecting four reflectivity data of blue, green, red and near infrared of remote sensing data and a broadband vegetation index as an initial characteristic factor of a powdery mildew monitoring model, wherein the initial characteristic factor forms an initial characteristic set of a remote sensing image, and the broadband vegetation index comprises a ratio vegetation index, a triangular vegetation index, a green-wave-band normalized vegetation index, an enhanced vegetation index, a normalized vegetation index, an optimized soil-adjusted vegetation index, a soil-adjusted vegetation index, an improved triangular vegetation index, an improved simple ratio index and a renormalized vegetation index.
The processing of the characteristic variables comprises the following steps:
41) calculating the characteristic weight by using a relief method aiming at the initially selected characteristic factors of the powdery mildew monitoring model;
selecting a primary characteristic factor of a powdery mildew monitoring model from a primary characteristic set of the remote sensing image, and calculating characteristic weight by using the primary characteristic factor sigma until all primary characteristic factors in the primary characteristic set of the remote sensing image are calculated by the characteristic weight;
42) setting a weight threshold, and comparing the average value of the primary selection characteristic factor sigma weight with the weight threshold; if the weight is larger than the weight threshold, the primarily selected feature factor sigma is classified into a screening feature set; if the weight is smaller than the weight threshold, discarding the primary selection characteristic factor sigma;
43) and (3) performing dimension reduction processing on the features of the screened feature set by an mRMR method, measuring the correlation between the features and disease categories and between the features in the feature subset by utilizing mutual information in the screened feature set screened by the relief method, and screening n feature variables.
The construction and optimization of the powdery mildew monitoring model comprise the following steps:
51) constructing a powdery mildew monitoring model on the basis of a support vector machine, and taking n characteristic variables as a sample set;
52) minimizing a regression function, and enabling a support vector machine to search phi and b to enable regression function data f (x) to be omegaTThe structural risk of phi (x) + b is minimized;
introducing non-negative relaxation variable xiiAnd a penalty factor C, wherein the problem to be optimized is represented as:
Figure GDA0002768017210000041
wherein C is a constant and xiiAnd
Figure GDA0002768017210000042
controlling an upper bound and a lower bound of an output constraint;
53) and optimizing the penalty factors and the nuclear parameters by using a genetic method so as to realize the optimization of the powdery mildew monitoring model.
The specific steps of the calculation of the feature weight are as follows:
61) randomly selecting a sample point A from the n actual sample points;
62) assuming that the sample point A is a healthy sample, searching two nearest neighbor samples around the sample point A, wherein one nearest neighbor healthy sample is a nearest neighbor healthy sample H, and the other nearest neighbor disease sample is a nearest neighbor disease sample M;
63) calculating the difference value sigma of the primary selection characteristic factors between the sample point A and the nearest neighbor healthy sample HAH
64) Calculating the difference value sigma of the primary selection characteristic factors between the sample point A and the nearest neighbor disease sample MAM
65) If σAHAMThen, it means that the primary selection of the feature factor σ is beneficial to distinguish the same class from different classesNearest neighbor, increasing the weight of the initially selected characteristic factor sigma;
66) if σAHAMIf so, the initial selection characteristic factor sigma is not beneficial to distinguishing the nearest neighbors of the same class and different classes, and the weight of the initial selection characteristic factor sigma is reduced;
67) and repeating 61) the step m times, and calculating the average value of the sigma weights of the m primary selection characteristic factors.
The dimension reduction processing of the features of the screened feature set by the mRMR method comprises the following steps:
71) given two random variables X and Y, whose probability density functions corresponding to the continuous variables are p (X), p (Y), p (X, Y) p (X), p (Y), p (X, Y), mutual information between X and Y is represented as follows:
Figure GDA0002768017210000051
72) and (3) calculating the maximum correlation between the screened feature set S and the disease category c by using the mutual information, wherein the expression is as follows:
Figure GDA0002768017210000052
73) method for calculating and screening features x in feature set S by utilizing mutual informationiAnd xjThe redundancy between them, its computational expression is as follows:
Figure GDA0002768017210000053
74) selecting the features with the largest difference value by utilizing a mutual information difference criterion to obtain one feature with the smallest redundancy and the largest correlation, and classifying the feature into an optimal feature set required by modeling;
75) repeating 72) the step n times in the remaining sample set, and finally obtaining n preferred characteristics required by modeling.
The optimization of the penalty factors and the nuclear parameters by using a genetic method comprises the following steps:
81) taking the n preferable features required by modeling as sample data to form a training sample;
82) calculating the average relative error between the output value of the training sample and the expected value;
83) carrying out selection, crossing and mutation operations on the training samples;
84) and judging whether the initial set maximum genetic algebra is met or not, and obtaining the optimal punishment factor and nuclear parameter when the conditions are met.
Advantageous effects
Compared with the prior art, the wheat powdery mildew remote sensing monitoring method with the disease characteristic preprocessing function has the advantages that two characteristic selection technologies of relief and mRMR are combined with a support vector machine optimized by a genetic method, and effective remote sensing monitoring on powdery mildew of a regional scale is formed.
The invention combines the relief algorithm and the mRMR algorithm to reduce the dimension of the features, and sets a certain threshold value to improve the weight of the features with good discrimination so as to obtain the optimal feature subset. The monitoring model is optimized by using the characteristics that the genetic algorithm is good at solving the global optimum problem and has strong robustness, so that the monitoring precision of the model is improved.
Drawings
FIG. 1 is a sequence diagram of the method of the present invention;
FIG. 2a is a weight result graph of the weight occupied by each feature after the initially selected 14 features are calculated by a relief method;
FIG. 2b is a weight result chart showing the number of each stage feature in the initially selected 14 features with the weight of 500 as the base number in the present invention;
FIG. 3a is a spatial distribution diagram of a wheat powdery mildew monitoring result by using a relief algorithm in combination with a GSSVM;
FIG. 3b is a spatial distribution diagram of features screened by a relief algorithm and a monitoring result of a GASVM;
FIG. 3c is a spatial distribution diagram of features screened by relief + mRMR algorithm and a monitoring result of GSSVM;
FIG. 3d is a spatial distribution diagram of features screened by relief + mRMR algorithm and monitoring results of GASVM according to the present invention;
FIG. 4a is a local distribution diagram of features screened by relief + mRMR algorithm and monitoring results of GSSVM;
fig. 4b is a local distribution diagram of the features screened by the relief + mRMR algorithm and the monitoring result of the GASVM according to the present invention.
Detailed Description
So that the manner in which the above recited features of the present invention can be understood and readily understood, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings, wherein:
as shown in FIG. 1, the remote sensing monitoring method for wheat powdery mildew with the disease feature preprocessing function, provided by the invention, has high operation efficiency by using a relief method, and can be used for giving higher weight to features with strong classification capability by calculating the weight of image features. Considering that the relief algorithm does not consider the correlation existing between the image features and cannot remove the redundancy between the image features, the mRMR method is used to remove the redundancy between the image features, and a feature subset with the minimum redundancy between the features and the maximum correlation between the features and the target is obtained. By combining the relief and the mRMR two feature selection methods, the features which are high in disease correlation and minimum in redundancy are screened out. Which comprises the following steps:
the method comprises the following steps of firstly, obtaining and preprocessing remote sensing data. And acquiring remote sensing data of the research area, preprocessing the remote sensing data, and extracting the planting area of the wheat by a maximum likelihood classification method. The method comprises the following specific steps:
(1) obtaining remote sensing data of wheat powdery mildew areas, and carrying out image radiometric calibration and atmospheric correction processing, wherein the image radiometric calibration formula is as follows:
L(λ)=Gain·DN+Bais,
wherein, L (lambda) is the radiation brightness value, Gain is the Gain coefficient, Bais is the bias coefficient, DN is the observation gray value.
(2) And (3) converting the radiance of the image into the reflectivity by adopting a FLAASH atmospheric correction module in ENVI5.1 remote sensing processing software.
(3) The image is cropped to obtain an image of the area to be studied.
(4) And extracting the wheat planting area of the research area by combining a maximum likelihood classification method in ENVI5.1 according to the normalized vegetation index and the near-infrared reflectivity data of the research area.
And secondly, extracting characteristic variables. And acquiring a field investigation data sample of the wheat, and extracting the characteristics required by wheat powdery mildew monitoring by utilizing the preprocessed remote sensing image. The method comprises the following specific steps:
(1) a wheat field survey data sample is obtained comprising n sample points, each sample point having been manually labeled as a health sample or a disease sample.
In practical application, field investigation data can be obtained by investigation in the wheat filling stage, disease investigation is carried out by adopting a 5-point investigation method, namely, a 1m multiplied by 1m sample prescription is taken at each investigation point, positioning is carried out by a Global Positioning System (GPS) at the center of the investigation point, 5 symmetrical points are uniformly selected in the sample prescription, 20 wheat plants are selected for investigation, and the powdery mildew occurrence condition in the sample prescription is recorded. The severity of disease conditions is recorded by adopting a 0-9 grade method of wheat powdery mildew improved in agricultural industry standard (NY/T613-2002), wheat is uniformly divided into 9 sections from top to bottom during investigation, classification is carried out according to a classification standard, and then disease condition index (DI) is calculated. In the invention, considering the difficulty of monitoring excessive grades, only the occurrence degree of the diseases is divided into a healthy type and an occurrence type.
(2) Selecting four reflectivity data of blue, green, red and near infrared of the remote sensing data and the broadband vegetation index as initial characteristic factors of the powdery mildew monitoring model, wherein the initial characteristic factors form an initial characteristic set of the remote sensing image. The broadband vegetation index comprises a ratio vegetation index, a triangular vegetation index, a green-wave-band normalized vegetation index, an enhanced vegetation index, a normalized vegetation index, an optimized soil-regulating vegetation index, a soil-regulating vegetation index, an improved triangular vegetation index, an improved simple ratio index and a renormalized vegetation index.
And thirdly, processing the characteristic variables. Because the dimension of the initially selected features is large, the calculation amount can be increased when the initially selected features are directly used for constructing a model, the condition that the correlation between the characteristics without being screened and the occurrence degree of diseases is small or even irrelevant exists, redundancy exists among the characteristics, and the precision of the model can be directly influenced, so that the dimension reduction is performed on the characteristics of the wheat powdery mildew by combining the relief and the mRMR technology.
And calculating the weight of the wheat powdery mildew characteristics by using a relief technology, setting a certain threshold, and after the threshold is screened, selecting a characteristic subset with the maximum correlation with the target category and the minimum redundancy among the characteristics by using an mRMR technology as an input variable of a powdery mildew monitoring model. The method comprises the following specific steps:
(1) and calculating the characteristic weight by using a relief method aiming at the initially selected characteristic factors of the powdery mildew monitoring model.
And selecting a primary characteristic factor of the powdery mildew monitoring model from the primary characteristic set of the remote sensing image, and calculating the characteristic weight by using the primary characteristic factor sigma until all primary characteristic factors in the primary characteristic set of the remote sensing image are calculated by the characteristic weight.
The specific steps of calculating the characteristic weight are as follows:
A. randomly selecting a sample point A from the n actual sample points;
B. assuming that a randomly extracted sample point A is a healthy sample, two nearest neighbor samples are searched around the sample point A, one is a nearest neighbor healthy sample H, and the other is a nearest neighbor disease sample M.
Here, if the randomly extracted sample point a is a disease sample, the discrimination may be performed in the reverse direction in the same manner.
For example, assuming that a randomly extracted sample point a is a disease sample, a nearest neighbor disease sample H is searched around the sample point a, and the difference between the initially selected feature factors of the sample point a and the nearest neighbor disease sample H is σAH(ii) a Searching a nearest neighbor healthy sample M around the sample point A, wherein the difference value of the initially selected characteristic factors between the sample point A and the nearest neighbor healthy sample M is sigmaAM
C. Calculating the difference value sigma of the primary selection characteristic factors between the sample point A and the nearest neighbor healthy sample HAH
D. Calculating the difference value sigma of the primary selection characteristic factors between the sample point A and the nearest neighbor disease sample MAM
E. If σAHAMIf so, the initial selection characteristic factor sigma is favorable for distinguishing the nearest neighbors of the same class and different classes, and the weight of the initial selection characteristic factor sigma is increased;
F. if σAHAMIf so, the initial selection feature factor sigma is not beneficial to distinguishing the nearest neighbors of the same class and different classes, and the weight of the initial selection feature factor sigma is reduced.
By such comparison, the correlation size can be determined by evaluating the distinguishing ability of the features on the close-range samples,
if the distance σ between the sample A and the sample H over the featureAHLess than the distance σ of the feature on sample A and sample MAMIf the feature is beneficial to distinguishing the nearest neighbors of the same class and different classes, the weight of the feature is increased; conversely, if the distance between sample A and sample H at the feature is larger than σAHThe distance σ between the sample A and the sample M over the featureAMIndicating that the feature negatively affects the nearest neighbor distinguishing between classes and non-classes, the weight of the feature is reduced. The larger the weight of the feature factor is, the stronger the classification capability of the feature is represented, and the smaller the weight of the feature factor is, the weaker the classification capability of the feature is represented.
G. And repeating the step A) (randomly selecting a sample point A from n actual sample points) m times, and calculating the average value of the sigma weights of m primary selection characteristic factors to obtain the average value of the primary selection characteristic factors (the four reflectivity data of blue, green, red and near infrared of the remote sensing data and the broadband vegetation index).
(2) Setting a weight threshold, and comparing the average value of the primary selection characteristic factor sigma weight with the weight threshold; if the weight is larger than the weight threshold, the primarily selected feature factor sigma is classified into a screening feature set; if the weight is smaller than the weight threshold value, the initially selected feature factor sigma is discarded. The screening feature set is a feature set obtained by screening all initially selected feature factors, so that some features with large influence and high correlation are selected.
(3) And (3) performing dimension reduction processing on the features of the screened feature set by an mRMR method, measuring the correlation between the features and disease categories and between the features in the feature subset by utilizing mutual information in the screened feature set screened by the relief method, and screening n required feature variables.
The mRMR method is a typical feature dimension reduction algorithm based on an information theory, and n features which have the maximum correlation with disease categories and have the minimum redundancy with each other are found out from features screened by a relief technology by using the mRMR method. The method comprises the following specific steps:
A. given two random variables X and Y, whose probability density functions corresponding to the continuous variables are p (X), p (Y), p (X, Y) p (X), p (Y), p (X, Y), mutual information between X and Y is represented as follows:
Figure GDA0002768017210000101
B. and (3) calculating the maximum correlation between the screened feature set S and the disease category c by using the mutual information, wherein the expression is as follows:
Figure GDA0002768017210000102
C. method for calculating and screening features x in feature set S by utilizing mutual informationiAnd xjThe redundancy between them, its computational expression is as follows:
Figure GDA0002768017210000103
D. selecting the features with the largest difference value by utilizing a mutual information difference criterion to obtain one feature with the smallest redundancy and the largest correlation, and classifying the feature into an optimal feature set required by modeling;
E. and (C) repeatedly executing the step B) (the maximum correlation between the screened feature set S and the disease category c is calculated by utilizing mutual information) n times in the rest sample sets, and finally obtaining n optimal features required by modeling.
Here, the relief algorithm is highly efficient in operation, and features having a high classification capability are given a high weight by calculating the weight of the features, but the relief algorithm does not consider the correlation existing between the features, and therefore, the redundancy between the features cannot be removed. The mRMR algorithm is able to obtain a subset of features with minimal redundancy between features and maximal correlation between features and the target, but cannot compute the weight magnitude. Therefore, the relief algorithm and the mRMR algorithm are combined to reduce the dimension of the features, and the weight of the features with good discrimination is improved by setting a certain threshold value to obtain the optimal feature subset.
As shown in fig. 2a, the abscissa in fig. 2a is the serial number of each of the initially selected 14 features (four reflectance data of blue, green, red, and near infrared of remote sensing data, a ratio vegetation index, a triangular vegetation index, a green-band normalized vegetation index, an enhanced vegetation index, a normalized vegetation index, an optimized soil-adjusted vegetation index, a soil-adjusted vegetation index, an improved triangular vegetation index, an improved simple ratio index, and a renormalized vegetation index), and the ordinate is the weight of the feature. Referring to fig. 2b, the number of features in each stage is shown with the weight being 500 as a base.
From fig. 2a, it can be seen that the weight occupied by 14 features is large, and with reference to fig. 2b, when the weight threshold is set to 2500, 8 features meeting the conditions can be screened out to be used as the initial selection features of the mRMR algorithm, then the optimal three feature variables obtained through the mRMR algorithm are used as the first group of feature variables, and the three feature variables obtained through the relief algorithm and having the largest weight are selected to be used as the second group of feature variables.
And fourthly, constructing and optimizing a powdery mildew monitoring model. And (3) constructing a powdery mildew monitoring model based on the support vector machine, and optimizing the powdery mildew monitoring model (forming a genetic algorithm optimized support vector machine GASVM).
The construction and optimization of the powdery mildew monitoring model comprise the following steps:
(1) and (3) constructing a powdery mildew monitoring model on the basis of a support vector machine, and taking n characteristic variables as a sample set.
(2) Minimizing a regression function, and enabling a support vector machine to search phi and b to enable regression function data f (x) to be omegaTThe structural risk of phi (x) + b is minimized;
introducing non-negative relaxation variable xiiAnd a penalty factor C, wherein the problem to be optimized is represented as:
Figure GDA0002768017210000111
wherein: c is a constant, xiiAnd
Figure GDA0002768017210000112
the upper and lower bounds of the output constraints are controlled.
(3) And optimizing the penalty factors and the nuclear parameters by using a genetic method so as to realize the optimization of the powdery mildew monitoring model. The traditional method is adopted for optimizing the penalty factors and the nuclear parameters by utilizing a genetic method, and the traditional method specifically comprises the following steps:
A. taking the n preferable features required by modeling as sample data to form a training sample, and distinguishing a verification sample to verify the reliability of the model;
B. calculating the average relative error between the output value of the training sample and the expected value;
C. carrying out selection, crossing and mutation operations on the training samples;
D. and judging whether the initial set maximum genetic algebra is met or not, and obtaining the optimal punishment factor and nuclear parameter when the conditions are met.
And fifthly, obtaining a wheat powdery mildew remote sensing monitoring result, storing extracted characteristic variables pixel by pixel from a remote sensing image into a matrix A as input variables of a powdery mildew monitoring model, extracting geographic coordinates of each pixel and storing the geographic coordinates into a matrix B, inputting the matrix A into the optimized powdery mildew monitoring model to obtain a matrix result C of the wheat powdery mildew monitoring condition of the research area, and drawing the monitoring result into a picture by combining the matrix result C and the geographic coordinate matrix B to obtain a wheat powdery mildew monitoring result spatial distribution map of the research area.
In practical application, the accuracy of the actual model can be better embodied by adopting independent sample data to verify the model. The invention combines the field investigation data of 24 days in 05 month 2014 in the research area of Hebei stone village city to evaluate 6 models.
The user precision, the drawing precision, the overall precision and the Kappa coefficient of the powdery mildew monitoring model established by combining 2 feature selection algorithms (relief + mRMR algorithm) and 3 modeling methods (SVM, grid search method optimized support vector machine GSSVM and genetic algorithm optimized support vector machine GASVM) are listed in table 1.
TABLE 1 comparison table of verification results of SVM, GSSVM and GASVM modeling methods
Figure GDA0002768017210000121
Figure GDA0002768017210000131
As can be seen from the table 1, the overall accuracy and the Kappa coefficient of the SVM monitoring model are lower than those of the GSSVM and GASVM monitoring models, and the overall accuracy and the Kappa coefficient of the relief + mRMR-SVM monitoring model are improved compared with those of the relief-SVM monitoring model, but are only 64% and 0.286%; in the 2 GSSVM monitoring models, the overall accuracy and Kappa coefficient of the monitoring model established by the relief + mRMR algorithm are 78.5 percent and 0.571, which are higher than the monitoring model established by the relief algorithm; in the 2 GASVM monitoring models, the accuracy and Kappa coefficient of the monitoring model established by the relief + mRMR algorithm are 85.7% and 0.714, and the accuracy and Kappa coefficient are higher than those of the monitoring model established by the features screened by the relief algorithm, so that the model accuracy can be effectively improved by screening out the features with the maximum disease correlation and removing the redundancy among the features.
Comparing the overall accuracy, the user accuracy and the drawing accuracy of the model built by the 3 model methods, the method has the advantages that the feature screening of the relief algorithm and the mRMR algorithm is the highest in accuracy with the monitoring model (relief + mRMR-GASVM) built by the GASVM, the overall accuracy is 21.7% and 7.2% higher than that of the GSSVM and the model built by the SVM, and the user accuracy and the drawing accuracy reach 85.7%. The results show that the model established by the relief + mRMR algorithm is superior to the model established by the relief algorithm, the monitoring model established by the SVM (GASVM) based on genetic algorithm optimization is superior to the non-optimized SVM and the SVM (GSSVM) based on grid search method optimization, and the precision of the model can be improved by combining the relief + mRMR algorithm with the powdery mildew monitoring model established by the GASVM (the relief + mRMR-GASVM disclosed by the invention).
Similarly, a remote sensing image of 26 days 05 month 2014 in the research area of hebei stone house city is taken as an example. As shown in fig. 3a, it is a spatial distribution diagram of wheat powdery mildew monitoring results of relief algorithm combined with GSSVM; as shown in fig. 3b, it is the monitoring result of the features and the GASVM after being screened by the relief algorithm; FIG. 3c is a graph showing features filtered by relief + mRMR algorithm and a monitoring result of GSSVM; fig. 3d shows the features screened by relief + mRMR algorithm and the monitoring result of the GASVM. In fig. 3 a-3 d, the white area is the non-wheat growing area and the black area is the diseased area.
The percentage of the area of attack in the total area was 44.5% in fig. 3a, 62.1% in fig. 3b, 62.4% in fig. 3c, and 60.4% in fig. 3 d. Fig. 3a shows a smaller incidence area than the remaining three figures, and shows a greater deviation from the wheat powdery mildew examined in the field. As can be seen from comparison of the results shown in FIGS. 3a, 3b, 3c and 3d, the powdery mildew disease areas monitored in FIGS. 3a and 3b are scattered, and the powdery mildew disease is caused by powdery mildew, which is characterized by high propagation speed, wide propagation area and no scattered occurrence, so the monitoring results shown in FIGS. 3a and 3b are contrary to the characteristic of powdery mildew disease, and the monitoring results shown in FIGS. 3c and 3d are more consistent with the characteristic of powdery mildew disease occurrence and have higher credibility. Fig. 3c and 3d show substantial agreement, but differ in detail.
As shown in fig. 4a and 4b, fig. 4a is a local distribution diagram of the features filtered by the relief + mRMR algorithm and the monitoring results of the GSSVM, and fig. 4b is a local distribution diagram of the features filtered by the relief + mRMR algorithm and the monitoring results of the gassvm.
Comparing fig. 4a and 4b, it can be seen that, in fig. 4a and 4b, the white area is the non-wheat growing area and the black area is the diseased area. The area classified as healthy in fig. 4b is classified as diseased in fig. 4 a. In the figure 4b, the distribution of healthy and diseased plots of wheat and wheat is uniform, while in the figure 4a, most of the whole plots are all diseased plots, only a few plots are uniformly distributed, and the comparison of the two local distribution plots shows that the goodness of fit between the features screened by the relief + mRMR algorithm and the monitoring model of the GSSVM is not high, although the overall trend of monitoring is consistent with the reality, the distinguishing capability of the detail part is weaker than that between the features screened by the relief + mRMR algorithm and the monitoring model of the gassvm, and the relief + mRMR-gassvm model is still applicable to local monitoring.
In conclusion, the relief + mRMR-GASVM model disclosed by the invention is more consistent with the actual situation in terms of overall trend, disease characteristics and local details, can truly reflect the disease condition of wheat powdery mildew, can be used for meeting the requirement of real-time monitoring of wheat powdery mildew in daily production life, can provide a prevention and treatment basis in a planned way by accurately acquiring the disease condition of powdery mildew and spatial distribution characteristics, and can improve the yield of wheat.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are merely illustrative of the principles of the invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (3)

1. A wheat powdery mildew remote sensing monitoring method with a disease characteristic preprocessing function is characterized by comprising the following steps:
11) obtaining and preprocessing remote sensing data, obtaining the remote sensing data of a research area, preprocessing the remote sensing data, and extracting the planting area of wheat by a maximum likelihood classification method;
12) extracting characteristic variables, namely acquiring a field investigation data sample of the wheat, and extracting the characteristics required by wheat powdery mildew monitoring by utilizing the preprocessed remote sensing image;
13) processing characteristic variables, namely calculating the weight of wheat powdery mildew characteristics by using a relief technology, and after threshold screening, selecting a characteristic subset with the maximum correlation with a target class and the minimum redundancy among the characteristic subsets by using an mRMR technology as an input variable of a powdery mildew monitoring model; the processing of the characteristic variables comprises the following steps:
131) calculating the characteristic weight by using a relief method aiming at the initially selected characteristic factors of the powdery mildew monitoring model;
selecting a primary characteristic factor of a powdery mildew monitoring model from a primary characteristic set of the remote sensing image, and calculating characteristic weight by using the primary characteristic factor sigma until all primary characteristic factors in the primary characteristic set of the remote sensing image are calculated by the characteristic weight;
the specific steps of the calculation of the feature weight are as follows:
51) randomly selecting a sample point A from the n actual sample points;
52) assuming that the sample point A is a healthy sample, searching two nearest neighbor samples around the sample point A, wherein one nearest neighbor healthy sample is a nearest neighbor healthy sample H, and the other nearest neighbor disease sample is a nearest neighbor disease sample M;
53) calculating the difference value sigma of the primary selection characteristic factors between the sample point A and the nearest neighbor healthy sample HAH
54) Calculating the difference value sigma of the primary selection characteristic factors between the sample point A and the nearest neighbor disease sample MAM
55) If σAH<σAMIf so, the initial selection characteristic factor sigma is favorable for distinguishing the nearest neighbors of the same class and different classes, and the weight of the initial selection characteristic factor sigma is increased;
56) if σAH>σAMIf so, the initial selection characteristic factor sigma is not beneficial to distinguishing the nearest neighbors of the same class and different classes, and the weight of the initial selection characteristic factor sigma is reduced;
57) repeating 51) the steps for m times, and calculating the average value of the sigma weights of the m primary selection characteristic factors;
132) setting a weight threshold, and comparing the average value of the primary selection characteristic factor sigma weight with the weight threshold; if the weight is larger than the weight threshold, the primarily selected feature factor sigma is classified into a screening feature set; if the weight is smaller than the weight threshold, discarding the primary selection characteristic factor sigma;
133) performing dimensionality reduction treatment on the features of the screened feature set by an mRMR method, measuring the correlation between the features and disease categories and between the features in the feature subset by utilizing mutual information in the screened feature set screened by a relief method, and screening n feature variables;
the dimension reduction processing of the features of the screening feature set by the mRMR method comprises the following steps:
61) given two random variables x and y, whose probability density functions corresponding to continuous variables are p (x), p (y), p (x, y), the mutual information between x and y is represented as follows:
Figure FDA0002732620880000023
62) and (3) calculating the maximum correlation between the screened feature set S and the disease category c by using the mutual information, wherein the expression is as follows:
Figure FDA0002732620880000021
63) method for calculating and screening features x in feature set S by utilizing mutual informationiAnd xjThe redundancy between them, its computational expression is as follows:
Figure FDA0002732620880000022
64) selecting the features with the largest difference value by utilizing a mutual information difference criterion to obtain one feature with the smallest redundancy and the largest correlation, and classifying the feature into an optimal feature set required by modeling;
65) repeating 62) the step n times in the rest sample sets to finally obtain n optimal characteristics required by modeling;
14) constructing and optimizing a powdery mildew monitoring model, constructing the powdery mildew monitoring model based on a support vector machine, and optimizing the powdery mildew monitoring model;
the construction and optimization of the powdery mildew monitoring model comprise the following steps:
141) constructing a powdery mildew monitoring model on the basis of a support vector machine, and taking n characteristic variables as a sample set;
142) the regression function minimization process supports the vector machine to make the regression function data f (x) omega through finding phi (x) and bTThe structural risk of phi (x) + b is minimized;
introducing non-negative relaxation variable xiiAnd a penalty factor C, wherein the problem to be optimized is represented as:
Figure FDA0002732620880000031
wherein C is a constant and xiiAnd
Figure FDA0002732620880000032
controlling an upper bound and a lower bound of an output constraint;
143) optimizing the penalty factors and the nuclear parameters by using a genetic method so as to realize the optimization of the powdery mildew monitoring model; the optimization of the penalty factors and the nuclear parameters by using a genetic method comprises the following steps:
71) taking the n preferable features required by modeling as sample data to form a training sample;
72) calculating the average relative error between the output value of the training sample and the expected value;
73) carrying out selection, crossing and mutation operations on the training samples;
74) judging whether the initial set maximum genetic algebra is met or not, and obtaining the optimal punishment factor and nuclear parameter when the conditions are met;
15) obtaining a wheat powdery mildew remote sensing monitoring result, storing extracted characteristic variables pixel by pixel from a remote sensing image into a matrix A as input variables of a powdery mildew monitoring model, extracting geographical coordinates of each pixel and storing the geographical coordinates into a matrix B, inputting the matrix A into the optimized powdery mildew monitoring model to obtain a matrix result C of a wheat powdery mildew monitoring condition in a research area, and drawing a monitoring result into a picture by combining the matrix result C and the geographical coordinate matrix B to obtain a wheat powdery mildew monitoring result spatial distribution map of the research area.
2. The remote sensing monitoring method for wheat powdery mildew with the disease characteristic preprocessing function of claim 1, wherein the acquisition and preprocessing of the remote sensing data comprises the following steps:
21) obtaining remote sensing data of wheat powdery mildew areas, and carrying out image radiometric calibration and atmospheric correction processing, wherein the image radiometric calibration formula is as follows:
L(λ)=Gain·DN+Bais,
wherein, L (lambda) is a radiation brightness value, Gain is a Gain coefficient, Bais is a bias coefficient, and DN is an observation gray value;
22) converting the radiance of the image into reflectivity by adopting a FLAASH atmospheric correction module in ENVI5.1 remote sensing processing software;
23) cutting the image to obtain an image of a region to be researched;
24) and extracting the wheat planting area of the research area by combining a maximum likelihood classification method in ENVI5.1 according to the normalized vegetation index and the near-infrared reflectivity data of the research area.
3. The remote sensing monitoring method for wheat powdery mildew with the disease characteristic preprocessing function of claim 1, wherein the extraction of the characteristic variables comprises the following steps:
31) obtaining a wheat field investigation data sample which comprises n sample points, wherein each sample point is marked as a health sample or a disease sample by an artificial label;
32) selecting four reflectivity data of blue, green, red and near infrared of remote sensing data and a broadband vegetation index as an initial characteristic factor of a powdery mildew monitoring model, wherein the initial characteristic factor forms an initial characteristic set of a remote sensing image, and the broadband vegetation index comprises a ratio vegetation index, a triangular vegetation index, a green-wave-band normalized vegetation index, an enhanced vegetation index, a normalized vegetation index, an optimized soil-adjusted vegetation index, a soil-adjusted vegetation index, an improved triangular vegetation index, an improved simple ratio index and a renormalized vegetation index.
CN201810383173.4A 2018-04-26 2018-04-26 Wheat powdery mildew remote sensing monitoring method with disease characteristic preprocessing function Active CN108596104B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810383173.4A CN108596104B (en) 2018-04-26 2018-04-26 Wheat powdery mildew remote sensing monitoring method with disease characteristic preprocessing function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810383173.4A CN108596104B (en) 2018-04-26 2018-04-26 Wheat powdery mildew remote sensing monitoring method with disease characteristic preprocessing function

Publications (2)

Publication Number Publication Date
CN108596104A CN108596104A (en) 2018-09-28
CN108596104B true CN108596104B (en) 2021-01-05

Family

ID=63609419

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810383173.4A Active CN108596104B (en) 2018-04-26 2018-04-26 Wheat powdery mildew remote sensing monitoring method with disease characteristic preprocessing function

Country Status (1)

Country Link
CN (1) CN108596104B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711319B (en) * 2018-12-24 2023-04-07 安徽高哲信息技术有限公司 Method and system for establishing imperfect grain image recognition sample library
CN109948237B (en) * 2019-03-15 2023-06-02 中国汽车技术研究中心有限公司 Method for predicting emission of bicycle
CN110008905B (en) * 2019-04-09 2021-02-12 安徽大学 Regional scale wheat stripe rust monitoring method based on red-edge wave band of remote sensing image
CN110188727B (en) * 2019-06-05 2021-11-05 中煤航测遥感集团有限公司 Ocean oil spill quantity estimation method and device
CN110363229B (en) * 2019-06-27 2021-07-27 岭南师范学院 Human body characteristic parameter selection method based on combination of improved RReliefF and mRMR
CN110309780A (en) * 2019-07-01 2019-10-08 中国科学院遥感与数字地球研究所 High resolution image houseclearing based on BFD-IGA-SVM model quickly supervises identification
CN111639649B (en) * 2020-05-26 2024-03-01 中国地质大学(武汉) Method and system for identifying and encoding numbered musical notation image based on real-time image stream
CN114548249A (en) * 2022-02-15 2022-05-27 中国农业科学院植物保护研究所 Wheat powdery mildew occurrence degree prediction method based on machine learning
CN115201210A (en) * 2022-08-12 2022-10-18 河南省农业科学院农业经济与信息研究所 Satellite data-based wheat yellow mosaic disease remote sensing monitoring method
CN116680548B (en) * 2023-08-03 2023-10-13 南京信息工程大学 Time sequence drought causal analysis method for multi-source observation data
CN117636185B (en) * 2024-01-26 2024-04-09 安徽大学 Pine wood nematode disease detecting system based on image processing

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629305B (en) * 2012-03-06 2015-02-25 上海大学 Feature selection method facing to SNP (Single Nucleotide Polymorphism) data
US9538126B2 (en) * 2014-12-03 2017-01-03 King Abdulaziz City For Science And Technology Super-resolution of dynamic scenes using sampling rate diversity
CN104794496A (en) * 2015-05-05 2015-07-22 中国科学院遥感与数字地球研究所 Remote sensing character optimization algorithm for improving mRMR (min-redundancy max-relevance) algorithm
CN106897703A (en) * 2017-02-27 2017-06-27 辽宁工程技术大学 Remote Image Classification based on AGA PKF SVM
CN107103306B (en) * 2017-05-22 2019-06-18 安徽大学 Winter wheat powdery mildew remote-sensing monitoring method based on wavelet analysis and support vector machines

Also Published As

Publication number Publication date
CN108596104A (en) 2018-09-28

Similar Documents

Publication Publication Date Title
CN108596104B (en) Wheat powdery mildew remote sensing monitoring method with disease characteristic preprocessing function
Song et al. High-throughput phenotyping: Breaking through the bottleneck in future crop breeding
Su et al. Evaluating maize phenotype dynamics under drought stress using terrestrial lidar
Font et al. Vineyard yield estimation based on the analysis of high resolution images obtained with artificial illumination at night
Mishra et al. A Deep Learning-Based Novel Approach for Weed Growth Estimation.
Ninomiya High-throughput field crop phenotyping: current status and challenges
Wang et al. Classification of rice yield using UAV-based hyperspectral imagery and lodging feature
CN116129260A (en) Forage grass image recognition method based on deep learning
US11170219B2 (en) Systems and methods for improved landscape management
CN116151454A (en) Method and system for predicting yield of short-forest linalool essential oil by multispectral unmanned aerial vehicle
CN115719453A (en) Rice planting structure remote sensing extraction method based on deep learning
Septiarini et al. Oil Palm Leaf Disease Detection on Natural Background Using Convolutional Neural Networks
Dong et al. Research on Graded Diagnosis of Lettuce Water-Nitrogen Stress and Pest Prevention Based on Deep Learning
Verma et al. A review on land cover classification techniques for major fruit crops in India-Present scenario and future aspects
Zhu et al. Exploring soybean flower and pod variation patterns during reproductive period based on fusion deep learning
Duan et al. High-throughput estimation of yield for individual rice plant using multi-angle RGB imaging
Brook et al. Canopy Volume as a Tool for Early Detection of Plant Drought and Fertilization Stress: Banana plant fine-phenotype
West Hyperspectral imagery combined with machine learning to differentiate genetically modified (GM) and non-GM canola
Rana et al. Mapping and Temporal Analysis of Wheat Crop Using Remote Sensing Imageries Burewala, Pakistan
Raouhi et al. Optimizing olive disease classification through transfer learning with unmanned aerial vehicle imagery.
El Mehdi Raouhi et al. Optimizing olive disease classification through transfer learning with unmanned aerial vehicle imagery
Boogaard Digital plant phenotyping in three dimensions: what’s the point?
Wu et al. Early Crop Classification Based on Historical Annual Crop Inventory Data and Remote Sensing Data
Khan et al. A Systematic Literature Review of Machine Learning and Deep Learning Approaches for Spectral Image Classification in Agricultural Applications Using Aerial Photography.
CN118053032A (en) Rice blast grading diagnosis method based on multispectral remote sensing image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant