CN112528559B - Chlorophyll a concentration inversion method combining pre-classification and machine learning - Google Patents
Chlorophyll a concentration inversion method combining pre-classification and machine learning Download PDFInfo
- Publication number
- CN112528559B CN112528559B CN202011403257.3A CN202011403257A CN112528559B CN 112528559 B CN112528559 B CN 112528559B CN 202011403257 A CN202011403257 A CN 202011403257A CN 112528559 B CN112528559 B CN 112528559B
- Authority
- CN
- China
- Prior art keywords
- chlorophyll
- classification
- concentration
- class
- machine learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 229930002868 chlorophyll a Natural products 0.000 title claims abstract description 73
- ATNHDLDRLWWWCB-AENOIHSZSA-M chlorophyll a Chemical compound C1([C@@H](C(=O)OC)C(=O)C2=C3C)=C2N2C3=CC(C(CC)=C3C)=[N+]4C3=CC3=C(C=C)C(C)=C5N3[Mg-2]42[N+]2=C1[C@@H](CCC(=O)OC\C=C(/C)CCC[C@H](C)CCC[C@H](C)CCCC(C)C)[C@H](C)C2=C5 ATNHDLDRLWWWCB-AENOIHSZSA-M 0.000 title claims abstract description 73
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000010801 machine learning Methods 0.000 title claims abstract description 22
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims abstract description 55
- 238000002310 reflectometry Methods 0.000 claims abstract description 45
- 230000009466 transformation Effects 0.000 claims abstract description 19
- 238000012216 screening Methods 0.000 claims abstract description 16
- 238000005457 optimization Methods 0.000 claims abstract description 12
- 238000010219 correlation analysis Methods 0.000 claims abstract description 6
- 230000000694 effects Effects 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000001419 dependent effect Effects 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 5
- 239000007788 liquid Substances 0.000 claims description 3
- 230000003595 spectral effect Effects 0.000 claims description 2
- 238000005259 measurement Methods 0.000 claims 1
- 238000001228 spectrum Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 230000007423 decrease Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000012851 eutrophication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/55—Specular reflectivity
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/62—Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
- G01N21/63—Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
- G01N21/64—Fluorescence; Phosphorescence
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N30/00—Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
- G01N30/02—Column chromatography
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/18—Water
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Pathology (AREA)
- Theoretical Computer Science (AREA)
- Biochemistry (AREA)
- Analytical Chemistry (AREA)
- Immunology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Geometry (AREA)
- Computer Hardware Design (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Probability & Statistics with Applications (AREA)
- Food Science & Technology (AREA)
- Medicinal Chemistry (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Investigating Or Analysing Materials By Optical Means (AREA)
Abstract
The invention discloses a chlorophyll a concentration inversion method combining pre-classification and machine learning, which comprises the steps of firstly collecting reflectivity data of the surfaces of two kinds of water bodies and chlorophyll a concentration values of the water bodies in the field; pre-classifying the reflectivity data, solving the error square sum under different classification numbers, and determining the real cluster number K according to the error square sum; dividing the collected reflectivity data into K classes, and carrying out continuous wavelet transformation under different scales to obtain wavelet coefficients of each class; performing correlation analysis on various wavelet coefficients and the actually measured chlorophyll a concentration value, and screening wavelet coefficients with correlation coefficients larger than a preset threshold value; and carrying out support vector regression modeling on the screened various wavelet coefficients, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model which can be used for inverting the chlorophyll a concentration of the second-class water body according to the wavelet coefficients of the reflectivity. The method can realize high-precision inversion of the chlorophyll a concentration of the second-class water body.
Description
Technical Field
The invention relates to the technical field of chlorophyll a concentration inversion of second-class water bodies, in particular to a chlorophyll a concentration inversion method combining pre-classification and machine learning.
Background
The water area of China is wide, and estuaries, offshore and inland lakes are typical two-class water bodies. When the water quality of the second-class water body is evaluated, the concentration of chlorophyll a in the water needs to be detected, and the chlorophyll a is an important index for reflecting the eutrophication of the water body.
At present, the two kinds of water bodies have very complex constitution conditions, the causes of the water body optical characteristic transformation have larger differences, meanwhile, the optical characteristics of the water bodies also have obvious region and season characteristics, if the chlorophyll a concentration of the two kinds of water bodies is inverted by using a unified model, different time and different research regions can not be adapted, and the inversion result is inaccurate, so that the high-precision chlorophyll a concentration inversion of the two kinds of water bodies is always a great challenge for people.
Disclosure of Invention
The first aim of the invention is to overcome the defects and shortcomings of the prior art and provide a chlorophyll a concentration inversion method combining pre-classification and machine learning, which can realize high-precision inversion of chlorophyll a concentration of a second-class water body.
A second object of the present invention is to provide a storage medium.
It is a third object of the present invention to provide a computing device.
The first object of the invention is achieved by the following technical scheme: a chlorophyll a concentration inversion method combining pre-classification and machine learning comprises the following steps:
s1, collecting reflectivity data of the surfaces of two kinds of water bodies in the field, and synchronously collecting chlorophyll a concentration values of the two kinds of water bodies;
S2, pre-classifying the reflectivity data;
S3, solving the error square sum under different classification numbers, finding out the classification number with the best classification effect according to the error square sum, and taking the classification number as a real cluster number K;
s4, dividing all collected reflectivity data into K categories;
S5, carrying out continuous wavelet transformation on the K-class reflectivity data under different scales to obtain wavelet coefficients of each class;
S6, carrying out correlation analysis on each type of wavelet coefficient and the chlorophyll a concentration value actually measured in the step S1, and screening wavelet coefficients with the correlation coefficient larger than a preset threshold value;
s7, respectively carrying out support vector regression modeling on the various wavelet coefficients after screening, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model with the wavelet coefficients as independent variables and the chlorophyll a concentration as dependent variables correspondingly;
s8, for the second-class water body to be detected, collecting reflectivity data of the surface of the second-class water body in the field, calculating Euclidean distances between the data and the mass centers of K different classes in the step S4, selecting the class with the smallest Euclidean distance as the class to which the data belongs, performing continuous wavelet transformation according to the step S5, and inputting the obtained wavelet coefficient into a chlorophyll a concentration inversion model corresponding to the class to which the wavelet coefficient belongs to predict to obtain chlorophyll a concentration of the second-class water body;
the Euclidean distance calculation formula:
Wherein X is the reflectivity data of the second type of water body to be detected, X i is the reflectivity of different wavelengths, C i is the ith class, m i is the mass center under the ith class, and dist represents the Euclidean distance.
Preferably, the reflectivity data of the water surface is collected by utilizing a spectrometer, and the chlorophyll a concentration of the second-class water is collected by utilizing a high performance liquid chromatograph, a fluorescence photometer or a spectrophotometer.
Further, the spectrum measuring wavelength range of the spectrometer is 400-900 nm.
Preferably, in step S2, the reflectance data is non-supervised classified using the K-means method.
Preferably, in step S2, the reflectance data is classified into N categories, and the value range of N is set to [1, 10];
In step S3, the sum of squares of errors under different classification numbers is obtained, and the classification number with the best classification effect is found according to the sum of squares of errors, and is used as the real cluster number K, specifically:
the error square sum SSE is calculated for the classes of different N-value divisions:
Wherein C i is the i-th category; p is the sample point in C i; m i is the centroid of C i; SSE is used for representing the quality of clustering effect;
And then, using the N value as an abscissa and the SSE as an ordinate, drawing an elbow graph by using the calculated SSE, finding out the category number corresponding to the elbow coefficient from the elbow graph, and using the category number as a real cluster number K.
Preferably, the manner of continuous wavelet transformation includes, but is not limited to MEXH wavelet basis functions.
Preferably, the screening out the wavelet coefficients with the correlation coefficient larger than the preset threshold value refers to screening out the wavelet coefficients with the correlation coefficient arranged in the first 1%.
Preferably, the hyper-parametric optimization method includes, but is not limited to, a grid search method, a random search method, and a Bayesian optimization method.
The second object of the invention is achieved by the following technical scheme: a computer-readable storage medium storing a program which, when executed by a processor, implements a chlorophyll-a concentration inversion method combining pre-classification and machine learning according to a first object of the present invention.
The third object of the invention is achieved by the following technical scheme: a computing device includes a processor and a memory for storing a program executable by the processor, wherein the processor implements the chlorophyll-a concentration inversion method combining pre-classification and machine learning according to the first object of the present invention when executing the program stored in the memory.
Compared with the prior art, the invention has the following advantages and effects:
(1) The invention combines the pre-classification and the machine learning chlorophyll a concentration inversion method, and classifies the water body reflection signals with complex optical characteristics into K classes by performing non-supervision classification on the actually measured hyperspectral signals; in order to better improve inversion accuracy, continuous wavelet transformation is introduced, spectral wavelet characteristics of hyperspectral signals are modeled to further improve model accuracy, and finally, support vector regression models are respectively built for each class, and the support vector regression models have stronger robustness and can be used for accurately inverting chlorophyll a concentration information.
(2) According to the invention, the collected spectrum signals are pre-classified, and the classified spectrum signals are subjected to continuous wavelet transformation, so that the ground hyperspectral signals are converted into various wavelet coefficients, and the continuous wavelet transformation is a signal processing mode capable of effectively detecting weak hidden information in the water spectrum signals, so that the reflectivity can be obviously improved in correlation with the actually measured chlorophyll a concentration after continuous wavelet transformation processing, the correlation screening is convenient, noise interference is suppressed, and the inversion accuracy of a model is facilitated.
(3) The invention screens out the effective wavelet coefficient with high correlation by using the correlation coefficient, and can further improve the inversion precision and modeling efficiency of the model.
Drawings
FIG. 1 is a flow chart of the invention for constructing a chlorophyll a concentration inversion model.
Fig. 2 is an elbow graph with N values on the abscissa and SSE on the ordinate.
FIG. 3 is a graph showing the correlation between wavelet coefficients and measured chlorophyll a concentrations.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Example 1
The embodiment discloses a chlorophyll a concentration inversion method combining pre-classification and machine learning, as shown in fig. 1, comprising the following steps:
S1, collecting reflectivity data R rs (lambda) of the surfaces of the two kinds of water bodies in the field, wherein lambda represents the collected wavelength, and synchronously collecting chlorophyll a concentration values C Chl-a of the two kinds of water bodies.
Specifically, a spectrometer is used for collecting reflectivity data of the surface of the water body in the field, and a high performance liquid chromatograph, a fluorescence photometer or a spectrophotometer is used for collecting chlorophyll a concentration of the second-class water body. The spectrum measuring wavelength range of the spectrometer is 400-900 nm, and the spectrometer mounting mode comprises foundation, machine load and satellite load.
S2, pre-classifying the reflectivity data.
In this embodiment, the K-means method is specifically used to classify the reflectivity data in an unsupervised manner, the reflectivity data is classified into N types, and the value range of N may be set to [1, 10].
S3, calculating the error square sum under different classification numbers, namely calculating the error square sum among the classes divided by different N values, and finding out the classification number with the best classification effect according to the error square sum, and taking the classification number as the real cluster number K.
In this embodiment, as shown in fig. 2, the error squares are classified into 2,3, … … and 10 categories, so that the error squares are calculated respectively for the 2 categories when classified into 2 categories, the error squares are calculated for the 3 categories when classified into 3 categories, … …, and the error squares are calculated for the 10 categories when classified into 10 categories, and the calculation formula of the error squares are as follows:
Wherein C i is the i-th category; p is the sample point in C i; m i is the centroid of C i; SSE is used to indicate how good the clustering is, the smaller SSE, the more aggregate is represented.
And then, using the N value as an abscissa and the SSE as an ordinate, drawing an elbow graph by using the calculated SSE, finding out the category number corresponding to the SSE descending amplitude sudden reduction position from the elbow graph, and using the category number as a real cluster number K.
As shown in fig. 2, the degree of polymerization of the species increases as the K value increases, specifically, the SSE value decreases and the entire curve decreases. When the descending trend is slowed down, namely the classification number value corresponding to the elbow coefficient is the true cluster number K. As can be seen from fig. 2, k=4.
S4, dividing all the collected reflectivity data into K categories.
S5, carrying out continuous wavelet transformation on the K-class reflectivity data under different scales to obtain wavelet coefficients of each class. The wavelet coefficients are denoted as CWT (K, scale, λ) and the manner of continuous wavelet transformation includes, but is not limited to MEXH wavelet basis functions.
S6, carrying out correlation analysis on each type of wavelet coefficient CWT (K, scale, lambda) and the chlorophyll a concentration value C Chl-a measured in the step S1 to obtain a corresponding correlation coefficient r between each wavelet coefficient and C Chl-a, wherein the correlation coefficient r can be seen in FIG. 3, and then screening the wavelet coefficients with the correlation coefficients larger than a preset threshold value.
In this embodiment, the preset threshold is the first 1% of the correlation coefficient value in the case of sorting from large to small, that is, the wavelet coefficient with the correlation coefficient value ranked in the first 1% is screened, and the higher the correlation coefficient value, the more relevant the wavelet coefficient and chlorophyll a concentration.
And S7, respectively carrying out support vector regression modeling on the various wavelet coefficients after screening, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model which takes the wavelet coefficients as independent variables and the chlorophyll a concentration as dependent variables.
The super-parameter optimization method comprises, but is not limited to, a grid search method, a random search method and a Bayesian optimization method. For example, in this embodiment, the minimum mean square error MSE is implemented by using a grid search method to optimize the superparameter (the superparameter includes a bin constraint, a kernel scale, epsilon, a kernel function type, and the like), and finally an inversion model is obtained.
Here, n in the MSE calculation formula is the number of samples; the MSE calculation formula i is the sample number,Chlorophyll a concentration predicted for the model; y i is the measured chlorophyll a concentration.
S8, for the second-class water body to be detected, collecting reflectivity data of the surface of the second-class water body in the field, calculating Euclidean distances between the data and the mass centers of K different classes in the step S4, selecting the class with the smallest Euclidean distance as the class to which the data belongs, performing continuous wavelet transformation according to the step S5, and inputting the obtained wavelet coefficient into a chlorophyll a concentration inversion model corresponding to the class to which the wavelet coefficient belongs to predict to obtain chlorophyll a concentration of the second-class water body;
the Euclidean distance calculation formula:
Wherein X is the reflectivity data of the second type of water body to be detected; x i is the reflectivity of different wavelengths; c i is the i-th category; m i is the centroid under that category; n is K; dist represents Euclidean distance, a measure of similarity that is more used in the k-means classification method.
Example 2
The embodiment discloses a computer readable storage medium storing a program which, when executed by a processor, implements the chlorophyll a concentration inversion method combining pre-classification and machine learning described in embodiment 1, and specifically includes the following steps:
s1, collecting reflectivity data of the surfaces of two kinds of water bodies in the field, and synchronously collecting chlorophyll a concentration values of the two kinds of water bodies;
S2, pre-classifying the reflectivity data;
S3, solving the error square sum under different classification numbers, finding out the classification number with the best classification effect according to the error square sum, and taking the classification number as a real cluster number K;
s4, dividing all collected reflectivity data into K categories;
S5, carrying out continuous wavelet transformation on the K-class reflectivity data under different scales to obtain wavelet coefficients of each class;
S6, carrying out correlation analysis on each type of wavelet coefficient and the chlorophyll a concentration value actually measured in the step S1, and screening wavelet coefficients with the correlation coefficient larger than a preset threshold value;
s7, respectively carrying out support vector regression modeling on the various wavelet coefficients after screening, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model with the wavelet coefficients as independent variables and the chlorophyll a concentration as dependent variables correspondingly;
s8, for the second-class water body to be detected, collecting reflectivity data of the surface of the second-class water body in the field, calculating Euclidean distances between the data and the mass centers of K different classes in the step S4, selecting the class with the smallest Euclidean distance as the class to which the data belongs, performing continuous wavelet transformation according to the step S5, and inputting the obtained wavelet coefficient into a chlorophyll a concentration inversion model corresponding to the class to which the wavelet coefficient belongs to predict to obtain chlorophyll a concentration of the second-class water body;
the Euclidean distance calculation formula:
Wherein X is the reflectivity data of the second type of water body to be detected, X i is the reflectivity of different wavelengths, C i is the ith class, m i is the mass center under the ith class, and dist represents the Euclidean distance.
The computer readable storage medium in this embodiment may be a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a usb disk, a removable hard disk, or the like.
Example 3
The embodiment discloses a computing device, which comprises a processor and a memory for storing a program executable by the processor, wherein when the processor executes the program stored by the memory, the chlorophyll a concentration inversion method combining pre-classification and machine learning is realized, and specifically comprises the following steps:
s1, collecting reflectivity data of the surfaces of two kinds of water bodies in the field, and synchronously collecting chlorophyll a concentration values of the two kinds of water bodies;
S2, pre-classifying the reflectivity data;
S3, solving the error square sum under different classification numbers, finding out the classification number with the best classification effect according to the error square sum, and taking the classification number as a real cluster number K;
s4, dividing all collected reflectivity data into K categories;
S5, carrying out continuous wavelet transformation on the K-class reflectivity data under different scales to obtain wavelet coefficients of each class;
S6, carrying out correlation analysis on each type of wavelet coefficient and the chlorophyll a concentration value actually measured in the step S1, and screening wavelet coefficients with the correlation coefficient larger than a preset threshold value;
s7, respectively carrying out support vector regression modeling on the various wavelet coefficients after screening, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model with the wavelet coefficients as independent variables and the chlorophyll a concentration as dependent variables correspondingly;
s8, for the second-class water body to be detected, collecting reflectivity data of the surface of the second-class water body in the field, calculating Euclidean distances between the data and the mass centers of K different classes in the step S4, selecting the class with the smallest Euclidean distance as the class to which the data belongs, performing continuous wavelet transformation according to the step S5, and inputting the obtained wavelet coefficient into a chlorophyll a concentration inversion model corresponding to the class to which the wavelet coefficient belongs to predict to obtain chlorophyll a concentration of the second-class water body;
the Euclidean distance calculation formula:
Wherein X is the reflectivity data of the second type of water body to be detected, X i is the reflectivity of different wavelengths, C i is the ith class, m i is the mass center under the ith class, and dist represents the Euclidean distance.
The computing device described in this embodiment may be a desktop computer, a notebook computer, a tablet computer, or other terminal devices with processor functions.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.
Claims (10)
1. The chlorophyll a concentration inversion method combining pre-classification and machine learning is characterized by comprising the following steps of:
s1, collecting reflectivity data of the surfaces of two kinds of water bodies in the field, and synchronously collecting chlorophyll a concentration values of the two kinds of water bodies;
S2, pre-classifying the reflectivity data;
S3, solving the error square sum under different classification numbers, finding out the classification number with the best classification effect according to the error square sum, and taking the classification number as a real cluster number K;
s4, dividing all collected reflectivity data into K categories;
S5, carrying out continuous wavelet transformation on the K-class reflectivity data under different scales to obtain wavelet coefficients of each class;
S6, carrying out correlation analysis on each type of wavelet coefficient and the chlorophyll a concentration value actually measured in the step S1, and screening wavelet coefficients with the correlation coefficient larger than a preset threshold value;
s7, respectively carrying out support vector regression modeling on the various wavelet coefficients after screening, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model with the wavelet coefficients as independent variables and the chlorophyll a concentration as dependent variables correspondingly;
s8, for the second-class water body to be detected, collecting reflectivity data of the surface of the second-class water body in the field, calculating Euclidean distances between the data and the mass centers of K different classes in the step S4, selecting the class with the smallest Euclidean distance as the class to which the data belongs, performing continuous wavelet transformation according to the step S5, and inputting the obtained wavelet coefficient into a chlorophyll a concentration inversion model corresponding to the class to which the wavelet coefficient belongs to predict to obtain chlorophyll a concentration of the second-class water body;
the Euclidean distance calculation formula:
Wherein X is the reflectivity data of the class II water body to be detected, X i is the reflectivity of different wavelengths, C i is the ith class, m i is the mass center under the ith class, and dist represents the Euclidean distance.
2. The chlorophyll-a concentration inversion method combining pre-classification and machine learning according to claim 1, wherein reflectivity data of the surface of the water body is collected by using a spectrometer, and chlorophyll-a concentrations of the second-class water body are collected by using a high performance liquid chromatograph, a fluorescence photometer or a spectrophotometer.
3. The chlorophyll a concentration inversion method combining pre-classification and machine learning according to claim 2, wherein the spectral measurement wavelength range of the spectrometer is 400-900 nm.
4. The chlorophyll a concentration inversion method combining pre-classification and machine learning of claim 1, wherein in step S2, the reflectance data is non-supervised classified using a K-means method.
5. The chlorophyll a concentration inversion method combining pre-classification and machine learning according to claim 1, wherein in step S2, the reflectance data is classified into N classes, and the value range of N is set to [1, 10];
In step S3, the sum of squares of errors under different classification numbers is obtained, and the classification number with the best classification effect is found according to the sum of squares of errors, and is used as the real cluster number K, specifically:
the error square sum SSE is calculated for the classes of different N-value divisions:
Wherein C i is the i-th category; p is the sample point in C i; m i is the centroid of C i; SSE is used for representing the quality of clustering effect;
And then, using the N value as an abscissa and the SSE as an ordinate, drawing an elbow graph by using the calculated SSE, finding out the category number corresponding to the elbow coefficient from the elbow graph, and using the category number as a real cluster number K.
6. A chlorophyll a concentration inversion method combining pre-classification and machine learning according to claim 1, wherein the manner of continuous wavelet transformation includes, but is not limited to MEXH wavelet basis functions.
7. The chlorophyll a concentration inversion method combining pre-classification and machine learning according to claim 1, wherein screening out wavelet coefficients with correlation coefficients greater than a preset threshold value means screening out wavelet coefficients with correlation coefficients ranked in the first 1%.
8. The chlorophyll a concentration inversion method combining pre-classification and machine learning according to claim 1, wherein the super-parametric optimization method includes, but is not limited to, grid search, random search, and bayesian optimization.
9. A computer-readable storage medium storing a program, wherein the program, when executed by a processor, implements the chlorophyll a concentration inversion method combining pre-classification and machine learning according to any one of claims 1 to 8.
10. A computing device comprising a processor and a memory for storing a processor-executable program, wherein the processor, when executing the program stored in the memory, implements the chlorophyll a concentration inversion method of any one of claims 1-8 in combination with pre-classification and machine learning.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011403257.3A CN112528559B (en) | 2020-12-04 | 2020-12-04 | Chlorophyll a concentration inversion method combining pre-classification and machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011403257.3A CN112528559B (en) | 2020-12-04 | 2020-12-04 | Chlorophyll a concentration inversion method combining pre-classification and machine learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112528559A CN112528559A (en) | 2021-03-19 |
CN112528559B true CN112528559B (en) | 2024-04-23 |
Family
ID=74998357
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011403257.3A Active CN112528559B (en) | 2020-12-04 | 2020-12-04 | Chlorophyll a concentration inversion method combining pre-classification and machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112528559B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113159167B (en) * | 2021-04-19 | 2023-03-03 | 福州大学 | Inland-based chlorophyll a inversion method for different types of water bodies |
CN117313017B (en) * | 2023-11-28 | 2024-02-06 | 山东艺林市政园林建设集团有限公司 | Color leaf research and development data processing method and system |
CN117992801B (en) * | 2024-04-03 | 2024-06-14 | 南京信息工程大学 | Sea area monitoring method and system through satellite remote sensing technology |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103983584A (en) * | 2014-05-30 | 2014-08-13 | 中国科学院遥感与数字地球研究所 | Retrieval method and retrieval device of chlorophyll a concentration of inland case II water |
CN104359847A (en) * | 2014-12-08 | 2015-02-18 | 中国科学院遥感与数字地球研究所 | Method and device for acquiring centroid set used for representing typical water category |
CN107025467A (en) * | 2017-05-09 | 2017-08-08 | 环境保护部卫星环境应用中心 | A kind of method for building up and device of water body disaggregated model |
CN111783826A (en) * | 2020-05-27 | 2020-10-16 | 西华大学 | Driving style classification method based on pre-classification and ensemble learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9165189B2 (en) * | 2011-07-19 | 2015-10-20 | Ball Horticultural Company | Seed holding device and seed classification system with seed holding device |
-
2020
- 2020-12-04 CN CN202011403257.3A patent/CN112528559B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103983584A (en) * | 2014-05-30 | 2014-08-13 | 中国科学院遥感与数字地球研究所 | Retrieval method and retrieval device of chlorophyll a concentration of inland case II water |
CN104359847A (en) * | 2014-12-08 | 2015-02-18 | 中国科学院遥感与数字地球研究所 | Method and device for acquiring centroid set used for representing typical water category |
CN107025467A (en) * | 2017-05-09 | 2017-08-08 | 环境保护部卫星环境应用中心 | A kind of method for building up and device of water body disaggregated model |
CN111783826A (en) * | 2020-05-27 | 2020-10-16 | 西华大学 | Driving style classification method based on pre-classification and ensemble learning |
Non-Patent Citations (1)
Title |
---|
基于环境减灾卫星遥感数据的呼伦贝尔草地地上生物量反演研究;陈鹏飞;王卷乐;廖秀英;尹芳;陈宝瑞;刘睿;;自然资源学报;20100715(07);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112528559A (en) | 2021-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112528559B (en) | Chlorophyll a concentration inversion method combining pre-classification and machine learning | |
CN111126575B (en) | Gas sensor array mixed gas detection method and device based on machine learning | |
CN110346312B (en) | Winter wheat head gibberellic disease identification method based on Fisher linear discrimination and support vector machine technology | |
CN113049500B (en) | Water quality detection model training and water quality detection method, electronic equipment and storage medium | |
CN104252625A (en) | Sample adaptive multi-feature weighted remote sensing image method | |
CN110779875B (en) | Method for detecting moisture content of winter wheat ear based on hyperspectral technology | |
CN104063710A (en) | Method for removing abnormal spectrum in actual measurement spectrum curve based on support vector machine model | |
CN111723876A (en) | Load curve integrated spectrum clustering algorithm considering double-scale similarity | |
CN111582387A (en) | Rock spectral feature fusion classification method and system | |
CN108827909B (en) | Rapid soil classification method based on visible near infrared spectrum and multi-target fusion | |
Jafarzadeh et al. | Examination of various feature selection approaches for daily precipitation downscaling in different climates | |
CN114399674A (en) | Hyperspectral image technology-based shellfish toxin nondestructive rapid detection method and system | |
CN115810403A (en) | Method for evaluating water pollution based on environmental characteristic information | |
CN108108758A (en) | Towards the multilayer increment feature extracting method of industrial big data | |
Lin et al. | Hyperspectral estimation of soil composition contents based on kernel principal component analysis and machine learning model | |
CN115728290A (en) | Method, system, equipment and storage medium for detecting chromium element in soil | |
CN116297281A (en) | System and method for predicting sample characteristics based on spectral measurements | |
AU2021102567A4 (en) | Rapid diagnosis method of soil fertility grade based on hyperspectral data | |
CN113640244A (en) | Fruit tree variety identification method based on visible near infrared spectrum | |
CN107451603B (en) | Locust age identification method | |
CN112526098A (en) | Continuous wavelet coefficient-based chlorophyll a concentration inversion method for class II water body | |
CN117312973B (en) | Inland water body optical classification method and system | |
Lukoshkin | Forest stand parameter estimation by using neural networks | |
Gopal et al. | Artificial neural networks for detecting forest change | |
Kim | The estimation of the variogram in geostatistical data with outliers |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |