CN112528559B - Chlorophyll a concentration inversion method combining pre-classification and machine learning - Google Patents

Chlorophyll a concentration inversion method combining pre-classification and machine learning Download PDF

Info

Publication number
CN112528559B
CN112528559B CN202011403257.3A CN202011403257A CN112528559B CN 112528559 B CN112528559 B CN 112528559B CN 202011403257 A CN202011403257 A CN 202011403257A CN 112528559 B CN112528559 B CN 112528559B
Authority
CN
China
Prior art keywords
chlorophyll
classification
concentration
class
machine learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011403257.3A
Other languages
Chinese (zh)
Other versions
CN112528559A (en
Inventor
陈水森
彭咏石
王重洋
陈金月
李丹
贾凯
姜浩
王力
郑琼
官云兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Institute of Geography of GDAS
Original Assignee
Guangzhou Institute of Geography of GDAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Institute of Geography of GDAS filed Critical Guangzhou Institute of Geography of GDAS
Priority to CN202011403257.3A priority Critical patent/CN112528559B/en
Publication of CN112528559A publication Critical patent/CN112528559A/en
Application granted granted Critical
Publication of CN112528559B publication Critical patent/CN112528559B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/55Specular reflectivity
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/64Fluorescence; Phosphorescence
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/18Water
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Pathology (AREA)
  • Theoretical Computer Science (AREA)
  • Biochemistry (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention discloses a chlorophyll a concentration inversion method combining pre-classification and machine learning, which comprises the steps of firstly collecting reflectivity data of the surfaces of two kinds of water bodies and chlorophyll a concentration values of the water bodies in the field; pre-classifying the reflectivity data, solving the error square sum under different classification numbers, and determining the real cluster number K according to the error square sum; dividing the collected reflectivity data into K classes, and carrying out continuous wavelet transformation under different scales to obtain wavelet coefficients of each class; performing correlation analysis on various wavelet coefficients and the actually measured chlorophyll a concentration value, and screening wavelet coefficients with correlation coefficients larger than a preset threshold value; and carrying out support vector regression modeling on the screened various wavelet coefficients, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model which can be used for inverting the chlorophyll a concentration of the second-class water body according to the wavelet coefficients of the reflectivity. The method can realize high-precision inversion of the chlorophyll a concentration of the second-class water body.

Description

Chlorophyll a concentration inversion method combining pre-classification and machine learning
Technical Field
The invention relates to the technical field of chlorophyll a concentration inversion of second-class water bodies, in particular to a chlorophyll a concentration inversion method combining pre-classification and machine learning.
Background
The water area of China is wide, and estuaries, offshore and inland lakes are typical two-class water bodies. When the water quality of the second-class water body is evaluated, the concentration of chlorophyll a in the water needs to be detected, and the chlorophyll a is an important index for reflecting the eutrophication of the water body.
At present, the two kinds of water bodies have very complex constitution conditions, the causes of the water body optical characteristic transformation have larger differences, meanwhile, the optical characteristics of the water bodies also have obvious region and season characteristics, if the chlorophyll a concentration of the two kinds of water bodies is inverted by using a unified model, different time and different research regions can not be adapted, and the inversion result is inaccurate, so that the high-precision chlorophyll a concentration inversion of the two kinds of water bodies is always a great challenge for people.
Disclosure of Invention
The first aim of the invention is to overcome the defects and shortcomings of the prior art and provide a chlorophyll a concentration inversion method combining pre-classification and machine learning, which can realize high-precision inversion of chlorophyll a concentration of a second-class water body.
A second object of the present invention is to provide a storage medium.
It is a third object of the present invention to provide a computing device.
The first object of the invention is achieved by the following technical scheme: a chlorophyll a concentration inversion method combining pre-classification and machine learning comprises the following steps:
s1, collecting reflectivity data of the surfaces of two kinds of water bodies in the field, and synchronously collecting chlorophyll a concentration values of the two kinds of water bodies;
S2, pre-classifying the reflectivity data;
S3, solving the error square sum under different classification numbers, finding out the classification number with the best classification effect according to the error square sum, and taking the classification number as a real cluster number K;
s4, dividing all collected reflectivity data into K categories;
S5, carrying out continuous wavelet transformation on the K-class reflectivity data under different scales to obtain wavelet coefficients of each class;
S6, carrying out correlation analysis on each type of wavelet coefficient and the chlorophyll a concentration value actually measured in the step S1, and screening wavelet coefficients with the correlation coefficient larger than a preset threshold value;
s7, respectively carrying out support vector regression modeling on the various wavelet coefficients after screening, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model with the wavelet coefficients as independent variables and the chlorophyll a concentration as dependent variables correspondingly;
s8, for the second-class water body to be detected, collecting reflectivity data of the surface of the second-class water body in the field, calculating Euclidean distances between the data and the mass centers of K different classes in the step S4, selecting the class with the smallest Euclidean distance as the class to which the data belongs, performing continuous wavelet transformation according to the step S5, and inputting the obtained wavelet coefficient into a chlorophyll a concentration inversion model corresponding to the class to which the wavelet coefficient belongs to predict to obtain chlorophyll a concentration of the second-class water body;
the Euclidean distance calculation formula:
Wherein X is the reflectivity data of the second type of water body to be detected, X i is the reflectivity of different wavelengths, C i is the ith class, m i is the mass center under the ith class, and dist represents the Euclidean distance.
Preferably, the reflectivity data of the water surface is collected by utilizing a spectrometer, and the chlorophyll a concentration of the second-class water is collected by utilizing a high performance liquid chromatograph, a fluorescence photometer or a spectrophotometer.
Further, the spectrum measuring wavelength range of the spectrometer is 400-900 nm.
Preferably, in step S2, the reflectance data is non-supervised classified using the K-means method.
Preferably, in step S2, the reflectance data is classified into N categories, and the value range of N is set to [1, 10];
In step S3, the sum of squares of errors under different classification numbers is obtained, and the classification number with the best classification effect is found according to the sum of squares of errors, and is used as the real cluster number K, specifically:
the error square sum SSE is calculated for the classes of different N-value divisions:
Wherein C i is the i-th category; p is the sample point in C i; m i is the centroid of C i; SSE is used for representing the quality of clustering effect;
And then, using the N value as an abscissa and the SSE as an ordinate, drawing an elbow graph by using the calculated SSE, finding out the category number corresponding to the elbow coefficient from the elbow graph, and using the category number as a real cluster number K.
Preferably, the manner of continuous wavelet transformation includes, but is not limited to MEXH wavelet basis functions.
Preferably, the screening out the wavelet coefficients with the correlation coefficient larger than the preset threshold value refers to screening out the wavelet coefficients with the correlation coefficient arranged in the first 1%.
Preferably, the hyper-parametric optimization method includes, but is not limited to, a grid search method, a random search method, and a Bayesian optimization method.
The second object of the invention is achieved by the following technical scheme: a computer-readable storage medium storing a program which, when executed by a processor, implements a chlorophyll-a concentration inversion method combining pre-classification and machine learning according to a first object of the present invention.
The third object of the invention is achieved by the following technical scheme: a computing device includes a processor and a memory for storing a program executable by the processor, wherein the processor implements the chlorophyll-a concentration inversion method combining pre-classification and machine learning according to the first object of the present invention when executing the program stored in the memory.
Compared with the prior art, the invention has the following advantages and effects:
(1) The invention combines the pre-classification and the machine learning chlorophyll a concentration inversion method, and classifies the water body reflection signals with complex optical characteristics into K classes by performing non-supervision classification on the actually measured hyperspectral signals; in order to better improve inversion accuracy, continuous wavelet transformation is introduced, spectral wavelet characteristics of hyperspectral signals are modeled to further improve model accuracy, and finally, support vector regression models are respectively built for each class, and the support vector regression models have stronger robustness and can be used for accurately inverting chlorophyll a concentration information.
(2) According to the invention, the collected spectrum signals are pre-classified, and the classified spectrum signals are subjected to continuous wavelet transformation, so that the ground hyperspectral signals are converted into various wavelet coefficients, and the continuous wavelet transformation is a signal processing mode capable of effectively detecting weak hidden information in the water spectrum signals, so that the reflectivity can be obviously improved in correlation with the actually measured chlorophyll a concentration after continuous wavelet transformation processing, the correlation screening is convenient, noise interference is suppressed, and the inversion accuracy of a model is facilitated.
(3) The invention screens out the effective wavelet coefficient with high correlation by using the correlation coefficient, and can further improve the inversion precision and modeling efficiency of the model.
Drawings
FIG. 1 is a flow chart of the invention for constructing a chlorophyll a concentration inversion model.
Fig. 2 is an elbow graph with N values on the abscissa and SSE on the ordinate.
FIG. 3 is a graph showing the correlation between wavelet coefficients and measured chlorophyll a concentrations.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Example 1
The embodiment discloses a chlorophyll a concentration inversion method combining pre-classification and machine learning, as shown in fig. 1, comprising the following steps:
S1, collecting reflectivity data R rs (lambda) of the surfaces of the two kinds of water bodies in the field, wherein lambda represents the collected wavelength, and synchronously collecting chlorophyll a concentration values C Chl-a of the two kinds of water bodies.
Specifically, a spectrometer is used for collecting reflectivity data of the surface of the water body in the field, and a high performance liquid chromatograph, a fluorescence photometer or a spectrophotometer is used for collecting chlorophyll a concentration of the second-class water body. The spectrum measuring wavelength range of the spectrometer is 400-900 nm, and the spectrometer mounting mode comprises foundation, machine load and satellite load.
S2, pre-classifying the reflectivity data.
In this embodiment, the K-means method is specifically used to classify the reflectivity data in an unsupervised manner, the reflectivity data is classified into N types, and the value range of N may be set to [1, 10].
S3, calculating the error square sum under different classification numbers, namely calculating the error square sum among the classes divided by different N values, and finding out the classification number with the best classification effect according to the error square sum, and taking the classification number as the real cluster number K.
In this embodiment, as shown in fig. 2, the error squares are classified into 2,3, … … and 10 categories, so that the error squares are calculated respectively for the 2 categories when classified into 2 categories, the error squares are calculated for the 3 categories when classified into 3 categories, … …, and the error squares are calculated for the 10 categories when classified into 10 categories, and the calculation formula of the error squares are as follows:
Wherein C i is the i-th category; p is the sample point in C i; m i is the centroid of C i; SSE is used to indicate how good the clustering is, the smaller SSE, the more aggregate is represented.
And then, using the N value as an abscissa and the SSE as an ordinate, drawing an elbow graph by using the calculated SSE, finding out the category number corresponding to the SSE descending amplitude sudden reduction position from the elbow graph, and using the category number as a real cluster number K.
As shown in fig. 2, the degree of polymerization of the species increases as the K value increases, specifically, the SSE value decreases and the entire curve decreases. When the descending trend is slowed down, namely the classification number value corresponding to the elbow coefficient is the true cluster number K. As can be seen from fig. 2, k=4.
S4, dividing all the collected reflectivity data into K categories.
S5, carrying out continuous wavelet transformation on the K-class reflectivity data under different scales to obtain wavelet coefficients of each class. The wavelet coefficients are denoted as CWT (K, scale, λ) and the manner of continuous wavelet transformation includes, but is not limited to MEXH wavelet basis functions.
S6, carrying out correlation analysis on each type of wavelet coefficient CWT (K, scale, lambda) and the chlorophyll a concentration value C Chl-a measured in the step S1 to obtain a corresponding correlation coefficient r between each wavelet coefficient and C Chl-a, wherein the correlation coefficient r can be seen in FIG. 3, and then screening the wavelet coefficients with the correlation coefficients larger than a preset threshold value.
In this embodiment, the preset threshold is the first 1% of the correlation coefficient value in the case of sorting from large to small, that is, the wavelet coefficient with the correlation coefficient value ranked in the first 1% is screened, and the higher the correlation coefficient value, the more relevant the wavelet coefficient and chlorophyll a concentration.
And S7, respectively carrying out support vector regression modeling on the various wavelet coefficients after screening, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model which takes the wavelet coefficients as independent variables and the chlorophyll a concentration as dependent variables.
The super-parameter optimization method comprises, but is not limited to, a grid search method, a random search method and a Bayesian optimization method. For example, in this embodiment, the minimum mean square error MSE is implemented by using a grid search method to optimize the superparameter (the superparameter includes a bin constraint, a kernel scale, epsilon, a kernel function type, and the like), and finally an inversion model is obtained.
Here, n in the MSE calculation formula is the number of samples; the MSE calculation formula i is the sample number,Chlorophyll a concentration predicted for the model; y i is the measured chlorophyll a concentration.
S8, for the second-class water body to be detected, collecting reflectivity data of the surface of the second-class water body in the field, calculating Euclidean distances between the data and the mass centers of K different classes in the step S4, selecting the class with the smallest Euclidean distance as the class to which the data belongs, performing continuous wavelet transformation according to the step S5, and inputting the obtained wavelet coefficient into a chlorophyll a concentration inversion model corresponding to the class to which the wavelet coefficient belongs to predict to obtain chlorophyll a concentration of the second-class water body;
the Euclidean distance calculation formula:
Wherein X is the reflectivity data of the second type of water body to be detected; x i is the reflectivity of different wavelengths; c i is the i-th category; m i is the centroid under that category; n is K; dist represents Euclidean distance, a measure of similarity that is more used in the k-means classification method.
Example 2
The embodiment discloses a computer readable storage medium storing a program which, when executed by a processor, implements the chlorophyll a concentration inversion method combining pre-classification and machine learning described in embodiment 1, and specifically includes the following steps:
s1, collecting reflectivity data of the surfaces of two kinds of water bodies in the field, and synchronously collecting chlorophyll a concentration values of the two kinds of water bodies;
S2, pre-classifying the reflectivity data;
S3, solving the error square sum under different classification numbers, finding out the classification number with the best classification effect according to the error square sum, and taking the classification number as a real cluster number K;
s4, dividing all collected reflectivity data into K categories;
S5, carrying out continuous wavelet transformation on the K-class reflectivity data under different scales to obtain wavelet coefficients of each class;
S6, carrying out correlation analysis on each type of wavelet coefficient and the chlorophyll a concentration value actually measured in the step S1, and screening wavelet coefficients with the correlation coefficient larger than a preset threshold value;
s7, respectively carrying out support vector regression modeling on the various wavelet coefficients after screening, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model with the wavelet coefficients as independent variables and the chlorophyll a concentration as dependent variables correspondingly;
s8, for the second-class water body to be detected, collecting reflectivity data of the surface of the second-class water body in the field, calculating Euclidean distances between the data and the mass centers of K different classes in the step S4, selecting the class with the smallest Euclidean distance as the class to which the data belongs, performing continuous wavelet transformation according to the step S5, and inputting the obtained wavelet coefficient into a chlorophyll a concentration inversion model corresponding to the class to which the wavelet coefficient belongs to predict to obtain chlorophyll a concentration of the second-class water body;
the Euclidean distance calculation formula:
Wherein X is the reflectivity data of the second type of water body to be detected, X i is the reflectivity of different wavelengths, C i is the ith class, m i is the mass center under the ith class, and dist represents the Euclidean distance.
The computer readable storage medium in this embodiment may be a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a usb disk, a removable hard disk, or the like.
Example 3
The embodiment discloses a computing device, which comprises a processor and a memory for storing a program executable by the processor, wherein when the processor executes the program stored by the memory, the chlorophyll a concentration inversion method combining pre-classification and machine learning is realized, and specifically comprises the following steps:
s1, collecting reflectivity data of the surfaces of two kinds of water bodies in the field, and synchronously collecting chlorophyll a concentration values of the two kinds of water bodies;
S2, pre-classifying the reflectivity data;
S3, solving the error square sum under different classification numbers, finding out the classification number with the best classification effect according to the error square sum, and taking the classification number as a real cluster number K;
s4, dividing all collected reflectivity data into K categories;
S5, carrying out continuous wavelet transformation on the K-class reflectivity data under different scales to obtain wavelet coefficients of each class;
S6, carrying out correlation analysis on each type of wavelet coefficient and the chlorophyll a concentration value actually measured in the step S1, and screening wavelet coefficients with the correlation coefficient larger than a preset threshold value;
s7, respectively carrying out support vector regression modeling on the various wavelet coefficients after screening, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model with the wavelet coefficients as independent variables and the chlorophyll a concentration as dependent variables correspondingly;
s8, for the second-class water body to be detected, collecting reflectivity data of the surface of the second-class water body in the field, calculating Euclidean distances between the data and the mass centers of K different classes in the step S4, selecting the class with the smallest Euclidean distance as the class to which the data belongs, performing continuous wavelet transformation according to the step S5, and inputting the obtained wavelet coefficient into a chlorophyll a concentration inversion model corresponding to the class to which the wavelet coefficient belongs to predict to obtain chlorophyll a concentration of the second-class water body;
the Euclidean distance calculation formula:
Wherein X is the reflectivity data of the second type of water body to be detected, X i is the reflectivity of different wavelengths, C i is the ith class, m i is the mass center under the ith class, and dist represents the Euclidean distance.
The computing device described in this embodiment may be a desktop computer, a notebook computer, a tablet computer, or other terminal devices with processor functions.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims (10)

1. The chlorophyll a concentration inversion method combining pre-classification and machine learning is characterized by comprising the following steps of:
s1, collecting reflectivity data of the surfaces of two kinds of water bodies in the field, and synchronously collecting chlorophyll a concentration values of the two kinds of water bodies;
S2, pre-classifying the reflectivity data;
S3, solving the error square sum under different classification numbers, finding out the classification number with the best classification effect according to the error square sum, and taking the classification number as a real cluster number K;
s4, dividing all collected reflectivity data into K categories;
S5, carrying out continuous wavelet transformation on the K-class reflectivity data under different scales to obtain wavelet coefficients of each class;
S6, carrying out correlation analysis on each type of wavelet coefficient and the chlorophyll a concentration value actually measured in the step S1, and screening wavelet coefficients with the correlation coefficient larger than a preset threshold value;
s7, respectively carrying out support vector regression modeling on the various wavelet coefficients after screening, carrying out super-parameter optimization, and finally obtaining a chlorophyll a concentration inversion model with the wavelet coefficients as independent variables and the chlorophyll a concentration as dependent variables correspondingly;
s8, for the second-class water body to be detected, collecting reflectivity data of the surface of the second-class water body in the field, calculating Euclidean distances between the data and the mass centers of K different classes in the step S4, selecting the class with the smallest Euclidean distance as the class to which the data belongs, performing continuous wavelet transformation according to the step S5, and inputting the obtained wavelet coefficient into a chlorophyll a concentration inversion model corresponding to the class to which the wavelet coefficient belongs to predict to obtain chlorophyll a concentration of the second-class water body;
the Euclidean distance calculation formula:
Wherein X is the reflectivity data of the class II water body to be detected, X i is the reflectivity of different wavelengths, C i is the ith class, m i is the mass center under the ith class, and dist represents the Euclidean distance.
2. The chlorophyll-a concentration inversion method combining pre-classification and machine learning according to claim 1, wherein reflectivity data of the surface of the water body is collected by using a spectrometer, and chlorophyll-a concentrations of the second-class water body are collected by using a high performance liquid chromatograph, a fluorescence photometer or a spectrophotometer.
3. The chlorophyll a concentration inversion method combining pre-classification and machine learning according to claim 2, wherein the spectral measurement wavelength range of the spectrometer is 400-900 nm.
4. The chlorophyll a concentration inversion method combining pre-classification and machine learning of claim 1, wherein in step S2, the reflectance data is non-supervised classified using a K-means method.
5. The chlorophyll a concentration inversion method combining pre-classification and machine learning according to claim 1, wherein in step S2, the reflectance data is classified into N classes, and the value range of N is set to [1, 10];
In step S3, the sum of squares of errors under different classification numbers is obtained, and the classification number with the best classification effect is found according to the sum of squares of errors, and is used as the real cluster number K, specifically:
the error square sum SSE is calculated for the classes of different N-value divisions:
Wherein C i is the i-th category; p is the sample point in C i; m i is the centroid of C i; SSE is used for representing the quality of clustering effect;
And then, using the N value as an abscissa and the SSE as an ordinate, drawing an elbow graph by using the calculated SSE, finding out the category number corresponding to the elbow coefficient from the elbow graph, and using the category number as a real cluster number K.
6. A chlorophyll a concentration inversion method combining pre-classification and machine learning according to claim 1, wherein the manner of continuous wavelet transformation includes, but is not limited to MEXH wavelet basis functions.
7. The chlorophyll a concentration inversion method combining pre-classification and machine learning according to claim 1, wherein screening out wavelet coefficients with correlation coefficients greater than a preset threshold value means screening out wavelet coefficients with correlation coefficients ranked in the first 1%.
8. The chlorophyll a concentration inversion method combining pre-classification and machine learning according to claim 1, wherein the super-parametric optimization method includes, but is not limited to, grid search, random search, and bayesian optimization.
9. A computer-readable storage medium storing a program, wherein the program, when executed by a processor, implements the chlorophyll a concentration inversion method combining pre-classification and machine learning according to any one of claims 1 to 8.
10. A computing device comprising a processor and a memory for storing a processor-executable program, wherein the processor, when executing the program stored in the memory, implements the chlorophyll a concentration inversion method of any one of claims 1-8 in combination with pre-classification and machine learning.
CN202011403257.3A 2020-12-04 2020-12-04 Chlorophyll a concentration inversion method combining pre-classification and machine learning Active CN112528559B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011403257.3A CN112528559B (en) 2020-12-04 2020-12-04 Chlorophyll a concentration inversion method combining pre-classification and machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011403257.3A CN112528559B (en) 2020-12-04 2020-12-04 Chlorophyll a concentration inversion method combining pre-classification and machine learning

Publications (2)

Publication Number Publication Date
CN112528559A CN112528559A (en) 2021-03-19
CN112528559B true CN112528559B (en) 2024-04-23

Family

ID=74998357

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011403257.3A Active CN112528559B (en) 2020-12-04 2020-12-04 Chlorophyll a concentration inversion method combining pre-classification and machine learning

Country Status (1)

Country Link
CN (1) CN112528559B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159167B (en) * 2021-04-19 2023-03-03 福州大学 Inland-based chlorophyll a inversion method for different types of water bodies
CN117313017B (en) * 2023-11-28 2024-02-06 山东艺林市政园林建设集团有限公司 Color leaf research and development data processing method and system
CN117992801B (en) * 2024-04-03 2024-06-14 南京信息工程大学 Sea area monitoring method and system through satellite remote sensing technology

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103983584A (en) * 2014-05-30 2014-08-13 中国科学院遥感与数字地球研究所 Retrieval method and retrieval device of chlorophyll a concentration of inland case II water
CN104359847A (en) * 2014-12-08 2015-02-18 中国科学院遥感与数字地球研究所 Method and device for acquiring centroid set used for representing typical water category
CN107025467A (en) * 2017-05-09 2017-08-08 环境保护部卫星环境应用中心 A kind of method for building up and device of water body disaggregated model
CN111783826A (en) * 2020-05-27 2020-10-16 西华大学 Driving style classification method based on pre-classification and ensemble learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9165189B2 (en) * 2011-07-19 2015-10-20 Ball Horticultural Company Seed holding device and seed classification system with seed holding device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103983584A (en) * 2014-05-30 2014-08-13 中国科学院遥感与数字地球研究所 Retrieval method and retrieval device of chlorophyll a concentration of inland case II water
CN104359847A (en) * 2014-12-08 2015-02-18 中国科学院遥感与数字地球研究所 Method and device for acquiring centroid set used for representing typical water category
CN107025467A (en) * 2017-05-09 2017-08-08 环境保护部卫星环境应用中心 A kind of method for building up and device of water body disaggregated model
CN111783826A (en) * 2020-05-27 2020-10-16 西华大学 Driving style classification method based on pre-classification and ensemble learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于环境减灾卫星遥感数据的呼伦贝尔草地地上生物量反演研究;陈鹏飞;王卷乐;廖秀英;尹芳;陈宝瑞;刘睿;;自然资源学报;20100715(07);全文 *

Also Published As

Publication number Publication date
CN112528559A (en) 2021-03-19

Similar Documents

Publication Publication Date Title
CN112528559B (en) Chlorophyll a concentration inversion method combining pre-classification and machine learning
CN111126575B (en) Gas sensor array mixed gas detection method and device based on machine learning
CN110346312B (en) Winter wheat head gibberellic disease identification method based on Fisher linear discrimination and support vector machine technology
CN113049500B (en) Water quality detection model training and water quality detection method, electronic equipment and storage medium
CN104252625A (en) Sample adaptive multi-feature weighted remote sensing image method
CN110779875B (en) Method for detecting moisture content of winter wheat ear based on hyperspectral technology
CN104063710A (en) Method for removing abnormal spectrum in actual measurement spectrum curve based on support vector machine model
CN111723876A (en) Load curve integrated spectrum clustering algorithm considering double-scale similarity
CN111582387A (en) Rock spectral feature fusion classification method and system
CN108827909B (en) Rapid soil classification method based on visible near infrared spectrum and multi-target fusion
Jafarzadeh et al. Examination of various feature selection approaches for daily precipitation downscaling in different climates
CN114399674A (en) Hyperspectral image technology-based shellfish toxin nondestructive rapid detection method and system
CN115810403A (en) Method for evaluating water pollution based on environmental characteristic information
CN108108758A (en) Towards the multilayer increment feature extracting method of industrial big data
Lin et al. Hyperspectral estimation of soil composition contents based on kernel principal component analysis and machine learning model
CN115728290A (en) Method, system, equipment and storage medium for detecting chromium element in soil
CN116297281A (en) System and method for predicting sample characteristics based on spectral measurements
AU2021102567A4 (en) Rapid diagnosis method of soil fertility grade based on hyperspectral data
CN113640244A (en) Fruit tree variety identification method based on visible near infrared spectrum
CN107451603B (en) Locust age identification method
CN112526098A (en) Continuous wavelet coefficient-based chlorophyll a concentration inversion method for class II water body
CN117312973B (en) Inland water body optical classification method and system
Lukoshkin Forest stand parameter estimation by using neural networks
Gopal et al. Artificial neural networks for detecting forest change
Kim The estimation of the variogram in geostatistical data with outliers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant