CN115221927A - Ultraviolet-visible spectrum dissolved organic carbon detection method - Google Patents
Ultraviolet-visible spectrum dissolved organic carbon detection method Download PDFInfo
- Publication number
- CN115221927A CN115221927A CN202210888398.1A CN202210888398A CN115221927A CN 115221927 A CN115221927 A CN 115221927A CN 202210888398 A CN202210888398 A CN 202210888398A CN 115221927 A CN115221927 A CN 115221927A
- Authority
- CN
- China
- Prior art keywords
- model
- organic carbon
- node
- dissolved organic
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 title claims abstract description 42
- 229910052799 carbon Inorganic materials 0.000 title claims abstract description 42
- 238000001514 detection method Methods 0.000 title claims abstract description 40
- 238000002371 ultraviolet--visible spectrum Methods 0.000 title claims abstract description 13
- 238000012549 training Methods 0.000 claims abstract description 35
- 238000000034 method Methods 0.000 claims abstract description 32
- 238000004458 analytical method Methods 0.000 claims abstract description 23
- 238000012360 testing method Methods 0.000 claims abstract description 20
- 238000003066 decision tree Methods 0.000 claims abstract description 16
- 230000003595 spectral effect Effects 0.000 claims abstract description 13
- 238000012937 correction Methods 0.000 claims abstract description 12
- 238000007781 pre-processing Methods 0.000 claims abstract description 11
- 238000009499 grossing Methods 0.000 claims abstract description 10
- 230000010354 integration Effects 0.000 claims abstract description 10
- 238000013507 mapping Methods 0.000 claims abstract description 7
- 238000000605 extraction Methods 0.000 claims abstract description 6
- 238000007621 cluster analysis Methods 0.000 claims abstract description 4
- 230000008030 elimination Effects 0.000 claims abstract description 4
- 238000003379 elimination reaction Methods 0.000 claims abstract description 4
- 238000001228 spectrum Methods 0.000 claims description 30
- 210000002569 neuron Anatomy 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 14
- 238000002835 absorbance Methods 0.000 claims description 13
- 239000000654 additive Substances 0.000 claims description 6
- 230000000996 additive effect Effects 0.000 claims description 6
- 238000012417 linear regression Methods 0.000 claims description 6
- 230000036961 partial effect Effects 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 230000003247 decreasing effect Effects 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- 238000004445 quantitative analysis Methods 0.000 claims description 4
- 238000003064 k means clustering Methods 0.000 claims description 3
- 239000012491 analyte Substances 0.000 claims description 2
- 230000002860 competitive effect Effects 0.000 claims description 2
- 230000007423 decrease Effects 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 238000012935 Averaging Methods 0.000 claims 1
- 238000003491 array Methods 0.000 claims 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 abstract description 6
- 230000007613 environmental effect Effects 0.000 description 5
- 238000010521 absorption reaction Methods 0.000 description 2
- 230000009021 linear effect Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003647 oxidation Effects 0.000 description 2
- 238000007254 oxidation reaction Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000002352 surface water Substances 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000003197 catalytic effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000013505 freshwater Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 229910001410 inorganic ion Inorganic materials 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 239000005416 organic matter Substances 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000002211 ultraviolet spectrum Methods 0.000 description 1
- 238000004065 wastewater treatment Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
- G01N21/33—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using ultraviolet light
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A20/00—Water conservation; Efficient water supply; Efficient water use
- Y02A20/20—Controlling water pollution; Waste water treatment
Abstract
The invention discloses a method for detecting dissolved organic carbon by ultraviolet-visible spectrum, belonging to the technical field of water quality detection, and the method comprises the following steps: dividing a to-be-detected dissolved organic carbon detection sample into a training sample and a test sample, and performing conventional dissolved organic carbon detection on the training sample to obtain a concentration value of the training sample; preprocessing a training sample, wherein the preprocessing comprises smoothing denoising, scattering correction and baseline drift influence elimination; carrying out global feature extraction on the preprocessed training samples by adopting a self-organizing mapping network; performing cluster analysis on the global features to obtain feature information which can represent the spectral difference between different samples; constructing a regression tree integration model; solving the regression tree integration model; introducing a regularization term, and constructing a nonlinear analysis model of the regularized greedy decision tree; and substituting the test sample into the nonlinear analysis model to verify the effectiveness of the nonlinear analysis model. The accuracy of concentration detection of the dissolved organic carbon can be improved.
Description
Technical Field
The invention belongs to the technical field of water quality detection, and particularly relates to a method for detecting dissolved organic carbon by using an ultraviolet-visible spectrum.
Background
Dissolved Organic Carbon (DOC) is taken as an important index for visually reflecting the pollution degree of human activities to water, and has important significance for evaluating Organic matter pollution of water in real time and developing water environment protection work. Researches prove that ultraviolet light at the wavelength of 254nm has better correlation with the concentration of dissolved organic carbon measured by a high-temperature oxidation method. However, in the actual measurement process, part of the wavelengths are susceptible to potential interference of inorganic ions, so that the measurement method based on single-wavelength absorbance is poor in applicability.
At present, scholars at home and abroad focus on establishing a multi-wavelength analysis model aiming at water body dissolved organic carbon, for example, sandford and the like use an ultraviolet spectrophotometric sensor for real-time and in-situ detection of DOC in fresh water, quantitative analysis is carried out in a 230-300nm wavelength range by adopting a mixed linear analysis curve fitting algorithm, and the method is proved to have better linear effect by comparison with a high-temperature catalytic oxidation method. Fichot et al established a correlation between the spectral slope coefficient of the ultraviolet spectrum in the 275-295nm range and the DOC concentration, used for on-site on-line quantitative analysis and research of DOC in surface water, and found that strong correlation exists between the ultraviolet absorption characteristic at 280nm and the content of dissolved organic carbon. Li et al studied the relationship between the spectral slopes of the dissolved organic carbon pairs at 254 and 280nm in surface water and wastewater treatment with a homemade miniature LED ultraviolet sensor, and the results showed that the spectrum at 280nm can supplement the traditional 254nm measurement. However, when studying a multi-wavelength analysis model, the above scholars do not consider that extracting a characteristic wavelength subset with strong effectiveness on the full spectrum of the dissolved organic carbon is easily interfered by environmental factors, and the detection performance of the model is reduced.
In summary, due to the influence of instruments, environmental noise, scattering interference and the like, the ultraviolet-visible spectrum absorbance of the dissolved organic carbon and the concentration of the sample to be detected do not strictly accord with the Lambert-Beer law, and the problems of how to reasonably select the wavelength and how to effectively extract the spectral characteristics are solved. The lack of a scheme for extracting characteristic wavelength subsets with strong effectiveness on a full spectrum leads to high data complexity and less effective information occupation, so that the existing model is difficult to accurately and efficiently realize the detection of the concentration of the dissolved organic carbon.
Disclosure of Invention
The embodiment of the invention aims to provide a method for detecting dissolved organic carbon by an ultraviolet-visible spectrum, which can solve the technical problems that the existing method for detecting the dissolved organic carbon is easily interfered by instruments, environmental noise and scattering, lacks a universal method for realizing detection on a full spectrum, has higher data complexity and less effective information occupation, and is difficult to accurately and efficiently realize concentration detection on the dissolved organic carbon.
In order to solve the technical problem, the invention is realized as follows:
the embodiment of the invention provides a method for detecting dissolved organic carbon in an ultraviolet-visible spectrum, which comprises the following steps:
s101: dividing a to-be-detected dissolved organic carbon detection sample into a training sample and a test sample, and performing conventional dissolved organic carbon detection on the training sample to obtain a concentration value of the training sample;
s102: preprocessing a training sample, wherein the preprocessing comprises smoothing denoising, scattering correction and elimination of baseline drift influence;
s103: carrying out global feature extraction on the preprocessed training samples by adopting a self-organizing mapping network;
s104: performing cluster analysis on the global features to obtain feature information which can represent the spectral difference between different samples;
s105: constructing a regression tree integration model;
s106: solving the regression tree integration model;
s107: introducing a regularization term, and constructing a nonlinear analysis model of the regularized greedy decision tree;
s108: and substituting the test sample into the nonlinear analysis model to verify the effectiveness of the nonlinear analysis model. In the embodiment of the invention, the interference of instruments, environmental noise and scattering is eliminated by preprocessing the data, and the concentration detection of the organic carbon is realized on a full spectrum by adopting a self-organizing mapping network and a nonlinear analysis model of a regularized greedy decision tree, so that the universality is strong, the complexity of the data is low, the effective information occupies a high area, and the accuracy of the concentration detection of the dissolved organic carbon is improved.
Drawings
Fig. 1 is a schematic flow chart of a method for detecting dissolved organic carbon in ultraviolet-visible spectrum according to an embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings in combination with embodiments.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art based on the embodiments of the present invention without any inventive step, are within the scope of the present invention.
The method for detecting dissolved organic carbon in ultraviolet-visible spectrum provided by the embodiment of the invention is described in detail by specific embodiments and application scenarios thereof with reference to the attached drawings.
Referring to fig. 1, a schematic flow chart of a method for detecting dissolved organic carbon in ultraviolet-visible spectrum according to an embodiment of the present invention is shown.
The method for detecting the dissolved organic carbon in the ultraviolet-visible spectrum, provided by the embodiment of the invention, comprises the following steps:
s101: dividing a to-be-detected dissolved organic carbon detection sample into a training sample and a test sample, and performing conventional dissolved organic carbon detection on the training sample to obtain a concentration value of the training sample.
In the practical application process, a person skilled in the art can select the relative proportion between the training sample and the testing sample according to practical needs, and the embodiment does not limit this.
In a possible implementation, S101 specifically includes:
s1011: and dividing 80% of the dissolved organic carbon detection samples to be detected into training samples and 20% of the dissolved organic carbon detection samples to be detected into testing samples, and performing conventional dissolved organic carbon detection on the training samples to obtain the concentration values of the training samples.
S102: and preprocessing the training sample, wherein the preprocessing comprises smoothing denoising, scattering correction and elimination of baseline drift influence.
It should be noted that, in the preprocessing process, the characterization capability of the spectral feature on the sample information can be improved by adding feature correlation analysis on the basis of local feature extraction.
In a possible implementation, S102 specifically includes sub-steps S1021 to S1024:
s1021: a Savitzky-Golay convolution smoothing method is adopted to remove noise signals with large frequency span in training samples, and the specific implementation mode of the Savitzky-Golay convolution smoothing method is as follows:
wherein x is n+1 At the n +1 th wavelength of the spectrum x,representing the nth wavelength of the average spectrum after mean centering treatment, hi is a smoothing coefficient, H is a normalization factor, and w is the window size;
s1022: using multivariate scattering correction method (Multiplicative Scatterer Cor)recovery, MSC) will average the spectrumPerforming linear regression with the spectrum x, and performing scattering correction by using linear regression parameters:
wherein x is MSC For the set of spectra after the multivariate scatter correction, b 0 And b is a correction parameter obtained by comparing the spectrum x and the average spectrumPerforming linear regression by using a least square method;
s1023: eliminating the influence of baseline drift by a background modeling method, selecting a partial spectrum at a non-characteristic wavelength in a target analyte spectrum, and fitting the partial spectrum in a polynomial form by using a least square method:
A=cλ Z
log (a) = log (c) + zlog (λ) formula 3
Wherein A represents absorbance, the absorbance is expressed in logarithmic form, λ represents wavelength, z represents the order relationship between absorption rate and wavelength, and c represents a constant;
s1024: and subtracting the absorbance of part of the spectrum from the total absorbance of the training sample to obtain the absorbance of the training sample after scattering is removed.
S103: and carrying out global feature extraction on the preprocessed training samples by adopting a self-organizing mapping network.
It should be noted that, when global feature extraction is performed by using the self-organized mapping network, a smaller detection error can be achieved and the detection performance can be improved when the problems that the features of the ultraviolet-visible spectrum in the dissolved organic carbon are not concentrated and are easily affected by nonlinearity are faced.
In a possible implementation, S103 specifically includes sub-steps S1031 to S1032:
s1031: falseLet the number of the two-dimensional neuron array be m, and the external input vector X be an N-dimensional vector, i.e., X = [ X ] 1 ,x 2 ,...,x N ] T Weight vector W between input vector and i-th hidden layer unit i Comprises the following steps: w i =[w i1 ,w i2 ,...,w iN ] T Wherein w is iN An nth weight representing an ith hidden layer unit;
s1032: in the competitive learning network, each neuron determines a winning neuron through mutual competition, and the winning neuron and the neighbor neurons thereof are adjusted in the learning network, wherein a competition result q is defined as the neuron with the weight vector closest to an input vector, namely:
wherein the topological neighborhood function η of the winning neuron q qi (k) Comprises the following steps:
wherein r is i And r q Coordinates of neurons q and i, respectively, η (k) and R (k) being decreasing functions, η qi (k) As the number of iterations k monotonically decreases, the neuron weight may be given by:
wherein, W i (k) And (3) representing the weight of the kth iteration number, wherein mu (k) is a learning rate parameter of the kth iteration number and is decreased with the iteration number k.
S104: and carrying out cluster analysis on the global features to obtain feature information which can represent the spectral difference between different samples.
In a possible implementation, S104 specifically includes:
s1041: and performing clustering analysis on the global features by adopting a k-means clustering algorithm to obtain feature information capable of representing the spectral difference between different samples.
It should be noted that the amount of spectral information can be further compressed by using a k-means clustering algorithm.
S105: and constructing a regression tree integration model.
In a possible implementation, S105 specifically includes sub-steps S1051 to S1052:
s1051: for spectral feature data, a sample set feature vector is represented by a vector u, and then a regression tree integration model F M (u) is expressed as:
wherein, T (u; theta) m ) Represents a decision tree, Θ m Representing the optimization parameters of the decision tree, wherein M is the number of the decision trees;
s1052: the recursive relationship between the two models can be obtained according to equation 7:
F m (u)=F m-1 (u)+T(u;Θ m ) Equation 8.
S106: and solving the regression tree integration model.
In a possible implementation, S106 specifically includes sub-steps S1061 to S1062:
s1061: optimization parameters Θ for decision trees m Solving according to an empirical risk minimization criterion:
wherein, N represents the number of samples contained in the sample subset used for training, u i Feature vector, y, representing the ith sample i Is the corresponding measured value;
s1062: a quantitative analysis model of the spectrum is constructed by using a square error loss function, and then gradient solution is performed through a formula 10:
L(y i ,F m (u i ))=(y i -F m (u i )) 2 =(y i -F m-1 (u i )-T(u;θ m )) 2 equation 10
Wherein, y i -F m-1 (u i ) The residual of the data is fitted to the current model.
Wherein for weak learner T (u; Θ) m ) In other words, the regression tree is trained using the residual error of each step of data fitting.
S107: and introducing a regular term, and constructing a nonlinear analysis model of the regularized greedy decision tree.
In one possible implementation, S107 specifically includes substeps S1071 to S1075:
s1071: the path from the root node to each leaf node constitutes a rule:
where v denotes a node of the tree, L (u) denotes a loss function, and a base learner b v (u) represents whether u can reach the node v after the judgment of the decision tree node,indicating that it is determined whether j nodes are less than the leaf node threshold among all the passing non-leaf nodes Indicating that a determination is made among all passing non-leaf nodes whether k nodes are greater than a leaf node threshold
S1072: let v 1 ,v 2 Two sub-nodes of the node v are set, the base learner at the node v is made according to an additive methodb v (u) as a combination of child nodes:
s1073: for model F, each node v in F can pass through (b) v ,a v ) Making an association wherein b v Representing the base learner at node v, a v Representing the weight of the node in the global learning process, the additive model of F can be represented as h F (U)=∑ V∈F a v b v (u) the additive model adds the regularization term R (h) F ) The latter loss function is expressed as:
Q(F)=L(h F (U),Y)+R(h F ) Equation 13
S1074: fixing all node weights, performing local optimal search on leaf nodes of the whole decision forest, finding out structural changes which enable loss functions to fall down most quickly, and determining a local optimal decision forest structure;
s1075: fixing decision forest structure by updating node weight a v The loss function is minimized, the model prediction precision is increased, and the steps are carried out alternately, so that the optimal detection model is obtained.
S108: and substituting the test sample into the nonlinear analysis model to verify the effectiveness of the nonlinear analysis model.
In a possible implementation, S108 specifically includes sub-steps S1081 to S1083:
s1081: substituting the test sample into the nonlinear analysis model to obtain a model prediction value of the test sample;
s1082: carrying out conventional dissolved organic carbon detection on the test sample to obtain a real concentration value of the test sample;
s1083: and comparing the model predicted value with the real concentration value, and verifying the effectiveness of the nonlinear analysis model.
In the embodiment of the invention, the interference of instruments, environmental noise and scattering is eliminated by preprocessing the data, and the concentration detection of the organic carbon is realized on a full spectrum by adopting the self-organizing mapping network and the nonlinear analysis model of the regularized greedy decision tree, so that the universality is strong, the complexity of the data is low, the effective information occupies a high area, and the accuracy of the concentration detection of the dissolved organic carbon is improved.
The above description is only an example of the present invention, and is not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.
Claims (9)
1. A method for detecting dissolved organic carbon in an ultraviolet-visible spectrum is characterized by comprising the following steps:
s101: dividing a to-be-detected dissolved organic carbon detection sample into a training sample and a test sample, and performing conventional dissolved organic carbon detection on the training sample to obtain a concentration value of the training sample;
s102: preprocessing the training sample, wherein the preprocessing comprises smoothing denoising, scattering correction and baseline drift influence elimination;
s103: carrying out global feature extraction on the preprocessed training samples by adopting a self-organizing mapping network
S104: performing cluster analysis on the global features to obtain feature information which can represent spectral differences among different samples;
s105: constructing a regression tree integration model;
s106: solving the regression tree integration model;
s107: introducing a regularization term, and constructing a nonlinear analysis model of the regularized greedy decision tree;
s108: and substituting the test sample into the nonlinear analysis model to verify the effectiveness of the nonlinear analysis model.
2. The detection method according to claim 1, wherein the S101 specifically includes:
s1011: and dividing 80% of the testing samples of the dissolved organic carbon to be tested into training samples and 20% of the testing samples, and carrying out conventional dissolved organic carbon detection on the training samples to obtain the concentration values of the training samples.
3. The detection method according to claim 1, wherein the S102 specifically includes:
s1021: removing a noise signal with a large frequency span in the training sample by adopting a Savitzky-Golay convolution smoothing method, wherein the specific implementation mode of the Savitzky-Golay convolution smoothing method is as follows:
wherein x is n+1 At the n +1 th wavelength of the spectrum x,n-th wavelength, h, representing the mean spectrum after mean centering i Is a smoothing coefficient, H is a normalization factor, and w is a window size;
s1022: averaging spectra using multivariate scatter correctionPerforming linear regression with the spectrum x, and performing scattering correction by using linear regression parameters:
wherein x is MSC For the set of spectra after the multivariate scatter correction, b 0 And b is a correction parameter obtained by comparing the spectrum x and the average spectrumPerforming linear regression by using a least square method to obtain;
s1023: eliminating the influence of baseline drift by a background modeling method, selecting a partial spectrum at a non-characteristic wavelength in a target analyte spectrum, and fitting the partial spectrum in a polynomial form by using a least square method:
A=cλ z
log (a) = log (c) + zlog (λ) formula 3
Wherein A represents absorbance, the absorbance is expressed in logarithmic form, λ represents wavelength, z represents the order relationship between absorbance and wavelength, and c represents a constant;
s1024: and subtracting the absorbance of the partial spectrum from the total absorbance of the training sample to obtain the absorbance of the training sample after scattering is removed.
4. The detection method according to claim 1, wherein the S103 specifically includes:
s1031: assuming that the number of two-dimensional neuron arrays is m, the external input vector X is an N-dimensional vector, i.e., X = [ X ] 1 ,x 2 ,…,x N ] T Weight vector W between input vector and i-th hidden layer unit i Comprises the following steps: w is a group of i =[w i1 ,w i2 ,…,w iN ] T Wherein w is iN An nth weight representing an ith hidden layer unit;
s1032: in a competitive learning network, each neuron determines a winning neuron by competing with each other, and the winning neuron and its neighbor neurons are adjusted in the learning network, wherein a competition result q is defined as the neuron whose weight vector is closest to the input vector, that is:
wherein a topological neighborhood function η of the winning neuron q qi (k) Comprises the following steps:
wherein r is i And r q Coordinates of neurons q and i, respectively, η (k) and R (k) being decreasing functions, η qi (k) As the number of iterations k monotonically decreases, the neuron weights may be given by:
wherein, W i (k) And the weight of the kth iteration number is shown, and mu (k) is a learning rate parameter of the kth iteration number and is decreased with the iteration number k.
5. The detection method according to claim 1, wherein the S104 specifically includes:
s1041: and performing clustering analysis on the global features by adopting a k-means clustering algorithm to obtain feature information which can represent the spectral difference between different samples.
6. The detection method according to claim 1, wherein the S105 specifically includes:
s1051: for spectral feature data, a sample set feature vector is represented by a vector u, and then the regression tree integration model F M (u) is expressed as:
wherein, T (u; theta) m ) Represents a decision tree, Θ m Representing the optimization parameters of the decision tree, wherein M is the number of the decision trees;
s1052: the recursive relationship between the two models can be obtained according to the formula 7:
F m (u)=F m-1 (u)+T(u;Θ m ) Equation 8.
7. The detection method according to claim 6, wherein the S106 specifically includes:
s1061: optimization parameters Θ for decision trees m Solving according to an empirical risk minimization criterion:
wherein, N represents the number of samples contained in the sample subset used for training, u i Feature vector, y, representing the ith sample i Is the corresponding measured value;
s1062: a quantitative analysis model of the spectrum is constructed by using a square error loss function, and then gradient solution is performed through a formula 10:
L(y i ,F m (u i ))=(y i -F m (u i )) 2
=(y i -F m-1 (u i )-T(u;θ m )) 2 equation 10
Wherein, y i -F m-1 (u i ) The residual of the data is fitted to the current model.
8. The detection method according to claim 1, wherein the S107 specifically includes:
s1071: the path from the root node to each leaf node constitutes a rule:
where v denotes a node of the tree, L (u) denotes a loss function, and a base learner b v (u) indicates whether u can reach the node v after the judgment of the decision tree node,indicating that whether j nodes are less than the leaf node threshold is judged among all the passing non-leaf nodesIndicating that a determination is made among all passing non-leaf nodes whether k nodes are greater than a leaf node threshold
S1072: let v 1 ,v 2 Two sub-nodes of the node v are obtained, and the base learner b at the node v is enabled according to an additive method v (u) as a combination of child nodes:
s1073: for model F, each node v in F can pass through (b) v ,a v ) Making an association wherein b v Representing the base learner at node v, a v Representing the weight of the node in the global learning process, the additive model of F can be represented as h F (U)=∑ v∈ F a v b v (u) the additive model adds a regularization term R (h) F ) The latter loss function is expressed as:
Q(F)=L(h F (U),Y)+R(h F ) Equation 13
S1074: fixing all node weights, performing local optimal search on leaf nodes of the whole decision forest, finding out structural changes which enable loss functions to fall down most quickly, and determining a local optimal decision forest structure;
s1075: fixing decision forest structure by updating node weight a v The loss minimization function increases the model prediction accuracy, and the steps are alternately performed, so that the optimal detection model is obtained.
9. The detection method according to claim 1, wherein the S108 specifically includes:
s1081: substituting the test sample into the nonlinear analysis model to obtain a model predicted value of the test sample;
s1082: carrying out conventional dissolved organic carbon detection on the test sample to obtain a real concentration value of the test sample;
s1083: and comparing the model predicted value with the real concentration value, and verifying the effectiveness of the nonlinear analysis model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210888398.1A CN115221927A (en) | 2022-07-26 | 2022-07-26 | Ultraviolet-visible spectrum dissolved organic carbon detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210888398.1A CN115221927A (en) | 2022-07-26 | 2022-07-26 | Ultraviolet-visible spectrum dissolved organic carbon detection method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115221927A true CN115221927A (en) | 2022-10-21 |
Family
ID=83614080
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210888398.1A Pending CN115221927A (en) | 2022-07-26 | 2022-07-26 | Ultraviolet-visible spectrum dissolved organic carbon detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115221927A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115950846A (en) * | 2023-03-10 | 2023-04-11 | 灌南县北陈集动物防疫检疫所 | Pig drinking water detection method and system based on optical means |
CN117194902A (en) * | 2023-11-08 | 2023-12-08 | 昆山尚瑞智能科技有限公司 | Noise data filtering method in spectrum measurement process |
-
2022
- 2022-07-26 CN CN202210888398.1A patent/CN115221927A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115950846A (en) * | 2023-03-10 | 2023-04-11 | 灌南县北陈集动物防疫检疫所 | Pig drinking water detection method and system based on optical means |
CN117194902A (en) * | 2023-11-08 | 2023-12-08 | 昆山尚瑞智能科技有限公司 | Noise data filtering method in spectrum measurement process |
CN117194902B (en) * | 2023-11-08 | 2024-02-06 | 昆山尚瑞智能科技有限公司 | Noise data filtering method in spectrum measurement process |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115221927A (en) | Ultraviolet-visible spectrum dissolved organic carbon detection method | |
US8731839B2 (en) | Method and system for robust classification strategy for cancer detection from mass spectrometry data | |
CN109187392B (en) | Zinc liquid trace metal ion concentration prediction method based on partition modeling | |
CN102072767A (en) | Wavelength similarity consensus regression-based infrared spectrum quantitative analysis method and device | |
CN115656074B (en) | Adaptive selection and estimation method for sea water COD (chemical oxygen demand) spectral variable characteristics | |
CN114216877B (en) | Automatic detection and reconstruction method and system for spectral peak in tea near infrared spectral analysis | |
CN108827909B (en) | Rapid soil classification method based on visible near infrared spectrum and multi-target fusion | |
CN113076692B (en) | Method for inverting nitrogen content of leaf | |
CN112750507A (en) | Method for simultaneously detecting content of nitrate and nitrite in water based on hybrid machine learning model | |
CN115810403B (en) | Method for evaluating water pollution based on environmental characteristic information | |
Huang et al. | Optimal wavelength selection for hyperspectral scattering prediction of apple firmness and soluble solids content | |
CN114062306B (en) | Near infrared spectrum data segmentation preprocessing method | |
CN112858208A (en) | Biomass potassium content measurement and modeling method based on infrared spectrum principal component and neural network | |
CN110887798A (en) | Nonlinear full-spectrum water turbidity quantitative analysis method based on extreme random tree | |
CN115236044A (en) | Method and device for calculating concentration of soluble organic carbon in water environment by fluorescence spectrometry | |
CN112630180B (en) | Ultraviolet/visible light absorption spectrum model for detecting concentration of organophosphorus pesticide in water body | |
CN112881333B (en) | Near infrared spectrum wavelength screening method based on improved immune genetic algorithm | |
CN115420707A (en) | Sewage near infrared spectrum chemical oxygen demand assessment method and system | |
CN111220565B (en) | CPLS-based infrared spectrum measuring instrument calibration migration method | |
CN113607683A (en) | Automatic modeling method for near infrared spectrum quantitative analysis | |
CN113686823B (en) | Water nitrite content estimation method based on transmission spectrum and PLS-Elman neural network | |
CN117556245B (en) | Method for detecting filtered impurities in tetramethylammonium hydroxide production | |
CN112651428A (en) | Deep learning model multi-classification method for remote Raman mineral identification | |
CN113376107B (en) | Water quality monitoring system and method based on cloud platform | |
CN111562226B (en) | Method and system for analyzing total nitrogen and total phosphorus in seawater based on characteristic peak area of absorption spectrum |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |