CN115221927A - Ultraviolet-visible spectrum dissolved organic carbon detection method - Google Patents

Ultraviolet-visible spectrum dissolved organic carbon detection method Download PDF

Info

Publication number
CN115221927A
CN115221927A CN202210888398.1A CN202210888398A CN115221927A CN 115221927 A CN115221927 A CN 115221927A CN 202210888398 A CN202210888398 A CN 202210888398A CN 115221927 A CN115221927 A CN 115221927A
Authority
CN
China
Prior art keywords
model
organic carbon
node
dissolved organic
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210888398.1A
Other languages
Chinese (zh)
Inventor
王柯
刘半藤
陈友荣
孟佳洋
吕晓雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Shuren University
Original Assignee
Zhejiang Shuren University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Shuren University filed Critical Zhejiang Shuren University
Priority to CN202210888398.1A priority Critical patent/CN115221927A/en
Publication of CN115221927A publication Critical patent/CN115221927A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/33Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using ultraviolet light
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/20Controlling water pollution; Waste water treatment

Abstract

The invention discloses a method for detecting dissolved organic carbon by ultraviolet-visible spectrum, belonging to the technical field of water quality detection, and the method comprises the following steps: dividing a to-be-detected dissolved organic carbon detection sample into a training sample and a test sample, and performing conventional dissolved organic carbon detection on the training sample to obtain a concentration value of the training sample; preprocessing a training sample, wherein the preprocessing comprises smoothing denoising, scattering correction and baseline drift influence elimination; carrying out global feature extraction on the preprocessed training samples by adopting a self-organizing mapping network; performing cluster analysis on the global features to obtain feature information which can represent the spectral difference between different samples; constructing a regression tree integration model; solving the regression tree integration model; introducing a regularization term, and constructing a nonlinear analysis model of the regularized greedy decision tree; and substituting the test sample into the nonlinear analysis model to verify the effectiveness of the nonlinear analysis model. The accuracy of concentration detection of the dissolved organic carbon can be improved.

Description

Ultraviolet-visible spectrum dissolved organic carbon detection method
Technical Field
The invention belongs to the technical field of water quality detection, and particularly relates to a method for detecting dissolved organic carbon by using an ultraviolet-visible spectrum.
Background
Dissolved Organic Carbon (DOC) is taken as an important index for visually reflecting the pollution degree of human activities to water, and has important significance for evaluating Organic matter pollution of water in real time and developing water environment protection work. Researches prove that ultraviolet light at the wavelength of 254nm has better correlation with the concentration of dissolved organic carbon measured by a high-temperature oxidation method. However, in the actual measurement process, part of the wavelengths are susceptible to potential interference of inorganic ions, so that the measurement method based on single-wavelength absorbance is poor in applicability.
At present, scholars at home and abroad focus on establishing a multi-wavelength analysis model aiming at water body dissolved organic carbon, for example, sandford and the like use an ultraviolet spectrophotometric sensor for real-time and in-situ detection of DOC in fresh water, quantitative analysis is carried out in a 230-300nm wavelength range by adopting a mixed linear analysis curve fitting algorithm, and the method is proved to have better linear effect by comparison with a high-temperature catalytic oxidation method. Fichot et al established a correlation between the spectral slope coefficient of the ultraviolet spectrum in the 275-295nm range and the DOC concentration, used for on-site on-line quantitative analysis and research of DOC in surface water, and found that strong correlation exists between the ultraviolet absorption characteristic at 280nm and the content of dissolved organic carbon. Li et al studied the relationship between the spectral slopes of the dissolved organic carbon pairs at 254 and 280nm in surface water and wastewater treatment with a homemade miniature LED ultraviolet sensor, and the results showed that the spectrum at 280nm can supplement the traditional 254nm measurement. However, when studying a multi-wavelength analysis model, the above scholars do not consider that extracting a characteristic wavelength subset with strong effectiveness on the full spectrum of the dissolved organic carbon is easily interfered by environmental factors, and the detection performance of the model is reduced.
In summary, due to the influence of instruments, environmental noise, scattering interference and the like, the ultraviolet-visible spectrum absorbance of the dissolved organic carbon and the concentration of the sample to be detected do not strictly accord with the Lambert-Beer law, and the problems of how to reasonably select the wavelength and how to effectively extract the spectral characteristics are solved. The lack of a scheme for extracting characteristic wavelength subsets with strong effectiveness on a full spectrum leads to high data complexity and less effective information occupation, so that the existing model is difficult to accurately and efficiently realize the detection of the concentration of the dissolved organic carbon.
Disclosure of Invention
The embodiment of the invention aims to provide a method for detecting dissolved organic carbon by an ultraviolet-visible spectrum, which can solve the technical problems that the existing method for detecting the dissolved organic carbon is easily interfered by instruments, environmental noise and scattering, lacks a universal method for realizing detection on a full spectrum, has higher data complexity and less effective information occupation, and is difficult to accurately and efficiently realize concentration detection on the dissolved organic carbon.
In order to solve the technical problem, the invention is realized as follows:
the embodiment of the invention provides a method for detecting dissolved organic carbon in an ultraviolet-visible spectrum, which comprises the following steps:
s101: dividing a to-be-detected dissolved organic carbon detection sample into a training sample and a test sample, and performing conventional dissolved organic carbon detection on the training sample to obtain a concentration value of the training sample;
s102: preprocessing a training sample, wherein the preprocessing comprises smoothing denoising, scattering correction and elimination of baseline drift influence;
s103: carrying out global feature extraction on the preprocessed training samples by adopting a self-organizing mapping network;
s104: performing cluster analysis on the global features to obtain feature information which can represent the spectral difference between different samples;
s105: constructing a regression tree integration model;
s106: solving the regression tree integration model;
s107: introducing a regularization term, and constructing a nonlinear analysis model of the regularized greedy decision tree;
s108: and substituting the test sample into the nonlinear analysis model to verify the effectiveness of the nonlinear analysis model. In the embodiment of the invention, the interference of instruments, environmental noise and scattering is eliminated by preprocessing the data, and the concentration detection of the organic carbon is realized on a full spectrum by adopting a self-organizing mapping network and a nonlinear analysis model of a regularized greedy decision tree, so that the universality is strong, the complexity of the data is low, the effective information occupies a high area, and the accuracy of the concentration detection of the dissolved organic carbon is improved.
Drawings
Fig. 1 is a schematic flow chart of a method for detecting dissolved organic carbon in ultraviolet-visible spectrum according to an embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings in combination with embodiments.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art based on the embodiments of the present invention without any inventive step, are within the scope of the present invention.
The method for detecting dissolved organic carbon in ultraviolet-visible spectrum provided by the embodiment of the invention is described in detail by specific embodiments and application scenarios thereof with reference to the attached drawings.
Referring to fig. 1, a schematic flow chart of a method for detecting dissolved organic carbon in ultraviolet-visible spectrum according to an embodiment of the present invention is shown.
The method for detecting the dissolved organic carbon in the ultraviolet-visible spectrum, provided by the embodiment of the invention, comprises the following steps:
s101: dividing a to-be-detected dissolved organic carbon detection sample into a training sample and a test sample, and performing conventional dissolved organic carbon detection on the training sample to obtain a concentration value of the training sample.
In the practical application process, a person skilled in the art can select the relative proportion between the training sample and the testing sample according to practical needs, and the embodiment does not limit this.
In a possible implementation, S101 specifically includes:
s1011: and dividing 80% of the dissolved organic carbon detection samples to be detected into training samples and 20% of the dissolved organic carbon detection samples to be detected into testing samples, and performing conventional dissolved organic carbon detection on the training samples to obtain the concentration values of the training samples.
S102: and preprocessing the training sample, wherein the preprocessing comprises smoothing denoising, scattering correction and elimination of baseline drift influence.
It should be noted that, in the preprocessing process, the characterization capability of the spectral feature on the sample information can be improved by adding feature correlation analysis on the basis of local feature extraction.
In a possible implementation, S102 specifically includes sub-steps S1021 to S1024:
s1021: a Savitzky-Golay convolution smoothing method is adopted to remove noise signals with large frequency span in training samples, and the specific implementation mode of the Savitzky-Golay convolution smoothing method is as follows:
Figure BDA0003765321590000041
wherein x is n+1 At the n +1 th wavelength of the spectrum x,
Figure BDA0003765321590000042
representing the nth wavelength of the average spectrum after mean centering treatment, hi is a smoothing coefficient, H is a normalization factor, and w is the window size;
s1022: using multivariate scattering correction method (Multiplicative Scatterer Cor)recovery, MSC) will average the spectrum
Figure BDA0003765321590000043
Performing linear regression with the spectrum x, and performing scattering correction by using linear regression parameters:
Figure BDA0003765321590000044
wherein x is MSC For the set of spectra after the multivariate scatter correction, b 0 And b is a correction parameter obtained by comparing the spectrum x and the average spectrum
Figure BDA0003765321590000045
Performing linear regression by using a least square method;
s1023: eliminating the influence of baseline drift by a background modeling method, selecting a partial spectrum at a non-characteristic wavelength in a target analyte spectrum, and fitting the partial spectrum in a polynomial form by using a least square method:
A=cλ Z
log (a) = log (c) + zlog (λ) formula 3
Wherein A represents absorbance, the absorbance is expressed in logarithmic form, λ represents wavelength, z represents the order relationship between absorption rate and wavelength, and c represents a constant;
s1024: and subtracting the absorbance of part of the spectrum from the total absorbance of the training sample to obtain the absorbance of the training sample after scattering is removed.
S103: and carrying out global feature extraction on the preprocessed training samples by adopting a self-organizing mapping network.
It should be noted that, when global feature extraction is performed by using the self-organized mapping network, a smaller detection error can be achieved and the detection performance can be improved when the problems that the features of the ultraviolet-visible spectrum in the dissolved organic carbon are not concentrated and are easily affected by nonlinearity are faced.
In a possible implementation, S103 specifically includes sub-steps S1031 to S1032:
s1031: falseLet the number of the two-dimensional neuron array be m, and the external input vector X be an N-dimensional vector, i.e., X = [ X ] 1 ,x 2 ,...,x N ] T Weight vector W between input vector and i-th hidden layer unit i Comprises the following steps: w i =[w i1 ,w i2 ,...,w iN ] T Wherein w is iN An nth weight representing an ith hidden layer unit;
s1032: in the competitive learning network, each neuron determines a winning neuron through mutual competition, and the winning neuron and the neighbor neurons thereof are adjusted in the learning network, wherein a competition result q is defined as the neuron with the weight vector closest to an input vector, namely:
Figure BDA0003765321590000051
wherein the topological neighborhood function η of the winning neuron q qi (k) Comprises the following steps:
Figure RE-GDA0003803757070000052
wherein r is i And r q Coordinates of neurons q and i, respectively, η (k) and R (k) being decreasing functions, η qi (k) As the number of iterations k monotonically decreases, the neuron weight may be given by:
Figure BDA0003765321590000053
wherein, W i (k) And (3) representing the weight of the kth iteration number, wherein mu (k) is a learning rate parameter of the kth iteration number and is decreased with the iteration number k.
S104: and carrying out cluster analysis on the global features to obtain feature information which can represent the spectral difference between different samples.
In a possible implementation, S104 specifically includes:
s1041: and performing clustering analysis on the global features by adopting a k-means clustering algorithm to obtain feature information capable of representing the spectral difference between different samples.
It should be noted that the amount of spectral information can be further compressed by using a k-means clustering algorithm.
S105: and constructing a regression tree integration model.
In a possible implementation, S105 specifically includes sub-steps S1051 to S1052:
s1051: for spectral feature data, a sample set feature vector is represented by a vector u, and then a regression tree integration model F M (u) is expressed as:
Figure BDA0003765321590000061
wherein, T (u; theta) m ) Represents a decision tree, Θ m Representing the optimization parameters of the decision tree, wherein M is the number of the decision trees;
s1052: the recursive relationship between the two models can be obtained according to equation 7:
F m (u)=F m-1 (u)+T(u;Θ m ) Equation 8.
S106: and solving the regression tree integration model.
In a possible implementation, S106 specifically includes sub-steps S1061 to S1062:
s1061: optimization parameters Θ for decision trees m Solving according to an empirical risk minimization criterion:
Figure BDA0003765321590000062
wherein, N represents the number of samples contained in the sample subset used for training, u i Feature vector, y, representing the ith sample i Is the corresponding measured value;
s1062: a quantitative analysis model of the spectrum is constructed by using a square error loss function, and then gradient solution is performed through a formula 10:
L(y i ,F m (u i ))=(y i -F m (u i )) 2 =(y i -F m-1 (u i )-T(u;θ m )) 2 equation 10
Wherein, y i -F m-1 (u i ) The residual of the data is fitted to the current model.
Wherein for weak learner T (u; Θ) m ) In other words, the regression tree is trained using the residual error of each step of data fitting.
S107: and introducing a regular term, and constructing a nonlinear analysis model of the regularized greedy decision tree.
In one possible implementation, S107 specifically includes substeps S1071 to S1075:
s1071: the path from the root node to each leaf node constitutes a rule:
Figure BDA0003765321590000071
where v denotes a node of the tree, L (u) denotes a loss function, and a base learner b v (u) represents whether u can reach the node v after the judgment of the decision tree node,
Figure BDA0003765321590000072
indicating that it is determined whether j nodes are less than the leaf node threshold among all the passing non-leaf nodes
Figure BDA0003765321590000073
Figure BDA0003765321590000076
Indicating that a determination is made among all passing non-leaf nodes whether k nodes are greater than a leaf node threshold
Figure BDA0003765321590000074
S1072: let v 1 ,v 2 Two sub-nodes of the node v are set, the base learner at the node v is made according to an additive methodb v (u) as a combination of child nodes:
Figure BDA0003765321590000075
s1073: for model F, each node v in F can pass through (b) v ,a v ) Making an association wherein b v Representing the base learner at node v, a v Representing the weight of the node in the global learning process, the additive model of F can be represented as h F (U)=∑ V∈F a v b v (u) the additive model adds the regularization term R (h) F ) The latter loss function is expressed as:
Q(F)=L(h F (U),Y)+R(h F ) Equation 13
S1074: fixing all node weights, performing local optimal search on leaf nodes of the whole decision forest, finding out structural changes which enable loss functions to fall down most quickly, and determining a local optimal decision forest structure;
s1075: fixing decision forest structure by updating node weight a v The loss function is minimized, the model prediction precision is increased, and the steps are carried out alternately, so that the optimal detection model is obtained.
S108: and substituting the test sample into the nonlinear analysis model to verify the effectiveness of the nonlinear analysis model.
In a possible implementation, S108 specifically includes sub-steps S1081 to S1083:
s1081: substituting the test sample into the nonlinear analysis model to obtain a model prediction value of the test sample;
s1082: carrying out conventional dissolved organic carbon detection on the test sample to obtain a real concentration value of the test sample;
s1083: and comparing the model predicted value with the real concentration value, and verifying the effectiveness of the nonlinear analysis model.
In the embodiment of the invention, the interference of instruments, environmental noise and scattering is eliminated by preprocessing the data, and the concentration detection of the organic carbon is realized on a full spectrum by adopting the self-organizing mapping network and the nonlinear analysis model of the regularized greedy decision tree, so that the universality is strong, the complexity of the data is low, the effective information occupies a high area, and the accuracy of the concentration detection of the dissolved organic carbon is improved.
The above description is only an example of the present invention, and is not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (9)

1. A method for detecting dissolved organic carbon in an ultraviolet-visible spectrum is characterized by comprising the following steps:
s101: dividing a to-be-detected dissolved organic carbon detection sample into a training sample and a test sample, and performing conventional dissolved organic carbon detection on the training sample to obtain a concentration value of the training sample;
s102: preprocessing the training sample, wherein the preprocessing comprises smoothing denoising, scattering correction and baseline drift influence elimination;
s103: carrying out global feature extraction on the preprocessed training samples by adopting a self-organizing mapping network
S104: performing cluster analysis on the global features to obtain feature information which can represent spectral differences among different samples;
s105: constructing a regression tree integration model;
s106: solving the regression tree integration model;
s107: introducing a regularization term, and constructing a nonlinear analysis model of the regularized greedy decision tree;
s108: and substituting the test sample into the nonlinear analysis model to verify the effectiveness of the nonlinear analysis model.
2. The detection method according to claim 1, wherein the S101 specifically includes:
s1011: and dividing 80% of the testing samples of the dissolved organic carbon to be tested into training samples and 20% of the testing samples, and carrying out conventional dissolved organic carbon detection on the training samples to obtain the concentration values of the training samples.
3. The detection method according to claim 1, wherein the S102 specifically includes:
s1021: removing a noise signal with a large frequency span in the training sample by adopting a Savitzky-Golay convolution smoothing method, wherein the specific implementation mode of the Savitzky-Golay convolution smoothing method is as follows:
Figure FDA0003765321580000011
wherein x is n+1 At the n +1 th wavelength of the spectrum x,
Figure FDA0003765321580000012
n-th wavelength, h, representing the mean spectrum after mean centering i Is a smoothing coefficient, H is a normalization factor, and w is a window size;
s1022: averaging spectra using multivariate scatter correction
Figure FDA0003765321580000021
Performing linear regression with the spectrum x, and performing scattering correction by using linear regression parameters:
Figure FDA0003765321580000022
Figure FDA0003765321580000023
wherein x is MSC For the set of spectra after the multivariate scatter correction, b 0 And b is a correction parameter obtained by comparing the spectrum x and the average spectrum
Figure FDA0003765321580000025
Performing linear regression by using a least square method to obtain;
s1023: eliminating the influence of baseline drift by a background modeling method, selecting a partial spectrum at a non-characteristic wavelength in a target analyte spectrum, and fitting the partial spectrum in a polynomial form by using a least square method:
A=cλ z
log (a) = log (c) + zlog (λ) formula 3
Wherein A represents absorbance, the absorbance is expressed in logarithmic form, λ represents wavelength, z represents the order relationship between absorbance and wavelength, and c represents a constant;
s1024: and subtracting the absorbance of the partial spectrum from the total absorbance of the training sample to obtain the absorbance of the training sample after scattering is removed.
4. The detection method according to claim 1, wherein the S103 specifically includes:
s1031: assuming that the number of two-dimensional neuron arrays is m, the external input vector X is an N-dimensional vector, i.e., X = [ X ] 1 ,x 2 ,…,x N ] T Weight vector W between input vector and i-th hidden layer unit i Comprises the following steps: w is a group of i =[w i1 ,w i2 ,…,w iN ] T Wherein w is iN An nth weight representing an ith hidden layer unit;
s1032: in a competitive learning network, each neuron determines a winning neuron by competing with each other, and the winning neuron and its neighbor neurons are adjusted in the learning network, wherein a competition result q is defined as the neuron whose weight vector is closest to the input vector, that is:
Figure RE-FDA0003803757060000023
wherein a topological neighborhood function η of the winning neuron q qi (k) Comprises the following steps:
Figure RE-FDA0003803757060000031
wherein r is i And r q Coordinates of neurons q and i, respectively, η (k) and R (k) being decreasing functions, η qi (k) As the number of iterations k monotonically decreases, the neuron weights may be given by:
Figure RE-FDA0003803757060000032
wherein, W i (k) And the weight of the kth iteration number is shown, and mu (k) is a learning rate parameter of the kth iteration number and is decreased with the iteration number k.
5. The detection method according to claim 1, wherein the S104 specifically includes:
s1041: and performing clustering analysis on the global features by adopting a k-means clustering algorithm to obtain feature information which can represent the spectral difference between different samples.
6. The detection method according to claim 1, wherein the S105 specifically includes:
s1051: for spectral feature data, a sample set feature vector is represented by a vector u, and then the regression tree integration model F M (u) is expressed as:
Figure FDA0003765321580000033
wherein, T (u; theta) m ) Represents a decision tree, Θ m Representing the optimization parameters of the decision tree, wherein M is the number of the decision trees;
s1052: the recursive relationship between the two models can be obtained according to the formula 7:
F m (u)=F m-1 (u)+T(u;Θ m ) Equation 8.
7. The detection method according to claim 6, wherein the S106 specifically includes:
s1061: optimization parameters Θ for decision trees m Solving according to an empirical risk minimization criterion:
Figure FDA0003765321580000034
wherein, N represents the number of samples contained in the sample subset used for training, u i Feature vector, y, representing the ith sample i Is the corresponding measured value;
s1062: a quantitative analysis model of the spectrum is constructed by using a square error loss function, and then gradient solution is performed through a formula 10:
L(y i ,F m (u i ))=(y i -F m (u i )) 2
=(y i -F m-1 (u i )-T(u;θ m )) 2 equation 10
Wherein, y i -F m-1 (u i ) The residual of the data is fitted to the current model.
8. The detection method according to claim 1, wherein the S107 specifically includes:
s1071: the path from the root node to each leaf node constitutes a rule:
Figure FDA0003765321580000041
where v denotes a node of the tree, L (u) denotes a loss function, and a base learner b v (u) indicates whether u can reach the node v after the judgment of the decision tree node,
Figure FDA0003765321580000042
indicating that whether j nodes are less than the leaf node threshold is judged among all the passing non-leaf nodes
Figure FDA0003765321580000043
Indicating that a determination is made among all passing non-leaf nodes whether k nodes are greater than a leaf node threshold
Figure FDA0003765321580000044
S1072: let v 1 ,v 2 Two sub-nodes of the node v are obtained, and the base learner b at the node v is enabled according to an additive method v (u) as a combination of child nodes:
Figure FDA0003765321580000045
s1073: for model F, each node v in F can pass through (b) v ,a v ) Making an association wherein b v Representing the base learner at node v, a v Representing the weight of the node in the global learning process, the additive model of F can be represented as h F (U)=∑ v∈ F a v b v (u) the additive model adds a regularization term R (h) F ) The latter loss function is expressed as:
Q(F)=L(h F (U),Y)+R(h F ) Equation 13
S1074: fixing all node weights, performing local optimal search on leaf nodes of the whole decision forest, finding out structural changes which enable loss functions to fall down most quickly, and determining a local optimal decision forest structure;
s1075: fixing decision forest structure by updating node weight a v The loss minimization function increases the model prediction accuracy, and the steps are alternately performed, so that the optimal detection model is obtained.
9. The detection method according to claim 1, wherein the S108 specifically includes:
s1081: substituting the test sample into the nonlinear analysis model to obtain a model predicted value of the test sample;
s1082: carrying out conventional dissolved organic carbon detection on the test sample to obtain a real concentration value of the test sample;
s1083: and comparing the model predicted value with the real concentration value, and verifying the effectiveness of the nonlinear analysis model.
CN202210888398.1A 2022-07-26 2022-07-26 Ultraviolet-visible spectrum dissolved organic carbon detection method Pending CN115221927A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210888398.1A CN115221927A (en) 2022-07-26 2022-07-26 Ultraviolet-visible spectrum dissolved organic carbon detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210888398.1A CN115221927A (en) 2022-07-26 2022-07-26 Ultraviolet-visible spectrum dissolved organic carbon detection method

Publications (1)

Publication Number Publication Date
CN115221927A true CN115221927A (en) 2022-10-21

Family

ID=83614080

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210888398.1A Pending CN115221927A (en) 2022-07-26 2022-07-26 Ultraviolet-visible spectrum dissolved organic carbon detection method

Country Status (1)

Country Link
CN (1) CN115221927A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115950846A (en) * 2023-03-10 2023-04-11 灌南县北陈集动物防疫检疫所 Pig drinking water detection method and system based on optical means
CN117194902A (en) * 2023-11-08 2023-12-08 昆山尚瑞智能科技有限公司 Noise data filtering method in spectrum measurement process

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115950846A (en) * 2023-03-10 2023-04-11 灌南县北陈集动物防疫检疫所 Pig drinking water detection method and system based on optical means
CN117194902A (en) * 2023-11-08 2023-12-08 昆山尚瑞智能科技有限公司 Noise data filtering method in spectrum measurement process
CN117194902B (en) * 2023-11-08 2024-02-06 昆山尚瑞智能科技有限公司 Noise data filtering method in spectrum measurement process

Similar Documents

Publication Publication Date Title
CN115221927A (en) Ultraviolet-visible spectrum dissolved organic carbon detection method
US8731839B2 (en) Method and system for robust classification strategy for cancer detection from mass spectrometry data
CN109187392B (en) Zinc liquid trace metal ion concentration prediction method based on partition modeling
CN102072767A (en) Wavelength similarity consensus regression-based infrared spectrum quantitative analysis method and device
CN115656074B (en) Adaptive selection and estimation method for sea water COD (chemical oxygen demand) spectral variable characteristics
CN114216877B (en) Automatic detection and reconstruction method and system for spectral peak in tea near infrared spectral analysis
CN108827909B (en) Rapid soil classification method based on visible near infrared spectrum and multi-target fusion
CN113076692B (en) Method for inverting nitrogen content of leaf
CN112750507A (en) Method for simultaneously detecting content of nitrate and nitrite in water based on hybrid machine learning model
CN115810403B (en) Method for evaluating water pollution based on environmental characteristic information
Huang et al. Optimal wavelength selection for hyperspectral scattering prediction of apple firmness and soluble solids content
CN114062306B (en) Near infrared spectrum data segmentation preprocessing method
CN112858208A (en) Biomass potassium content measurement and modeling method based on infrared spectrum principal component and neural network
CN110887798A (en) Nonlinear full-spectrum water turbidity quantitative analysis method based on extreme random tree
CN115236044A (en) Method and device for calculating concentration of soluble organic carbon in water environment by fluorescence spectrometry
CN112630180B (en) Ultraviolet/visible light absorption spectrum model for detecting concentration of organophosphorus pesticide in water body
CN112881333B (en) Near infrared spectrum wavelength screening method based on improved immune genetic algorithm
CN115420707A (en) Sewage near infrared spectrum chemical oxygen demand assessment method and system
CN111220565B (en) CPLS-based infrared spectrum measuring instrument calibration migration method
CN113607683A (en) Automatic modeling method for near infrared spectrum quantitative analysis
CN113686823B (en) Water nitrite content estimation method based on transmission spectrum and PLS-Elman neural network
CN117556245B (en) Method for detecting filtered impurities in tetramethylammonium hydroxide production
CN112651428A (en) Deep learning model multi-classification method for remote Raman mineral identification
CN113376107B (en) Water quality monitoring system and method based on cloud platform
CN111562226B (en) Method and system for analyzing total nitrogen and total phosphorus in seawater based on characteristic peak area of absorption spectrum

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination