CN109187392B - Zinc liquid trace metal ion concentration prediction method based on partition modeling - Google Patents

Zinc liquid trace metal ion concentration prediction method based on partition modeling Download PDF

Info

Publication number
CN109187392B
CN109187392B CN201811124589.0A CN201811124589A CN109187392B CN 109187392 B CN109187392 B CN 109187392B CN 201811124589 A CN201811124589 A CN 201811124589A CN 109187392 B CN109187392 B CN 109187392B
Authority
CN
China
Prior art keywords
concentration
model
mixed solution
wavelength
wavelength variable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811124589.0A
Other languages
Chinese (zh)
Other versions
CN109187392A (en
Inventor
朱红求
吴书君
李勇刚
阳春华
程菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201811124589.0A priority Critical patent/CN109187392B/en
Publication of CN109187392A publication Critical patent/CN109187392A/en
Application granted granted Critical
Publication of CN109187392B publication Critical patent/CN109187392B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry

Landscapes

  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention discloses a zinc liquid trace metal ion concentration prediction method based on partition modeling, which comprises the following steps: acquiring a spectrum signal diagram of the mixed solution sample, and calculating a correlation coefficient-stability value of each wavelength variable based on the spectrum signal of the mixed solution sample; acquiring an optimal wavelength variable based on the correlation coefficient-stability value of the wavelength variable; carrying out model training to obtain a concentration interval classification model of the mixed solution; respectively carrying out model training to obtain a concentration prediction model in a high concentration interval and a concentration prediction model in a low concentration interval; and obtaining a concentration interval to which the mixed solution to be detected belongs based on the concentration interval classification model, and obtaining the predicted concentration of the trace metal ions to be detected based on a concentration prediction model corresponding to a high concentration interval or a concentration prediction model corresponding to a low concentration interval. The method can improve the accuracy of the prediction model and obtain a more reliable prediction result.

Description

Zinc liquid trace metal ion concentration prediction method based on partition modeling
Technical Field
The invention belongs to the field of nonlinear quantitative analysis of absorption spectra, and particularly relates to a zinc liquid trace metal ion concentration prediction method based on partition modeling.
Background
The concentration of trace metal ions in the zinc liquid is an important technological parameter index of a wet-method zinc smelting purification process, a proper amount of impurity ions can be used as an activating agent for reaction, and the electrolytic effect can be reduced if the concentration is too high, so that the accurate detection of the concentration of the trace metal ions is the premise and the basis for the stability of the subsequent process. The concentration of matrix ions in the zinc liquid is high, the concentration ratio of the matrix ions to the trace metal ions to be detected is up to 17-20 ten thousand times, the trace metal ion signals are overlapped and shielded seriously by high-concentration zinc ions and other impurity metal ions, in addition, the zinc liquid has complex components and similar chemical characteristics of various metal ions, the mutual interference among ions is serious under the influence of charge distribution, and due to the limitation of stray light, energy and the like of an instrument, the nonlinearity between the concentration of the trace metal ions to be detected and the signals is strong, and the detection of the concentration of the trace metal ions is difficult.
In addition, regarding the existing concentration prediction model constructed during the detection of the concentration of the trace metal ions, the high and low concentration intervals and the partition training model are not distinguished aiming at the mixed solution of the trace metal ions under different concentrations, but a prediction model is obtained by uniformly processing the whole interval. However, experimental studies show that when the concentrations of trace metal ions are different, the spectrum signals are affected by zinc ions and other ions to different extents, for example, when the mixed solution is mixed by zinc zn (ii), copper cu (ii), and cobalt co (ii), zinc zn (ii) is a high-concentration matrix ion, cobalt co (ii) is other interfering ions, and copper cu (ii) is a trace ion to be detected, as shown in fig. 8, when the concentration of cu (ii) is lower, the spectrum signals of cu (ii) are weak and are largely masked by zn (ii), co (ii), the mutual influence and interference of the ions are serious, and the nonlinearity between the concentration of cu (ii) and the spectrum signals is strong; when the concentration of Cu (II) is higher, the concentration ratio of Zn (II) to Cu (II) is reduced, the interference and masking effect of Zn (II) on Cu (II) signals are reduced, the correlation between the concentration of Cu (II) and a spectrum signal is strong, and the linearity is enhanced, so that when the concentration of trace metal ions to be detected is different, the influence degree of the trace metal ions on the spectrum signal is different, and therefore, the precision of a model obtained by uniformly processing mixed solutions with different concentrations in the prior art is not high, and the reliability of an obtained prediction result is not high.
Disclosure of Invention
The invention aims to provide a zinc liquid trace metal ion concentration prediction method based on partition modeling, which enhances the aggregation of internal characteristic information of an interval through partition prediction, quickly and efficiently removes masking, overlapping and interference wave bands of high-concentration matrix ions, avoids the interference and masking of ionic zinc and impurity metal ions, further improves the precision of a prediction model and obtains a more reliable prediction result.
A zinc liquid trace metal ion concentration prediction method based on partition modeling comprises the following steps:
s1: acquiring a spectrum signal diagram of the mixed solution sample, and calculating a correlation coefficient-stability value of each wavelength variable based on the spectrum signal of the mixed solution sample;
the spectral signal diagram is a spectral signal-wavelength parameter relation diagram, and the wavelength variable is a wavelength parameter point selected according to a preset rule;
s2: acquiring an optimal wavelength variable based on the correlation coefficient-stability value of the wavelength variable;
s3: carrying out model training to obtain a concentration interval classification model of the mixed solution;
dividing the mixed solution sample into a mixed solution sample in a low concentration interval and a mixed solution sample in a high concentration interval according to the concentration of trace metal ions, and inputting a spectral signal corresponding to an optimal wavelength variable in a spectral signal of each mixed solution sample and a concentration interval classification label to which the mixed solution sample belongs into a model for training to obtain a concentration interval classification model;
the input data of the concentration interval classification model are as follows: the spectral signal corresponding to the optimal wavelength variable in the spectral signals of the mixed solution outputs data as follows: classifying labels of concentration intervals to which the mixed solution belongs;
s4: respectively carrying out model training to obtain a concentration prediction model in a high concentration interval and a concentration prediction model in a low concentration interval;
a: inputting a spectral signal corresponding to an optimal wavelength variable in a spectral signal of a mixed solution sample belonging to a high-concentration interval and a concentration of a trace metal ion to be detected in a corresponding mixed solution sample into a model for training to obtain a concentration prediction model of the high-concentration interval;
b: inputting a spectral signal corresponding to an optimal wavelength variable in a spectral signal of a mixed solution sample belonging to a low concentration interval and a concentration of a trace metal ion to be detected in a corresponding mixed solution sample into a model for training to obtain a concentration prediction model of the low concentration interval;
the obtained input data of the concentration prediction model of the high concentration interval and the concentration prediction model of the low concentration interval are both: the spectrum signal corresponding to the optimal wavelength variable in the spectrum signals of the mixed solution; the output data is: the concentration of trace metal ions to be detected in the mixed solution;
s5: and obtaining a concentration interval to which the mixed solution to be detected belongs based on the concentration interval classification model in the step S3, and obtaining the predicted concentration of the trace metal ions to be detected based on the concentration prediction model corresponding to the high concentration interval or the concentration prediction model corresponding to the low concentration interval in the step S4.
According to experimental research, the influence degree of zinc ions and other ions on the spectrum signals is different when the concentration of trace metal ions is different. Therefore, the mixed solution sample is divided into two concentration intervals, namely a high concentration interval and a low concentration interval, and a concentration interval classification model is trained; and respectively carrying out model training on the mixed solution samples in the high-concentration interval and the low-concentration interval to obtain a concentration prediction model in the high-concentration interval and a concentration prediction model in the low-concentration interval, so as to realize partition prediction, further enhance the aggregation of characteristic information in the intervals, enhance the pertinence of the models and improve the accuracy.
Preferably, the process of acquiring the optimal wavelength variable in step S2 is as follows:
s21: sorting the wavelength variables according to the sequence of the correlation coefficient-stability value from large to small;
s22: selecting different numbers of wavelength variables in sequence to respectively construct a concentration prediction initial model;
the training input data of a corresponding concentration prediction initial model when the wavelength variable is selected each time is as follows: the spectrum signal corresponding to the currently selected wavelength variable in the spectrum signal of each sample and the concentration of the trace metal ions to be detected in each sample;
the input data of the obtained initial model for concentration prediction are as follows: the output data of the spectrum signal of the currently selected wavelength variable in the spectrum signals of the mixed solution are as follows: the concentration of trace metal ions to be detected in the mixed solution;
s23: obtaining an optimal wavelength variable based on the model effect of each concentration prediction initial model;
the optimal wavelength variable is a wavelength variable correspondingly selected by the optimal model effect, and the optimal model effect is as follows: the model error is minimum, and the number of the selected wavelength variables is minimum.
The invention takes the correlation coefficient-stability value as an importance index to sequence and select the wavelength variables. The correlation coefficient is obtained based on the fact that the Cu (II) concentration and the spectrum signal are nonlinear due to the complex environment and the mutual influence among ions in the mixed solution, but positive correlation characteristics still exist between the Cu (II) concentration and the spectrum signal, the larger the correlation coefficient between the spectrum signal and the component to be detected is, the more the concentration information of trace metal ions contained in the wavelength variable is, the higher the signal sensitivity is, the smaller the interference of other ions is, and the less blank information and redundant noise are contained; in addition, in order to improve the stability of wavelength variable selection and the reliability of a model, the wavelength variables are sorted and selected by taking a correlation coefficient-stability value obtained based on the correlation coefficient as an important index, so that noise information and blank information can be effectively eliminated, and the interference and the masking of matrix ions zinc and impurity metal ions are avoided as much as possible; selecting wavelength variables which are highly correlated with the trace metal ions to be detected, reserving the sensitivity of the trace metal ions to be detected to the maximum extent, and improving the model efficiency by reducing the number of the wavelength variables;
on the other hand, the method obtains the optimal wavelength variable corresponding to the optimal model effect by training the concentration prediction initial model and then based on the model effect, extracts the optimal wavelength variable more scientifically and accurately and reduces the number of the wavelength variables on the basis of improving the model precision. Especially, compared with the existing MC-UVE method, the MC-UVE method has the advantages that the number of wavelength variables is reasonably selected by setting stability threshold values through experience, and therefore non-information variables with wavelength variable stability values in the middle of the threshold values are eliminated; however, the trace metal ion concentration under the background of high concentration is rarely studied, and is lack of experience support, so that the adaptability and the pertinence of the sample are enhanced by combining the sample characteristics and a regression model, and the adaptability and the pertinence of the sample are enhanced by the method of the invention through the model.
Preferably, the best model effect in step S2 is: the cross validation has the minimum mean square error and the minimum number of wavelength variables.
Preferably, in step S22, each time the wavelength variables are sequentially selected, different numbers of wavelength variables are sequentially selected from the first wavelength variable in the sequence.
The larger the correlation coefficient, the larger the correlation coefficient-stability value obtained. The larger the correlation coefficient is, the more the concentration information of the trace metal ions contained in the wavelength variable is, the higher the signal sensitivity is, the smaller the interference of other ions is, and the less blank information and redundant noise are contained, so that the wavelength variable selected each time from the first wavelength variable in the sequence can be ensured to retain the wavelength variable with the high correlation with the trace metal ions to be detected as much as possible.
Further preferably, the concentration interval classification model, the concentration prediction model of the high concentration interval, the concentration prediction model of the low concentration interval, and the concentration prediction initial model are all constructed based on a support vector machine;
wherein, the model training process is as follows: and based on a particle swarm optimization algorithm, optimizing and solving a penalty coefficient C and a nuclear parameter sigma of the support vector machine model by taking the cross validation mean square error as a fitness function.
Further preferably, the process of calculating the correlation coefficient-stability value of each wavelength variable based on the spectrum signal of the sample in step S1 is as follows:
s11: randomly selecting a mixed solution sample from the collected mixed solution samples according to a preset proportion based on a Monte Carlo non-information variable elimination method;
s12: calculating a correlation coefficient between the corresponding spectral signal and the concentration of each wavelength variable by adopting a preset correlation coefficient formula based on the mixed solution sample selected in the step S11;
s13: repeatedly executing the steps S11 and S12K times, respectively;
s14: and calculating a correlation coefficient-stability value of each wavelength variable by adopting a preset correlation coefficient-stability value formula based on the correlation coefficient of each wavelength variable obtained by executing the steps S11 and S12 for K times, wherein K is a positive integer.
Further preferably: the preset correlation coefficient formula is as follows:
Figure BDA0001812094930000061
in the formula, RjFor the correlation coefficient of the jth wavelength variable, I represents the number of randomly selected samples for each Monte Carlo non-information variable eliminationJ represents the total number of wavelength variables, xijRepresenting the absorbance, y, corresponding to the wavelength variable j in the ith sample after each Monte Carlo samplingiAnd representing the concentration of the trace metal ions to be detected in the ith sample after each Monte Carlo sampling.
Further preferably, the preset correlation coefficient-stability value formula is as follows:
Figure BDA0001812094930000062
wherein K represents the total number of Monte Carlo sampling, mean (. cndot.) represents the mean, std (. cndot.) represents the standard deviation, and R represents the standard deviationkjRepresenting the correlation coefficient of the wavelength variable j in the kth monte carlo sample.
The common correlation coefficient method has high dependence on samples, and the addition or the loss of a random sample can cause numerical value change, particularly has limitation under the condition of unbalanced sample distribution; the existing Monte Carlo non-information variable elimination method (MC-UVE) is a wavelength variable selection method established based on a regression coefficient of linear Partial Least Squares (PLS), takes a linear regression coefficient as a wavelength importance index, and cannot be applied to a nonlinear scene in the invention; therefore, the correlation coefficient-stability value formula is provided, the Monte Carlo random sampling and the correlation coefficient are combined to serve as the importance evaluation index of the wavelength variable, the wavelength variable corresponding to the correlation coefficient with obvious influence is selected, the method is suitable for linear scenes and nonlinear scenes with positive correlation, the sample set is fully utilized as far as possible, the stability of wavelength variable selection and the reliability of the model are improved, and errors caused by sample unbalance are reduced.
Further preferably, when the wavelength parameter is a wave number, the preset selection rule of the wavelength variable in step S1 is: scanning was performed at 1nm scanning intervals. For example, the scanning interval is 400-800nm, and the scanning interval is 1nm, so that 401 wavelength points are obtained.
Further preferably, the wavelength parameter is a wave number, and the spectral signal is absorbance. The wavelength parameter is a wavelength-dependent parameter, and in other possible embodiments, the wavelength parameter may be other parameters that can implement the present solution, such as a wave number.
Advantageous effects
1. According to the invention, experimental researches show that trace metal ions to be detected in zinc liquid present nonlinearity in high and low concentration intervals in different degrees, a mixed solution sample is divided into two concentration intervals, namely a high concentration interval and a low concentration interval according to the characteristic, and a concentration interval classification model is trained; and respectively carrying out model training on the mixed solution samples in the high-concentration interval and the low-concentration interval to obtain a concentration prediction model in the high-concentration interval and a concentration prediction model in the low-concentration interval, so as to realize partition prediction, further enhance the aggregation of characteristic information in the intervals, enhance the pertinence of the models and improve the accuracy.
2. The method takes the correlation coefficient-stability value as an importance index to sequence and select the wavelength variables, can effectively eliminate noise information and blank information, and avoids the interference and masking of matrix ions zinc and impurity metal ions as much as possible; selecting wavelength variables with large correlation with the trace metal ions to be detected, reserving the sensitivity of the trace metal ions to be detected to the maximum extent, and improving the model efficiency by reducing the number of the wavelength variables. Compared with the existing correlation coefficient method or the Monte Carlo non-information variable elimination (MC-UVE) method, the calculation formula of the correlation coefficient-stability value provided by the invention is more suitable for the nonlinear characteristic presented between the Cu (II) concentration and the spectrum signal caused by the complex environment and the mutual influence among ions in the mixed solution, so that the correlation coefficient-stability value of each wavelength variable is more accurate.
3. The method obtains the optimal wavelength variable corresponding to the optimal model effect by training the concentration prediction initial model and then based on the model effect, more scientifically and accurately extracts the optimal wavelength variable and reduces the number of the wavelength variables on the basis of improving the model precision. Particularly, compared with the existing mode of reasonably selecting the number of wavelength variables by setting the stability threshold value through experience, the method can effectively solve the problems that the research on the concentration of trace metal ions under the background of high concentration is too little and the set stability threshold value is lack of experience support in the existing method, and the method improves the adaptability and is more targeted by adopting a model mode in combination with the characteristics of the sample.
Drawings
Fig. 1 is a schematic flow chart of a zinc liquid trace metal ion concentration prediction method based on partition modeling according to an embodiment of the present invention.
Fig. 2 is a diagram of original spectral signals of three metal ions provided by an embodiment of the present invention.
Fig. 3 is a graph of derivative de-noised spectral signals of three metal ions according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of correlation coefficient-stability values of wavelength variables of cu (ii) according to the method provided in the embodiment of the present invention.
FIG. 5 is a graph showing the correlation coefficient-stability value of each wavelength variable of Cu (II) in MC-UVE PLS.
Fig. 6 is a schematic diagram of cross-validation mean square error of a regression model of support vectors with different numbers of wavelength variables according to an embodiment of the present invention.
FIG. 7 is a graph of Cu (II) concentration variation under high concentration Zn (II) and trace Co (II) interference environment according to an embodiment of the present invention.
Fig. 8 is a graph of the relationship between the cu (ii) concentration and the spectrum signal at 9 selected wavelength points in the environment with high zn (ii) concentration and trace co (ii) interference according to the embodiment of the present invention.
Fig. 9 is a schematic diagram of an error between an actual value and a predicted value of a concentration of a trace metal ion in a test set sample according to an embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to the following examples.
The ultraviolet-visible spectrophotometry (UV-Vis) is used for qualitatively and quantitatively measuring a substance to be measured by detecting the absorbance within the wavelength range of 190-800 nm, and has been widely applied to the detection of low-concentration multi-metal ions in a solution due to the advantages of high accuracy, good reproducibility, simple analysis method and the like. According to the characteristics of large concentration ratio of metal ions in the zinc liquid, various types of metal ions and the like, and the requirements on rapidity and stability of a detection instrument in online detection, the ultraviolet-visible spectrophotometry is suitable for measuring the concentration of trace metal ions in the zinc liquid. The invention adopts ultraviolet visible spectrophotometry (UV-Vis) to detect.
The invention provides a zinc liquid trace metal ion concentration prediction method based on partition modeling, which comprises the following steps:
s1: acquiring a spectrum signal diagram of the mixed solution sample, and calculating a correlation coefficient-stability value of each wavelength variable based on the spectrum signal of the mixed solution sample;
the spectral signal diagram of the mixed solution sample acquired by the invention is preferably an original spectral signal processed by using derivative spectroscopy, and is a signal diagram obtained by performing denoising pretreatment on derivative spectral data by using a wavelet function. The spectral data is obtained by an ultraviolet-visible spectrophotometry, specifically, a spectral signal of a mixed solution to be detected containing a plurality of metal ions in a full waveband is obtained, for example, the absorbance of 401 wavelength points can be obtained at intervals of 1nm and in a wavelength range of 400-800 nm. The method comprises the steps of processing an original spectrum signal by using a derivative spectroscopy, separating overlapped signals of a plurality of metal ions, reproducing a spectrum peak of the trace metal ions to be detected, and denoising the derivative spectrum by using a classical db3 wavelet function in order to eliminate noise signals introduced by the derivative spectroscopy, improve the signal-to-noise ratio and enable the derivative spectrum signal to be smoother. The spectrum signal diagram in this example is a graph of absorbance versus wavelength.
The specific process of step S1 is as follows:
s11: randomly selecting a mixed solution sample from the collected mixed solution samples according to a preset proportion based on a Monte Carlo non-information variable elimination method;
s12: calculating a correlation coefficient between the corresponding spectral signal and the concentration of each wavelength variable by adopting a preset correlation coefficient formula based on the mixed solution sample selected in the step S11;
s13: repeatedly executing the steps S11 and S12K times, respectively;
s14: and calculating a correlation coefficient-stability value of each wavelength variable by adopting a preset correlation coefficient-stability value formula based on the correlation coefficient of each wavelength variable obtained by executing the steps S11 and S12 for K times, wherein K is a positive integer.
The preset correlation coefficient formula and the preset correlation coefficient-stability value formula are provided in the summary of the invention, and are not described herein again. In this embodiment, 401 wavelength points are included at 1nm intervals and 400-800nm wavelength ranges, i.e. there are 401 wavelength variations. As can be seen from the above formula and expression, every time the monte carlo sampling is performed, the correlation coefficients of 401 wavelength variables in the current monte carlo sampling are sequentially calculated; after K monte carlo sampling is performed, each wavelength variable in the 401 wavelength variables corresponds to a calculated value of K correlation coefficients, and a preset correlation coefficient-stability value of the wavelength variable can be calculated based on the calculated value of K correlation coefficients of each wavelength variable.
S2: and acquiring the optimal wavelength variable based on the correlation coefficient-stability value of the wavelength variable. The specific implementation process is as follows:
s21: and sorting the wavelength variables according to the sequence of the correlation coefficient-stability values from large to small. I.e., the higher the correlation coefficient-stability value, the higher the number of wavelength variables.
S22: and selecting different numbers of wavelength variables in sequence to respectively construct a concentration prediction initial model.
In this embodiment, the wavelength variables are selected according to the number change rule of 1,2,3, …,401, that is, the first wavelength variable is selected for the first time, the two wavelength variables in the first two orders are selected for the second time, the three wavelength variables in the first three orders are selected for the third time, and the operations are sequentially performed, and all the 401 wavelength variables are selected for the last time. Then, a concentration prediction initial model is respectively constructed for a certain number of wavelength variables selected each time, and the concentration prediction initial model is used for searching the optimal wavelength variables.
The training input data of a corresponding concentration prediction initial model when the wavelength variable is selected each time is as follows: the spectrum signal corresponding to the currently selected wavelength variable in the spectrum signal of each sample and the concentration of the trace metal ions to be detected in each sample;
the input data of the obtained initial model for concentration prediction are as follows: the output data of the spectrum signal of the currently selected wavelength variable in the spectrum signals of the mixed solution are as follows: and (4) measuring the concentration of trace metal ions in the mixed solution.
In this embodiment, a regression model of the support vector machine is selected, and in other feasible embodiments, other models capable of realizing the same function may be adopted, which is not specifically limited by the present invention.
S23: the optimal wavelength variable is obtained based on the model effect of each concentration prediction initial model.
The optimal wavelength variable is a wavelength variable correspondingly selected by the optimal model effect, and the optimal model effect is as follows: and if the cross validation mean square error and the number of the wavelength variables are not simultaneously combined, the cross validation mean square error is the minimum. After the initial concentration prediction model is obtained through training, cross validation mean square error is calculated according to input data.
S3: carrying out model training to obtain a concentration interval classification model of the mixed solution;
the mixed solution samples are divided into samples in a low concentration interval and samples in a high concentration interval, and a spectrum signal corresponding to an optimal wavelength variable in a spectrum signal of each mixed solution sample and a concentration interval classification label to which the mixed solution sample belongs are input into a model to be trained to obtain a concentration interval classification model. In this embodiment, the standard for dividing the low-concentration interval and the high-concentration interval is determined according to experiments and experiences, and is determined mainly according to a relational graph of the concentration of the trace metal ions at a wavelength point having a symbolic meaning and a spectral signal, and the high-concentration interval and the low-concentration interval are divided by identifying a curve variation trend due to a change in concentration.
The input data of the concentration interval classification model are as follows: the spectral signal corresponding to the optimal wavelength variable in the spectral signals of the mixed solution outputs data as follows: and (4) classifying labels of concentration intervals to which the mixed solution belongs.
In this embodiment, the concentration interval classification model is a support vector machine classification model, and in other feasible embodiments, other models that can achieve the same function may be used.
S4: respectively carrying out model training to obtain a concentration prediction model in a high concentration interval and a concentration prediction model in a low concentration interval;
a: inputting a spectral signal corresponding to an optimal wavelength variable in a spectral signal of a mixed solution sample belonging to a high-concentration interval and a concentration of a trace metal ion to be detected in a corresponding mixed solution sample into a model for training to obtain a concentration prediction model of the high-concentration interval;
b: inputting a spectral signal corresponding to an optimal wavelength variable in a spectral signal of a mixed solution sample belonging to a low concentration interval and a concentration of a trace metal ion to be detected in a corresponding mixed solution sample into a model for training to obtain a concentration prediction model of the low concentration interval;
the obtained input data of the concentration prediction model of the high concentration interval and the concentration prediction model of the low concentration interval are both: the spectrum signal corresponding to the optimal wavelength variable in the spectrum signals of the mixed solution; the output data is: and (4) the concentration of the trace metal ions to be detected of the mixed solution.
In this embodiment, both the concentration prediction model in the high concentration region and the concentration prediction model in the low concentration region adopt a support vector machine regression model, and in other feasible embodiments, other models that can achieve the same function may be adopted.
S5: and obtaining a concentration interval to which the mixed solution to be detected belongs based on the concentration interval classification model in the step S3, and obtaining the predicted concentration of the trace metal ions to be detected based on the concentration prediction model corresponding to the high concentration interval or the concentration prediction model corresponding to the low concentration interval in the step S4.
In this embodiment, the optimization problem of the support vector regression model between the spectrum signal x of the mixed solution and the concentration y of the trace metal ion to be measured in the concentration prediction initial model in step S22, the concentration prediction model in the high concentration interval in step S4, and the concentration prediction model in the low concentration interval is as follows:
Figure BDA0001812094930000131
s.t.f(xi)-yi≤ξi
Figure BDA0001812094930000132
Figure BDA0001812094930000133
where i is 1,2, …, N represents the total number of mixed solution samples of the input model. Wherein, in the concentration prediction initial model for optimum wavelength variable acquisition in step S2, for example, N is the total number of mixed solution samples acquired in step S1; in the model for predicting the concentration of the high concentration section in S4, N is the total number of the mixed solution samples belonging to the high concentration section obtained in step S1; in the model for predicting the concentration of the low concentration section in S4, N is the total number of the mixed solution samples belonging to the low concentration section obtained in step S1.
ω represents a weight vector, C represents a penalty coefficient, ξiAnd
Figure BDA0001812094930000134
representing a relaxation variable,. epsilon.representing a deviation, xiRepresenting the spectral signal, y, corresponding to the selected wavelength of the i-th group of mixed solutionsiAnd (3) representing the concentration of trace metal ions in the mixed solution of the i-th group, introducing a Lagrange function to convert the optimization problem into a dual problem, and solving to obtain an objective function of the support vector regression model:
Figure BDA0001812094930000135
wherein x is input data, namely a spectrum signal corresponding to the selected optimal wavelength of the mixed solution to be detected, f (x) is output, namely the concentration of trace metal ions in the mixed solution to be detected, and xiRepresenting the spectral signal, alpha, corresponding to the selected wavelength of the i-th group of mixed solutions in the training setiAnd
Figure BDA0001812094930000136
representing the Lagrange factor, b representing the decision threshold,
Figure BDA0001812094930000137
k(xix) denotes the radial basis kernel function and σ denotes the kernel parameter.
And optimizing and solving a penalty coefficient C and a nuclear parameter sigma of the support vector machine model by taking a cross validation mean square error as a fitness function based on a particle swarm optimization algorithm.
Similarly, the objective function obtained by training the support vector machine model for the concentration interval classification model is as follows:
Figure BDA0001812094930000141
wherein x is input data, namely a spectrum signal corresponding to the selected optimal wavelength of the mixed solution to be detected, f (x) is output, namely a classification label (a classification label in a high-concentration interval or a classification label in a low-concentration interval) of the concentration interval of the trace metal ions of the mixed solution to be detected, sgn [ ·]Is a symbolic function, xiRepresenting the spectral signal, alpha, corresponding to the selected wavelength of the i-th group of mixed solutions in the training setiAnd
Figure BDA0001812094930000142
representing the Lagrange factor and b the decision threshold.
In conclusion, the method provided by the invention can obtain a more accurate prediction result. Compared with the existing analysis method, the analysis method provided by the invention has more advantages, and the specific reasons are as follows:
the analysis and correction methods commonly used in the prior art are mainly linear regression methods and nonlinear correction methods, wherein the linear regression methods include principal component analysis (PCR), Partial Least Squares (PLS), and the nonlinear correction methods include Artificial Neural Network (ANN), Support Vector Regression (SVR), and the like. The interference and the overlapping of metal ion spectrum signals in the zinc liquid are serious, and the traditional analysis and correction method based on the full spectrum wave band contains a large amount of noise and redundant information, so that the requirements of detection precision and the like are difficult to meet. In view of this limitation, in recent years, regression methods combining wavelength variable selection have been developed, mainly including the Interval Partial Least Squares (IPLS), Moving Window Partial Least Squares (MWPLS), monte carlo non-information variable elimination (MC-UVE), competitive adaptive weighting (CARS), etc., which achieve better accuracy under the condition of satisfying linear, additive, and approximately lambert-beer law, where IPLS and MWPLS mainly select for spectral intervals; MC-UVE and CARS are selected according to wavelength points, wherein the MC-UVE can be used for rejecting noise wavelength points in a spectrum, and the CARS can reduce the influence of co-linearity variables and remove useless information variables. However, when signals are seriously overlapped, masked and nonlinear between spectrum variables and concentrations due to the influence of high-concentration matrix ions and the interaction of components, IPLS and MWPLS are difficult to select interval width due to inexperience, important characteristic wavelength variables outside the selected interval can be omitted, and MC-UVE and CARS have good indexes at the peak of the matrix ion spectrum and are easy to select matrix ion information wavelength points so as to influence the model accuracy of trace metal ions.
Aiming at the wavelength variable selection, the correlation coefficient stability value calculation formula provided firstly accords with the non-linear relation between the trace metal ion concentration and the spectrum signal but has the advance of positive correlation, and the calculation formula of the existing MC-UVE method is based on the linear characteristic, so that the accuracy of the correlation coefficient-stability value calculated by the method is higher; in addition, the optimal wavelength variable is selected based on the constructed concentration prediction initial model, the characteristics of a sample are considered, and the optimal wavelength variable is selected without depending on an empirical value, so that compared with the problems that the interval width is difficult to select due to inexperience of the existing IPLS and MWPLS, important characteristic wavelength variables outside the selected interval may be omitted, and MC-UVE reasonably selects the number of the variables through setting a stability threshold value by experience, but the research on the concentration of trace metal ions under a high-concentration background is too few, and the problem of inexperienced support is lacking, the wavelength variable is sorted and selected by the correlation coefficient-stability value, the important characteristic wavelength variable is preferentially selected, and the stability and the applicability of wavelength variable screening are improved. Compared with the MC-UVE and CARS, the method has better indexes at the peak of the matrix ion spectrum and is easy to select the information wavelength point of the matrix ions so as to influence the model precision of the trace metal ions.
The following illustrates a zinc liquid trace metal ion concentration partition modeling method based on variable sorting selection according to an embodiment of the present invention.
And zinc Zn (II) in the mixed solution to be detected is high-concentration matrix ions, cobalt Co (II) is other interference ions, and copper Cu (II) is trace ions to be detected. According to the experimental requirements, Zn (II), Cu (II), Co (II) single ion solutions and mixed solutions with high concentration ratios are prepared under a system of nitroso R salt and sodium acetate, wherein the concentration range of Zn (II) is 70-100g/L, the interval is 10g/L, the concentration ranges of Cu (II) and Co (II) are 0.5-4.0mg/L, the interval is 0.5mg/L, the concentration ratio reaches 17-20 ten thousand times, the volume is determined by deionized water, a reagent blank (not containing metal ions) is used as a reference, 1nm is used as an interval, 400-800nm is used as a scanning wavelength range, the spectral signals of the single ion and 62 groups of mixed solutions are measured, 48 groups of mixed solutions are used as a training sample set, and 14 groups are used as a testing sample set by adopting a KS algorithm.
FIG. 2 is a diagram of original spectrum signals of Zn (II), Cu (II), Co (II) single ions according to an embodiment of the present invention. As shown in FIG. 2, the high Zn (II) heavily masks the trace Cu (II) signal at 400-500nm band, and the Zn (II) signal is lower after 500nm, while the interference Co (II) is continuously higher at the 400-600nm band, and basically masks the Cu (II) signal. Under the masking effect of Zn (II) and Co (II) ions, the detection of Cu (II) spectrum signals is difficult. In addition, the zinc liquid has high concentration, complex components and similar chemical characteristics of various metal ions, is influenced by charge distribution interaction in the process of measuring by an ultraviolet-visible spectrophotometry, changes the molar absorption coefficient, seriously deviates from the precondition assumption conditions that the dilute solution and ions which enable the Lambert-beer law to be established have no mutual interference, and the like, and because of the limitations of stray light of an instrument dispersion element, light source energy and the like, the concentration of the trace Cu (II) to be measured and a spectrum signal can not meet good linearity any more.
In order to solve the problem that Cu (II) signals in original spectrum signals are basically and completely overlapped and masked by Zn (II) signals and Co (II) signals, and reproduce Cu (II) spectrum peaks, the invention adopts a derivative spectrum combined wavelet denoising method to preprocess the spectrum signals, FIG. 3 is a derivative denoising spectrum signal diagram of three metal ions provided by the embodiment of the invention, and the interference of Zn (II) and Co (II) to Cu (II) is far less than that of the original spectrum signals. However, due to the similar chemical characteristics of the three metal ions, the absorption spectra of the complex formed in the liquid to be detected are similar, and the spectral signals are still partially overlapped.
Although the Cu (II) concentration and the spectrum signal are nonlinear due to the complex environment and the interaction between ions in the solution, the Cu (II) concentration and the spectrum signal still have positive correlation, the larger the correlation coefficient between the spectrum signal and the component to be measured, the more Cu (II) concentration information contained in the wavelength variable is represented, the higher the signal sensitivity is, the smaller the interference of other ions is, and the less blank information and redundant noise are contained. MC-UVE is a model variable stability value-based non-information variable elimination method, and dependence and instability of a common correlation coefficient method on a sample can be reduced by using a Monte Carlo random sampling to fully utilize a sample set as far as possible.
36 samples are randomly acquired from 48 groups of training set samples according to a proportion of 75% to obtain a correlation coefficient of each wavelength variable, a correlation coefficient-wavelength matrix is obtained after sampling for 100 times, and then a correlation coefficient-stability value of each wavelength variable is calculated, wherein fig. 4 is a schematic diagram of correlation coefficients-stability values of all wavelength variables of the method Cu (II) provided by the embodiment of the invention. As shown in FIG. 4, the wavelength variables of 500-541nm have relatively high stability values, in this wavelength band, the peak of the Cu (II) spectrum signal is obvious, the Zn (II) spectrum signal is rapidly reduced to the vicinity of the absorbance 0, and the Co (II) spectrum signal is smoothly stabilized to the vicinity of the absorbance 0, which indicates that the wavelength variables contain more useful information of the ions to be measured and are less influenced by other interfering ions, and the spectrum signals are smooth and contain less noise. Comparing the method of the invention with the Monte Carlo non-informative variable elimination-partial least squares (MC-UVE PLS) method, FIG. 5 is a graph showing the correlation coefficient-stability of each wavelength variable of Cu (II) in MC-UVE PLS, the stability of 500-541nm wavelength variable is high, which is similar to the analysis effect of the present invention, but the stability value is higher at 400-420nm, even partially higher than the band with obvious Cu (II) spectrum signal, the wave band matrix ions Zn (II) have very large negative spectral signals, spectral line fluctuation is large, and a large amount of noise is contained, therefore, the MC-UVE PLS is likely to largely select the wavelength variable within the band of 400-420nm in the process of selecting the wavelength variable, which influences the model effect and is not suitable for the detection of the concentration of the trace metal ions under the influence of high-concentration matrix ions and impurity interfering ions.
After the wavelength variables are arranged in the order of the correlation coefficient-stability value from large to small in fig. 4, the number of the selected wavelength variables is particularly important for the stability and generalization capability of the model, the number of the variables is too small, although the selection of the redundant noise is reduced, part of information of the sample is lost, the characteristics of the sample cannot be fully reflected, and if the number of the variables is too large, a large amount of noise, blank information and interference bands are introduced, so that the contribution of the whole variable set to the model is reduced. The MC-UVE reasonably selects the number of variables by setting stability threshold values through experience generally, and eliminates the non-information variables with the wavelength variable stability values in the middle of the threshold values. However, the trace metal ion concentration under the background of high concentration is rarely studied, and is lack of experience support, so that the selection of the number of wavelength variables needs to be combined with the sample characteristics and a regression model to enhance the adaptability and pertinence of the sample.
Preferentially selecting a wavelength variable with a large stability value in order to extract Cu (II) useful information as much as possible and reduce the influence of interference information and noise; then 1,2,3, … … and 401 wavelength variables with different quantities are respectively selected in sequence, a particle swarm optimization-support vector regression (PSO-SVR) model is respectively established, and the concentration prediction initial model is obtained based on training. And each PSO-SVR model takes a cross validation mean square error (CVmse) as a fitness, updates the particle speed and position, calculates the fitness of population particles and updates the global optimum, and after an iteration termination condition is reached, determines a concentration prediction initial model and outputs the CVmse of a model under 401 wavelength variables. Fig. 6 is a schematic diagram of cross-validation mean square error of a regression model of support vectors with different numbers of wavelength variables according to an embodiment of the present invention. The CVmse value reaches a minimum value when the variable number in fig. 6 is 50, and thereafter the CVmse is kept substantially constant as the variable number increases, but the CVmse starts to increase sharply when the variable number reaches 150. On the premise of improving the model accuracy, in order to simplify the model complexity and reduce the operation time, the number of the selected optimal wavelength variables is 50, and the selected wavelength variables are 495-541nm and 406-408 nm.
Fig. 7 is a derivative spectrum signal of the mixed solution when the concentration of cu (ii) changes from 0 to 4.2mg/L in the high concentration zn (ii) and trace amount co (ii) interference environment provided by the embodiment of the present invention, and fig. 8 is a graph of the relationship between the concentration of cu (ii) and the spectrum signal at 9 symbolic wavelength points selected in fig. 7 provided by the embodiment of the present invention. As shown in fig. 8, when the concentration of cu (ii) is low, the cu (ii) spectrum signal is weak and is greatly masked by zn (ii), co (ii), and ions are seriously affected and interfered with each other, and the nonlinearity between the concentration of cu (ii) and the spectrum signal is strong; when the concentration of Cu (II) is higher, the concentration ratio of Zn (II) to Cu (II) is reduced, the interference and masking effect of Zn (II) on Cu (II) signals is reduced, the correlation between the concentration of Cu (II) and spectrum signals is strong, and the linearity is enhanced. The two concentration intervals have different influence degrees on the spectral signal, so that the two intervals are divided according to the Cu (II) concentration, the low-concentration interval and the high-concentration interval are respectively fitted, and compared with the full-interval linear fitting, the fitting standard deviation (RMSE) is reduced by 37.9-59.2%, and the RMSE of 9 wavelength points is averagely reduced by 50.1%; compared with the full-interval nonlinear fitting, the RMSE is reduced by 11.0-35.1%, and the RMSE of 9 wavelength points is reduced by 23.4% on average, so that the visible partition fitting effect is better, the solution characteristics can be highlighted, and the aggregation of characteristic information in the interval is enhanced.
Dividing a sample into a Cu (II) high-concentration interval and a Cu (II) low-concentration interval, inputting spectral signals of 50 wavelength variables into a particle swarm optimization-support vector classification (PSO-SVC) model training model to obtain a concentration interval classification model, then establishing a PSO-SVR model aiming at different concentration intervals respectively, and then training to obtain a concentration prediction model of the high-concentration interval and a concentration prediction model of the low-concentration interval. Table 1 shows that the prediction accuracy of the confusion matrix of the cu (ii) concentration interval prediction results of 14 test samples is 100%, and if a few prediction errors occur in the experiment, the influence on the entire model is small, but the error of the sample is increased to a certain extent, which indicates that the sample has strong specificity and the error is still lower than that of the model without interval prediction.
TABLE 1 confusion matrix of prediction results of Cu (II) concentration intervals
Figure BDA0001812094930000191
After the concentration subinterval partition model is established, a PSO-SVR model is reconstructed aiming at the subinterval; and finally, carrying out interval prediction on the concentration value of the trace metal ions to be detected through the established model, and then calculating the concentration value.
Respectively establishing full-waveband Partial Least Squares (PLS), full-waveband particle swarm optimization-support vector regression (PSO-SVR), competitive adaptive weighted partial least squares (CARS-PLS), Monte Carlo non-information variable elimination-partial least squares (MC-UVE PLS), Monte Carlo non-information variable elimination-least squares support vector machine (MC-UVE LS SVM) and support vector machine model based on partition modeling (VR-S C-SVR) for 14 groups of test samples, establishing a support vector machine (VR-S SVR) based on partition modeling without concentration partition for proving partition effect, and determining coefficients (R-S SVR) according to the number of wavelength variables, maximum relative errors and decision coefficients2) The predicted Root Mean Square Error (RMSEP) is used as an evaluation index of 7 models, and table 2 is a comparison of results of 7 modeling methods.
TABLE 2 comparative results of 7 modeling methods for Cu (II) concentration prediction
Figure BDA0001812094930000201
As shown in Table 2, the full-band PLS and the full-band PSO-SVR are modeled, the number of variables is huge, the complexity of the model is high, and the precision of the model is low due to the existence of a large amount of noise, redundancy and interference information; CARS PLS, MC-UVE PLS and MC-UVE LS SVM reductionThe number of the wavelength variables is increased, the model precision is improved, but because part of the wavelength variables in the 400-plus-420 nm wave band are selected, interference signals and a large amount of noise caused by introducing matrix ions with high Zn (II) are introduced, the model precision is not ideal, the maximum relative error reaches about 10 percent, and the experimental requirement and the industrial field requirement are not met; VR-S SVR and VR-S C-SVR combine the advantages of correlation coefficient and MC-UVE, select wavelength variable with obvious Cu (II) spectrum signal characteristics, reduce interference of high Zn (II) and impurity Co (II), remove noise and useless information, and have less wavelength variables, maximum relative error, RMSEP and R2The VR-S C-SVR can better reflect the characteristics of Cu (II) spectrum signals in the mixed solution due to the fact that the concentration interval is predicted and then the models are separately established aiming at the two concentration intervals, the maximum relative error and RMSEP are lower, and R is lower2And obviously, the method has better effect.
Fig. 9 is a schematic diagram of errors between actual values and predicted values of the concentrations of the trace metal ions in the test set samples provided in the embodiment of the present invention, where, in 14 samples, the maximum relative error is 6.94%, the average relative error is 2.74%, there are 13 relative errors between 0% and 5%, and there are 1 relative errors between 5% and 10%, compared with other linear regression and nonlinear regression methods, the method reduces the number of wavelengths by 25.37% to 87.53% under the conditions of high zn (ii) background and co (ii) impurity interference, and improves the model accuracy by 21.10% to 76.08%.
According to the zinc liquid trace metal ion concentration prediction method based on partition modeling, the stability and the applicability of wavelength variable screening are improved by adopting a variable sorting selection method based on a correlation coefficient-stability value, Zn (II), Co (II) spectral signals are quickly and efficiently removed to mask, overlap and interfere wave bands, wavelength variables with high correlation with Cu (II) are selected, the number of the wavelength variables is reduced, and the complexity and the running time of a model are reduced; according to the data statistics phenomenon that Cu (II) in the mixed solution presents different nonlinear characteristics in high and low concentration intervals, the concentration is divided into the high and low intervals for partition modeling, the pertinence of the model is enhanced, and the precision of the model is improved.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (8)

1. A zinc liquid trace metal ion concentration prediction method based on partition modeling is characterized by comprising the following steps: the method comprises the following steps:
s1: acquiring a spectrum signal diagram of the mixed solution sample, and calculating a correlation coefficient-stability value of each wavelength variable based on the spectrum signal of the mixed solution sample;
the spectrum signal diagram is a relation diagram of spectrum signals and wavelength parameters, the wavelength variable is a wavelength parameter point selected according to a preset rule, and a formula of the correlation coefficient and the stability value is as follows:
Figure FDA0002927696270000011
wherein K represents the total number of Monte Carlo sampling, mean (. cndot.) represents the mean, std (. cndot.) represents the standard deviation, and R represents the standard deviationkjA correlation coefficient representing the wavelength variable j in the kth Monte Carlo sample;
the calculation formula of the correlation coefficient is as follows:
Figure FDA0002927696270000012
Figure FDA0002927696270000013
in the formula, RjFor the correlation coefficient of the jth wavelength variable, I represents the number of randomly selected samples for each Monte Carlo non-information variable elimination, J represents the total number of wavelength variables, xijRepresenting the absorbance, y, corresponding to the wavelength variable j in the ith sample after each Monte Carlo samplingiRepresenting the concentration of the trace metal ions to be detected in the ith sample after each Monte Carlo sampling;
s2: acquiring an optimal wavelength variable based on the correlation coefficient-stability value of the wavelength variable;
s3: carrying out model training to obtain a concentration interval classification model of the mixed solution;
dividing the mixed solution sample into a mixed solution sample in a low concentration interval and a mixed solution sample in a high concentration interval according to the concentration of trace metal ions, and inputting a spectral signal corresponding to an optimal wavelength variable in a spectral signal of each mixed solution sample and a concentration interval classification label to which the mixed solution sample belongs into a model for training to obtain a concentration interval classification model;
the input data of the concentration interval classification model are as follows: the spectral signal corresponding to the optimal wavelength variable in the spectral signals of the mixed solution outputs data as follows: classifying labels of concentration intervals to which the mixed solution belongs;
s4: respectively carrying out model training to obtain a concentration prediction model in a high concentration interval and a concentration prediction model in a low concentration interval;
a: inputting a spectral signal corresponding to an optimal wavelength variable in a spectral signal of a mixed solution sample belonging to a high-concentration interval and a concentration of a trace metal ion to be detected in a corresponding mixed solution sample into a model for training to obtain a concentration prediction model of the high-concentration interval;
b: inputting a spectral signal corresponding to an optimal wavelength variable in a spectral signal of a mixed solution sample belonging to a low concentration interval and a concentration of a trace metal ion to be detected in a corresponding mixed solution sample into a model for training to obtain a concentration prediction model of the low concentration interval;
the obtained input data of the concentration prediction model of the high concentration interval and the concentration prediction model of the low concentration interval are both: the spectrum signal corresponding to the optimal wavelength variable in the spectrum signals of the mixed solution; the output data is: the concentration of trace metal ions to be detected in the mixed solution;
s5: and obtaining a concentration interval to which the mixed solution to be detected belongs based on the concentration interval classification model in the step S3, and obtaining the predicted concentration of the trace metal ions to be detected based on the concentration prediction model corresponding to the high concentration interval or the concentration prediction model corresponding to the low concentration interval in the step S4.
2. The method of claim 1, wherein: the process of obtaining the optimum wavelength variable in step S2 is as follows:
s21: sorting the wavelength variables according to the sequence of the correlation coefficient-stability value from large to small;
s22: selecting different numbers of wavelength variables in sequence to respectively construct a concentration prediction initial model;
the training input data of a corresponding concentration prediction initial model when the wavelength variable is selected each time is as follows: the spectrum signal corresponding to the currently selected wavelength variable in the spectrum signal of each sample and the concentration of the trace metal ions to be detected in each sample;
the input data of the obtained initial model for concentration prediction are as follows: the output data of the spectrum signal of the currently selected wavelength variable in the spectrum signals of the mixed solution are as follows: the concentration of trace metal ions to be detected in the mixed solution;
s23: obtaining an optimal wavelength variable based on the model effect of each concentration prediction initial model;
the optimal wavelength variable is a wavelength variable correspondingly selected by the optimal model effect, and the optimal model effect is as follows: the model error is minimum, and the number of the selected wavelength variables is minimum.
3. The method of claim 2, wherein: the best model effect in step S2 is: the cross validation has the minimum mean square error and the minimum number of wavelength variables.
4. The method of claim 2, wherein: in step S22, when the wavelength variables are sequentially selected, different numbers of wavelength variables are sequentially selected from the first wavelength variable in the sequence.
5. The method of claim 2, wherein: the concentration interval classification model, the concentration prediction model of the high concentration interval, the concentration prediction model of the low concentration interval and the concentration prediction initial model are all constructed on the basis of a support vector machine;
wherein, the model training process is as follows: and based on a particle swarm optimization algorithm, optimizing and solving a penalty coefficient C and a nuclear parameter sigma of the support vector machine model by taking the cross validation mean square error as a fitness function.
6. The method of claim 1, wherein: the process of calculating the correlation coefficient-stability value of each wavelength variable based on the spectrum signal of the sample in step S1 is as follows:
s11: randomly selecting a mixed solution sample from the collected mixed solution samples according to a preset proportion based on a Monte Carlo non-information variable elimination method;
s12: calculating a correlation coefficient between the corresponding spectral signal and the concentration of each wavelength variable by adopting a preset correlation coefficient formula based on the mixed solution sample selected in the step S11;
s13: repeatedly executing the steps S11 and S12K times, respectively;
s14: and calculating a correlation coefficient-stability value of each wavelength variable by adopting a preset correlation coefficient-stability value formula based on the correlation coefficient of each wavelength variable obtained by executing the steps S11 and S12 for K times, wherein K is a positive integer.
7. The method of claim 1, wherein: when the wavelength parameter is wavelength, the preset selection rule of the wavelength variable in step S1 is: scanning was performed at 1nm scanning intervals.
8. The method of claim 1, wherein: the wavelength parameter is wavelength, and the spectral signal is absorbance.
CN201811124589.0A 2018-09-26 2018-09-26 Zinc liquid trace metal ion concentration prediction method based on partition modeling Active CN109187392B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811124589.0A CN109187392B (en) 2018-09-26 2018-09-26 Zinc liquid trace metal ion concentration prediction method based on partition modeling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811124589.0A CN109187392B (en) 2018-09-26 2018-09-26 Zinc liquid trace metal ion concentration prediction method based on partition modeling

Publications (2)

Publication Number Publication Date
CN109187392A CN109187392A (en) 2019-01-11
CN109187392B true CN109187392B (en) 2021-11-19

Family

ID=64907261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811124589.0A Active CN109187392B (en) 2018-09-26 2018-09-26 Zinc liquid trace metal ion concentration prediction method based on partition modeling

Country Status (1)

Country Link
CN (1) CN109187392B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109507181B (en) * 2019-01-17 2020-06-30 中南大学 Method for rapidly and quantitatively detecting concentration of trace cobalt, nickel and iron ions in zinc smelting solution
CN109799203B (en) * 2019-01-26 2021-05-07 上海交通大学 Wide-range high-precision spectrum detection method for COD concentration in water body
CN112750507B (en) * 2021-01-15 2023-12-22 中南大学 Method for simultaneously detecting nitrate and nitrite contents in water based on hybrid machine learning model
CN113450883B (en) * 2021-06-25 2023-03-21 中南大学 Solution ion concentration detection method based on multispectral fusion
CN113567375B (en) * 2021-07-29 2022-05-10 中南大学 Self-adaptive multi-metal ion concentration regression prediction method and system based on linear feature separation
CN113640225B (en) * 2021-08-23 2024-04-19 广西埃索凯新材料科技有限公司 Sulfuric acid concentration monitoring system applied to manganese sulfate production
US20230195079A1 (en) * 2021-12-21 2023-06-22 International Business Machines Corporation Characterizing liquids based on features extracted from time-dependent, differential signal measurements

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107924390A (en) * 2015-07-02 2018-04-17 通用电气健康护理生物科学股份公司 The method and system of the concentration range of sample is determined by means of calibration curve

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1235035C (en) * 2003-11-20 2006-01-04 上海交通大学 Spectrum quantitative automatic analysis method
CA2681681A1 (en) * 2009-10-06 2010-06-08 Colin Irvin Wong Mapping concentrations of airborne matter
EP2657681A1 (en) * 2012-04-26 2013-10-30 Roche Diagnostics GmbH Improvement of the sensitivity and the dynamic range of photometric assays by generating multiple calibration curves
CN103592255A (en) * 2013-11-22 2014-02-19 山东东阿阿胶股份有限公司 Soft method for measuring total protein content of donkey-hide gelatin skin solution on basis of near infrared spectrum technology
CN103868858B (en) * 2014-03-03 2016-02-10 上海交通大学 A kind ofly determine the method that saliferous clay dominates salinity spectral response best band
CN104237159A (en) * 2014-09-15 2014-12-24 甘肃银光化学工业集团有限公司 Method for analyzing content of dibutyl phthalate in mixed material through near infrared spectrum
CN105823751B (en) * 2016-03-22 2018-10-02 东北大学 Infrared spectrum Multivariate Correction regression modeling method based on λ-SPXY algorithms
CN106153561A (en) * 2016-06-21 2016-11-23 中南大学 The many metal ion inspections of uv-vis spectra based on wavelength screening
CN106644953A (en) * 2016-09-14 2017-05-10 天津工业大学 Method for improving simultaneous detection sensitivity and accuracy of multiple heavy metal ions
CN106918567B (en) * 2017-03-27 2019-05-28 中南大学 A kind of method and apparatus measuring trace metal ion concentration
CN108181262A (en) * 2017-12-18 2018-06-19 浙江工业大学 A kind of method using Near Infrared Spectroscopy for Rapid Sargassum horneri content of cellulose

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107924390A (en) * 2015-07-02 2018-04-17 通用电气健康护理生物科学股份公司 The method and system of the concentration range of sample is determined by means of calibration curve

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种高锌背景下痕量钴离子浓度分光光度测量法;朱红求 等;《光谱学与光谱分析》;20171231;第37卷(第12期);第3882-3888页 *

Also Published As

Publication number Publication date
CN109187392A (en) 2019-01-11

Similar Documents

Publication Publication Date Title
CN109187392B (en) Zinc liquid trace metal ion concentration prediction method based on partition modeling
CN109493287B (en) Deep learning-based quantitative spectral data analysis processing method
CN101915744B (en) Near infrared spectrum nondestructive testing method and device for material component content
CN111239071B (en) Method for detecting concentration of nitrate in seawater by spectrometry
CN106918567A (en) A kind of method and apparatus for measuring trace metal ion concentration
CN109060771B (en) Consensus model construction method based on different characteristic sets of spectrum
CN111157476A (en) Quantitative inversion method for water quality multi-parameter ultraviolet-visible absorption spectrum
CN108956584B (en) The quick and precisely detection method of heavy metal element chromium in a kind of mulberry fruit
US20230243744A1 (en) Method and system for automatically detecting and reconstructing spectrum peaks in near infrared spectrum analysis of tea
CN115221927A (en) Ultraviolet-visible spectrum dissolved organic carbon detection method
CN115112699A (en) XRF soil heavy metal element quantitative analysis method
CN112504983A (en) Nitrate concentration prediction method based on turbidity chromaticity compensation
CN102937575A (en) Watermelon sugar degree rapid modeling method based on secondary spectrum recombination
Ju et al. Rapid identification of atmospheric gaseous pollutants using Fourier-transform infrared spectroscopy combined with independent component analysis
Yao et al. Prediction of total nitrogen in soil based on random frog leaping wavelet neural network
CN112630180B (en) Ultraviolet/visible light absorption spectrum model for detecting concentration of organophosphorus pesticide in water body
Liu et al. Research on the online rapid sensing method of moisture content in famous green tea spreading
CN113567375B (en) Self-adaptive multi-metal ion concentration regression prediction method and system based on linear feature separation
Zhu et al. A prediction method for intervals of trace ions concentration in zinc sulfate solution based on UV-vis spectroscopy
CN116380869A (en) Raman spectrum denoising method based on self-adaptive sparse decomposition
CN106198433A (en) Infrared spectrum method for qualitative analysis based on LM GA algorithm
CN111104876A (en) Infrared spectrum deconvolution method based on neural network
Yao et al. Prediction of total nitrogen content in different soil types based on spectroscopy
CN114062306B (en) Near infrared spectrum data segmentation preprocessing method
CN113406037B (en) Infrared spectrum online rapid identification analysis method based on sequence forward selection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant