CN106483077B - Method for measuring element content of combustion body based on principal component and multiple linear regression - Google Patents

Method for measuring element content of combustion body based on principal component and multiple linear regression Download PDF

Info

Publication number
CN106483077B
CN106483077B CN201510552931.7A CN201510552931A CN106483077B CN 106483077 B CN106483077 B CN 106483077B CN 201510552931 A CN201510552931 A CN 201510552931A CN 106483077 B CN106483077 B CN 106483077B
Authority
CN
China
Prior art keywords
principal component
light intensity
matrix
intensity value
formula
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510552931.7A
Other languages
Chinese (zh)
Other versions
CN106483077A (en
Inventor
陈浩
呙星
张仁李
盛卫星
马晓峰
韩玉兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN201510552931.7A priority Critical patent/CN106483077B/en
Publication of CN106483077A publication Critical patent/CN106483077A/en
Application granted granted Critical
Publication of CN106483077B publication Critical patent/CN106483077B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Investigating, Analyzing Materials By Fluorescence Or Luminescence (AREA)

Abstract

本发明提出一种基于主成分和多元线性回归的燃烧体元素含量测定方法,首先,对燃烧体发射光谱的光强度值进行数据筛选,对筛选后的光强度值进行去噪处理;然后,根据去噪处理后的光强度值获得光强度值的主成分得分;最后,根据光强度值的主成分得分计算获得燃烧体中待测元素的含量。本发明通过分析燃烧体的发射光谱的光强度值进行元素含量的测定,测定方法简单高效,测量精度高。The invention proposes a method for determining element content of a combustion body based on principal components and multiple linear regression. First, data screening is performed on the light intensity value of the emission spectrum of the combustion body, and the filtered light intensity value is denoised; then, according to The main component score of the light intensity value is obtained from the denoised light intensity value; finally, the content of the element to be tested in the combustion body is calculated according to the main component score of the light intensity value. In the invention, the element content is determined by analyzing the light intensity value of the emission spectrum of the combustion body, the determination method is simple and efficient, and the measurement accuracy is high.

Description

基于主成分和多元线性回归的燃烧体元素含量测定方法Determination method of element content in combustion body based on principal components and multiple linear regression

技术领域technical field

本发明属于光学技术领域,具体涉及一种基于主成分分析和多元线性回归法的燃烧体元素含量测定方法。The invention belongs to the technical field of optics, and in particular relates to a method for measuring element content of a combustion body based on principal component analysis and multiple linear regression method.

背景技术Background technique

对于燃烧体,要想测定其中含有哪些元素及各元素含量,需要通过复杂的化学测量分析方法,较为普遍的做法是采用对物质燃烧后的残留物进行元素分析,进而可以分析出原燃烧体中的元素及其含量,这种方法测量难度较大,同时对于含有无机元素的燃烧体,很难对燃烧后的残留物进行元素分析,此外,运用化学领域的方法测量元素及其含量的代价较高,测量成本较高。For the combustion body, in order to determine which elements and the content of each element contained in it, complex chemical measurement and analysis methods are required. Elements and their contents are difficult to measure by this method. At the same time, for combustion bodies containing inorganic elements, it is difficult to perform elemental analysis on the residues after combustion. In addition, the cost of measuring elements and their contents by methods in the chemical field is relatively high. high, and the measurement cost is high.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供一种基于主成分分析和多元线性回归的燃烧体元素含量测定方法,通过分析燃烧体的发射光谱的光强度值进行元素含量的测定,测定方法简单高效,测量精度高。The purpose of the present invention is to provide a method for measuring the element content of a combustion body based on principal component analysis and multiple linear regression. The element content is measured by analyzing the light intensity value of the emission spectrum of the combustion body.

为了解决上述技术问题,本发明提供一种基于主成分和多元线性回归的燃烧体元素含量测定方法,首先,对燃烧体发射光谱的光强度值进行数据筛选,对筛选后的光强度值进行去噪处理;然后,根据去噪处理后的光强度值获得光强度值的主成分得分;最后,根据光强度值的主成分得分计算获得燃烧体中待测元素的含量,计算方式如公式所示,In order to solve the above-mentioned technical problems, the present invention provides a method for measuring element content of a combustion body based on principal components and multiple linear regression. First, data screening is performed on the light intensity value of the emission spectrum of the combustion body, and the filtered light intensity value is removed. Noise processing; then, the principal component score of the light intensity value is obtained according to the denoised light intensity value; finally, the content of the element to be tested in the combustion body is calculated according to the principal component score of the light intensity value, and the calculation method is shown in the formula ,

Y=b0+b1S1+...+bpSp Y=b 0 +b 1 S 1 +...+b p S p

其中,Y表示燃烧体中待测元素Y的含量,S1,S2,….Sp表示主成分得分,b0为常数项,b1…bp为回归系数,b0以及b1…bp均为已知量。Among them, Y represents the content of the element Y to be measured in the combustion body, S 1 , S 2 , … .Sp represents the principal component score, b 0 is a constant term, b 1 … b p is the regression coefficient, b 0 and b 1bp are all known quantities.

本发明与现有技术相比,其显著优点在于,(1)不需要采用化学领域的分析测量方法,很容易在实际中实现;(2)采用主成分分析法,测量精度比较高;(3)采用数学建模的方法,方法较为创新;(4)理论上可以用于测量自然界中的大部分元素,方法用途较广。Compared with the prior art, the present invention has significant advantages in that (1) it does not need to adopt the analysis and measurement method in the field of chemistry, and it is easy to realize in practice; (2) the principal component analysis method is adopted, and the measurement accuracy is relatively high; (3) ) using the mathematical modeling method, the method is more innovative; (4) theoretically can be used to measure most of the elements in nature, the method is widely used.

具体实施方式Detailed ways

容易理解,依据本发明的技术方案,在不变更本发明的实质精神的情况下,本领域的一般技术人员可以想象出本发明基于主成分分析和多元线性回归的燃烧体元素含量测定方法的多种实施方式。因此,以下具体实施方式仅是对本发明的技术方案的示例性说明,而不应当视为本发明的全部或者视为对本发明技术方案的限制或限定。It is easy to understand that, according to the technical solution of the present invention, without changing the essential spirit of the present invention, those skilled in the art can imagine the multiplication of the method for determining the element content of the combustion body based on principal component analysis and multiple linear regression of the present invention. an implementation. Therefore, the following specific embodiments are only exemplary descriptions of the technical solutions of the present invention, and should not be regarded as the whole of the present invention or as limitations or restrictions on the technical solutions of the present invention.

本发明对主成分得分变量S1,S2,….Sp与因变量Y(即燃烧体中待测元素Y的含量)进行多元线性回归分析,建立它们之间的多元线性回归模型方程,如式(1)所示:The present invention performs multiple linear regression analysis on the principal component score variables S 1 , S 2 , . As shown in formula (1):

Y=b0+b1S1+...+bpSp (1)Y=b 0 +b 1 S 1 +...+b p S p (1)

式(1)中,b0为常数项;b1…bp为回归系数,b1为S1,S2…固定时,S1每增加一个单位对Y的效应,即S1对Y的偏回归系数;同理b2为S1,S2…固定时,S2每增加一个单位对Y的效应,即S2对Y的偏回归系数等等。回归系数可以通过实验获得,在每次进行燃烧体中某待测元素含量时,只需根据光强度值获得主成分得分S1、S2…Sp,将主成分得分代入式(1)即可实现对燃烧体中某元素Y含量的测定。In formula (1), b 0 is a constant term; b 1 ... b p is the regression coefficient, and b 1 is S 1 , S 2 ... When fixed, the effect of each additional unit of S 1 on Y is the effect of S 1 on Y. Partial regression coefficient; Similarly, when b 2 is S 1 , S 2 . . . is fixed, the effect of each additional unit of S 2 on Y is the partial regression coefficient of S 2 on Y and so on. The regression coefficient can be obtained through experiments. When the content of a certain element to be measured in the combustion body is carried out each time, it is only necessary to obtain the principal component scores S 1 , S 2 . . . It can realize the determination of the content of a certain element Y in the combustion body.

一、回归系数是事先通过多次试验获得,每次使用时,只需通过查表即可使用,回归系数获得过程如下:1. The regression coefficient is obtained through multiple experiments in advance. Each time it is used, it can be used only by looking up the table. The process of obtaining the regression coefficient is as follows:

在多次试验获中,测得的包含元素Y的燃烧体发射光谱的光强度值,并在实验前通过其他方式非常艰难地测得燃烧体中待测元素Y的含量值(即先验信息),设每次试验中,燃烧体发射光谱中L个光波长对应的光强度值为X1,X2,…XL,且X1~XL的光波长的波长差是一个常数。In many experiments, the light intensity value of the emission spectrum of the combustion body containing element Y was measured, and the content value of the element Y to be measured in the combustion body was measured very difficultly by other means before the experiment (that is, the prior information). ), suppose that in each test, the light intensity values corresponding to L light wavelengths in the emission spectrum of the combustion body are X 1 , X 2 , . . . XL , and the wavelength difference of the light wavelengths from X 1 to XL is a constant.

第一步,根据相关光谱理论知识,对发射光谱的光强度值进行筛选操作,获得筛选后的发射光谱的光强度值。The first step is to perform a screening operation on the light intensity value of the emission spectrum according to the relevant spectral theoretical knowledge, and obtain the light intensity value of the filtered emission spectrum.

根据对相关光谱理论知识的研究,可知光谱数据的筛选正确与否对数据的后续处理分析有一定的影响作用。假设需要测量的为元素Y,根据现有知识可知,元素Y的燃烧发射光谱波长应当为lamda1、lamda2…、lamdan共n个波长值。由于光谱线测量时发射光谱的波长值可能会发生一定的平移,因此需要先将光强度值X1,X2,…XL所对应的L个光波长值分别与已知标准的n个波长值lamda1,lamda2…lamdan进行比较,如果两者相差值在预先设定的误差Δl范围内的,则认为此波长所对应的谱线是元素Y的发射光谱谱线,予以保留,否则,则认为此波长所对应的谱线不是元素Y的发射光谱 谱线,予以剔除;设经前述筛选后保留了m个光波长对应的光强度值X1,X2,…Xm,m<L。According to the research on relevant spectral theoretical knowledge, it can be known that the correctness of spectral data screening has a certain influence on the subsequent processing and analysis of the data. Assuming that the element Y needs to be measured, according to the existing knowledge, the wavelength of the combustion emission spectrum of element Y should be lamda1, lamda2..., lamdan, a total of n wavelength values. Since the wavelength value of the emission spectrum may shift to a certain extent during the spectral line measurement, it is necessary to firstly compare the L light wavelength values corresponding to the light intensity values X 1 , X 2 , ... XL with the known standard n wavelengths respectively. Compare the values of lamda1, lamda2… The spectral line corresponding to this wavelength is not the emission spectral line of the element Y, so it is rejected; it is assumed that after the aforementioned screening , the light intensity values X 1 , X 2 , . . .

第二步,根据相关去噪理论,对筛选后的发射光谱的光强度值X1,X2,…Xm进行去噪预处理。发射光谱数据在采集时往往会受到诸多因素的影响,使得光谱数据中除了包含有用的信息外,还混杂有背景和仪器噪声等无关信息。在分析处理光谱数据时,这些噪声会对分析结果产生不良的影响,降低光谱分析精度,因此需要先进行去噪预处理。作为一种优选方案,本发明去噪预处理采用信号平滑法,去除高频噪声,提高光谱信号的信噪比,信号平滑中具体采用移动平均平滑(Moving Aerage Smoothing,MAS)法。设经去噪预处理后的m个光波长对应的光强度值为

Figure BDA0000794251510000031
In the second step, according to the relevant denoising theory, denoising preprocessing is performed on the light intensity values X 1 , X 2 , . . . X m of the filtered emission spectrum. The emission spectrum data is often affected by many factors during the collection, so that the spectrum data contains not only useful information, but also irrelevant information such as background and instrument noise. When analyzing and processing spectral data, these noises will adversely affect the analysis results and reduce the accuracy of spectral analysis, so it is necessary to perform denoising preprocessing first. As a preferred solution, the denoising preprocessing of the present invention adopts a signal smoothing method to remove high-frequency noise and improve the signal-to-noise ratio of the spectral signal. The moving average smoothing (MAS) method is specifically used for signal smoothing. Set the light intensity corresponding to m light wavelengths after denoising preprocessing as
Figure BDA0000794251510000031

第三步,根据主成分分析方法,对去噪预处理后的光强度值为

Figure BDA0000794251510000032
进行主成分分析,获得最终的主成分得分Si(i=1,2,...p)。In the third step, according to the principal component analysis method, the light intensity value after denoising preprocessing is
Figure BDA0000794251510000032
Principal component analysis is performed to obtain the final principal component score S i (i=1,2,...p).

发射光谱的光强度值数据量大,其中包含了丰富的物质成分和结构信息,通过主成分分析法分析光谱数据,用少数几个主成分变量代替原始变量,主成分变量是原始变量的线性组合,能够最大程度地代表原始变量的信息。主成分分析法通过正交变换使主成分之间彼此互不相关,以此克服数据共线性问题。主成分分析具体分析过程如下:The light intensity value of the emission spectrum has a large amount of data, which contains rich material composition and structural information. The spectral data is analyzed by the principal component analysis method, and a few principal component variables are used to replace the original variables. The principal component variables are the linear combination of the original variables. , which can best represent the information of the original variable. The principal component analysis method makes the principal components uncorrelated with each other through orthogonal transformation, so as to overcome the problem of data collinearity. The specific analysis process of principal component analysis is as follows:

1.1设对于

Figure BDA0000794251510000033
共m个光强度值,每一个波长对应的光强度值有n个数据(即n次试验获得的结果),将n次试验获得光强度值组成自变量矩阵X如式(2)所示,其中xnm代表第n次试验中第m个波长对应的光强度值。1.1 Assuming that for
Figure BDA0000794251510000033
There are m light intensity values in total, and the light intensity value corresponding to each wavelength has n data (that is, the results obtained from n tests), and the light intensity values obtained from n tests are composed of an independent variable matrix X, as shown in formula (2), where x nm represents the light intensity value corresponding to the mth wavelength in the nth experiment.

Figure BDA0000794251510000034
Figure BDA0000794251510000034

n表示矩阵X的行数,m表示矩阵X的列数n represents the number of rows of matrix X, m represents the number of columns of matrix X

1.2计算自变量矩阵X的相关系数矩阵R,关系数矩阵R如式(3)所示:1.2 Calculate the correlation coefficient matrix R of the independent variable matrix X, and the correlation coefficient matrix R is shown in formula (3):

Figure BDA0000794251510000035
Figure BDA0000794251510000035

式(3)中,rij(i,j=1,2,...,m)为光强度值

Figure BDA0000794251510000036
Figure BDA0000794251510000037
的相关系数,其计算公式如式(4)所示:In formula (3), r ij (i,j=1,2,...,m) is the light intensity value
Figure BDA0000794251510000036
and
Figure BDA0000794251510000037
The correlation coefficient of , and its calculation formula is shown in formula (4):

Figure BDA0000794251510000041
Figure BDA0000794251510000041

其中,

Figure BDA0000794251510000042
表示变量xi的平均值,xi表示矩阵X中第i列数据;k表示矩阵X的行数,k=1,2,...,n。in,
Figure BDA0000794251510000042
Represents the average value of variable x i , x i represents the data in the i-th column of matrix X; k represents the number of rows in matrix X, k=1,2,...,n.

1.3根据相关系数矩阵R求解出特征方程|λI-R|=0所需使用的特征值λ12...λm,其中I为一个单位矩阵,求解特征值λ12...λm常用的方法是雅可比(Jacobi)法,并将求解出的特征值λ12...λm按照从大到小的顺序排列λ1≥λ2≥...≥λm≥01.3 According to the correlation coefficient matrix R, the eigenvalues λ 1 , λ 2 ... λ m required to be used for the characteristic equation |λI-R|=0 are solved, where I is a unit matrix, and the eigenvalues λ 1 , λ 2 are solved. ..λ m The commonly used method is the Jacobi method, and the solved eigenvalues λ 1 , λ 2 ... λ m are arranged in descending order λ 1 ≥λ 2 ≥...≥ λ m ≥ 0

1.4将特征值λ12...λm代入特征方程|λI-R|=0分别求出特征值λi对应的特征向量ei,特征向量ei为m行1列的矩阵,其中i取1~m范围内的整数。1.4 Substitute the eigenvalues λ 1 , λ 2 ... λ m into the characteristic equation |λI-R|=0 to obtain the eigenvectors e i corresponding to the eigenvalues λ i respectively, and the eigenvectors e i are a matrix with m rows and 1 column, where i is an integer in the range of 1 to m.

1.5根据特征值λ12...λm计算主成分的贡献率和累计贡献率,贡献率计算公式(5)所示,累计贡献率如公式(6)所示:1.5 Calculate the contribution rate and cumulative contribution rate of the principal component according to the eigenvalues λ 1 , λ 2 ... λ m , the contribution rate calculation formula (5) is shown, and the cumulative contribution rate is shown in the formula (6):

Figure BDA0000794251510000043
Figure BDA0000794251510000043

Figure BDA0000794251510000044
Figure BDA0000794251510000044

累计贡献率表示所生成的新变量(即主成分)所包含原始数据信息的程度,为了更大程度地代表原来的变量,一般选取主成分累计贡献率≥95%的特征值λ12,...λp对应的主成分,p≤m。The cumulative contribution rate indicates the degree of original data information contained in the new variable (ie principal component) generated. In order to represent the original variable to a greater extent, the eigenvalues λ 1 , λ 2 with the cumulative contribution rate of the principal component ≥95% are generally selected. ,...λ p corresponds to the principal component, p≤m.

1.6根据特征值λ12...λm和特征向量ei计算各主成分载荷Ii,计算公式如式(7)所示:1.6 Calculate each principal component load I i according to the eigenvalues λ 1 , λ 2 ... λ m and the eigenvectors e i , the calculation formula is shown in formula (7):

Figure BDA0000794251510000045
Figure BDA0000794251510000045

主成分载荷Ii为m行1列的矩阵;The principal component load I i is a matrix with m rows and 1 column;

然后根据求出的各主成分的载荷Ii,计算得到各主成分得分Si,如公式(8)所示,Then, according to the obtained load I i of each principal component, the score S i of each principal component is calculated, as shown in formula (8),

Figure BDA0000794251510000046
Figure BDA0000794251510000046

其中,Si为n行1列的矩阵,表示的是主成分变量Si在n次试验下的得分数值。Among them, Si is a matrix with n rows and 1 column, which represents the score value of the principal component variable Si under n trials.

由上可知,主成分分析是一种有损降维的方法,当主成分选取过少时,信息损失量将会比较大,当主成分数量选取过多时,计算机的运算量将会大大增加,因此在实际计算过程中应该根据需要合理选取主成分变量个数。It can be seen from the above that principal component analysis is a lossy dimensionality reduction method. When too few principal components are selected, the amount of information loss will be relatively large. In the calculation process, the number of principal component variables should be reasonably selected according to the needs.

第四步,根据上面主成分分析求出的主成分得分Si(i=1,2,...p)和已知的燃烧体中元素Y的含量值Yi(i=1,2,...n),采用最小二乘法进行拟合,可以求出回归系数b1…bpThe fourth step is based on the principal component score Si ( i =1,2,...p) obtained by the above principal component analysis and the known content value of element Y in the combustion body Yi ( i =1,2, ...n), the least squares method is used for fitting, and the regression coefficients b 1 ...b p can be obtained.

二、主成分得分S1、S2…Sp的确定2. Determination of principal component scores S 1 , S 2 ... Sp

设对于当前需要测定其中元素Y含量的燃烧体的发射光谱的光强度值为z1,z2,…zL,按照前述筛选与预处理方法做相应的处理,得到处理后的光强度值

Figure BDA0000794251510000051
Figure BDA0000794251510000052
Figure BDA0000794251510000053
组成一个1行m列的矩阵Z,根据公式9可获得主成分得分的值S1、S2…Sp。Assuming that the light intensity values of the emission spectrum of the combustion body whose content of element Y needs to be determined at present are z 1 , z 2 , .
Figure BDA0000794251510000051
Figure BDA0000794251510000052
Will
Figure BDA0000794251510000053
A matrix Z with 1 row and m columns is formed, and the values S 1 , S 2 . . . S p of the principal component scores can be obtained according to formula 9.

Figure BDA0000794251510000054
Figure BDA0000794251510000054

其中,i=1,2,...p。where i=1,2,...p.

本发明可以通过以下实验进一步说明。The present invention can be further illustrated by the following experiments.

本实验中的燃烧体发射光谱波长范围为245nm~1044nm,共3648个光波长值,变量w1,w2…w3648分别代表某一波长值对应的光强度值,且w1~w3648光波长差是一个常数,光强度值是某一种固定参考强度的倍数,每一个变量都有19次试验的结果,基于此,完成对燃烧体中碳元素含量的测定。The wavelength range of the emission spectrum of the combustion body in this experiment is 245nm to 1044nm, with a total of 3648 light wavelength values. The variables w1, w2...w3648 respectively represent the light intensity value corresponding to a certain wavelength value, and the light wavelength difference between w1 and w3648 is a constant. , the light intensity value is a multiple of a certain fixed reference intensity, and each variable has the results of 19 trials. Based on this, the determination of the carbon content in the combustion body is completed.

第一步,对19次试验的(不是本次待测燃烧体)原始的光强度值进行发射光谱的光强度值数据的筛选,获得筛选后的光强强度值,由查询到的碳元素发射光谱资料可得:碳元素发射光谱波长范围为175.136nm~453.18nm(具体波长值见附录),根据相关的光谱理论知识可得由于光谱线测量时发射光谱的波长值可能会发生一定的平移,因此选取在碳元素发射光谱波长范围附近间隔小于等于0.2nm处的谱线为碳元素的发射光谱谱线,筛选后的光谱数据包含954个波长,w1,w2…w954分别代表对应波长的光强度值。The first step is to screen the light intensity value data of the emission spectrum for the original light intensity values of the 19 tests (not the combustion body to be tested this time), and obtain the filtered light intensity value, which is emitted by the carbon element inquired. Spectral data can be obtained: the wavelength range of carbon emission spectrum is 175.136nm ~ 453.18nm (see appendix for the specific wavelength value). According to the relevant spectral theory knowledge, the wavelength value of the emission spectrum may shift to a certain extent during the measurement of the spectral line. Therefore, the spectral lines at intervals of less than or equal to 0.2 nm near the wavelength range of the emission spectrum of carbon element are selected as the emission spectral lines of carbon element. The filtered spectral data contains 954 wavelengths, and w1, w2...w954 respectively represent the light intensity of the corresponding wavelength. value.

第二步,对第一步筛选出的发射光谱的光强度值进行数据预处理,获得预处理后的光强度值,采用移动平均平滑法对光谱数据进行预处理,经过预处理后的954个光波长对应的光强度值为

Figure BDA0000794251510000055
The second step is to perform data preprocessing on the light intensity values of the emission spectrum screened in the first step to obtain the preprocessed light intensity values, and use the moving average smoothing method to preprocess the spectral data. The light intensity value corresponding to the light wavelength is
Figure BDA0000794251510000055

第三步,对第二步预处理后的光强度值进行主成分分析(PCA),获得主成分得分数值,在Matlab(一种数学分析计算软件)中运用主成分分析方法进行求解,

Figure BDA0000794251510000056
Figure BDA0000794251510000061
为预处理后的光强度值变量,共有19次试验的结果,将19次试验的结果组成矩阵W(19行954列的数据)如下,其中n为19,m为954,wnm代表第n次试验中第m个波长对应的光强度值。The third step is to perform principal component analysis (PCA) on the light intensity value preprocessed in the second step to obtain the principal component score value, and use the principal component analysis method in Matlab (a mathematical analysis and calculation software) to solve,
Figure BDA0000794251510000056
Figure BDA0000794251510000061
For the variable of light intensity value after preprocessing, there are a total of 19 test results, the results of 19 tests are composed of matrix W (data in 19 rows and 954 columns) as follows, where n is 19, m is 954, and w nm represents the nth The light intensity value corresponding to the mth wavelength in this experiment.

Figure BDA0000794251510000062
Figure BDA0000794251510000062

(1)对数据矩阵W按列进行标准化(每一列中的数据除以每一列数据的总和),获得标准化后的数据矩阵。(1) Normalize the data matrix W by column (the data in each column is divided by the sum of the data in each column) to obtain the normalized data matrix.

(2)对(1)中的标准化后的数据矩阵W,按照以下公式求出矩阵W的相关系数矩阵R,(2) For the standardized data matrix W in (1), obtain the correlation coefficient matrix R of the matrix W according to the following formula,

Figure BDA0000794251510000063
Figure BDA0000794251510000063

式中,rij(i,j=1,2,...,m)为原变量

Figure BDA0000794251510000064
Figure BDA0000794251510000065
的相关系数,其计算公式如下所示,
Figure BDA0000794251510000066
代表变量wi的平均值。In the formula, r ij (i,j=1,2,...,m) is the original variable
Figure BDA0000794251510000064
and
Figure BDA0000794251510000065
The correlation coefficient of , and its calculation formula is as follows:
Figure BDA0000794251510000066
represents the mean of the variable wi .

Figure BDA0000794251510000067
Figure BDA0000794251510000067

(3)对(2)中的相关系数矩阵R求解特征值λi和对应的特征向量ei(i=1,2....954),同时计算贡献率和累积贡献率,取累计贡献率达到96%时的特征值λ12,…,λp所对应的为第一、第二,…,第p(p≤954)个主成分,求出每个主成分变量对应的贡献率如表1所示,共求出16个主成分。(3) Solve the eigenvalue λ i and the corresponding eigenvector e i (i=1, 2....954) for the correlation coefficient matrix R in (2), calculate the contribution rate and the cumulative contribution rate at the same time, and take the cumulative contribution When the rate reaches 96%, the eigenvalues λ 1 , λ 2 ,…,λ p correspond to the first, second,…, pth (p≤954) principal components, and find the corresponding The contribution rate is shown in Table 1, and a total of 16 principal components were obtained.

表1各主成分贡献率Table 1 Contribution rate of each principal component

各主成分贡献率The contribution rate of each principal component 主成分变量1principal component variable 1 0.6980160.698016 主成分变量2Principal Component Variable 2 0.2157540.215754 主成分变量3Principal Component Variable 3 0.0075370.007537 主成分变量4Principal Component Variable 4 0.0066900.006690 主成分变量5Principal Component Variables 5 0.0063840.006384 主成分变量6Principal Component Variables 6 0.0062270.006227 主成分变量7Principal Component Variables 7 0.0061240.006124 主成分变量8Principal Component Variables 8 0.0058490.005849 主成分变量9Principal Component Variables 9 0.0055560.005556 主成分变量10Principal Component Variables 10 0.0054760.005476 主成分变量11Principal Component Variables 11 0.0053390.005339 主成分变量12Principal Component Variables 12 0.0049110.004911 主成分变量13Principal Component Variables 13 0.0048320.004832 主成分变量14Principal Component Variables 14 0.0046340.004634 主成分变量15Principal Component Variables 15 0.0045040.004504 主成分变量16Principal Component Variables 16 0.004327 0.004327

(4)根据(3)中求出的特征值和特征向量计算主成分变量载荷,其计算公式为:(4) Calculate the principal component variable load according to the eigenvalues and eigenvectors obtained in (3), and the calculation formula is:

Figure BDA0000794251510000071
(954行1列的矩阵)
Figure BDA0000794251510000071
(Matrix with 954 rows and 1 column)

根据各主成分的载荷计算得到各主成分的得分:The score of each principal component is calculated according to the load of each principal component:

Figure BDA0000794251510000072
Figure BDA0000794251510000072

其中,Si为19行1列的矩阵,表示的是主成分变量Si在19次样本试验下的得分。Among them, Si is a matrix with 19 rows and 1 column, which represents the score of the principal component variable Si under 19 sample trials.

利用上式求出每一个主成分变量(共16个)得分,每一个主成分变量都有19次试验的结果。Use the above formula to obtain the score of each principal component variable (16 in total), and each principal component variable has the results of 19 trials.

第四步,对第三步中求出的16个主成分得分变量S1,S2,…S16和试验中先验信息(实验前测得的碳元素含量)Y进行多元线性回归拟合,16个主成分得分变量作为自变量,碳元素含量作为因变量,求得碳含量Y与16个主成分变量的关系:The fourth step is to perform multiple linear regression fitting on the 16 principal component score variables S 1 , S 2 , ... S 16 obtained in the third step and the prior information in the experiment (the carbon content measured before the experiment) Y , 16 principal component score variables are used as independent variables, and carbon element content is used as a dependent variable, and the relationship between carbon content Y and 16 principal component variables is obtained:

Y=b0+b1S1+...+b16S16 Y=b 0 +b 1 S 1 +...+b 16 S 16

在Matlab中调用regress函数(最小二乘拟合法)求得上式中各项系数如表2所示:Call the regress function (least square fitting method) in Matlab to obtain the coefficients in the above formula as shown in Table 2:

表2多元线性回归方程拟合系数Table 2 Multiple linear regression equation fitting coefficients

多元线性回归拟合系数Multiple Linear Regression Fit Coefficients b<sub>0</sub>b<sub>0</sub> 7.0005647.000564 b<sub>1</sub>b<sub>1</sub> 0.3468450.346845 b<sub>2</sub>b<sub>2</sub> -0.015874-0.015874 b<sub>3</sub>b<sub>3</sub> -46.693979-46.693979 b<sub>4</sub>b<sub>4</sub> -33.287644-33.287644 b<sub>5</sub>b<sub>5</sub> 55.12868055.128680 b<sub>6</sub>b<sub>6</sub> 7.8276107.827610 b<sub>7</sub>b<sub>7</sub> 2.9223072.922307 b<sub>8</sub>b<sub>8</sub> 4.5684494.568449 b<sub>9</sub>b<sub>9</sub> -20.499940-20.499940 b<sub>10</sub>b<sub>10</sub> -40.594263-40.594263 b<sub>11</sub>b<sub>11</sub> 96.92787696.927876 b<sub>12</sub>b<sub>12</sub> 5.3122275.312227 b<sub>13</sub>b<sub>13</sub> 14.55807314.558073 b<sub>14</sub>b<sub>14</sub> 123.090712123.090712 b<sub>15</sub>b<sub>15</sub> -58.641999-58.641999 b<sub>16</sub>b<sub>16</sub> 9.248874 9.248874

运用拟合后得出的关系模型方程求解碳含量,实验前测得燃烧体碳含量真实值如表3所示:The relationship model equation obtained after fitting is used to solve the carbon content. The real value of the carbon content of the combustion body measured before the experiment is shown in Table 3:

表3碳含量真实值Table 3 Carbon content true value

试验次数nnumber of trials n 碳含量真实值True value of carbon content 11 21.00000021.000000 22 20.00000020.000000 33 18.00000018.000000 44 20.00000020.000000 55 18.48000018.480000 66 25.04000025.040000 77 29.35000029.350000 88 19.00000019.000000 99 24.00000024.000000 1010 19.60000019.600000 1111 25.00000025.000000 1212 25.31000025.310000 1313 30.00000030.000000 1414 27.75000027.750000 1515 15.17000015.170000 1616 20.00000020.000000 1717 17.92000017.920000 1818 20.44000020.440000 1919 14.280000 14.280000

运用拟合出的方程求出的碳含量如表4所示:The carbon content calculated using the fitted equation is shown in Table 4:

表4碳含量测量值Table 4 Carbon Content Measurements

试验次数nnumber of trials n 碳含量测量值Carbon content measurement 11 22.21492322.214923 22 20.84507820.845078 33 18.37391318.373913 44 20.35542420.355424 55 18.52450518.524505 66 23.56523523.565235 77 28.69186128.691861 88 18.74252918.742529 99 22.50987722.509877 1010 19.19090719.190907 1111 23.60210023.602100 1212 26.44718426.447184 1313 31.45960731.459607 1414 27.59899827.598998 1515 15.61303315.613033 1616 21.48484621.484846 1717 18.47247718.472477 1818 21.34211021.342110 1919 15.508944 15.508944

经分析可知,本发明通过建立主成分分析模型和多元线性回归模型,通过分析燃烧体的发射光谱的光强度值进行元素含量的测定,碳含量真实值和预测值之间的误差比较少,测量精度较高、方法简单高效。It can be seen from the analysis that the present invention determines the element content by establishing a principal component analysis model and a multiple linear regression model, and analyzing the light intensity value of the emission spectrum of the combustion body. The precision is high, and the method is simple and efficient.

Claims (1)

1. A method for measuring the element content of a combustion body based on principal components and multiple linear regression is characterized in that,
carrying out data screening on the light intensity value of the emission spectrum of the combustion body, and carrying out denoising treatment on the screened light intensity value;
obtaining a principal component score of the light intensity value according to the de-noised light intensity value;
the content of the element to be measured in the combustion body is obtained by calculation according to the principal component score of the light intensity value, the calculation mode is shown as formula (1),
Y=b0+b1S1+...+bpSp(1)
wherein Y represents the content of the element to be measured in the combustion body, and S1,S2,….SpRepresents the score of the principal component, b0Is a constant term, b1…bpAs a regression coefficient, b0And b1…bpAre all known amounts;
the regression coefficient acquisition process is as follows:
firstly, screening the light intensity value of the emission spectrum of the combustion body, and denoising the screened light intensity value to obtain the light intensity values corresponding to the m light wavelengths subjected to screening and denoising treatment
Figure FDA0002521666450000011
L is the number of light wavelengths in the emission spectrum of the combustion body;
secondly, according to the principal component analysis method, the light intensity value after denoising pretreatment is carried out
Figure FDA0002521666450000012
Performing principal component analysis to obtain principal component score SiP, p is the number of principal components;
1.1 is provided for
Figure FDA0002521666450000013
M light intensity values are totally obtained, the light intensity values obtained by n times of tests form an independent variable matrix X shown as a formula (2), wherein XnmRepresents the light intensity value corresponding to the mth wavelength in the nth test,
Figure FDA0002521666450000014
where n denotes the number of rows of the matrix X and m denotes the number of columns of the matrix X
1.2, calculating a correlation coefficient matrix R of the independent variable matrix X, wherein the correlation coefficient matrix R is shown as a formula (3):
Figure FDA0002521666450000015
in the formula (3), rij,i,j=1,2,...,m,rijAs a light intensity value
Figure FDA0002521666450000016
And
Figure FDA0002521666450000017
the calculation formula of the correlation coefficient (c) is shown in formula (4):
Figure FDA0002521666450000021
wherein,
Figure FDA0002521666450000022
represents the variable xiAverage value of (1), xiRepresents the ith column of data in the matrix X; k represents the number of rows of matrix X, k being 1,2.., n;
1.3 solving the eigenvalue lambda according to the correlation coefficient matrix R12...λm
1.4 dividing the characteristic value lambda12...λmSubstituting the characteristic equation | λ I-R | ═ 0 to respectively obtain characteristic values λiCorresponding feature vector eiFeature vector eiIs a matrix with m rows and 1 column, wherein i is an integer ranging from 1 to m;
1.5 according to the characteristic value lambda12...λmCalculating the contribution rate and the cumulative contribution rate of the principal component, wherein the contribution rate is calculated by formula (5), and the cumulative contribution rate is expressed by formula (6):
Figure FDA0002521666450000023
Figure FDA0002521666450000024
selecting p characteristic values lambda of which the accumulated contribution rate of the main components is more than or equal to 95 percent12,...λpTaking the corresponding principal component as the final principal component score, wherein p is less than or equal to m;
1.6 according to the characteristic value lambda12...λmAnd a feature vector eiCalculating the loads I of the main componentsiThe calculation formula is shown in formula (7):
Figure FDA0002521666450000025
principal component load IiA matrix of m rows and 1 column;
then, based on the obtained loads I of the respective principal componentsiCalculating to obtain score S of each principal componentiAs shown in the formula (8),
Figure FDA0002521666450000026
wherein S isiIs a matrix of n rows and 1 column, and represents a principal component variable SiScore values under n trials;
thirdly, according to the score S of the principal componenti(i ═ 1,2.. p) and the known content value Y of the element Y in the combustion bodyi(i ═ 1,2.. n), fitting was performed by the least square method, and the regression coefficient b was determined1…bp
Screening and denoising light intensity values of emission spectra of elements to be detected in a combustion body to form a matrix Z with 1 row and m columns, and obtaining principal component scores according to the following formula (9):
Figure FDA0002521666450000031
CN201510552931.7A 2015-09-01 2015-09-01 Method for measuring element content of combustion body based on principal component and multiple linear regression Active CN106483077B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510552931.7A CN106483077B (en) 2015-09-01 2015-09-01 Method for measuring element content of combustion body based on principal component and multiple linear regression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510552931.7A CN106483077B (en) 2015-09-01 2015-09-01 Method for measuring element content of combustion body based on principal component and multiple linear regression

Publications (2)

Publication Number Publication Date
CN106483077A CN106483077A (en) 2017-03-08
CN106483077B true CN106483077B (en) 2020-10-02

Family

ID=58237921

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510552931.7A Active CN106483077B (en) 2015-09-01 2015-09-01 Method for measuring element content of combustion body based on principal component and multiple linear regression

Country Status (1)

Country Link
CN (1) CN106483077B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101403696A (en) * 2008-10-21 2009-04-08 浙江大学 Method for measuring gasoline olefin content based on Raman spectrum
CN102798597A (en) * 2012-08-13 2012-11-28 浙江大学 Soil total nitrogen content detection apparatus and method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102313722B (en) * 2011-09-05 2013-03-20 华南理工大学 A Coal Quality Industrial Analysis Method Based on Multiple Linear Regression

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101403696A (en) * 2008-10-21 2009-04-08 浙江大学 Method for measuring gasoline olefin content based on Raman spectrum
CN102798597A (en) * 2012-08-13 2012-11-28 浙江大学 Soil total nitrogen content detection apparatus and method

Also Published As

Publication number Publication date
CN106483077A (en) 2017-03-08

Similar Documents

Publication Publication Date Title
CN104949936B (en) Sample component assay method based on optimization Partial Least-Squares Regression Model
CN106124449B (en) A kind of soil near-infrared spectrum analysis prediction technique based on depth learning technology
CN105352895B (en) High-spectrum remote sensing data vegetation information extraction method
CN101430276B (en) Wavelength variable optimization method in spectrum analysis
CN101520412A (en) Near infrared spectrum analyzing method based on isolated component analysis and genetic neural network
CN107748146A (en) A kind of crude oil attribute method for quick predicting based near infrared spectrum detection
CN105158200B (en) A kind of modeling method for improving the Qualitative Analysis of Near Infrared Spectroscopy degree of accuracy
CN105891147A (en) Near infrared spectrum information extraction method based on canonical correlation coefficients
CN103234922A (en) Rapid soil organic matter detection method based on large sample soil visible-near infrared spectrum classification
CN101915744A (en) Near-infrared spectroscopy non-destructive testing method and device for substance composition content
CN107247033B (en) The method of identifying the maturity of Huanghua pear based on the fast decay elimination algorithm and PLSDA
CN111504979A (en) Method for improving mixture component identification precision by using Raman spectrum of known mixture
CN103226093A (en) Calibration curve creation method, calibration curve creation device and target component determination device
CN111855608A (en) A near-infrared non-destructive testing method for apple acidity based on fusion feature wavelength selection algorithm
CN110779875B (en) Method for detecting moisture content of winter wheat ear based on hyperspectral technology
CN105486655A (en) Rapid detection method for organic matters in soil based on infrared spectroscopic intelligent identification model
JP4315975B2 (en) Noise component removal method
CN106918567A (en) A kind of method and apparatus for measuring trace metal ion concentration
CN107132190A (en) A kind of soil organism spectra inversion model calibration samples collection construction method
CN114002162B (en) Soil organic carbon content estimation method, equipment, storage medium and program product
CN113496218B (en) Evaluation method and system for hyperspectral remote sensing sensitive wave band selection mode
CN114460116A (en) A Quantitative Analysis Method of Element Content by Support Vector Machine Regression Combined with Sensitivity Analysis
CN106153561A (en) The many metal ion inspections of uv-vis spectra based on wavelength screening
CN102135496A (en) Infrared spectrum quantitative analysis method and infrared spectrum quantitative analysis device based on multi-scale regression
CN102128805A (en) Method and device for near infrared spectrum wavelength selection and quick quantitative analysis of fruit

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant