Summary of the invention
Purpose of the present invention just is to provide a kind of Raman spectrum preprocess method, and it can realize more effectively noise filtering, improves the result of spectroscopic data signal to noise ratio (S/N ratio).
The objective of the invention is to realize by such technical scheme, concrete steps are as follows:
1) obtain Raman original spectrum information;
2) the adaptive threshold denoise algorithm of employing based on wavelet transformation realizes the high frequency noise filtering of Raman original spectrum information;
3) to step 2) spectral information after processing, adopt the baseline correction algorithm based on asymmetric least square, remove the interference of fluorescence background to spectral analysis;
4) to the Raman spectrum diagram data after the step 3) processing, carry out the standard normalized, eliminate the impact on spectral analysis of noise and data dimension;
5) to the Raman spectrum diagram data after the step 4) processing, carry out seemingly closing based on the cubic smoothing spline Raman spectrum data.
The concrete steps of the adaptive threshold denoise algorithm based on wavelet transformation further, step 2) are as follows:
The genetic algorithm of employing based on population, using wavelet function, decomposition scale, threshold value and threshold function table as input, take the denoising integrated performance index as fitness function, and the parameter of wavelet Threshold Filter Algorithms is optimized; Choose wavelet function, decomposition scale, threshold value and threshold function table after optimization, the Raman original spectrum is carried out to the wavelet threshold denoising, realize high frequency noise filtering;
The concrete steps of described wavelet Threshold Filter Algorithms are as follows:
2-1) select the decomposition scale of wavelet function definite original spectrum signal;
2-2) according to step 2-1) parameter, the original spectrum signal is carried out to Wavelet Multi-resolution Transform, decomposite signal high frequency wavelet coefficient and signal low frequency wavelet coefficient;
2-3) adopt threshold function table to process the high frequency wavelet coefficient by stages threshold value quantizing of wavelet decomposition;
2-4) according to step 2-3) process the high frequency wavelet coefficient obtained, carry out spectral signal reconstruct in conjunction with low frequency coefficient;
2-5), according to the spectral signal before the spectral signal after reconstruct and decomposition, calculate the denoising integrated performance index;
2-6) step 2-6) denoising integrated performance index described in, if meet the requirements, take step 2-4) in the output of spectral signal after the removal high frequency noise of reconstruct, if undesirable, turn to step 2-1).
Further, the baseline correction algorithm based on asymmetric least square described in step 3), concrete steps are as follows:
3-1) estimate initial baseline
Asymmetric least-squares algorithm optimization aim function is:
Wherein, D
1be the single order differential matrix, D is Second differential matrix, and W is a diagonal angle weight matrix, and the diagonal matrix be comprised of vectorial w, i.e. W=diag (w), y and z are respectively through step 2) Raman signal after processing and the background signal of estimation, λ and λ
1regularization parameter, weight coefficient w in formula
iaccording to asymmetrical mode, select;
To minimize objective function and be converted into the iterative linear equation
Estimate the baseline initial estimate z of original spectrum y
(0);
3-2) adopt iterative algorithm to obtain new baseline
Utilize step 3-1) in the baseline initial estimate z that tries to achieve
(0), determine and proofread and correct spectrum y-z
(0)for the position of negative value, structural matrix W carries out Regularization to negative loop, works as y
iz
ithe time, w
i=p, and y
i≤ z
ithe time, w
i=1-p, i and p are respectively Raman pixel number and regularization parameter, optimize and choose p, minimize the energy of the negative loop of spectrum y-z after proofreading and correct, and the new structural matrix W substitution following formula by Regularization, solve the baseline made new advances
3-3) with multinomial seemingly hop algorithm, calculation correction spectrum y-z
(i)background value, iterative computation is until background value changes within preset range, algorithm stops, and obtains the Raman spectrum diagram data after baseline correction.
Further, the concrete grammar of the normalized of standard described in step 4) is:
Different variablees is compressed to processing, make the variance of each variable be 1, that is:
Wherein, I is the sample point number, and J is the variable number, x
ijfor the capable j row of i of the matrix X before standardization, x
ij *for the data after standardization,
for the average of the j of X row, s
jvariance for the j of X row.
Further, the concrete grammar seemingly closed based on the cubic smoothing spline Raman spectrum data described in step 5) is:
(Q
TD
2Q+pT)c=pQ
Ty
a=y-p
-1D
2Qc
The input that the Raman spectrum data of take through the standard normalized is the cubic smoothing spline matching, the following system of equations of substitution, adopt numerical algorithm to calculate the coefficient value of cubic parabola, thereby obtain the cubic smoothing spline curve, realize seemingly closing of Raman spectrum data, wherein, p is LaGrange parameter, the positive definite diagonal matrix that T is the n-1 order, diagonal matrix D=diag (the δ y that Q is the capable n-1 row of n+1
0..., δ y
n), the constant term that a is the cubic smoothing spline function, the quadratic term coefficient that c is the cubic smoothing spline function, y is the Raman scattering intensity after normalization.
Owing to having adopted technique scheme, the present invention has advantages of as follows:
The present invention has adopted the adaptive wavelet threshold denoise algorithm, the wavelet Threshold Filter Algorithms is improved, threshold rule is improved, to high-frequency information by stages, minute yardstick threshold process, increase the practicality of threshold value, reduce the deviation that the wrong diagnosis of wavelet coefficient threshold value causes, improved the signal to noise ratio (S/N ratio) of Raman spectrum.Simultaneously, the parameter based on intelligent optimization algorithm to the wavelet thresholding algorithm is in optimized selection, and more can adapt to the analyzing and processing of different noises for Raman signal, can obtain filtering preferably for Raman signal and process.
Secondly, baseline correction algorithm based on asymmetric least square of the present invention, estimate the original background value in conjunction with polynomial fitting method, estimate baseline based on the smooth device of Whittaker, treat simultaneously and ask baseline to apply the slickness constraint, add the bound term of first order derivative, the given spectrum of asymmetricly matching, the numerical value and the error between raw data that simulate are very little, can more effectively remove the interference of fluorescence background to spectral analysis.
Simultaneously, the present invention is by filtering and baseline correction coupling, and combined standard method for normalizing and cubic smoothing spline fitting algorithm, has realized auto adapted filtering denoising and the baseline correction of Raman spectrogram, has improved the signal to noise ratio (S/N ratio) of spectrogram.
Other advantages of the present invention, target and feature will be set forth to a certain extent in the following description, and to a certain extent, based on will be apparent to those skilled in the art to investigating hereinafter, or can be instructed from the practice of the present invention.Target of the present invention and other advantages can realize and obtain by following instructions and claims.
Embodiment
Below in conjunction with drawings and Examples, the invention will be further described.
A kind of Raman spectrum preprocess method, concrete steps are as follows:
1) obtain Raman original spectrum information;
2) the adaptive threshold denoise algorithm of employing based on wavelet transformation realizes the high frequency noise filtering of Raman original spectrum information;
3) to step 2) spectral information after processing, adopt the baseline correction algorithm based on asymmetric least square, remove the interference of fluorescence background to spectral analysis;
4) to the Raman spectrum diagram data after the step 3) processing, carry out the standard normalized, eliminate the impact on spectral analysis of noise and data dimension;
5) to the Raman spectrum diagram data after the step 4) processing, carry out seemingly closing based on the cubic smoothing spline Raman spectrum data.
The concrete steps of the adaptive threshold denoise algorithm based on wavelet transformation step 2 as shown in Figure 2) are as follows:
The genetic algorithm of employing based on population, using wavelet function, decomposition scale, threshold value and threshold function table as input, take the denoising integrated performance index as fitness function, using wavelet function, decomposition scale, threshold value and threshold function table as input, the parameter of wavelet Threshold Filter Algorithms is optimized; Choose wavelet function, decomposition scale, threshold value and threshold function table after optimization, the Raman original spectrum is carried out to the wavelet threshold denoising, realize high frequency noise filtering;
The concrete steps of described wavelet Threshold Filter Algorithms are as follows:
2-1) select the decomposition scale of wavelet function definite original spectrum signal;
2-2) according to step 2-1) parameter, the original spectrum signal is carried out to Multiresolution Decomposition, decomposite signal high frequency wavelet coefficient and signal low frequency wavelet coefficient;
2-3) adopt threshold function table to process the high frequency wavelet coefficient by stages threshold value quantizing of wavelet decomposition;
2-4) according to step 2-3) process the signal high frequency coefficient obtain, in conjunction with
Low frequency coefficient, carry out spectral signal reconstruct;
2-5), according to the spectral signal before the spectral signal after reconstruct and decomposition, calculate the denoising integrated performance index;
2-6) step 2-6) denoising integrated performance index described in, if meet the requirements, take step 2-4) in the output of spectral signal after the removal high frequency noise of reconstruct, if undesirable, turn to step 2-1).
Adaptive threshold denoise algorithm based on wavelet transformation is for realizing the filtering of Raman spectrum high frequency noise, in order to improve the signal to noise ratio (S/N ratio) of Raman spectrum data filtering, based on Wavelet Multi-resolution Transform, adopt threshold function table to process each layer of wavelet coefficient by stages, the decomposition scale of wavelet decomposition, increase the practicality of threshold value, reduce the deviation that the wrong diagnosis of wavelet coefficient threshold value causes, realize the filtering of high-frequency information, improved the signal to noise ratio (S/N ratio) of Raman spectrum.Wherein, a crucial step is exactly How to choose threshold value and the mode of carrying out the threshold value quantizing processing, and to a certain extent, it is related to the quality of signal noise silencing.Based on this, adopt genetic algorithm to be optimized wavelet function, decomposition scale and the threshold value of wavelet thresholding algorithm, self-adaptation is adjusted parameter, reaches the effect of the different noises of better filtering to facilitate the analyzing and processing of raman spectral signal.
Baseline correction algorithm based on asymmetric least square described in step 3), concrete steps are as follows:
3-1) estimate initial baseline
Asymmetric least-squares algorithm optimization aim function is:
Wherein, D
1be the single order differential matrix, D is Second differential matrix, and W is a diagonal angle weight matrix, and the diagonal matrix be comprised of vectorial w, i.e. W=diag (w), y and z are respectively step 2) Raman signal after processing and the background signal of estimation, weight coefficient w in formula
iaccording to asymmetrical mode, select; First of above formula reflected the baseline fitting error, and second has reflected the first order derivative error of fitting, and the 3rd is the constraint of baseline slickness, λ and λ
1be regularization parameter, the span of λ is 10
2-10
5, λ
1span be 10
-2-10
-10, be mainly the effect that minimizes error of fitting and guarantee the baseline flatness.
To minimize objective function and be converted into the iterative linear equation
Estimate the baseline initial estimate z of original spectrum y
(0);
3-2) adopt iterative algorithm to obtain new baseline
Utilize step 3-1) in the baseline initial estimate z that tries to achieve
(0), determine and proofread and correct spectrum y-z
(0)for the position of negative value, structural matrix W carries out Regularization to negative loop, works as y
iz
ithe time, w
i=p, and y
i≤ z
ithe time, w
i=1-p, i and p are respectively Raman pixel number and regularization parameter, optimize and choose p, and the p span is 0.000001-0.001, minimizes the energy of the negative loop of spectrum y-z after proofreading and correct, and the new structural matrix W substitution following formula by Regularization, solve the baseline made new advances
3-3) with multinomial seemingly hop algorithm, calculation correction spectrum y-z
(i)background value, iterative computation is until background value changes within preset range, algorithm stops, and obtains the Raman spectrum diagram data after baseline correction.
Raman spectrum data is carried out to the standard normalized, eliminating noise and data dimension affects spectral analysis, by standardization, can give prominence to the part nonlinear characteristic that exists in the correlationship that exists between process variable, removal process, reject measurement dimensions different between process variable to the impact of model, the structure of reduced data model.Data standard normalization comprises two steps usually: the centralization of data is processed and the dimension processing that disappears.The concrete grammar of the normalized of standard described in step 4) is:
Different variablees is compressed to processing, make the variance of each variable be 1, that is:
Wherein, I is the sample point number, and J is the variable number, x
ijfor the capable j row of i of the matrix X before standardization, x
ij *for the data after standardization,
for the average of the j of X row, s
jvariance for the j of X row.
The concrete grammar seemingly closed based on the cubic smoothing spline Raman spectrum data described in step 5) is:
(Q
TD
2Q+pT)c=pQ
Ty
a=y-p
-1D
2Qc
The input that the Raman spectrum data of take through the standard normalized is the cubic smoothing spline matching, the following system of equations of substitution, adopt numerical algorithm to calculate the coefficient value of cubic parabola, thereby obtain the cubic smoothing spline curve, realize seemingly closing of Raman spectrum data.Wherein, p is LaGrange parameter, the positive definite diagonal matrix that T is the n-1 order, diagonal matrix D=diag (the δ y that Q is the capable n-1 row of n+1
0..., δ y
n), the constant term that a is the cubic smoothing spline function, the quadratic term coefficient that c is the cubic smoothing spline function, y is the Raman scattering intensity after normalization.
Cubic smoothing spline is a kind of smoothing method that uses splines, and the method, when carrying out spline-fitting, adds a smoothing factor, can realize certain smoothing function.
Given x
i, y
i(i=0 ... n), suppose x
0<x
1<...<x
n, smooth function f (x) solves by following function
wherein g (x) satisfies condition
Therefore, by introducing auxiliary variable z and LaGrange parameter p, former minimization problem is converted to
By Euler-Lagrange equation, can be obtained:
f''''(x)=0,x
i<x<x
i+1,i=0,...,n-1
Wherein,
Use f''(x
0)
-=f'''(x
0)
-=f''(x
n)
+=f'''(x
n)
+=0
Extremal function f (x) can be write as a cubic parabola form
f(x)=a
i+b
i(x-x
i)+c
i(x-x
i)
2+d
i(x-x
i)
3,x
i≤x<x
i+1
Because it is continuous at the f of end points place, f', f'', it is separated as cubic spline curve.
By the relation between spline coefficients, we can obtain
When k=2
c
0=c
n=0,d
i=(c
i+1-c
i)/(3h
i),i=0,...,n-1
When k=0
When k=1
Tc=Q
Ta
When k=3
Qc=pD
-2(y-a)
Wherein:
h
i=x
i+1-x
i
c=(c
1,...,c
n-1)
T
y=(y
0,y
1,...y
n)
T
a=(a
0,a
1,...a
n)
T
D=diag(δy
0,...,δy
n)
The positive definite diagonal matrix that T is the n-1 order
T wherein
i,i=2 (h
i-1+ h
i)/3, t
i, i+1=t
i+1, i=h
i/ 3
The diagonal matrix that Q is the capable n-1 row of n+1
Q wherein
i-1, i=1/h
i-1, qii=-1/h
i-1-1/h
i, q
i+1, i=1/h
i
Therefore
(Q
TD
2Q+pT)c=pQ
Ty
a=y-p
-1D
2Qc
Ask for above-mentioned system of equations by numerical method and just can obtain the coefficient value of cubic parabola, then obtain all cubic smooth spline curves.
Raman spectrogram after adopting disposal route of the present invention to be processed is as shown in Fig. 3, Fig. 4, Fig. 5 and Fig. 6.
The performance index that table 1 is the adaptive threshold denoise algorithm of employing based on wavelet transformation and level and smooth filtering algorithm relatively
Finally explanation is, above embodiment is only unrestricted in order to technical scheme of the present invention to be described, although with reference to preferred embodiment, the present invention is had been described in detail, those of ordinary skill in the art is to be understood that, can modify or be equal to replacement technical scheme of the present invention, and not breaking away from aim and the scope of the technical program, it all should be encompassed in the middle of claim scope of the present invention.