A kind of spectral normalization combines the jade original producton location authentication method of multivariate statistical model
Technical field
The present invention relates to a kind of method for carrying out original producton location identification to jade, specifically, its general principle is using sharp
Photoinduction breakdown spectral technology (LIBS), is pre-processed using spectral normalization, in conjunction with principal component analysis (PCA), is supported
Vector machine (SVM) both multivariate statistical models analysis spectroscopic datas, carry out the original producton location identification of jade.
Background technology
Jade is the precious mineral of a class, it will usually made as all kinds of jewellery, ornament and the art work.The jade of high-quality
There is stone very high economy and technique to be worth.In China, the history of the existing more than one thousand years of jade culture, abundant intension is contained.
And the quality and price of jade, mainly determined by the place of production of jade raw material.Compared with differentiating that jade is true and false, the mirror in jade original producton location
Determine difficulty bigger.The original producton location discrimination method accuracy detected by traditional color and luster, micro-judgment and normal optical is only had
60% or so, and need associated specialist just can carry out, it is affected by human factors very big.Although also having using Raman spectrum skill at present
Art, inductively coupled plasma spectrometry technology are compared to the jade sample of different sources, but do not form more system
Authentication method.
LIBS (abbreviation LIBS) is a kind of emerging element analysis technology, with pre- without the need for sample
Process, analyze speed little to sample damage is fast, can realize various advantages such as multielement measurement, in the original producton location identification of jade
With very big potentiality.But LIBS spectroscopic datas generally have, and data volume is big, dimension is more, the easy spy by experiment condition influence of fluctuations
Point, it is difficult to be directly used in the discriminating work in original producton location, therefore the present invention is pre-processed using spectral normalization, eliminates experiment condition
The impact of fluctuation, and with reference to multivariate statistics model, first carry out, using principal component analysis (PCA), extracting principal component, remove redundancy
Data and noise, reduce data dimension;Again the principal component to extracting is classified using SVMs (SVM), determines jade
Original producton location.
The content of the invention
The purpose of the present invention be for traditional jade original producton location authentication method accuracy it is low, be affected by human factors big lacking
Fall into, propose by LIBS technologies, identified with reference to spectral normalization, multivariate statistical model PCA and SVM, will qualitatively people
The quantification for being promoted to more science for empirical analysis differentiates, so as to improve the accuracy of qualification result.
The technical scheme is that:
A kind of spectral normalization combines the jade original producton location authentication method of multivariate statistical model, it is characterized in that the method includes
Following steps:
1) using jade sample known to one group of original producton location, it is modeled as calibration sample, will be from same original producton location
Sample be classified as same class, the sample of different sources is classified as inhomogeneity;
2) calibration sample is detected using LIBS experimental system, obtains the light of this group of calibration sample
Spectrum spectral line, the inside contains the characteristic spectral line of the LIBS of each calibration sample various elements and these features
The intensity of spectral line;
3) LIBS of all calibration samples is normalized:Select an intensity higher
, used as standard feature spectral line, for the spectrum of each calibration sample, the intensity of each characteristic spectral line is simultaneously divided by standard for characteristic spectral line
The intensity of characteristic spectral line, remains as the intensity after normalization, forms the characteristic spectral line intensity after a spectral normalization
Matrix X,
Wherein, n represents the quantity of the jade sample for calibration, and p represents the quantity of characteristic spectral line, xi1,xi2,…,xipTable
Show the intensity of each characteristic spectral line after i-th jade sample spectra normalization;
4) principal component analysis is carried out to matrix X, extracts principal component:Matrix X is carried out into diagonalization, that is, find one it is orthogonal
Matrix A so that
Wherein, ATWith XTThe transposition of difference representing matrix A and X, λ1, λ2..., λpIt is the characteristic value on diagonal, and meets
Characteristic value on diagonal sorts from big to small, i.e. λ1≥λ2≥…≥λp;
M eigenvalue λ before selecting1, λ2..., λmSo that it is this m characteristic value and more than or equal to characteristic value summation
95%, i.e.,
The corresponding dimension of this front m characteristic value, is exactly the front m principal component of matrix X, and S is designated as respectively1, S2..., Sm, it is full
Foot:
S1=XA1, S2=XA2..., Sm=XAm (4)
Wherein, A1, A2..., AmRespectively the 1st of orthogonal matrix A, the 2nd ..., m column elements;
The principal component matrix that principal component analysis is obtained is designated as S, just has
Wherein, n represents the quantity of the jade sample for calibration, and m represents the number of the characteristic value of selection, that is, it is main into
The number divided, si1,si2,…,simRepresenting i-th is used for the front m principal component of jade sample of calibration;
5) this n jade sample for calibration is set from k kinds original producton location, every kind of place of production includes CiIndividual sample, i.e.,
C1+C2+…Ck=n (6)
Model is set up using support vector machine method to be calibrated, original producton location is pressed to this n jade sample for calibration
Pairwise classification;
First the sample from first original producton location is regarded as into a class, be left k-1 original producton location sample be regarded as it is another kind of;It is right
In this two classes sample, using support vector machine method, that is, vectorial ω and constant b of a m dimension is found so that first product
Principal component [the s of each jade sample i on groundi1 si2 … sim] be satisfied by
ωT[si1 si2 … sim]+b≥+1 (7)
And there is sample so that equal sign establishment, i.e., certain jade sample i in first place of production*Principal component [si*1 si*2
… si*m] meet
ωT[si*1 si*2 … si*m]+b=+1 (8)
Principal component [the s of each jade sample j in another kind ofj1 sj2 … sjm] be satisfied by
ωT[sj1 sj2 … sjm]+b≤-1 (9)
And there is sample so that equal sign is set up, i.e., it is another kind of in certain jade sample j*Principal component [sj*1 sj*2 …
sj*m] meet
ωT[sj*1 sj*2 … sj*m]+b=-1 (10)
Such two classes sample is just by linear plane ωTS+b=0 is separated, and spacing distance isWherein ‖ ω ‖
Represent the modulus value of vector ω;
Meet the vectorial ω and constant b more than one set of above-mentioned condition, it is minimum to take wherein ‖ ω ‖, that is, make two class samples it
Between spacing distanceMaximum that group (ω1 *,b1 *), as the jade sample for dividing first place of production and the remaining k-1 place of production
The best mode of product;
After marking off the jade sample in first place of production, then the sample in second place of production is regarded as into a class, is left k-2 product
The sample on ground is regarded as Equations of The Second Kind, is divided using aforesaid way, records corresponding (ω2 *,b2 *);By that analogy, until dividing
Go out the sample in all places of production;
6) the jade sample unknown for one group of original producton location, is predicted as test sample, and specific practice is as follows:
Test sample is detected first by LIBS experimental system, obtains optic spectrum line;And it is right
Spectrum is normalized, the characteristic spectral line intensity matrix X ' after being normalized;Extract the characteristic spectral line intensity after normalization
The front m principal component of matrix X ', is designated as S ', has
Wherein q represents the quantity of the jade sample for test, and m represents the number of the characteristic value of selection, that is, it is main into
The number divided, si1′,si2′,…,sim' expression is used for each principal component of the jade sample of test for i-th;
Pairwise classification is carried out to this q jade sample for test using support vector machine method, its original producton location is predicted;
First mark off the sample in first place of production;For i-th jade sample, if principal component [si1′ si2′ … sim'] meet (ω1 *
)T[si1′ si2′ … sim′]+b1 *>=0, it is regarded as the sample in first place of production;If meeting (ω1 *)T[si1′ si2′ …
sim′]+b1 *<0, it is regarded as the sample in other places of production;Mark off the sample in second place of production in the sample in other places of production again,
I.e. for i-th jade sample,
If principal component meets (ω2 *)T[si1′ si2′ … sim′]+b2 *>=0, it is regarded as the sample in second place of production;
If meeting (ω2 *)T[si1′ si2′ … sim′]+b2 *<0, it is regarded as the sample in other places of production;By that analogy, institute
All predict and finish in the original producton location for having test sample;
7) the prediction original producton location of test sample and true original producton location are contrasted, is verified the correctness of original producton location identification..
Technique effect of the present invention with advantages below and high-lighting:
LIBS technologies, in nanogram rank, damage very little to the ablation quality of jade sample, and qualification process is hardly right
Jade condition produces impact, it is possible to achieve Nondestructive Identification;Each sample need to only gather a spectrum in specific operation process, a collection of
Sample finishes the time for only needing a few minutes from adopting to compose to analyzing and identifying, and is capable of achieving Rapid identification;Using LIBS spectral line data conducts
Classification indicators, the quantification that qualitatively artificial empirical analysis is promoted to more science is differentiated, significantly improves original producton location identification
Correctness.Pre-processed using spectral normalization, it is to avoid the impact that experiment condition fluctuates to the identification of jade original producton location;Adopt
Primary data is pre-processed with PCA, only retains important principal component, eliminate unnecessary dimension, greatly reduce mould
Time & Space Complexity when type is calculated;For principal component is modeled using SVM, carry out classification and realize that original producton location is identified, by SVM
In combination with the characteristics of superior classification capacity is with PCA dimensionality reductions, high identification accuracy can be obtained.
Description of the drawings
Fig. 1 is the schematic flow sheet of the present invention.
Fig. 2 a are the data distribution with each place of production sample that the first and second principal components are transverse and longitudinal coordinate drafting, and Fig. 2 b are front
The characteristic value of seven principal components accounts for the percentage of all characteristic value summations.
Fig. 3 is place of production qualification result figure.
Specific embodiment
With reference to the accompanying drawings and examples the present invention is described further.
A kind of spectral normalization that the present invention is provided combines the jade original producton location authentication method of multivariate statistical model, and its is concrete
Comprise the steps:
1) using jade sample known to one group of original producton location, it is modeled as calibration sample, will be from same original producton location
Sample be classified as same class, the sample of different sources is classified as inhomogeneity;
2) calibration sample is detected using LIBS experimental system, obtains the light of this group of calibration sample
Spectrum spectral line, the inside contains the characteristic spectral line of the LIBS of each calibration sample various elements and these features
All kinds of atomic spectral lines of the elements such as the intensity of spectral line, mainly Ca, Mg, Si and ion line and its intensity;
3) LIBS of all calibration samples is normalized:Select an intensity higher
Used as standard feature spectral line, such as atomic spectral line of the Ca elements at wavelength 616.129nm, the intensity of spectral line is high and line for characteristic spectral line
Type preferably, is suitable as standard feature spectral line;For the spectrum of each calibration sample, the intensity of each characteristic spectral line is simultaneously divided by mark
The intensity of quasi- characteristic spectral line, remains as the intensity after normalization, and the characteristic spectral line formed after a spectral normalization is strong
Degree matrix X,
Wherein, n represents the quantity of the jade sample for calibration, and p represents the quantity of characteristic spectral line, xi1,xi2,…,xipTable
Show the intensity of each characteristic spectral line after i-th jade sample spectra normalization;
4) principal component analysis is carried out to matrix X, extracts principal component:Matrix X is carried out into diagonalization, that is, find one it is orthogonal
Matrix A so that:
Wherein, ATWith XTThe transposition of difference representing matrix A and X, λ1, λ2..., λpIt is the characteristic value on diagonal, and meets
Characteristic value on diagonal sorts from big to small, i.e. λ1≥λ2≥…≥λp;
M eigenvalue λ before selecting1, λ2..., λmSo that this m characteristic value and more than or equal to characteristic value summation 95%
(the less characteristic value of 5% numerical value after casting out, the corresponding dimension of these characteristic values acts on very little, can directly reject to classification;
The percentage cast out can be adjusted, usual 5%-10%, both can guarantee that the principal component information of extraction was sufficient,
The unwanted contributions that adulterate can be avoided), i.e.,
The corresponding dimension of this front m characteristic value, is exactly the front m principal component of matrix X, and S is designated as respectively1, S2..., Sm, it is full
Foot:
S1=XA1, S2=XA2..., Sm=XAm (4)
Wherein, A1, A2..., AmRespectively the 1st of orthogonal matrix A, the 2nd ..., m column elements;
The principal component matrix that principal component analysis is obtained is designated as S, just has
Wherein, n represents the quantity of the jade sample for calibration, and m represents the number of the characteristic value of selection, that is, it is main into
The number divided, si1,si2,…,simRepresenting i-th is used for the front m principal component of jade sample of calibration, and this m principal component is real
It is the linear combination of former spectrum intensity data on border, but is further extracted for jade sample is originated in compared with former spectrum
The useful information of ground identification, eliminates the garbages such as noise;
5) this n jade sample for calibration is set from k kinds original producton location, every kind of place of production includes CiIndividual sample, i.e.,
C1+C2+…Ck=n (6)
Model is set up using support vector machine method to be calibrated, original producton location is pressed to this n jade sample for calibration
Pairwise classification;
First a class will be regarded as from the sample in first original producton location (such as original producton location is the jade of Luodian), be left k-1
The sample (jades in other original producton locations such as Xinjiang, Qinghai, Russia, Korea) in original producton location is regarded as another kind of;For this two classes sample
Product, using support vector machine method, that is, find vectorial ω and constant b of a m dimension so that each jade in first place of production
Principal component [the s of stone sample ii1 si2 … sim] be satisfied by
ωT[si1 si2 … sim]+b≥+1 (7)
And there is sample so that equal sign establishment, i.e., certain jade sample i in first place of production*Principal component [si*1 si*2
… si*m] meet
ωT[si*1 si*2 … si*m]+b=+1 (8)
Principal component [the s of each jade sample j in another kind ofj1 sj2 … sjm] be satisfied by
ωT[sj1 sj2 … sjm]+b≤-1 (9)
And there is sample so that equal sign is set up, i.e., it is another kind of in certain jade sample j*Principal component [sj*1 sj*2 …
sj*m] meet
ωT[sj*1 sj*2 … sj*m]+b=-1 (10)
Such two classes sample is just by linear plane ωTS+b=0 is separated, and two class samples are located at respectively this line
The both sides in mild-natured face, and spacing distance isWherein ‖ ω ‖ represent the modulus value of vectorial ω.
Meet the vectorial ω and constant b more than one set of above-mentioned condition, it is minimum to take wherein ‖ ω ‖, that is, make two class samples it
Between spacing distanceMaximum, differentiation most significantly that group (ω1 *,b1 *), as first place of production of division and remaining k-1 product
The best mode of the jade sample on ground;
After marking off the jade sample in first place of production, then the sample in second place of production is regarded as into a class (such as in division
After going out the jade of Luodian, the jade sample in Xinjiang is then divided), the sample for being left the k-2 place of production is regarded as Equations of The Second Kind, using above-mentioned
Mode is divided, and records corresponding (ω2 *,b2 *);By that analogy, until marking off the sample in all places of production;
6) the jade sample unknown for one group of original producton location, is predicted as test sample, and specific practice is as follows:
Test sample is detected first by LIBS experimental system, obtains optic spectrum line;And it is right
Spectrum is normalized, the characteristic spectral line intensity matrix X ' after being normalized;Extract the characteristic spectral line intensity after normalization
The front m principal component of matrix X ', is designated as S ', has
Wherein q represents the quantity of the jade sample for test, and m represents the number of the characteristic value of selection, that is, it is main into
The number divided, si1′,si2′,…,sim' expression is used for each principal component of the jade sample of test for i-th;
Pairwise classification is carried out to this q jade sample for test using support vector machine method, its original producton location is predicted.
First mark off the sample in first place of production.For i-th jade sample, if principal component [si1′ si2′ … sim'] meet (ω1 *
)T[si1′ si2′ … sim′]+b1 *>=0, it is regarded as the sample in first place of production;If meeting (ω1 *)T[si1′ si2′ …
sim′]+b1 *<0, it is regarded as the sample in other places of production;Mark off the sample in second place of production in the sample in other places of production again,
I.e. for i-th jade sample,
If principal component meets (ω2 *)T[si1′ si2′ … sim′]+b2 *>=0, it is regarded as the sample in second place of production;
If meeting (ω2 *)T[si1′ si2′ … sim′]+b2 *<0, it is regarded as the sample in other places of production;By that analogy, directly
All predict and finish in the original producton location of extremely all test samples;
7) the prediction original producton location of test sample and true original producton location are contrasted, is verified the correctness of original producton location identification.
Embodiment:Original producton location identification is carried out to the Khotan jade sample from 5 kinds of original producton locations.
The original producton location identification of 638 Khotan jade samples, wherein 114 originate in Luodian, 114 originate in Xinjiang, and 110 originate in
Qinghai, 150 originate in Russia, and 150 originate in Korea.
Mainly comprise the following steps:
1) first using 500 known original producton locations (Luodian, Xinjiang, Qinghai, Russia and Korea are originated in respectively, each place of production
100 samples) Khotan jade sample as calibration sample, set up model:Using LIBS experimental system to every
Individual sample detected, obtains spectrum, finds characteristic spectral line, mainly includes Mg in 285.2nm, 382.9nm, 383.2nm etc.
Atom line, Ca 487.812nm, 616.129nm, 714.814nm etc. atom line, Si 288.157nm,
The atom line of 265.977nm etc. and ion line.Background correction, the area in each spectral line region of integral and calculating is used as the intensity of spectral line.
2) select atom lines of the Ca at 616.129nm as standard spectral line, carry out spectral normalization.
3) the intensity of spectral line data of all calibration samples are pre-processed using PCA, obtains the spy of first five principal component
Value indicative accounts for respectively 57.5%, 25.9%, 7.1%, 4.0%, the 1.8% of characteristic value summation, and its accounting reaches 96.3%, exceedes
95%, therefore extract the data of 5 principal components before each sample.
4) model is set up using SVM methods, 500 calibration samples is divided two-by-two by original producton location.Model accuracy
Can correctly be identified in the original producton location for reaching each jade sample for being used for calibration in 100%, i.e. modeling process.
5) in order to verify the correctness of authentication method, using 138 Khotan jade samples (14 products in prior unknown original producton location
In Luodian, 14 originate in Xinjiang, and 10 originate in Qinghai, and 50 originate in Russia, and 50 originate in Korea) as test sample, make
With the model set up, original producton location prediction is carried out.The accuracy for predicting the outcome also has reached 100%, has obtained originating in well
Ground qualification result.
Fig. 2 a are the data distribution with each place of production sample that the first and second principal components are transverse and longitudinal coordinate drafting, and Fig. 2 b are front
The characteristic value of seven principal components accounts for the percentage of all characteristic value summations.
Fig. 3 is place of production qualification result figure, equal for the identification accuracy of the jade sample in 5 kinds of original producton locations, calibration and verification
Reach 100%.