WO2017063174A1 - 一种定量判别软玉产地的方法 - Google Patents

一种定量判别软玉产地的方法 Download PDF

Info

Publication number
WO2017063174A1
WO2017063174A1 PCT/CN2015/092024 CN2015092024W WO2017063174A1 WO 2017063174 A1 WO2017063174 A1 WO 2017063174A1 CN 2015092024 W CN2015092024 W CN 2015092024W WO 2017063174 A1 WO2017063174 A1 WO 2017063174A1
Authority
WO
WIPO (PCT)
Prior art keywords
lda
discriminant
origin
group
sample
Prior art date
Application number
PCT/CN2015/092024
Other languages
English (en)
French (fr)
Inventor
罗泽敏
沈锡田
杨明星
Original Assignee
中国地质大学(武汉)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国地质大学(武汉) filed Critical 中国地质大学(武汉)
Publication of WO2017063174A1 publication Critical patent/WO2017063174A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N27/00Investigating or analysing materials by the use of electric, electrochemical, or magnetic means
    • G01N27/62Investigating or analysing materials by the use of electric, electrochemical, or magnetic means by investigating the ionisation of gases, e.g. aerosols; by investigating electric discharges, e.g. emission of cathode
    • G01N27/622Ion mobility spectrometry
    • G01N27/623Ion mobility spectrometry combined with mass spectrometry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/64Fluorescence; Phosphorescence
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N27/00Investigating or analysing materials by the use of electric, electrochemical, or magnetic means
    • G01N27/62Investigating or analysing materials by the use of electric, electrochemical, or magnetic means by investigating the ionisation of gases, e.g. aerosols; by investigating electric discharges, e.g. emission of cathode
    • G01N27/68Investigating or analysing materials by the use of electric, electrochemical, or magnetic means by investigating the ionisation of gases, e.g. aerosols; by investigating electric discharges, e.g. emission of cathode using electric discharge to ionise a gas
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00

Definitions

  • the invention belongs to the field of origin identification, and particularly relates to a method for quantitatively measuring trace element content in a sample and statistical binary iterative linear discriminant analysis to quantitatively determine the origin of nephrite.
  • nephrite East Asian countries, especially China, have always had a strong interest and love for nephrite since ancient times.
  • the history of Chinese using nephrite dates back to the Neolithic Age (Wen and Jing, 1996; Tsien et al., 1996; Wen, 2001; Harlow and Sorensen, 2005).
  • nephrite was regarded as emperor and nobility.
  • the symbol of rights and wealth represents the good character of a woman.
  • Today, nephrite jewelry from China, Russia, South Korea and other places is still popular in the Chinese jade market and has become the focus of gemstone research.
  • the research on the geological source of nephrite is of great significance for the quality grading, price evaluation and traceability of the ancient jade.
  • Nephrite is a polycrystalline aggregate of gem-quality tremolite or actinite minerals. According to the difference of metallogenic mechanism, it is generally divided into marble type nephrite and serpentine type nephrite (Harlow and Sorensen, 2005). The former is mainly produced in East Asia, with Xinjiang and Qinghai in China, Baikal in Russia, and Chuncheon in South Korea. The latter is mainly produced in New Zealand, Canada, China, Taiwan, the United States and other places.
  • nephrite are easier to distinguish because they differ significantly in Fe, Cr, Co, Ni, and oxygen and strontium isotopes (Yui and Kwon, 2002; Harlow and Sorensen, 2005; Siqin et Al., 2012; Adamo and Bocchio, 2013), the marble type jadeite has a low content of the above-mentioned coloring elements, and its color is generally shallow, often appearing as white jade, white jade, sapphire, topaz, etc.; serpentine type nephrite Generally it is greenish green and contains high Fe, Cr, Co, Ni and so on.
  • Figure 1 is a legend of four different types of marble-like nephrites in the East Asian market. They are very similar in appearance, color, transparency, gloss, and conventional gemological parameters such as relative density, refractive index, and microscopic characteristics.
  • trace elements in minerals often carry fingerprint information of mineral formation geological environment, which can be used for traceability of gemstone minerals (Breeding and Shen, 2010; Blodgett and Shen, 2011; Shen et al., 2011;Zhong et al., 2013).
  • trace elements may overlap to different extents in different places.
  • LDA Linear Discriminant Analysis
  • the basic idea is to filter the combination of characteristic variables into the best by weighting the grouping (discriminating ability) of each variable in the case of known sample classification.
  • the spatial projection direction that is, the Fisher linear discriminant function, when each group of data is projected in this vector direction, the difference between the different groups is the largest, and the difference within the group is the smallest.
  • the LDA method has been used to discriminate geological sources of certain single crystal gemstones, such as copper-containing paraiba, ruby, sapphire, etc. (Blodgett and Shen, 2011), which has not yet been classified in the soft jade production area. It was reported in the study.
  • the object of the present invention is to propose a method for quantitatively determining the origin of nephrite.
  • a method for quantitatively determining the origin of nephrite comprising the steps of:
  • the quantitative test methods are laser ablation inductively coupled plasma mass spectrometry, laser induced breakdown spectroscopy, glow discharge mass spectrometry, external beam proton excitation, secondary ion mass spectrometry, and X-ray fluorescence spectrometry.
  • sample test points carrying information on trace element content are divided into two groups: "training group” and “test group”, and "training group” and “test group” cover all nephrite samples of origin;
  • Descriptive analysis of the trace element content of all test sites in N producing areas was performed in step 2) to check whether there were statistical differences between different producing areas. Specifically, a multivariate analysis of variance (or F-value test) or a chi-square test can be used. When the 95% confidence level is used, the mean value of the group mean is tested for the trace element variable. If the significance level is sig ⁇ 0.05, the variable is indicated. The group mean values are not equal and can be used for the next step of discriminant analysis.
  • IB-LDA binary iterative discriminant analysis
  • the statistical software will screen out the characteristic trace elements of the two groups to the greatest extent and give Discriminate the probability value.
  • IB-LDA all the sample points of each place of the “training group” are compared with the other origin packages (the second group) as the first group, and the present invention extracts the place with the highest probability of discrimination as the first round of IB.
  • the origin of the LDA screening is the origin of the LDA screening.
  • the second round of IB-LDA analysis was started by the same pairwise comparison method described above, so that the second round of the highest probability of discrimination was selected, and the process continued until each place of origin. They are all screened out.
  • N origins a total of N-1 rounds of screening were performed.
  • the characteristic trace elements and discriminant coefficients given by IB-LDA constitute a binary iterative typical discriminant function DF Y i , referred to as the discriminant function.
  • the absolute value of the discriminant coefficient reflects the importance of the variable to the discriminant group;
  • the trace elements and the classification coefficients constitute a binary iterative classification function CF i1 and CF i2 , and the subscript i is the number of rounds in which the IB-LDA is performed.
  • the discriminant function and classification function of the middle-aged area are sequentially screened to form a discriminant model of IB-LDA, and the classification function can achieve rapid classification.
  • test group sample points are taken from each production area, and the trace element content is substituted into the classification function of the production area selected in step A, and the origin of the "test group” sample point is judged according to the score, and the actual origin is In comparison, the feasibility of the IB-LDA discrimination method is tested by the false positive rate (ER).
  • ER false positive rate
  • step 1-2 Select the nephrite sample from the unknown origin in the market and record it as “sample to be tested”.
  • step 1-2 Perform trace element test on the flat polished surface according to step 1-2), and bring the trace element content data into IB-LDA in step 4).
  • the classification function is established, and the origin of the "sample to be tested” is judged according to the score value, and the discrimination accuracy rate is given;
  • each producing area is regarded as an independent group, which is numbered 1 to N in turn; each trace element is regarded as an independent independent variable, and the trace element information of each group of origin is input to Statistical software, linear discriminant analysis (LDA) is given to distinguish the trace element variables, discriminant function coefficients and classification function coefficients of the N sets of data to the maximum extent, so as to construct the discriminant function and the classification function;
  • LDA linear discriminant analysis
  • Step 1) In the field, collect the nephrite samples from N known places of origin (N is the number of places of origin, which is a positive integer, ranging from 2 to 50 places) to ensure a reliable source of the sample. A certain number of representative samples were selected from each producing area to make two parallel, well polished samples.
  • the trace elements referred to in step 2) are relative to the main elements.
  • the element written in the chemical formula is called the main element, and the lesser element in the mineral that is not included in the chemical formula and is present in the mineral is called a trace element.
  • the chemical formula of nephrite is generally Ca 2 (Mg, Fe) 5 (Si 8 O 22 ) (F, OH) 2 , and elements other than Ca, Mg, Fe, Si, O, H, and F are regarded as trace elements.
  • the geological fingerprint characteristics carried by trace elements are of great value for the gemstone production area (Breeding and Shen, 2010; Blodgett and Shen, 2011; Zhong et al., 2013).
  • the present invention preferably uses a highly sensitive laser ablation inductively coupled plasma mass spectrometer (LA-ICP-MS) to accurately determine the trace element content of all samples.
  • the test sample points are divided into two groups: “training group” and “test group”. Each group contains samples of all origins, “training group” sample points and trace element information, used to establish the origin discrimination model, “test group” samples. Point and trace element information, used to test the feasibility of the "training group” to build the model.
  • the number of sample points of each production place of the "training group” in the step 2) is 1/2-2/3 of the total sample points, and the sample points of each place of the "test group” account for the total sample. 1/3-1/2 of the number of points.
  • the present invention does not exclude random grouping of sample quantities for "training group” and "test group".
  • the laser energy density measured is 5-40 J/cm 2 , preferably 5-10 J/cm 2 ; and the laser spot size is 16-120 ⁇ m, preferably 32- 44 ⁇ m; the test elements are 34-65 of the trace elements.
  • the external reference sample used is one or more of NIST, USGS, MPI-DING, CGSC, GSJ natural or synthetic glass series standards, and 29 Si is selected as data processing. Internal standard. Other experimental parameter settings and methods for converting trace element content can be performed according to conventional methods of LA-ICP-MS, such as references (Liu et al,. 2008, 2010).
  • the step 3) uses a multivariate analysis of variance or a chi-square test to test the homogeneity of the group mean for each trace element of the N producing areas.
  • the significance level sig ⁇ 0.05 the group mean value of the variable is not equal, there is a significant difference between the groups, and it is meaningful to carry out the next step of discriminant analysis.
  • a confidence level of 99% or higher can also be taken. If the significance level sig ⁇ 0.01, the trace element variable has a very significant difference between the different groups.
  • the present invention does not exclude the use of chi-square tests or other methods for performing statistical significance tests.
  • each round of IB-LDA or LDA can determine the trace element variable by using one of the direct method "enter the independent variable together” or the stepwise screening method of adding the variable step by step.
  • the “step-by-step screening method” one of the "Wilk's Lambda”, “Ma's distance”, “unexplained variance”, “minimum F value”, and “Rao's V” methods may be used, according to each trace element pair.
  • the default F value of the system default input variable is 3.84, and the critical F value of the culling variable is 2.71.
  • the default F value of the system added variable is 0.05, and the variable is removed.
  • the probability of F value is 0.10;
  • IB-LDA extracts the trace elements and corresponding discriminant coefficients and classification coefficients that contribute to the two groups by means of two-two discriminant analysis.
  • the characteristic trace elements and their discriminant coefficients are combined into an optimal projection vector direction (in this case). In the direction, the difference between the two groups is maximized, and the difference within the group is minimized.
  • This projection direction corresponds to the typical discriminant function DF Yi in IB-LDA, which is called the discriminant function.
  • the mathematical expression is shown in formula (1). :
  • the characteristic variable corresponds to the characteristic trace element contained in nephrite; a 1k , a 2k , a 3k , ..., a mk is the linear discriminant coefficient in front of the feature element, and b K is the constant term of the linear discriminant function.
  • each one being screened IB-LDA discriminant origin probability may be determined using cross-validation accuracy CV i (Cross Validation correctly classified rate , referred to as CV i) expressed, i is performed in the IB-LDA The number of rounds.
  • Cross-validation is based on the principle of “look one or several samples out”, which means that in the discriminant analysis, one or several sample points are randomly extracted as “omitted samples”, and the discriminant function is established with the remaining sample points.
  • CV ori defined as the time from the first wheel to distinguish the two CV 1 continued until the CV i i distinguish two round product
  • CV IB-total is defined as the arithmetic mean of the CV ori sum of each of the selected intermediates, and the calculation formula is as shown in (5):
  • the initial grouped cases correctly classified rate (OG i ) to indicate the discriminative probability of the selected middle-origin in the IB-LDA.
  • the sample is not extracted, but Enter the discriminant function for all initial sample points of the software, and then discriminate each sample point.
  • OG IB-total which is defined as the arithmetic mean of the OG ori sum of each of the selected producing areas, as shown in (6) and (7):
  • OG ori OG 1 ⁇ ...OG i-1 ⁇ OG i (6)
  • Step 6) of the method of the present invention may be: comparing the relative merits of IB-LDA and LDA in the quantitative discrimination of the soft jade origin.
  • the discriminating ability of the discriminant function is evaluated by one or more of Wilk's Lambda value, typical correlation coefficient, and eigenvalue.
  • Wilk's Lambda referred to as ⁇ value, is the sum of the squared deviations of the groups/total deviations. It is often used in discriminant tests. The smaller ⁇ value indicates that the discriminant function has higher discriminating ability; the typical correlation coefficient is the group separation.
  • the arithmetic square root of the square sum of the difference sum / total deviation, the eigenvalue is the sum of the squares of the dispersion between groups / the sum of the squares of the differences within the group.
  • the discriminant function the larger the typical correlation coefficient or eigenvalue, the more the discriminating ability Strong.
  • the "test group” sample, the false positive rate Expressed by ER (Error rate), its mathematical expression is
  • the discriminant accuracy rate is preferably expressed by the cross-validation discriminant accuracy rate (CV ori ) of the place of origin, and the initial data classification discriminant accuracy rate (OG ori ) can also be used. Said.
  • Step 5) of the method of the present invention may specifically be: taking a sample of nephrite to be tested, quantitatively determining the content of trace elements, and substituting into the IB-LDA discriminant model established in step 4), that is, sequentially introducing into the middle of the screening.
  • the classification function discriminates its origin according to the score, and its discriminant accuracy rate is preferentially expressed by the cross-validation accuracy CV ori of the origin of the discriminant, and can also be represented by the initial case group discriminant accuracy rate OG ori ; wherein, the “sample to be tested” is substituted into each
  • the order of the origin classification function is the same as the order in which the classification function of each origin is established in step 4). If the actual origin of the sample to be tested does not belong to the origin of the IB-LDA modeling, misjudgment may occur, so it is required to cover all the samples of the soft jade origin as much as possible when modeling the IB-LDA.
  • the statistical software used in the IB-LDA of the present invention is an internationally free R-Language.
  • the method of the present invention does not exclude the use of other software such as SPSS, MATLAB, S-PLUS, SAS, EXCEL to perform IB-LDA operations.
  • the invention is based on the binary iterative linear discriminant analysis (IB-LDA) of quantitative determination of the soft jade origin method, which has great advantages over the traditional naked eye qualitative identification and LDA method.
  • IB-LDA binary iterative linear discriminant analysis
  • the method of the present invention deeply explores the role of geological information of trace elements in the discrimination of soft jade origin, and the trace elements are obtained by quantitative measurement by high-sensitivity laser ablation inductively coupled plasma mass spectrometry;
  • the method of the present invention successfully improves the conventional LDA method and proposes a binary iterative discriminant analysis (IB-LDA) method.
  • IB-LDA binary iterative discriminant analysis
  • the discriminant accuracy rate was 92.9%; 7 rounds of IB-LDA were performed in 8 producing areas, and the arithmetic mean of Wilk's Lambda with 7 discriminant functions was 0.152, which was significantly smaller than that of traditional LDA's 7 discriminant functions obtained from 8 origins. Wilk's Lambda The arithmetic mean is 0.172. IB-LDA randomly selected 5 “test group” sample points (4 test points per production place) to obtain the cross-validation accuracy of the average of 95%, that is, an average of 95% of the sample points were accurately judged. It is the actual origin of it, which verifies the feasibility of this technical method.
  • Figure 1 is a legend of the nephrite products from four different marble types in the East Asian market.
  • (a), (b), (c), (d) are Xinjiang Hetian, Qinghai Golmud, Russia, Baikal, Chuncheon, South Korea.
  • Figure 2 shows the distribution of eight major marble-type nephrite producing areas in East Asia, including Xinjiang West (China Xinjiang), Xinjiang East (China Xinjiang), Golmud (China Qinghai), Baikal ( Russia), Chuncheon (Korea), ⁇ Rock (Liaoning, China), Luodian (Guizhou, China), and Xiangyang (Jiangsu, China).
  • Figure 3 is a photograph of a sample of eight major marble-type nephrite producing areas in East Asia. Each producing area lists a representative color sample as a display: (1) Xinjiang West; (2) Xinjiang East; (3) Golmud; (4) Baikal; (5) Chuncheon; (6) Xiuyan; (7) Luodian ; (8) Fuyang.
  • Figure 4 shows a six-round IB-LDA as an example to illustrate the process of discriminating the binary jade discriminant analysis on the place where the soft jade is produced (a total of six rounds of IB-LDA analysis were performed in seven places except Xiangyang).
  • the abscissa and ordinate represent the classification function of each round of IB-LDA.
  • two different shapes of solid patterns represent the sample points of the IB-LDA screening and the "ungrouped" in the "training group”; the hollow patterns corresponding to the above solid patterns are respectively "tested”.
  • the sample points in the screening and the "ungrouped” in the group are labeled as "test points”.
  • the elliptical wireframe represents the range of classification function scores for the "training group” sample points.
  • Figure 5 is a histogram comparing the accuracy of cross-validation discrimination between IB-LDA and traditional LDA to identify 8 jadeite producing areas.
  • the abscissa LY, LD, XY, CC, XJW, BK, XJE, GM represent 7 rounds of IB-LDA, respectively.
  • the origins of the screening were: Xiangyang, Luodian, Xiuyan, Chuncheon, Xinjiang West, Baikal, Xinjiang East and Golmud, and total represents the cross-validation accuracy of the eight producing areas.
  • FIG. 6 is a diagram showing the process of discriminating the sample of yam-LDA on the production of nephrite by taking the delivery building block game played by the child as an example.
  • Figure 6a is a process diagram for establishing a discriminant model by first determining the "sieve” of each round of IB-LDA based on the "test set” building blocks;
  • Figure 6b is then sequentially bringing the "test set” building blocks into the sieve test IB- in Figure 6a.
  • Embodiment 1 Binary Iterative Linear Discriminant Analysis (IB-LDA) Method and Distinguishing Results
  • the sample in the example is a large amount of nephrite samples collected by the inventor's research team for many years in various mining areas of the marble-type nephrite in East Asia. Taking the eight most common places of soft jade production as the research object, the IB-LDA method was used to establish the discrimination model. The geographical location of the eight origins is shown in Figure 2. In Figure 2, the soft jade produced in Xinjiang is divided into Xinjiang and Xinjiang, which are determined by the significant differences in geological mineralization conditions (Tang Yanling et al., 1994).
  • Figures 1-8 represent Xinjiang West (the various mines in western Xinjiang, including the famous Hetian, Yecheng, Moyuhe areas, etc.), Xinjiang East (the eastern part of Xinjiang, China, including Qiemo, Ruoqiang), Golmud (China) Qinghai), Baikal (USD), Chuncheon (South Korea), Xiuyan (Liaoning, China), Luodian (Guizhou, China), and Xiangyang (Jiangsu, China).
  • a total of 160 representative samples from each producing area were selected and prepared into two sides with parallel polishing and uniform specifications (as shown in Figure 3).
  • LA-ICP-MS High-sensitivity laser ablation inductively coupled plasma mass spectrometer
  • test results are established by standard samples The working curve is calculated and converted to a concentration value.
  • the laser energy density reaching the surface of the sample is 5-10 J/cm 2 , and the laser spot size is 32-44 ⁇ m. Since the element content of the standard sample is simultaneously collected in the measurement, the same test effect can be obtained for the sample.
  • the 480 LA-ICP-MS test points of 8 production samples were divided into two groups: “training group” and “test group”.
  • the “training group” accounted for 2/3 ratio, a total of 320 points, corresponding to 40 per producing area.
  • the “test group” accounted for 1/3 of the total, 120 points, corresponding to 20 samples per production area.
  • Multivariate analysis of variance or F-value test was used for all test sites in 8 producing areas to determine whether there were statistical differences between different producing areas.
  • the trace element content of 480 test points in 8 producing areas is input into the statistical software, and the number of input trace elements is 34 elements except the first transition group, which is considered that there may be multiple kinds of nephrite samples in the same place of origin. Color, different colors of nephrite may have large differences in the content of transitional coloring elements. By removing the first transitional coloring elements, the effect of color on the soft jade production area can be minimized.
  • IB-LDA binary iterative linear discriminant analysis method
  • the first step of IB-LDA is to divide the test points of the “training group” production area into two groups each time, and carry out multiple rounds of two sets of discriminant comparisons.
  • all randomly selected "one place” of all nephrite samples (as the first group) and all other samples of the "other origin package" (as the second group) were subjected to two-two discriminant analysis so that The difference between this place of origin and other places of origin is reflected to the greatest extent.
  • each production area will be compared with other production areas as the first group, and the origin of the “other production package” with the maximum cross-validation accuracy (CV i ) will be selected.
  • the rest of the production area called “Ungrouped”, enters the next round of IB-LDA analysis and repeats the above process until the last place of origin is identified.
  • the binary iterative method requires N-1 IB-LDA analysis.
  • CF i1 corresponds to the classification function of the origin of each round of IB-LDA screening, and CF i2 is the classification function of "ungrouped".
  • the second round of IB-LDA analysis was carried out on the remaining 7 producing areas.
  • [element] represents the content of the element.
  • test group sample points select the same number of “test group” sample points for each production area, and bring the trace element information into the classification function of each round of IB-LDA in order, firstly bring the first In the round of IB-LDA screening, the Xiangyang group and the corresponding “ungrouped group”, which set of classification function scores are high, and the “test group” sample points belong to which group, that is, if the test points are brought into the classification function of the Fuyang group If the score is high, it is judged to be a sample of Xiangyang origin, and no further analysis is needed.
  • the entire operation step can be realized by calculation programming, so that the trace elements of the sample to be tested can be directly determined to determine the origin of the sample.
  • the elliptical wireframe corresponds to the classification function score range of the "training group” sample points, wherein the gray shaded area is the classification function score range of "unzoned grouping".
  • two different shapes of solid patterns represent the origin sample points and the "ungrouped” sample points in the round of IB-LDA screening in the "training group”, and the hollow patterns corresponding to the above solid patterns are respectively
  • the sample points in the "test group” and the "ungrouped” in the "test group” are labeled as "test points”.
  • the CV i marked in the lower right corner is the cross-validation accuracy of the soft jade origin samples in each round of IB-LDA screening without considering the prior probability.
  • the rectangular frame with diagonal lines in Fig. 5 gives the cross-validation discrimination accuracy (CV ori ) and the total cross-validation discrimination accuracy CV IB-total of the eight origins identified by IB-LDA. If a place of origin is discriminated in the ith round, the cross-validation discriminant accuracy rate (CVori) is the product of the CV 1 from the first round of the two rounds to the CV i of the two sets of the i-th round.
  • CV ori was 100%, 99.5%, 98.4%, 96.4%, 93.6%, 92.6%, 89.6%, and 89.6%, respectively.
  • the probability that the eight origins were accurately discriminated was CV IB-total, which is the arithmetic mean of the CV ori sum of each of the selected mid-production regions , which is 95.0%.
  • the child's delivery building block game is taken as an example to illustrate.
  • the assignment of nephrite from different habitats to the corresponding origin is similar to casting blocks of different shapes into boxes of corresponding opening shapes.
  • only six rounds of IB-LDA are used to distinguish the processes of the seven production areas except Xiangyang.
  • Fig. 6(a) it is assumed that there are 7 different shapes of building blocks representing seven different origins of nephrite samples.
  • the "test sample to be tested” of unknown origin is tested, and the production of the soft jade sample products (the origin of the origin is not determined) from two parts of Xinjiang is reported.
  • step 2) of Example 1 the area of the sample surface was first selected to be smooth and flat for LA-ICP-MS test, and each sample was tested at 3 points, and a total of 6 test points were obtained.
  • the trace element content is sequentially brought into the step 4 of the first embodiment.
  • IB-LDA is based on the classification function established by the eight origins, and the classification function of which group has a high score, and the "sample to be tested" belongs to which origin.
  • the result of the discriminating is that after 7 rounds of IB-LDA analysis, the scores of the classification functions all fall into the eastern Xinjiang range, so the “samples to be tested” are identified as the eastern producing areas of Xinjiang, and the discriminant accuracy rate is CV or 7 . It is 89.6%.
  • This embodiment is used to compare the distinguishing effect between the traditional LDA and the IB-LDA.
  • the sample and trace element information used were the same as in Example 1, that is, discriminant analysis was performed on 480 sample test points of 8 origins.
  • the difference from IB-LDA is that traditional LDA takes each sample of the origin as a group and conducts multiple sets of comparisons at the same time.
  • the discriminant result is the result of one round of discriminant analysis of 8 producing areas, and IB-LDA gives each The maximum discriminant probability of a place of origin relative to other places of origin.
  • each of the 8 producing areas is regarded as an independent group.
  • the trace element content of the 320 sample test points in the 8 training areas in the “training group” is imported into the statistical software, and the discriminant analysis is selected to give priority to the “stepwise screening”.
  • the method "filters the maximum degree while distinguishing the characteristic trace element variables, discriminant function coefficients and classification function coefficients of the eight groups, so that seven discriminant functions (DFY 1 , DFY 2 , ..., DFY 7 ) and eight classification functions can be obtained ( CF 1 , CF 2 ,...,CF 8 );
  • Step 2 Take the same "test group” sample points as in the second step of IB-LDA analysis, from the 8 origins, 4 sample points per origin, and substitute the classification function established in the first step of LDA, according to the classification.
  • the function score is used to judge the origin of the "test group” sample point, and compared with its actual origin, calculate "error”
  • the judgment rate is used to test the feasibility of the LDA discriminant method.
  • the true discriminant accuracy of the sample can be judged by the cross-validation discriminant accuracy or the initial data classification discriminant accuracy.
  • the discriminative power of the discriminant function established by IB-LDA and traditional LDA is compared.
  • the blank rectangular frame in Fig. 5 gives the discrimination result of the traditional LDA to 8 places of origin, and the LDA analysis of 8 places of origin, the corresponding CV total is 92.9%, which is lower than the 95.0% CV IB of IB-LDA. -total value.
  • IB-LDA improved the discriminative accuracy of Luodian, Xiuyan, Xinjiang West and Xinjiang East.
  • the Xinjiang West sample was identified in the fifth round of IB-LDA, and its corresponding cross-validation discrimination
  • the accuracy rate is 93.6% for CV or 4 , which is significantly higher than that of traditional LDA.
  • the average value of the Wilk's Lambda obtained by the IB-LDA from the seven discriminant functions is 0.152
  • the average value of the Wilk's Lambda of the seven discriminant functions of the conventional LDA is 0.172.
  • the former is significantly smaller than the latter, indicating that the patent
  • the IB-LDA used has a significant improvement in the discriminating ability of the soft jade samples compared with the traditional LDA.
  • the discriminative ability of the discriminant functions established by the two methods can also be compared by typical correlation coefficients and eigenvalues.
  • the average correlation coefficient of the seven discriminant functions obtained by IB-LDA is 0.920, which is significantly larger than the typical correlation of traditional LDA.
  • the mean value of the coefficient is 0.834, which also confirms that the discriminating ability of IB-LDA is superior to that of traditional LDA.
  • the invention is based on binary elementary iterative linear discriminant analysis (IB-LDA) quantitative discrimination and Tianyu origin method, which has great advantages over traditional naked eye qualitative identification and traditional LDA method.
  • IB-LDA binary elementary iterative linear discriminant analysis
  • trace elements were obtained by high-sensitivity laser ablation inductively coupled plasma mass spectrometry.
  • Statistical analysis methods have achieved important information in processing large amounts of trace elements in large quantities of samples.
  • the invention successfully improves the tradition
  • the LDA method proposes the IB-LDA method.
  • IB-LDA has a total cross-validation accuracy of 95.0% for 480 samples from 8 jadeite producing areas in East Asia, which is significantly higher than the traditional cross-validation accuracy of 92.9%.
  • the Wilk's Lambda average of IB-LDA The value is 0.152, which is significantly smaller than the traditional LDA Wilk's Lambda average of 0.172.
  • IB-LDA randomly selected 5 “test group” sample points (4 test points per production place), and the average error rate was 5%, that is, an average of 95% of the sample points were accurately identified as their actual origin. The feasibility of this technical method is verified.

Landscapes

  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Immunology (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Electrochemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Molecular Biology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Engineering & Computer Science (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

一种定量判别软玉产地的方法,包括步骤1)制备已知产地的软玉样品;2)定量测试样品的微量元素含量,将测试点分为"训练组"和"测试组";3)对微量元素含量进行显著性水平分析,以检验不同产地间是否存在统计差异;4)利用二元迭代线性判别分析法(IB-LDA)对"训练组"数据进行两两判别,并用"测试组"检验该方法的可行性;5)用IB-LDA预测"待测样品"的产地。基于微量元素的二元迭代线性判别分析定量判别软玉产地,将传统的多组同时比较,简化为两两比较,不仅获得了每个产地相对其他产地的判别概率,而且显著提高了判别函数的判别能力,相对于肉眼定性鉴别和传统线性判别分析方法,具有明显的优势。

Description

一种定量判别软玉产地的方法 技术领域
本发明属于产地鉴定领域,具体涉及一种对样品中微量元素含量进行定量测量和统计学二元迭代线性判别分析,以定量判别软玉产地的方法。
背景技术
东亚国家尤其是中国,自古以来对软玉一直有着浓厚的兴趣和喜爱。中国人使用软玉的历史最早可追溯至新石器时代(Wen and Jing,1996;Tsien et al.,1996;Wen,2001;Harlow and Sorensen,2005),在古代,软玉被视为帝王与贵族权利和财富的象征,代表君子的美好品德。在今天,来自中国、俄罗斯、韩国等地的软玉饰品在中国玉器市场仍然备受欢迎,并成为宝玉石科学研究的重点对象。其中对软玉地质来源问题的研究,对于软玉的品质分级、价格评估、古玉器产地溯源都具有十分重要的意义。
软玉是宝石级透闪石或阳起石矿物多晶集合体,根据成矿机制的差异,一般分为大理岩型软玉和蛇纹岩型软玉(Harlow and Sorensen,2005)。前者主要产地为东亚,以中国新疆和青海、俄罗斯贝加尔湖、韩国春川等地为主。后者主要产于新西兰、加拿大、中国、台湾、美国等地。这两种类型的软玉矿之间较容易区分,因为二者在Fe,Cr,Co,Ni,以及氧和氘同位素含量上差异明显(Yui and Kwon,2002;Harlow and Sorensen,2005;Siqin et al.,2012;Adamo and Bocchio,2013),大理岩型软玉因上述致色元素含量低,其颜色一般较浅,常常呈现为白玉、青白玉、青玉、黄玉等;蛇纹岩型软玉一般呈碧绿色,含较高Fe,Cr,Co,Ni等。在蛇纹岩型软玉矿内部,前人文献指出铬铁矿包体组成差异以及锶同位素含量差异可以作为区分不同产地蛇纹岩型软玉的依据(Adams et al.,2007;Zhang and Gan,2011;Zhang et al.,2012)。
关于大理岩型软玉,目前还没有一套被广泛接受的区分其产地的方法。主要的原因可能有两点,一是东亚地区大理岩型软玉矿较多(Yin et al.,2014),二是这些不同产地的软玉矿在外观特征上相似度较大(Wu et al.,2002;Ling et al.,2013),如何对其进行科学有效的区分,一直是珠宝行业和宝石科 学面临的难点。例如图1为东亚市场上四个不同产地大理岩型软玉的图例,他们在颜色、透明度、光泽等外观特征,以及常规宝石学参数如相对密度、折射率、显微特征等方面都非常接近,仅通过肉眼和常规仪器难以准确区分。而在东亚的诸多大理岩型软玉矿中,新疆地区所产软玉(新疆和田玉),备受消费者亲睐,市场价格相对最高,这与其本身质地温润、结构细腻有关,同时也与新疆软玉长期以来在中国玉文化中所积淀的声望相关。而其他产地,如来自俄罗斯贝加尔湖、韩国春川、中国青海格尔木等地大理岩型软玉价格相对较低。正是因为产地差异所引起市场价格的悬殊,在玉石市场上有时候会出现使用其他产地软玉冒充新疆软玉销售的事件,扰乱了市场和行业。
虽然近年来,玉石行业和宝玉石学家一直在努力探索和挖掘区分不同产地的方法,如行业内有人认为通过肉眼或放大观察软玉的结构可以一定程度判断软玉产地来源,但这一方法依赖长期经验积累且具有较大的主观不确定性;有些研究者们采用光学显微镜或扫描电镜观察不同样品的显微结构,或利用电子探针、X射线荧光光谱等方法,试图比较不同产地主量元素含量(如Ca、Mg、Si、F等)的差异(Wu et al.,2002;Ling et al.,2013),但不同产地间重叠程度明显,效果甚微。迄今为止,尚无成熟的文献报道一套系统、定量区分软玉产地的方法。为了维护消费者的合法利益,规范软玉市场,尽快研制一套科学、定量的软玉产地判别方法和标准迫在眉睫。
根据微量元素地球化学知识,矿物中的微量元素往往携带了矿物形成地质环境的指纹信息,可以用于宝玉石矿物的产地溯源(Breeding and Shen,2010;Blodgett and Shen,2011;Shen et al.,2011;Zhong et al.,2013)。但随着样品的产地数量以及化学成分复杂性的增加,微量元素在不同产地间可能存在不同程度的重叠,此时单纯通过肉眼观察某一个或几个微量元素的含量,很难快速有效的将多个产地的差异提取出来(Siqin et al.,2012)。此时,需要考虑多个微量元素以及他们之间的协同作用,而如何选取合理的统计学分析方法,提取并优化有效区分不同产地的微量元素变量在此成为主要目标。
线性判别分析(Linear Discriminant Analysis,简称LDA)是一种被国内外广泛使用的用于多组判别分类的统计学分析方法(Fisher,1936;Yu and  Yang,2001;McLachlan,2004;Guo et al.,2007),其基本思路是是在已知样品分类的情况下,通过各变量对分组的权重(判别能力),筛选特征变量组合成最佳的空间投影方向,即Fisher线性判别函数,各组数据在这个矢量方向上投影时,不同组间差异最大,同时组内差异最小。目前,LDA方法已被用于对某些单晶宝石,如含铜的帕拉依巴碧玺、红宝石、蓝宝石等的地质来源进行判别(Blodgett and Shen,2011),该方法尚未在软玉产地区分中被研究报道。
发明内容
针对现有技术的不足之处,本发明的目的是提出一套定量判别软玉产地的方法。
本发明目的通过下述技术方案来实现:
一种定量判别软玉产地的方法,包括步骤:
1)选择产地和制备样品
选取已知确切来源的2-50个产地的软玉样品(尽可能涵盖目前已发现的所有软玉产地),产地数记为N个,样品表面抛光良好;
2)定量测试不同产地软玉样品的微量元素含量
定量测试所有软玉样品的微量元素含量,定量测试方法为激光剥蚀电感耦合等离子质谱、激光诱导击穿光谱、辉光放电质谱、外束质子激发、二次离子质谱、X荧光光谱分析法中的一种或多种;将携带微量元素含量信息的样品测试点,分为“训练组”和“测试组”两组,“训练组”和“测试组”都涵盖所有产地的软玉样品;
3)对微量元素含量进行初步分析
对步骤2)测得N个产地所有测试点的微量元素含量进行描述性分析,以检验不同产地之间是否存在统计差异。具体可采用多元方差分析(或称F值检验法)或卡方检验,在95%置信度时,对微量元素变量进行组均值的均等性的检验,若显著性水平sig<0.05,说明该变量的组均值不相等,可用于下一步的判别分析。
4)采用二元迭代线性判别分析方法对微量元素进行分析
A、基于“训练组”数据建立二元迭代线性判别分析模型
对“训练组”产地数据点,开展多轮次的二元迭代判别分析(IB-LDA)。具体为:每一轮IB-LDA中,将“训练组”数据点分为两组,对这两组的微量元素数据进行两两线性判别分析。如在第1轮IB-LDA中,任选取“训练组”中一个已知产地的所有软玉样品点标记为第一组,剩下的N-1个产地的所有软玉样品点全部打包标记为第二组,每个微量元素均作为一个独立变量,对这两组的微量元素数据进行两两线性判别分析,统计软件会筛选出最大程度区分这两组的特征微量元素,并给出判别概率值。在这一轮IB-LDA中,“训练组”每个产地的所有样品点都会作为第一组与其他产地包(第二组)进行比较,本发明提取判别概率最大的产地作为第1轮IB-LDA筛选出来的产地。
对于剩下的N-1个产地,用上述同样的两两比较法,开始第2轮IB-LDA分析,从而筛选出第2轮判别概率最大的产地,此过程继续持续下去,直到每一个产地都被筛选出来。对于N个产地,共进行N-1轮筛选。对于筛选中的产地,IB-LDA给出的特征微量元素和判别系数构成二元迭代典型判别函数DF Yi,简称判别函数,判别系数的绝对值反映了变量对判别分组的重要性;而特征微量元素和分类系数构成二元迭代分类函数CFi1和CFi2,下标i为进行IB-LDA的轮数。被依次筛选中产地的判别函数和分类函数,组成IB-LDA的判别模型,分类函数可实现快速的分类。
B、用“测试组”样品检验IB-LDA方法的可行性
每个产地取相同数量的“测试组”样品点,将其微量元素含量依次代入步骤A所筛选出的产地的分类函数,根据得分情况判断“测试组”样品点的产地,并与其实际所属产地比较,由误判率(ER)检验IB-LDA判别方法的可行性。
5)对“待测样品”进行产地判别
选取市场上未知产地的软玉样品,记为“待测样品”,按照步骤1-2)对其平整的抛光面进行微量元素测试,将微量元素含量数据依次带入步骤4)中IB-LDA建立的分类函数,根据得分值判断“待测样品”的所属产地,并给出判别准确率;
6)对上述N个产地的微量元素进行线性判别分析
对“训练组”中N个产地数据点,每个产地视为独立的一组,依次编号为1至N;每个微量元素视为独立的自变量,将每组产地的微量元素信息输入到统计软件,线性判别分析(LDA)给出最大程度同时区分这N组数据的微量元素变量、判别函数系数和分类函数系数,从而构建判别函数和分类函数;
然后对“测试组”,每个产地选取相同比例的数据点,代入“训练组”建立的分类函数,计算误判率和每个“测试组”样品点的判别准确率,误判率可评估判别函数模型的可行性;
通过比较LDA与IB-LDA所建立判别函数的Wilk’s Lambda值、或典型相关系数、或特征值等的大小,衡量判别方法的相对优劣。
步骤1)中实地采集国内外N个已知产地(N为产地数,为正整数,范围2-50个产地)的软玉样品,以保证样品的可靠来源。每个产地均选取一定数量的代表性样品,制成两面平行、良好抛光的样片。
步骤2)中所称的微量元素是相对于主量元素而言。写入化学分子式中的元素称为主量元素,将矿物中不计入化学分子式而在矿物中存在的含量较少元素称为微量元素。软玉的化学式一般为Ca2(Mg,Fe)5(Si8O22)(F,OH)2,除Ca,Mg,Fe,Si,O,H,F以外的元素被看做微量元素。
微量元素所携带的地质指纹特征,对于宝玉石产地区分具有很重要的价值(Breeding and Shen,2010;Blodgett and Shen,2011;Zhong et al.,2013)。本发明优选采用高灵敏度的激光剥蚀电感耦合等离子质谱仪(LA-ICP-MS),对所有样品所含的微量元素含量进行准确测定。测试的样品点分为“训练组”和“测试组”两组,每组都包含所有产地的样品,“训练组”样品点及微量元素信息,用于建立产地判别模型,“测试组”样品点及微量元素信息,用于检验“训练组”建立模型的可行性。
本发明方法中,所述步骤2)中“训练组”每个产地的样品点数,占其总样本点数的1/2-2/3,“测试组”每个产地的样品点数占其总样本点数的1/3-1/2。本发明不排除对“训练组”和“测试组”样品量的随机分组。
步骤2)中所述激光剥蚀电感耦合等离子质谱分析法中,测试的激光能量密度为5-40J/cm2,优选为5-10J/cm2;激光光斑大小为16-120μm,优选为32-44 μm;测试元素为微量元素中的34-65个。
所述激光剥蚀电感耦合等离子质谱分析法中,所用外部参考样品为NIST,USGS,MPI-DING,CGSC,GSJ天然或合成玻璃系列标样中的一种或多种,选择29Si作为数据处理的内标。其他实验参数设置及微量元素含量的转换方法可按照LA-ICP-MS常规方法,如参考文献(Liu et al,.2008,2010)。
其中,所述步骤3)采用多元方差分析法或卡方检验,对N个产地的每个微量元素进行组均值的均等性的检验。当取95%的置信度时,若显著性水平sig<0.05,说明该变量的组均值不相等,各组之间存在显著性差异,开展下一步判别分析具有意义。也可取99%或更高的置信度,此时若显著性水平sig<0.01,说明微量元素变量在不同组之间存在极显著的差异。本发明不排除使用卡方检验或其他方法开展统计显著性检验。
所述步骤4)和步骤6)中,每一轮IB-LDA或LDA可采用直接法“一起输入自变量”或逐步加入变量的“逐步筛选法”中的一种来确定微量元素变量。在“逐步筛选法”中,具体可采用“Wilk’s Lambda”、“马氏距离”、“未解释方差”、“最小F值”、“Rao’s V”方法中的一种,根据每一个微量元素对产地的判别能力,筛选具有判别效能的微量元素变量;“逐步筛选法”中系统引入变量和剔除变量默认的判据,可采用“使用F值”或“使用F检验的概率”中的一种,来决定元素变量是否被加入判别函数或移除,系统默认的引入变量的临界F值为3.84,剔除变量的临界F值为2.71;系统默认的加入变量的F值概率是0.05,移出变量的F值概率是0.10;
IB-LDA通过两两判别分析,将对于两组区分贡献较大的微量元素和对应的判别系数、分类系数提取出来,特征微量元素及其判别系数组合成一个最佳的投影矢量方向(在这个方向上,两组间差异最大化,同时组内差异最小化),这一投影方向即对应IB-LDA中的典型判别函数DF Yi,简称判别函数,其数学表达式如公式(1)所示:
DF Yi=a1iX1+a2iX2+a3iX3+……+amiXm+a0i          (1)
式中,DF Yk对应每一轮IB-LDA筛选中产地的线性判别函数,k=1,2,……,N-1;X1,X2,X3,……,Xm是筛选出的特征变量,对应软玉所含 的特征微量元素;a1k,a2k,a3k,……,amk是特征元素前面的线性判别系数,bK为线性判别函数的常数项。
其中,所述步骤4)中,关于IB-LDA的分类函数CFi,其数学表达式如公式(2)和(3)所示:
CFi1=b11X1+b21X2+b31X3+……+bmi1Xm+bi1          (2)
CFi2=b12X1+b22X2+b32X3+……+bmi2Xm+bi2          (3)
式中,CFi1和CFi2分别对应第i轮IB-LDA筛选中的产地和未被区分组的分类函数,i=1,2,……,N-1;X1,X2,X3,……,Xm是每轮IB-LDA筛选出的特征变量,对应软玉所含的特征微量元素;b11,b21…bmi1,……,bmi2是特征变量对应的分类函数系数,bi1和bi2为分类函数的常数项。
对“训练组”,每一轮IB-LDA中被筛选中产地的判别概率,可使用交叉验证判别准确率CVi(Cross Validation correctly classified rate,简称CVi)表示,i为进行IB-LDA的轮数。交叉验证以“留一个或若干个样本在外”的原则为基础,是指在判别分析时,随机提取一个或若干个样品点在外作为“省略样品”,而以剩下的样品点建立判别函数,用建立的判别函数判定“省略样品”的产地,与其实际产地对比,以检验判别函数的可靠性,在运算中会重复这一过程直到所有样品点都作为“省略样品”被检测;被准确判别的“省略样品”数占总样品数的百分比即为交叉验证判别准确率。交叉验证一方面在此表示某产地被准确判别的概率,另一方面,起到在建立判别函数过程中辅助建立模型,防止出现过度拟合的作用。CVi的计算公式如公式(9)所示
Figure PCTCN2015092024-appb-000001
如某产地在第i轮IB-LDA中被筛选出来,其对应的判别概率为CVori,定义为从第一次轮两组区分的CV1持续到第i轮两组区分CVi的乘积,公式如下:
CVori=CV1×……CVi-1×CVi                    (4)
or为origin缩写,表示产地。
通过IB-LDA筛选中的所有产地被准确判别的概率优先用CVIB-total表示,定义为每个被筛选中产地CVori加和的算数平均值,计算公式如(5)所示:
Figure PCTCN2015092024-appb-000002
或,使用初始分组案例判别准确率(Original grouped cases correctly classified rate,简称OGi)来表示每一轮IB-LDA中被筛选中产地的判别概率,初始分组案例中,不提取样品在外,而以输入软件的所有初始样本点数建立判别函数,进而对每个样品点进行判别。
OGi的计算公式如公式(10)所示:
Figure PCTCN2015092024-appb-000003
若某产地在第i轮IB-LDA中被判别出来,其对应的分组初始案例判别准确率OGori,定义为从第一次轮两组区分的OG1持续到第i轮两组区分OGi的乘积;而所有产地被准确判别的概率也可用OGIB-total表示,定义为每个被筛选中产地OGori加和的算数平均值,计算公式分别如(6)和(7)所示:
OGori=OG1×……OGi-1×OGi               (6)
Figure PCTCN2015092024-appb-000004
本发明方法的步骤6)可以为:比较IB-LDA和LDA在软玉产地定量判别中的相对优劣。具体地,对步骤4)和步骤6)各自所建立的判别函数,用Wilk’s Lambda值、典型相关系数、特征值中的一种或多种,评价判别函数的判别能力。Wilk’s Lambda,简称λ值,是组内离差平方和/总的离差平方和,在判别检验中常使用,较小的λ值表示判别函数具有较高的判别能力;典型相关系数是组间离差平方和/总离差平方和的算术平方根,特征值是组间离差平方和/组内离差平方和,对于判别函数而言,典型相关系数或特征值越大,其判别能力就越强。
本发明方法中,所述步骤4)和步骤6)中,对“测试组”样品,误判率 用ER(Error rate)表示,其数学表达式为
Figure PCTCN2015092024-appb-000005
将“测试组”样品代入各产地分类函数的顺序与步骤4)中建立各产地的分类函数的顺序相同。若“测试组”某样品点被准确判别为其实际产地,其判别准确率,优选使用该产地的交叉验证判别准确率(CVori)表示,也可使用初始数据分类判别准确率(OGori)表示。
本发明所述方法的步骤5)具体可以为:取“待测”的软玉样品,定量测定微量元素含量,代入步骤4)所建立的IB-LDA判别模型,即依次带入筛选中产地的分类函数,根据得分判别其所属产地,其判别准确率优先使用判别所属产地的交叉验证准确率CVori表示,也可用初始案例分组判别准确率OGori表示;其中,将“待测样品”代入各产地分类函数的顺序与步骤4)中建立各产地的分类函数的顺序相同。若待测样品实际产地不属于IB-LDA建模时所包含的产地,则可能产生误判,因此要求IB-LDA建模时尽可能覆盖所有的软玉产地样品。
本发明在此开展IB-LDA所用统计学软件为国际上免费的R-Language,本发明的方法不排除利用SPSS、MATLAB、S-PLUS、SAS、EXCEL等其他软件开展IB-LDA运算。
本发明的有益效果:
本发明基于微量元素的二元迭代线性判别分析(IB-LDA)定量判别软玉产地方法,相对传统肉眼定性鉴别和LDA方法,有很大的优势。
(1)本发明的方法深入挖掘了微量元素的地质学信息在软玉产地判别中的作用,微量元素通过高灵敏度激光剥蚀电感耦合等离子质谱定量测量获得;
(2)统计学分析方法,在处理大样本的大量微量元素信息方面显示快速、定量、准确的明显优势;
(3)本发明的方法成功地改进了传统LDA方法,提出二元迭代判别分析(IB-LDA)方法。对于不同产地软玉样品,将传统的同时多组比较,简化为多轮的两两比较,不仅获得了每个产地相对其他产地的最大判别概率,而且 显著提高了判别函数对产地判别的能力,IB-LDA对东亚地区8个软玉产地480个样品点总的交叉验证判别准确率达到了95.0%,明显高于传统LDA对8个产地的交叉验证判别准确率92.9%;对8个产地开展7轮IB-LDA,所得7个判别函数的Wilk’s Lambda的算术平均值为0.152,明显小于传统LDA对8个产地同时分析所得7个判别函数的Wilk’s Lambda的算术平均值0.172。IB-LDA随机选取5次“测试组”样品点(每次每个产地4个测试点)测试,获得平均值为95%的交叉验证判别准确率,即平均有95%的样品点被准确判属为其实际的产地,验证了本技术方法的可行性。
附图说明
图1为东亚市场上常见的来自4个不同大理岩型产地的软玉成品的图例,(a),(b),(c),(d)依次为新疆和田、青海格尔木、俄罗斯贝加尔湖、韩国春川。
图2为东亚地区8个主要大理岩型软玉产地分布图,包括新疆西(中国新疆)、新疆东(中国新疆)、格尔木(中国青海)、贝加尔湖(俄罗斯)、春川(韩国)、岫岩(中国辽宁)、罗甸(中国贵州)、溧阳(中国江苏)。
图3为东亚地区8个主要大理岩型软玉产地的样品照片。每个产地列一个代表颜色样品作为展示:(1)新疆西;(2)新疆东;(3)格尔木;(4)贝加尔湖;(5)春川;(6)岫岩;(7)罗甸;(8)溧阳。
图4为以六轮IB-LDA为例说明二元迭代判别分析对软玉产地的判别过程图(除溧阳外的七个产地共进行了六轮IB-LDA分析)。横坐标和纵坐标分别代表每一轮IB-LDA的分类函数。每一个子图中,两种不同形状的实心图样分别代表“训练组”中该轮IB-LDA筛选中产地和“未区分组”的样品点;与上述实心图样对应的空心图样分别为“测试组”中该筛选中产地和“未区分组”的样品点,标记为“测试点”。椭圆线框表示“训练组”样品点的分类函数得分范围。
图5为比较IB-LDA与传统LDA判别8个软玉产地的交叉验证判别准确率的柱状图,横坐标LY,LD,XY,CC,XJW,BK,XJE,GM分别代表7轮IB-LDA依次筛选的产地:溧阳、罗甸、岫岩、春川、新疆西、贝加尔湖、新疆东和格尔木,total代表对8个产地总的交叉验证判别准确率。
图6是以小孩玩的投递积木游戏为例,介绍IB-LDA对软玉产地样品的判别过程图。图6a是首先依据“测试组”积木确定每一轮IB-LDA的“筛子”,建立判别模型的过程图;图6b是然后将“测试组”积木依次带入图6a中的筛子检验IB-LDA方法的可行性,以及对“待测”积木依次判别的过程图。
具体实施方式
以下以具体实施例来进一步说明本发明技术方案。本领域技术人员应当知晓,实施例仅用于说明本发明,不用于限制本发明的范围。
实施例中,如无特别说明,所用技术手段为本领域常规的技术手段。
实施例1二元迭代线性判别分析(IB-LDA)方法及区分结果
1)实施例中样品为发明人所在研究团队多年来亲赴东亚地区大理岩型软玉的各个矿区,实地采集的大量软玉样品。以目前最常见的8个软玉产地为研究对象,采用IB-LDA方法建立产地判别模型。8个产地的地理位置如图2所示。图2中新疆地区所产软玉分为新疆西和新疆东,是依据二者在地质成矿条件上的显著差异决定的(唐延龄等,1994)。数字1-8分别代表新疆西(中国新疆西部各矿点,含著名的和田、叶城、墨玉河地区等)、新疆东(中国新疆东部矿点,含且末、若羌)、格尔木(中国青海)、贝加尔湖(俄罗斯)、春川(韩国)、岫岩(中国辽宁)、罗甸(中国贵州)、溧阳(中国江苏)。
挑选各产地代表性样品共160块,制备成两面平行抛光、统一规格的样片(如图3所示)。
2)利用高灵敏度的激光剥蚀电感耦合等离子质谱仪(LA-ICP-MS),定量测试不同产地软玉样片的微量元素组成,测试条件:激光系统Geolas 193nm,质谱系统Agilent 7500ICP-MS。激光能量密度设置为6J/cm2,激光光斑大小44μm。测试时同步使用的标准样品为合成玻璃NIST SRM610,USGS系列中的BCR-2G,BHVO-2G和BIR-1G。数据分析时,选择29Si作为定量分析的内标,每个样品等距测试3个不同的点位(间隔约5mm),总测试元素45个,除Ca,Mg,Fe,Si以外的微量元素为41个。具体元素为:Li,Be,Na,Mg,Al,Si,K,Ca,Sc,Ti,V,Cr,Mn,Fe,Co,Ni,Cu,Rb,Sr,Y,Zr,Nb,Cs,Ba,La,Ce,Pr,Nd,Sm,Eu,Gd,Tb,Dy,Ho,Er,Tm,Yb,Lu,Hf,Ta,W,Bi,Pb,Th,U,测试结果通过标准 样品建立的工作曲线计算转化得到浓度值。
测试条件中,到达样品表面的激光能量密度为5-10J/cm2,激光光斑大小为32-44μm,因为测量中同时采集标准样品的元素含量,故能对样品取得同样的测试效果。
将8个产地样品480个LA-ICP-MS测试点,分为“训练组”和“测试组”两组,“训练组”占2/3比例,共320个点,对应每个产地含40个样品点,“测试组”占1/3比例,共120个点,对应每个产地含20个样品点。
3)对微量元素含量初步分析
对8个产地的所有测试点采用多元方差分析或称F值检验法,初步判断不同产地之间是否存在统计差异。具体为将8个产地的480个测试点的微量元素含量输入统计软件,输入的微量元素个数为除第一过渡族外的34个元素,这是考虑到同一产地软玉样品可能有多种颜色,不同颜色软玉在过渡族致色元素含量上可能存在较大差异,通过去除第一过渡族致色元素可尽量减少颜色对于软玉产地区分的影响。然后在95%置信度时,对这些微量元素变量进行组均值的均等性的检验,发现所有微量元素组均值对应的显著性水平sig<0.05,说明各组之间存在显著性差异,开展下一步判别分析具有意义。
4)对8个产地样品点的微量元素含量进行IB-LDA分析
对于上述8个产地样品的测试点,采用二元迭代线性判别分析方法(IB-LDA)筛选特征元素变量建立判别模型。
IB-LDA第一步,是将“训练组”产地样品测试点每次分为两组,开展多轮的两组判别比较。在每轮IB-LDA分析时,将随机选取的“一个产地”的所有软玉样品(作为第一组)与“其他产地包”的所有样品(作为第二组)进行两两判别分析,以便最大程度的将这个产地与其他产地的差异体现出来。第一轮IB-LDA中,每个产地都会作为第一组与其他产地包比较,其中相对“其他产地包”具有最大交叉验证判别准确率(CVi)的产地,将被挑选出来。余下的产地,称为“未区分组”,进入下一轮的IB-LDA分析,重复上述过程,直到最后一个产地被判别出来。对于N个产地,二元迭代法需要执行N-1次IB-LDA分析。
本实施例中对于8个软玉产地的“训练组”样品测试点,进行7轮IB-LDA分析。在第一轮IB-LDA分析中,将任一产地的所有样品测试点作为第一组,与其他7个产地所有样品测试点打包形成的第二组,进行两组判别分析,采用逐步筛选法筛选出对区分这两组贡献最大的微量元素、判别系数及分类系数。第1轮IB-LDA中发现溧阳样品相对于其他产地包具有最大判别概率,因此,溧阳产地被筛选出来,其他7个产地称为“未区分组”;被筛选出的特征微量元素及判别系数构成第一轮IB-LDA的判别函数DF Y1(i=1),而特征微量元素和分类系数构成一组分类函数CFi1和CFi2(i=1)。CFi1对应每轮IB-LDA筛选出的产地的分类函数,CFi2是“未区分组”的分类函数。接下来,对剩下的7个产地进行第二轮IB-LDA分析,采用同样的方法,随机选任一产地作为第一组,与其他6个产地打包形成的第二组,进行两组判别分析,在第二轮IB-LDA分析中,发现罗甸样品具有最大的判别概率,被筛选的新的微量元素及系数组成新的判别函数和分类函数。剩下的6个尚未区分的产地(“未区分组”)将继续进行下一轮的IB-LDA分析,每一轮IB-LDA,会有一个产地被挑选出来。最后,所有的产地都会被IB-LDA最大程度的区分出来(如图4所示)。每一轮IB-LDA筛选中产地的区分概率用交叉验证判别准确率CVori表示。
对于8个产地的“训练组”样品,经过7轮IB-LDA分析,我们依次筛选出溧阳、罗甸、岫岩、春川、新疆西、贝加尔湖、新疆东和格尔木这些产地(此顺序不能变),溧阳CVi在第一轮IB-LDA最高,罗甸CVi在第二轮IB-LDA最高,……,新疆东和格尔木在最后一轮同时被判别出来。对“测试组”样品点和“待测”未知样品的判别也遵循该顺序。经7轮IB-LDA,可筛选得到7组不同的特征元素和分类系数,并由此建立每个筛选中产地及其对应的“未区分组”的分类函数。在此列出第7轮(本实施例最后一轮,i=7)IB-LDA的分类函数,见公式(11)和公式(12),可用来区分新疆东和格尔木这两个产地的软玉样品,其中Li,Be,Al,K,Nb,Ba,La七个元素是本实施例中IB-LDA方法筛选的特征元素。
CF71(Xinjiang-East)=–2.058[Li]+1.554[Be]+0.005[Al]+0.012[K]–9.807[Nb]–0.461[Ba]+6.452[La]–11.891                  (11)
CF72(Geermu)=0.452[Li]–0.682[Be]+0.001[Al]+4.062[Nb]+0.431[Ba]+0.145[La]–2.376                    (12)
公式中,[元素]表示元素的含量。
5)用“测试组”样品验证IB-LDA方法的可行性
对“测试组”样品点进行判别:每个产地选择相同数量的“测试组”样品点,将其微量元素信息,按顺序依次带入每一轮IB-LDA的分类函数,首先是带入第一轮IB-LDA筛选中的溧阳组和对应的“未区分组”,哪一组分类函数得分高,“测试组”样品点就属于哪一组,即如果测试点带入溧阳组的分类函数得分高,则判别为溧阳产地样品,无需进一步分析。而如果属于“未区分组”,则进入下一轮IB-LDA分析,将其微量元素信息带入第二轮IB-LDA筛选中的罗甸组和对应的“未区分组”的分类函数,看其得分是否属于罗甸组,如果不属于罗甸,则继续带入下一轮IB-LDA进行分析,依次轮流下去。每一个“测试组”样品点都会被判属为上述8个产地中的某一个。将样品点被判属的产地与其真实产地进行比较,计算总的误判率(ER),以检验IB-LDA方法的可行性,而判对样品点的真实判别准确率,用其所属产地的交叉验证判别准确率(CVori)或初始数据分类判别准确率(OGori)表示。
整个操作步骤可以通过计算编程实现,从而达到输入待测样品的微量元素即可直接判定其所属产地。
参见图4。以六轮IB-LDA为例,说明使用二元迭代线性判别分析方法对软玉产地的定量判别过程。每一轮IB-LDA中,相对其他产地包具有最大区分概率的产地被筛选出来。对除溧阳产地外的7个产地,通过六轮IB-LDA,筛选中的产地依次是罗甸(a),岫岩(b),春川(c),新疆西(d),贝加尔湖(e),新疆东(f)和格尔木(f)。每个子图的横坐标(CFi1)和纵坐标(CFi2)分别代表第i(i=1-6)轮IB-LDA时的二元迭代分类函数。椭圆线框对应“训练组”样品点的分类函数得分范围,其中灰色阴影区域是“未区分组”的分类函数得分范围。每一个子图中,两种不同形状的实心图样分别代表“训练组”中该轮IB-LDA筛选中产地样品点和“未区分组”样品点,与上述实心图样对 应的空心图样,分别为“测试组”中该筛选中产地和“未区分组”的样品点,均标记为“测试点”。右下角标记的CVi为不考虑先验概率时,每一轮IB-LDA筛选中软玉产地样品的交叉验证判别准确率。
图5中带斜线的矩形框,给出了IB-LDA依次判别出的8个产地各自的交叉验证判别准确率(CVori)及总的交叉验证判别准确率CVIB-total。若某产地在第i轮被判别出来,其交叉验证判别准确率(CVori)为从第一次轮两组区分的CV1持续到第i轮两组区分CVi的乘积。从左到右,溧阳(LY),罗甸(LD),岫岩(XY),春川(CC),新疆西(XJW),贝加尔湖(BK),新疆东(XJE),格尔木(GM)的CVori依次为100%,99.5%,98.4%,96.4%,93.6%,92.6%,89.6%,89.6%。8个产地被准确判别的概率CVIB-total为每个被筛选中产地CVori加和的算数平均值,为95.0%。
为了方便理解二元迭代LDA方法判别软玉产地的过程,用小孩玩的投递积木游戏为例来说明。将不同产地软玉归属到对应的产地,这一过程,类似于将不同形状的积木投到对应开口形状的盒子中。在此仅用6轮IB-LDA区分除溧阳外7个产地的过程为例进行说明。如图6(a)所示,假设有7种不同形状的积木代表7个不同产地的软玉样品,三角形、四角星、五角星、六角星、菱形、五边形、六边形分别代表来自罗甸、岫岩、春川、新疆西、贝加尔湖、新疆东、格尔木的软玉样品,投递积木的小孩事先不知道这些积木的形状。为了把这些积木放到对应的盒子,我们首先要通过对“训练组”的两两判别分析,建立起不同开口形状的盒子(产地筛子)。“训练组”中除溧阳外的7个产地的280个样品点,经过6轮IB-LDA分析后,依次选出了7个带不同开口形状的筛子,不同形状的开口对应每一个选中筛子的二元迭代线性判别函数。这些筛子的顺序,不能改变,因为他们直接对应每一轮IB-LDA的筛选中的产地。
然后,选取来自这7个产地的28个“测试组”样品点,每个产地随机选取4个点,依次通过上述筛子,来检验“判别组”建立的筛子的可靠性(图6(b)是IB-LDA第二步的流程图,用第一步建立的筛子对“测试组”样品点进行逐步判别),即选用28个积木作为“测试组”,依次往三角形、四角星、五角星、六角星、菱形、五边形、六边形开口的盒子中投递。当“测试组”积木的形 状与筛子形状一致时,其产地被确定。经过5次这样的随机测试,我们的分析结果发现,140个“测试组”样品点中有95%的样品点被准确归属为其原来的产地,即误判率为5%。
实施例2未知产地样品的测试
本实施例对未知产地的“待测样品”进行测试,对2块据称来自新疆地区的软玉样品成品(产地来源并未确定)进行产地判别分析。如实施例1步骤2),首先选择样品表面光滑平整的区域进行LA-ICP-MS测试,每个样品测试3个点位,共获得6个测试点。将其微量元素含量依次带入实施例1步骤4)IB-LDA基于8个产地建立的分类函数,哪个组的分类函数得分高,“待测样品”就属于哪个产地。
判别的结果是,这6个测试点经过7轮IB-LDA分析后,其分类函数得分均落入新疆东范围,因而将“待测样品”判别为新疆东产地,其判别准确率CVor7均为89.6%。
实施例3传统线性判别分析(LDA)方法产地区分结果
本实施例用以比较传统LDA和IB-LDA的区分效果。传统LDA分析时,所用样品和微量元素信息均与实施例1相同,即对8个产地的480个样品测试点开展判别分析。和IB-LDA不同之处在于,传统LDA将每个产地样品作为一组,同时开展多组比较,其给出的判别结果是8个产地一轮判别分析的结果,而IB-LDA给出每个产地相对其他产地的最大判别概率。
传统LDA具体分析步骤如下:
第一步:8个产地各自视为独立的一个组,将“训练组”中这8个产地的320个样品测试点的微量元素含量导入到统计学软件,选择判别分析,优先利用“逐步筛选法”筛选最大程度同时区分这8个组的特征微量元素变量、判别函数系数和分类函数系数,从而可获得7个判别函数(DFY1,DFY2,…,DFY7)和8个分类函数(CF1,CF2,…,CF8);
第二步:取与IB-LDA第二步分析时相同的“测试组”样品点,分别来自这8个产地,每个产地4个样品点,代入LDA第一步建立的分类函数,根据分类函数得分高低判断“测试组”样品点的产地,并与其实际产地比较,计算“误 判率”,以检验LDA判别方法的可行性。而判对样品的真实判别准确率,可用交叉验证判别准确率或初始数据分类判别准确率表示。
通过比较Wilk’s Lambda值的大小,或典型相关系数,或特征值的大小等,比较IB-LDA与传统LDA所建立的判别函数的判别能力。
本实施例中,图5中空白矩形框给出了传统LDA对8个产地的判别结果,8个产地同时LDA分析,对应的CVtotal为92.9%,低于IB-LDA的95.0%的CVIB-total值。与LDA相比,IB-LDA提高了罗甸、岫岩、新疆西和新疆东4个产地的判别准确率,如新疆西样品在第5轮IB-LDA被判别出来,其对应的交叉验证判别准确率为CVor4为93.6%,明显高于传统LDA对其的交叉验证判别准确率80%。
实施例1和实施例3中,IB-LDA获得7个判别函数的Wilk’s Lambda的平均值为0.152,传统LDA获得7个判别函数的Wilk’s Lambda的平均值0.172,前者明显小于后者,说明本专利使用的IB-LDA相对传统LDA对软玉产地样品的判别能力有明显提升。也可通过典型相关系数、特征值等方式比较两种方法所建立的判别函数的判别能力,如IB-LDA获得的7个判别函数的典型相关系数的平均值0.920,明显大于传统LDA的典型相关系数的平均值0.834,同样证实IB-LDA的判别能力优于传统LDA。
以上的实施例仅仅是对本发明的优选实施方式进行描述,并非对本发明的范围进行限定,在不脱离本发明设计精神的前提下,本领域普通工程技术人员对本发明的技术方案作出的各种变型和改进,均应落入本发明的权利要求书确定的保护范围内。
工业实用性
本发明基于微量元素的二元迭代线性判别分析(IB-LDA)定量判别和田玉产地方法,相对传统肉眼定性鉴别和传统LDA方法,有很大的优势。首先,通过深入挖掘了微量元素在和田玉产地判别中的作用,微量元素通过高灵敏度激光剥蚀电感耦合等离子质谱定量测量获得;统计学分析方法,在处理大量样品大量微量元素信息中取得了重要的优势。本发明成功地改进了传统 LDA方法,提出IB-LDA方法。对于不同产地和田玉样品,通过将传统的多组比较简化为两两比较,不仅获得了每个产地相对其他产地的判别概率,而且显著提高了产地判别的准确率。IB-LDA对东亚地区8个软玉产地480个样品点,总的交叉验证判别准确率达到了95.0%,明显高于传统LDA对应的交叉验证判别准确率92.9%;IB-LDA的Wilk’s Lambda平均值为0.152,明显小于传统LDA的Wilk’s Lambda平均值0.172。IB-LDA随机选取5次“测试组”样品点(每次每个产地4个测试点)测试,平均误差率为5%,即平均有95%的样品点被准确判属为其实际的产地,验证了本技术方法的可行性。

Claims (10)

  1. 一种定量判别软玉产地的方法,其特征在于,包括步骤:
    1)选择产地和制备样品
    选取已知确切来源的2-50个产地的软玉样品,产地数记为N个,样品表面抛光良好;
    2)定量测试不同产地软玉样品的微量元素含量
    定量测试所有软玉样品的微量元素含量,定量测试方法为激光剥蚀电感耦合等离子质谱、激光诱导击穿光谱、辉光放电质谱、外束质子激发、二次离子质谱、X荧光光谱分析法中的一种或多种;
    将携带微量元素含量信息的样品测试点,分为“训练组”和“测试组”两组,“训练组”和“测试组”都涵盖所有产地的软玉样品;
    3)对微量元素含量进行初步分析
    对步骤2)测得N个产地所有测试点的微量元素含量,采用F值检验法或卡方检验法进行描述性分析,以检验不同产地之间是否存在统计差异;
    4)采用二元迭代线性判别分析方法对微量元素数据进行分析
    A、基于“训练组”数据建立二元迭代线性判别分析模型对“训练组”产地数据点,开展多轮次的二元迭代判别分析,二元迭代判别分析全称为Iterative-binary Linear Discriminant Analysis,简称IB-LDA,具体为:每一轮IB-LDA中,将“训练组”数据点分为两组,对这两组的微量元素数据进行两两线性判别分析;如在第1轮IB-LDA中,任选取“训练组”中一个已知产地的所有软玉样品点标记为第一组,剩下的N-1个已知产地的所有软玉样品点全部打包标记为第二组,每个微量元素均作为一个独立变量,对这两组的微量元素数据进行两两线性判别分析,统计软件会筛选出最大程度区分这两组的特征微量元素,并给出判别概率值。在这一轮IB-LDA中,“训练组”每个产地的所有样品点都会作为第一组与其他产地包(第二组)进行比较,提取判别概率最大的产地作为第1轮IB-LDA筛选出来的产地;而对于剩下的N-1个产地,用上述同样的两两比较法,开始第2轮IB-LDA分析,从而筛选出第2轮判别概率最大的产地,此过程继续持续下去,直到每一个产地都被筛选出来,对于 N个产地,共进行N-1轮筛选;对于筛选中的产地,IB-LDA给出的特征微量元素和判别系数构成其二元迭代判别函数DFYi,判别系数的绝对值反映了变量对判别分组的重要性;特征微量元素和分类系数构成二元迭代分类函数CFi1和CFi2,下标i均为进行IB-LDA的轮数;被依次筛选中产地的判别函数和分类函数组成IB-LDA的判别模型,分类函数可实现快速的分类;
    B、用“测试组”样品检验IB-LDA方法的可行性
    每个产地取相同数量的“测试组”样品点,将其微量元素含量依次代入步骤A所筛选出的产地的分类函数,根据得分情况判断“测试组”样品点的产地,哪一组的函数得分高,“测试组”样品点就分入哪组,将预测结果与其实际所属产地比较,计算误判率,及每个“测试”样品点的判别准确率,误判率可评估IB-LDA所建立的判别模型的可行性;
    5)对“待测样品”进行产地判别
    选取市场上未知产地的软玉样品,记为“待测样品”,按照步骤1-2)对其平整的抛光面进行微量元素测试,将微量元素含量数据依次带入步骤4)中IB-LDA建立的分类函数,根据得分情况判断“待测样品”的所属产地,并给出判别准确率;
    6)对上述N个产地的微量元素进行线性判别分析
    对“训练组”中N个产地数据点,每个产地视为独立的一组,依次编号为1,2,…至N;每个微量元素视为独立的自变量,将每组产地的微量元素信息输入到统计软件,LDA给出最大程度同时区分这N组数据的微量元素变量、判别函数系数和分类函数系数,从而构建判别函数和分类函数;
    然后对“测试组”,每个产地选取相同比例的数据点,代入“训练组”建立的分类函数,计算误判率和每个“测试组”样品点的判别准确率,误判率可评估LDA建立的判别模型的可行性;
    通过比较LDA与IB-LDA所建立判别函数的Wilk’s Lambda值、或典型相关系数、或特征值等的大小,衡量两种判别方法的相对优劣。
  2. 根据权利要求1所述的方法,其特征在于,所述步骤1)中,“训练组”每个产地的样品点数,占其总样本点数的1/2-2/3,“测试组”每个产地的样品 点数占其总样本点数的1/3-1/2。
  3. 根据权利要求1所述的方法,其特征在于,步骤2)中所述激光剥蚀电感耦合等离子质谱分析法中,测试的激光能量密度为5-40J/cm2,优选5-10J/cm2;激光光斑大小为16-120μm,优选为32-44μm;测试元素为微量元素中的34-65个。
  4. 根据权利要求3所述的方法,其特征在于,所述激光剥蚀电感耦合等离子质谱分析法中,所用外部参考标样为NIST,USGS,MPI-DING,CGSC,GSJ天然或合成玻璃系列样品中的一种或多种;选择29Si作为数据处理的内标。
  5. 根据权利要求1所述的方法,其特征在于,所述步骤3)采用多元方差分析法或卡方检验,对N个产地的每个微量元素进行组均值的均等性的检验;当取95%的置信度时,若显著性水平sig<0.05,说明该变量的组均值不相等,各组之间存在显著性差异,开展下一步判别分析具有意义。也可取99%或更高的置信度,此时若显著性水平sig<0.01,说明微量元素变量在不同组之间存在极显著的差异。
  6. 根据权利要求1所述的方法,其特征在于,所述步骤4)和步骤6)中,每一轮IB-LDA或LDA可采用直接法“一起输入自变量”或逐步加入变量的“逐步筛选法”中的一种来确定微量元素变量;在“逐步筛选法”中,具体可采用“Wilk’s Lambda”、“马氏距离”、“未解释方差”、“最小F值”、“Rao’s V”方法中的一种,根据每一个微量元素对产地的判别能力,筛选具有判别效能的微量元素变量;
    在“逐步筛选法”中,系统引入变量和剔除变量的判据,可采用“使用F值”或“使用F检验的概率”中的一种,来决定元素变量是否被加入判别函数或移除;
    IB-LDA通过两两判别分析,将对于两组区分贡献较大的微量元素和对应的判别系数、分类系数提取出来,特征微量元素及其判别系数组合成一个最佳的投影矢量方向,这一投影方向即为对应二元迭代线性判别分析的典型判别函数DF Yi,简称判别函数,其数学表达式如公式(1)所示:
    DF Yi=a1iX1+a2iX2+a3iX3+……+amiXm+a0i           (1)
    式中,DF Yi对应第i轮IB-LDA筛选中产地的判别函数,下标i=1,2,……,N-1;X1,X2,X3,……,Xm是每轮IB-LDA筛选出的特征变量,对应软玉所含的特征微量元素;a1i,a2i,a3i,……,ami是特征变量对应的判别系数,a0i为判别函数的常数项。
  7. 根据权利要求1所述的方法,其特征在于,所述步骤4)中,关于IB-LDA的分类函数CFi,其数学表达式如公式(2)和(3)所示:
    CFi1=b11X1+b21X2+b31X3+……+bmi1Xm+bi1        (2)
    CFi2=b12X1+b22X2+b32X3+……+bmi2Xm+bi2        (3)
    式中,CFi1和CFi2分别对应第i轮IB-LDA筛选中的产地和“未区分组”的分类函数,i=1,2,……,N-1;X1,X2,X3,……,Xm是每轮IB-LDA筛选出的特征变量,对应软玉所含的特征微量元素;b11,b12,b21,……,bmi1,bmi2是特征变量对应的分类函数系数,bi1和bi2为分类函数的常数项。
  8. 根据权利要求1所述的方法,其特征在于,所述步骤4)中,“训练组”中每一轮IB-LDA中被筛选中产地的判别概率,可使用交叉验证判别准确率CVi表示,若某产地在第i轮IB-LDA中被筛选出来,其最终被准确判别的概率为CVori,定义为从第一次轮两组区分的CV1持续到第i轮两组区分CVi的乘积,公式如下:
    CVori=CV1×……CVi-1×CVi          (4)
    i为进行IB-LDA的轮数;or为origin缩写,表示产地;
    通过IB-LDA筛选中的所有产地被准确判别的概率用CVIB-total表示,定义为每个被筛选中产地CVori加和的算数平均值,计算公式如(5)所示:
    Figure PCTCN2015092024-appb-100001
    或,使用初始分组案例判别准确率OGi来表示每一轮IB-LDA中被筛选中产地的判别概率,若某产地在第i轮IB-LDA中被筛选出来,其分组初始案例判别准确率OGori,定义为从第一次轮两组区分的OG1持续到第i轮两组区分OGi 的乘积,公式如下:
    OGori=OG1×……OGi-1×OGi         (6)
    或,通过IB-LDA筛选中的所有产地被准确判别的概率用OGIB-total表示,定义为每个被筛选中产地OGori加和的算数平均值,计算公式如(7)所示:
    Figure PCTCN2015092024-appb-100002
  9. 根据权利要求1所述的方法,其特征在于,对步骤4)和步骤6)中建立的判别函数,用Wilk’s Lambda值、典型相关系数、特征值中的一种或多种,评价判别函数的统计显著性;Wilk’s Lambda,简称λ值,是组内离差平方和/总的离差平方和,在判别检验中常使用,较小的λ值表示判别函数具有较高的判别能力;典型相关系数是组间离差平方和/总离差平方和的算术平方根,特征值是组间离差平方和/组内离差平方和,对于判别函数而言,典型相关系数或特征值越大,其判别能力就越强。
  10. 根据权利要求1-9任一所述的方法,其特征在于,步骤4)和步骤6)中,对“测试组”样品的误判率用ER表示,其数学表达式为
    Figure PCTCN2015092024-appb-100003
    对“测试组”每个样品,若某样品在第i轮被准确判别出来,与其实际所属产地相符,则其判别准确率优先用对应产地的交叉验证判别准确率CVori表示,或用初始分组案例判别准确率OGori表示;而对步骤5)中的“待测样品”,若某样品在第i轮被准确判别出来,其判别准确率直接用判别所属产地的交叉验证判别准确率CVori表示,或初始分组案例判别准确率OGori表示。
PCT/CN2015/092024 2015-10-13 2015-10-15 一种定量判别软玉产地的方法 WO2017063174A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510659192.1 2015-10-13
CN201510659192.1A CN105181907B (zh) 2015-10-13 2015-10-13 一种定量判别软玉产地的方法

Publications (1)

Publication Number Publication Date
WO2017063174A1 true WO2017063174A1 (zh) 2017-04-20

Family

ID=54904130

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/092024 WO2017063174A1 (zh) 2015-10-13 2015-10-15 一种定量判别软玉产地的方法

Country Status (2)

Country Link
CN (1) CN105181907B (zh)
WO (1) WO2017063174A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112229863A (zh) * 2020-09-30 2021-01-15 上海海关工业品与原材料检测技术中心 一种铁矿石的原产国或品牌的鉴别方法
CN114441577A (zh) * 2022-01-24 2022-05-06 深圳市吉尔德技术有限公司 一种基于微量元素线性判别式分析的祖母绿产地判定方法
CN115372396A (zh) * 2022-10-26 2022-11-22 中国科学院地质与地球物理研究所 一种斜长石标样的确认方法
CN116559099A (zh) * 2023-07-07 2023-08-08 泉州海关综合技术服务中心 一种茶叶中重金属测定的设备和方法
BE1030200B1 (de) * 2022-03-30 2023-08-16 Univ China Geosciences Wuhan Verfahren zur Unterscheidung verschiedener Kalksilikatfels-Lagerstätten basierend auf der Variationen in der Granatzusammensetzung
CN116777915A (zh) * 2023-08-23 2023-09-19 国检中心深圳珠宝检验实验室有限公司 一种海蓝宝石的子类识别方法、装置及存储介质

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845522B (zh) * 2016-12-26 2020-01-31 华北理工大学 一种冶金成球过程中的分类判别系统
CN106770617A (zh) * 2017-04-10 2017-05-31 山东省分析测试中心 一种利用微量元素和稀土元素含量测定结合多元统计分析对丹参进行产地溯源的方法
CN107423750A (zh) * 2017-05-10 2017-12-01 贵州大学 一种有效的辣椒产地溯源方法
CN107727603A (zh) * 2017-11-01 2018-02-23 中国地质大学(武汉) 一种鉴别寿山高山系水坑石的系统及方法
CN108535258A (zh) * 2018-03-16 2018-09-14 上海交通大学 一种快速无损判别区分软玉产地的方法
CN108419751A (zh) * 2018-03-26 2018-08-21 四川大学 一种四川省阿坝州黑水县产凤尾鸡
CN109145990A (zh) * 2018-08-22 2019-01-04 云图元睿(上海)科技有限公司 基于典型相关的高维市场细分方法及装置
CN110987996B (zh) * 2019-12-03 2023-04-07 上海海关工业品与原材料检测技术中心 一种判别进口铁矿石产地及品牌的方法
CN111220693A (zh) * 2020-03-16 2020-06-02 阿拉山口海关技术中心 一种基于多元素含量统计学的不同产地和田玉鉴别方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1008952A2 (en) * 1998-12-11 2000-06-14 Florida Department of Citrus System and method for identifying the geographic origin of a fresh commodity
CN101701916A (zh) * 2009-12-01 2010-05-05 中国农业大学 一种玉米品种快速鉴定、鉴别方法
CN103487537A (zh) * 2013-07-30 2014-01-01 中国标准化研究院 一种基于遗传算法优化西湖龙井茶产地检测方法
CN103913477A (zh) * 2014-04-16 2014-07-09 国家黄金钻石制品质量监督检验中心 一种泰山玉的产地鉴定方法
CN104502299A (zh) * 2014-12-12 2015-04-08 深圳市计量质量检测研究院 一种利用近红外光谱技术鉴别五常稻花香大米的方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9086344B2 (en) * 2011-08-22 2015-07-21 Weiqian Yi Antique identification method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1008952A2 (en) * 1998-12-11 2000-06-14 Florida Department of Citrus System and method for identifying the geographic origin of a fresh commodity
CN101701916A (zh) * 2009-12-01 2010-05-05 中国农业大学 一种玉米品种快速鉴定、鉴别方法
CN103487537A (zh) * 2013-07-30 2014-01-01 中国标准化研究院 一种基于遗传算法优化西湖龙井茶产地检测方法
CN103913477A (zh) * 2014-04-16 2014-07-09 国家黄金钻石制品质量监督检验中心 一种泰山玉的产地鉴定方法
CN104502299A (zh) * 2014-12-12 2015-04-08 深圳市计量质量检测研究院 一种利用近红外光谱技术鉴别五常稻花香大米的方法

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112229863A (zh) * 2020-09-30 2021-01-15 上海海关工业品与原材料检测技术中心 一种铁矿石的原产国或品牌的鉴别方法
CN114441577A (zh) * 2022-01-24 2022-05-06 深圳市吉尔德技术有限公司 一种基于微量元素线性判别式分析的祖母绿产地判定方法
BE1030200B1 (de) * 2022-03-30 2023-08-16 Univ China Geosciences Wuhan Verfahren zur Unterscheidung verschiedener Kalksilikatfels-Lagerstätten basierend auf der Variationen in der Granatzusammensetzung
CN115372396A (zh) * 2022-10-26 2022-11-22 中国科学院地质与地球物理研究所 一种斜长石标样的确认方法
CN116559099A (zh) * 2023-07-07 2023-08-08 泉州海关综合技术服务中心 一种茶叶中重金属测定的设备和方法
CN116559099B (zh) * 2023-07-07 2023-09-19 泉州海关综合技术服务中心 一种茶叶中重金属测定的设备和方法
CN116777915A (zh) * 2023-08-23 2023-09-19 国检中心深圳珠宝检验实验室有限公司 一种海蓝宝石的子类识别方法、装置及存储介质

Also Published As

Publication number Publication date
CN105181907A (zh) 2015-12-23
CN105181907B (zh) 2017-01-04

Similar Documents

Publication Publication Date Title
WO2017063174A1 (zh) 一种定量判别软玉产地的方法
CN101701916B (zh) 一种玉米品种快速鉴定、鉴别方法
CN106355011B (zh) 一种地球化学数据元素序结构分析方法及装置
Qi et al. Rapid classification of archaeological ceramics via laser-induced breakdown spectroscopy coupled with random forest
Benn Clast morphology
CN109993459B (zh) 一种复杂多含水层矿井突水水源识别方法
CN110455803A (zh) 一种文物鉴定系统及鉴定方法
CN105843870B (zh) 重复性和再现性的分析方法及其应用
CN107238409B (zh) 一种用于识别宝石身份的方法及其识别系统
CN106560704A (zh) 联合同位素和微量元素检验的武夷岩茶产地鉴别方法
CN107478595A (zh) 一种快速鉴别珍珠粉真伪及定量预测掺伪贝壳粉含量的方法
CN102183500A (zh) 基于荧光特征参量欧氏距离的白酒鉴别方法
CN103822897A (zh) 一种基于红外光谱的白酒鉴定及溯源方法
TW201523467A (zh) 寶石認證及檢驗的方法和系統
CN108535258A (zh) 一种快速无损判别区分软玉产地的方法
CN105181809A (zh) 一种基于多波谱分析的珠宝品质鉴定方法及系统
CN106126883A (zh) 油套管质量水平评价方法
CN106323937B (zh) 一种高辨识力的原油指纹谱构建及鉴别方法
Luo et al. Origin determination of dolomite-related white nephrite through iterative-binary linear discriminant analysis
Katsurada et al. Geographic origin determination of Paraiba tourmaline
CN115901694A (zh) 一种鉴别琥珀产地的方法
Buchanan et al. On identifying stone tool production techniques: An experimental and statistical assessment of pressure versus soft hammer percussion flake form
CN110174392A (zh) 一种高辨识力多组分复杂油品的指纹谱构建及鉴别方法
Hu et al. Rapid authentication of green tea grade by excitation-emission matrix fluorescence spectroscopy coupled with multi-way chemometric methods
Grunsky et al. Classification of distinct eruptive phases of the diamondiferous Star kimberlite, Saskatchewan, Canada based on statistical treatment of whole rock geochemical analyses

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15906059

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15906059

Country of ref document: EP

Kind code of ref document: A1