CN110749565A - Method for rapidly identifying storage years of Pu' er tea - Google Patents
Method for rapidly identifying storage years of Pu' er tea Download PDFInfo
- Publication number
- CN110749565A CN110749565A CN201911201641.2A CN201911201641A CN110749565A CN 110749565 A CN110749565 A CN 110749565A CN 201911201641 A CN201911201641 A CN 201911201641A CN 110749565 A CN110749565 A CN 110749565A
- Authority
- CN
- China
- Prior art keywords
- spectrum
- tea
- sample
- matrix
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 235000019224 Camellia sinensis var Qingmao Nutrition 0.000 title claims abstract description 61
- 235000020339 pu-erh tea Nutrition 0.000 title claims abstract description 61
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000001228 spectrum Methods 0.000 claims abstract description 64
- 238000012937 correction Methods 0.000 claims abstract description 32
- 230000003595 spectral effect Effects 0.000 claims abstract description 27
- 238000012795 verification Methods 0.000 claims abstract description 24
- 238000007781 pre-processing Methods 0.000 claims abstract description 21
- 238000002329 infrared spectrum Methods 0.000 claims abstract description 20
- 238000005498 polishing Methods 0.000 claims abstract description 3
- 239000011159 matrix material Substances 0.000 claims description 38
- 238000002835 absorbance Methods 0.000 claims description 13
- 238000012417 linear regression Methods 0.000 claims description 7
- 238000000354 decomposition reaction Methods 0.000 claims description 4
- 229920001343 polytetrafluoroethylene Polymers 0.000 claims description 4
- 239000004810 polytetrafluoroethylene Substances 0.000 claims description 4
- 238000002790 cross-validation Methods 0.000 claims description 3
- 229910052736 halogen Inorganic materials 0.000 claims description 3
- 150000002367 halogens Chemical class 0.000 claims description 3
- -1 polytetrafluoroethylene Polymers 0.000 claims description 3
- 239000010453 quartz Substances 0.000 claims description 3
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N silicon dioxide Inorganic materials O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 claims description 3
- 239000000523 sample Substances 0.000 description 30
- 238000005516 engineering process Methods 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 7
- 238000010200 validation analysis Methods 0.000 description 6
- 238000001514 detection method Methods 0.000 description 5
- 241001122767 Theaceae Species 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 235000013616 tea Nutrition 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000002203 pretreatment Methods 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000004451 qualitative analysis Methods 0.000 description 2
- 238000004445 quantitative analysis Methods 0.000 description 2
- 230000001953 sensory effect Effects 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 1
- 101100116570 Caenorhabditis elegans cup-2 gene Proteins 0.000 description 1
- 101100116572 Drosophila melanogaster Der-1 gene Proteins 0.000 description 1
- 229910000530 Gallium indium arsenide Inorganic materials 0.000 description 1
- 238000004566 IR spectroscopy Methods 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 238000000149 argon plasma sintering Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000012850 discrimination method Methods 0.000 description 1
- 238000002536 laser-induced breakdown spectroscopy Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000011197 physicochemical method Methods 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 230000002040 relaxant effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000004611 spectroscopical analysis Methods 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000004416 surface enhanced Raman spectroscopy Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
- G01N21/35—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
- G01N21/3563—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing solids; Preparation of samples therefor
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
- G01N21/35—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
- G01N21/359—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using near infrared light
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Pathology (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Algebra (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Immunology (AREA)
- Computing Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Operations Research (AREA)
- Probability & Statistics with Applications (AREA)
- Investigating Or Analysing Materials By Optical Means (AREA)
Abstract
The invention discloses a method for rapidly identifying the storage years of Pu' er tea, which comprises the following steps: collecting an original spectrum: polishing and crushing Pu 'er tea leaves to prepare a Pu' er tea sample, and detecting an original spectrum of the sample by using a near-infrared spectrometer; preprocessing of the original spectrum: preprocessing an original spectrum by using a method of first-order derivative and multivariate scattering correction combination, and dividing the preprocessed spectrum data into a correction set and a verification set; constructing a discriminant partial least square model: establishing a discriminant partial least square model by using the spectral data of the calibration set samples, and verifying the effectiveness of the model by using the spectral data of the verification set samples; identification of the sample: and (3) carrying out near infrared spectrum acquisition on a Pu 'er tea sample to be identified for unknown years, preprocessing the spectral data, and then introducing the spectral data into a judgment partial least square model subjected to validity verification to obtain the storage year of the Pu' er tea. The method disclosed by the invention is simple to operate and high in accuracy of the identification result.
Description
Technical Field
The invention relates to the technical field of tea leaf identification, in particular to a method for quickly identifying the storage years of Pu' er tea.
Background
The Pu 'er tea is a unique tea product in Yunnan province of China, is mainly produced in Xishuangbanna, Lincang, Pu' er and other areas in Yunnan province, and has unique taste and aroma. According to research and analysis, the Pu' er tea has the activities of reducing blood fat, resisting bacteria and viruses and relaxing bowels, has unique advantages in the aspect of health, is greatly pursued by people, and obtains larger market space. The price of the Pu ' er tea is higher than that of most other tea in the market, the price difference of different types of Pu ' er tea is large, and the shapes of various types of Pu ' er tea are similar, so that common consumers are difficult to distinguish. Sensory evaluation and physical and chemical index detection are two important methods for detecting the quality of the Pu' er tea at present. However, sensory evaluation depends mainly on the experience of an evaluator, and is easily confused subjectively. The physical and chemical index detection operation is complex, and time and labor are wasted.
In recent years, various analysis technologies are combined with chemometrics methods to be applied to quantitative and qualitative analysis of Pu' er tea, and mainly include analysis technologies such as infrared spectroscopy, electronic nose, laser-induced breakdown spectroscopy and surface-enhanced Raman spectroscopy, and chemometrics methods such as principal component analysis, artificial neural network, linear discrimination and support vector machine.
The near infrared spectrum technology is a detection technology which is developed and widely applied in recent years, is a rapid and nondestructive detection means, has the advantages of low analysis cost, high detection speed and the like, and is widely applied to qualitative and quantitative analysis. The near infrared spectrum technology is to obtain near infrared spectrum data by measuring the near infrared spectrum of a sample, and analyze the spectrum data by combining a chemometrics method to achieve the aim of identifying the sample.
In the existing spectroscopy method for the storage year research of the Pu 'er tea, some Pu' er tea can be distinguished in different years by comparing absorption peaks and absorbance ratios, and unknown samples can not be distinguished. Some methods are combined with chemometrics methods for qualitative judgment, but the judgment accuracy is not ideal. The spectral matrix of the near infrared spectrum measurement technology has many useless noise information, so that the identification result has large errors.
Disclosure of Invention
In order to solve the technical problems, the invention provides a method for rapidly identifying the storage year of Pu' er tea, so as to achieve the purposes of simple operation and high identification result accuracy.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a method for rapidly identifying the storage years of Pu' er tea comprises the following steps:
(1) collecting an original spectrum: polishing and crushing Pu 'er tea leaves to prepare a Pu' er tea sample, and detecting an original spectrum of the sample by using a near-infrared spectrometer;
(2) preprocessing of the original spectrum: preprocessing an original spectrum by using a method of first-order derivative and multivariate scattering correction combination, and dividing the preprocessed spectrum data into a correction set and a verification set;
(3) constructing a discriminant partial least square model: establishing a discriminant partial least square model by using the spectral data of the correction set, and verifying the effectiveness of the model by using the spectral data of the verification set;
(4) identification of the sample: the method comprises the steps of collecting a near infrared spectrum of a Pu 'er tea sample of unknown years to be identified, preprocessing the near infrared spectrum by a first derivative and multivariate scattering correction combination method, and then introducing the processed sample into a judgment partial least square model with validity verification to obtain the storage year of the Pu' er tea.
In the scheme, in the step (1), a Fourier transform near-infrared spectrometer is used, polytetrafluoroethylene is used as a background spectrum, a quartz halogen lamp is used as a light source, and a high-flux double-axis Michelson interferometer is adopted.
In the above scheme, the pretreatment method in step (2) is as follows:
Wherein n is the number of Pu' er tea samples, k is the number of wavelength points on each spectrum, and Xi,jExpressed as the absorbance value of the ith sample at the jth wavelength spot;
2) establishing each spectrum XiAnda linear regression relationship between them to obtain aiAnd bi:
3) Performing multiple scattering correction according to each spectrum XiAnd corresponding aiAnd biObtaining a corrected spectrum Xi(MSC):
Xi(MSC)=(Xi-ai)/bi; (3)
4) Using direct difference method to Xi(MSC)And (5) calculating a first derivative with the difference width of g at the wavelength point k according to the following formula:
Xi(1st)=(Xi,k+g-Xi,k)/g; (4)
Xi,k+gand Xi,kThe absorbances at wavenumber points k + g and k on the ith sample spectrum are shown, respectively.
Further, the specific method of the step (3) is as follows:
1) spectral matrix X using calibration set samplesn×kAnd the category matrix Cn×1Performing main component decomposition:
Xn×k=Tn×d·Pd×k+En×k; (5)
Cn×1=Un×d·Qd×1+Fn×1; (6)
in the above formula, Tn×dAnd Un×dRespectively an absorbance characteristic factor matrix and a class characteristic factor matrix, Pd×kAnd Qd×1Respectively absorbance load matrix and class load matrix, En×kAnd Fn×1Is an error matrix;
2) will Tn×dAnd Un×dMultiple linear regression:
Un×d=Tn×d·B; (7)
B=(X′n×k·Xn×k)-1·X′n×k·Cn×1; (8)
3) determining the value of the number of characteristic factors d:
substituting equations (7) and (8) into equations (5) and (6) yields:
Cn×1=Tn×d·B·Qd×1+Fn×1; (9)
determining the number d of the characteristic factors according to the cross validation root mean square error RMSECV of the real value and the predicted value of the correction set;
4) and verifying the effectiveness of the partial least square model by using the spectral data of the verification set samples.
Preferably, the threshold of the discriminant partial least squares model is set to 0.5, and when the absolute value of the difference between the predicted value and the actual value of the verification centralized sample is less than 0.5, the model is correctly discriminated; and (5) selecting the number d of the characteristic factors to be 6 to establish a discriminant partial least square model.
Preferably, in the step (1), the wave number range of the collected near infrared spectrum is 10000-4000cm-1The repeated scanning times of spectrum acquisition is 32 times, and the spectral resolution is 4cm-1Each sample was measured 3 times and averaged to give the final measured spectrum. Through the technical scheme, the method for rapidly identifying the storage years of the Pu' er tea provided by the invention has the following advantages:
1. the near infrared spectrum has the characteristics of high speed, high efficiency, low cost, wide application range and the like, can be used for directly analyzing a solid sample, and can extract a large amount of useful information from a near infrared spectrogram by adopting full spectrum analysis and combining a chemometrics method.
2. In the preprocessing process, the Multivariate Scattering Correction (MSC) corrects each spectrum by using an average spectrum, and can eliminate the light scattering influence caused by optical path difference and uneven sample particle size and density; first derivative (1)stDer) adopts a direct difference method to conduct derivation on the spectrum, can effectively eliminate the influence of baseline drift and background interference, and improves the signal-to-noise ratio and the resolution ratio of the spectrum.
3. Discriminant partial least squares (PLS-DA) is a Partial Least Squares (PLS) based on discriminant basis, which is an effective combination of multiple linear regression, canonical correlation analysis and principal component analysis, and has superior discriminant effects by projecting a high-dimensional data matrix into a lower-dimensional space. In the process of decomposing the spectral matrix, PLS-DA needs to introduce the information of the category information matrix into the spectral information matrix and then carry out orthogonal decomposition. By doing so, useless noise information in the spectrum matrix can be effectively eliminated, and useless information in the category information matrix is also eliminated, so that an optimal calibration model is ensured to be obtained.
The discrimination model for discriminating the storage years of the Pu' er tea, which is established by combining the near infrared spectrum technology with the discrimination partial least square algorithm, has higher accuracy, can be verified by a physicochemical method, fully proves the effectiveness of the discrimination result, and has simple operation and high accuracy.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is an original spectrogram of Pu' er tea of five different storage years;
FIG. 2 is a spectrum after first derivative and MSC pre-processing;
FIG. 3 is a PLS-DA discriminant model result diagram of Pu' er tea calibration set samples;
FIG. 4 is a PLS-DA discriminant model result diagram of Pu' er tea validation set samples.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
The invention provides a method for rapidly identifying the storage years of Pu' er tea, which comprises the following specific embodiments:
1. sample (I)
The research sample materials are Pu ' er tea from five different storage years in the unworked mountain of Pu ' er city in Yunnan province, are respectively produced from Pu ' er tea in 2010, 2012, 2014, 2016 and 2018, and have better representativeness. 40 parts of samples are prepared from Pu' er tea of different storage years, and the total amount is 200 parts of samples. 5g (precision 0.01) of tea leaves weighed each time are put into a solid sample crusher (Wanke instruments Co., Ltd.) to be ground and crushed, and then are put into a sample bottle to be used as a Pu' er tea sample.
2. Laboratory instrument and spectral acquisition
The research adopts a Fourier near infrared spectrometer (ABB MB3600 in Switzerland) provided with a high-sensitivity InGaAs detector, a diffuse reflection probe and a solid sample testing kit for near infrared spectrum measurement. The light source of the spectrometer is a quartz halogen lamp, and the high-flux double-rotating-shaft Michelson interferometer is adopted, so that the stability and the repeatability are ensured. The acquisition of near infrared spectrum data is realized by a horizon n MB software (3.4.0.3 edition, ABB MB3600, Switzerland), and the scanning wave number of the near infrared spectrometer is 10000-4000cm-1And at 1.928cm-1The data was measured at intervals of 3112 variables per spectrum.
The spectra were obtained by 32 consecutive scans with a spectral resolution of 4cm-1Each sample was measured 3 times and averaged to give the final measured spectrum. Polytetrafluoroethylene (PTFE, mod. skg8613g, ABB, switzerland) was chosen as background spectrum. The raw spectra collected are shown in fig. 1.
3. Spectral preprocessing
The invention uses the first derivative (1)stDer) and Multivariate Scatter Correction (MSC) are performed, and the specific process is as follows:
Wherein n is 200, k is 3112, Xi,jExpressed as the absorbance value of the ith sample at the jth wavelength spot;
2) establishing each spectrum XiAnda linear regression relationship between them to obtain aiAnd bi:
3) Performing multiple scattering correction according to each spectrum XiAnd corresponding aiAnd biObtaining a corrected spectrum Xi(MSC):
Xi(MSC)=(Xi-ai)/bi; (3)
4) Using direct difference method to Xi(MSC)And (5) calculating a first derivative with the difference width of g at the wavelength point k according to the following formula:
Xi(1st)=(Xi,k+g-Xi,k)/g; (4)
Xi,k+gand Xi,kThe absorbances at wavenumber points k + g and k on the ith sample spectrum are shown, respectively.
The spectrum after pretreatment is shown in FIG. 2.
The data of 200 Pu' er tea samples in different storage years are as follows: the scale of 1 is divided into a correction set and a validation set. The correction set of the Pu 'er tea samples in each year is 30 parts, the verification set is 10 parts, the correction set is 150 parts and the verification set is 50 parts for the Pu' er tea samples in five different storage years.
4. Construction of discriminant partial least squares model (PLS-DA)
PLS-DA is a partial least squares algorithm based on discriminant analysis. The PLS-DA is used for performing PLS analysis on a matrix representing sample class attributes and a matrix containing sample spectrum data, establishing a PLS discriminant model of classification variables and spectrum data, and performing discriminant prediction on unknown samples in a verification set. The method uses the category information matrix to replace a concentration matrix in a partial least square regression model, so that noise information in a spectrum matrix can be effectively eliminated, useless information in the information matrix is also eliminated, and the optimal calibration model is ensured to be obtained. The specific modeling method is as follows:
1) spectral matrix X using calibration set samplesn×kAnd the category matrix Cn×1Performing main component decomposition:
Xn×k=Tn×d·Pd×k+En×k; (5)
Cn×1=Un×d·Qd×1+Fn×1; (6)
in the above formula, Tn×dAnd Un×dRespectively an absorbance characteristic factor matrix and a class characteristic factor matrix, Pd×kAnd Qd×1Respectively absorbance load matrix and class load matrix, En×kAnd Fn×1Is an error matrix;
in the correction set, n is 150, and k is 3112.
2) Will Tn×dAnd Un×dMultiple linear regression:
Un×d=Tn×d·B; (7)
B=(X′n×k·Xn×k)-1·X′n×k·Cn×1; (8)
3) determining the value of the number of characteristic factors d:
substituting equations (7) and (8) into equations (5) and (6) yields:
Cn×1=Tn×d·B·Qd×1+Fn×1; (9)
determining the number d of the characteristic factors according to the cross validation root mean square error RMSECV of the real value and the predicted value of the correction set;
in the invention, as the number of the characteristic factors increases, RMSECV is continuously reduced, when d is 6, RMSECV is 0.0041, and when d is 6>At 6, the RMSECV value tends to be stable, and the residual matrix F at this timen×1It is negligible and therefore the number of selected eigenfactors is 6 for modeling.
4) The effectiveness of the partial least square distinguishing model is verified by using the spectrum data of the verification set samples, the quality of the model performance is based on the distinguishing accuracy of the verification set samples, and the higher the distinguishing accuracy is, the better the performance of the model is. The preprocessing and modeling of spectra in this study used Matlab and PLS _ toolbox. FIG. 3 is a PLS-DA discriminant model result diagram of Pu 'er tea calibration set samples, and FIG. 4 is a PLS-DA discriminant model result diagram of Pu' er tea verification set samples. As can be seen from fig. 3 and 4, pass 1stAnd (3) performing Der and MSC pretreatment, wherein the number of discrimination errors in 50 verification set samples is 0, and the discrimination accuracy is 100.00%.
5. Identifying the storage year of Pu' er tea
Carrying out near infrared spectrum scanning on a Pu' er tea sample to be identified to obtain near infrared spectrum data, and processing by a processing unit 1stAnd (5) pretreating the Der and the MSC, and introducing the pretreated Der and MSC into a model with verified effectiveness to identify the storage year of the Pu' er tea.
Conclusion 6
Identifying Pu 'er tea samples of five different storage years in Pu' er city of Yunnan province by near infrared spectrum technology and partial least square algorithm, and using 1stThe Der and the MSC preprocess the near infrared spectrum, and establish a PLS-DA model to obtain better identification effect, and can accurately judge the storage year of the Pu' er tea.
Comparison of different pretreatment methods:
the data of 200 Pu' er tea samples in different storage years are as follows: the scale of 1 is divided into a correction set and a validation set. The correction set of the Pu 'er tea samples in each year is 30 parts, the verification set is 10 parts, the correction set of the Pu' er tea samples in five different storage years is 150 parts, and the verification set is 50 parts.
Different spectrum preprocessing data are used for establishing PLS-DA (partial least squares-data acquisition) discrimination model
The original spectral data is preprocessed by a plurality of preprocessing methods to obtain the spectral data preprocessed by each method, and the spectral data is divided into a correction set and a verification set. Assigning classification variables of Pu 'er tea correction set samples in different storage years according to the flow of the PLS-DA discrimination method, and assigning 1, 2, 3, 4 and 5 to Pu' er tea in 2010, 2012, 2014, 2016 and 2018 respectively. After assignment, carrying out regression analysis on the spectrum of the correction set sample and the classification variable corresponding to the sample, establishing a PLS model between the spectrum characteristics and the classification variable, comparing the discrimination accuracy of the sample under various spectrum preprocessing methods, screening out the optimal preprocessing method according to the discrimination accuracy, establishing the optimal discrimination combination, and obtaining the discrimination result of the PLS-DA model established under different spectrum preprocessing methods on the storage year of the Pu' er tea in the table 1.
TABLE 1 correction and prediction results of PLS-DA models constructed under different preprocessing methods
Pretreatment method | Number of erroneous judgments of correction set | Correction set accuracy | Number of false positives for a validation set | Validation set |
Original spectrum | ||||
15 | 90.00% | 17 | 66.00% | |
MSC | 14 | 90.67% | 5 | 90.00 |
SNV | ||||
15 | 90.00% | 5 | 90.00% | |
Normalization | 16 | 89.33% | 6 | 88.00% |
1st |
1 | 99.33% | 1 | 98.00% |
2nd Der | 0 | 100.00% | 1 | 98.00% |
MSC+1st Der | 0 | 100.00% | 0 | 100.00% |
As shown in Table 1, use 1stWhen the Der and the MSC preprocess the spectral data, the judgment accuracy of the PLS-DA judgment model is the highest, and the judgment accuracy of the model correction set and the verification set is 100%. All 150 calibration set samples and 50 validation set samples were correctly discriminated. In summary, 1stThe Der combines with MSC to use together to preprocess the spectral data, establishes PLS-DA model with highest discrimination accuracy, has better stability and is beneficial to sample discrimination, therefore, the invention 1stAnd (3) establishing a PLS-DA model by using Der and MSC as an optimal preprocessing method to perform discriminant analysis on the storage years of the Pu' er tea.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (6)
1. A method for rapidly identifying the storage years of Pu' er tea is characterized by comprising the following steps:
(1) collecting an original spectrum: polishing and crushing Pu 'er tea leaves to prepare a Pu' er tea sample, and detecting an original spectrum of the sample by using a near-infrared spectrometer;
(2) preprocessing of the original spectrum: preprocessing an original spectrum by using a method of first-order derivative and multivariate scattering correction combination, and dividing the preprocessed spectrum data into a correction set and a verification set;
(3) constructing a discriminant partial least square model: establishing a discriminant partial least square model by using the spectral data of the correction set, and verifying the effectiveness of the model by using the spectral data of the verification set;
(4) identification of the sample: the method comprises the steps of collecting a near infrared spectrum of a Pu 'er tea sample to be identified for unknown years, preprocessing the near infrared spectrum by a first derivative and multivariate scattering correction combination method, and guiding the processed sample into a judgment partial least square model with validity verification to obtain the storage year of the Pu' er tea.
2. The method for rapidly identifying the storage years of Pu' er tea according to claim 1, wherein in the step (1), a Fourier transform near infrared spectrometer is used, polytetrafluoroethylene is used as a background spectrum, a light source is a quartz halogen lamp, and a high-flux double-axis Michelson interferometer is adopted.
3. The method for rapidly identifying the storage years of Pu' er tea according to claim 1, wherein the preprocessing method in the step (2) is as follows:
Wherein n is the number of Pu' er tea samples, k is the number of wavelength points on each spectrum, and Xi,jExpressed as the absorbance value of the ith sample at the jth wavelength spot;
2) establishing each spectrum XiAnda linear regression relationship between them to obtain aiAnd bi:
3) Performing multiple scattering correction according to each spectrum XiAnd corresponding aiAnd biObtaining a corrected spectrum Xi(MSC):
Xi(MSC)=(Xi-ai)/bi; (3)
4) Using direct difference method to Xi(MSC)And (5) calculating a first derivative with the difference width of g at the wavelength point k according to the following formula:
Xi(1st)=(Xi,k+g-Xi,k)/g; (4)
Xi,k+gand Xi,kThe absorbances at the i-th spectral wavenumber points k + g and k are shown, respectively.
4. The method for rapidly identifying the storage years of Pu' er tea according to claim 3, wherein the specific method of the step (3) is as follows:
1) spectral matrix X using calibration set samplesn×kAnd the category matrix Cn×1Performing main component decomposition:
Xn×k=Tn×d·Pd×k+En×k; (5)
Cn×1=Un×d·Qd×1+Fn×1; (6)
in the above formula, Tn×dAnd Un×dRespectively an absorbance characteristic factor matrix and a class characteristic factor matrix, Pd×kAnd Qd×1Respectively absorbance load matrix and class load matrix, En×kAnd Fn×1Is an error matrix;
2) will Tn×dAnd Un×dMultiple linear regression:
Un×d=Tn×d·B; (7)
B=(X′n×k·Xn×k)-1·X′n×k·Cn×1; (8)
3) determining the value of the number of characteristic factors d:
substituting equations (7) and (8) into equations (5) and (6) yields:
Cn×1=Tn×d·B·Qd×1+Fn×1; (9)
determining the number d of the characteristic factors according to the cross validation root mean square error of the real value and the predicted value of the correction set;
4) and verifying the effectiveness of the discriminant partial least squares model by using the spectral data of the verification set samples.
5. The method for rapidly identifying the storage years of Pu' er tea according to claim 4, wherein the threshold of the discrimination partial least square model is set to 0.5, and when the absolute value of the difference between the predicted value and the actual value of the verification centralized sample is less than 0.5, the model discrimination is correct; and (5) selecting the number d of the characteristic factors to be 6 to establish a discriminant partial least square model.
6. The method as claimed in claim 2, wherein the wave number of the collected near infrared spectrum in step (1) is 10000-4000cm-1The repeated scanning times of spectrum acquisition is 32 times, and the spectral resolution is 4cm-1Each sample was measured 3 times and averaged to give the final measured spectrum.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911201641.2A CN110749565A (en) | 2019-11-29 | 2019-11-29 | Method for rapidly identifying storage years of Pu' er tea |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911201641.2A CN110749565A (en) | 2019-11-29 | 2019-11-29 | Method for rapidly identifying storage years of Pu' er tea |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110749565A true CN110749565A (en) | 2020-02-04 |
Family
ID=69285096
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911201641.2A Pending CN110749565A (en) | 2019-11-29 | 2019-11-29 | Method for rapidly identifying storage years of Pu' er tea |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110749565A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111208251A (en) * | 2020-01-16 | 2020-05-29 | 中国农业科学院茶叶研究所 | Method for judging year of white tea by taking S-linalool and R/S-dihydroactinidiolide as markers |
CN111415715A (en) * | 2020-04-17 | 2020-07-14 | 北京北分瑞利分析仪器(集团)有限责任公司 | Intelligent correction method, system and device based on multivariate spectral data |
CN111665216A (en) * | 2020-06-02 | 2020-09-15 | 中南民族大学 | Method for judging pollution degree of escherichia coli and staphylococcus aureus in quick-frozen rice-flour product |
CN112161968A (en) * | 2020-09-02 | 2021-01-01 | 滨州医学院 | Donkey-hide gelatin brand identification method based on data fusion |
CN112161949A (en) * | 2020-09-16 | 2021-01-01 | 贵州国台酒业股份有限公司 | Method for identifying Maotai-flavor liquor brewing process based on infrared spectrum technology |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104297203A (en) * | 2014-09-28 | 2015-01-21 | 安徽农业大学 | Rapid discriminant method for fermentation quality of congou black tea on basis of near-infrared spectrum analysis technology |
CN105651742A (en) * | 2016-01-11 | 2016-06-08 | 北京理工大学 | Laser-induced breakdown spectroscopy based explosive real-time remote detection method |
-
2019
- 2019-11-29 CN CN201911201641.2A patent/CN110749565A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104297203A (en) * | 2014-09-28 | 2015-01-21 | 安徽农业大学 | Rapid discriminant method for fermentation quality of congou black tea on basis of near-infrared spectrum analysis technology |
CN105651742A (en) * | 2016-01-11 | 2016-06-08 | 北京理工大学 | Laser-induced breakdown spectroscopy based explosive real-time remote detection method |
Non-Patent Citations (5)
Title |
---|
严衍禄等: "《近红外光谱分析的原理、技术与应用》", 31 January 2013, 中国轻工业出版社 * |
尚延义等: "《基于光谱技术的寒地水稻稻瘟病害分析及机理研究》", 30 June 2016, 哈尔滨工程大学出版社 * |
李志刚等: "《光谱数据处理与定量分析技术》", 31 July 2017, 北京邮电大学出版社 * |
郑华等: "FTIR结合PLS-DA鉴别不同陈化时间六堡茶熟茶", 《食品工业》 * |
陈璐等: "基于近红外光谱技术快速鉴别不同产地金银花", 《农产品质量与安全》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111208251A (en) * | 2020-01-16 | 2020-05-29 | 中国农业科学院茶叶研究所 | Method for judging year of white tea by taking S-linalool and R/S-dihydroactinidiolide as markers |
CN111415715A (en) * | 2020-04-17 | 2020-07-14 | 北京北分瑞利分析仪器(集团)有限责任公司 | Intelligent correction method, system and device based on multivariate spectral data |
CN111415715B (en) * | 2020-04-17 | 2023-09-01 | 北京北分瑞利分析仪器(集团)有限责任公司 | Intelligent correction method, system and device based on multi-element spectrum data |
CN111665216A (en) * | 2020-06-02 | 2020-09-15 | 中南民族大学 | Method for judging pollution degree of escherichia coli and staphylococcus aureus in quick-frozen rice-flour product |
CN112161968A (en) * | 2020-09-02 | 2021-01-01 | 滨州医学院 | Donkey-hide gelatin brand identification method based on data fusion |
CN112161949A (en) * | 2020-09-16 | 2021-01-01 | 贵州国台酒业股份有限公司 | Method for identifying Maotai-flavor liquor brewing process based on infrared spectrum technology |
CN112161949B (en) * | 2020-09-16 | 2023-07-18 | 贵州国台酒业集团股份有限公司 | Method for identifying Maotai-flavor liquor brewing process based on infrared spectrum technology |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110749565A (en) | Method for rapidly identifying storage years of Pu' er tea | |
CN101915744B (en) | Near infrared spectrum nondestructive testing method and device for material component content | |
CN103776777B (en) | Method for identifying ginsengs with different growth patterns by using near infrared spectrum technology and determining content of components in ginsengs | |
CN104965973B (en) | A kind of Apple Mould Core multiple-factor Non-Destructive Testing discrimination model and method for building up thereof | |
CN102937575B (en) | Watermelon sugar degree rapid modeling method based on secondary spectrum recombination | |
CN108760647A (en) | A kind of wheat content of molds line detecting method based on Vis/NIR technology | |
CN111272696A (en) | Method for rapidly detecting essence doped in Pu' er tea | |
WO2020248961A1 (en) | Method for selecting spectral wavenumber without reference value | |
CN104596979A (en) | Method for measuring cellulose of reconstituted tobacco by virtue of near infrared reflectance spectroscopy technique | |
CN105181761A (en) | Method for rapidly identifying irradiation absorbed dose of tea by using electronic nose | |
CN104596975A (en) | Method for measuring lignin of reconstituted tobacco by paper-making process by virtue of near infrared reflectance spectroscopy technique | |
CN106018331A (en) | Stability evaluation method of multi-channel spectrum system and pretreatment optimization method | |
CN113655027B (en) | Method for near infrared rapid detection of tannin content in plants | |
CN110672578A (en) | Model universality and stability verification method for polar component detection of frying oil | |
CN104596976A (en) | Method for determining protein of paper-making reconstituted tobacco through ear infrared reflectance spectroscopy technique | |
Hu et al. | Identification and quantification of adulterated Tieguanyin based on the fluorescence hyperspectral image technique | |
CN105699314B (en) | A method of detecting soil stabilization carbon isotope ratio using middle infrared spectrum | |
CN105784629B (en) | The method that the stable carbon isotope ratio of soil is quickly detected using middle infrared spectrum | |
CN107238557A (en) | A kind of method of utilization near infrared spectroscopy quick detection calcium carbonate particle diameter distribution | |
CN111289451B (en) | Method for quantitatively calculating concentration of complex spectral components | |
Jiang et al. | The utility of Fourier transform near-infrared spectroscopy to identify geographical origins of Chinese pears | |
CN112763448A (en) | ATR-FTIR technology-based method for rapidly detecting content of polysaccharides in rice bran | |
CN106568740A (en) | Method for rapid judging of varieties of fresh tea leaves by near infrared spectroscopy | |
CN107884360B (en) | Cigarette paper combustion improver detection method | |
CN110907392A (en) | Melamine detection system based on infrared spectroscopic analysis and application and method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200204 |