WO2020232959A1 - Near infrared spectral feature extraction method and system based on functional principal component analysis - Google Patents
Near infrared spectral feature extraction method and system based on functional principal component analysis Download PDFInfo
- Publication number
- WO2020232959A1 WO2020232959A1 PCT/CN2019/111602 CN2019111602W WO2020232959A1 WO 2020232959 A1 WO2020232959 A1 WO 2020232959A1 CN 2019111602 W CN2019111602 W CN 2019111602W WO 2020232959 A1 WO2020232959 A1 WO 2020232959A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- infrared spectrum
- function
- principal component
- formula
- band
- Prior art date
Links
- 238000000513 principal component analysis Methods 0.000 title claims abstract description 34
- 238000000605 extraction Methods 0.000 title claims abstract description 30
- 230000003595 spectral effect Effects 0.000 title claims abstract description 26
- 238000002329 infrared spectrum Methods 0.000 claims abstract description 82
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 230000001186 cumulative effect Effects 0.000 claims abstract description 9
- 230000009466 transformation Effects 0.000 claims abstract description 7
- 238000004497 NIR spectroscopy Methods 0.000 claims description 23
- 238000000034 method Methods 0.000 claims description 18
- 238000012545 processing Methods 0.000 claims description 10
- 239000011159 matrix material Substances 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 5
- 235000015097 nutrients Nutrition 0.000 claims description 5
- 244000252132 Pleurotus eryngii Species 0.000 claims description 4
- 235000001681 Pleurotus eryngii Nutrition 0.000 claims description 4
- 241000121220 Tricholoma matsutake Species 0.000 claims description 4
- 241001327634 Agaricus blazei Species 0.000 claims description 3
- 150000001413 amino acids Chemical class 0.000 claims description 3
- 238000007705 chemical test Methods 0.000 claims description 3
- 102000004169 proteins and genes Human genes 0.000 claims description 3
- 108090000623 proteins and genes Proteins 0.000 claims description 3
- 238000004451 qualitative analysis Methods 0.000 claims description 3
- 238000004445 quantitative analysis Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 239000004615 ingredient Substances 0.000 claims description 2
- 238000009499 grossing Methods 0.000 claims 1
- 235000016709 nutrition Nutrition 0.000 claims 1
- 241000233866 Fungi Species 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 6
- 230000009467 reduction Effects 0.000 description 5
- 229910000530 Gallium indium arsenide Inorganic materials 0.000 description 4
- KXNLCSXBJCPWGL-UHFFFAOYSA-N [Ga].[As].[In] Chemical compound [Ga].[As].[In] KXNLCSXBJCPWGL-UHFFFAOYSA-N 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 230000000875 corresponding effect Effects 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 238000010521 absorption reaction Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000001066 destructive effect Effects 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 229910052736 halogen Inorganic materials 0.000 description 2
- 150000002367 halogens Chemical class 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- OSBSXTGABLIDRX-UHFFFAOYSA-N 5-methylidenecyclohexa-1,3-diene Chemical group C=C1CC=CC=C1 OSBSXTGABLIDRX-UHFFFAOYSA-N 0.000 description 1
- 239000004215 Carbon black (E152) Substances 0.000 description 1
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 description 1
- 241000208125 Nicotiana Species 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000012569 chemometric method Methods 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 230000005283 ground state Effects 0.000 description 1
- 229930195733 hydrocarbon Natural products 0.000 description 1
- 150000002430 hydrocarbons Chemical class 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- WFKWXMTUELFFGS-UHFFFAOYSA-N tungsten Chemical compound [W] WFKWXMTUELFFGS-UHFFFAOYSA-N 0.000 description 1
- 229910052721 tungsten Inorganic materials 0.000 description 1
- 239000010937 tungsten Substances 0.000 description 1
- -1 viscous Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
- G01N21/35—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
- G01N21/3563—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing solids; Preparation of samples therefor
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
- G01N21/35—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
- G01N21/359—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using near infrared light
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
- G01N21/35—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
- G01N21/3563—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing solids; Preparation of samples therefor
- G01N2021/3572—Preparation of samples, e.g. salt matrices
Definitions
- the invention relates to the technical field of near-infrared spectroscopy non-destructive analysis, and more specifically to a near-infrared spectroscopy feature extraction method and system based on functional principal component analysis.
- near-infrared light has good transmission characteristics in conventional optical fibers, and its instruments are simple, fast analysis, non-destructive and small sample preparation, it is almost suitable for all kinds of samples (liquid, viscous, coating, powder and Solid) analysis, multi-component multi-channel simultaneous determination, etc., have been widely used in many fields including agriculture and animal husbandry, food, chemical industry, petrochemical, pharmaceutical, tobacco, etc., providing a great opportunity for scientific research, teaching and production process control. Broad use space.
- X hydrogen-containing group
- Different groups such as methyl, methylene, benzene ring, etc.
- Material quality parameters (such as component content) are also related to their composition and structure information.
- the application of chemometric methods to correlate the two can determine the qualitative or The quantitative relationship is: the calibration model. After the calibration model is established, as long as the near-infrared spectrum of the unknown sample is measured, the quality parameters of the sample can be predicted based on the calibration model.
- the near-infrared spectroscopy data has the characteristics of high dimensionality and band overlap, which brings a certain degree of difficulty and challenge to extracting the key principal component information of the sample.
- How to realize the feature mapping relationship from high-dimensional space to low-dimensional space so as to facilitate the extraction of the key principal component information of sample spectral data is a technical problem to be solved urgently.
- PCA principal component analysis
- LDA linear discriminant analysis
- GA genetics Algorithm
- UVE Uniformative Variable Elimination
- iPLS Interval Partial Least Squares
- SPA Successive Projections Algorithm
- the dimensionality reduction algorithm commonly used in the prior art only starts from the spectral data itself, that is, the discrete points of the spectral data, to realize feature mapping from a high-dimensional space to a low-dimensional space.
- the internal structure of spectral data presents a "functional type", which is continuous.
- the dimensionality reduction algorithm in the prior art will result in a lot of potential feature information, such as derivative, order and other feature information, which cannot be mined.
- the technical problem to be solved by the present invention is to provide a near-infrared spectrum feature extraction method and system based on functional principal component analysis to solve the problem that the prior art dimensionality reduction algorithm in the background art cannot obtain feature information such as derivative and order.
- the present invention provides the following technical solutions:
- a feature extraction method of near infrared spectroscopy based on functional principal component analysis including the following steps:
- the feature mapping from high dimensionality to low latitude is completed, and at the same time, further information such as the order and derivative of the intrinsic function of the spectral data can be mined.
- the step S1 includes:
- the spectrometer adopts the NIRQuest512 near-infrared spectrometer produced by Ocean Optics in the United States. It is equipped with HL-2000 series halogen lamp light source with a wavelength range of 360nm-2000nm.
- the resolution of the spectrometer is 3cm -1 , the integration time is 45s, and the scanning wavelength range is 900 -1700nm, built-in Hamamatsu indium gallium arsenide (InGaAs) array detector with 512 pixels, high stability, 32 scans;
- the collected samples to be tested are several dry slices of wild matsutake, Agaricus blazei, old man's head, and Pleurotus eryngii, and spectral sampling is performed on the dry slices.
- the step S3 includes:
- ⁇ k (t) is the k-th B-spline basis function of the near-infrared spectrum band, k is less than or equal to m, m represents the number of B-spline basis functions, C is the coefficient matrix, and X(t) is the near-infrared spectrum
- t is the band of the near-infrared spectrum
- ⁇ represents the summation function
- PEN 2 (X) represents the rough penalty
- DDX(t) represents the second derivative of the function X(t);
- x is the observation data of the j-th near-infrared spectrum
- x j is the observation data of the j-th near-infrared spectrum
- j is a positive integer less than m
- c) represents the sum of squared residuals to minimize
- t j represents the j-th near-infrared spectrum band
- k ⁇ K ⁇ j ⁇ k (t j ) represents the B-spline basis function of the j-th near-infrared spectrum band
- the coefficient matrix C is estimated, and the rough penalty is used to smooth the function, which effectively avoids the phenomenon of over-fitting.
- the step S4 includes:
- the centralization formula is as follows:
- i is the sample number
- n is the total number of samples
- X i (t) is the function of the i-th NIR spectral band t
- c represents centralization
- st represents the condition function
- the step S5 includes:
- the covariance calculation formula is as follows:
- V(s, t) represents the covariance of two different bands of s and t.
- the step S6 includes:
- ⁇ j (t) is the pivot weight function of the j-th band t
- ⁇ j (s) represents the pivot weight function of the j-th band s
- j is a positive integer
- ⁇ j is the eigenvalue
- st is the condition Function
- the step S7 includes:
- M represents the number of pivots
- the threshold is set to 90%.
- the step S8 includes:
- f j is a function The j-th pivot.
- the present invention also provides a system adopting the near-infrared spectrum feature extraction method based on functional principal component analysis according to any one of the above solutions, including:
- Acquisition module used to collect near-infrared spectrum data of various samples
- a preprocessing module for preprocessing the near-infrared spectrum data using standard normal transformation
- the acquisition module is used to acquire the spline function of the processed near-infrared spectrum data
- Centralized processing module used to centralize the spline function
- the covariance module is used to calculate the covariance of the centered spline function between different band functions
- the eigenvalue module is used to calculate the jth eigenvalue of the covariance
- the contribution degree module is used to calculate the cumulative contribution degree through the characteristic value of the covariance, and the principal element whose contribution degree exceeds the threshold is used as the characteristic value of the near-infrared spectrum band;
- the principal component score module is used to use the characteristic values of the near-infrared spectrum bands to calculate the functional principal component scores in the equations of different bands.
- the present invention treats near-infrared spectroscopy data as a continuous function and utilizes full-band information. While accurately extracting band features with effective information, the feature values are obtained through the covariance of different band functions, and through the covariance features Calculate the contribution degree of the value to obtain the characteristic value of the near-infrared spectrum band, thereby obtaining the function-shaped principal component scores in the equations of different bands, and realize further mining of the order and derivative (rate of change, slope, curvature, etc.) information of the intrinsic function of the spectrum data;
- the present invention obtains the function-shaped principal component scores in the equations of different bands, and at the same time obtains the rate of change, slope, curvature and other information, thereby enhancing the robustness of the calibration model and improving the predictive ability of the calibration model.
- Infrared spectrum data provides a new feature extraction method, which has high practical value.
- FIG. 1 is a schematic flowchart of a near-infrared spectrum feature extraction method based on functional principal component analysis according to Embodiment 1 of the present invention.
- FIG. 2 is a schematic diagram of B-spline basis functions in a near-infrared spectrum feature extraction method based on functional principal component analysis according to Embodiment 1 of the present invention
- FIG. 3 is a schematic diagram of the functional description of the near-infrared spectra of edible fungi in a near-infrared spectrum feature extraction method based on functional principal component analysis according to Embodiment 1 of the present invention
- FIG. 4 is a load distribution diagram of functional principal component analysis of edible fungi in a near-infrared spectral feature extraction method based on functional principal component analysis according to Embodiment 1 of the present invention
- FIG. 5 is a diagram of two principal component analysis results of four kinds of edible fungi spectral data in a near-infrared spectral feature extraction method based on functional principal component analysis according to Embodiment 1 of the present invention.
- a feature extraction method of near-infrared spectroscopy based on functional principal component analysis uses near-infrared spectroscopy analysis to distinguish different types of edible fungi, and obtain orders and derivatives (rate of change, slope, curvature, etc.) Information;
- the present invention is not limited to screening the types of edible fungi. The steps are as follows:
- the nutrient ingredients include: protein, fat and a variety of amino acids; the spectrometer adopts the NIRQuest512 near-infrared spectrometer produced by Ocean Optics in the United States, with wavelength
- the range of HL-2000 series halogen tungsten light source is 360nm-2000nm, the resolution of the spectrometer is 3cm -1 , the integration time is 45s, the scanning wavelength range is: 900-1700nm, built-in Hamamatsu with 512 pixels and high stability Indium gallium arsenide (InGaAs) array detector with 32 scans;
- the collected samples to be tested are wild matsutake, Agaricus blazei, old man's head, and Pleurotus eryngii, a total of 166 dry slices, and spectrum sampling is performed on 166 dry slices;
- Fig. 2 is a schematic diagram of the B-spline basis function in a near-infrared spectrum feature extraction method based on functional principal component analysis according to an embodiment of the present invention.
- the B-spline function is a seventh-order spline function. 21 basis functions, using seven B-spline functions, and using formulas to describe the near-infrared spectrum data of each sample as a function, the formula is as follows:
- ⁇ k (t) is the k-th B-spline basis function of the near-infrared spectral band
- k is less than or equal to m
- m represents the number of corresponding B-spline basis functions
- C is the coefficient matrix
- X(t) is the near
- t is the band of the near-infrared spectrum
- ⁇ represents the summation function, so that the near-infrared spectrum data can use a set of selected basis functions ⁇ k (t), using the basis function ⁇ k (t ) Linear combination description;
- PEN 2 (X) represents the rough penalty
- DDX(t) represents the second derivative of the function X(t);
- x j is the observation data of the j-th near-infrared spectrum
- c) represents the sum of squared residuals of the minimization
- t j represents the band of the j-th near-infrared spectrum
- k ⁇ K ⁇ j ⁇ k (t j ) is expressed as the B-spline basis function of the j-th near-infrared spectral band
- Equation (4) can be estimated by the least square method, but the least square method estimation is susceptible to noise and over-fitting. Therefore, the rough penalty is used to smooth the function, that is, the integral of the second derivative square is used to control the function The smoothness of the curve, then, the residual sum of squares and the roughness penalty are combined to estimate the coefficient matrix C, where ⁇ is the smoothness coefficient, the larger the ⁇ , the closer the fitting is to the straight line, and the fitting is given For some points in space, find a continuous surface with known form and unknown parameters to approximate these points as much as possible;
- FIG. 3 is a schematic diagram of functional description of the near-infrared spectrum of edible fungi in a method for extracting near-infrared spectrum features based on functional principal component analysis according to an embodiment of the present invention.
- the difference between near-infrared spectroscopy data requires centralized processing of the functions of near-infrared spectroscopy data.
- the formula for centralized processing is as follows:
- i is the sample number
- n is the total number of samples
- X i (t) is the function of the i-th NIR spectral band t
- c represents centralization
- st represents the condition function
- V(s,t) represents the covariance between two different band functions, Represents the function of the near-infrared spectrum band s of the i-th sample after the centralized processing;
- ⁇ j (t) is the j-th pivot weight function of band t
- ⁇ j (s) is the j-th pivot weight function of band s
- j is a positive integer
- ⁇ j is the eigenvalue
- st is the condition Function
- FIG. 4 is a functional principal component analysis based on the embodiment of the present invention.
- the threshold in this embodiment is set to 90%
- FIG. 5 is a diagram of two principal component analysis results of four kinds of edible fungi spectral data in a near-infrared spectrum feature extraction method based on functional principal component analysis provided by an embodiment of the present invention.
- the derivative information includes the rate of change, slope, curve, etc.
- the four edible fungi are wild Matsutake, Ji For matsutake, old man's head, and Pleurotus eryngii, the abscissa in Figure 5 represents the first pivot and the ordinate represents the second pivot.
- a near-infrared spectrum feature extraction system based on functional principal component analysis including:
- Acquisition module used to collect near-infrared spectrum data of various samples
- a preprocessing module for preprocessing the near-infrared spectrum data using standard normal transformation
- the acquisition module is used to acquire the spline function of the processed near-infrared spectrum data
- Centralized processing module used to centralize the spline function
- the covariance module is used to calculate the covariance of the centered spline function between different band functions
- the eigenvalue module is used to calculate the jth eigenvalue of the covariance
- the contribution degree module is used to calculate the cumulative contribution degree through the characteristic value of the covariance, and the principal element whose contribution degree exceeds the threshold is used as the characteristic value of the near-infrared spectrum band;
- the principal component score module is used to use the characteristic values of the near-infrared spectrum bands to calculate the functional principal component scores in the equations of different bands.
- installed should be interpreted broadly. For example, they may be fixedly connected, detachably connected, or integrally connected. Connection; it can be a mechanical connection or an electrical connection; it can be directly connected, or indirectly connected through an intermediate medium, and it can be the internal communication between two components.
- connected installed
- connected can be a mechanical connection or an electrical connection
- it can be directly connected, or indirectly connected through an intermediate medium, and it can be the internal communication between two components.
- specific meaning of the above-mentioned terms in the present invention can be understood in specific situations.
Landscapes
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Investigating Or Analysing Materials By Optical Means (AREA)
Abstract
Description
Claims (10)
- 一种基于函数性主元分析的近红外光谱特征提取方法,其特征在于,步骤如下:A feature extraction method for near infrared spectroscopy based on functional principal component analysis, which is characterized in that the steps are as follows:S1、采集多种样本中近红外光谱的数据;S1. Collect near-infrared spectroscopy data in various samples;S2、采用标准正态变换对所述近红外光谱的数据进行预处理;S2. Preprocessing the near-infrared spectroscopy data by using standard normal transformation;S3、获取处理后的近红外光谱数据的样条函数;S3. Obtain the spline function of the processed near-infrared spectrum data;S4、对所述样条函数进行中心化处理;S4. Perform centralization processing on the spline function;S5、计算中心化处理后的样条函数在不同波段函数之间的协方差;S5. Calculate the covariance of the centered spline function between different waveband functions;S6、计算协方差的第j个特征值;S6. Calculate the j-th eigenvalue of the covariance;S7、通过协方差的特征值,计算累计贡献度,贡献度超过阈值的主元作为近红外光谱波段的特征值;S7. Calculate the cumulative contribution degree through the characteristic value of the covariance, and the principal element whose contribution degree exceeds the threshold is used as the characteristic value of the near infrared spectrum band;S8、利用近红外光谱波段的特征值,计算不同波段的方程中函数形主元得分。S8. Using the characteristic values of the near-infrared spectrum bands, calculate the functional principal component scores in the equations of different bands.
- 根据权利要求1所述的基于函数性主元分析的近红外光谱特征提取方法,其特征在于,所述步骤S1中,采集待测样品的近红外光谱数据,并通过理化试验测定营养成分的含量;营养成分包括:蛋白质、脂肪和多种氨基酸。The method for extracting near-infrared spectroscopy features based on functional principal component analysis according to claim 1, characterized in that, in the step S1, near-infrared spectroscopy data of the sample to be tested is collected, and the content of nutrients is determined through physical and chemical tests ; Nutritional ingredients include: protein, fat and a variety of amino acids.
- 根据权利要求2所述的基于函数性主元分析的近红外光谱特征提取方法,其特征在于,采集的待测样品分别为若干个野生松茸、姬松茸、老人头、杏鲍菇的切片干样,并对所述切片干样进行光谱采样。The near-infrared spectroscopy feature extraction method based on functional principal component analysis according to claim 2, characterized in that the collected samples to be tested are respectively several dried slices of wild matsutake, Agaricus blazei, old man's head, and Pleurotus eryngii, And perform spectral sampling on the dry slice sample.
- 根据权利要求1所述的基于函数性主元分析的近红外光谱特征提取方法,其特征在于,所述步骤S3包括:The near-infrared spectrum feature extraction method based on functional principal component analysis according to claim 1, wherein the step S3 comprises:S31、利用公式,获取各个样品的近红外光谱数据的B样条函数,所述公式如下:S31. Obtain the B-spline function of the near-infrared spectrum data of each sample by using a formula, the formula is as follows:其中,φ k(t)为近红外光谱波段的第k个B样条基函数,k小于等于m,m表 示B样条基函数的数量,C为系数矩阵,X(t)为近红外光谱数据的函数形式,t为近红外光谱的波段,∑表示求和函数; Among them, φ k (t) is the k-th B-spline basis function of the near-infrared spectrum band, k is less than or equal to m, m represents the number of B-spline basis functions, C is the coefficient matrix, and X(t) is the near-infrared spectrum The function form of the data, t is the band of the near-infrared spectrum, and ∑ represents the summation function;S32、利用公式对X(t)函数进行光滑处理,所述公式如下:S32. Use a formula to smooth the X(t) function. The formula is as follows:PEN 2(X)=∫[DDX(t)] 2dt (2) PEN 2 (X)=∫[DDX(t)] 2 dt (2)其中,PEN 2(X)表示粗糙惩罚,DDX(t)表示函数X(t)的二阶导数; Among them, PEN 2 (X) represents the rough penalty, and DDX(t) represents the second derivative of the function X(t);S33、利用公式计算近红外光谱数据函数的系数矩阵C;所述公式如下:S33. Calculate the coefficient matrix C of the near-infrared spectrum data function using a formula; the formula is as follows:PENSSE λ=SMSSE(x|c)+γPEN 2(X); (3) PENSSE λ =SMSSE(x|c)+γPEN 2 (X); (3)其中,PENSSE λ表示残差平方和与粗糙惩罚之和,γ为光滑系数; Among them, PENSSE λ represents the sum of the residual sum of squares and the rough penalty, and γ is the smoothing coefficient;其中,x为近红外光谱的观测数据,x j为第j个近红外光谱的观测数据,j为小于m的正整数,SMESS(x|c)表示极小化残差平方和;t j表示第j个近红外光谱的波段,k≤K≤j,φ k(t j)表示为第j个近红外光谱波段的B样条基函数。 Among them, x is the observation data of the near-infrared spectrum, x j is the observation data of the j-th near-infrared spectrum, j is a positive integer less than m, SMESS(x|c) represents the sum of squares of the minimized residuals; t j represents The j-th near-infrared spectrum band, k≤K≤j, φ k (t j ) is expressed as the B-spline basis function of the j-th near-infrared spectrum band.
- 根据权利要求1所述的基于函数性主元分析的近红外光谱特征提取方法,其特征在于,所述步骤S4包括:The near-infrared spectrum feature extraction method based on functional principal component analysis according to claim 1, wherein said step S4 comprises:利用公式对样本数据进行中心化处理,所述公式如下:Use the formula to centralize the sample data, the formula is as follows:式中,i为样本序号,n为样本总量, 为n个样本近红外光谱波段的函数均值,X i(t)为第i个样本的近红外光谱波段t的函数, 为中心化处理之后的第i个样本的近红外光谱波段t的函数,c表示中心化,s.t.表示条件函数。 In the formula, i is the sample number, n is the total number of samples, Is the mean value of the NIR spectral band function of n samples, X i (t) is the function of the i-th NIR spectral band t, It is a function of the near-infrared spectrum band t of the i-th sample after the centering process, c means centering, and st means conditional function.
- 据权利要求5所述的基于函数性主元分析的近红外光谱特征提取方法,其特征在于,所述步骤S5包括:The near-infrared spectrum feature extraction method based on functional principal component analysis according to claim 5, wherein said step S5 comprises:所述协方差的计算公式如下:The calculation formula of the covariance is as follows:其中,s代表与t不同的近红外光谱的波段;X i c(s)表示中心化处理后的第i个样本的近红外光谱波段s的函数,V(s,t)表示s,t两个波段的协方差。 Among them, s represents the band of the near-infrared spectrum that is different from t; X i c (s) represents the function of the near-infrared spectrum band s of the i-th sample after the centering process, and V(s, t) represents s, t two The covariance of two bands.
- 根据权利要求6所述的基于函数性主元分析的近红外光谱特征提取方法,其特征在于,所述步骤S6包括:The near-infrared spectrum feature extraction method based on functional principal component analysis according to claim 6, wherein the step S6 comprises:利用公式计算协方差的第j个特征值;所述公式如下:Use the formula to calculate the j-th eigenvalue of the covariance; the formula is as follows:其中,ξ j(t)为第j个波段t的主元权重函数,ξ j(s)表示第j波段s的主元权重函数,j为正整数,ρ j为特征值,s.t.表示条件函数。 Among them, ξ j (t) is the pivot weight function of the j-th band t, ξ j (s) represents the pivot weight function of the j-th band s, j is a positive integer, ρ j is the eigenvalue, and st is the condition function .
- 根据权利要求7所述的基于函数性主元分析的近红外光谱特征提取方法,其特征在于,所述步骤S7包括:The near-infrared spectroscopy feature extraction method based on functional principal component analysis according to claim 7, wherein said step S7 comprises:累计贡献度的公式为 选取贡献度超过阈值的M个主元作为近红外光谱波段的特征值,构建定量模型,完成对待测样品的定性或定量分析; The formula for cumulative contribution is Select the M principal elements whose contribution degree exceeds the threshold value as the characteristic value of the near-infrared spectrum band, construct a quantitative model, and complete the qualitative or quantitative analysis of the sample to be tested;其中,M表示主元个数。Among them, M represents the number of pivots.
- 根据权利要求7所述的基于函数性主元分析的近红外光谱特征提取方法,其特征在于,The near-infrared spectrum feature extraction method based on functional principal component analysis according to claim 7, characterized in that,所述步骤S8包括:The step S8 includes:利用公式计算中心化处理后的不同近红外光谱波段方程中函数形主元得分,所述公式如下:Use the formula to calculate the function-shaped principal component scores in the different near-infrared spectrum band equations after the centralization process, the formula is as follows:
- 一种采用权利要求1-9任一项所述的基于函数性主元分析的近红外光谱特征提取方法的提取系统,其特征在于,包括:An extraction system using the near-infrared spectral feature extraction method based on functional principal component analysis according to any one of claims 1-9, characterized in that it comprises:采集模块,用于采集多种样本中近红外光谱的数据;Acquisition module, used to collect near-infrared spectrum data of various samples;预处理模块,用于采用标准正态变换对所述近红外光谱的数据进行预处理;A preprocessing module for preprocessing the near-infrared spectrum data using standard normal transformation;获取模块,用于获取处理后的近红外光谱数据的样条函数;The acquisition module is used to acquire the spline function of the processed near-infrared spectrum data;中心化处理模块,用于对所述样条函数进行中心化处理;Centralized processing module, used to centralize the spline function;协方差模块,用于计算中心化处理后的样条函数在不同波段函数之间的协方差;The covariance module is used to calculate the covariance of the centered spline function between different band functions;特征值模块,用于计算协方差的第j个特征值;The eigenvalue module is used to calculate the jth eigenvalue of the covariance;贡献度模块,用于通过协方差的特征值,计算累计贡献度,贡献度超过阈值的主元作为近红外光谱波段的特征值;The contribution degree module is used to calculate the cumulative contribution degree through the characteristic value of the covariance, and the principal element whose contribution degree exceeds the threshold is used as the characteristic value of the near-infrared spectrum band;主元得分模块,用于利用近红外光谱波段的特征值,计算不同波段的方程中函数形主元得分。The principal component score module is used to use the characteristic values of the near-infrared spectrum bands to calculate the functional principal component scores in the equations of different bands.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910427265.2A CN110006844A (en) | 2019-05-22 | 2019-05-22 | Near infrared spectrum feature extracting method and system based on functionality pivot analysis |
CN201910427265.2 | 2019-05-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020232959A1 true WO2020232959A1 (en) | 2020-11-26 |
Family
ID=67177642
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/111602 WO2020232959A1 (en) | 2019-05-22 | 2019-10-17 | Near infrared spectral feature extraction method and system based on functional principal component analysis |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110006844A (en) |
WO (1) | WO2020232959A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115963074A (en) * | 2023-02-23 | 2023-04-14 | 中国人民解放军国防科技大学 | Rapid detection method and system for spore and hypha ratio of microbial material |
CN116881705A (en) * | 2023-09-07 | 2023-10-13 | 佳木斯大学 | Near infrared spectrum data processing system of calyx seu fructus physalis |
CN117291445A (en) * | 2023-11-27 | 2023-12-26 | 国网安徽省电力有限公司电力科学研究院 | Multi-target prediction method based on state transition under comprehensive energy system |
CN117473207A (en) * | 2023-12-28 | 2024-01-30 | 深圳市新景环境技术有限公司 | Paint spraying waste gas treatment equipment and treatment method thereof |
CN117589741A (en) * | 2024-01-18 | 2024-02-23 | 天津博霆光电技术有限公司 | Indocyanine green intelligent detection method based on optical characteristics |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110006844A (en) * | 2019-05-22 | 2019-07-12 | 安徽大学 | Near infrared spectrum feature extracting method and system based on functionality pivot analysis |
CN112837816B (en) * | 2021-02-09 | 2022-11-29 | 清华大学 | Physiological state prediction method, computer device, and storage medium |
CN114298107A (en) * | 2021-12-29 | 2022-04-08 | 安徽大学 | Near infrared spectrum net signal extraction method and system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101738373A (en) * | 2008-11-24 | 2010-06-16 | 中国农业大学 | Method for distinguishing varieties of crop seeds |
CN101915744A (en) * | 2010-07-05 | 2010-12-15 | 北京航空航天大学 | Near infrared spectrum nondestructive testing method and device for material component content |
CN102519903A (en) * | 2011-11-22 | 2012-06-27 | 山东理工大学 | Method for measuring whiteness value of Agaricus bisporus by using near infrared spectrum |
WO2012142076A1 (en) * | 2011-04-12 | 2012-10-18 | The General Hospital Corporation | System and method for monitoring glucose or other compositions in an individual |
CN105139412A (en) * | 2015-09-25 | 2015-12-09 | 深圳大学 | Hyperspectral image corner detection method and system |
CN108780730A (en) * | 2016-03-07 | 2018-11-09 | 英国质谱公司 | Spectrum analysis |
CN109409350A (en) * | 2018-10-23 | 2019-03-01 | 桂林理工大学 | A kind of Wavelength selecting method based on PCA modeling reaction type load weighting |
CN110006844A (en) * | 2019-05-22 | 2019-07-12 | 安徽大学 | Near infrared spectrum feature extracting method and system based on functionality pivot analysis |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103985000B (en) * | 2014-06-05 | 2017-04-26 | 武汉大学 | Medium-and-long term typical daily load curve prediction method based on function type nonparametric regression |
CN104778337B (en) * | 2015-04-30 | 2017-03-22 | 北京航空航天大学 | Method for predicting remaining service life of lithium battery based on FPCA (functional principal component analysis) and Bayesian updating |
CN106645014B (en) * | 2016-09-23 | 2019-04-30 | 上海理工大学 | Substance identification based on tera-hertz spectra |
-
2019
- 2019-05-22 CN CN201910427265.2A patent/CN110006844A/en active Pending
- 2019-10-17 WO PCT/CN2019/111602 patent/WO2020232959A1/en active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101738373A (en) * | 2008-11-24 | 2010-06-16 | 中国农业大学 | Method for distinguishing varieties of crop seeds |
CN101915744A (en) * | 2010-07-05 | 2010-12-15 | 北京航空航天大学 | Near infrared spectrum nondestructive testing method and device for material component content |
WO2012142076A1 (en) * | 2011-04-12 | 2012-10-18 | The General Hospital Corporation | System and method for monitoring glucose or other compositions in an individual |
CN102519903A (en) * | 2011-11-22 | 2012-06-27 | 山东理工大学 | Method for measuring whiteness value of Agaricus bisporus by using near infrared spectrum |
CN105139412A (en) * | 2015-09-25 | 2015-12-09 | 深圳大学 | Hyperspectral image corner detection method and system |
CN108780730A (en) * | 2016-03-07 | 2018-11-09 | 英国质谱公司 | Spectrum analysis |
CN109409350A (en) * | 2018-10-23 | 2019-03-01 | 桂林理工大学 | A kind of Wavelength selecting method based on PCA modeling reaction type load weighting |
CN110006844A (en) * | 2019-05-22 | 2019-07-12 | 安徽大学 | Near infrared spectrum feature extracting method and system based on functionality pivot analysis |
Non-Patent Citations (3)
Title |
---|
CHEN QUANSHENG; ZHAO JIEWEN EDITED BY CHEN QUANSHENG: "Tea Quality and Safety Detection and Analysis Method", 31 March 2001, CHINA LIGHT INDUSTRY PRESS, CN, ISBN: 978-7-5019-7971-4, article CHEN QUANSHENG; ZHAO JIEWEN EDITED BY CHEN QUANSHENG: "Chapter 3, Near-infrared spectral analysis", pages: 101 - 103, XP009524358 * |
GONG, HUILI: "Feature extraction and similarity measure on tobacco near infrared spectra", CHINA DOCTORAL DISSERTATIONS FULL-TEXT DATABASE, BASIC SCIENCES, no. 02, 15 February 2015 (2015-02-15), pages 1 - 115, XP055754936 * |
LI, YUQIANG ET AL.: "NIR spectral feature selection using lasso method and its application in the classification analysis", SPECTROSCOPY AND SPECTRAL ANALYSIS, vol. 39, no. 12, 31 December 2019 (2019-12-31), pages 3809 - 3815, XP055754956 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115963074A (en) * | 2023-02-23 | 2023-04-14 | 中国人民解放军国防科技大学 | Rapid detection method and system for spore and hypha ratio of microbial material |
CN115963074B (en) * | 2023-02-23 | 2023-06-02 | 中国人民解放军国防科技大学 | Method and system for rapidly detecting spore hypha ratio of microbial material |
CN116881705A (en) * | 2023-09-07 | 2023-10-13 | 佳木斯大学 | Near infrared spectrum data processing system of calyx seu fructus physalis |
CN116881705B (en) * | 2023-09-07 | 2023-11-21 | 佳木斯大学 | Near infrared spectrum data processing system of calyx seu fructus physalis |
CN117291445A (en) * | 2023-11-27 | 2023-12-26 | 国网安徽省电力有限公司电力科学研究院 | Multi-target prediction method based on state transition under comprehensive energy system |
CN117291445B (en) * | 2023-11-27 | 2024-02-13 | 国网安徽省电力有限公司电力科学研究院 | Multi-target prediction method based on state transition under comprehensive energy system |
CN117473207A (en) * | 2023-12-28 | 2024-01-30 | 深圳市新景环境技术有限公司 | Paint spraying waste gas treatment equipment and treatment method thereof |
CN117473207B (en) * | 2023-12-28 | 2024-03-29 | 深圳市新景环境技术有限公司 | Paint spraying waste gas treatment equipment and treatment method thereof |
CN117589741A (en) * | 2024-01-18 | 2024-02-23 | 天津博霆光电技术有限公司 | Indocyanine green intelligent detection method based on optical characteristics |
CN117589741B (en) * | 2024-01-18 | 2024-04-05 | 天津博霆光电技术有限公司 | Indocyanine green intelligent detection method based on optical characteristics |
Also Published As
Publication number | Publication date |
---|---|
CN110006844A (en) | 2019-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020232959A1 (en) | Near infrared spectral feature extraction method and system based on functional principal component analysis | |
Saeys et al. | Multivariate calibration of spectroscopic sensors for postharvest quality evaluation: A review | |
Kimuli et al. | Utilisation of visible/near-infrared hyperspectral images to classify aflatoxin B1 contaminated maize kernels | |
Mishra et al. | Partial least square regression versus domain invariant partial least square regression with application to near-infrared spectroscopy of fresh fruit | |
Song et al. | Differentiation of organic and non-organic apples using near infrared reflectance spectroscopy—A pattern recognition approach | |
Singh et al. | Nondestructive identification of barley seeds variety using near‐infrared hyperspectral imaging coupled with convolutional neural network | |
Shahin et al. | Quantification of mildew damage in soft red winter wheat based on spectral characteristics of bulk samples: a comparison of visible-near-infrared imaging and near-infrared spectroscopy | |
Torres et al. | Developing universal models for the prediction of physical quality in citrus fruits analysed on-tree using portable NIRS sensors | |
Munawar et al. | Near infrared spectroscopy as a fast and non-destructive technique for total acidity prediction of intact mango: Comparison among regression approaches | |
Beghi et al. | Rapid evaluation of grape phytosanitary status directly at the check point station entering the winery by using visible/near infrared spectroscopy | |
CN109470648A (en) | A kind of single grain crop unsound grain quick nondestructive determination method | |
Fadock et al. | Visible-near infrared reflectance spectroscopy for nondestructive analysis of red wine grapes | |
Yu et al. | Rapid and visual measurement of fat content in peanuts by using the hyperspectral imaging technique with chemometrics | |
Puttipipatkajorn et al. | Development of calibration models for rapid determination of moisture content in rubber sheets using portable near-infrared spectrometers | |
Pourdarbani et al. | Using metaheuristic algorithms to improve the estimation of acidity in Fuji apples using NIR spectroscopy | |
Liu et al. | Identification of heat damage in imported soybeans based on hyperspectral imaging technology | |
Feng et al. | Nondestructive determination of soluble solids content and pH in red bayberry (Myrica rubra) based on color space | |
Courand et al. | Evaluation of a robust regression method (RoBoost-PLSR) to predict biochemical variables for agronomic applications: Case study of grape berry maturity monitoring | |
Wachendorf et al. | Prediction of the clover content of red clover‐and white clover‐grass mixtures by near‐infrared reflectance spectroscopy | |
Hao et al. | Combined hyperspectral imaging technology with 2D convolutional neural network for near geographical origins identification of wolfberry | |
Shen et al. | Discrimination of blended Chinese rice wine ages based on near-infrared spectroscopy | |
CN116578851A (en) | Method for predicting effective boron content of hyperspectral soil | |
CN106442400B (en) | A kind of method that near infrared spectrum quickly determines soil type fresh tea leaves | |
Xu et al. | Comparative study of different wavelength selection methods in the transfer of crop kernel qualitive near-infrared models | |
Chen et al. | Hyperspectral reflectance imaging for detecting typical defects of durum kernel surface |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19929260 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19929260 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19929260 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 30/05/2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19929260 Country of ref document: EP Kind code of ref document: A1 |