CN103776797A - Method for identifying Pingli fiveleaf gynostemma herb through near infrared spectroscopy - Google Patents

Method for identifying Pingli fiveleaf gynostemma herb through near infrared spectroscopy Download PDF

Info

Publication number
CN103776797A
CN103776797A CN201410065240.XA CN201410065240A CN103776797A CN 103776797 A CN103776797 A CN 103776797A CN 201410065240 A CN201410065240 A CN 201410065240A CN 103776797 A CN103776797 A CN 103776797A
Authority
CN
China
Prior art keywords
sample
mrow
near infrared
mover
pingli
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410065240.XA
Other languages
Chinese (zh)
Other versions
CN103776797B (en
Inventor
赵志磊
李小亭
陈培云
吴广臣
刘秀华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei University
Original Assignee
Hebei University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei University filed Critical Hebei University
Priority to CN201410065240.XA priority Critical patent/CN103776797B/en
Publication of CN103776797A publication Critical patent/CN103776797A/en
Application granted granted Critical
Publication of CN103776797B publication Critical patent/CN103776797B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention provides a method for identifying Pingli fiveleaf gynostemma herb through near infrared spectroscopy, which comprises the following steps: (A) establishing a near infrared spectroscopy identifying model of Pingli fiveleaf gynostemma herb: (A-1) selecting the spectral range of 4,000-12,500cm<-1>, and scanning the near infrared spectrogram of the Pingli fiveleaf gynostemma herb; (A-2) pretreating the data of the spectral range of 4,000-9,500cm<-1>; (A-3) extracting the main components; and (A-4) establishing an artificial neutral network model: determining the structure of the neutral network by an artificial neutral network algorithm according to the characteristics of the input and output data, and training the neutral network by use of the training data; and establishing a BP artificial neutral network model of input layer nodes 10-, hidden layer nodes 5- and output layer nodes 2 by use of the MATLAB software; and (B) identifying the unknown sample: scanning the near infrared spectrogram of the unknown sample under the same conditions, selecting the number of main components, judging the authenticity of the unknown sample according to the trained neutral network model, and representing the output nodes with binary codes respectively, wherein 10 represents Pingli fiveleaf gynostemma herb, and 01 represents non-Pingli fiveleaf gynostemma herb.

Description

Method for identifying Pingli gynostemma pentaphylla by near infrared spectrum
Technical Field
The invention relates to a method for identifying Pingli gynostemma pentaphylla by near infrared spectrum, in particular to a method for identifying Pingli gynostemma pentaphylla by combining near infrared spectrum technology with artificial neural network algorithm, belonging to the field of near infrared spectrum detection and analysis.
Background
Gynostemma pentaphylla, also known as fiveleaf gynostemma herb and scandent schefflera root, has the functions of reducing blood pressure, blood fat and blood sugar, protecting heart and liver, regulating fat and losing weight, and is called as 'longevity herb'. Since the quality control bureau of 2004 performs regional protection of the origin of the Shaanxi Pingli gynostemma pentaphylla, the price of Pingli gynostemma pentaphylla is multiplied, the phenomena of inferior quality and counterfeit and shoddy occur occasionally in the market, and the technology for identifying the origin of the Shaanxi Pingli gynostemma pentaphylla is imperative in order to effectively identify the Pingli gynostemma pentaphylla of different origins and protect the rights and interests of consumers.
The near infrared spectrum has the advantages of high reaction speed, rich information content, less pretreatment, no environmental pollution and the like, is widely applied in many fields, and becomes one of the most popular spectral analysis technologies in current research. The near infrared spectrum contains a great deal of information of the sample, so that the near infrared analysis technology and the pattern recognition method are combined, and the grade and the category of the sample can be more effectively distinguished. The near-infrared pattern recognition technology is a technology for deducing the attribution of a substance from near-infrared data of the substance by applying a chemical pattern recognition method. All methods of chemical pattern recognition can be used for the study of near infrared pattern recognition. At present, the near-infrared-based pattern recognition technology is widely applied to the fields of agriculture, medicine, food, petroleum and the like, and plays an important role in the aspects of true and false discrimination, grade classification, origin and place identification and the like. However, the recognition models established by the pattern recognition are all specific to specific products and have strong specificity. The applicant has adopted the near infrared spectroscopy combined with the mahalanobis distance algorithm and the qualification test to effectively identify the rice with the rice water; and successfully identifying the virgin olive oil and the olive pomace oil by using a fisher discrimination algorithm. The invention discloses a method for identifying Pingli gynostemma pentaphylla based on near infrared spectrum technology and artificial neural network algorithm. At present, the researches on gynostemma pentaphylla by scholars at home and abroad mainly focus on chemical components and pharmacological actions of gynostemma pentaphylla. It mainly contains saponin [1], polysaccharide [2], amino acid [4], flavone [3], organic acid and trace elements [4] and other chemical components. The reports prove that the components of the gynostemma pentaphylla in different producing areas are different, so the methods have certain reference value for identifying the authenticity of the gynostemma pentaphylla in different producing areas, but the report for identifying the authenticity of the gynostemma pentaphylla by utilizing the component difference is not found at present.
Disclosure of Invention
The invention aims to provide a method for quickly and accurately identifying the truth of the Pingli gynostemma pentaphylla by combining a near infrared spectrum technology and an artificial neural network algorithm.
The technical scheme of the invention is as follows: the method for identifying Pingli gynostemma pentaphylla by using the near infrared spectrum comprises the following steps:
A. establishing a near infrared spectrum identification model of Pingli gynostemma pentaphylla
A-1, selective spectral range 4000--1Scanning a Pingli gynostemma pentaphylla near infrared spectrogram;
a-2, 12500cm in spectral range 4000--1Preprocessing the data;
a-3, extracting main components;
a-4, establishing an artificial neural network model: determining the structure of a neural network according to the characteristics of input and output data by adopting an artificial neural network algorithm, and training the neural network by utilizing training data to obtain an identification model of the fiveleaf gynostemma herb;
B. identification of unknown samples
Scanning a near-infrared spectrogram of an unknown sample under the same condition, selecting the number of main components, judging the authenticity of the unknown sample according to a trained neural network model, respectively representing output nodes by binary codes, wherein 10 represents Pingli gynostemma pentaphylla, and 01 represents non-Pingli gynostemma pentaphylla.
The method for identifying the Pingli gynostemma pentaphylla by the near infrared spectrum comprises the following steps of A-1, wherein the scanning Pingli gynostemma pentaphylla near infrared spectrogram comprises the following steps: drying and crushing an effective amount of a Gynostemma Pentaphyllum sample, uniformly placing the dried and crushed Gynostemma Pentaphyllum sample in a quartz sample cell, and scanning an absorption spectrum by using a Fourier near infrared spectrometer; the scanning mode is rotation diffuse reflection, and the resolution ratio is 8cm-1Scanning each sample for multiple times, and taking an average spectrum as a final analysis spectrum of the sample;
the method for identifying the Pingli gynostemma pentaphylla by the near infrared spectrum comprises the following steps of A-2, wherein the data preprocessing of the Pingli gynostemma pentaphylla near infrared spectrogram comprises the following steps: and preprocessing of multivariate scattering correction and proper normalization is carried out on the spectrum of the stranded blue sample, and the influence of interference factors such as sample nonuniformity, light scattering, instrument noise and the like is eliminated through the preprocessing, so that the prediction precision and the stability of the model are improved.
In the method for identifying Pingli gynostemma pentaphylla by using the near infrared spectrum, the main component extraction in the step A-3 is to reduce the dimension of spectrogram information by using a main component analysis method, the cumulative contribution rate of the first 10 main components is 99.99%, the limited amount of input is used for reducing the calculation complexity of the model, and the prediction precision of the model is improved.
The method for identifying the Pingli gynostemma pentaphylla by the near infrared spectrum comprises the following steps of A-4, establishing an artificial neural network model by the Pingli gynostemma pentaphylla, wherein the artificial neural network model comprises an input layer node 10, a hidden layer node 5 and an output layer node 2, and by using MATLAB software:
a-4-1, determining the number of nodes of an input layer: taking 10 principal component scores as parameters, and determining the input layer node of the network to be 10;
a-4-2, determining the number of hidden nodes: the following equation was used to determine:
L = 0.43 mn + 0.12 n 2 + 2.54 m + 0.77 n + 0.35 + 0.51
m and n are respectively the number of input nodes and output nodes, the number of hidden nodes can obtain an initial value through a formula, and then the initial value is corrected by utilizing a step-by-step growth method to obtain an empirical value 5;
a-4-3, determining the number of nodes of an output layer: and determining 2 output nodes of the neural network according to two results of judging whether the gynostemma pentaphylla sample belongs to a peaceful producing area or a non-peaceful producing area.
The method for identifying the Pingli gynostemma pentaphylla by the near infrared spectrum comprises the following steps of:
let x1,x2,…,xnIs a sample taken from the population x, where xi=(xi1,xi2,…,xip)′(i=1,2,…n);
The sample observation matrix is recorded as:
X = x 11 x 12 . . . x 1 p x 21 x 22 . . . x 2 p . . . . . . . . . x n 1 x n 2 . . . x np
each row of x corresponds to a sample and each column corresponds to a variable;
recording the sample covariance matrix and the sample correlation coefficient matrix as follows:
<math><mrow> <mi>S</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>n</mi> <mo>-</mo> <mn>1</mn> </mrow> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <mover> <mi>x</mi> <mo>&OverBar;</mo> </mover> <mo>)</mo> </mrow> <msup> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <mover> <mi>x</mi> <mo>&OverBar;</mo> </mover> <mo>)</mo> </mrow> <mo>&prime;</mo> </msup> <mo>=</mo> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mi>ij</mi> </msub> <mo>)</mo> </mrow> </mrow></math>
R ^ = ( r ij ) , r ij = s ij s ii s jj
wherein,is the sample average;
taking S as an estimate of sigma,
Figure BDA0000469649420000055
as an estimate of R, from S or
Figure BDA0000469649420000056
The principal components of the sample can be determined.
The method for identifying Pingli gynostemma pentaphylla by near infrared spectroscopy is characterized by comprising the following steps of: the principal component of the sample is composed of a matrix of correlation coefficients of the slave samples
Figure BDA0000469649420000057
Starting and solving:
is provided with
Figure BDA0000469649420000058
Is composed ofP number of charactersThe characteristic value of the light-emitting diode is shown,
Figure BDA00004696494200000510
for the corresponding orthonormal unit feature vector, the p principal components of the sample are
<math><mrow> <msup> <mover> <mi>y</mi> <mo>^</mo> </mover> <mo>*</mo> </msup> <mo>=</mo> <msup> <msubsup> <mover> <mi>t</mi> <mo>^</mo> </mover> <mi>i</mi> <mo>*</mo> </msubsup> <mo>&prime;</mo> </msup> <msup> <mi>x</mi> <mo>*</mo> </msup> <mo>,</mo> <mi>i</mi> <mo>=</mo> <mn>1,2</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>p</mi> </mrow></math>
Sample xiNormalized observed value
Figure BDA00004696494200000512
Substituting into the jth main component to obtain a sample xiJ-th principal component score of
<math><mrow> <msubsup> <mover> <mi>y</mi> <mo>^</mo> </mover> <mi>ij</mi> <mo>*</mo> </msubsup> <mo>=</mo> <msup> <msubsup> <mover> <mi>t</mi> <mo>^</mo> </mover> <mi>j</mi> <mo>*</mo> </msubsup> <mo>&prime;</mo> </msup> <msubsup> <mi>x</mi> <mi>i</mi> <mo>*</mo> </msubsup> <mrow> <mo>(</mo> <mi>i</mi> <mo>=</mo> <mn>1,2</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>n</mi> <mo>;</mo> <mi>j</mi> <mo>=</mo> <mn>1,2</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>p</mi> <mo>)</mo> </mrow> <mo>.</mo> </mrow></math>
The invention selects the spectral range of 4000--1According to the original spectral analysis, the characteristic peak of the interval containing the main near-infrared absorption of the gynostemma pentaphylla sample is subjected to preprocessing of multi-element scattering correction and vector normalization on the spectrum of the gynostemma pentaphylla sample, so that the influence of interference factors such as sample nonuniformity, light scattering and instrument noise is eliminated, and the prediction precision and stability of a neural network model are improved; according to the method, the principal component scores of the first 10 principal components are extracted and used as new variable input, the computation complexity of the model is reduced through the limited input, and the prediction accuracy of the model is improved; the method can quickly realize the identification of the trueness of the fiveleaf gynostemma herb, and the identification accuracy of the model training set and the prediction set is 100 percent. The established discrimination model has great significance for realizing the discrimination of the trueness of the fiveleaf gynostemma herb.
Drawings
FIG. 1 is a diagram of a neural network architecture
Each letter in the figure indicates:
xjrepresents the input of the jth node of the input layer, j =1, …, M; ij is
wijRepresenting the weight from the ith node of the hidden layer to the jth node of the input layer;
θia threshold value representing the ith node of the hidden layer;
Φxan excitation function representing the hidden layer;
wkjrepresenting the weight from the kth node of the output layer to the ith node of the hidden layer, i =1, …, q; ka watch
akA threshold value representing the kth node of the output layer, k =1, …, L;
Ψxan excitation function representing an output layer;
0krepresenting the output of the kth node of the output layer.
FIG. 2 is a near infrared spectrum of Pingli gynostemma pentaphyllum
FIG. 3 cumulative contribution rate of the top ten principal component scores
FIG. 4 is a program diagram of a neural network computational implementation
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
1. Instruments and reagents: the sample spectrum was collected using MPA near infrared spectrometer and diffuse reflectance accessory from Bruker, Germany, with its own analytical software and MATLAB software.
1-1, instrument noise
Preparing a voltage-stabilized power supply, starting up the device, preheating until the instrument is sufficiently stable, and ensuring the proper test environment temperature to be 15-25 ℃;
1-2, wavelength accuracy and reproducibility
The accuracy of the wavelength was corrected with a low pressure mercury lamp and a methylene blue solution with a mass fraction of 0.005% to prevent drift.
2. Sample preparation and spectral scanning: the gynostemma pentaphylla samples were purchased at the place of origin.
TABLE 1 Gynostemma pentaphyllum sample information table
Figure BDA0000469649420000071
3. Establishing near infrared spectrum identification model of geographical sign gynostemma pentaphylla
3-1, scanning the gynostemma pentaphylla near infrared spectrogram
The method comprises the steps of drying a gynostemma pentaphyllum sample at 60 ℃ for 4 hours, crushing, sieving with a 60-mesh sieve, uniformly placing about 50g of the sample in a sample cell, and collecting a rotating diffuse reflection spectrum of the sample by using an MPA near infrared spectrometer and a diffuse reflection accessory of Bruker company, Germany, as shown in figure 2. The light source of the instrument is a 20W tungsten halogen lamp, and the spectral range is 4000-12500cm-1. The spectrum acquisition software used was OPUS6.5 from Bruker, with a scan number of 64 and a resolution of 8cm-1The reference is the built-in background of the instrument. Each sample was scanned 3 times and the average spectrum of the 3 scans was taken as the sample spectrum.
3-2, preprocessing the spectral data
At the wave band 4000--1The discrimination model is established, MATLAB software is used for processing the spectrum of the sample, a multivariate scattering correction and vector normalization preprocessing method is used for processing the spectrum, and a principal component analysis method is used for reducing the dimension of the spectrum of the stranded blue sample.
3-3, extracting main components;
the method for identifying the Pingli gynostemma pentaphylla by the near infrared spectrum comprises the following steps of:
let x1,x2,…,xnIs a sample taken from the population x, where xi=(xi1,xi2,…,xip)′(i=1,2,…n);
The sample observation matrix is recorded as:
X = x 11 x 12 . . . x 1 p x 21 x 22 . . . x 2 p . . . . . . . . . x n 1 x n 2 . . . x np
each row of x corresponds to a sample and each column corresponds to a variable;
recording the sample covariance matrix and the sample correlation coefficient matrix as follows:
<math><mrow> <mi>S</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>n</mi> <mo>-</mo> <mn>1</mn> </mrow> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <mover> <mi>x</mi> <mo>&OverBar;</mo> </mover> <mo>)</mo> </mrow> <msup> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <mover> <mi>x</mi> <mo>&OverBar;</mo> </mover> <mo>)</mo> </mrow> <mo>&prime;</mo> </msup> <mo>=</mo> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mi>ij</mi> </msub> <mo>)</mo> </mrow> </mrow></math>
R ^ = ( r ij ) , r ij = s ij s ii s jj
wherein,
Figure BDA0000469649420000093
is the sample average;
taking S as an estimate of sigma,
Figure BDA0000469649420000094
as an estimate of R, from S orThe principal components of the sample can be determined.
The method for identifying Pingli gynostemma pentaphylla by near infrared spectroscopy is characterized by comprising the following steps of: the principal component of the sample is composed of a matrix of correlation coefficients of the slave samples
Figure BDA0000469649420000096
Starting and solving:
is provided with
Figure BDA0000469649420000097
Is composed of
Figure BDA0000469649420000098
The number of p characteristic values of (a),
Figure BDA0000469649420000099
for the corresponding orthonormal unit feature vector, the p principal components of the sample are
<math><mrow> <msup> <mover> <mi>y</mi> <mo>^</mo> </mover> <mo>*</mo> </msup> <mo>=</mo> <msup> <msubsup> <mover> <mi>t</mi> <mo>^</mo> </mover> <mi>i</mi> <mo>*</mo> </msubsup> <mo>&prime;</mo> </msup> <msup> <mi>x</mi> <mo>*</mo> </msup> <mo>,</mo> <mi>i</mi> <mo>=</mo> <mn>1,2</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>p</mi> </mrow></math>
Sample xiNormalized observed value
Figure BDA00004696494200000911
Substituting into the jth main component to obtain a sample xiJ-th principal component score of
<math><mrow> <msubsup> <mover> <mi>y</mi> <mo>^</mo> </mover> <mi>ij</mi> <mo>*</mo> </msubsup> <mo>=</mo> <msup> <msubsup> <mover> <mi>t</mi> <mo>^</mo> </mover> <mi>j</mi> <mo>*</mo> </msubsup> <mo>&prime;</mo> </msup> <msubsup> <mi>x</mi> <mi>i</mi> <mo>*</mo> </msubsup> <mrow> <mo>(</mo> <mi>i</mi> <mo>=</mo> <mn>1,2</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>n</mi> <mo>;</mo> <mi>j</mi> <mo>=</mo> <mn>1,2</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>p</mi> <mo>)</mo> </mrow> <mo>.</mo> </mrow></math>
Using MATLAB software to take the top 10 principal component scores as the input level nodes of the network, the cumulative contribution rate of the top ten principal components in fig. 3 reaches 99.99%, as shown in table 2. Can represent most effective information of the gynostemma pentaphylla, so the scores of the first ten principal components are adopted as the parameters of the nodes of the input layer.
TABLE 2 cumulative contribution of the first ten principal components
Figure BDA0000469649420000101
The output point of the network adopts binary code to represent the output of the producing area, as shown in table 3, the samples of eight producing areas are only divided into two categories, one is the original producing area peaceful gynostemma pentaphylla, which is represented by binary code 10; the other is non-native gynostemma pentaphyllum, represented by binary code 01.
Table 3 output node origin code
Figure BDA0000469649420000102
3-4, establishing artificial neural network model
3-4-1, determining the number of nodes of an input layer: taking 10 principal component scores as parameters, determining the input layer nodes of the network to be 10, dividing the sample into a training set and a prediction set, wherein the training set is used for training the neural network, and establishing the artificial neural network model. The prediction set is used for verifying the accuracy of the network model, and if the accuracy of the method is found to be not up to the requirement, the model is trained again, namely the parameters are optimized until the accurate neural network model is established.
In this embodiment, a total of 405 samples are collected, 90 samples are collected in a fair producing area, 45 samples are collected in other producing areas, and the 405 samples are randomly divided into a training set and a prediction set, wherein the number of the training set samples is 270, and the number of the prediction set samples is 135.
3-4-2, determining the number of hidden nodes: the following equation was used to determine:
L = 0.43 mn + 0.12 n 2 + 2.54 m + 0.77 n + 0.35 + 0.51
m and n are respectively the number of input nodes and output nodes, the number of hidden nodes can obtain an initial value through a formula, and then the initial value is corrected by utilizing a step-by-step growth method to obtain an empirical value 5;
3-4-3, determining the number of nodes of an output layer: and determining 2 output nodes of the neural network according to two results of judging whether the gynostemma pentaphylla sample belongs to a peaceful producing area or a non-peaceful producing area.
According to the calculation procedure shown in fig. 4, transfer functions tansig of the input layer and the hidden layer of the network are determined, the transfer function of the output layer is a thingdx function, and the training target is set to be 1x10-6The learning rate of the network is 0.05, the set training iteration times are 1000, and the number of hidden nodes is 5. MATLAB software is used for establishing a BP artificial neural network model of the input layer node 10, the hidden layer node 5 and the output layer node 2.
4. Unknown sample identification
4-1, sample treatment: taking about 50g of unknown sample, drying for 4 hours at 60 ℃, crushing, sieving by a 60-mesh sieve, uniformly placing in a sample cell, and collecting the rotating diffuse reflection spectrum of the sample. The spectral range is 4000-12500cm-1Scanning times 64, resolution 8cm-1The reference is the built-in background of the instrument. Each sample was scanned 3 times, and 3 timesThe average spectrum of (a) is the sample spectrum.
4-2 neural network recognition model
Selecting the spectral range of the unknown sample to be 4000--1And (3) processing the internal near-infrared spectrogram by adopting a multivariate scattering correction and vector normalization preprocessing method, extracting scores of the first 10 main components by using MATLAB software, inputting the scores into a discrimination model, judging that the fiveleaf gynostemma herb is obtained if the output result of the model is 10, and judging that the fiveleaf gynostemma herb is not fiveleaf gynostemma herb if the output result is 01. The results show that the recognition accuracy of 270 training samples is 100%, the recognition accuracy of 135 prediction samples is 100%, and the specific verification results are shown in table 4.
TABLE 4 judgment result of recognition feasibility of the artificial neural network to Gymnema pentaphyllum
Figure BDA0000469649420000121
The above description is only provided as an implementable technical solution of the method for identifying the Pingli gynostemma pentaphylla by using the near infrared spectrum of the invention, and is not a single limitation condition for the technical solution.
Reference documents:
[1] "bamboo-Ben Chang Song", a research on the composition of Cucurbitaceae plants-saponin component of Gynostemma pentaphyllum (Ma) Makino.) Makino, volume 7, phase 5, 1985, month 4
[2] Research and development of water-soluble polysaccharide of Gynostemma pentaphyllum Makino by WANGSHUANGJING, Lord Shihui, alkali, food research and development, Vol.27, No. 5, 2006, 5 months
[3] Research on refining process of gynostemma pentaphylla flavonoid compound by using Wangqinghao, Zhangluo and macroporous adsorption resin [ J ] forest chemical communication, volume 39, 6 th year, 6 th 2005
[4] Dang Shilin, the well-known of the world, analysis of amino acids, vitamins and various chemical elements in Gynostemma pentaphyllum [ J ]. Vol.19, university of Hunan medical sciences, Vol.6, 1994, 6 months.

Claims (7)

1. A method for identifying Pingli gynostemma pentaphylla by near infrared spectrum is characterized by comprising the following steps:
A. establishing a near infrared spectrum identification model of Pingli gynostemma pentaphylla
A-1, selective spectral range 4000--1Scanning a near-infrared spectrogram of the Pingli gynostemma pentaphylla;
a-2, selecting the spectrum range of 4000--1Preprocessing the data;
a-3, extracting main components;
a-4, establishing an artificial neural network model: determining the structure of a neural network according to the characteristics of input and output data by adopting an artificial neural network algorithm, and training the neural network by utilizing training data to obtain an identification model of the fiveleaf gynostemma herb;
B. identification of unknown samples
Scanning a near-infrared spectrogram of an unknown sample under the same condition, selecting the number of main components, judging the authenticity of the unknown sample according to a trained neural network model, and respectively representing output nodes by binary numbers, wherein 10 represents Pinglie gynostemma pentaphylla, and 01 represents non-Pinglie gynostemma pentaphylla.
2. The method for identifying Gynostemma pentaphyllum by near infrared spectrum according to claim 1, wherein the scanning of the Gynostemma pentaphyllum near infrared spectrum of step A-1 comprises: drying and crushing an effective amount of a Gynostemma Pentaphyllum sample, uniformly placing the dried and crushed Gynostemma Pentaphyllum sample in a quartz sample cell, and scanning an absorption spectrum by using a Fourier near infrared spectrometer; the scanning mode is rotation diffuse reflection, and the resolution ratio is 8cm-1Scanning each sample for multiple times, and taking an average spectrum as a final analysis spectrum of the sample;
3. the method for identifying Gynostemma Pentaphyllum by near infrared spectroscopy as claimed in claim 1, wherein the near infrared spectroscopy range of Gynostemma Pentaphyllum in step A-2 is 4000--1The data preprocessing comprises the following steps: and preprocessing of multivariate scattering correction and proper normalization is carried out on the spectrum of the stranded blue sample, and the influence of interference factors such as sample nonuniformity, light scattering, instrument noise and the like is eliminated through the preprocessing, so that the prediction precision and the stability of the model are improved.
4. The method for identifying Pingli gynostemma pentaphylla by using the near infrared spectrum as claimed in claim 1, wherein the step A-3 of extracting principal components is to reduce the dimension of spectrogram information by using a principal component analysis method, the cumulative contribution rate of the first 10 principal components is 99.99%, the computation complexity of a model is reduced by inputting limited quantity, and the prediction accuracy of the model is improved.
5. The method for identifying Gynostemma pentaphyllum by near infrared spectroscopy according to claim 1, wherein: the method for establishing the artificial neural network model by utilizing the fiveleaf gynostemma herb in the step A-4 comprises the following steps of establishing a BP artificial neural network model of an input layer node 10, a hidden layer node 5 and an output layer node 2 by utilizing MATLAB software:
a-4-1, determining the number of nodes of an input layer: taking 10 principal component scores as parameters, and determining the input layer node of the network to be 10;
a-4-2, determining the number of hidden nodes: the following equation was used to determine:
L = 0.43 mn + 0.12 n 2 + 2.54 m + 0.77 n + 0.35 + 0.51
m and n are respectively the number of input nodes and output nodes, the number of hidden nodes can obtain an initial value through a formula, and then the initial value is corrected by utilizing a step-by-step growth method to obtain an empirical value 5;
a-4-3, determining the number of nodes of an output layer: and determining 2 output nodes of the neural network according to two results of judging whether the gynostemma pentaphylla sample belongs to a peaceful producing area or a non-peaceful producing area.
6. The method for identifying Gynostemma pentaphyllum by near infrared spectroscopy according to claim 3, wherein: the principal component of the sample is determined by the following method:
let x1,x2,…,xnIs a sample taken from the population x, where xi=(xi1,xi2,…,xip)′(i=1,2,…n);
The sample observation matrix is recorded as:
X = x 11 x 12 . . . x 1 p x 21 x 22 . . . x 2 p . . . . . . . . . x n 1 x n 2 . . . x np
each row of x corresponds to a sample and each column corresponds to a variable;
recording the sample covariance matrix and the sample correlation coefficient matrix as follows:
<math> <mrow> <mi>S</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>n</mi> <mo>-</mo> <mn>1</mn> </mrow> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <mover> <mi>x</mi> <mo>&OverBar;</mo> </mover> <mo>)</mo> </mrow> <msup> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <mover> <mi>x</mi> <mo>&OverBar;</mo> </mover> <mo>)</mo> </mrow> <mo>&prime;</mo> </msup> <mo>=</mo> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mi>ij</mi> </msub> <mo>)</mo> </mrow> </mrow> </math>
R ^ = ( r ij ) , r ij = s ij s ii s jj
wherein,is the sample average;
taking S as an estimate of sigma,
Figure FDA0000469649410000035
as an estimate of R, from S or
Figure FDA00004696494100000313
The principal components of the sample can be determined.
7. The method for identifying Gynostemma pentaphyllum by near infrared spectroscopy according to claim 6, wherein: the principal component of the sample is composed of a matrix of correlation coefficients of the slave samples
Figure FDA0000469649410000036
Starting and solving:
is provided with
Figure FDA0000469649410000037
Is composed of
Figure FDA0000469649410000038
The number of p characteristic values of (a),
Figure FDA0000469649410000039
for the corresponding orthonormal unit feature vector, the p principal components of the sample are
<math> <mrow> <msup> <mover> <mi>y</mi> <mo>^</mo> </mover> <mo>*</mo> </msup> <mo>=</mo> <msup> <msubsup> <mover> <mi>t</mi> <mo>^</mo> </mover> <mi>i</mi> <mo>*</mo> </msubsup> <mo>&prime;</mo> </msup> <msup> <mi>x</mi> <mo>*</mo> </msup> <mo>,</mo> <mi>i</mi> <mo>=</mo> <mn>1,2</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>p</mi> </mrow> </math>
Sample xiNormalized observed value
Figure FDA00004696494100000311
Substituting into the jth main component to obtain a sample xiJ-th principal component score of
<math> <mrow> <msubsup> <mover> <mi>y</mi> <mo>^</mo> </mover> <mi>ij</mi> <mo>*</mo> </msubsup> <mo>=</mo> <msup> <msubsup> <mover> <mi>t</mi> <mo>^</mo> </mover> <mi>j</mi> <mo>*</mo> </msubsup> <mo>&prime;</mo> </msup> <msubsup> <mi>x</mi> <mi>i</mi> <mo>*</mo> </msubsup> <mrow> <mo>(</mo> <mi>i</mi> <mo>=</mo> <mn>1,2</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>n</mi> <mo>;</mo> <mi>j</mi> <mo>=</mo> <mn>1,2</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>p</mi> <mo>)</mo> </mrow> <mo>.</mo> </mrow> </math>
CN201410065240.XA 2014-02-25 2014-02-25 A kind of near infrared spectrum differentiates the method for flat interest Herb Gynostemmae Pentaphylli Expired - Fee Related CN103776797B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410065240.XA CN103776797B (en) 2014-02-25 2014-02-25 A kind of near infrared spectrum differentiates the method for flat interest Herb Gynostemmae Pentaphylli

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410065240.XA CN103776797B (en) 2014-02-25 2014-02-25 A kind of near infrared spectrum differentiates the method for flat interest Herb Gynostemmae Pentaphylli

Publications (2)

Publication Number Publication Date
CN103776797A true CN103776797A (en) 2014-05-07
CN103776797B CN103776797B (en) 2016-09-21

Family

ID=50569301

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410065240.XA Expired - Fee Related CN103776797B (en) 2014-02-25 2014-02-25 A kind of near infrared spectrum differentiates the method for flat interest Herb Gynostemmae Pentaphylli

Country Status (1)

Country Link
CN (1) CN103776797B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106485094A (en) * 2016-11-30 2017-03-08 华东理工大学 A kind of PX oxidation reaction production process agent model modeling method
CN108324286A (en) * 2018-01-26 2018-07-27 重庆大学 A kind of infrared light noninvasive dynamics monitoring device based on PCA-NARX correcting algorithms
CN108444944A (en) * 2018-02-24 2018-08-24 盐城工学院 A kind of radix polygoni multiflori powder place of production discrimination method for the spectrometry that diffused based on near-infrared
CN108509997A (en) * 2018-04-03 2018-09-07 深圳市药品检验研究院(深圳市医疗器械检测中心) A method of Chemical Pattern Recognition is carried out to the true and false that Chinese medicine Chinese honey locust is pierced based on near-infrared spectrum technique
WO2023279338A1 (en) * 2021-07-08 2023-01-12 Shanghaitech University Neural spectral field reconstruction for spectrometer

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1603794A (en) * 2004-11-02 2005-04-06 江苏大学 Method and device for rapidly detecting tenderness of beef utilizing near infrared technology
CN101532954A (en) * 2008-03-13 2009-09-16 天津天士力现代中药资源有限公司 Method for identifying traditional Chinese medicinal materials by combining infra-red spectra with cluster analysis
CN101957316A (en) * 2010-01-18 2011-01-26 河北大学 Method for authenticating Xiangshui rice by near-infrared spectroscopy
CN101968438A (en) * 2010-09-25 2011-02-09 西北农林科技大学 Method for distinguishing water injection of raw material muscles quickly
CN101995392A (en) * 2010-11-15 2011-03-30 中华人民共和国上海出入境检验检疫局 Method for rapidly detecting adulteration of olive oil
CN102636452A (en) * 2012-05-03 2012-08-15 中国科学院长春光学精密机械与物理研究所 NIR (Near Infrared Spectrum) undamaged identification authenticity method for wild ginseng

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1603794A (en) * 2004-11-02 2005-04-06 江苏大学 Method and device for rapidly detecting tenderness of beef utilizing near infrared technology
CN101532954A (en) * 2008-03-13 2009-09-16 天津天士力现代中药资源有限公司 Method for identifying traditional Chinese medicinal materials by combining infra-red spectra with cluster analysis
CN101957316A (en) * 2010-01-18 2011-01-26 河北大学 Method for authenticating Xiangshui rice by near-infrared spectroscopy
CN101968438A (en) * 2010-09-25 2011-02-09 西北农林科技大学 Method for distinguishing water injection of raw material muscles quickly
CN101995392A (en) * 2010-11-15 2011-03-30 中华人民共和国上海出入境检验检疫局 Method for rapidly detecting adulteration of olive oil
CN102636452A (en) * 2012-05-03 2012-08-15 中国科学院长春光学精密机械与物理研究所 NIR (Near Infrared Spectrum) undamaged identification authenticity method for wild ginseng

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
NING LI ET AL.: "Fast discrimination of traditional Chinese medicine according to geographical origins with FTIR spectroscopy and advanced pattern recognition techniques", 《OPTICS EXPRESS》 *
姜健 等: "基于主成分分析和人工神经网络的五味子质量鉴定方法研究", 《红外》 *
郭萍 等: "中草药绞股蓝的傅里叶变换红外和拉曼光谱分析", 《光谱学与光谱分析》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106485094A (en) * 2016-11-30 2017-03-08 华东理工大学 A kind of PX oxidation reaction production process agent model modeling method
CN108324286A (en) * 2018-01-26 2018-07-27 重庆大学 A kind of infrared light noninvasive dynamics monitoring device based on PCA-NARX correcting algorithms
CN108324286B (en) * 2018-01-26 2020-11-10 重庆大学 Infrared noninvasive blood glucose detection device based on PCA-NARX correction algorithm
CN108444944A (en) * 2018-02-24 2018-08-24 盐城工学院 A kind of radix polygoni multiflori powder place of production discrimination method for the spectrometry that diffused based on near-infrared
CN108509997A (en) * 2018-04-03 2018-09-07 深圳市药品检验研究院(深圳市医疗器械检测中心) A method of Chemical Pattern Recognition is carried out to the true and false that Chinese medicine Chinese honey locust is pierced based on near-infrared spectrum technique
WO2019192433A1 (en) * 2018-04-03 2019-10-10 深圳市药品检验研究院(深圳市医疗器械检测中心) Method for chemical pattern recognition of authenticity of traditional chinese medicine chinese honeylocust spine based on near-infrared spectroscopy
US11656176B2 (en) 2018-04-03 2023-05-23 Shenzhen Institute For Drug Control (Shenzhen Testing Center Of Medical Devices) Near-infrared spectroscopy-based method for chemical pattern recognition of authenticity of traditional Chinese medicine Gleditsiae spina
WO2023279338A1 (en) * 2021-07-08 2023-01-12 Shanghaitech University Neural spectral field reconstruction for spectrometer

Also Published As

Publication number Publication date
CN103776797B (en) 2016-09-21

Similar Documents

Publication Publication Date Title
CN107677647B (en) Method for identifying origin of traditional Chinese medicinal materials based on principal component analysis and BP neural network
CN103776797B (en) A kind of near infrared spectrum differentiates the method for flat interest Herb Gynostemmae Pentaphylli
CN104677875B (en) A kind of three-dimensional fluorescence spectrum combines the method that parallel factor differentiates different brands Chinese liquor
CN108875913B (en) Tricholoma matsutake rapid nondestructive testing system and method based on convolutional neural network
CN109187392B (en) Zinc liquid trace metal ion concentration prediction method based on partition modeling
Griffiths et al. Self-weighted correlation coefficients and their application to measure spectral similarity
CN101957316B (en) Method for authenticating Xiangshui rice by near-infrared spectroscopy
CN105279379A (en) Terahertz spectroscopy feature extraction method based on convex combination kernel function principal component analysis
CN106596513A (en) Tea leaf variety identification method based on laser induced breakdown spectroscopy
CN108573105A (en) The method for building up of soil heavy metal content detection model based on depth confidence network
CN103278467A (en) Rapid nondestructive high-accuracy method with for identifying abundance degree of nitrogen element in plant leaf
Li et al. Manufacturer identification and storage time determination of “Dong’e Ejiao” using near infrared spectroscopy and chemometrics
CN106383088A (en) A seed purity rapid nondestructive testing method based on a multispectral imaging technique
Huang et al. Applications of machine learning in pine nuts classification
CN104568824A (en) Method and device for detecting freshness grade of shrimps based on visible/near-infrared spectroscopy
Xiao et al. Rapid detection of maize seed germination rate based on Gaussian process regression with selection kernel function
Fan et al. Non-destructive detection of single-seed viability in maize using hyperspectral imaging technology and multi-scale 3D convolutional neural network
Zhang et al. Brand Identification of Soybean Milk Powder based on Raman Spectroscopy Combined with Random Forest Algorithm
CN114112983A (en) Python data fusion-based Tibetan medicine all-leaf artemisia rupestris L producing area distinguishing method
Liu et al. Technical exploration of the origins, storage periods and species identification of Boletus bainiugan
Qi et al. Composition analysis of white mineral pigment based on convolutional neural network and Raman spectrum
CN113406037B (en) Infrared spectrum online rapid identification analysis method based on sequence forward selection
Xiao et al. Identification of geographical origin and adulteration of Northeast China soybeans by mid-infrared spectroscopy and spectra augmentation
Kong et al. An integrated field and hyperspectral remote sensing method for the estimation of pigments content of Stipa Purpurea in Shenzha, Tibet
Tang et al. Determining the content of nitrogen in rubber trees by the method of NIR spectroscopy

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160921

Termination date: 20180225

CF01 Termination of patent right due to non-payment of annual fee