WO2019192433A1 - 一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法 - Google Patents

一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法 Download PDF

Info

Publication number
WO2019192433A1
WO2019192433A1 PCT/CN2019/080873 CN2019080873W WO2019192433A1 WO 2019192433 A1 WO2019192433 A1 WO 2019192433A1 CN 2019080873 W CN2019080873 W CN 2019080873W WO 2019192433 A1 WO2019192433 A1 WO 2019192433A1
Authority
WO
WIPO (PCT)
Prior art keywords
saponin
pattern recognition
batches
authenticity
chinese medicine
Prior art date
Application number
PCT/CN2019/080873
Other languages
English (en)
French (fr)
Inventor
王铁杰
王丽君
闫研
王珏
殷果
江坤
李菁
王洋
回音
Original Assignee
深圳市药品检验研究院(深圳市医疗器械检测中心)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市药品检验研究院(深圳市医疗器械检测中心) filed Critical 深圳市药品检验研究院(深圳市医疗器械检测中心)
Priority to US17/043,325 priority Critical patent/US11656176B2/en
Publication of WO2019192433A1 publication Critical patent/WO2019192433A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N21/359Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using near infrared light
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2115Selection of the most significant subset of features by evaluating different subsets according to an optimisation criterion, e.g. class separability, forward selection or backward elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N21/3504Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing gases, e.g. multi-gas analysis
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N21/3563Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing solids; Preparation of samples therefor
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N21/3577Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing liquids, e.g. polluted water
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2201/00Features of devices classified in G01N21/00
    • G01N2201/12Circuits of general importance; Signal processing
    • G01N2201/129Using chemometrical methods
    • G01N2201/1296Using chemometrical methods using neural networks

Definitions

  • the present application belongs to the technical field of chemical analysis, and relates to a method for chemical pattern recognition based on near-infrared spectroscopy technology for authenticity of traditional Chinese medicine saponin.
  • the saponin is a dry spine of the genus Gleditsia sinensis Lam. It has the effects of detoxification, swelling, and insecticidal discharge (Chinese Pharmacopoeia 2015 edition. Part One [S]. 2015: 177-178). Modern pharmacological tests have shown that flavonoids such as quercetin and quercetin have good anti-tumor effects (Xu Zhe, Zhao Xiaotong, Wang Yimen et al. Isolation, identification and activity determination of anti-tumor active ingredients of saponin [ J]. Journal of Shenyang Pharmaceutical University. 2008, (2): 108-111).
  • Near-infrared spectroscopy has the characteristics of fast analysis speed, simple pre-treatment, environmental protection and no pollution. It can directly measure solid, liquid and gas samples. At present, it has been widely used in the field of pharmacy for the authenticity identification of drugs, the identification of origin, and the quantitative analysis of adulteration.
  • the chemical pattern recognition technology satisfies the ambiguity and integrity requirements of Chinese medicine ingredient information, and is a new technology for describing and classifying chemical composition information in samples by computer.
  • the purpose of the present application is to provide a method for chemical pattern recognition based on near-infrared spectroscopy technology for authenticity of traditional Chinese medicine saponin.
  • the present application provides a method for chemical pattern recognition based on near-infrared spectroscopy for authenticity of traditional Chinese medicine saponin, the method comprising the following steps:
  • step (4) Using the discriminant functions obtained from step (4) by using 5 characteristic wave points of x 8 , x 13 , x 16 , x 19 and x 21 of the test set sample to discriminate the discriminating accuracy of the saponin and its counterfeit .
  • the chemical pattern recognition of the authenticity of the traditional Chinese medicine saponin is achieved by using the near-infrared spectroscopy acquisition method and the first derivative preprocessing method, the continuous projection algorithm, the Kennard-Stone algorithm and the stepping algorithm.
  • the result of the discriminant method is accurate and reliable. It can accurately distinguish the saponins and their counterfeits, and provide a scientific basis for the quality evaluation of saponins.
  • the pseudo-products of step (1) are saponins, wild staghorn thorns and raspberries.
  • the near infrared spectrum of the step (1) has an acquisition range of 12,000 to 4000 cm -1 , an instrument resolution of 4 cm -1 , and a number of scans of 32 times.
  • the continuous projection algorithm is used to screen the characteristic wavenumber points (ie, characteristic variables) in the range of 5000 to 4200 cm -1 , since the spectral interval 11800 to 7500 cm -1 contains 2230 variables, and the interval 6500-5500 cm -1 contains 519 variables.
  • the interval 5000 ⁇ 4200cm -1 contains 416 variables.
  • the continuous projection algorithm is used to effectively compress the data to eliminate the interference of the collinear data on the model, which greatly reduces the complexity of the model and is conducive to modeling.
  • the stepwise method is used to introduce the variable step by step, the step rule adopts the minimum F value method, and when the F value is greater than 3.84, the variable having a large influence on the classification is added, and when the F value is less than 2.71, the variable having a small influence on the classification is eliminated. Reduce the false positive rate and improve the accuracy of the model.
  • the first derivative preprocessing method is used in step (2) to preprocess the peaks of 5000 to 4200 cm -1 , and such preprocessing has higher modeling accuracy.
  • the Savitzky-Golay (SG) smoothing, Vector Normalization (VN), Min Max Normalization (MMN), and Second Derivative (2nd D) are not used for preprocessing.
  • the first derivative preprocessing method has high modeling accuracy.
  • the training set sample of the step (3) comprises 32 batches of samples, including 24 batches of saponin, 3 batches of saponin, 2 batches of wild saponins and 3 batches of staghorns, and the test set samples include 11 batches of samples, including 8 batches of saponin, 1 batch of saponin, 1 batch of wild saponin and 1 batch of staghorn.
  • cluster analysis is used to perform systematic cluster analysis on the five characteristic wave points extracted by the step method described in step (4).
  • the clustering analysis uses a squared deviation method, and the distance measure is a squared Euclidean distance.
  • the cluster analysis shows that the extracted five characteristic wave numbers can accurately and effectively distinguish the authentic and fake products of the saponin and can distinguish different types of counterfeits.
  • the BP neural network model is used to verify the accuracy of the pattern recognition of the characteristic wave points obtained in the step (3).
  • the BP neural network model is a feature wave number extracted by a continuous projection algorithm as an input of a neural network
  • the input layer includes a node having a characteristic wave number point
  • the hidden layer has 10 nodes
  • the output layer has 4 nodes, and is established.
  • the BP neural network model is obtained.
  • the code of the saponin is [1 0 0 0]
  • the code of the saponin is [0 1 0 0]
  • the code of the wild staghorn is [0 0 1 0]
  • the neural network learning algorithm is a conjugate gradient algorithm
  • the training rule selects the Levenberg-Marquardt algorithm
  • the random method assigns the sample set to the training set, the verification set, and the test set.
  • BP neural network model is established under different spectral range and different preprocessing methods using training set data.
  • samples of verification set and test set are used.
  • Verify the recognition capabilities of the BP neural network model The results show that the spectral interval is selected from 5000 to 4200 cm -1 , and the first derivative preprocessing method is adopted. The classification accuracy of the model for the training set, the verification set and the test set is 100%, which indicates that the BP artificial neural network model can be effective. The identification of saponins is authentic and fake.
  • the method for chemical pattern recognition based on the near-infrared spectroscopy technique for authenticating the traditional Chinese medicine saponin includes the following steps:
  • the wild saponin and the raspberry are divided into a training set sample and a test set sample, and the training set sample includes 32 batches of samples, including 24 batches of saponin, 3 batches of saponin, 2 batches of wild staghorn thorns and 3 batches of rake, the test set sample comprises 11 batches of samples, including 8 batches of saponin, 1 batch of saponin, 1 batch of wild saponin and 1 batch of stag;
  • step (4) Using the discriminant functions obtained from step (4) by using 5 characteristic wave points of x 8 , x 13 , x 16 , x 19 and x 21 of the test set sample to discriminate the discriminating accuracy of the saponin and its counterfeit ;
  • Cluster analysis is used to perform systematic cluster analysis on the five characteristic wave points extracted by the step method described in step (4) to verify the discriminant accuracy of the obtained discriminant function, and to use the BP neural network model to step (3) The obtained feature wave points are verified for the accuracy of the pattern recognition result.
  • the clustering analysis adopts a deviation square sum method, and the distance measure is a square Euclidean distance.
  • the BP neural network model is a feature wave number point extracted by a continuous projection algorithm as an input of a neural network, and the input layer contains nodes as characteristic wave number points.
  • the hidden layer contains 10 nodes and the output layer contains 4 nodes, and the BP neural network model is established.
  • the method for chemical pattern recognition of the authenticity of the traditional Chinese medicine saponin is made by using the near-infrared spectroscopy method and the first derivative preprocessing method, the continuous projection algorithm, the Kennard-Stone algorithm and the stepping algorithm.
  • the chemical pattern recognition of the authenticity of the thorn makes the result of the identification method accurate and reliable, and can accurately distinguish the saponin and its counterfeit.
  • This application is the first to establish a chemical pattern recognition method based on the near-infrared spectroscopy technique. Distinguish between saponins and their counterfeits, and provide a scientific basis for the quality evaluation of saponins.
  • this paper establishes a chemical pattern recognition method for distinguishing between authentic and counterfeit saponins by cluster analysis, discriminant analysis and BP neural network analysis technology, which can overcome the subjectivity of the traditional identification method, and is more scientific and comprehensive.
  • Fig. 1 is a schematic diagram showing the original average near-infrared spectrum obtained by infrared spectrum acquisition of the saponins and the saponins of the saponins of the saponins, the wild saponins and the snails.
  • 2A is a near-infrared spectrum obtained by preprocessing the original average near-infrared spectroscopy by Savitzky-Golay (SG) smoothing and Vector Normalization (VN).
  • SG Savitzky-Golay
  • VN Vector Normalization
  • 2B is a near-infrared spectrum obtained by preprocessing the original mean near-infrared spectrum by Savitzky-Golay (SG) smoothing and Min Max Normalization (MMN) method.
  • SG Savitzky-Golay
  • MNN Min Max Normalization
  • 2C is a near-infrared spectrum obtained by preprocessing the original average near-infrared spectrum by a first derivative (1st D) method.
  • 2D is a near-infrared spectrum obtained by pretreating the original average near-infrared spectrum by a Second Derivative (2nd D) method.
  • FIG. 3 is a diagram of cluster analysis results of the present application.
  • VERTEX 70 Fourier transform near-infrared spectrometer (Bruker, Germany), the detector is inGas arsenide (InGaAS); RT-04A high-speed pulverizer (Hong Kong ⁇ Pharmaceutical Machinery Company).
  • Spectral data preprocessing using OPUS 6.5 software (Bruker, Germany), continuous projection algorithm, Kennard-Stone algorithm operation and BP neural network establishment using Matlab R2014a software (Mathworks, USA), cluster analysis and discriminant analysis using SPSS 21.0 software (IBM, USA).
  • the samples used are as follows:
  • the method for chemical pattern recognition of the authenticity of the traditional Chinese medicine saponin includes the following steps:
  • the preprocessing method is screened.
  • the preprocessing methods of the screening include Savitzky-Golay (SG) smoothing, Vector Normalization (VN), Min Max Normalization (MMN), first derivative. (First Derivative, 1st D), Second Derivative (2nd D) method, using these preprocessing methods and some combinations of some methods to preprocess the original spectrum of the sample, and to examine the accuracy of modeling by different preprocessing methods. influences.
  • the spectrum after pretreatment is shown in Figure 2.
  • the continuous projection algorithm is used to screen the characteristic wave points (feature variables) in each interval, and the data extracted by the continuous projection algorithm is used as the independent variable to establish the stepwise discriminant analysis method.
  • Wilks'Lambda is used as the index to introduce the variables gradually to establish the typical discriminant function.
  • the discriminant score of the typical function of the saponin and its counterfeit the probability of judging the classification of the saponin and all kinds of counterfeit products is determined.
  • the classification accuracy under each method is shown in Table 2. It can be seen from Table 2 that when the spectrum is 5000 ⁇ 4200cm -1 , the original spectrum, SG+VN, and first derivative preprocessed data are used to establish the discriminant analysis model. The classification accuracy of the saponin and its counterfeit products are 100%. .
  • the internal cross-validation method was used to examine the discriminant results. As shown in Table 3, when the spectrum segment was selected from 5000 to 4200 cm -1 and the original spectrum was used, one of the hooks was misjudged as the saponin. The accuracy of cross-validation was 96.9%. When the spectrum segment was 5000 ⁇ 4200cm -1 and the SG+VN pretreatment method was used, there were 3 cases of misjudgment in the saponin, one case was wrongly judged as saponin, 1 The error was judged as wild staghorn thorn and 1 case was wrongly determined as a dangling. The accuracy of cross-validation was 90.6%. When the spectrum segment was selected from 5000 to 4200 cm -1 and the first-order derivative preprocessing method was used, there was no misjudgment and cross-validation. The accuracy rate is 100%. The discriminant model is very effective.
  • GS saponin
  • GJ saponin
  • GM wild saponin
  • RC raspberry.
  • the first derivative preprocessing method can make the discrimination more accurate. Therefore, the first derivative preprocessing method is used to predict the peaks of 5000 ⁇ 4200 cm -1 . deal with.
  • the wild saponin and the raspberry are divided into a training set sample and a test set sample, and the training set sample includes 32 batches of samples, including 24 batches of saponin, 3 batches of saponin, 2 batches of wild staghorn thorns and 3 batches of rake, the test set sample comprises 11 batches of samples, including 8 batches of saponin, 1 batch of saponin, 1 batch of wild saponin and 1 batch of stag;
  • step (4) Using the discriminant functions obtained from step (4) by using 5 characteristic wave points of x 8 , x 13 , x 16 , x 19 and x 21 of the test set sample to discriminate the discriminating accuracy of the saponin and its counterfeit .
  • the 11 batches of samples will be substituted into the discriminant function to judge the classification of saponins and their counterfeits.
  • the results are shown in Table 4.
  • the discriminative accuracy of 11 batches of samples is 100%, which indicates that the typical discriminant function can accurately identify the saponins. And the classification of its counterfeits.
  • GS saponin
  • GJ saponin
  • GM wild saponin
  • RC raspberry.
  • the system cluster analysis is carried out on the five characteristic variables extracted by the step method.
  • the clustering method is the squared deviation method, and the distance measure is the squared Euclidean distance.
  • the clustering result tree is shown in Figure 3. It can be seen from Fig. 3 that the genuine saponin of No.1-32 is condensed into Class I, the No. 33-43 is collected into Class II, and the No. 33-36 of the fake is a sample of saponin, which is clustered into Class III, 37- No.
  • 39 is a sample of wild saponin, which is clustered into class IV, and 40-43 is a sample of raspberry, which is clustered into V, and the clustering results are consistent with the results of trait identification.
  • the clustering results show that the extracted five characteristic wave numbers can accurately and effectively distinguish between staghorn thorns and fakes and can distinguish different types of counterfeits.
  • the feature variables extracted by the continuous projection algorithm are used as the input of the neural network.
  • the nodes in the input layer are the number of feature variables, the hidden layer contains 10 nodes, and the output layer contains 4 nodes.
  • a three-layer BP neural network model is established.
  • the code of the saponin is [1 0 0 0]
  • the code of the saponin is [0 1 0 0]
  • the code of the wild staghorn is [0 0 1 0]
  • the code of the stag is [0 0 0 1].
  • the neural network learning algorithm is a conjugate gradient algorithm
  • the training rule selects the Levenberg-Marquardt algorithm
  • the random method assigns the sample set to the training set, the verification set, and the test set.
  • BP neural network model is established under different spectral range and different preprocessing methods using training set data.
  • samples of verification set and test set are used. Verify the recognition ability of the BP neural network model.
  • the classification results are shown in Table 5. The results show that the spectral interval is selected from 5000 to 4200 cm -1 , and the first derivative preprocessing method is adopted.
  • the classification accuracy of the model for the training set, the verification set and the test set is 100%, which indicates that the BP artificial neural network model can be effective.
  • the identification of saponins is authentic and fake.
  • the present application makes the result of the discrimination method accurate and reliable through the combination of the near-infrared spectroscopy acquisition method and the continuous projection algorithm, the first derivative preprocessing method, the Kennard-Stone algorithm and the stepping algorithm. Distinguish between saponins and their counterfeits.
  • the application may cause fiber optic probe sampling noise spectrum segments end to end, thus excluding the hetero 12000 ⁇ 11800cm -1 ⁇ 4000cm -1 peak spectral 4200's.
  • the amount of near-infrared spectroscopy data is cumbersome
  • the spectral interval 11800 ⁇ 7500cm -1 contains 2230 variables
  • the interval 6500 ⁇ 5500cm -1 contains 519 variables
  • the interval 5000 ⁇ 4200cm -1 contains 416 variables
  • the data is continuous projection algorithm.
  • Effective compression to eliminate the interference of the collinear data on the model greatly reduces the complexity of the model and facilitates modeling.
  • the stepwise discriminant analysis method the stepwise method is used to introduce the variable step by step.
  • the step rule adopts the minimum F value method. When the F value is greater than 3.84, the variable that has a large influence on the classification is added. When the F value is less than 2.71, the variable with less influence on the classification is removed. Reduce the false positive rate and improve the accuracy of the model.
  • Savitzky-Golay smoothing method can effectively smooth high-frequency noise and improve signal-to-noise ratio; vector normalization and minimum and maximum normalization are used to correct the spectral error caused by particle scattering; first derivative and second derivative are respectively It is used to eliminate the translation and drift of the baseline in the spectrum, improving resolution and sensitivity. In the investigation of the pretreatment method, it was found that the first derivative preprocessing method is more accurate for the model discrimination results.
  • the results of cluster analysis showed that the saponins were mainly divided into two categories: 10, 11, 21, 18, 25, 26, 27, 28 from Hubei, Hebei, Beijing, Hebei Xinle, Shandong Taian and Anhui Zhangzhou.
  • Samples 29, 30, 31, 32, 22, 4, and 23 were grouped into one class, indicating that the quality of the saponin in the above-mentioned production areas is relatively close; from Luoyang, Henan, Shandong, Zaozhuang, etc., Shaanxi Lishui, Shanxi Yuncheng Samples of 12, 13, 14, 15, 16, 17, 19, 20, 5, 6, 7, 8, 1, 2, 3, 9, and 24 from all parts of Hubei, Shenyang, and Guangxi are grouped together, indicating The quality of the saponins in the sample production area is similar. The above differences may be caused by factors such as the growth period of saponins, temperature of the producing area, light and rainfall, and need further study.
  • BP neural network analysis results show that BP artificial neural network modeling is better. It can be seen from Table 5 that the classification accuracy rate of each set of models to the training set ranges from 82.6 to 100%. Among them, 11 groups of 15 models established by different conditions have a training set classification accuracy rate of 100%; each group model has a prediction accuracy rate range of 63.6 to 100% for the verification set, and the prediction accuracy rate range for the test set is 44.4 ⁇ 100%. Among them, there are multiple sets of 100% accurate classification verification set and test set model. After optimization and screening, the spectral range is selected from 5000 to 4200 cm -1 , and the data preprocessing adopts the first derivative. The model established is the best BP neural network model. The classification accuracy of the training set, verification set and test set are both 100%.
  • the present application describes the method for chemical pattern recognition of the authenticity of the traditional Chinese medicine saponin by the above embodiments.
  • the present application is not limited to the above embodiments, that is, the application does not depend on the above embodiments. It should be apparent to those skilled in the art that any modifications of the present application, equivalent substitution of the materials selected for the present application, and the addition of the auxiliary components, the selection of the specific manners, and the like, are all within the scope of the present application.

Abstract

本申请提供一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,所述方法利用近红外光谱采集法、一阶导数预处理方法以及连续投影算法、Kennard-Stone算法以及步进算法的结合对中药皂角刺的真伪进行化学模式识别,使得模式识别方法的结果准确可靠,可以准确区分皂角刺及其伪品。本申请首次建立了基于近红外光谱技术皂角刺质量的化学模式识别方法,可以准确区分皂角刺及其伪品,为皂角刺的质量评价提供科学依据。

Description

一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法 技术领域
本申请属于化学分析技术领域,涉及一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法。
背景技术
皂角刺是豆科植物皂荚(Gleditsia sinensis Lam.)的干燥棘刺,具有解毒消肿,杀虫排脓的功效(中国药典2015年版.一部[S].2015:177-178)。现代药理试验表明皂角刺中的黄颜木素、槲皮素等黄酮类成分具有良好的抗肿瘤作用(徐哲,赵晓頔,王漪檬等.皂角刺抗肿瘤活性成分的分离鉴定与活性测定[J].沈阳药科大学学报.2008,(2):108-111)。随着市场需求的增加,出现使用形似质次的其他植物棘刺(如山皂角刺、野皂角刺、悬钩子,等)掺伪销售的现象,这些伪品在外观上与正品十分相似,制成饮片或药粉后更难以直观鉴别。目前,传统的性状鉴别及显微鉴别方法未涉及产生中药疗效的化学成分。理化鉴别法仅对中药材复杂成分体系的个别成分进行评价,难以反映其质量的整体性(王铁杰,罗旭,王玺等.中药龙胆质量的化学模式识别[J].药学学报,1992,(6):456-461;王洋,申丽,江坤等.中药砂仁质量的化学模式识别研究[J].药物分析杂志.2016,(10):1863-1869),因而测定皂角刺中的黄酮类等活性成分不能代表其整体疗效。
近红外光谱技术具有分析速度快、前处理简单、环保无污染等特点,对固体、液体、气体形态的样品均可以直接测定。目前在药学领域已广泛应用于药品的真伪鉴别、产地鉴别、掺伪定量分析等方面。化学模式识别技术较好地满足了中药成分信息的模糊性和整体性要求,是一种通过计算机对样品中的化学 成分信息进行描述和分类的新技术。
然而在本领域中,目前还没有针对皂角刺正品及伪品的化学模式识别方法,对于如何能够快速准确区分皂角刺正品及伪品依然是本领域的研究重点。
内容总结
针对现有技术的不足,本申请的目的在于提供一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法。
为达到此申请目的,本申请采用以下技术方案:
本申请提供一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,所述方法包括以下步骤:
(1)采集皂角刺及其伪品样品的近红外光谱,扣除内置参比背景,每个样品表面采集三个不同位置的光谱,得到平均光谱作为原始光谱;
(2)剔除原始光谱中的干扰峰,得到11800~7500cm -1、6500~5500cm -1以及5000~4200cm -1谱段峰,选择5000~4200cm -1谱段峰作为模型分析峰,并采用一阶导数预处理方法对5000~4200cm -1谱段峰进行预处理;
(3)采用连续投影算法筛选一阶导数预处理5000~4200cm -1范围内的特征波数点,根据特征波数点,采用Kennard-Stone算法将待判别的皂角刺及其伪品分为训练集样品和试验集样品;
(4)以训练集样品建立判别模型,利用步进法提取出x 8、x 13、x 16、x 19、x 21共5个特征波数点,引入以上5个特征波数点建立如下判别函数:
F 1=36387.907x 8+24242.533x 13+9262.246x 16+11456.025x 19+13209.943x 21+3.210,
F 2=-43757.506x 8+40701.987x 13+24623.897x 16+28906.269x 19-20234.651x 21+4.496;
(5)采用试验集样品的x 8、x 13、x 16、x 19、x 21共5个特征波数点代入步骤(4)得到的判别函数以判别皂角刺及其伪品的判别准确率。
在本申请中,通过采用近红外光谱采集法以及一阶导数预处理方法、连续投影算法、Kennard-Stone算法以及步进算法的结合来实现对中药皂角刺的真伪进行化学模式识别,该判别方法的结果准确可靠。可以准确区分皂角刺及其伪品,为皂角刺的质量评价提供科学依据。
优选地,步骤(1)所述伪品为山皂角刺、野皂角刺和悬钩子。
优选地,步骤(1)所述近红外光谱的采集范围为12000~4000cm -1,仪器分辨率为4cm -1,扫描次数为32次。
优选地,步骤(2)所述干扰峰为12000~11800cm -1、4200~4000cm -1、7500~6500cm -1和5500~5000cm -1谱段的峰,其中12000~11800cm -1、4200~4000cm -1为近红外光谱中由于仪器不稳定以及一些外部原因可能导致不准确的谱段峰,7500~6500cm -1和5500~5000cm -1谱段的峰为水峰,因此在分析中去除这些干扰峰。
在本申请中,去除干扰峰后,得到11800~7500cm -1、6500~5500cm -1以及5000~4200cm -1三个谱段峰,11800~7500cm -1、6500~5500cm -1谱段峰建立判别模型未能对正品和伪品进行准确判别,其中由5000~4200cm -1谱段峰建立的判别模型,能够对正品和伪品进行准确的判别。
在本申请中,采用连续投影算法筛选5000~4200cm -1范围内的特征波数点(即特征变量),由于光谱区间11800~7500cm -1包含变量2230个,区间6500~5500cm -1包含变量519个,区间5000~4200cm -1包含变量416个,采用连续投影算法对数据进行有效压缩以消除共线性数据对模型的干扰,大大降低模型的复杂程度,有利于建模。
在本申请中,采用步进法逐步引入变量,步进规则采用最小F值法,F值大于3.84时加入对分类影响大的变量,F值小于2.71时剔除对分类影响小的变量。降低了误判率,提高模型的精度。
在本申请中,步骤(2)中采用一阶导数预处理方法对5000~4200cm -1谱段峰进行预处理,这样的预处理具有更高的建模准确率。而采用Savitzky-Golay(SG)平滑、矢量归一化(Vector Normalization,VN)、最小最大归一化(Min Max Normalization,MMN)、二阶导数(Second Derivative,2nd D)进行预处理时均没有一阶导数预处理方法建模准确率高。
优选地,步骤(3)所述训练集样品包括32批样品,其中包括24批皂角刺、3批山皂角刺、2批野皂角刺和3批悬钩子,所述试验集样品包括11批样品,其中包括8批皂角刺、1批山皂角刺、1批野皂角刺和1批悬钩子。
为了验证本申请所述方法对中药皂角刺的真伪判别的准确性,采用聚类分析对步骤(4)所述步进法提取的5个特征波数点进行系统聚类分析。
优选地,所述聚类分析采用离差平方和法,距离测度为平方欧式距离。
在本申请中,通过聚类分析表明,提取的5个特征波数可以准确有效的区分皂角刺正品与伪品并且能够区分不同类别的伪品。
在本申请中为了进一步验证本申请所述方法对中药皂角刺的真伪判别的准确性,采用BP神经网络模型对步骤(3)得到的特征波点数进行模式识别的结果准确性进行验证。
优选地,所述BP神经网络模型为采用连续投影算法提取的特征波数点作为神经网络的输入,输入层含有的节点为特征波数点数、隐藏层含有10个节点、输出层含有4个节点,建立得到BP神经网络模型。
在本申请中,皂角刺的代码为[1 0 0 0],山皂角刺的代码为[0 1 0 0],野皂角 刺的代码为[0 0 1 0],悬钩子的代码为[0 0 0 1]。神经网络的学习算法为共轭梯度算法,训练规则选择Levenberg-Marquardt算法,随机法分配样本集为训练集、验证集、测试集。为筛选出最佳建模条件,采用训练集数据分别在不同谱段范围与不同预处理方法下建立BP神经网络模型;为了进一步检验BP神经网络模型的预测效果,采用验证集与测试集的样本验证BP神经网络模型的识别能力。结果显示,光谱区间选择为5000~4200cm -1,采用一阶导数预处理方法,模型对训练集、验证集以及测试集的分类准确率均为100%,表明所建BP人工神经网络模型可以有效的识别皂角刺正品与伪品。
作为本申请的优选技术方法,本申请所述基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法具体包括以下步骤:
(1)采集皂角刺及其伪品山皂角刺、野皂角刺和悬钩子样品的近红外光谱,采集范围为12000~4000cm -1,仪器分辨率为4cm -1,扫描次数为32次,扣除内置参比背景,每个样品表面采集三个不同位置的光谱,得到平均光谱作为原始光谱;
(2)剔除原始光谱中位于12000~11800cm -1、4200~4000cm -1、7500~6500cm -1和5500~5000cm -1谱段的干扰峰,得到11800~7500cm -1、6500~5500cm -1以及5000~4200cm -1谱段峰,选择5000~4200cm -1谱段峰作为模型分析峰,采用一阶导数预处理方法对5000~4200cm -1谱段峰进行预处理;
(3)采用连续投影算法筛选一阶导数预处理5000~4200cm -1范围内的特征波数点,根据特征波数点,采用Kennard-Stone算法将待判别的皂角刺及其伪品山皂角刺、野皂角刺和悬钩子分为训练集样品和试验集样品,所述训练集样品包括32批样品,其中包括24批皂角刺、3批山皂角刺、2批野皂角刺和3批悬钩子,所述试验集样品包括11批样品,其中包括8批皂角刺、1批山皂角刺、1 批野皂角刺和1批悬钩子;
(4)以训练集样品建立判别模型,利用步进法提取出x 8、x 13、x 16、x 19、x 21共5个特征波数点,引入以上5个特征波数点建立如下判别函数:
F 1=36387.907x 8+24242.533x 13+9262.246x 16+11456.025x 19+13209.943x 21+3.210
F 2=-43757.506x 8+40701.987x 13+24623.897x 16+28906.269x 19-20234.651x 21+4.496;
(5)采用试验集样品的x 8、x 13、x 16、x 19、x 21共5个特征波数点代入步骤(4)得到的判别函数以判别皂角刺及其伪品的判别准确率;
(6)采用聚类分析对步骤(4)所述步进法提取的5个特征波数点进行系统聚类分析以对得到的判别函数的判别准确性进行验证,以及采用BP神经网络模型对步骤(3)得到的特征波点数进行模式识别的结果准确性进行验证。所述聚类分析采用离差平方和法,距离测度为平方欧式距离,所述BP神经网络模型为采用连续投影算法提取的特征波数点作为神经网络的输入,输入层含有的节点为特征波数点数、隐藏层含有10个节点、输出层含有4个节点,建立得到BP神经网络模型。
相对于现有技术,本申请具有以下有益效果:
本申请对中药皂角刺的真伪进行化学模式识别的方法,利用近红外光谱采集法以及一阶导数预处理方法、连续投影算法、Kennard-Stone算法以及步进算法的结合对对中药皂角刺的真伪进行化学模式识别,使得识别方法的结果准确可靠,可以准确区分皂角刺及其伪品,本申请首次建立了基于近红外光谱技术皂角刺质量的化学模式识别方法,可以准确区分皂角刺及其伪品,为皂角刺的质量评价提供科学依据。
本申请首次通过聚类分析、判别分析以及BP神经网络的分析技术建立了区分皂角刺正品及伪品的化学模式识别方法,能够克服传统鉴别方法的主观性,更具科学性和全面性。
附图说明
图1为本申请对皂角刺及其伪品山皂角刺、野皂角刺和悬钩子样品进行红外光谱采集得到的原始平均近红外光谱图。
图2A为采用Savitzky-Golay(SG)平滑和矢量归一化(Vector Normalization,VN)方法对原始平均近红外光谱进行预处理后得到的近红外光谱图。
图2B为采用Savitzky-Golay(SG)平滑和最小最大归一化(Min Max Normalization,MMN)方法对原始平均近红外光谱进行预处理后得到的近红外光谱图。
图2C为采用一阶导数(First Derivative,1st D)方法对原始平均近红外光谱进行预处理后得到的近红外光谱图。
图2D为采用二阶导数(Second Derivative,2nd D)方法对原始平均近红外光谱进行预处理后得到的近红外光谱图。
图3为本申请的聚类分析结果图。
具体实施方式
下面通过具体实施方式来进一步说明本申请的技术方案。本领域技术人员应该明了,所述实施例仅仅是帮助理解本申请,不应视为对本申请的具体限制。
实施例1
在本实施例中,使用的仪器与软件如下:
VERTEX 70傅里叶变换近红外光谱仪(德国Bruker公司),检测器为铟镓砷(InGaAS);RT-04A型高速粉碎机(香港泓荃制药机械公司)。光谱数据预处 理采用OPUS 6.5软件(德国Bruker公司),连续投影算法、Kennard-Stone算法的运行和BP神经网络的建立采用Matlab R2014a软件(美国Mathworks公司),聚类分析和判别分析采用SPSS 21.0软件(美国IBM公司)。
在本实施例中,使用的样品如下:
收集皂角刺(Gleditsia sinensis Lam.)32批,山皂角刺(Gleditsia japonica Miq.)4批,野皂角刺(Gleditsia microphylla Gordon ex Y.T.Lee)3批,悬钩子(Rubus cochinchinensis Tratt.)4批,共计43批样品。以上样品均经鉴定,确认为中药皂角刺正品及其各类典型伪品。干燥、粉碎过50目筛备用。样品来源信息见表1。
表1
Figure PCTCN2019080873-appb-000001
Figure PCTCN2019080873-appb-000002
对中药皂角刺的真伪进行化学模式识别的方法具体包括以下步骤:
(1)采用光纤探头对皂角刺及其伪品山皂角刺、野皂角刺和悬钩子样品进行光谱采集,光谱采集范围为12000~4000cm -1,仪器分辨率为4cm -1,扫描次数为32次。扣除内置参比背景,每个样品表面采集三个不同位置的光谱,得到平均光谱作为原始光谱。原始平均近红外光谱如图1所示。
(2)剔除干扰峰12000~11800cm -1与4200~4000cm -1、水峰7500~6500cm -1与5500~5000cm -1后,全谱段被分为3个区间11800~7500cm -1、6500~5500cm -1以及5000~4200cm -1
对原始平均近红外光谱进行预处理
首先对预处理方法进行了筛选,筛选的预处理方法包括Savitzky-Golay(SG)平滑、矢量归一化(Vector Normalization,VN)、最小最大归一化(Min Max  Normalization,MMN)、一阶导数(First Derivative,1st D)、二阶导数(Second Derivative,2nd D)方法,使用这些预处理方法以及其中一些方法的组合对样本原始光谱进行预处理,考察不同预处理方法对建模准确率的影响。预处理后的光谱见图2。
采用连续投影算法筛选出各区间范围内的特征波数点(特征变量),采用连续投影算法提取后的数据作为自变量建立逐步判别分析方法,以Wilks'Lambda作为逐步引入变量的指标建立典型判别函数方程,根据皂角刺及其伪品的典型函数的判别得分确定皂角刺正品及各类伪品判别分类的概率。各方法项下分类准确率见表2。由表2可知,当谱段为5000~4200cm -1时,采用原始光谱、SG+VN、一阶导数预处理数据建立判别分析模型,皂角刺及其伪品的分类准确率均为100%。
表2 判别分析的分类准确率
Figure PCTCN2019080873-appb-000003
为了验证判别模型的有效性,采用内部交叉验证方法考察判别结果,如表3所示,当选择谱段5000~4200cm -1并采用原始光谱时,悬钩子存在1例错判为 皂角刺正品,交叉验证准确率为96.9%;选择谱段5000~4200cm -1并采用SG+VN预处理方法时,山皂角刺存在3例错判,分别为1例错判为皂角刺正品、1例错判为野皂角刺以及1例错判为悬钩子,交叉验证准确率为90.6%;选择谱段5000~4200cm -1并采用一阶导数预处理方法时,不存在误判,交叉验证准确率为100%。可见判别模型具有很好的有效性。
表3 判别分析的交叉验证准确率
Figure PCTCN2019080873-appb-000004
其中,GS:皂角刺;GJ:山皂角刺;GM:野皂角刺;RC:悬钩子。
由如上对预处理方法对建模准确率的考察,可以得出利用一阶导数预处理方法时可以使得判别更加准确,因此采用一阶导数预处理方法对5000~4200cm -1 谱段峰进行预处理。
(3)采用连续投影算法筛选一阶导数预处理5000~4200cm -1范围内的特征波数点,根据特征波数点,采用Kennard-Stone算法将待判别的皂角刺及其伪品山皂角刺、野皂角刺和悬钩子分为训练集样品和试验集样品,所述训练集样品包括32批样品,其中包括24批皂角刺、3批山皂角刺、2批野皂角刺和3批悬钩子,所述试验集样品包括11批样品,其中包括8批皂角刺、1批山皂角刺、1批野皂角刺和1批悬钩子;
(4)以训练集样品建立判别模型,利用步进法提取出x 8、x 13、x 16、x 19、x 21共5个特征波数点,引入以上5个特征波数点建立如下判别函数:
F 1=36387.907x 8+24242.533x 13+9262.246x 16+11456.025x 19+13209.943x 21+3.210
F 2=-43757.506x 8+40701.987x 13+24623.897x 16+28906.269x 19-20234.651x 21+4.496;
(5)采用试验集样品的x 8、x 13、x 16、x 19、x 21共5个特征波数点代入步骤(4)得到的判别函数以判别皂角刺及其伪品的判别准确率。即将试验集11批样品代入判别函数判别皂角刺及其伪品的分类情况,结果见表4,其中11批样品判别准确率为100%,表明所建典型判别函数可以准确的识别皂角刺及其伪品的分类。
表4 判别分析的外部验证结果
Figure PCTCN2019080873-appb-000005
其中符号代表含义为:GS:皂角刺;GJ:山皂角刺;GM:野皂角刺;RC:悬钩子。
(6)聚类分析
为了进一步验证筛选特征波数的科学性以及判别分析模型的合理性,对步进法提取的5个特征变量进行系统聚类分析,聚类方法为离差平方和法,距离测度为平方欧式距离,聚类结果树状图见图3。由图3可见,1-32号正品皂角刺聚为I类,33-43号伪品聚为II类,伪品中33-36号为山皂角刺样品,聚为III类、37-39号为野皂角刺样品,聚为IV类,40-43号为悬钩子样品,聚为V类,聚类结果均与性状鉴别结果一致。聚类结果表明,提取的5个特征波数可以准确有效的区分皂角刺正品与伪品并且能够区分不同类别的伪品。
(7)BP神经网络分析
采用连续投影算法提取后的特征变量作为神经网络的输入,输入层含有的节点为特征变量数、隐藏层含有10个节点、输出层含有4个节点,建立三层BP神经网络模型。皂角刺的代码为[1 0 0 0],山皂角刺的代码为[0 1 0 0],野皂角刺的代码为[0 0 1 0],悬钩子的代码为[0 0 0 1]。神经网络的学习算法为共轭梯度算 法,训练规则选择Levenberg-Marquardt算法,随机法分配样本集为训练集、验证集、测试集。为筛选出最佳建模条件,采用训练集数据分别在不同谱段范围与不同预处理方法下建立BP神经网络模型;为了进一步检验BP神经网络模型的预测效果,采用验证集与测试集的样本验证BP神经网络模型的识别能力,分类结果见表5。结果显示,光谱区间选择为5000~4200cm -1,采用一阶导数预处理方法,模型对训练集、验证集以及测试集的分类准确率均为100%,表明所建BP人工神经网络模型可以有效的识别皂角刺正品与伪品。
表5 BP神经网络分类识别结果
Figure PCTCN2019080873-appb-000006
通过如上所述分析,可以看出,本申请通过近红外光谱采集法以及连续投影算法、一阶导数预处理方法、Kennard-Stone算法以及步进算法的结合使得判别方法的结果准确可靠,可以准确区分皂角刺及其伪品。
在本申请中,应用光纤探头采样可造成首尾谱段的噪声干扰,因此剔除12000~11800cm -1与4200~4000cm -1谱段的杂峰。水分在6897cm -1和5181cm -1有较强且较宽的吸收峰,为避免水峰信息与样品信息重叠,剔除7500~6500cm -1、 5500~5000cm -1区间的水吸收峰。
本申请中近红外光谱数据量冗杂,光谱区间11800~7500cm -1包含变量2230个,区间6500~5500cm -1包含变量519个,区间5000~4200cm -1包含变量416个,采用连续投影算法对数据进行有效压缩以消除共线性数据对模型的干扰,大大降低模型的复杂程度,有利于建模。在逐步判别分析方法中进一步采用步进法逐步引入变量,步进规则采用最小F值法,F值大于3.84时加入对分类影响大的变量,F值小于2.71时剔除对分类影响小的变量。降低了误判率,提高模型的精度。
采用Savitzky-Golay平滑法能够有效平滑高频噪音,提高信噪比;矢量归一化和最小最大归一化用来校正样品因颗粒散射而引起的光谱的误差;一阶导数和二阶导数分别用于消除光谱中基线的平移和漂移,提高分辨率和灵敏度。在预处理方法的考察中,发现使用一阶导数预处理法对模型判别结果更加准确。
聚类分析结果表明,皂角刺正品主要分为两类:来自湖北武汉、河南各地、北京、河北新乐、山东泰安和安徽亳州的10、11、21、18、25、26、27、28、29、30、31、32、22、4、23号样品聚为一类,表明以上产区的皂角刺质量比较接近;来自河南洛阳等地、山东枣庄等地、陕西柞水、山西运城、、湖北襄阳和广西各地的12、13、14、15、16、17、19、20、5、6、7、8、1、2、3、9、24号样品聚为一类,表明以上样品产区的皂角刺质量相近。上述差异可能是由皂角刺的生长年限、产地气温、光照和降雨量等因素导致,有待深入研究。
BP神经网络分析结果表明,采用BP人工神经网络建模效果较好。由表5可见,各组模型对训练集的分类准确率范围为82.6~100%。其中,由不同条件建立的15组模型中有11组的训练集分类准确率达到100%;各组模型对验证集的预测正确率范围为63.6~100%,对测试集的预测正确率范围为44.4~100%。其中, 分别存在多组100%准确分类验证集与测试集的模型。经过优化筛选,光谱范围选择5000~4200cm -1,数据预处理采用一阶导数,建立的模型为最佳BP神经网络模型,其训练集、验证集以及测试集的分类准确率均为100%。
本申请通过上述实施例来说明本申请对中药皂角刺的真伪进行化学模式识别的方法,但本申请并不局限于上述实施例,即不意味着本申请必须依赖上述实施例才能实施。所属技术领域的技术人员应该明了,对本申请的任何改进,对本申请所选用原料的等效替换及辅助成分的添加、具体方式的选择等,均落在本申请的保护范围和公开范围之内。

Claims (10)

  1. 一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,其包括以下步骤:
    (1)采集皂角刺及其伪品样品的近红外光谱,扣除内置参比背景,每个样品表面采集三个不同位置的光谱,得到平均光谱作为原始光谱;
    (2)剔除原始光谱中的干扰峰,得到11800~7500cm -1、6500~5500cm -1以及5000~4200cm -1谱段峰,选择5000~4200cm -1谱段峰作为模型分析峰,并采用一阶导数预处理方法对5000~4200cm -1谱段峰进行预处理;
    (3)采用连续投影算法筛选一阶导数预处理5000~4200cm -1范围内的特征波数点,根据特征波数点,采用Kennard-Stone算法将待判别的皂角刺及其伪品分为训练集样品和试验集样品;
    (4)以训练集样品建立判别模型,利用步进法提取出x 8、x 13、x 16、x 19、x 21共5个特征波数点,引入以上5个特征波数点建立如下判别函数:
    F 1=36387.907x 8+24242.533x 13+9262.246x 16+11456.025x 19+13209.943x 21+3.210,
    F 2=-43757.506x 8+40701.987x 13+24623.897x 16+28906.269x 19-20234.651x 21+4.496;
    (5)采用试验集样品的x 8、x 13、x 16、x 19、x 21共5个特征波数点代入步骤(4)得到的判别函数以判别皂角刺及其伪品的判别准确率。
  2. 根据权利要求1所述的基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,其中,步骤(1)所述伪品为山皂角刺、野皂角刺和悬钩子。
  3. 根据权利要求1或2所述的基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,其中,步骤(3)所述训练集样品包括32批样品,其中包括24批皂角刺、3批山皂角刺、2批野皂角刺和3批悬钩子,所述试验集 样品包括11批样品,其中包括8批皂角刺、1批山皂角刺、1批野皂角刺和1批悬钩子。
  4. 根据权利要求1-3中任一项所述的基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,其中,步骤(1)所述近红外光谱的采集范围为12000~4000cm -1,仪器分辨率为4cm -1,扫描次数为32次。
  5. 根据权利要求1-4中任一项所述的基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,其中,步骤(2)所述干扰峰为12000~11800cm -1、4200~4000cm -1、7500~6500cm -1和5500~5000cm -1谱段的峰。
  6. 根据权利要求1-5中任一项所述的基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,其中,采用聚类分析对步骤(4)所述步进法提取的5个特征波数点进行系统聚类分析,以验证判别函数的判别准确性。
  7. 根据权利要求6所述的基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,其中,所述聚类分析采用离差平方和法,距离测度为平方欧式距离。
  8. 根据权利要求1-7中任一项所述的基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,其中,采用BP神经网络模型对步骤(3)得到特征波点数进行模式识别的结果准确性进行验证。
  9. 根据权利要求8所述的基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,其中,所述BP神经网络模型为采用连续投影算法提取的特征波数点作为神经网络的输入,输入层含有的节点为特征波数点数、隐藏层含有10个节点、输出层含有4个节点,建立得到BP神经网络模型。
  10. 根据权利要求1-9中任一项所述的基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法,其中,所述方法包括以下步骤:
    (1)采集皂角刺及其伪品山皂角刺、野皂角刺和悬钩子样品的近红外光谱,采集范围为12000~4000cm -1,仪器分辨率为4cm -1,扫描次数为32次,扣除内置参比背景,每个样品表面采集三个不同位置的光谱,得到平均光谱作为原始光谱;
    (2)剔除原始光谱中位于12000~11800cm -1、4200~4000cm -1、7500~6500cm -1和5500~5000cm -1谱段的干扰峰,得到11800~7500cm -1、6500~5500cm -1以及5000~4200cm -1谱段峰,选择5000~4200cm -1谱段峰作为模型分析峰,采用一阶导数预处理方法对5000~4200cm -1谱段峰进行预处理;
    (3)采用连续投影算法筛选一阶导数预处理5000~4200cm -1范围内的特征波数点,根据特征波数点,采用Kennard-Stone算法将待判别的皂角刺及其伪品山皂角刺、野皂角刺和悬钩子分为训练集样品和试验集样品,所述训练集样品包括32批样品,其中包括24批皂角刺、3批山皂角刺、2批野皂角刺和3批悬钩子,所述试验集样品包括11批样品,其中包括8批皂角刺、1批山皂角刺、1批野皂角刺和1批悬钩子;
    (4)以训练集样品建立判别模型,利用步进法提取出x 8、x 13、x 16、x 19、x 21共5个特征波数点,引入以上5个特征波数点建立如下判别函数:
    F 1=36387.907x 8+24242.533x 13+9262.246x 16+11456.025x 19+13209.943x 21+3.210
    F 2=-43757.506x 8+40701.987x 13+24623.897x 16+28906.269x 19-20234.651x 21+4.496;
    (5)采用试验集样品的x 8、x 13、x 16、x 19、x 21共5个特征波数点代入步骤(4)得到的判别函数以判别皂角刺及其伪品的判别准确率;
    (6)采用聚类分析对步骤(4)所述步进法提取的5个特征波数点进行系 统聚类分析以对得到的判别函数的判别准确性进行验证,以及采用BP神经网络模型对步骤(3)得到的特征波点数进行模式识别的结果准确性进行验证。所述聚类分析采用离差平方和法,距离测度为平方欧式距离,所述BP神经网络模型为采用连续投影算法提取的特征波数点作为神经网络的输入,输入层含有的节点为特征波数点数、隐藏层含有10个节点、输出层含有4个节点,建立得到BP神经网络模型。
PCT/CN2019/080873 2018-04-03 2019-04-01 一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法 WO2019192433A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/043,325 US11656176B2 (en) 2018-04-03 2019-04-01 Near-infrared spectroscopy-based method for chemical pattern recognition of authenticity of traditional Chinese medicine Gleditsiae spina

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810290323.7 2018-04-03
CN201810290323.7A CN108509997A (zh) 2018-04-03 2018-04-03 一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法

Publications (1)

Publication Number Publication Date
WO2019192433A1 true WO2019192433A1 (zh) 2019-10-10

Family

ID=63380013

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/080873 WO2019192433A1 (zh) 2018-04-03 2019-04-01 一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法

Country Status (3)

Country Link
US (1) US11656176B2 (zh)
CN (1) CN108509997A (zh)
WO (1) WO2019192433A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111879710A (zh) * 2020-07-23 2020-11-03 中冶建筑研究总院(深圳)有限公司 钢结构涂层防腐性能评定方法、系统、服务器和存储介质
CN112132129A (zh) * 2020-09-21 2020-12-25 天津科技大学 一种基于外观图像的枸杞子道地性ai识别方法
CN112669915A (zh) * 2020-11-06 2021-04-16 西安理工大学 一种基于神经网络与近红外光谱的梨无损检测方法
CN113176227A (zh) * 2021-04-27 2021-07-27 皖西学院 一种快速预测河南石斛掺伪霍山石斛的方法
CN113655019A (zh) * 2021-08-10 2021-11-16 南京富岛信息工程有限公司 一种管输原油的混油界面检测方法
CN114076745A (zh) * 2020-08-20 2022-02-22 成都市食品药品检验研究院 一种基于云端-互联便携式近红外技术的西红花鉴别方法及其掺伪品定量预测方法
WO2023207453A1 (zh) * 2022-04-28 2023-11-02 山东大学 一种基于光谱聚类的中药成分分析方法及系统

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509997A (zh) * 2018-04-03 2018-09-07 深圳市药品检验研究院(深圳市医疗器械检测中心) 一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法
CN110082308B (zh) * 2019-05-06 2021-12-03 中国科学院西北高原生物研究所 一种基于近红外光谱判别模型的秦艽类别识别方法
CN110220863A (zh) * 2019-06-25 2019-09-10 湖南中医药大学 一种基于atr-ftir的金银花和山银花中药制剂的鉴别方法
CN110514611B (zh) 2019-09-25 2023-01-20 深圳市药品检验研究院(深圳市医疗器械检测中心) 一种基于药效信息建立评价中药质量的化学模式识别方法
CN111830173B (zh) * 2020-08-04 2022-10-18 甘肃省药品检验研究院 皂角刺饮片中掺杂野皂角刺含量的内标法检测方法
CN112684023A (zh) * 2020-12-02 2021-04-20 太极集团重庆涪陵制药厂有限公司 一种厚朴药材质量的快速检测方法和厚朴药材的筛选方法
CN112945900B (zh) * 2021-02-03 2022-12-30 广东药科大学 一种快速检测莪术质量的检测模型及方法
CN113624874B (zh) * 2021-08-05 2023-06-27 天津中医药大学 一种鉴别鹅不食草的方法
CN113588590B (zh) * 2021-08-11 2024-04-16 苏州泽达兴邦医药科技有限公司 一种基于数据挖掘的中药提取过程质量控制方法
CN114166764A (zh) * 2021-11-09 2022-03-11 中国农业科学院农产品加工研究所 基于特征波长筛选的光谱特征模型的构建方法及装置
CN114184727B (zh) * 2021-11-24 2023-04-25 天津中医药大学 源自皂荚的三种药材的鉴别方法及其应用
CN114280180B (zh) * 2021-12-21 2023-11-14 山西大学 一种实时精确检测中成药品中成分种类与含量的方法
CN116010861A (zh) * 2022-12-13 2023-04-25 淮阴工学院 一种形似中药的分类方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4893253A (en) * 1988-03-10 1990-01-09 Indiana University Foundation Method for analyzing intact capsules and tablets by near-infrared reflectance spectrometry
CN103776797A (zh) * 2014-02-25 2014-05-07 河北大学 一种近红外光谱鉴别平利绞股蓝的方法
CN103837492A (zh) * 2014-02-24 2014-06-04 西北农林科技大学 一种基于近红外光谱技术的猕猴桃膨大果无损检测方法
CN108509997A (zh) * 2018-04-03 2018-09-07 深圳市药品检验研究院(深圳市医疗器械检测中心) 一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107121406A (zh) * 2017-05-24 2017-09-01 福州大学 一种基于近红外光谱的葡萄籽油掺假鉴别方法
CN109444066B (zh) * 2018-10-29 2020-04-14 山东大学 基于光谱数据的模型转移方法
CN110514611B (zh) * 2019-09-25 2023-01-20 深圳市药品检验研究院(深圳市医疗器械检测中心) 一种基于药效信息建立评价中药质量的化学模式识别方法
CN113762208B (zh) * 2021-09-22 2023-07-28 山东大学 一种近红外光谱与特征图谱的图谱转换方法及其应用

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4893253A (en) * 1988-03-10 1990-01-09 Indiana University Foundation Method for analyzing intact capsules and tablets by near-infrared reflectance spectrometry
CN103837492A (zh) * 2014-02-24 2014-06-04 西北农林科技大学 一种基于近红外光谱技术的猕猴桃膨大果无损检测方法
CN103776797A (zh) * 2014-02-25 2014-05-07 河北大学 一种近红外光谱鉴别平利绞股蓝的方法
CN108509997A (zh) * 2018-04-03 2018-09-07 深圳市药品检验研究院(深圳市医疗器械检测中心) 一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111879710A (zh) * 2020-07-23 2020-11-03 中冶建筑研究总院(深圳)有限公司 钢结构涂层防腐性能评定方法、系统、服务器和存储介质
CN111879710B (zh) * 2020-07-23 2023-07-11 中冶建筑研究总院(深圳)有限公司 钢结构涂层防腐性能评定方法、系统、服务器和存储介质
CN114076745A (zh) * 2020-08-20 2022-02-22 成都市食品药品检验研究院 一种基于云端-互联便携式近红外技术的西红花鉴别方法及其掺伪品定量预测方法
CN112132129A (zh) * 2020-09-21 2020-12-25 天津科技大学 一种基于外观图像的枸杞子道地性ai识别方法
CN112669915A (zh) * 2020-11-06 2021-04-16 西安理工大学 一种基于神经网络与近红外光谱的梨无损检测方法
CN112669915B (zh) * 2020-11-06 2024-03-29 西安理工大学 一种基于神经网络与近红外光谱的梨无损检测方法
CN113176227A (zh) * 2021-04-27 2021-07-27 皖西学院 一种快速预测河南石斛掺伪霍山石斛的方法
CN113655019A (zh) * 2021-08-10 2021-11-16 南京富岛信息工程有限公司 一种管输原油的混油界面检测方法
CN113655019B (zh) * 2021-08-10 2024-04-26 南京富岛信息工程有限公司 一种管输原油的混油界面检测方法
WO2023207453A1 (zh) * 2022-04-28 2023-11-02 山东大学 一种基于光谱聚类的中药成分分析方法及系统

Also Published As

Publication number Publication date
CN108509997A (zh) 2018-09-07
US20210025815A1 (en) 2021-01-28
US11656176B2 (en) 2023-05-23

Similar Documents

Publication Publication Date Title
WO2019192433A1 (zh) 一种基于近红外光谱技术对中药皂角刺的真伪进行化学模式识别的方法
Yin et al. A review of the application of near-infrared spectroscopy to rare traditional Chinese medicine
US11710541B2 (en) Chemical pattern recognition method for evaluating quality of traditional Chinese medicine based on medicine effect information
CN101961360B (zh) 三七的近红外光谱鉴别方法
CN101285768B (zh) 应用近红外光谱分析技术无损鉴别卷烟真伪的方法
CN101532954A (zh) 一种用红外光谱结合聚类分析鉴定中药材的方法
CN103364359A (zh) Simca模式识别法在近红外光谱识别大黄药材中的应用
CN103411912A (zh) 一种利用THz-TDS结合模糊规则专家系统鉴定中草药的方法
CN108593592A (zh) 一种基于近红外光谱技术的半夏掺伪鉴别方法
CN108760677A (zh) 一种基于近红外光谱技术的法半夏掺伪鉴别方法
CN111523587A (zh) 一种基于机器学习的木本植物物种光谱识别方法
Liu et al. An identification method of herbal medicines superior to traditional spectroscopy: Two-dimensional correlation spectral images combined with deep learning
CN109358022A (zh) 一种快速判别烟用爆珠类型的方法
CN103076300B (zh) 专属性模式识别模型判别分析中药材资源指纹信息的方法
Hou et al. Application of terahertz spectroscopy combined with feature improvement algorithm for the identification of adulterated rice seeds
CN109685099A (zh) 一种光谱波段优选模糊聚类的苹果品种辨别方法
CN110197481A (zh) 一种基于大数据分析的石墨烯指纹峰分析方法
CN116008245A (zh) 桑叶拉曼光谱指纹图谱的建立结合机器学习算法在桑叶属地来源鉴定中的应用
CN108564099A (zh) 基于标识纤维随机分布的图像识别区分药用植物的方法
Lu et al. Qualitative discrimination of intact tobacco leaves based on Near-Infrared technology
CN110174392B (zh) 一种高辨识力多组分复杂油品的指纹谱构建及鉴别方法
CN113433274A (zh) 一种中药饮片的检测方法
Lucio-Gutiérrez et al. Expeditious identification and semi-quantification of Panax ginseng using near infrared spectral fingerprints and multivariate analysis
Sosa et al. An application using canny edge detection and multilayer perceptron for recognizing leaves of tropical plants
Hong et al. Identification between Fimbristylis miliacea (L.) Vahl and Fimbristhlis stauntonii Debeaux et Franch. by CWT‐FTIR‐RBFNN

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19782231

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19782231

Country of ref document: EP

Kind code of ref document: A1