CN116990430A - Method for identifying ginseng medicinal materials in different producing areas - Google Patents

Method for identifying ginseng medicinal materials in different producing areas Download PDF

Info

Publication number
CN116990430A
CN116990430A CN202210430699.XA CN202210430699A CN116990430A CN 116990430 A CN116990430 A CN 116990430A CN 202210430699 A CN202210430699 A CN 202210430699A CN 116990430 A CN116990430 A CN 116990430A
Authority
CN
China
Prior art keywords
ginseng
vector machine
ginseng medicinal
peak
support vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210430699.XA
Other languages
Chinese (zh)
Inventor
梁鑫淼
张迟
金红利
薛倩倩
刘喆
陆绍铭
李效农
刘艳芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Institute of Chemical Physics of CAS
Original Assignee
Dalian Institute of Chemical Physics of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Institute of Chemical Physics of CAS filed Critical Dalian Institute of Chemical Physics of CAS
Priority to CN202210430699.XA priority Critical patent/CN116990430A/en
Publication of CN116990430A publication Critical patent/CN116990430A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/88Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8624Detection of slopes or peaks; baseline correction
    • G01N30/8631Peaks
    • G01N30/8634Peak quality criteria
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8651Recording, data aquisition, archiving and storage
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N2030/022Column chromatography characterised by the kind of separation mechanism
    • G01N2030/027Liquid chromatography
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/88Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86
    • G01N2030/8809Integrated analysis systems specially adapted therefor, not covered by a single one of the groups G01N30/04 - G01N30/86 analysis specially adapted for the sample

Landscapes

  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

The invention discloses a method for identifying ginseng medicinal materials in different producing areas, which adopts a high performance liquid chromatography-mass spectrometer (liquid chromatography-mass spectrometry) to acquire the composition information of the ginseng medicinal materials, and combines a Support Vector Machine (SVM) to accurately identify the ginseng medicinal materials in different producing areas. The ginseng medicinal material origin identification method based on the liquid chromatography-mass spectrometry technology and the support vector machine is provided for the first time, can accurately predict ginseng origins, has the advantages of rapidness, high identification accuracy and strong stability, and has important significance on ginseng medicinal material quality evaluation and clinical medication safety.

Description

Method for identifying ginseng medicinal materials in different producing areas
Technical Field
The invention relates to the technical field of detection of traditional Chinese medicinal materials, in particular to a method for identifying ginseng medicinal materials in different producing areas.
Background
Ginseng is a dried root and rhizome of Panax ginseng C.A. Mey of Araliaceae, which has been used in China for over 2000 years as a traditional herbal medicine. The ginseng contains various chemical components such as ginsenoside, polysaccharide, amino acid, etc. Modern pharmacological research shows that ginsenoside is the main active ingredient of ginseng and has various pharmacological activities of regulating blood sugar, resisting oxidation, resisting tumor, regulating immunity, resisting inflammation, protecting nerves and the like.
Research reports show that the types and the contents of the ginsenoside contained in the ginseng are influenced by the production places, including geographical environments such as illumination, temperature, humidity, soil fertility and the like of the production places; moreover, the planting years, harvesting times, storage modes and processing methods of ginseng are not exactly the same for each production place. Therefore, the production place of ginseng is closely related to the content of ginsenoside, which also suggests that the production place affects the quality of ginseng and finally affects the efficacy. However, the quality of ginseng in different producing areas is uneven, so that quality monitoring is difficult, and the ginseng market is full of a large number of counterfeit products and confusing products; in addition, the ginseng in different producing areas has small difference in appearance and is difficult to identify. These problems seriously affect the effective use of ginseng and the popularization of international markets. The current edition of Chinese pharmacopoeia passes through ginsenoside Rg 1 The quality of the ginseng is evaluated by 3 index components, but the difference of ginseng between different producing areas is difficult to be clarified only by measuring the content of the limited components, and the quality cannot be identified. The method for evaluating the similarity of the high-efficiency liquid phase fingerprint images commonly used in the traditional Chinese medicine industry is also difficult to effectively identify ginseng medicinal materials with subtle differences in different places of origin. The literature reports the use of Raman spectroscopy [ H.G.M.Edwards.analytical and Bioanalytical chemistry 2007,389 (7): 2203-2215 ]]And DNA molecular characterization techniques [ G.Li. Journal of Ginseng research.2017,41 (3): 326-329]Methods for identifying ginseng in different regions, however, the spectrum provides limited accuracy and cannot be correlated with intrinsic components; DNA molecular technologyThe identification is complex in operation, and meanwhile, the requirements on the processing environment and mode of the medicinal materials are high. Therefore, it is important to establish a rapid, efficient and associable identification method for ginseng in different places of origin.
The support vector machine (Support vector machine, SVM) is a widely applied machine learning method, and has strong generalization capability and high accuracy. The SVM belongs to a binary classification model, and aims to find a hyperplane to divide a sample, wherein the dividing principle is the maximum interval. It shows many unique advantages in solving the small sample, nonlinear, and high-dimensional pattern recognition problems. The liquid chromatography-mass spectrometry (LC-MS) technology is widely applied to the aspects of chemical component characterization, quality evaluation and control, potential active component screening, action mechanism and the like in the field of traditional Chinese medicines, and becomes a powerful tool for traditional Chinese medicine research. By means of LC-MS, the information of the medicinal materials can be obtained rapidly, the information can be guaranteed to be comprehensive and accurate, and the contained components can be identified. The liquid chromatography-mass spectrometry technology and the support vector machine technology are combined, and a novel method is provided for the rapid and efficient identification of the traditional Chinese medicine production place.
Disclosure of Invention
In view of the above, the present invention is expected to provide a method for identifying ginseng medicinal materials in different producing areas, which uses a high performance liquid chromatography-mass spectrometer (liquid chromatography-mass spectrometry) to obtain the composition information of ginseng medicinal materials, and combines a Support Vector Machine (SVM) to accurately identify ginseng medicinal materials in different producing areas.
In order to achieve the above purpose, the technical scheme of the invention is realized as follows:
the invention relates to a method for identifying ginseng medicinal materials in different producing areas, which comprises the following steps:
A. establishing a ginseng origin database:
1) Sample solution preparation: pulverizing at least two Ginseng radix materials with different production places, adding methanol and/or ethanol water solution, reflux extracting and/or ultrasonic extracting, and filtering to obtain sample solution;
2) Liquid analysis: precisely sucking the sample solution obtained in the step 1), injecting the sample solution into a high performance liquid chromatography-mass spectrometer, and recording a Total Ion Current (TIC) map;
3) Preprocessing spectrogram data: identifying mass spectrum peaks in a Total Ion Current (TIC) map to obtain characteristic peaks of the mass spectrum of the ginseng medicinal material, and identifying the characteristic peaks; specific characteristic peak component information (including name, molecular formula, molecular weight); normalizing the obtained characteristic peak area of the mass spectrum of the ginseng medicinal material;
4) Construction and application of Support Vector Machine (SVM) classifier: and 3) using the normalized peak area obtained in the step 3) for supporting vectors of a vector machine classifier, constructing a support vector machine model, and obtaining a visual result of the model to obtain a ginseng origin database containing more than two known origins.
B. Identification of ginseng origin:
taking the ginseng medicinal material to be identified in the producing area, and performing operation treatment according to the processes of the steps A1), 2) and 3), so as to obtain normalized data of the characteristic peak-to-peak area of the mass spectrum of the ginseng medicinal material; substituting the vector into the support vector machine model obtained in the step A) as a vector, if the vector is in a certain area in more than two areas of the ginseng origin database, determining that the origin is a province in the ginseng origin database, otherwise, determining that the origin is not in the ginseng origin database;
further, in the step 1), the ginseng medicinal material is prepared from 2 or more different provinces from the origin of the sample solution, and the batch of each province is greater than or equal to 1 batch.
Further, in the step 1), ginseng medicinal material powder (passing through a fourth pharmacopoeia sieve) is taken, precisely weighed, placed in a conical bottle with a plug, and precisely added with methanol or ethanol solution, wherein the concentration is 30% -80%, and the feed-liquid ratio is 1: 20-1: 100, weighing, refluxing or ultrasonic treatment for 10-50 min, cooling, weighing again, shaking, standing, collecting and filtering the extract to obtain the sample solution.
Further, in the step 2), the chromatographic column filler used by the high performance liquid chromatography-mass spectrometer is octadecylsilane chemically bonded silica, the length of the chromatographic column is 100-250mm, the diameter of the chromatographic column is 2.1-4.6mm, the particle size of the chromatographic column is 1.7-5 μm, the mobile phase A is water containing 0.01-0.1% formic acid or 0.01-0.1% acetic acid of the additive, the mobile phase B is acetonitrile containing 0.01-0.1% formic acid or 0.01-0.1% acetic acid of the additive, and the flow rate is 0.2-0.5 mL/min; the elution gradient is: 0-10min,19% B;10-16min,19% -28% B;16-30min,28-34% B; the sample injection amount is 0.5-5 mu L; mass spectrometry scan mode: MS or auto MS/MS, acquisition mode: negative ion mode, MS acquisition range: m/z 100-2000, auto MS/MS acquisition range: m/z is 100-2000. Temperature of air curtain: 300-350 ℃; sheath temperature: 300-350 ℃; drying gas flow rate: 6-10L/min, ionization pressure: -3500 to-3000V; capillary outlet voltage: 60-90V; collision energy: 20-60V.
Further, in the step 3), the specific method for identifying the characteristic peaks of the total ion flow (TIC) spectrum is that the peak area of the EIC peak is more than or equal to 1000, the peak height of the fragment is more than or equal to 1000, the matching error is less than or equal to 5ppm, the peak with the qualitative fraction of the compound is more than or equal to 60 is identified as the characteristic peak, the identification result is screened by missing values, namely, the peak with the peak area value of not 0 in more than 80% of samples is reserved, and the characteristic peak is identified based on the fragment information of the characteristic peak in combination with literature reports and public databases at home and abroad.
Further, in the step 3), the subsequent processing is performed on the ginseng medicinal material with the number of the mass spectrum characteristic peaks which are identified by the ginseng medicinal material being greater than 50, and the normalization processing is performed on the area. The normalization process includes, but is not limited to: mean normalization method, Z-score normalization method.
Further, in the step 4), the present invention uses a kernel support vector machine model including kernel functions to implement origin identification. And taking the normalized mass spectrum characteristic peak area as a sample vector, extracting a support vector from the sample vector by using a support vector machine, wherein the support vector is used for dividing boundaries of samples of different categories so as to achieve the aim of classification, thereby realizing identification of ginseng in different producing areas.
Further, in the step 4), high-dimensional function transformation mapping is performed on the data.
Further, in the step 4), the sample data is subjected to high-dimensional transformation, including but not limited to: sigmoid aware kernel, radial basis function kernel.
The method adopts high performance liquid chromatography-mass spectrometer (liquid chromatography-mass spectrometry) to obtain the composition information of ginseng medicinal materials, and combines a Support Vector Machine (SVM) to accurately identify ginseng medicinal materials in different production places. The ginseng medicinal material origin identification method based on the liquid chromatography-mass spectrometry technology and the support vector machine is provided for the first time, can accurately predict ginseng origins, has the advantages of rapidness, high identification accuracy and strong stability, and has important significance on ginseng medicinal material quality evaluation and clinical medication safety.
The invention has the following beneficial effects: the method has strong specificity and a plurality of peak identification numbers, the obtained mass spectrum characteristic peaks can comprehensively and characteristically represent the composition information of ginseng medicinal materials, and the method can be used for identifying ginseng in different producing areas and searching differential markers by combining a support vector machine.
The ginseng medicinal material origin identification method based on the liquid chromatography-mass spectrometry technology and the support vector machine is provided for the first time, can accurately predict ginseng origins, has the advantages of rapidness, high identification accuracy and strong stability, and has important significance on ginseng medicinal material quality evaluation and clinical medication safety.
Drawings
FIG. 1 is a characteristic peak extraction ion spectrum of a ginseng medicinal material in a first embodiment of the present invention;
FIG. 2 is a boundary result of a support vector machine classifier on a known sample in a third embodiment of the present invention;
FIG. 3 is a detailed result of the support vector machine classifier for identifying known samples in a third embodiment of the present invention;
fig. 4 is a classification boundary result of the support vector machine classifier on the unknown sample in the fourth embodiment of the present invention.
Detailed Description
In order to more fully understand the features and technical content of the present invention, a method for identifying ginseng medicinal materials in different places will be further described in detail through specific examples.
Embodiment one:
1 by searching the literature about ginseng component information in recent decades at home and abroad, the literature referred to and referred to includes [ Wen-zhi Yang, analytica Chimica acta 2012 (739): 56-66 ], [ Wei Wu, journal of Pharmaceutical and Biomedical analysis 2015 (107): 141-145], [ Peng Liu, chinese Journal of Natural medicine 2015 (13): 471-480], [ Li Yang. Ginseng chemical component and pharmacological research progress [ J ]. Chinese herbal medicine, 2009,40 (01): 164-166], [ Guo Xiuli. Ginseng chemical component and pharmacological research progress [ J ]. Chinese medicine clinical research, 2012,4 (14): 26-27]. The search conditions comprise ginseng, ginsenoside, chemical components of ginseng, ginsenoside, ginseng, ginsenosides and the like, wherein the information of the saponin components is recorded, 405 pieces of information are searched and sorted together, and the content of the information record is exemplified by the following names: ginsenoside Re (Ginsenoside Re), molecular formula: C48H82O18, precise molecular weight: 946.5501.
2 searching SciFinder, pubMed, massbank and other public databases for ginsenoside component information and recording, searching and sorting 350 pieces in total, wherein the content of the information record is exemplified by the following names: ginsenoside Re (Ginsenoside Re), molecular formula: C48H82O18, precise molecular weight: 946.5501;
the 3 ginsenoside standard experiments included 7 standards: ginsenoside Rg1, ginsenoside Re, ginsenoside Rb1, ginsenoside Rf, ginsenoside Rc, ginsenoside Rb2 and ginsenoside Rd, and obtaining mass spectrum primary and secondary fragment information of the standard product by liquid chromatography-mass spectrometry. For example, ginsenoside Re has mass spectrum primary fragment 945.5426[ M-H ]] - The secondary chips were 799.4880[ M-H-Rha ]] - 、783.4926[M-H-Glc] - 、637.4346[M-H-Glc-Rha] - 、475.3818[M-H-Glc-Glc-Rha] -
4 through the steps 1,2 and 3, removing the repeated information of search, obtaining ginsenoside information 472 pieces, each piece containing 1 piece of detailed information of saponin components, such as: name: ginsenoside Re (Ginsenoside Re), molecular formula: C48H82O18, precise molecular weight: 946.5501. name: ginsenoside Rg1 (Ginsenoside Rg 1), molecular formula: c42H72O14, precise molecular weight: 800.4922. name: ginsenoside Rf (Ginsenoside Rf), molecular formula: c42H72O14, precise molecular weight: 800.4922. this information is used to identify characteristic peaks.
Embodiment two:
1 instrument and reagent
1.1 instruments
Agilent 1290 ultra-high performance liquid chromatography combined with quadrupole tandem time-of-flight liquid chromatography combined system (6545Q-TOF-MS, agilent company, U.S.) is equipped with Dual AJS ESI ion source.
The column was Acquity UPLC BEH C (2.1X100 mm,1.7 μm, waters, USA).
1.2 reagents
Chromatographic grade acetonitrile was purchased from Fisher, mass spectrometry grade formic acid was purchased from Sigma, analytical grade ethanol was purchased from Energy Chemical, and ultrapure water was prepared in the laboratory (Milli-Q IQ 7000).
1.3 reagents
31 batches of ginseng samples were collected in three northeast provinces of China, wherein 19 batches of Jilin provinces, 8 batches of Heilongjiang provinces and 4 batches of Liaoning provinces, all samples were identified as dry roots of the Araliaceae plant Panax ginseng C.A.Mey. By the institute of chemical and physical of great company of China academy of sciences Yang Xiaoping, and the 31 batches of ginseng sample information is shown in Table 1.
TABLE 1 information on samples of ginseng lot 31
2 method
2.1 preparation of sample solutions
Taking 1g of ginseng medicinal material powder (preferably passing through a No. four pharmacopoeia sieve), precisely weighing, placing into a conical bottle with a plug, precisely adding 40% ethanol water solution with volume concentration, weighing, and the mass ratio of feed liquid is 1:50; ultrasonic extracting with 400kW for 45min, cooling to room temperature, adding 40% ethanol solution to the total weight loss, and filtering the extractive solution with 0.22 μm filter membrane to obtain Ginseng radix sample solution.
2.2 chromatographic conditions
Mobile phase a was water containing 0.01% formic acid (volume concentration) as additive, mobile phase B was acetonitrile containing 0.01% formic acid (volume concentration) as additive, flow rate was 0.4mL/min, elution gradient was: 0-10min,19% B;10-16min,19% -28% B;16-30min,28-34% B. Mass spectrometry scan mode: MS, auto MS/MS, acquisition mode: negative ion mode, MS acquisition range: m/z 400-1700, MS/MS acquisition range: m/z is 100-1700. Temperature of air curtain: 320 ℃; sheath temperature: 320 ℃; drying gas flow rate: 8L/min, ionization pressure: -3500V; capillary outlet voltage: 75V; collision energy: 40. 60V.
3 analysis of operations
3.1 obtaining and identifying characteristic peaks of ginseng medicinal material by mass spectrum
Preparing test solution according to the method under item 2.1, analyzing all test solution according to chromatographic conditions under item 2.2 to obtain total ion flow spectrum (TIC) of ginseng medicinal material, as shown in figure 1, identifying peak with EIC peak area of more than or equal to 1000, fragment peak height of more than or equal to 1000, matching error of less than or equal to 5ppm and qualitative fraction of more than or equal to 60 as mass spectrum characteristic peak, and screening identification result with missing value, namely retaining peak with peak area value of not 0 in more than 80% of samples, and simultaneously obtaining molecular formula, retention time, peak height, peak area and other information of characteristic peak. In this example, there were 69 mass spectrum characteristic peaks, and the mass spectrum characteristic peaks were identified with reference to the saponins information obtained in the first example, and the results after identification are shown in the following table:
TABLE 1 69 Mass Spectrometry characteristic Peak information
3.2 normalization treatment of characteristic peaks of mass spectrum of ginseng medicinal material
And processing the mass spectrum characteristic peaks of 69 ginseng medicinal materials obtained by the liquid quality map by adopting an average number normalization method. Normalized mathematical expression is as follows:
wherein x is i Mass spectrum characteristic peak-to-peak area data for sample i (which is a natural number of 1-69),for the mass spectrum characteristic peak area of sample i, the average value of the peak areas in different samples is +.>And normalizing the mass spectrum characteristic peak-to-peak area data of the sample i, namely, supporting the vector of the vector machine classifier. Taking peak No. 1 as an example, the normalization results are shown in the following table.
Table 2 example of normalization of characteristic peaks of mass spectrum
Embodiment III:
construction of 1SVM classifier
The Support Vector Machine (SVM) is a classification model, which is a linear classifier that divides a classification curved surface in a feature space through iterative calculation so as to maximize the interval of sample classification. Which selects feature classification vectors (support vectors) based on data features, and dividesThe classification interval is variable to achieve the purpose of classification. In this example, 31 ginseng batches with known origin information are used for the construction of SVM classifier, which is 19 Jilin provinces, 8 Heilongjiang provinces and 4 Liaoning provinces respectively. Using python3.7 as the programming language, sigmoid is selected as the kernel function. Normalized data of 69 mass spectrum characteristic peaks of 31 batches obtained in the second embodiment are taken as a matrix x i And (3) carrying out SVM classifier, wherein regularization parameters are set to be 10, gamma values are set to be 1, cross verification is carried out by adopting a leave-one-out method, an output bitmap of the SVM classifier is taken as a classification boundary, a visual result of the model is obtained, and three shadow areas are taken as sample ranges of three provinces.
Discrimination results of 2 SVM classifier
According to the parameters, an SVM classification model is generated, as shown in figure 2, the boundary of the output image color block is the classification boundary, the region A is the classification boundary of Liaoning province, the region B is the classification boundary of Heilongjiang province, and the region C is the classification boundary of Jilin province. The detailed identification results are shown in Table 3 and FIG. 3. In the figure 3, the abscissa is 31 samples of different batches, the ordinate is three places of origin, and the five-pointed star (i) represents that the identification result of the samples by the SVM classifier is identical with the actual identification, namely the identification is correct; the square (■) represents that the sample was identified by the SVM classifier and was not in agreement with the actual, i.e., misclassification. The final model classification accuracy was 100%.
Table 3 identification results of Ginseng radix samples from Jilin province, heilongjiang province and Liaoning province
Embodiment four:
1 instrument and reagent
1.1 instruments are as in example two, "1.1"
1.2 reagent as in example two, "1.2"
1.3 samples of 8 batches of ginseng were collected in three northeast provinces of China, wherein 5 batches of Jilin provinces, 2 batches of Heilongjiang provinces, and 1 batch of Liaoning provinces, all samples were identified as dry roots of the Araliaceae plant Panax ginseng C.A.Mey. By the institute of chemical and physical research Yang Xiaoping, and the information of 8 batches of ginseng samples is shown in Table 1. Sample origin information is not known at modeling, and is only used when result comparison verifies modeling prediction conditions.
Table 48 information on samples of ginseng
2 method
2.1 preparation of sample solution "2.1" in example two.
2.2 chromatographic conditions were the same as "2.2" in example two.
3 analysis of operations
3.1 obtaining and identifying characteristic peaks of the ginseng medicinal material in the second embodiment as '3.1'.
3.2 normalization treatment of characteristic peaks of the mass spectrum of the ginseng medicinal material is the same as that of '3.2' in the second embodiment.
Construction of a 4 SVM classifier normalized mass spectrum characteristic peaks were substituted as vectors into the SVM classifier constructed in the third embodiment "1", and 8 batches of ginseng samples in Table 4 were used as unknown samples for carrying out the prediction of the origin and visualizing the prediction results.
Identification results of 5 SVM classifier this example uses SVM classifier to identify the origin of 8 unknown ginseng samples, and the shadow area where the samples fall is the origin identification result of the SVM classifier for the samples. As shown in fig. 4, 1 sample is in area a, i.e., the origin is identified as the Liaoning province; 2 batches of samples fall in the area B, namely the production places of the samples are identified as Heilongjiang province; there were 5 batches falling in region C, i.e. their origin was identified as Jilin province. All samples are in the correct area, and the actual origin is consistent with the origin predicted by the SVM classifier.
The technical scheme provided by the invention is compared with the prior art as follows:
according to the detection method, the identification of ginseng medicinal materials in different producing areas can be realized by specific and preferable sample pretreatment and liquid quality analysis methods, pretreatment of original spectrogram data and classification by constructing a support vector machine model. The invention can obtain the abundant chemical component information of the ginseng medicinal material by adopting the liquid quality technology, and has higher accuracy rate on the support vector machine model in the series of pretreatment processes of the original data. Meanwhile, the invention can identify the characteristic peaks and can provide guarantee for the quality control and clinical safety use of ginseng medicinal materials.
The present invention is not limited to the above-mentioned embodiments, but can be modified or changed in accordance with the scope of the appended claims without departing from the principles of the present invention.

Claims (9)

1. A method for identifying ginseng medicinal materials in different producing areas, which is characterized by comprising the following steps:
A. establishing a ginseng origin database:
1) Sample solution preparation: pulverizing at least two Ginseng radix materials with different production places, adding methanol and/or ethanol water solution, reflux extracting and/or ultrasonic extracting, and filtering to obtain sample solution;
2) Liquid analysis: precisely sucking the sample solution obtained in the step 1), injecting the sample solution into a high performance liquid chromatography-mass spectrometer, and recording a total ion flow (TIC) chromatogram;
3) Preprocessing spectrogram data: identifying mass spectrum peaks in a Total Ion Current (TIC) map to obtain a ginseng medicinal material mass spectrum characteristic peak, identifying the characteristic peak, and determining characteristic peak component information (comprising name, molecular formula and molecular weight); normalizing the obtained characteristic peak area of the mass spectrum of the ginseng medicinal material;
4) Construction and application of Support Vector Machine (SVM) classifier: the normalized peak area obtained in the step 3) is used for supporting vectors of a vector machine classifier, a support vector machine model is constructed, a visual result of the model is obtained, and a ginseng origin database containing more than two known origins is obtained;
B. identification of ginseng origin:
taking the ginseng medicinal material to be identified in the producing area, and performing operation treatment according to the processes of the steps A1), 2) and 3), so as to obtain normalized data of the characteristic peak-to-peak area of the mass spectrum of the ginseng medicinal material; substituting the vector into the support vector machine model obtained in the step A) as a vector, if the vector is in a certain area in more than two areas of the ginseng origin database, determining that the origin is a province in the ginseng origin database, otherwise, determining that the origin is not in the ginseng origin database;
the identification of different producing areas is realized through a support vector machine classifier.
2. The method according to claim 1, characterized in that: in the step A1), the ginseng medicinal materials obtained by preparing the sample solution are derived from any 2 or more known areas in different provinces, autonomous areas or direct administration cities, and the ginseng medicinal materials obtained by each province, autonomous area or direct administration city are more than or equal to 1 batch.
3. The method according to claim 1, wherein in the step 1), ginseng powder is taken, methanol and/or ethanol aqueous solution is added, the volume concentration is 30% -80%, and the mass ratio of the feed liquid is 1: 20-1: 100, weighing, refluxing and/or ultrasonic treatment for 10-50 min, cooling to room temperature, weighing again, supplementing the lost weight with the methanol and/or ethanol water solution, shaking uniformly, standing, collecting and filtering the extract to obtain the sample solution.
4. The method according to claim 1, wherein in the step a 2), the column packing used by the high performance liquid chromatography-mass spectrometer is octadecylsilane chemically bonded silica, the column length is 100-250mm, the diameter is 2.1-4.6mm, the particle size is 1.7-5 μm, the mobile phase a is water containing 0.01-0.1% formic acid or 0.01-0.1% acetic acid by volume concentration of the additive, the mobile phase B is acetonitrile containing 0.01-0.1% formic acid or 0.01-0.1% acetic acid by volume concentration of the additive, and the flow rate is 0.2-0.5 mL/min; the elution gradient is: 0-10min,19% B;10-16min,19% -28% B;16-30min,28-34% B; the sample injection amount is 0.5-5 mu L; mass spectrometry scan mode: MS or auto MS/MS, acquisition mode: negative ion mode, MS acquisition range: m/z 100-2000, auto MS/MS acquisition range: m/z is 100-2000. Temperature of air curtain: 300-350 ℃; sheath temperature: 300-350 ℃; drying gas flow rate: 6-10L/min, ionization pressure: -3500 to-3000V; capillary outlet voltage: 60-90V; collision energy: 20-60V.
5. The method according to claim 1, wherein the specific method for identifying the characteristic peaks in the step A) is that the peak area of EIC in the mass spectrum of the ginseng medicinal material is more than or equal to 1000, the peak height of fragments is more than or equal to 1000, the matching error is less than or equal to 5ppm, and the peak with the qualitative fraction of the compound is more than or equal to 60 is identified as the characteristic peak; and screening the identification result by using a missing value, namely reserving peaks with peak area values which are not 0 in more than 80% of samples, and identifying the characteristic peaks based on one or more than two of fragment information of the characteristic peaks combined with published literature reports and/or public databases at home and abroad and chromatographic-mass spectrometry combined detection data of ginseng medicinal material component standards.
6. The method according to claim 1, wherein the number of characteristic peaks identified in step a) is greater than 50, and the subsequent processing is performed by normalization processing including, but not limited to: average normalization and/or Z-score normalization.
7. The method according to claim 1, wherein the vector machine model of step a, 4) is a kernel support vector machine model comprising kernel functions; the normalized mass spectrum characteristic peak-peak area is taken as a sample vector.
8. The method of claim 7, wherein the kernel support vector machine model comprising kernel functions performs high-dimensional function transformation mapping on the data.
9. The method of claim 7, wherein the high-dimensional transformation includes, but is not limited to: sigmoid aware kernels and/or radial basis function kernels.
CN202210430699.XA 2022-04-22 2022-04-22 Method for identifying ginseng medicinal materials in different producing areas Pending CN116990430A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210430699.XA CN116990430A (en) 2022-04-22 2022-04-22 Method for identifying ginseng medicinal materials in different producing areas

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210430699.XA CN116990430A (en) 2022-04-22 2022-04-22 Method for identifying ginseng medicinal materials in different producing areas

Publications (1)

Publication Number Publication Date
CN116990430A true CN116990430A (en) 2023-11-03

Family

ID=88521878

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210430699.XA Pending CN116990430A (en) 2022-04-22 2022-04-22 Method for identifying ginseng medicinal materials in different producing areas

Country Status (1)

Country Link
CN (1) CN116990430A (en)

Similar Documents

Publication Publication Date Title
CN105574474B (en) A kind of biometric image recognition methods based on Information in Mass Spectra
Peng et al. The difference of origin and extraction method significantly affects the intrinsic quality of licorice: A new method for quality evaluation of homologous materials of medicine and food
CN105572212A (en) Visual mass spectrometry information-based sun-dried ginseng and red ginseng rapid identification method
Kim et al. Chemical fingerprinting of Codonopsis pilosula and simultaneous analysis of its major components by HPLC–UV
CN106841428A (en) A kind of discrimination method of organic liquid milk
CN115060822B (en) Fingerprint spectrum quantitative analysis method based on traditional Chinese medicine 'imprinting template' component clusters
CN110715994A (en) Method for analyzing difference chemical components of spina date seed and spina date seed by using UHPLC-Q-Orbitrap MS
CN112710765A (en) Fingerprint detection method of gardenia medicinal material and application thereof
CN114113381A (en) Characteristic polypeptide of saloon, application thereof and method for identifying comfortable saloon
CN107449849B (en) Traditional Chinese medicine identification method
CN112114079B (en) Method for simultaneously detecting 9 chemical components in quisqualis indica
CN113759003B (en) Licorice origin distinguishing method based on UPLC fingerprint spectrum and chemometrics method
Xing et al. Characterization of volatile organic compounds in Polygonum multiflorum and two of its processed products based on multivariate statistical analysis for processing technology monitoring
CN110887921B (en) Method for efficiently and rapidly analyzing characteristic volatile components of eucommia leaves and fermentation product thereof
CN116990430A (en) Method for identifying ginseng medicinal materials in different producing areas
Yue et al. Multiresidue screening of pesticides in Panax Ginseng CA Meyer by ultra‐high‐performance liquid chromatography with quadrupole time‐of‐flight mass spectrometry
CN113899826A (en) Method and system for classifying astragalus seeds
CN111413423B (en) Method for constructing UPLC (ultra performance liquid chromatography) characteristic spectrum of cortex lycii radicis and method for detecting cortex lycii radicis
CN114814057A (en) Method for distinguishing true and false of selaginella tamariscina varieties through non-targeted metabonomics and application
CN113917009A (en) Construction method and application of bupleurum chinense non-saponin component HPLC fingerprint
CN109884222B (en) HPLC fingerprint spectrum establishment method of caulis Sinomenii
CN114295751A (en) Method for evaluating quality of radix linderae based on multi-wavelength fingerprint spectrum
CN113341031A (en) Method for identifying quality goods, counterfeit goods and substitutes of rhodiola root medicinal materials
CN113267582B (en) Construction method of scandent stigmata fingerprint
CN113419010B (en) Method for constructing characteristic spectrum of fritillary medicinal materials and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination