CN117334257A - Lucid ganoderma multi-level screening mass spectrum database and establishment method and application thereof - Google Patents
Lucid ganoderma multi-level screening mass spectrum database and establishment method and application thereof Download PDFInfo
- Publication number
- CN117334257A CN117334257A CN202311192034.0A CN202311192034A CN117334257A CN 117334257 A CN117334257 A CN 117334257A CN 202311192034 A CN202311192034 A CN 202311192034A CN 117334257 A CN117334257 A CN 117334257A
- Authority
- CN
- China
- Prior art keywords
- database
- mass spectrum
- ganoderma lucidum
- mass
- less
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001819 mass spectrum Methods 0.000 title claims abstract description 69
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000012216 screening Methods 0.000 title claims abstract description 10
- 241000222336 Ganoderma Species 0.000 title claims description 32
- 240000008397 Ganoderma lucidum Species 0.000 claims abstract description 96
- 235000001637 Ganoderma lucidum Nutrition 0.000 claims abstract description 95
- 239000012634 fragment Substances 0.000 claims abstract description 63
- 150000001875 compounds Chemical class 0.000 claims abstract description 54
- 239000000126 substance Substances 0.000 claims abstract description 28
- 230000010365 information processing Effects 0.000 claims abstract description 21
- 238000005336 cracking Methods 0.000 claims abstract description 17
- 239000003814 drug Substances 0.000 claims abstract description 9
- 238000012545 processing Methods 0.000 claims abstract description 8
- 238000005259 measurement Methods 0.000 claims abstract description 5
- 150000002500 ions Chemical class 0.000 claims description 63
- 239000000523 sample Substances 0.000 claims description 24
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 claims description 21
- 238000004451 qualitative analysis Methods 0.000 claims description 21
- 230000014759 maintenance of location Effects 0.000 claims description 19
- 238000004949 mass spectrometry Methods 0.000 claims description 17
- 229930000044 secondary metabolite Natural products 0.000 claims description 13
- 238000004088 simulation Methods 0.000 claims description 12
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 11
- 239000007788 liquid Substances 0.000 claims description 10
- 238000010200 validation analysis Methods 0.000 claims description 8
- 150000003648 triterpenes Chemical class 0.000 claims description 7
- 238000012790 confirmation Methods 0.000 claims description 6
- 239000000203 mixture Substances 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 5
- 235000017807 phytochemicals Nutrition 0.000 claims description 4
- 229930000223 plant secondary metabolite Natural products 0.000 claims description 4
- 239000012488 sample solution Substances 0.000 claims description 4
- 239000002904 solvent Substances 0.000 claims description 4
- 238000002137 ultrasound extraction Methods 0.000 claims description 4
- 238000001097 direct analysis in real time mass spectrometry Methods 0.000 claims description 3
- 238000004811 liquid chromatography Methods 0.000 claims description 3
- 230000006696 biosynthetic metabolic pathway Effects 0.000 claims description 2
- 230000015572 biosynthetic process Effects 0.000 claims description 2
- 239000012467 final product Substances 0.000 claims description 2
- 230000000052 comparative effect Effects 0.000 claims 1
- 238000000210 desorption electrospray ionisation mass spectrometry Methods 0.000 claims 1
- 238000001906 matrix-assisted laser desorption--ionisation mass spectrometry Methods 0.000 claims 1
- 239000012085 test solution Substances 0.000 claims 1
- 238000001514 detection method Methods 0.000 abstract description 6
- 229930014626 natural product Natural products 0.000 abstract description 4
- 238000000889 atomisation Methods 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 8
- 238000002347 injection Methods 0.000 description 6
- 239000007924 injection Substances 0.000 description 6
- URLZCHNOLZSCCA-UHFFFAOYSA-N leu-enkephalin Chemical compound C=1C=C(O)C=CC=1CC(N)C(=O)NCC(=O)NCC(=O)NC(C(=O)NC(CC(C)C)C(O)=O)CC1=CC=CC=C1 URLZCHNOLZSCCA-UHFFFAOYSA-N 0.000 description 6
- 238000000926 separation method Methods 0.000 description 6
- 238000001228 spectrum Methods 0.000 description 6
- 239000006228 supernatant Substances 0.000 description 6
- 238000009826 distribution Methods 0.000 description 5
- 150000004676 glycans Chemical class 0.000 description 5
- 238000013537 high throughput screening Methods 0.000 description 5
- 229920001282 polysaccharide Polymers 0.000 description 5
- 239000005017 polysaccharide Substances 0.000 description 5
- 239000000047 product Substances 0.000 description 5
- 239000000243 solution Substances 0.000 description 5
- 241001489091 Ganoderma sinense Species 0.000 description 4
- 108010022337 Leucine Enkephalin Proteins 0.000 description 4
- WEVYAHXRMPXWCK-UHFFFAOYSA-N acetonitrile Substances CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 4
- 239000002989 correction material Substances 0.000 description 4
- HQVFCQRVQFYGRJ-UHFFFAOYSA-N formic acid;hydrate Chemical compound O.OC=O HQVFCQRVQFYGRJ-UHFFFAOYSA-N 0.000 description 4
- 238000004895 liquid chromatography mass spectrometry Methods 0.000 description 4
- 239000000843 powder Substances 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000004704 ultra performance liquid chromatography Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000004907 flux Effects 0.000 description 3
- 238000002290 gas chromatography-mass spectrometry Methods 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 238000003908 quality control method Methods 0.000 description 3
- 238000005303 weighing Methods 0.000 description 3
- 241000233866 Fungi Species 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- 238000004737 colorimetric analysis Methods 0.000 description 2
- 238000000688 desorption electrospray ionisation Methods 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000002398 materia medica Substances 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000000816 matrix-assisted laser desorption--ionisation Methods 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 230000007017 scission Effects 0.000 description 2
- 238000002470 solid-phase micro-extraction Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 208000024172 Cardiovascular disease Diseases 0.000 description 1
- 235000013162 Cocos nucifera Nutrition 0.000 description 1
- 244000060011 Cocos nucifera Species 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 1
- RDMQPKIDHAFXKA-JNORPAGFSA-N Ganoderic Acid Am1 Chemical compound C([C@@]12C)C[C@H](O)C(C)(C)[C@@H]1CC(=O)C1=C2C(=O)C[C@]2(C)[C@@H]([C@@H](CC(=O)CC(C)C(O)=O)C)CC(=O)[C@]21C RDMQPKIDHAFXKA-JNORPAGFSA-N 0.000 description 1
- BSEYIQDDZBVTJY-UHFFFAOYSA-N Ganoderic acid A Natural products CC(CC(=O)CCC1CC(O)C2(C)C3=C(C(=O)CC12C)C4(C)CCC(=O)C(C)(C)C4CC3O)C(=O)O BSEYIQDDZBVTJY-UHFFFAOYSA-N 0.000 description 1
- 241001149422 Ganoderma applanatum Species 0.000 description 1
- 229920001503 Glucan Polymers 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 229930182558 Sterol Natural products 0.000 description 1
- 239000004480 active ingredient Substances 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 235000001014 amino acid Nutrition 0.000 description 1
- 150000001413 amino acids Chemical class 0.000 description 1
- 230000003556 anti-epileptic effect Effects 0.000 description 1
- 230000003110 anti-inflammatory effect Effects 0.000 description 1
- 230000000259 anti-tumor effect Effects 0.000 description 1
- 230000000840 anti-viral effect Effects 0.000 description 1
- 239000001961 anticonvulsive agent Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000000857 drug effect Effects 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 150000004665 fatty acids Chemical class 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000011141 high resolution liquid chromatography Methods 0.000 description 1
- 238000004896 high resolution mass spectrometry Methods 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 230000007365 immunoregulation Effects 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005374 membrane filtration Methods 0.000 description 1
- 239000012982 microporous membrane Substances 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000004112 neuroprotection Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000000144 pharmacologic effect Effects 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 235000018102 proteins Nutrition 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000002798 spectrophotometry method Methods 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 239000012086 standard solution Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 150000003432 sterols Chemical class 0.000 description 1
- 235000003702 sterols Nutrition 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 238000003809 water extraction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/90—Programming languages; Computing architectures; Database systems; Data warehousing
Landscapes
- Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioethics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Chemical & Material Sciences (AREA)
- Crystallography & Structural Chemistry (AREA)
- Epidemiology (AREA)
- Public Health (AREA)
- Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
Abstract
The invention belongs to the technical field of natural product detection, and particularly relates to a ganoderma lucidum multi-level screening mass spectrum database and an establishment method and application thereof. The establishment of the database comprises the following steps: (1) Constructing a primary mass spectrum database based on a ganoderma lucidum compound standard; (2) Based on the mass spectrum data of the ganoderma lucidum real-measurement sample and the published component information of the ganoderma lucidum, simulating fragment cracking by using an information processing platform, predicting a mobility value and carrying out data comparison to construct a secondary mass spectrum database; (3) Based on an open source type database, a three-level mass spectrum database is constructed by utilizing a mass spectrum data processing strategy of various traditional Chinese medicine complex components and phytochemistry taxonomies and combining an information processing platform to simulate fragment cracking and predict mobility values. The invention can realize qualitative screening of the ganoderma lucidum related samples without standard substances, and has the characteristics of high throughput, accuracy, simplicity, convenience, rapidness and the like.
Description
Technical Field
The invention belongs to the technical field of natural product detection, and particularly relates to a ganoderma lucidum multi-level screening mass spectrum database and an establishment method and application thereof.
Background
Ganoderma lucidum is a dry fruiting body of Polyporaceae fungus Ganoderma lucidum Ganoderma lucidum or Ganoderma sinense Ganoderma sinense, and is one of the well-known medicinal fungi in China. The efficacy and clinical application of ganoderma lucidum are recorded by ancient medical works such as Shennong's herbal channel, xin Xiu Ben Cao (new revised materia Medica), ben Cao gang mu (compendium of materia Medica) and the like, and the modern pharmacological clinical research also shows that ganoderma lucidum has the effects of resisting tumor, regulating immunity, protecting liver, resisting aging, preventing and treating cardiovascular diseases and the like.
The basic research shows that the ganoderma lucidum has complex chemical components and contains various active components such as ganoderan, triterpenes, proteins, amino acids, sterols, fatty acids and the like. In particular, the ganoderma triterpene component, which is one of the main active ingredients of ganoderma, has various activities such as anti-tumor, immunoregulation, antiviral, antiepileptic, anti-inflammatory, neuroprotection, etc. More than 320 ganoderma lucidum triterpene components are reported at present, more isomers exist, and the identification difficulty is high.
At present, a colorimetric method, a liquid chromatography and a liquid chromatography-mass spectrometry method are mostly adopted for analysis of ganoderma lucidum components. The methods have the advantages and disadvantages that the colorimetric method is low in detection cost, but can only be used for total content measurement and has larger interference; liquid chromatography is the most dominant detection technique, and component detection is usually performed in combination with an ultraviolet-visible light detector or the like, but is limited by the influence of standards, sensitivity, matrix interference, and the like, and it is difficult to identify each chromatographic peak during analysis.
The Chinese patent application CN201210222763.1 discloses a quality analysis method of medicines or health products, in particular to a quality control method of ganoderma lucidum water extract. In order to purposefully control the product quality, the sum of the content of crude polysaccharide and the content of ganoderic acid A, B, C2 is used as the quality control of the ganoderma lucidum water extract, which means that the ganoderma lucidum crude polysaccharide is extracted by a water extraction and alcohol precipitation method in sample research, the polysaccharide with a glucan structure in the polysaccharide is precipitated by a copper reagent, the content of the polysaccharide is measured by an ultraviolet spectrophotometry, the influence of auxiliary materials on measurement is eliminated, and the basis is provided for the quality control of the ganoderma lucidum water extract. But only the total content of the above components can be measured, and there is a large disturbance.
Chinese patent application CN201910560738 relates to characteristic spectrum and its construction method and application, especially HS-SPME/GC-MS characteristic spectrum of volatile components of Ganoderma lucidum and its construction method and application. The application discloses HS-SPME/GC-MS characteristic spectrum of volatile components of ganoderma lucidum, which is characterized in that qualitative analysis is carried out on a common peak by establishing the characteristic spectrum of ganoderma lucidum, and then distinguishing and identifying ganoderma lucidum to be detected or other edible fungi and related products thereof, and distinguishing different varieties of ganoderma lucidum by GC-MS characteristic spectrum analysis, but because ganoderma lucidum components are complex, more isomers exist, and detection is not accurate enough.
With the development of mass spectrometry technology, particularly high-resolution mass spectrometry technology, by virtue of the characteristics of high flux, high resolution, spectrum library search and the like, the method is widely applied in the analysis field, and natural products can be rapidly identified by matching with the functions of high flux, accurate molecular weight, spectrum library search and the like.
At present, the analysis of the chemical components of ganoderma lucidum through a mass spectrometry or liquid chromatography-mass spectrometry technology is reported in the literature, but because ganoderma lucidum components are complex, more isomers exist, a systematic ganoderma lucidum chemical component mass spectrometry database is not formed yet, the number of directly identified chemical components is small, and the accuracy is not high. Based on this, provided herein are methods for identifying secondary metabolites in ganoderma lucidum by establishing a ganoderma lucidum multi-level high throughput screening mass spectrometry database that is a primary database (standard library), a secondary database (local library) and a tertiary database (online database) as the main components.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a ganoderma lucidum multi-level screening mass spectrum database and an establishment method and application thereof. The multi-level mass spectrum database capable of accurately identifying complex secondary metabolites in ganoderma lucidum is formed by taking the construction of a primary database (standard substance database), a secondary database (local database) and a tertiary database (online database) as main bodies, and the database is utilized to carry out high-throughput screening and identification on secondary metabolites in ganoderma lucidum.
In order to achieve the above purpose of the present invention, the present invention adopts the following specific technical scheme:
a method for establishing a ganoderma lucidum multi-level screening mass spectrum database comprises the following steps:
(1) Based on the ganoderma lucidum compound standard, carrying out data acquisition by using different acquisition modes of a high-resolution mass spectrometer, analyzing the data of the ganoderma lucidum compound standard and constructing a primary mass spectrum database;
(2) Based on the mass spectrum data of the ganoderma lucidum real-measurement sample and the published component information of the ganoderma lucidum, simulating fragment cracking by using an information processing platform, predicting a mobility value and carrying out data comparison to construct a secondary mass spectrum database;
(3) Based on an open source type database, a three-level mass spectrum database is constructed by utilizing a mass spectrum data processing strategy of various traditional Chinese medicine complex components and phytochemistry taxonomies and combining an information processing platform to simulate fragment cracking and predict mobility values.
Specifically, the method for establishing the primary mass spectrum database based on the standard substance comprises the following steps: and acquiring mass spectrum information of the standard substance by using different types of mass spectrometers under different acquisition modes and different mass spectrum parameter conditions, and establishing a primary mass spectrum database containing relevant information such as accurate molecular mass, structural formula, retention time, accurate molecular weight and ion intensity of primary fragments, accurate molecular weight and ion intensity of secondary fragments, ion mobility drift time, collision cross section and the like.
Preferably, the standard in the step (1) covers ganoderma lucidum triterpene representative components with different skeletons consisting of 24 carbons, 27 carbons and 30 carbons as parent nuclei; the different modes include MS E DDA, DIA, full Scan, MIM-EPI, SONAR, SWATH, and HDMS; the primary mass spectrum database comprises accurate molecular mass, structural formula, retention time, accurate molecular weight and ion intensity of primary fragments, accurate molecular weight and ion intensity of secondary fragments, ion mobility drift time and collision sectional area.
Preferably, in the step (1), the standard substance covers the ganoderma lucidum triterpene representative components with different frameworks formed by taking 24 carbons, 27 carbons and 30 carbons as parent nuclei, and the accurate retention time, the accurate molecular weight and ion intensity of the primary and secondary fragments, the ion mobility drift time, the collision cross section area and other mass spectrum information are collected, and meanwhile, the mass spectrum cracking rules of different parent nucleus frameworks are summarized; the different modes include MS E DDA, DIA, full Scan, MIM-EPI, SONAR, SWATH and HDMS.
Preferably, the primary database in step (1) includes information related to chemical formula, structural formula, retention time, precise molecular weight and ionic strength of the primary fragments, precise molecular weight and ionic strength of the secondary fragments, ion mobility drift time, collision cross-sectional area, and the like.
The primary database in the invention has the following characteristics: (1) Scanning modes are all inclusive, covering essentially all types of mass spectrometers and their scanning modes, including MS E DDA, DIA, full Scan, MIM-EPI, SONAR, SWATH, and HDMS; (2) The number and types of standard products are all, and the ganoderma lucidum triterpene representative components taking 24 carbons, 27 carbons and 30 carbons as parent nucleus frameworks are covered. The compounds of the same parent nucleus skeleton have similar cleavage rules, and the collection of the mass spectrum information of the representative compounds is helpful for summarizing the mass spectrum cleavage rules of different types of parent nucleus skeletons, which have important significance in identifying unknown compounds without standard substances, and can help identify the unknown compoundsA basic mother nucleus skeleton; (3) The compound information is accurate, such as accurate retention time, accurate molecular weight and ion strength of the primary and secondary fragments, ion mobility drift time, collision cross section area and the like. The ion mobility drift time and the collision cross section area are closely related to the mass-to-charge ratio of ions, the number of ion charges and the three-dimensional structure of the ions, so that the method has important significance for identifying the isomer and the similar structural compounds, and meanwhile, the separation of the other latitude on the basis of the retention time and the mass-to-charge ratio is realized, so that the identification result is more accurate.
Preferably, the information processing platform in the step (2) includes at least one of Peakview software, UNIFI software, masslynx software, progenesis QI software, massFrontier software, compound Discoverer software, tracefilter software, massHunter software, skyline software, MS-DIAL software and AllCCS platform; and the UNICFI software performs simulated fragment splitting matching, and the AllCCS platform performs mobility value prediction.
Specifically, the method for establishing the secondary database based on the reported component information of ganoderma lucidum is as follows: collecting mass spectrum data of a ganoderma lucidum sample, selecting ganoderma lucidum related components to construct an initial data set, and carrying out double accurate simulation by utilizing information processing platforms such as UNIFI, GNPS and SIRIUS 4 and the like through simulated fragment cracking and mobility value prediction based on the initial data set to construct a secondary mass spectrum database capable of accurately identifying a compound structure.
Preferably, the source of the information of the reported components of the ganoderma genus in the step (2) includes public literature information, large chemical databases (Reaxys, pubchem, scibinder, TCM, etc.).
Preferably, the information processing platform in the step (2) includes Peakview software, UNIFI software, masslynx software, progenesis QI software, massFrontier software, compound Discoverer software, tracefilter software, massHunter software, skyline software, MS-DIAL software, allCCS platform.
Further preferably, UNIFI software is selected for simulated patch-splitting matching and an AllCCS platform is selected for mobility value prediction.
The secondary mass spectrum database in the invention has the following characteristics: (1) The coverage range is wide, and all reported component information of ganoderma lucidum is covered, including literature data, large chemical databases such as reaxys, scifinder, pubchem and other information about ganoderma lucidum chemical components; (2) The accuracy is high, and accurate fragment simulation and mobility value prediction are carried out on the compound by utilizing a plurality of information platforms such as UNIFI, GNPS, SIRIUS, allCCS and the like to obtain accurate compound identification information.
Specifically, the method for establishing the three-level database based on the open source database updated in real time comprises the following steps: the three-level mass spectrum database capable of accurately identifying the structure of the compound is constructed by utilizing a mass spectrum data processing strategy of various traditional Chinese medicine complex components and phytochemical taxonomies and combining information platforms such as UNIFI, GNPS, SIRIUS 4 and the like to simulate fragment cracking and accurately confirm mobility value prediction.
Preferably, the open source database in step (3) includes an open source database of ChemSpider, COCONUT, super Natural II, NPASS, massbank, KEGG, fooDB, etc. in an amount exceeding 350, and the open source database has the following characteristics: (1) The coating is wide in coverage and covers all chemical components known at present, including artificial synthetic products and natural products. (2) Real-time performance, the information in the library is updated regularly, so that the accuracy and instantaneity of the information in the library are guaranteed.
Preferably, the traditional Chinese medicine complex composition mass spectrum data processing strategy in the step (3) comprises: at least one of a template compound-based mass spectrum dendrogram similarity filtering technique, a "patch tree" strategy for de novo identification of unknown compounds based on patch fingerprint features, a molecular network based on secondary patch similarity scores, and a molecular descriptor-based compound prediction strategy; the phytochemical taxonomies include ganoderma chemical taxonomies based on formation of ganoderma related biosynthetic pathways and/or parent nuclei of known chemical structures based on studies of ganoderma chemical composition.
The three-stage mass spectrum database in the invention has the following characteristics: (1) Real-time property, related compound information in the database can be updated and expanded in real time; (2) The accuracy is based on a traditional Chinese medicine complex component mass spectrum data processing strategy and phytochemistry taxonomies, and the simulation of fragment cracking and mobility value prediction are carried out by combining a scientific information system, so that accurate compound identification information can be obtained.
The invention also relates to a database established by the establishment method, and the database is updated periodically or in real time.
The invention also relates to application of the database established by the establishment method in identifying secondary metabolites in ganoderma lucidum.
Preferably, the step of identifying secondary metabolites in ganoderma lucidum comprises the following steps:
(1) Preparing a sample solution and collecting data;
(2) And comparing and identifying the mass spectrum data of the sample to be identified with a primary mass spectrum database, a secondary mass spectrum database and a tertiary mass spectrum database respectively by utilizing an information processing platform.
Preferably, the preparation method of the sample solution in the step (1) includes: extracting Ganoderma sample with solvent under ultrasonic wave, and filtering to obtain the final product; the solvent is selected from water or 70-100% methanol, preferably 100% methanol; the ultrasonic extraction time is 20-60min, and the filtration is 0.22-0.5 μm microporous membrane filtration.
Specifically, the data acquisition method of the sample solution provided by the invention comprises the following steps: direct injection mass spectrometer, liquid chromatography-mass spectrometer, desorption electrospray ionization mass spectrometer (DESI-MS), real-time direct analysis mass spectrometry (DART-MS), matrix assisted laser desorption ionization mass spectrometry (MALDI-MS).
Preferably, the device for data collection in step (1) is selected from one or more of mass spectrometer, liquid chromatograph, desorption electrospray ionization mass spectrometer (DESI-MS), real-time direct analysis mass spectrometry (DART-MS) and matrix assisted laser desorption ionization mass spectrometry (MALDI-MS), preferably liquid chromatograph; the ion source used by the liquid chromatography-mass spectrometer is selected from one of ESI, ESCI, APCI, EI and CI, and ESI is preferred.
Further preferably, the data acquisition is performed by a liquid chromatography-mass spectrometer and a direct injection mass spectrometer; preferably, a liquid chromatography-mass spectrometer.
In the invention, the influence of different methods of directly injecting mass spectrographs, liquid chromatography-mass spectrometry, DESI-MS and the like for acquiring mass spectrum data on the identification result is considered in consideration of that most secondary metabolites in ganoderma are the same type of compounds and have more isomers. The results show that the liquid chromatography-mass spectrometry can obtain better peak capacity and has better separation degree for isomers of the same type of compounds in the ganoderma.
Preferably, the mode of data acquisition in step (1) is selected from DDA, DIA, full Scan, MS E One or more of MIM-EPI, SONAR, SWATH and HDMS; preferably, SONAR and HDMS.
In the present invention, the method comprises the steps of performing a method of measuring DDA, DIA, full Scan, MS E Mass spectrum acquisition modes such as MIM-EPI, SONAR, SWATH, HDMS and the like are compared. The result shows that the SONAR+HDMS mode can greatly reduce the complexity of a spectrogram and improve the signal to noise ratio while belonging to high-energy fragments, and increases the separation of one dimension (mobility value) in the traditional acquisition means, so that the identification accuracy is further improved by confirming one more latitude compound.
Specifically, the comparison and identification method with the primary mass spectrum database in the step (2) is as follows: and importing the acquired information into an information processing platform, and matching with a primary mass spectrum database. Qualitative analysis validation criteria: the retention time deviation is +/-0.15 min; the mass error of the parent ion monoisotope is less than 3.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 3.5ppm and the number of the matched fragments is at least 6; the deviation of the mobility value is less than 5%. The comparison result of the screened component and a component in the primary database meets all qualitative analysis confirmation standards at the same time, and the screened component can be regarded as the component.
Specifically, the comparison and identification method with the secondary mass spectrum database in the step (2) is as follows: and (3) importing the acquired information into an information processing platform, and obtaining corresponding matching degree (0-50, and the higher the matching degree is, the higher the prediction accuracy is) through double accurate simulation matching of simulated fragment splitting and mobility value prediction with a secondary mass spectrum database. Qualitative analysis validation criteria: the retention time deviation is +/-0.15 min; the mass error of the parent ion monoisotope is less than 3.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 3.5ppm and the number of the matched fragments is at least 4; the deviation of the mobility value is less than 5%; the matching degree is more than or equal to 30. The comparison result of the screened component and a component in the primary database meets all qualitative analysis confirmation standards at the same time, and the screened component can be regarded as the component.
Specifically, the comparison and identification method with the tertiary mass spectrum database in the step (2) is as follows: and importing the acquired information into an information processing platform, and obtaining corresponding matching degree through double accurate simulation matching of simulated fragment cracking and mobility value prediction with a three-stage mass spectrum database. Simultaneous qualitative analysis validation criteria: the compound is present in Ganoderma genus; the mass error of the parent ion monoisotope is less than 4.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 4.5ppm and the number of the matched fragments is at least 3; the deviation of the mobility value is less than 8%; the oil-water distribution coefficient is close to the peak time of the component chromatograph; the matching degree is more than or equal to 20. The matching result of the screened component and a component in the three-level database meets all qualitative analysis confirmation standards, and can be regarded as the component.
Preferably, the information processing platform in the step (2) is selected from one or more of Peakview software, UNIFI software, masslynx software, progenesis QI software, massFrontier software, compound Discoverer software, tracefilter software, massHunter software, skyline software, MS-DIAL software and AllCCS platform. Preferably, UNIFI software is selected for simulated patch-off matching and an AllCCS platform is selected for mobility value prediction.
Compared with the prior art, the invention has the following beneficial effects:
(1) The ganoderma lucidum database constructed in a multi-level mode can realize qualitative screening of ganoderma lucidum related samples under the condition of no standard substance, has the characteristics of high flux, accuracy, simplicity, rapidness and the like, solves the problems of limited standard substances, large workload, untimely database updating and the like in the traditional identification process, and provides technical support for ganoderma lucidum substance basic research and drug effect substance identification;
(2) The multi-layer data base constructed by the invention increases one-dimensional (ion mobility) separation in the traditional method, thereby confirming the compound with one more latitude (mobility value) and further improving the identification accuracy. In addition, a plurality of information processing platforms such as a UNIFI scientific information system and a molecular network platform are combined to carry out simulation matching on fragments and mobility values, so that the accuracy and reliability of compound identification are greatly improved;
(3) The three-level database constructed by the invention combines a plurality of traditional Chinese medicine complex component mass spectrum data processing strategies such as a Molecular Network (MN) based on a secondary fragment similarity score and a compound prediction strategy based on a molecular descriptor while using an open source database updated in real time, and combines phytochemical taxonomies with information processing platforms such as UNIFI, GNPS, SIRIUS 4 and the like, thereby avoiding errors caused by the fact that the isomerides and the open source database contain too many compounds, and further improving the accuracy and reliability of compound identification.
Drawings
FIG. 1 is a flowchart of the establishment and application of a ganoderma lucidum multi-layer high throughput screening mass spectrometry database;
FIG. 2 is a schematic diagram of a primary database creation and comparison process;
FIGS. 3 and 4 are schematic diagrams of a secondary database creation and comparison process;
FIG. 5 is a schematic diagram of a three-level database creation and comparison process;
FIGS. 6 and 7 are visual views of the authentication results of the GNPS information processing platform;
FIG. 8 is a Principal Component Analysis (PCA) plot of different Ganoderma samples based on the identification of a multi-level screening database, wherein BR is white ganoderma lucidum; CC-Ganoderma lucidum; CZ-ganoderma lucidum; GT-Ganoderma tsugae; GW-Wei-Ganoderma lucidum; NF-south ganoderma; SS-Ganoderma Applanatum; WB-ganoderma lucidum without stem.
Detailed Description
In order to make the person skilled in the art better understand the present invention, the technical solution of the present invention will be further explained with reference to the drawings and the examples, but the scope of the present invention is not limited in any way by the examples.
Example 1 establishment of a database of ganoderma lucidum multilayer high throughput screening Mass spectra
As shown in fig. 1, the establishment flow of the ganoderma lucidum multilayer high-throughput screening mass spectrum database of the invention is specifically as follows:
(1) First-level database creation
Standards of the compounds contained in ganoderma lucidum (containing purity and NMR identification certificates) were collected, and the names and CAS numbers of the obtained compounds are shown in table 1 in detail.
Weighing 5mg of each ganoderma lucidum compound standard substance respectively, adding methanol to dilute to 50mL, and uniformly mixing to prepare a single standard solution with the content of 0.1 mg/mL. Injecting a Synta XS high-resolution liquid chromatography mass spectrometer, wherein the instrument parameters are as follows:
the column was a C18 UPLC column (2.1 mm. Times.100 mm,1.8 μm); the mobile phase was 0.1% (v/v) formic acid water (A) -acetonitrile (B). 0-9min,20-28% B;9-28min,28-60% B; gradient eluting with 60-100% B for 28-45min at a flow rate of 0.3mL min -1 The column temperature is 30 ℃ and the sample injection amount is 1 mu L. Adopting electrospray ion source to analyze in negative ion mode, wherein the mass scanning range is 100-1500Da, the ion source temperature is 150 ℃, the capillary voltage is 3kV, the taper hole voltage is 40V, the collision energy is 20-50eV, the atomization gas pressure is 6.0bar, the atomization gas temperature is 500 ℃, and the atomization gas flow is 1000 L.h -1 . Leucine-enkephalin (ESI-554.2615 Da) solution as correction fluid with concentration of 200pg.mL -1 。
And recording the information of retention time, accurate molecular weight and ionic strength of the primary and secondary fragments, mobility drift time, collision sectional area and the like of the compound, and simultaneously summarizing mass spectrum cracking rules of different mother nucleus frameworks. Based on this, a primary database based on the information of the ganoderma lucidum compound standard substance is formed, as shown in fig. 2.
TABLE 1 ganoderma lucidum compounds covered by the current primary database
(2) Establishment of secondary database
Collecting Ganoderma samples of different Ganoderma genus (Ganoderma Applanatum Ganodermo applanatum, ganoderma south Ganodermo australe, ganoderma long and narrow spore Ganodermo boninense, ganoderma with handle Ganodermo gibbosum, ganoderma lucidum Ganodermo leucocontextum, ganoderma lucidum Ganodermo multipileum, ganoderma lucidum Ganodermo resinaceum, ganoderma lucidum Ganodermo lucidum, ganoderma sinense Ganodermo sinense, ganoderma sinensis Ganoderm tsugqe, and Ganoderma weber Ganodermo weberianum), weighing 0.1g each time, placing into 50mL centrifuge tube, accurately adding 40mL methanol, extracting with ultrasound for 30min, collecting supernatant 1mL in 1.5mL centrifuge tube, centrifuging for 10min at 13000r/min, and collecting supernatant, and injecting into Synapt XS high resolution liquid chromatography-mass spectrometer with the following instrument parameters:
the column was a C18 UPLC column (2.1 mm. Times.100 mm,1.8 μm); the mobile phase was 0.1% (v/v) formic acid water (A) -acetonitrile (B). 0-9min,20-28% B;9-28min,28-60% B; gradient eluting with 60-100% B for 28-45min at a flow rate of 0.3mL min -1 The column temperature is 30 ℃ and the sample injection amount is 1 mu L. Adopting electrospray ion source to analyze in negative ion mode, wherein the mass scanning range is 100-1500Da, the ion source temperature is 150 ℃, the capillary voltage is 3kV, the taper hole voltage is 40V, the collision energy is 20-50eV, the atomization gas pressure is 6.0bar, the atomization gas temperature is 500 ℃, and the atomization gas flow is 1000 L.h -1 . Leucine-enkephalin (ESI-554.2615 Da) solution as correction fluid with concentration of 200pg.mL -1 。
Sample data are collected, and information such as retention time, accurate molecular weight, ion strength, mobility drift time, collision sectional area and the like of all primary and secondary ions are recorded.
Collecting data information of all the reported compounds of all the ganoderma genus contained in all the related documents and large chemical databases to form an initial data set, wherein the data set contains information such as ganoderma lucidum basal source attribution, name, molecular formula, structural formula, secondary fragments, oil-water distribution coefficient and the like of the compounds. And by using information processing platforms such as UNIFI, GNPS and SIRIUS 4, the method performs double accurate simulation comparison between the simulated fragment cracking and the mobility value prediction and all components in the acquired ganoderma lucidum sample. The mass number error of the monoisotopic element satisfying the parent ion is less than 3.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 3.5ppm and the number of the matched fragments is at least 4; the deviation of the mobility value is less than 5%; the oil-water distribution coefficient is close to the peak time of the component chromatograph; the matching degree is more than or equal to 30, and the qualitative determination of the component can be completed. Based on this, a qualitative component is formed into a secondary mass spectrum database capable of achieving accurate identification of the structure and retention time of the compound, as shown in fig. 3 and 4.
(3) Three-level database creation
The simulation fragment cracking and mobility value prediction are carried out by utilizing a plurality of open source databases updated in real time, such as chemspider, massBank, KEGG and the like, and combining platforms of UNIFI scientific information system, GNPS, SIRIUS 4 and the like, and double accurate simulation comparison is carried out on unidentified components of the primary and secondary databases in the ganoderma lucidum samples, so that the corresponding matching degree is obtained. For compounds meeting the requirement that the compound exists in ganoderma, the mass number error of parent ion monoisotopic element is less than 3.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 3.5ppm and the number of the matched fragments is at least 3; the deviation of the mobility value is less than 8%; the oil-water distribution coefficient is close to the peak time of the component chromatograph; the matching degree is more than or equal to 20; the component can be characterized by the compound type of Ganoderma. Based on this, a qualitative component was formed into a three-stage mass spectrum database capable of achieving accurate identification of the structure of the compound, as shown in fig. 5.
Example 2 identification of Ganoderma lucidum Secondary metabolite based on Multi-level Mass Spectrometry database
(1) Sample preparation to be tested
Accurately weighing 0.1g of ganoderma lucidum sample (ganoderma lucidum ) sample powder, placing the powder into a 50mL centrifuge tube, accurately adding 40mL of methanol, performing ultrasonic extraction for 30min, taking 1mL of supernatant into a 1.5mL centrifuge tube, centrifuging for 10min at 13000r/min, and taking the supernatant as a sample to be detected for later use.
(2) Primary and secondary mass spectrogram acquisition of sample to be detected
And detecting a sample to be detected by a Synapt XS high-resolution liquid chromatography-mass spectrometer, wherein the acquisition parameters of the instrument are as follows:
chromatographic conditions: the column was a C18 UPLC column (2.1 mm. Times.100 mm,1.8 μm); the mobile phase was 0.1% (v/v) formic acid water (A) -acetonitrile (B). 0-9min,20-28% B;9-28min,28-60% B; gradient eluting with 60-100% B for 28-45min at a flow rate of 0.3mL min -1 The column temperature is 30 ℃ and the sample injection amount is 1 mu L.
Mass spectrometry conditions: adopting electrospray ion source to analyze in negative ion mode, wherein the mass scanning range is 100-1500Da, the ion source temperature is 150 ℃, the capillary voltage is 3kV, the taper hole voltage is 40V, the collision energy is 20-50eV, the atomization gas pressure is 6.0bar, the atomization gas temperature is 500 ℃, and the atomization gas flow is 1000 L.h -1 . leucine-Enkephalin (ESI) - :554.2615 Da) solution is used as correction fluid, and the concentration is 200 pg.mL -1 。
(3) Qualitative analysis of the composition
The collected information is imported into scientific information systems such as UNIFI, and the like, matching is carried out through a primary database, and qualitative analysis is carried out to confirm the standard: the retention time deviation is +/-0.15 min; the mass error of the parent ion monoisotope is less than 3.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 3.5ppm and the number of the matched fragments is at least 6; the deviation of the mobility value is less than 5%. From which 29 components are determined in total based on the primary database.
And secondly, carrying out simulated fragment cracking and mobility value prediction dual accurate simulation comparison through a secondary database to obtain corresponding matching degree. Qualitative analysis validation criteria: the retention time deviation is +/-0.15 min; the mass error of the parent ion monoisotope is less than 3.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 3.5ppm and the number of the matched fragments is at least 4; the deviation of the mobility value is less than 5%; the matching degree is more than or equal to 30. From which 142 components were determined in total based on the secondary database.
And finally, further comparing through a three-level database to obtain the corresponding matching degree. Qualitative analysis validation criteria: the compound exists in ganoderma genus, and the mass number error of parent ion monoisotope is less than 4.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 4.5ppm and the number of the matched fragments is at least 4; the deviation of the mobility value is less than 8%; the oil-water distribution coefficient is close to the peak time of the component; the matching degree is more than or equal to 20. From which 33 components are determined in total based on a three-level database.
Finally, searching through a ganoderma lucidum database constructed in a multi-level mode, and identifying 204 components from ganoderma lucidum samples, wherein specific identification information is shown in the following table 2. This embodiment has the following advantages over the currently existing methods: (1) Compared with the two-dimensional separation and identification method of retention time and primary and secondary fragment information in the traditional identification, the identification result is accurate, and the one-dimensional separation of mobility value is increased, so that the identification result is more accurate. (2) The identification efficiency is high, the traditional method is based on manual calibration, the whole identification period is long, more than two months are needed for identifying 100 compounds, but according to the embodiment, the database is automatically searched and identified by software, more than 200 components (within 30 minutes) can be identified from the ganoderma lucidum sample at one time, and the analysis efficiency is remarkably improved. (3) The technical threshold for identifying the compound is low, and the traditional method needs personnel to have strong knowledge of analytical chemistry, phytochemistry and mass spectrum to accurately judge the identification result, but by the embodiment, the software is used for automatically searching and judging the data, so that the technical capability requirement on operators is low.
TABLE 2 identification of Ganoderma lucidum (Ganoderma lucidum) secondary metabolite based on Multi-level Mass Spectrometry database
/>
/>
/>
/>
/>
/>
/>
/>
Example 3 identification of secondary metabolites of different varieties of Ganoderma lucidum based on Multi-level Mass Spectrometry database
(1) Material preparation
0.1g of powder of different types of ganoderma lucidum samples (ganoderma lucidum Ganodermo applanatum, ganoderma lucidum Ganodermo australe, ganoderma lucidum Ganodermo leucocontextum, ganoderma lucidum Ganodermo multipileum, ganoderma lucidum Ganodermo resinaceum, ganoderma lucidum Ganodermo lucidum, ganoderma lucidum Ganodermo sinense, ganoderma lucidum Ganoderm tsugqe and ganoderma weber Ganodermo weberianum) is accurately weighed, the powder is respectively placed in 50mL centrifuge tubes, 40mL of methanol is accurately added, ultrasonic extraction is carried out for 30min, 1mL of supernatant is taken in a 1.5mL centrifuge tube, and 13000r/min is centrifuged for 10min, and the supernatant is taken as a sample to be measured for standby.
(2) Primary and secondary mass spectrogram acquisition of sample to be detected
And detecting a sample to be detected by a Synta XS liquid chromatography-mass spectrometer, wherein the parameters of the instrument are as follows:
chromatographic conditions: the column was a C18 UPLC column (2.1 mm. Times.100 mm,1.8 μm); the mobile phase was 0.1% (v/v) formic acid water (A) -acetonitrile (B). 0-9min,20-28% B;9-28min,28-60% B; gradient eluting with 60-100% B for 28-45min at a flow rate of 0.3mL min -1 The column temperature is 30 ℃ and the sample injection amount is 1 mu L.
Mass spectrometry conditions: adopting electrospray ion source to analyze in negative ion mode, wherein the mass scanning range is 100-1500Da, the ion source temperature is 150 ℃, the capillary voltage is 3kV, the taper hole voltage is 40V, the collision energy is 20-50eV, the atomization gas pressure is 6.0bar, the atomization gas temperature is 500 ℃, and the atomization gas flow is 1000 L.h -1 . leucine-Enkephalin (ESI) - :554.2615 Da) solution is used as correction fluid, and the concentration is 200 pg.mL -1 。
(3) Qualitative analysis of the composition
The collected information is imported into scientific information systems such as UNIFI, and the like, matching is carried out through a primary database, and qualitative analysis is carried out to confirm the standard: the retention time deviation is +/-0.15 min; the mass error of the parent ion monoisotope is less than 3.5ppm; the mass number error of the fragment ions is less than 3.5ppm; the relative abundance deviation of isotopes is less than 5%; the deviation of the mobility value is less than 5%; the number of matching chips is at least 6.
And secondly, carrying out simulated fragment cracking and mobility value prediction dual accurate simulation comparison through a secondary database to obtain corresponding matching degree. Meanwhile, the GNPS information processing platform is adopted to further simulate fragment cracking of unidentified components, the unidentified components are compared with an initial data set of a secondary database, the visual results are shown in fig. 6 and 7, and the unidentified components are input into the secondary database. Qualitative analysis validation criteria: the retention time deviation is +/-0.15 min; the mass error of the parent ion monoisotope is less than 3.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 3.5ppm and the number of the matched fragments is at least 4; the deviation of the mobility value is less than 5%; the matching degree is more than or equal to 30.
And finally, further comparing through a three-level database. Qualitative analysis validation criteria: the compound belongs to the genus ganoderma, and the mass number error of parent ion monoisotope is less than 4.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 4.5ppm; the number of matching fragments is at least 3; the deviation of the mobility value is less than 8%; the matching degree is more than or equal to 20; the oil-water partition coefficient is close to the peak time of the component and the compound type exists in ganoderma.
By searching the ganoderma lucidum database constructed in a multi-level manner, 408 secondary metabolites were identified in total from different ganoderma lucidum samples (table 3) and were subjected to multivariate statistical analysis based on the identified components (fig. 8), it was found that the different ganoderma lucidum samples could be distinguished based on the identified compounds. This example shows that this method has the following advantages over the conventional method: (1) The identification efficiency is high, multiple groups of data can be automatically identified in a short time, and more components can be identified; (2) The application range is wide, and besides the component identification of the basal sources (ganoderma lucidum and ganoderma sinensis) specified under ganoderma lucidum item in Chinese pharmacopoeia, other ganoderma samples can be identified.
TABLE 3 identification results and intensity values of different varieties of Ganoderma lucidum secondary metabolites based on a multi-level mass spectrum database
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
The foregoing detailed description is directed to one of the possible embodiments of the present invention, which is not intended to limit the scope of the invention, but is to be accorded the full scope of all such equivalents and modifications so as not to depart from the scope of the invention.
Claims (10)
1. The method for establishing the ganoderma lucidum multi-level screening mass spectrum database is characterized by comprising the following steps of:
(1) Based on the ganoderma lucidum compound standard, carrying out data acquisition by using different acquisition modes of a high-resolution mass spectrometer, analyzing the data of the ganoderma lucidum compound standard and constructing a primary mass spectrum database;
(2) Based on the mass spectrum data of the ganoderma lucidum real-measurement sample and the published component information of the ganoderma lucidum, simulating fragment cracking by using an information processing platform, predicting a mobility value and carrying out data comparison to construct a secondary mass spectrum database;
(3) Based on an open source type database, a three-level mass spectrum database is constructed by utilizing a mass spectrum data processing strategy of various traditional Chinese medicine complex components and phytochemistry taxonomies and combining an information processing platform to simulate fragment cracking and predict mobility values.
2. The method according to claim 1, wherein the standard in the step (1) comprises ganoderma lucidum triterpene representative components with different skeletons consisting of 24 carbons, 27 carbons and 30 carbons as parent nuclei; the different acquisition modes include MS E DDA, DIA, full Scan, MIM-EPI, SONAR, SWATH, and HDMS; the primary mass spectrum database comprises accurate molecular mass, structural formula, retention time, accurate molecular weight and ion intensity of primary fragments, accurate molecular weight and ion intensity of secondary fragments, ion mobility drift time and collision sectional area.
3. The method according to claim 1, wherein the information processing platform in step (2) includes at least one of Peakview software, UNIFI software, masslynx software, progenesis QI software, massfront software, compound Discoverer software, tracefilter software, massHunter software, skyline software, MS-DIAL software, and AllCCS platform; and the UNICFI software performs simulated fragment splitting matching, and the AllCCS platform performs mobility value prediction.
4. The method according to claim 1, wherein the traditional Chinese medicine complex composition mass spectrum data processing strategy in step (3) comprises: at least one of a template compound-based mass spectrum dendrogram similarity filtering technique, a "patch tree" strategy for de novo identification of unknown compounds based on patch fingerprint features, a molecular network based on secondary patch similarity scores, and a molecular descriptor-based compound prediction strategy; the phytochemical taxonomies include ganoderma chemical taxonomies based on formation of ganoderma related biosynthetic pathways and/or parent nuclei of known chemical structures based on studies of ganoderma chemical composition.
5. A database built by the building method according to any one of claims 1-4, characterized in that the database is updated periodically or in real time.
6. Use of a database created by the method of any one of claims 1-4 for identifying secondary metabolites in ganoderma lucidum.
7. The use according to claim 6, wherein the step of identifying secondary metabolites in ganoderma lucidum comprises:
(1) Preparing a sample solution and collecting data;
(2) And comparing and identifying the mass spectrum data of the sample to be identified with a primary mass spectrum database, a secondary mass spectrum database and a tertiary mass spectrum database respectively by utilizing an information processing platform.
8. The use according to claim 7, wherein the method for preparing the test solution in step (1) comprises: extracting Ganoderma sample with solvent under ultrasonic wave, and filtering to obtain the final product; the solvent is selected from water or 70-100% methanol; the ultrasonic extraction time is 20-60min.
9. The use of claim 7, wherein the data acquisition device in step (1) is selected from one or more of mass spectrometer, liquid chromatography, DESI-MS, DART-MS, and MALDI-MS; the ion source used by the liquid chromatography-mass spectrometer is selected from one of ESI, ESCI, APCI, EI and CI; the mode of data acquisition is selected from DDA, DIA, full Scan, MS E One or more of MIM-EPI, SONAR, SWATH and HDMS.
10. The use of claim 7, wherein the validation criteria for the comparative authentication in step (2) comprises:
s1: qualitative analysis of the primary mass spectrometry database confirms the standard: the retention time deviation is +/-0.15 min; the mass error of the parent ion monoisotope is less than 3.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 3.5ppm and the number of the matched fragments is at least 6; the deviation of the mobility value is less than 5%; the comparison result of the screened component and a certain component in the primary database meets all qualitative analysis confirmation standards at the same time, namely the component is considered;
s2: qualitative analysis of the secondary mass spectrometry database confirms the standard: the retention time deviation is +/-0.15 min; the mass error of the parent ion monoisotope is less than 3.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 3.5ppm and the number of the matched fragments is at least 4; the deviation of the mobility value is less than 5%; the matching degree is more than or equal to 30; the matching degree is obtained by performing double accurate simulation matching of simulated fragment splitting and mobility value prediction with a secondary mass spectrum database; the comparison result of the screened component and a certain component in the secondary database meets all qualitative analysis confirmation standards at the same time, namely the component is considered;
s3: qualitative analysis of the tertiary mass spectrometry database confirms the standard: the mass error of the parent ion monoisotope is less than 4.5ppm; the relative abundance deviation of isotopes is less than 5%; the mass number error of the fragment ions is less than 4.5ppm and the number of the matched fragments is at least 3; the deviation of the mobility value is less than 8%; the matching degree is more than or equal to 20; the comparison result of the screened component and a component in the three-level database meets all qualitative analysis confirmation standards at the same time, namely the component is considered.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311192034.0A CN117334257A (en) | 2023-09-15 | 2023-09-15 | Lucid ganoderma multi-level screening mass spectrum database and establishment method and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311192034.0A CN117334257A (en) | 2023-09-15 | 2023-09-15 | Lucid ganoderma multi-level screening mass spectrum database and establishment method and application thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117334257A true CN117334257A (en) | 2024-01-02 |
Family
ID=89276375
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311192034.0A Pending CN117334257A (en) | 2023-09-15 | 2023-09-15 | Lucid ganoderma multi-level screening mass spectrum database and establishment method and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117334257A (en) |
-
2023
- 2023-09-15 CN CN202311192034.0A patent/CN117334257A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Stavrianidi | A classification of liquid chromatography mass spectrometry techniques for evaluation of chemical composition and quality control of traditional medicines | |
Qi et al. | Isolation and analysis of ginseng: advances and challenges | |
Peng et al. | The difference of origin and extraction method significantly affects the intrinsic quality of licorice: A new method for quality evaluation of homologous materials of medicine and food | |
CN104297355A (en) | Simulative-target metabonomics analytic method based on combination of liquid chromatography and mass spectrum | |
Bai et al. | Localization of ginsenosides in Panax ginseng with different age by matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry imaging | |
Zhu et al. | Recent development in mass spectrometry and its hyphenated techniques for the analysis of medicinal plants | |
US20230101558A1 (en) | Metabolomics relative quantitative analysis method based on uplc/hmrs | |
CN109870515B (en) | Traditional Chinese medicine identification method based on traditional Chinese medicine chromatogram-mass spectrum high-dimensional image database | |
Sun et al. | Chemical discrimination of cortex Phellodendri amurensis and cortex Phellodendri chinensis by multivariate analysis approach | |
CN108593825B (en) | Method for mining mass spectrum data of red ginseng and screening specific markers | |
Jiang et al. | Rapid profiling of alkaloid analogues in Sinomenii Caulis by an integrated characterization strategy and quantitative analysis | |
CN109696510B (en) | Method for acquiring metabolic difference between transgenic corn and non-transgenic corn based on UHPLC-MS | |
Millán et al. | Liquid chromatography–quadrupole time of flight tandem mass spectrometry–based targeted metabolomic study for varietal discrimination of grapes according to plant sterols content | |
Mattoli et al. | Mass spectrometry‐based metabolomic analysis as a tool for quality control of natural complex products | |
CN113759003B (en) | Licorice origin distinguishing method based on UPLC fingerprint spectrum and chemometrics method | |
CN107941939B (en) | Method for distinguishing organic rice from non-organic rice by utilizing metabonomics technology | |
CN108490096A (en) | The detection method of 25(OH)VD in human serum | |
Vanderplanck et al. | Integration of non-targeted metabolomics and automated determination of elemental compositions for comprehensive alkaloid profiling in plants | |
Ma et al. | A strategy for the metabolomics-based screening of active constituents and quality consistency control for natural medicinal substance toad venom | |
Liu et al. | Metabolomic study of a rat fever model induced with 2, 4-dinitrophenol and the therapeutic effects of a crude drug derived from Coptis chinensis | |
CN112710765A (en) | Fingerprint detection method of gardenia medicinal material and application thereof | |
Lee et al. | Simultaneous determination of various platycosides in Four Platycodon grandiflorum cultivars by UPLC-QTOF/MS | |
CN113419000A (en) | Method for identifying panax notoginseng with 25 heads and less than 80 heads based on non-targeted metabonomics | |
Wang et al. | Simultaneous determination and pharmacokinetics study of three triterpenoid saponins in rat plasma by ultra‐high‐performance liquid chromatography tandem mass‐spectrometry after oral administration of Astragalus Membranaceus leaf extract | |
CN114814057B (en) | Method for distinguishing true and false of selaginella tamariscina varieties by non-targeted metabonomics and application |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |