CN114594171B - Metabolome deep annotation method - Google Patents
Metabolome deep annotation method Download PDFInfo
- Publication number
- CN114594171B CN114594171B CN202011407735.8A CN202011407735A CN114594171B CN 114594171 B CN114594171 B CN 114594171B CN 202011407735 A CN202011407735 A CN 202011407735A CN 114594171 B CN114594171 B CN 114594171B
- Authority
- CN
- China
- Prior art keywords
- metabolites
- metabolite
- molecular
- mass spectrum
- actual measurement
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 239000002207 metabolite Substances 0.000 claims abstract description 164
- 238000001819 mass spectrum Methods 0.000 claims abstract description 43
- 230000014759 maintenance of location Effects 0.000 claims abstract description 42
- 239000000284 extract Substances 0.000 claims abstract description 13
- 239000012472 biological sample Substances 0.000 claims abstract description 11
- 239000007788 liquid Substances 0.000 claims abstract description 8
- 238000004458 analytical method Methods 0.000 claims abstract description 5
- 238000012216 screening Methods 0.000 claims abstract 2
- 150000002500 ions Chemical class 0.000 claims description 47
- 238000005259 measurement Methods 0.000 claims description 46
- 238000004949 mass spectrometry Methods 0.000 claims description 16
- 238000013051 Liquid chromatography–high-resolution mass spectrometry Methods 0.000 claims description 10
- 238000002474 experimental method Methods 0.000 claims description 7
- 239000000419 plant extract Substances 0.000 claims description 7
- 238000012512 characterization method Methods 0.000 claims description 4
- 230000002503 metabolic effect Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 2
- 150000001875 compounds Chemical class 0.000 claims description 2
- 238000002826 magnetic-activated cell sorting Methods 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- BULVZWIRKLYCBC-UHFFFAOYSA-N phorate Chemical compound CCOP(=S)(OCC)SCSCC BULVZWIRKLYCBC-UHFFFAOYSA-N 0.000 claims 2
- 238000004451 qualitative analysis Methods 0.000 abstract description 2
- VLEGKXSJUXLEJG-UHFFFAOYSA-N 2-hydroxy-3-phenylprop-2-enamide Chemical compound NC(=O)C(O)=CC1=CC=CC=C1 VLEGKXSJUXLEJG-UHFFFAOYSA-N 0.000 description 23
- 239000000523 sample Substances 0.000 description 11
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 9
- 241000196324 Embryophyta Species 0.000 description 8
- 238000001228 spectrum Methods 0.000 description 8
- -1 N- (p-coumaryl) -cadaverine Chemical compound 0.000 description 7
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 7
- 240000008042 Zea mays Species 0.000 description 6
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 6
- 238000004811 liquid chromatography Methods 0.000 description 6
- 235000009973 maize Nutrition 0.000 description 6
- 238000011160 research Methods 0.000 description 6
- ADXYVJDZSILCMZ-ZZXKWVIFSA-N (e)-3-(4-hydroxy-3,5-dimethoxyphenyl)-n-[2-(5-hydroxy-1h-indol-3-yl)ethyl]prop-2-enamide Chemical compound COC1=C(O)C(OC)=CC(\C=C\C(=O)NCCC=2C3=CC(O)=CC=C3NC=2)=C1 ADXYVJDZSILCMZ-ZZXKWVIFSA-N 0.000 description 4
- BDAGIHXWWSANSR-UHFFFAOYSA-N methanoic acid Natural products OC=O BDAGIHXWWSANSR-UHFFFAOYSA-N 0.000 description 4
- 238000004885 tandem mass spectrometry Methods 0.000 description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 239000003643 water by type Substances 0.000 description 3
- ALRHLSYJTWAHJZ-UHFFFAOYSA-N 3-hydroxypropionic acid Chemical compound OCCC(O)=O ALRHLSYJTWAHJZ-UHFFFAOYSA-N 0.000 description 2
- OSWFIVFLDKOXQC-UHFFFAOYSA-N 4-(3-methoxyphenyl)aniline Chemical compound COC1=CC=CC(C=2C=CC(N)=CC=2)=C1 OSWFIVFLDKOXQC-UHFFFAOYSA-N 0.000 description 2
- ATRRKUHOCOJYRX-UHFFFAOYSA-N Ammonium bicarbonate Chemical compound [NH4+].OC([O-])=O ATRRKUHOCOJYRX-UHFFFAOYSA-N 0.000 description 2
- 229910000013 Ammonium bicarbonate Inorganic materials 0.000 description 2
- MKMCJLMBVKHUMS-UHFFFAOYSA-N Coixol Chemical compound COC1=CC=C2NC(=O)OC2=C1 MKMCJLMBVKHUMS-UHFFFAOYSA-N 0.000 description 2
- REFJWTPEDVJJIY-UHFFFAOYSA-N Quercetin Chemical compound C=1C(O)=CC(O)=C(C(C=2O)=O)C=1OC=2C1=CC=C(O)C(O)=C1 REFJWTPEDVJJIY-UHFFFAOYSA-N 0.000 description 2
- 235000012538 ammonium bicarbonate Nutrition 0.000 description 2
- 239000001099 ammonium carbonate Substances 0.000 description 2
- WPYMKLBDIGXBTP-UHFFFAOYSA-N benzoic acid Chemical compound OC(=O)C1=CC=CC=C1 WPYMKLBDIGXBTP-UHFFFAOYSA-N 0.000 description 2
- 238000000132 electrospray ionisation Methods 0.000 description 2
- 238000010828 elution Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 235000019253 formic acid Nutrition 0.000 description 2
- 238000002546 full scan Methods 0.000 description 2
- 238000004896 high resolution mass spectrometry Methods 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 238000012417 linear regression Methods 0.000 description 2
- 238000002705 metabolomic analysis Methods 0.000 description 2
- 230000001431 metabolomic effect Effects 0.000 description 2
- 239000000843 powder Substances 0.000 description 2
- 230000000750 progressive effect Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 238000001195 ultra high performance liquid chromatography Methods 0.000 description 2
- YXKFALZVRFVXFA-QHHAFSJGSA-N (e)-3-(3,4-dihydroxyphenyl)-n-[2-(5-hydroxy-1h-indol-3-yl)ethyl]prop-2-enamide Chemical compound C12=CC(O)=CC=C2NC=C1CCNC(=O)\C=C\C1=CC=C(O)C(O)=C1 YXKFALZVRFVXFA-QHHAFSJGSA-N 0.000 description 1
- 239000005711 Benzoic acid Substances 0.000 description 1
- QEFRNWWLZKMPFJ-ZXPFJRLXSA-N L-methionine (R)-S-oxide Chemical compound C[S@@](=O)CC[C@H]([NH3+])C([O-])=O QEFRNWWLZKMPFJ-ZXPFJRLXSA-N 0.000 description 1
- QEFRNWWLZKMPFJ-UHFFFAOYSA-N L-methionine sulphoxide Natural products CS(=O)CCC(N)C(O)=O QEFRNWWLZKMPFJ-UHFFFAOYSA-N 0.000 description 1
- NWBJYWHLCVSVIJ-UHFFFAOYSA-N N-benzyladenine Chemical compound N=1C=NC=2NC=NC=2C=1NCC1=CC=CC=C1 NWBJYWHLCVSVIJ-UHFFFAOYSA-N 0.000 description 1
- IEDBNTAKVGBZEP-VMPITWQZSA-N N-trans-sinapoyltyramine Chemical compound COC1=C(O)C(OC)=CC(\C=C\C(=O)NCCC=2C=CC(O)=CC=2)=C1 IEDBNTAKVGBZEP-VMPITWQZSA-N 0.000 description 1
- KXGHHSIMRWPVQM-UHFFFAOYSA-N Nardosinone Natural products O=C1CC2OOC(C)(C)C2C2(C)C(C)CCC=C21 KXGHHSIMRWPVQM-UHFFFAOYSA-N 0.000 description 1
- ZVOLCUVKHLEPEV-UHFFFAOYSA-N Quercetagetin Natural products C1=C(O)C(O)=CC=C1C1=C(O)C(=O)C2=C(O)C(O)=C(O)C=C2O1 ZVOLCUVKHLEPEV-UHFFFAOYSA-N 0.000 description 1
- HWTZYBCRDDUBJY-UHFFFAOYSA-N Rhynchosin Natural products C1=C(O)C(O)=CC=C1C1=C(O)C(=O)C2=CC(O)=C(O)C=C2O1 HWTZYBCRDDUBJY-UHFFFAOYSA-N 0.000 description 1
- 241000482268 Zea mays subsp. mays Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 239000007864 aqueous solution Substances 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 235000010233 benzoic acid Nutrition 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- JRHMMVBOTXEHGJ-UHFFFAOYSA-N daphnoretin Chemical compound C1=CC(=O)OC2=CC(OC3=CC=4C=C(C(=CC=4OC3=O)O)OC)=CC=C21 JRHMMVBOTXEHGJ-UHFFFAOYSA-N 0.000 description 1
- UJRSXAFBORHKBS-UHFFFAOYSA-N daphnoretin Natural products COC1=CC2C=C(Oc3ccc4C=CC(=O)Oc4c3)C(=O)OC2C=C1O UJRSXAFBORHKBS-UHFFFAOYSA-N 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- RXKJFZQQPQGTFL-UHFFFAOYSA-N dihydroxyacetone Chemical compound OCC(=O)CO RXKJFZQQPQGTFL-UHFFFAOYSA-N 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- MWDZOUNAPSSOEL-UHFFFAOYSA-N kaempferol Natural products OC1=C(C(=O)c2cc(O)cc(O)c2O1)c3ccc(O)cc3 MWDZOUNAPSSOEL-UHFFFAOYSA-N 0.000 description 1
- QMXUPEOIQOABGZ-UHFFFAOYSA-N n'-(3-phenylprop-2-enyl)butane-1,4-diamine Chemical compound NCCCCNCC=CC1=CC=CC=C1 QMXUPEOIQOABGZ-UHFFFAOYSA-N 0.000 description 1
- KXGHHSIMRWPVQM-JWFUOXDNSA-N nardosinone Chemical compound O=C1C[C@H]2OOC(C)(C)[C@H]2[C@@]2(C)[C@H](C)CCC=C21 KXGHHSIMRWPVQM-JWFUOXDNSA-N 0.000 description 1
- 229930014626 natural product Natural products 0.000 description 1
- 229960001576 octopamine Drugs 0.000 description 1
- 229960001285 quercetin Drugs 0.000 description 1
- 235000005875 quercetin Nutrition 0.000 description 1
- LISFMEBWQUVKPJ-UHFFFAOYSA-N quinolin-2-ol Chemical compound C1=CC=C2NC(=O)C=CC2=C1 LISFMEBWQUVKPJ-UHFFFAOYSA-N 0.000 description 1
- LFBIHCZSRPAPHS-UHFFFAOYSA-N rutamontine Natural products COc1cc2OC(=O)C(=Cc2cc1O)Oc3ccc4C=CC(=O)Oc4c3 LFBIHCZSRPAPHS-UHFFFAOYSA-N 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- NGSWKAQJJWESNS-ZZXKWVIFSA-N trans-4-coumaric acid Chemical compound OC(=O)\C=C\C1=CC=C(O)C=C1 NGSWKAQJJWESNS-ZZXKWVIFSA-N 0.000 description 1
- 238000004704 ultra performance liquid chromatography Methods 0.000 description 1
- 238000003260 vortexing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N30/00—Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
- G01N30/02—Column chromatography
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N30/00—Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
- G01N30/02—Column chromatography
- G01N30/26—Conditioning of the fluid carrier; Flow patterns
- G01N30/28—Control of physical parameters of the fluid carrier
- G01N30/34—Control of physical parameters of the fluid carrier of fluid composition, e.g. gradient
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N30/00—Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
- G01N30/02—Column chromatography
- G01N30/62—Detectors specially adapted therefor
- G01N30/72—Mass spectrometers
- G01N30/7233—Mass spectrometers interfaced to liquid or supercritical fluid chromatograph
- G01N30/724—Nebulising, aerosol formation or ionisation
- G01N30/7266—Nebulising, aerosol formation or ionisation by electric field, e.g. electrospray
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N30/00—Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
- G01N30/02—Column chromatography
- G01N30/86—Signal analysis
- G01N30/8675—Evaluation, i.e. decoding of the signal into analytical information
- G01N30/8679—Target compound analysis, i.e. whereby a limited number of peaks is analysed
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N30/00—Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
- G01N30/02—Column chromatography
- G01N30/86—Signal analysis
- G01N30/8675—Evaluation, i.e. decoding of the signal into analytical information
- G01N30/8686—Fingerprinting, e.g. without prior knowledge of the sample components
Landscapes
- Chemical & Material Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Engineering & Computer Science (AREA)
- Library & Information Science (AREA)
- Dispersion Chemistry (AREA)
- Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
Abstract
The invention discloses a deep annotation method for a complex biological sample metabolome. The method comprises the steps of carrying out non-targeted metabonomics analysis based on ultra-high performance liquid chromatography-high resolution mass spectrum on a biological sample extract, obtaining metabonomic chromatography-mass spectrum information of the biological sample, and screening matched candidate metabolites from a metabonomics database according to experimental primary mass spectrum ion mass-charge ratio and experimental retention time in the obtained non-targeted metabonomics data; and constructing a metabolite molecular structure association network according to the molecular fingerprint similarity of the candidate metabolites. And then, performing large-scale qualitative analysis on the metabolome by using non-targeted ultra-high performance liquid chromatography-high resolution mass spectrum metabolome experimental data and using a molecular structure association network as a background network. The method does not depend on a large-scale experimental secondary spectrogram database, and has higher qualitative coverage and reliability.
Description
Technical Field
The invention relates to the fields of analytical chemistry and metabonomics, in particular to a metabolite deep annotation method based on a molecular structure association network.
Research setting
Metabolites are diverse in variety and species specific. Metabolomics has been a bottleneck problem in research in the fields of metabolomics and analytical chemistry. Non-targeted ultra-high performance liquid chromatography-high resolution mass spectrometry technology is one of the main technologies of metabonomics research, and along with the continuous progress of high resolution mass spectrometry technology, the generation of high-flux metabonomics data is no longer a main bottleneck of research. Metabonomics methods based on non-targeted ultra high performance liquid chromatography-high resolution mass spectrometry (UHPLC-HRMS) have enabled detection of tens or tens of thousands of mass spectral peaks (metabolic feature) at a time, but typically can obtain fewer than 1000 metabolites, and wherein typically only a few hundred metabolites can be identified. Because the non-targeted metabonomics experimental data has limited annotated information, a large number of discovered differential metabolites cannot be used for subsequent functional mechanism and other researches due to unknown structures.
High reliability metabolite identification based on mass spectrometry techniques typically requires search matching identification by accurate mass numbers, retention times, and secondary mass spectrometry (MS/MS). At present, a large amount of endogenous metabolites are recorded in a metabolome database, but the database lacks chromatographic retention time, the number of experimental secondary spectrograms is small, most recorded secondary spectrograms are theoretical predicted spectrograms, and the difference between the recorded secondary spectrograms and actual measured spectrograms is large. In addition, the reproducibility of the secondary spectrograms acquired by different types of mass spectra is poor, so that the database searching and qualitative capacity is limited, and the effective identification of metabolites is seriously influenced. For this reason, development of a deep annotation method for non-targeted ultra-high performance liquid chromatography-high resolution mass spectrometry metabolome data is urgently needed.
Disclosure of Invention
The invention provides a large-scale qualitative method of metabolome. In order to achieve the aim of the invention, non-targeted metabonomics analysis based on ultra-high performance liquid chromatography-high resolution mass spectrometry is carried out on the biological sample extract, and metabonomic related chromatography-mass spectrometry information of the biological sample extract is obtained; collecting candidate metabolites in a metabolome database based on the obtained non-targeted metabolome data; constructing a metabolite molecular structure association network based on candidate metabolite molecular fingerprint similarity; and carrying out large-scale qualitative on the metabolome by using non-targeted ultra-high performance liquid chromatography-high resolution mass spectrum metabolome experimental data and using a molecular structure association network as a background network. The technical scheme adopted by the invention comprises the following steps:
firstly, performing non-targeted metabonomics analysis on an extract of a biological sample to be detected by adopting ultra-high performance liquid chromatography-high resolution mass spectrometry; acquiring relevant chromatographic-mass spectral information of extract metabolome, including retention time t of metabolite peak detected by experiment r actual measurement Primary mass spectrum information, i.e. primary ion mass to charge ratio m/z Actual measurement And corresponding secondary mass spectrometry information, i.e., mass to charge ratio and intensity of secondary ions; the primary ions refer to ions directly collected after the compounds are ionized; the secondary ions refer to ions collected by the primary ions after collision and fragmentation by applying certain energy;
second, constructing the number of molecular structures of the candidate metaboliteA database; the primary ion mass-to-charge ratio m/z of all metabolites in the biological sample extract to be tested obtained according to the first step of experiment Actual measurement And experimental retention time t r actual measurement . Obtaining mass-to-charge ratio m/z of theoretical primary ions by using molecular formula of metabolites in open source metabolome database Theory of The method comprises the steps of carrying out a first treatment on the surface of the Obtaining predicted retention time t of the metabolite according to the retention time prediction model r prediction The retention time prediction model is constructed based on known metabolite structure retention relationships. Mass-to-charge ratio m/z of primary ions of metabolic physics in open source metabolome database Theory of And a predicted retention time t r prediction First order ion mass to charge ratio m/z with experimental metabolite data Actual measurement And experimental retention time t r actual measurement Matching is carried out while meeting
|t r prediction -t r actual measurement |<2min, and |m/z Theory of -m/z Actual measurement |/m/z Theory of *1000000<5ppm of metabolite will be used as candidate metabolite, constructing a candidate metabolite database; the database contains simplified molecular linear input specifications (SMILES), names, molecular formulas, molecular structures and predicted retention times for metabolites;
thirdly, constructing a molecular structure association network of the metabolome; obtaining a molecular fingerprint according to the molecular structure of the metabolite in the candidate metabolite database, wherein the molecular fingerprint can be any one of Morgan fingerprint, MACS fingerprint, atom-pair fingerprint and Dayleight fingerprint; and calculating the similarity between the molecular fingerprints of any two candidate metabolites, wherein the calculation method of the similarity is based on an open source tool RDkit. Setting a similarity threshold, taking metabolites as nodes and molecular fingerprint similarity as edges, and connecting lines among the metabolites with the similarity threshold value larger than or equal to the molecular fingerprint similarity threshold value to construct a molecular structure association network of a metabolome level;
fourthly, carrying out scale qualitative on the metabolites based on a molecular structure association network; taking the molecular structure association network constructed in the third step as a background network, taking a candidate metabolite database as a reference, selecting 5-50 metabolites from the background network, and identifying 5-50 metabolites from non-targeted ultra-high performance liquid chromatography-high resolution mass spectrum metabolome experimental data by utilizing standard samples of the 5-50 metabolitesThe 50 metabolites are used as seed metabolites, mapped into an established molecular structure association network, seed metabolite-related metabolites are obtained from the network, wherein adjacent metabolites refer to the metabolites with direct side connection in the molecular structure association network; assigning secondary mass spectra of seed metabolites to adjacent metabolites as pseudo secondary mass spectra thereof, setting a search threshold, |t r prediction -t r actual measurement |<2min and |m/z Theory of -m/z Actual measurement |/m/z Theory of *1000000<The similarity between the experimental secondary mass spectrum of the metabolite peak and the quasi secondary mass spectrum of the adjacent metabolite is more than or equal to 0.5. Searching for neighboring metabolites m/z in experimental data Theory of ,t r prediction Metabolite peaks matched by the secondary mass spectrum, and if the matching is successful, the metabolite peaks are identified; the method comprises the steps of carrying out a first treatment on the surface of the The identified metabolites are used as new seeds, and the qualitative process is repeated until no new metabolites are identified; when there are a plurality of matching results, the matching results are scored, and the metabolite peaks with higher scores are identified with higher accuracy, so that the identified metabolite is no longer used as a new seed. Score = 0.25× (1- |m/z Theory of -m/z Actual measurement |×1000000/(m/z Theory of ×5))+0.25×(1-|t r (metabolite) -t r (experimental value) I/2) +0.5×secondary spectrum similarity.
According to the invention, on the premise that the MS/MS has similarity, a large-scale qualitative method based on a molecular structure association network guided by experimental data is established, and the structural identification of unknown metabolites is realized. By establishing a candidate metabolite database and a candidate metabolite molecular structure association network thereof, the network is adopted to identify the metabolites without standard MS/MS spectrograms, so that the structure identification of the metabolites can not depend on a large-scale standard MS/MS database. The invention relates to a metabolome deep annotation method independent of a large-scale experimental secondary spectrogram database, which can realize large-scale, reliable and qualitative metabolome annotation and remarkably enlarge the coverage of metabolome annotation.
Drawings
FIG. 1 molecular structure association network (metabolite molecular fingerprint similarity threshold of 0.7);
FIG. 2 is a partial enlarged view of a molecular structure-associated network;
FIG. 3 is a schematic diagram of a qualitative process of metabolites based on a molecular structure-related network;
FIG. 4A is a molecular structure association network from the maize filament mass spectrometry positive ion mode;
fig. 4B is a molecular structure-related network from the maize filament mass spectrometry negative ion mode.
Detailed Description
The invention is described in detail below with reference to the attached drawings: the present embodiment is implemented on the premise of the technical scheme of the present invention, and a detailed implementation manner and a specific operation process are provided, but the protection scope of the present invention is not limited to the following embodiments.
Example 1
To confirm the effectiveness and feasibility of the present invention, a mixed standard consisting of 173 hydroxycinnamamide (including N-cinnamyl-putrescine, N- (p-coumaryl) -cadaverine, N- (p-coumaryl) -agmatine, N ' -Caffeoyl-feruloyl-putrescine, N ', N "-Caffeoyl-feruloyl-spidine, and N, N ', N" -Tris-feruloyl-spimine, etc. was added to a plant extract, and the principle of the present invention was illustrated by taking the qualitative example of hydroxycinnamamide in the collected non-targeted metabonomics data by performing ultra-high performance liquid chromatography-high resolution mass spectrometry data acquisition on the plant extract with a final concentration of 100 to 200 ng/mL.
Extraction of plant tissue metabolome: the metabolite in the maize filament is extracted by adopting a plant metabonomics method. First, 50 mg of the popcorn powder was weighed into a 1.5 ml centrifuge tube, 1.0 ml of methanol/water (volume ratio, 4:1) extractant was added, vortexed on a vortexing machine for 5 minutes, and centrifuged at 15000rpm at 4℃for 10 minutes. 700. Mu.L of the supernatant was lyophilized in a vacuum centrifugal concentrator. 100 microliters of methanol/water (volume ratio, 4:1) was added to the lyophilized sample powder, vortexed for 1min, and centrifuged at 15000rpm at 4℃for 10min in a high-speed centrifuge.
Non-targeted chromatography-mass spectrometry information acquisition: data were collected on an analytical instrument used in combination with an ACQUITY UHPLC ultra high performance liquid chromatography system (UPLC, waters, milford, mass., USA.) and a Q exact HF high resolution mass spectrometry (Thermo Fisher Scientific, rockford, ill., USA.).
The liquid chromatography conditions under the positive ion mode of the mass spectrum electrospray ionization source are as follows: phase A and phase B were 0.1% formic acid/water (volume ratio) and 0.1% formic acid/acetonitrile (volume ratio), respectively, at a flow rate of 0.35mL/min. The initial elution gradient was 5% b, held for 1min; the linear gradient increased to 100% b over 23min and was maintained for 4min, followed by a linear return to the initial gradient over 0.1min and was maintained for 2.9min for a total analysis time of 30min. The sample was ACQUITY BEH C 18 The column (100mm x 2.1mm,1.7 μm, waters, milford, MA, u.s.a.) was used for separation. The column temperature was 50 ℃. The temperature of the sample introduction chamber was set to 4℃and the sample introduction amount was 5. Mu.L.
The liquid chromatography conditions under the mass spectrum electrospray ionization source negative ion mode are as follows: phase A and phase B were 6.5mM ammonium bicarbonate aqueous solution and 6.5mM ammonium bicarbonate 95% methanol/water solution, respectively (volume ratio). The flow rate was 0.35mL/min. The initial elution gradient was 2% b, held for 1min, the linear gradient increased to 100% b at 18min and held for 22min, then at 22.1min the linear gradient returned to the initial ratio and held for 25min. Sample adopts ACQUITY HSS T 3 The separation was performed by a chromatographic column (100 mm. Times.2.1 mm,1.8 μm, waters, milford, mass., U.S.A.). The column temperature was 50℃and the sample introduction chamber temperature was set at 4℃with an introduction amount of 5. Mu.L.
The Q exact HF mass spectrometry conditions were: the scanning mode is a full-scan/auto-triggered data-dependent secondary mass spectrometry scanning mode (full MS/data-dependencMS) 2 ). In the full-scan mass spectrometry setting, the resolution is 120,000, and the automatic gain control target (AGC target) and the maximum injection time (maximum IT) are set to 3×10, respectively 6 Ion capacity and 100ms. The scanning range of the full scanning mass is m/z 85-1250. In the secondary mass spectrum setting, an automatic gain control target (AGC target) and a maximum injection time (maximum IT) are set to 1×10, respectively 5 Ion capacity and 50ms. The isolation window is m/z 1.0. The collision energy was 15%,30% and 45% of the mixed normalized energy (NCE). The acquisition of the secondary mass spectrum is triggered by the first 10 ions that respond most strongly in each full scan cycle. An Inclusion list is added and set to on. Positive directionElectrospray voltages in negative ion mode are 3.5kV and 3.0kV respectively, the temperature of the ion transmission tube is 320 ℃, and the temperature of the auxiliary gas is 350 ℃. The sheath gas and auxiliary gas flow rates were 45 and 10, respectively (in arbitrary units). S-lens was set to 50.0 (in arbitrary units).
Acquisition of experimental chromatography-mass spectrometry information: non-targeted metabonomics data based on the labeled extracts were used to obtain peak tables, including experimental retention time t, using software CompoundDisovery3.1 r actual measurement Primary mass spectrum information, i.e. primary ion mass to charge ratio m/z Actual measurement An Excel table was derived. And (3) converting the original data by adopting software Proteowizard to obtain a secondary file of mgf, wherein the secondary file contains corresponding secondary mass spectrum information, namely the mass-to-charge ratio and the intensity of secondary ions. First-order ion mass-to-charge ratio m/z of metabolite peaks in experimental data Actual measurement Experimental retention time t r actual measurement The mass window matched to the corresponding secondary mass spectrum was 10ppm and the retention time window was 10s. From the collected non-targeted metabonomics data, the experimental retention time t of 173 hydroxycinnamamides was extracted r actual measurement Primary mass spectrum information, i.e. primary ion mass to charge ratio m/z Actual measurement And corresponding secondary mass spectrometry information, i.e., mass to charge ratio and intensity of the secondary ions.
And (3) constructing a retention time prediction model: 127 hydroxycinnamamide (including N- (p-Coumaroyl) -spidine, N-Sinapoyl-tyramine, N '-Cinnamoyl-Sinapoyl-putrescine, N' - (p-Coumaroyl) -bis-caffeoyl-spidine, etc.) samples were analyzed using the same ultra-high performance liquid chromatography-high resolution mass spectrometry data acquisition conditions as the plant extracts to obtain liquid chromatography assay retention times. Calculating in an open source website ChemDes (http:// www.scbdd.com/ChemDes) by using an SDF file of a standard sample to obtain a 1D &2D molecular descriptor of each standard sample, adopting a multiple linear regression method, taking liquid chromatography retention time as a dependent variable and a molecular descriptor as an independent variable, and selecting a progressive method to construct a retention time prediction model.
Candidate metabolites were collected using the open source plant hydroxycinnamamide metabolome database (https:// pubs. Acs. Org/doi/abs/10.1021/acs. Analchem.8b 03654), which has recorded 846 hydroxycinnamamides. First using a numberObtaining the mass-to-charge ratio m/z of theoretical primary ion of each hydroxycinnamamide according to molecular formula of hydroxycinnamamide in database Theory of The method comprises the steps of carrying out a first treatment on the surface of the Predicting the predicted retention time t of 846 hydroxycinnamamides using the previously constructed retention time prediction model r prediction . The primary ion mass-to-charge ratio m/z of 173 hydroxycinnamamide obtained by non-targeted metabonomics experiment of the labeled plant extract Actual measurement And experimental retention time t r actual measurement Searching an open source plant hydroxycinnamate metabolome database, and simultaneously meeting the following conditions in the database:
|t r prediction -t r actual measurement |<2min,
And/m/z Theory of -m/z Actual measurement |/m/z Theory of ×1000000<5ppm of 220 hydroxycinnamamide as a candidate metabolite, and the SMILES, name, molecular formula, molecular structure and predicted retention time were obtained to construct a candidate hydroxycinnamamide database.
Building a molecular structure association network: and (3) obtaining Morgan fingerprints of the molecular structures of the candidate hydroxycinnamamide, calculating the similarity between Morgan fingerprints of any two candidate hydroxycinnamamide, setting a molecular fingerprint similarity threshold to be 0.7, taking the candidate hydroxycinnamamide as a node, taking the Morgan fingerprint similarity between any two candidate hydroxycinnamamide as an edge, and constructing a molecular structure association network, wherein the number of the nodes is 220 and the number of the edges is 3866.
Correlation network characterization based on molecular structure: and identifying the labeled metabolites collected by the non-targeted ultra-high performance liquid chromatography-high resolution mass spectrum metabolome by taking the constructed molecular structure association network as a background network. The specific qualitative process is as follows:
1) And identifying 6 hydroxycinnamamide serving as seed metabolites from non-targeted ultra-high performance liquid chromatography-high resolution mass spectrometry metabolome experimental data of the plant extract subjected to the standard sample, mapping the seed metabolites into an established molecular structure association network, and obtaining adjacent metabolites of the seed metabolites from the network, wherein the adjacent metabolites refer to metabolites with direct side connection in the molecular structure association network. FIG. 2 is a molecular structure-related networkA partial enlarged view is shown, wherein seed metabolite 1 is N-Caffeoyl-5-methoxytyrptamine, 5 adjacent metabolites are included, wherein adjacent metabolite 1 is N-Sinapoyl-serotonin, adjacent metabolite 2 is N, N '-ferroyl-cinnamyl-cadaverine, adjacent metabolite 3 is N, N' - (p-Coumaroyl) -ferroyl-agmatine, adjacent metabolite 4 is N-ferroyl-octopamine and adjacent metabolite 5 is N-Caffeoyl-serotonin. M/z of adjacent metabolites 1 to 5 Theory of ,t r prediction M/z 383.1607 and 6.62min respectively; m/z 409.2127,8.90min; m/z 453.2138,8.14min; m/z 330.1341,5.91min and m/z339.1345,6.19min.
2) The secondary mass spectrum of the seed metabolite is assigned to the adjacent metabolite as its "quasi-secondary mass spectrum". Setting a search threshold:
|t r prediction -t r actual measurement |<2min,
|m/z Theory of -m/z Actual measurement |/m/z Theory of *1000000<5ppm,
And the similarity of the experimental secondary mass spectrum and the quasi-secondary mass spectrum of the adjacent metabolites is more than or equal to 0.5
The qualitative procedure is illustrated as follows: as shown in fig. 3, the secondary spectrum of seed metabolite 1 is the red spectrum in the figure, which is taken as the "pseudo-secondary spectrum" of 5 adjacent metabolites; finding m/z from experimental data for each adjacent metabolite Theory of ,t r prediction And a metabolite peak that is matched to the secondary mass spectrum. The retention time was found to be 6.97min, [ M+H ] in the experimental data] + 383.1594 metabolite peak which coincides with |t of the adjacent metabolite 1 (N-Sinapoyl-serotonin) r prediction -t r actual measurement |=0.35min,Δm=|m/z Theory of -m/z Actual measurement |/m/z Theory of X 1000000=3.4 ppm, and the similarity of the experimental secondary spectrum (blue) of this peak to the "pseudo secondary spectrum" (red spectrum) of the adjacent metabolite 1 was 0.86. Thus, the metabolite peak was characterized as N-Sinapoyl-serotonin. Using a similar qualitative approach, 3 metabolite peaks (m/z Actual measurement ,t r actual measurement Secondary similarity) m/z409.2109,9.34min,0.78; m/z453.2118,7.92min,0.76 and m/z330.1330,5.71min,0.86 and phases, respectivelyNeighboring metabolites 2,3, and 4 matched, and these 3 metabolite peaks were also successfully identified.
3) When the experimental data searches out a plurality of matching results, scoring the matching results, wherein the scoring rule is as follows:
score = 0.25× (1- |m/z Theory of -m/z Actual measurement |×1000000/(m/z Theory of ×5))+0.25×(1-|t r (metabolite) -t r (experimental value) Similarity of 0.5 x secondary spectrum
If 3 metabolite peaks are found to match the adjacent metabolite 5 in the experimental data, all meet the search threshold, which is m/z Actual measurement ,tr Actual measurement The similarity of the secondary mass spectrum is respectively m/z 339.1332,5.89min and 0.77; the 3 results are scored by m/z 339.1330,5.47min,0.61 and m/z 339.1335,6.63min and 0.63, the corresponding scores are 0.66,0.50 and 0.62, and the high-score identification results are output according to the sequence from large to small, so that the reliability is high. The metabolite peaks identified in this case are no longer involved as seeds in the next round of characterization.
4) The above identified metabolites were then used as new seeds and the qualitative procedure was repeated until no new metabolite peaks were identified. The metabolite peak (383.1594, 6.97 min) was successfully identified as N-Sinapoyl-serotonin (adjacent metabolite 1 in FIG. 2) in experimental data, and its experimental secondary profile was assigned to the next-order adjacent metabolite 1 (N, N' -Feruloyl-bis-cinnamoyl-split) in FIG. 2 as its "pseudo-secondary profile". M/z of the next-order neighboring metabolite 1 Theory of ,t r prediction 582.2968, 11.19min. And finding out a metabolite peak 582.2948 meeting a threshold value in experimental data for 11.65min, wherein the similarity of the experimental secondary spectrogram and the simulated secondary spectrogram is 0.75, and the matching is successful. The metabolite peak (582.2948, 11.65 min) was identified as N, N', N "-Feruloyl-bis-cinnamoyl-spidine and the qualitative procedure described above was repeated as a new seed.
By adopting the method, the 167 hydroxycinnamamide are successfully identified by using 6 hydroxycinnamamide as an initial seed metabolite, and the accuracy of the identification result is 98.8%. Of these, 141 were ranked first, 19 were ranked second, 5 were ranked third, and 2 were ranked 4. The reason for the non-rank first is that 80 of the 169 hydroxycinnamamides have isomers with retention times similar to the secondary mass spectrum.
Comparing the identification result with the conventional database searching method, wherein the database research (http:// specra. Psc. Riken. Jp /) only contains 23 hydroxycinnamamide, the Metlin (https:// Metlin. Scrips. Edu) contains 44 hydroxycinnamamide, but the databases hardly contain secondary spectrograms of hydroxycinnamamide, and only use primary ion mass-to-charge ratio search, so that the reliability of the qualitative result is not guaranteed and the coverage is limited.
The result shows that the metabolite qualitative method based on the molecular structure correlation network is independent of a large-scale experimental secondary spectrogram database, and can realize reliable qualitative; the coverage of metabolome annotations can be significantly expanded using an open-source structural database.
Example 2
The invention is used for qualitative determination of the actual biological sample extract. Extracting plant tissue (maize filament) metabolome, carrying out ultra-high performance liquid chromatography-high resolution mass spectrum data acquisition on the maize filament tissue extract, and carrying out qualitative analysis on the obtained non-targeted metabolome data.
The procedure and conditions are the same as in example 1, except that:
extraction of plant tissue metabolome: as in example 1.
Non-targeted metabonomics data acquisition: as in example 1.
Acquisition of experimental chromatography-mass spectrometry information: non-targeted metabonomics data based on maize filament extract, peak tables were obtained using software CompoundDisovery 3.1, including experimental retention time t r actual measurement Primary mass spectrum information, i.e. primary ion mass to charge ratio m/z Actual measurement An Excel table was derived. And (3) converting the original data by adopting software Proteowizard to obtain a secondary file of mgf, wherein the secondary file contains corresponding secondary mass spectrum information, namely the mass-to-charge ratio and the intensity of secondary ions.
And (3) constructing a retention time prediction model: 254 standard samples (including 1,3-Dihydroxyacetone, benzoic acid, methionine sulfoxide, 7-methoxorimarin, vibraactone B, nardosinone, etc.) were analyzed in the positive ion mode, 327 standard samples (including 3-Hydroxypropanoic acid, 2-hydroxyquinoline, coixol, 6-Benzylaminopurine, quercetin, daphnoretin, etc.) were analyzed in the negative ion mode, and the retention time of the liquid chromatography experiment was obtained, respectively. Calculating to obtain 1D &2D molecular descriptors of each standard sample in an open source website ChemDes (http:// www.scbdd.com/ChemDes) by using an SDF file of the standard sample, adopting a multiple linear regression method, taking liquid chromatography retention time as a dependent variable and selecting a progressive method to respectively construct a retention time prediction model of a positive ion mode and a negative ion mode by taking the molecular descriptors as independent variables.
Using the open source metabolome database Universal Natural Products Database UNPD (http:// pkuxxj. Pku. Edu. Cn/UNPD /), plant Metabolic Network (https:// playcyc. Org /) and KEGG (https:// www.genome.jp/KEGG /). First, based on molecular formula of metabolites in a database, mass-to-charge ratio m/z of theoretical primary ion of each metabolite is obtained Theory of The method comprises the steps of carrying out a first treatment on the surface of the Predicting a predicted retention time t for each metabolite using the aforementioned retention time prediction model r prediction . The primary ion mass-to-charge ratio m/z of the metabolite peak obtained by non-targeted metabonomics experiments of plant extracts Actual measurement And experimental retention time t r actual measurement Searching an open source metabolome database, and simultaneously meeting the following conditions in the database:
|t r prediction -t r actual measurement |<2min,
|m/z Theory of -m/z Actual measurement |/m/z Theory of *1000000<5ppm of metabolite is taken as candidate metabolite, SMILES, name, molecular formula, molecular structure and predicted retention time are obtained, and a candidate metabolite database is constructed.
Building a molecular structure association network: obtaining Morgan fingerprints of the candidate metabolites based on molecular structures of the candidate metabolites, calculating the similarity between Morgan fingerprints of any two candidate metabolites, setting a molecular fingerprint similarity threshold to be 0.6, taking the candidate metabolites as nodes and the Morgan fingerprint similarity between any two candidate metabolites as edges, and constructing a molecular structure association network, wherein the molecular structure association network in a positive ion mode comprises 1965 metabolites (nodes) and 28199 edges, and is shown in FIG. 4A; the molecular structure association network in negative ion mode includes 1945 metabolites (nodes), 34451 sides, see fig. 4B.
Correlation network characterization based on molecular structure: and (3) taking the constructed molecular structure association network as a background network, identifying experimental data collected by a non-targeted ultra-high performance liquid chromatography-high resolution mass spectrum metabolome, and determining metabolites in a biological sample to be detected, wherein the identification process is the same as that of the embodiment 1.
The process shows that abundant candidate metabolites can be obtained from complex plant tissue extract metabolome data, the candidate metabolites calculate the similarity between Morgan fingerprints, and when a molecular fingerprint similarity threshold value is set to be 0.6, a complete communicated network can be formed, so that large-scale qualitative of the metabolome can be realized.
Claims (4)
1. A metabolome deep annotation method, characterized by:
firstly, performing non-targeted metabonomics analysis on a plant extract to be detected by adopting ultra-high performance liquid chromatography-high resolution mass spectrometry; obtaining chromatographic-mass spectrometry information of extract metabolome, including retention time of metabolite peak detected by experimentt r actual measurement Mass to charge ratio of primary mass spectrometry ionsm/z Actual measurement And the mass to charge ratio and intensity of the corresponding secondary mass spectrometry ions;
secondly, constructing a candidate metabolite molecular structure database; the primary ion mass-to-charge ratio of all metabolites in the biological sample extract to be tested obtained according to the first step of experimentm/z Actual measurement And experimental retention timet r actual measurement Screening and primary ion mass-to-charge ratio from open source metabonomics databasem/z Actual measurement And experimental retention timet r actual measurement The matched metabolites are taken as candidate metabolites, and a candidate metabolite database is constructed; the database contains simplified molecular linear input specifications SMILES, name, molecular formula, molecular structure and predicted retention time for the metabolite;
thirdly, constructing a metabolic component molecular structure association network; obtaining a molecular fingerprint according to the molecular structure of the metabolites in the candidate metabolite database; calculating the similarity between the molecular fingerprints of any two candidate metabolites, wherein the similarity calculation method is based on an open source tool RDkit; setting a similarity threshold value between molecular fingerprints to be 0.5-0.8, taking the metabolites as nodes and the molecular fingerprint similarity as edges, and connecting the metabolites with the similarity threshold value between the molecular fingerprints more than or equal to each other to construct a molecular structure association network;
fourthly, performing metabolite characterization based on a molecular structure association network; the molecular structure association network constructed in the third step is used as a background network to identify experimental data collected by non-targeted ultra-high performance liquid chromatography-high resolution mass spectrometry, and the metabolites in the biological sample to be detected are determined;
the first step of the primary mass spectrum ion is an ion directly acquired after ionization and ionization of a compound by mass spectrum; the secondary mass spectrum ions are ions acquired after the primary ions are collided and disintegrated by applying certain energy;
the second step, the method for obtaining the candidate metabolites comprises the following steps: obtaining mass-to-charge ratio of theoretical primary mass spectrum under positive and negative ion ionization mode by using molecular formula of metabolite in public metabolome databasem/z Theory of Obtaining predicted retention time according to metabolite structure parametert r prediction The method comprises the steps of carrying out a first treatment on the surface of the Inclusion criteria for candidate metabolites are, at the same time, that
|t r prediction -t r actual measurement |<2min and
|m/z theory of -m/z Actual measurement |/m/z Theory of ×1000000<5 ppm;
The fourth step, metabolite identification method based on molecular structure association network is that taking candidate metabolite database as reference, selecting 5-50 metabolites from the candidate metabolite database, identifying 5-50 metabolites as seed metabolites from non-targeted ultra-high performance liquid chromatography-high resolution mass spectrum metabolome experimental data by using standard samples of the 5-50 metabolites, mapping the seed metabolites into established molecular structure association network, and obtaining seed metabolites from the networkAn ortho metabolite; assigning a secondary mass spectrum of the seed metabolite to an adjacent metabolite as a pseudo-secondary mass spectrum thereof; setting a search threshold value, and searching for adjacent metabolites in experimental datam/z Theory of ,t r prediction Metabolite peaks matched by the quasi-secondary mass spectrum, and if the matching is successful, the metabolite peaks are identified; the identified metabolites are used as new seeds, and the qualitative process is repeated until no new metabolites are identified;
when a plurality of matching results exist, scoring the matching results, and sorting the matching results from high to low according to the score, wherein the metabolite peaks with higher scores are identified with higher accuracy, and the identified metabolites are not used as new seeds any more;
search threshold:
|t r prediction -t r actual measurement |<2 min,
|m/z Theory of -m/z Actual measurement |/m/z Theory of *1000000<5 ppm,
And similarity of experimental secondary mass spectrum of metabolite peaks and pseudo secondary mass spectrum of adjacent metabolites≥0.5。
2. The method according to claim 1, wherein: the molecular fingerprint in the third step is any one of Morgan fingerprint, MACS fingerprint, atom-pair fingerprint and Dayleight fingerprint.
3. The method according to claim 1, wherein: the predicted retention time of the metabolite is predicted by a retention time prediction model constructed by a known metabolite structure-retention relationship.
4. The method according to claim 1, wherein: adjacent metabolites are those having direct side linkages in the molecular structure-associated network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011407735.8A CN114594171B (en) | 2020-12-03 | 2020-12-03 | Metabolome deep annotation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011407735.8A CN114594171B (en) | 2020-12-03 | 2020-12-03 | Metabolome deep annotation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114594171A CN114594171A (en) | 2022-06-07 |
CN114594171B true CN114594171B (en) | 2023-12-15 |
Family
ID=81813178
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011407735.8A Active CN114594171B (en) | 2020-12-03 | 2020-12-03 | Metabolome deep annotation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114594171B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107729721A (en) * | 2017-10-17 | 2018-02-23 | 中国科学院上海有机化学研究所 | A kind of metabolin identification and disorderly path analysis method |
WO2018072306A1 (en) * | 2016-10-23 | 2018-04-26 | 哈尔滨工业大学深圳研究生院 | Visualization network-based two-stage metabolite mass spectrometry detection method for compound |
CN110907575A (en) * | 2018-09-14 | 2020-03-24 | 中国科学院大连化学物理研究所 | Deep annotation method of hydroxycinnamic acid amide in plants |
CN111710363A (en) * | 2020-06-19 | 2020-09-25 | 苏州帕诺米克生物医药科技有限公司 | Method and device for determining metabolite pairing relationship |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012125121A1 (en) * | 2011-03-11 | 2012-09-20 | Agency For Science, Technology And Research | A method, an apparatus, and a computer program product for identifying metabolites from liquid chromatography-mass spectrometry measurements |
-
2020
- 2020-12-03 CN CN202011407735.8A patent/CN114594171B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018072306A1 (en) * | 2016-10-23 | 2018-04-26 | 哈尔滨工业大学深圳研究生院 | Visualization network-based two-stage metabolite mass spectrometry detection method for compound |
CN107729721A (en) * | 2017-10-17 | 2018-02-23 | 中国科学院上海有机化学研究所 | A kind of metabolin identification and disorderly path analysis method |
CN110907575A (en) * | 2018-09-14 | 2020-03-24 | 中国科学院大连化学物理研究所 | Deep annotation method of hydroxycinnamic acid amide in plants |
CN111710363A (en) * | 2020-06-19 | 2020-09-25 | 苏州帕诺米克生物医药科技有限公司 | Method and device for determining metabolite pairing relationship |
Non-Patent Citations (7)
Title |
---|
A protocol for high-throughput,untargeted forest community metabolomics using mass spectrometry molecular networks;Brian E. Sedio 等;《Applications in Plant Sciences》;20180402;第6卷(第3期);第1-13页 * |
Metabolite identification through multiple kernel learning on fragmentation trees;Huibin Shen 等;《Bioinformatics》;20141231;第30卷;第157-164页 * |
Metabolomics Based on UHPLC-Orbitrap-MS and Global Natural Product Social Molecular Networking Reveals Effects of Time Scale and Environment of Storage on the Metabolites and Taste Quality of Raw Pu-erh Tea;Shanshan Xu 等;《Journal of Agricultural and Food Chemistry》;20190927;第67卷(第43期);第12084-12093页 * |
Naoki Tanaka 等.Small-World Phenomena in Chemical Library Networks:Application to Fragment-Based Drug Discovery.Journal of Chemical Information and Modeling.2009,第49卷(第12期),第2677-2686页. * |
Xiaotao Shen 等.Metabolic reaction network-based recursive metabolite annotation for untargeted metabolomics.Nature Communications.2019,第10卷第1-14页. * |
基于液相色谱-质谱联用的代谢组学研究中代谢物的结构鉴定进展;孔宏伟 等;《色谱》;20141031;第32卷(第10期);第1052-1057页 * |
王先龙.分子指纹.《计算机辅助药物设计 实践指南》.电子科技大学出版社,2016, * |
Also Published As
Publication number | Publication date |
---|---|
CN114594171A (en) | 2022-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8975577B2 (en) | System and method for grouping precursor and fragment ions using selected ion chromatograms | |
Krauss et al. | LC–high resolution MS in environmental analysis: from target screening to the identification of unknowns | |
US20140297201A1 (en) | Computer-assisted structure identification | |
US20140361159A1 (en) | Isotopic Pattern Recognition | |
CN110907575B (en) | Deep annotation method of hydroxycinnamic acid amide in plants | |
JP6004080B2 (en) | Data processing apparatus and data processing method | |
JP6149810B2 (en) | Metabolite analysis system and metabolite analysis method | |
JP5810983B2 (en) | Compound identification method and compound identification system using mass spectrometry | |
US7462821B2 (en) | Instrumentation, articles of manufacture, and analysis methods | |
US20230047202A1 (en) | Method and system for the identification of compounds in complex biological or environmental samples | |
CN114594171B (en) | Metabolome deep annotation method | |
JP2019174431A (en) | Method of analyzing chromatogram and mass spectrum obtained from chromatography-mass spectrometry performed on sample comprising multiple components, information processing device, program, and storage medium | |
CN118176540A (en) | Chemical peak finder model for unknown compound detection and identification | |
CN114609318B (en) | Large-scale metabolome qualitative method based on molecular structure association network | |
CN108375639B (en) | Method for rapidly establishing component mass spectrum database in sample | |
Souza et al. | Accelerated unknown compound annotation with confidence: from spectra to structure in untargeted metabolomics experiments | |
US20220301839A1 (en) | Method for analyzing mass spectrometry data, computer program medium, and device for analyzing mass spectrometry data | |
JP2018119897A (en) | Substance identification method using mass analysis and mass analysis data processing device | |
CN118262824A (en) | Qualitative analysis method for secondary metabolite of plants and database construction thereof | |
James | XLIM-MS Towards the Development of a Novel approach to Cross-linking Mass Spectrometry | |
CN116773682A (en) | Combined analysis method for analyzing chemical components of okra capsules | |
CN115248282A (en) | Analysis method for identifying structure of compound outside NIST spectrum library and application | |
Tarbin et al. | Approaches to the identification of metabolites and breakdown products of veterinary drug residues using accurate mass spectrometry.[Conference poster]. | |
Bytes | Understanding Metabolomics Data: An Overview for Bioinformaticians | |
Weber | Increased confidence of metabolite identification in high-resolution mass spectra using prior biological and chemical knowledge-based approaches |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |