CN111061818A - Metabolic group and other omics combined analysis method and device - Google Patents

Metabolic group and other omics combined analysis method and device Download PDF

Info

Publication number
CN111061818A
CN111061818A CN201911380147.7A CN201911380147A CN111061818A CN 111061818 A CN111061818 A CN 111061818A CN 201911380147 A CN201911380147 A CN 201911380147A CN 111061818 A CN111061818 A CN 111061818A
Authority
CN
China
Prior art keywords
data
metabolome
omics
analysis
pair
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911380147.7A
Other languages
Chinese (zh)
Other versions
CN111061818B (en
Inventor
郑洪坤
秦刚
张蕾
梁若冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Biomarker Technologies Co ltd
Original Assignee
Beijing Biomarker Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Biomarker Technologies Co ltd filed Critical Beijing Biomarker Technologies Co ltd
Priority to CN201911380147.7A priority Critical patent/CN111061818B/en
Publication of CN111061818A publication Critical patent/CN111061818A/en
Application granted granted Critical
Publication of CN111061818B publication Critical patent/CN111061818B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The embodiment of the invention provides a metabolome and other omics combined analysis method and a device, wherein the method comprises the following steps: performing multiple-inertia analysis on the data of the metabolome and the data of one or more other omics except the metabolome to obtain the association relationship of the data between the metabolome and the other omics; taking the data of the metabolome with the incidence relation and the data of other omics as a data pair, carrying out correlation analysis on each data pair, and screening the data pairs according to the correlation analysis result of each data pair; and performing restrictive correspondence analysis on each screened data pair, determining a finally-retained data pair according to the restrictive correspondence analysis result of each data pair, and finally determining the association relationship of the data between the metabolome and the other omics according to the finally-retained data pair. The embodiment of the invention realizes the joint analysis of the metabolome and other omics, and the analysis result is more accurate.

Description

Metabolic group and other omics combined analysis method and device
Technical Field
The invention belongs to the technical field of biological information analysis, and particularly relates to a metabolome and other omics combined analysis method and device.
Background
Metabonomics is a group study which is started along with the continuous development of mass spectrometry technology and information technology, and takes a whole set of metabolites in organisms as study objects. The metabolome plays a key role in disease diagnosis and prevention, new drug screening and development and ecological research.
At present, the development of various omics such as transcriptome, proteome, metabolome, microbiome and the like has promoted the understanding of the bio-physiological activities, but the elucidation of the complex life activities of organisms is difficult to be performed by single omics research. Research of different omics shows the state of organisms under different space-time conditions, and various omics need to be integrated to obtain the activity mechanism of the organism as a whole. The combined analysis of multiomics is beneficial to systematically explaining the intrinsic mechanism of organisms, and how to effectively integrate metabolome and other omics data and extract key information is a problem to be solved urgently.
Most of the existing multiomic combined analysis methods mainly use correlation analysis, but the correlation analysis is not accurate in analyzing the association relation between the group of chemical data.
Disclosure of Invention
In order to overcome the problem that the analysis result of the existing multi-omics combined analysis method is inaccurate or at least partially solve the problem, the embodiment of the invention provides a metabolome and other omics combined analysis method and device.
According to a first aspect of embodiments of the present invention, there is provided a metabolome and proteomics combined analysis method, comprising:
performing multiple-inertia analysis on the data of the metabolome and the data of one or more other omics except the metabolome to obtain the association relationship of the data between the metabolome and the other omics;
taking the data of the metabolome with the incidence relation and the data of other omics as a data pair, carrying out correlation analysis on each data pair, and screening the data pairs according to the correlation analysis result of each data pair;
and performing restrictive correspondence analysis on each screened data pair, determining a finally-retained data pair according to the restrictive correspondence analysis result of each data pair, and finally determining the association relationship of the data between the metabolome and the other omics according to the finally-retained data pair.
Specifically, the step of performing multiple covariant analysis on the data of the metabolome and the data of one or more other omics except the metabolome to obtain the association relationship between the metabolome and the other omics comprises the following steps:
projecting the metabolome data and the omics data into the same dimensional space based on omicide 4 software;
determining the relevance between any data of the metabolome and any data of any other omics according to the included angle between the coordinate of any data of the metabolome and the connecting line of the coordinate of any data of any other omics and the origin in the dimension space;
and preliminarily determining that the association relationship exists between the data of the metabolome corresponding to the association larger than the first preset threshold and the data of other omics.
Specifically, the step of performing a multiple-covariance analysis on the metabolome data, and one or more omics data other than the metabolome, specifically comprises:
performing differential analysis on the expression level file of the metabolome data and the expression level file of the omics data;
performing a multiple covariant analysis on the difference analysis result of the metabolome data and the difference analysis result of the omics data.
Specifically, the step of performing correlation analysis on each data pair and screening the data pairs according to the correlation analysis result of each data pair includes:
and performing correlation analysis on each data pair based on a spearman method, acquiring a correlation coefficient and a correlation P value of each data pair, and screening out the data pairs of which the correlation coefficient is greater than a second preset threshold and the correlation P value is less than a third preset threshold.
Specifically, the step of performing a restrictive correspondence analysis on each of the screened data pairs, and determining a final retained data pair according to a restrictive correspondence analysis result of each of the data pairs, includes:
based on vegan software, performing restrictive corresponding analysis on each screened data pair to obtain a scoring result of each data pair; wherein the data of the metabolome of the data pair is taken as a restrictive condition for the restrictive correspondence analysis;
and determining the finally reserved data pairs according to the scoring result of each data pair.
In particular, the omics include one or more of transcriptomes, proteomics, and microbiomes.
Specifically, the step of finally determining the association of the data between the metabolome and the omics according to the finally retained data pairs further comprises:
if the metabolites generated by the organisms with the transcriptome genes are preset metabolites, acquiring data of other omics which are in association with the preset metabolites according to the finally determined association relationship of the data between the metabolome and the other omics;
determining that the transcriptome gene has a predetermined difference gene if the organism has data from an omics having an association with the predetermined metabolite.
According to a second aspect of the embodiments of the present invention, there is provided a metabolome and proteomics combined analysis device, including:
the acquisition module is used for carrying out multiple-covariance analysis on the data of the metabolome and the data of one or more other omics except the metabolome to acquire the incidence relation of the data between the metabolome and the other omics;
the screening module is used for taking the data of the metabolome with the incidence relation and the data of other omics as a data pair, carrying out correlation analysis on each data pair and screening the data pair according to the correlation analysis result of each data pair;
and the determining module is used for performing restrictive correspondence analysis on each screened data pair, determining a finally-retained data pair according to the restrictive correspondence analysis result of each data pair, and finally determining the association relationship of the data between the metabolome and the other omics according to the finally-retained data pair.
According to a third aspect of the embodiments of the present invention, there is also provided an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor calls the program instructions to perform a metabolome and other omics joint analysis method provided in any one of the various possible implementations of the first aspect.
According to a fourth aspect of embodiments of the present invention, there is also provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform a metabolome and other omics joint analysis method provided in any of the various possible implementations of the first aspect.
The embodiment of the invention provides a metabolome and other omics combined analysis method and a device, the method is characterized in that after multi-covariance analysis is carried out on metabolome and other omics data, the incidence relation of the data between the metabolome and other omics is preliminarily determined, correlation analysis is further carried out on the metabolome data with the incidence relation and other omics data, a more accurate analysis result is screened out, then deep restrictive corresponding analysis is carried out on the screened data with the incidence relation, and the incidence relation of the data between the metabolome and other omics is finally determined, so that the combined analysis of the metabolome and other omics is realized, and the analysis result is more accurate.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic overall flow chart of the metabolome and other omics joint analysis method provided by the embodiment of the present invention;
FIG. 2 is a schematic diagram of the overall structure of a metabolome and other omics combined analysis device provided by the embodiment of the present invention;
fig. 3 is a schematic view of an overall structure of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In an embodiment of the present invention, a method for analyzing metabolome and other omics jointly is provided, and fig. 1 is a schematic overall flow chart of the method for analyzing metabolome and other omics jointly provided in the embodiment of the present invention, the method includes: s101, carrying out multiple co-inertia analysis on the data of the metabolome and the data of one or more other omics except the metabolome to obtain the association relationship of the data between the metabolome and the other omics;
wherein the metabolome data is data for a plurality of metabolites and the omics include one or more of transcriptome, proteome and microbiome. The data of transcriptome is data of multiple genes, the data of proteome is data of multiple proteins, and the data of microbiology is data of multiple microorganisms. And performing multiple-covariance analysis on the data of the metabolome and the data of other omics, and primarily determining the association relationship between the metabolome data and the data of other different omics.
S102, taking the data of the metabolome with the incidence relation and the data of other omics as a data pair, carrying out correlation analysis on each data pair, and screening the data pair according to the correlation analysis result of each data pair;
then, correlation analysis is performed on the two data preliminarily determined to have the association relationship in each data pair, and the correlation between the two data is determined. And screening out a data pair consisting of two data with strong correlation.
S103, performing restrictive correspondence analysis on each screened data pair, determining a finally-reserved data pair according to the restrictive correspondence analysis result of each data pair, and finally determining the association relationship of the data between the metabolome and the other omics according to the finally-reserved data pair.
And further determining the finally reserved data pairs through restrictive correspondence analysis on the basis of the screened data pairs with strong correlation. And taking the finally retained data of other omics in any data pair as finally determined data associated with the metabonomic data in the data pair, thereby obtaining a more accurate incidence relation analysis result.
In the embodiment, after the multi-covariant analysis is performed on the metabolome and other omics data, and the incidence relation of the data between the metabolome and other omics is preliminarily determined, the relativity analysis is further performed on the metabolome data with the incidence relation and other omics data, so that a more accurate analysis result is screened out, then the screened data with the incidence relation is subjected to deep restrictive correspondence analysis, and the incidence relation of the data between the metabolome and other omics is finally determined, so that the combined analysis of the metabolome and other omics is realized, and the analysis result is more accurate.
On the basis of the above embodiment, in this embodiment, the multiple covariant analysis is performed on the data of the metabolome and the data of one or more other omics other than the metabolome, and the step of obtaining the association relationship between the metabolome and the other omics includes: projecting the metabolome data and the omics data into the same dimensional space based on omicide 4 software; determining the relevance between any data of the metabolome and any data of any other omics according to the included angle between the coordinate of any data of the metabolome and the connecting line of the coordinate of any data of any other omics and the origin in the dimension space; and preliminarily determining that the association relationship exists between the data of the metabolome corresponding to the association larger than the first preset threshold and the data of other omics.
The multicompartment inertia analysis is an exploratory data analysis method for mining the common relation among a plurality of data sets, and can project a plurality of omics data to the same dimensional space, so as to visually display the association relation between each metabolite in the metabolome and each data in other omics, such as the association with the data of one or more omics of the transcriptome, the proteome and the microbiome. In the embodiment, the omicide 4 software is used for analysis, and the data of the omics are displayed in a two-dimensional graph, wherein an included angle formed by a connecting line between a position point corresponding to any two data and an origin reflects the relevance between the two data, and the smaller the included angle is, the stronger the relevance is. And preliminarily determining that the association exists between the metabolome data with the association being larger than the first preset threshold and the data of other omics.
On the basis of the above embodiment, the step of performing the multiple covariant analysis on the metabolome data and the one or more omics data other than the metabolome in this embodiment specifically includes: performing differential analysis on the expression level file of the metabolome data and the expression level file of the omics data; performing a multiple covariant analysis on the difference analysis result of the metabolome data and the difference analysis result of the omics data.
Specifically, before performing the joint analysis of the metabolome and the other omics, the expression level files of the metabolome and the other omics need to be prepared, the expression level files of the metabolome and the other omics are generally subjected to the difference analysis, and the difference analysis results of the metabolome and the other omics are used as input to perform the multiple inertia analysis. The correlation analysis and the restrictive correspondence analysis are also performed based on the expression level file.
On the basis of the foregoing embodiment, in this embodiment, the step of performing correlation analysis on each data pair and screening the data pairs according to the correlation analysis result of each data pair includes: and performing correlation analysis on each data pair based on a spearman method, acquiring a correlation coefficient and a correlation P value of each data pair, and screening out the data pairs of which the correlation coefficient is greater than a second preset threshold and the correlation P value is less than a third preset threshold.
Specifically, correlation analysis is performed on the metabolome data and other omics data in each data pair based on the spearman method, and the correlation analysis result is a correlation coefficient and a correlation P value between two data in each data pair. And screening out data pairs with the correlation coefficient larger than a second preset threshold and the correlation coefficient P value smaller than a third preset threshold, wherein if the second preset threshold is 0.8, the third preset threshold is 0.05, and screening out other omics data such as genes, proteins and microorganisms which are related to the metabonomic data. And visualizing the screening result by using the network map.
On the basis of the foregoing embodiment, in this embodiment, the step of performing the restricted correspondence analysis on each screened data pair, and determining the finally retained data pair according to the result of the restricted correspondence analysis on each data pair includes: based on vegan software, performing restrictive corresponding analysis on each screened data pair to obtain a scoring result of each data pair; wherein the data of the metabolome of the data pair is taken as a restrictive condition for the restrictive correspondence analysis; and determining the finally reserved data pairs according to the scoring result of each data pair.
Specifically, vegan software is used for carrying out restriction correspondence analysis on the data in each screened data pair, and metabolites in each data pair are taken as restriction conditions during analysis. And finally determining other omics data in association with the metabonomic data according to the scoring result of each data pair.
On the basis of the foregoing embodiments, the step of finally determining the association relationship between the metabolome and the omics according to the finally retained data pair in this embodiment further includes: if the metabolites generated by the organisms with the transcriptome genes are preset metabolites, acquiring data of other omics which are in association with the preset metabolites according to the finally determined association relationship of the data between the metabolome and the other omics; determining that the transcriptome gene has a predetermined difference gene if the organism has data from an omics having an association with the predetermined metabolite.
Specifically, when determining whether a predetermined difference gene exists in the transcriptome gene, it is first determined whether a metabolite produced by the organism having the transcriptome gene is a predetermined metabolite. The predetermined differential genes enable organisms with transcriptome genes to have particular functions, such as drought resistance. If the metabolites generated by the organisms with the transcriptome genes are preset metabolites, searching data of other omics corresponding to the preset metabolites according to the finally determined association relationship, further judging whether the organisms with the transcriptome genes have the data of other omics corresponding to the preset metabolites, and if so, determining that preset difference genes exist in the transcriptome genes, thereby comprehensively using the metabolites and the data of other omics associated with the metabolites to determine the difference genes in the transcriptome genes, and enabling the determination result to be more accurate.
In another embodiment of the present invention, a metabolome and other omics combined analysis device is provided for carrying out the methods of the preceding embodiments. Thus, the descriptions and definitions in the embodiments of the aforementioned metabolome and other omics combinatorial analysis methods can be used for an understanding of the various executive modules in the embodiments of the present invention. Fig. 2 is a schematic diagram of an overall structure of a metabolome and other omics combined analysis device provided in the embodiment of the present invention, which includes an obtaining module 201, a screening module 202, and a determining module 203, wherein:
the obtaining module 201 is configured to perform multiple-covariance analysis on the data of the metabolome and the data of one or more other omics except the metabolome, and obtain an association relationship between the metabolome and the other omics;
wherein the metabolome data is data for a plurality of metabolites and the omics include one or more of transcriptome, proteome and microbiome. The data of transcriptome is data of multiple genes, the data of proteome is data of multiple proteins, and the data of microbiology is data of multiple microorganisms. The obtaining module 201 performs multiple inertia analysis on the metabolome data and other omics data, and preliminarily determines the association relationship between the metabolome data and other different omics data.
The screening module 202 is configured to use the metabolome data and other omics data having an association relationship as a data pair, perform correlation analysis on each data pair, and screen the data pair according to the correlation analysis result of each data pair;
the screening module 202 performs correlation analysis on the two data preliminarily determined to have the correlation in each data pair, and determines the correlation between the two data. And screening out a data pair consisting of two data with strong correlation.
The determining module 203 is configured to perform a restrictive correspondence analysis on each of the screened data pairs, determine a final retained data pair according to a restrictive correspondence analysis result of each of the data pairs, and finally determine an association relationship between the metabolome and the other omics according to the final retained data pair.
The determining module 203 further determines the finally retained data pairs through restrictive correspondence analysis on the basis of the screened data pairs with strong correlation. And taking the finally retained data of other omics in any data pair as finally determined data associated with the metabonomic data in the data pair, thereby obtaining a more accurate incidence relation analysis result.
In the embodiment, after the multi-covariant analysis is performed on the metabolome and other omics data, and the incidence relation of the data between the metabolome and other omics is preliminarily determined, the relativity analysis is further performed on the metabolome data with the incidence relation and other omics data, so that a more accurate analysis result is screened out, then the screened data with the incidence relation is subjected to deep restrictive correspondence analysis, and the incidence relation of the data between the metabolome and other omics is finally determined, so that the combined analysis of the metabolome and other omics is realized, and the analysis result is more accurate.
On the basis of the foregoing embodiment, the obtaining module in this embodiment is specifically configured to: projecting the metabolome data and the omics data into the same dimensional space based on omicide 4 software; determining the relevance between any data of the metabolome and any data of any other omics according to the included angle between the coordinate of any data of the metabolome and the connecting line of the coordinate of any data of any other omics and the origin in the dimension space; and preliminarily determining that the association relationship exists between the data of the metabolome corresponding to the association larger than the first preset threshold and the data of other omics.
On the basis of the foregoing embodiment, the obtaining module in this embodiment is specifically configured to: performing differential analysis on the expression level file of the metabolome data and the expression level file of the omics data; performing a multiple covariant analysis on the difference analysis result of the metabolome data and the difference analysis result of the omics data.
On the basis of the above embodiment, the screening module in this embodiment is specifically configured to: and performing correlation analysis on each data pair based on a spearman method, acquiring a correlation coefficient and a correlation P value of each data pair, and screening out the data pairs of which the correlation coefficient is greater than a second preset threshold and the correlation P value is less than a third preset threshold.
On the basis of the foregoing embodiment, the determining module in this embodiment is specifically configured to: based on vegan software, performing restrictive corresponding analysis on each screened data pair to obtain a scoring result of each data pair; wherein the data of the metabolome of the data pair is taken as a restrictive condition for the restrictive correspondence analysis; and determining the finally reserved data pairs according to the scoring result of each data pair.
Based on the above embodiments, the omics in this embodiment include one or more of transcriptome, proteome, and microbiome.
On the basis of the above embodiments, the present embodiment further includes an application module, configured to, if a metabolite produced by an organism having a transcriptome gene is a preset metabolite, obtain data of an omics having an association relationship with the preset metabolite according to a final determination of an association relationship of data between the metabolome and the omics; determining that the transcriptome gene has a predetermined difference gene if the organism has data from an omics having an association with the predetermined metabolite.
Fig. 3 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 3: a processor (processor)301, a communication Interface (communication Interface)302, a memory (memory)303 and a communication bus 304, wherein the processor 301, the communication Interface 302 and the memory 303 complete communication with each other through the communication bus 304. Processor 301 may call logic instructions in memory 303 to perform the following method: performing multiple-inertia analysis on the data of the metabolome and the data of one or more other omics except the metabolome to obtain the association relationship of the data between the metabolome and the other omics; taking the data of the metabolome with the incidence relation and the data of other omics as a data pair, carrying out correlation analysis on each data pair, and screening the data pairs according to the correlation analysis result of each data pair; and performing restrictive correspondence analysis on each screened data pair, determining a finally-retained data pair according to the restrictive correspondence analysis result of each data pair, and finally determining the association relationship of the data between the metabolome and the other omics according to the finally-retained data pair.
In addition, the logic instructions in the memory 303 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the methods provided by the above method embodiments, for example, including: performing multiple-inertia analysis on the data of the metabolome and the data of one or more other omics except the metabolome to obtain the association relationship of the data between the metabolome and the other omics; taking the data of the metabolome with the incidence relation and the data of other omics as a data pair, carrying out correlation analysis on each data pair, and screening the data pairs according to the correlation analysis result of each data pair; and performing restrictive correspondence analysis on each screened data pair, determining a finally-retained data pair according to the restrictive correspondence analysis result of each data pair, and finally determining the association relationship of the data between the metabolome and the other omics according to the finally-retained data pair.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A metabolome and other omics joint analysis method, which is characterized by comprising the following steps:
performing multiple-inertia analysis on the data of the metabolome and the data of one or more other omics except the metabolome to obtain the association relationship of the data between the metabolome and the other omics;
taking the data of the metabolome with the incidence relation and the data of other omics as a data pair, carrying out correlation analysis on each data pair, and screening the data pairs according to the correlation analysis result of each data pair;
and performing restrictive correspondence analysis on each screened data pair, determining a finally-retained data pair according to the restrictive correspondence analysis result of each data pair, and finally determining the association relationship of the data between the metabolome and the other omics according to the finally-retained data pair.
2. The method for the metabolome and other omics joint analysis of claim 1, wherein the step of performing a multiple covariant analysis on the metabolome data and one or more other omics data other than the metabolome, and obtaining the association between the metabolome and the other omics comprises:
projecting the metabolome data and the omics data into the same dimensional space based on omicide 4 software;
determining the relevance between any data of the metabolome and any data of any other omics according to the included angle between the coordinate of any data of the metabolome and the connecting line of the coordinate of any data of any other omics and the origin in the dimension space;
and preliminarily determining that the association relationship exists between the data of the metabolome corresponding to the association larger than the first preset threshold and the data of other omics.
3. The metabolome and other omics joint analysis method of claim 1, wherein the step of performing a multiple covaiance analysis of the metabolome data and one or more omics data other than the metabolome comprises:
performing differential analysis on the expression level file of the metabolome data and the expression level file of the omics data;
performing a multiple covariant analysis on the difference analysis result of the metabolome data and the difference analysis result of the omics data.
4. The metabolome and proteomics combined analysis method of claim 1, wherein each of the data pairs is analyzed for correlation, and the step of screening the data pairs according to the correlation analysis result of each of the data pairs comprises:
and performing correlation analysis on each data pair based on a spearman method, acquiring a correlation coefficient and a correlation P value of each data pair, and screening out the data pairs of which the correlation coefficient is greater than a second preset threshold and the correlation P value is less than a third preset threshold.
5. The metabolome and other omics joint analysis method of claim 1, wherein the step of performing a restricted correspondence analysis on each of the selected data pairs and determining the final retained data pair based on the result of the restricted correspondence analysis on each of the selected data pairs comprises:
based on vegan software, performing restrictive corresponding analysis on each screened data pair to obtain a scoring result of each data pair; wherein the data of the metabolome of the data pair is taken as a restrictive condition for the restrictive correspondence analysis;
and determining the finally reserved data pairs according to the scoring result of each data pair.
6. The metabolome and proteomics combined assay of any one of claims 1-5, wherein the proteomics comprises one or more of transcriptome, proteome and microbiome.
7. The metabolome and other omics combined analysis method of any of claims 1 to 5 wherein the step of finally determining the data association between the metabolome and the other omics from the finally retained data pairs is followed by further steps of:
if the metabolites generated by the organisms with the transcriptome genes are preset metabolites, acquiring data of other omics which are in association with the preset metabolites according to the finally determined association relationship of the data between the metabolome and the other omics;
determining that the transcriptome gene has a predetermined difference gene if the organism has data from an omics having an association with the predetermined metabolite.
8. A metabolome and proteomics combined analysis device, comprising:
the acquisition module is used for carrying out multiple-covariance analysis on the data of the metabolome and the data of one or more other omics except the metabolome to acquire the incidence relation of the data between the metabolome and the other omics;
the screening module is used for taking the data of the metabolome with the incidence relation and the data of other omics as a data pair, carrying out correlation analysis on each data pair and screening the data pair according to the correlation analysis result of each data pair;
and the determining module is used for performing restrictive correspondence analysis on each screened data pair, determining a finally-retained data pair according to the restrictive correspondence analysis result of each data pair, and finally determining the association relationship of the data between the metabolome and the other omics according to the finally-retained data pair.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of the metabolome and other omics joint analysis method of any of claims 1 to 7.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the metabolome and proteomics joint analysis method of any one of claims 1 to 7.
CN201911380147.7A 2019-12-27 2019-12-27 Metabolic group and other group combined analysis method and device Active CN111061818B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911380147.7A CN111061818B (en) 2019-12-27 2019-12-27 Metabolic group and other group combined analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911380147.7A CN111061818B (en) 2019-12-27 2019-12-27 Metabolic group and other group combined analysis method and device

Publications (2)

Publication Number Publication Date
CN111061818A true CN111061818A (en) 2020-04-24
CN111061818B CN111061818B (en) 2023-06-30

Family

ID=70304192

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911380147.7A Active CN111061818B (en) 2019-12-27 2019-12-27 Metabolic group and other group combined analysis method and device

Country Status (1)

Country Link
CN (1) CN111061818B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109817282A (en) * 2019-02-25 2019-05-28 上海市第六人民医院 A kind of the data correlation system and method for metabolome and microorganism group
CN109979527A (en) * 2019-03-08 2019-07-05 广州基迪奥生物科技有限公司 A kind of transcript profile and metabolism group data relation analysis method and system
US20190228841A1 (en) * 2018-01-23 2019-07-25 The Regents Of The University Of California Arrowland: an online multiscale interactive tool for -omics data visualization

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190228841A1 (en) * 2018-01-23 2019-07-25 The Regents Of The University Of California Arrowland: an online multiscale interactive tool for -omics data visualization
CN109817282A (en) * 2019-02-25 2019-05-28 上海市第六人民医院 A kind of the data correlation system and method for metabolome and microorganism group
CN109979527A (en) * 2019-03-08 2019-07-05 广州基迪奥生物科技有限公司 A kind of transcript profile and metabolism group data relation analysis method and system

Also Published As

Publication number Publication date
CN111061818B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
Zyla et al. Gene set enrichment for reproducible science: comparison of CERNO and eight other algorithms
JP6771751B2 (en) Risk assessment method and system
Fan et al. miRNet-dissecting miRNA-target interactions and functional associations through network-based visual analysis
Barido-Sottani et al. A multitype birth–death model for Bayesian inference of lineage-specific birth and death rates
Goenawan et al. DyNet: visualization and analysis of dynamic molecular interaction networks
Benidt et al. SimSeq: a nonparametric approach to simulation of RNA-sequence datasets
Jalili et al. Using combined evidence from replicates to evaluate ChIP-seq peaks
Chen et al. A novel statistical method for quantitative comparison of multiple ChIP-seq datasets
US11901040B2 (en) Cross-network genomic data user interface
Olechnovič et al. Comparative analysis of methods for evaluation of protein models against native structures
Pihur et al. Reconstruction of genetic association networks from microarray data: a partial least squares approach
Stadler et al. Estimating speciation and extinction rates for phylogenies of higher taxa
CN108304112B (en) Data processing method and device
CN111191601B (en) Method, device, server and storage medium for identifying peer users
Pereira et al. A meta-approach for improving the prediction and the functional annotation of ortholog groups
Palopoli et al. QSLiMFinder: improved short linear motif prediction using specific query protein data
CN109189668A (en) Interface test method, device, computer equipment and storage medium
Xu et al. Detecting local diversity-dependence in diversification
Lemant et al. Robust, universal tree balance indices
US20200134136A1 (en) Cross-network genomic data user interface
Hawinkel et al. Model-based joint visualization of multiple compositional omics datasets
Tian et al. A two-step framework for inferring direct protein-protein interaction network from AP-MS data
CN111061818A (en) Metabolic group and other omics combined analysis method and device
Goossens et al. Detection of minor variants in Mycobacterium tuberculosis whole genome sequencing data
Kopp et al. An improved compound Poisson model for the number of motif hits in DNA sequences

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant