CN113377765A - Multi-group chemical data analysis system and data conversion method thereof - Google Patents

Multi-group chemical data analysis system and data conversion method thereof Download PDF

Info

Publication number
CN113377765A
CN113377765A CN202110545036.8A CN202110545036A CN113377765A CN 113377765 A CN113377765 A CN 113377765A CN 202110545036 A CN202110545036 A CN 202110545036A CN 113377765 A CN113377765 A CN 113377765A
Authority
CN
China
Prior art keywords
data
omics
databases
module
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110545036.8A
Other languages
Chinese (zh)
Inventor
石明明
唐冲
揭文才
白洁
孙宇欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Technology Solutions Co Ltd
Original Assignee
BGI Technology Solutions Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Technology Solutions Co Ltd filed Critical BGI Technology Solutions Co Ltd
Priority to CN202110545036.8A priority Critical patent/CN113377765A/en
Publication of CN113377765A publication Critical patent/CN113377765A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a multigroup data analysis system and a data conversion method thereof, wherein the analysis system comprises: the system comprises an interaction module, a processing module and a plurality of omics databases; the interactive module is used for providing an interactive form for a user to collect data analysis demand information of the user on one or more omic databases; the processing module is used for extracting or converting the associated data matched with the analysis demand information from one or more omics databases according to the data analysis demand information and returning the associated data to the interaction module; the plurality of omics databases comprises a genomics database, a transcriptomics database, an epigenomics database and a proteomics database. The method collects different data analysis requirements of the user on one or more omics in a mode of interacting with the user, and realizes different analysis and mining of multiple groups of chemical data by utilizing the mutual mapping relation of different omics data.

Description

Multi-group chemical data analysis system and data conversion method thereof
Technical Field
The invention belongs to the field of biological information, and particularly relates to a multigroup chemical data analysis system and a data conversion method thereof.
Background
With the initiation of genome projects for more and more species, research in the field of biology has entered the era of "omics". Genomics is used for researching the structure positioning, function, evolution, editing and the like of genome through high-throughput sequencing and bioinformatics technology; transcriptomics studies the expression level of RNA expressed in a specific environment and a specific tissue, and displays the dynamic change and trend of gene expression; epigenomics focuses on the sum of modified genetic material, observing how an organism affects gene expression without altering the DNA sequence; proteomics studies are a collection of proteins contained in a sample, and directly represent functional changes of the sample. The omics research promotes the development of technologies such as big data, bioinformatics and the like, and makes contributions to the application in the fields of medicine, environment, industry, agriculture and the like.
However, as the research proceeds, the molecular biological mechanism cannot be clearly explained from a certain aspect or all biological problems cannot be solved. Taking the disease research as an example, the development of the disease is a complex network change, the pathogenesis of the disease can involve the variation of DNA level, the regulation of RNA, the change of epigenetics and the like, so that the disease target is found, the joint analysis of data of multiple omics is needed to be completed, and the concept of 'multigroup science' is moved to the visual field of researchers in recent years.
The multigroup science comprises genomics, transcriptomics, epigenomics, proteomics, metabonomics and the like, namely the life process is researched from multiple angles, the high-throughput data of each omic is integrated and analyzed, and the system is a comprehensive research system and integrally reflects the life metabolic process of an organism. Taking diseases as an example, the strategy of the multiomic joint analysis can deeply excavate candidate markers of the diseases from multiple angles such as genes, proteins, small molecule metabolites, intestinal microorganisms and the like, and prompt the occurrence and development mechanisms of the diseases through gene variation, expression trends of the genes and regulatory factors and construction of an influencing factor regulatory network. In addition, the multi-group combined research also obtains great results in the aspects of molecular breeding, industrial fermentation, environmental monitoring and the like, and promotes the industrial development.
Disclosure of Invention
In order to solve the problems of the prior art that the respective isolated omics of sequencing products are analyzed, the data conversion of the sequencing products in different omics is realized, and the integrity of data mining is improved, the invention provides a multi-group chemical data analysis system in a first aspect, which comprises an interaction module, a processing module and a plurality of omics databases; the interactive module is used for providing an interactive form for a user to collect data analysis demand information of the user on one or more omic databases; the processing module is used for extracting or converting the associated data matched with the analysis demand information from one or more omics databases according to the data analysis demand information and returning the associated data to the interaction module; the plurality of omics databases comprises a genomics database, a transcriptomics database, an epigenomics database and a proteomics database.
In some embodiments of the present invention, the interactive module comprises a first interactive module, a second interactive module and a third interactive module, the first interactive module is used for providing a first interactive form to a user to collect data display requirement information of the user belonging to one or more omic databases; the second interactive module is used for providing a second interactive form for the user to collect data analysis demand information of the user on one or more omic databases; and the third interactive module is used for providing a third interactive form for the user to collect the information of the data conversion requirement of the user on one or more omic databases.
Further, the processing module comprises a first processing module, a second processing module and a third processing module, wherein the first processing module is used for extracting corresponding data from one or more omics databases according to the data display requirement information; the second processing module is used for extracting and analyzing corresponding data from one or more omics databases according to the data analysis demand information; and the third processing module is used for matching a plurality of data with the same characteristic value in one or more omics databases according to the data conversion requirement information and realizing the mutual conversion.
Preferably, in some embodiments of the present invention, the third processing module includes a mapping conversion unit, an interaction relation conversion unit, and a position conversion module, and the mapping conversion unit is configured to perform mapping conversion on a plurality of data belonging to different omics databases and having an association relation; the interaction relation conversion unit is used for carrying out function prediction and judgment on a plurality of data which belong to different omics databases and have interaction; the position conversion module unit is used for performing position conversion of corresponding omics on the adjacent data of one or more omics data through an element with coordinate information on a genome; .
Preferably, in some embodiments of the present invention, the second processing module performs a single omic or multi-group chemical association analysis on data belonging to one or more omic databases.
Preferably, in some embodiments of the present invention, the processing module further includes a scheduling module, and the scheduling module at least realizes the composite analysis requirement of the user through composite call of two processing modules of the first processing module, the second processing module and the third processing module.
In a second aspect of the present invention, a data conversion method for a multigroup mathematical data analysis system includes the steps of: providing an interactive form to a user to collect information on their data analysis needs pertaining to one or more omics databases; extracting or converting associated data matched with the analysis demand information from one or more omics databases according to the data analysis demand information; and displaying the associated data matched with the analysis requirement information to a user.
Further, the extracting or converting the associated data matched with the analysis demand information from one or more omics databases according to the data analysis demand information comprises the following steps:
respectively extracting data display requirement information, data analysis requirement information and data conversion requirement information from the data analysis requirement information; extracting corresponding data from one or more omics databases according to the data display requirement information; extracting and analyzing corresponding data from one or more omics databases according to the data analysis demand information; and matching a plurality of data with the same characteristic value in one or more omics databases according to the data conversion requirement information, and realizing mutual conversion.
In a third aspect of the present invention, there is provided an electronic device comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method provided by the first aspect of the invention.
In a fourth aspect of the invention, a computer-readable medium is provided, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the method provided in the first aspect of the invention.
The invention has the beneficial effects that:
1. aiming at different analysis requirements of a user on the multi-omics, an interactive form is divided into a core form and an extended form, the core form is used as a user interface of conventional data display and data analysis of a single omic, and basic analysis requirement information of the user is collected; the extended form is used as a user interface for introducing data display and data analysis of other multiple related omics, and analysis demand information or data conversion demand information of a user is collected;
2. by utilizing the mapping or corresponding relation (at least comprising the corresponding relation stated by the genetic codon and the central rule) among the omics IDs, the annotation information of the protein ID corresponding to the gene ID can be obtained, and the methylation modification information of the upstream and downstream regions of the gene can be inquired; meanwhile, analysis results such as expression quantity and difference analysis can be provided to the expansion column, and the analysis results can be displayed flexibly.
3. The method is loaded into an analysis system through methods such as core table switching, expansion and conversion, cyclic calling tools and the like, and different requirements of users on the multiomic analysis are met.
Drawings
FIG. 1 is a primary block diagram of a multigroup chemical data analysis system in some embodiments of the invention;
FIG. 2 is a schematic diagram of an interaction module configuration in some embodiments of the invention;
FIG. 3 is a schematic diagram of a processing module in some embodiments of the invention;
FIG. 4 is one of the schematic interactive interfaces of the first interactive module in some embodiments of the present invention;
FIG. 5 is a second schematic view of an interactive interface of the first interactive module according to some embodiments of the present invention;
FIG. 6 is a schematic diagram of an interactive interface between a mapping conversion unit and an interactive module of a third processing module in some embodiments of the invention;
FIG. 7 is a schematic diagram of an interaction interface between an interaction relationship transformation unit and an interaction module of a third processing module in some embodiments of the invention;
FIG. 8 is a schematic diagram of an interface between a position transformation unit and an interaction module of a third processing module in some embodiments of the invention;
FIG. 9 is a schematic diagram of an interaction interface between an association transformation unit and an interaction module of a third processing module in some embodiments of the invention;
figure 10 is an scRNA report generated by the correlating unit during the conversion process in some embodiments of the invention;
FIG. 11 is a schematic diagram of an interactive interface for conversion completion of an association conversion unit in some embodiments of the invention;
FIG. 12 is a basic flow diagram of a data transformation method of a multigroup mathematical data analysis system in some embodiments of the present invention;
FIG. 13 is a detailed flow diagram of a data transformation method of a multigroup mathematical data analysis system in accordance with certain embodiments of the invention;
fig. 14 is a schematic structural diagram of an electronic device in some embodiments of the invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Example 1
Referring to fig. 1 in a first aspect of the present invention, there is provided a multigroup omic data analysis system 1, comprising an interaction module 11, a processing module 12 and a plurality of omic databases 13, wherein the interaction module 11 is configured to provide a user with an interactive form to collect information on data analysis requirements of the user on the omic databases; and also for presenting the returned data to the user.
The processing module 12 is configured to extract or convert associated data matched with the analysis requirement information from one or more omics databases according to the data analysis requirement information and return the associated data to the interaction module; the plurality of omics databases 13 comprises a genomics database, a transcriptomics database, an epigenomics database and a proteomics database. Optionally, the plurality of omics databases 13 may also comprise a metabolomics database.
Referring to fig. 2, in some embodiments of the present invention, the interactive module 11 includes a first interactive module 111, a second interactive module 112 and a third interactive module 113, the first interactive module 111 is used for providing a first interactive form to a user to collect data presentation requirement information of the user on one or more omic databases; the second interactive module 112 is used for providing a second interactive form for the user to collect the information of the data analysis requirement of the user on one or more omic databases; the third interactive module 113 is configured to provide a third interactive form to the user to collect information on data conversion requirements of the user on one or more omic databases. It is understood that the interactive module 11 (front end) in the present invention includes, in addition to the above-mentioned interactive form, a corresponding interface for information transfer or interaction with the processing module 12 (back end).
Referring to fig. 4 and 5, the interactive form is embodied in the form of a table. The core table comprises a gene table, an RNA table, a result table of other sequencing products and a total table, wherein the total table simultaneously displays several types of data by selecting common characteristic values among different products (single omics data) so as to realize the collection or display of different analysis requirement information. The core table supports the expansion of a longitudinal column, when an expansion column button is clicked, added fields can be selected from four options of basic attributes, quantitative features, sample differences and comments, the selected items can be counted, the result appears in the upper right corner of the options, and the selected fields are displayed above the table, so that the operations of checking, deleting and the like are facilitated. After clicking is determined, the selected field is added to the core table in a column form, and each row shows the data corresponding to the field. The interactive module (front end) sends request information to the processing module according to the content of the interactive form, and after the processing module (usually a background server) receives the request information, the processing module searches corresponding field information in the database and updates the data in the interactive form.
Referring to fig. 3, in some embodiments, the processing module 12 includes a first processing module 121, a second processing module 122, and a third processing module 123, wherein the first processing module 121 is configured to extract corresponding data from one or more omic databases according to data presentation requirement information; the second processing module 122 is configured to extract and analyze corresponding data from one or more omics databases according to the data analysis requirement information; the third processing module 123 is configured to match a plurality of data with the same characteristic value in one or more omics databases according to the data conversion requirement information, and implement mutual conversion.
Referring to fig. 6 to 11, for different analysis requirement information shown in the above embodiments, the processing module 12 needs to extract data of corresponding databases respectively. In some embodiments of the present invention, the third processing module 123 includes a mapping conversion unit, an interaction relation conversion unit, and a position conversion module, where the mapping conversion unit is configured to map and convert a plurality of data belonging to different omics databases and having an association relationship; the interaction relation conversion unit is used for carrying out function prediction and judgment on a plurality of data which belong to different omics databases and have interaction; and the position conversion module unit is used for performing position conversion of corresponding omics on the adjacent data of one or more omics data through an element with coordinate information on a genome.
Specifically, with reference to fig. 6, the mapping transformation unit may enable the transformation of "protein-gene" or "protein-transcript" between elements (omics data): clicking 'association conversion' in the interactive form, selecting the mapping relation-type to be converted, and simultaneously giving weight to the difference result of a certain comparison group or sample in the scheme for click determination.
It is understood that the principle of mapping transformation is to use the ID correspondence between elements (proteins, genes, transcripts). Illustratively, gene IDs for 53 model species are stored in multiple omics databases. The protein may be the ID of the public database of NCBI, ensembles, etc. at the time the entry results are reported into the system. Firstly, the protein ID is converted into the ID of NCBI, and then the ID is converted into the transcript ID through a corresponding relation table. For example: transcript ID → protein ID; trans1 → P1; trans2 → P2. The mapping relationship refers to conversion between genes, transcripts, proteins and other biologically relevant entities, for example: given a gene ID, the transcript ID and protein ID corresponding to that gene can be found.
Referring to fig. 7, the interaction relationship conversion unit is the prediction and determination of the function of the interaction of elements (omics data) such as genes, transcripts, proteins, etc. These include protein-protein related effects (PPI), targets, CERNA, RNA plex, and the like. For example: the user selects the corresponding option PPI in the interactive form, selects the score parameter, and clicks the element which is determined to be converted into the interactive relation with the currently selected ID.
It is understood that the above-mentioned principle is to screen the correspondence between genes and original genes: taking PPI as an example: genes and their interactions are downloaded in String public databases. The concrete expression is as follows:
Source target score sort (sorting)
geneA geneB 100 2
geneA geneC 200 1
geneB geneC 100 2
The correlation is recorded in a corresponding omics database, and when a user operates a page, the user queries a target column with the corresponding relation by taking an ID list transmitted from a front-end page as a source. And screening the corresponding relation with the score larger than 50 by using the score parameter (such as 50).
Referring to FIG. 8, for the methylated sequencing product, the position conversion module can find the upstream or downstream position of the differential methylation sequence and perform table conversion on the region. The information of a public database of a preset species is stored in a multigroup chemical database in the multigroup chemical analysis system, the information comprises chromosome position information, gene information and an ID (identity) association table, a user fills relevant parameters of position relation, including upstream and downstream positions and the table type to be converted, the parameter information is transmitted to a background from a front end page, the background reads the database according to the parameter information, corresponding fields are searched, and the ID relationship table is subjected to association transformation. It is understood that the principle of this function is to determine positional relationship using coordinate information of genes on chromosomes. Example (c): gene → chromosome → start → end; gene1 → chr1 → 100 → 201; gene2 → chr1 → 300 → 500; the Gene 2(Gene2) can be searched by searching for the Gene in 1KB (1000bp) upstream and downstream of Gene1 and by the position relation. In addition, the position conversion is realized by using elements of omics data corresponding to coordinate position information on a genome, including gene finding genes, gene finding methylated regions, methylated region finding genes and the like.
Referring to fig. 9 and 11, in the analysis process of a single cell product in some multiomics, due to the increase of cell dimensions, a cell dimension table needs to be added when switching a core table, so as to realize the function of interconversion between a gene table and a cell table. After the user has turned "association conversion" by clicking on the project home page, the mapping relationship is incremented by "Cell to Gene" (for the Cell table) and "Gene to Cell" (for the Gene table).
Referring to fig. 10, the above process for realizing the interconversion between the gene table and the cell table is as follows: conventional RNA reports include 1 main gene table: the gene _ core _ table contains the gene expression levels. The scRNA report provided by the invention comprises the following main tables:
gene master table: gene _ core _ table: the first column is gene _ id;
cell master table: cell _ core _ table: the first column is cell _ id;
expression scale: cell _ gene _ exp: when a user switches to a table about cells in an interactive form to select a cell _ id and sends a correlation conversion request on a page, inquiring the cell _ id through a cell _ core _ table, inquiring a gene according to the id corresponding relation of an expression table, selecting basic information from a gene table to be displayed on the page, and completing conversion;
cell to Gene (Cell to Gene): a certain number of cells are selected in the dimension of the cell table and converted into a matrix of gene cells, and after conversion, the numerical values in the cells are defaulted to the expression amount.
Preferably, in some embodiments of the present invention, the second processing module 122 performs a single omic or multi-group chemical association analysis on data belonging to one or more omic databases. The analysis includes common cluster analysis, association cluster analysis, common network interaction graph, association network interaction graph, alternative splicing, multi-group chemical association, chi-square test, multi-group chemical association, correlation heat map and the like. Based on the table data or the matrix data after the table conversion, the re-analysis requirement of the user is met.
Preferably, in some embodiments of the present invention, the processing module 12 further includes a scheduling module, and the scheduling module implements the composite analysis requirement of the user through composite call to at least two processing modules 12 of the first processing module 121, the second processing module 122, and the third processing module 123. All processing modules 12 support the analysis of the core table of the current page, including the results of the existing processes and the results of the re-analysis tasks. In other words, for the result of re-analysis after the tool is called, the table data supporting the current page is selected again to continue the tool calling, that is, the tool is recycled to mine elements (data) such as target genes or proteins.
Example 2
Referring to fig. 12, in a second aspect of the present invention, there is provided a data conversion method of a multigroup chemical data analysis system, comprising the steps of: s100, providing an interactive form for a user to collect data analysis demand information of the user on one or more omic databases; s200, extracting or converting associated data matched with the analysis demand information from one or more omics databases according to the data analysis demand information; and S300, displaying the associated data matched with the analysis requirement information to a user.
Further, in step S200 of some embodiments of the present invention, the extracting or converting the associated data matched with the analysis requirement information from one or more omics databases according to the data analysis requirement information includes the following steps: respectively extracting data display requirement information, data analysis requirement information and data conversion requirement information from the data analysis requirement information; extracting corresponding data from one or more omics databases according to the data display requirement information; extracting and analyzing corresponding data from one or more omics databases according to the data analysis demand information; and matching a plurality of data with the same characteristic value in one or more omics databases according to the data conversion requirement information, and realizing mutual conversion.
Referring to fig. 13, in some embodiments of the present invention, the steps are specifically: the user fills in or operates and selects corresponding screening conditions, one or more omics data to be analyzed and related data thereof are determined, the front end (interaction module) converts the omics data to be analyzed or to be converted, which are input by the user, into operation requests for one or more omic databases according to interface parameters, the one or more omic databases return results to the processing module according to the operation requests, and the processing module converts the returned results (information) or converts the data through a conversion page or directly displays the results (information) to the user for the user to check conveniently.
Example 3
In a third aspect of the present invention, there is provided an electronic device comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method provided by the first aspect of the invention.
Referring to fig. 14, an electronic device 500 may include a processing means (e.g., central processing unit, graphics processor, etc.) 501 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM502, and the RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following devices may be connected to the I/O interface 505 in general: input devices 506 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 507 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; a storage device 508 including, for example, a hard disk; and a communication device 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 14 illustrates an electronic device 500 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 14 may represent one device or may represent a plurality of devices as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or installed from the storage means 508, or installed from the ROM 502. The computer program, when executed by the processing device 501, performs the above-described functions defined in the methods of embodiments of the present disclosure. It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more computer programs which, when executed by the electronic device, cause the electronic device to:
computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, Python, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A multigroup chemical data analysis system is characterized by comprising an interaction module, a processing module and a plurality of chemical databases,
the interactive module is used for providing an interactive form for a user to collect data analysis demand information of the user on one or more omic databases;
the processing module is used for extracting or converting the associated data matched with the analysis demand information from one or more omics databases according to the data analysis demand information and returning the associated data to the interaction module;
the plurality of omics databases comprises a genomics database, a transcriptomics database, an epigenomics database and a proteomics database.
2. The system of multigroup mathematical data analysis according to claim 1, wherein the interaction module comprises a first interaction module, a second interaction module and a third interaction module,
the first interactive module is used for providing a first interactive form for a user so as to collect data display requirement information of the user on one or more omic databases;
the second interactive module is used for providing a second interactive form for the user to collect data analysis demand information of the user on one or more omic databases;
and the third interactive module is used for providing a third interactive form for the user to collect the information of the data conversion requirement of the user on one or more omic databases.
3. The system of multimathematical data analysis according to claim 2, wherein the processing modules include a first processing module, a second processing module, a third processing module,
the first processing module is used for extracting corresponding data from one or more omics databases according to the data display requirement information;
the second processing module is used for extracting and analyzing corresponding data from one or more omics databases according to the data analysis demand information;
and the third processing module is used for matching a plurality of data with the same characteristic value in one or more omics databases according to the data conversion requirement information and realizing the mutual conversion.
4. The multigroup mathematical data analysis system according to claim 3, wherein the third processing module comprises a mapping conversion unit, an interaction relation conversion unit, a position conversion module unit,
the mapping conversion unit is used for mapping and converting a plurality of data which belong to different omics databases and have incidence relations;
the interaction relation conversion unit is used for carrying out function prediction and judgment on a plurality of data which belong to different omics databases and have interaction;
and the position conversion module unit is used for performing position conversion of corresponding omics on the adjacent data of one or more omics data through an element with coordinate information on a genome.
5. The system of claim 3 wherein the second processing module performs a single omic or multi-set associative analysis of data belonging to one or more omic databases.
6. The system of claim 3, wherein the processing modules further comprise a scheduling module, and the scheduling module implements the composite analysis requirements of the user through at least composite calls to two of the first processing module, the second processing module, and the third processing module.
7. A data conversion method of a multigroup chemical data analysis system is characterized by comprising the following steps:
providing an interactive form to a user to collect information on their data analysis needs pertaining to one or more omics databases;
extracting or converting associated data matched with the analysis demand information from one or more omics databases according to the data analysis demand information;
and displaying the associated data matched with the analysis requirement information to a user.
8. The data transformation method of multigroup chemical data analysis system according to claim 7, wherein said extracting or transforming the associated data matching the analysis requirement information from one or more omics databases according to the data analysis requirement information comprises the steps of:
respectively extracting data display requirement information, data analysis requirement information and data conversion requirement information from the data analysis requirement information;
extracting corresponding data from one or more omics databases according to the data display requirement information;
extracting and analyzing corresponding data from one or more omics databases according to the data analysis demand information;
and matching a plurality of data with the same characteristic value in one or more omics databases according to the data conversion requirement information, and realizing mutual conversion.
9. An electronic device, comprising: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the data conversion method of the multi-component chemical data analysis system of any one of claims 7 to 8.
10. A computer-readable medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements a data conversion method of a multigang chemical data analysis system as claimed in any one of claims 7 to 8.
CN202110545036.8A 2021-07-09 2021-07-09 Multi-group chemical data analysis system and data conversion method thereof Pending CN113377765A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110545036.8A CN113377765A (en) 2021-07-09 2021-07-09 Multi-group chemical data analysis system and data conversion method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110545036.8A CN113377765A (en) 2021-07-09 2021-07-09 Multi-group chemical data analysis system and data conversion method thereof

Publications (1)

Publication Number Publication Date
CN113377765A true CN113377765A (en) 2021-09-10

Family

ID=77571233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110545036.8A Pending CN113377765A (en) 2021-07-09 2021-07-09 Multi-group chemical data analysis system and data conversion method thereof

Country Status (1)

Country Link
CN (1) CN113377765A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113921084A (en) * 2021-12-13 2022-01-11 山东大学齐鲁医院 Multi-dimensional target prediction method and system for disease-related non-coding RNA (ribonucleic acid) regulation and control axis

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2674758A1 (en) * 2012-06-13 2013-12-18 Agilent Technologies, Inc. A computational method for mapping peptides to proteins using sequencing data
CN107368704A (en) * 2017-07-21 2017-11-21 上海桑格信息技术有限公司 The interactive analysis system and method for the transcriptome project for having reference gene group based on cloud computing platform
CN107862176A (en) * 2017-10-13 2018-03-30 浙江大学 A kind of multi-level bio-networks method for reconstructing of plant full-length genome based on multigroup Data Integration
WO2018197648A1 (en) * 2017-04-27 2018-11-01 Koninklijke Philips N.V. Interactive precision medicine explorer for genomic abberations and treatment options
CN109817282A (en) * 2019-02-25 2019-05-28 上海市第六人民医院 A kind of the data correlation system and method for metabolome and microorganism group
CN110428866A (en) * 2019-07-23 2019-11-08 哈尔滨工业大学 Cancer related pathways recognition methods based on network integration multiple groups data
CN110706750A (en) * 2019-10-28 2020-01-17 广州基迪奥生物科技有限公司 Dynamic interactive microbiology online analysis cloud platform and generation method thereof
CN111161804A (en) * 2019-12-27 2020-05-15 北京百迈客生物科技有限公司 Query method and system for species genomics database
CN111863136A (en) * 2020-07-17 2020-10-30 上海市第六人民医院 Integrated system and method for correlation analysis among multiple sets of chemical data
CN112397146A (en) * 2020-12-02 2021-02-23 广东美格基因科技有限公司 Microbial omics data interaction analysis system based on cloud platform

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2674758A1 (en) * 2012-06-13 2013-12-18 Agilent Technologies, Inc. A computational method for mapping peptides to proteins using sequencing data
WO2018197648A1 (en) * 2017-04-27 2018-11-01 Koninklijke Philips N.V. Interactive precision medicine explorer for genomic abberations and treatment options
CN107368704A (en) * 2017-07-21 2017-11-21 上海桑格信息技术有限公司 The interactive analysis system and method for the transcriptome project for having reference gene group based on cloud computing platform
CN107862176A (en) * 2017-10-13 2018-03-30 浙江大学 A kind of multi-level bio-networks method for reconstructing of plant full-length genome based on multigroup Data Integration
CN109817282A (en) * 2019-02-25 2019-05-28 上海市第六人民医院 A kind of the data correlation system and method for metabolome and microorganism group
CN110428866A (en) * 2019-07-23 2019-11-08 哈尔滨工业大学 Cancer related pathways recognition methods based on network integration multiple groups data
CN110706750A (en) * 2019-10-28 2020-01-17 广州基迪奥生物科技有限公司 Dynamic interactive microbiology online analysis cloud platform and generation method thereof
CN111161804A (en) * 2019-12-27 2020-05-15 北京百迈客生物科技有限公司 Query method and system for species genomics database
CN111863136A (en) * 2020-07-17 2020-10-30 上海市第六人民医院 Integrated system and method for correlation analysis among multiple sets of chemical data
CN112397146A (en) * 2020-12-02 2021-02-23 广东美格基因科技有限公司 Microbial omics data interaction analysis system based on cloud platform

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113921084A (en) * 2021-12-13 2022-01-11 山东大学齐鲁医院 Multi-dimensional target prediction method and system for disease-related non-coding RNA (ribonucleic acid) regulation and control axis
CN113921084B (en) * 2021-12-13 2022-03-08 山东大学齐鲁医院 Multi-dimensional target prediction method and system for disease-related non-coding RNA (ribonucleic acid) regulation and control axis

Similar Documents

Publication Publication Date Title
Kieser et al. ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data
Manni et al. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes
Samaras et al. ProteomicsDB: a multi-omics and multi-organism resource for life science research
Xia et al. MetaboAnalyst 3.0—making metabolomics more meaningful
Kolde et al. Robust rank aggregation for gene list integration and meta-analysis
Xia et al. MSEA: a web-based tool to identify biologically meaningful patterns in quantitative metabolomic data
Karp et al. Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology
Alkan et al. BEAMS: backbone extraction and merge strategy for the global many-to-many alignment of multiple PPI networks
Aggarwal et al. Functional genomics and proteomics as a foundation for systems biology
Ji et al. Identifying time-lagged gene clusters using gene expression data
Chalmel et al. The Annotation, Mapping, Expression and Network (AMEN) suite of tools for molecular systems biology
US8364665B2 (en) Directional expression-based scientific information knowledge management
Lihu et al. A review of ensemble methods for de novo motif discovery in ChIP-Seq data
Mithani et al. Rahnuma: hypergraph-based tool for metabolic pathway prediction and network comparison
Campos et al. An evaluation of machine learning approaches for the prediction of essential genes in eukaryotes using protein sequence-derived features
Knowles et al. Grape RNA-Seq analysis pipeline environment
Ocaña et al. Parallel computing in genomic research: advances and applications
Wang hppRNA—a Snakemake-based handy parameter-free pipeline for RNA-Seq analysis of numerous samples
Schilder et al. echolocatoR: an automated end-to-end statistical and functional genomic fine-mapping pipeline
Xue et al. qPTMplants: an integrative database of quantitative post-translational modifications in plants
Zheng et al. Visualization of circular RNAs and their internal splicing events from transcriptomic data
Yang et al. CyanOmics: an integrated database of omics for the model cyanobacterium Synechococcus sp. PCC 7002
CN113377765A (en) Multi-group chemical data analysis system and data conversion method thereof
Aguilar-Pontes et al. (Post-) genomics approaches in fungal research
Miller et al. Exploration and analysis of R-loop mapping data with RLBase

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination