CN114255885B - New drug research and development management system and method based on graph data - Google Patents

New drug research and development management system and method based on graph data Download PDF

Info

Publication number
CN114255885B
CN114255885B CN202111526092.3A CN202111526092A CN114255885B CN 114255885 B CN114255885 B CN 114255885B CN 202111526092 A CN202111526092 A CN 202111526092A CN 114255885 B CN114255885 B CN 114255885B
Authority
CN
China
Prior art keywords
compound
disease
information
point type
compounds
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111526092.3A
Other languages
Chinese (zh)
Other versions
CN114255885A (en
Inventor
张晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Create Link Technology Co ltd
Original Assignee
Zhejiang Create Link Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Create Link Technology Co ltd filed Critical Zhejiang Create Link Technology Co ltd
Priority to CN202111526092.3A priority Critical patent/CN114255885B/en
Publication of CN114255885A publication Critical patent/CN114255885A/en
Application granted granted Critical
Publication of CN114255885B publication Critical patent/CN114255885B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/30Drug targeting using structural data; Docking or binding prediction
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/90Programming languages; Computing architectures; Database systems; Data warehousing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Theoretical Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Bioethics (AREA)
  • Software Systems (AREA)
  • Toxicology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The embodiment of the invention discloses a new medicine research and development management system and method based on graph data, wherein the system comprises a data acquisition module for acquiring and integrating medicine data; the medical data includes compound information, disease information, target gene information, and side effect information; the diagram data module is used for constructing a diagram model according to the medical data; wherein, each compound, disease, target gene and side effect are regarded as peaks, and the correlation factor between each peak is regarded as side; the query prediction module is used for transmitting the query information to the graph model for prediction according to the acquired query information, and displaying the fed-back prediction result; the beneficial effects are as follows: by constructing a correlation network according to the information of the compound, the disease, the target gene, the side effect and the like, a graph model is obtained, so that new medicine research personnel can be helped to quickly find the relation among the compound, the disease and the target gene, the research and development progress of the new medicine is accelerated, and the research and development efficiency of the new medicine is further improved.

Description

New drug research and development management system and method based on graph data
Technical Field
The invention relates to the technical field of information processing, in particular to a new medicine research and development management system and method based on graph data.
Background
The development of new drugs is a very time-consuming, costly and labor-consuming project, and billions to billions of data are accumulated in the development stage, and relate to how various compounds treat diseases, what genes are targeted by various compounds, what side effects are caused by various compounds while treating the diseases, and the like. The data are huge in volume and complex in association, and if the value of the associated data can be quickly released, the period of new medicine development is greatly shortened, more patients can take new medicines more quickly, and the trouble of pain is eliminated.
However, the data are stored in the relational database, ten or more relational tables of TB level are generated, ten query languages are required to be written for each query, a plurality of relational tables are associated, and a great amount of time is consumed to obtain a result. And in a plurality of links of new medicine research and development, each link involves a large amount of associated inquiry of a large amount of data. The inability to quickly interrogate these vast amounts of associated data becomes a large block that hinders the improvement of new drug development efficiency.
Disclosure of Invention
The invention aims at: the novel drug development management system and method based on the graph data are provided for helping novel drug developers to quickly discover the relation among compounds, diseases and target genes and accelerating development progress.
First aspect: a new drug development management system based on graph data, comprising:
the data acquisition module is used for acquiring and integrating the medical data; wherein the medical data includes compound information, disease information, target gene information, and side effect information;
the diagram data module is used for constructing a diagram model according to the medical data; wherein, each compound, disease, target gene and side effect are regarded as peaks, and the correlation factor between each peak is regarded as side;
And the query prediction module is used for transmitting the query information to the graph model for prediction according to the acquired query information, and displaying the fed-back prediction result.
Preferably, the compound information includes compound ID, compound name, data source, international compound identification, and similar compound information;
the disease information includes a disease ID, a disease name, and similar disease information;
the target gene information comprises target gene ID, target gene name, gene description and chromosome;
the side effect information includes a side effect ID and a side effect name.
Preferably, the association factors include similar compounds, similar diseases, combinations, treatments, causes and links a plurality of factors, and each factor is taken as a corresponding edge type.
Preferably, when the edge type is a similar compound, the corresponding start point type and end point type are both compounds;
when the edge type is similar to the disease, the corresponding starting point type and ending point type are both diseases;
When the edge types are combination, the corresponding starting point type is a compound, and the ending point type is a target gene;
When the side type is treatment, the corresponding starting point type is a compound, and the ending point type is a disease;
When the edge type is caused, the corresponding starting point type is a compound, and the ending point type is a side effect;
when the edge type is the connection, the corresponding starting point type is the disease, and the ending point type is the target gene.
Preferably, the graph query language is adopted and the prediction results are ranked during query.
Second aspect: a new drug development management method based on graph data, which is applied to the new drug development management system based on graph data in the first aspect, the method comprises the following steps:
acquiring and integrating medical data; wherein the medical data includes compound information, disease information, target gene information, and side effect information;
constructing a graph model according to the medical data; wherein, each compound, disease, target gene and side effect are regarded as peaks, and the correlation factor between each peak is regarded as side;
And according to the acquired query information, transmitting the query information to the graph model for prediction, and displaying the fed-back prediction result.
Preferably, the compound information includes compound ID, compound name, data source, international compound identification, and similar compound information;
the disease information includes a disease ID, a disease name, and similar disease information;
the target gene information comprises target gene ID, target gene name, gene description and chromosome;
the side effect information includes a side effect ID and a side effect name.
Preferably, the association factors include similar compounds, similar diseases, combinations, treatments, causes and links a plurality of factors, and each factor is taken as a corresponding edge type.
Preferably, when the edge type is a similar compound, the corresponding start point type and end point type are both compounds;
when the edge type is similar to the disease, the corresponding starting point type and ending point type are both diseases;
When the edge types are combination, the corresponding starting point type is a compound, and the ending point type is a target gene;
When the side type is treatment, the corresponding starting point type is a compound, and the ending point type is a disease;
When the edge type is caused, the corresponding starting point type is a compound, and the ending point type is a side effect;
when the edge type is the connection, the corresponding starting point type is the disease, and the ending point type is the target gene.
Preferably, the graph query language is adopted and the prediction results are ranked during query.
By adopting the technical scheme, the method has the following advantages: according to the new medicine research and development management system and method based on the graph data, the graph model is obtained by constructing the association relation network according to the information of the compound, the disease, the target gene, the side effect and the like, so that the association conditions of the compound, the disease, the target gene and the side effect are fully displayed, a new medicine research and development staff is helped to quickly find the relation among the compound, the disease and the target gene, the research and development progress of the new medicine is accelerated, and the research and development efficiency of the new medicine is further improved.
Drawings
FIG. 1 is a system block diagram of a new drug development management system based on graph data provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of a graphic model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a prediction result according to an embodiment of the present invention;
Fig. 4 is a flowchart of a new drug development management method based on graph data according to an embodiment of the present invention.
Detailed Description
Specific embodiments of the invention will be described in detail below, it being noted that the embodiments described herein are for illustration only and are not intended to limit the invention. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that: no such specific details are necessary to practice the invention. In other instances, well-known circuits, software, or methods have not been described in detail in order not to obscure the invention.
Throughout the specification, references to "one embodiment," "an embodiment," "one example," or "an example" mean: a particular feature, structure, or characteristic described in connection with the embodiment or example is included within at least one embodiment of the invention. Thus, the appearances of the phrases "in one embodiment," "in an embodiment," "one example," or "an example" in various places throughout this specification are not necessarily all referring to the same embodiment or example. Furthermore, the particular features, structures, or characteristics may be combined in any suitable combination and/or sub-combination in one or more embodiments or examples. Moreover, those of ordinary skill in the art will appreciate that the illustrations provided herein are for illustrative purposes and that the illustrations are not necessarily drawn to scale.
The present invention will be described in detail with reference to the accompanying drawings.
Referring to fig. 1 and fig. 2, a new drug development management system based on graph data provided by an embodiment of the present invention includes:
the data acquisition module is used for acquiring and integrating the medical data; wherein the medical data includes compound information, disease information, target gene information, and side effect information.
Specifically, the medical data includes medical data derived from internet disclosure, and data accumulated by pharmaceutical companies themselves, and these data are taken as sample data sets; the scale of the sample dataset: sample data set the sample data set contains 17 ten thousand-sided relationships of 137 diseases, 1552 compounds, 5734 side effects, 20945 target genes, similarity between points, treatment, and the like; wherein:
the sample dataset content details:
Compound information: such as compound ID, compound name, data source, international compound identification, url;
Disease information: such as disease ID, disease name, data source, url;
target gene information: such as target gene ID, target gene name, data source, url, gene description, chromosome;
Side effect information: such as side effect ID, side effect name, data source, url;
Similar compound information: such as two compound similarity, data source;
similar disease information: such as a data source;
Compounds cause side effects, compounds bind to target genes, compounds treat diseases, and disease link target gene information.
The diagram data module is used for constructing a diagram model according to the medical data; wherein each compound, disease, target gene and side effect are regarded as vertices, and the correlation factor between vertices is regarded as edges.
In particular, the association factors include similar compounds, similar diseases, binding, treatment, creation and association of a plurality of factors, and each factor is taken as a corresponding edge type.
Referring to table 1, the point types in the graph model are:
TABLE 1
Correspondingly, when the edge types are similar compounds, the corresponding starting point types and the corresponding ending point types are both compounds;
when the edge type is similar to the disease, the corresponding starting point type and ending point type are both diseases;
When the edge types are combination, the corresponding starting point type is a compound, and the ending point type is a target gene;
When the side type is treatment, the corresponding starting point type is a compound, and the ending point type is a disease;
When the edge type is caused, the corresponding starting point type is a compound, and the ending point type is a side effect;
when the edge type is the connection, the corresponding starting point type is the disease, and the ending point type is the target gene.
Specifically, referring to table 2, the edge types in the graph model are:
TABLE 2
Type of starting point Edge type Type of termination point Attributes of
Compounds of formula (I) Analogous compounds Compounds of formula (I) Similarity, data sources
Compounds of formula (I) Bonding of Target gene Data source
Compounds of formula (I) Treatment of Disease of the human body Data source
Compounds of formula (I) Resulting in Side effects Data source
Disease of the human body Contact with Target gene Data source
Disease of the human body Similar diseases Disease of the human body Data source
And the query prediction module is used for transmitting the query information to the graph model for prediction according to the acquired query information, and displaying the fed-back prediction result.
Specifically, during query, adopting a graph query language, and sequencing the prediction results; when the method is applied, the adopted Cypher, gremlin isograph query languages can concentrate dozens of associated queries of the original relational database into one query, so that the code quantity is reduced; meanwhile, the ranking can be performed according to the similarity between the obtained compounds; the related point types are corresponding to at least one of the related factors during query, and can be specifically referred to table 2.
Further, to facilitate a better understanding of the present solution, specific business requirements are exemplified below.
Business appeal 1:
In the process of developing new drugs, the searching of the Miao compound takes a great deal of time and energy, and the way of searching the Miao compound at the present stage is random screening, so that blindness is achieved; the graph data technology can be used for predicting the Miao ethnic compound from the angles of similarity and the same action mechanism, so that the research and development efficiency of the new drug is improved.
Query description:
finding a disease, for example, a similar disease of CERVICAL CANCER (cervical cancer);
Compounds capable of treating similar diseases were found as predicted leptic compounds.
Query statement:
Analogous diseases to the finding of diseases CERVICAL CANCER (cervical cancer), and compounds having therapeutic effects on analogous diseases
MATCH p= (j: disease { name: 'CERVICAL CANCER' } - [ r: similar disease ] - (h 1) - [ r1: treatment ] - (f)
Hybrid compounds useful for treating and preventing diseases
RETURN p
Referring to FIG. 3, the query results are shown, wherein the query results firstly query similar diseases to the cervical cancer, namely uterine cancer and ovarian cancer; then according to the relevant factor of treatment, finding out a compound capable of treating similar diseases as a predicted Miao ethnic compound;
compounds that may be able to treat the disease CERVICAL CANCER (cervical cancer) can be found from figure 3 by similarity of the disease, and early experimental verification of compounds that are able to treat both similar diseases can be performed.
Business appeal 2:
Query description:
Finding a compound capable of treating the disease sarcomas (sarcomas);
similar compounds to the above compounds were found as predicted leptic compounds.
Query statement:
similar compounds to those capable of treating the disease sarcomas are sought.
MATCH p= (j: disease { name: 'sarcoma' } - [ r: treatment ] - (h 1) - [ r1: analog compound ] - (f)
The compounds returned to treat the disease sarcoma (sarcoma), and the predicted Miao ethnic compound.
RETURN p
Finally, the compound which can treat the disease sarcomas is found through the similarity of the compounds, and then the experiment verification is carried out after the similarity of the compounds is sequenced.
Business appeal 3:
Query description:
Searching for compounds capable of treating disease primary biliary cirrhosis (primary biliary cirrhosis);
finding out target genes and side effects of the compound;
And (3) finding out compounds which have the same target genes and side effects as the compounds, and taking the compounds as predicted leptic compounds.
Query statement:
The finding of a compound that is capable of treating disease primary biliary cirrhosis (primary biliary cirrhosis) and has the same side effects as the compound and binding to the target gene.
MATCH p= (j: disease { name } primary biliary cirrhosis' } is < r: treatment ] - (h 1: compound) - [ r1: cause ] - > (f) < - [ r2: cause ] - (h 2: compound) - [ r3: bind ] - > (b) < - [ r4: bind ] - (h 1)
The compounds that have the same side effects and binding genes as the therapeutic disease sarcomas are regarded as predicted Miao compounds.
RETURN p
Finally, the compound which can possibly treat the disease primary biliary cirrhosis (primary biliary cirrhosis) can be found through the same binding genes and side effects of the compound, and experimental verification can be carried out on the compound.
By adopting the scheme, the graph model is obtained by forming the association relation network according to the information of the compound, the disease, the target gene, the side effect and the like, so that the association conditions of the compound, the disease, the target gene and the side effect are fully displayed, new medicine research personnel are helped to quickly find the relationship among the compound, the disease and the target gene, the research and development progress of the new medicine is accelerated, and the research and development efficiency of the new medicine is further improved.
Based on the inventive concept of the system, referring to fig. 4, the embodiment of the invention further provides a new drug development management method based on graph data, which is applied to the new drug development management system based on graph data, and the method includes:
s101, acquiring and integrating medical data; wherein the medical data includes compound information, disease information, target gene information, and side effect information.
Specifically, the medical data includes medical data derived from internet disclosures, and data accumulated by pharmaceutical companies themselves.
The compound information includes compound ID, compound name, data source, international compound identity, and similar compound information;
the disease information includes a disease ID, a disease name, and similar disease information;
the target gene information comprises target gene ID, target gene name, gene description and chromosome;
the side effect information includes a side effect ID and a side effect name.
S102, constructing a graph model according to the medical data; wherein each compound, disease, target gene and side effect are regarded as vertices, and the correlation factor between vertices is regarded as edges.
In particular, the association factors include similar compounds, similar diseases, binding, treatment, creation and association of a plurality of factors, and each factor is taken as a corresponding edge type.
Correspondingly, when the edge types are similar compounds, the corresponding starting point types and the corresponding ending point types are both compounds;
when the edge type is similar to the disease, the corresponding starting point type and ending point type are both diseases;
When the edge types are combination, the corresponding starting point type is a compound, and the ending point type is a target gene;
When the side type is treatment, the corresponding starting point type is a compound, and the ending point type is a disease;
When the edge type is caused, the corresponding starting point type is a compound, and the ending point type is a side effect;
when the edge type is the connection, the corresponding starting point type is the disease, and the ending point type is the target gene.
S103, according to the acquired query information, transmitting the query information to the graph model for prediction, and displaying the fed-back prediction result.
Specifically, during query, adopting a graph query language, and sequencing the prediction results; when the method is applied, the adopted Cypher, gremlin isograph query languages can concentrate dozens of associated queries of the original relational database into one query, so that the code quantity is reduced; meanwhile, the ordering may be performed according to the similarity between the obtained compounds.
It should be noted that, for more specific working processes and examples of the method, please refer to the foregoing system embodiment part, and no further description is provided herein.
By adopting the method, the association conditions of the compound, the disease and the gene are presented in a full dimension by using the constructed graph model, so that new medicine research personnel can be helped to quickly find the relationship among the compound, the disease and the gene, and the research and development progress of the new medicine is quickened.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention, and are intended to be included within the scope of the appended claims and description.

Claims (4)

1. A new medicine research and development management system based on graph data is characterized in that: comprising the following steps:
the data acquisition module is used for acquiring and integrating the medical data; wherein the medical data includes compound information, disease information, target gene information, and side effect information;
The diagram data module is used for constructing a diagram model according to the medical data; wherein, each compound, disease, target gene and side effect are regarded as peaks, and the correlation factor between each peak is regarded as side; the association factors include similar compounds, similar diseases, binding, treating, causing and linking a plurality of factors, and each factor is taken as a corresponding edge type; the related point type corresponds to at least one related factor;
the query prediction module is used for transmitting the query information to the graph model for prediction according to the acquired query information, and displaying the fed-back prediction result;
In the prediction, from the angles of similarity and the same action mechanism, the prediction of the Miao ethnic compound is carried out;
searching for similar diseases, and finding out a compound capable of treating the similar diseases as a predicted Miao ethnic compound according to the relevant factor of treatment;
Finding compounds which have the same target genes and side effects as the compounds, and taking the compounds as predicted leptic compounds;
During inquiry, adopting a graph inquiry language, and carrying out experimental verification after sequencing the prediction results;
the compound information includes compound ID, compound name, data source, international compound identity, and similar compound information; wherein the similar compound information includes two compound similarities;
the disease information includes a disease ID, a disease name, and similar disease information;
the target gene information includes target gene ID, target gene name, gene description and chromosome;
The side effect information includes a side effect ID and a side effect name;
The compounds cause side effects, the compounds bind to target genes, the compounds treat diseases and the disease link target gene information.
2. The new drug development management system based on graph data of claim 1, wherein: when the edge type is similar compound, the corresponding starting point type and ending point type are both compounds;
when the edge type is similar to the disease, the corresponding starting point type and ending point type are both diseases;
When the edge types are combination, the corresponding starting point type is a compound, and the ending point type is a target gene;
When the side type is treatment, the corresponding starting point type is a compound, and the ending point type is a disease;
When the edge type is caused, the corresponding starting point type is a compound, and the ending point type is a side effect;
when the edge type is the connection, the corresponding starting point type is the disease, and the ending point type is the target gene.
3. A new medicine research and development management method based on graph data is characterized in that: a new drug development management system for application to the graph-based data of claim 1, the method comprising:
acquiring and integrating medical data; wherein the medical data includes compound information, disease information, target gene information, and side effect information;
Constructing a graph model according to the medical data; wherein, each compound, disease, target gene and side effect are regarded as peaks, and the correlation factor between each peak is regarded as side; the association factors include similar compounds, similar diseases, binding, treating, causing and linking a plurality of factors, and each factor is taken as a corresponding edge type; the related point type corresponds to at least one related factor;
According to the acquired query information, transmitting the query information to the graph model for prediction, and displaying the fed-back prediction result;
In the prediction, from the angles of similarity and the same action mechanism, the prediction of the Miao ethnic compound is carried out;
searching for similar diseases, and finding out a compound capable of treating the similar diseases as a predicted Miao ethnic compound according to the relevant factor of treatment;
Finding compounds which have the same target genes and side effects as the compounds, and taking the compounds as predicted leptic compounds;
During inquiry, adopting a graph inquiry language, and carrying out experimental verification after sequencing the prediction results;
the compound information includes compound ID, compound name, data source, international compound identity, and similar compound information; wherein the similar compound information includes two compound similarities;
the disease information includes a disease ID, a disease name, and similar disease information;
the target gene information includes target gene ID, target gene name, gene description and chromosome;
The side effect information includes a side effect ID and a side effect name;
The compounds cause side effects, the compounds bind to target genes, the compounds treat diseases and the disease link target gene information.
4. A new drug development management method based on graph data according to claim 3, wherein: when the edge type is similar compound, the corresponding starting point type and ending point type are both compounds;
when the edge type is similar to the disease, the corresponding starting point type and ending point type are both diseases;
When the edge types are combination, the corresponding starting point type is a compound, and the ending point type is a target gene;
When the side type is treatment, the corresponding starting point type is a compound, and the ending point type is a disease;
When the edge type is caused, the corresponding starting point type is a compound, and the ending point type is a side effect;
when the edge type is the connection, the corresponding starting point type is the disease, and the ending point type is the target gene.
CN202111526092.3A 2021-12-14 2021-12-14 New drug research and development management system and method based on graph data Active CN114255885B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111526092.3A CN114255885B (en) 2021-12-14 2021-12-14 New drug research and development management system and method based on graph data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111526092.3A CN114255885B (en) 2021-12-14 2021-12-14 New drug research and development management system and method based on graph data

Publications (2)

Publication Number Publication Date
CN114255885A CN114255885A (en) 2022-03-29
CN114255885B true CN114255885B (en) 2024-09-13

Family

ID=80792178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111526092.3A Active CN114255885B (en) 2021-12-14 2021-12-14 New drug research and development management system and method based on graph data

Country Status (1)

Country Link
CN (1) CN114255885B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191014A (en) * 2019-12-26 2020-05-22 上海科技发展有限公司 Medicine relocation method, system, terminal and medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1490822A2 (en) * 2002-02-04 2004-12-29 Ingenuity Systems Inc. Drug discovery methods
KR101117603B1 (en) * 2011-08-16 2012-03-07 (주)신테카바이오 System and method for providing functional correlation information of biomedical data by generating inter-linkable maps
US20150371009A1 (en) * 2014-06-19 2015-12-24 Jake Yue Chen Drug identification models and methods of using the same to identify compounds to treat disease
CN109325131B (en) * 2018-09-27 2021-03-02 大连理工大学 Medicine identification method based on biomedical knowledge map reasoning
KR102225278B1 (en) * 2020-01-31 2021-03-10 주식회사 스탠다임 Prediction Method for Disease, Gene or Protein related Query Entity and built Prediction System using the same
CN113742443B (en) * 2020-05-29 2024-09-10 京东方科技集团股份有限公司 Multi-drug sharing query method, mobile terminal and storage medium
CN113707264B (en) * 2021-08-31 2024-09-06 平安科技(深圳)有限公司 Machine learning-based medicine recommendation method, device, equipment and medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191014A (en) * 2019-12-26 2020-05-22 上海科技发展有限公司 Medicine relocation method, system, terminal and medium

Also Published As

Publication number Publication date
CN114255885A (en) 2022-03-29

Similar Documents

Publication Publication Date Title
Agrawal et al. Large language models are few-shot clinical information extractors
Bravo et al. Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research
Chao et al. Multi-view cluster analysis with incomplete data to understand treatment effects
CN109493925B (en) Method for determining incidence relation between medicine and medicine target
Dhombres et al. Interoperability between phenotypes in research and healthcare terminologies—Investigating partial mappings between HPO and SNOMED CT
Zhu et al. Biomedical text mining and its applications in cancer research
Veronin et al. A systematic approach to'cleaning'of drug name records data in the FAERS database: a case report
Hettne et al. Rewriting and suppressing UMLS terms for improved biomedical term identification
Sinisi et al. Optimal personalised treatment computation through in silico clinical trials on patient digital twins
Wei et al. SimConcept: A hybrid approach for simplifying composite named entities in biomedicine
CN114860887A (en) Disease content pushing method, device, equipment and medium based on intelligent association
CN114255885B (en) New drug research and development management system and method based on graph data
Lin et al. Outcomes of out-of-hospital cardiac arrests after a decade of system-wide initiatives optimising community chain of survival in Taipei city
Weinzierl et al. The impact of learning Unified Medical Language System knowledge embeddings in relation extraction from biomedical texts
CN115376704A (en) Medicine-disease interaction prediction method fusing multi-neighborhood correlation information
CN113064960A (en) Method for accurately searching cases similar to patient&#39;s condition
Shi et al. Predicting binary, discrete and continued lncRNA-disease associations via a unified framework based on graph regression
Di Lena et al. MIMO: an efficient tool for molecular interaction maps overlap
Mortensen et al. Modest Use of Ontology Design Patterns in a Repository of Biomedical Ontologies.
Gravina et al. Controlling astrocyte-mediated synaptic pruning signals for schizophrenia drug repurposing with deep graph networks
Samuel et al. Mining online full-text literature for novel protein interaction discovery
CN114121293A (en) Clinical trial information mining and inquiring method and device
US20200303033A1 (en) System and method for data curation
CN112667809A (en) Text processing method and device, electronic equipment and storage medium
CN114765060A (en) Multi-attention method for predicting drug target interaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant