CN110503158A - A kind of disease associated analysis method of drug based on time factor - Google Patents

A kind of disease associated analysis method of drug based on time factor Download PDF

Info

Publication number
CN110503158A
CN110503158A CN201910800975.5A CN201910800975A CN110503158A CN 110503158 A CN110503158 A CN 110503158A CN 201910800975 A CN201910800975 A CN 201910800975A CN 110503158 A CN110503158 A CN 110503158A
Authority
CN
China
Prior art keywords
disease
drug
community
analysis
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910800975.5A
Other languages
Chinese (zh)
Inventor
刘文丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Health And Medical Big Data Co Ltd
Original Assignee
Shandong Health And Medical Big Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Health And Medical Big Data Co Ltd filed Critical Shandong Health And Medical Big Data Co Ltd
Priority to CN201910800975.5A priority Critical patent/CN110503158A/en
Publication of CN110503158A publication Critical patent/CN110503158A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/60ICT specially adapted for the handling or processing of medical references relating to pathologies

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Toxicology (AREA)
  • Medicinal Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biotechnology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a kind of disease associated analysis methods of the drug based on time factor, belong to complex network in the applied technical field of diseases analysis.The disease associated analysis method of drug based on time factor of the invention is measured the correlation of drug and disease using patient's number as weight first, then obtains the positive negative correlation between drug and disease by the analysis to patient medication time and Diagnostic Time.The disease associated analysis method of the drug based on time factor of the invention can make up for it the deficiency in disease and drug-associated research, and provide direction for the experimental research studied after drug listing, have good application value.

Description

A kind of disease associated analysis method of drug based on time factor
Technical field
The present invention relates to complex networks in the applied technical field of diseases analysis, specifically provides a kind of based on time factor The disease associated analysis method of drug.
Background technique
Currently, mainly passing through two ways for the research of drug and disease.First way is the mode of clinical trial, Which needs researcher to test repeatedly.This not only needs to expend a large amount of manpower and material resources, while also very time-consuming.It is general a kind of Drug needs to expend from studying to using 10 to 15 years [1].Another way is based on real world data, and utilization is various Data mining technology carries out drug and disease associated research.With the fast development of computer hardware, complex network, machine Study, the research of neural network scheduling theory are more complete, increasingly by the welcome of medical field.Wherein, complex network research is benefit One of the important method of drug and disease relationship research is carried out with real world data.It is complex network drug at home and abroad below The main application in disease research field.
First, application of the complex network in traditional Chinese medicine research.The application refer to by Chinese medicine data carry out network modelling, Secondly its characteristic is analyzed, then finds community using community discovery algorithm, and study community one by one.Utilize this Method has found that hepatitis B syndrome crowd's network has Small-world Characters.
Second, application of the complex network in disease research.In real life, universal common diabetes, cancer, the heart Vascular diseases etc. are the diseases influenced by environment and genetics.The mutual of protein relevant to such disease is simulated using complex network Networking network, the abnormal target spot of disease is gone out by network analysis.
Third, application of the complex network in disease and drug-associated research field.Current complex network is in study of disease With the correlation aspect of drug, it is based primarily upon disease, gene, drug and establishes network.Then it is visited by the analysis to complex network The distribution characteristics for seeking certain a kind of drug side-effect excavates the correlativity with drug side-effect such as disease, gene.
Genetic factors are added in disease and drug-associated research about complex network, it no doubt can be to a certain extent Reach the correlation for excavating disease and drug side-effect.However, this method has ignored real world individual patient factor, especially Disease incidence time and administration time factor.Since drug is to be used by the patient, usually work to entire people rather than certain class disease Disease, therefore need that individual patient factor is added under study for action.Again because individual all one's life will obtain different diseases in different time sections Disease, or obtain multiclass disease in the same period, while can also have that same time individual eats multiclass drug or different time is eaten not The case where similar drugs.Therefore, while network is added in individual factors, time factor is also that cannot be neglected important spy Sign.
Summary of the invention
Technical assignment of the invention is that in view of the above problems, providing one kind can make up for it disease and drug-associated Deficiency in research, and the drug disease based on time factor in direction is provided for the experimental research studied after drug listing Correlation analysis.
To achieve the above object, the present invention provides the following technical scheme that
A kind of disease associated analysis method of drug based on time factor, this method weigh by weight of patient's number first The correlation of drug and disease is measured, then is obtained between drug and disease just by the analysis to patient medication time and Diagnostic Time Negative correlation.
Preferably, should disease associated analysis method of drug based on time factor specifically includes the following steps:
S1, drug and disease associated excavation
S11, it establishes complex network: establishing network using drug, disease, patient as node;
S12, heterogeneous network community discovery;
S13, correlation analysis, drug disease correlation of nodes analysis in analysis and community including community's tightness;
S2, drug and the positive and negative correlation analysis of disease
The higher disease of correlation and diagnosis pair are excavated in S21, arrangement;
The positive negative correlation of S22, the remaining drug disease pair of analysis.
Preferably, step S12 heterogeneous network community discovery establishes random walk sequence using the mode of node2vec, make With the mode training pattern of word2vec, community's division is carried out using clustering algorithm.
Preferably, correlation analysis described in step S13 includes drug disease node in the analysis and community of community's density The analysis of correlation.
Preferably, the analysis of community's tightness determines the community by calculating the average class spacing in class between sample The tightness degree of interior nodes, as shown in formula (1)
Wherein, M is community's tightness, and M is smaller, and community's tightness is higher;N is that drug, disease, patient's node are total in community Number;X is the drug obtained by node2vec algorithm in community, disease, patient's knot vector.
Preferably, the connection that the analysis of drug disease correlation of nodes passes through patient between drug and disease in the community Number determines that the patient between drug and disease is more, and drug disease correlation of nodes is bigger in community.
Preferably, the higher disease of correlation and diagnosis pair are excavated in step S21 arrangement, have directly for drug and disease It is associated right.
Preferably, step S22 analyzes the criterion of the positive negative correlation of remaining drug disease pair are as follows:
1) patient takes time of such drug earlier than the medical diagnosis on disease time, and the two time phase difference is in threshold range, Then there is negative correlation between the drug and disease;
2) time that patient takes such drug is later than the medical diagnosis on disease time, and nothing in the threshold time after administration time Palindromia then exists between the drug and disease and is positively correlated.
Herein threshold values needs determined according to the curative effect time for being actually related to drug, generally should be within curative effect time.
Compared with prior art, the disease associated analysis method of the drug of the invention based on time factor has following prominent Out the utility model has the advantages that the disease associated analysis method of the drug based on time factor can analyze in advance that class drug with The correlation of which class disease is higher, is then made a choice using professional knowledge to analysis result, as the reality studied after listing It tests Journal of Sex Research and direction is provided, increase the accuracy of experimental research, complex network can made up in disease and drug-associated While deficiency in research, the beforehand research using real world publishing house is provided for the experimental research in research after drug listing Study carefully, there is good application value.
Detailed description of the invention
Fig. 1 is the flow chart of the disease associated analysis method of the drug based on time factor of the present invention.
Specific embodiment
Below in conjunction with drawings and examples, to the disease associated analysis method of the drug based on time factor of the invention It is described in further detail.
Embodiment
The disease associated analysis method of drug based on time factor of the invention, first by using patient's number as weight Lack individual factor in disease medicament correlation research to indicate that drug and disease associated mode make up complex network It is insufficient.Secondly, this method probes into drug under the conditions of real world data by the analysis to patient medication time and Diagnostic Time Positive negative correlation between disease
Below based on python language, specific embodiments of the present invention are illustrated.
As shown in Figure 1, should disease associated analysis method of drug based on time factor specifically includes the following steps:
S1, drug and disease associated excavation.
S11, it establishes complex network: establishing network using drug, disease, patient as node, drug, disease, patient are node The relationship content for establishing network is as shown in table 1, and drug point attribute, disease point attribute, patient's point attribute are respectively such as table 2, table 3, table 4 It is shown.
Wherein, complex network is established with networkx frame.Before establishing complex network, need to carry out specimen sample. Sample needs comprehensively when specimen sample, is generally comprehensive hospital and non-special class hospital from sampling mechanism.According to table 1, table 2, table 3, the attribute listed in table 4, samples one by one, and field contents can not be sky, and sampled data must really be effective.
Table 1
Table 2
Serial number Property Name Attribute description
1 Medicine name Pharmaceutical standards title
2 Drug codes Drug standards coding
Table 3
Serial number Property Name Attribute description
1 Disease name Disease criterion title
2 Disease code Disease ICD coding
Table 4
S12, heterogeneous network community discovery.
The analysis of node2vec is carried out using the node2vec packet based on networkx frame.Guarantee the section with homogeney The vector of point study is closer, forms community to find the stronger node of relevance.I.e. when carrying out node2vec random walk, Make the strategy of random walk be more biased towards in BFS (range migration) rather than DFS (depth migration).For the situation, migration parameter is set It is as follows.
Wherein p, q are random walk parameter, and wherein p is Return parameter, and q is In-out parameter.
After obtaining knot vector, model training is carried out using the word2vec function based on Gensim frame, it is therefore an objective to Node is subjected to vectorization.
It is clustered using the function (such as kmeans) based on cluster in sklearn frame, divides community.It is used herein as Kmeans cluster can have the requirement for needing to specify community's number in advance, can be configured according to specific sample situation.
S13, correlation analysis, drug disease correlation of nodes analysis in analysis and community including community's tightness.
The analysis of community's tightness determines the close of community's interior nodes by calculating the average class spacing in class between sample Degree, as shown in formula (1)
Wherein, M is community's tightness, and M is smaller, and community's tightness is higher;N is that drug, disease, patient's node are total in community Number;X is the drug obtained by node2vec algorithm in community, disease, patient's knot vector.
The calculating that average inter- object distance is carried out to the class gathered, selects inter- object distance smaller, in community more close community It is analyzed, to reduce computer capacity.It counts in the community filtered out, common connection patient is more and without direct association Drug and disease node.
S2, drug and the positive and negative correlation analysis of disease
The higher disease of correlation and diagnosis pair are excavated in S21, arrangement, there is pair of direct correlation with disease for drug.
The positive negative correlation of S22, the remaining drug disease pair of analysis.
The criterion of the positive negative correlation of the remaining drug disease pair of the analysis are as follows:
1) patient takes time of such drug earlier than the medical diagnosis on disease time, and the two time phase difference is in threshold range, Then there is negative correlation between the drug and disease, i.e., such drug may cause such disease.
2) time that patient takes such drug is later than the medical diagnosis on disease time, and nothing in the threshold time after administration time Palindromia then exists between the drug and disease and is positively correlated, i.e., such drug has a possibility that curing or alleviating such disease.
Embodiment described above, the only present invention more preferably specific embodiment, those skilled in the art is at this The usual variations and alternatives carried out within the scope of inventive technique scheme should be all included within the scope of the present invention.

Claims (8)

1. a kind of disease associated analysis method of drug based on time factor, it is characterised in that: this method is first with patient Number be weight measure drug and disease correlations, then by the analysis to patient medication time and Diagnostic Time obtain drug with Positive negative correlation between disease.
2. the disease associated analysis method of the drug according to claim 1 based on time factor, it is characterised in that: the party Method specifically includes the following steps:
S1, drug and disease associated excavation
S11, it establishes complex network: establishing network using drug, disease, patient as node;
S12, heterogeneous network community discovery;
S13, correlation analysis, drug disease correlation of nodes analysis in analysis and community including community's tightness;
S2, drug and the positive and negative correlation analysis of disease
The higher disease of correlation and diagnosis pair are excavated in S21, arrangement;
The positive negative correlation of S22, the remaining drug disease pair of analysis.
3. the disease associated analysis method of the drug according to claim 2 based on time factor, it is characterised in that: step S12 heterogeneous network community discovery establishes random walk sequence using the mode of node2vec, uses the mode training of word2vec Model carries out community's division using clustering algorithm.
4. the disease associated analysis method of the drug according to claim 3 based on time factor, it is characterised in that: step Correlation analysis described in S13 includes the analysis of drug disease correlation of nodes in the analysis and community of community's density.
5. the disease associated analysis method of the drug according to claim 4 based on time factor, it is characterised in that: described The analysis of community's tightness determines the tightness degree of community's interior nodes by calculating the average class spacing in class between sample, such as public Shown in formula (1)
Wherein, M is community's tightness, and M is smaller, and community's tightness is higher;N is that drug, disease, patient's node are always a in community Number;X is the drug obtained by node2vec algorithm in community, disease, patient's knot vector.
6. the disease associated analysis method of the drug according to claim 5 based on time factor, it is characterised in that: described The analysis of drug disease correlation of nodes is determined by the linking number of patient between drug and disease in community, between drug and disease Patient it is more, drug disease correlation of nodes is bigger in community.
7. the disease associated analysis method of the drug according to claim 6 based on time factor, it is characterised in that: step The higher disease of correlation and diagnosis pair are excavated in S21 arrangement, there is pair of direct correlation with disease for drug.
8. the disease associated analysis method of the drug according to claim 7 based on time factor, it is characterised in that: step S22 analyzes the criterion of the positive negative correlation of remaining drug disease pair are as follows:
1) patient takes time of such drug earlier than the medical diagnosis on disease time, and the two time phase difference then should in threshold range There is negative correlation between drug and disease;
2) time that patient takes such drug is later than the medical diagnosis on disease time, and without disease in the threshold time after administration time Recurrence then exists between the drug and disease and is positively correlated.
CN201910800975.5A 2019-08-28 2019-08-28 A kind of disease associated analysis method of drug based on time factor Pending CN110503158A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910800975.5A CN110503158A (en) 2019-08-28 2019-08-28 A kind of disease associated analysis method of drug based on time factor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910800975.5A CN110503158A (en) 2019-08-28 2019-08-28 A kind of disease associated analysis method of drug based on time factor

Publications (1)

Publication Number Publication Date
CN110503158A true CN110503158A (en) 2019-11-26

Family

ID=68590137

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910800975.5A Pending CN110503158A (en) 2019-08-28 2019-08-28 A kind of disease associated analysis method of drug based on time factor

Country Status (1)

Country Link
CN (1) CN110503158A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779142A (en) * 2011-06-28 2012-11-14 安徽大学 Quick community discovery method based on community closeness
CN104978474A (en) * 2014-04-11 2015-10-14 中国中医科学院中医临床基础医学研究所 Medicine effect evaluating method based on molecular network and medicine effect evaluating system
CN105653846A (en) * 2015-12-25 2016-06-08 中南大学 Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method
WO2017158152A1 (en) * 2016-03-17 2017-09-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. Diagnosis of chronic obstructive pulmonary disease (copd)
CN108109700A (en) * 2017-12-19 2018-06-01 中国科学院深圳先进技术研究院 A kind of chronic disease Drug efficacy evaluation method and apparatus
CN109411033A (en) * 2018-11-05 2019-03-01 杭州师范大学 A kind of curative effect of medication screening technique based on complex network
CN109903854A (en) * 2019-01-25 2019-06-18 电子科技大学 A kind of core drug recognition methods based on TCM Literature

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779142A (en) * 2011-06-28 2012-11-14 安徽大学 Quick community discovery method based on community closeness
CN104978474A (en) * 2014-04-11 2015-10-14 中国中医科学院中医临床基础医学研究所 Medicine effect evaluating method based on molecular network and medicine effect evaluating system
CN105653846A (en) * 2015-12-25 2016-06-08 中南大学 Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method
WO2017158152A1 (en) * 2016-03-17 2017-09-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. Diagnosis of chronic obstructive pulmonary disease (copd)
CN108109700A (en) * 2017-12-19 2018-06-01 中国科学院深圳先进技术研究院 A kind of chronic disease Drug efficacy evaluation method and apparatus
CN109411033A (en) * 2018-11-05 2019-03-01 杭州师范大学 A kind of curative effect of medication screening technique based on complex network
CN109903854A (en) * 2019-01-25 2019-06-18 电子科技大学 A kind of core drug recognition methods based on TCM Literature

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FAN GONG等: "SMR: Medical Knowledge Graph Embedding for Safe Medicine Recommendation", 《ARXIV:1710.05980》 *
熊正理等: "基于用户紧密度的在线社会网络社区发现算法", 《计算机工程》 *

Similar Documents

Publication Publication Date Title
Yang et al. Risk prediction of diabetes: big data mining with fusion of multifarious physical examination indicators
CN105653846B (en) Drug method for relocating based on integrated similarity measurement and random two-way migration
Wilkens et al. HierS: hierarchical scaffold clustering using topological chemical graphs
Veeramah et al. An early divergence of KhoeSan ancestors from those of other modern humans is supported by an ABC-based analysis of autosomal resequencing data
Anzar et al. NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer
CN109872772B (en) Method for excavating colorectal cancer radiotherapy specific genes by using weight gene co-expression network
Khaing Data mining based fragmentation and prediction of medical data
CN109920547A (en) A kind of diabetes prediction model construction method based on electronic health record data mining
CN110444248A (en) Cancer Biology molecular marker screening technique and system based on network topology parameters
CN109637579B (en) Tensor random walk-based key protein identification method
Zhao et al. Microbes and complex diseases: from experimental results to computational models
Li et al. Large-scale identification of potential drug targets based on the topological features of human protein–protein interaction network
CN105139390A (en) Image processing method for detecting pulmonary tuberculosis focus in chest X-ray DR film
WO2016191340A1 (en) Discovery and analysis of drug-related side effects
Lindblom et al. Bioinformatics for human genetics: promises and challenges
Zhu et al. TGSA: protein–protein association-based twin graph neural networks for drug response prediction with similarity augmentation
Hasan et al. Design protein-protein interaction network and protein-drug interaction network for common cancer diseases: A bioinformatics approach
CN111986814A (en) Modeling method of lupus nephritis prediction model of lupus erythematosus patient
Li et al. End-to-end interpretable disease–gene association prediction
KR101839572B1 (en) Apparatus Analyzing Disease-related Genes and Method thereof
CN110503158A (en) A kind of disease associated analysis method of drug based on time factor
CN108804871A (en) Key protein matter recognition methods based on maximum neighbours' subnet
CN113488119B (en) Drug small molecule numerical value characteristic structured database and establishment method thereof
CN105447337B (en) A kind of time series data processing method based on dynamic network map analysis
CN109492690B (en) Method for detecting CT image based on convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191126