CN112216353A - Method and device for predicting drug-target interaction relationship - Google Patents

Method and device for predicting drug-target interaction relationship Download PDF

Info

Publication number
CN112216353A
CN112216353A CN202011202566.4A CN202011202566A CN112216353A CN 112216353 A CN112216353 A CN 112216353A CN 202011202566 A CN202011202566 A CN 202011202566A CN 112216353 A CN112216353 A CN 112216353A
Authority
CN
China
Prior art keywords
drug
target
similarity matrix
predicting
disease
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011202566.4A
Other languages
Chinese (zh)
Other versions
CN112216353B (en
Inventor
郑莹
吴峥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha University of Science and Technology
Original Assignee
Changsha University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha University of Science and Technology filed Critical Changsha University of Science and Technology
Priority to CN202011202566.4A priority Critical patent/CN112216353B/en
Publication of CN112216353A publication Critical patent/CN112216353A/en
Application granted granted Critical
Publication of CN112216353B publication Critical patent/CN112216353B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Medicinal Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Algebra (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a method and a device for predicting drug-target interaction relation, wherein the method comprises the following steps: introducing a drug-target interaction and a target-disease interaction to construct a three-layer heterogeneous network of drug-target-disease; constructing a drug similarity matrix and a target similarity matrix based on the three-layer heterogeneous network, wherein the target similarity matrix comprises an inter-target Gaussian kernel similarity matrix and a target-disease Gaussian kernel similarity matrix; calculating a drug similarity matrix and a target similarity matrix kronecker product, and obtaining a prediction result by a regularized least square method; and verifying the prediction result. Compared with the traditional prediction method, the method adopts a more complete network structure model, establishes a more complex similarity matrix space, and predicts the brand-new drug-target interaction from more angles; the ultra-scale matrix operation is avoided in the calculation process; compared with the conventional FLapRLS method and RLS _ Kron method, the method has better prediction performance.

Description

Method and device for predicting drug-target interaction relationship
Technical Field
The invention relates to the technical field of biological information processing, in particular to a method and equipment for predicting drug-target interaction relation.
Background
It is well known that the development of drugs is an important prerequisite for the treatment of diseases, and the confirmation of drug-target relationship is also an important link in the process of drug development. Although the development of pharmaceuticals has made significant progress over the past few decades, its financial and time costs remain high. With the development of system biology and cybepharmacology, one drug can target a plurality of different targets, and likewise, one target can be acted on by different drugs. In the drug target relationship network, the confirmation of the drug target relationship can accelerate the development process of the drug and understand the action effect of the drug and the treatment scheme of the disease.
Since the introduction of network pharmacology in 08, the idea of single drug-targeted interaction was broken in the past, and four classes of drug-target interaction networks of enzymes, ion channels, G Protein Coupled Receptors (GPCRs) and nuclear receptors were described and constructed in the paper "Yamanishi yoshi Yoshihiro, Araki Michihiro, Gutteridge Alex et al.prediction of drug-target interaction networks, [ J ]. Bioinformatics,2008,24: i 232-40" of the same year, the originality of the proposed method lies in the supervised learning problem of the inference formation of drug-target interactions into bipartite graphs, which also becomes a set of reference standards. It is because drug-target interactions can be abstracted by definition as graphs in computer graphics and studied using correlation algorithms, allowing computers to be tightly connected to drugs or biology, and on this basis, extensive thinking and discovery are being conducted on different standpoints around a variety of research approaches in the context of bioinformatics. For the entire drug-target interaction network, the network is abstracted as a bipartite graph in graph theory. At this point, the drug and target are each abstracted as a set of points V in the bipartite graph, and the drug-target interactions as a set of edges E. Therefore, there is a model G ═ V, E. FIG. 1 is a simple bipartite graph example (where star nodes are targets, rectangular nodes are drugs, and their connecting edges are abstracted drug-target interactions).
The current research on drug-target interaction prediction methods also has the following drawbacks:
1. the used model is usually a bipartite graph model with drug-target interaction, the model form is single, the obtained drug-target similarity matrix is simple, and only local prediction results can be obtained;
2. the mainstream biological methods mainly rely on the whole information of the target drug or target, such as the complete target 3D structure; meanwhile, the prior art is difficult to screen and predict in a large scale, and the obtained result is not satisfactory due to the fact that the cost is high because of no pertinence;
3. in the face of more and more drugs and targets, the requirement of more than 10^16 storage space generated in the intermediate calculation process due to the quantity reaching the magnitude of 10^4-10^5 also exists, and the large matrix is extremely difficult to process for the current calculation level.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art. To this end, the present invention proposes a method and apparatus for predicting drug-target interaction relationships.
In a first aspect of the present invention, there is provided a method for predicting a drug-target interaction relationship, comprising the steps of:
constructing a three-layer heterogeneous network of the drug-target-disease according to the known drug-target interaction relationship and target-disease interaction relationship;
constructing a drug similarity matrix and a target similarity matrix based on the drug-target-disease three-layer heterogeneous network, wherein the target similarity matrix is obtained by fitting a Gaussian kernel similarity matrix between targets and a disease-target;
calculating the kronecker product of the drug similarity matrix and the target similarity matrix, and obtaining a prediction result by a regularized least square method;
and verifying the prediction result.
In a second aspect of the invention, there is provided an apparatus for predicting a drug-target interaction relationship, comprising: at least one control processor and a memory for communicative connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a method for predicting a drug-target interaction relationship according to the first aspect of the invention.
In a third aspect of the present invention, there is provided a computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions for causing a computer to perform the method for predicting a drug-target interaction relationship according to the first aspect of the present invention.
According to the embodiment of the invention, at least the following technical effects are achieved:
(1) the method comprises the steps of introducing target-disease interaction and combining the drug-target interaction to construct a three-layer heterogeneous network, constructing a drug similarity matrix and a target similarity matrix based on the three-layer heterogeneous network, wherein the target similarity matrix is obtained by fitting a Gaussian kernel similarity matrix between targets and the target-disease Gaussian kernel similarity matrix, and compared with a traditional prediction method, a more complete network structure model is adopted, a more complex similarity matrix space is established, and brand new drug-target interaction is predicted from more angles.
(2) The regularized least square method of the kronecker product is used for predicting the final result, and the ultra-scale matrix operation is avoided in the calculation process.
(3) Compared with the common FLapRLS method and the RLS _ Kron method, the prediction result obtained by the method has better prediction performance through verification. The novel drug-target interaction which is obtained by the method and is not chemically verified has high research value, so that the subsequent chemical verification test can be performed in a targeted manner, and the repeated test without a clear target in a large range is avoided. Meanwhile, because the given drug passes through the steps of clinical tests and the like, the overlong period of newly developed drugs is avoided in the aspect of reuse of the drug in the final commercialization.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is an example of a prior art bipartite graph;
FIG. 2 is a schematic flow chart of a method for predicting a drug-target interaction relationship provided by an embodiment of the present invention;
FIG. 3 is a schematic flow chart of a method for predicting drug-target interaction relationships provided by an embodiment of the present invention;
fig. 4 is a schematic diagram of a three-layer heterogeneous network of drug-target-disease provided by an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an apparatus for predicting drug-target interaction relationship according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
Referring to fig. 2 to 4, one embodiment of the present invention provides a method for predicting a drug-target interaction relationship, comprising the steps of:
s100, constructing a three-layer heterogeneous network of the drug, the target and the disease according to the known drug-target interaction relationship and the target-disease interaction relationship.
Wherein the drug-target interaction relationships can all be provided by the benchmark dataset; or partly by the reference data set and partly by the external data set (i.e. the data set provided so far by the institution in the external database complementing the update-related information); but may also be provided entirely by an external data set. The target-disease interaction relationships are given by the human disease relationship database.
Firstly, unifying IDs of all drugs, targets and diseases, arranging the IDs into an adjacent table form, and abstracting the drug-target interaction of the adjacent table form into a drug-target bipartite graph model; target-disease interactions are also abstracted into a bipartite model of target-disease. While abstracting into a bipartite graph model, an adjacent matrix (used as an auxiliary input for a subsequent regularized least squares method) is generated, wherein the matrix value with interaction is 1, and the matrix value without interaction is 0. And constructing a three-layer heterogeneous network of the drug-target-disease according to the drug-target bipartite graph model and the target-disease bipartite graph model. The purpose of constructing a three-layer heterogeneous network is to determine correlations between drugs, targets and diseases in order to maintain the integrity of drug-target interactions and target-disease interactions, so as to enable subsequent fitting and prediction processing. As shown in fig. 4, where Drug ═ d1,d2,...,dmDenotes the drug layer (top layer in the figure), Target ═ t1,t2,...,tnDenotes a target layer (middle layer in the figure), distance ═ r1,r2,...,rkThe } represents the disease layer (bottom layer in the figure), the solid line represents the interaction within the interior of the hierarchy, and the dashed line represents the interaction between the two layers.
S200, constructing a drug similarity matrix and a target similarity matrix based on the drug-target-disease three-layer heterogeneous network, wherein the target similarity matrix is obtained by fitting a Gaussian kernel similarity matrix between targets and a target-disease Gaussian kernel similarity matrix.
As an alternative embodiment, the drug similarity matrix is obtained by fitting a drug chemical structure similarity matrix and a drug Gaussian kernel similarity matrix. The types of the similarity matrix are enriched, and the accuracy of the final prediction result can be improved. The calculation process of the above four similarity matrixes is as follows:
firstly, constructing a drug chemical structure similarity matrix;
firstly, SIMILES (Simplified molecular-input line-entry system) characteristics of all medicines are calculated, and each chemical structure is ensured to have a unique corresponding character string;
then, each SIMILES feature is converted into a corresponding binary chemical fingerprint;
and finally, calculating to obtain a pharmaceutical chemical structure similarity matrix from all binary chemical fingerprints according to a valley coefficient method. The calculation formula is as follows:
Figure BDA0002755844860000061
where f (dx) is the binary chemical fingerprint of drug x. And constructing a drug chemical structure similarity matrix for all drugs on the basis. Here a binary chemical fingerprint is a binary bit string of 166 bits.
Secondly, constructing a drug Gaussian kernel similarity matrix;
first, the gaussian kernel parameters for all drugs were calculated. The Gaussian kernel parameter is defined as a monotonic function of Euclidean distance between any two points in space;
and then, respectively calculating Gaussian nuclear similarity vectors between each drug and all the drugs, and constructing a Gaussian nuclear similarity matrix of the drugs. The calculation formula is as follows:
KGIP,d(Di,Dj)=exp(-γd||ydi-ydj||2) (2)
Figure BDA0002755844860000062
wherein D ═ { D ═ D1,d2,...,dmIs the set of all drugs, m represents the number of drugs; the adjacency matrix Y ∈ m × n represents the known drug-target interaction (the effect of the adjacency matrix Y in the above formula is to provide ydiAnd ydjA specific numerical value of (a), n represents the number of targets; if the drug diAnd target tjThere is a known relationship of association, then yijA value of 1; otherwise the value is 0. yd ofi={yi1,yi2,...,yin} tableMedicine diAssociation vectors with all targets; gamma raydDenotes a control parameter, γ ', controlling the core width'dSet to 1 according to the experience of using a gaussian kernel.
Thirdly, constructing a Gaussian kernel similarity matrix between the targets;
firstly, calculating Gaussian nuclear parameters among all targets;
and then, respectively calculating the Gaussian nuclear similarity vector between each target and all the targets, and constructing a Gaussian nuclear similarity matrix between the targets. The calculation formula is as follows:
KGIP,t(Ti,Tj)=exp(-γt||yti-ytj||2) (4)
Figure BDA0002755844860000071
wherein T ═ { T ═ T1,t2,...,tnExpressed as a set of all targets; if the drug diAnd target tjThere is a known relationship of association, then yijA value of 1; otherwise the value is 0. yti={yi1,yi2,...,yinDenotes the target tiAssociation vectors with all drugs; gamma raytDenotes a control parameter, γ ', controlling the core width'tSet to 1 according to the experience of using a gaussian kernel.
Fourthly, constructing a target-disease Gaussian kernel similarity matrix;
firstly, calculating a Gaussian nuclear parameter of a target-disease;
and then, respectively calculating Gaussian nuclear similarity vectors between each target and all diseases, and constructing a target-disease Gaussian nuclear similarity matrix. The calculation formula is as follows:
KGIP,S(tsi,tsj)=exp(-γts||ytsi-ytsj||2) (6)
Figure BDA0002755844860000072
wherein, S ═ { ts ═ ts1,ts2,...,tskThe } is the set of targets from the target-disease interaction, k is the number of targets; the adjacency matrix Y ∈ k × n represents the known target-disease associations. If target tsiAnd disease disjThere is a known relationship of association, then yijA value of 1; otherwise the value is 0. ytsi={yi1,yi2,...,yinDenoted as target tsiAssociation vectors with all diseases; y istsTo control a tuning parameter of the core width, y'tsSet to 1 according to the experience of using a gaussian kernel.
S300, calculating a kronecker product of the drug similarity matrix and the target similarity matrix, and obtaining a prediction result by a regularization least square method.
Because the similarity matrix is a non-positive definite square matrix and a plurality of similarity matrices need to be fused into a large similarity matrix, the kronecker product of the matrix is calculated by using a regularized least square method in the step S300, and the specific calculation process is analyzed as follows:
firstly, a drug chemical structure similarity matrix and a drug Gaussian nuclear similarity matrix are subjected to linear fitting, a Gaussian nuclear similarity matrix between targets and a target-disease Gaussian nuclear similarity matrix are subjected to linear fitting, an empirical setting weighting factor alpha is 0.5, and it should be noted that the use of 0.5 for alpha is a more common parameter ratio indicating that the two matrixes have the same characteristics, but the method cannot be used for limiting the scope of the invention.
SIMdrug(dx,dy)=(1-α)×KGIP,d(dx,dy)+α×SIMchem(dx,dy) (8)
SIMtar(tx,ty)=(1-α)×KGIP,t(tx,ty)+α×KGIP,S(tsx,tsy) (9)
To the fitted SIMdrug(dx,dy) And a SIMtar(tx,ty) A kronecker product is performed. The kronecker product can be abstractly expressed as
Figure BDA0002755844860000081
Where W is a matrix of (mn by mn) size, with each position in the matrix representing a drug pair (d)i,dj) And target pair (t)p,tq) The specific fraction of multiplication, because the number m, n of the drugs and targets given by the current data set is in the order of 10^4-10^5, the characteristic value decomposition is needed first, and then
Figure BDA0002755844860000082
And
Figure BDA0002755844860000083
wherein V represents an orthogonal matrix formed by characteristic vectors of the drugs or the targets, and A represents a diagonal matrix formed by characteristic values of the drugs or the targets. The final kronecker product results are:
W=V∧VT (10)
Figure BDA0002755844860000084
but on this basis the matrix size computed remains (mn × mn), and the final result remains difficult to save and manipulate. Then, a regularization least square method is introduced to obtain a prediction result, which is specifically as follows:
Figure BDA0002755844860000091
wherein, sigma is a regularization parameter,
Figure BDA0002755844860000092
representing the result matrix of the prediction, VEC (Y)T) The column direction formed by stacking all columns of matrix Y is substituted by two equations, which are:
Figure BDA0002755844860000093
according to the property of a matrix equation of the kronecker product and the transposition operation of the kronecker product, the method conforms to the distribution law and comprises the following steps:
Figure BDA0002755844860000094
Figure BDA0002755844860000095
therefore, equation (12) can be simplified as:
Figure BDA0002755844860000096
order to
Figure BDA0002755844860000097
When X is a column vector, the matrix equation property can still be used, then
Figure BDA0002755844860000098
Comprises the following steps:
Figure BDA0002755844860000099
and obtaining a prediction result by a regularization least square method, and then sequencing the prediction result (finding a maximum value and a row-column position in the matrix).
And S400, verifying the prediction result.
As an alternative embodiment, step S400 includes: firstly, ten times of cross validation is carried out on a prediction result by using a test set; and introducing an external data set to verify the prediction result. The verification method can effectively embody the generalization ability of the model.
Simulation experiment I:
on the basis of the implementation of the above method, the following table shows that the new drug-target interactions with the highest names (magnitude ordering of normalized least squares output values) on the two datasets of enzyme and GPCR obtained by the present method verify the prediction results on the external database with higher confidence, while 4 drug-target interactions have been proven (the drug-target interactions proven in tables 1 and 2 below are shown in bold):
Figure BDA0002755844860000101
Figure BDA0002755844860000111
TABLE 1
Figure BDA0002755844860000112
Figure BDA0002755844860000121
TABLE 2
As in table 1, in the top ten list of enzymes, the targets that were validated were derived from Cytochrome P4502C 19. Paper "Kirchheiner J, Muller G, Meineke I, Wernecke KD, Roots I, Brockmoller J: Effects of polymorphisms in CYP2D6, CYP2C9, and CYP2C19 on trimipramine pharmacokinetics.J. Clin Psychopharmacol.2003 Oct; 23(5) 459-66.doi 10.1097/01. jcp.0000088909.24613.92. the four drug-target interactions are supported by chemical methods. Drug-target interactions that were not chemically validated in tables 1 and 2 are of high research value.
And (2) simulation experiment II:
three-layer heterogeneous network models were constructed from multiple databases with different emphasis and the performance of the method was evaluated on a reference dataset named for four major targets, namely enzyme, ion channel, GPCR and nuclear receptor, as described in the paper "Yamanishi Yoshihiro, Araki michhiro, guilleridge Alex et al, prediction of drug-target interaction networks from the integration of chemical and genetic protocols" [ J ]. Bioinformatics,2008,24: i232-40 ".
Figure BDA0002755844860000122
Figure BDA0002755844860000131
TABLE 3
Wherein drug bank and DisGeNET are used as external data sources, and the others are used as verification databases.
Table 4 below shows the number of drugs and targets in the four reference data sets, Yamanishi Yoshihiro, Araki Michihiro, Guitterridge Alex et al.Prection of drug-target interaction networks from the integration of chemical and genetic spaces, J.Bioinformatics, 2008,24: i232-40, and their ratios.
Figure BDA0002755844860000141
TABLE 4
Table 5 below uses the external database proposed by Yamanishi Yoshihiro, Araki Michihiro, Gutterridge Alex et al.prediction of drug-target interaction networks from the integration of chemical and genetic spaces J.Bioinformatics, 2008,24: i232-40, and compares the area under the curve, AUC, and AUPR on both ROC and PR curves using evaluation criteria including sensitivity and specificity. Because the used evaluation criteria are uniform, the sources and the calculation method are not described again.
Figure BDA0002755844860000142
Figure BDA0002755844860000151
TABLE 5
Among them, the fliprls method defines a kernel, called Gaussian Interaction Profile (GIP) kernel, and predicts drug-target interactions using a simple classifier (kernel) Regularized Least Squares (RLS). RLS _ Kron was first used to construct the chemical and genomic spaces, respectively, using two different methods, and to predict drug-target interactions using the regularized least squares method of kronecker product. It can be seen from table 5 that the prediction performance of the method is superior to the zaprls method and the RLS _ Kron method.
The embodiment of the method has the following beneficial effects:
(1) the method introduces a target-disease interaction relation and combines a drug-target interaction relation to construct a three-layer heterogeneous network, constructs a Gaussian nuclear similarity matrix between targets, a target-disease Gaussian nuclear similarity matrix, a drug chemical structure similarity matrix and a drug Gaussian nuclear similarity matrix based on the three-layer heterogeneous network, adopts a more complete network structure model compared with the traditional prediction method, establishes a complex and reliable similarity matrix space, and predicts brand-new drug-target interaction from multiple angles.
(2) The method uses the regularized least square method of the kronecker product to predict the final result, and avoids the super-scale matrix operation in the calculation process.
(3) The final prediction result of the method is well verified through tenfold intersection and an external database, and the prediction performance of the method obtained through the tenfold intersection and the external verification method is superior to that of a common FLapRLS method and RLS _ Kron method. The novel drug-target interaction which is provided by the method and is not chemically verified has high research value, so that the subsequent chemical verification test can be performed in a targeted manner, and the repeated test without a clear target in a large range is avoided. Meanwhile, because the given drug passes through the steps of clinical tests and the like, the overlong period of newly developed drugs is avoided in the aspect of reuse of the drug in the final commercialization.
Referring to fig. 5, one embodiment of the present invention provides an apparatus for predicting drug-target interaction relationships, the apparatus comprising: one or more control processors and memory, one control processor being exemplified in fig. 5. The control processor and the memory may be connected by a bus or other means, as exemplified by the bus connection in fig. 5.
The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the apparatus in the embodiments of the present invention. The control processor implements the method for predicting drug-target interaction relationships described in the method embodiments above by executing non-transitory software programs, instructions, and modules stored in memory.
The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located from the control processor, and the remote memory may be connected to the device for predicting drug-target interaction relationships over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The one or more modules are stored in the memory and, when executed by the one or more control processors, perform the method for predicting drug-target interaction relationships of the above-described method embodiments.
Embodiments of the present invention also provide a computer-readable storage medium storing computer-executable instructions for execution by one or more control processors, for example, one of the control processors in fig. 5, to cause the one or more control processors to perform the method for predicting a drug-target interaction relationship in the above method embodiments.
Through the above description of the embodiments, those skilled in the art can clearly understand that the embodiments can be implemented by software plus a general hardware platform. Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program that can be executed by associated hardware, and the computer program may be stored in a computer readable storage medium, and when executed, may include the processes of the above embodiments of the methods. The storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples" or the like mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (10)

1. A method for predicting a drug-target interaction relationship, comprising the steps of:
constructing a three-layer heterogeneous network of the drug-target-disease according to the known drug-target interaction relationship and target-disease interaction relationship;
constructing a drug similarity matrix and a target similarity matrix based on the drug-target-disease three-layer heterogeneous network, wherein the target similarity matrix is obtained by fitting a Gaussian kernel similarity matrix between targets and a disease-target;
calculating the kronecker product of the drug similarity matrix and the target similarity matrix, and obtaining a prediction result by a regularized least square method;
and verifying the prediction result.
2. The method for predicting a drug-target interaction relationship of claim 1, wherein the drug similarity matrix is fit from a drug chemical structure similarity matrix and a drug gaussian kernel similarity matrix.
3. The method for predicting a drug-target interaction relationship of claim 1, wherein said validating said prediction comprises: and performing ten-fold cross validation on the prediction result by using a test set.
4. The method for predicting a drug-target interaction relationship of claim 3, wherein said validating said prediction further comprises: and introducing an external data set to verify the prediction result.
5. The method for predicting drug-target interaction relationship of claim 1, wherein the construction of the target-disease Gaussian nuclear similarity matrix comprises:
calculating a gaussian nuclear parameter for the target-disease;
and respectively calculating the Gaussian nuclear similarity vector between each target and all diseases, and constructing a target-disease Gaussian nuclear similarity matrix.
6. The method for predicting a drug-target interaction relationship of claim 1, wherein the construction of the inter-target gaussian kernel similarity matrix comprises:
calculating Gaussian nuclear parameters among all targets;
and respectively calculating the Gaussian nuclear similarity vector between each target and all the targets, and constructing a Gaussian nuclear similarity matrix between the targets.
7. The method for predicting a drug-target interaction relationship of claim 2, wherein the construction of the drug chemical structure similarity matrix comprises:
calculating SIMILES characteristics of all drugs;
converting each SIMILES feature into a corresponding binary chemical fingerprint;
and calculating a medicine chemical structure similarity matrix from all binary chemical fingerprints according to the valley coefficient.
8. The method for predicting a drug-target interaction relationship of claim 2, wherein the constructing of the drug gaussian kernel similarity matrix comprises:
calculating the Gaussian nuclear parameters of all the medicines;
and respectively calculating the Gaussian nuclear similarity vectors between each drug and all the drugs to construct a drug Gaussian nuclear similarity matrix.
9. An apparatus for predicting drug-target interaction relationships, comprising: at least one control processor and a memory for communicative connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform the method for predicting drug-target interaction relationships of any one of claims 1 to 8.
10. A computer-readable storage medium storing computer-executable instructions for causing a computer to perform the method for predicting a drug-target interaction relationship of any one of claims 1 to 8.
CN202011202566.4A 2020-11-02 2020-11-02 Method and apparatus for predicting drug-target interaction relationship Active CN112216353B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011202566.4A CN112216353B (en) 2020-11-02 2020-11-02 Method and apparatus for predicting drug-target interaction relationship

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011202566.4A CN112216353B (en) 2020-11-02 2020-11-02 Method and apparatus for predicting drug-target interaction relationship

Publications (2)

Publication Number Publication Date
CN112216353A true CN112216353A (en) 2021-01-12
CN112216353B CN112216353B (en) 2024-04-02

Family

ID=74057941

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011202566.4A Active CN112216353B (en) 2020-11-02 2020-11-02 Method and apparatus for predicting drug-target interaction relationship

Country Status (1)

Country Link
CN (1) CN112216353B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113223609A (en) * 2021-05-17 2021-08-06 西安电子科技大学 Drug target interaction prediction method based on heterogeneous information network
WO2024037526A1 (en) * 2022-08-18 2024-02-22 京东方科技集团股份有限公司 Drug and target interaction prediction method and apparatus, and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0724370A1 (en) * 1995-01-28 1996-07-31 Gpt Limited Service interactions prevention in telecommunication systems
CN102663214A (en) * 2012-05-09 2012-09-12 四川大学 Construction and prediction method of integrated drug target prediction system
CN105653846A (en) * 2015-12-25 2016-06-08 中南大学 Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method
CN106529205A (en) * 2016-11-03 2017-03-22 中南大学 Drug target relation prediction method based on drug substructure and molecule character description information
CN107506591A (en) * 2017-08-28 2017-12-22 中南大学 A kind of medicine method for relocating based on multivariate information fusion and random walk model
CN108520166A (en) * 2018-03-26 2018-09-11 中山大学 A kind of drug targets prediction technique based on multiple similitude network wandering
CN108647484A (en) * 2018-05-17 2018-10-12 中南大学 A kind of drug relationship prediction technique integrated based on multiple information with least square method
CN108920895A (en) * 2018-06-22 2018-11-30 中南大学 A kind of incidence relation prediction technique of circular rna and disease
KR20190000166A (en) * 2017-06-22 2019-01-02 한국과학기술원 Method and system for predicting drug repositioning candidate based on similarity between drug and metabolite
KR20190028417A (en) * 2019-03-12 2019-03-18 한국과학기술원 Method for predicting drug candidate for diseases by using human metabolite specific for the disease target metabolizing enzyme
CN110706740A (en) * 2019-09-29 2020-01-17 长沙理工大学 Method, device and equipment for predicting protein function based on module decomposition
CN111785320A (en) * 2020-06-28 2020-10-16 西安电子科技大学 Drug target interaction prediction method based on multilayer network representation learning

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0724370A1 (en) * 1995-01-28 1996-07-31 Gpt Limited Service interactions prevention in telecommunication systems
CN102663214A (en) * 2012-05-09 2012-09-12 四川大学 Construction and prediction method of integrated drug target prediction system
CN105653846A (en) * 2015-12-25 2016-06-08 中南大学 Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method
CN106529205A (en) * 2016-11-03 2017-03-22 中南大学 Drug target relation prediction method based on drug substructure and molecule character description information
KR20190000166A (en) * 2017-06-22 2019-01-02 한국과학기술원 Method and system for predicting drug repositioning candidate based on similarity between drug and metabolite
CN107506591A (en) * 2017-08-28 2017-12-22 中南大学 A kind of medicine method for relocating based on multivariate information fusion and random walk model
CN108520166A (en) * 2018-03-26 2018-09-11 中山大学 A kind of drug targets prediction technique based on multiple similitude network wandering
CN108647484A (en) * 2018-05-17 2018-10-12 中南大学 A kind of drug relationship prediction technique integrated based on multiple information with least square method
CN108920895A (en) * 2018-06-22 2018-11-30 中南大学 A kind of incidence relation prediction technique of circular rna and disease
KR20190028417A (en) * 2019-03-12 2019-03-18 한국과학기술원 Method for predicting drug candidate for diseases by using human metabolite specific for the disease target metabolizing enzyme
CN110706740A (en) * 2019-09-29 2020-01-17 长沙理工大学 Method, device and equipment for predicting protein function based on module decomposition
CN111785320A (en) * 2020-06-28 2020-10-16 西安电子科技大学 Drug target interaction prediction method based on multilayer network representation learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
XING CHEN等: "Drug–target interaction prediction by random walk on the heterogeneous network", 《MOLECULAR BIOSYSTEMS》, vol. 8, pages 1970 - 1978 *
YING ZHENG等: "A Machine Learning-Based Biological Drug−Target Interaction Prediction Method for a Tripartite Heterogeneous Network", 《ACS OMEGA 》, vol. 6, pages 3037 - 3045 *
吴峥: "基于网络结构分析的药物重定位研究", 《中国优秀硕士学位论文全文数据库 医药卫生科技辑》, no. 01, pages 079 - 25 *
张程林: "药物-靶标相互作用与亲和力的预测研究", 《中国优秀硕士学位论文全文数据库 医药卫生科技辑》, no. 07, pages 079 - 98 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113223609A (en) * 2021-05-17 2021-08-06 西安电子科技大学 Drug target interaction prediction method based on heterogeneous information network
CN113223609B (en) * 2021-05-17 2023-05-02 西安电子科技大学 Drug target interaction prediction method based on heterogeneous information network
WO2024037526A1 (en) * 2022-08-18 2024-02-22 京东方科技集团股份有限公司 Drug and target interaction prediction method and apparatus, and storage medium

Also Published As

Publication number Publication date
CN112216353B (en) 2024-04-02

Similar Documents

Publication Publication Date Title
Jing et al. Deep learning for drug design: an artificial intelligence paradigm for drug discovery in the big data era
Singaravel et al. Deep-learning neural-network architectures and methods: Using component-based models in building-design energy prediction
Elsebakhi et al. Large-scale machine learning based on functional networks for biomedical big data with high performance computing platforms
US20220020064A1 (en) Feature processing method and apparatus for artificial intelligence recommendation model, electronic device, and storage medium
CN113592093B (en) Quantum state preparation circuit generation method and device, quantum operation chip and equipment
Sibieude et al. Fast screening of covariates in population models empowered by machine learning
Pham et al. Graph memory networks for molecular activity prediction
CN109493925A (en) A kind of method of determining drug and drug target incidence relation
CN112216353A (en) Method and device for predicting drug-target interaction relationship
Sarkar et al. An algorithm for DNA read alignment on quantum accelerators
CN116189809B (en) Drug molecule important node prediction method based on challenge resistance
Chitty-Venkata et al. Neural architecture search benchmarks: Insights and survey
Zheng et al. DTI-RCNN: New efficient hybrid neural network model to predict drug–target interactions
Trivodaliev et al. Exploring function prediction in protein interaction networks via clustering methods
Ou-Yang et al. A two-layer integration framework for protein complex detection
Boudard et al. GARN: Sampling RNA 3D structure space with game theory and knowledge-based scoring strategies
CN106575286A (en) Recursive hierarchical process for combinatorial optimization and statistical sampling
CN111931939A (en) Single-amplitude quantum computation simulation method
CN115083537A (en) Method, device, medium and electronic device for processing molecular framework transition
CN115966316B (en) Tumor drug sensitivity prediction method, system, equipment and storage medium
Jurczuk et al. Fitness evaluation reuse for accelerating GPU-based evolutionary induction of decision trees
CN116564555A (en) Drug interaction prediction model construction method based on deep memory interaction
CN116312856A (en) Medicine interaction prediction method and system based on substructure
CN115511076A (en) Network representation learning method, device, equipment and storage medium
Fan et al. Neighborhood constraint matrix completion for drug-target interaction prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant