CN108830040B - Drug sensitivity prediction method based on cell line and drug similarity network - Google Patents

Drug sensitivity prediction method based on cell line and drug similarity network Download PDF

Info

Publication number
CN108830040B
CN108830040B CN201810578523.2A CN201810578523A CN108830040B CN 108830040 B CN108830040 B CN 108830040B CN 201810578523 A CN201810578523 A CN 201810578523A CN 108830040 B CN108830040 B CN 108830040B
Authority
CN
China
Prior art keywords
drug
cell line
value
similarity
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810578523.2A
Other languages
Chinese (zh)
Other versions
CN108830040A (en
Inventor
李敏
王晓桐
王建新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zaozhidao Technology Co ltd
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201810578523.2A priority Critical patent/CN108830040B/en
Publication of CN108830040A publication Critical patent/CN108830040A/en
Application granted granted Critical
Publication of CN108830040B publication Critical patent/CN108830040B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Toxicology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Chemical & Material Sciences (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a drug sensitivity prediction method based on a cell line and a drug similarity network, which comprises the following steps: constructing a drug similarity network, a cell line similarity network and a drug-cell line relation network; respectively obtaining a corresponding drug adjacency matrix, a cell line adjacency matrix and a drug-cell line relation initial matrix according to the drug similarity network, the cell line similarity network and the drug-cell line relation network; and obtaining a drug sensitivity prediction matrix of the drug-cell line by adopting an unbalanced double random walk algorithm based on the drug adjacency matrix, the cell line adjacency matrix and the drug-cell line relation initial matrix, wherein each element in the drug sensitivity prediction matrix of the drug-cell line obtained after the walking is finished by adopting an unbalanced double random walk formula is a sensitivity value of the corresponding drug to the cell line prediction. The method fully considers the characteristics of the drug similarity network and the cell line similarity network, and further improves the reliability of the drug sensitivity prediction result.

Description

Drug sensitivity prediction method based on cell line and drug similarity network
Technical Field
The invention belongs to the technical field of biomedicine, and particularly relates to a drug sensitivity prediction method based on a cell line and a drug similarity network.
Background
Over the past two decades, with substantial improvements in high throughput analysis techniques, there has been an increase in the expectation that personalized or sophisticated medicine will become the future medical science paradigm. Patients with the same cancer may respond differently to a particular drug treatment, and personalized medicine may wish to molecularly interpret the cause of a particular patient's cancer and then tailor the treatment regimen to address the patient's cancer. Personalized medicine observes tumor responses based on established molecular characteristics of cancer cells, as compared to chemotherapy-based monotherapy approaches, to overcome some of the limitations associated with conventional symptom-directed disease diagnosis and treatment. The most important step in personalized medicine is the identification of biomarkers (biorarker) that are capable of predicting the patient's drug response, i.e. predicting the patient's sensitivity to the drug. However, the development of predictive biomarkers requires extensive experimentation and is more expensive when used in humans or animal models. Therefore, there are many studies to perform large-scale drug screening by culturing human cell lines to determine predictive biomarkers. One of the earliest attempts at this was the NCI-60 study, which included a panel of 60 human cell lines and their response to over 10 million compounds. The drug response results for the NCI-60 dataset show that different types of cancers have different drug response characteristics, and that different tumors from the same type of cancer may have different molecular patterns. When two tissues, Cancer Cell Line Encyclopedia (CCLE) and Cancer Genome Project (CGP), were examined by co-analysis of the pharmacological profiles of about 1,000 clinically relevant human cell lines and 149 cancer drugs, and examples of the use of an elastic network model to select expression and mutation profiles that predict drug response are given. We have then found that algorithms in bioinformatics can be used to predict drug response.
Zhang et al, who first calculated drug similarity and cell line similarity separately and then made drug sensitivity predictions by constructing a two-layer network of drugs and cell lines, proposed two hypotheses in the article: 1) genetically similar cell lines may respond very similarly to a given drug; 2) structurally related drugs may have similar therapeutic effects due to their common molecular structure or targeting pattern. And these two hypotheses were verified experimentally. Actually, the drug sensitivity prediction experiment based on the double-layer network is to excavate the influence of the drug and the cell line on the drug sensitivity, but the method does not consider the biological network characteristics of the drug similarity network and the similarity network of the cell line. Meanwhile, Zhang and the like adopt heterogeneous networks to realize network fusion, and the mode cannot fully excavate the topological structures of Zhang and the like; in addition, some apply the MRSF algorithm to drug sensitivity prediction based on the similarity network, and similarly, the MRSF algorithm fails to sufficiently exploit the biological network characteristics of drugs and cell lines themselves and their topological structures, and the effect is not ideal.
Therefore, in the prior art, the process of fusing drug similarity and cell line similarity into drug sensitivity prediction cannot fully excavate the topological structure of the network, so that the prediction effect is poor and needs to be improved.
Disclosure of Invention
The invention aims to provide a drug sensitivity prediction method based on a cell line and a drug similarity network, which deeply excavates the characteristics and the association of the drug similarity network and the cell line similarity network, applies the fusion to the drug sensitivity prediction and improves the reliability of a drug sensitivity prediction result.
The invention provides a drug sensitivity prediction method based on a cell line and a drug similarity network, which comprises the following steps:
s1: constructing a drug similarity network, a cell line similarity network and a drug-cell line relation network;
wherein the drug similarity network comprises similarity values for any two drugs in the drug set, the cell line similarity network comprises similarity values for any two cell lines in the cell line set, and the drug-cell line relationship network comprises drug-cell lines of known sensitivity values and corresponding sensitivity values;
the drug-cell line of known sensitivity value indicates that the drug sensitivity data of the corresponding drug in the drug set to the corresponding cell line in the cell line set is known;
s2: respectively obtaining a corresponding drug adjacency matrix, a cell line adjacency matrix and a drug-cell line relation initial matrix according to the drug similarity network, the cell line similarity network and the drug-cell line relation network;
wherein, each element value in the drug adjacency matrix, the cell line adjacency matrix and the drug-cell line relation initial matrix is respectively determined according to the similarity value between two corresponding drugs, the similarity value between two corresponding cell lines and whether the sensitivity value of the corresponding drug-cell line is known or not;
the drug adjacency matrix is an N-row and N-row matrix, and the cell line adjacency matrix is an M-row and M-row matrix; the primary matrix of the drug-cell line relation is an N-row and M-column matrix or an M-row and N-column matrix, N is the number of the drugs in the drug set, and M is the number of the cell lines in the cell line set;
s3: taking the primary drug-cell line relation matrix as an initial value of a drug sensitivity prediction matrix of the drug-cell line, and updating the initial value of the drug sensitivity prediction matrix of the drug-cell line by adopting an unbalanced double random walk algorithm based on the drug adjacency matrix and the cell line adjacency matrix;
wherein, each element in the drug sensitivity prediction matrix of the updated drug-cell line after the unbalanced double random walk algorithm is adopted to walk is the sensitivity value of the corresponding drug to the cell line prediction; the drug sensitivity prediction matrix of the drug-cell line is an N-row and M-column matrix or an M-row and N-column matrix.
In the present invention, the association of a drug with a cell line is regarded as the drug sensitivity data of the drug to the cell line, and the value is expressed as sensitivity value. Therefore, if the data relating a drug to a cell line in the data set is known, i.e. the data relating the drug to the drug sensitivity of the cell line is known, the relationship between the drug and the cell line is included in the drug-cell line relationship network and is considered as the drug-cell line with known drug sensitivity.
1. The method is based on the fact that the drug sensitivity data of the drug-cell line in the existing data set is not uniform or accurate and needs to be obtained or updated again, and the method is based on the fact that the similarity network of the drug similarity network and the similarity network of the cell line have influence on the drug sensitivity of the drug-cell line, so that the similarity network of the drug similarity network and the similarity network of the cell line are applied to an unbalanced double random walk algorithm, and the drug-cell line with known drug sensitivity is used as an initial value to be updated to obtain the drug sensitivity data, namely the sensitivity value, between each drug to be predicted and each cell line.
2. Alternatively, the present invention updates the sensitivity data of the existing drug-cell line and predicts the sensitivity data of the existing drug-cell line based on the data update or change of the drug similarity network and the cell line similarity network.
The primary drug-cell line relationship matrix and the drug sensitivity prediction matrix of the drug-cell line are both N-row and M-column matrices or M-row and N-column matrices, i.e., each element in the drug sensitivity prediction matrix of the drug-cell line corresponds to each element in the primary drug-cell line relationship matrix one to one.
Further preferably, the unbalanced double random walk formula is as follows:
Figure GDA0003026005820000031
Figure GDA0003026005820000032
Figure GDA0003026005820000033
in the formula, Wt、Wt+1The prediction matrices for drug sensitivity of the t-th and t + 1-th migratory drug-cell lines, respectively, are shown, and W1=R,
Figure GDA0003026005820000034
Respectively representing the left matrix and the right matrix corresponding to the t +1 th wandering, D representing a drug adjacency matrix, C representing a cell line adjacency matrix, R representing a drug-cell line relation primary matrix, and lambdaleftAnd λrightThe weight of the random particles in the drug similarity network and the cell line similarity network, respectively, is controlled, the weight lambdaleftAnd λrightAre all positive numbers, alpha is a restarting parameter of random walk, and the value range of alpha is [0, 1%],ll、lrRespectively representing the preset migration step length, l, corresponding to the drug adjacency matrix D and the cell line adjacency matrix Cl、lrAre all positive integers.
According to the invention, through unbalanced double random walk, when the set walk step length is completed, the output drug sensitivity prediction matrix of the drug-cell line is the drug sensitivity prediction matrix of the drug-cell line to be obtained by the invention. It should be noted that since ll、lrThe values of the two are independent, so that the two are different, if the two are different, the left matrix or the right matrix corresponding to one walking step length is firstly completed, the value is 0 in the next iteration process, the walking is continued until the two walking step lengths are completed, and the obtained W after all the step lengths are played is walkedt+1I.e. the drug sensitivity prediction matrix of the updated drug-cell line. Wherein, W1R denotes that the initial value of each element in the drug sensitivity prediction matrix of the drug-cell line is equal to the value of the corresponding element in the primary matrix of the drug-cell line relationship.
Further preferably, the calculation formula of the elements of the drug adjacency matrix is as follows:
Figure GDA0003026005820000041
wherein d (i, j) is the value of the ith row and jth column element in the drug adjacency matrix, and Pccd (i, j) is the similarity value of two drugs corresponding to the element d (i, j) in the drug similarity network;
the element calculation formula of the cell line adjacency matrix is as follows:
Figure GDA0003026005820000042
wherein c (i, j) is the value of the element in row i and column j in the cell line adjacency matrix, and Pccc (i, j) is the similarity value of the two cell lines in the cell line similarity network corresponding to the element c (i, j);
the element calculation formula of the drug-cell line relation initial matrix is as follows:
Figure GDA0003026005820000043
wherein r (i, j) is the value of the ith row and jth column element in the initial matrix of the drug-cell line relationship, and Pcce (i, j) is the corresponding sensitivity value of the drug and cell line corresponding to the element r (i, j) in the drug-cell line relationship network.
When calculating the values of the elements in the primary matrix of drug-cell line relationships, if the drug-cell line relationship corresponding to an element is already in the drug-cell line relationship network, it indicates that the sensitivity data of the corresponding drug-cell line is known, i.e., the sensitivity value is known.
Further preferably, the process of constructing the drug similarity network in S1 is as follows:
firstly, obtaining a descriptor quantitative value of a 1D &2D structure of each medicine in a medicine set;
then, calculating a Pearson correlation coefficient between every two drugs in the drug set based on a Pearson correlation coefficient formula and descriptor quantitative values of the drugs to obtain a drug similarity network;
wherein a pearson correlation coefficient between the two drugs is equal to a similarity value between the corresponding two drugs;
the calculation formula of the pearson correlation coefficient between the two drugs is as follows:
Figure GDA0003026005820000044
wherein r isa,bDenotes the Pearson correlation coefficient, X, between the two drugs a, bi(a)Quantitative values of the ith descriptor representing drug a,
Figure GDA0003026005820000045
Mean of quantitative values representing descriptors of a drugs, SXaVariance of quantitative values representing a descriptor of the drug; y isi(b)Quantitative values representing the ith descriptor of b-drug,
Figure GDA0003026005820000051
Mean value of quantitative values representing descriptors of b-drugs, SYbVariance of quantitative values of descriptors representing b-drugs, N is the number of descriptors.
Further preferably, before the drug adjacency matrix is obtained according to the drug similarity network in S2, the method further includes correcting the similarity value between each two drugs in the drug similarity network by using logistic regression;
wherein the formula of the logistic regression is as follows:
Figure GDA0003026005820000052
wherein L (x) is the similarity value between the two drugs after correction, x is the similarity value between the two drugs before correction, e is a natural base number, c1And d1Are all adjustment parameters.
Let L (0) be 0.0001, thus d1Has a value of log (9999), and c1The value of (c) is adjusted to obtain an optimal or target value by cross-validation. Logistic regression is used to shift smaller similarity values between drugs closer to 0, and larger similarity values are amplified.
Further preferably, the construction process of the cell line similarity network in S1 is as follows:
firstly, acquiring cell line gene spectrum data obtained by carrying out experimental test on each cell line by using a gene probe;
wherein, the cell line gene spectrum data comprises an expression value obtained by each gene probe through experimental test on each cell line, and one gene probe corresponds to one expression value of one cell line;
then, calculating the variance corresponding to each gene probe based on the expression values between each gene probe and all cell lines, and selecting the expression value obtained by testing the correspondence between the n gene probes with the largest variance and each cell line;
finally, calculating the Pearson correlation coefficient of the gene spectrum between every two cell lines in the cell line set based on a Pearson correlation coefficient formula and n expression values obtained by testing each cell line and the n gene probes correspondingly;
wherein the Pearson's correlation coefficient for the gene profile between the two cell lines is equal to the similarity value between the corresponding two cell lines;
the pearson correlation coefficient of the gene profile between the two cell lines was calculated as follows:
Figure GDA0003026005820000053
wherein r isc,dPearson's correlation coefficient, X, representing the Gene Profile between two cell lines c, di(c)The expression level of the cell line c,
Figure GDA0003026005820000054
Represents the mean value of the expression values of the cell line c, SXcVariance representing expression value of cell line c; y isi(d)The expression level of the cell line d,
Figure GDA0003026005820000055
Denotes the mean value of the expression values of cell line d, SYdRepresents the variance of the expression values of cell line d.
The value range of n is 5-25% of the total number of the gene probes. The cell line gene general data of the invention is a numerical value measured by a gene probe through experimental reaction in a cell line, namely an expression value of the invention.
Further preferably, before the obtaining of the cell line adjacency matrix according to the cell line similarity network in S2, the method further includes correcting the similarity value between every two cell lines in the cell line similarity network by using logistic regression;
wherein the formula of the logistic regression is as follows:
Figure GDA0003026005820000061
wherein L (y) is the similarity value between the two cell lines after correction, y is the similarity value between the two cell lines before correction, e is a natural base number, c2And d2Are all adjustment parameters.
Let L (0) be 0.0001, thus d2Has a value of log (9999), and c2The value of (c) is adjusted to obtain an optimal or target value by cross-validation. Logistic regression was used to shift the smaller similarity values between cell lines closer to 0 and the larger similarity values were amplified.
Advantageous effects
1. The drug sensitivity prediction method based on the cell line and the drug similarity network directly applies the drug similarity network and the cell line similarity network to drug sensitivity prediction, and calculates a drug sensitivity prediction matrix of a drug-cell line by adopting an unbalanced double random walk algorithm, wherein the unbalanced random walk algorithm is applied to the drug sensitivity prediction for the first time, and the unbalanced random walk algorithm can be rapidly diffused in the network and can be applied to networks with different topological structures, so that the method is an extremely advantageous tool for processing biological networks with various specific structures and biological calculation problems based on the network, and the reliability of drug sensitivity prediction results is improved. The invention also verifies through experiments that the used unbalanced random walk algorithm can fully utilize the biological network information of the medicine and the cell line compared with the zhang integrated heterogeneous network medicine sensitivity prediction method and the MRSF algorithm, and the prediction result is more accurate. On the basis of simplicity and practicality, the accuracy of drug sensitivity prediction can be well improved, and important reference values and practical values are provided for researchers to carry out experimental analysis and deeper research on drug sensitivity.
2. According to the method, certain genes which have larger influence on the cell line in the cell line gene spectrum are selected through variance calculation to construct a more accurate cell line similarity network, so that the noise in the cell line similarity network is reduced, and the reliability of a prediction result is further improved; and the logistic regression algorithm is adopted to reduce the noise in the cell line similarity network and the drug similarity network respectively, so that the effect of the drug similarity network and the cell line similarity network as the characteristics of the biological network in drug sensitivity research is fully considered, and the drug sensitivity can be predicted more accurately.
Drawings
FIG. 1 is a flow chart of a method for predicting drug sensitivity based on cell lines and drug similarity networks provided by the present invention;
fig. 2 is a graph of RMSE values for each drug (drug) using the leave-one method (a) and the ten-fold cross-validation (b) on the CCLE dataset for the three methods.
Detailed Description
The present invention will be further described with reference to the following examples.
The biological data set used in this example: the Cancer Cell Line Encyclopedia (CCLE) and the tumor drug sensitivity Genetics (GDSC) are two sets of data sets, and specific data of the two sets of data sets are described in detail in the following table 1.
The CCLE dataset consists of large-scale genomic data including gene expression profiles, mutation status and copy number variation of 1,036 human cancer cell lines, and eight-point dose response curves for 24 chemical compounds across 504 cell lines. Gene expression profiles and drug sensitivity data (measured by the area under the dose response curve) can be downloaded from the CCLE website (http:// www.broadinstitute.org/CCLE). Of all 504 cell lines, 491 common cancer cell lines were identified with drug sensitivity measurements and gene expression profiling data. In the CCLE dataset, there are 24 drugs, 23 of which can find the corresponding SDF files in the Pubchem database, and one compound LBW242 without SDF files, so in the drug similarity network, the similarity data of this compound with other compounds is 0.
GDSC data can improve cancer treatment by finding therapeutic biomarkers that can be used to identify patients most likely to respond to anticancer drugs. In the raw GDSC dataset that can be used herein, there are 140 drug data, but only 139 related compounds in the PubChem database, so the number of drugs in this experiment is 139, and by cross-comparing the 139 compound related cell lines, we found 789 cell lines that could be applied to this experiment. The number of drug-cell lines finally formed by extracting the original data set is 64,814, and the drug-cell line relationship data in the invention represent drug sensitivity data. It is worth mentioning that the values of the drug sensitivity data have different measures (e.g., IC50 value, activity area value (activity area) and AUC value, etc.), and different values are used in different method comparisons, so that the obtained RMSE values cannot be uniformly compared in a standard.
TABLE 1 Experimental data in CCLE and GDSC data sets
Figure GDA0003026005820000071
The invention carries out drug sensitivity prediction based on an unbalanced random walk technology, and in the embodiment, two similarity networks are firstly constructed through gene screening and logistic regression algorithms: a medicine similarity network (DSN) and a cell line similarity network (CSN), and extracting relation data of the medicine-cell lines in the data set as medicine sensitivity data to construct the medicine-cell line relation network. After the three networks (DSN, CSN, drug-cell line relationship network) are constructed, the three networks are put into an unbalanced double random walk algorithm (birdsp algorithm) proposed herein, and drug sensitivity data is predicted by calculating the birdsp algorithm, and the method for predicting drug sensitivity based on the cell line and the drug similarity network provided in this embodiment includes the following specific steps:
step 1: constructing a drug similarity network DSN, a cell line similarity network CSN and a drug-cell line relation network;
1. network DSN for drug similarity
The method comprises the steps of firstly determining the names of compounds of each drug in a drug set, searching corresponding SDF files describing chemical structures of the drugs in a pubchem database (https:// pubchem. ncbi. nlm. nih. gov /) according to the names of the compounds, extracting 1D &2D structures of the SDF files of each drug through software PaDEL (http:// www.yapcwsoft.com/dd/padeldescriptor /), analyzing quantitative values of descriptors of each drug by using PaDEL software, wherein the descriptors quantitatively describe the 1D &2D structures of the compounds, finally calculating Pearson correlation coefficients of the descriptors of the compounds to obtain a drug similarity network, and constructing the drug similarity network formed by similarity relations between every two drugs.
The pearson correlation coefficient between each two drugs is calculated as follows:
Figure GDA0003026005820000081
wherein r isa,bDenotes the Pearson correlation coefficient, X, between the two drugs a, bi(a)Quantitative values of the ith descriptor representing drug a,
Figure GDA0003026005820000082
Mean of quantitative values representing descriptors of a drugs, SXaVariance of quantitative values representing a descriptor of the drug; y isi(b)Quantitative values representing the ith descriptor of b-drug,
Figure GDA0003026005820000083
Mean value of quantitative values representing descriptors of b-drugs, SYbVariance of quantitative values of descriptors representing b-drugs, N is the number of descriptors.
The coefficient r can be known from the formulaa,bHas a value range of [ -1,1 [)]If the value of the variable r is greater thana,bWhen the variable is close to 0, the variable is irrelevanta,bA value of 1 or-1 indicates that they are strongly correlated.
2. Network of cell line similarity CSN
Firstly, cell line gene spectrum data obtained by experimental tests of the gene probes in the data set on each cell line is obtained. For example, in two tissue data sets CCLE and GDSC, one copy of data on cell line gene profiles was obtained by testing each cell line with 18,988 and 22,277 gene probes, respectively. The cell line gene common data is a value measured by a gene probe through an experimental reaction in a cell line, namely when the cell line is subjected to an experimental test by utilizing the gene probe in a data set, an experimental expression value is obtained between each gene probe and each cell line, the invention describes an expression value in a gene spectrum, the cell line gene spectrum data obtained by the invention comprises the expression value obtained by performing the experimental test on each cell line by utilizing each gene probe, one gene probe corresponds to one expression value of one cell line, for example, when 491 cell lines are subjected to the experimental test by utilizing 18,988 gene probes in a data set CCLE, one gene probe is used for obtaining one expression value after performing the experimental test on one cell line, and therefore one gene probe corresponds to 491 expression values. The gene profile depicts information on the type and abundance of gene expression in a particular state for that particular cell or tissue.
Then, the variance corresponding to each gene probe is calculated based on the expression values between each gene probe and all cell lines, and the expression value obtained by the correspondence test between the n gene probes with the largest variance and each cell line is selected. For example, 491 expression values are assigned to one gene probe in the above data set CCLE, and thus the variance for each gene probe can be calculated based on the expression value for each gene probe. The 1000 with the largest variance, i.e., n equal to 1000, were selected in this example, where 1000 expression values were assigned to each cell line.
Finally, calculating the Pearson correlation coefficient of the gene spectrum between every two cell lines in the cell line set based on a Pearson correlation coefficient formula and n expression values obtained by testing each cell line and the n gene probes correspondingly; the pearson correlation coefficient of the gene profile between the two cell lines is equal to the similarity value between the corresponding two cell lines.
The pearson correlation coefficient of the gene profile between each two cell lines was calculated as follows:
Figure GDA0003026005820000091
wherein r isc,dPearson's correlation coefficient, X, representing the Gene Profile between two cell lines c, di(c)The expression level of the cell line c,
Figure GDA0003026005820000092
Represents the mean value of the expression values of the cell line c, SXcVariance representing expression value of cell line c; y isi(d)The expression level of the cell line d,
Figure GDA0003026005820000093
Denotes the mean value of the expression values of cell line d, SYdRepresents the variance of the expression values of cell line d.
The coefficient r can be known from the formulac,dHas a value range of [ -1,1 [)]If the value of the variable r is greater thanc,dWhen the variable is close to 0, the variable is irrelevantc,dA value of 1 or-1 indicates that they are strongly correlated.
3. Relating to drug-cell line relationship networks
The drug-cell line relationship network comprises drug-cell lines of known sensitivity values and corresponding sensitivity values; by a drug-cell line of known sensitivity value is meant that the drug sensitivity data of the corresponding drug in the drug set to the corresponding cell line in the cell line set is known. Wherein, whether the drug sensitive data is known or not is determined according to whether the data set contains the relevant data of the corresponding drug-cell line or not. Drug sensitivity data, such as AUC or IC50 values, are obtained from direct experiments in CCLE and GDSC and other related databases, which are already present in the data set.
Step 2: and denoising the drug similarity network DSN and the cell line similarity network CSN by adopting logistic regression.
The corresponding similarity values of the drug similarity network DSN and the cell line similarity network CSN are calculated by adopting a Pearson correlation coefficient formula. But the Pearson correlation coefficient formula finds that the Pearson phase selection relation number is the cosine of an included angle between vectors formed by concentrating values of two variables according to the mean value. That is, it is a way to calculate the similarity of two drugs from a pure mathematical point of view, and we know that the drugs contain biological significance, and the pearson correlation coefficient calculated in this way, that is, the similarity coefficient of the drugs, has no way to be completely identical to whether the two drugs are actually similar or not, and even have a long difference. Therefore, it is necessary to correct the similarity value and finally improve the reliability of the prediction result.
1. Correction of similarity values between every two drugs in a drug similarity network
And correcting the similarity value between every two medicines in the medicine similarity network by using logistic regression. Wherein the formula of the logistic regression is as follows:
Figure GDA0003026005820000101
wherein L (x) is the similarity value between the two drugs after correction, x is the similarity value between the two drugs before correction, e is a natural base number, c1And d1Are all adjustment parameters. Can be adjusted by adjusting the parameter c1And d1To control the magnitude of the drug similarity value. In this embodiment, L (0) is set to 0.0001, and thus d1Has a value of log (9999), and c1The value of (c) is adjusted to obtain an optimal or target value by cross-validation. By using logistic regression, the smaller similarity values between drugs are transformed closer to 0, and the larger similarity values are amplified. By using the above procedure, the drug similarity value x is converted into a new similarity value l (x).
2. Correction of similarity values between every two cell lines in a cell line similarity network
And correcting the similarity value between every two cell lines in the cell line similarity network by using logistic regression. The formula of the logistic regression at this time is as follows:
Figure GDA0003026005820000102
wherein L (y) is the similarity value between the two cell lines after correction, y is the similarity value between the two cell lines before correction, e is a natural base number, c2And d2Are all adjustment parameters. The same principle can be realized by adjusting the parameter c2And d2To control the magnitude of the cell line similarity value. In this embodiment, L (0) is set to 0.0001, and thus d2Has a value of log (9999), and c2The value of (c) is adjusted to obtain an optimal or target value by cross-validation. Logistic regression was used to shift the smaller similarity values between cell lines closer to 0 and the larger similarity values were amplified. By using the above procedure, the cell line similarity value y is converted into a new similarity value l (y).
And step 3: and acquiring a corresponding drug adjacency matrix D (N multiplied by N), a cell line adjacency matrix C (M multiplied by M) and a drug-cell line relation initial matrix R (N multiplied by M).
N is the number of drugs in the drug pool, and M is the number of cell lines in the cell line pool. The drug adjacency matrix D represents an adjacency matrix of a drug similarity network DSN, the cell line adjacency matrix C represents an adjacency matrix of a cell line similarity network CSN, and the drug-cell line relation initial matrix R represents a drug-cell line known association relation matrix constructed based on the drug-cell line relation network.
Wherein, the calculation process of the values of each element in the drug adjacency matrix D, the cell line adjacency matrix C and the drug-cell line relation initial matrix R is as follows:
the elemental calculation formula for the drug adjacency matrix is as follows:
Figure GDA0003026005820000111
where d (i, j) is the value of the ith row and jth column element in the drug adjacency matrix, and Pccd (i, j) is the similarity value of the two drugs corresponding to element d (i, j) in the drug similarity network. In this embodiment, the similarity value Pccd (i, j) is a similarity value after denoising processing, that is, a value after logistic regression processing is adopted, and in other feasible embodiments, if denoising is not performed by logistic regression, the similarity value is correspondingly calculated by using a pearson correlation coefficient formula.
The formula for calculating the elements of the cell line adjacency matrix is as follows:
Figure GDA0003026005820000112
where c (i, j) is the value of the element in row i and column j in the cell line adjacency matrix, and Pccc (i, j) is the similarity value in the cell line similarity network for the two cell lines corresponding to element c (i, j). Similarly, in this embodiment, the similarity value Pccc (i, j) is a similarity value after denoising processing, that is, a value after logistic regression processing is adopted, and in other feasible embodiments, if denoising is not performed by logistic regression, the similarity value is correspondingly calculated by using a pearson correlation coefficient formula.
The element calculation formula of the primary matrix of the drug-cell line relationship is as follows:
Figure GDA0003026005820000113
wherein r (i, j) is the value of the ith row and jth column element in the initial matrix of the drug-cell line relationship, and Pcce (i, j) is the corresponding sensitivity value of the drug and cell line corresponding to the element r (i, j) in the drug-cell line relationship network. That is, if a given drug has sensitivity data for a cell line, its element r (i, j) is the corresponding sensitivity value, otherwise r (i, j) is 0.
And 4, step 4: obtaining a drug sensitivity prediction matrix W (N × M) of the drug-cell line by adopting an unbalanced double random walk algorithm based on the drug adjacency matrix D (N × N), the cell line adjacency matrix C (M × M) and the drug-cell line relation initial matrix R (N × M);
and obtaining the sensitivity value of the corresponding medicine to the cell line prediction by using each element in the medicine-cell line medicine sensitivity prediction matrix obtained after the unbalanced double random walk formula is adopted for walking. In this example, the prediction matrix of drug sensitivity of the drug-cell line is an N-column and M-column matrix. The values of its elements w (i, j) represent the predicted drug sensitivity data for a given drug i on cell line j. Where the initial value of W is the matrix R.
The unbalanced double random walk formula is as follows:
Figure GDA0003026005820000121
Figure GDA0003026005820000122
Figure GDA0003026005820000123
in the formula, Wt、Wt+1The prediction matrices for drug sensitivity of the t-th and t + 1-th migratory drug-cell lines, respectively, are shown, and W1=R,
Figure GDA0003026005820000124
Respectively representing the left matrix and the right matrix corresponding to the t +1 th wandering, D representing a drug adjacency matrix, C representing a cell line adjacency matrix, R representing a drug-cell line relation primary matrix, and lambdaleftAnd λrightThe weight of the random particles in the drug similarity network and the cell line similarity network, respectively, is controlled, the weight lambdaleftAnd λrightAre all positive numbers, alpha is a restarting parameter of random walk, and the value range of alpha is [0, 1%]The R matrix participates in the migration process, and the whole process can be regulated and controlled by changing the value of the parameter alpha, ll、lrRespectively representing the preset migration step length, l, corresponding to the drug adjacency matrix D and the cell line adjacency matrix Cl、lrAre all positive integers.
Verifying and simulating:
in order to evaluate the effectiveness of the method provided by the invention, the method respectively adopts a leave-one method and a ten-fold cross validation to compare the Pearson correlation coefficient and the root mean square error (RMSE value) between the predicted value and the true value of the three methods in two sets of data sets of CCLE and GDSC by using other two methods Zhang's and MRSF, so that the effectiveness of the drug sensitivity prediction method based on the random walk technology provided by the invention is observed and compared.
a. Verification of algorithm performance based on Pearson correlation coefficient
The performance of the algorithm is evaluated by calculating the Pearson correlation coefficient between the predicted value and the true value of the drug sensitivity data, and the larger the Pearson correlation coefficient value is, the better the performance of the algorithm is. Notably, the algorithm is considered meaningless when the value of the pearson correlation coefficient is less than 0.6. We applied three algorithms on the CCLE and GDSC datasets, respectively: the BiRWSSP (representing the method of the invention), Zhang's and MRSF algorithms are tested by a leave-one cross validation method and a ten-fold cross validation method.
The second to fourth rows in table 2 are RMSE values of birdsp, Zhang's, MRSF algorithms on CCLE dataset, and the last three rows are RMSE values of the above algorithms on GDSC dataset. Table 3-2 lists the pearson correlation values of the three algorithms on the CCLE data set and the GDSC data set, and we respectively take the average, minimum, and maximum of the pearson correlation values of the three algorithms for comparison. It can be seen that the average pearson value of the birdsp algorithm on the CCLE dataset is 0.9082, 1.6%, 12.6% higher than the other two algorithms Zhang's, MRSF, indicating that the performance of the birdsp algorithm is the best, and among all these compared algorithms, Zhang's performs the worst, demonstrating that there is still a lot of information available to be mined in the double layer network on the drug-cell line in terms of drug sensitivity prediction. On the GDSC dataset, the resulting RMSE values are substantially the same since the data is larger and less information is known than on the CCLE dataset.
Table 2 Pearson correlation coefficient between true value and predicted value verified by leave-one-out method of three algorithms
Figure GDA0003026005820000131
The second to fourth rows in table 3 are the RMSE values of birdsp, Zhang's, MRSF algorithms on the CCLE dataset after cross validation by ten-fold, and the last three rows are the RMSE values of the above algorithms on the GDSC dataset. Table 3 lists the pearson correlation values of the three algorithms on the CCLE data set and the GDSC data set, and similar to table 2, the average, the minimum, and the maximum of the pearson correlation values of the three algorithms are respectively taken for comparison. It can be seen that the average pearson value of the birdsp algorithm on the CCLE dataset is 0.9082, which is 12.6% and 1.6% higher than the other two algorithms Zhang's and MRSF, respectively, indicating that the performance of the birdsp algorithm is the best. On the GDSC dataset, the average pearson value of the birdsp algorithm is 0.8723, which is 18.92% and 1.83% higher than that of the other two algorithms Zhang's and MRSF, respectively, indicating that the performance of the birdsp algorithm is the best. Among all these compared algorithms, Zhang's perform the worst, demonstrating that there is still a lot of information to be mined in the bilayer network on the drug-cell line in terms of drug sensitivity prediction.
TABLE 3 Pearson correlation coefficient between true and predicted values for cross-validation of ten folds for three algorithms
Figure GDA0003026005820000132
b verifying Performance based on RMSE Angle
As can be seen from the definition and equation of RMSE, smaller value of RMSE means smaller difference between the predicted value and the true value, i.e. better prediction effect.
We evaluate our algorithm from the perspective of RMSE. In the method of Zhang's, it uses leave-one-out method on both sets of data sets to evaluate the quality of drug susceptibility prediction experiments. In the MRSF algorithm, a ten-fold cross-validation method is used for evaluating the quality of a drug sensitivity prediction experiment on two sets of data sets. Drug sensitivity data were calculated on both CCLE and GDSC datasets using both of the above validation methods. Because the 1000 drug-related genes are processed before the drug similarity network is screened, and the logical regression is performed in the drug similarity network and the cell line similarity network to correct the similarity, in order to make the algorithm as fair as possible, the leave-one-out analysis of the two algorithms is performed on the processed data and the data before the processing, and the results are shown in the following table 4.
TABLE 4 RMSE data leave-one-out validation comparison of BiRWSP algorithm on two data sets
Figure GDA0003026005820000141
Ul _ us in the table is data which is not subjected to gene screening and logistic regression, the gene screening is data which is obtained by calculating the variance corresponding to the gene probes and selecting data which is larger than the variance and corresponds to n gene probes to calculate the similarity value of the cell line, the ul _ s is data which is not subjected to gene screening and is subjected to logistic regression, the l-us is data which is subjected to gene screening and is not subjected to logistic regression, and the l _ s is data which is subjected to gene screening and logistic regression. It can be seen from table 4 that the RMSE value using the birdsp algorithm was the smallest for the data subjected to gene screening and logistic regression, and therefore the data subjected to gene screening and logistic regression were used for the subsequent data.
TABLE 5 RMSE data one-out-of-one validation comparison of three algorithms on two data sets
Figure GDA0003026005820000142
The second to fourth rows in table 5 are birdsp, Zhang's, RMSE values between predicted and observed values of the MRSF algorithm on the CCLE dataset, and the last three rows are RMSE values between predicted and observed values of the above algorithm on the GDSC dataset. It can be seen from the illustration that, on the CCLE dataset, the birdsp algorithm performs best in terms of both the average RMSE value and the maximum RMSE value when the leave-one method is used for verification, where the average values are 0.7206 and 0.0090 lower than the Zhang's and MRSF algorithms, respectively, and on the GDSC dataset, the birdsp algorithm is also more advantageous. Among them, the algorithm of Zhang's has the largest RMSE value, and also proves that there are still many potential relations which can be presumed in the aspect of predicting drug sensitivity by adopting similarity.
The second to fourth rows in table 6 are birdsp, Zhang's, RMSE values between predicted and observed values of the MRSF algorithm on the CCLE dataset, and the last three rows are RMSE values between predicted and observed values of the above algorithm on the GDSC dataset. It can be seen by the illustration that the birdsp algorithm performed best in terms of both the mean RMSE value, which is 0.4652 and 0.0537 lower than the Zhang's and MRSF algorithms, respectively, and the maximum RMSE value, which is 0.2197 and 0.0244 lower than the Zhang's and MRSF algorithms, respectively, when validated with a ten-fold crossover. In the ten-fold cross validation, the BiRWSP algorithm is also more advantageous. Among them, the RMSE value of Zhang's algorithm is the largest, and it is also proved that there are still many potential relations to predict drug sensitivity using similarity.
Table 6 cross-fold cross-validation comparison of RMSE data between predicted values and true values of three algorithms
Figure GDA0003026005820000151
As figure 2 illustrates the RMSE values calculated for each drug on the CCLE dataset for the three methods using the leave-one method (top) and the ten-fold cross-validation (bottom), it can be seen graphically that the birdsp algorithm has 18 drugs with smaller RMSE values in all 24 drugs when validated using the leave-one method than the other two methods. The birdsp algorithm is also more advantageous in averaging RMSE values. When ten-fold cross-validation is used, the birdsp algorithm has a lower RMSE value for 20 out of all 24 drugs than for the other two methods. The birdsp algorithm is also more advantageous in averaging RMSE values. Therefore, birdsp is more advantageous in predicting its sensitivity to each individual drug.
It should be emphasized that the examples described herein are illustrative and not restrictive, and thus the invention is not to be limited to the examples described herein, but rather to other embodiments that may be devised by those skilled in the art based on the teachings herein, and that various modifications, alterations, and substitutions are possible without departing from the spirit and scope of the present invention.

Claims (7)

1. A drug sensitivity prediction method based on a cell line and a drug similarity network is characterized in that: the method comprises the following steps:
s1: constructing a drug similarity network, a cell line similarity network and a drug-cell line relation network;
wherein the drug similarity network comprises similarity values for any two drugs in the drug set, the cell line similarity network comprises similarity values for any two cell lines in the cell line set, and the drug-cell line relationship network comprises drug-cell lines of known sensitivity values and corresponding sensitivity values;
the drug-cell line of known sensitivity value indicates that the drug sensitivity data of the corresponding drug in the drug set to the corresponding cell line in the cell line set is known;
s2: respectively obtaining a corresponding drug adjacency matrix, a cell line adjacency matrix and a drug-cell line relation initial matrix according to the drug similarity network, the cell line similarity network and the drug-cell line relation network;
wherein, each element value in the drug adjacency matrix, the cell line adjacency matrix and the drug-cell line relation initial matrix is respectively determined according to the similarity value between two corresponding drugs, the similarity value between two corresponding cell lines and whether the sensitivity value of the corresponding drug-cell line is known or not;
the drug adjacency matrix is an N-row and N-row matrix, and the cell line adjacency matrix is an M-row and M-row matrix; the primary matrix of the drug-cell line relation is an N-row and M-column matrix or an M-row and N-column matrix, N is the number of the drugs in the drug set, and M is the number of the cell lines in the cell line set;
s3: taking the primary drug-cell line relation matrix as an initial value of a drug sensitivity prediction matrix of the drug-cell line, and updating the initial value of the drug sensitivity prediction matrix of the drug-cell line by adopting an unbalanced double random walk algorithm based on the drug adjacency matrix and the cell line adjacency matrix;
wherein, each element in the drug sensitivity prediction matrix of the updated drug-cell line after the unbalanced double random walk algorithm is adopted to walk is the sensitivity value of the corresponding drug to the cell line prediction; the drug sensitivity prediction matrix of the drug-cell line is an N-row and M-column matrix or an M-row and N-column matrix.
2. The method of claim 1, wherein: the unbalanced double random walk formula is as follows:
Figure FDA0003026005810000011
Figure FDA0003026005810000012
Figure FDA0003026005810000013
in the formula, Wt、Wt+1The prediction matrices for drug sensitivity of the t-th and t + 1-th migratory drug-cell lines, respectively, are shown, and W1=R,
Figure FDA0003026005810000014
Respectively representing the left matrix and the right matrix corresponding to the t +1 th wandering, D representing a drug adjacency matrix, C representing a cell line adjacency matrix, R representing a drug-cell line relation primary matrix, and lambdaleftAnd λrightThe weight of the random particles in the drug similarity network and the cell line similarity network, respectively, is controlled, the weight lambdaleftAnd λrightAre all positive numbers, alpha is a restarting parameter of random walk, and the value range of alpha is [0, 1%],ll、lrRespectively representing the preset migration step length, l, corresponding to the drug adjacency matrix D and the cell line adjacency matrix Cl、lrAre all positive integers.
3. The method of claim 1, wherein: the element calculation formula of the medicine adjacency matrix is as follows:
Figure FDA0003026005810000021
wherein d (i, j) is the value of the ith row and jth column element in the drug adjacency matrix, and Pccd (i, j) is the similarity value of two drugs corresponding to the element d (i, j) in the drug similarity network;
the element calculation formula of the cell line adjacency matrix is as follows:
Figure FDA0003026005810000022
wherein c (i, j) is the value of the element in row i and column j in the cell line adjacency matrix, and Pccc (i, j) is the similarity value of the two cell lines in the cell line similarity network corresponding to the element c (i, j);
the element calculation formula of the drug-cell line relation initial matrix is as follows:
Figure FDA0003026005810000023
wherein r (i, j) is the value of the ith row and jth column element in the initial matrix of the drug-cell line relationship, and Pcce (i, j) is the corresponding sensitivity value of the drug and cell line corresponding to the element r (i, j) in the drug-cell line relationship network.
4. The method of claim 1, wherein: the construction process of the drug similarity network in S1 is as follows:
firstly, obtaining a descriptor quantitative value of a 1D &2D structure of each medicine in a medicine set;
then, calculating a Pearson correlation coefficient between every two drugs in the drug set based on a Pearson correlation coefficient formula and descriptor quantitative values of the drugs to obtain a drug similarity network;
wherein a pearson correlation coefficient between the two drugs is equal to a similarity value between the corresponding two drugs;
the calculation formula of the pearson correlation coefficient between the two drugs is as follows:
Figure FDA0003026005810000024
wherein r isa,bDenotes the Pearson correlation coefficient, X, between the two drugs a, bi(a)Quantitative values of the ith descriptor representing drug a,
Figure FDA0003026005810000025
Mean of quantitative values representing descriptors of a drugs, SXaVariance of quantitative values representing a descriptor of the drug; y isi(b)Quantitative values representing the ith descriptor of b-drug,
Figure FDA0003026005810000031
Mean value of quantitative values representing descriptors of b-drugs, SYbVariance of quantitative values of descriptors representing b-drugs, N is the number of descriptors.
5. The method of claim 4, wherein: before the medicine adjacency matrix is obtained according to the medicine similarity network in the S2, correcting the similarity value between every two medicines in the medicine similarity network by adopting logistic regression;
wherein the formula of the logistic regression is as follows:
Figure FDA0003026005810000032
wherein L (x) is the similarity value between the two drugs after correction, x is the similarity value between the two drugs before correction, and e is a natural base number,c1And d1Are all adjustment parameters.
6. The method of claim 1, wherein: the construction process of the cell line similarity network in S1 is as follows:
firstly, acquiring cell line gene spectrum data obtained by carrying out experimental test on each cell line by using a gene probe;
wherein, the cell line gene spectrum data comprises an expression value obtained by each gene probe through experimental test on each cell line, and one gene probe corresponds to one expression value of one cell line;
then, calculating the variance corresponding to each gene probe based on the expression values between each gene probe and all cell lines, and selecting the expression value obtained by testing the correspondence between the n gene probes with the largest variance and each cell line;
finally, calculating the Pearson correlation coefficient of the gene spectrum between every two cell lines in the cell line set based on a Pearson correlation coefficient formula and n expression values obtained by correspondingly testing each cell line and n gene probes;
wherein the Pearson's correlation coefficient for the gene profile between the two cell lines is equal to the similarity value between the corresponding two cell lines;
the pearson correlation coefficient of the gene profile between the two cell lines was calculated as follows:
Figure FDA0003026005810000033
wherein r isc,dPearson's correlation coefficient, X, representing the Gene Profile between two cell lines c, di(c)The expression level of the cell line c,
Figure FDA0003026005810000034
Represents the mean value of the expression values of the cell line c, SXcVariance representing expression value of cell line c; y isi(d)The expression level of the cell line d,
Figure FDA0003026005810000035
Denotes the mean value of the expression values of cell line d, SYdRepresents the variance of the expression values of cell line d.
7. The method of claim 6, wherein: before obtaining the cell line adjacency matrix according to the cell line similarity network in the S2, correcting the similarity value between every two cell lines in the cell line similarity network by using logistic regression;
wherein the formula of the logistic regression is as follows:
Figure FDA0003026005810000041
wherein L (y) is the similarity value between the two cell lines after correction, y is the similarity value between the two cell lines before correction, e is a natural base number, c2And d2Are all adjustment parameters.
CN201810578523.2A 2018-06-07 2018-06-07 Drug sensitivity prediction method based on cell line and drug similarity network Active CN108830040B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810578523.2A CN108830040B (en) 2018-06-07 2018-06-07 Drug sensitivity prediction method based on cell line and drug similarity network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810578523.2A CN108830040B (en) 2018-06-07 2018-06-07 Drug sensitivity prediction method based on cell line and drug similarity network

Publications (2)

Publication Number Publication Date
CN108830040A CN108830040A (en) 2018-11-16
CN108830040B true CN108830040B (en) 2021-06-15

Family

ID=64144194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810578523.2A Active CN108830040B (en) 2018-06-07 2018-06-07 Drug sensitivity prediction method based on cell line and drug similarity network

Country Status (1)

Country Link
CN (1) CN108830040B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110232978B (en) * 2019-06-14 2022-05-17 西安电子科技大学 Cancer cell line treatment drug prediction method based on multidimensional network
CN110739028B (en) * 2019-10-18 2023-08-15 中国矿业大学 Cell line drug response prediction method based on K-nearest neighbor constraint matrix decomposition
CN112435754B (en) * 2020-09-30 2022-04-08 天津大学 Method for predicting drug sensitivity based on depth factorization machine
CN112599207A (en) * 2020-12-23 2021-04-02 上海海洋大学 Cancer drug sensitivity prediction method based on pathway activity and elastic net
CN113362895A (en) * 2021-06-15 2021-09-07 上海基绪康生物科技有限公司 Comprehensive analysis method for predicting anti-cancer drug response related gene
CN114255886B (en) * 2022-02-28 2022-06-14 浙江大学 Multi-group similarity guide-based drug sensitivity prediction method and device
CN117524346A (en) * 2023-11-20 2024-02-06 东北林业大学 Multi-view cancer drug response prediction system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1428023A2 (en) * 2001-09-06 2004-06-16 Decode Genetics EHF. Methods for predicting drug sensitivity in patients afflicted with an inflammatory disease
CN105653846A (en) * 2015-12-25 2016-06-08 中南大学 Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method
CN107609326A (en) * 2017-07-26 2018-01-19 同济大学 Drug sensitivity prediction method in the accurate medical treatment of cancer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1428023A2 (en) * 2001-09-06 2004-06-16 Decode Genetics EHF. Methods for predicting drug sensitivity in patients afflicted with an inflammatory disease
CN105653846A (en) * 2015-12-25 2016-06-08 中南大学 Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method
CN107609326A (en) * 2017-07-26 2018-01-19 同济大学 Drug sensitivity prediction method in the accurate medical treatment of cancer

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Heme oxygenase-1 silencing increases the sensitivity of human osteosarcoma MG63 cells to arsenic trioxide;Wang Xiaotong等;《SpringerLink》;20141231;第392卷;第135–144页 *
动态蛋白质网络的构建、分析及应用研究进展;李敏;《计算机研究与发展 》;20170411;第54卷(第6期);第1281-1299页 *
肿瘤干细胞与卵巢癌耐药的研究进展;尹晓龙等;《中华全科医学 》;20160208;第14卷(第2期);第288-290页 *
面向药物发现和精准医疗的基因表达谱分析;刘阳等;《生物化学与生物物理进展》;20161028;第43卷(第10期);第923-935页 *

Also Published As

Publication number Publication date
CN108830040A (en) 2018-11-16

Similar Documents

Publication Publication Date Title
CN108830040B (en) Drug sensitivity prediction method based on cell line and drug similarity network
CN108877953B (en) Drug sensitivity prediction method based on multi-similarity network
Lawson et al. Unlocking data sets by calibrating populations of models to data density: A study in atrial electrophysiology
Wang et al. Predicting the impacts of mutations on protein-ligand binding affinity based on molecular dynamics simulations and machine learning methods
Peng et al. Predicting drug response based on multi-omics fusion and graph convolution
Turner et al. Improved biclustering of microarray data demonstrated through systematic performance tests
Chatterjee et al. Clinical application of modified bag-of-features coupled with hybrid neural-based classifier in dengue fever classification using gene expression data
Pal Predictive modeling of drug sensitivity
CN110232978B (en) Cancer cell line treatment drug prediction method based on multidimensional network
US20170147743A1 (en) Rapid identification of pharmacological targets and anti-targets for drug discovery and repurposing
JP2019527894A (en) Dasatinib reaction prediction model and method
KR102316989B1 (en) Method and system for discovery new drug candidate
CN111524554A (en) Cell activity prediction method based on LINCS-L1000 perturbation signal
Bak et al. Multidimensional (3D/4D-QSAR) probability-guided pharmacophore mapping: Investigation of activity profile for a series of drug absorption promoters
Xiao et al. Modeling three-dimensional chromosome structures using gene expression data
Yoruk et al. A comprehensive statistical model for cell signaling
CN110739028B (en) Cell line drug response prediction method based on K-nearest neighbor constraint matrix decomposition
KR102361615B1 (en) Method for drug repositioning based on drug responding gene expression features
Chakraborty et al. Bayesian robust learning in chain graph models for integrative pharmacogenomics
US20070088509A1 (en) Method and system for selecting a marker molecule
CN112397140A (en) Target identification method and device based on allosteric mechanism and storage medium
Taj Drug Response Prediction and Biomarker Discovery using Multimodal Deep Learning
Sarkar et al. An ensemble approach for gene regulatory network study in rice blast
CN114334038B (en) Disease medicine prediction method based on heterogeneous network embedded model
Emdadi et al. Clinical drug response prediction from preclinical cancer cell lines by logistic matrix factorization approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20221114

Address after: 518000 a1002 Tian'an innovation and Technology Plaza, chegong temple, Shatou street, Futian District, Shenzhen City, Guangdong Province

Patentee after: SHENZHEN ZAOZHIDAO TECHNOLOGY CO.,LTD.

Address before: Yuelu District City, Hunan province 410083 Changsha Lushan Road No. 932

Patentee before: CENTRAL SOUTH University

TR01 Transfer of patent right