CN117275608A - Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs - Google Patents

Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs Download PDF

Info

Publication number
CN117275608A
CN117275608A CN202311155808.2A CN202311155808A CN117275608A CN 117275608 A CN117275608 A CN 117275608A CN 202311155808 A CN202311155808 A CN 202311155808A CN 117275608 A CN117275608 A CN 117275608A
Authority
CN
China
Prior art keywords
drug
medicine
data
cell line
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311155808.2A
Other languages
Chinese (zh)
Other versions
CN117275608B (en
Inventor
杨波
郭越
曹戟
何俏军
胡海涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202311155808.2A priority Critical patent/CN117275608B/en
Publication of CN117275608A publication Critical patent/CN117275608A/en
Application granted granted Critical
Publication of CN117275608B publication Critical patent/CN117275608B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Medical Informatics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Software Systems (AREA)
  • Toxicology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a cooperative prediction method and a device for an interpretable anticancer drug based on cooperative attention, comprising the following steps: obtaining multiple sets of chemical data of cell lines, at least 2 kinds of medicine data, and combination index data of medicine combinations on different cell lines, and constructing training samples; constructing structural characteristics of each drug data and embedding characteristics of multiple groups of chemical data; constructing a medicine combination prediction model, wherein the medicine combination prediction model comprises a medicine-cell line associated feature extraction module, a medicine-medicine associated feature extraction module and a medicine combination index prediction module corresponding to each kind of medicine data; carrying out parameter optimization on the medicine combination prediction model by using a training sample; the method and the device realize the explanatory coding and prediction of the drug-drug interaction and the drug-cell line interaction based on a cooperative attention mechanism, thereby realizing the more efficient and intelligent prediction of the tumor drug combination.

Description

Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs
Technical Field
The invention belongs to the technical field of medicine combination prediction and evaluation, and particularly relates to a method and a device for collaborative prediction of an interpretable anticancer medicine based on collaborative attention.
Background
Cancer is a major refractory disease worldwide, and drug therapy is one of the most dominant modes of cancer treatment. At present, the rapid development of antitumor drugs provides more medication options for patient treatment, and simultaneously promotes the development of drug combination treatment. Drug combination strategies have proven to be an effective cancer treatment strategy that can improve the therapeutic efficacy and reduce the toxic side effects of patients. However, the current discovery of drug combination mainly depends on clinical experience and experimental screening, and has low efficiency and extremely high cost; meanwhile, due to multiple heterogeneity of cancers and diversity of drug actions, interactions between drugs cannot be comprehensively considered only by means of artificial experience. Therefore, an efficient and intelligent drug combination discovery strategy is developed, and the method has important scientific significance and clinical value.
In recent years, the development of medical big data and deep learning technology provides new opportunities to solve this problem. On the one hand, with the development of high-throughput experimental techniques such as sequencing technology and CRISPR, a series of cancer-related databases are created. For example, databases such as cancer cell line histology databases (Cancer Cell Line Encyclopedia, CCLE), cancer genomic maps (The Cancer Genome Atlas, TCGA) and the like, which incorporate a large number of sets of cancer cell lines and cancer patient individuals, provide data resources for characterizing tumor molecules; the field of drug combination also has a plurality of large drug combination screening databases, such as DrugComb, drugCombDB, SYNERGxDB, which record nearly millions of combination information of thousands of drugs on more than three hundred cell lines, and provide a data base for application of artificial intelligence technology on combination problems.
On the other hand, deep learning, which is a powerful artificial intelligence technology, has shown great potential in many fields such as image recognition, natural language processing, and data analysis. In the cancer field, deep learning has also been widely used for research in tumor diagnosis, drug development, personalized medicine, and the like. Currently, deep learning methods are mostly used to predict the response of individual drugs to diseases, but there are relatively few studies on drug combination prediction. Two kinds of model construction methods are mainly adopted, one method focuses on extracting the characteristics of medicines and cell lines respectively by two branches, and collaborative prediction is carried out after simple characteristic splicing, such as the method described in the patent with publication number of CN111223577A, but the prediction accuracy is not enough because important information such as medicine interaction, medicine interaction and target interaction is not considered; another approach is to encode the gene and drug molecule as separate nodes into a heterogram, and predict drug synergy by extracting associations between different nodes, as in the method of the patent publication CN114420309a, but such methods limit their reasoning ability in brand-new drug predictions or brand-new cell line predictions due to the excessive reliance on known drug-drug associations and known drug-target association prior information.
Disclosure of Invention
In view of the above, the present invention aims to provide a synergistic attention-based method and apparatus for collaborative prediction of an interpretable anticancer drug, which fully considers chemical information of the drug based on substructure coding, integrates multiple sets of chemical data to extract individual characteristics of a cell line, and simultaneously realizes interpretable coding and prediction of drug-drug interactions and drug-cell line interactions based on a synergistic attention mechanism, thereby realizing more efficient and intelligent prediction of tumor drug combination.
To achieve the above object, an embodiment provides a synergistic attention-based method for collaborative prediction of an interpretable anticancer drug, comprising the steps of:
obtaining a plurality of sets of chemical data, at least 2 types of drug data, and combination index data for drug combinations on different cell lines;
constructing structural characteristics of each drug data and embedding characteristics of multiple groups of chemical data;
constructing a drug combination prediction model, wherein the drug combination prediction model comprises a drug-cell line associated feature extraction module, a drug-drug associated feature extraction module and a drug combination index prediction module corresponding to each drug data, and each drug-cell line associated feature extraction module is used for carrying out associated feature extraction on the structural features of each drug data and the embedded features of multiple groups of chemical data based on cooperative attention so as to obtain drug-specific cell line features and cell line-specific drug features, and splicing the drug-specific cell line features corresponding to all the drug data so as to obtain final cell line features; each medicine-medicine association feature extraction module is used for extracting association features of the structural features of each medicine data and the structural features of other medicine data based on cooperative attention to obtain medicine features fused with other medicine information, and splicing the medicine features of cell line specificity and the medicine features fused with other medicine information to serve as final medicine features of each medicine data; the drug use index prediction module is used for predicting the drug use index based on the final drug characteristics and the final cell line characteristics of all the drug data;
Taking multiple groups of chemical data and drug data of the cell lines as sample data, taking combination index data of drug combinations on different cell lines as truth labels, and carrying out parameter optimization on a drug combination prediction model;
and carrying out medicine combination prediction by using the medicine combination prediction model after parameter optimization.
Preferably, in the drug combination prediction model, an input end of a drug-drug association feature extraction module corresponding to each drug data is connected to an output end of a drug-cell line association feature extraction module corresponding to the drug data, and at this time, each drug-drug association feature extraction module is used for performing association feature extraction on a drug feature of cell line specificity corresponding to the drug data and a structural feature of other drug data based on cooperative attention, so as to obtain a drug feature of fusion cell line specificity and other drug information as a final drug feature.
Preferably, in the drug combination prediction model, an input end of a drug-cell line associated feature extraction module corresponding to each drug data is connected with an output end of a drug-drug associated feature extraction module corresponding to the drug data, and at this time, each drug-cell line associated feature extraction module is used for carrying out associated feature extraction on drug features corresponding to each drug data and fused with other drug information and embedded features of multiple groups of chemical data based on cooperative attention, so as to obtain multi-drug-specific cell line features and multi-drug-line-specific multi-drug features serving as final drug features, and splicing multi-drug-specific cell line features corresponding to all drug data to obtain final cell line features.
Preferably, each drug-cell line association feature extraction module comprises a first multi-head collaborative attention network and a third feedforward neural network, wherein a Query matrix is generated in the first multi-head collaborative attention network according to a first input matrix, a Key matrix and a Value matrix are generated by a second input matrix, dot multiplication operation is carried out on the Query matrix and a transposed matrix of the Key matrix to obtain an attention matrix, the attention matrix is taken as a weight in a dot multiplication mode to weight the Value matrix to obtain an output matrix, and the output matrix of the first head collaborative attention network is input to the third feedforward neural network through residual connection and layer normalization for nonlinear transformation so as to enhance features;
when the first input matrix is the structural feature of the drug data and the second input matrix is the embedded feature of the multiple groups of the chemical data, the output matrix is the drug-specific cell line feature;
when the first input matrix is the embedded feature of the multiple groups of chemical data and the second input matrix is the structural feature of the drug data, the output matrix is the cell line specific drug feature;
when the first input matrix is the drug characteristic fusing other drug information and the second input matrix is the embedded characteristic of multiple groups of chemical data, the output matrix is the cell line characteristic of multiple drug specificities;
When the first input matrix is the embedded feature of the multi-group chemical data and the second input matrix is the drug feature fused with other drug information, the output matrix is the cell line specific multi-drug feature.
Preferably, each drug-drug association feature extraction module comprises a second multi-head collaborative attention network, a third residual error connection layer and layer normalization, a fourth feedforward neural network and a fourth residual error connection layer and layer normalization, wherein a Query matrix is generated in the second multi-head collaborative attention network according to third input features, key matrixes and Value matrixes are generated by the fourth input features, the Query matrixes and transposed matrixes of the Key matrixes are subjected to dot multiplication operation to obtain attention matrixes, the attention matrixes are weighted by taking the attention matrixes as weights in a dot multiplication mode to obtain output matrixes, the input and the output of the second multi-head collaborative attention network are subjected to residual error connection, the input and the output of the fourth feedforward neural network are subjected to residual error connection, and the output matrixes of the second multi-head collaborative attention network are subjected to transformation calculation through the third residual error connection layer and layer normalization, the fourth feedforward neural network and the fourth residual error connection layer and layer normalization;
when the first input matrix is the structural feature of one drug data and the second input matrix is the structural feature of other drug data, the output matrix is the drug feature fused with other drug information;
When the first input matrix is the structural feature of one drug data and the second input matrix is the cell line specific drug feature of the other drug data, the output matrix is the drug feature fusing the cell line specific drug feature and the other drug information.
Preferably, constructing structural features of each drug data includes:
initializing the structural code of each medicine data to obtain medicine sub-structural code characteristics as the structural characteristics of the medicine data;
or introducing a medicine feature extraction module corresponding to each kind of medicine data to perform feature extraction on the medicine sub-structure code and taking an extraction result as a structural feature of the medicine data, wherein in the medicine feature extraction module, the medicine sub-structure code feature and the position code are added to serve as input of a medicine feature extraction unit, and the extraction result is output through feature extraction of the medicine feature extraction unit, wherein the medicine feature extraction unit comprises a first multi-head self-focusing network, a first residual error connecting layer and layer normalization, a first feedforward neural network, a second residual error connecting layer and layer normalization which are sequentially connected, the input and the output of the first multi-head self-focusing network are in residual error connection, and the input and the output of the first feedforward neural network are in residual error connection.
Preferably, constructing embedded features of the multiple sets of mathematical data includes:
introducing a first cell line feature extraction module to perform feature integration extraction on a plurality of groups of chemical matrixes corresponding to the plurality of groups of chemical data to obtain a plurality of groups of chemical integration features as embedded features, wherein the first cell line feature extraction module comprises a plurality of layers of one-dimensional convolution layers, each convolution layer is provided with m filters, each filter glidingly convolves the group chemical features of the previous layer, and a Relu activation layer, a Dropout layer and a maximum pooling layer are introduced between the convolution layers.
Preferably, constructing the embedded features of the multiple sets of chemical data further comprises:
and a second cell line characteristic extraction module is also introduced to extract a context dependency relationship from a plurality of groups of the chemical integration characteristics output by the first cell line characteristic extraction module to obtain context characteristics as embedded characteristics, wherein the second cell line characteristic extraction module comprises a first layer of normalization, a second multi-head self-attention network, a second layer of normalization and a second feedforward neural network which are sequentially connected, the input of the first layer of normalization is in residual connection with the output of the second multi-head self-attention network, and the input of the second layer of normalization is in residual connection with the output of the second feedforward neural network.
Preferably, the drug use index prediction module comprises a plurality of fully connected layers, and the drug use index is predicted by the plurality of fully connected layers according to the final drug characteristics and the final cell line characteristics of all the drug data.
To achieve the above object, an embodiment further provides an interpretable anticancer drug collaborative prediction apparatus based on collaborative attention, including:
a data acquisition unit for acquiring multiple sets of chemical data of the cell lines, at least 2 kinds of drug data, and combination index data of the drug combinations on different cell lines;
the characteristic construction unit is used for constructing structural characteristics of each drug data and embedding characteristics of multiple groups of chemical data;
the model construction unit is used for constructing a medicine combination prediction model, and comprises a medicine-cell line associated feature extraction module, a medicine-medicine associated feature extraction module and a medicine combination index prediction module, wherein the medicine-cell line associated feature extraction module corresponds to each medicine data; each medicine-medicine association feature extraction module is used for extracting association features of the structural features of each medicine data and the structural features of other medicine data based on cooperative attention to obtain medicine features fused with other medicine information, and splicing the medicine features of cell line specificity and the medicine features fused with other medicine information to serve as final medicine features of each medicine data; the drug use index prediction module is used for predicting the drug use index based on the final drug characteristics and the final cell line characteristics of all the drug data;
The parameter optimization unit is used for performing parameter optimization on the medicine combination prediction model by taking the multiple groups of chemical data and medicine data of the cell lines as sample data and the combination index data of the medicine combination on different cell lines as truth labels;
and the prediction unit is used for carrying out medicine combination prediction by using the medicine combination prediction model after parameter optimization.
Compared with the prior art, the invention has the beneficial effects that at least the following steps are included:
on the basis of obtaining the structural characteristics of each drug data and the embedded characteristics of multiple groups of chemical data, the cooperative attention is introduced to enable the model to learn drug-cell line interactions and drug-drug interactions by itself, and the associated characteristics outside the drug and cell line characteristics are provided for a drug combination prediction task, so that more accurate drug combination prediction is realized, and meanwhile, the interpretability of the gene and substructure layers is obtained based on the attention distribution.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an illustrative anti-cancer drug co-prediction method based on co-attention provided in an embodiment;
FIG. 2 is a flow chart of the construction of structural features of each drug data provided in the examples;
FIG. 3 is a flow chart of the construction of embedded features for multiple sets of chemical data provided by the embodiments;
FIG. 4 is a schematic structural diagram of a drug combination prediction model provided in the examples;
FIG. 5 is a schematic diagram of another embodiment of a drug combination prediction model;
FIG. 6 is a schematic diagram of still another embodiment of a drug combination prediction model provided in the examples;
FIG. 7 is a workflow diagram of a drug-cell line associated feature extraction module provided by an embodiment;
fig. 8 is a schematic structural view of an explanatory anticancer drug co-prediction apparatus based on co-attention provided in the embodiment.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description is presented by way of example only and is not intended to limit the scope of the invention.
In order to solve the problems of low prediction precision, poor generalization capability and the like of the existing method caused by the fact that information such as drug-drug interaction and the like is ignored to be embedded and drug-cell line interaction cannot be effectively coded. The embodiment of the invention provides a synergic attention-based method for synergic prediction of an interpretable anticancer drug, which is characterized in that on one hand, a plurality of groups of chemical characteristics of a cell line and structural characteristics of drug molecules are fully fused and coded, and on the other hand, a synergic attention mechanism is introduced to consider drug-drug interaction and drug-cell line interaction, so that more accurate drug combination prediction is realized, and meanwhile, key mechanism clues influencing the drug combination effect can be obtained based on attention distribution, so that a model has interpretability.
As shown in fig. 1, the method for collaborative prediction of an interpretable anticancer drug based on collaborative attention provided in the embodiment includes the following steps:
step 1, obtaining multiple sets of chemical data, at least 2 kinds of medicine data and combination index data of medicine combinations on different cell lines of the cell lines, and constructing a training sample.
In the examples, the obtained combination data of the drug combinations on different cell lines are derived from a drug combination database such as DrugComb, drugCombDB, SYNERGxDB, DCDB, and experimental data or self-derived data disclosed in the literature; the basic content of each piece of data includes: basic information of at least two drugs, cell line information and inhibition rate of each drug to corresponding cell lines under different concentration gradients, and according to the inhibition rate, a combination index of a plurality of drugs for the corresponding cell lines can be calculated, wherein the drug combination index can be represented by various indexes, including a loewe index, a Bliss index, a HAS index, a ZIP index and the like. Taking two medicines for use as an example, the basic content of each piece of data comprises: the basic information of the drug A, the basic information of the drug B, the cell line information and the inhibition rate of the drug A and the drug B on the cell line under different concentration gradients, and the combination index of the drug A and the drug B on the corresponding cell line can be calculated according to the inhibition rate.
After acquiring the drug combination data, operations such as missing value processing, abnormal value processing, data distribution imbalance processing, repeated value processing and the like are required to be performed on the data, specifically: carrying out deficiency value treatment, and if the data of the medicine A, the medicine B, the cell line and the combination index are missing or the existing data cannot support the calculation of the combination index, deleting the data; firstly checking numerical distribution of the usability index, deleting data exceeding a theoretical range value, and then using a mean value +/-3 times standard deviation of the usability index as a boundary between a minimum value and a maximum value, wherein discrete values exceeding the boundary are assigned by using boundary values; performing data distribution balance treatment, counting the occurrence times of drug molecules and cell lines in all samples, and eliminating drug molecules and cell lines with extremely low occurrence frequency; and (3) performing repeated value processing, and taking the average value of the combination index as the truth value label of the data of the drug A, the drug B and the cell line.
In an embodiment, multiple sets of chemical data for different cell lines are obtained, wherein the multiple sets of chemical data include gene mutation data, gene expression data, gene copy number, gene methylation data, proteomic data, gene effect data, etc., which can be obtained from a CCLE database. After obtaining the multiple sets of histology data of the cell line, the cell line with the deletion of the histology data is deleted first, and only genes with all histology characteristics are retained; next, calculating variation coefficients of all genes in the acquired M kinds of histology data in different cell lines, determining hypervariable genes of each kind of histology data according to the variation coefficient, and obtaining a final gene list by merging the first five hundred hypervariable genes of each kind of histology, wherein the number of genes is represented by N; next, each type of histologic data is subjected to data standardization after the N genes are screened out, and optional standardization methods comprise a standard deviation standardization method, a Z-score standardization method, a maximization method, a minimization method, a log function standardization method and the like, so that standardized histologic data are obtained; finally, M histology data of N genes are integrated for each cell line to form a multiple histology matrix of the cell line Wherein the order of the N genes is maintained consistent in each cell line.
In an embodiment, each acquired drug data should at least contain information such as a drug name, a drug SMILES formula, etc., and the drug SMILES information can be retrieved from PubChem and converted into a standard SMILES formula using an open source software package Rdkit.
In an embodiment, the multiple sets of chemical data for the cell line, drug data for drug A and drug B are used as training samples, and the combination index of drug A and drug B on a particular cell line is used as the truth label for the training samples.
And 2, constructing structural features of each drug data and embedding features of multiple groups of chemical data.
In an embodiment, the structural feature of each drug data may be introduced into the drug feature extraction module by initializing the encoding mode, or based on the initial encoding, and the specific flow is shown in fig. 2.
When an initialization coding mode is adopted, the standard SMILES type of medicine is converted into medicine substructure coding features which are used as the structural features of medicine data. Specifically, as shown in fig. 2, a substructure dictionary is first generated by using high-frequency substructure fragments, pharmacophore fragments, toxic group fragments, and the like of drug molecules obtained by statistics from a drug database such as a ChEMBL database, a PubChem database, a drug bank, and the like, and each substructure corresponds to a specific arabic numeral as a label. Next, carrying out substructure search on the medicine molecules based on the substructure dictionary, forming substructure codes of the medicine molecules according to the substructures existing in the medicine, and finally, representing the medicine molecules as a string of Arabic numerals, wherein each numeral corresponds to one substructure in the dictionary; further converting each digital code into an embedded vector with a fixed length of m to obtain a medicine substructure coding characteristic (where L is the number of substructures).
When a medicine feature extraction module corresponding to each kind of medicine data is introduced, the medicine substructure coding features obtained by initializing codes are input into the medicine feature extraction module, and the extraction result obtained by feature extraction is used as the structural features of the medicine data. As shown in fig. 2, in the drug feature extraction module, the drug substructure coding feature and the position code are added and then used as input of a drug feature extraction unit, and extraction results are output through feature extraction of the drug feature extraction unit, wherein the drug feature extraction unit comprises a first multi-head self-attention network, a first residual error connection layer and layer normalization, a first feedforward neural network, a second residual error connection layer and layer normalization which are sequentially connected, and the first multi-head self-attention layer allows a model to code dependency relations among different substructures in different feature spaces and focuses on locally important features from global features based on attention calculation; the first feedforward neural network further carries out nonlinear transformation on the output of the attention mechanism, and the modeling capability of the model on complex features in the sub-structure sequence is increased; the input and output of the first multi-head self-attention network are connected in a residual way, the input and output of the first feedforward neural network are connected in a residual way, the residual way is used for avoiding network degradation, and the layer normalization is used for smoothing the characteristics.
In an embodiment, when embedding features of multiple sets of chemical data are constructed, a first cell line feature extraction module may be introduced, or a second cell line feature extraction module may be introduced to perform further feature extraction based on the first cell line feature extraction module, and the specific flow is shown in fig. 3.
When the first cell line feature extraction module is introduced, as shown in FIG. 3, the first cell line feature extraction module is used to extract a plurality of sets of mathematical matrices corresponding to the plurality of sets of mathematical dataAnd carrying out feature integration extraction to obtain a plurality of groups of chemical integration features serving as embedded features, wherein the first cell line feature extraction module comprises a plurality of layers of one-dimensional convolution layers, each convolution layer is provided with m filters, each filter glidingly convolves the group chemical features of the previous layer, and a Relu activation layer, a Dropout layer and a maximum pooling layer are introduced between the convolution layers. The relu activation function is used for providing nonlinear learning capability of a model, the Dropout layer randomly discards some features to effectively prevent the model from overfitting, the maximum pooling layer is adopted for feature dimension reduction and redundant information removal, and a plurality of groups of chemically integrated features of the cell line are obtained after a plurality of layers of convolution operations->
When the second cell line feature extraction module is introduced, as shown in fig. 3, the context dependency relationship is extracted from the multiple groups of the chemical integration features output by the first cell line feature extraction module by using the second cell line feature extraction module to obtain the context feature as the embedded feature, wherein the second cell line feature extraction module comprises a first layer of normalization, a second multi-head self-attention network, a second layer of normalization and a second feedforward neural network which are sequentially connected, the input of the first layer of normalization and the output of the second multi-head self-attention network are in residual connection, and the input of the second layer of normalization and the output of the second feedforward neural network are in residual connection. Specifically, the multiple groups of the chemical integration features are subjected to first layer normalization to perform feature smoothing, then the dependency relationship among genes is extracted by the second multi-head self-attention layer, important gene features influencing individual differences of cell lines are captured, residual summation is performed on the cell line features of the second multi-head self-attention layer and the multiple groups of the chemical integration features, and then the context features are obtained through second layer normalization, the second feedforward neural network layer and residual connection in sequence.
And 3, constructing a drug combination prediction model.
In an embodiment, the constructed drug combination prediction model includes a drug-cell line associated feature extraction module and a drug-drug associated feature extraction module corresponding to each drug data, and a drug combination index prediction module.
In one embodiment, one drug-cell line associated feature extraction module and one drug-drug associated feature extraction module for each drug data perform functions in parallel. Specifically, each drug-cell line associated feature extraction module is used for extracting associated features of structural features of each drug data and embedded features of multiple groups of chemical data based on cooperative attention to obtain drug-specific cell line features and cell line-specific drug features, and splicing the drug-specific cell line features corresponding to all the drug data to obtain final cell line features. Each medicine-medicine association feature extraction module is used for carrying out association feature extraction on the structural features of each medicine data and the structural features of other medicine data based on cooperative attention to obtain medicine features fused with other medicine information, and at the moment, the medicine features specific to the cell line and the medicine features fused with other medicine information are spliced to serve as final medicine features of each medicine data. Fig. 4 illustrates one structure of the corresponding drug combination predictive model for two drug data.
In another embodiment, in the drug combination prediction model, an input of a drug-drug associated feature extraction module corresponding to each drug data is connected to an output of a drug-cell line associated feature extraction module corresponding to such drug data. At this time, the function of each drug-cell line association feature extraction module is unchanged, the association feature extraction is still carried out on the structural features of each drug data and the embedded features of multiple groups of chemical data based on the cooperative attention to obtain drug-specific cell line features and cell line-specific drug features, and the drug-specific cell line features corresponding to all the drug data are spliced to obtain final cell line features. The function of each drug-drug associated feature extraction module is changed, specifically: each medicine-medicine association feature extraction module is used for carrying out association feature extraction on the medicine features of the cell line specificity of the medicine data and the structural features of other medicine data based on the cooperative attention so as to obtain the medicine features of the fusion cell line specificity and other medicine information as final medicine features. Fig. 5 illustrates another structure of a corresponding drug combination predictive model for two drug data.
In yet another embodiment, the input of the drug-cell line associated feature extraction module for each drug data is connected to the output of the drug-drug associated feature extraction module for that drug data. At this time, the function of each drug-drug association feature extraction module is unchanged, and still the drug features fused with other drug information are obtained by performing association feature extraction on the structural features of each drug data and the structural features of other drug data based on the cooperative attention. The function of each drug-cell line associated feature extraction module is changed, specifically: and carrying out associated feature extraction on the drug features which are corresponding to each drug data and are fused with other drug information and the embedded features of multiple groups of chemical data based on the cooperative attention to obtain multi-drug-specific cell line features and cell line-specific multi-drug features as final drug features, and simultaneously splicing the multi-drug-specific cell line features corresponding to all the drug data to obtain final cell line features. Fig. 6 illustrates yet another structure of the corresponding drug combination predictive model for two drug data.
In the above three embodiments, each drug-cell line association feature extraction module includes a first multi-head collaborative attention network and a third feedforward neural network, as shown in fig. 7, a Query matrix is generated in the first multi-head collaborative attention network according to a first input matrix, a Key matrix and a Value matrix are generated in a second input matrix, the Query matrix and a transpose matrix of the Key matrix perform dot multiplication operation to obtain an attention matrix, then the attention matrix is used as a weight in a dot multiplication manner to perform weighted summation on the Value matrix to obtain an output matrix, the output matrix of the first head collaborative attention network is input into the third feedforward neural network through residual connection and layer normalization to perform nonlinear transformation, wherein the third feedforward neural network is composed of two layers of fully connected layers, and nonlinear transformation capability is provided through a Relu activation function to enhance fitting capability of features and models.
In the drug-cell line associated feature extraction module, corresponding input and output are different according to different connection modes of the drug-cell line associated feature extraction module and the drug-drug associated feature extraction module, specifically, when a first input matrix is a structural feature of drug data and a second input matrix is an embedded feature of multiple groups of chemical data, an output matrix is a drug-specific cell line feature; when the first input matrix is the embedded feature of the multiple groups of chemical data and the second input matrix is the structural feature of the drug data, the output matrix is the cell line specific drug feature; when the first input matrix is the drug characteristic fusing other drug information and the second input matrix is the embedded characteristic of multiple groups of chemical data, the output matrix is the cell line characteristic of multiple drug specificities; when the first input matrix is the embedded feature of the multi-group chemical data and the second input matrix is the drug feature fused with other drug information, the output matrix is the cell line specific multi-drug feature.
Taking a cell line characteristic based on a drug-cell line association characteristic module to update drug specificity as an example, the operation process is described, firstly, the structural characteristic X based on drug data d Generating a Query matrix (formula (1)) based on embedded features X of multiple sets of mathematical data c Key and Value matrixes (formulas (2) and (3)) are respectively generated:
Q i =X d W i Q (1)
K i =X c W i K (2)
V i =X c W i V (3)
wherein Q is i 、K i 、V i A Query matrix, a Key matrix, a Value matrix and W of the ith head in the multi-head attention respectively i Q 、W i K And W is i V Is a three-way trainable parameter matrix. Q (Q) i From X d And W is i Q Obtained by matrix multiplication, K i And V i From X c Respectively with W i K And W is i V The method comprises the steps of obtaining through matrix multiplication operation;
generating a single head attention weight matrix of the drug to the cell line based on the following formula (4) and combining with V i A multiplication update feature, wherein,represents K i Transpose of matrix,/->For the dot multiplication operation, calculating the attention weight of the Query matrix on the Value matrix, and d k Represents K i Is divided by the number of heads in the multi-head Attention, softmax represents the normalization operation, attention d2c Represents a matrix of attention weights of the cell lines over the drugs:
the updated cell line characteristics are further obtained by integrating the results of multiple single-head attentions by the following equation (5):
MHAttn d2c =Concat(head i ,···,head h )W O (5)
wherein head is i =Attention d2c (Q i ,K i ,V i ),MHAttn d2c Representing the updated characteristics of the cell line obtained after the multi-head attention operation of the drug on the cell line, wherein h is the number of heads in the multi-head attention, and W O Is a trainable parameter matrix, the Concat operation is to splice the output of each head and connect with W O The multiplication performs a linear transformation.
In the above three embodiments, each drug-drug association feature extraction module includes a second multi-head collaborative attention network, a third residual error connection layer and layer normalization, a fourth feedforward neural network, and a fourth residual error connection layer and layer normalization that are sequentially connected, a Query matrix is generated according to a third input feature in the second multi-head collaborative attention network, a Key matrix and a Value matrix are generated by a fourth input feature, the Query matrix and a transposed matrix of the Key matrix perform a dot product operation to obtain an attention matrix, the attention matrix is weighted as a weight in a dot product manner to obtain an output matrix, an input and an output of the second multi-head collaborative attention network are in residual error connection, an input and an output of the fourth feedforward neural network are in residual error connection, and an output matrix of the second multi-head collaborative attention network is subjected to normalization of the third residual error connection layer and layer, the fourth feedforward neural network and the fourth residual error connection layer and layer normalization transformation calculation.
In the medicine-medicine association feature extraction module, corresponding input and output are different according to different connection modes of the medicine-cell line association feature extraction module and the medicine-medicine association feature extraction module, specifically, when a first input matrix is a structural feature of medicine data and a second input matrix is a structural feature of other medicine data, an output matrix is a medicine feature fused with other medicine information; when the first input matrix is the structural feature of one drug data and the second input matrix is the cell line specific drug feature of the other drug data, the output matrix is the drug feature fusing the cell line specificity and the other drug information.
The attention weight matrix output by the medicine-cell line associated feature extraction module and the medicine-medicine associated feature extraction module can be used for subsequent visual analysis of the influence of medicine substructures and different genes on the medicine combination effect, and further explores and explains the mechanism of medicine combination.
In the above three embodiments, the drug combination prediction model includes a drug combination index prediction module that functions identically, and is used to predict the drug combination index based on the final drug characteristics and the final cell line characteristics of all the drug data. Specifically, the drug combination index prediction module comprises a plurality of full-connection layers, for example, 2 full-connection layers, wherein a Relu activation function and a Dropout layer are contained between the full-connection layers, the final drug characteristics and the final cell line characteristics of all the drug data are spliced and then input into the plurality of full-connection layers, and the characteristic fusion and regression prediction are carried out on the input spliced characteristics by utilizing the plurality of full-connection layers to predict and output the drug combination index of a specific drug combination to a specific cell line.
In the above-mentioned drug combination prediction model, the drug-cell line associated feature extraction module and the drug-drug associated feature extraction module adopt different connection modes to obtain a deeper deep neural network model, wherein the shallow neural network is focused on extracting fine granularity information related to a drug combination prediction task, and the fine granularity information covers sub-structural features in the drug, associated details among the histology feature main bodies of different genes of the cell line, and the like; the deep neural network gradually focuses on the global, and is used for extracting coarse-granularity information related to a prediction task, wherein the coarse-granularity information covers the whole representation of medicines, the histology fusion characterization of cell lines and the global association among all the main bodies; through multi-layer feature updating and feature fusion among different layers, richer and comprehensive drug features and cell line features suitable for drug combination prediction tasks can be obtained.
And 4, carrying out parameter optimization on the medicine combination prediction model by using the training sample.
In the embodiment, multiple sets of chemical data and drug data of a cell line are used as training data, combination index data of a drug combination on different cell lines is used as a truth value label, parameter optimization is performed on a drug combination prediction model, and a predicted value of the drug combination index and a mean square error of a corresponding truth value label are used as a loss function to update model parameters of the drug combination prediction model. It should be noted that, when the structural feature of each drug data is constructed by the drug feature extraction module, and the embedded feature of the multiple sets of chemical data is constructed by the first cell line feature extraction module or the second cell line feature extraction module, the parameter optimization is performed together with the parameter optimization of the drug feature extraction module, the first cell line feature extraction module and the second cell line feature extraction module when the parameter optimization is performed on the drug combination prediction model.
And 5, carrying out drug combination prediction by using the drug combination prediction model after parameter optimization.
In the embodiment, the model parameters trained in the step 4 are fixed for predicting unknown pharmaceutical combinations, and when the prediction is applied, the embedded features of the multiple groups of chemical data of the cell line to be predicted and the structural features of each group of pharmaceutical data are input into a pharmaceutical combination prediction model to obtain the final pharmaceutical features of each group of pharmaceutical data and the final cell line features of the multiple groups of chemical data, and are input into a pharmaceutical combination index prediction module to obtain the pharmaceutical combination index through calculation and prediction.
The medicine combination prediction model disclosed before mainly relies on a multi-layer perceptron to extract single-group chemical information as cell line characterization, the invention integrates the multi-group chemical information of the cell line based on a one-dimensional convolution network, can obtain the global characterization of the cell line containing more multi-group chemical information, further adopts a self-attention network to capture the internal correlation among different genes of the cell line on the basis, combines the dual advantages of the local feature capturing of the convolution network and the global feature encoding of the self-attention network, and can fully characterize the individual difference of different cell lines; secondly, a substructure is adopted to code the medicine and a framework for extracting characteristics is adopted, so that a model can sense important pharmacodynamic genes and toxic genes which influence the use of the medicine, and meanwhile, the model structure has strong generalization capability on brand new medicine molecules; in addition, the invention does not depend on sparse biological priori knowledge, and a cooperative attention module is introduced for the first time, so that a model can learn drug-cell line interaction and drug-drug interaction by itself, and related features besides drug and cell line features are provided for a drug combination prediction task, thereby realizing more accurate drug combination prediction, and simultaneously obtaining the interpretability of gene and substructure layers based on attention distribution.
Based on the same inventive concept, as shown in fig. 8, the embodiment further provides an interpretive anticancer drug co-prediction apparatus 800 based on co-attention, which includes a data acquisition unit 810, a feature construction unit 820, a model construction unit 830, a parameter optimization unit 840, and a prediction unit 850;
wherein, the data acquisition unit 810 is used for acquiring multiple groups of chemical data, at least 2 kinds of medicine data and the combination index data of medicine combinations on different cell lines; the feature construction unit 820 is used for constructing the structural feature of each drug data and the embedded feature of multiple groups of chemical data; the model construction unit 830 constructs a drug combination prediction model, which comprises a drug-cell line associated feature extraction module, a drug-drug associated feature extraction module and a drug combination index prediction module, wherein the drug-cell line associated feature extraction module corresponds to each drug data, and each drug-cell line associated feature extraction module is used for carrying out associated feature extraction on the structural features of each drug data and the embedded features of multiple groups of chemical data based on cooperative attention so as to obtain drug-specific cell line features and cell line-specific drug features, and splicing the drug-specific cell line features corresponding to all the drug data so as to obtain final cell line features; each medicine-medicine association feature extraction module is used for extracting association features of the structural features of each medicine data and the structural features of other medicine data based on cooperative attention to obtain medicine features fused with other medicine information, and splicing the medicine features of cell line specificity and the medicine features fused with other medicine information to serve as final medicine features of each medicine data; the drug use index prediction module is used for predicting the drug use index based on the final drug characteristics and the final cell line characteristics of all the drug data; the parameter optimization unit 840 is configured to perform parameter optimization on the drug combination prediction model by using multiple sets of chemical data and drug data of the cell lines as sample data and using combination index data of the drug combination on different cell lines as truth labels; the prediction unit 850 is configured to perform pharmaceutical composition prediction using the pharmaceutical composition prediction model after parameter optimization.
It should be noted that, in the drug combination prediction device based on the cooperative attention mechanism provided in the above embodiment, the above division of each functional unit should be taken as an example for the drug combination prediction. The above functions may be allocated to different functional units, which may be divided into different modules in the internal structure of the terminal device or the server to implement all or part of the functions described above, according to specific needs. The present invention is not limited to the functional unit division shown in the above embodiments, and may be appropriately changed and adjusted according to actual circumstances. The flexible functional unit division mode can be adjusted and configured according to actual requirements, and the adaptability and the expandability of the medicine combination prediction device are ensured.
The embodiment also provides an anticancer drug combination prediction device based on the cooperative attention mechanism, which comprises a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor realizes the steps of the anticancer drug combination prediction method based on the cooperative attention mechanism when executing the computer program.
Those skilled in the art will appreciate that implementing all or part of the above-described embodiment methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. In the embodiments provided in the present application, the memory may be a volatile memory at a near end, such as a RAM, or may be a nonvolatile memory, such as a ROM, a FLASH, a floppy disk, a mechanical hard disk, or may be a remote storage cloud. The processor may be a Central Processing Unit (CPU), a Microprocessor (MPU), a Digital Signal Processor (DSP), or a Field Programmable Gate Array (FPGA).
The foregoing detailed description of the preferred embodiments and advantages of the invention will be appreciated that the foregoing description is merely illustrative of the presently preferred embodiments of the invention, and that no changes, additions, substitutions and equivalents of those embodiments are intended to be included within the scope of the invention.

Claims (10)

1. A synergistic attention-based method for collaborative prediction of an interpretable anticancer drug, comprising the steps of:
obtaining a plurality of sets of chemical data, at least 2 types of drug data, and combination index data for drug combinations on different cell lines;
constructing structural characteristics of each drug data and embedding characteristics of multiple groups of chemical data;
constructing a drug combination prediction model, wherein the drug combination prediction model comprises a drug-cell line associated feature extraction module, a drug-drug associated feature extraction module and a drug combination index prediction module corresponding to each drug data, and each drug-cell line associated feature extraction module is used for carrying out associated feature extraction on the structural features of each drug data and the embedded features of multiple groups of chemical data based on cooperative attention so as to obtain drug-specific cell line features and cell line-specific drug features, and splicing the drug-specific cell line features corresponding to all the drug data so as to obtain final cell line features; each medicine-medicine association feature extraction module is used for extracting association features of the structural features of each medicine data and the structural features of other medicine data based on cooperative attention to obtain medicine features fused with other medicine information, and splicing the medicine features of cell line specificity and the medicine features fused with other medicine information to serve as final medicine features of each medicine data; the drug use index prediction module is used for predicting the drug use index based on the final drug characteristics and the final cell line characteristics of all the drug data;
Taking multiple groups of chemical data and drug data of the cell lines as sample data, taking combination index data of drug combinations on different cell lines as truth labels, and carrying out parameter optimization on a drug combination prediction model;
and carrying out medicine combination prediction by using the medicine combination prediction model after parameter optimization.
2. The cooperative attention-based interpretable anticancer drug cooperative prediction method according to claim 1, further comprising: in the medicine combination prediction model, the input end of a medicine-medicine association feature extraction module corresponding to each medicine data is connected with the output end of a medicine-cell line association feature extraction module corresponding to the medicine data, and at the moment, each medicine-medicine association feature extraction module is used for carrying out association feature extraction on the medicine features of the cell line specificity corresponding to the medicine data and the structural features of other medicine data based on the cooperative attention so as to obtain the medicine features of fusion cell line specificity and other medicine information as final medicine features.
3. The cooperative attention-based interpretable anticancer drug cooperative prediction method according to claim 1, further comprising: in the medicine combination prediction model, the input end of a medicine-cell line associated feature extraction module corresponding to each medicine data is connected with the output end of the medicine-medicine associated feature extraction module corresponding to the medicine data, at this time, each medicine-cell line associated feature extraction module is used for carrying out associated feature extraction on the medicine features corresponding to each medicine data and fused with other medicine information and embedded features of multiple groups of chemical data based on cooperative attention, so as to obtain multi-medicine-specific cell line features and multi-medicine-line-specific multi-medicine features serving as final medicine features, and splicing the multi-medicine-specific cell line features corresponding to all the medicine data to obtain final cell line features.
4. The collaborative attention-based interpretable anticancer drug collaborative prediction method according to claim 1, 2 or 3, wherein each drug-cell line association feature extraction module comprises a first multi-head collaborative attention network and a third feedforward neural network, a Query matrix is generated in the first multi-head collaborative attention network according to a first input matrix, a Key matrix and a Value matrix are generated by a second input matrix, the attention matrix is obtained by performing dot multiplication operation on the Query matrix and a transposed matrix of the Key matrix, the Value matrix is weighted by taking the attention matrix as a weight in a dot multiplication manner to obtain an output matrix, and the output matrix of the first head collaborative attention network is input to the third feedforward neural network through residual connection and layer normalization to perform nonlinear transformation so as to enhance features;
when the first input matrix is the structural feature of the drug data and the second input matrix is the embedded feature of the multiple groups of the chemical data, the output matrix is the drug-specific cell line feature;
when the first input matrix is the embedded feature of the multiple groups of chemical data and the second input matrix is the structural feature of the drug data, the output matrix is the cell line specific drug feature;
When the first input matrix is the drug characteristic fusing other drug information and the second input matrix is the embedded characteristic of multiple groups of chemical data, the output matrix is the cell line characteristic of multiple drug specificities;
when the first input matrix is the embedded feature of the multi-group chemical data and the second input matrix is the drug feature fused with other drug information, the output matrix is the cell line specific multi-drug feature.
5. The collaborative attention-based interpretable anticancer drug collaborative prediction method according to claim 1 or 2, wherein each drug-drug associated feature extraction module comprises a second multi-head collaborative attention network, a third residual error connection layer and layer normalization, a fourth feedforward neural network and a fourth residual error connection layer and layer normalization, a Query matrix is generated in the second multi-head collaborative attention network according to a third input feature, a Key matrix and a Value matrix are generated by the fourth input feature, a transpose matrix of the Query matrix and the Key matrix is subjected to dot multiplication operation to obtain an attention matrix, the attention matrix is used as a weight in a dot multiplication manner to weight the Value matrix to obtain an output matrix, the input and the output of the second multi-head collaborative attention network are subjected to residual error connection, the input and the output of the fourth feedforward neural network are subjected to residual error connection, and the output matrix of the second multi-head collaborative attention network is subjected to transformation calculation through the third residual error connection layer and layer normalization, the fourth feedforward neural network and the fourth residual error connection layer normalization;
When the first input matrix is the structural feature of one drug data and the second input matrix is the structural feature of other drug data, the output matrix is the drug feature fused with other drug information;
when the first input matrix is the structural feature of one drug data and the second input matrix is the cell line specific drug feature of the other drug data, the output matrix is the drug feature fusing the cell line specific drug feature and the other drug information.
6. The cooperative attention-based interpretable anticancer drug cooperative prediction method according to claim 1 or 2 or 3, wherein constructing structural features of each drug data comprises:
initializing the structural code of each medicine data to obtain medicine sub-structural code characteristics as the structural characteristics of the medicine data;
or introducing a medicine feature extraction module corresponding to each kind of medicine data to perform feature extraction on the medicine sub-structure code and taking an extraction result as a structural feature of the medicine data, wherein in the medicine feature extraction module, the medicine sub-structure code feature and the position code are added to serve as input of a medicine feature extraction unit, and the extraction result is output through feature extraction of the medicine feature extraction unit, wherein the medicine feature extraction unit comprises a first multi-head self-focusing network, a first residual error connecting layer and layer normalization, a first feedforward neural network, a second residual error connecting layer and layer normalization which are sequentially connected, the input and the output of the first multi-head self-focusing network are in residual error connection, and the input and the output of the first feedforward neural network are in residual error connection.
7. The cooperative attention-based interpretable anticancer drug cooperative prediction method of claim 1 or 2 or 3, wherein constructing the embedded feature of the multiple sets of mathematical data comprises:
introducing a first cell line feature extraction module to perform feature integration extraction on a plurality of groups of chemical matrixes corresponding to the plurality of groups of chemical data to obtain a plurality of groups of chemical integration features as embedded features, wherein the first cell line feature extraction module comprises a plurality of layers of one-dimensional convolution layers, each convolution layer is provided with m filters, each filter glidingly convolves the group chemical features of the previous layer, and a Relu activation layer, a Dropout layer and a maximum pooling layer are introduced between the convolution layers.
8. The cooperative attention-based interpretable anticancer drug cooperative prediction method of claim 7, wherein constructing the embedded feature of the multiple sets of mathematical data further comprises:
and a second cell line characteristic extraction module is also introduced to extract a context dependency relationship from a plurality of groups of the chemical integration characteristics output by the first cell line characteristic extraction module to obtain context characteristics as embedded characteristics, wherein the second cell line characteristic extraction module comprises a first layer of normalization, a second multi-head self-attention network, a second layer of normalization and a second feedforward neural network which are sequentially connected, the input of the first layer of normalization is in residual connection with the output of the second multi-head self-attention network, and the input of the second layer of normalization is in residual connection with the output of the second feedforward neural network.
9. The cooperative attention-based interpretable anticancer drug cooperative prediction method of claim 1, wherein the drug combination index prediction module includes a plurality of fully connected layers, and the drug combination index is predicted using the plurality of fully connected layers based on final drug characteristics and final cell line characteristics of all drug data.
10. An interpretable anticancer drug collaborative prediction apparatus based on collaborative attention, comprising:
a data acquisition unit for acquiring multiple sets of chemical data of the cell lines, at least 2 kinds of drug data, and combination index data of the drug combinations on different cell lines;
the characteristic construction unit is used for constructing structural characteristics of each drug data and embedding characteristics of multiple groups of chemical data;
the model construction unit is used for constructing a medicine combination prediction model, and comprises a medicine-cell line associated feature extraction module, a medicine-medicine associated feature extraction module and a medicine combination index prediction module, wherein the medicine-cell line associated feature extraction module corresponds to each medicine data; each medicine-medicine association feature extraction module is used for extracting association features of the structural features of each medicine data and the structural features of other medicine data based on cooperative attention to obtain medicine features fused with other medicine information, and splicing the medicine features of cell line specificity and the medicine features fused with other medicine information to serve as final medicine features of each medicine data; the drug use index prediction module is used for predicting the drug use index based on the final drug characteristics and the final cell line characteristics of all the drug data;
The parameter optimization unit is used for performing parameter optimization on the medicine combination prediction model by taking the multiple groups of chemical data and medicine data of the cell lines as sample data and the combination index data of the medicine combination on different cell lines as truth labels;
and the prediction unit is used for carrying out medicine combination prediction by using the medicine combination prediction model after parameter optimization.
CN202311155808.2A 2023-09-08 2023-09-08 Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs Active CN117275608B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311155808.2A CN117275608B (en) 2023-09-08 2023-09-08 Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311155808.2A CN117275608B (en) 2023-09-08 2023-09-08 Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs

Publications (2)

Publication Number Publication Date
CN117275608A true CN117275608A (en) 2023-12-22
CN117275608B CN117275608B (en) 2024-04-26

Family

ID=89201825

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311155808.2A Active CN117275608B (en) 2023-09-08 2023-09-08 Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs

Country Status (1)

Country Link
CN (1) CN117275608B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113362895A (en) * 2021-06-15 2021-09-07 上海基绪康生物科技有限公司 Comprehensive analysis method for predicting anti-cancer drug response related gene
CN114496303A (en) * 2022-01-06 2022-05-13 湖南大学 Anticancer drug screening method based on multichannel neural network
CN114530258A (en) * 2022-01-28 2022-05-24 华南理工大学 Deep learning drug interaction prediction method, device, medium and equipment
CN114765060A (en) * 2021-01-13 2022-07-19 四川大学 Multi-attention method for predicting drug target interaction
CN114841261A (en) * 2022-04-29 2022-08-02 华南理工大学 Increment width and deep learning drug response prediction method, medium, and apparatus
WO2022214036A1 (en) * 2021-04-09 2022-10-13 至本医疗科技(上海)有限公司 Method for predicting drug sensitivity state, device, and storage medium
WO2023038501A1 (en) * 2021-09-10 2023-03-16 주식회사 아론티어 System for predicting drug responses by using convolutional neural network based on drug and cell line similarity matrix
CN116313148A (en) * 2023-03-07 2023-06-23 中南大学 Drug sensitivity prediction method, device, terminal equipment and medium
CN116312808A (en) * 2023-03-24 2023-06-23 东北农业大学 TransGAT-based drug-target interaction prediction method
CN116313147A (en) * 2023-03-03 2023-06-23 河南大学 Knowledge graph attention network-based anticancer drug collaborative prediction method
CN116343928A (en) * 2023-03-29 2023-06-27 西安电子科技大学 Method, system, equipment and medium for predicting sensitivity of anticancer cells by fusing gene network relationship

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114765060A (en) * 2021-01-13 2022-07-19 四川大学 Multi-attention method for predicting drug target interaction
WO2022214036A1 (en) * 2021-04-09 2022-10-13 至本医疗科技(上海)有限公司 Method for predicting drug sensitivity state, device, and storage medium
CN113362895A (en) * 2021-06-15 2021-09-07 上海基绪康生物科技有限公司 Comprehensive analysis method for predicting anti-cancer drug response related gene
WO2023038501A1 (en) * 2021-09-10 2023-03-16 주식회사 아론티어 System for predicting drug responses by using convolutional neural network based on drug and cell line similarity matrix
CN114496303A (en) * 2022-01-06 2022-05-13 湖南大学 Anticancer drug screening method based on multichannel neural network
CN114530258A (en) * 2022-01-28 2022-05-24 华南理工大学 Deep learning drug interaction prediction method, device, medium and equipment
CN114841261A (en) * 2022-04-29 2022-08-02 华南理工大学 Increment width and deep learning drug response prediction method, medium, and apparatus
CN116313147A (en) * 2023-03-03 2023-06-23 河南大学 Knowledge graph attention network-based anticancer drug collaborative prediction method
CN116313148A (en) * 2023-03-07 2023-06-23 中南大学 Drug sensitivity prediction method, device, terminal equipment and medium
CN116312808A (en) * 2023-03-24 2023-06-23 东北农业大学 TransGAT-based drug-target interaction prediction method
CN116343928A (en) * 2023-03-29 2023-06-27 西安电子科技大学 Method, system, equipment and medium for predicting sensitivity of anticancer cells by fusing gene network relationship

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BIORAIN 药学生: "药物组合的机器学习预测方法综述", Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/407742376> *
TIANSHUO WANG 等: "AttenSyn: An Attention-Based Deep Graph Neural Network for Anticancer Synergistic Drug Combination Prediction", 《JOURNAL OF CHEMICAL INFORMATION AND MODELING》, 11 August 2023 (2023-08-11) *
何亚琼;朱晓军;: "深度协同过滤算法实现药物-靶标关系预测", 计算机工程与设计, no. 08, 16 August 2020 (2020-08-16) *
杨晨雨: "基于多组学数据的肿瘤药物敏感性预测", 《生物工程学报》, vol. 38, no. 6, 15 February 2022 (2022-02-15) *
陈希;秦玉芳;陈明;张重阳;: "基于多输入神经网络的药物组合协同作用预测", 生物医学工程学杂志, no. 04, 10 August 2020 (2020-08-10) *

Also Published As

Publication number Publication date
CN117275608B (en) 2024-04-26

Similar Documents

Publication Publication Date Title
CN109670179A (en) Case history text based on iteration expansion convolutional neural networks names entity recognition method
CN113782089B (en) Drug sensitivity prediction method and device based on multigroup chemical data fusion
CN111968715B (en) Drug recommendation modeling method based on medical record data and drug interaction risk
Sriwong et al. Dermatological classification using deep learning of skin image and patient background knowledge
CN113571125A (en) Drug target interaction prediction method based on multilayer network and graph coding
CN116110509B (en) Method and device for predicting drug sensitivity based on histology consistency pretraining
CN114283878A (en) Method and apparatus for training matching model, predicting amino acid sequence and designing medicine
Wang et al. SSGraphCPI: A novel model for predicting compound-protein interactions based on deep learning
CN113160986A (en) Model construction method and system for predicting development of systemic inflammatory response syndrome
Tran et al. Advanced calibration of mortality prediction on cardiovascular disease using feature-based artificial neural network
Tsinganos et al. Real-time analysis of hand gesture recognition with temporal convolutional networks
Zhang et al. Uncovering the relationship between tissue-specific TF-DNA binding and chromatin features through a transformer-based model
CN117275608B (en) Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs
Liu et al. An automatic ECG signal quality assessment method based on resnet and self-attention
CN115206421B (en) Drug repositioning method, and repositioning model training method and device
Abut et al. Deep Neural Networks and Applications in Medical Research
CN116978464A (en) Data processing method, device, equipment and medium
CN115966315A (en) Method, equipment and storage medium for predicting anti-aging medicine
Diao et al. Implementation of lightweight convolutional neural networks via layer-wise differentiable compression
WO2022212337A1 (en) Graph database techniques for machine learning
KR20220167245A (en) Individual and Accession Specific Classification Variance and Marker Selection Method and System Using Artificial Intelligence
Lau et al. Drug repurposing for Leishmaniasis with Hyperbolic Graph Neural Networks
KR20200094490A (en) New molecular fingerprint of chemical compound inspired by NLP and method of quantatitive prediction of activity based on its strucuture
Barber et al. Human exons and introns classification using pre-trained Resnet-50 and GoogleNet models and 13-layers CNN model
Carvalho Deep Modelling for Anticancer Drug Response Prediction with Therapeutic Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant