CN113782089A - Drug sensitivity prediction method and device based on multigroup chemical data fusion - Google Patents
Drug sensitivity prediction method and device based on multigroup chemical data fusion Download PDFInfo
- Publication number
- CN113782089A CN113782089A CN202111349387.8A CN202111349387A CN113782089A CN 113782089 A CN113782089 A CN 113782089A CN 202111349387 A CN202111349387 A CN 202111349387A CN 113782089 A CN113782089 A CN 113782089A
- Authority
- CN
- China
- Prior art keywords
- data
- cell line
- drug
- genes
- drug sensitivity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000003814 drug Substances 0.000 title claims abstract description 233
- 229940079593 drug Drugs 0.000 title claims abstract description 214
- 230000035945 sensitivity Effects 0.000 title claims abstract description 102
- 239000000126 substance Substances 0.000 title claims abstract description 57
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000004927 fusion Effects 0.000 title claims abstract description 35
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 87
- 238000000605 extraction Methods 0.000 claims abstract description 49
- 230000002776 aggregation Effects 0.000 claims description 34
- 238000004220 aggregation Methods 0.000 claims description 34
- 238000013528 artificial neural network Methods 0.000 claims description 20
- 230000037353 metabolic pathway Effects 0.000 claims description 18
- 230000006916 protein interaction Effects 0.000 claims description 17
- 238000005457 optimization Methods 0.000 claims description 16
- 238000012512 characterization method Methods 0.000 claims description 13
- 230000014509 gene expression Effects 0.000 claims description 11
- 230000003993 interaction Effects 0.000 claims description 11
- 206010064571 Gene mutation Diseases 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 9
- 238000006243 chemical reaction Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 5
- 230000002401 inhibitory effect Effects 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 4
- 230000008030 elimination Effects 0.000 claims description 3
- 238000003379 elimination reaction Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000002503 metabolic effect Effects 0.000 claims description 2
- 230000009466 transformation Effects 0.000 claims 1
- 238000001514 detection method Methods 0.000 abstract description 2
- 210000004027 cell Anatomy 0.000 description 112
- 238000010586 diagram Methods 0.000 description 19
- 206010028980 Neoplasm Diseases 0.000 description 8
- 201000011510 cancer Diseases 0.000 description 8
- 102000004169 proteins and genes Human genes 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002705 metabolomic analysis Methods 0.000 description 2
- 230000001431 metabolomic effect Effects 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 210000004881 tumor cell Anatomy 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000012165 high-throughput sequencing Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 229920000333 poly(propyleneimine) Polymers 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/50—Mutagenesis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H70/00—ICT specially adapted for the handling or processing of medical references
- G16H70/40—ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Biotechnology (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Biomedical Technology (AREA)
- Epidemiology (AREA)
- Public Health (AREA)
- Apparatus Associated With Microorganisms And Enzymes (AREA)
- Pharmacology & Pharmacy (AREA)
- Primary Health Care (AREA)
- Toxicology (AREA)
- Bioethics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medicinal Chemistry (AREA)
Abstract
The invention discloses a method and a device for predicting drug sensitivity based on multigroup chemical data fusion, which belong to the field of drug sensitivity detection and comprise the following steps: the method comprises the steps of integrating three groups of chemical information, namely genomics data, proteomics data and metabonomics data of an individual cell line through a cell line graph characteristic module to obtain a cell line polygonal graph, fully considering the groups of chemical information of the cell line and potential relation among products expressed by genes in the groups of chemical layers, then extracting characteristics of the cell line polygonal graph through a cell line graph characteristic extraction module to fully extract node characteristics and side characteristics in the cell line polygonal graph as cell line characteristics, and finally predicting the semi-inhibitory concentration of a drug by a drug sensitivity prediction module according to the cell line characteristics and the drug characteristics extracted by the drug characteristic extraction module, so that the prediction accuracy of drug sensitivity is improved on the basis of comprehensively considering the genomics data, the proteomics data and the metabonomics data.
Description
Technical Field
The invention belongs to the technical field of drug sensitivity detection and evaluation, and particularly relates to a drug sensitivity prediction method and device based on multigroup chemical data fusion.
Background
The treatment of cancer is a great problem which is solved in an effort all over the world, and the development of high-throughput sequencing technology and artificial intelligence technology provides infinite possibility for the precise treatment of cancer. How to utilize abundant biological information of individuals and efficient analysis means such as deep learning and artificial intelligence to automatically learn the specific characteristics of the individuals and formulate a specific diagnosis and treatment scheme for each individual so as to realize accurate diagnosis and accurate treatment is an important problem which is very concerned by researchers and industries all over the world. Many researchers have made much effort and contribution to this problem, trying to apply individual genomic data to personalized diagnosis and medication recommendations for patients. However, the existing research still faces an important problem, and how to fully utilize the complex and diverse omics aggregates of each individual to realize more accurate prediction of drug efficacy and drug recommendation is still an important problem to be solved urgently.
With The progress of The research on Genomics, some public datasets are beginning to be applied more and more to bioinformatics research, such as Cancer Cell Line Encyclopedia (CCLE) and (Genomics of Drug Sensitivity in Cancer, GDSC), Cancer Genome map (The Cancer Genome Atlas, TCGA), etc., and proteomics dataset (STRING) for studying The interaction between human genes/proteins, metabolomics dataset for studying The human information pathway (GSEA dataset), etc. For example, a tumor cell drug sensitivity assessment method based on genetic material specificity disclosed in patent application publication No. CN105005693A, which uses a tumor cell sample set alone to predict the half inhibitory concentration (IC 50 value), and a drug sensitivity prediction method based on a self-expression model disclosed in patent application publication No. CN112164474A, and uses GDSC data set and cancer cell line encyclopedia to predict the half inhibitory concentration (IC 50 value).
The data sets are still continuously expanded and developed, and a rich sample data basis is provided for researching occurrence, development, prognosis, regression and the like of diseases. However, the existing data is rarely fully utilized, thereby solving the problems of drug susceptibility prediction and drug recommendation. For example, existing methods only use individual genomics data provided in the CCLE and GDSC databases to predict the semi-inhibitory concentration through genomics analysis, however, such methods often ignore the possible association of individual genes at other omics levels. Therefore, although such a method has been advanced to some extent, the accuracy of the semi-inhibitory concentration prediction is still insufficient. Therefore, at present, no good model is available which can sufficiently fuse multiple sets of individual mathematical information so as to predict drug sensitivity (half inhibitory concentration) more accurately.
Disclosure of Invention
In view of the above, the present invention aims to provide a method and a device for predicting drug sensitivity based on multigroup chemical data fusion, so as to solve the problem of poor accuracy of drug sensitivity prediction caused by neglecting potential connection between genes.
In order to achieve the purpose, the invention provides the following technical scheme:
in a first aspect, an embodiment provides a drug sensitivity prediction method based on multigroup chemical data fusion, including the following steps:
acquiring multiple groups of chemical data, drug data and half-inhibition concentration data of drugs on the cell line, wherein the multiple groups of chemical data of the cell line comprise genomics data, proteomics data and metabonomics data;
constructing a drug sensitivity prediction model, which comprises a cell line graph characteristic module, a cell line graph characteristic extraction module, a drug characteristic extraction module and a drug sensitivity prediction module, wherein the cell line graph characteristic module is used for encoding multiple groups of chemical data of a cell line into a cell line polygonal graph, namely, genes of each sample are used as nodes of the cell line polygonal graph, and gene expression quantity, gene mutation condition and copy number variation condition corresponding to the genes are used as node characteristics, so that the connection edges between the nodes are constructed according to the correlation among the genes determined by genomics data, the protein interaction among the genes determined by proteomics data and the metabolic pathway information among the genes determined by the metabonomics data; the cell line image feature extraction module is used for extracting cell line features from a cell line polygonal image; the medicine characteristic extraction module is used for extracting medicine characteristics from the medicine data; the drug sensitivity prediction module is used for predicting the semi-inhibitory concentration of the drug according to the cell line characteristics and the drug characteristics;
performing parameter optimization on a drug sensitivity prediction model by taking multigroup mathematical data and drug data of a cell line as sample data and taking semi-inhibitory concentration data of a drug on the cell line as a truth label;
and (5) performing drug sensitivity prediction by using the drug sensitivity prediction model after parameter optimization.
In one embodiment, in the cell line graph characterization module, a pearson correlation coefficient between gene expression data of two genes is calculated according to genomics data to determine correlation between the genes, and when the pearson correlation coefficient is greater than a set threshold, a connecting edge between nodes corresponding to the two genes is constructed;
acquiring the interaction between two genes according to proteomics data as protein interaction, constructing a connecting edge between nodes corresponding to the two genes with the protein interaction, and simultaneously taking the interaction score of the interaction as the weight of the connecting edge;
metabolic pathway information among genes is obtained according to metabonomics data, and when multiple genes simultaneously appear in a certain metabolic pathway, a super edge is constructed between nodes corresponding to the genes to serve as a connecting edge.
In one embodiment, the cell line map feature extraction module comprises a first map neural network unit and a gating cycle unit, wherein the first map neural network unit is composed of a plurality of map convolutional layers, and two adjacent map convolutional layers are connected through the gating cycle unit, the first map neural network unit is used for extracting cell line features from a cell line polygon, and the gating cycle unit is used for performing feature attention on the extracted cell line features.
In one embodiment, in each convolutional layer, a three-step feature aggregation is performed on the node features, including:
the method comprises the steps of firstly, performing feature aggregation, namely determining all first-order neighbor nodes of a current node according to a first connecting edge constructed according to the correlation among genes, and performing feature aggregation through the following formula (1);
wherein,is shown asiThe current node characteristics of the current node,representing the new node characteristics after the first step of characteristic aggregation,is shown asjThe node characteristics of the first one-order neighbor node,is shown asiA current node andjthe weight of the first connecting edge between the first one-order neighboring nodes,representing the number of first order neighbor nodes,representing the new node characteristics after the first step of characteristic aggregation;
secondly, performing feature aggregation, namely determining all second-first-order neighbor nodes of the current node according to a second connecting edge constructed by protein interaction between genes, and performing feature aggregation through the following formula (2);
wherein,is shown asiNew node characteristics of current nodeThe new node features after attention by the node gating unit,is shown askThe node characteristics of the second-order neighbor nodes,is shown asiA current node andkthe weight of the second connecting edge between the second first-order neighbor nodes,indicating the number of second-order neighbor nodes,representing the new node characteristics after the second step of characteristic aggregation;
thirdly, feature aggregation, namely determining all third-order neighbor nodes of the current node according to a third connecting edge constructed by metabolic pathway information among genes, and performing feature aggregation through the following formula (3);
wherein,is shown asiNew node characteristics of current nodeThe new node features after attention by the node gating unit,is shown astThe node characteristics of the third-first-order neighbor nodes,is shown asiCurrent node characteristics andtthe weight of the third connecting edge between the third-first-order neighbor nodes,indicating the number of third-order neighbor nodes,representing the new node characteristics after the third step of characteristic aggregation;
new node characteristics of the current node pass throughAttention-off via node gating unitAnnotated new node featuresAs the current node characteristic of the next convolution layer.
In one embodiment, the drug feature extraction module comprises a conversion unit and a second graph neural network unit, wherein the conversion unit is used for converting the drug data into a drug score graph, and the second graph neural network unit is used for extracting the drug features from the input drug score graph.
In one embodiment, the conversion unit encodes the drug data into a drug molecular graph using an open source library RDKit; the second graph neural network unit is constructed based on graph isomorphism principle.
In one embodiment, the drug sensitivity prediction module comprises a plurality of fully connected layers for performing feature fusion and regression on the input cell line features and the splicing features of the drug features to predict the semi-inhibitory concentration of the drug.
In one embodiment, after acquiring multiple sets of mathematical data, drug data and half-inhibitory concentration data of drug on cell lines, the data are subjected to outlier and missing value elimination, and the processed data are used for constructing a training sample.
In one embodiment, when the drug sensitivity prediction model is optimized, the model parameters of the drug sensitivity prediction model are updated by taking the predicted value of the half inhibitory concentration and the mean square error of the corresponding truth label as a loss function.
In a second aspect, embodiments provide a drug sensitivity prediction device based on multigroup chemical data fusion, including:
the data acquisition unit is used for acquiring multiple groups of chemical data, drug data and semi-inhibitory concentration data of drugs on the cell line, wherein the multiple groups of chemical data of the cell line comprise genomics data, proteomics data and metabonomics data;
the model construction unit is used for constructing a drug sensitivity prediction model and comprises a cell line graph characteristic module, a cell line graph characteristic extraction module, a drug characteristic extraction module and a drug sensitivity prediction module, wherein the cell line graph characteristic module is used for encoding multiple groups of chemical data of a cell line into a cell line polygonal graph, namely, genes of each sample are used as nodes, and gene expression quantity, gene mutation condition and copy number variation condition corresponding to the genes are used as node characteristics, so that the connection edges between the nodes are constructed according to the correlation among the genes determined by the genomics data, the protein interaction among the genes determined by the proteomics data and the metabolic pathway information among the genes determined by the metabolic data; the cell line image feature extraction module is used for extracting cell line features from a cell line polygonal image; the medicine characteristic extraction module is used for extracting medicine characteristics from the medicine data; the drug sensitivity prediction module is used for predicting the semi-inhibitory concentration of the drug according to the cell line characteristics and the drug characteristics;
the optimization learning unit is used for performing parameter optimization on the drug sensitivity prediction model by taking multigroup chemical data and drug data of the cell line as sample data and taking semi-inhibitory concentration data of the drug on the cell line as a truth label;
and the prediction unit is used for predicting the drug sensitivity by using the drug sensitivity prediction model after parameter optimization.
In a third aspect, embodiments provide a drug sensitivity prediction apparatus based on multi-set chemical data fusion, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the drug sensitivity prediction method based on multi-set chemical data fusion of the first aspect.
Compared with the prior art, the invention has the beneficial effects that at least:
the method comprises the steps of integrating three groups of chemical information, namely genomics data, proteomics data and metabonomics data of an individual cell line through a cell line graph characterization module to obtain a cell line polygonal graph, fully considering the groups of chemical information of the cell line and potential relation among products expressed by genes in the groups of chemical layers, then extracting the characteristics of the cell line polygonal graph through a cell line graph characteristic extraction module to fully extract node characteristics and side characteristics in the cell line polygonal graph as cell line characteristics, and finally predicting the semi-inhibitory concentration of a drug by a drug sensitivity prediction module according to the cell line characteristics and the drug characteristics extracted by the drug characteristic extraction module, so that the prediction accuracy of the drug sensitivity is improved on the basis of comprehensively considering the genomics data, the proteomics data and the metabonomics data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of a method for predicting drug sensitivity based on multigroup chemical data fusion provided by an embodiment;
FIG. 2 is a schematic structural diagram of a drug sensitivity prediction model provided in an embodiment;
FIG. 3 is a schematic diagram of a cell line profile constructed in a cell line profile characterization module according to an embodiment;
FIG. 4 is a schematic diagram of feature extraction in a cell line map feature extraction module according to an embodiment;
fig. 5 is a schematic structural diagram of a drug sensitivity prediction device based on multigroup chemical data fusion provided by an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
The method aims to solve the problems that the accuracy of a drug sensitivity prediction model is low and the drug sensitivity is difficult to predict accurately due to the existence of complex characteristics of individual multigroup information which are not considered and various potential relations of genes possibly existing on the multigroup level. The embodiment provides a drug sensitivity prediction method and a drug sensitivity prediction device in multigroup chemical data fusion. The potential relation possibly existing among multiple groups of chemical information and genes such as genomics, proteomics, metabonomics and the like is considered, and data characteristics are extracted by combining a graph neural network unit and a gate control circulation unit, so that the prediction accuracy of the drug sensitivity prediction model is improved.
Fig. 1 is a flowchart of a drug sensitivity prediction method based on multigroup chemical data fusion provided in the embodiment. As shown in fig. 1, the embodiment provides a method for predicting drug sensitivity based on multigroup chemical data fusion, which comprises the following steps:
s110, acquiring multigroup data, medicine data and semi-inhibitory concentration data of medicines to the cell line of the cell line, and constructing a training sample.
For each cell line sample, corresponding sets of mathematical data, drug data, and half-inhibitory concentration data of the drug on the cell line can be obtained. Wherein the plurality of sets of chemical data comprises genomic data, proteomic data, and metabolomic data. Drug data refers to the name of the drug acting on the cell line from which the drug molecular formula can be obtained. The semi-inhibitory concentration data characterize the resistance of the cell line to the drug, and the smaller the semi-inhibitory concentration data, the stronger the antibody specificity of the cell line to the drug. For all genes contained in the cell line, the genomics data comprise gene expression level, gene mutation condition and copy number variation condition; protein Interactions (PPIs) between proteomic data-responsive genes at the protein level; metabonomics data reflect the correspondence between genes at the level of the metabolic pathway, i.e., whether multiple genes are present on the same metabolic pathway.
In an embodiment, the acquisition data may be from multiple sets of mathematical data, such as: the CCLE data set records genomics data of cell lines, including gene expression level, copy number variation condition and gene mutation condition; the STRING data set records the interaction between human genes/proteins, the GSEA data set records the metabonomics information of a human metabonomics information channel, and the GDSC data set records the semi-inhibitory concentration value of a cell line to a certain drug. The general drug data in these data sets are expressed by name, and for convenience of extracting drug molecular graph, it is also necessary to obtain drug molecular formula from database (such as PubChem database) as drug research object.
Sample partitioning of the acquired data is required for optimizing drug sensitivity prediction model parameters. Specifically, cell line information, drug data and semi-inhibitory concentration data are extracted from each piece of record information of a GDSC dataset, and then multiple sets of mathematical data corresponding to each cell line are obtained from CCLE, STRING and GSEA datasets, wherein the data corresponding to each record is used as a training sample, namely the multiple sets of mathematical data and drug data of the cell line are used as sample data, and the semi-inhibitory concentration data of the drug on the cell line is used as a truth label.
In one possible embodiment, in order to improve the quality of the training sample and further improve the training effect of the model, after acquiring multiple sets of mathematical data of the cell line, drug data and semi-inhibitory concentration data of the drug on the cell line, the data is further subjected to outlier and missing value elimination processing, and the processed data is used for constructing the training sample.
And S120, constructing a drug sensitivity prediction model.
FIG. 2 is a schematic structural diagram of a drug sensitivity prediction model provided in the example. As shown in fig. 2, the drug sensitivity prediction model provided in the embodiment includes a cell line map characterization module, a cell line map feature extraction module, a drug feature extraction module, and a drug sensitivity prediction module. The cell line graph characterization module is used for encoding multiple sets of mathematical data of the cell line into a cell line polygon graph and realizing cell line polygon graph characterization based on fusion of the multiple sets of mathematical data; the cell line image feature extraction module is used for extracting cell line features from a cell line polygonal image; the medicine characteristic extraction module is used for extracting medicine characteristics from the medicine data; and the drug sensitivity prediction module is used for predicting the semi-inhibitory concentration of the drug according to the cell line characteristics and the drug characteristics.
FIG. 3 is a schematic diagram of the construction of a cell line polygon in the cell line graph characterization module provided in the examples. As shown in fig. 3, first, nodes and node features are constructed, specifically, genes of each sample are used as nodes of a cell line polygonal diagram, and accordingly, three features of the nodes are constructed according to genomic data of each gene, that is, a gene expression level, a gene mutation condition and a copy number variation condition corresponding to the gene are used as the node features, wherein the gene mutation condition is understood as whether a gene mutation occurs, and the copy number variation condition is understood as whether a copy number variation exists.
And then constructing connection edge information between the nodes, specifically, constructing connection edges between the nodes according to the correlation between genes determined according to the genomics data, the protein interaction between the genes determined according to the proteomics data and the metabolic pathway information between the genes determined according to the metabonomics data.
When constructing a connecting edge between nodes according to the correlation between genes, calculating a Pearson correlation coefficient between gene expression data of two genes to determine the correlation between the genes, and when the Pearson correlation coefficient is larger than a set threshold value, constructing a connecting edge between nodes corresponding to the two genes, wherein the corresponding weight is set to be 1, and the weight of the non-existing connecting edge is 0.
When the continuous edge between the nodes is established according to the protein interaction between the genes, the interaction between the two genes is obtained according to proteomics data to be used as the protein interaction, the continuous edge is established between the nodes corresponding to the two genes with the protein interaction, and meanwhile, the interaction score of the interaction is used as the continuous edge weight.
When the connecting edges between the nodes are established according to the metabolic pathway information between the genes, the metabolic pathway information between the genes is obtained according to the metabonomics data, and when a plurality of genes simultaneously appear in a certain metabolic pathway, a super edge is established between the nodes corresponding to the genes to be used as the connecting edge. In the examples, the linking and weighting between any two genes are obtained by two steps of super Edge Expansion (Clique Expansion) and Edge Merging (Edge Merging). Specifically, for a super edge formed by a plurality of genes, the super edge is firstly unfolded to obtain a full-connectivity graph in which every two nodes between nodes corresponding to the plurality of genes are interconnected, and a common connection edge is formed between every two nodes. After the super-edge unfolding operation is carried out on all super-edges, a plurality of connecting edges may be formed between nodes corresponding to two genes, all the connecting edges are subjected to edge merging operation, and the number of the connecting edges existing between the two nodes is used as the connecting edge weight of the two nodes.
Fig. 4 is a schematic diagram of feature extraction in the cell line map feature extraction module according to the embodiment. As shown in fig. 4, in consideration of the specific structure of the constructed cell line polygonal diagram, the cell line diagram feature extraction module provided by the embodiment includes a first diagram neural network unit and a gating cycle unit, wherein the first diagram neural network unit includes a plurality of diagram convolution layers, such as 8 diagram convolution layers, for extracting node features from the cell line polygonal diagram as cell line features. The gating circulation unit is connected with two adjacent graph convolution layers and used for giving different attention to the extracted node features to pay feature attention, namely the node features extracted by the previous graph convolution layer are used as the basis for feature extraction of the next graph convolution layer after being subjected to feature attention by the gating circulation unit, and therefore high attention of effective features in the feature extraction process can be achieved.
As shown in fig. 4, in each convolutional layer, three-step feature aggregation is performed on the node features, which are respectively used to implement information aggregation of the node features in the cell line multi-edge graph through different types of edges, and specifically includes:
the method comprises the steps of firstly, performing feature aggregation, namely determining all first-order neighbor nodes of a current node according to a first connecting edge constructed according to the correlation among genes, and performing feature aggregation through the following formula (1);
wherein,is shown asiThe current node characteristics of the current node,representing new sections after the first step of feature aggregationThe characteristics of the points are such that,is shown asjThe node characteristics of the first one-order neighbor node,is shown asiA current node andjthe weight of the first connecting edge between the first one-order neighboring nodes,representing the number of first order neighbor nodes,representing the new node characteristics after the first step of characteristic aggregation;
secondly, performing feature aggregation, namely determining all second-first-order neighbor nodes of the current node according to a second connecting edge constructed by protein interaction between genes, and performing feature aggregation through the following formula (2);
wherein,is shown asiNew node characteristics of current nodeThe new node features after attention by the node gating unit,is shown askThe node characteristics of the second-order neighbor nodes,is shown asiA current node andksecond connections between second first-order neighbor nodesThe weight of the edge(s) is,indicating the number of second-order neighbor nodes,representing the new node characteristics after the second step of characteristic aggregation;
thirdly, feature aggregation, namely determining all third-order neighbor nodes of the current node according to a third connecting edge constructed by metabolic pathway information among genes, and performing feature aggregation through the following formula (3);
wherein,is shown asiNew node characteristics of current nodeThe new node features after attention by the node gating unit,is shown astThe node characteristics of the third-first-order neighbor nodes,is shown asiCurrent node characteristics andtthe weight of the third connecting edge between the third-first-order neighbor nodes,indicating the number of third-order neighbor nodes,representing the new node characteristics after the third step of characteristic aggregation;
new node of current nodeFeature(s)New node features after attention by node gating unitAs the current node characteristic of the next convolution layer.
The feature aggregation of the three steps respectively aggregates the node features of the first-order neighbor nodes formed by the three edges, so that each graph convolution layer can aggregate the node features of all the first-order neighbor nodes formed by the three edges once, and after each step, feature attention is paid through a node gating unit on a node level, so that different weights are properly given to the node features aggregated by the different kinds of continuous edges.
As shown in fig. 2, the drug feature extraction module includes a conversion unit and a second graph neural network unit, wherein the conversion unit is used for converting the drug data into a drug score graph, and the second graph neural network unit is used for extracting the drug features from the input drug score graph. In one possible embodiment, the conversion unit encodes the drug data into a drug molecular graph by using the open source library RDKit, and the second graph neural network unit is constructed based on a graph isomorphism principle, that is, after the drug data is encoded into the drug molecular graph by using the open source library RDKit, the second graph neural network unit constructed based on the graph isomorphism principle is used for performing feature extraction on the drug molecular graph to obtain the drug features.
The second Graph neural Network unit constructed based on the Graph Isomorphism principle comprises a plurality of Graph Isomorphism Network (GIN) structures, each GIN structure comprises a convolutional layer (GINConv), a batch normalization layer (BN) and a ReLU activation layer (ReLU), and each GAT module comprises a convolutional layer (GATConv), a batch normalization layer (BN) and a ReLU activation layer (ReLU) of GAT.
As shown in fig. 2, the drug sensitivity prediction module includes a plurality of full-junction layers, such as 3 full-junction layers, the cell line characteristics and the drug characteristics are spliced and input to the drug sensitivity prediction module, and feature fusion and regression prediction are performed on the input spliced characteristics by using the plurality of full-junction layers to output the predicted half-inhibitory concentration of the drug-cell line pair.
And S130, performing parameter optimization on the drug sensitivity prediction model by using the training sample.
In the embodiment, parameter optimization is performed on the drug sensitivity prediction model by using multigroup mathematical data and drug data of a cell line as sample data and using half-inhibitory concentration data of a drug on the cell line as a truth label. Specifically, multigroup mathematical data of a cell line are input into a cell line graph characteristic module, a cell line polygon graph after characterization is input into a cell line graph characteristic extraction module, and cell line characteristics are obtained through information characterization and characteristic extraction; inputting the drug data into a drug feature extraction module, and obtaining drug features through information characterization and feature extraction; inputting the cell line characteristics and the drug characteristics into a drug sensitivity prediction module, outputting a predicted value of the semi-inhibitory concentration through calculation, and updating model parameters of a drug sensitivity prediction model by taking the predicted value of the semi-inhibitory concentration and the mean square error of a corresponding truth label as a loss function.
And S140, performing drug sensitivity prediction by using the drug sensitivity prediction model after parameter optimization.
When prediction is applied, multigroup mathematical data of a cell line are input into a drug sensitivity prediction model, the multigroup mathematical data of the cell line are encoded into a cell line polygonal diagram by using a cell line diagram characteristic module and input into a cell line diagram characteristic extraction module, and cell line characteristics are obtained through information characterization and characteristic extraction; inputting the drug data into a drug feature extraction module, and obtaining drug features through information characterization and feature extraction; inputting the cell line characteristics and the drug characteristics into a drug sensitivity prediction module, and outputting the predicted value of the semi-inhibitory concentration through calculation. For example, with the training and testing of drug sensitivity prediction models on 564 cell lines of pan-cancer species and 170 drugs, the RMSE on the test set was found to be only 0.7943, much better than the existing classes of models.
Fig. 5 is a schematic structural diagram of a drug sensitivity prediction device based on multigroup chemical data fusion provided by an embodiment. As shown in fig. 5, an embodiment provides a drug sensitivity prediction apparatus 500, including:
the data acquisition unit 510 is configured to acquire multiple sets of chemical data, drug data, and half-inhibitory concentration data of a drug on a cell line, where the multiple sets of chemical data include genomics data, proteomics data, and metabonomics data;
the model construction unit 520 is used for constructing a drug sensitivity prediction model and comprises a cell line graph characteristic module, a cell line graph characteristic extraction module, a drug characteristic extraction module and a drug sensitivity prediction module, wherein the cell line graph characteristic module is used for encoding multiple groups of chemical data of a cell line into a cell line polygonal graph, namely, genes of each sample are used as nodes, and gene expression quantity, gene mutation condition and copy number variation condition corresponding to the genes are used as node characteristics, so that the connection edges between the nodes are constructed according to the correlation among the genes determined by the genomics data, the protein interaction among the genes determined by the proteomics and the metabolic pathway information among the genes determined by the metabonomics; the cell line image feature extraction module is used for extracting cell line features from a cell line polygonal image; the medicine characteristic extraction module is used for extracting medicine characteristics from the medicine data; the drug sensitivity prediction module is used for predicting the semi-inhibitory concentration of the drug according to the cell line characteristics and the drug characteristics;
the optimization learning unit 530 is configured to perform parameter optimization on the drug sensitivity prediction model by using multiple sets of mathematical data and drug data of the cell line as sample data and using half-inhibitory concentration data of the drug on the cell line as a true value label;
and the predicting unit 540 is used for predicting the drug sensitivity by using the drug sensitivity prediction model after parameter optimization.
It should be noted that, when the drug sensitivity prediction device based on multi-set chemical data fusion provided in the above embodiments is used to perform drug sensitivity prediction, the division of the above functional units is taken as an example, and the above function assignment may be performed by different functional units according to needs, that is, the internal structure of the terminal or the server is divided into different functional units to perform all or part of the above described functions. In addition, the drug sensitivity prediction device based on the multigroup chemical data fusion provided in the above embodiments and the drug sensitivity prediction method based on the multigroup chemical data fusion provided in the above embodiments belong to the same concept, and the specific implementation process thereof is described in detail in the drug sensitivity prediction method based on the multigroup chemical data fusion, and is not described herein again.
The embodiment also provides a drug sensitivity prediction device based on multigroup chemical data fusion, which comprises a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to realize the drug sensitivity prediction method based on multigroup chemical data fusion.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the computer program is executed. In the embodiments provided in the present application, the memory may be a volatile memory at the near end, such as a RAM, a non-volatile memory, such as a ROM, a FLASH, a floppy disk, a mechanical hard disk, or the like, or a remote storage cloud. The processor may be a Central Processing Unit (CPU), a Microprocessor (MPU), a Digital Signal Processor (DSP), or a Field Programmable Gate Array (FPGA).
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.
Claims (10)
1. A drug sensitivity prediction method based on multigroup chemical data fusion is characterized by comprising the following steps:
acquiring multiple groups of chemical data, drug data and half-inhibition concentration data of drugs on the cell line, wherein the multiple groups of chemical data of the cell line comprise genomics data, proteomics data and metabonomics data;
constructing a drug sensitivity prediction model, which comprises a cell line graph characteristic module, a cell line graph characteristic extraction module, a drug characteristic extraction module and a drug sensitivity prediction module, wherein the cell line graph characteristic module is used for encoding multiple groups of chemical data of a cell line into a cell line polygonal graph, namely, genes of each sample are used as nodes of the cell line polygonal graph, and gene expression quantity, gene mutation condition and copy number variation condition corresponding to the genes are used as node characteristics, so that the connection edges between the nodes are constructed according to the correlation among the genes determined by genomics data, the protein interaction among the genes determined by proteomics data and the metabolic pathway information among the genes determined by the metabonomics data; the cell line image feature extraction module is used for extracting cell line features from a cell line polygonal image; the medicine characteristic extraction module is used for extracting medicine characteristics from the medicine data; the drug sensitivity prediction module is used for predicting the semi-inhibitory concentration of the drug according to the cell line characteristics and the drug characteristics;
performing parameter optimization on a drug sensitivity prediction model by taking multigroup mathematical data and drug data of a cell line as sample data and taking semi-inhibitory concentration data of a drug on the cell line as a truth label;
and (5) performing drug sensitivity prediction by using the drug sensitivity prediction model after parameter optimization.
2. The method for predicting drug sensitivity based on multigroup chemical data fusion according to claim 1, wherein in the cell line graph characterization module, a pearson correlation coefficient between gene expression data of two genes is calculated according to genomic data to determine correlation between the genes, and when the pearson correlation coefficient is greater than a set threshold, a connecting edge between nodes corresponding to the two genes is constructed;
acquiring the interaction between two genes according to proteomics data as protein interaction, constructing a connecting edge between nodes corresponding to the two genes with the protein interaction, and simultaneously taking the interaction score of the interaction as the weight of the connecting edge;
metabolic pathway information among genes is obtained according to metabonomics data, and when multiple genes simultaneously appear in a certain metabolic pathway, a super edge is constructed between nodes corresponding to the genes to serve as a connecting edge.
3. The method for predicting drug sensitivity based on multigroup chemical data fusion according to claim 1, wherein the cell line map feature extraction module comprises a first map neural network unit and a gated cycle unit, the first map neural network unit and the gated cycle unit are composed of a plurality of map convolution layers, and two adjacent map convolution layers are connected through the gated cycle unit, wherein the first map neural network unit is used for extracting cell line features from a cell line polygon map, and the gated cycle unit is used for performing feature attention on the extracted cell line features.
4. The method of claim 3, wherein in each convolutional layer, node features are subjected to three-step feature aggregation, comprising:
the method comprises the steps of firstly, performing feature aggregation, namely determining all first-order neighbor nodes of a current node according to a first connecting edge constructed according to the correlation among genes, and performing feature aggregation through the following formula (1);
wherein,is shown asiThe current node characteristics of the current node,representing the new node characteristics after the first step of characteristic aggregation,is shown asjThe node characteristics of the first one-order neighbor node,is shown asiA current node andjthe weight of the first connecting edge between the first one-order neighboring nodes,representing the number of first order neighbor nodes,representing the new node characteristics after the first step of characteristic aggregation;
secondly, performing feature aggregation, namely determining all second-first-order neighbor nodes of the current node according to a second connecting edge constructed by protein interaction between genes, and performing feature aggregation through the following formula (2);
wherein,is shown asiNew node characteristics of current nodeThe new node features after attention by the node gating unit,is shown askThe node characteristics of the second-order neighbor nodes,is shown asiA current node andkthe weight of the second connecting edge between the second first-order neighbor nodes,indicating the number of second-order neighbor nodes,representing the new node characteristics after the second step of characteristic aggregation;
thirdly, feature aggregation, namely determining all third-order neighbor nodes of the current node according to a third connecting edge constructed by metabolic pathway information among genes, and performing feature aggregation through the following formula (3);
wherein,is shown asiNew node characteristics of current nodeThe new node features after attention by the node gating unit,is shown astThe node characteristics of the third-first-order neighbor nodes,is shown asiCurrent node characteristics andtthe weight of the third connecting edge between the third-first-order neighbor nodes,indicating the number of third-order neighbor nodes,representing the new node characteristics after the third step of characteristic aggregation;
5. The method for predicting drug sensitivity based on multigroup chemical data fusion of claim 1, wherein the drug feature extraction module comprises a conversion unit and a second graph neural network unit, wherein the conversion unit is used for converting the drug data into a drug score graph, and the second graph neural network unit is used for extracting the drug features from the input drug score graph.
6. The method for predicting drug sensitivity based on multigroup chemical data fusion of claim 5, wherein the transformation unit encodes the drug data into a drug molecular graph by using an open-source library RDkit; the second graph neural network unit is constructed based on graph isomorphism principle.
7. The method for predicting drug sensitivity based on multigroup chemical data fusion of claim 1, wherein the drug sensitivity prediction module comprises a plurality of fully-connected layers for performing feature fusion and regression prediction on the input cell line features and the splicing features of the drug features to obtain the semi-inhibitory concentration of the drug.
8. The method for predicting drug sensitivity based on multigroup chemical data fusion according to claim 1, characterized in that after multigroup chemical data of a cell line, drug data and semi-inhibitory concentration data of a drug on the cell line are obtained, outlier and missing value elimination processing is further performed on the data, and the processed data are used for constructing a training sample;
and when the parameters of the drug sensitivity prediction model are optimized, updating the model parameters of the drug sensitivity prediction model by taking the predicted value of the half inhibitory concentration and the mean square error of the corresponding truth label as a loss function.
9. A drug sensitivity prediction device based on multigroup chemical data fusion, comprising:
the data acquisition unit is used for acquiring multiple groups of chemical data, drug data and semi-inhibitory concentration data of drugs on the cell line, wherein the multiple groups of chemical data of the cell line comprise genomics data, proteomics data and metabonomics data;
the model construction unit is used for constructing a drug sensitivity prediction model and comprises a cell line graph characteristic module, a cell line graph characteristic extraction module, a drug characteristic extraction module and a drug sensitivity prediction module, wherein the cell line graph characteristic module is used for encoding multiple groups of chemical data of a cell line into a cell line polygonal graph, namely, genes of each sample are used as nodes, and gene expression quantity, gene mutation condition and copy number variation condition corresponding to the genes are used as node characteristics, so that the connection edges between the nodes are constructed according to the correlation among the genes determined by the genomics data, the protein interaction among the genes determined by the proteomics data and the metabolic pathway information among the genes determined by the metabolic data; the cell line image feature extraction module is used for extracting cell line features from a cell line polygonal image; the medicine characteristic extraction module is used for extracting medicine characteristics from the medicine data; the drug sensitivity prediction module is used for predicting the semi-inhibitory concentration of the drug according to the cell line characteristics and the drug characteristics;
the optimization learning unit is used for performing parameter optimization on the drug sensitivity prediction model by taking multigroup chemical data and drug data of the cell line as sample data and taking semi-inhibitory concentration data of the drug on the cell line as a truth label;
and the prediction unit is used for predicting the drug sensitivity by using the drug sensitivity prediction model after parameter optimization.
10. A drug sensitivity prediction device based on multigroup chemical data fusion, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the drug sensitivity prediction method based on multigroup chemical data fusion of any one of claims 1 to 8 when executing the computer program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111349387.8A CN113782089B (en) | 2021-11-15 | 2021-11-15 | Drug sensitivity prediction method and device based on multigroup chemical data fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111349387.8A CN113782089B (en) | 2021-11-15 | 2021-11-15 | Drug sensitivity prediction method and device based on multigroup chemical data fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113782089A true CN113782089A (en) | 2021-12-10 |
CN113782089B CN113782089B (en) | 2022-02-18 |
Family
ID=78873903
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111349387.8A Active CN113782089B (en) | 2021-11-15 | 2021-11-15 | Drug sensitivity prediction method and device based on multigroup chemical data fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113782089B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114255886A (en) * | 2022-02-28 | 2022-03-29 | 浙江大学 | Multi-group similarity guide-based drug sensitivity prediction method and device |
CN114429787A (en) * | 2021-12-30 | 2022-05-03 | 北京百度网讯科技有限公司 | Omics data processing method and device, electronic device and storage medium |
CN114678069A (en) * | 2022-05-27 | 2022-06-28 | 浙江大学 | Immune rejection prediction and signal path determination device for organ transplantation |
CN114999630A (en) * | 2022-06-07 | 2022-09-02 | 浙江大学 | Liver transplantation recipient prognosis prediction device based on multi-source data fusion |
CN115206421A (en) * | 2022-07-19 | 2022-10-18 | 北京百度网讯科技有限公司 | Drug repositioning method, and repositioning model training method and device |
CN116110509A (en) * | 2022-11-15 | 2023-05-12 | 浙江大学 | Method and device for predicting drug sensitivity based on histology consistency pretraining |
CN116597902A (en) * | 2023-04-24 | 2023-08-15 | 浙江大学 | Method and device for screening multiple groups of chemical biomarkers based on drug sensitivity data |
CN116705194A (en) * | 2023-06-06 | 2023-09-05 | 之江实验室 | Method and device for predicting drug cancer suppression sensitivity based on graph neural network |
WO2023221125A1 (en) * | 2022-05-20 | 2023-11-23 | 京东方科技集团股份有限公司 | Drug sensitivity prediction method, model training method, storage medium and device |
CN117524346A (en) * | 2023-11-20 | 2024-02-06 | 东北林业大学 | Multi-view cancer drug response prediction system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050131057A1 (en) * | 2003-08-25 | 2005-06-16 | Board Of Regents, The University Of Texas System | Taxane chemosensitivity prediction test |
CN107609326A (en) * | 2017-07-26 | 2018-01-19 | 同济大学 | Drug sensitivity prediction method in the accurate medical treatment of cancer |
CN108877953A (en) * | 2018-06-06 | 2018-11-23 | 中南大学 | A kind of drug sensitivity prediction method based on more similitude networks |
CN112599218A (en) * | 2020-12-16 | 2021-04-02 | 北京深度制耀科技有限公司 | Training method and prediction method of drug sensitivity prediction model and related device |
-
2021
- 2021-11-15 CN CN202111349387.8A patent/CN113782089B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050131057A1 (en) * | 2003-08-25 | 2005-06-16 | Board Of Regents, The University Of Texas System | Taxane chemosensitivity prediction test |
CN107609326A (en) * | 2017-07-26 | 2018-01-19 | 同济大学 | Drug sensitivity prediction method in the accurate medical treatment of cancer |
CN108877953A (en) * | 2018-06-06 | 2018-11-23 | 中南大学 | A kind of drug sensitivity prediction method based on more similitude networks |
CN112599218A (en) * | 2020-12-16 | 2021-04-02 | 北京深度制耀科技有限公司 | Training method and prediction method of drug sensitivity prediction model and related device |
Non-Patent Citations (2)
Title |
---|
JOSÉ ZARIFFA: "Relationship Between Clinical Assessments of Function and Measurements From an Upper-Limb Robotic Rehabilitation Device in Cervical Spinal Cord Injury", 《 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING》 * |
李叙潼: "人工智能算法在药物细胞敏感性预测中的应用", 《科学通报》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114429787A (en) * | 2021-12-30 | 2022-05-03 | 北京百度网讯科技有限公司 | Omics data processing method and device, electronic device and storage medium |
CN114255886B (en) * | 2022-02-28 | 2022-06-14 | 浙江大学 | Multi-group similarity guide-based drug sensitivity prediction method and device |
CN114255886A (en) * | 2022-02-28 | 2022-03-29 | 浙江大学 | Multi-group similarity guide-based drug sensitivity prediction method and device |
WO2023221125A1 (en) * | 2022-05-20 | 2023-11-23 | 京东方科技集团股份有限公司 | Drug sensitivity prediction method, model training method, storage medium and device |
CN114678069A (en) * | 2022-05-27 | 2022-06-28 | 浙江大学 | Immune rejection prediction and signal path determination device for organ transplantation |
CN114999630A (en) * | 2022-06-07 | 2022-09-02 | 浙江大学 | Liver transplantation recipient prognosis prediction device based on multi-source data fusion |
CN115206421A (en) * | 2022-07-19 | 2022-10-18 | 北京百度网讯科技有限公司 | Drug repositioning method, and repositioning model training method and device |
CN116110509A (en) * | 2022-11-15 | 2023-05-12 | 浙江大学 | Method and device for predicting drug sensitivity based on histology consistency pretraining |
CN116110509B (en) * | 2022-11-15 | 2023-08-04 | 浙江大学 | Method and device for predicting drug sensitivity based on histology consistency pretraining |
CN116597902A (en) * | 2023-04-24 | 2023-08-15 | 浙江大学 | Method and device for screening multiple groups of chemical biomarkers based on drug sensitivity data |
CN116597902B (en) * | 2023-04-24 | 2023-12-01 | 浙江大学 | Method and device for screening multiple groups of chemical biomarkers based on drug sensitivity data |
CN116705194A (en) * | 2023-06-06 | 2023-09-05 | 之江实验室 | Method and device for predicting drug cancer suppression sensitivity based on graph neural network |
CN116705194B (en) * | 2023-06-06 | 2024-06-04 | 之江实验室 | Method and device for predicting drug cancer suppression sensitivity based on graph neural network |
CN117524346A (en) * | 2023-11-20 | 2024-02-06 | 东北林业大学 | Multi-view cancer drug response prediction system |
Also Published As
Publication number | Publication date |
---|---|
CN113782089B (en) | 2022-02-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113782089B (en) | Drug sensitivity prediction method and device based on multigroup chemical data fusion | |
CN112863696B (en) | Drug sensitivity prediction method and device based on transfer learning and graph neural network | |
CN114255886B (en) | Multi-group similarity guide-based drug sensitivity prediction method and device | |
Zeng et al. | Review of statistical learning methods in integrated omics studies (an integrated information science) | |
WO2022111385A1 (en) | Graph neural network-based clinical omics data processing method and apparatus, device, and medium | |
Wang et al. | FastGGM: an efficient algorithm for the inference of Gaussian graphical model in biological networks | |
CN112784913B (en) | MiRNA-disease association prediction method and device based on fusion of multi-view information of graphic neural network | |
CN112037912A (en) | Triage model training method, device and equipment based on medical knowledge map | |
US20220130541A1 (en) | Disease-gene prioritization method and system | |
CN105653846A (en) | Integrated similarity measurement and bi-directional random walk based pharmaceutical relocation method | |
CN115798598B (en) | Hypergraph-based miRNA-disease association prediction model and method | |
CN116110509B (en) | Method and device for predicting drug sensitivity based on histology consistency pretraining | |
D’Agaro | Artificial intelligence used in genome analysis studies | |
Yang et al. | Machine learning methods for exploring sequence determinants of 3D genome organization | |
WO2024164739A1 (en) | Graph network construction method and apparatus, electronic device, and storage medium | |
Bansal et al. | A review on machine learning aided multi-omics data integration techniques for healthcare | |
CN117079804A (en) | Method and system for constructing digestive system tumor clinical result prediction model | |
CN116978464A (en) | Data processing method, device, equipment and medium | |
CN115410642A (en) | Biological relation network information modeling method and system | |
Wang et al. | Network clustering analysis using mixture exponential-family random graph models and its application in genetic interaction data | |
Wang et al. | Prediction of the disease causal genes based on heterogeneous network and multi-feature combination method | |
CN114999630A (en) | Liver transplantation recipient prognosis prediction device based on multi-source data fusion | |
Li et al. | iEnhance: a multi-scale spatial projection encoding network for enhancing chromatin interaction data resolution | |
Gentry et al. | Missingness adapted group informed clustered (MAGIC)-LASSO: A novel paradigm for prediction in data with widespread non-random missingness | |
CN116994652B (en) | Information prediction method and device based on neural network and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |