CN112308326B - Biological network link prediction method based on meta-path and bidirectional encoder - Google Patents

Biological network link prediction method based on meta-path and bidirectional encoder Download PDF

Info

Publication number
CN112308326B
CN112308326B CN202011226195.3A CN202011226195A CN112308326B CN 112308326 B CN112308326 B CN 112308326B CN 202011226195 A CN202011226195 A CN 202011226195A CN 112308326 B CN112308326 B CN 112308326B
Authority
CN
China
Prior art keywords
drug
protein
disease
network
mask
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011226195.3A
Other languages
Chinese (zh)
Other versions
CN112308326A (en
Inventor
彭绍亮
王小奇
李非
辛彬
肖霞
王红
张兴龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202011226195.3A priority Critical patent/CN112308326B/en
Publication of CN112308326A publication Critical patent/CN112308326A/en
Application granted granted Critical
Publication of CN112308326B publication Critical patent/CN112308326B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention belongs to the field of computer science, and discloses a biological network link prediction method based on a meta-path and a bidirectional encoder. Firstly, constructing a multi-source heterogeneous medicine information network, and designing various semantic paths for sequence sampling to form a large-scale semantic information base; secondly, organically fusing a depth Transformer coder and a mask language model (masked language model) to design a depth bidirectional coding characterization model and effectively extract a low latitude characterization vector of each node; finally, a biological link prediction such as a disease-protein association relation, a protein-drug interaction, a drug-side effect association relation and the like is carried out by utilizing an Inductive matrix completion (Inductive matrix completion) technology, and a drug research and development technology system of disease-target-drug-side effect is further completed.

Description

Biological network link prediction method based on meta-path and bidirectional encoder
Technical Field
The invention belongs to the field of computer science, relates to the application of artificial intelligence technology, and particularly relates to a biological network link prediction method based on a meta-path and a bidirectional encoder.
Background
Aiming at predicting other potential interactions (links) between entities for a set of biomedical entities and their known interactions is one of the most important tasks in the biomedical field, and therefore, more and more researchers are utilizing computer technology to predict potential interactions in various biomedical networks.
Traditional methods in the biomedical field have invested considerable effort in developing biologically relevant features such as chemical substructures, gene ontologies and topological similarities. At the same time, supervised learning methods and inference models of semi-supervised graphs are used to predict potential interactions. These methods are based primarily on the assumption of similarity, i.e., entities with similar biological or structural characteristics may have similar associations. However, biological feature-based prediction methods typically face two problems: (1) The biological feature extraction process is very costly, and even some biological features are difficult to obtain, and although biological entities without features can be deleted through preprocessing, the data set is usually small in scale and loses important information, so that the method is not practical in practical application; (2) Biological features may not be sufficiently accurate to represent biomedical entities, and may not be able to build stable and accurate models.
Network characterization methods that attempt to automatically learn low latitude vectors of network nodes are expected to solve the two problems described above and are widely used in bio-link prediction. For example, matrix factorization based techniques are used for prediction of drug-disease associations; some researchers have proposed a matrix decomposition technique of manifold regularization, which improves the prediction of drug-drug interaction by incorporating laplace regularization to learn a better drug representation, and in addition, some network characterization methods based on random walk and characterization methods based on deep neural networks have been proposed. However, the existing method only concerns the structural features between the nodes of the network, and ignores the semantic information between network entities; or only short structure and meta-path can be captured, and the structure and semantic relation between network nodes cannot be deeply mined.
Disclosure of Invention
In order to overcome the defects of the technology, the invention provides a biological network link prediction method based on a meta-path and a bidirectional encoder. Firstly, constructing a multi-source heterogeneous medicine information network, and designing a plurality of meta-paths for sequence sampling to form a large-scale semantic information base; secondly, organically fusing a depth Transformer encoder and a mask language model (masked language model) to design a depth bidirectional coding representation model, and effectively extracting a low latitude representation vector of each node; finally, a biological link prediction such as a disease-protein association relation, a protein-drug interaction, a drug-side effect association relation and the like is carried out by utilizing an Inductive matrix completion (Inductive matrix completion) technology, and a drug research and development technology system of disease-target-drug-side effect is further completed.
The technical scheme adopted by the invention is as follows:
a biological network link prediction method based on meta-path and bidirectional encoder includes the following steps:
1) Initializing parameters, including: the method comprises the following steps of 1, network sequence length l, a node reading threshold value deg, a characterization vector dimension dim, the number of layers n of a transform encoder, a MASK sequence ratio k (0, 1) of a language model, a probability p (0, 1) that a MASK sequence is replaced by a special character [ MASK ], and a probability p' () that the MASK sequence is replaced by other sequences in a semantic text, wherein the number of layers n is equal to the number of layers n;
2) Constructing a medicine information network and a meta path;
3) Numbering all nodes in a network x i ∈{x i I =1,2,. N, num }, where num represents the total number of nodes and x represents the total number of nodes for each node i ∈{x i I =1, 2.,. Num } sequentially sampling according to the meta path of the step 2);
4) Inputting all semantic sequences into a deep bidirectional Transformer encoder for characterization learning to obtain a low-dimensional characterization vector of a node, wherein a Transformer model of each layer comprises the same multi-head self-attention mechanism (multi-head self-attention mechanism) and a full-connection network;
5) Judging whether the maximum training times are reached, if so, outputting the characterization vector of each node
Figure GDA0003919908580000021
Go to step 6), otherwise go to step 4);
6) Carrying out disease-protein correlation prediction by using a generalizing matrix completion method;
7) The same as the disease-protein correlation prediction in the step 6), the target-drug interaction is predicted by using an induction matrix completion method;
8) The relationship between the medicine and the side effect is predicted by using an induction matrix completion method as in the disease-protein correlation prediction in the step 6).
As a further improvement of the present invention, the step 2) is realized by the following steps:
2.1 Constructing a drug information network comprising 4 node types of drugs, targets, diseases and side effects, 7 edges with deletion degree less than deg by using drug public databases, uniProt, HPRD, SIDER, CTD, NDFRT and STRING public databases, wherein the 7 edges comprise drug-drug interaction, drug-protein interaction, drug-disease association, drug-side effect association, protein-disease association, drug-drug structural similarity and protein-protein sequence similarity;
2.2 According to different biological pathways and pharmaceutical mechanisms, 23 kinds of meta-paths are constructed, which are respectively: drug-protein, drug-protein-drug, drug-protein, drug-protein-disease, drug-protein-drug, drug-protein-disease, drug-protein-drug-protein, drug-protein-drug-disease, drug-protein-drug-side effect, drug-protein-disease-protein, drug-protein-disease-drug, protein-drug, protein-drug-protein, protein-drug-disease, protein-drug-side effect, protein-drug-protein, protein-drug-disease, protein-drug-side effect, protein-drug-protein-disease, protein-drug-disease-protein, protein-drug-side effect-drug;
as a further improvement of the present invention, the step 4) is realized by the following steps:
4.1 Dividing words of all semantic sequences, including removing special characters and redundant characters, and performing a space word dividing process, finally processing the semantic sequences by adopting a MASK language model, randomly selecting MASK sequences from all semantic sequences according to a MASK ratio k, generating a random number rand belonging to [0,1] for each MASK sequence, and if rand is less than p, replacing the sequence with [ MASK ], wherein the p belonging to [0,1] is the probability that the MASK sequences are replaced by [ MASK ]; if p ≦ rand < p + p ', randomly selecting a sequence from the semantic sequences to replace the mask sequence, where p' e (0, 1-p) is the probability that the mask sequence is replaced by other sequences; if p + p' ≦ rand < 1, the mask sequence remains unchanged;
4.2 Superimpose the initial token vector and location vector of each node as
Figure GDA0003919908580000031
Inputting the vector to a multi-head attention mechanism for learning to obtain a vector
Figure GDA0003919908580000032
And using residual connection and normalization processing to obtain
Figure GDA0003919908580000033
Secondly, further learning by utilizing a fully-connected feedforward network, and carrying out residual error connection and normalization operation by the fully-connected feedforward network; and finally obtaining the low-dimensional characterization vector of the node.
As a further improvement of the present invention, said step 6) is realized by the following steps:
6.1 Calculating the number of Ninter of disease-protein correlation in the network, randomly selecting the same number of negative samples of Ninter from the disease-protein correlation network, mixing the positive samples and the negative samples together, and performing 10-fold (10-fold) cross validation;
6.2 Reconstructing a heterogeneous network based on the inductive matrix completion model, and eliminating network association information of a test set, wherein the specific operation is as follows: by the formula
Figure GDA0003919908580000034
Converting node link prediction into optimization problem, where r is 7 types of network edge, P r Is a contiguous matrix of 7 types of single networks, Z r Is the low-rank matrix, V, corresponding to the single network to be solved u And V w Is a feature vector of a node in a single network; the 7 types of network edges include: drug-drug interactions, drug-protein interactions, drug-disease associations,drug-side effect association, protein-disease association, drug-drug structure similarity, protein-protein sequence similarity;
6.3 Calculate a disease-target association score in the test set based on the low rank matrix corresponding to the trained disease-protein association.
Compared with the prior art, the invention has the beneficial effects that:
the invention organically integrates the structural relationship between network nodes and the semantic relationship such as biological pathway, pharmacology mechanism and the like by constructing different types and lengths of meta-paths; secondly, a multi-head attention mechanism is adopted to effectively capture the dependency between network nodes with different distances, so that the local balance and the global balance are ensured; finally, the context relation of the semantic sequence is integrated through a mask language model, and the network representation capability is further greatly promoted; in addition, a cold start problem faced by a sparse network is effectively solved by adopting an induction matrix completion model in link prediction.
Drawings
FIG. 1 is a flow chart of a method for predicting bio-network links based on meta-paths and bi-directional encoders;
fig. 2 is a prediction result of the bio-network link prediction method based on meta-path and bi-directional encoder.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Fig. 1 shows a flowchart of a bio-network link prediction method based on meta-path and bi-directional encoder according to an embodiment of the present invention.
With reference to figure 1 of the drawings,
a bio-network link prediction method based on meta-path and bidirectional encoder includes the following steps:
1) Initializing parameters, including: the method comprises the following steps of 1, network sequence length l, a node reading threshold deg, a characterization vector dimension dim, the number of layers n of a transform encoder, MASK sequence ratio k (0, 1) of a language model, probability p (0, 1) that a MASK sequence is replaced by a special character [ MASK ], and probability p' (0, 1-p) that the MASK sequence is replaced by other sequences in a semantic text;
2) Constructing a medicine information network and a meta path;
3) Numbering x all nodes in a network i ∈{x i I =1, 2.. Num }, where num represents the total number of nodes, and x for each node i ∈{x i I =1, 2.,. Num } sequentially sampling according to the meta path of the step 2);
4) Inputting all semantic sequences into a deep bidirectional Transformer encoder for characterization learning to obtain a low-dimensional characterization vector of a node, wherein a Transformer model of each layer comprises the same multi-head self-attention mechanism (multi-head self-attention mechanism) and a full-connection network;
5) Judging whether the maximum training times are reached, if so, outputting the characterization vector of each node
Figure GDA0003919908580000041
Go to step 6), otherwise go to step 4);
6) Carrying out disease-protein correlation prediction by using a generalizing matrix completion method;
7) The same as the disease-protein correlation prediction in the step 6), the target-drug interaction is predicted by using an induction matrix completion method;
8) The same as the disease-protein correlation prediction in the step 6), the drug-side effect correlation is predicted by using an induction matrix completion method.
As a further improvement of the present invention, the step 2) is realized by the following steps:
2.1 Constructing a drug information network comprising 4 node types of drugs, targets, diseases and side effects, 7 edges with deletion degree less than deg by using drug public databases, uniProt, HPRD, SIDER, CTD, NDFRT and STRING public databases, wherein the 7 edges comprise drug-drug interaction, drug-protein interaction, drug-disease association, drug-side effect association, protein-disease association, drug-drug structural similarity and protein-protein sequence similarity;
2.2 23 kinds of meta-paths are constructed according to different biological pathways and pharmaceutical mechanisms, and respectively are as follows: drug-protein, drug-protein-drug, drug-protein, drug-protein-disease, drug-protein-drug, drug-protein-disease, drug-protein-drug-protein, drug-protein-drug-disease, drug-protein-drug-side effect, drug-protein-disease-protein, drug-protein-disease-drug, protein-drug, protein-drug-protein, protein-drug-disease, protein-drug-side effect, protein-drug-protein, protein-drug-disease, protein-drug-side effect, protein-drug-protein-disease, protein-drug-disease-protein, protein-drug-side effect-drug;
as a further improvement of the present invention, the step 4) is realized by the following steps:
4.1 Dividing words of all semantic sequences, including removing special characters and redundant characters, and performing a space word dividing process, finally processing the semantic sequences by adopting a MASK language model, randomly selecting MASK sequences from all semantic sequences according to a MASK ratio k, generating a random number rand belonging to [0,1] for each MASK sequence, and if rand is less than p, replacing the sequence with [ MASK ], wherein the p belonging to [0,1] is the probability that the MASK sequences are replaced by [ MASK ]; if p ≦ rand < p + p ', randomly selecting a sequence from the semantic sequences to replace the mask sequence, where p' e (0, 1-p) is the probability that the mask sequence is replaced by other sequences; if p + p' ≦ rand < 1, the mask sequence remains unchanged;
4.2 Superimpose the initial token vector and location vector of each node as
Figure GDA0003919908580000051
Inputting the vector to a multi-head attention mechanism for learning to obtain a vector
Figure GDA0003919908580000052
And using residual connection and normalization processing to obtain
Figure GDA0003919908580000053
Secondly, further learning by utilizing a fully-connected feedforward network, and carrying out residual error connection and normalization operation by the fully-connected feedforward network; and finally obtaining the low-dimensional characterization vector of the node.
As a further improvement of the present invention, the step 6) is realized by the following steps:
6.1 Calculating the number of Ninter in the disease-protein correlation network, randomly selecting the same number of Ninter negative samples from the disease-protein correlation network, mixing the positive samples and the negative samples together, and performing 10-fold (10-fold) cross validation;
6.2 Reconstructing a heterogeneous network based on the inductive matrix completion model, and eliminating network association information of a test set, wherein the specific operation is as follows: by the formula
Figure GDA0003919908580000061
Converting node link prediction into optimization problem, where r is 7 types of network edge, P r Is a contiguous matrix of 7 types of single networks, Z r Is a low-rank matrix, V, corresponding to the single network to be solved u And V w Is a feature vector of a node in a single network; the 7 types of network edges include: drug-drug interactions, drug-protein interactions, drug-disease associations, drug-side effect associations, protein-disease associations, drug-drug structural similarities, protein-protein sequence similarities;
6.3 Calculate a disease-target association score in the test set based on the low rank matrix corresponding to the trained disease-protein association.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims (2)

1. A biological network link prediction method based on meta-path and bidirectional encoder is characterized by comprising the following steps:
1) Initializing parameters, including: the method comprises the following steps of 1, network sequence length l, a node reading threshold value deg, a characterization vector dimension dim, the number of layers n of a transform encoder, a MASK sequence ratio k (0, 1) of a language model, a probability p (0, 1) that a MASK sequence is replaced by a special character [ MASK ], and a probability p' () that the MASK sequence is replaced by other sequences in a semantic text, wherein the number of layers n is equal to the number of layers n;
2) Constructing a medicine information network and a meta path, and realizing the following steps:
2.1 Constructing a drug information network comprising 4 node types of drugs, targets, diseases and side effects, 7 edges with a deletion degree smaller than that of the nodes through drug bank, uniProt, HPRD, signature, CTD, NDFRT and STRING public databases, wherein the 7 edges comprise drug-drug interactions, drug-protein interactions, drug-disease associations, drug-side effect associations, protein-disease associations, drug-drug structural similarities and protein-protein sequence similarities;
2.2 23 kinds of meta-paths are constructed according to different biological pathways and pharmaceutical mechanisms, and respectively are as follows: drug-protein, drug-protein-drug, drug-protein, drug-protein-disease, drug-protein-drug, drug-protein-disease, drug-protein-drug-protein, drug-protein-drug-disease, drug-protein-drug-side effect, drug-protein-disease-protein, drug-protein-disease-drug, protein-drug, protein-drug-protein, protein-drug-disease, protein-drug-side effect, protein-drug-protein, protein-drug-disease, protein-drug-side effect, protein-drug-protein-disease, protein-drug-disease-protein, protein-drug-side effect-drug;
3) Numbering x all nodes in a network i ∈{x i I =1,2,. N, num }, where num represents the total number of nodes and x represents the total number of nodes for each node i ∈{x i I =1, 2.. Num } follows the meta-path of step 2)Sampling is carried out for the second time;
4) Inputting all semantic sequences into a deep bidirectional Transformer encoder for characterization learning to obtain low-dimensional characterization vectors of nodes, wherein each layer of Transformer model comprises the same multi-head self-attention mechanism and a full-connection network;
5) Judging whether the maximum training times are reached, if so, outputting the characterization vector of each node
Figure FDA0003919908570000011
Go to step 6), otherwise go to step 4);
6) The induction matrix completion method is used for predicting the disease-protein association, and is realized by the following steps:
6.1 Calculating the number of Ninter of disease-protein correlation in the network, randomly selecting the same number of negative samples of Ninter from the disease-protein correlation network, mixing the positive samples and the negative samples together, and performing 10-fold cross validation;
6.2 Reconstructing a heterogeneous network based on the inductive matrix completion model, and eliminating network association information of a test set, wherein the specific operation is as follows: by the formula
Figure FDA0003919908570000012
Converting node link prediction into optimization problem, where r is 7 types of network edge, P r Is a contiguous matrix of 7 types of single networks, Z r Is the low-rank matrix, V, corresponding to the single network to be solved u And V w Is a feature vector of a node in a single network; the 7 types of network edges include: drug-drug interactions, drug-protein interactions, drug-disease associations, drug-side effect associations, protein-disease associations, drug-drug structural similarities, protein-protein sequence similarities;
6.3 Computing a disease-target association score in the test set based on the low rank matrix corresponding to the trained disease-protein association;
7) The same as the disease-protein correlation prediction in the step 6), the target-drug interaction is predicted by using an induction matrix completion method;
8) The same as the disease-protein correlation prediction in the step 6), the drug-side effect correlation is predicted by using an induction matrix completion method.
2. The bio-network link prediction method based on meta-path and bi-directional encoder as claimed in claim 1, wherein the step 4) is implemented by the steps of:
4.1 Dividing words of all semantic sequences, including a process of removing special characters and redundant characters and dividing words by blank spaces, finally processing the semantic sequences by adopting a MASK language model, randomly selecting MASK sequences from all semantic sequences according to a MASK ratio k, generating a random number rand from [0,1] aiming at each MASK sequence, and if rand is less than p, replacing the sequences with [ MASK ], wherein p from [ MASK ] is the probability that the MASK sequences are replaced by [ MASK ]; if p ≦ rand < p + p ', randomly selecting a sequence from the semantic sequences to replace the mask sequence, where p' e (0, 1-p) is the probability that the mask sequence is replaced by other sequences; if p + p' ≦ rand < 1, the mask sequence remains unchanged;
4.2 Superimpose the initial token vector and location vector of each node as
Figure FDA0003919908570000021
And inputting the result into a multi-head attention mechanism to learn to obtain a vector
Figure FDA0003919908570000022
And using residual connection and normalization processing to obtain
Figure FDA0003919908570000023
Secondly, further learning by utilizing a fully-connected feedforward network, and performing residual connection and normalization operation by using the fully-connected feedforward network; finally, the low-dimensional characterization vector of the node is obtained.
CN202011226195.3A 2020-11-05 2020-11-05 Biological network link prediction method based on meta-path and bidirectional encoder Active CN112308326B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011226195.3A CN112308326B (en) 2020-11-05 2020-11-05 Biological network link prediction method based on meta-path and bidirectional encoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011226195.3A CN112308326B (en) 2020-11-05 2020-11-05 Biological network link prediction method based on meta-path and bidirectional encoder

Publications (2)

Publication Number Publication Date
CN112308326A CN112308326A (en) 2021-02-02
CN112308326B true CN112308326B (en) 2022-12-13

Family

ID=74326187

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011226195.3A Active CN112308326B (en) 2020-11-05 2020-11-05 Biological network link prediction method based on meta-path and bidirectional encoder

Country Status (1)

Country Link
CN (1) CN112308326B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113327644A (en) * 2021-04-09 2021-08-31 中山大学 Medicine-target interaction prediction method based on deep embedding learning of graph and sequence
CN113160894B (en) * 2021-04-23 2023-10-24 平安科技(深圳)有限公司 Method, device, equipment and storage medium for predicting interaction between medicine and target
CN113223655B (en) * 2021-05-07 2023-05-12 西安电子科技大学 Drug-disease association prediction method based on variation self-encoder
CN113611356B (en) * 2021-07-29 2023-04-07 湖南大学 Drug relocation prediction method based on self-supervision graph representation learning
CN116504331A (en) * 2023-04-28 2023-07-28 东北林业大学 Frequency score prediction method for drug side effects based on multiple modes and multiple tasks

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7998708B2 (en) * 2006-03-24 2011-08-16 Handylab, Inc. Microfluidic system for amplifying and detecting polynucleotides in parallel
CN102298674B (en) * 2010-06-25 2014-03-26 清华大学 Method for determining medicament target and/or medicament function based on protein network
US20170281784A1 (en) * 2016-04-05 2017-10-05 Arvinas, Inc. Protein-protein interaction inducing technology
CN109783618B (en) * 2018-12-11 2021-01-19 北京大学 Attention mechanism neural network-based drug entity relationship extraction method and system
CN111667884B (en) * 2020-06-12 2022-09-09 天津大学 Convolutional neural network model for predicting protein interactions using protein primary sequences based on attention mechanism
CN111785320B (en) * 2020-06-28 2024-02-06 西安电子科技大学 Drug target interaction prediction method based on multi-layer network representation learning
CN111814460B (en) * 2020-07-06 2021-02-09 四川大学 External knowledge-based drug interaction relation extraction method and system

Also Published As

Publication number Publication date
CN112308326A (en) 2021-02-02

Similar Documents

Publication Publication Date Title
CN112308326B (en) Biological network link prediction method based on meta-path and bidirectional encoder
Zhou et al. A comprehensive survey on pretrained foundation models: A history from bert to chatgpt
CN112417104B (en) Machine reading understanding multi-hop inference model and method with enhanced syntactic relation
CN109492232A (en) A kind of illiteracy Chinese machine translation method of the enhancing semantic feature information based on Transformer
CN112883738A (en) Medical entity relation extraction method based on neural network and self-attention mechanism
CN112905801B (en) Stroke prediction method, system, equipment and storage medium based on event map
CN111291556A (en) Chinese entity relation extraction method based on character and word feature fusion of entity meaning item
CN111222318B (en) Trigger word recognition method based on double-channel bidirectional LSTM-CRF network
CN111274424B (en) Semantic enhanced hash method for zero sample image retrieval
Chen et al. Bilinear joint learning of word and entity embeddings for entity linking
Zhang et al. Learn to abstract via concept graph for weakly-supervised few-shot learning
CN113988075A (en) Network security field text data entity relation extraction method based on multi-task learning
CN113761893A (en) Relation extraction method based on mode pre-training
Li et al. A novel molecular representation learning for molecular property prediction with a multiple SMILES-based augmentation
CN111710428A (en) Biomedical text representation method for modeling global and local context interaction
CN114822683A (en) Method, device, equipment and storage medium for predicting interaction between medicine and target
CN115510242A (en) Chinese medicine text entity relation combined extraction method
CN114138971A (en) Genetic algorithm-based maximum multi-label classification method
CN111145914A (en) Method and device for determining lung cancer clinical disease library text entity
CN114048314A (en) Natural language steganalysis method
CN114021584A (en) Knowledge representation learning method based on graph convolution network and translation model
CN112380867A (en) Text processing method, text processing device, knowledge base construction method, knowledge base construction device and storage medium
CN117131933A (en) Multi-mode knowledge graph establishing method and application
CN116226404A (en) Knowledge graph construction method and knowledge graph system for intestinal-brain axis
Shirghasemi et al. The impact of active learning algorithm on a cross-lingual model in a Persian sentiment task

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant