CN114822718B - Human oral bioavailability prediction method based on graph neural network - Google Patents

Human oral bioavailability prediction method based on graph neural network Download PDF

Info

Publication number
CN114822718B
CN114822718B CN202210306054.5A CN202210306054A CN114822718B CN 114822718 B CN114822718 B CN 114822718B CN 202210306054 A CN202210306054 A CN 202210306054A CN 114822718 B CN114822718 B CN 114822718B
Authority
CN
China
Prior art keywords
neural network
atomic
information
atoms
chemical bond
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210306054.5A
Other languages
Chinese (zh)
Other versions
CN114822718A (en
Inventor
杨云
于明浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan University YNU
Original Assignee
Yunnan University YNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan University YNU filed Critical Yunnan University YNU
Priority to CN202210306054.5A priority Critical patent/CN114822718B/en
Publication of CN114822718A publication Critical patent/CN114822718A/en
Application granted granted Critical
Publication of CN114822718B publication Critical patent/CN114822718B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Chemical & Material Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a human oral bioavailability prediction method based on a graph neural network in the technical field of molecular chemistry prediction, which comprises an initial atom, a chemical bond characteristic extraction module and a graph neural network module; the image neural network needs to convert molecular structure information into a molecular graph, defines initial characteristics of atoms and chemical bonds for the image neural network, and constructs a topological structure of the atomic adjacency matrix representative molecule by utilizing the atomic structure information; the graph neural network module comprises two steps, namely message transmission and reading, wherein the message transmission needs to be carried out for many times to generate hidden representations of atoms and chemical bonds, the graph neural network is used for avoiding extraction of molecular descriptors, reducing workload and using a chemical bond message absorption mechanism, so that a chemical bond auxiliary model learns better molecular representations, and the interpretation of the graph neural network is improved.

Description

Human oral bioavailability prediction method based on graph neural network
Technical Field
The invention relates to the technical field of molecular chemistry prediction, in particular to a human oral bioavailability prediction method based on a graph neural network.
Background
Human oral bioavailability is one of the most important pharmacokinetic properties in human oral drug development. In the early stage of the discovery and research and development of oral medicines, candidate medicines with low oral bioavailability of human bodies are eliminated, so that the consumption of resources can be reduced. Currently, a molecular descriptor based on a specific calculation method or based on expert definition is often used for predicting the human oral bioavailability of a candidate drug in combination with a machine learning algorithm, and the predefined molecular descriptor not only increases the workload, but also does not bring new insight and new ideas to the development of oral drugs, and the traditional prediction of the human oral bioavailability uses the molecular descriptor in combination with the machine learning to develop a prediction model, but the molecular descriptor is often based on the experience of the development of the traditional drug, does not provide new insight to the development of the new drug, and has a certain unavoidable experience deviation. With the development of deep learning technology, the graph neural network has been widely applied to the task of molecular property prediction. The use of the graph neural network does not need to extract molecular descriptors, and can automatically learn the hidden representation of the molecules by defining simple atomic characteristics and chemical bond characteristics, thereby completing the prediction of molecular properties. Therefore, the graph neural network is utilized to construct a human oral bioavailability prediction model, the development of new drugs is assisted, and the application and development of artificial intelligence in the field of drug discovery are promoted.
Because the human oral bioavailability prediction has higher theoretical research and application value, the resource waste caused by the too low human oral bioavailability of the candidate drug can be obviously reduced, and a plurality of researchers at home and abroad always propose a new method for predicting the property. Falcdelta-Cano [1] et al use multiple machine learning model integration, extract 0D-2D multiple molecular descriptors to construct human oral bioavailability prediction model, experimental results show that the prediction method has certain advantage in the aspect of prediction accuracy, which is also representative of the traditional prediction method. The application of graph neural network to predict human oral bioavailability belongs to the field of molecular property prediction, gilmer [3] et al propose a message transfer graph neural network model, and the convolution operation of the graph neural network is constructed based on atomic message transfer, which has greatly exceeded the traditional method in the field of quantum chemistry property prediction;
the prior art has the following defects:
(1) Human oral bioavailability prediction model
The prior prediction model for predicting the oral bioavailability of a human body is expressed by taking a molecular descriptor as a molecule, wherein the molecular descriptor can be divided into a molecular descriptor based on a predefined method and a molecular descriptor based on a specific calculation method. The compound synthesized by human at present only occupies a small part of chemical space based on predefined molecular descriptors which are developed by pharmacologists through past drug development experience, and the problems of experience deviation, misjudgment and the like are unavoidable based on the past drug development experience. Whereas for descriptors based on a specific computational method, researchers are often unaware of the relevance of the descriptor to the task, which limits the performance of predictions for a particular property, such as human oral bioavailability predictions. The use of the graph neural network to automatically extract a molecular representation that is highly correlated to the human oral bioavailability or will help predict this property in a more accurate manner.
(2) Molecular property prediction model based on graph neural network
Currently, the forward propagation process of the graph neural network predicting molecular properties does not take into account the intrinsic nature of chemical bonds, which represent electron clouds around atom pairs. When the atomic state changes, the chemical bond state should also change. However, most models do not update chemical bonds during message passing, and even if the chemical bonds are updated, the interaction of atoms and chemical bonds is insufficient. Improving the interaction of atoms and chemical bonds, renewing chemical bonds in a manner consistent with chemical knowledge, or would help to improve the performance of the molecular property predictions of the graph neural network.
Based on the above, the present invention designs a human oral bioavailability prediction method based on a graph neural network to solve the above problems.
Disclosure of Invention
The invention aims to provide a human oral bioavailability prediction method based on a graph neural network, so as to solve the problems in the background technology.
1. In order to achieve the above purpose, the present invention provides the following technical solutions: the human oral bioavailability prediction method based on the graph neural network comprises an initial atom, a chemical bond characteristic extraction module and a graph neural network module;
the initial atomic and chemical bond characteristic extraction module is used for converting molecular structure information into a molecular graph, defining initial characteristics of atoms and chemical bonds for the graph neural network, and constructing an atomic adjacency matrix to represent a topological structure of molecules by utilizing the atomic structure information;
the forward propagation of the graph neural network comprises two steps, namely message transmission and reading, wherein the message transmission needs to be carried out for many times to generate hidden representations of atoms and chemical bonds, the hidden representations of the atoms and the chemical bonds are generated into hidden representations of molecules by reading operation, and then the hidden representations of the atoms and the chemical bonds are predicted by using a fully-connected network to obtain a prediction result;
s1: messaging, which includes three phases of atomic messaging, chemical bond message absorption, and scaling self-attention;
in the atomic messaging phase, each atom in the molecular diagram absorbs information about the atom and chemical bond to which it is attachedAccording to the following:
wherein,and->Are learning matrices, d t And c t In the t-th update, the dimensions of the atomic state vector and the chemical bond state vector are respectively; d, d t+1 Is the dimension of the atomic state vector in the t+1st update; sigma (·) is a ReLU nonlinear activation function; the process updates the information of the central atom i by using the information of the neighbor atoms around the central atom and the chemical bonds connected with the central atom;
in the chemical bond message absorption phase, the chemical bond absorbs information of two atoms connected with the chemical bond message for updating the chemical bond message, and the chemical bond message is based on the following steps:
wherein,and->All are learning matrices->Will be combined with e ij Splicing the state vectors of the two connected atoms;
through atom information transmission and chemical bond information absorption, information of atoms flows to atoms and chemical bonds connected with the atoms, the chemical bonds absorb surrounding atom information, and after multiple updates, molecular information flows through all the atoms and the chemical bonds, so that the atoms and the chemical bonds have topology information of the neighborhood of the atoms and the chemical bonds;
in the scale self-attention phase, the model will focus on the atomic and chemical bond features, according to:
wherein V is t+1 And E is t+1 The state matrix of atoms and chemical bonds when completing the transfer of the atomic message and the absorption of the chemical bond message in the t-th update, and->All are learning matrices->Is Hadamard Product (Hadamard Product) of matrix, W va1 For embedding information in an atomic state matrix into a high-dimensional space, W after activation va2 Extracting information, converting the numerical value into attention weight through a SoftMax (·) function to obtain an atomic attention weight vector, and reducing the numerical value of all features, namely important features, the reduction amplitude and d when the attention weight vector and an atomic state matrix are directly used for Hadamard product t+1 Related to the size of d t +1 The larger the feature value is, the larger the degree of feature value shrinkage is, and the attention weight vector is amplified by d t+1 The average value of the attention weight vector is amplified to 1 and is not influenced by the length d of the feature vector t+1 The influence of (2) makes the model easier to train;
s2: readout, in the readout phase, atoms and chemical bonds are processed simultaneously using multiple readout functions to obtain a better molecular hidden representation, according to:
v all =Set2Set(V T )||Mean(V T )||Max(V T ) (8)
e all =Set2Set(E T )||Mean(E T )||Max(E T ) (9)
z=v all ||e all (10)
wherein, mean (-) and Max (-) are global average pooling and global maximum pooling, respectively.
Preferably, the extracted atomic initiation features include atomic type, atomic number, aromaticity, and hybridization mode features as atomic representations; extracting chemical bond initiation features includes bond type, whether covalent bond, or stereoisomeric type features as chemical bond representations.
Preferably, in S1, the matrix is embeddedAnd->The method is used for embedding information of atoms and chemical bonds into a hidden space respectively, and the space dimension is h; dimension-reducing matrix->Dimension required for transforming information in hidden space into the underlying graphic neural network, ++>For collecting information about atom i itself.
Preferably, in S1, the matrix is embeddedIs used to embed the two atomic information into a hidden space with dimensions h,/and h->For collecting chemical bonds e ij The information of the self is embedded into the hidden space as well, and the dimension-reducing matrix is +.>For converting the information in the hidden space into the dimensions required by the chemical bonds of the neural network of the next layer.
Preferably, in S1, the average value of the attention weight vectors is
Preferably, in S1, the chemical bond state vector matrix is processed in the same manner as described above.
Preferably, in S2, the results obtained by the various Readout functions are spliced so that the obtained atoms are denoted as v in their entirety all And chemical bond overall represents e all Will be more representative of its overall state.
Preferably, in S2, v all And e all And splicing to obtain a hidden representation z of the molecule, and predicting by using a full-connection layer f (·) to obtain a prediction result.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the method, chemical bond message absorption is proposed, so that the graphic neural network can adaptively fuse important layer number characteristics according to molecular structure information, noise information is filtered, and molecular representation capacity is improved; providing a scaling self-attention mechanism, so that the model can pay attention to the characteristics which are strongly related to the oral bioavailability of a human body, simultaneously avoid the strongly related characteristics from being reduced too much, and improve the molecular representation capability; the method has strong interpretation, can analyze molecular substructures highly related to the oral bioavailability of human bodies, and provides new insights of artificial intelligence layers exceeding the human visual angle for new medicine research and development;
2. by using the graphic neural network, extraction of molecular descriptors can be avoided, workload is reduced, and a chemical bond message absorption mechanism is used, so that a better molecular representation is learned by a chemical bond auxiliary model, and the interpretation of the graphic neural network is improved.
Of course, it is not necessary for any one product to practice the invention to achieve all of the advantages set forth above at the same time.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of the neural network module of the present invention;
fig. 3 is a schematic diagram of atomic messaging and chemical bond message absorption in accordance with the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1 to 3, the present invention provides a human oral bioavailability prediction method based on a graph neural network, which comprises the following steps: the human oral bioavailability prediction method based on the graph neural network comprises an initial atom, a chemical bond characteristic extraction module and a graph neural network module;
the initial atomic and chemical bond characteristic extraction module is used for converting molecular structure information into a molecular graph, defining initial characteristics of atoms and chemical bonds for the graph neural network, and extracting atomic initial characteristics including atomic types, atomic numbers, aromaticity and hybridization mode characteristics as atomic representations; extracting initial characteristics of chemical bonds including bond types, covalent bonds or stereoisomerism types as chemical bond representations, and constructing a topological structure of molecules represented by an atomic adjacency matrix by utilizing atomic structure information;
the forward propagation of the graph neural network comprises two steps, namely message transmission and reading, wherein the message transmission needs to be carried out for many times to generate hidden representations of atoms and chemical bonds, the hidden representations of the atoms and the chemical bonds are generated into hidden representations of molecules by reading operation, and then the hidden representations of the atoms and the chemical bonds are predicted by using a fully-connected network to obtain a prediction result;
s1: messaging, which includes three phases of atomic messaging, chemical bond message absorption, and scaling self-attention;
in the atomic messaging phase, each atom in the molecular diagram absorbs information about the atom and chemical bond to which it is attachedAccording to the following:
wherein,and->Are learning matrices, d t And c t In the t-th update, the dimensions of the atomic state vector and the chemical bond state vector are respectively; d, d t+1 Is the dimension of the atomic state vector in the t+1st update; sigma (·) is a ReLU nonlinear activation function; embedding matrix->And->The method is used for embedding information of atoms and chemical bonds into a hidden space respectively, and the space dimension is h; dimension-reducing matrix->Dimension required for transforming information in hidden space into the underlying graphic neural network, ++>For collecting atom i self information, this process updates the central atom i self information with information of its surrounding neighbor atoms and chemical bonds connected thereto, fig. 3 (a) shows the process of atomic messaging;
in the chemical bond message absorption phase, the chemical bond absorbs information of two atoms connected with the chemical bond message for updating the chemical bond message, and the chemical bond message is based on the following steps:
wherein,and->All are learning matrices->Will be combined with e ij State vector concatenation of two connected atoms, embedding matrix +.>Is used to embed the two atomic information into a hidden space with dimensions h,/and h->For collecting chemical bonds e ij The information of the self is embedded into the hidden space as well, and the dimension-reducing matrix is +.>The method is used for converting the information in the hidden space into the dimension required by the chemical bond of the neural network of the next layer of graph, and the chemical bond message absorption process is shown in the figure 3 (b);
through atom information transmission and chemical bond information absorption, information of atoms flows to atoms and chemical bonds connected with the atoms, the chemical bonds absorb surrounding atom information, and after multiple updates, molecular information flows through all the atoms and the chemical bonds, so that the atoms and the chemical bonds have topology information of the neighborhood of the atoms and the chemical bonds;
in the scale self-attention phase, the model will focus on the atomic and chemical bond features, according to:
wherein V is t+1 And E is t+1 The state matrix of atoms and chemical bonds when completing the transfer of the atomic message and the absorption of the chemical bond message in the t-th update, and->All are learning matrices->Is Hadamard Product (Hadamard Product) of matrix, W va1 For embedding information in an atomic state matrix into a high-dimensional space, W after activation va2 Extracting information, converting the numerical value into attention weight through a SoftMax (·) function to obtain an atomic attention weight vector, wherein the average value of the attention weight vector at the moment is +.>When the attention weight vector and the atomic state matrix are directly used for Hadamard product, the numerical value of all the features is reduced, even the important featuresThe reduction of the amplitude and d t+1 Related to the size of d t+1 The larger the feature value is, the larger the degree of feature value shrinkage is, and the attention weight vector is amplified by d t+1 The average value of the attention weight vector is amplified to 1 and is not influenced by the length d of the feature vector t+1 The influence of the characteristic value is avoided from being excessively reduced when the attention is used, so that the model is easier to train, and the processing mode of the chemical bond state vector matrix is the same as that described above;
s2: readout, in the readout phase, atoms and chemical bonds are processed simultaneously using multiple readout functions to obtain a better molecular hidden representation, according to:
v all =Set2Set(V T )||Mean(V T )||Max(VT) (8)
e all =Set2Set(E T )||Mean(E T )||Max(E T ) (9)
z=v all ||e all (10)
wherein Mean (-) and Max (-) are global average pooling and global maximum pooling respectively, and the results obtained by the plurality of Readout functions are spliced to obtain an atomic integral representation v all And chemical bond overall represents e all Will be more representative of its overall state, will be v all And e all And splicing to obtain a hidden representation z of the molecule, and predicting by using a full-connection layer f (·) to obtain a prediction result.
According to the method, chemical bond message absorption is proposed, so that the graphic neural network can adaptively fuse important layer number characteristics according to molecular structure information, noise information is filtered, and molecular representation capacity is improved; providing a scaling self-attention mechanism, so that the model can pay attention to the characteristics which are strongly related to the oral bioavailability of a human body, simultaneously avoid the strongly related characteristics from being reduced too much, and improve the molecular representation capability; the method has strong interpretation, can analyze molecular substructures highly related to the oral bioavailability of human bodies, and provides new insights of artificial intelligence layers exceeding the human visual angle for new medicine research and development; by using the graphic neural network, extraction of molecular descriptors can be avoided, workload is reduced, and a chemical bond message absorption mechanism is used, so that a better molecular representation is learned by a chemical bond auxiliary model, and the interpretation of the graphic neural network is improved.
In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The preferred embodiments of the invention disclosed above are intended only to assist in the explanation of the invention. The preferred embodiments are not exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.

Claims (8)

1. The human oral bioavailability prediction method based on the graph neural network is characterized by comprising an initial atom, a chemical bond characteristic extraction module and a graph neural network module;
the initial atomic and chemical bond characteristic extraction module is used for converting molecular structure information into a molecular graph, defining initial characteristics of atoms and chemical bonds for the graph neural network, and constructing an atomic adjacency matrix to represent a topological structure of molecules by utilizing the atomic structure information;
the forward propagation of the graph neural network comprises two steps, namely message transmission and reading, wherein the message transmission needs to be carried out for many times to generate hidden representations of atoms and chemical bonds, the hidden representations of the atoms and the chemical bonds are generated into hidden representations of molecules by reading operation, and then the hidden representations of the atoms and the chemical bonds are predicted by using a fully-connected network to obtain a prediction result;
s1: messaging, which includes three phases of atomic messaging, chemical bond message absorption, and scaling self-attention;
in the atomic messaging phase, each atom in the molecular diagram absorbs information about the atom and chemical bond to which it is attachedAccording to the following:
wherein,and->Are learning matrices, d t And c t In the t-th update, the dimensions of the atomic state vector and the chemical bond state vector are respectively; d, d t+1 Is the dimension of the atomic state vector in the t+1st update; sigma (·) is a ReLU nonlinear activation function; the process updates the information of the central atom i by using the information of the neighbor atoms around the central atom and the chemical bonds connected with the central atom;
in the chemical bond message absorption phase, the chemical bond absorbs information of two atoms connected with the chemical bond message for updating the chemical bond message, and the chemical bond message is based on the following steps:
wherein,and->All are learning matrices->Will be combined with e ij Splicing the state vectors of the two connected atoms;
through atom information transmission and chemical bond information absorption, information of atoms flows to atoms and chemical bonds connected with the atoms, the chemical bonds absorb surrounding atom information, and after multiple updates, molecular information flows through all the atoms and the chemical bonds, so that the atoms and the chemical bonds have topology information of the neighborhood of the atoms and the chemical bonds;
in the scale self-attention phase, the model will focus on the atomic and chemical bond features, according to:
wherein V is t+1 And E is t+1 The state matrix of atoms and chemical bonds when completing the transfer of the atomic message and the absorption of the chemical bond message in the t-th update, and->All are learning matrices->Is Hadamard product (Hadamard product) of matrix, W va1 For embedding information in an atomic state matrix into a high-dimensional space, W after activation va2 Extracting information, converting the numerical value into attention weight through a SoftMax (·) function to obtain an atomic attention weight vector, and reducing the numerical value of all features, namely important features, the reduction amplitude and d when the attention weight vector and an atomic state matrix are directly used for Hadamard product t+1 Related to the size of d t+1 The larger the feature value is, the larger the degree of feature value shrinkage is, and the attention weight vector is amplified by d t+1 The average value of the attention weight vector is amplified to 1 and is not influenced by the length d of the feature vector t+1 The influence of (2) makes the model easier to train;
s2: readout, in the readout phase, atoms and chemical bonds are processed simultaneously using multiple readout functions to obtain a better molecular hidden representation, according to:
v all =Set2Set(V T )||Mean(V T )||Max(V T ) (8)
e all =Set2Set(E T )||Mean(E T )||Max(E T ) (9)
z=v all ||e all (10)
wherein, mean (-) and Max (-) are global average pooling and global maximum pooling, respectively.
2. The method for predicting human oral bioavailability based on the graph neural network of claim 1, wherein: the extracted atomic initial features include atomic type, atomic number, aromaticity, and hybrid mode features as atomic representations; extracting chemical bond initiation features includes bond type, whether covalent bond, or stereoisomeric type features as chemical bond representations.
3. The method for predicting human oral bioavailability based on the graph neural network of claim 1, wherein: in the S1, an embedding matrixAnd->The method is used for embedding information of atoms and chemical bonds into a hidden space respectively, and the space dimension is h; dimension-reducing matrix->Dimension required for transforming information in hidden space into the underlying graphic neural network, ++>For collecting information about atom i itself.
4. The method for predicting human oral bioavailability based on the graph neural network of claim 1, wherein: in the S1, an embedding matrixIs used to embed the two atomic information into a hidden space, the hidden space dimension being h,for collecting chemical bonds e ij The information of the self is embedded into the hidden space as well, and the dimension-reducing matrix is +.>For converting the information in the hidden space into the dimensions required by the chemical bonds of the neural network of the next layer.
5. The method for predicting human oral bioavailability based on the graph neural network of claim 1, wherein: in S1, the average value of the attention weight vectors is
6. The method for predicting human oral bioavailability based on the graph neural network of claim 1, wherein: in S1, the processing manner of the chemical bond state vector matrix is the same as that described above.
7. The method for predicting human oral bioavailability based on the graph neural network of claim 1, wherein: in S2, splicing the results obtained by the plurality of Readout functions to obtain an integral representation v all And chemical bond overall represents e all Will be more representative of its overall state.
8. The method for predicting human oral bioavailability based on the graph neural network of claim 1, wherein: in S2, v is all And e all And splicing to obtain a hidden representation z of the molecule, and predicting by using a full-connection layer f (·) to obtain a prediction result.
CN202210306054.5A 2022-03-25 2022-03-25 Human oral bioavailability prediction method based on graph neural network Active CN114822718B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210306054.5A CN114822718B (en) 2022-03-25 2022-03-25 Human oral bioavailability prediction method based on graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210306054.5A CN114822718B (en) 2022-03-25 2022-03-25 Human oral bioavailability prediction method based on graph neural network

Publications (2)

Publication Number Publication Date
CN114822718A CN114822718A (en) 2022-07-29
CN114822718B true CN114822718B (en) 2024-04-09

Family

ID=82531176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210306054.5A Active CN114822718B (en) 2022-03-25 2022-03-25 Human oral bioavailability prediction method based on graph neural network

Country Status (1)

Country Link
CN (1) CN114822718B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115966266B (en) * 2023-01-06 2023-11-17 东南大学 Anti-tumor molecule strengthening method based on graph neural network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113140267A (en) * 2021-03-25 2021-07-20 北京化工大学 Directional molecule generation method based on graph neural network
CN113241128A (en) * 2021-04-29 2021-08-10 天津大学 Molecular property prediction method based on molecular space position coding attention neural network model
CN113299354A (en) * 2021-05-14 2021-08-24 中山大学 Small molecule representation learning method based on Transformer and enhanced interactive MPNN neural network
WO2022022173A1 (en) * 2020-07-30 2022-02-03 腾讯科技(深圳)有限公司 Drug molecular property determining method and device, and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022022173A1 (en) * 2020-07-30 2022-02-03 腾讯科技(深圳)有限公司 Drug molecular property determining method and device, and storage medium
CN113140267A (en) * 2021-03-25 2021-07-20 北京化工大学 Directional molecule generation method based on graph neural network
CN113241128A (en) * 2021-04-29 2021-08-10 天津大学 Molecular property prediction method based on molecular space position coding attention neural network model
CN113299354A (en) * 2021-05-14 2021-08-24 中山大学 Small molecule representation learning method based on Transformer and enhanced interactive MPNN neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于分层注意力的信息级联预测模型;张志扬;张凤荔;陈学勤;王瑞锦;;计算机科学;20200615(06);全文 *

Also Published As

Publication number Publication date
CN114822718A (en) 2022-07-29

Similar Documents

Publication Publication Date Title
CN112347268B (en) Text-enhanced knowledge-graph combined representation learning method and device
CN110287849B (en) Lightweight depth network image target detection method suitable for raspberry pi
Xie et al. End to end multi-task learning with attention for multi-objective fault diagnosis under small sample
CN112883149B (en) Natural language processing method and device
WO2023236977A1 (en) Data processing method and related device
Hayashi The right direction needed to develop white-box deep learning in radiology, pathology, and ophthalmology: A short review
CN110473195B (en) Medical focus detection framework and method capable of being customized automatically
CN114822718B (en) Human oral bioavailability prediction method based on graph neural network
CN115240786A (en) Method for predicting reactant molecules, method for training reactant molecules, device for performing the method, and electronic apparatus
CN110019711A (en) A kind of control method and device of pair of medicine text data structureization processing
CN114661933A (en) Cross-modal retrieval method based on fetal congenital heart disease ultrasonic image-diagnosis report
CN116109678B (en) Method and system for tracking target based on context self-attention learning depth network
CN116403730A (en) Medicine interaction prediction method and system based on graph neural network
Ye et al. A novel automatic image caption generation using bidirectional long-short term memory framework
EP4318322A1 (en) Data processing method and related device
Paul et al. A modern approach for sign language interpretation using convolutional neural network
CN117349708A (en) Motor fault intelligent diagnosis method based on shallow feature fusion
Cheng et al. A survey on image semantic segmentation using deep learning techniques
CN113378938B (en) Edge transform graph neural network-based small sample image classification method and system
Zhu et al. Tri-HGNN: Learning triple policies fused hierarchical graph neural networks for pedestrian trajectory prediction
Bayoudh A survey of multimodal hybrid deep learning for computer vision: Architectures, applications, trends, and challenges
Aluka et al. A comparative study on pre-training models of deep learning to detect lung cancer
CN116701665A (en) Deep learning-based traditional Chinese medicine ancient book knowledge graph construction method
CN115630223A (en) Service recommendation method and system based on multi-model fusion
Ma Summary of Research on Application of Deep Learning in Image Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant