CN113707236A - Method, device and equipment for predicting properties of small drug molecules based on graph neural network - Google Patents

Method, device and equipment for predicting properties of small drug molecules based on graph neural network Download PDF

Info

Publication number
CN113707236A
CN113707236A CN202111005476.0A CN202111005476A CN113707236A CN 113707236 A CN113707236 A CN 113707236A CN 202111005476 A CN202111005476 A CN 202111005476A CN 113707236 A CN113707236 A CN 113707236A
Authority
CN
China
Prior art keywords
feature vector
molecular
neural network
graph
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111005476.0A
Other languages
Chinese (zh)
Other versions
CN113707236B (en
Inventor
王俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202111005476.0A priority Critical patent/CN113707236B/en
Publication of CN113707236A publication Critical patent/CN113707236A/en
Priority to PCT/CN2022/071440 priority patent/WO2023029352A1/en
Application granted granted Critical
Publication of CN113707236B publication Critical patent/CN113707236B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/20Identification of molecular entities, parts thereof or of chemical compositions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Chemical & Material Sciences (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a method, a device and equipment for predicting properties of small drug molecules based on a graph neural network, relates to the technical field of artificial intelligence, and can solve the technical problems of low efficiency and low accuracy of the conventional method for predicting the properties of the small drug molecules. The method comprises the following steps: generating a molecular diagram structure according to the chemical molecular structure of the target drug micromolecule, and generating a molecular subgraph structure according to the functional group intermediate structure of the target drug micromolecule; determining a first feature vector corresponding to the molecular diagram structure and a second feature vector corresponding to the molecular subgraph structure by using a target diagram neural network model; and constructing a third feature vector according to the first feature vector and the second feature vector, and inputting the third feature vector into a trained property prediction model to obtain a property prediction result of the target drug micromolecule. The method and the device are suitable for realizing intelligent prediction of the properties of the drug small molecules based on an artificial intelligence technology.

Description

Method, device and equipment for predicting properties of small drug molecules based on graph neural network
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a method, a device and equipment for predicting properties of small drug molecules based on a graph neural network.
Background
Polycyclic drug small molecules have been a hotspot in the field of medicinal chemistry for nearly 20 years because polycyclic structures have significant effects in improving pharmacological activity and selectivity, improving drug potency, and taking into account the microstructure combined with targets and the macroscopic properties required by pharmacokinetics. The majority of small molecules of the clinical polycyclic drugs are cyclic peptides or lactones consisting of 12-20 atoms, and the polycyclic drugs are mainly used for treating diseases such as infection, inflammation, tumor and the like and have dosage forms such as oral administration and injection. The property prediction of polycyclic drug small molecules is important for further improving the deep learning potential in drug discovery.
The conventional method for predicting the properties of polycyclic drug small molecules uses feature engineering, i.e. generating and using problem-specific molecular descriptors such as molecular fingerprints, descriptors derived from quantum chemistry, physicochemical and differential topology, etc., and then performing quantitative structure-activity relationship or structure-activity relationship (QSAR/QSPR) modeling by using the molecular descriptors (including 1D/2D/3D/high-dimensional descriptors such as physicochemical properties of molecular weight, etc.) as input features.
However, the algorithmic model predictive performance of such a mass modeling approach using molecular descriptors depends strongly on artificially created features or predefined descriptors. When the characteristic engineering is time-consuming and labor-consuming, the internal structure information is not considered or fully used in the characteristic extraction process of the system structures, so that the property prediction accuracy of the complicated polycyclic drug micromolecules is not high.
Disclosure of Invention
In view of this, the present application provides a method, an apparatus, and a device for predicting properties of a small drug molecule based on a graph neural network, which can be used to solve the technical problems of low efficiency and low accuracy of predicting properties of a small drug molecule at present.
According to one aspect of the application, a method for predicting the property of a drug small molecule based on a graph neural network is provided, and the method comprises the following steps:
generating a molecular diagram structure according to the chemical molecular structure of the target drug micromolecule, and generating a molecular subgraph structure according to the functional group intermediate structure of the target drug micromolecule;
determining a first feature vector corresponding to the molecular diagram structure and a second feature vector corresponding to the molecular subgraph structure by using a target diagram neural network model;
and constructing a third feature vector according to the first feature vector and the second feature vector, and inputting the third feature vector into a trained property prediction model to obtain a property prediction result of the target drug micromolecule.
According to another aspect of the present application, there is provided a device for predicting a property of a small molecule of a drug based on a neural network, the device comprising:
the generation module is used for generating a molecular diagram structure according to the chemical molecular structure of the target drug micromolecule and generating a molecular subgraph structure according to the functional group intermediate structure of the target drug micromolecule;
the first determining module is used for determining a first feature vector corresponding to the molecular diagram structure and a second feature vector corresponding to the molecular diagram structure by using a target diagram neural network model;
and the second determination module is used for constructing a third feature vector according to the first feature vector and the second feature vector, and inputting the third feature vector into a trained property prediction model to obtain a property prediction result of the target drug micromolecule.
According to yet another aspect of the present application, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the above graph neural network-based drug small molecule property prediction method.
According to yet another aspect of the present application, there is provided a computer device, comprising a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, the processor implementing the method for predicting a property of a small molecule of a drug based on a graph neural network as described above when executing the program.
Compared with the current medicine small molecule property prediction mode based on descriptors, the method, the device and the equipment for predicting the medicine small molecule property based on the graph neural network can firstly generate the molecular graph structure and the molecular subgraph structure of the target medicine small molecule, then respectively input the molecular graph structure and the molecular subgraph structure into a target graph neural network model, and respectively determine to obtain the first characteristic vector and the second characteristic vector of the target medicine small molecule; and then, a third feature vector is constructed by utilizing the first feature vector and the second feature vector, and the third feature vector is input into the trained property prediction model, so that the property prediction result of the target drug micromolecule can be determined. According to the technical scheme, for the medicine molecules containing multiple rings, while a molecular graph structure is considered, the functional group structure information between atoms and molecules, namely the subgraph, is introduced, and the property prediction of the medicine molecules containing multiple rings is realized by integrating the multi-scale molecular expression by combining a graph neural network model, so that the key graph representation information of the medicine molecules can be efficiently learned, the general structure rules in different graph data are captured, and further, the better fitting capability on a property prediction task is given, and the method has better prediction performance compared with the traditional molecular fingerprints, descriptors and the like, and further, the property prediction accuracy of the medicine molecules containing multiple rings can be ensured.
The foregoing description is only an overview of the technical solutions of the present application, and the present application can be implemented according to the content of the description in order to make the technical means of the present application more clearly understood, and the following detailed description of the present application is given in order to make the above and other objects, features, and advantages of the present application more clearly understandable.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application to the disclosed embodiment. In the drawings:
fig. 1 shows a schematic flow chart of a method for predicting a property of a drug small molecule based on a graph neural network provided in an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram illustrating another method for predicting the property of a small drug molecule based on a graph neural network provided in an embodiment of the present application;
FIG. 3 is a schematic diagram illustrating the principle of drug small molecule property prediction based on a graph neural network provided by an embodiment of the present application;
fig. 4 is a schematic structural diagram illustrating a device for predicting a property of a small drug molecule based on a graph neural network according to an embodiment of the present application;
fig. 5 shows a schematic structural diagram of another graph neural network-based drug small molecule property prediction device provided in an embodiment of the present application.
Detailed Description
The method and the device can realize the prediction of the properties of the small molecules of the drugs based on the artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
Aiming at the technical problems of low efficiency and low accuracy of the prediction of the properties of the drug small molecules at present, the application provides a method for predicting the properties of the drug small molecules based on a graph neural network, as shown in fig. 1, the method comprises the following steps:
101. generating a molecular diagram structure according to the chemical molecular structure of the target drug small molecule, and generating a molecular subgraph structure according to the functional group intermediate structure of the target drug small molecule.
Wherein, the target drug micromolecule is a polycyclic micromolecule to be subjected to property prediction analysis; the functional group intermediate structure is a structure between an atomic scale and a molecular scale, which is obtained by expressing a polycyclic small molecule corresponding to a benzene ring in a chemical molecular structure by using a functional group, and comprises an original functional group in the chemical molecular structure and a new functional group converted from the benzene ring. Functional groups are intermediate scales between atomic and molecular scales, and are atoms or groups of atoms that determine the chemical properties of organic compounds. Common functional groups include hydroxyl, carboxyl, ether linkages, aldehyde groups, carbonyl groups, and the like. The organic chemical reaction mainly takes place on functional groups, and the functional group pair has a functional group intermediate structure according with a certain rule. The property of the organic matters plays a role of determining, -X, -OH, -CHO, -COOH, -NO2, -SO3H, -NH2 and RCO-, and the functional groups determine the chemical properties of halogenated hydrocarbon, alcohol or phenol, aldehyde, carboxylic acid, nitro compound or nitrite, sulfonic organic matters, amines and amides in the organic matters.
Accurate prediction of the properties of a drug is critical to drug development. Since traditional experimental methods are limited by throughput and cost, developing effective machine learning methods is of great interest for drug property prediction. At present, subgraph information of intermediate scales more than polycyclic drug micromolecules is not mined sufficiently, so that the model prediction accuracy is limited. Therefore, in order to accurately predict the properties of the drug molecules to promote the drug research and development process, the application proposes to realize the property prediction of the polycyclic drug small molecules by combining multi-scale molecular expression based on a graph neural network model. Experimental results show that the model has better prediction capability, and has potential superior to the various methods compared with the traditional full-connection neural network and a single-scale (including no sub-graph scale information extraction) prediction model.
In a specific application scenario, before the step of this embodiment is executed, the chemical molecular structure of the target drug small molecule may be extracted in advance, and then the functional group intermediate structure is generated according to the chemical molecular structure. Given that a benzene ring is a special functional group, but if learning is to be started from an atomic scale, a graph neural network model may need to turn several layers to obtain functional information of the benzene ring, why does not directly tell the model that it is a benzene ring? Thus, again, the unit level of one functional group scale can be innovatively added to explicitly extract information at intermediate scales directly, to better enable representation learning and property prediction of molecules. Therefore, in this embodiment, when the functional group intermediate structure is generated according to the chemical molecular structure, the benzene ring in the chemical molecular structure can be replaced by the functional group, so as to obtain the functional group intermediate structure constructed by the original functional group in the chemical molecular structure and the new functional group converted from the benzene ring.
For this embodiment, each atom in a drug small molecule can be represented as a node (node) in the molecular diagram structure, and the force between atoms is represented by the edge (edge) between nodes. Nodes can carry different information to express different atomic symbols, and edges (edges) can also carry different information to express different acting force modes, so that the chemical molecular structure of a chemical molecule is expressed by a molecular diagram structure in a computer. Accordingly, a molecular sub-graph structure may be further generated based on the functional group intermediate structure of the target drug small molecule, in which each node represents a functional group, and the interaction between the functional groups is represented by edges between the nodes. By adding a unit level of a functional group scale to explicitly extract information of an intermediate scale directly, representation learning and property prediction of molecules can be better realized.
The execution main body can be a device for predicting the property of the drug small molecule, can be configured at a client side or a server side, and can generate a molecular diagram structure and a molecular subgraph structure of the target drug small molecule in advance, and then input the molecular diagram structure and the molecular subgraph structure into a target diagram neural network model respectively to respectively determine and obtain a first eigenvector and a second eigenvector of the target drug small molecule; and finally, constructing by using the first feature vector and the second feature vector to obtain a third feature vector, and inputting the third feature vector into the trained property prediction model to determine a property prediction result of the target drug micromolecule.
102. And determining a first feature vector corresponding to the molecular subgraph structure and a second feature vector corresponding to the molecular subgraph structure by using the target graph neural network model.
For the embodiment, the method can be applied to Graph Neural Networks (GNNs) to extract the first feature vector and the second feature vector of the target drug small molecule. Graph Neural Networks (GNNs) are a class of models that are deeply learned on graph structure data. The input to a graph neural network is typically a graph structure, the final output of which generally depends on the specific task. Taking graph property prediction as an example, a graph neural network trains implicit vector representation of each node in a graph according to a graph structure and input node properties, wherein the vector representation is targeted to contain strong enough expression information so that each node can be helped to extract information, and finally, information vector representation of the whole graph can be obtained through an average pooling mode and the like.
Before the graph neural network is applied, the graph neural network needs to be pre-trained by combining task scenes. In general, if there is sufficient data and labels, the graph neural network can be pre-trained by means of supervised learning. However, in real life, there is often a large amount of data and only a small number of tags, and marking data requires a lot of effort, and unfortunately, if the unmarked data is directly discarded. These unlabeled data can therefore be "labeled," although these labels are not the same as the final labels of the learning task, nor otherwise learned with the model. For example, the graph neural network is expected to be used for classifying nodes on the graph, however, the labeled nodes are few, and other tasks can be designed at this time, for example, the graph neural network is used for predicting the degree of the nodes, the degree information of the nodes can be simply obtained through statistics, and through such learning, the graph neural network is expected to be capable of learning the local information of each node in the graph structure, and the information is helpful for the final node classification task. In the above example, the label of the node is the label that is ultimately desired to be predicted, and the degree of the node is the created label. By predicting the degree of a node using a graph neural network, one can obtain: 1) the node embedding is suitable for node degree prediction; 2) and the weight matrix is suitable for the graph neural network of the node degree prediction task. Then, the nodes embedding can be connected into a classifier and classified learning is carried out by using data with labels; and (4) continuing training by directly using the labeled data on the graph neural network, and adjusting the weight matrix to obtain a model suitable for the node classification task.
Accordingly, in the application, in order to learn the molecular level representation through the graph network, a pseudo label can be created from large-scale unlabeled data as a supervision signal, and the supervision signal is constructed to perform supervised learning on the model, so that potential features and information in the data can be effectively learned. Therefore, before the steps of this embodiment are executed, as a preferred method, the steps of this embodiment may specifically include: acquiring an unlabeled graph data set and a first labeled graph data set, wherein a training task of the first labeled graph data set is different from a preset property prediction task; taking an unlabeled graph data set as a training sample, training a preset graph neural network model, and adjusting parameters of the graph neural network model to obtain a first graph neural network model; taking the first labeled graph data set as a training sample, training the first graph neural network model, and adjusting parameters of the first graph neural network model to obtain a second graph neural network model; and taking a second labeled graph data set corresponding to the preset property prediction task as a training sample, training the second graph neural network model, and adjusting parameters of the second graph neural network model to obtain the target graph neural network model. After the first graph neural network model is obtained and before the graph neural network applied to the scene to be applied is obtained, the first labeled graph data set is used as a training sample, the first graph neural network model is trained, parameters of the first graph neural network model are adjusted, and the obtained second graph neural network model learns the rules of how to perform basic data processing, analysis and the like on the graph data in the form of labeled graph data. And then, the second labeled graph data is used as a training sample, and when the second graph neural network model is trained, the second graph neural network model can rapidly process and analyze the second labeled graph data, so that the model training efficiency is further improved, and the quality of the graph neural network obtained by training is optimized.
When the graph neural network model is trained based on the training samples and parameters of the graph neural network model are adjusted, nodes or connection relations of the nodes with a proportion of about 15% are randomly masked (mask) from an adjacency matrix of graph data to disturb the integrity of an original graph (for example, the original graph data has 20 atomic nodes, about 15% of the nodes are randomly masked from the original graph data, and the adjacency matrix is correspondingly disturbed and transformed), so that a learning target is constructed, and the model learns and predicts the compact information expression of the nodes of the graph data by learning and predicting the masked nodes or connection relations of the nodes. If the model can better predict the covered nodes or node attributes, the model already learns the basic knowledge about the data, and can obtain better performance when other subsequent tasks are subsequently learned.
Correspondingly, for the embodiment, after the target graph neural network model is obtained through training, the molecular graph structure of the target drug micromolecules can be input into the target graph neural network model, and first feature vectors under the corresponding molecular scale are obtained; in addition, the intermediate structure of the functional group of the target drug micromolecule can be input into the target graph neural network model to obtain a second feature vector under the corresponding intermediate scale, so that the first feature vector and the second feature vector are utilized to construct the multi-scale molecular expression of the target drug micromolecule.
103. And constructing a third feature vector according to the first feature vector and the second feature vector, and inputting the third feature vector into the trained property prediction model to obtain a property prediction result of the target drug micromolecule.
The property prediction model may correspond to any one of the existing neural network models, for example, a linear regression model, a decision tree model, a neural network model, a support vector machine model, a hidden markov model, etc., and is not specifically limited in this application; the property prediction result may specifically include one or more of target binding property prediction, activity prediction, toxicity prediction, efficacy prediction, water solubility prediction, adverse reaction prediction, prediction of a treatment effect for a certain disease, and the like, and the type of the property prediction may be specifically set according to an actual application prediction scenario, which is not specifically limited in this scheme. Before the steps of this embodiment are performed, the property prediction model needs to be trained in advance by using the label samples, so as to use the trained property prediction model to realize the property prediction of the target drug small molecule.
For the embodiment, after determining and obtaining the first feature vector of the target drug small molecule in the corresponding molecular scale and the second feature vector in the intermediate scale based on the embodiment step 102, the third feature vector obtained by fusion may be input into the trained property prediction model by fusing the first feature vector and the second feature vector, and the property prediction result of the target drug small molecule is determined and obtained.
By the method for predicting the property of the small drug molecules based on the graph neural network, a molecular graph structure and a molecular subgraph structure of the small drug molecules of the target can be generated firstly, and then the molecular graph structure and the molecular subgraph structure are respectively input into a target graph neural network model to respectively determine and obtain a first characteristic vector and a second characteristic vector of the small drug molecules of the target; and then, a third feature vector is constructed by utilizing the first feature vector and the second feature vector, and the third feature vector is input into the trained property prediction model, so that the property prediction result of the target drug micromolecule can be determined. According to the technical scheme, for the medicine molecules containing multiple rings, while a molecular graph structure is considered, the functional group structure information between atoms and molecules, namely the subgraph, is introduced, and the property prediction of the medicine molecules containing multiple rings is realized by integrating the multi-scale molecular expression by combining a graph neural network model, so that the key graph representation information of the medicine molecules can be efficiently learned, the general structure rules in different graph data are captured, and further, the better fitting capability on a property prediction task is given, and the method has better prediction performance compared with the traditional molecular fingerprints, descriptors and the like, and further, the property prediction accuracy of the medicine molecules containing multiple rings can be ensured.
Further, as a refinement and an extension of the specific implementation of the above embodiment, in order to fully illustrate the implementation process in this embodiment, another method for predicting the property of a small molecule of a drug based on a graph neural network is provided, as shown in fig. 2, the method includes:
201. generating a molecular diagram structure according to the chemical molecular structure of the target drug small molecule, and generating a molecular subgraph structure according to the functional group intermediate structure of the target drug small molecule.
As a preferred mode, the embodiment step 201 may specifically include: acquiring a chemical molecular structure of a target drug micromolecule, determining atoms in the chemical molecular structure as nodes in a molecular graph structure, determining atom connection relations in the chemical molecular structure as edges in the molecular graph structure, and generating a molecular graph structure of the target drug micromolecule; acquiring a functional group intermediate structure of the target small molecule, determining functional groups in the functional group intermediate structure as nodes in a molecular subgraph structure, determining a functional group connection relation in the functional group intermediate structure as edges in the molecular subgraph structure, and generating the molecular subgraph structure of the target drug small molecule relative to the intermediate dimension.
202. Inputting the molecular graph structure and the adjacency matrix and attribute information carried in the molecular graph structure into a target graph neural network model, and acquiring node hidden vectors of all nodes in the molecular graph structure.
For the present embodiment, before executing the steps of the present embodiment, the graph neural network model needs to be pre-trained, and then the pre-trained target graph neural network model is used to determine the node hidden vectors of each node in the molecular graph structure. The specific pre-training process is the same as the pre-training process in step 102 of the embodiment, and is not described again.
The adjacent matrix is an n-x-n matrix formed by representing node connection relations, elements with connection relations in the adjacent matrix are represented as 1, elements without connection relations are 0, and n is the number of nodes contained in the target small molecule; the attribute information may include a node initial feature vector and an edge initial feature vector of the atom. The node initial feature vector is generated according to a first preset vector generation rule, wherein the first preset vector generation rule can be shown in table 1, and the node initial feature vector can be a 27-bit feature vector formed by mixing the number of 6-bit chemical bonds, the number of 5-bit formal charges, the chirality of 4-bit atoms, the number of 5-bit bound hydrogen atoms, and the number of 5-bit atomic orbitals, and the aromaticity of + 1-bit and the atomic mass of 1-bit. The edge initial feature vector is generated according to a second predetermined vector generation rule, where the second predetermined vector generation rule can be shown in table 2, and the edge initial feature vector can be a 12-bit feature vector formed by a 4-bit chemical bond type + 1-bit conjugation + 1-bit stereoselectivity in a ring + 6-bit stereoselectivity.
TABLE 1
Figure BDA0003236912930000091
Figure BDA0003236912930000101
TABLE 2
Figure BDA0003236912930000102
For this embodiment, the molecular diagram structure and the adjacency matrix and attribute information carried in the molecular diagram structure may be input into the target diagram neural network model, and the node implicit vectors of each node in the molecular diagram structure may be obtained by using iterative learning of the target diagram neural network model.
In particular, the main process of learning the neural network model of the target graph is to iteratively aggregate and update the neighbor information of the nodes in the graph data. In one iteration, each node updates its own information by aggregating the characteristics of neighboring nodes and the characteristics of its previous layer, and usually performs nonlinear transformation on the aggregated information. By stacking the multi-layer network, each node can acquire neighbor node information within a corresponding hop count.
The learning of the neural network model is understood in a node message passing manner, and involves two processes, namely a message passing (message passing) stage and a read (readout) stage. The information transfer phase is a forward propagation phase which runs T steps circularly and passes through an information function MtObtaining information by updating lettersNumber UtAnd updating the nodes.
Information function MtAnd update function UtIs characterized by the formula:
Figure BDA0003236912930000111
Figure BDA0003236912930000112
wherein e isvwA feature vector representing an edge from node v to w.
The read (ready) phase calculates a feature vector for the representation (rendering) of the whole graph, implemented using a function R whose formula is characterized by:
Figure BDA0003236912930000113
wherein the whole time step number is represented, wherein the function Mt,UtAnd R may use different model settings, such as Graph Convolutional Network (GCN), Attention model (GAT), and the like.
The central idea of learning the molecular representation by the target graph neural network model can be understood as follows: if the initial feature vectors are used for expressing different nodes and different edges respectively, the final stable feature vector expression mode of the nodes can be found through an iterative mode of message propagation. After a fixing step, such as a T step, the feature vector corresponding to each node may be balanced to some extent and not changed. Thus, with the final stable feature vector for each node, compared to the original node feature vector, the final feature vector for each node also contains information about its neighboring nodes and the entire graph (e.g., some atomic nodes in a chemical molecule, assuming their contribution to a certain property of the molecule is the greatest, will have a corresponding more specific expression in the final feature vector).
203. And generating a first characteristic vector of the target drug micromolecule by using the node implicit vectors of all the nodes.
For this embodiment, after determining the node hidden vector of each node in the molecular structure diagram based on the embodiment step 202, an information vector representation of the whole molecular structure diagram can be further obtained according to the node hidden vector of each node (for example, an information representation of the molecular level of the whole molecular compound is extracted through the characteristics of the atomic nodes and the chemical bond information of the connecting edges between the atoms). As a preferred mode, embodiment step 203 may specifically include: calculating an implicit vector average value of node implicit vectors, and determining the implicit vector average value as a first characteristic vector of the target drug small molecule; or extracting the first node hidden vector with the maximum corresponding hidden vector value from the node hidden vectors, and determining the first node hidden vector as the first feature vector.
204. And inputting the molecular subgraph structure into a target graph neural network model, and determining a second feature vector of the target drug micromolecule.
For the present embodiment, before executing the steps of the present embodiment, the graph neural network model also needs to be pre-trained, and then the pre-trained target graph neural network model is used to determine the node hidden vectors of each node in the molecular graph structure. The specific pre-training process is the same as the pre-training process in step 102 of the embodiment, and is not described again. It should be noted that, the present application does not limit the specific execution sequence of the determining process of the first feature vector in the step 202 and the determining process of the second feature vector in the step 203 of the present embodiment, and any one of the determining processes may be preferentially executed.
205. And constructing a third feature vector according to the first feature vector and the second feature vector, and inputting the third feature vector into the trained property prediction model to obtain a property prediction result of the target drug micromolecule.
In a specific application scenario, before executing the steps of this embodiment, the steps of this embodiment further include: taking a sample feature vector matched with a preset property prediction task corresponding to the target drug micromolecule as a training sample, and training a preset property prediction model; and calculating a loss function of the property prediction model, and judging that the property prediction model is trained completely when the loss function is smaller than a preset threshold value. The loss function is used for representing a prediction error of a prediction result of the property prediction model relative to a sample marking result, a preset threshold value is between 0 and 1 and is used for representing the training precision of the property prediction model, the closer the preset threshold value is to 1, the higher the training precision of the property prediction model is, and a specific numerical value of the preset threshold value can be set according to an actual application scene and is not specifically limited herein. The property prediction model may correspond to any one of the existing neural network models, for example, a linear regression model, a decision tree model, a neural network model, a support vector machine model, a hidden markov model, etc., and may be adaptively selected according to the actual application requirements, which is not specifically limited in this application.
Correspondingly, for the present embodiment, as a preferred mode, the step 205 of the embodiment may specifically include: performing vector splicing processing on the first feature vector and the second feature vector according to a preset vector splicing rule to obtain a third feature vector; and inputting the third feature vector serving as an input feature into the trained property prediction model to obtain a property prediction result of the target drug micromolecule. Wherein, the preset vector splicing rule may include: splicing the first feature vector after the second feature vector to obtain a third feature vector; or splicing the second feature vector after the first feature vector to obtain a third feature vector; or, the first feature vector and the second feature vector are added to obtain a third feature vector and the like.
For the present application, the process of predicting the property of the small drug molecule based on the graph neural network can refer to a principle schematic diagram of predicting the property of the small drug molecule based on the graph neural network shown in fig. 3, for the same polycyclic small target drug molecule, a molecular diagram structure can be generated according to the chemical molecular structure of the small drug molecule at an atomic scale, and a molecular sub-diagram structure can be generated according to a functional group intermediate structure of the target small drug molecule at a functional group scale, wherein the functional group intermediate structure is a structure between the atomic scale and the molecular scale obtained by expressing the polycyclic small drug molecule corresponding to a benzene ring in the chemical molecular structure with the functional group, and the functional group intermediate structure comprises an original functional group in the chemical molecular structure and a new functional group converted from the benzene ring; further, a first feature vector corresponding to the molecular diagram structure and a second feature vector corresponding to the molecular sub-diagram structure are determined by using a target Graph Neural Network (GNN); and finally, constructing a multi-scale expressed third feature vector according to the first feature vector and the second feature vector, and inputting the third feature vector into a trained property prediction model, so that a property prediction result of the target drug micromolecule can be determined.
By the method for predicting the properties of the small drug molecules based on the graph neural network, the molecular graph structure and the molecular subgraph structure of the small drug molecules can be generated firstly, and then the molecular graph structure and the molecular subgraph structure are respectively input into a target graph neural network model to respectively determine and obtain a first characteristic vector and a second characteristic vector of the small drug molecules; and then, a third feature vector is constructed by utilizing the first feature vector and the second feature vector, and the third feature vector is input into the trained property prediction model, so that the property prediction result of the target drug micromolecule can be determined. According to the technical scheme, for the medicine molecules containing multiple rings, while a molecular graph structure is considered, the functional group structure information between atoms and molecules, namely the subgraph, is introduced, and the property prediction of the medicine molecules containing multiple rings is realized by integrating the multi-scale molecular expression by combining a graph neural network model, so that the key graph representation information of the medicine molecules can be efficiently learned, the general structure rules in different graph data are captured, and further, the better fitting capability on a property prediction task is given, and the method has better prediction performance compared with the traditional molecular fingerprints, descriptors and the like, and further, the property prediction accuracy of the medicine molecules containing multiple rings can be ensured. In addition, based on the graph neural network combined with the topological structure information learning of the molecules, good precision results can be obtained only by a relatively small amount of labeled data, the original modes of manual parameter adjustment and machine learning engineers and experts are changed into the mode suitable for large-scale reproducible intelligent industrial expansion, the property prediction efficiency of the drug small molecules can be improved, and the prediction cost is saved.
Further, as a specific implementation of the method shown in fig. 1 and fig. 2, an embodiment of the present application provides a device for predicting a property of a small molecule of a drug based on a graph neural network, as shown in fig. 4, the device includes: a generation module 31, a first determination module 32, a second determination module 33;
the generation module 31 is used for generating a molecular diagram structure according to the chemical molecular structure of the target drug small molecule and generating a molecular sub-diagram structure according to the functional group intermediate structure of the target drug small molecule;
a first determining module 32, configured to determine a first feature vector corresponding to the molecular sub-graph structure and a second feature vector corresponding to the molecular sub-graph structure by using the target graph neural network model;
and the second determining module 33 is configured to construct a third feature vector according to the first feature vector and the second feature vector, and input the third feature vector into the trained property prediction model to obtain a property prediction result of the target drug small molecule.
In a specific application scenario, the generating module 31 may be specifically configured to obtain a chemical molecular structure of the target drug small molecule, determine an atom in the chemical molecular structure as a node in a molecular graph structure, determine an atom connection relationship in the chemical molecular structure as an edge in the molecular graph structure, and generate a molecular graph structure of the target drug small molecule; acquiring a functional group intermediate structure of the target small molecule, determining functional groups in the functional group intermediate structure as nodes in a molecular subgraph structure, determining a functional group connection relation in the functional group intermediate structure as edges in the molecular subgraph structure, and generating the molecular subgraph structure of the target drug small molecule relative to the intermediate dimension.
Accordingly, in order to obtain the target graph neural network model through pre-training of the graph neural network model, as shown in fig. 5, the apparatus further includes: an acquisition module 34, a first training module 35, a second training module 36, and a third training module 37;
an obtaining module 34, configured to obtain an unlabeled graph data set and a first labeled graph data set, where a training task of the first labeled graph data set is different from a preset property prediction task;
the first training module 35 is configured to train a preset graph neural network model by using an unlabeled graph data set as a training sample, and adjust parameters of the graph neural network model to obtain a first graph neural network model;
the second training module 36 may be configured to use the first labeled graph data set as a training sample, train the first graph neural network model, and adjust parameters of the first graph neural network model to obtain a second graph neural network model;
the third training module 37 is configured to use a second labeled graph data set corresponding to the preset property prediction task as a training sample, and adjust parameters of the second graph neural network model by training the second graph neural network model to obtain the target graph neural network model.
In a specific application scenario, a molecular graph structure carries an adjacency matrix and attribute information, wherein the attribute information comprises a node initial feature vector and an edge initial feature vector, and the node initial feature vector and the edge initial feature vector are determined according to a preset vector generation rule; correspondingly, the first determining module 32 is specifically configured to input the molecular graph structure, the adjacency matrix and the attribute information into the target graph neural network model, and obtain node hidden vectors of each node in the molecular graph structure; generating a first characteristic vector of the target drug micromolecule by using the node implicit vectors of all the nodes; and inputting the molecular subgraph structure into a target graph neural network model, and determining a second feature vector of the target drug micromolecule.
Correspondingly, when the node implicit vectors of the nodes are used for generating the first characteristic vector of the target drug small molecule, the first determining module 32 is specifically used for calculating the implicit vector average value of the node implicit vectors and determining the implicit vector average value as the first characteristic vector of the target drug small molecule; or extracting the first node hidden vector with the maximum corresponding hidden vector value from the node hidden vectors, and determining the first node hidden vector as the first feature vector.
In a specific application scenario, the second determining module 33 is specifically configured to perform vector splicing processing on the first feature vector and the second feature vector according to a preset vector splicing rule to obtain a third feature vector; and inputting the third feature vector serving as an input feature into the trained property prediction model to obtain a property prediction result of the target drug micromolecule.
Accordingly, in order to train the property prediction model in advance, as shown in fig. 5, the apparatus further includes: a fourth training module 38, a calculation module 39;
the fourth training module 38 is configured to train a preset property prediction model by using the sample feature vector matched with the preset property prediction task corresponding to the target drug small molecule as a training sample;
and the calculating module 39 is configured to calculate a loss function of the property prediction model, and when the loss function is smaller than a preset threshold, it is determined that the property prediction model is trained completely.
It should be noted that other corresponding descriptions of the functional units related to the device for predicting properties of a small drug molecule based on a graph neural network provided in this embodiment may refer to the corresponding descriptions in fig. 1 to fig. 2, and are not repeated herein.
Based on the method shown in fig. 1 to fig. 2, correspondingly, the present embodiment further provides a storage medium, which may be volatile or nonvolatile, and has computer readable instructions stored thereon, and when the computer readable instructions are executed by a processor, the method for predicting the property of the drug small molecule based on the graph neural network shown in fig. 1 to fig. 2 is implemented.
Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, or the like), and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device, or the like) to execute the method of the embodiments of the present application.
Based on the method shown in fig. 1 to fig. 2 and the virtual device embodiments shown in fig. 4 and fig. 5, in order to achieve the above object, the present embodiment further provides a computer device, where the computer device includes a storage medium and a processor; a storage medium for storing a computer program; a processor for executing a computer program to implement the method for predicting the property of the small molecule of the drug based on the graph neural network as shown in fig. 1 to 2.
Optionally, the computer device may further include a user interface, a network interface, a camera, Radio Frequency (RF) circuitry, a sensor, audio circuitry, a WI-FI module, and so forth. The user interface may include a Display screen (Display), an input unit such as a keypad (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), etc.
It will be understood by those skilled in the art that the present embodiment provides a computer device structure that is not limited to the physical device, and may include more or less components, or some components in combination, or a different arrangement of components.
The storage medium may further include an operating system and a network communication module. The operating system is a program that manages the hardware and software resources of the computer device described above, supporting the operation of information handling programs and other software and/or programs. The network communication module is used for realizing communication among components in the storage medium and communication with other hardware and software in the information processing entity device.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus a necessary general hardware platform, and can also be implemented by hardware.
By applying the technical scheme, compared with the prior art, the method can firstly generate the molecular diagram structure and the molecular subgraph structure of the target drug micromolecule, further respectively input the molecular diagram structure and the molecular subgraph structure into the target diagram neural network model, and respectively determine and obtain the first characteristic vector and the second characteristic vector of the target drug micromolecule; and then, a third feature vector is constructed by utilizing the first feature vector and the second feature vector, and the third feature vector is input into the trained property prediction model, so that the property prediction result of the target drug micromolecule can be determined. According to the technical scheme, for the medicine molecules containing multiple rings, while a molecular graph structure is considered, the functional group structure information between atoms and molecules, namely the subgraph, is introduced, and the property prediction of the medicine molecules containing multiple rings is realized by integrating the multi-scale molecular expression by combining a graph neural network model, so that the key graph representation information of the medicine molecules can be efficiently learned, the general structure rules in different graph data are captured, and further, the better fitting capability on a property prediction task is given, and the method has better prediction performance compared with the traditional molecular fingerprints, descriptors and the like, and further, the property prediction accuracy of the medicine molecules containing multiple rings can be ensured. In addition, based on the graph neural network combined with the topological structure information learning of the molecules, good precision results can be obtained only by a relatively small amount of labeled data, the original modes of manual parameter adjustment and machine learning engineers and experts are changed into the mode suitable for large-scale reproducible intelligent industrial expansion, the property prediction efficiency of the drug small molecules can be improved, and the prediction cost is saved.
Those skilled in the art will appreciate that the figures are merely schematic representations of one preferred implementation scenario and that the blocks or flow diagrams in the figures are not necessarily required to practice the present application. Those skilled in the art will appreciate that the modules in the devices in the implementation scenario may be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or may be located in one or more devices different from the present implementation scenario with corresponding changes. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The above application serial numbers are for description purposes only and do not represent the superiority or inferiority of the implementation scenarios. The above disclosure is only a few specific implementation scenarios of the present application, but the present application is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present application.

Claims (10)

1. A method for predicting the property of a small drug molecule based on a graph neural network is characterized by comprising the following steps:
generating a molecular diagram structure according to the chemical molecular structure of the target drug micromolecule, and generating a molecular subgraph structure according to the functional group intermediate structure of the target drug micromolecule;
determining a first feature vector corresponding to the molecular diagram structure and a second feature vector corresponding to the molecular subgraph structure by using a target diagram neural network model;
and constructing a third feature vector according to the first feature vector and the second feature vector, and inputting the third feature vector into a trained property prediction model to obtain a property prediction result of the target drug micromolecule.
2. The method of claim 1, wherein generating a molecular diagram structure from the chemical molecular structure of the target drug small molecule and generating a molecular sub-diagram structure from the functional group intermediate structure of the target drug small molecule comprises:
acquiring a chemical molecular structure of a target drug small molecule, determining atoms in the chemical molecular structure as nodes in the molecular diagram structure, determining atom connection relations in the chemical molecular structure as edges in the molecular diagram structure, and generating the molecular diagram structure of the target drug small molecule;
acquiring a functional group intermediate structure of the target small molecule, determining functional groups in the functional group intermediate structure as nodes in the molecular subgraph structure, determining a functional group connection relation in the functional group intermediate structure as edges in the molecular subgraph structure, and generating a molecular subgraph structure of the target drug small molecule relative to the intermediate dimension.
3. The method of claim 1, prior to determining the first feature vector corresponding to the molecular graph structure and the second feature vector corresponding to the molecular sub-graph structure using the target graph neural network model, comprising:
acquiring an unlabeled graph data set and a first labeled graph data set, wherein a training task of the first labeled graph data set is different from a preset property prediction task;
taking the unlabeled graph data set as a training sample, training a preset graph neural network model, and adjusting parameters of the graph neural network model to obtain a first graph neural network model;
taking the first labeled graph data set as a training sample, training the first graph neural network model, and adjusting parameters of the first graph neural network model to obtain a second graph neural network model;
and taking a second labeled graph data set corresponding to the preset property prediction task as a training sample, training the second graph neural network model, and adjusting parameters of the second graph neural network model to obtain a target graph neural network model.
4. The method according to claim 1, wherein the molecular graph structure carries an adjacency matrix and attribute information, and the attribute information includes a node initial feature vector and an edge initial feature vector, wherein the node initial feature vector and the edge initial feature vector are determined according to a preset vector generation rule;
the determining a first feature vector corresponding to the molecular graph structure and a second feature vector corresponding to the molecular subgraph structure by using the target graph neural network model includes:
inputting the molecular diagram structure, the adjacency matrix and the attribute information into a target diagram neural network model to obtain node implicit vectors of all nodes in the molecular diagram structure;
generating a first characteristic vector of the target drug micromolecules by using the node implicit vectors of all the nodes;
inputting the molecular subgraph structure into the target graph neural network model, and determining a second feature vector of the target drug small molecule.
5. The method of claim 4, wherein the generating a first feature vector of the target drug small molecule using the node implicit vectors of the respective nodes comprises:
calculating an implicit vector average value of the node implicit vectors, and determining the implicit vector average value as a first feature vector of the target drug small molecule; or the like, or, alternatively,
extracting a first node hidden vector with the maximum corresponding hidden vector value from the node hidden vectors, and determining the first node hidden vector as the first feature vector.
6. The method of claim 1, wherein the constructing a third feature vector according to the first feature vector and the second feature vector, and inputting the third feature vector into a trained property prediction model to obtain a property prediction result of the target drug small molecule comprises:
performing vector splicing processing on the first feature vector and the second feature vector according to a preset vector splicing rule to obtain a third feature vector;
and inputting the third feature vector serving as an input feature into the trained property prediction model to obtain a property prediction result of the target drug micromolecule.
7. The method of claim 1, wherein prior to constructing a third feature vector from the first and second feature vectors and inputting the third feature vector into a trained property prediction model, determining a property prediction result for the target drug small molecule, the method further comprises:
taking a sample feature vector matched with a preset property prediction task corresponding to the target drug micromolecule as a training sample, and training a preset property prediction model;
and calculating a loss function of the property prediction model, and judging that the property prediction model is trained completely when the loss function is smaller than a preset threshold value.
8. A device for predicting the property of a small drug molecule based on a graph neural network is characterized by comprising:
the generation module is used for generating a molecular diagram structure according to the chemical molecular structure of the target drug micromolecule and generating a molecular subgraph structure according to the functional group intermediate structure of the target drug micromolecule;
the first determining module is used for determining a first feature vector corresponding to the molecular diagram structure and a second feature vector corresponding to the molecular diagram structure by using a target diagram neural network model;
and the second determination module is used for constructing a third feature vector according to the first feature vector and the second feature vector, and inputting the third feature vector into a trained property prediction model to obtain a property prediction result of the target drug micromolecule.
9. A storage medium on which a computer program is stored, wherein the program, when executed by a processor, implements the graph neural network-based drug small molecule property prediction method of any one of claims 1 to 7.
10. A computer device comprising a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, wherein the processor implements the graph neural network-based drug small molecule property prediction method of any one of claims 1 to 7 when executing the program.
CN202111005476.0A 2021-08-30 2021-08-30 Drug small molecule property prediction method, device and equipment based on graph neural network Active CN113707236B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111005476.0A CN113707236B (en) 2021-08-30 2021-08-30 Drug small molecule property prediction method, device and equipment based on graph neural network
PCT/CN2022/071440 WO2023029352A1 (en) 2021-08-30 2022-01-11 Drug small molecule property prediction method and apparatus based on graph neural network, and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111005476.0A CN113707236B (en) 2021-08-30 2021-08-30 Drug small molecule property prediction method, device and equipment based on graph neural network

Publications (2)

Publication Number Publication Date
CN113707236A true CN113707236A (en) 2021-11-26
CN113707236B CN113707236B (en) 2024-05-14

Family

ID=78656947

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111005476.0A Active CN113707236B (en) 2021-08-30 2021-08-30 Drug small molecule property prediction method, device and equipment based on graph neural network

Country Status (2)

Country Link
CN (1) CN113707236B (en)
WO (1) WO2023029352A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114358202A (en) * 2022-01-11 2022-04-15 平安科技(深圳)有限公司 Information pushing method and device based on drug molecule image classification
CN114386694A (en) * 2022-01-11 2022-04-22 平安科技(深圳)有限公司 Drug molecule property prediction method, device and equipment based on comparative learning
CN114496302A (en) * 2021-12-29 2022-05-13 深圳云天励飞技术股份有限公司 Method for predicting pharmaceutical indications and related device
CN115274008A (en) * 2022-08-08 2022-11-01 苏州创腾软件有限公司 Molecular property prediction method and system based on graph neural network
WO2023029352A1 (en) * 2021-08-30 2023-03-09 平安科技(深圳)有限公司 Drug small molecule property prediction method and apparatus based on graph neural network, and device
CN116189809A (en) * 2023-01-06 2023-05-30 东南大学 Drug molecule important node prediction method based on challenge resistance
WO2023115343A1 (en) * 2021-12-21 2023-06-29 深圳晶泰科技有限公司 Data processing method and apparatus, model training method and free energy prediction method
CN116705195A (en) * 2023-06-07 2023-09-05 之江实验室 Method and device for predicting pharmaceutical properties of graph neural network based on vector quantization
CN118072861A (en) * 2024-04-17 2024-05-24 烟台国工智能科技有限公司 Molecular optimization method, device and medium based on multi-mode feature fusion

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117612633B (en) * 2024-01-23 2024-04-09 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Drug molecular property prediction method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190272468A1 (en) * 2018-03-05 2019-09-05 The Board Of Trustees Of The Leland Stanford Junior University Systems and Methods for Spatial Graph Convolutions with Applications to Drug Discovery and Molecular Simulation
CN111428848A (en) * 2019-09-05 2020-07-17 中国海洋大学 Molecular intelligent design method based on self-encoder and 3-order graph convolution
CN111816252A (en) * 2020-07-21 2020-10-23 腾讯科技(深圳)有限公司 Drug screening method and device and electronic equipment

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200168302A1 (en) * 2017-07-20 2020-05-28 The University Of North Carolina At Chapel Hill Methods, systems and non-transitory computer readable media for automated design of molecules with desired properties using artificial intelligence
US10923214B2 (en) * 2017-09-07 2021-02-16 Accutar Biotechnology Inc. Neural network for predicting drug property
CN109033738B (en) * 2018-07-09 2022-01-11 湖南大学 Deep learning-based drug activity prediction method
CN111933225B (en) * 2020-09-27 2021-01-05 平安科技(深圳)有限公司 Drug classification method and device, terminal equipment and storage medium
CN113011282A (en) * 2021-02-26 2021-06-22 腾讯科技(深圳)有限公司 Graph data processing method and device, electronic equipment and computer storage medium
CN113241128B (en) * 2021-04-29 2022-05-13 天津大学 Molecular property prediction method based on molecular space position coding attention neural network model
CN113299354B (en) * 2021-05-14 2023-06-30 中山大学 Small molecule representation learning method based on transducer and enhanced interactive MPNN neural network
CN113707236B (en) * 2021-08-30 2024-05-14 平安科技(深圳)有限公司 Drug small molecule property prediction method, device and equipment based on graph neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190272468A1 (en) * 2018-03-05 2019-09-05 The Board Of Trustees Of The Leland Stanford Junior University Systems and Methods for Spatial Graph Convolutions with Applications to Drug Discovery and Molecular Simulation
CN111428848A (en) * 2019-09-05 2020-07-17 中国海洋大学 Molecular intelligent design method based on self-encoder and 3-order graph convolution
CN111816252A (en) * 2020-07-21 2020-10-23 腾讯科技(深圳)有限公司 Drug screening method and device and electronic equipment

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023029352A1 (en) * 2021-08-30 2023-03-09 平安科技(深圳)有限公司 Drug small molecule property prediction method and apparatus based on graph neural network, and device
WO2023115343A1 (en) * 2021-12-21 2023-06-29 深圳晶泰科技有限公司 Data processing method and apparatus, model training method and free energy prediction method
CN114496302A (en) * 2021-12-29 2022-05-13 深圳云天励飞技术股份有限公司 Method for predicting pharmaceutical indications and related device
WO2023134063A1 (en) * 2022-01-11 2023-07-20 平安科技(深圳)有限公司 Comparative learning-based method, apparatus, and device for predicting properties of drug molecule
CN114386694A (en) * 2022-01-11 2022-04-22 平安科技(深圳)有限公司 Drug molecule property prediction method, device and equipment based on comparative learning
CN114358202A (en) * 2022-01-11 2022-04-15 平安科技(深圳)有限公司 Information pushing method and device based on drug molecule image classification
CN114386694B (en) * 2022-01-11 2024-02-23 平安科技(深圳)有限公司 Drug molecular property prediction method, device and equipment based on contrast learning
CN115274008A (en) * 2022-08-08 2022-11-01 苏州创腾软件有限公司 Molecular property prediction method and system based on graph neural network
CN116189809B (en) * 2023-01-06 2024-01-09 东南大学 Drug molecule important node prediction method based on challenge resistance
CN116189809A (en) * 2023-01-06 2023-05-30 东南大学 Drug molecule important node prediction method based on challenge resistance
CN116705195A (en) * 2023-06-07 2023-09-05 之江实验室 Method and device for predicting pharmaceutical properties of graph neural network based on vector quantization
CN116705195B (en) * 2023-06-07 2024-03-26 之江实验室 Method and device for predicting pharmaceutical properties of graph neural network based on vector quantization
CN118072861A (en) * 2024-04-17 2024-05-24 烟台国工智能科技有限公司 Molecular optimization method, device and medium based on multi-mode feature fusion

Also Published As

Publication number Publication date
CN113707236B (en) 2024-05-14
WO2023029352A1 (en) 2023-03-09

Similar Documents

Publication Publication Date Title
CN113707236B (en) Drug small molecule property prediction method, device and equipment based on graph neural network
CN113707235B (en) Drug micromolecule property prediction method, device and equipment based on self-supervision learning
Wang et al. Exploiting ontology graph for predicting sparsely annotated gene function
CN111524557B (en) Inverse synthesis prediction method, device, equipment and storage medium based on artificial intelligence
CN114386694B (en) Drug molecular property prediction method, device and equipment based on contrast learning
CN112905801B (en) Stroke prediction method, system, equipment and storage medium based on event map
US20230043540A1 (en) Method for predicting retrosynthesis of a compound molecule and related apparatus
CN111627494B (en) Protein property prediction method and device based on multidimensional features and computing equipment
CN113140254A (en) Meta-learning drug-target interaction prediction system and prediction method
Sarkar et al. An algorithm for DNA read alignment on quantum accelerators
CN113127667A (en) Image processing method and device, and image classification method and device
CN114613450A (en) Method and device for predicting property of drug molecule, storage medium and computer equipment
CN116206688A (en) Multi-mode information fusion model and method for DTA prediction
Tang et al. Single-cell multimodal prediction via transformers
Wang et al. Sparse imbalanced drug-target interaction prediction via heterogeneous data augmentation and node similarity
Peng et al. Pocket-specific 3d molecule generation by fragment-based autoregressive diffusion models
CN115458044A (en) Medicine and medicine interaction prediction method based on biological network global structure
Oliver et al. Approximate network motif mining via graph learning
CN114417982A (en) Model training method, terminal device and computer readable storage medium
CN113326877A (en) Model training method, data processing method, device, apparatus, storage medium, and program
KR20210027668A (en) A system of predicting compound activity for target protein using Fourier descriptor and artificial neural network
Liu et al. Efficient prediction of peptide self-assembly through sequential and graphical encoding
CN115794196B (en) Method, device, equipment and storage medium for identifying key software of edge X
Trivodaliev et al. Deep Learning the Protein Function in Protein Interaction Networks
Bao et al. ILSES: Identification lysine succinylation-sites with ensemble classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant