WO2023029352A1 - Procédé de prédiction des propriétés d'une petite molécule médicamenteuse et appareil reposant sur un réseau de neurones graphiques, et dispositif - Google Patents

Procédé de prédiction des propriétés d'une petite molécule médicamenteuse et appareil reposant sur un réseau de neurones graphiques, et dispositif Download PDF

Info

Publication number
WO2023029352A1
WO2023029352A1 PCT/CN2022/071440 CN2022071440W WO2023029352A1 WO 2023029352 A1 WO2023029352 A1 WO 2023029352A1 CN 2022071440 W CN2022071440 W CN 2022071440W WO 2023029352 A1 WO2023029352 A1 WO 2023029352A1
Authority
WO
WIPO (PCT)
Prior art keywords
graph
molecular
feature vector
neural network
small molecule
Prior art date
Application number
PCT/CN2022/071440
Other languages
English (en)
Chinese (zh)
Inventor
王俊
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023029352A1 publication Critical patent/WO2023029352A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/20Identification of molecular entities, parts thereof or of chemical compositions

Definitions

  • the present application relates to the technical field of artificial intelligence, in particular to a method, device and equipment for predicting properties of small drug molecules based on graph neural networks.
  • Small molecules of polycyclic drugs have been a hot spot in the field of medicinal chemistry in the past 20 years, because the polycyclic structure has a significant effect on improving pharmacological activity and selectivity, and improving druggability, and it also takes into account the microstructure and pharmacokinetics requirements of binding to the target. macroscopic nature.
  • Most of the clinical small molecules of polycyclic drugs are cyclic peptides or lactones composed of 12 to 20 atoms, which are mainly used to treat diseases such as infection, inflammation and tumors, and have oral and injection dosage forms.
  • the property prediction of polycyclic drug small molecules is crucial to further improve the potential of deep learning in drug discovery.
  • the present application provides a method, device and equipment for predicting the properties of small molecules of drugs based on graph neural network, which can be used to solve the current technical problems of low efficiency and low accuracy in predicting properties of small molecules of drugs.
  • a method for predicting properties of small drug molecules based on a graph neural network comprising:
  • a device for predicting properties of small molecules of drugs based on a graph neural network comprising:
  • a generating module configured to generate a molecular graph structure according to the chemical molecular structure of the target drug small molecule, and generate a molecular subgraph structure according to the functional group intermediate structure of the target drug small molecule;
  • the first determining module is used to determine the first eigenvector corresponding to the molecular graph structure and the second eigenvector corresponding to the molecular subgraph structure by using the target graph neural network model;
  • the second determination module is configured to construct a third feature vector according to the first feature vector and the second feature vector, and input the third feature vector into the trained property prediction model to obtain the target drug small Molecular property prediction results.
  • a computer-readable storage medium on which computer-readable instructions are stored, and when the program is executed by a processor, the above-mentioned method for predicting properties of small drug molecules based on a graph neural network is implemented.
  • a computer device including a memory, a processor, and computer-readable instructions stored in the memory and operable on the processor.
  • the processor executes the program, the above-mentioned Graph Neural Networks for Drug Small Molecule Property Prediction.
  • the present application provides a method, device and equipment for predicting the properties of small molecules of drugs based on graph neural networks.
  • the molecular graph structure and molecular subgraph structure of the small molecule and then input the molecular graph structure and molecular subgraph structure into the target graph neural network model, respectively determine the first eigenvector and the second eigenvector of the target drug small molecule; and then use The first eigenvector and the second eigenvector construct the third eigenvector, and input the third eigenvector into the trained property prediction model to obtain the property prediction result of the small molecule of the target drug.
  • Fig. 1 shows a schematic flow chart of a method for predicting properties of small drug molecules based on a graph neural network provided by an embodiment of the present application
  • Figure 2 shows a schematic flow diagram of another method for predicting properties of small drug molecules based on a graph neural network provided by the embodiment of the present application;
  • Figure 3 shows a schematic diagram of the principle of predicting the properties of small molecules of drugs based on graph neural networks provided by the embodiment of the present application;
  • Fig. 4 shows a schematic structural diagram of a small drug molecule property prediction device based on a graph neural network provided by an embodiment of the present application
  • FIG. 5 shows a schematic structural diagram of another device for predicting properties of small drug molecules based on a graph neural network provided by an embodiment of the present application.
  • AI artificial intelligence
  • digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • Artificial intelligence basic technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • this application provides a method for predicting properties of small molecules of drugs based on graph neural network, as shown in Figure 1, the method includes:
  • the target drug small molecule is a polycyclic small molecule to be predicted and analyzed for properties;
  • the intermediate structure of the functional group is the polycyclic small molecule corresponding to the benzene ring in the chemical molecular structure, which is obtained by expressing the functional group between the atomic scale and the molecular scale.
  • Intermediate structure, functional group The intermediate structure contains the original functional group in the chemical molecular structure and the new functional group after the transformation of the benzene ring.
  • a functional group is an intermediate scale between the atomic and molecular scales, and is an atom or atomic group that determines the chemical properties of an organic compound. Common functional groups include hydroxyl, carboxyl, ether linkage, aldehyde, carbonyl, etc.
  • this application proposes to combine the graph-based neural network model and integrate multi-scale molecular expression to realize the property prediction of polycyclic drug small molecules.
  • the experimental results show that the model has good predictive ability. Compared with the traditional fully connected neural network, and the single-scale (not including sub-graph scale information extraction) prediction model, it has the potential to be superior to the above methods.
  • the chemical molecular structure of the small molecule of the target drug can be extracted in advance, and then the functional group intermediate structure can be generated according to the chemical molecular structure.
  • the benzene ring is a special functional group, but if we want to learn from the atomic scale, the graph neural network model may need to go through several layers to get the functional information of the benzene ring, so why not just tell the model that this is a benzene ring? Therefore, a functional group-scale unit level can be innovatively added again to explicitly and directly extract information at the meso-intermediate scale, so as to better realize the representation learning and property prediction of molecules.
  • the benzene ring in the chemical molecular structure can be replaced with a functional group, and then a new functional group structure including the original functional group in the chemical molecular structure and the transformation of the benzene ring can be obtained The functional group intermediate structure.
  • each atom in the drug small molecule can be represented as a node (node) in the molecular graph structure, and the force between atoms is represented by the edge (edge) between the nodes.
  • nodes can carry different information to express different atomic symbols, and edges can also carry different information to express different force modes.
  • the chemical molecular structure of chemical molecules can be represented by the molecular graph structure in the computer. express.
  • the molecular subgraph structure can be further generated according to the intermediate structure of the functional group of the target drug small molecule.
  • each node represents a functional group
  • the interaction between functional groups is represented by the edges between nodes. Representation learning and property prediction for molecules can be better achieved by adding a functional group-scale unit level to explicitly and directly extract information at intermediate scales.
  • the executor of this application can be a device for predicting the properties of small drug molecules, which can be configured on the client side or server side, and can pre-generate the molecular graph structure and molecular subgraph structure of the target drug small molecule, and then The molecular graph structure and the molecular subgraph structure are respectively input into the target graph neural network model, and the first feature vector and the second feature vector of the small molecule of the target drug are respectively determined; finally, the third feature vector is constructed by using the first feature vector and the second feature vector vector, and input the third feature vector into the trained property prediction model to determine the property prediction result of the target drug small molecule.
  • Graph Neural Networks Graph Neural Networks
  • GNN graph Neural Networks
  • the input of a graph neural network is usually a graph structure, and its final output generally depends on the specific task.
  • the graph neural network trains the implicit vector representation of each node in the graph according to the graph structure and input node attributes. The goal is to make the vector representation contain enough powerful expression information to help each Nodes perform information extraction, and finally, through methods such as average pooling, the information vector representation of the entire graph can be obtained.
  • the graph neural network Before applying the graph neural network, it is necessary to pre-train the graph neural network in combination with task scenarios. In general, if there are sufficient data and labels, the graph neural network can be pre-trained by supervised learning. But in real life, there are often a large amount of data but only a small number of labels, and labeling data requires a lot of energy. It would be a pity to directly discard these unlabeled data. Therefore, we can "create labels" for these unlabeled data. Of course, these labels are different from the final labels of the learning task, otherwise there is no need for model learning.
  • the graph neural network can learn the local information of each node in the graph structure, and this information is helpful for the final node classification task.
  • the label of the node is the final label we want to predict, and the degree of the node is the produced label.
  • graph neural network By using graph neural network to predict the degree of nodes, we can get: 1) node embedding suitable for node degree prediction; 2) weight matrix of graph neural network suitable for node degree prediction task. Then you can connect the node embedding to the classifier and use the labeled data for classification learning; directly use the labeled data on the graph neural network to continue training and adjust the weight matrix to obtain a model suitable for node classification tasks.
  • pseudo-labels can be created from large-scale unlabeled data as supervisory signals, which can be used to construct supervisory signals for supervised learning of models, thereby effectively learning data potential features and information in .
  • the steps of the embodiment may specifically include: obtaining an unlabeled image dataset and a first labeled image dataset, training tasks and preset property predictions for the first labeled image dataset The tasks are different; the unlabeled graph dataset is used as a training sample, and the parameters of the graph neural network model are adjusted to obtain the first graph neural network model by training the preset graph neural network model; the first labeled graph dataset is used as a training sample.
  • Sample by training the neural network model of the first graph, adjusting the parameters of the neural network model of the first graph, and obtaining the neural network model of the second graph; using the second labeled graph data set corresponding to the preset property prediction task as a training sample, by The neural network model of the second graph is trained, and the parameters of the neural network model of the second graph are adjusted to obtain the neural network model of the target graph.
  • the first labeled graph data set is used as a training sample, and the first graph neural network model is trained to adjust the first graph neural network.
  • the obtained second graph neural network model has learned how to perform basic data processing, analysis and other laws on graph data in the form of labeled graph data.
  • the second image neural network model can quickly process and analyze the second labeled image data, thereby further improving the efficiency of model training efficiency, and optimize the quality of the trained graph neural network.
  • the above-mentioned graph neural network model is trained based on the training samples and the parameters of the graph neural network model are adjusted, about 15% of the nodes or node connections can be randomly masked from the adjacency matrix of the graph data relationship, to disturb the integrity of the original graph (for example, the original molecular graph data has 20 atomic nodes, and about 15% of 3 nodes are randomly covered up, and the adjacency matrix is also disturbed and transformed accordingly).
  • the model learns a compact information representation of the nodes of graph data by learning to predict the masked nodes or node connection relations. If the model can better predict the hidden nodes or node attributes, it means that the model has learned the basic knowledge about the data, and can achieve better performance when learning other follow-up tasks.
  • the molecular graph structure of the small molecule of the target drug can be input into the target graph neural network model to obtain the first feature vector at the corresponding molecular scale; in addition, Input the functional group intermediate structure of the target drug small molecule into the target graph neural network model to obtain the second feature vector at the corresponding intermediate scale, so as to use the first feature vector and the second feature vector to construct the multi-scale molecular expression of the target drug small molecule.
  • the property prediction model can correspond to any one of the existing neural network models, such as linear regression model, decision tree model, neural network model, support vector machine model, hidden Markov model, etc., which are not mentioned in this application.
  • the property prediction results may specifically include one or more of target binding property predictions, activity predictions, toxicity predictions, efficacy predictions, water solubility predictions, adverse reaction predictions, and therapeutic effect predictions for a certain disease.
  • the property prediction type can be set according to the actual application prediction scenario, which is not specifically limited in this solution. It should be noted that before performing the steps in this embodiment, the property prediction model needs to be trained in advance using labeled samples, so that the property prediction model of the trained property prediction model can be used to realize the property prediction of the small molecule of the target drug.
  • the third eigenvector obtained by fusion is input into the trained property prediction model, and the property prediction result of the small molecule of the target drug is determined to be obtained.
  • the molecular graph structure and molecular subgraph structure of the target drug small molecule can be generated first, and then the molecular graph structure and molecular subgraph structure can be respectively input into the target graph neural network Model, respectively determine the first eigenvector and the second eigenvector of the small molecule of the target drug; then use the first eigenvector and the second eigenvector to construct the third eigenvector, and input the third eigenvector into the trained property prediction model , the property prediction results of the small molecule of the target drug can be determined.
  • step 201 of the embodiment may specifically include: obtaining the chemical molecular structure of the small molecule of the target drug, determining the atoms in the chemical molecular structure as nodes in the molecular graph structure, and determining the atomic connection relationship in the chemical molecular structure Generate the molecular graph structure of the small molecule of the target drug as an edge in the molecular graph structure; obtain the intermediate structure of the functional group of the target small molecule, determine the functional group in the intermediate structure of the functional group as a node in the molecular subgraph structure, and set the intermediate structure of the functional group The functional group connection relationship is determined as the edges in the molecular subgraph structure, and the molecular subgraph structure of the target drug small molecule on the intermediate scale is generated.
  • the specific pre-training process is the same as the pre-training process in step 102 of the embodiment, and will not be described again.
  • the adjacency matrix is an n*n matrix representing the connection relationship between nodes.
  • elements with connection relationship are represented as 1, elements without connection relationship are 0, and n is the number of nodes contained in the target small molecule; attribute information May include node initial eigenvectors of atoms and edge initial eigenvectors.
  • the initial feature vector of the node is generated according to the first preset vector generation rule, wherein the first preset vector generation rule can be seen in Table 1, and the initial feature vector of the node can be composed of 6-bit chemical bond number + 5-bit form charge + A 27-bit feature vector composed of 4-bit atomic chirality + 5-bit number of bound hydrogen atoms + 5-bit atomic orbital mixture + 1-bit aromaticity + 1-bit atomic mass.
  • the edge initial feature vector is generated according to the second preset vector generation rule, wherein the second preset vector generation rule can be seen in Table 2, and the edge initial feature vector can be composed of 4 chemical bond types + 1 bit conjugation + 12-bit eigenvector composed of 1 bit in the ring + 6-bit stereo.
  • the molecular graph structure and the adjacency matrix and attribute information carried in the molecular graph structure can be input into the target graph neural network model, and the node hidden vectors of each node in the molecular graph structure can be obtained by using the iterative learning of the target graph neural network model .
  • the main process of target graph neural network model learning is to iteratively aggregate and update the neighbor information of nodes in the graph data.
  • each node updates its own information by aggregating the features of its neighbor nodes and its own features in the upper layer, and usually performs nonlinear transformation on the aggregated information.
  • each node can obtain the information of neighbor nodes within the corresponding hop number.
  • the information transfer stage is the forward propagation stage, which runs T steps cyclically, obtains information through the information function M t , and updates nodes through the update function U t .
  • e vw represents the eigenvector of the edge from node v to w.
  • a feature vector is calculated for the representation of the entire graph, which is implemented using the function R.
  • the formula feature of the function R is described as:
  • the functions M t , U t and R can use different model settings, such as the convolutional network (Graph Convolutional Network, GCN), the attention model (Graph Attention, GAT) and so on.
  • GCN Graph Convolutional Network
  • GAT Graph Attention, GAT
  • the central idea of the target graph neural network model for molecular representation learning can be understood as: if the initial feature vectors are used to express different nodes and different edges, the final stable feature vector expression of the nodes can be found through the iterative method of message propagation. After a fixed step such as the T step, the eigenvector corresponding to each node can be balanced to a certain extent and will not change.
  • the final eigenvector of each node also contains the information of its neighbor nodes and the whole graph (for example, some chemical molecules in The atomic node, assuming that it contributes the most to a certain property of the molecule, will have a corresponding more specific expression in the final feature vector).
  • step 203 of the embodiment may specifically include: calculating the average value of the hidden vectors of the node hidden vectors, and determining the average value of the hidden vectors as the first eigenvector of the small molecule of the target drug; The first node hidden vector with the largest vector value is determined as the first feature vector.
  • the specific pre-training process is the same as the pre-training process in step 102 of the embodiment, and will not be described again. It should be noted that this application does not limit the specific execution order of the determination process of the first eigenvector in steps 202-203 of the embodiment and the determination process of the second eigenvector in the steps of this embodiment, and any one of them can be executed preferentially. Determine the process.
  • the steps of the embodiment specifically include: using the sample feature vectors that match the preset property prediction tasks corresponding to the small molecule of the target drug as training samples, and training the preset property prediction model ; Calculate the loss function of the property prediction model, and when the loss function is smaller than the preset threshold, it is determined that the training of the property prediction model is completed.
  • the loss function is used to represent the prediction error of the prediction result of the property prediction model relative to the sample labeling result.
  • the preset threshold value is between 0 and 1, which is used to represent the training accuracy of the property prediction model. The closer the preset threshold is to 1.
  • the property prediction model can correspond to any of the existing neural network models, such as linear regression model, decision tree model, neural network model, support vector machine model, hidden Markov model, etc., and can be adapted according to actual application requirements The selection is not specifically limited in this application.
  • step 205 of the embodiment may specifically include: performing vector splicing processing on the first feature vector and the second feature vector according to a preset vector splicing rule to obtain a third feature vector;
  • the third feature vector is used as an input feature and input into the trained property prediction model to obtain the property prediction result of the small molecule of the target drug.
  • the preset vector splicing rule may include: splicing the first feature vector after the second feature vector to obtain a third feature vector; or splicing the second feature vector after the first feature vector to obtain a third feature vector; Or, add the first eigenvector and the second eigenvector to get the third eigenvector, etc.
  • the process of predicting the properties of small drug molecules based on the graph neural network can be referred to the schematic diagram of the principle of predicting the properties of small molecules of drugs based on the graph neural network shown in Figure 3.
  • the molecular subgraph structure is generated according to the intermediate structure of the functional group of the small molecule of the target drug.
  • the intermediate structure of the functional group is the corresponding chemical molecule
  • the benzene ring in the structure is a structure between the atomic scale and the molecular scale obtained by expressing the functional group.
  • the intermediate structure of the functional group includes the original functional group in the chemical molecular structure and the new functional group after the transformation of the benzene ring; and then use the target graph neural network
  • the model (Graph Neural Networks, GNN) determines the first eigenvector corresponding to the molecular graph structure, and the second eigenvector corresponding to the molecular subgraph structure; finally, it is constructed according to the first eigenvector and the second eigenvector
  • the third eigenvector expressed in multiple scales, and inputting the third eigenvector into the trained property prediction model, can determine the property prediction result of the target drug small molecule.
  • the molecular graph structure and molecular subgraph structure of the target drug small molecule can be generated first, and then the molecular graph structure and molecular subgraph structure are respectively input into the target graph neural network model, Respectively determine the first feature vector and the second feature vector of the small molecule of the target drug; then use the first feature vector and the second feature vector to construct a third feature vector, and input the third feature vector into the trained property prediction model, The property prediction results of the small molecule of the target drug can be confirmed and obtained.
  • the embodiment of the present application provides a device for predicting properties of small drug molecules based on a graph neural network, as shown in Figure 4, the device includes: a generation module 31, The first determination module 32, the second determination module 33;
  • the generation module 31 can be used to generate a molecular graph structure according to the chemical molecular structure of the target drug small molecule, and generate a molecular subgraph structure according to the functional group intermediate structure of the target drug small molecule;
  • the first determination module 32 can be used to determine the first eigenvector corresponding to the molecular graph structure and the second eigenvector corresponding to the molecular subgraph structure by using the target graph neural network model;
  • the second determination module 33 can be used to construct a third feature vector according to the first feature vector and the second feature vector, and input the third feature vector into the trained property prediction model to obtain the property prediction result of the target drug small molecule.
  • the generation module 31 can specifically be used to obtain the chemical molecular structure of the small molecule of the target drug, determine the atoms in the chemical molecular structure as nodes in the molecular graph structure, and determine the atomic connection relationship in the chemical molecular structure Generate the molecular graph structure of the small molecule of the target drug as an edge in the molecular graph structure; obtain the intermediate structure of the functional group of the target small molecule, determine the functional group in the intermediate structure of the functional group as a node in the molecular subgraph structure, and set the intermediate structure of the functional group The functional group connection relationship is determined as the edges in the molecular subgraph structure, and the molecular subgraph structure of the target drug small molecule on the intermediate scale is generated.
  • the acquisition module 34 can be used to acquire an unlabeled graph data set and a first labeled graph data set, and the training task of the first labeled graph data set is different from the preset property prediction task;
  • the first training module 35 can be used to use the unmarked graph data set as a training sample, and adjust the parameters of the graph neural network model to obtain the first graph neural network model by training the preset graph neural network model;
  • the second training module 36 can be used to use the first labeled graph data set as a training sample, and adjust the parameters of the first graph neural network model to obtain the second graph neural network model by training the first graph neural network model;
  • the third training module 37 can be used to use the second labeled graph data set corresponding to the preset property prediction task as a training sample, and adjust the parameters of the neural network model of the second graph to obtain the target graph by training the neural network model of the second graph neural network model.
  • the molecular graph structure carries the adjacency matrix and attribute information, and the attribute information includes the initial feature vector of the node and the initial feature vector of the edge, where the initial feature vector of the node and the initial feature vector of the edge are generated according to the preset vector Determined;
  • the first determination module 32 can be specifically used to input the molecular graph structure, adjacency matrix and attribute information into the target graph neural network model, and obtain the node hidden vector of each node in the molecular graph structure; use the node hidden vector of each node The vector generates the first feature vector of the target drug small molecule; the molecular subgraph structure is input into the target graph neural network model to determine the second feature vector of the target drug small molecule.
  • the first determination module 32 can specifically be used to calculate the hidden vector average value of the node hidden vectors, and determine the hidden vector average value as the target The first eigenvector of the drug small molecule; or, extracting the first node hidden vector corresponding to the largest hidden vector value from the node hidden vector, and determining the first node hidden vector as the first feature vector.
  • the second determination module 33 can specifically be used to perform vector splicing processing on the first feature vector and the second feature vector according to a preset vector splicing rule to obtain a third feature vector; the third feature vector is used as an input
  • the features are input into the trained property prediction model to obtain the property prediction results of the small molecule of the target drug.
  • the device in order to train the property prediction model in advance, as shown in FIG. 5 , the device also includes: a fourth training module 38 and a calculation module 39;
  • the fourth training module 38 can be used to use the sample feature vector matching the preset property prediction task corresponding to the small molecule of the target drug as a training sample to train a preset property prediction model;
  • the calculation module 39 can be used to calculate the loss function of the property prediction model, and when the loss function is smaller than the preset threshold, it is determined that the training of the property prediction model is completed.
  • this embodiment also provides a computer-readable storage medium.
  • the computer-readable storage medium can be volatile or non-volatile, and stores the Computer-readable instructions, when the readable instructions are executed by the processor, realize the method for predicting properties of small drug molecules based on the graph neural network as shown in Figs. 1 to 2 .
  • the technical solution of this application can be embodied in the form of software products, which can be stored in a computer-readable storage medium (which can be CD-ROM, U disk, mobile hard disk, etc.), including several instructions It is used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods of various implementation scenarios of the present application.
  • a computer-readable storage medium which can be CD-ROM, U disk, mobile hard disk, etc.
  • a computer device which may be a personal computer, a server, or a network device, etc.
  • this embodiment also provides a computer device, the computer device includes a memory and a processor a memory for storing computer-readable instructions; a processor for executing computer-readable instructions to implement the method for predicting properties of small molecules of drugs based on graph neural networks as shown in Figures 1 to 2 above.
  • the computer device may also include a user interface, a network interface, a camera, a radio frequency (Radio Frequency, RF) circuit, a sensor, an audio circuit, a WI-FI module, and the like.
  • the user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the like, and optional user interfaces may also include a USB interface, a card reader interface, and the like.
  • the network interface may include a standard wired interface, a wireless interface (such as a WI-FI interface), and the like.
  • a computer device does not constitute a limitation to the physical device, and may include more or less components, or combine some components, or arrange different components.
  • the computer-readable storage medium may also include an operating system and a network communication module.
  • the operating system is a program that manages the hardware and software resources of the above-mentioned computer equipment, and supports the operation of information processing programs and other software and/or programs.
  • the network communication module is used to realize the communication among various components in the computer-readable storage medium, and communicate with other hardware and software in the information processing entity device.
  • this application can first generate the molecular graph structure and molecular subgraph structure of the small molecule of the target drug, and then input the molecular graph structure and molecular subgraph structure respectively into the target graph neuron Network model, respectively determine the first feature vector and the second feature vector of the small molecule of the target drug; then use the first feature vector and the second feature vector to construct the third feature vector, and input the third feature vector into the trained property prediction In the model, the property prediction results of the small molecule of the target drug can be determined.
  • the accompanying drawing is only a schematic diagram of a preferred implementation scenario, and the modules or processes in the accompanying drawings are not necessarily necessary for implementing the present application.
  • the modules in the devices in the implementation scenario can be distributed among the devices in the implementation scenario according to the description of the implementation scenario, or can be located in one or more devices different from the implementation scenario according to corresponding changes.
  • the modules of the above implementation scenarios can be combined into one module, or can be further split into multiple sub-modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

La présente invention concerne un procédé de prédiction des propriétés d'une petite molécule médicamenteuse et un appareil reposnt sur un réseau de neurones graphiques, et un dispositif, concernant le domaine technique de l'intelligence artificielle et susceptible de résoudre les problèmes techniques selon lesquels l'efficacité et la précision sont faibles lors de la prédiction actuelle des propriétés d'une petite molécule médicamenteuse. Le procédé comprend : selon la structure moléculaire chimique d'une petite molécule médicamenteuse cible, la génération d'une structure graphique moléculaire et, selon la structure intermédiaire des groupes fonctionnels de la petite molécule médicamenteuse cible, la génération d'une structure sous-graphique moléculaire (101) ; l'utilisation d'un modèle de réseau de neurones graphiques cibles pour déterminer un premier vecteur de caractéristiques correspondant à la structure graphique moléculaire ainsi qu'un deuxième vecteur de caractéristiques correspondant à la structure sous-graphique moléculaire (102) ; et, selon le premier vecteur de caractéristiques et le deuxième vecteur de caractéristiques, la construction d'un troisième vecteur de caractéristiques, en introduisant le troisième vecteur de caractéristiques dans un modèle appris de prédiction des propriétés, et l'obtention d'un résultat de prédiction des propriétés de la petite molécule médicamenteuse cible (103). Le procédé permet d'atteindre une prédiction intelligente reposant sur la technologie de l'intelligence artificielle, portant sur les propriétés d'une petite molécule médicamenteuse.
PCT/CN2022/071440 2021-08-30 2022-01-11 Procédé de prédiction des propriétés d'une petite molécule médicamenteuse et appareil reposant sur un réseau de neurones graphiques, et dispositif WO2023029352A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111005476.0 2021-08-30
CN202111005476.0A CN113707236B (zh) 2021-08-30 2021-08-30 基于图神经网络的药物小分子性质预测方法、装置及设备

Publications (1)

Publication Number Publication Date
WO2023029352A1 true WO2023029352A1 (fr) 2023-03-09

Family

ID=78656947

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/071440 WO2023029352A1 (fr) 2021-08-30 2022-01-11 Procédé de prédiction des propriétés d'une petite molécule médicamenteuse et appareil reposant sur un réseau de neurones graphiques, et dispositif

Country Status (2)

Country Link
CN (1) CN113707236B (fr)
WO (1) WO2023029352A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117612633A (zh) * 2024-01-23 2024-02-27 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) 一种药物分子性质预测方法

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113707236B (zh) * 2021-08-30 2024-05-14 平安科技(深圳)有限公司 基于图神经网络的药物小分子性质预测方法、装置及设备
WO2023115343A1 (fr) * 2021-12-21 2023-06-29 深圳晶泰科技有限公司 Procédé et appareil de traitement de données, procédé d'apprentissage de modèle et procédé de prédiction d'énergie libre
CN114496302A (zh) * 2021-12-29 2022-05-13 深圳云天励飞技术股份有限公司 药物适应症的预测方法及相关设备
CN114358202A (zh) * 2022-01-11 2022-04-15 平安科技(深圳)有限公司 基于药物分子图像分类的信息推送方法及装置
CN114386694B (zh) * 2022-01-11 2024-02-23 平安科技(深圳)有限公司 基于对比学习的药物分子性质预测方法、装置及设备
CN115274008A (zh) * 2022-08-08 2022-11-01 苏州创腾软件有限公司 基于图神经网络的分子性质预测方法和系统
CN116189809B (zh) * 2023-01-06 2024-01-09 东南大学 一种基于对抗攻击的药物分子重要节点预测方法
CN116705195B (zh) * 2023-06-07 2024-03-26 之江实验室 基于矢量量化的图神经网络的药物性质预测方法和装置
CN118072861B (zh) * 2024-04-17 2024-07-23 烟台国工智能科技有限公司 一种基于多模态特征融合的分子优化方法、设备及介质

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109033738A (zh) * 2018-07-09 2018-12-18 湖南大学 一种基于深度学习的药物活性预测方法
US20190108320A1 (en) * 2017-09-07 2019-04-11 Accutar Biotechnology Inc. Neural network for predicting drug property
US20200168302A1 (en) * 2017-07-20 2020-05-28 The University Of North Carolina At Chapel Hill Methods, systems and non-transitory computer readable media for automated design of molecules with desired properties using artificial intelligence
CN111428848A (zh) * 2019-09-05 2020-07-17 中国海洋大学 基于自编码器和3阶图卷积的分子智能设计方法
CN111816252A (zh) * 2020-07-21 2020-10-23 腾讯科技(深圳)有限公司 一种药物筛选方法、装置及电子设备
CN111933225A (zh) * 2020-09-27 2020-11-13 平安科技(深圳)有限公司 药物分类方法、装置、终端设备以及存储介质
CN113011282A (zh) * 2021-02-26 2021-06-22 腾讯科技(深圳)有限公司 图数据处理方法、装置、电子设备及计算机存储介质
CN113241128A (zh) * 2021-04-29 2021-08-10 天津大学 基于分子空间位置编码注意力神经网络模型的分子性质预测方法
CN113299354A (zh) * 2021-05-14 2021-08-24 中山大学 基于Transformer和增强交互型MPNN神经网络的小分子表示学习方法
CN113707236A (zh) * 2021-08-30 2021-11-26 平安科技(深圳)有限公司 基于图神经网络的药物小分子性质预测方法、装置及设备

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019173401A1 (fr) * 2018-03-05 2019-09-12 The Board Of Trustees Of The Leland Stanford Junior University Systèmes et procédés pour convolutions graphiques spatiales ayant des applications dans la découverte de médicaments et la simulation moléculaire

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200168302A1 (en) * 2017-07-20 2020-05-28 The University Of North Carolina At Chapel Hill Methods, systems and non-transitory computer readable media for automated design of molecules with desired properties using artificial intelligence
US20190108320A1 (en) * 2017-09-07 2019-04-11 Accutar Biotechnology Inc. Neural network for predicting drug property
CN109033738A (zh) * 2018-07-09 2018-12-18 湖南大学 一种基于深度学习的药物活性预测方法
CN111428848A (zh) * 2019-09-05 2020-07-17 中国海洋大学 基于自编码器和3阶图卷积的分子智能设计方法
CN111816252A (zh) * 2020-07-21 2020-10-23 腾讯科技(深圳)有限公司 一种药物筛选方法、装置及电子设备
CN111933225A (zh) * 2020-09-27 2020-11-13 平安科技(深圳)有限公司 药物分类方法、装置、终端设备以及存储介质
CN113011282A (zh) * 2021-02-26 2021-06-22 腾讯科技(深圳)有限公司 图数据处理方法、装置、电子设备及计算机存储介质
CN113241128A (zh) * 2021-04-29 2021-08-10 天津大学 基于分子空间位置编码注意力神经网络模型的分子性质预测方法
CN113299354A (zh) * 2021-05-14 2021-08-24 中山大学 基于Transformer和增强交互型MPNN神经网络的小分子表示学习方法
CN113707236A (zh) * 2021-08-30 2021-11-26 平安科技(深圳)有限公司 基于图神经网络的药物小分子性质预测方法、装置及设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117612633A (zh) * 2024-01-23 2024-02-27 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) 一种药物分子性质预测方法
CN117612633B (zh) * 2024-01-23 2024-04-09 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) 一种药物分子性质预测方法

Also Published As

Publication number Publication date
CN113707236B (zh) 2024-05-14
CN113707236A (zh) 2021-11-26

Similar Documents

Publication Publication Date Title
WO2023029352A1 (fr) Procédé de prédiction des propriétés d'une petite molécule médicamenteuse et appareil reposant sur un réseau de neurones graphiques, et dispositif
WO2023029351A1 (fr) Procédé, appareil et dispositif basés sur un apprentissage auto-supervisé pour prédire des propriétés de petites molécules de médicament
JP7247258B2 (ja) コンピュータシステム、方法及びプログラム
WO2023134063A1 (fr) Procédé, appareil et dispositif basés sur l'apprentissage comparatif pour prédire des propriétés d'une molécule de médicament
WO2022222231A1 (fr) Procédé et appareil de prédiction d'interaction médicament-cible, dispositif, et support de stockage
Wang et al. Exploiting ontology graph for predicting sparsely annotated gene function
WO2021165887A1 (fr) Architecture d'autocodeur contradictoire pour des procédés de modèles de graphe à séquence
WO2022188653A1 (fr) Procédé et appareil de traitement de saut d'échafaudage moléculaire, support, dispositif électronique et produit-programme d'ordinateur
CN113168568A (zh) 用于具有深度特征化的主动迁移学习的系统和方法
CN111627494A (zh) 基于多维特征的蛋白质性质预测方法、装置和计算设备
WO2022267752A1 (fr) Procédé et appareil de traitement de composé basés sur l'intelligence artificielle, dispositif, support de stockage et produit-programme informatique
CN113470741A (zh) 药物靶标关系预测方法、装置、计算机设备及存储介质
WO2023168810A1 (fr) Procédé et appareil de prédiction des propriétés d'une molécule de médicament, support d'enregistrement et dispositif informatique
Bahra et al. A hybrid user mobility prediction approach for handover management in mobile networks
Joshi et al. Artificial intelligence for autonomous molecular design: A perspective
Shi et al. A review of machine learning-based methods for predicting drug–target interactions
Peng et al. Pocket-specific 3d molecule generation by fragment-based autoregressive diffusion models
CN116721713A (zh) 一种面向化学结构式识别的数据集构建方法和装置
CN116705192A (zh) 基于深度学习的药物虚拟筛选方法及装置
Wang et al. Sparse imbalanced drug-target interaction prediction via heterogeneous data augmentation and node similarity
Abbou et al. Logistic matrix factorisation and generative adversarial neural network-based method for predicting drug-target interactions
Shi et al. A Review on Predicting Drug Target Interactions Based on Machine Learning
Trivodaliev et al. Deep Learning the Protein Function in Protein Interaction Networks
WO2023226310A1 (fr) Procédé et appareil d'optimisation de molécule
US20240233883A1 (en) Generative machine learning on textual queries relating to molecules

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22862503

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22862503

Country of ref document: EP

Kind code of ref document: A1