CN115966266B - Anti-tumor molecule strengthening method based on graph neural network - Google Patents

Anti-tumor molecule strengthening method based on graph neural network Download PDF

Info

Publication number
CN115966266B
CN115966266B CN202310015687.5A CN202310015687A CN115966266B CN 115966266 B CN115966266 B CN 115966266B CN 202310015687 A CN202310015687 A CN 202310015687A CN 115966266 B CN115966266 B CN 115966266B
Authority
CN
China
Prior art keywords
molecules
tumor
molecular
molecule
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310015687.5A
Other languages
Chinese (zh)
Other versions
CN115966266A (en
Inventor
施喻甜
王贝伦
金桥
王玟雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202310015687.5A priority Critical patent/CN115966266B/en
Publication of CN115966266A publication Critical patent/CN115966266A/en
Application granted granted Critical
Publication of CN115966266B publication Critical patent/CN115966266B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Image Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention designs an antitumor molecule reinforcement learning method based on a graph neural network, which comprises the following steps of: step 1: classifying the molecules into antitumor (positive) and non-antitumor (negative) categories according to the molecular labels in the database, and step 2: inputting the obtained graph into a proposed anti-tumor molecule strengthening model, learning the implicit expression of the graph according to different properties of positive molecules and negative molecules, and obtaining the local structural characteristics of the anti-tumor molecules for one-step generation and optimization of the molecules, wherein the step 3 is as follows: constraint is applied, target optimization is carried out, the drug properties of molecules in the molecular strengthening process are ensured, and step 4: substituting the obtained local molecular structure into the anti-tumor molecule to carry out structural modification and optimization, and step 5: judging the synthesizability by using the existing molecular property reasonable detection tool, outputting reasonable molecules, and further reversely optimizing unreasonable molecules, wherein in the step 6: obtaining novel anti-tumor molecules reasonably and ending the task.

Description

Anti-tumor molecule strengthening method based on graph neural network
Technical Field
The invention relates to a medicine molecule strengthening technology based on a graphic neural network, and belongs to the technical fields of anti-tumor medicine molecule chemistry research and graphic neural network strengthening learning.
Background
The state of the art and the problems closest to the present invention are described.
Optimization of drug molecules and research of new drugs are important for treatment of tumors. The goal of drug molecule optimization is to enhance the more desirable biological effects in a particular direction while ensuring biopharmaceutical acceptability. The current problem is the high time and monetary cost of developing a highly effective drug. In the current pharmaceutical research field, traditional strategies are based on screening of existing libraries of compounds. However, because of the limited structural diversity of the existing compounds, the library molecules, and the research institutions have already screened many times, the discovery and innovation of highly potent drug molecules has become more and more challenging.
With the development of machine learning, the method plays an increasing role in the field of drug discovery, and compared with the traditional screening strategy, the method for fusing machine learning and molecular research has more effectiveness and expandability on task processing.
At present, the related person in the field opens up a "de novo molecular design technique" -an atomic-based, fragment-based, reaction-based molecular design method, by which new molecules with high effectiveness are generated starting from scratch by analytical calculations. The design idea is as follows: the traditional rough direct screening method is abandoned, global and local are balanced, key features are explored, and target molecules are constructed based on clues. The method has the defects that the method needs high manual generation and molecular architecture from scratch; at the same time, explicit design goals and normalized design principles are also necessary for the interpretability of the results. Meanwhile, since the de novo molecule design is based on fragments, the mass of the resulting molecule is also indispensible from earlier manipulation. In summary, the current outstanding challenges of de novo design methods are high computational effort, low interpretability, and high dependence of the resulting on pre-selection.
Disclosure of Invention
Technical problems: the invention provides an anti-tumor drug molecule reinforcement learning method based on a graph neural network. Constructing a molecular anti-tumor enhancement graph neural network model, and carrying out feature extraction and molecular property classification based on drug molecules; and extracting key features by modifying the feature observation classification result. The analysis of key characteristics is utilized to carry out chemical strengthening and modification on the basis of the original molecular structure, and finally, new molecules with stronger target characteristics are obtained, so that the efficiency and the interpretability of molecule generation are improved, and the research and development difficulty, the development period and the cost of antitumor molecular medicaments are reduced.
The technical scheme is as follows: in order to achieve the purpose of the invention, the technical scheme adopted by the invention is as follows: an antitumor molecule strengthening method based on a graph neural network, which comprises the following steps:
step 1: molecules are classified into antitumor (positive) and non-antitumor (negative) categories according to molecular tags in the database. Each input molecule is described as an undirected graph (graph matrix) in which nodes and edges correspond to atoms and chemical bonds, respectively. A graph neural network (Graph neural networks) model is constructed to generate chemical stability and molecular drug properties of molecules as loss functions and pre-train.
Step 2: inputting the obtained graph into a proposed anti-tumor molecule strengthening model, and learning the implicit expression of the graph according to different properties of positive molecules and negative molecules to obtain the local structural characteristics of the anti-tumor molecules, wherein the local structural characteristics are used for one-step generation and optimization of the molecules.
Step 3: constraint is applied, target optimization is carried out, and the drug properties of molecules in the molecular strengthening process are ensured.
Step 4: substituting the obtained local molecular structure into the anti-tumor molecule to modify and optimize the structure of the anti-tumor molecule.
Step 5: and judging the synthesizability by using the existing molecular property reasonable detection tool, and outputting reasonable molecules. And further reverse optimizing for unreasonable molecules.
Step 6: obtaining novel anti-tumor molecules reasonably and ending the task.
Further, in step (1), the molecules are classified into antitumor (positive) and non-antitumor (negative) categories according to the molecular tags in the database. Each input molecule is described as an undirected graph (graph matrix) in which nodes and edges correspond to atoms and chemical bonds, respectively. The method for constructing the graph neural network (Graph neural networks) model and pre-training comprises the following steps:
(101) The input molecules are graphically represented by g= (V, E, X), where G represents the input molecule, V represents the atoms of the molecule, 1×n One-hot encoding format, E represents the molecular chemical bond, n×n size contiguous matrix, X is characteristic of each atom of the molecule, n×n size matrix. Each input molecule is described as an undirected graph (graph matrix), i.e. G.
(102) Classifying the molecules into antitumor (positive) and non-antitumor (negative) classes according to molecular tags in the database, using G + Or G - And (3) representing.
(103) And constructing a graph neural network (Graph neural networks) model, inputting the graph neural network into an original molecular undirected graph G, outputting the graph neural network into a binary probability matrix P, and pre-training the GNN model by taking chemical stability of generated molecules and molecular drug properties as loss functions.
In the step (2), the obtained image is input into a proposed anti-tumor molecule strengthening model, and the local structural characteristics of the anti-tumor molecules are obtained according to the implicit expression of the learning image of different properties of the positive molecules and the negative molecules, so that the anti-tumor molecules are generated and optimized in one step. The method comprises the following steps:
(201) Inputting the obtained molecule G into a feature extraction model fG-F epsilon R n×h Extraction of implicit features F of molecules
(202) Feature extraction model fG-F epsilon R n×h Comprises (1)
Wherein the method comprises the steps ofCharacterization of the individual molecules obtained in layer 1>For matrix->The j-th row content, the neighbor node with N (v) v, the UPDATE function of each layer, AGG as aggregation function, READOUT as READOUT function, and the characteristics of the molecule obtained after l iterations
(2) To get h G As input, the implicit character F of the molecule is obtained by means of an MLP model.
(203) Inputting the obtained implicit characteristic F into a classification prediction network c.F-P out ∈R n×2 Obtaining a classification probability result matrix P u
(204) Carrying out partial transformation on the obtained molecular structure, carrying out feature extraction model to obtain transformed implicit feature F ', inputting the obtained implicit feature F' into a classification prediction network to obtain a classification probability result matrix P n
(205) Inputting the probability results obtained in the step (203) and the step (204) into a probability fluctuation function PFF (Fluctuation probability function), bringing the probability results into a MEAS (measurement) function, analyzing the calculation results of the PFF,
PFF=||p u -p n ||
S out =MEAS{PFF(P u ,P n )}
extracting molecular local structural features S with large influence probability fluctuation degree after modification out This is taken as output.
Further, in the step (3), constraint is applied, target optimization is performed, and the drug properties of the molecules in the molecular strengthening process are ensured, wherein the method comprises the following steps:
(301) Using QED values in MEAS functions in training an anti-tumor molecular enhancement model
Performing constraints
MEAS:S out =RF{PFF(P u ,P n )+γQED}
Wherein QED is calculated by adopting RDkit. The RF function maps the result of the probability fluctuation function calculation value subjected to QED constraint to a molecular structure so as to obtain a local molecular structure, thereby ensuring drug-like drug similarity corresponding to the generated local feature.
Further, in the step (4), the obtained local molecular structure is substituted into the structure modification and optimization method for the antitumor molecule as follows:
(401) Inputting existing positive molecules comprises G= (V, E, X), wherein G represents an input molecule, V represents an atom of the molecule, E represents a molecular chemical bond, and X is a characteristic of each atom of the molecule, so that the input molecule is obtained in a graph manner.
(402) Strengthening the molecules by using the molecular local structural characteristics obtained by the trained model: by an automatic iterative optimization method, the most possible atoms or chemical bonds are continuously searched for to be connected, so that the structure of the molecule is modified, and the molecular local structural characteristics obtained before are applied to the molecule, so that the molecule with better anti-tumor performance is gradually constructed.
Further, in the step (5), the existing molecular property reasonable detection tool is used for judging the synthesizability and outputting reasonable molecules. And for unreasonable molecules, the method is further reversely optimized as follows:
(501) An existing molecular property reasonable detection tool p & gtG= (V, E, X) & gtV epsilon & lt R & gt is established, molecular property rationality is analyzed, and chemical feasibility is evaluated.
(502) The model p mainly comprises a state-action module a and a reorder module Q, and at any time step t, the input of the module a is a state and the output is an action, which is a tensor defined in the feature representation space of all the initial reactants. The module Q calculates the optimal value of the state through the Q network; the Actor iteratively updates parameters of the strategy function by using the optimal value, so as to select actions and obtain feedback and a new state. The environment takes the state, the optimal reaction template and the action as references, and the calculation determines whether the round is ended. The evaluation value V is finally output through the Softmax layer to represent the feasibility score.
Further, in the step (6), a reasonable novel anti-tumor molecule is obtained, and the task is ended, and the method comprises the following steps:
(601) By claim 6, in the optimization of the feedback standard penalty, the generated molecules obviously cannot be made into a realistic usable drug, which means that the generated molecules can be manufactured or stably exist in real life by the anti-tumor molecule strengthening model with strong anti-tumor characteristics. This highlights the need for rewarding and punishment using multi-objective optimization when reinforcement learning is used for anti-tumor molecular reinforcement.
(602) Re-emphasizing variable definitions:
X v : dimension R DV Feature vector of node v
h v : dimension R DV State vector of node v
x v1,v2 : dimension R DE Edge (V) 1 ,V 2 ) Feature vectors of (a)
During node operation, a node state update function needs to be definedSo that the node state is iteratively stabilized. And for the case of
The state transformation function of a node, for node V, the transformation of its state vector can be expressed as:
(603) The back propagation process is emphasized here:
in this step, we can find the gradient of the parameter according to the back propagation step, and then use the gradient descent method
And (5) optimizing. The iterative process is as follows:
Δw ij =ηδ j x i
then, by using a Backpropagation mode and using automatic backward propagation under a pytorch frame, the molecular structure is continuously and iteratively optimized, so that the generated reinforced molecular model meets multiple requirements of anti-tumor, molecular rationality and the like.
The beneficial effects are that: compared with the prior art, the technical scheme of the method has the following beneficial technical effects:
1. the characteristics of the drug molecules are learned by a pattern neural network mode, the given molecules are reinforced by using the pattern neural network and reinforcement learning method, and the work of the reinforced molecules is transferred to a machine for processing, so that the efficiency is greatly improved compared with the traditional manual reinforcement method.
2. Compared with other methods for generating molecules from scratch according to a given molecular model, the method for strengthening the existing molecules based on the graph neural network and reinforcement learning greatly improves the efficiency and accuracy to a certain extent.
3. The molecules generated by the method have higher interpretability and better visualization, are favorable for understanding the characteristics and the reinforcement cause of the generated reinforced drug molecules, and improve more convenience for related drug researchers.
4. Multiple constraint and property detection are adopted in the generation process, so that the chemical rationality and the pharmaceutical property of the generated molecules are ensured.
Drawings
FIG. 1 is a flow chart of the steps for implementing the present invention;
FIG. 2 is a diagram of an anti-tumor molecular reinforcement model framework;
FIG. 3 is an example molecular structural diagram;
FIG. 4 is a graph showing the results of an example molecular optimization structure.
Detailed Description
In order to enhance the understanding of the present invention, the present embodiment will be described in detail with reference to the accompanying drawings.
Examples: the technical scheme of the invention is described in detail below by taking a chumbl database molecule as an example and combining the drawings.
An antitumor molecule strengthening method based on a graph neural network, which comprises the following steps:
step (1): molecules are classified into antitumor (positive) and non-antitumor (negative) categories according to molecular tags in the database. Each input molecule is described as an undirected graph (graph matrix) in which nodes and edges correspond to atoms and chemical bonds, respectively. The method for constructing the graph neural network (Graph neural networks) model to generate chemical stability and molecular drug properties of molecules as loss functions and pretraining comprises the following steps:
(101) The input molecules are graphically represented by g= (V, E, X), where G represents the input molecule, V represents the atoms of the molecule, 1×n One-hot encoding format, E represents the molecular chemical bond, n×n size contiguous matrix, X is characteristic of each atom of the molecule, n×n size matrix. Each input molecule is described as an undirected graph (graph matrix), i.e. G.
(102) Classifying the molecules into antitumor (positive) and non-antitumor (negative) classes according to molecular tags in the database, using G + Or G - And (3) representing.
(103) Constructing a graph neural network (Graph neural networks) model to generate chemical stability and molecular drug property of molecules as loss functions, wherein the model structure is shown in figure 2, the input is an original molecular undirected graph G, and the output is two classes
And (3) a probability matrix P, and pre-training the GNN model.
And (2) inputting the obtained image into a proposed anti-tumor molecule strengthening model, and learning the implicit expression of the image according to different properties of positive molecules and negative molecules to obtain the local structural characteristics of the anti-tumor molecules for one-step generation and optimization of the molecules. The method comprises the following steps:
(201) Inputting the obtained molecule G into a feature extraction model fG-F epsilon R n×h Extraction of implicit features F of molecules
(202) Feature extraction model fG-F epsilon R n×h Comprises (1)
Wherein the method comprises the steps ofCharacterization of the individual molecules obtained in layer 1>For matrix->The j-th row content, the neighbor node with N (v) v, the UPDATE function of each layer, AGG as aggregation function, READOUT as READOUT function, and the characteristics of the molecule obtained after l iterations
(2) To get h G As input, the implicit character F of the molecule is obtained by means of an MLP model.
(203) Inputting the obtained implicit characteristic F into a classification prediction network c.F-P out ∈R n×2 Obtaining a classification probability result matrix P u
(204) Carrying out partial transformation on the obtained molecular structure, carrying out feature extraction model to obtain transformed implicit feature F ', inputting the obtained implicit feature F' into a classification prediction network to obtain a classification probability result matrix P n
(205) Inputting the probability results obtained in the step (203) and the step (204) into a probability fluctuation function PFF (Fluctuation probability function), bringing the probability results into a MEAS (measurement) function, analyzing the calculation results of the PFF,
PFF=||p u -p n ||
S out =MEAS{PFF(P u ,P n )}
extracting molecular local structural features S with large influence probability fluctuation degree after modification out This is taken as output.
Step (3): constraint is applied, target optimization is carried out, and the drug properties of molecules in the molecular strengthening process are ensured, wherein the method comprises the following steps:
(301) Using QED values in MEAS functions in training an anti-tumor molecular enhancement model
Performing constraints
MEAS:S out =RF{PFF(P u ,P n )+γQED}
Wherein QED is calculated by adopting RDkit. The RF function maps the result of the probability fluctuation function calculation value subjected to QED constraint to a molecular structure so as to obtain a local molecular structure, thereby ensuring drug-like drug similarity corresponding to the generated local feature.
And (4) substituting the obtained local molecular structure into the structure of the antitumor molecule for modification and optimization. The method comprises the following steps:
(401) Inputting existing positive molecules comprises G= (V, E, X), wherein G represents an input molecule, V represents an atom of the molecule, E represents a molecular chemical bond, and X is a characteristic of each atom of the molecule, so that the input molecule is obtained in a graph manner.
(402) Strengthening the molecules by using the molecular local structural characteristics obtained by the trained model: by an automatic iterative optimization method, the most possible atoms or chemical bonds are continuously searched for to be connected, so that the structure of the molecule is modified, and the molecular local structural characteristics obtained before are applied to the molecule, so that the molecule with better anti-tumor performance is gradually constructed.
And (5) judging the synthesizability by using the existing molecular property reasonable detection tool, and outputting reasonable molecules. And for unreasonable molecules, the method is further reversely optimized as follows:
(501) Establishing an existing reasonable detection tool p: G= (V, E, X) →V epsilon R for molecular property, analyzing the rationality of molecular property and evaluating
Estimating the feasibility of chemistry
(502) The model p mainly comprises a state-action module a and a reorder module Q, and at any time step t, the input of the module a is a state and the output is an action, which is a tensor defined in the feature representation space of all the initial reactants. The module Q calculates the optimal value of the state through the Q network; the Actor iteratively updates parameters of the strategy function by using the optimal value, so as to select actions and obtain feedback and a new state. The environment takes the state, the optimal reaction template and the action as references, and the calculation determines whether the round is ended. The evaluation value V is finally output through the Softmax layer to represent the feasibility score.
Step (6): obtaining reasonable novel anti-tumor molecules, ending the task, and the method comprises the following steps:
(601) By claim 6, in the optimization of the feedback standard penalty, the generated molecules obviously cannot be made into a realistic usable drug, which means that the generated molecules can be manufactured or stably exist in real life by the anti-tumor molecule strengthening model with strong anti-tumor characteristics. This highlights the need for rewarding and punishment using multi-objective optimization when reinforcement learning is used for anti-tumor molecular reinforcement.
(602) Re-emphasizing variable definitions:
X v : dimension R DV Feature vector of node v
h v : dimension R DV State vector of node v
x v1,v2 : dimension R DE Edge (V) 1 ,V 2 ) Feature vectors of (a)
During node operation, a node state update function needs to be definedSo that the node state is iteratively stabilized. Whereas for node V, the state vector transformation may represent the state transformation function of the nodeThe method comprises the following steps:
(603) The back propagation process is emphasized here:
in this step we can find the gradient of the parameter according to the back-propagation step and then optimize it using the gradient descent method. The iterative process is as follows:
Δw ij =ηδ j X i
then, by using a Backpropagation mode and using automatic backward propagation under a pytorch frame, the molecular structure is continuously and iteratively optimized, so that the generated reinforced molecular model meets multiple requirements of anti-tumor, molecular rationality and the like. Number of ChEMBL
The optimization of a molecule in the database is shown in FIG. 4.
It should be noted that the above-mentioned embodiments are not intended to limit the scope of the present invention, and equivalent changes or substitutions made on the basis of the above-mentioned technical solutions fall within the scope of the present invention as defined in the claims.

Claims (5)

1. An anti-tumor molecule reinforcement learning method based on a graph neural network is characterized by comprising the following steps of:
step 1: dividing the molecules into anti-tumor, namely positive, non-anti-tumor, namely negative categories according to molecular labels in a database, describing each input molecule as an undirected graph, namely a graph matrix, wherein nodes and edges respectively correspond to atoms and chemical bonds, constructing a graph neural network (Graph neural networks) model, taking chemical stability of the generated molecules and molecular drug properties as loss functions, pre-training for node classification tasks,
step 2: inputting the obtained graph into a proposed anti-tumor molecule strengthening model, learning the implicit expression of the graph according to different properties of positive molecules and negative molecules, obtaining the local structural characteristics of the anti-tumor molecules for one-step generation and optimization of the molecules,
step 3: constraint is applied to perform target optimization, so as to ensure the drug properties of molecules in the molecular strengthening process,
step 4: substituting the obtained local molecular structure into the anti-tumor molecule to modify and optimize the structure,
step 5: the existing molecular property reasonable detection tool is used for judging the synthesizability, outputting reasonable molecules, and aiming at unreasonable molecules, further reversely optimizing,
step 6: obtaining reasonable novel anti-tumor molecules and ending the task;
in step 1, according to the molecular labels in the database, dividing the molecules into anti-tumor positive and non-anti-tumor negative categories, describing each input molecule as an undirected graph, namely a graph matrix, wherein nodes and edges respectively correspond to atoms and chemical bonds, constructing a graph neural network (Graph neural networks) model and pre-training, and specifically, the method comprises the following steps:
(101) Graphically inputting molecules, including g= (V, E, X), wherein G represents an input molecule, V represents an atom of the molecule, is in a 1×n One-hot encoding format, E represents a molecular chemical bond, is an n×n-sized adjacent matrix, X is characteristic of each atom of the molecule, is an n×n-sized matrix, and each input molecule is described as an undirected graph, i.e., graph matrix, i.e., G;
(102) According to molecular markers in the databaseThe molecules are classified into anti-tumor positive and non-anti-tumor negative categories by G + Or G-represents;
(103) Constructing a graph neural network (Graph neural networks) model, wherein the input of the graph neural network is an original molecular undirected graph G, the output of the graph neural network is a binary probability matrix P, and the GNN model is pre-trained; in the step (2), inputting the obtained graph into a proposed anti-tumor molecule strengthening model, and learning the implicit expression of the graph according to different properties of positive molecules and negative molecules to obtain local structural characteristics of the anti-tumor molecules, wherein the local structural characteristics are used for one-step generation and optimization of the molecules, and the method comprises the following steps:
(201) Inputting the obtained molecule G into a feature extraction model
f:G→F∈R n×h
The implicit character F of the molecule is extracted,
(202) Feature extraction fG.fwdarw.F.epsilon.R n×h The model comprises (1)
Wherein the method comprises the steps ofCharacterization of the individual molecules obtained in layer 1>For matrix->The j-th row content, N (v) is the neighbor node of v, UPDATE is the UPDATE function of each layer, AGG is the aggregation function, READOUT is the read-out function, and the characteristic (2) of the molecule is obtained after l iterations to obtain h G As input, the molecules are obtained by an MLP modelIs a function of the implicit characteristic F of (a),
(203) Inputting the obtained implicit characteristic F into a classification prediction network
c:F→P out ∈R n×2
Obtaining a binary probability result matrix P u
(204) Carrying out partial transformation on the obtained molecular structure, carrying out feature extraction model to obtain transformed implicit feature F ', inputting the obtained implicit feature F' into a classification prediction network to obtain a classification probability result matrix P n
(205) Inputting the probability results obtained in the step (203) and the step (204) into a probability change function PFF (Fluctuation probability function), bringing the probability results into a MEAS (measurement) function, and analyzing the calculation result of the PFF
PFF=||p u -p n ||
S out =MEAS{PFF(P u ,P n )},
Extracting molecular local structural features S with large influence probability fluctuation degree after modification out This is taken as output.
2. The method for strengthening learning anti-tumor molecules based on a graph neural network according to claim 1, wherein in the step 3, constraint is applied to optimize targets, so as to ensure drug properties of molecules in the process of strengthening the molecules, and the method comprises the following steps:
(301) Restraint in MEAS function by QED (quantitative estimate ofdrug-token) value when training anti-tumor molecule strengthening model
MEAS:S out =RF{PFF(P u ,P n )+γQED}
The QED is calculated by adopting an RDkit, and the RF function maps the result of the probability fluctuation function calculation value subjected to QED constraint to a molecular structure so as to obtain a local molecular structure, thereby ensuring drug-like drug similarity corresponding to the generated local feature.
3. The method for strengthening learning anti-tumor molecules based on the graph neural network according to claim 1, wherein in the step 4, the obtained local molecular structure is substituted into the structure of the anti-tumor molecules for modification and optimization, and the method comprises the following steps:
(401) Inputting existing positive molecules comprises G= (V, E, X), wherein G represents input molecules, V represents atoms of the molecules, E represents molecular chemical bonds, X is characteristic of each atom of the molecules, thereby obtaining input molecules in a graph manner,
(402) Strengthening the molecules by using the molecular local structural characteristics obtained by the trained anti-tumor molecular strengthening model: by an automatic iterative optimization method, the most possible atoms or chemical bonds are continuously searched for to be connected, so that the structure of the molecule is modified, and the molecular local structural characteristics obtained before are applied to the molecule, so that the molecule with better anti-tumor performance is gradually constructed.
4. The method for reinforcement learning of antitumor molecules based on the graphic neural network according to claim 1, wherein in step 5, the existing molecular property rational detection tool is used to judge the synthesizability, and rational molecules are output, and for unreasonable molecules, the method is further reversely optimized, and the method is as follows:
(501) An existing molecular property reasonable detection tool p & gtG= (V, E, X) & gtV epsilon & lt R & gt is established, molecular property rationality is analyzed, chemical feasibility is evaluated,
(502) The model p mainly comprises a state-action module A and a reward module Q, at any time step t, the input of the module A is a state, the output is an action, the action is tensors defined in the feature representation space of all initial reactants, and the module Q calculates the optimal value of the state through a Q network; the Actor iteratively updates parameters of the strategy function by using the optimal value, further selects actions, obtains feedback and new states, uses the states, the optimal reaction templates and the actions as references in the environment, calculates and determines whether the round is ended, and finally outputs an evaluation value V through a Softmax layer to represent the feasibility score.
5. The method for reinforcement learning of antitumor molecules based on the graphic neural network according to claim 1, wherein the existing molecular property reasonable detection tool is used for judging the synthesizability, outputting reasonable molecules, and further reversely optimizing for unreasonable molecules, and the method is characterized in that in the step (6), a reasonable novel antitumor molecule is obtained, and comprises the following steps:
(601) In the optimization of feedback standard punishment, the generated molecules obviously cannot become a realistic and usable drug, which shows that the generated molecules can be generated to have strong anti-tumor characteristics through an anti-tumor molecule strengthening model, but the generated molecules cannot be ensured to be manufactured or stably exist in real life, which highlights the requirement of using multi-objective optimization for punishment when strengthening the anti-tumor molecules by using reinforcement learning,
(602) Re-emphasizing variable definitions:
X v : dimension R DV The feature vector of node v,
h v : dimension R DV The state vector of node v,
x v1,v2 : dimension R DE Edge (V) 1 ,V 2 ) Is used for the feature vector of (a),
during node operation, a node state update function needs to be definedSo that the node states are iteratively stabilized, while for the state transition function of the node, the transition of its state vector for node V can be expressed as:
(603) The back propagation process is emphasized here:
in this step, the gradient of the parameter is obtained according to the back propagation step, and then the gradient descent method is used for optimization, and the iterative process is as follows:
……
and then, by using a Backpropagation mode and using automatic backward propagation under a pytorch frame, continuously iterating and optimizing a molecular structure, so that the generated reinforced molecular model meets multiple requirements on anti-tumor and molecular rationality.
CN202310015687.5A 2023-01-06 2023-01-06 Anti-tumor molecule strengthening method based on graph neural network Active CN115966266B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310015687.5A CN115966266B (en) 2023-01-06 2023-01-06 Anti-tumor molecule strengthening method based on graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310015687.5A CN115966266B (en) 2023-01-06 2023-01-06 Anti-tumor molecule strengthening method based on graph neural network

Publications (2)

Publication Number Publication Date
CN115966266A CN115966266A (en) 2023-04-14
CN115966266B true CN115966266B (en) 2023-11-17

Family

ID=87357842

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310015687.5A Active CN115966266B (en) 2023-01-06 2023-01-06 Anti-tumor molecule strengthening method based on graph neural network

Country Status (1)

Country Link
CN (1) CN115966266B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118280482B (en) * 2024-06-04 2024-08-23 浙江大学 Method and system for predicting antioxidant molecules based on deep learning

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898730A (en) * 2020-06-17 2020-11-06 西安交通大学 Structure optimization design method for accelerating by using graph convolution neural network structure
CN112820361A (en) * 2019-11-15 2021-05-18 北京大学 Drug molecule generation method based on confrontation and imitation learning
CN113140267A (en) * 2021-03-25 2021-07-20 北京化工大学 Directional molecule generation method based on graph neural network
CN113327651A (en) * 2021-05-31 2021-08-31 东南大学 Molecular diagram generation method based on variational self-encoder and message transmission neural network
CN114822718A (en) * 2022-03-25 2022-07-29 云南大学 Human oral bioavailability prediction method based on graph neural network
CN115274007A (en) * 2022-08-02 2022-11-01 殷越铭 Generalizable and interpretable depth map learning method for discovering and optimizing drug lead compound
CN115526246A (en) * 2022-09-21 2022-12-27 吉林大学 Self-supervision molecular classification method based on deep learning model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BR112021015643A2 (en) * 2019-02-08 2021-10-05 Google Llc SYSTEMS AND METHODS TO PREDICT THE OLFACTIVE PROPERTIES OF MOLECULES USING MACHINE LEARNING

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112820361A (en) * 2019-11-15 2021-05-18 北京大学 Drug molecule generation method based on confrontation and imitation learning
CN111898730A (en) * 2020-06-17 2020-11-06 西安交通大学 Structure optimization design method for accelerating by using graph convolution neural network structure
CN113140267A (en) * 2021-03-25 2021-07-20 北京化工大学 Directional molecule generation method based on graph neural network
CN113327651A (en) * 2021-05-31 2021-08-31 东南大学 Molecular diagram generation method based on variational self-encoder and message transmission neural network
CN114822718A (en) * 2022-03-25 2022-07-29 云南大学 Human oral bioavailability prediction method based on graph neural network
CN115274007A (en) * 2022-08-02 2022-11-01 殷越铭 Generalizable and interpretable depth map learning method for discovering and optimizing drug lead compound
CN115526246A (en) * 2022-09-21 2022-12-27 吉林大学 Self-supervision molecular classification method based on deep learning model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"A Review of Graph Neural Networks and Their Applications in Power Systems";Wenlong Liao.etc;《JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY》;全文 *

Also Published As

Publication number Publication date
CN115966266A (en) 2023-04-14

Similar Documents

Publication Publication Date Title
Ronoud et al. An evolutionary deep belief network extreme learning-based for breast cancer diagnosis
Jin et al. Bayesian symbolic regression
Tomar et al. Twin support vector machine: a review from 2007 to 2014
Jiang et al. Protein secondary structure prediction: A survey of the state of the art
Tan et al. Multi-stage dimension reduction for expensive sparse multi-objective optimization problems
CN109816000A (en) A kind of new feature selecting and parameter optimization method
Altun et al. Gaussian process classification for segmenting and annotating sequences
CN115966266B (en) Anti-tumor molecule strengthening method based on graph neural network
Şahín et al. Robust feature selection with LSTM recurrent neural networks for artificial immune recognition system
Zhang et al. A cost-sensitive attention temporal convolutional network based on adaptive top-k differential evolution for imbalanced time-series classification
Kiran et al. Harnessing quantum power using hybrid quantum deep neural network for advanced image taxonomy
Demirel et al. Meta-tuning loss functions and data augmentation for few-shot object detection
Raiaan et al. A systematic review of hyperparameter optimization techniques in Convolutional Neural Networks
Cai et al. A general convergence analysis method for evolutionary multi-objective optimization algorithm
WO2023174064A1 (en) Automatic search method, automatic-search performance prediction model training method and apparatus
CN115953609B (en) Data set screening method and system
Peng et al. Predicting chromosome flexibility from the genomic sequence based on deep learning neural networks
Fuchs et al. Iterative creation of matching-graphs–finding relevant substructures in graph sets
Lagutin et al. Ex2MCMC: Sampling through Exploration Exploitation
CN117574309B (en) Hierarchical text classification method integrating multi-label contrast learning and KNN
CN116913379B (en) Directional protein transformation method based on iterative optimization pre-training large model sampling
Zhang et al. An MCMC-based prior sub-hypergraph matching in presence of outliers
Genders et al. Plant diseases detection and classification using transfer learning
Li Towards Structured Prediction in Bioinformatics with Deep Learning
Guo et al. Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant