CN114627980A

CN114627980A - Chemical inverse synthesis analysis method and system

Info

Publication number: CN114627980A
Application number: CN202210335947.2A
Authority: CN
Inventors: 董昊; 郭家盛; 余晨宁
Original assignee: Nanjing Nanxin Medical Technology Research Institute Co ltd; Nanjing Xinrui Biotechnology Co ltd; Nanjing University
Current assignee: Nanjing Nanxin Medical Technology Research Institute Co ltd; Nanjing Xinrui Biotechnology Co ltd; Nanjing University
Priority date: 2022-03-31
Filing date: 2022-03-31
Publication date: 2022-06-14

Abstract

The invention relates to a chemical inverse synthesis analysis method and a chemical inverse synthesis analysis system, which relate to the field of compound analysis, and the method comprises the following steps: inputting a training compound into the synthesis tree, and searching by using a Monte Carlo search tree to obtain training data of the inverse synthetic analysis network; training the inverse synthetic analysis network by using the training data to obtain a trained inverse synthetic analysis network; and acquiring a compound to be synthesized, and inputting the compound to be synthesized into the trained inverse synthetic analysis network to obtain a synthetic route of the compound to be synthesized. The invention can realize more flexible utilization of chemical reaction data from various sources, reduce the requirement on external data and still obtain a synthetic route with relatively good quality.

Description

Chemical inverse synthesis analysis method and system

Technical Field

The invention relates to the field of compound analysis, in particular to a chemical inverse synthesis analysis method and a chemical inverse synthesis analysis system.

Background

Computer-aided synthetic route Planning programs (CASP programs) are a class of programs that automatically design a given synthetic route for an organic compound by combining computer algorithms with chemical databases. Among the most important and central technologies in such programs are methods for calculating synthetic routes for a given compound according to certain user requirements based on known chemical reaction template (interactions templates) data, building blocks (building blocks) data, and other data or information related to the design of synthetic routes. Early CASP programs used very simple search and ranking algorithms and therefore did not yield results that were close to the chemist level. In recent research and practice, synthetic route design using artificial intelligence methods is becoming the mainstream of developing CASP programs. In such algorithms, however, the ranking component is very sensitive to the external data used and therefore it is often difficult to give results that are highly consistent with expert judgment or laboratory results. Therefore, the algorithm design thinking is more focused on the searching capability, so that the method is convenient for generating a sufficient number of synthetic routes and provides a greater reference value for manual design. In addition, the data-driven nature of the artificial intelligence approach determines that high-quality chemical reaction data is the determining factor for improving its performance. High quality chemical reaction data is still scarce and expensive so far, and there is a need to develop and implement a synthetic route design solution that can more flexibly utilize chemical reaction data from various sources, reduce the need for external data, and still obtain a synthetic route of relatively good quality.

Disclosure of Invention

The invention aims to provide a chemical inverse synthesis analysis method and a chemical inverse synthesis analysis system, so that chemical reaction data from various sources can be more flexibly utilized, the requirement on external data is reduced, and a synthetic route with relatively good quality can be obtained.

In order to achieve the purpose, the invention provides the following scheme:

a method of chemical inverse synthesis analysis, comprising:

inputting a training compound into the synthesis tree, and searching by using a Monte Carlo search tree to obtain training data of the inverse synthetic analysis network; the training data comprises a base molecule of a training compound and a synthetic route for the training compound starting from the base molecule; the Monte Carlo search tree is updated according to the action value estimation and the state value estimation;

training the inverse synthetic analysis network by using the training data to obtain a trained inverse synthetic analysis network;

and acquiring a compound to be synthesized, and inputting the compound to be synthesized into the trained inverse synthetic analysis network to obtain a synthetic route of the compound to be synthesized.

Optionally, the inputting a training compound into the synthesis tree, and searching by using the monte carlo search tree to obtain training data of the inverse synthetic analysis network specifically includes:

taking the training compound as a root node of a synthesis tree, and determining the reverse synthesis template probability of the training compound and the reverse synthesis template probability of the intermediate after decomposition of the training chemical by utilizing a Monte Carlo search tree;

calculating a state marker of the root node;

judging whether the state flag is-1 or not to obtain a first judgment result; if the first judgment result is yes, the construction of the synthetic tree fails, all nodes except the node marked as 1 are reset to-1, and training data are determined according to leaf nodes and edges of the synthetic tree;

if the first judgment result is negative, judging whether the node depth in the synthetic tree reaches a first set node depth or not to obtain a second judgment result;

if the second judgment result is negative, judging whether a node with a state mark of 0 exists in the composition tree or not to obtain a third judgment result; if the third judgment result is yes, traversing all leaf nodes with the state marks of 0 in the synthetic tree, and adding child nodes to the leaf nodes with the state marks of 0 according to the reverse synthetic template probability of the training compound and the reverse synthetic template probability of the intermediate after decomposition of the training chemical; determining the state flag of the child node and returning to the step of judging whether the state flag is-1 or not to obtain a first judgment result;

if the third judgment result is negative, the synthetic tree is successfully constructed, all the node marks are reset to be 1, and training data are determined according to leaf nodes and edges of the synthetic tree;

if the second judgment result is yes, judging whether a node with a state mark of 0 exists in the composition tree or not, and obtaining a fourth judgment result; if the fourth judgment result shows that the node is not the leaf node of the synthetic tree, the synthetic tree is successfully constructed, all the node marks are reset to be 1, and training data are determined according to the leaf nodes and the edges of the synthetic tree;

if the result of the fourth judgment is negative, the construction of the synthetic tree fails, all nodes except the node marked as 1 are reset to-1, and training data are determined according to leaf nodes and edges of the synthetic tree.

Optionally, the expression of the loss function of the inverse synthetic analysis network is:

L＝π^Tln p+(z-v)²+λ||ω||²

wherein, L is a loss function, omega is a reverse synthesis analysis network parameter, | | omega | | caly²Is the sum of the squares of all parameters; λ is the L2 regularization parameter, π^TIs a monte carlo strategy, p is a neural network strategy, v is a node value estimation, and z is a state marker.

Optionally, the expression of the inverse synthesis template probability is:

wherein, pi (A | S)₀) To synthesize the template probabilities, s, in reverse₀Search the root node of the tree for Monte Carlo, a is the reverse synthesis template, N(s)₀And a) is the total number of accesses to the edge storing the reverse synthesis template, and τ is the temperature coefficient.

A chemical inverse synthesis analysis system, comprising:

the training data acquisition module is used for inputting a training compound into the synthesis tree and searching by utilizing the Monte Carlo search tree to obtain training data of the inverse synthetic analysis network; the training data comprises a base molecule of a training compound and a synthetic route for the training compound starting from the base molecule; the Monte Carlo search tree is updated according to the action value estimation and the state value estimation;

the training module is used for training the inverse synthetic analysis network by using the training data to obtain a trained inverse synthetic analysis network;

and the inverse synthetic analysis module is used for obtaining a compound to be synthesized and inputting the compound to be synthesized into the trained inverse synthetic analysis network to obtain a synthetic route of the compound to be synthesized.

Optionally, the training data obtaining module specifically includes:

the reverse synthesis template probability determination module is used for determining the reverse synthesis template probability of the training compound and the reverse synthesis template probability of the intermediate after decomposition of the training chemical by using the Monte Carlo search tree with the training compound as the root node of the synthesis tree;

the state mark determining module is used for calculating the state mark of the root node;

the first judgment module is used for judging whether the state mark is-1 or not to obtain a first judgment result; if the first judgment result is yes, the construction of the synthetic tree fails, all nodes except the node marked as 1 are reset to-1, and training data are determined according to leaf nodes and edges of the synthetic tree;

a second judging module, configured to, if the first judging result is negative, judge whether there is a node depth in the composition tree that reaches a first set node depth, to obtain a second judging result;

a third determining module, configured to determine whether a node with a state flag of 0 exists in the composition tree if the second determination result is negative, to obtain a third determination result;

an adding and returning module, configured to traverse leaf nodes in a synthetic tree, where all state flags are 0, if the third determination result is yes, and add child nodes to the leaf nodes in the state flags of 0 according to the reverse synthetic template probability of the training compound and the reverse synthetic template probability of the intermediate after decomposition of the training chemical; determining the state flag of the child node and returning to the step of judging whether the state flag is-1 or not to obtain a first judgment result;

a first resetting module, configured to, if the third determination result is negative, successfully construct the synthetic tree, reset all the node flags as 1, and determine training data according to leaf nodes and edges of the synthetic tree;

a fourth determining module, configured to determine whether a node with a status flag of 0 exists in the composition tree if the second determination result is yes, so as to obtain a fourth determination result; if the fourth judgment result shows that the node is not the leaf node of the synthetic tree, the synthetic tree is successfully constructed, all the node marks are reset to be 1, and training data are determined according to the leaf nodes and the edges of the synthetic tree;

and the second resetting module is used for resetting all nodes except the node marked as 1 as the state to-1 and determining training data according to leaf nodes and edges of the synthetic tree if the fourth judgment result is negative.

L＝π^Tln p+(z-v)²+λ||ω||²

wherein L is a loss function, ω is an inverse synthesis analysis network parameter, | ω | | tory²Is the sum of the squares of all parameters; λ is the L2 regularization parameter, π^TIs a monte carlo strategy, p is a neural network strategy, v is a node value estimation, and z is a state marker.

Optionally, the expression of the inverse synthesis template probability is:

wherein, pi (A | S)₀) To synthesize the template probabilities, s, in reverse₀Search the root node of the tree for Monte Carlo, a is the reverse synthesis template, N(s)₀A) is the total number of accesses of the edge storing the reverse composition template,τ is the temperature coefficient.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

inputting a training compound into a synthesis tree, and searching by using a Monte Carlo search tree to obtain training data of an inverse synthetic analysis network; the training data includes training molecular synthetic routes; training the inverse synthetic analysis network by using the training data to obtain a trained inverse synthetic analysis network; and acquiring a compound to be synthesized, inputting the compound to be synthesized into the trained inverse synthetic analysis network, and acquiring a synthetic route of the compound to be synthesized, so that chemical reaction data from various sources can be more flexibly utilized, the requirement on external data is reduced, and a synthetic route with relatively good quality can be obtained.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flow chart of a chemical reverse synthesis analysis method provided by the present invention;

FIG. 2 is a schematic diagram of a composition tree provided by the present invention;

FIG. 3 is a flow chart of inverse synthetic analysis network training;

FIG. 4 is a diagram of an inverse analysis-by-synthesis network architecture.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

In order to make the aforementioned objects, features and advantages of the present invention more comprehensible, the present invention is described in detail with reference to the accompanying drawings and the detailed description thereof.

As shown in fig. 1, the chemical inverse synthesis analysis method provided by the present invention includes:

step 101: inputting a training compound into the synthesis tree, and searching by using a Monte Carlo search tree to obtain training data of the inverse synthetic analysis network; the training data comprises a base molecule of a training compound and a synthetic route for the training compound starting from the base molecule; the Monte Carlo search tree is updated according to the action value estimates and the state value estimates.

Step 102: and training the inverse synthetic analysis network by using the training data to obtain the trained inverse synthetic analysis network.

Step 103: and acquiring a compound to be synthesized, and inputting the compound to be synthesized into the trained inverse synthetic analysis network to obtain a synthetic route of the compound to be synthesized.

Step 101, specifically comprising:

and taking the training compound as a root node of the synthesis tree, and determining the reverse synthesis template probability of the training compound and the reverse synthesis template probability of the intermediate after decomposition of the training chemical by utilizing a Monte Carlo search tree.

And calculating the state mark of the root node.

Judging whether the state flag is-1 or not to obtain a first judgment result; if the first judgment result is yes, the construction of the synthetic tree fails, all nodes except the node marked as 1 are reset to-1, and training data are determined according to leaf nodes and edges of the synthetic tree.

If the first judgment result is negative, judging whether the node depth in the synthesis tree reaches the first set node depth or not, and obtaining a second judgment result.

If the second judgment result is negative, judging whether a node with a state mark of 0 exists in the composition tree or not to obtain a third judgment result; if the third judgment result is yes, traversing all leaf nodes with the state marks of 0 in the synthetic tree, and adding child nodes to the leaf nodes with the state marks of 0 according to the reverse synthetic template probability of the training compound and the reverse synthetic template probability of the intermediate after decomposition of the training chemical; and determining the state mark of the child node and returning to the step of judging whether the state mark is-1 or not to obtain a first judgment result.

If the third judgment result is negative, the synthetic tree is successfully constructed, all the node marks are reset to be 1, and training data are determined according to leaf nodes and edges of the synthetic tree.

if the fourth judgment result is negative, the construction of the synthetic tree fails, all nodes except the node marked as 1 are reset to-1, and training data are determined according to leaf nodes and edges of the synthetic tree.

In practical application, the expression of the loss function of the inverse synthetic analysis network is as follows:

L＝π^Tln p+(z-v)²+λ||ω||²

In practical application, the expression of the inverse synthesis template probability is as follows:

The invention provides a chemical inverse synthesis analysis system, comprising:

the training data acquisition module is used for inputting a training compound into the synthesis tree and searching by utilizing the Monte Carlo search tree to obtain training data of the inverse synthetic analysis network; the training data comprises a base molecule of a training compound and a synthetic route for the training compound starting from the base molecule; the Monte Carlo search tree is updated according to the action value estimate and the state value estimate.

And the training module is used for training the inverse synthetic analysis network by using the training data to obtain the trained inverse synthetic analysis network.

And the inverse synthetic analysis module is used for acquiring a compound to be synthesized and inputting the compound to be synthesized into the trained inverse synthetic analysis network to obtain a basic molecule and a basic molecule synthetic route.

The training data acquisition module specifically comprises:

a state flag determining module, configured to calculate a state flag of the root node;

a first resetting module, configured to, if the third determination result is negative, successfully construct the composition tree, reset all the node flags as 1, and determine training data according to leaf nodes and edges of the composition tree;

and the second resetting module is used for resetting all nodes except the node marked as 1 as the state to-1 and determining training data according to leaf nodes and edges of the synthetic tree if the fourth judgment result is negative. Wherein, the expression of the loss function of the inverse synthesis analysis network is:

L＝π^Tln p+(z-v)²+λ||ω||²

wherein L is a loss function, ω is an inverse synthesis analysis network parameter, | ω | | tory²Is the sum of the squares of all parameters; λ is the L2 regularization parameter, π^TIs a Monte Carlo strategy, p is a neural network strategy, v is a node value estimate, and z is a stateAnd (4) marking.

Wherein, the expression of the reverse synthesis template probability is as follows:

wherein, pi (A | S)₀) To synthesize the template probabilities, s, in reverse₀Search the root node of the tree for Monte Carlo, a is the reverse synthesis template, N(s)₀A) is the total number of accesses to the edge storing the inverse synthesis template, and τ is the temperature coefficient.

There are two modes of computation in software: a model building mode and a model application mode. The model building mode builds a model which can be used for designing a synthetic route in the software by a reinforcement learning method through a basic molecule database, a chemical reaction template classification database, other necessary parameters and the like given by a user; the model application mode uses the model obtained by the model construction mode to carry out synthesis route design on the compound to be synthesized given by the user.

The basic molecule database is a database composed of commercial compounds or easily available compounds obtained from a certain source, and can also be a database composed of a series of compounds which are customized by a user, have simple structures and are convenient to synthesize or purchase.

The base molecule is a molecule in a base molecule database, or a molecule satisfying the conditions listed in (five).

The target compound database is a database which is formed by extracting products of all reactions from a chemical reaction database acquired from an open source channel by software.

The target compound is a compound molecule in a database of target compounds, which is provided to the software during model iteration for calculation of the synthetic route. The calculated synthetic routes are used to translate into training data for use in model training.

Chemical informatics tools refer to open source software tools capable of processing standardized chemical information, chemical codes, chemical databases, including but not limited to RDKit, Indigo, Openbabel, etc.

The chemical reaction template is a SMARTS expression which is obtained by extracting from a plurality of similar chemical reaction SMARTS expressions through a chemical informatics tool and codes a type of chemical reaction. The chemical reaction template represents some chemical reactions similar in structural transformation, and has a portion encoding a structural feature of a reactant and a portion encoding a structural feature of a product, which are referred to as a reactant template and a product template, respectively. Both templates are extracted from the reactants and products of the type of chemical reaction represented by the chemical reaction template. If a reactant template in a chemical reaction template is extracted from a reactant (product) of an original chemical reaction and a product template is extracted from a product (reactant) of the original chemical reaction, then such a chemical reaction template is referred to as a forward (reverse) synthesis template. By means of a chemical informatics tool, a chemical reaction template can be used to convert a SMILES expression of a chemical molecule matching a reactant template in the chemical reaction template into a SMILES expression of another molecule, i.e. to perform such a chemical reaction abstractly in silico. The match of the chemical reaction template and the chemical molecule is determined by the chemical informatics tool, which in fact determines whether there is a substructure in the molecular structure that matches the reactant template in the chemical reaction template. Applying matching templates to molecules always results in new molecules, while applying non-matching templates to molecules always results in nulls.

The chemical reaction template database is a database composed of SMARTS expressions for coding chemical reaction templates and can be obtained by user definition. The software uses a chemical informatics tool to extract a simple chemical reaction template database from a chemical reaction database acquired from an open source channel, and can be used when a user does not provide the database. It should be noted that there is always a blank template in the database of chemical reaction templates, and applying this template to any molecule does not result in any result (result is blank), which is the case when all the chemical reaction templates in the database cannot be matched with a molecule. Hereinafter, unless otherwise specified, the chemical reaction template database is composed of reverse synthesis templates by default.

The chemical reaction template classification database is a general name of 3 new chemical reaction template databases obtained after the chemical reaction template databases are classified to a certain degree. The acquisition mode and the application method of the database are shown In (IV).

The inverse synthesis strategy is that for a chemical molecule and a database of chemical reaction templates (consisting of inverse synthesis templates) used, the molecule applies the probability of each inverse synthesis template within the database. This probability is normalized by the size of the chemical reaction template database, i.e., the sum of the probabilities of each template in the arbitrary molecular application database is 1. In the present invention, there are two types of inverse synthetic strategies, which are different in their origin and in the location of use. The inverse synthetic strategy calculated by the neural network is called as a neural network strategy and is represented by a symbol p; the strategy derived from MCTS is called MCTS strategy and is denoted by the symbol pi.

MC search trees (all called monte carlo search trees) are tree-like data structures built by MCTS process, which are generated when computing the inverse synthesis strategy of a single compound. A tree data structure comprises two structures of nodes and edges, wherein the nodes are connected with each other through directional edges, namely the edges are provided with starting nodes and ending nodes. The starting node of the edge is called the parent node of the terminating node, and the terminating node of the edge is called the child node of the starting node. In the context of the present invention, the termination node of an edge may not be unique, but rather a plurality of nodes exist as termination nodes, all of which are considered child nodes of the starting node. The tree data structure defines only one root node that is not a child of any other node. Also defined in the tree data structure is a leaf node that is not a parent of any other node. The definition of a tree data structure requires that for any node other than the root node, there is one and only one edge with this node as the termination node. The node is denoted by the symbol s, the edge is denoted by the symbol a, and (s, a) denotes the edge a starting from the node s. The depth d(s) of a node in the tree data structure is defined as the number of edges between a root node and the node, and the depth of the root node is generally defined as 0; the depth d (a) of an edge is defined as the depth +1 of its starting node, e.g., the edge with the root node as the starting node, the depth is 1, and so on. The nodes and edges of the tree data structure can also be used as any abstract data structure for storing different types of information.

The termination state is a special type of leaf node in the MC search tree, and the leaf node satisfies one of the following three conditions: 1. maxmtmctdepth whose depth reaches a certain maximum value, i.e., the maximum search depth of the MCTS in (one); 2. wherein the chemical molecule stored is a base molecule; 3. the chemical molecules stored therein are not basic molecules and do not match any chemical reaction template.

In the MC search tree according to the present invention, the information stored in the node includes: 1. depth of node d(s); a chemical molecule SMILES expression S corresponding to the node; 2. a neural network strategy p (S) of a molecule S calculated by a neural network; 3. the node value estimation v, which represents the difficulty of molecule synthesis at the node, is calculated by the neural network with the chemical molecule at the node as input (non-terminated state), or determined by the terminated state condition (terminated state). In the termination state, the node value estimate v stored by the node will be an accurate value, rather than an estimated value, without using model calculations. As long as the node satisfies the end state condition 3, v-1 is considered; as long as the node satisfies the termination state condition 2, consider v to be 1; otherwise, if the node satisfies the termination state condition 1, considering that v is-1; 4. n(s), meaning the total number of times the node is currently visited. The information stored by the edge includes: 1. a chemical reaction template SMARTS expression A corresponding to the edge; 2. n (s, a), meaning is the total number of times that the edge a with the node s as the starting node is visited currently; 3. q (s, a), meaning is an edge value estimate of the edge a starting with node s.

The model is a neural network model built by a pytorch in software. The model has the function of calculating the value estimation of the node in the neural network strategy and the MC search tree according to the input chemical molecules (the molecules stored by the node are just the molecules of the input model).

MCTS (all called monte carlo tree search) is a search algorithm. The search algorithm needs to complete a plurality of simulation processes each time, and constructs the MC search tree along with continuous simulation, and finally gives a solution to the search problem according to the information stored in the search tree. In the present invention, each simulation process includes three stages: a selection phase, an expansion and evaluation phase and an updating phase. The specific scheme of each stage is described in (A).

A synthetic tree is a tree-like data structure that software constructs for a compound to be synthesized. To distinguish from the MC search tree, the nodes in the composition tree are denoted by m, the edges by r, and (m, r) denotes the edge r starting from the node m. The main difference between the composition tree and the MC search tree is that the information stored by each node and edge on the composition tree is different. The information stored by each node of the composition tree includes: depth of node d (m); a chemical molecule SMILES expression M corresponding to the node; the state of the node is labeled z. The information stored at each side of the composition tree includes: the depth of the edge d (r); the template R of the chemical reaction performed by the chemical molecule at the starting node of the edge junction (this is a reverse synthesis template). The node state mark z of the composition tree is determined according to the following rule: if the chemical molecule to which the node corresponds is a base molecule, z is 1; if the chemical molecule corresponding to the node is present in the synthetic tree for the second and subsequent times, the label value is z-2 and the software will not expand the node; if the chemical molecule corresponding to the node is not a basic molecule but cannot execute the template selected by the software, or the template selected by the software is an empty template, then z is-1; and in other cases z is 0.

When software performs synthetic route design on a certain chemical molecule, success or failure results may be obtained. The success judgment mode is as follows: the software builds synthetic trees in which all the chemical molecules stored in the leaf nodes are the basic molecules within the maximum depth allowed by the synthetic trees. Otherwise, the failure is determined in any case.

The success rate is the proportion of molecules which can generate a complete synthetic route when a certain number of molecules are subjected to synthetic route design by software.

The "+ ═" operator is defined as: if x is a variable, x +1 is equivalent to x +1, i.e., the current value of the variable x is incremented by 1 and then assigned to the variable x.

Method for calculating inverse synthesis strategy of compound by combining MCTS (methyl-substituted-N-methyl-substituted-S) with neural network

Inputting data: the method comprises the following steps of searching a compound to be searched, a model, a chemical reaction template database, a basic molecule database and a chemical reaction template classification database.

The method comprises the following parameters: MCTS simulates total times MCTSTimes, temperature coefficients temp (written as tau in the formula), MCTS maximum search depth maxMCTSDepth and single simulation maximum time limit timeBudget.

And outputting a result: MCTS strategy pi for compounds to be searched

Description of the method:

in the selection phase, the algorithm needs to construct a path from the root node to the leaf node. A path is also a tree-like data structure and is a subset of all nodes and edges in the MC search tree. Besides satisfying the definition of the tree data structure, the nodes and edges in the path additionally satisfy: the edge taking any node as the starting node is unique; the leaf node in the path must be a leaf node in the currently constructed MC search tree unless it does not belong to the current MC search tree. The path construction method comprises the following steps:

(1) will root node s₀And adding to the path.

(2) If this is the first round of simulation, the MC search tree contains only the root node, then the path has been constructed (without any edges, since no edges exist in the MC search tree at this time). If it is not the first round of simulation at this time, then the neural network policy p stored on the root node is used (S)₀)，S₀For the chemical molecule (compound to be searched) on the root node, s is calculated according to the following formula₀As the edge(s) of the start node₀，a₁) Chemical reaction template A₁：

A₁＝argmax_A∈RS{q(s₀，A)+U(s₀，A)}

In the formula, RS is a chemical reaction template database, and in the formula, all RS and s are subjected to reaction₀Adapted forThe template is calculated, but s can be greatly reduced by the method In (IV)₀Upper stored molecule S₀The number of matched templates; a is a certain reverse synthesis template in a database; p (S)₀A) is a neural network strategy p (S)₀) The probability of template a being applied is synthesized inversely. The subscripts for each S, S, a, A herein denote the depth of the node and edge, as follows. c. C_puctIs a constant in the formula and is used for controlling the balance of the MCTS algorithm on exploration and utilization, and c is taken in the invention _puct1, and remains unchanged. U(s)₀A) is based on the chemical reaction template A in the current MC search tree in the molecule s₀For the edge(s) in which the template is stored₀And a) an improvement in edge value estimation. q(s)₀A) is the edge(s) where the template A is stored₀And a) current edge value estimation. If the q value can not be obtained from the existing MC search tree, the edge is the first access, and q(s) is taken at the moment₀And A) is 0. For the first accessed edge, N(s)₀The value of a) cannot be obtained from the MC search tree, and N(s) is taken at this time₀And a) is 0. A is calculated such that the maximum value in parentheses in the second formula is taken₁Then, select A₁Corresponding edge(s)₀，a₁) Adding into the path;

(3) a is to be₁Application to molecule S₀Obtaining a collection of molecules

And creates nodes corresponding to the molecules in the set one to one

Is added to the path as s₀The child node of (2).

(4) Starting from d to 1, the following operations of (1) - (fifth) are circularly executed:

firstly, all nodes which belong to the MC search tree but are not leaf nodes in the MC search tree are selected from all leaf nodes with the depth of d on the current path to form a set

② traverse the set from k to 1

To each node. For the kth node in the set

Calculated according to a similar formula to

As edges of the starting node

Chemical reaction template

Wherein,

for neural network strategies

The probability of the middle reverse synthetic template A being applied;

is based on the molecular weight of the chemical reaction template A in the current MC search tree

To the edge storing the template

An improvement in edge value estimation;

is a node

The total number of times that the user has been currently visited (not including this time);

is an edge

is the edge where the template A is stored

The current edge value estimate.

Note that edges may still be encountered at this point

Is the case of the first access, which is still taken

In the formula, the sum is performed on all RSs

The matched template is calculated, but the sum of the sum

Molecule of upper memory

The number of matched templates;

thirdly, there will be a template

Is not limited by

Is added into the path and will

Application to

Obtaining a collection of molecules

Construction of nodes corresponding to one-to-one in a set of molecules

Adding into the path;

fourthly, go back to the step (k + 1) to continue processing the node

Up to the collection

All the molecules in the solution are processed, and the process is out of the cycle;

d + - [ 1 ], then go back to (r), recalculate the set

Until all leaf nodes on the path are leaf nodes originally existing in the MC search tree, jumping out of the loop;

in the expansion and evaluation stage, the algorithm analyzes the information stored by all leaf nodes in the path and completes the calculation of the node value estimation. When analyzing each leaf node, first, it is determined whether the leaf node is in a termination state. If the state is the termination state, calculating the accurate value of v according to the condition met by the termination state; if not, then determine whether the leaf node is accessed for the first time. For a leaf node that is visited for the first time, the model is used to compute its neural network policy p and node value estimate v. And for the leaf nodes which are not visited for the first time, selecting a chemical reaction template corresponding to one component with the highest probability to be applied to the molecules stored on the leaf nodes in the current MC search tree according to the calculated p, then adding corresponding new edges and child nodes to the leaf nodes, expanding the MC search tree, evaluating the v values of the child nodes, and storing the v values on the nodes. The terminated state of these nodes will compute the exact value of v according to the conditions that the terminated state satisfies, and the non-terminated state will use the model to compute the node value estimate. The non-terminated state additionally calculates a neural network policy p and stores p and v on the nodes. Here newly added nodes are added to the path at the same time, they have in fact become new leaf nodes in the path, but are not subject to further analysis. The update phase is entered when the analysis of all leaf nodes is completed. If the total time of one round of simulation reaches the maximum time limit timeBudget of one simulation in the analysis, the analysis of the remaining leaf nodes is stopped immediately, the updating stage is entered immediately, and the node value estimation of the remaining leaf nodes is forced to be v ═ 0 (even without considering the termination state).

In the update phase, the node value estimates of all leaf nodes on the path participate in the update. These leaf nodes include the terminating state leaf node, the first visited leaf node, and the newly added node on the path. If the expansion and evaluation phase reaches a single simulation maximum time limit, then the nodes whose node value estimates are noted as 0 also participate in the update (they are in fact also leaf nodes in the path). And updating the edge value estimation q (s, a) of all the edges visited by the simulation according to the node value estimation v by the algorithm, and updating the total visit times N(s) and N (s, a) of all the nodes and edges in the path selected by the simulation. The specific updating method is carried out according to the following processes:

the number of accesses to each node and edge is updated first. Total number of accesses for first-accessed node s and first-accessed edge (s, a):

N(s)＝0，N(s，a)＝1

total number of accesses for node s that is not the first access and edge (s, a) that is the first access:

N(s)+＝1，N(s，a)+＝1

for the update of the edge value estimation, two ways can be selected: mode 1 is referred to as avg and mode 2 is referred to as min. The difference between the two modes is that the method of calculating the update amount G of the edge value estimation is different. The calculation of the update quantity G is started from the node value estimation v of all the leaf nodes obtained in the above, and iterative calculation is carried out from the direction from the child node to the father node, that is, the update quantity of the edge connecting the leaf node and the father node in the connecting path is calculated first, then the update quantity of the edge connecting the father node and the father node is calculated, and so on until the update quantity calculation of each edge connected with the root node is completed. The edge (s, a) directly connected to the leaf node, whose update quantity G (s, a) is calculated according to the method 1, is:

if it is

G (s, a) ═ 1 if

The calculation method of G (s, a) according to mode 2 is:

if it is

G (s, a) ═ 1 if

Where s ' represents the leaf nodes that are the termination nodes of the edge (s, a), v (s ') represents their node value estimates, and n (s ', a, s) represents the total number of termination nodes of the edge (s, a), this quantity not necessarily being 1, since a compound does not necessarily have only one synthesis precursor.

For an edge (s, a) not directly connected to a leaf node, its update quantity G (s, a) is related to the update quantity G (s ', a ') of the edge (s ', a ') starting from all the end nodes s ' of the edge. The calculation method according to the mode 1 is as follows:

if it is

G (s, a) ═ 1 if

The calculation method of G (s, a) according to mode 2 is:

if it is

G (s, a) ═ 1 if

After calculating the edge value estimation update quantity G (s, a) of all the edges on the path, updating the edge value estimation q (s, a) according to the following formula:

when this expression is used for the first-time accessed side, the value of the right side q (s, a) is not yet defined, and in this case, the right side q (s, a) is taken to be 0.

This is the process of one complete simulation in MCTS. When the algorithm finishes the simulation times (MCTSnims) with the specified number, the simulation process of the algorithm is finished, and the problem solution is obtained according to the information stored in the constructed MC search tree. Specifically, the root node s of the tree is searched from the MC₀All edges(s) connected₀A) obtaining the total number of accesses N(s) to each edge₀A), then calculating the compound to be searched S according to the following formula₀Probability of applying a certain reverse synthesis template a:

wherein N(s) on the molecule₀A) represents the total number of accesses to the edge storing the template A, and the denominator is the total number of accesses to all edges connected to the root node and having a total number of accesses other than 0

The sum is raised to the power (an edge with a total number of accesses of 0 obviously has no effect on the result). τ is called the temperature coefficient in the equation and this parameter controls the magnitude of the difference between the probabilities that the chemical reaction templates are applied in the MCTS strategy. The larger the temperature coefficient is, the closer the applied probability of each chemical reaction template is, and the closer the MCTS strategy is to the random strategy with equal probability of each chemical reaction template; the smaller the temperature coefficient, the closer the MCTS strategy is to a greedy strategy that only makes the probability of being applied to the edge that is accessed the most often approach 1. And (3) after calculating the probability corresponding to the template in each chemical reaction template database, taking each probability as a component to form a final MCTS strategy pi.

(II) a method for designing the optimal synthetic route of the compound to be synthesized or designing a plurality of synthetic routes of the same compound to be synthesized by constructing a synthetic tree of the compound to be synthesized by utilizing the method for calculating the inverse synthetic strategy

Inputting data: compound to be synthesized, model, chemical reaction template database, basic molecule database

Inputting parameters: maximum length depth, maximum proportion of undegradable Compounds, maximum ratio of undegradable Compounds, and maximum ratio of undegradable Compounds

Description of the method:

the method for designing the synthetic route of the compound to be synthesized by software is to construct a synthetic tree of the compound and then give a specific synthetic route according to the synthetic tree.

1. Designing an optimal route

First, add root node m to the composition tree₀，m₀The molecule stored above is the compound M to be synthesized₀. Here, the subscript of M denotes the depth of the node, and the subscript of M denotes the depth of the node M where M is located, the same applies hereinafter. Calculating the compound M to be synthesized by using the method in (I)₀MCTS strategy of (M)₀). Adding edges (m) to a composition tree₀，r₁) Selecting pi (M)₀) The reverse reaction template R corresponding to the component with the maximum probability₁To the edge (m)₀，r₁) The above. Here, the subscript of R denotes the depth of the side, and the subscript of R denotes the depth of the side R on which the template is placed, as follows. To M₀Using R₁Obtaining a collection of molecules

Here, the superscript of each M is shown in the application R₁The number of each molecule produced; n is₁Is to M₀Using R₁The total number of molecules (maximum number) generated later, the subscript of n indicates the depth of the side R on which the template R is applied. Finally, the node is connected

As m₀Is added to the composition tree, the ith node

The molecule stored on is

And calculating the state labels z. for these nodes if z-1 appears in these state labels, the construction of the synthesis tree is immediately stopped, while it is determined that the design of the synthesis for the target compound has failed; if no z ═ 1 appears in all state flags, then the following loop process is entered:

(1) the initialization d is 1. Judging whether the depth of any node in the composition tree reaches the maximum value d_max(namely the maximum length depth of the synthetic route), namely the maximum length depth of the synthetic route, and if yes, jumping out of the loop; judging whether a leaf node with the z being 0 exists in the synthetic tree, if the leaf node does not exist, jumping out of the loop, and stopping the construction of the synthetic tree;

(2) if the loop is not skipped, all leaf nodes of the current composition tree with z equal to 0 and depth d are traversed. For the kth leaf node

The following loop is performed:

calculating the stored molecules

MCTS strategy of

Then add an edge to this leaf node

Storage on edge

The reverse reaction template corresponding to the component with the maximum probability

② to the molecule

Using reverse reaction templates

Obtaining a collection of molecules

i_d+1，kIs to molecule

Application templates

The number of each molecule in the molecular set obtained thereafter. The total number of molecules in the molecular assembly is

This is also i_d+1，kMaximum value of (a);

direction node

Adding

Sub-node

(these are all edges

Terminating node of) th_d+1，kSub-node

The molecule stored on is

The state labels z of these child nodes are then computed. If z is-1 in these nodes, then the loop is dropped and the building of the composition tree is stopped. The judgment of the synthesis design fails;

fourthly, returning to the first step, and continuously processing the next node

Traversing all leaf nodes with z being 0 in the current composition tree;

(3) d + - < 1 > and then returns to (1) until the cycle is jumped out after the condition is met;

and after the loop is jumped out, the software obtains the constructed synthetic tree. From the composition tree, the software will determine whether the composition was successful, the method being described in the glossary. If the design is successful, outputting the molecules M stored on each node and the reverse synthesis templates R stored on each edge in the synthesis tree by software according to a certain format to be used as a synthesis route of the compound to be synthesized; if the design fails, the software also outputs the molecules M stored on each node and the reverse synthesis templates R stored on each edge in the synthesis tree according to a certain format, but the synthesis route does not represent a complete synthesis route.

A built composition tree describing a successful composition route design is shown in fig. 2. Note that it is assumed in the figure that 2 molecules are generated after each reverse synthetic template application for mapping. Typically the composition tree will be more complex. The symbol of the basic molecule in the figure is changed to B, which is distinguished from the intermediate phase of the non-basic molecule. d is not depth here, but the maximum length of the synthetic path depth.

In the method for designing the synthetic route of the compound to be synthesized, software is used as a method for designing the optimal synthetic route of the target compound in the model training stage; in the model application phase, the software takes this as a way to design the optimal synthetic route for the compound to be synthesized given by the user.

2. Designing multiple different routes

In the model application phase, the software can also design multiple synthetic routes for a user-given compound to be synthesized. This approach is largely identical to the above approach to design an optimal synthetic route, with only the difference in selecting R by pi. Note that this method is not applied to the model training phase. The method comprises the following steps:

first, add root node m to the composition tree₀，m₀The molecule stored on is the compound to be synthesizedThing M₀. Here, the subscript of M denotes the depth of the node, and the subscript of M denotes the depth of the node M where M is located, the same applies hereinafter. Calculating the compound M to be synthesized by using the method in (I)₀MCTS strategy of (M)₀). Adding edges (m) to a composition tree₀，r₁) From pi (M)₀) Randomly selecting a reverse reaction template R corresponding to one component from the first K components with the maximum medium probability₁To the edge (m)₀，r₁) The above. Here, the subscript of R denotes the depth of the side, and the subscript of R denotes the depth of the side R on which the template is placed, as follows. To M₀Using R₁Obtaining a collection of molecules

As m₀Is added to the composition tree, the ith node

The molecule stored thereon is

And calculating the state labels z. of the nodes, if the proportion of the number of nodes with z-1 in the state labels to the number of all nodes in the current synthesis tree exceeds unavailable _ ratio _ threshold, the construction of the synthesis tree is immediately stopped, and meanwhile, the synthesis design of the target compound is judged to have failed; if no z-1 appears in all the state flags, the following loop process is entered:

(1) the initialization d is 1. Judging whether the depth of any node in the composition tree reaches the maximum value d_maxIf yes, jumping out of the loop; then judging whether leaf nodes with z being 0 exist in the composition tree or notIf no leaf node exists, jumping out of the loop and stopping the construction of the synthesis tree;

The following steps are carried out:

calculating the stored molecules

MCTS strategy of (1)

Then add an edge to this leaf node

From

Randomly selecting a reverse reaction template corresponding to one component from the first K components with the maximum medium probability

Stored at the edge

The above step (1);

② to molecules

Using reverse reaction templates

Obtaining a collection of molecules

i_d+1，kIs to molecule

Application templates

The number of each molecule in the molecular set obtained thereafter. The total number of molecules in the molecular set is

This is also i_d+1，kIs taken as the maximum value;

③ to the node

Adding

Sub-node

(these are all edges

Terminating node of) th_d+1，kSub-node

The molecule stored on is

The state labels z of these child nodes are then computed. And if the proportion of the number of the nodes with z being 1 in the nodes to the number of all the nodes in the current synthesis tree exceeds unavailable _ ratio _ threshold, jumping out all the loops and stopping constructing the synthesis tree. The synthesis design judgment fails;

Traversing leaf nodes of which z is 0 in all current composition trees;

and after the loop is jumped out, the software obtains the constructed synthetic tree. From the composition tree, the software will determine whether the composition was successful. If the design is successful, outputting the molecules M stored on each node and the reverse synthesis templates R stored on each edge in the synthesis tree by software according to a certain format to be used as a synthesis route of the compound to be synthesized; if the design fails, the software also outputs the molecules M stored on each node and the reverse synthesis templates R stored on each edge in the synthesis tree according to a certain format, but the synthesis route does not represent a complete synthesis route.

Because the reverse synthesis template applied to each molecule in the synthesis tree has randomness, different synthesis routes of the same compound to be synthesized can be obtained by designing the parallel synthesis routes for multiple times for the same compound to be synthesized.

And (III) a reinforcement learning method used in a model construction mode, which is used for obtaining a model for designing a compound synthesis route in software.

The method comprises the following parameters: total number of model iteration rounds, and number of target compounds N used for generating data of each round_trainNumber of target Compounds N used per model evaluation_testThe maximum historical training data storage round numiters fortraineexampleshistory. N is a radical of_train＝40，N_test20, numiterspersexemplesHistory 10. the total number of model iteration rounds is set by the user.

In the model construction mode, software starts from a model with randomly initialized parameters, and executes a plurality of model iteration rounds to finally obtain a model which can be used for designing a compound synthesis route in the software. Each iteration turn is divided into three stages of data generation, model training and model evaluation. A flow chart of the model building mode is shown in fig. 3.

The architecture of the model is first described, as shown in FIG. 4. The input molecule firstly calculates the 1024-dimensional ECFP4 molecular fingerprint by using a chemical informatics tool, then inputs the fingerprint into a first full-link layer, the activation function is ReLU, passes through a Dropout layer with batch normalization and p value (not p in figure 4) of 0.3, then enters a second full-link layer with 512 neurons, the activation function is ReLU, passes through the Dropout layer with batch normalization and p ═ 0.3, finally calculates the logarithm ln p of the neural network strategy of the current input molecule by a LogSoftmax layer with 512 neurons, and calculates the node value estimation v(s) of the current input molecule by another Tanh layer with 512 neurons. p is a vector with dimensions equal to +1 of the total number of chemical reaction templates, the last dimension corresponding to the so-called empty template, the logarithm lnp of p corresponding to the result of the logarithm of each component of p; v is a real number between 0 and 1. Note here that the neural network policy is a result obtained by a re-exponential operation of ln p output by the neural network. The logarithm is only used as an intermediate process of model training, and is purely due to the consideration of numerical stability. In order to display the functions of the model more reasonably and avoid the complexity of description, in the text except this part, the model is still considered to output the neural network strategy p.

In the data generation phase, the software randomly extracts a certain number of target compounds from the target compound database (software built-in parameters, using N)_trainAnd (b) representing), designing an optimal synthetic route by the method in the step (two), generating a synthetic tree of each target compound, and recording whether the synthetic design corresponding to the synthetic tree is successful or not. If the synthetic route corresponding to the synthetic tree is successfully designed, resetting the state marks z on all nodes in the synthetic tree to be 1; otherwise, all the values are reset to be z-1. If the model is in the 1 st model iteration round, the historical optimal success rate is the proportion of successful design in all target compounds of the designed synthetic route; if this is the kth model iteration round (k ≧ 2), then the update of the historical success rate depends on the results of the model evaluation in the last iteration round. If the model after the previous training is accepted in the previous model evaluation, the historical success rate is updated to be the weighted average of the success rate of designing the synthetic route of the target compound for the current model evaluation and the success rate of designing the synthetic route of the target compound in the previous model evaluation, namely

Wherein, acc_best，kTo representThe history optimal success rate of the kth model iteration; acc (acrylic acid)_train，kThe success rate of designing all target compound synthetic routes in the data generation stage of the kth iteration is shown; acc (acrylic acid)_eval，k-1Showing the success rate of designing all target compound synthetic routes in the model evaluation stage of model iteration of the (k-1) th round; n is a radical of_testIs the total number of synthetic routes of the target compound that need to be designed in the model evaluation phase. If the model after the previous round of training is not received in the previous round of model evaluation, the historical success rate is updated to be the weighted average of the historical success rate of the previous round and the success rate of all target compound synthesis routes designed in the data generation stage of the current round, namely:

wherein acc_best，k-1Is the historical best success rate, N, in the last iteration round_curThe model estimates how many times of untrained models have been in succession since the last model estimate received training. Resetting N each time a model evaluation receives a trained model_cur0; each time the model after evaluation is not trained, N_cur+＝1。

In order to obtain a training data set of the model, a molecular SMILES expression M, MCTS strategy pi and a node state mark z stored on each node in each synthetic tree are selected to form a tuple [ M, pi, z ]. These tuples will constitute the training dataset of the model, where M is used as input to the neural network after being converted into 1024-dimensional ECFP4 molecular fingerprints with the chemo-informatics tool, pi is the label when the model outputs the neural network policy p, and z is the label when the model outputs the node value estimate v. The training data set generated for each iteration of the model is saved until the number of model iterations exceeds the numIters ForTraineExampleHistory. Only the training data set generated by the previous numiterfortrainexamplehistory round, which contains the training data set generated by the current round, is saved at this time.

In the model training phase, software optimizes the parameters of the model by minimizing the loss function in the following formula according to the entire training data set currently stored:

L＝π^Tln p+(z-v)²+λ||ω||²

wherein omega is a parameter of the model, | omega | | non-woven gas²Is the sum of the squares of all parameters; λ is the L2 regularization parameter. In the model training stage, the MCTS strategy is used for improving the neural network strategy so as to synthesize the success and failure of the design and improve the node value estimation in the MC search tree, so that the method for constructing the model is a reinforced learning method.

In the model evaluation phase, the software extracts a further quantity of target compound from the target compound database (parameter N mentioned above)_testAnd parameters are built in software), and the optimal synthetic route is designed by using the method in the step (II). If the success rate is higher than the historical best success rate acc of the round_best，kReceiving a model after the training of the current round by a certain numerical value (a small quantity belongs to a parameter in the software, and the quantity belongs to 0.05), replacing the model used in the data generation of the current round, and using the model in the synthesis route design of the target compound in the data generation of the next round; otherwise, the trained model is abandoned, and the model used in the next round is continuously used for the synthesis route design of the target compound in the data generation of the next round.

And after the software finishes all model iterations set by the total number of model iteration rounds, the software stops running, and a user obtains a model which can design a compound synthesis route in the software.

(IV) method for extracting chemical reaction template classification data from chemical reaction template data and using the data to accelerate MCTS process in step (I)

The method divides templates into 3 types according to the number of product templates on the right side of a chemical reaction template: 1. containing a single product template; 2. contains 2 product templates; 3. containing 3 or more product templates. When a chemically reactive template is applied to a molecule, there may be a match between the substructure of multiple different positions in the molecule and the template. The semiochemical tools are operated without distinguishing between substructures at different positions in the molecule, but with one operation for each matching substructure. If a chemical reaction template contains 2 product templates and the matching molecule contains 3 different positions of the matched substructures, then applying the template to the molecule will yield 6 new molecules. If there are more matches, the number of new molecules grows rapidly and the MC search tree is too complex, resulting in a slow search. Therefore, in the development and evaluation stage (i), for all templates matching a certain molecule, it is necessary to remove the chemical reaction template matching too many sites and generating too many templates according to the number of matched sites in the template and the molecule and the number of generating templates of the template itself. For the 1 st template, at most 3 sites in the molecule are required to match the template; for the 2 nd type template, at most two sites in the molecule are required to be matched, and when one site in the two sites is required to apply a rule, the reaction corresponds to one intramolecular reaction, otherwise, only one site is allowed to be matched; no restriction is made with respect to class 3, since the number of such templates is very small. With such a restriction, each node of the MC search tree almost always has at most two child nodes (without applying the type 3 template), which greatly reduces the complexity of the MC search tree, reduces the total number of nodes and edges, and speeds up the search process. The method classifies a chemical reaction template database through regular expressions to obtain a chemical reaction template set containing different numbers of product templates, and provides the set for use in step one.

(V) in order to expand the coverage of the basic molecules, in addition to the basic molecules in the basic molecule database, the method sets other conditions for judging the basic molecules, and molecules which satisfy the conditions but do not belong to the basic molecule database can still be considered as the basic molecules. These conditions include: 1. if the number of C atoms in the molecule is not more than 6, the molecule can be regarded as a simple organic matter and automatically judged as a basic molecule; 2. if any metal atom other than Li, Mg, Cu, Zn, Sn, Pb, Bi is contained in the molecule, the molecule is directly judged as a basic molecule without counting the number of C atoms.

According to the CASP program developed by the scheme, chemical reaction template data from any source can be used in principle, the requirement on the number of chemical reaction templates is low, and meanwhile, the calculation success rate can be kept high. The MCTS searching method is characterized in that the method is a technical scheme based on a reinforcement learning algorithm, and in the method (I), an MCTS searching process is guided by effectively utilizing a designed appropriate strategy function and a return function (namely, a model neural network); in the method (iii), all the synthetic route information generated by the learning program itself can be used, and the model neural network parameters are continuously improved, so as to react on the MCTS process (i) and improve the accuracy of the search. The two methods complement each other and promote each other, the data adaptability is enhanced through a reinforcement learning method, the data requirement is reduced, and the calculation success rate is kept.

The CASP program developed according to this protocol can design a completely new synthetic route for a target compound that has never been reported in the literature or in the patent, provide direction guidance for the design of new compound synthesis, or design a new synthetic route for compounds with existing synthetic routes. This is because the present solution only allows the model to learn information about the chemical reaction templates, and does not add these templates and the information associated with the chemical reaction that generated the templates to the learning process, so the model does not design a synthetic route according to the chemical reaction in the existing literature or patent, but only uses the information learned in the training. The scheme is directly embodied by using a reinforcement learning method;

according to the CASP program developed by the scheme, different synthetic routes can be designed for the same target compound. This is because the above method (two) 2 can randomly sample from the inverse synthesis strategy so that as few repetitive routes as possible are generated.

The invention separates the MCTS process from the construction of the synthetic route, and lays a cushion for improving the performance of the neural network in the MCTS through the design result of the synthetic route; the MCTS process and the separation of the synthesis route construction are realized in that the construction of the MC search tree and the construction of the synthesis tree are independent of each other (two parts, one part and two parts), and only by doing so, the complete MCTS calculation can be carried out once for each intermediate molecule (non-basic molecule) in the synthesis tree (or the synthesis route), so that the inverse synthesis strategy calculation for each intermediate molecule is the most accurate. If the MC search tree and the composition tree are combined into one, not every molecule in the tree goes through the complete MCTS process. This is because they are not the root nodes of the tree, and the paths at each simulation do not necessarily contain them, so the total number of simulations can be less or even much less than the total number of MCTS simulations, and the number of simulations between nodes is also very unbalanced (somewhat more and somewhat less). This does not allow for a precise retro-synthetic strategy for each intermediate molecule in the synthetic route.

The invention applies the neural network model to the MCTS algorithm in a brand-new way, guides the MCTS searching process and replaces a simulation stage with strong randomness, thereby solving the problem of chemical synthesis design, greatly reducing the randomness of synthesis design and ensuring that the optimal route of the compound has repeatability; if the node value estimation is calculated without using the neural network instead of simulation, the calculation of the node value estimation needs random sampling, which easily causes the MCTS search to be too random, and finally no proper synthetic route can be found, so that a large number of parallel calculations are needed to obtain the expected result. After using a neural network, the node value estimates will be deterministic values, no longer generated by a stochastic process, and therefore very repeatable, typically requiring only one calculation.

The method extracts training data from the synthetic route designed by software, and is used for improving the performance of the neural network used by the MCTS, thereby enhancing the searching capability of the MCTS algorithm and improving the success rate and the route quality of the synthetic route designed by the software.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. A method of chemical inverse synthesis analysis, comprising:

2. The chemical inverse synthetic analysis method of claim 1, wherein the inputting of the training compound into the synthetic tree and the searching using the monte carlo search tree to obtain the training data of the inverse synthetic analysis network specifically comprises:

calculating a state marker of the root node;

if the second judgment result is negative, judging whether a node with a state mark of 0 exists in the composition tree or not to obtain a third judgment result; if the third judgment result is yes, traversing all leaf nodes with the state marks of 0 in a synthetic tree, and adding child nodes to the leaf nodes with the state marks of 0 according to the reverse synthetic template probability of the training compound and the reverse synthetic template probability of the intermediate after decomposition of the training chemical; determining the state flag of the child node and returning to the step of judging whether the state flag is-1 or not to obtain a first judgment result;

if the second judgment result is yes, judging whether a node with a state mark of 0 exists in the composition tree or not, and obtaining a fourth judgment result; if the fourth judgment result shows that the node is not the node, the synthetic tree is successfully constructed, all the node marks are reset to be 1, and training data are determined according to leaf nodes and edges of the synthetic tree;

3. The chemical inverse synthesis analysis method according to claim 1, wherein the loss function of the inverse synthesis analysis network is expressed by:

L＝π^Tln p+(z-v)²+λ||ω||²

4. The chemical inverse synthesis analysis method of claim 2, wherein the expression of the inverse synthesis template probability is:

5. A chemical inverse synthesis analysis system, comprising:

6. The chemical inverse synthesis analysis system according to claim 5, wherein the training data acquisition module specifically includes:

7. The chemical inverse synthesis analysis system of claim 5, wherein the loss function of the inverse synthesis analysis network is expressed as:

L＝π^Tln p+(z-v)²+λ||ω||²

8. The system of claim 6, wherein the inverse synthesis template probability is expressed as: