WO2022034913A1 - 推定装置、訓練装置、グラフ生成方法及びネットワーク生成方法 - Google Patents
推定装置、訓練装置、グラフ生成方法及びネットワーク生成方法 Download PDFInfo
- Publication number
- WO2022034913A1 WO2022034913A1 PCT/JP2021/029717 JP2021029717W WO2022034913A1 WO 2022034913 A1 WO2022034913 A1 WO 2022034913A1 JP 2021029717 W JP2021029717 W JP 2021029717W WO 2022034913 A1 WO2022034913 A1 WO 2022034913A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- node
- tree
- graph
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/80—Data visualisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
Definitions
- This disclosure relates to an estimation device, a training device, a graph generation method, and a network generation method.
- a method of generating a molecular structure using a trained generative model is being researched. These methods train the generative model using the molecular structure as teacher data, etc., and train the mapping between the molecular structure and the corresponding fixed-length latent vector. At the time of molecular structure generation, the latent vector is randomly changed, or a specific property is searched for as a target, and the molecular structure corresponding to the latent vector is inferred.
- the expression method of the molecular structure has a great influence on the performance of molecular structure generation.
- the character string expression output by the generative model may not correspond to a valid molecular structure, which may affect the performance.
- the graph format there are no cases in which the graph expression does not correspond to a valid molecular structure, but it is difficult to efficiently learn the molecular structure (for example, phenyl group) that frequently appears in organic compounds. ..
- an estimation device for estimating the graph structure of a compound at high speed and a training device for training a trained model of the inference device are realized.
- the estimator comprises one or more memories and one or more processors.
- the one or more processors obtain tree information, including node information and edge information, from the latent representation and generate a graph from the tree information.
- the information in the tree includes the connection information of the node.
- the figure which shows the conversion in the training which concerns on one Embodiment. which shows an example of the site information which concerns on one Embodiment.
- Ticle decomposition indicates a method of mapping from a graph representation of a molecular structure to a tree representation.
- “Singleton” indicates a node corresponding to an atom that becomes a branch point when the graph representation of a compound molecule is decomposed into trees.
- the branch point of the graph is this singleton node if the branch point does not belong to the ring structure.
- “Bond” is a representation of two covalently bonded atoms as one node. However, in the case of a covalent bond belonging to a ring structure, the following "ring” is applied.
- a "ring” is a node corresponding to a ring structure when the graph representation of a compound molecule is decomposed into trees.
- Typical examples of the compound represented as a ring include, but are not limited to, benzene, pyridine, pyrimidine and the like, or cyclobutadiene, cyclopentadiene, pyrrole, cyclooctatetraene, cyclooctane and the like. Any ring may be used.
- Site information is information showing the relationship between how the tree-decomposed nodes were connected in the original molecular structure. Although general tree decomposition is irreversible, by using this site information, in the embodiment of the present disclosure, the reversibility that can be reversely converted from the tree decomposition tree representation to the graph representation is ensured. Further, the tree representation to which this site information is added is called a tree representation with site information.
- node and edges mainly indicate nodes and edges in a tree representation, but before and after tree decomposition, nodes and edges in a graph representation and nodes and edges in a tree representation are appropriately used. It shall be read as.
- the data represented by the graph may be referred to as the first type data
- the data represented by the tree with site information may be referred to as the second type data
- the latent representation may be referred to as the third type data.
- FIG. 1 is a block diagram showing an example of the configuration of the estimation device according to the embodiment.
- the estimation device 1 includes an input unit 100, a storage unit 102, a search unit 104, a decoding unit 106, a restoration unit 108, and an output unit 110.
- the estimation device 1 generates and outputs a chemical formula of the compound based on the information input from the input unit 100 or the latent state automatically generated by the estimation device 1.
- the input unit 100 accepts input from the outside.
- the data received by the input unit 100 is stored in the storage unit 102 if necessary.
- the input unit 100 receives a request from a user or receives information that is a seed for a search in the search unit 104.
- the storage unit 102 stores information necessary for the operation of the estimation device 1. Further, the intermediate generation data, the final generation data, and the like in the processing of the estimation device 1 may be stored. For example, in various operations of the estimation device 1, when information processing by software is specifically realized by using hardware resources, a program related to this software is stored. It may also store hyperparameters and parameters that make up the trained neural network model.
- the search unit 104 acquires the data of the latent expression (third type) for producing the compound.
- this latent variable may be obtained using a random value.
- the search unit 104 may acquire a latent variable obtained by adding, multiplying, or the like by a random number value to the input latent variable or the latent variable based on the already acquired compound.
- the search unit 104 may optimize the latent expression so that a better result can be obtained based on the result decoded by the decoding unit 106 or the result restored by the restoration unit 108.
- a method based on metaheuristics such as PSO (Particle Swarm Optimization) may be used.
- the decoding unit 106 decodes the latent variable searched by the search unit 104 using the trained neural network model.
- the decoding unit 106 acquires a tree representation with site information corresponding to the latent variable by decoding the latent variable.
- the decoding unit 106 constitutes a decoder neural network (hereinafter referred to as a decoder NN 120) using, for example, various parameters stored in the storage unit 102. Then, the decoding unit 106 acquires the data of the tree representation with site information (second type) by inputting the latent variable into the decoder NN120.
- This decoder NN120 is, for example, a model optimized by a method of generating an autoencoder.
- the decoding unit 106 converts the data from the third type to the second type by using the decoding NN120 which outputs the second type data when the third type data is input.
- this neural network model is optimized by combining an encoder that outputs a latent variable when a tree representation with site information is input and a decoder that outputs a tree representation with site information when a latent variable is input.
- the dimension of the latent variable that is the input of the decoder NN120 may be lower than the dimension of the information that the tree representation with site information that is the output has.
- the restoration unit 108 outputs the chemical structural formula of the compound from the tree representation with site information output by the decoding unit 106.
- the restoration unit 108 converts the tree representation with site information into the data of the chemical structural formula representation (graph representation, first type) by executing the reverse operation of the tree decomposition. That is, the restoration unit 108 converts the data from the second type to the first type.
- This inverse transformation to graph representation is called assemble. The assembling method will be described in more detail in the description of the training device described later.
- the decoding unit 106 and the restoration unit 108 may perform an operation of determining the acquired result.
- the decoding unit 106 may evaluate the output tree representation with site information by an evaluation function and select whether to output or search again based on the score value.
- the restoration unit 108 may evaluate the output chemical structural formula information by an evaluation function and select whether to output or search again based on the score value.
- the search unit 104 executes a re-search when it is determined by at least one of the decoding unit 106 and the restoration unit 108 to perform a re-search. It should be noted that this determination may not be executed by the decoding unit 106 or the restoration unit 108, but may be determined by the search unit 104 based on the output of the decoding unit 106 or the restoration unit 108.
- the search unit 104 may search for a plurality of latent variables at the same timing. For example, when PSO is used as a search method, the search is executed in parallel for latent variables that are multiple particles.
- the decoding unit 106 may also generate information on a plurality of trees from a plurality of latent variables in parallel.
- the restoration unit 108 may also generate information on a plurality of chemical structural formulas from information on a plurality of trees in parallel.
- the tree information represents the tree information including the node information and the edge information.
- information including node information and edge information may be simply referred to as tree information.
- the output unit 110 outputs appropriate information.
- the information output by the output unit 110 is a chemical structural formula.
- the output unit 110 outputs information on the chemical structural formula to the outside.
- the output unit 110 may output the information of the chemical structural formula in the storage unit 102.
- the output is a concept including storing in the internal storage unit 102 together with the output to the outside.
- the output unit 110 outputs the information when the search is completed once.
- the output information is optimized by the search unit 104, only the final data may be output, or the data in a plurality of steps including the progress data may be output together with the final data. May be good.
- FIG. 2 shows a concept showing the estimation process of the estimation device 1 as a chart. The processing flow of the estimation device 1 will be described with reference to FIG.
- the search unit 104 searches for the latent variable (third type data) z'(S100).
- this latent expression may be acquired as a random number, or a latent variable close to the latent expression of the compound generated in advance may be acquired.
- the structural formula (graph representation) of the compound is obtained as a latent representation z via tree decomposition and an encoder. Then, by adding a minute value or the like to this latent expression z or multiplying it by a value close to 1, the latent expression z'that is the target of the search is acquired.
- the decoding unit 106 inputs the latent expression z'generated by the search unit 104 into the decoder NN 120, and acquires the data of the tree representation with site information (second type) (S102).
- site information second type
- the individual molecular formulas shown in the figure are the nodes in the tree representation with site information, and the arrows indicate the site information. Site information between individual nodes is shown as arrows.
- the restoration unit 108 acquires the graph representation x'(first type data) of the compound by assembling using the tree representation with site information generated by the decoding unit 106 (S104).
- the search unit 104 may acquire a new latent representation (third type data) using the data generated by at least one of the decoding unit 106 or the restoration unit 108, and repeat the search. Good (S106).
- the estimation device 1 can generate information on the compound x'.
- FIG. 3 is a block diagram showing the configuration of the training device according to the embodiment.
- the training device 2 includes an input unit 200, a storage unit 202, a decomposition unit 204, an encoding unit 206, a decoding unit 208, a restoring unit 210, an updating unit 212, and an output unit 214.
- the input unit 200 receives the input of data necessary for the operation of the training device 2.
- the training device 2 acquires, for example, data of a graph representation (first type) of a compound as training data via an input unit 200. Further, input of hyperparameters of the neural network model to be optimized may be accepted.
- the storage unit 202 stores data necessary for the operation of the training device 2.
- the training data acquired by the training device 2 via the input unit 200 may be stored in the storage unit 202. Since other operations are almost the same as those of the storage unit 102 of the estimation device 1, detailed description thereof will be omitted.
- the decomposition unit 204 converts the first data acquired via the input unit 200 into tree representation (second type) data with site information by executing tree decomposition. Further, as another example, the decomposition unit 204 may acquire the second type data based on the first type data stored in the storage unit 202. That is, the decomposition unit 204 converts the data from the first type to the second type.
- the encoding unit 206 inputs the second data generated by the decomposition unit 204 to the encoder NN 220 and acquires the data of the latent expression (third type). For example, the encoding unit 206 forms the encoder NN 220 based on various parameters of the model stored in the storage unit 202, and generates the third type data from the second type data. That is, the encoding unit 206 converts the data from the second type to the third type.
- the decoding unit 208 inputs the third type data generated by the encoding unit 206 to the decoder NN 222 to generate the second type data.
- the decoder NN 222 and the decoding unit 208 correspond to the decoder NN 120 and the decoding unit 106 of the estimation device 1 described above, respectively. That is, the decoding unit 208 converts the data from the third type to the second type.
- the restoration unit 210 acquires the first type data by assembling the second type data output by the decoding unit 208.
- the restoration unit 210 corresponds to the restoration unit 108 of the estimation device 1 described above. That is, the restoration unit 210 converts the data from the second type to the first type.
- the update unit 212 optimizes the encoder NN 220 and the decoder NN 222 based on the first type data of the compound output by the restore unit 210.
- Various optimizations used for autoencoder optimization may be used for this optimization.
- a method of VAE Vehicle Autoencoder that handles latent variables as a probability distribution may be used.
- the update unit 212 may update the model based on the second type data output by the decoding unit 208 instead of updating the model based on the first type data. ..
- the optimization is executed by comparing the data obtained by converting the first type data of the training data into the second type and the data output by the decoding unit 208.
- the update unit 212 updates the parameters of the encoder and the decoder using at least one of the data of the first type and the second type. Desirably, the update unit 212 performs parameter update using the second type of data.
- the output unit 214 outputs various parameters of the encoder and decoder optimized by updating the parameters by the update unit 212. Similar to the estimation device 1, the output unit 214 may output the acquired data to the outside or may output it to the storage unit 202.
- the training device 2 after training may be used as the estimation device 1.
- FIG. 4 shows a concept showing the training process of the training device 2 as a chart. The processing flow of the training apparatus 2 will be described with reference to FIG.
- the training device 2 acquires the compound data x via the input unit 200.
- This data may be, for example, the first type, that is, data in a graph format.
- the decomposition unit 204 decomposes the compound data of the first type into trees and acquires the data of the second type, that is, the data of the tree representation with the site information (S200).
- the encoding unit 206 inputs the second type data generated by the decomposition unit 204 to the encoder NN 220 to generate the latent expression z which is the third type data (S202).
- the decoding unit 208 inputs the latent expression z, which is the third type data generated by the encoding unit 206, to the decoder NN 222, and acquires the second type data (S204).
- the restoration unit 210 assembles from the second type data generated by the decoding unit 208 to acquire a graph representation of the compound which is the first type data (S206).
- the update unit 212 updates the parameters of the encoder NN 220 and the decoder NN 222 based on the data restored by the restore unit 210 (S208).
- the parameters may be updated based on the second type data generated by the decoding unit 208 instead of the first type data restored by the restoration unit 210.
- the processing of S206 is not essential.
- the training device 2 can generate at least information about the decoder NN222.
- the output unit 214 may output not only the decoder NN222 but also the parameters of the encoder NN220. In this case, the parameters of the encoder NN 220 can be used for further learning in the future.
- the decoder NN222 optimized by the training device 2 can be used as the decoder NN120 in the estimation device 1.
- the estimator 1 can realize inference to generate the first type of data by designating one point in the third type of data space.
- the disassembly unit 204, the restoration unit 210 of the training device 2, and the restoration unit 108 of the estimation device 1 perform conversion and assembly by wood decomposition as described below.
- the tree-decomposed node will be one of the above-mentioned singleton, bond, or ring nodes. There are four types of connections between nodes: bond-bond, bond-singleton (or singleton-bond), ring-bond (or bond-ring), and ring-ring.
- Bond-Bond is a case where two bonds are connected to each other.
- Bond-singleton is a case where one bond connects to the singleton which is the branch point.
- Ring-bond is a case where one bond is connected to the ring.
- Ring-ring is a case where two rings are directly connected. In this case, there are cases where they are condensed (one bond is shared and connected) and cases where they are spiro-bonded (only one atom is shared and connected).
- connection information of the nodes indicating the relationship in which the nodes are connected is added to the tree information as information about the site.
- This connection information includes, for example, information on the position where the node connects and information on the direction in which the node connects, as will be described in detail later.
- the decomposition unit 204 remembers which atom was shared between the nodes as site information. By using this site information, the restoration units 108 and 210 can restore the graph representation.
- the restoration units 108 and 210 can acquire the graph representation without additional information.
- the decomposition unit 204 stores, as site information, which position of the ring, that is, which atom the bond was connected to.
- the restoration units 108 and 210 can uniquely determine the connection position of the bond to the ring based on this site information.
- the disassembly unit 204 remembers which bond was shared as the site information. Furthermore, as the site direction information, the direction to which the bond was connected is recorded. By using this site information and site direction information, the restoration units 108 and 210 can restore the graph representation.
- the site information uniquely represents a subatomic group of compounds represented by a node (two atoms connected by a covalent bond in the case of a bond node, and three or more atoms belonging to a ring structure in the case of a ring node) uniquely within the node. Decide appropriately in advance so that it can be expressed. If it is a bond node, for example, numerical values such as 0 and 1 are given to each atom contained in the node as site information. If it is a ring node, for example, a number is assigned so as to pass all atoms clockwise from the reference atom.
- the reference atom whose site information is 0 may be, for example, in ascending order in which the element names are represented by ASCII codes and sorted in a dictionary. Further, as another example, the determination may be made based on the atomic number.
- any atom may be set to 0.
- the dictionary order by ASCII code becomes the youngest, or when the atomic numbers are arranged in order, the order becomes the smallest. May be good.
- an atomic number a 1- to 2-digit number or an extended number of 3 digits may be used.
- site information may be added so as to be the youngest as an 18-digit numerical value.
- the method is not limited to the above, and any method may be used as long as the same site information is appropriately assigned to the bond node or the ring node having the same configuration.
- site information is given by the same method, any of the candidate atoms may be used as a reference in a node having symmetry such as a benzene ring where there is no difference in dictionary order or the like.
- Site information may be added to all tree nodes in the implementation.
- the site information may be set to any value or 0.
- it is assumed that the site information is added to all the nodes.
- the bond-singleton connection may be configured without adding site information.
- FIG. 5 is a diagram showing an example of site information.
- FIG. 5 shows the site information regarding the bond-bond connection, but the bond-singleton case can be processed in the same manner.
- circles indicate nodes and arrows indicate directed edges with features.
- node A and node B are nodes indicating CN, and these nodes are connected to each other. It is assumed that the carbon atom (C) in the node is numbered 0 and the nitrogen atom (N) is numbered 1. As shown in Fig. 5, there are two possible molecular structures represented by this tree representation: CH 3 NHCH 3 and NH 2 CH 2 NH 2 , but the information of 0 is added to edge A ⁇ B and edge B ⁇ When 0 information is added to A, the graph representation (compound) obtained by assembling this tree representation can be uniquely determined as NH 2 CH 2 NH 2 .
- site information is not essential, but it may be added in terms of implementation.
- Restoration units 108 and 210 restore based on this site information.
- FIG. 6 is a diagram showing an example of site information, and shows site information regarding a ring-bond connection.
- Node A is a ring and node B is a bond.
- "2" of node A is numbered so that it starts from a predetermined atom in the ring of the graph of node A and passes through all the atoms. As shown in FIG. 6, starting from S in the figure, numbers 0, 1, ..., 4 are assigned in order. Node B is numbered 0, 1 as in the case of FIG.
- the restoration units 108 and 210 can uniquely restore the graph on the right instead of the graph on the left in the figure below from the information of the tree with the site information.
- the restoration units 108 and 210 are, for example, an atom at position 2 of node A based on the information of the edge from node A to node B, and 0 of node B based on the information of the edge from node B to node A. Connect with the atom at the position.
- the restorers 108 and 210 uniquely restore the atom graph from the tree information by adding the positions of the connecting atoms in the ring as site information as described above. Is possible.
- FIG. 7 is a diagram showing an example of site information, and shows site information regarding ring-ring condensed connection.
- Both node A and node B are rings.
- node A is a 6-membered ring aromatic compound
- node B is a 5-membered ring aromatic compound.
- Node A is numbered from 0 to 5 in order from a certain side in the atomic graph.
- node B is assigned a number from 0 to 4.
- the site information number is added to the edge of the atomic graph instead of the node of the atomic graph.
- the edge number of the atomic graph and the connecting direction are added as the site information of the node.
- the number 0 and the direction +1 are added as the site information of edge A ⁇ B.
- the number 3 and the direction +1 are added as the site information of the edge B ⁇ A.
- the direction means, for example, whether to connect in the order in which the numbers are added or in the reverse order.
- the direction is given in the clockwise direction, but is shown as an example, and may be counterclockwise as long as it can be uniquely specified.
- the restoration units 108 and 210 can uniquely restore the connection state on the lower right side of the figure.
- the site information of edge B ⁇ A when the site information of edge A ⁇ B is 0 (+1), the site information of edge B ⁇ A may be set to 1 (-1) to indicate the same connection. ..
- FIG. 8 is a diagram showing an example of site information, and shows site information regarding a ring-ring spiro connection. Both node A and node B are rings and have the same configuration as in FIG. 7.
- the site direction information is set to 0.
- the restoration units 108 and 210 may first refer to the site direction information when restoring from the tree showing the ring-ring to the graph. If the site direction information is 0, as shown in FIG. 8, the restoration units 108 and 210 determine that the site information indicates the atom number, connect the specified atoms to each other, and restore the graph information. do. On the other hand, when the site direction information is ⁇ 1, the graph information is restored by connecting the edges based on the case shown in Fig. 7.
- the training method of the neural network model (encoder NN, decoder NN) is also changed from the usual machine learning method.
- TreeGRU (TreeGatedRecurrentUnit) is used to train the above information with a site, but it can also be implemented in TreeLSTM (TreeLongShortTermMemory), for example.
- TreeGRU uses, for example, the method shown in W. Jin, et.al., “Junction Tree Variational Autoencoder for Molecular Graph Generation,” arXiv: 1802.04364v4, March 29, 2019.
- this TreeGRU can realize the optimization of the autoencoder by VAE. These are only given as examples, and can be applied as long as they are appropriate network formation methods and optimization methods.
- the training can be expressed by the following equation.
- EdgeTreeGRU () in equation (1) is a GRU designed to input and output a tree representation with site information as a message.
- x is a vector indicating a feature amount indicating the type of node and the like, and is represented by, for example, a one-hot vector.
- e is a vector indicating edge information, that is, a feature amount of site information (including site direction information), and is represented by, for example, a one-hot vector.
- h is a message vector between nodes. In this way, by defining the feature vector including the site information and giving the message vector between the nodes as the hidden vector of GRU, the process is executed in the same way as GRU.
- ⁇ () in each equation indicates the sigmoid function, and odot indicates the product of the elements.
- W and U represent weights, and b represents a bias term.
- the message vector h is calculated according to equations (1) to (6).
- the site information is calculated by being concatenate so that the information is included in the message vector of the GRU, as shown in the equations (2), (4), and (5).
- the encoder unit 206 forms the encoder NN 220 by using the EdgeTreeGRU () of the above equation (1) as a network representing the GRU.
- the decoding unit 208 also forms the decoder NN 222 using EdgeTreeGRU ().
- the decoding unit 208 decodes the site information as well as decodes the graph information based on the following equation.
- u and W represent weights.
- ⁇ () in equation (7) represents ReLU.
- the decoding unit 208 determines the probability of whether or not the node has further child nodes based on the equation (7) based on the output z in the previous step, the feature x of the node, and the received message h. To calculate.
- Q in Eq. (8) indicates the characteristics of the node when a child node is created.
- the decoding unit 208 infers the site information and the site direction information in addition to the inference of the node information of the tree described above.
- the decoding unit 208 calculates an intermediate variable based on the equation (9) using the output of the previous step and the input message.
- the decoding unit 208 acquires the site information from the result and weight of the equation (9) based on the equation (10), and the site direction from the result and the weight of the equation (9) based on the equation (11). Infer information.
- the decoding unit 208 acquires the information of the tree with the site from the latent variable.
- the above is the operation of the encoding unit 206 and the decoding unit 208 as an autoencoder, but it is also possible to use only the decoding NN.
- the decoding unit 106 acquires the information of the tree with the site based on the operations of the equations (7) to (11) for the latent variable.
- the updater 212 calculates the evaluation value (loss, loss) of the training data with the information of the tree encoded and decoded based on the above equation, and the weights U, W, u of each equation are calculated based on this evaluation value. , And, in some cases, update the bias b.
- the loss is expressed, for example, by the following equation.
- p_hat, q_hat, s_hat, and d_hat represent grand truth values for the predicted values p, q, s, and d, respectively.
- w s and w d are hyperparameters for adjusting the balance between site information and site direction information, and need to be set appropriately.
- each L is a loss function that is set appropriately for each variable.
- a loss function modified to use site-attached information is defined for equation (12), which is a loss function for general TLSTM.
- the update unit 212 updates the network parameters so as to minimize the cross-entropy loss shown in the equation (13). By repeating this operation, the update unit 212 optimizes the encoder NN 220 and the decoder NN 222 (decoder NN 120). For example, q_hat can be obtained by decomposing the atomic graph acquired by the decomposition unit 204 as training data into a tree with a site.
- Optimization by the update unit 212 is executed using a general method.
- the Teacher Forcing method of inputting correct answer data in the next step of GRU may be used.
- other methods such as Scheduled Sampling and Professor Forcing can also be used.
- the site information has been described with reference to FIGS. 5 to 8, and next, how to encode this site information will be described.
- the decomposition unit 204 adds site information and site direction information to the nodes that become bonds, singletons, and rings at the timing of tree decomposition.
- Site information may be given in one direction.
- the decomposition unit 204 encodes one site information on one edge. There is a method of encoding the site information of the departure side of the edge and a method of encoding the site information of the arrival side of the edge.
- the site information is encoded by adding (site of node A, site direction) as an edge feature to the edge from node A to node B.
- the site information given from node A to the edge of node B in FIG. 6 is encoded as (2,0), and the site information given from node B to the edge of node A is encoded as (0,0). ..
- the site information given from node A to the edge of node B in FIG. 7 is (0, +1), and the site information given from node B to the edge of node A is (3, +1). Is. Further, the site information given from the node B to the edge of the node A may be (1, -1).
- the site information given from node A to the edge of node B in FIG. 8 is (0, 0), and the site information given from node B to the edge of node A is (3, 0). ..
- the site direction is 0
- the site information is encoded by adding (site of node B, site direction) as an edge feature to the edge from node A to node B.
- the site information given from node A to the edge of node B in FIG. 6 is encoded as (0, 0), and the site information given from node B to the edge of node A is encoded as (2, 0). ..
- the site information given from node A to the edge of node B in FIG. 7 is (3, +1), and the site information given from node B to the edge of node A is (0, +1). Is.
- site information can be added using the information seen from the own node.
- the other site information is not essential.
- the connecting atoms are the same atom, the same atom may be extracted from the other node from one site information and the result may be used as the site information. In this way, the site information may be omitted depending on the situation.
- the site information of both may be given as the feature of the edge. That is, both site information may be encoded in one directed edge.
- the site information given from node A to the edge of node B in FIG. 6 is (2, 0, 0), and the site information given from node B to the edge of node A is (0, 2, 0). Is encoded.
- the site information given from node A to the edge of node B in FIG. 7 is (0, 3, +1), and the site information given from node B to the edge of node A is (3, 0, +). Encoded as 1).
- the site direction may be calculated, for example, by the number assigned to the node of the graph.
- the adjacent atomic nodes of node B be bl and bk (l and k are the atomic node numbers).
- the site direction is (i, l, +1).
- the site direction is -1.
- the site information at the edge of node A to node B is (i, l, -1).
- one of l and k is 0 and the other is the maximum value of the atomic node number, the opposite is true.
- the site direction information By acquiring the site direction information in this way, it is possible to restore the information of the uniquely decomposed tree to the graph. It should be noted that the above-mentioned granting of the site direction is described as an example, and even if this method is not used, it is appropriate that the site direction information can be uniquely converted into a graph in the bond that is condensed and connected in the ring. Any method of granting is sufficient.
- two-way site information In training, it is desirable to use two-way site information, but it is not limited to this. This is because in decoding, the presence of bidirectional site information allows the information of the node connecting to a certain node to be read from both nodes rather than being given by the information from the other node. it is conceivable that.
- the restoration units 108 and 210 reconstruct the atomic graph based on the information of the tree with the site output by the decoding unit. For example, the nodes may be restored in order from one node of the graph by the autoregressive method.
- the site information exists in addition to the tree information, the inference from each node to the next node can be uniquely determined.
- the inference of the node may be performed by performing the reverse operation for adding the site information described above.
- FIG. 9 is a diagram illustrating the reconstruction of the atomic graph.
- the generation of the tree structure may be executed by the same method as the autoregressive model of RNN (Recurrent Neural Network).
- the restoration unit generates node 1 from the acquired latent vector as a predetermined starting node by the neural network model of the decoding unit.
- the restoration unit self-regressively generates node 2 based on the information of the latent vector and node 1.
- a neural network model having an autoregressive configuration may be used to generate this node 2, as represented by RNN.
- the restore unit repeats this operation until all the nodes (for example, the nodes up to node N) are created.
- the tree structure can be obtained, that is, the molecular structure can be obtained.
- the present embodiment it is possible to uniquely realize the restoration from the tree information to the atomic graph by adding the site information at the timing of decomposing the tree from the atomic graph.
- a method of tree decomposition with a site has been proposed as a method of solving a unique restoration.
- an autoencoder can be used as a method for inferring the information of this site-attached tree decomposition from latent variables.
- All of the above trained inference models may be, for example, a concept that includes a model that has been trained as described and then distilled by a general method.
- each device in the above-described embodiment may be configured by hardware, or may be a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), or the like. It may be composed of information processing of software (program) to be executed.
- software information processing software that realizes at least a part of the functions of each device in the above-described embodiment is a flexible disk, CD-ROM (Compact Disc-Read Only Memory) or USB (Universal Serial). Bus)
- Information processing of software may be executed by storing it in a non-temporary storage medium (non-temporary computer-readable medium) such as a memory and loading it into a computer. Further, the software may be downloaded via a communication network. Further, information processing may be executed by hardware by implementing the software in a circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- the type of storage medium that stores the software is not limited.
- the storage medium is not limited to a removable one such as a magnetic disk or an optical disk, and may be a fixed type storage medium such as a hard disk or a memory. Further, the storage medium may be provided inside the computer or may be provided outside the computer.
- FIG. 10 is a block diagram showing an example of the hardware configuration of each device (estimation device 1 or training device 2) in the above-described embodiment.
- each device includes a processor 71, a main storage device 72 (memory), an auxiliary storage device 73 (memory), a network interface 74, and a device interface 75, which are connected via a bus 76. It may be realized as a computer 7.
- the computer 7 in FIG. 10 has one component for each component, but may have a plurality of the same components. Also, in FIG. 10, one computer 7 is shown, but the software is installed on multiple computers, and each of the multiple computers performs the same or different parts of the software. May be good. In this case, it may be a form of distributed computing in which each computer communicates via a network interface 74 or the like to execute processing. That is, each device (estimation device 1 or training device 2) in the above-described embodiment is a system that realizes a function by executing an instruction stored in one or a plurality of storage devices by one or a plurality of computers. It may be configured. Further, the information transmitted from the terminal may be processed by one or a plurality of computers provided on the cloud, and the processing result may be transmitted to the terminal.
- each device estimate device 1 or training device 2 in the above-described embodiment is executed in parallel processing by using one or a plurality of processors or by using a plurality of computers via a network. May be good. Further, various operations may be distributed to a plurality of arithmetic cores in the processor and executed in parallel processing. In addition, some or all of the processes, means, etc. of the present disclosure may be executed by at least one of a processor and a storage device provided on the cloud capable of communicating with the computer 7 via a network. As described above, each device in the above-described embodiment may be in the form of parallel computing by one or a plurality of computers.
- the processor 71 may be an electronic circuit (processing circuit, Processing circuitry, CPU, GPU, FPGA, ASIC, etc.) including a computer control device and an arithmetic unit. Further, the processor 71 may be a semiconductor device or the like including a dedicated processing circuit. The processor 71 is not limited to an electronic circuit using an electronic logic element, and may be realized by an optical circuit using an optical logic element. Further, the processor 71 may include an arithmetic function based on quantum computing.
- the processor 71 can perform arithmetic processing based on data and software (programs) input from each device or the like of the internal configuration of the computer 7, and output the arithmetic result or control signal to each device or the like.
- the processor 71 may control each component constituting the computer 7 by executing an OS (Operating System) of the computer 7, an application, or the like.
- OS Operating System
- Each device (estimation device 1 or training device 2) in the above-described embodiment may be realized by one or a plurality of processors 71.
- the processor 71 may refer to one or more electronic circuits arranged on one chip, or may refer to one or more electronic circuits arranged on two or more chips or two or more devices. You may point. When a plurality of electronic circuits are used, each electronic circuit may communicate by wire or wirelessly.
- the main storage device 72 is a storage device that stores instructions executed by the processor 71, various data, and the like, and the information stored in the main storage device 72 is read out by the processor 71.
- the auxiliary storage device 73 is a storage device other than the main storage device 72. It should be noted that these storage devices mean arbitrary electronic components capable of storing electronic information, and may be semiconductor memories. The semiconductor memory may be either a volatile memory or a non-volatile memory.
- the storage device for storing various data in each device (estimation device 1 or training device 2) in the above-described embodiment may be realized by the main storage device 72 or the auxiliary storage device 73, and is built in the processor 71. It may be realized by the built-in memory.
- the storage units 102 and 202 in the above-described embodiment may be realized by the main storage device 72 or the auxiliary storage device 73.
- processors may be connected (combined) to one storage device (memory), or a single processor may be connected.
- a plurality of storage devices (memory) may be connected (combined) to one processor.
- Each device (estimation device 1 or training device 2) in the above-described embodiment is composed of at least one storage device (memory) and a plurality of processors connected (combined) to the at least one storage device (memory).
- a configuration in which at least one of a plurality of processors is connected (combined) to at least one storage device (memory) may be included.
- this configuration may be realized by a storage device (memory) and a processor included in a plurality of computers.
- a configuration in which the storage device (memory) is integrated with the processor for example, a cache memory including an L1 cache and an L2 cache
- the storage device (memory) is integrated with the processor (for example, a cache memory including an L1 cache and an L2 cache)
- the network interface 74 is an interface for connecting to the communication network 8 wirelessly or by wire. As the network interface 74, an appropriate interface such as one conforming to an existing communication standard may be used. The network interface 74 may exchange information with the external device 9A connected via the communication network 8.
- the communication network 8 may be any one of WAN (Wide Area Network), LAN (Local Area Network), PAN (Personal Area Network), or a combination thereof, and may be a combination of the computer 7 and the external device 9A. It suffices as long as information is exchanged between them.
- An example of a WAN is the Internet
- an example of a LAN is IEEE802.11 or Ethernet (registered trademark)
- an example of a PAN is Bluetooth (registered trademark) or NFC (Near Field Communication).
- the device interface 75 is an interface such as USB that directly connects to the external device 9B.
- the external device 9A is a device connected to the computer 7 via a network.
- the external device 9B is a device that is directly connected to the computer 7.
- the external device 9A or the external device 9B may be an input device as an example.
- the input device is, for example, a device such as a camera, a microphone, a motion capture, various sensors, a keyboard, a mouse, or a touch panel, and gives the acquired information to the computer 7. Further, it may be a personal computer, a tablet terminal, or a device having an input unit such as a smartphone, a memory, and a processor.
- the external device 9A or the external device 9B may be an output device as an example.
- the output device may be, for example, a display device such as an LCD (Liquid Crystal Display), a CRT (Cathode Ray Tube), a PDP (Plasma Display Panel), or an organic EL (Electro Luminescence) panel, or may output audio or the like. It may be an output speaker or the like. Further, it may be a personal computer, a tablet terminal, or a device having an output unit such as a smartphone, a memory, and a processor.
- the external device 9A or the external device 9B may be a storage device (memory).
- the external device 9A may be a network storage or the like, and the external device 9B may be a storage such as an HDD.
- the external device 9A or the external device 9B may be a device having some functions of the components of each device (estimation device 1 or training device 2) in the above-described embodiment. That is, the computer 7 may transmit or receive a part or all of the processing result of the external device 9A or the external device 9B.
- the expression (including similar expressions) of "at least one of a, b and c (one)" or "at least one of a, b or c (one)” is used. When used, it includes any of a, b, c, ab, ac, bc, or abc. It may also include multiple instances for any element, such as a-a, a-b-b, a-a-b-b-c-c, and the like. It also includes adding elements other than the listed elements (a, b and c), such as having d, such as a-b-c-d.
- connection and “coupled” are direct connection / coupling and indirect connection / coupling. , Electrically connected / combined, communicatively connected / combined, operatively connected / combined, physically connected / combined, etc. Intended as a term.
- the term should be interpreted as appropriate according to the context in which the term is used, but any connection / coupling form that is not intentionally or naturally excluded is not included in the term. It should be interpreted in a limited way.
- the physical structure of the element A can execute the operation B. Including that the element A has a configuration and the permanent or temporary setting (setting / configuration) of the element A is set (configured / set) to actually execute the operation B. good.
- the element A is a general-purpose processor
- the processor has a hardware configuration capable of executing the operation B, and the operation B is set by setting a permanent or temporary program (instruction). It suffices if it is configured to actually execute.
- the element A is a dedicated processor, a dedicated arithmetic circuit, or the like, the circuit structure of the processor actually executes the operation B regardless of whether or not the control instruction and data are actually attached. It only needs to be implemented.
- the respective hardware when a plurality of hardware performs a predetermined process, the respective hardware may cooperate to perform the predetermined process, or some hardware may perform the predetermined process. You may do all of the above. Further, some hardware may perform a part of a predetermined process, and another hardware may perform the rest of the predetermined process.
- expressions such as "one or more hardware performs the first process and the one or more hardware performs the second process" are used.
- the hardware that performs the first process and the hardware that performs the second process may be the same or different. That is, the hardware that performs the first process and the hardware that performs the second process may be included in the one or more hardware.
- the hardware may include an electronic circuit, a device including an electronic circuit, or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Chemical & Material Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2022542877A JP7823891B2 (ja) | 2020-08-14 | 2021-08-12 | 推定装置、訓練装置、グラフ生成方法及びネットワーク生成方法 |
| US18/167,948 US20230196075A1 (en) | 2020-08-14 | 2023-02-13 | Inferring device, training device and inferring method |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2020137071 | 2020-08-14 | ||
| JP2020-137071 | 2020-08-14 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/167,948 Continuation US20230196075A1 (en) | 2020-08-14 | 2023-02-13 | Inferring device, training device and inferring method |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022034913A1 true WO2022034913A1 (ja) | 2022-02-17 |
Family
ID=80248027
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2021/029717 Ceased WO2022034913A1 (ja) | 2020-08-14 | 2021-08-12 | 推定装置、訓練装置、グラフ生成方法及びネットワーク生成方法 |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20230196075A1 (https=) |
| JP (1) | JP7823891B2 (https=) |
| WO (1) | WO2022034913A1 (https=) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20230267159A1 (en) * | 2022-02-18 | 2023-08-24 | Microsoft Technology Licensing, Llc | Input-output searching |
| US12587274B2 (en) | 2023-03-28 | 2026-03-24 | Quantum Generative Materials Llc | Satellite optimization management system based on natural language input and artificial intelligence |
| US12368503B2 (en) | 2023-12-27 | 2025-07-22 | Quantum Generative Materials Llc | Intent-based satellite transmit management based on preexisting historical location and machine learning |
| US12603701B2 (en) | 2023-12-27 | 2026-04-14 | Quantum Generative Materials Llc | Distributed satellite constellation management and control system |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018078735A1 (ja) * | 2016-10-26 | 2018-05-03 | 富士通株式会社 | 情報処理装置、情報処理方法および情報処理プログラム |
-
2021
- 2021-08-12 WO PCT/JP2021/029717 patent/WO2022034913A1/ja not_active Ceased
- 2021-08-12 JP JP2022542877A patent/JP7823891B2/ja active Active
-
2023
- 2023-02-13 US US18/167,948 patent/US20230196075A1/en active Pending
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018078735A1 (ja) * | 2016-10-26 | 2018-05-03 | 富士通株式会社 | 情報処理装置、情報処理方法および情報処理プログラム |
Non-Patent Citations (1)
| Title |
|---|
| WENGONG JIN; REGINA BARZILAY; TOMMI JAAKKOLA: "Junction Tree Variational Autoencoder for Molecular Graph Generation", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 12 February 2018 (2018-02-12), 201 Olin Library Cornell University Ithaca, NY 14853 , XP080856473 * |
Also Published As
| Publication number | Publication date |
|---|---|
| US20230196075A1 (en) | 2023-06-22 |
| JP7823891B2 (ja) | 2026-03-04 |
| JPWO2022034913A1 (https=) | 2022-02-17 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2022034913A1 (ja) | 推定装置、訓練装置、グラフ生成方法及びネットワーク生成方法 | |
| Davies | Why is the physical world so comprehensible | |
| CN111428848B (zh) | 基于自编码器和3阶图卷积的分子智能设计方法 | |
| CN112068798B (zh) | 一种实现网络节点重要性排序的方法及装置 | |
| CN110023963A (zh) | 使用神经网络处理文本序列 | |
| CN109155003A (zh) | 生成神经网络 | |
| US20200134471A1 (en) | Method for Generating Neural Network and Electronic Device | |
| CN112073221A (zh) | 一种实现网络节点排序的方法及装置 | |
| CN110138595A (zh) | 动态加权网络的时间链路预测方法、装置、设备及介质 | |
| US20240079099A1 (en) | Inferring device, training device, inferring method, method of generating reinforcement learning model and method of generating molecular structure | |
| CN109376857A (zh) | 一种融合结构和属性信息的多模态深度网络嵌入方法 | |
| WO2018160342A1 (en) | Small majorana fermion codes | |
| CN114038516A (zh) | 一种基于变分自编码器的分子生成与优化 | |
| CN115831261A (zh) | 基于多任务预训练逆强化学习的三维空间分子生成方法和装置 | |
| CN116681139A (zh) | 一种量子态制备方法及相关装置 | |
| CN117290477A (zh) | 一种基于二次检索增强的生成式建筑知识问答方法 | |
| CN114297398A (zh) | 基于神经网络的知识图谱实体链接方法、装置及电子设备 | |
| CN116485501B (zh) | 一种基于图嵌入与注意力机制的图神经网络会话推荐方法 | |
| JP2025076078A (ja) | 暗号化装置、最適化システム、暗号化方法および暗号化プログラム | |
| CN114970334B (zh) | 基于深度强化学习的作战体系设计方法及相关设备 | |
| WO2022163629A1 (ja) | 推定装置、訓練装置、推定方法、生成方法及びプログラム | |
| Patterson | Structured and decorated cospans from the viewpoint of double category theory | |
| WO2022145388A1 (ja) | 推定装置、訓練装置、グラフ生成方法、分子構造、ネットワーク生成方法及びネットワーク | |
| CN110222839A (zh) | 一种网络表示学习的方法、装置及存储介质 | |
| CN118690864B (zh) | 基于模式树的量子线路模式匹配方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21855992 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022542877 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21855992 Country of ref document: EP Kind code of ref document: A1 |