WO2022260179A1 - 訓練装置、訓練方法、プログラム及び推論装置 - Google Patents

訓練装置、訓練方法、プログラム及び推論装置 Download PDF

Info

Publication number
WO2022260179A1
WO2022260179A1 PCT/JP2022/023523 JP2022023523W WO2022260179A1 WO 2022260179 A1 WO2022260179 A1 WO 2022260179A1 JP 2022023523 W JP2022023523 W JP 2022023523W WO 2022260179 A1 WO2022260179 A1 WO 2022260179A1
Authority
WO
WIPO (PCT)
Prior art keywords
atomic structure
energy
training
model
error
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2022/023523
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
幾 品川
聡 高本
伊織 倉田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Preferred Networks Inc
Original Assignee
Preferred Networks Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Preferred Networks Inc filed Critical Preferred Networks Inc
Priority to JP2023524992A priority Critical patent/JP7392203B2/ja
Publication of WO2022260179A1 publication Critical patent/WO2022260179A1/ja
Priority to US18/534,130 priority patent/US20240127121A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C60/00Computational materials science, i.e. ICT specially adapted for investigating the physical or chemical properties of materials or phenomena associated with their design, synthesis, processing, characterisation or utilisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C10/00Computational theoretical chemistry, i.e. ICT specially adapted for theoretical aspects of quantum chemistry, molecular mechanics, molecular dynamics or the like
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Definitions

  • the present disclosure relates to training devices, training methods, programs, and reasoning devices.
  • Adsorption energy is the amount of energy stabilized by molecules adsorbing to the surface. Specifically, the adsorption energy is calculated from the difference between the (surface + molecule) energy E (slab + molecule) and the (surface) energy E (slab) + (single molecule) energy E (molecule) .
  • NNP Neuronal Network Potential
  • the adsorption energy is a small value such as 1 [eV].
  • the value of this adsorption energy is small with respect to the energy prediction error of the entire system. Therefore, in error backpropagation, it may be difficult to discriminate the adsorption energy as an error, and furthermore, to perform learning for this adsorption energy.
  • an inference device for inferring energy and a training device for training the NNP model used in this inference device are provided.
  • a training device comprises one or more memories and one or more processors.
  • the one or more processors input a first atomic structure of a surface and adsorbed molecules proximate to the surface to a training target model, and output the energy of the first atomic structure output from the training target model, and the first A first error is obtained based on the correct value of the energy of the atomic structure, and the fourth atomic structure of the cluster and the adsorbed molecule close to the cluster is input to the training target model and output from the training target model.
  • Obtaining a fourth error based on the energy of the fourth atomic structure and the correct value of the energy of the fourth atomic structure, and obtaining parameters of the model to be trained based on the first error and the fourth error to update.
  • the surface and the clusters contain identical atoms.
  • FIG. 1 is a diagram schematically showing an inference device according to an embodiment
  • FIG. 4 is a flowchart showing processing of an inference device according to an embodiment
  • the figure which shows the training apparatus which concerns on one Embodiment typically.
  • numerator and a cluster. 4 is a flow chart showing processing of the training device according to one embodiment.
  • Energy is exponential, for example, the energy of two molecules of H2O is twice the energy of one molecule of H2O . Also, the energy is roughly proportional to the number of atoms.
  • the stable state is where the energy of the whole system is lower, and in principle the atoms move in such a way that the energy becomes lower. Thus, states with higher energies are less likely to occur than states with lower energies.
  • the coordinate derivative of energy is force. Therefore, when the energy is obtained, the force acting on each atom can be obtained.
  • the stable state (or Metastable state) can be obtained.
  • the interatomic potential is a function that obtains energy from the arrangement of atoms, and is also called a force field.
  • This function is generally an artificial function. It is a function corresponding to the governing equation for MD (Molecular Dynamics) simulation.
  • MD Molecular Dynamics
  • Various physical properties can be calculated by combining with computational science methods. When the number of atoms is N, the energy value can be obtained from the atomic structure specified by N three-dimensional coordinates and N element information. By differentiating this energy with respect to three-dimensional coordinates, the force acting on the atom can be obtained as N three-dimensional coordinates.
  • NNP Neuron Potential
  • the neural network model may be a graph neural network or a graph convolutional neural network that can handle graph information, but is not limited to this. Since the function for obtaining the energy from the above atomic structure is formed by machine learning, it becomes a regression problem. For this NNP neural network model, the force for each atom can be obtained by forward propagating the atomic structure and then backward propagating the energy.
  • DFT Density Functional Theory
  • ⁇ (x) the state of electrons represented by the wave function ⁇ (x) from the arrangement of atoms (atomic structure).
  • Energy (or force) can also be obtained from the electronic state.
  • DFT Density Functional Theory
  • N three-dimensional coordinates and N element information are converted into electronic states, and energy and force are converted from these electronic states. can be obtained. It is used to create interatomic potentials, but the computational cost is high. If we ignore the computational cost, we can use this DFT instead of the interatomic potential. Therefore, it is possible to generate training data for a neural network model in NNP by DFT calculation.
  • the adsorption energy is a change in energy when molecules or the like are adsorbed on a solid surface, and is an amount corresponding to the magnitude of interaction between the solid surface and the molecules or the like.
  • the adsorption energy ⁇ E adsorp is represented by the following formula, where E(molecule) is molecular energy, E(slab) is surface energy, and E(molecule+slab) is molecular and surface energy.
  • the adsorption energy is expressed as a negative value or as an absolute value in the literature, but in the present disclosure, the adsorption energy is defined by formula (1), and the adsorption energy takes a positive value in principle. shall be If the value is negative, it is considered that there is a more stable adsorption state, so it may be reflected in training or re-inferred after optimizing the atomic structure.
  • the atomic structure used for DFT for example, the atomic structure input to NNP, may be described by periodic boundary conditions.
  • the structure of a crystal is described by applying periodic boundary conditions to the atomic structure of a unit cell, which is a repeating unit.
  • Fig. 1 is a diagram schematically showing a unit cell of Pt (platinum) as an example.
  • a dotted cube indicates a unit cell region, and a solid sphere indicates a Pt atom. Since Pt has a face-centered cubic lattice structure, the stable structure is arranged as shown in FIG. In the drawing, part of the region showing the atoms protrudes from the cell region, but in reality it is enough to show the positions of the Pt atoms, so there is no effect on the calculation.
  • sx, sy, and sz indicate the length of each axis of the unit cell.
  • sx, sy, and sz match the size of the unit cell of Pt, but they are not limited to this. Defined appropriately to perform the computation and inference of the DFT computation by the NNP.
  • Fig. 2 shows an atomic structure in which the unit cells in Fig. 1 are combined, for example, 3 ⁇ 3 ⁇ 3.
  • This figure is a cutout of a part of Pt atoms in which unit cells are arranged under periodic boundary conditions.
  • the size and shape of a unit cell can be expressed by the length of each axis and the angle between each axis.
  • the length of each axis of the unit cell in Fig. 1 is represented by (sx, sy, sz) and three-dimensional quantities.
  • the angle formed by each axis is represented by a three-dimensional quantity ( ⁇ /2, ⁇ /2, ⁇ /2).
  • the length of each axis of the unit cell and the angle formed by each axis are determined according to the crystal structure to be expressed.
  • the atomic structure is described as n ⁇ (coordinates (three-dimensional), element (one-dimensional)) and the periodic boundary condition (angle ( 3D), length (3D)).
  • a free boundary condition a state without repetition can be described.
  • the description of the atomic structure of the cluster which will be described later, may be set as this free boundary condition, and in order to reduce the computational complexity in the NNP, the size is large enough to eliminate the influence from other molecules, etc. may be a periodic boundary condition defining a unit cell of In the present disclosure, in principle, training and inference with periodic boundary conditions are performed, but as described above, this boundary condition is not limited, and if each state can be appropriately described, Any boundary condition may be set.
  • Fig. 3 is a diagram showing one method (slab model) of expressing a surface as a periodic boundary condition.
  • a unit cell for example, a structure composed of a group of unit lattices as described above is set in the lower part, and the upper part is in a vacuum state. By setting in this way, it is possible to generate a model capable of appropriately representing the surface shape of metal, crystal, or the like.
  • Fig. 4 is a diagram showing the atomic structure in which the structure in Fig. 3 is repeated 2 ⁇ 2.
  • FIG. 4 by describing the structure of FIG. 3 using periodic boundary conditions, it is possible to represent an atomic structure in which a substance having a surface and a vacuum layer are alternately repeated.
  • the thickness of the vacuum layer sufficiently large (eg, 15 ⁇ or more)
  • the atomic structure can be described to the extent that molecules placed near the surface (eg, 5 ⁇ or less) are not affected by other surface structures. can do.
  • FIG. 5 is a block diagram schematically showing the inference device 1 according to one embodiment.
  • Inference device 1 includes input unit 100 , storage unit 102 , structure optimization unit 104 , inference unit 106 , energy calculation unit 108 , and output unit 110 .
  • the inference device 1 infers, for example, the adsorption energy using the trained model NN1.
  • the trained model NN1 is a neural network model used for the NNP mentioned above, and is a model that outputs energy when an atomic structure is input. As described above, the input is, for example, the number of atoms in the atomic structure ⁇ (element (1) + coordinate (3)) + periodic boundary condition (angle (3) + length (3)) dimensional data. .
  • this trained model NN1 may be a type of graph neural network model capable of processing graphs, or may be any other neural network model.
  • the trained model NN1 may be a model that executes inference after fixing the boundary condition to be a periodic boundary condition.
  • the trained model NN1 is a model formed as a neural network model that acquires physical property values from atomic structures through quantum chemical calculations.
  • This trained model NN1 may be a model that outputs the results of first-principles calculations as quantum chemical calculations.
  • this trained model NN1 may be a model that acquires the result of DFT calculation as first-principles calculation.
  • the first-principles calculation is described as being performed by the DFT calculation, but the first-principles calculation may also be a calculation using the Hartree-Fock method, the Meller Preset method, or the like.
  • the trained model NN1 may be a model that constitutes an NNP that outputs potential energy when an atomic structure is input.
  • the trained model NN1 is a model that infers the energy by DFT calculation from the atomic structure.
  • the input unit 100 has an interface for receiving inputs such as data necessary for inference of adsorption energy in the inference device 1 .
  • the inference device 1 receives, via the input unit 100, data relating to the surface structure of a metal, crystal, etc., for which adsorption energy is to be obtained, and the atomic structure of an adsorbed molecule.
  • the atomic structure is written as explained above.
  • the inference device 1 may be input with surface structure data and molecular structure data, or may be input with surface structure data, molecular structure data, and data including surface structure and molecular structure.
  • the storage unit 102 stores data necessary for the operation of the inference device 1. For example, data input to the inference apparatus 1 via the input unit 100 may be stored in the storage unit 102.
  • FIG. Although the storage unit 102 is included in the inference device 1 in FIG. 5, at least part of the storage unit 102 may be implemented in an external storage, file server, or the like. In this case, the system may be such that data is input via the input unit 100 at the timing when data or the like is required.
  • the atomic structure that includes both the atomic structure related to the adsorbed molecule and the atomic structure related to the surface will be referred to as the first atomic structure
  • the atomic structure related to the adsorbed molecule will be referred to as the second atomic structure
  • the atomic structure related to the surface will be referred to as the third atomic structure.
  • the structure optimization unit 104 optimizes an appropriate atomic structure in which molecules are adsorbed to the surface structure, which is the first atomic structure, from the input surface structure and molecular structure data. For example, when the data of the first atomic structure in which the surface structure and the molecular structure coexist are input as the input data, by optimizing the data of this atomic structure, the state in which the molecules are adsorbed to the surface, that is, , to obtain steady-state atomic structure data.
  • the structure optimization unit 104 When the data on the first atomic structure is not input to the reasoning device 1, and the data on the atomic structure of the adsorbed molecule, which is the second atomic structure, and the atomic structure of the surface, which is the third atomic structure, is input to the inference device 1, this second From the data on the atomic structure and the tertiary atomic structure, the primary atomic structure is generated and optimized.
  • the structure optimization unit 104 generates first atomic structure data in which the adsorbed molecules are close to the surface, and optimizes this first atomic structure data.
  • being close may represent, for example, a state in which the closest atoms of both the adsorbed molecule and the surface are at a predetermined distance (for example, 5 ⁇ ) or less, or may represent a closer distance.
  • the structure optimization unit 104 inputs the data about the first atomic structure to the trained model NN1, acquires the potential energy, and then back-propagates the acquired energy value to the trained model NN1, so that each atom Get the power to hang. Based on this force, the structure optimization unit 104 updates the atomic structure to which the adsorbed molecules are moved as the first atomic structure.
  • the structure optimization unit 104 may repeatedly update the first atomic structure. This update is performed until the position of the adsorbed molecule stops changing, until the change in the position of the adsorbed molecule becomes equal to or less than a predetermined threshold value, until the force becomes equal to or less than a predetermined threshold value, until a predetermined number of updates are completed. may be repeated until an appropriate termination condition such as .
  • the first atomic structure updated and optimized by the structure optimization unit 104 can be an atomic structure in a stable state or a metastable state. Based on this first atomic structure, the reasoner infers the adsorption energy.
  • the structure optimization unit 104 when the adsorbed molecules are properly adsorbed on the surface, that is, when the data on the first atomic structure that is guaranteed to be in a stable state or a metastable state is input, the structure optimization unit 104 , is not a required configuration.
  • the inference unit 106 infers energies for the second atomic structure, the third atomic structure, and the optimized first atomic structure using the trained model NN1.
  • E(molecule) in equation (1) is the output of inputting the second atomic structure to trained model NN1
  • E(slab) is the output of inputting the third atomic structure to trained model NN1
  • E(molecule + slab) is Each corresponds to the output of inputting the first atomic structure to the trained model NN1.
  • the inference unit 106 inputs the first atomic structure, the second atomic structure, and the third atomic structure to the trained model NN1, and obtains their energies.
  • the energy calculation unit 108 calculates the adsorption energy using Equation (1) based on the energy value acquired by the inference unit 106.
  • Energy calculation unit 108 preferably obtains energies using trained model NN1 for each of the first atomic structure, the second atomic structure, and the third atomic structure in order to match the conditions of the calculation process.
  • the inferring unit 106 should at least infer the energy of the first atomic structure.
  • the output unit 110 appropriately outputs the adsorption energy acquired by the energy calculation unit 108 to the outside or the storage unit 102 .
  • FIG. 6 is a flowchart showing an example of processing of the inference device 1.
  • the inference device 1 acquires, via the input unit 100, input data of the atomic structure of molecules and surfaces for which adsorption energy is to be acquired (S100). As described above, the inference device 1 may acquire data on the second atomic structure and the third atomic structure via the input unit 100, or may additionally acquire data on the first atomic structure. good. These atomic structures may also be entered in graphical form, except for periodic boundary conditions.
  • the structure optimization unit 104 optimizes the first atomic structure based on the inputted second atomic structure and third atomic structure (S102). More specifically, a first atomic structure is defined in which a molecule or the like described by the second atomic structure is in close proximity to a surface or the like described by the third atomic structure, and this defined first atomic structure is , is optimized using the trained model NN1. Note that if the first atomic structure has been acquired in S100, the optimization of the first atomic structure may be performed using the trained model NN1. Further, when the first atomic structure guaranteed to be in a stable state is input, the processing of S102 can be omitted.
  • the inference unit 106 inputs the updated first atomic structure, second atomic structure, and tertiary atomic structure to the trained model NN1, respectively, and obtains E(molecule + slab), E(molecule), E(slab ) is obtained (S104).
  • the energy calculation unit 108 acquires the adsorption energy based on the formula (1) from the energy in each structure acquired in S104 (S106).
  • the inference device 1 outputs the adsorption energy from the output unit 110 and ends the process (S108).
  • the inference unit 106 When referring to a database or the like for the energies of the second atomic structure and the third atomic structure, in S104 the inference unit 106 infers at least the energy of the first atomic structure. Then, in S106, the energy calculation unit 108 calculates the adsorption energy using the energy of the first atomic structure acquired by the inference unit 106 and the energies of the second atomic structure and the third atomic structure with reference to the database or the like. You may
  • training device Next, a training device for training the trained model NN1 used for energy inference in the inference device 1 will be described.
  • FIG. 7 is a block diagram schematically showing a training device according to one embodiment.
  • the training device 2 includes an input unit 200, a storage unit 202, a training unit 204, and an output unit 206.
  • the training device 2 is a device for training the trained model NN1 used in the inference device 1 described above, and trains the training target model NN2 using a machine learning technique.
  • the training target model NN2 is a neural network model used in NNP. Since the basic configuration is the same as that of the trained model NN1 described above, the details are omitted.
  • the input unit 200 accepts input of data in the training device 2.
  • the training device 2 acquires data and the like necessary for training via the input unit 200 .
  • the storage unit 202 stores data necessary for the operation of the training device 2. Data input from the input unit 200 may be stored in this storage unit 202 .
  • the training unit 204 executes training of the training target model NN2.
  • the training target model NN2 is mainly trained as a model for inferring the energy of the entire atomic structure including the adsorption energy when the first atomic structure is input.
  • the output unit 206 outputs the parameters and the like of the training target model NN2 trained by the training unit 204 to the outside or the storage unit 202.
  • the training unit 204 trains the training target model NN2 to infer the result of quantum chemical calculation, for example, first-principles calculation, particularly DFT calculation, for the atomic structure. .
  • the training unit 204 optimizes the parameters of the training target model NN2 by, for example, supervised learning.
  • the adsorption energy becomes a value equal to or lower than the error used for backpropagation processing in training, and by machine learning Less likely to be properly trained.
  • the atomic structure of a cluster composed of several to several tens of atoms, which is less than the number of atoms constituting the atomic structure of the surface, is used as training data. Note that the cluster may be a part of the atomic structure of the surface that has a stable energy cut out.
  • the data used for training by this training unit 204 will be explained.
  • the data may be acquired by a data generation device, or may be acquired from a database or the like.
  • DFT calculation is performed on the atomic structure, the energy is calculated, and the combination of the atomic structure and the energy is used as the data set.
  • the force may be calculated in the data generation device and used as teacher data. In the following, the case of using energy will be described, but the same applies to the case of using force unless otherwise specified.
  • FIG. 8 is a diagram showing one type of data used for training in this embodiment. Dotted lines indicate unit cells. This FIG. 8 shows a state in which H 2 molecules are adsorbed (adjacent) to the surface of Pt. Large spheres indicate Pt atoms and small spheres indicate H atoms.
  • the first atomic structure which is the atomic structure of the H 2 molecule and the Pt surface shown in this way, is defined.
  • an atomic structure is defined by adding a molecular structure to the surface structure in the unit cell shown in FIG. 2, and this is defined as the first atomic structure.
  • the closest distance between the H atom of the H 2 molecule and the Pt molecule on the Pt surface may be, for example, 4 ⁇ or more and 5 ⁇ or less.
  • An energy value is obtained by performing a DFT calculation on this first atomic structure.
  • This first atomic structure need not be optimized for the stable state.
  • a data set is prepared for the first atomic structure optimized to the stable state or metastable state. More preferably, a data set is prepared for the first atomic structure with various positions and orientations of H 2 with respect to the Pt surface.
  • FIG. 9 is a diagram showing different types of data used for training in this embodiment.
  • This FIG. 9 shows clusters and molecules that are close to each other.
  • the DFT calculation is performed for the fourth atomic structure, which is the atomic structure composed of the H 2 molecule and the Pt cluster, and the energy value is obtained.
  • the closest distance between the H atom of the H 2 molecule and the Pt molecule on the Pt surface may be 5 ⁇ or less.
  • This fourth atomic structure and energy value data set is used as training data.
  • DFT calculations are performed with periodic boundary conditions for a unit cell with four Pt atoms and one H2 atom.
  • the DFT calculation may be performed as a free boundary condition, but if the input of the training target model NN2 used for NNP is fixed to the periodic boundary condition, it is desirable to perform the calculation with the periodic boundary condition.
  • FIG. 10 shows different examples of combinations of clusters and molecules.
  • 14 Pt cut out from the face-centered cubic structure may be defined such that H 2 molecules are present in close proximity.
  • the first atomic structure is such that the adsorbed molecule (e.g., H2 molecule) is in close proximity to the solid surface (e.g., Pt solid surface) (adsorbed state, or adsorbed ), and the fourth atomic structure is the state where the adsorbed molecule is close to the cluster (for example, Pt cluster) ) is the atomic structure of And the solid surface of the first atomic structure and the cluster of the fourth atomic structure contain the same atoms (for example, Pt atoms).
  • the adsorbed molecule e.g., H2 molecule
  • the fourth atomic structure is the state where the adsorbed molecule is close to the cluster (for example, Pt cluster)
  • the fourth atomic structure is the state where the adsorbed molecule is close to the cluster (for example, Pt cluster)
  • the solid surface of the first atomic structure and the cluster of the fourth atomic structure contain the same atoms (for example, Pt atoms).
  • the training unit 204 calculates the difference between the result of inputting the first atomic structure to the training target model NN2 and the result of the DFT calculation as the first error.
  • the training unit 204 backpropagates this first error to update the parameters of the training target model NN2.
  • the training unit 204 calculates a fourth error, which is the difference between the result of inputting the fourth atomic structure to the training target model NN2 and the result of the DFT calculation.
  • the training unit 204 backpropagates this fourth error to update the parameters of the training target model NN2.
  • a neural network model that can make inferences about the first atomic structure and the fourth atomic structure by using the training data set about the first atomic structure and the training data set about the fourth atomic structure as training data without distinguishing between them. can be trained.
  • the magnitude of the interaction between the molecule and the surface in the first atomic structure and the magnitude of the interaction between the molecule and the cluster calculated from the fourth atomic structure do not differ much because the local structure is the same. Since the number of atoms is largely different between the first atomic structure and the fourth atomic structure, the magnitude of interaction per atom is largely different.
  • the neural network model learns this interaction from the energy, but since the magnitude of the interaction per atom is large, learning using the result of the fourth atom structure reduces the adsorption energy to the error used for training. It becomes possible to suppress being buried.
  • the training device 2 trains the training target model NN2 using a dataset for multiple first atomic structures and a dataset for multiple fourth atomic structures.
  • acquisition of energy values for the surface and molecules can be appropriately learned, and by using cluster and molecule data as training data, the adsorption energy can be reproduced with higher accuracy.
  • the first and fourth errors can be different from the above in order to perform training on the energy per atom more accurately.
  • the training unit 204 may divide the difference between the result of inputting the first atomic structure into the training target model NN2 and the DFT calculation result by the number of atoms included in the atomic structure, and use this value as the first error.
  • the training unit 204 divides the difference between the result of inputting the fourth atomic structure into the training target model NN2 and the DFT calculation result by the number of atoms included in the atomic structure, and uses this value as the fourth error. good.
  • the training unit 204 infers the energy to surface impact for the surface, and for the surface or cluster atoms in close proximity to the molecule: It becomes possible to infer the energy that reflects the influence of the adsorption energy.
  • the difference between the output of the training target model NN2 and the DFT calculation result may be divided by the square of the number of atoms included in the atomic structure.
  • FIG. 11 is a flow chart showing the processing of the training device 2 according to this embodiment.
  • the training device 2 acquires a training data set via the input unit 200 (S200).
  • the training dataset comprises the dataset for the first atomic structure and the dataset for the fourth atomic structure, as described above.
  • the training unit 204 uses the acquired training data set to train the training target model NN2 based on any appropriate machine learning method (S202).
  • the training device 2 outputs necessary data such as parameters related to the trained model NN2, and ends the process (S204).
  • the inference device can infer highly accurate adsorption energy.
  • the adsorption energy can be inferred with high accuracy, but it is also possible to further improve this accuracy.
  • the training device 2 can also use the data set regarding the second atomic structure regarding only adsorbed molecules, in addition to the data set regarding the first atomic structure and the fourth atomic structure.
  • FIG. 12 is a diagram showing an example of an atomic structure having only molecules.
  • an atomic structure in which only molecules are set in a unit cell is defined as a second atomic structure. Then, the energy of this second atomic structure is obtained by DFT calculation and added to the data set.
  • the training unit 204 uses this second atomic structure and the data set of the DFT calculation results to calculate the second error in the same manner as above, and trains the training target model NN2 based on this second error. It is desirable to train the data set for the second atomic structure without distinguishing it from the data set for the first atomic structure and the data set for the fourth atomic structure.
  • a data set may be prepared with an atomic structure as shown in FIG. 2 as the third atomic structure on the surface.
  • the preparation of the data set is the same as above, so the details are omitted.
  • the training unit 204 calculates the third error by comparing the result of inputting the third atomic structure to the training target model NN2 and the result of the DFT calculation using the data set related to the third atomic structure.
  • the training target model NN2 may be trained based on the 3rd error along with the 1st error and the 4th error.
  • FIG. 13 is a diagram showing an example of the atomic structure of a cluster.
  • the training unit 204 may perform training based on the fifth atomic structure, which is the atomic structure of the cluster, in addition to the first atomic structure and the fourth atomic structure. For this, a data set for the fifth atomic structure may be prepared.
  • the training unit 204 calculates the fifth error by comparing the result of inputting the fifth atomic structure to the training target model NN2 and the result of the DFT calculation using the data set related to the fifth atomic structure.
  • a training target model NN2 may be trained based on the fifth error along with the error and the fourth error.
  • the training device 2 performs training based on data sets of at least the first atomic structure and the fourth atomic structure, and furthermore, at least one of the second atomic structure, the third atomic structure, or the fifth atomic structure described above. Training may be performed by adding the data set for the first atomic structure and the data set for the fourth atomic structure. A second atomic structure, a third atomic structure, or a fifth atomic structure can be incorporated into the training in any combination. Of course, a data set for all the first to fifth atomic structures may be prepared, and the training device 2 may train the training target model NN2 using this data set.
  • a data set for an atomic structure in which the distance between the surface and the molecule is sufficiently separated and an atomic structure in which the distance between the cluster and the molecule is sufficiently separated is prepared as the first atomic structure and It may be used as a training data set together with the quaternary atomic structure.
  • the sufficient distance may be, for example, 10 ⁇ or more between the surface or cluster atom and the molecule atom. Note that in this case the molecules are also placed well away from the surface or opposite face of the unit cell where the cluster resides.
  • the training device 2 can train sufficiently separated surfaces and molecules, or clusters and molecules. In these cases, since there is no adsorption energy, the training can be clearly distinguished from the case where the surface and the molecule or the cluster and the molecule are arranged so close that they have significant adsorption energy. For this reason, it is possible to form a trained model NN1 that can make accurate inferences about cases where these situations are simply included in the atomic structure and situations where the situation is stable due to adsorption energy.
  • the atomic structure of the surface described in the present embodiment described above is based on the atomic structure of a solid surface, but the "surface” is not limited to a solid surface.
  • a "surface” is the boundary where one homogeneous solid or liquid phase meets another homogeneous gaseous phase or a vacuum, and liquid surfaces such as the surface of a liquid metal generated by simulating high temperature conditions. Contains conceptually. Then, the training target model may be trained using the atomic structure in which the surface of the liquid metal and the adsorbed molecules are close to each other as the first atomic structure.
  • All of the above trained models may be concepts that include, for example, models that have been trained as described and further distilled by a general method.
  • each device inference device 1 or training device 2 in the above-described embodiment may be configured by hardware, or a CPU (Central Processing Unit), GPU (Graphics Processing Unit), etc. It may be configured by information processing of software (program) to be executed.
  • software information processing software that realizes at least a part of the functions of each device in the above-described embodiments can be transferred to a flexible disk, CD-ROM (Compact Disc-Read Only Memory), or USB (Universal Serial Bus) memory or other non-temporary storage medium (non-temporary computer-readable medium) and read into a computer to execute software information processing.
  • the software may be downloaded via a communication network.
  • information processing may be performed by hardware by implementing software in a circuit such as an ASIC (Application Specific Integrated Circuit) or FPGA (Field Programmable Gate Array).
  • the type of storage medium that stores the software is not limited.
  • the storage medium is not limited to a detachable one such as a magnetic disk or an optical disk, and may be a fixed storage medium such as a hard disk or memory. Also, the storage medium may be provided inside the computer, or may be provided outside the computer.
  • FIG. 14 is a block diagram showing an example of the hardware configuration of each device (inference device 1 or training device 2) in the above-described embodiment.
  • Each device includes, for example, a processor 71, a main storage device 72 (memory), an auxiliary storage device 73 (memory), a network interface 74, and a device interface 75, which are connected via a bus 76.
  • a processor 71 for example, a main storage device 72 (memory), an auxiliary storage device 73 (memory), a network interface 74, and a device interface 75, which are connected via a bus 76.
  • a bus 76 may be implemented as a computer 7 integrated with the
  • the computer 7 in FIG. 14 has one of each component, but may have a plurality of the same components. Also, in FIG. 14, one computer 7 is shown. good too. In this case, it may be in the form of distributed computing in which each computer communicates via the network interface 74 or the like to execute processing.
  • each device (inference device 1 or training device 2) in the above-described embodiment is a system that realizes functions by seven or more computers executing instructions stored in seven or more storage devices. may be configured.
  • the information transmitted from the terminal may be processed by one or more computers provided on the cloud, and the processing result may be transmitted to the terminal.
  • each device (reasoning device 1 or training device 2) in the above-described embodiments are executed in parallel using one or more processors or using multiple computers via a network. good too. Also, various operations may be distributed to a plurality of operation cores in the processor and executed in parallel. Also, part or all of the processing, means, etc. of the present disclosure may be executed by at least one of a processor and a storage device provided on a cloud capable of communicating with the computer 7 via a network. Thus, each device in the above-described embodiments may be in the form of parallel computing by one or more computers.
  • the processor 71 may be an electronic circuit (processing circuit, processing circuitry, CPU, GPU, FPGA, ASIC, etc.) including a computer control device and arithmetic device. Also, the processor 71 may be a semiconductor device or the like including a dedicated processing circuit. The processor 71 is not limited to an electronic circuit using electronic logic elements, and may be realized by an optical circuit using optical logic elements. Also, the processor 71 may include arithmetic functions based on quantum computing.
  • the processor 71 can perform arithmetic processing based on the data and software (programs) input from each device, etc. of the internal configuration of the computer 7, and output the arithmetic result and control signal to each device, etc.
  • the processor 71 may control each component of the computer 7 by executing the OS (Operating System) of the computer 7, applications, and the like.
  • Each device (inference device 1 or training device 2) in the above-described embodiments may be realized by one or more processors 71.
  • the processor 71 may refer to one or more electronic circuits arranged on one chip, one or more electronic circuits arranged on two or more chips or two or more devices. You can point When multiple electronic circuits are used, each electronic circuit may communicate by wire or wirelessly.
  • the main storage device 72 is a storage device that stores instructions and various data to be executed by the processor 71 , and the information stored in the main storage device 72 is read by the processor 71 .
  • Auxiliary storage device 73 is a storage device other than main storage device 72 . These storage devices mean any electronic components capable of storing electronic information, and may be semiconductor memories. The semiconductor memory may be either volatile memory or non-volatile memory.
  • a storage device for storing various data in each device (reasoning device 1 or training device 2) in the above-described embodiments may be realized by the main storage device 72 or the auxiliary storage device 73, and is built into the processor 71. It may be realized by an internal memory.
  • the storage unit 102 in the above-described embodiment may be realized by the main storage device 72 or the auxiliary storage device 73.
  • processors may be connected (coupled) to one storage device (memory), or a single processor may be connected.
  • a plurality of storage devices (memories) may be connected (coupled) to one processor.
  • Each device (reasoning device 1 or training device 2) in the above-described embodiments is composed of at least one storage device (memory) and a plurality of processors connected (coupled) to this at least one storage device (memory).
  • at least one of the plurality of processors may include a configuration connected (coupled) to at least one storage device (memory).
  • this configuration may be realized by storage devices (memory) and processors included in a plurality of computers.
  • a configuration in which a storage device (memory) is integrated with a processor for example, a cache memory including an L1 cache and an L2 cache) may be included.
  • the network interface 74 is an interface for connecting to the communication network 8 wirelessly or by wire. As for the network interface 74, an appropriate interface such as one conforming to existing communication standards may be used. The network interface 74 may exchange information with the external device 9A connected via the communication network 8.
  • FIG. The communication network 8 may be any one of WAN (Wide Area Network), LAN (Local Area Network), PAN (Personal Area Network), etc., or a combination thereof. It is sufficient if information can be exchanged between them. Examples of WAN include the Internet, examples of LAN include IEEE802.11 and Ethernet (registered trademark), and examples of PAN include Bluetooth (registered trademark) and NFC (Near Field Communication).
  • the device interface 75 is an interface such as USB that directly connects with the external device 9B.
  • the external device 9A is a device connected to the computer 7 via a network.
  • External device 9B is a device that is directly connected to computer 7 .
  • the external device 9A or the external device 9B may be an input device.
  • the input device is, for example, a device such as a camera, microphone, motion capture, various sensors, a keyboard, a mouse, or a touch panel, and provides the computer 7 with acquired information.
  • a device such as a personal computer, a tablet terminal, or a smartphone including an input unit, a memory, and a processor may be used.
  • the external device 9A or the external device 9B may be, for example, an output device.
  • the output device may be, for example, a display device such as LCD (Liquid Crystal Display), CRT (Cathode Ray Tube), PDP (Plasma Display Panel), or organic EL (Electro Luminescence) panel.
  • a speaker or the like for output may be used.
  • a device such as a personal computer, a tablet terminal, or a smartphone including an output unit, a memory, and a processor may be used.
  • the external device 9A or the external device 9B may be a storage device (memory).
  • the external device 9A may be a network storage or the like, and the external device 9B may be a storage such as an HDD.
  • the external device 9A or the external device 9B may be a device having the functions of some of the components of each device (inference device 1 or training device 2) in the above-described embodiments. That is, the computer 7 may transmit or receive part or all of the processing results of the external device 9A or the external device 9B.
  • the expression "at least one (one) of a, b and c" or “at least one (one) of a, b or c" includes any of a, b, c, a-b, ac, b-c, or a-b-c. Also, multiple instances of any element may be included, such as a-a, a-b-b, a-a-b-b-c-c, and so on. It also includes the addition of other elements than the listed elements (a, b and c), such as having d such as a-b-c-d.
  • connection and “coupled” when used, they refer to direct connection/coupling, indirect connection/coupling , electrically connected/coupled, communicatively connected/coupled, operatively connected/coupled, physically connected/coupled, etc. intended as a term.
  • the term should be interpreted appropriately according to the context in which the term is used, but any form of connection/bonding that is not intentionally or naturally excluded is not included in the term. should be interpreted restrictively.
  • the physical structure of element A is such that it is capable of performing operation B has a configuration, including that a permanent or temporary setting/configuration of element A is configured/set to actually perform action B good.
  • element A is a general-purpose processor
  • the processor has a hardware configuration that can execute operation B, and operation B can be performed by setting a permanent or temporary program (instruction). It just needs to be configured to actually run.
  • the element A is a dedicated processor or a dedicated arithmetic circuit, etc., regardless of whether or not control instructions and data are actually attached, the circuit structure of the processor actually executes the operation B. It just needs to be implemented.
  • finding a global optimum finding an approximation of a global optimum, finding a local optimum, and finding a local optimum It includes approximations of values and should be interpreted accordingly depending on the context in which the term is used. It also includes stochastically or heuristically approximating these optimum values.
  • each piece of hardware may work together to perform the predetermined processing, or a part of the hardware may perform the predetermined processing. You may do all of Also, some hardware may perform a part of the predetermined processing, and another hardware may perform the rest of the predetermined processing.
  • the hardware that performs the first process and the hardware that performs the second process may be the same or different. In other words, the hardware that performs the first process and the hardware that performs the second process may be included in the one or more pieces of hardware.
  • hardware may include an electronic circuit or a device including an electronic circuit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
PCT/JP2022/023523 2021-06-11 2022-06-10 訓練装置、訓練方法、プログラム及び推論装置 Ceased WO2022260179A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2023524992A JP7392203B2 (ja) 2021-06-11 2022-06-10 訓練装置、訓練方法、プログラム及び推論装置
US18/534,130 US20240127121A1 (en) 2021-06-11 2023-12-08 Training device, method, non-transitory computer readable medium, and inferring device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021-098304 2021-06-11
JP2021098304 2021-06-11

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/534,130 Continuation US20240127121A1 (en) 2021-06-11 2023-12-08 Training device, method, non-transitory computer readable medium, and inferring device

Publications (1)

Publication Number Publication Date
WO2022260179A1 true WO2022260179A1 (ja) 2022-12-15

Family

ID=84424563

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/023523 Ceased WO2022260179A1 (ja) 2021-06-11 2022-06-10 訓練装置、訓練方法、プログラム及び推論装置

Country Status (3)

Country Link
US (1) US20240127121A1 (https=)
JP (1) JP7392203B2 (https=)
WO (1) WO2022260179A1 (https=)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118974837A (zh) * 2022-03-31 2024-11-15 松下知识产权经营株式会社 信息处理方法、信息处理系统以及程序
US12587274B2 (en) 2023-03-28 2026-03-24 Quantum Generative Materials Llc Satellite optimization management system based on natural language input and artificial intelligence
US12368503B2 (en) 2023-12-27 2025-07-22 Quantum Generative Materials Llc Intent-based satellite transmit management based on preexisting historical location and machine learning
US12603701B2 (en) 2023-12-27 2026-04-14 Quantum Generative Materials Llc Distributed satellite constellation management and control system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0589074A (ja) * 1991-09-30 1993-04-09 Fujitsu Ltd 二次構造予測装置
JP2003303313A (ja) * 1996-12-19 2003-10-24 Fujitsu Ltd 粒子シミュレーションシステムおよび記憶媒体
WO2021054402A1 (ja) * 2019-09-20 2021-03-25 株式会社 Preferred Networks 推定装置、訓練装置、推定方法及び訓練方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0589074A (ja) * 1991-09-30 1993-04-09 Fujitsu Ltd 二次構造予測装置
JP2003303313A (ja) * 1996-12-19 2003-10-24 Fujitsu Ltd 粒子シミュレーションシステムおよび記憶媒体
WO2021054402A1 (ja) * 2019-09-20 2021-03-25 株式会社 Preferred Networks 推定装置、訓練装置、推定方法及び訓練方法

Also Published As

Publication number Publication date
JPWO2022260179A1 (https=) 2022-12-15
JP7392203B2 (ja) 2023-12-05
US20240127121A1 (en) 2024-04-18

Similar Documents

Publication Publication Date Title
WO2022260179A1 (ja) 訓練装置、訓練方法、プログラム及び推論装置
US12033728B2 (en) Simulating electronic structure with quantum annealing devices and artificial neural networks
CN114580647B (zh) 量子系统的模拟方法、计算设备、装置及存储介质
JP7453244B2 (ja) 推定装置、訓練装置、推定方法及びモデル生成方法
JP7403032B2 (ja) 訓練装置、推定装置、訓練方法、推定方法及びプログラム
WO2022069740A1 (en) Simulating physical environments using mesh representations and graph neural networks
JP2021505993A (ja) 深層学習アプリケーションのための堅牢な勾配重み圧縮方式
CN116151135B (zh) 一种电大尺寸目标的电磁仿真方法及系统
WO2022260171A1 (ja) 推定装置及びモデル生成方法
JP2022068327A (ja) ノードグループ化方法、装置及び電子機器
JP2024075646A (ja) 推定装置、訓練装置、推定方法、訓練方法、プログラム及び非一時的コンピュータ可読媒体
JP7702279B2 (ja) 評価装置、推論装置、評価方法、プログラム及び非一時的コンピュータ可読媒体
Zeng et al. A nearsighted force-training approach to systematically generate training data for the machine learning of large atomic structures
WO2022260172A1 (ja) 探索装置、探索方法、プログラム及び非一時的コンピュータ可読媒体
WO2020217620A1 (ja) 訓練装置、推定装置、訓練方法、推定方法及びプログラム
US20250061360A1 (en) Quantum oracle decomposition for simulating quantum computing systems and applications
US20250272462A1 (en) Determination device and calculation method
JP2022189642A (ja) 情報処理装置
JP2025095299A (ja) 情報処理装置、情報処理システム、および方法
CN119940554A (zh) 模型优化器、多跳问答模型训练、多跳问答方法和装置
JP2024072890A (ja) 推定装置、訓練装置、推定方法、生成方法及びプログラム
CN115699094A (zh) 从低分辨率表示合成高分辨率3d形状的合成数据生成系统和应用
WO2021251414A1 (ja) 推定装置、訓練装置、推定方法、訓練方法及びプログラム
WO2022050426A1 (ja) 推定装置、推定方法及びプログラム
WO2021251413A1 (ja) 推定装置、推定方法、化学構造式及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22820357

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023524992

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22820357

Country of ref document: EP

Kind code of ref document: A1