WO2022260177A1 - 推定装置、訓練装置、推定方法、訓練方法、プログラム及び非一時的コンピュータ可読媒体 - Google Patents

推定装置、訓練装置、推定方法、訓練方法、プログラム及び非一時的コンピュータ可読媒体 Download PDF

Info

Publication number
WO2022260177A1
WO2022260177A1 PCT/JP2022/023520 JP2022023520W WO2022260177A1 WO 2022260177 A1 WO2022260177 A1 WO 2022260177A1 JP 2022023520 W JP2022023520 W JP 2022023520W WO 2022260177 A1 WO2022260177 A1 WO 2022260177A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
model
training
energy
atoms
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2022/023520
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
大資 本木
幾 品川
聡 高本
広紀 入口
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Preferred Networks Inc
Original Assignee
Preferred Networks Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Preferred Networks Inc filed Critical Preferred Networks Inc
Priority to JP2023527952A priority Critical patent/JP7457877B2/ja
Publication of WO2022260177A1 publication Critical patent/WO2022260177A1/ja
Priority to US18/534,252 priority patent/US20240111998A1/en
Anticipated expiration legal-status Critical
Priority to JP2024040534A priority patent/JP2024075646A/ja
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Definitions

  • the present disclosure relates to an estimation device, a training device, an estimation method, a training method, a program, and a non-transitory computer-readable medium.
  • NNP Neuronal Network Potential
  • a neural network model training device that improves the accuracy of NNP is provided.
  • a training device comprises one or more memories and one or more processors.
  • the one or more processors train a model that outputs at least energy information when atomic information is input, using training data including material information and training data including two-body potential information.
  • FIG. 4 is a flow chart showing processing of the training device according to one embodiment.
  • FIG. 4 is a diagram schematically showing the flow of data in data generation, model generation, and inference according to one embodiment; A graph inferring a two-body potential in the NNP generated by the apparatus according to one embodiment.
  • 1 is a block diagram of a training device according to one embodiment;
  • FIG. 4 is a flow chart showing processing of the training device according to one embodiment.
  • the block diagram which shows the estimation apparatus which concerns on one Embodiment. 4 is a flowchart showing processing of an estimation device according to one embodiment;
  • FIG. 4 is a diagram schematically showing the flow of data in data generation, model generation, and inference according to one embodiment; A block diagram showing a hardware implementation example of a training device and an estimation device according to an embodiment.
  • the interatomic potential (interatomic interaction potential energy) is a function that obtains energy from the arrangement of atoms. This function is generally an artificial function. It is a function corresponding to the governing equation for MD (Molecular Dynamics) simulation. A non-limiting example of an interatomic potential is the Lennard Jones potential.
  • NNP Neuron Potential
  • the neural network may be, for example, a GNN (Graph Neural Network) that handles graphs.
  • the NNP (model NN) of this specification consists of an input layer in which the information of each atom in the atomic system whose energy is to be estimated is input, a hidden layer in which calculation is performed based on the input information, and the energy of the atomic system. and an output layer to output.
  • Neural network training is performed using a training dataset.
  • the training data set includes, for each of a plurality of atomic systems, information on each atom in the atomic system and the correct energy value of the atomic system.
  • the correct value of the energy of an atomic system can be obtained by first-principles calculations (for example, calculations based on DFT (density functional theory), calculations based on HF (Hatree-Fock) method, and MP (Moeller-Plesset) method for that atomic system.
  • is the energy value calculated by Neural network training involves inputting information about each atom in the atomic system into the neural network for each atomic system included in the training data set, and calculating the error between the output estimated energy of the atomic system and its correct value. and update the weight parameters of the neural network by error backpropagation based on those errors.
  • NNP may output secondary information such as the charge of each atom in the input atomic system.
  • the neural network's training data set contains the correct values for the charge, and the neural network is trained by error backpropagation based on the energy error and the charge error.
  • the NNP may have a function of calculating the differential value of the estimated energy with respect to the position of the atom as the force applied to that atom.
  • the NNP may also output the force applied to each atom of the input atomic system.
  • the training data set for the neural network contains the correct values for the force applied to each atom, and the neural network performs error backpropagation based on errors in energy and/or charge and errors in the force applied to each atom. trained by
  • the atom information input to the model NN used in the NNP is, for example, information including the type and position of each atom.
  • atomic information may be referred to as atomic information.
  • the information on the positions of atoms includes, for example, information that directly indicates the positions of atoms by coordinates, information that directly or indirectly indicates relative positions between atoms, and the like.
  • the information is represented, for example, by interatomic distances, angles, dihedral angles, and the like.
  • the atom information may be information directly indicating the position or information calculated from the position information.
  • the information on atoms may also include information on charges and information on bonds.
  • the model NN outputs information about energy, for example.
  • Information about energy for example, energy, information calculated based on energy
  • Information calculated based on energy for example, force for each atom, stress (stress of the entire system), virial for each atom, system Whole virial and the like.
  • NNP may output information that can be calculated using NNP, such as charge for each atom, in addition to information on energy.
  • a two-body potential curve shows the relationship between the distance and energy of two atoms when only two atoms exist in the system.
  • the two-body potential function tries to express the overall energy as a label of the potential by summing the two-body potential curves. In general, it is difficult to accurately reproduce energy values and the like only from two-body potential functions.
  • the Lennard Jonse potential mentioned above is also this two-body potential function.
  • FIG. 1 is a block diagram showing an example of a training device according to this embodiment.
  • the training device 1 includes an input unit 100, a storage unit 102, a training unit 104, and an output unit .
  • the training device 1 is a device that trains the model NN in any suitable machine learning technique based on input data.
  • the input unit 100 accepts input of data in the training device 1.
  • the input unit 100 has, for example, an input interface.
  • the input data is data used for training the model NN.
  • the data used for training may include, for example, teacher data relating to input/output of the model NN, verification data used for validation, and the like.
  • the input unit 100 may accept input of data such as hyperparameters and initial parameters.
  • the storage unit 102 stores data necessary for the operation of the training device 1. For example, data input via the input unit 100 may be stored in this storage unit 102 .
  • the storage unit 102 is included in the training device 1 in FIG. 1, but at least part of the storage unit 102 may be implemented in an external storage, file server, or the like. In this case, the data may be input via the input section 100 at the timing when the data is required.
  • the training unit 104 executes training of the model NN.
  • the training unit 104 for example, forward propagates data input via the input unit 100 to the model NN, compares the output data and teacher data to calculate an error, back-propagates the error, and appropriately Update the parameters that make up the model NN based on information such as gradients.
  • the output unit 106 outputs data such as parameters optimized by training by the training unit 104 to the outside or to the storage unit 102 .
  • the output of parameters and the like may be a concept that includes a process of storing data in the storage unit 102 of the training device 1 in addition to the output to the outside.
  • a model NN is a neural network model used in NNP, for example, a model for inferring atomic interactions that can obtain the results of quantum chemical calculations.
  • the model NN is a neural network model that outputs energy and force when atomic information regarding both molecules, compounds such as crystals, and information such as the environment is input.
  • forces can be obtained by backpropagating energy values.
  • force acquisition may be performed by backpropagation.
  • the atom information input to the model NN used in the NNP is, for example, data including information on the type and position of each atom.
  • atomic information may be referred to as atomic information.
  • the atom information includes, for example, the type of atom, information on the position of the atom, and the like.
  • Information about the positions of atoms includes, for example, information indicating coordinates of atoms, information indicating relative positions between atoms directly or indirectly, and the like.
  • the information is represented, for example, by interatomic distances, angles, dihedral angles, and the like.
  • information on the position of atoms information on the distance between two atoms and the angle between three atoms are calculated from the information on the coordinates of the atoms.
  • the information about the position of the atom may be information directly indicating the position or information calculated from the position.
  • information on charges and information on bonds may also be included.
  • the model NN is constructed as any neural network model suitable for performing NNP inference.
  • This configuration may include, for example, a convolutional layer and a fully connected layer, or may include a layer or a convolutional layer capable of inputting/outputting graph information.
  • the model NN may be, for example, a model in which information such as molecules is input as graph information, or a model in which tree information converted from a graph is input.
  • This model NN can be used, as non-limiting examples, to infer energy in binding between proteins and compounds, inference of reaction rates in catalysts, etc. In addition to this, it can also be used for reasoning in processing using energy, force, etc. between compounds.
  • the model NN is used to infer the energy, force, etc. of compounds and proteins, etc., and using this value, for example, by using the MD method, it is possible to infer the energy in the binding of the protein and the compound, the catalyst, etc. It can be used for inference of reaction rate and the like.
  • FIG. 2 is a flowchart showing an example of processing of the training device 1 according to this embodiment.
  • the training device 1 receives data necessary for training via the input unit 100 (S100). This data may be stored in the storage unit 102 as needed.
  • the data are data necessary for training, such as training data as teacher data, hyperparameter data and initial parameter data for forming a neural network model, and verification data for executing validation. This training data will be described in detail later.
  • the training unit 104 executes training using the input data (S102).
  • the training unit 104 first forms a neural network model (model NN) based on hyperparameters and the like, inputs input data among training data to the model NN, and executes forward propagation processing. After the forward propagation process is completed, the backward propagation process is executed based on the error between the data output from the model NN and the output data (teacher data) of the training data.
  • the training unit 104 updates the parameters of each layer of the model NN based on the gradient information acquired by this backpropagation. This series of processes is repeated until the termination condition is satisfied.
  • This training process is performed by any suitable general machine learning technique.
  • the output unit 106 outputs the optimized parameters and ends the process (S104).
  • data on compounds such as molecules and crystals are used for training neural network models in NNP.
  • data obtained using simulations based on physical laws such as first-principles calculations such as DFT (Density Functional Theory) can be used.
  • the energy and force of this compound or the like are obtained by DFT, and a combination of this data and input data is used as training data.
  • this training data may be obtained from a database or the like that stores results that have already been calculated, in addition to being obtained by calculation such as DFT.
  • the training device 1 uses the amount of energy and force in the two elements for training the model NN. These quantities follow the two-body potential curve.
  • the training device 1 uses, as training data, a data set according to a two-body potential curve, which is information obtained by calculating interaction energies and forces of two elements at various distances by an arbitrary method.
  • This training data may be generated in advance by a training data generation device different from the training device 1.
  • the training data generation device is based on a simulation based on the laws of physics such as first-principles calculation, for example, based on DFT calculation, for each of the two atoms, set as the element for which you want to acquire data, and use calculation by simulation while changing the distance. to obtain the data necessary for training. That is, the training data generation device sets two elements of the same type or different types, sets the distance between these two elements to various distances, and uses first-principles calculation to set various distances of the two elements. Energy and force at distance are obtained to generate training data. In this way, a data set that follows the two-body potential curve is generated and used for training.
  • the two-body potential is approximated as a function
  • the potential at the element and the distance based on this function may be obtained.
  • the accuracy may be inferior to the calculation result of DFT or the like, it is possible to obtain the result at a higher speed. Therefore, it is also possible to reduce the arithmetic processing time for training data generation.
  • FIG. 3 is a diagram schematically showing the state of data generation and training according to this embodiment.
  • a dashed line represents a data generation device, a dotted line represents a training device, and a solid line represents an estimation device.
  • the data generator may use, for example, DFT as the first-principles calculation.
  • Any software such as VASP (registered trademark), Gaussian (registered trademark), etc. can be used with arbitrary parameters for software that executes DFT.
  • this software may use the same software as the software that acquires information on molecules, crystals, etc., with the same parameters.
  • data may be generated by the combination of these software and parameters.
  • the data generator obtains energy and force information from various states such as molecules and crystals using first-principles calculations and the like. Using this information, the trainer trains the model. The estimator performs inference using the trained model. Data used for training in this way are based on molecules, crystals, etc., and data based on diatoms (data following a two-body potential curve) are not used for training.
  • the data generation device acquires energy and force information from various states such as molecules and crystals using first-principles calculation.
  • Information such as the interaction energy between two atoms following the two-body potential curve is obtained from the information on the distance between two atoms using first-principles calculations.
  • the training device executes model training using both information such as molecular crystals and energy and diatomic information and energy.
  • the estimator then performs inference using this model. In this way, the training uses diatomic-based data as well as information on molecules, crystals, etc.
  • the data generation device does not have to be provided in the system, in which case data that already exists in a database or the like may be used.
  • FIG. 4 is a graph showing energy estimation results generated in this way by the estimation device according to the present embodiment and energy estimation results by an estimation device (comparative example) trained without using two-body potentials.
  • . 4 is a graph comparing inference results between a model NN trained by the training device 1 according to the present embodiment and a comparative example when there are two hydrogen elements as two atoms.
  • the solid line represents the energy between two atoms calculated in DFT under the condition ⁇ B97XD/6-31G(d).
  • Plots indicated by circles are NNP calculations using the model NN trained in this embodiment.
  • the plot indicated by x is the result of NNP calculation using a model trained without using the two-body potential as a comparative example.
  • the bond length of hydrogen molecules is about 0.74 ⁇ , but in the comparative example there is also a local stability point near 1.4 ⁇ . For this reason, in MD simulations, etc., there is a problem that a hydrogen molecule with a bond length of about 1.4 ⁇ , which is inherently unstable, appears, and results that deviate from reality are obtained. In contrast, according to the present embodiment, inference of the two-body potential can be realized with higher precision than the comparative example.
  • the above is a two-body potential, but this can be rephrased as a potential between at least two atoms. That is, if potentials between atoms of three or more bodies can be calculated appropriately, the potentials between atoms of three or more bodies may be input as a data set used for training. In such cases, training a model NN that can be fitted by a potential function between many bodies can be realized.
  • the training data includes data of two-body potential, but the embodiments of the present disclosure are not limited to this.
  • FIG. 5 is a block diagram showing an example of a training device according to the second embodiment.
  • the training device 1 includes an input unit 100, a storage unit 102, a first training unit 108, a second training unit 110, and an output unit . Components with the same reference numerals as in the first embodiment perform the same operations.
  • the first training unit 108 optimizes the first model NN1 by machine learning
  • the second training unit 110 optimizes the second model NN2 by machine learning.
  • the first model NN1 is a neural network model that outputs a two-body potential when the types and distances of two atoms are input.
  • the second model NN2 is a neural network model that, when information such as molecules and crystals is input, outputs information such as energy excluding energy related to the two-body potential between constituent atoms.
  • the first training unit 108 trains the first model NN1 using a data set of the types and distances of diatomic elements and the energy according to the two-body potential curve, which are input via the input unit 100. Train the first model NN1 with any machine learning method suitable for The first training unit 108 completes training of the first model NN1 in advance before training of the second model NN2 is executed.
  • the second training unit 110 executes training of the second model NN2 in a state where training of the first model NN1 is completed. Similar to the first training unit 108, train the second model NN2 with any suitable machine learning technique for training the second model NN2.
  • the first model NN1 and the second model NN2 are trained at different timings as described above, but this is not the only option.
  • the training device 1 may, for example, train the first model NN1 and the second model NN2 at the same timing.
  • the training device 1 may, for example, input data regarding two atoms in the training data set into the first model NN1, and input data regarding molecules, crystals, etc. in the training data set into the second model NN2.
  • the first model NN1 is trained based on the output of the first model NN1 and the two-body potential, and along with this, the sum of the output of the first model NN1 and the output of the second model NN2 and the training by the first principle calculation etc.
  • a second model NN2 may be trained based on the data.
  • the first model NN1 is not an essential configuration.
  • the first model NN1 may be replaced with a function that obtains the two-body potential based on the Lennard-Jones function, or other functions and models that can appropriately calculate the two-body potential may be used. This also applies to the estimation device 2 according to this embodiment.
  • FIG. 6 is a flowchart showing processing according to the second embodiment.
  • the training device 1 acquires training data via the input unit 100 (S200).
  • the training data is data in which the states of the two atoms (the type of each element and the distance between the two atoms) and the two-body potentials (including energy and force) required for training the first model NN1 are linked, and , molecules, crystals, etc., and information including energy and force are linked data.
  • the first training unit 108 uses the training data on the two-body potential to train the first model NN1 (S202).
  • the first model NN1 is a neural network model suitable for inputting and outputting information on two-body potentials.
  • the first training unit 108 trains the first model NN1 using a machine learning method suitable for training the first model NN1. Termination conditions and the like can also be arbitrarily determined. For example, the first training unit 108 inputs the information of two atoms to the first model NN1, forward propagates it, and back-propagates the error between the output result and the information of the two-body potential of the two atoms to obtain the parameters. Update.
  • the second training unit 110 executes the training of the second model NN2 (S204).
  • the second training unit 110 trains the second model NN2 using the optimized output data of the first model NN1 and training data such as molecules and crystals.
  • the sum of the energy output from the first model NN1 and the energy output from inputting information on molecules, crystals, etc. to the second model NN2 is the value of the teacher data (first The second model NN2 is trained so as to obtain the result of calculation by principle calculation, etc.).
  • the first training unit 108 extracts the combination information of two atoms from information such as molecules, and acquires the two-body potentials related to the two atoms by forward propagating the first model NN1. do. Then, the second training unit 110 inputs information such as molecules and forward propagates it to the second model NN2. The second training unit 110 considers the information such as the energy output from the second model NN2 based on the potential function and the information such as the energy output from the first model NN1. Compute the error and perform the training of the second model NN2 by backpropagating this error.
  • the difference between the amount of energy, etc. calculated by the first-principles calculation and the amount of energy, etc. between two atoms composing a molecule etc. output by the first model NN1 is used as a teacher.
  • the second training unit 110 performs training of the second model NN2.
  • the first model NN1 calculates the two-body potential between two atoms existing within a predetermined distance among all the combinations of two atoms that constitute a molecule or the like in the first model NN1.
  • the sum of energies, etc. may be subtracted from the energies, etc. calculated by the first-principles calculation, and used as teacher data.
  • the predetermined distance can be, for example, a distance that can exert an effect as a two-body potential, depending on the type of diatomic element.
  • the amount of energy, etc. between two atoms composing molecules, etc. output from the first model NN1 is substituted into an arbitrary potential function, and the data obtained by removing the energy due to the two-body potential is used as teacher data.
  • the two-body potential may be calculated by extracting a combination of two atoms within a predetermined distance from the configuration of molecules or the like.
  • second training unit 110 executes training of second model NN2 based on the potential function, using the energy of the molecule or the like from which the effect of the two-body potential has been removed as teacher data.
  • the training device 1 trains the first model NN1 for two-body potentials and the second model NN2 for compound potentials such as molecules and crystals.
  • the first model NN1 may be trained in advance on another training device.
  • the training device 1 may train the second model NN2 based on the output result of the first model NN1 without including the first training unit 108.
  • the training device 1 may train the first model NN1 and the second model NN2 in parallel.
  • the processes of S202 and S204 may be executed at the same timing.
  • the training of the second model NN2 did not use data on two-body potentials, but it is not limited to this.
  • the second training unit 110 may use training data on the two-body potential in training the second model NN2.
  • the second training unit 110 may train the second model NN2 so that the energy (and force) becomes 0 when data on two-body potentials, that is, data on two atoms is input.
  • FIG. 7 is a block diagram showing an example of an estimation device according to this embodiment.
  • Estimation device 2 includes input unit 200 , storage unit 202 , inference unit 204 , calculation unit 206 , and output unit 208 .
  • the estimating device 2 when inputting information about compounds such as molecules and crystals, estimates and outputs physical quantities such as energy.
  • the estimating device 2 receives input of data necessary for inference via the input unit 200.
  • the input data may be temporarily stored in the storage unit 202, for example. Since the specific operation of the input unit 200 is the same as that of the input unit 100 in the training device 1, detailed description will be omitted.
  • the storage unit 202 stores data necessary for estimation processing in the estimation device 2. Since the operation of this storage unit 202 is also the same as that of the storage unit 102 in the training device 1, detailed description will be omitted.
  • the inference unit 204 uses the first model NN1 and the second model NN2 trained in the training device 1 to infer quantities such as energy from input information such as two atoms and molecules.
  • the inference unit 204 appropriately forward-propagates the input information to the first model NN1 and the second model NN2 for inference.
  • the inference unit 204 inputs the information on the two atoms into the first model NN1, executes a two-body potential estimation process, and calculates the estimation result of the two-body potential. Output to unit 206 .
  • the second model NN2 may not be used.
  • the inference unit 204 When information on three or more atoms, for example, information on molecules, crystals, etc., is input, the inference unit 204 inputs information on two atoms forming a two-body potential function to the first model NN1, and also inputs information on molecules, crystals, etc. etc. is input to the second model NN2. In the same way as during training described above, two atoms within a predetermined distance are extracted from information such as molecules, the information about these two atoms is input to the first model NN1, and the information itself such as molecules is sent to the second model NN2. input and forward propagate each. Information output from each model is output to the calculation unit 206 .
  • the inference unit 204 adds the extracted 2 Atom information may be entered.
  • the calculation unit 206 calculates information on energy, force, etc. as a whole based on the information on the two-body potential output from the first model NN1 and the information on the potential of molecules etc. output from the second model NN2. get.
  • the computation unit 206 computes the total energy and the like using the same method as that considered for the training data of the second model NN2 in the training.
  • the calculation unit 206 calculates the output result of the first model NN1 and the second Calculate the sum with the output result of the model NN2 of , and output it as an amount such as energy.
  • the computing unit 206 converts the output from the first model NN1 and the output from the second model NN2 based on the potential function. Performs a combining operation and outputs the result of this operation.
  • the output unit 208 outputs the result calculated by the calculation unit 206 as an estimation result.
  • FIG. 8 is a flowchart showing the processing of the estimation device 2 according to this embodiment.
  • the estimation device 2 acquires data on the molecular structure and the like to be estimated via the input unit 200 (S300).
  • the inference unit 204 extracts data on two atoms from the input data such as the molecular structure (S302). This process may be, for example, a process of extracting all diatomic combinations, or a process of extracting diatomic combinations within a predetermined distance.
  • the inference unit 204 executes potential inference by forward propagating data to the first model NN1 and the second model NN2 (S304).
  • the inference unit 204 inputs data on the two extracted atoms into the first model NN1, inputs data on molecules and the like into the second model NN2, and propagates them forward.
  • the inference unit 204 inputs data to the second model NN2 according to the definition designed during training of the second model NN2.
  • the computing unit 206 When the outputs from the first model NN1 and the second model NN2 are acquired, the computing unit 206 appropriately synthesizes the outputs (S306). Similar to the process of S304, the calculation unit 206 executes the synthesis calculation based on the method defined during training.
  • the output unit 208 outputs the result of synthesis by the calculation unit 206 as information such as energy and force (S308).
  • FIG. 9 is a diagram showing the flow of data in the data generation device, training device, and estimation device in this embodiment. As in FIG. 3, the dashed line indicates the data generation device, the dotted line indicates the training device, and the solid line indicates the estimation device.
  • the data generation device acquires information such as energy and force from the information of two atoms based on first-principles calculations or potential curves. In addition, the data generation device acquires information such as energy and force from information such as molecules and crystals based on first-principles calculations and the like.
  • the training device 1 trains the first model using diatomic information and diatomic potential information as training data.
  • the training device 1 also uses information on molecules, crystals, etc., and output data of first-principles calculation as training data, and trains the second model using this training data and the output of the first model.
  • the estimating device 2 infers information such as energy that can be expressed by two-body potential and energy that cannot be expressed by two-body potential using the first model and second model from information about substances such as molecules and crystals. Then, the inference results are combined to acquire and output values such as energy.
  • a neural network model is trained that appropriately reflects information such as two-body potentials, structures such as molecules, and energy in surrounding situations. can be executed. Further, according to the estimation device according to the present embodiment, the structure of molecules and the like, the energy in the environment, and the like are inferred by separating them into two-body potentials and potentials related to the structures of molecules and the like based on the potential function, and the inference result is can be properly synthesized.
  • first model NN1 the portion that can be represented by the two-body potential
  • second model NN2 the portion that cannot be represented by the two-body potential
  • the model NN in Fig. 1, the first model NN1, and the second model NN2 in Fig. 5 to be trained are models composed of a series of calculation definitions and a set of parameters. These models may be, by way of non-limiting example, any neural network model that can be adequately solved. In the case of neural network models, these models constitute the NNP in the estimator.
  • Inputs for models that infer at least two-body potentials are the element type of each atom in the two atoms and the The input is a sequence of 3D coordinate pairs.
  • the training data is obtained by first arbitrarily selecting the types of elements that make up the two atoms, and in this state, while changing the distance between the two atoms, obtaining the amount of energy, etc. by the potential curve or first-principles calculation. generated. Then, by appropriately changing the types of elements that make up the two atoms, potentials between various elements and at various distances are obtained. For example, the amount of two-body potential at various distances may be obtained for all element combinations.
  • This data set is referred to as the first data set.
  • Models that infer quantities that cannot be represented by at least two-body potentials such as the model NN and the second model NN2, input data on the structure of molecules, crystals, etc., such as the type of element of each atom that makes up the molecule,
  • the input is a set of 3D coordinates of the element.
  • the training data is composed of a set of structural information of these molecules, etc., and quantities such as energies obtained using first-principles calculations such as DFT for these molecules. This data set is referred to as the second data set.
  • the training device 1 trains the model NN using the combined data of the first data set and the second data set as training data.
  • the training device 1 trains the first model NN1 using the first data set as training data, and trains the second model NN2 using the input/output data of the first model NN1 and the second data set.
  • the training device 1 uses, for example, the energy obtained by subtracting the sum of the output values obtained by inputting the combination of two atoms in the input molecule, etc. into the first model NN1 from the energy, etc. of the input molecule, etc. in the second data set, as teacher data.
  • calculations based on potential functions may be performed instead of simple sums.
  • one model can be appropriately trained in the first embodiment, and two models can be appropriately trained in the second embodiment. It becomes possible.
  • All of the above trained models may be concepts that include, for example, models that have been trained as described and further distilled by a general method.
  • each device in the above-described embodiments may be configured by hardware, or a CPU (Central Processing Unit) or a GPU (Graphics Processing Unit) It may be configured by information processing of software (program) to be executed.
  • software information processing software that realizes at least a part of the functions of each device in the above-described embodiments can be transferred to a flexible disk, CD-ROM (Compact Disc-Read Only Memory), or USB (Universal Serial Bus) memory or other non-temporary storage medium (non-temporary computer-readable medium) and read into a computer to execute software information processing.
  • the software may be downloaded via a communication network.
  • information processing may be performed by hardware by implementing software in a circuit such as an ASIC (Application Specific Integrated Circuit) or FPGA (Field Programmable Gate Array).
  • the type of storage medium that stores software is not limited.
  • the storage medium is not limited to a detachable one such as a magnetic disk or an optical disk, and may be a fixed storage medium such as a hard disk or memory. Also, the storage medium may be provided inside the computer, or may be provided outside the computer.
  • FIG. 10 is a block diagram showing an example of the hardware configuration of each device (training device 1 or estimation device 2) in the above-described embodiment.
  • Each device includes, for example, a processor 71, a main storage device 72 (memory), an auxiliary storage device 73 (memory), a network interface 74, and a device interface 75, which are connected via a bus 76.
  • a processor 71 for example, a main storage device 72 (memory), an auxiliary storage device 73 (memory), a network interface 74, and a device interface 75, which are connected via a bus 76.
  • a bus 76 may be implemented as a computer 7 integrated with the
  • the computer 7 in FIG. 10 has one of each component, but may have a plurality of the same components. Also, in FIG. 10, one computer 7 is shown. good too. In this case, it may be in the form of distributed computing in which each computer communicates via the network interface 74 or the like to execute processing.
  • each device (training device 1 or estimating device 2) in the above-described embodiment is a system that realizes functions by one or more computers executing instructions stored in one or more storage devices. may be configured. Alternatively, the information transmitted from the terminal may be processed by one or more computers provided on the cloud, and the processing result may be transmitted to the terminal.
  • each device in the above-described embodiments is executed in parallel using one or more processors or using multiple computers via a network. good too. Also, various operations may be distributed to a plurality of operation cores in the processor and executed in parallel. Also, part or all of the processing, means, etc. of the present disclosure may be executed by at least one of a processor and a storage device provided on a cloud capable of communicating with the computer 7 via a network. Thus, each device in the above-described embodiments may be in the form of parallel computing by one or more computers.
  • the processor 71 may be an electronic circuit (processing circuit, processing circuitry, CPU, GPU, FPGA, ASIC, etc.) including a computer control device and arithmetic device. Also, the processor 71 may be a semiconductor device or the like including a dedicated processing circuit. The processor 71 is not limited to an electronic circuit using electronic logic elements, and may be realized by an optical circuit using optical logic elements. Also, the processor 71 may include arithmetic functions based on quantum computing.
  • the processor 71 can perform arithmetic processing based on the data and software (programs) input from each device, etc. of the internal configuration of the computer 7, and output the arithmetic result and control signal to each device, etc.
  • the processor 71 may control each component of the computer 7 by executing the OS (Operating System) of the computer 7, applications, and the like.
  • Each device (the training device 1 and/or the estimation device 2) in the above-described embodiments may be realized by one or more processors 71.
  • the processor 71 may refer to one or more electronic circuits arranged on one chip, or may refer to one or more electronic circuits arranged on two or more chips or two or more devices. You can point When multiple electronic circuits are used, each electronic circuit may communicate by wire or wirelessly.
  • the main storage device 72 is a storage device that stores instructions and various data to be executed by the processor 71 , and the information stored in the main storage device 72 is read by the processor 71 .
  • Auxiliary storage device 73 is a storage device other than main storage device 72 . These storage devices mean any electronic components capable of storing electronic information, and may be semiconductor memories. The semiconductor memory may be either volatile memory or non-volatile memory.
  • a storage device for storing various data in each device (training device 1 or estimating device 2) in the above-described embodiments may be realized by main storage device 72 or auxiliary storage device 73, and is built into processor 71. It may be realized by an internal memory.
  • the storage unit 102 in the above-described embodiment may be realized by the main storage device 72 or the auxiliary storage device 73.
  • processors may be connected (coupled) to one storage device (memory), or a single processor may be connected.
  • a plurality of storage devices (memories) may be connected (coupled) to one processor.
  • Each device (training device 1 or estimating device 2) in the above-described embodiments is composed of at least one storage device (memory) and a plurality of processors connected (coupled) to this at least one storage device (memory).
  • at least one of the plurality of processors may include a configuration connected (coupled) to at least one storage device (memory).
  • this configuration may be implemented by storage devices (memories) and processors included in a plurality of computers.
  • a configuration in which a storage device (memory) is integrated with a processor for example, a cache memory including an L1 cache and an L2 cache
  • a cache memory including an L1 cache and an L2 cache
  • the network interface 74 is an interface for connecting to the communication network 8 wirelessly or by wire. As for the network interface 74, an appropriate interface such as one conforming to existing communication standards may be used. The network interface 74 may exchange information with the external device 9A connected via the communication network 8.
  • FIG. The communication network 8 may be any one of WAN (Wide Area Network), LAN (Local Area Network), PAN (Personal Area Network), etc., or a combination thereof. It is sufficient if information can be exchanged between them. Examples of WAN include the Internet, examples of LAN include IEEE802.11 and Ethernet (registered trademark), and examples of PAN include Bluetooth (registered trademark) and NFC (Near Field Communication).
  • the device interface 75 is an interface such as USB that directly connects with the external device 9B.
  • the external device 9A is a device connected to the computer 7 via a network.
  • External device 9B is a device that is directly connected to computer 7 .
  • the external device 9A or the external device 9B may be an input device.
  • the input device is, for example, a device such as a camera, microphone, motion capture, various sensors, a keyboard, a mouse, or a touch panel, and provides the computer 7 with acquired information.
  • a device such as a personal computer, a tablet terminal, or a smartphone including an input unit, a memory, and a processor may be used.
  • the external device 9A or the external device 9B may be, for example, an output device.
  • the output device may be, for example, a display device such as LCD (Liquid Crystal Display), CRT (Cathode Ray Tube), PDP (Plasma Display Panel), or organic EL (Electro Luminescence) panel.
  • a speaker or the like for output may be used.
  • a device such as a personal computer, a tablet terminal, or a smartphone including an output unit, a memory, and a processor may be used.
  • the external device 9A or the external device 9B may be a storage device (memory).
  • the external device 9A may be a network storage or the like, and the external device 9B may be a storage such as an HDD.
  • the external device 9A or the external device 9B may be a device having the functions of some of the components of each device (the training device 1 or the estimation device 2) in the above-described embodiments. That is, the computer 7 may transmit or receive part or all of the processing results of the external device 9A or the external device 9B.
  • the expression "at least one (one) of a, b and c" or “at least one (one) of a, b or c" includes any of a, b, c, a-b, ac, b-c, or a-b-c. Also, multiple instances of any element may be included, such as a-a, a-b-b, a-a-b-b-c-c, and so on. It also includes the addition of other elements than the listed elements (a, b and c), such as having d such as a-b-c-d.
  • connection and “coupled” when used, they refer to direct connection/coupling, indirect connection/coupling , electrically connected/coupled, communicatively connected/coupled, operatively connected/coupled, physically connected/coupled, etc. intended as a term.
  • the term should be interpreted appropriately according to the context in which the term is used, but any form of connection/bonding that is not intentionally or naturally excluded is not included in the term. should be interpreted restrictively.
  • the physical structure of element A is such that it is capable of performing operation B has a configuration, including that a permanent or temporary setting/configuration of element A is configured/set to actually perform action B good.
  • element A is a general-purpose processor
  • the processor has a hardware configuration that can execute operation B, and operation B can be performed by setting a permanent or temporary program (instruction). It just needs to be configured to actually run.
  • the element A is a dedicated processor or a dedicated arithmetic circuit, etc., regardless of whether or not control instructions and data are actually attached, the circuit structure of the processor actually executes the operation B. It just needs to be implemented.
  • finding a global optimum finding an approximation of a global optimum, finding a local optimum, and finding a local optimum It includes approximations of values and should be interpreted accordingly depending on the context in which the term is used. It also includes stochastically or heuristically approximating these optimum values.
  • each piece of hardware may work together to perform the predetermined processing, or a part of the hardware may perform the predetermined processing. You may do all of Also, some hardware may perform a part of the predetermined processing, and another hardware may perform the rest of the predetermined processing.
  • the hardware that performs the first process and the hardware that performs the second process may be the same or different. In other words, the hardware that performs the first process and the hardware that performs the second process may be included in the one or more pieces of hardware.
  • hardware may include an electronic circuit or a device including an electronic circuit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Chemical & Material Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
PCT/JP2022/023520 2021-06-11 2022-06-10 推定装置、訓練装置、推定方法、訓練方法、プログラム及び非一時的コンピュータ可読媒体 Ceased WO2022260177A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2023527952A JP7457877B2 (ja) 2021-06-11 2022-06-10 推定装置、訓練装置、方法及びプログラム
US18/534,252 US20240111998A1 (en) 2021-06-11 2023-12-08 Inferring device, inferring method, and training device
JP2024040534A JP2024075646A (ja) 2021-06-11 2024-03-14 推定装置、訓練装置、推定方法、訓練方法、プログラム及び非一時的コンピュータ可読媒体

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021-098325 2021-06-11
JP2021098325 2021-06-11

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/534,252 Continuation US20240111998A1 (en) 2021-06-11 2023-12-08 Inferring device, inferring method, and training device

Publications (1)

Publication Number Publication Date
WO2022260177A1 true WO2022260177A1 (ja) 2022-12-15

Family

ID=84424617

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/023520 Ceased WO2022260177A1 (ja) 2021-06-11 2022-06-10 推定装置、訓練装置、推定方法、訓練方法、プログラム及び非一時的コンピュータ可読媒体

Country Status (3)

Country Link
US (1) US20240111998A1 (https=)
JP (2) JP7457877B2 (https=)
WO (1) WO2022260177A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2026074898A1 (ja) * 2024-10-01 2026-04-09 富士通株式会社 情報処理プログラム、情報処理方法、および情報処理装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2026046431A (ja) * 2024-09-02 2026-03-13 富士通株式会社 情報処理プログラム、情報処理方法、および情報処理装置

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020030638A (ja) * 2018-08-23 2020-02-27 パナソニックIpマネジメント株式会社 材料情報出力方法、材料情報出力装置、材料情報出力システム、及びプログラム
JP2020166706A (ja) * 2019-03-29 2020-10-08 株式会社クロスアビリティ 結晶形予測装置、結晶形予測方法、ニューラルネットワークの製造方法、及びプログラム

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11455439B2 (en) * 2018-11-28 2022-09-27 Robert Bosch Gmbh Neural network force field computational algorithms for molecular dynamics computer simulations
KR102260838B1 (ko) * 2019-04-16 2021-06-07 한국과학기술연구원 원자 인공 신경망을 이용한 결정체의 표면 에너지 산출 방법

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020030638A (ja) * 2018-08-23 2020-02-27 パナソニックIpマネジメント株式会社 材料情報出力方法、材料情報出力装置、材料情報出力システム、及びプログラム
JP2020166706A (ja) * 2019-03-29 2020-10-08 株式会社クロスアビリティ 結晶形予測装置、結晶形予測方法、ニューラルネットワークの製造方法、及びプログラム

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2026074898A1 (ja) * 2024-10-01 2026-04-09 富士通株式会社 情報処理プログラム、情報処理方法、および情報処理装置

Also Published As

Publication number Publication date
JP2024075646A (ja) 2024-06-04
JPWO2022260177A1 (https=) 2022-12-15
JP7457877B2 (ja) 2024-03-28
US20240111998A1 (en) 2024-04-04

Similar Documents

Publication Publication Date Title
US10924127B2 (en) Generating a control sequence for quantum control
JP7301156B2 (ja) 量子系をシミュレートするための量子変分法、装置及び記憶媒体
JP7438303B2 (ja) ディープラーニングモデルのトレーニング方法、自然言語処理方法及び装置、電子機器、記憶媒体及びコンピュータプログラム
KR20190028531A (ko) 복수의 기계 학습 태스크에 대해 기계 학습 모델들을 훈련
JP2024075646A (ja) 推定装置、訓練装置、推定方法、訓練方法、プログラム及び非一時的コンピュータ可読媒体
WO2022260171A1 (ja) 推定装置及びモデル生成方法
JP7392203B2 (ja) 訓練装置、訓練方法、プログラム及び推論装置
CN104101344A (zh) 基于粒子群小波网络的mems陀螺随机误差补偿方法
CN115577791B (zh) 基于量子系统的信息处理方法及装置
CN113196233B (zh) 用于实现多层神经网络的系统和方法
WO2022260178A1 (ja) 訓練装置、推定装置、訓練方法、推定方法及びプログラム
JP2022062274A (ja) 関数処理方法、装置及び電子機器
CN103530439B (zh) 利用输出变量全导数的协同仿真过程
JP2022068327A (ja) ノードグループ化方法、装置及び電子機器
Nielsen et al. Numerical construction of the density-potential mapping
CN116258196A (zh) 对神经网络进行训练的方法及更新神经网络参数的优化器
WO2023149920A2 (en) Quantum device simulation using natural-orbital basis
WO2022249626A1 (ja) 推定装置、訓練装置、推定方法、強化学習モデル生成方法及び分子構造生成方法
JP2020119108A (ja) データ処理装置、データ処理方法、データ処理プログラム
JP2022189626A (ja) 評価装置、推論装置、評価方法、プログラム及び非一時的コンピュータ可読媒体
CN115577792B (zh) 基于量子系统的信息处理方法及装置
CN115759269B (zh) 特征信息的确定方法、装置、设备以及存储介质
CN115510731A (zh) 推理方法、信息处理装置和计算机可读记录介质
US20250307674A1 (en) Parameter estimation device and parameter estimation method
JP2024072890A (ja) 推定装置、訓練装置、推定方法、生成方法及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22820355

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023527952

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22820355

Country of ref document: EP

Kind code of ref document: A1