CN116959600A

CN116959600A - Molecular state prediction method, device and storage medium

Info

Publication number: CN116959600A
Application number: CN202310913924.XA
Authority: CN
Inventors: 荣钰; 李佳; 刘阳
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-07-21
Filing date: 2023-07-21
Publication date: 2023-10-27

Abstract

The present disclosure provides a molecular state prediction method, a molecular state prediction apparatus, a molecular state prediction device, and a computer-readable storage medium. The molecular state prediction method comprises the following steps: receiving a system state of a molecule at a first time, wherein the molecule comprises one or more atoms, the system state at the first time comprising a position vector, a velocity vector, and a feature vector of each atom in the molecule at the first time, and a relationship vector characterizing a physical relationship between atoms in the molecule; based on the system state at the first moment, predicting an acceleration vector of each atom in the molecule at the first moment by using a graph neural network model; generating a system state of the molecule at a second moment based on the acceleration vector of each atom in the molecule at the first moment and the system state at the first moment; and performing a plurality of iterative computations to generate a system state of the molecule at a predetermined time.

Description

Molecular state prediction method, device and storage medium

Technical Field

The present disclosure relates to the field of computer technology, and more particularly, to a molecular state prediction method, a molecular state prediction apparatus, a molecular state prediction device, and a computer-readable storage medium.

Background

Molecular dynamics is a leading edge technology combining multiple subjects of mathematics, physics, chemistry, biology and the like, and based on Newton mechanics and statistical mechanics principles, the movement and interaction of molecules under different conditions are studied through numerical simulation. The traditional molecular dynamics simulation relies on a computer to construct a molecular model, takes classical mechanics, quantum mechanics and statistical mechanics as theoretical bases, utilizes computer numerical solution to obtain physicochemical data of a molecular system, so as to simulate and research the structure and the property of the molecular system, and has been widely applied to the scientific and technical fields of chemistry and chemical industry, material science and engineering, physics, biological medicine and the like. However, the conventional molecular dynamics simulation method needs to build an accurate molecular model, the calculation of a molecular force field is very time-consuming, and the fitting of simulation parameters is very complex, especially when the simulation is performed on a large biological scale, the calculation amount is huge.

With the rapid development of computer technology, machine learning technology plays an increasingly important role in many fields, and provides a new idea for molecular dynamics simulation. Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc.; it is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

Disclosure of Invention

The present disclosure provides a molecular state prediction method, a molecular state prediction apparatus, a molecular state prediction device, and a computer-readable storage medium.

According to an aspect of the embodiments of the present disclosure, there is provided a molecular state prediction method, including: receiving a system state of a molecule at a first time, the molecule comprising one or more atoms, the system state at the first time comprising a position vector, a velocity vector, and a feature vector of each atom in the molecule at the first time, and a relationship vector characterizing an inter-atomic physical relationship in the molecule; predicting an acceleration vector of each atom in the molecule at the first moment by using a graph neural network model based on the system state at the first moment; generating a system state of the molecule at a second time based on the acceleration vector of each atom in the molecule at the first time and the system state at the first time; and performing a plurality of iterative computations to generate a system state of the molecule at a predetermined time.

According to an example of an embodiment of the present disclosure, wherein the feature vector comprises a vector characterizing a non-geometric feature of each atom in the molecule, the non-geometric feature comprising one or more of atomic number, atomic mass, historical trajectory information, and the relationship vector comprises an edge vector characterizing whether a distance between any two atoms in the molecule is less than a predetermined threshold.

According to an example of an embodiment of the present disclosure, wherein performing a plurality of iterative computations to generate a system state of the molecule at a predetermined time comprises: predicting an acceleration vector of each atom in the molecule at the second moment by using a graph neural network model based on the system state at the second moment; and generating a system state of the molecule at a third time based on the acceleration vector of each atom in the molecule at the second time and the system state at the second time.

According to an example of an embodiment of the present disclosure, wherein predicting the acceleration vector of each atom in the molecule at the first time instant using a graph neural network model based on the system state at the first time instant comprises, for each atom in the molecule: calculating a message vector between the atom and another atom in the molecule, the message vector characterizing an interaction between the atom and the other atom; an acceleration vector of the atom at the first time is determined based on the message vectors between the atom and all other atoms in the molecule.

According to an example of an embodiment of the present disclosure, wherein calculating a message vector between the atom and another atom in the molecule comprises: a message vector between the atom and another atom in the molecule is calculated based on the position vector and the feature vector of the atom, the position vector and the feature vector of the other atom, and an edge vector between the atom and the other atom.

According to an example of an embodiment of the present disclosure, the molecular state prediction method further includes: based on the message vector between the atom and all other atoms in the molecule and the feature vector of the atom at the first time, a feature vector of the atom at a second time is generated.

According to an example of an embodiment of the present disclosure, wherein generating the system state of the molecule at the second time based on the acceleration vector of each atom in the molecule at the first time and the system state of the first time comprises, for each atom in the molecule: generating a velocity vector of the atom at the second moment based on the acceleration vector and the velocity vector of the atom at the first moment; and generating a position vector of the atom at the second time based on the velocity vector of the atom at the second time and the position vector of the atom at the first time.

According to an example of an embodiment of the present disclosure, wherein generating the system state of the molecule at the second time based on the acceleration vector of each atom in the molecule at the first time and the system state of the first time comprises, for each atom in the molecule: generating a velocity vector of the atom at the second moment based on the acceleration vector and the velocity vector of the atom at the first moment; and generating a position vector of the atom at the second moment based on the velocity vector and the position vector of the atom at the first moment.

According to another aspect of the embodiments of the present disclosure, there is provided a training method of a graph neural network model for molecular state prediction, including: receiving a system state of a molecule at a first training moment, the molecule comprising one or more atoms, the system state at the first training moment comprising a position vector, a velocity vector and a feature vector of each atom in the molecule at the first training moment and a relationship vector characterizing a physical relationship between atoms in the molecule; predicting an acceleration vector of each atom in the molecule at the first training moment by using a graph neural network model based on the system state at the first training moment; generating a system state of the molecule at a second training time based on the acceleration vector of each atom in the molecule at the first training time and the system state of the first training time; performing a plurality of iterative computations to generate a predicted system state of the molecule at a predetermined training time; and training the graph neural network model based on the predicted system state and the real system state of the molecule at the preset training moment.

According to an example of an embodiment of the present disclosure, wherein training the graph neural network model based on the predicted system state and the actual system state of the molecule at the predetermined training time comprises: calculating a loss function based on the predicted system state and the actual system state of the molecule at the predetermined training moment; the graph neural network model is trained by minimizing the loss function.

According to an example of embodiment of the present disclosure, the real system state is obtained based on a system state of the molecule at the first training moment using a molecular dynamics numerical model.

According to another aspect of the embodiments of the present disclosure, there is provided a molecular state prediction apparatus, the apparatus including: an input unit configured to receive a system state of a molecule at a first time, the molecule comprising one or more atoms, the system state at the first time comprising a position vector, a velocity vector, and a feature vector of each atom in the molecule at the first time, and a relationship vector characterizing a physical relationship between atoms in the molecule; a processing unit configured to predict an acceleration vector of each atom in the molecule at the first time using a graph neural network model based on the system state at the first time, generate a system state of the molecule at a second time based on the acceleration vector of each atom in the molecule at the first time and the system state at the first time, and perform a plurality of iterative computations to generate the system state of the molecule at a predetermined time; and an output unit configured to output a system state of the molecule at the predetermined timing.

According to another aspect of the embodiments of the present disclosure, there is provided a molecular state prediction apparatus including: one or more processors; and one or more memories, wherein the memories have stored therein computer readable instructions that, when executed by the one or more processors, cause the one or more processors to perform the methods described in the various aspects above.

According to another aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer-readable instructions, which when executed by a processor, cause the processor to perform a method according to any of the above aspects of the present disclosure.

According to another aspect of embodiments of the present disclosure, there is provided a computer program product comprising computer readable instructions which, when executed by a processor, cause the processor to perform a method as in any of the above aspects of the present disclosure.

The molecular state prediction method according to the embodiment of the disclosure is based on a graph neural network and a numerical algorithm, combines the geometric structure and physical priori knowledge (such as classical mechanics theory such as Newton's law) of molecules, provides a high-efficiency molecular dynamics model, and realizes decoupling of acceleration of a molecular system and modeling of the dynamic process due to the fact that the dynamic process of the neural network is described in a form of a normal differential equation (instead of the graph neural network), so that the parameter number of the model can be reduced, the risk of overfitting is reduced, the flexibility of the model is improved, and the generalization of the model is improved; in addition, the interpretability of the ordinary differential equation can better reflect the physical rule of molecular dynamics, and can also improve the interpretability of the model.

Drawings

The above and other objects, features and advantages of the presently disclosed embodiments will become more apparent from the more detailed description of the presently disclosed embodiments when taken in conjunction with the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. In the drawings, like reference numerals generally refer to like parts or steps.

FIG. 1 illustrates an exemplary scenario diagram of a molecular state prediction system according to an embodiment of the present disclosure.

FIG. 2 illustrates a flow chart of a molecular state prediction method according to an embodiment of the present disclosure

Fig. 3 shows a schematic diagram of computing an atomic acceleration vector and further updating an atomic velocity vector and a position vector according to an embodiment of the present disclosure.

Fig. 4 shows an overall flow of a molecular state prediction method according to an embodiment of the present disclosure.

Fig. 5 shows a flowchart of a method of training a graph neural network model for molecular state prediction, according to an embodiment of the present disclosure.

Fig. 6 shows a schematic structural diagram of a molecular state prediction apparatus according to an embodiment of the present disclosure.

Fig. 7 illustrates a schematic diagram of an architecture of an exemplary computing device, according to an embodiment of the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure. It will be apparent that the described embodiments are merely embodiments of a portion, but not all, of the present disclosure. All other embodiments, which can be made by one of ordinary skill in the art without the need for inventive faculty, are intended to be within the scope of the present disclosure, based on the embodiments in this disclosure.

As used in this disclosure and in the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. The terms "first," "second," and the like, as used in this disclosure, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. Likewise, the word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.

Further, a flowchart is used in this disclosure to describe the operations performed by the system according to embodiments of the present disclosure. It should be understood that the preceding or following operations are not necessarily performed in order precisely. Rather, the various steps may be processed in reverse order or simultaneously. Also, other operations may be added to the processes or a step or steps may be removed from the processes.

The key to molecular dynamics simulation is the complex interactions between multiple atoms in the modeling system, and existing molecular dynamics simulation based on deep learning mainly uses a graph neural network to model the trajectory of the whole molecular system, for example, a isomorphous graph neural network is adopted. The alien neural network attempts to encode physical symmetry into the graph neural network to ensure translational/rotational/reflective alien variability between geometric inputs and outputs have become one of the primary approaches to modeling molecular dynamics systems.

Most prior isomorphic neural network studies consider the dynamic process as a series of discrete states, including the position, velocity, and force of each object; then, given an input state, the state is mapped directly by the model to the next state at a fixed time interval. That is, the existing molecular dynamics simulation based on the isomorphous graph neural network performs integrated simulation on the static and dynamic processes of the molecular system, which can be regarded as a black box that receives the input molecular state and outputs the molecular state at a predetermined time. Thus, such methods can have the following disadvantages: (1) Model overfitting, because a large number of parameters are needed to simulate complex data distribution, the overfitting problem is easy to occur; (2) Lack of interpretability, i.e., difficulty in interpreting the internal mechanisms of the model and modeling process; (3) The existing model structure based on the graph neural network is usually fixed, is difficult to combine with physical priori knowledge, and can not adapt to different physical systems or predict under different conditions; (4) The lack of spatial and temporal coupling causes the position and velocity of molecules in the molecular dynamics simulation to evolve with time, so that the spatial and temporal coupling relation needs to be considered simultaneously, and the existing model based on the graph neural network often obscures the modeling of time and space, so that the prediction result cannot be explained and the generalization of the model is affected.

In view of the above problems, the present disclosure provides a method for predicting a molecular state, which is based on a graph neural network and a neural ordinary differential equation, combines geometric structure and physical prior knowledge of a molecule, and can enhance model interpretability, improve model flexibility and increase generalization performance of a model.

FIG. 1 illustrates an exemplary scenario diagram of a molecular state prediction system according to an embodiment of the present disclosure. As shown in fig. 1, the molecular state prediction system 100 may include a user terminal 110, a network 120, a server 130, and a database 140.

The user terminal 110 may be, for example, a computer 110-1, a mobile phone 110-2 as shown in fig. 1. It will be appreciated that in fact, user terminal 110 may be any other type of electronic device capable of performing data processing, which may include, but is not limited to, a fixed terminal such as a desktop computer, smart television, etc., a mobile terminal such as a smart phone, tablet, portable computer, handheld device, etc., or any combination thereof, to which embodiments of the present disclosure are not particularly limited.

The user terminal 110 according to an embodiment of the present disclosure may be configured to receive an initial system state of a molecule and predict the state of the molecule system at a predetermined time using the molecular state prediction method provided by the present disclosure. In some embodiments, the molecular state prediction methods provided by the present disclosure may be performed using a processing unit of the user terminal 110. In some implementations, the user terminal 110 may perform the molecular state prediction methods provided by the present disclosure using an application built into the user terminal. In other implementations, the user terminal 110 may perform the molecular state prediction methods provided by the present disclosure by invoking an application program stored external to the user terminal.

In other embodiments, the user terminal 110 transmits the received initial system state of the molecule to be processed to the server 130 via the network 120, and the server 130 performs the molecular state prediction method. In some implementations, the server 130 may perform the molecular state prediction method using an application built into the server. In other implementations, the server 130 may perform the molecular state prediction method by invoking an application stored external to the server.

Network 120 may be a single network or a combination of at least two different networks. For example, network 120 may include, but is not limited to, one or a combination of several of a local area network, a wide area network, a public network, a private network, and the like. The server 130 may be an independent server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, location services, basic cloud computing services such as big data and an artificial intelligence platform, which is not particularly limited in the embodiments of the present disclosure.

Database 140 may refer broadly to a device having a storage function. The database 140 is mainly used to store various data utilized, generated, and outputted in the operation of the user terminal 110 and the server 130. Database 140 may be local or remote. The database 140 may include various memories such as random access Memory (Random Access Memory (RAM)), read Only Memory (ROM), and the like. The above mentioned storage devices are just examples and the storage devices that the system may use are not limited to. Database 140 may be interconnected or in communication with server 130 or a portion thereof via network 120, or directly with server 130, or a combination thereof.

A molecular state prediction method according to an embodiment of the present disclosure is described below with reference to fig. 2. Fig. 2 shows a flow chart of a molecular state prediction method 200 according to an embodiment of the present disclosure.

In step S210, the system state of the molecule at the first moment is received. In embodiments of the present disclosure, a molecule is an entity of constituent atoms that are joined together in a certain bonding order and spatial arrangement, which may include one or more atoms. The molecules may be monoatomic molecules such as helium molecules (He) and argon molecules (Ar); may be a diatomic molecule, such as an oxygen molecule (O) ₂ ) And carbon monoxide molecules (CO); or polyatomic molecules, such as water molecules (H ₂ O), carbon dioxide molecules (CO ₂ ) Or more complex molecules composed of more atoms, such as high molecular weight compounds. The geometry of the molecule and the number of atoms included in the embodiment of the present disclosure are not particularly limited. In examples of embodiments of the present disclosure, molecules having N atoms may be usedAnd (3) representing that i= … … N, wherein N is a positive integer greater than or equal to 1.

In embodiments of the present disclosure, the system state of the molecule at the first time may characterize the state of each atom in the molecule at the first time. In particular, the system state of the molecule at the first moment in time may include a position vector, a velocity vector, and a feature vector of each atom in the molecule at the first moment in time, and a relationship vector characterizing the physical relationship between atoms in the molecule. For example, the system state of a molecule at a first time t may be expressed asWherein the relationship vector g= { P, E } represents the physical relationship between atoms in the molecule P; the edge vector E indicates whether an edge exists between any two atoms in the molecule or, alternatively, whether there is a physical interaction that needs to be considered. For example, if the distance between two atoms is less than a predetermined threshold, or alternatively, the physical interaction between two atoms Above a predetermined threshold, this may be considered to be an edge between the two atoms; otherwise, no edge is considered to exist between the two atoms. Here, the predetermined threshold value used to determine the edge vector may be determined based on empirical parameters of physical prior knowledge, such as based on known molecular structures, molecular force fields, etc., to which the embodiments of the present disclosure are not particularly limited. The position vector q represents the position of each atom in the molecule at the first instant t. Speed vector->The first derivative of q represents the velocity of each atom in the molecule at the first instant t. The feature vector h represents a non-geometric feature of each atom in the molecule, such as atomic number, atomic mass, historical trajectory information (e.g., historical speed, historical position, etc.), and so forth.

In step S220, based on the system state at the first time, an acceleration vector of each atom in the molecule at the first time is predicted using the graph neural network model. In the disclosed embodiments, a second derivative (i.e., acceleration) of the atomic position of a molecule may be modeled using a graph neural network model, or more specifically, a graph neural network model may be constructed to predict the acceleration of each atom in a molecule based on an input molecular system state. The graph neural network model according to embodiments of the present disclosure may be constructed, for example, based on a constant graph neural network, a geometrically constrained graph neural network, or any other suitable graph neural network, to which embodiments of the present disclosure are not particularly limited. The constructed graph neural network model can be represented, for example, by the following equation:

Wherein, the liquid crystal display device comprises a liquid crystal display device,a second derivative of the position vector q representing the atom, i.e. an acceleration vector; f represents a position of an atom based on an input state of a molecular systemThe vector q and the eigenvector h are mapped into a graph neural network of acceleration vectors.

In examples of embodiments of the present disclosure, a messaging-based graph neural network may be employed to model acceleration vectors of solution atoms. The graph neural network is a deep learning method based on a graph structure, wherein the graph structure is a data structure composed of nodes and edges, and message passing is a mode of aggregating adjacent node information to update central node information. In examples of embodiments of the present disclosure, information of neighboring atoms may be utilized to calculate acceleration vectors of center atoms. In particular, for each atom in a molecule, a message vector between the atom and another atom in the molecule may be calculated, wherein the message vector may characterize an interaction between the atom and the other atom. For example, at a first time t, a message vector m between an ith atom and a jth atom in a molecule can be calculated by the following equation (2) _ij ：

m _ij ＝φ _e (q _i ,q _j ,h _i ,h _j ,a _ij ) (2)

Wherein q _i And h _i Position vector and feature vector for the i-th atom; q _j And h _i Position vector and feature vector for the j-th atom; a, a _ij Representing an edge vector; phi (phi) _e Representing a multi-layered perceptron for generating a message vector that may generate a message vector between an ith atom and a jth atom based on the location vector and feature vector of the ith atom, the location vector and feature vector of the jth atom, and an edge vector between the ith atom and the jth atom.

As mentioned previously, if the distance between the ith and jth atoms is less than a predetermined threshold, then it can be considered that there is an edge, a, between the two atoms _ij I.e. representing an edge vector between the two atoms, which may be, for example, the distance between the two atoms, and calculating a message vector between the two atoms based on the edge vector. Otherwise, if the distance between the i-th atom and the j-th atom is greater than a predetermined threshold, then it may be considered that there is no edge between the two, or in other words,the interaction between the two atoms is negligible and thus the message vector between the two atoms may not be calculated or considered to be 0 in this case.

The process of generating a message vector between two atoms can be considered as the process of generating acceleration by interaction between them. For the ith atom, computing its final acceleration vector requires aggregating the message vectors between it and all other atoms in the molecule. That is, the acceleration vector of an atom at a first time may be determined based on the message vector between the atom and all other atoms in the molecule. For example, the acceleration vector of the ith atom at the first time t can be calculated by the following equation (3)

Wherein N is the number of atoms in the molecule; n (N) _i Represents a set of all other atoms in the molecule except the i-th atom, that is, j= … … i-1, i+ … … N; phi (phi) _q Representing a multi-layer perceptron for aggregating message vectors.

Furthermore, in order to increase the expressive power of the model, the feature vector of the atom may be updated in each iteration of the model (i.e., each time the acceleration vector is calculated). In particular, a feature vector of an atom at a second time may be generated based on a message vector between the atom and all other atoms in the molecule and a feature vector of the atom at the first time. For example, the atomic feature vector updated through one iteration may be generated by the following equation (4):

wherein, the liquid crystal display device comprises a liquid crystal display device,a feature vector representing an atom at a first time t; />Representing the eigenvector of the atom at a second time t+Δt, Δt representing the time interval of each iteration; phi (phi) _h Representing a multi-layer perceptron for updating atomic feature vectors. The feature vector of the atom at the second time updated by the above equation (4) can be used to calculate the acceleration vector of the atom at the second time in the next iterative calculation.

For each atom in the molecule, its acceleration vector at the first moment can be calculated using equations (2) - (3) above, and its feature vector updated using equation (4) above. To this end, acceleration vectors for each atom in the molecule at a first moment have been predicted based on the system state of the molecule at the first moment using a graph neural network model.

Next, in step S230, a system state of the molecule at a second time may be generated based on the acceleration vector of each atom in the molecule at the first time and the system state at the first time. In the disclosed embodiment, the system state of the component at the second moment is generated by combining the graph neural network model predicted atomic acceleration with updating the atomic velocity and position using a numerical algorithm. Numerical algorithms are conventional methods of solving ordinary differential equations, and when simulating physical motion processes, common numerical algorithms include, for example, the euler method, the semi-implicit euler method, the Wei Erlai integral method, the longger-kutta method, and the like. Embodiments of the present disclosure may employ any of these numerical algorithms to calculate velocity vectors and position vectors of atoms based on the atomic acceleration vectors. According to an example of an embodiment of the present disclosure, a numerical algorithm for calculating a velocity vector and a position vector of an atom may be implemented in the form of a neural ordinary differential equation, and constitute a molecular dynamics model capable of predicting a molecular state together with a graph neural network for predicting an acceleration of the atom.

According to one example of an embodiment of the present disclosure, for each atom in a molecule, a velocity vector of the atom at a second time may be generated based on the acceleration vector and the velocity vector of the atom at the first time; and generating a position vector of the atom at a second time based on the velocity vector and the position vector of the atom at the first time. The process calculates the velocity vector and the position vector of the atom based on the euler method, which can be expressed by, for example, the following equations (5) - (6):

Wherein, the liquid crystal display device comprises a liquid crystal display device,a velocity vector representing the ith atom at a first time t; />Representing the acceleration vector of the ith atom at the first time t, which may be obtained, for example, by step S220 described above; />Representing a velocity vector of the ith atom at a second time t+Δt; />Representing a position vector of the ith atom at the first time t; />Representing the position vector of the ith atom at the second instant t + deltat.

According to another example of an embodiment of the present disclosure, for each atom in a molecule, a velocity vector of the atom at a second time may be generated based on the acceleration vector and the velocity vector of the atom at the first time; and generating a position vector of the atom at the second time based on the velocity vector of the atom at the second time and the position vector of the atom at the first time. The process calculates the velocity vector and the position vector of the atom based on the semi-implicit euler method, which can be represented by, for example, equation (5) and the following equation (7):

the process of updating the state of the molecular system based on the euler method and the semi-implicit euler method is described above by way of example, but it should be understood that the embodiments of the present disclosure are not limited thereto and that any other suitable numerical algorithm may be employed to update the state of the molecular system in step S230.

Through the above steps S210 to S230, based on the system state of the molecule at the first time, the system state of the molecule at the second time including the position vector, the velocity vector and the feature vector of the molecule at the second time is generated by using the graph neural network model in combination with the numerical algorithm, which may be referred to as that an iterative calculation is completed, and the process thereof may be represented by fig. 3, for example. Fig. 3 shows a schematic diagram of computing an atomic acceleration vector and further updating an atomic velocity vector and a position vector according to an embodiment of the present disclosure. In the example of fig. 3, the example is illustrated with a molecule comprising three atoms, but it is understood that this is by way of example only and not limitation, and that a molecule according to embodiments of the present disclosure may comprise a greater or lesser number of atoms.

As shown in fig. 3, at a first time t, the position vector, the velocity vector, and the feature vector of the first atom in the molecule may be expressed asAnd h ₁ The method comprises the steps of carrying out a first treatment on the surface of the The position vector, velocity vector and feature vector of the second atom can be expressed as respectivelyAnd h ₂ The method comprises the steps of carrying out a first treatment on the surface of the The position vector, the velocity vector and the feature vector of the third atom can be expressed as +.>And h ₃ . At the first time t, the acceleration vectors of the first to third atoms in the molecule at the first time may be calculated through the above-described step S220. Thereafter, the velocity vector of the individual atoms can be updated, for example, based on the calculated acceleration vector at the first moment by the above equation (5) to the velocity vector at the second moment, i.e., to +. >Andfinally, the position vector of the respective atom can be updated, for example, by the above equation (6) or (7) to the position vector at the second moment, i.e., to +.>And->

Thereafter, in step S240, the iterative computation may continue to be performed a plurality of times to generate a system state of the molecule at a predetermined time. Specifically, the acceleration vector of each atom in the molecule at the second time may be predicted using the graph neural network model based on the system state at the second time obtained in step S230; generating a system state of the molecule at a third moment based on the acceleration vector of each atom in the molecule at the second moment and the system state at the second moment; this process is repeated until a system state of the molecule at a predetermined moment is generated. For example, for a predetermined time t' =t+k Δt, k iterative computations may be performed, i.e. steps S210 and S230 described above are performed k times, where the state of the subsystem output from each iteration is taken as input to the graph neural network model in the next iteration, until the system state of the molecule at the predetermined time is generated.

In the molecular state prediction method according to the embodiment of the disclosure, based on an initial system state of a molecule, an acceleration vector of each atom in the molecule at a current moment is predicted through a graph neural network model, then a speed vector and a position vector of each atom at a next moment are updated through a numerical algorithm, and the system state of the molecule at a preset moment is finally output through multiple iterative computation. The overall flow of the method may be represented, for example, as fig. 4. Fig. 4 shows an overall flow of a molecular state prediction method according to an embodiment of the present disclosure. As shown in fig. 4, the molecular state prediction method according to the embodiment of the present disclosure receives an input molecular system state, calculates an acceleration in combination with a numerical algorithm using a graph neural network to update the system state, and finally outputs the system state of a molecule at any predetermined time.

A training method of a graph neural network model for molecular state prediction according to an embodiment of the present disclosure is described below with reference to fig. 5. Fig. 5 illustrates a flowchart of a method 500 of training a graph neural network model for molecular state prediction, according to an embodiment of the present disclosure. Since the details of steps S510 to S540 of the training method 500 are similar to those of steps S210 to S240 of the molecular state prediction method 200 described with reference to fig. 2, a repetitive description of part of the contents is omitted here for the sake of brevity.

As shown in fig. 5, in step S510, a system state of a molecule at a first training time is received, wherein the molecule includes one or more atoms, and the system state at the first training time includes a position vector, a velocity vector, and a feature vector of each atom in the molecule at the first training time, and a relationship vector characterizing a physical relationship among atoms in the molecule. As described above, the feature vector includes a vector representing a non-geometric feature of each atom in the molecule, such as an atomic number, an atomic mass, historical trajectory information (e.g., a historical speed, a historical position, etc.), and so forth; the relationship vector may then include an edge vector that characterizes whether the distance between any two atoms in the molecule is less than a predetermined threshold.

In step S520, based on the system state at the first training time, an acceleration vector of each atom in the molecule at the first training time is predicted using the graph neural network model. In particular, a message vector between an atom and another atom in the molecule can be calculated, wherein the message vector characterizes interactions between the atom and the other atom; the acceleration vector of the atom at the first training moment is then determined based on the message vectors between the atom and all other atoms in the molecule. For example, a message vector between an atom and another atom in a molecule may be calculated based on the position vector and the feature vector of the atom, the position vector and the feature vector of another atom, and the edge vector between the atom and another atom according to the above equation (2), and an acceleration vector of the atom may be calculated according to the above equation (3). And, a feature vector of an atom at a second training time may be generated based on a message vector between the atom and all other atoms in the molecule and a feature vector of the atom at the first training time, e.g., according to equation (4) above.

In step S530, a system state of the molecule at a second training time is generated based on the acceleration vector of each atom in the molecule at the first training time and the system state at the first training time. For example, any one of numerical algorithms such as an euler method, a semi-implicit euler method, a Wei Erlai integration method, a longger-base tower method, and the like may be employed to calculate the velocity vector and the position vector of the atom based on the atomic acceleration vector and the system state at the first training time, which is not particularly limited by the embodiments of the present disclosure. For example, the velocity vector and the position vector of the atom may be calculated based on the atomic acceleration vector and the system state at the first training time according to the above equations (5) - (6) or (5) and (7).

In step S540, a number of iterative computations are performed to generate predicted system states for the molecules at predetermined training moments. Specifically, the acceleration vector of each atom in the molecule at the second training time may be predicted using the graph neural network model based on the system state at the second training time obtained in step S530; generating a system state of the molecule at a third training time based on the acceleration vector of each atom in the molecule at the second training time and the system state of the molecule at the second training time; this process is repeated until a system state of the molecule at a predetermined training moment is generated. For example, for a predetermined training time t' =t+k Δt, k iterative computations may be performed, i.e. steps S510 and S530 described above are performed k times, where the state of the molecular system output from each iteration is taken as the input of the graph neural network model in the next iteration, until the state of the system of the molecule at the predetermined training time is generated.

Thereafter, in step S550, the graph neural network model is trained based on the predicted system state and the actual system state of the molecule at the predetermined training time. The real system state of the molecule may be obtained based on the system state of the molecule at the first training moment by using a classical molecular dynamics numerical model, or may be obtained from an existing experimental database, etc., and the method for obtaining the real system state in the embodiment of the present disclosure is not particularly limited. Specifically, the loss function may be calculated based on the predicted system state and the actual system state of the molecule at the predetermined training time; and training the graph neural network model by minimizing the loss function. For example, the loss function L at time t+Δt can be calculated by the following equation (8):

Wherein, the liquid crystal display device comprises a liquid crystal display device,and->The atomic positions in the molecular system state and the atomic positions in the real system state predicted at the time t+Deltat are respectively; d is a training dataset comprising all atoms in the molecule used for training. A loss function L between the predicted system state and the actual system state of the molecule at any predetermined moment can be determined, and parameters in the model, such as parameters of each multi-layer perceptron, etc., are optimized by minimizing the loss function L, so as to obtain a trained graph neural network model. In addition, parameters of the neural ordinary differential equation for updating the velocity vector and the position vector of the atom may be optimized together during the training.

The molecular state prediction method according to the embodiment of the disclosure is based on a graph neural network and a numerical algorithm, combines the geometric structure and physical priori knowledge (such as classical mechanics theory such as Newton's law) of molecules, provides a high-efficiency molecular dynamics model, and realizes decoupling of acceleration of a molecular system and modeling of the dynamic process due to the fact that the dynamic process of the neural network is described in a form of a normal differential equation (instead of the graph neural network), so that the parameter number of the model can be reduced, the risk of overfitting is reduced, the flexibility of the model is improved, and the generalization of the model is improved; in addition, the interpretability of the ordinary differential equation can better reflect the physical rule of molecular dynamics, and can also improve the interpretability of the model. The molecular state prediction method according to the embodiment of the disclosure can effectively model the spatial and dynamic information of molecules and predict the structure and properties of the molecules, thereby playing an important role in optimizing drug design, predicting material performance to optimize material design, researching biological molecular structure, predicting energy storage and transmission performance of energy materials and the like.

A molecular state prediction apparatus according to an embodiment of the present disclosure is described below with reference to fig. 6. Fig. 6 shows a schematic structural diagram of a molecular state prediction apparatus 600 according to an embodiment of the present disclosure. As shown in fig. 6, the molecular state prediction apparatus 600 may include an input unit 610, a processing unit 620, and an output unit 630. In addition to these three units, the molecular state prediction apparatus 600 may include other suitable units or components, but since these units or components are not relevant to the present disclosure, a detailed description of the functions thereof is omitted herein. Further, details of the functions of the molecular state prediction apparatus 600 are similar to those of the molecular state prediction method 200 described with reference to fig. 2, and thus, for brevity, a repetitive description of a part of the contents is omitted here.

The input unit 610 is configured to receive a system state of the molecule at a first moment in time. In embodiments of the present disclosure, a molecule is an entity of constituent atoms that are joined together in a certain bonding order and spatial arrangement, which may include one or more atoms. The molecules may be monoatomic molecules such as helium molecules (He) and argon molecules (Ar); can be a double sourceSub-molecules, e.g. oxygen molecules (O) ₂ ) And carbon monoxide molecules (CO); or polyatomic molecules, such as water molecules (H ₂ O), carbon dioxide molecules (CO ₂ ) Or more complex molecules composed of more atoms, such as high molecular weight compounds. The geometry of the molecule and the number of atoms included in the embodiment of the present disclosure are not particularly limited. In examples of embodiments of the present disclosure, molecules having N atoms may be usedAnd (3) representing that i= … … N, wherein N is a positive integer greater than or equal to 1.

In embodiments of the present disclosure, the system state of the molecule at the first time may characterize the state of each atom in the molecule at the first time. In particular, the system state of the molecule at the first moment in time may include a position vector, a velocity vector, and a feature vector of each atom in the molecule at the first moment in time, and a relationship vector characterizing the physical relationship between atoms in the molecule. For example, the system state of a molecule at a first time t may be expressed asWherein the relationship vector g= { P, E } represents the physical relationship between atoms in the molecule P; the edge vector E indicates whether an edge exists between any two atoms in the molecule or, alternatively, whether there is a physical interaction that needs to be considered. For example, if the distance between two atoms is less than a predetermined threshold, or alternatively, the physical interaction between two atoms is greater than a predetermined threshold, then this may be considered to be an edge between the two atoms; otherwise, no edge is considered to exist between the two atoms. Here, the predetermined threshold value used to determine the edge vector may be determined based on empirical parameters of physical prior knowledge, such as based on known molecular structures, molecular force fields, etc., to which the embodiments of the present disclosure are not particularly limited. The position vector q represents the position of each atom in the molecule at the first instant t. Speed vector- >The first derivative of q represents the velocity of each atom in the molecule at the first instant t. The feature vector h represents a non-geometric feature of each atom in the molecule, such as atomic number, atomic mass, historical trajectory information (e.g., historical speed, historical position, etc.), and so forth.

The processing unit 620 may include, for example, a graph neural network unit 6210, a numerical value updating unit 6220, and an iteration unit 6230. Wherein the graph neural network unit 6210 is configured to predict an acceleration vector of each atom in the molecule at a first time instant using the graph neural network model based on the system state at the first time instant. In the disclosed embodiments, a second derivative (i.e., acceleration) of the atomic position of a molecule may be modeled using a graph neural network model, or more specifically, a graph neural network model may be constructed to predict the acceleration of each atom in a molecule based on an input molecular system state. The graph neural network model according to embodiments of the present disclosure may be constructed, for example, based on a constant graph neural network, a geometrically constrained graph neural network, or any other suitable graph neural network, to which embodiments of the present disclosure are not particularly limited. The constructed graph neural network model can be represented by, for example, the above equation (1).

In examples of embodiments of the present disclosure, a messaging-based graph neural network may be employed to model acceleration vectors of solution atoms. The graph neural network is a deep learning method based on a graph structure, wherein the graph structure is a data structure composed of nodes and edges, and message passing is a mode of aggregating adjacent node information to update central node information. In examples of embodiments of the present disclosure, then, the information of the neighboring atoms may be utilized to calculate the acceleration vector of the center atom. In particular, the graph neural network unit 6210 may be configured to calculate, for each atom in a molecule, a message vector between the atom and another atom in the molecule, wherein the message vector may characterize an interaction between the atom and the other atom. For example, a message vector between an atom and another atom in a molecule can be calculated by equation (2) above.

The process of generating a message vector between two atoms can be considered as the process of generating acceleration by interaction between them. For the ith atom, computing its final acceleration vector requires aggregating the message vectors between it and all other atoms in the molecule. That is, the acceleration vector of an atom at a first time may be determined based on the message vector between the atom and all other atoms in the molecule. For example, the acceleration vector of the i-th atom at the first time t can be calculated by the above equation (3).

Furthermore, in order to increase the expressive power of the model, the feature vector of the atom may be updated in each iteration of the model (i.e., each time the acceleration vector is calculated). In particular, a feature vector of an atom at a second time may be generated based on a message vector between the atom and all other atoms in the molecule and a feature vector of the atom at the first time. For example, the atomic feature vector updated through one iteration may be generated by the above equation (4).

The numerical update unit 6220 of the processing unit 620 may be configured to generate a system state of the molecule at a second time based on the acceleration vector of each atom in the molecule at the first time and the system state at the first time. In the disclosed embodiment, the system state of the component at the second moment is generated by combining the graph neural network model predicted atomic acceleration with updating the atomic velocity and position using a numerical algorithm. Numerical algorithms are conventional methods of solving ordinary differential equations, and when simulating physical motion processes, common numerical algorithms include, for example, the euler method, the semi-implicit euler method, the Wei Erlai integral method, the longger-kutta method, and the like. Embodiments of the present disclosure may employ any of these numerical algorithms to calculate velocity vectors and position vectors of atoms based on the atomic acceleration vectors. According to an example of an embodiment of the present disclosure, a numerical algorithm for calculating a velocity vector and a position vector of an atom may be implemented in the form of a neural ordinary differential equation, and constitute a molecular dynamics model capable of predicting a molecular state together with a graph neural network for predicting an acceleration of the atom.

According to one example of an embodiment of the present disclosure, the numerical value updating unit 6220 may be configured to generate, for each atom in the molecule, a velocity vector of the atom at a second time based on the acceleration vector and the velocity vector of the atom at the first time; and generating a position vector of the atom at a second time based on the velocity vector and the position vector of the atom at the first time. The process calculates a velocity vector and a position vector of the atom based on the euler method, which can be represented by, for example, the above equations (5) - (6).

According to another example of an embodiment of the present disclosure, the numerical value updating unit 6220 may be configured to generate, for each atom in the molecule, a velocity vector of the atom at a second time based on the acceleration vector and the velocity vector of the atom at the first time; and generating a position vector of the atom at the second time based on the velocity vector of the atom at the second time and the position vector of the atom at the first time. The process calculates the velocity vector and the position vector of the atom based on the semi-implicit euler method, which can be represented by, for example, equations (5) and (7) above.

Through the above process, based on the system state of the molecule at the first time, the system state of the molecule at the second time including the position vector, the velocity vector and the feature vector of the molecule at the second time is generated by using the graph neural network model in combination with the numerical algorithm, which may be referred to as that one iterative calculation is completed. The iteration unit 6230 of the processing unit 620 may continue to perform multiple iterative computations to generate a system state of the molecule at a predetermined time. Specifically, the acceleration vector of each atom in the molecule at the second moment can be predicted by using the graph neural network model based on the obtained system state at the second moment; generating a system state of the molecule at a third moment based on the acceleration vector of each atom in the molecule at the second moment and the system state at the second moment; this process is repeated until a system state of the molecule at a predetermined moment is generated. For example, for a predetermined time t' =t+k Δt, the iterative calculation may be performed k times, that is, the above-described process is performed k times, wherein the state of the molecular system output by the numerical value updating unit 6220 in each iteration is taken as the input of the graph neural network unit 6210 in the next iteration to generate the state of the system of the molecules at the predetermined time.

The molecular state prediction device according to the embodiment of the disclosure is based on a graph neural network and a numerical algorithm, combines the geometric structure and physical priori knowledge (such as classical mechanics theory such as Newton's law) of molecules, provides a high-efficiency molecular dynamics model, and realizes decoupling of acceleration of a molecular system and modeling of the dynamic process due to the fact that the dynamic process of the neural network is described in a form of a normal differential equation (instead of the graph neural network), so that the parameter number of the model can be reduced, the risk of overfitting is reduced, the flexibility of the model is improved, and the generalization of the model is improved; in addition, the interpretability of the ordinary differential equation can better reflect the physical rule of molecular dynamics, and can also improve the interpretability of the model. The molecular state prediction method according to the embodiment of the present disclosure can effectively model spatial and dynamic information of molecules, predict structures and properties of molecules, and thus play an important role in, for example, optimizing drug designs, predicting material properties to optimize material designs, researching biological molecular structures, predicting energy storage and transmission properties of energy materials, and the like.

Furthermore, devices (e.g., molecular state prediction devices, etc.) according to embodiments of the present disclosure may also be implemented by way of the architecture of the exemplary computing device shown in fig. 7. Fig. 7 illustrates a schematic diagram of an architecture of an exemplary computing device, according to an embodiment of the present disclosure. As shown in fig. 7, computing device 700 may include a bus 710, one or more CPUs 720, a Read Only Memory (ROM) 730, a Random Access Memory (RAM) 740, a communication port 750 connected to a network, an input/output component 760, a hard disk 770, and the like. A storage device, such as ROM 730 or hard disk 770, in computing device 700 may store various data or files for computer processing and/or communication and program instructions for execution by the CPU. Computing device 700 may also include a user interface 780. Of course, the architecture shown in FIG. 7 is merely exemplary, and one or more components of the computing device shown in FIG. 7 may be omitted as may be practical in implementing different devices. The apparatus according to the embodiments of the present disclosure may be configured to perform the molecular state prediction method according to the above-described embodiments of the present disclosure, or to implement the molecular state prediction device according to the above-described embodiments of the present disclosure.

Embodiments of the present disclosure may also be implemented as a computer-readable storage medium. Computer readable storage media according to embodiments of the present disclosure have computer readable instructions stored thereon. When the computer readable instructions are executed by the processor, the molecular state prediction method according to the embodiments of the present disclosure described with reference to the above figures may be performed. Computer-readable storage media include, but are not limited to, volatile memory and/or nonvolatile memory, for example. Volatile memory can include, for example, random Access Memory (RAM) and/or cache memory (cache) and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like.

According to an embodiment of the present disclosure, there is also provided a computer program product or a computer program comprising computer readable instructions stored in a computer readable storage medium. The processor of the computer device may read the computer readable instructions from the computer readable storage medium, and the processor executes the computer readable instructions, so that the computer device performs the molecular state prediction method described in the above embodiments.

Program portions of the technology may be considered to be "products" or "articles of manufacture" in the form of executable code and/or associated data, embodied or carried out by a computer readable medium. A tangible, persistent storage medium may include any memory or storage used by a computer, processor, or similar device or related module. Such as various semiconductor memories, tape drives, disk drives, or the like, capable of providing storage functionality for software.

All or a portion of the software may sometimes communicate over a network, such as the internet or other communication network. Such communication may load software from one computer device or processor to another. For example: a hardware platform loaded from a server or host computer of the molecular state machine to a computer environment, or other computer environment implementing the system, or similar functioning system related to providing information needed for molecular state prediction. Thus, another medium capable of carrying software elements may also be used as a physical connection between local devices, such as optical, electrical, electromagnetic, etc., propagating through cable, optical cable, air, etc. Physical media used for carrier waves, such as electrical, wireless, or optical, may also be considered to be software-bearing media. Unless limited to a tangible "storage" medium, other terms used herein to refer to a computer or machine "readable medium" mean any medium that participates in the execution of any instructions by a processor.

Those skilled in the art will appreciate that various modifications and improvements can be made to the disclosure. For example, the various devices or components described above may be implemented in hardware, or may be implemented in software, firmware, or a combination of some or all of the three.

Furthermore, those skilled in the art will appreciate that the various aspects of the application are illustrated and described in the context of a number of patentable categories or circumstances, including any novel and useful procedures, machines, products, or materials, or any novel and useful modifications thereof. Accordingly, aspects of the application may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.) or by a combination of hardware and software. The above hardware or software may be referred to as a "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the application may take the form of a computer product, comprising computer-readable program code, embodied in one or more computer-readable media.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

While the present disclosure has been described in detail above, it will be apparent to those skilled in the art that the present disclosure is not limited to the embodiments described in the present specification. The present disclosure may be embodied as modifications and variations without departing from the spirit and scope of the disclosure, which is defined by the appended claims. Accordingly, the description herein is for the purpose of illustration and is not intended to be in any limiting sense with respect to the present disclosure.

Claims

1. A method of molecular state prediction, comprising:

receiving a system state of a molecule at a first time, the molecule comprising one or more atoms, the system state at the first time characterizing a state of each atom in the molecule at the first time;

predicting an acceleration vector of each atom in the molecule at the first moment by using a graph neural network model based on the system state at the first moment;

generating a system state of the molecule at a second time based on the acceleration vector of each atom in the molecule at the first time and the system state at the first time; and

multiple iterative computations are performed to generate a system state of the molecule at a predetermined time.

2. The method of claim 1, wherein the system state at the first time comprises a position vector, a velocity vector, and a feature vector for each atom in the molecule at the first time, and a relationship vector characterizing the physical relationship between atoms in the molecule.

3. The method of claim 2, wherein the feature vector comprises a vector characterizing a non-geometric feature of each atom in the molecule, the non-geometric feature comprising one or more of atomic number, atomic mass, historical trajectory information, and the relationship vector comprises an edge vector characterizing whether a distance between any two atoms in the molecule is less than a predetermined threshold.

4. The method of claim 1, wherein performing a plurality of iterative computations to generate a system state of the molecule at a predetermined time comprises:

predicting an acceleration vector of each atom in the molecule at the second moment by using a graph neural network model based on the system state at the second moment; and

a system state of the molecule at a third time is generated based on the acceleration vector of each atom in the molecule at the second time and the system state at the second time.

5. The method of claim 1, wherein predicting an acceleration vector of each atom in the molecule at the first time instant using a graph neural network model based on the system state at the first time instant comprises, for each atom in the molecule:

calculating a message vector between the atom and another atom in the molecule, the message vector characterizing an interaction between the atom and the other atom;

an acceleration vector of the atom at the first time is determined based on the message vectors between the atom and all other atoms in the molecule.

6. The method of claim 5, wherein calculating a message vector between the atom and another atom in the molecule comprises:

a message vector between the atom and another atom in the molecule is calculated based on the position vector and the feature vector of the atom, the position vector and the feature vector of the other atom, and an edge vector between the atom and the other atom.

7. The method of claim 5, further comprising:

based on the message vector between the atom and all other atoms in the molecule and the feature vector of the atom at the first time, a feature vector of the atom at a second time is generated.

8. The method of claim 1, wherein generating the system state of the molecule at the second time based on the acceleration vector of each atom in the molecule at the first time and the system state at the first time comprises, for each atom in the molecule:

generating a velocity vector of the atom at the second moment based on the acceleration vector and the velocity vector of the atom at the first moment; and

and generating a position vector of the atom at the second moment based on the speed vector of the atom at the second moment and the position vector of the atom at the first moment.

9. The method of claim 1, wherein generating the system state of the molecule at the second time based on the acceleration vector of each atom in the molecule at the first time and the system state at the first time comprises, for each atom in the molecule:

and generating a position vector of the atom at the second moment based on the speed vector and the position vector of the atom at the first moment.

10. A training method of a graph neural network model for molecular state prediction comprises the following steps:

receiving a system state of a molecule at a first training time, the molecule comprising one or more atoms, the system state at the first training time characterizing a state of each atom in the molecule at the first training time;

predicting an acceleration vector of each atom in the molecule at the first training moment by using a graph neural network model based on the system state at the first training moment;

generating a system state of the molecule at a second training time based on the acceleration vector of each atom in the molecule at the first training time and the system state of the first training time;

performing a plurality of iterative computations to generate a predicted system state of the molecule at a predetermined training time;

and training the graph neural network model based on the predicted system state and the real system state of the molecule at the preset training moment.

11. The method of claim 10, wherein training the graph neural network model based on the predicted system state and the actual system state of the molecule at the predetermined training time comprises:

Calculating a loss function based on the predicted system state and the actual system state of the molecule at the predetermined training moment;

the graph neural network model is trained by minimizing the loss function.

12. The method of claim 11, wherein the true system state is obtained based on a system state of the molecule at the first training time using a molecular dynamics numerical model.

13. A molecular state prediction apparatus, the apparatus comprising:

an input unit configured to receive a system state of a molecule at a first time, the molecule comprising one or more atoms, the system state at the first time characterizing a state of each atom in the molecule at the first time;

a processing unit configured to predict an acceleration vector of each atom in the molecule at the first time using a graph neural network model based on the system state at the first time, generate a system state of the molecule at a second time based on the acceleration vector of each atom in the molecule at the first time and the system state at the first time, and perform a plurality of iterative computations to generate the system state of the molecule at a predetermined time; and

And an output unit configured to output a system state of the molecule at the predetermined timing.

14. A molecular state prediction apparatus comprising:

one or more processors; and

one or more memories having stored therein computer readable instructions that, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-12.

15. A computer readable storage medium having stored thereon computer readable instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1-12.

16. A computer program product comprising computer readable instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1-12.