CN115394364A - Atomic fingerprint computing method and device for atomic dynamics Monte Carlo simulation - Google Patents

Atomic fingerprint computing method and device for atomic dynamics Monte Carlo simulation Download PDF

Info

Publication number
CN115394364A
CN115394364A CN202210816365.6A CN202210816365A CN115394364A CN 115394364 A CN115394364 A CN 115394364A CN 202210816365 A CN202210816365 A CN 202210816365A CN 115394364 A CN115394364 A CN 115394364A
Authority
CN
China
Prior art keywords
vacancy
atom
fingerprint
lattice point
atomic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210816365.6A
Other languages
Chinese (zh)
Other versions
CN115394364B (en
Inventor
宋海峰
商红慧
陈欣
林蓉芬
王丽芳
高兴誉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
INSTITUTE OF APPLIED PHYSICS AND COMPUTATIONAL MATHEMATICS
Original Assignee
INSTITUTE OF APPLIED PHYSICS AND COMPUTATIONAL MATHEMATICS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by INSTITUTE OF APPLIED PHYSICS AND COMPUTATIONAL MATHEMATICS filed Critical INSTITUTE OF APPLIED PHYSICS AND COMPUTATIONAL MATHEMATICS
Priority to CN202210816365.6A priority Critical patent/CN115394364B/en
Publication of CN115394364A publication Critical patent/CN115394364A/en
Application granted granted Critical
Publication of CN115394364B publication Critical patent/CN115394364B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C10/00Computational theoretical chemistry, i.e. ICT specially adapted for theoretical aspects of quantum chemistry, molecular mechanics, molecular dynamics or the like
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The method is applied to a heterogeneous multi-core processor, can distribute atomic fingerprint operation to a plurality of slave cores for execution, quickly obtains the result of a fingerprint function by adopting a mode of predetermining an atomic fingerprint table, and can realize high-efficiency AKMC atomic fingerprint calculation by combining the relative coordinate, the neighbor relation and the atom type vector of a vacancy system.

Description

Atomic fingerprint computing method and device for atomic dynamics Monte Carlo simulation
Technical Field
The invention relates to the technical field of atomic fingerprint calculation, in particular to an atomic fingerprint calculation method and device for atomic dynamics Monte Carlo simulation.
Background
An Atomic fingerprint (Atomic features) is a vector description of the environment in which an atom is located, the environment being the relationship between an atom and its neighboring atoms, usually measured by distance. Atomic fingerprints are typically computed using an artificially defined fingerprint function (fingerprint function), i.e. a descriptor function (descriptor function).
The current atomic fingerprint calculation scheme is mainly used for molecular dynamics simulation, structure relaxation and the like. In these research tasks, atoms are free to move, and therefore the distance r between them ij Is continuous (in principle can be of any value, since the existence of a repulsive force r ij Not too small), the fingerprint function at any time step (time step) needs to be computed and acquired in real time. The computation cost of the fingerprint function is extremely large, and the proportion of the fingerprint function in the total computation cost of the arbitrary time step function can reach 50-80% (depending on the definition of the fingerprint function).
In the Monte Carlo simulation for Lattice atom dynamics (AKMC), atoms are always located on Lattice points and the Lattice parameter is constant. The interatomic distance may only be a few specific values (determined by the lattice type), and the distribution of interatomic distances is discrete, in which case the fingerprint is computed in real time, resulting in a significant computational overhead.
In addition, the existing fingerprint computing method is designed mainly for a Central Processing Unit (CPU), is a typical serial method, and if the method is not optimized, and runs directly on a heterogeneous multi-core processor, only a master core can be called, and the computing power of a slave core cannot be utilized. The main computing power of the heterogeneous multi-core processor is concentrated on the slave core, and if the slave core cannot be fully utilized, the performance of the heterogeneous multi-core processor cannot be exerted.
Disclosure of Invention
In order to solve the above problems, the present invention provides an atomic fingerprint calculation method for atomic dynamics monte carlo simulation, which is applied to a heterogeneous multi-core processor, where the heterogeneous multi-core processor includes a master core and a slave core, and the method includes: allocating a plurality of grid points of a vacancy system to each slave core; each slave core respectively stores the relative coordinate, the neighbor relation, the atom type vector and the atom fingerprint table of the vacancy system; the relative coordinates are coordinates of grid points in the vacancy system relative to a central vacancy, the neighbor relation comprises distances between the grid points in the vacancy system and neighbor grid points, the atom type vector comprises an atom type of each grid point in the vacancy system, and the atom fingerprint table comprises atom fingerprints calculated based on the relative coordinates, the neighbor relation and a preset fingerprint function; the secondary core queries the atom fingerprint table according to the relative coordinate, the neighbor relation and the atom type vector to obtain an atom fingerprint in an initial state; exchanging a target lattice point in the atom type vector with the atom type of the central vacancy, and exchanging a neighbor relation of the target lattice point with a neighbor relation of the central vacancy, to simulate the central vacancy transitioning to the target lattice point; and the secondary core inquires the atom fingerprint table to calculate and obtain the atom fingerprint after the simulated vacancy transition according to the neighbor relation and the atom type vector after the simulated vacancy transition to the target lattice point.
Optionally, the allocating a plurality of lattice points of the vacancy system to each slave core includes: equally distributing a plurality of grid points of the vacancy system to each slave core according to the number of the slave cores; each slave core stores a neighbor relation of the assigned grid point.
Optionally, after the querying the atomic fingerprint table to obtain an atomic fingerprint after the simulated vacancy transition, the method further includes: exchanging the central vacancy in the atom type vector with an atom type of the target lattice point, and exchanging a neighbor relation of the central vacancy with a neighbor relation of the target lattice point to restore to an initial state of the vacancy system.
Optionally, the method further comprises: pre-calculating all the atom fingerprints corresponding to each lattice point in the vacancy system, and storing all the atom fingerprints corresponding to each node in the atom fingerprint table; the atomic fingerprint table is stored in a matrix form, the matrix comprises N rows and M columns, N is the number of layers of adjacent lattice points of the central vacancy in the truncation radius, and M is the number of fingerprint functions. This makes it easier to achieve high performance summing with Single Instruction Multiple Data (SIMD) streams.
Optionally, the neighbor relations store neighbor relations of each lattice point in the vacancy system in a three-dimensional array form, and the neighbor relations of any lattice point include sequence numbers of neighbor lattice points of the lattice point and distances of neighbor lattice points of the lattice point.
Optionally, the vacancy system includes a central vacancy, a plurality of nearest neighbor lattice points for the central vacancy, all of the nearest neighbor atoms of the plurality of nearest neighbor lattice points within a truncation radius.
Optionally, the atom type vector is a one-dimensional vector, the number of columns of the one-dimensional vector is a serial number of each lattice point in the vacancy system, and a value stored in each column of the one-dimensional vector is an atom type of each corresponding lattice point.
The invention provides an atomic dynamics Monte Carlo simulated atomic fingerprint computing device, which is applied to a heterogeneous multi-core processor, wherein the heterogeneous multi-core processor comprises a main core and a slave core, and the device comprises: the allocation module is used for allocating a plurality of grid points of the vacancy system to each slave core; each slave core respectively stores the relative coordinate, the neighbor relation, the atom type vector and the atom fingerprint table of the vacancy system; the relative coordinates are coordinates of grid points in the vacancy system relative to a central vacancy, the neighbor relation comprises distances between the grid points in the vacancy system and neighbor grid points, the atom type vector comprises an atom type of each grid point in the vacancy system, and the atom fingerprint table comprises atom fingerprints calculated based on the relative coordinates, the neighbor relation and a preset fingerprint function; the initial state calculation module is used for inquiring the atom fingerprint table by the secondary core according to the relative coordinate, the neighbor relation and the atom type vector to obtain an atom fingerprint of an initial state; a transition simulation module for exchanging a target lattice point in the atom type vector with an atom type of the central vacancy, and exchanging a neighbor relation of the target lattice point with a neighbor relation of the central vacancy, to simulate the central vacancy transitioning to the target lattice point; and the transition state calculation module is used for inquiring the atomic fingerprint table to calculate and obtain the atomic fingerprint after the simulated vacancy transition according to the neighbor relation and the atomic type vector after the simulated vacancy transition to the target lattice point.
Optionally, the allocation module is specifically configured to: equally distributing a plurality of grid points of the vacancy system to each slave core according to the number of the slave cores; each slave core stores the neighbor relation of the allocated grid point.
Optionally, the apparatus further comprises a transition recovery module configured to: exchanging the central vacancy in the atom type vector with an atom type of the target lattice point, and exchanging a neighbor relation of the central vacancy with a neighbor relation of the target lattice point to restore to an initial state of the vacancy system.
The atomic dynamics Monte Carlo simulated atomic fingerprint calculation method provided by the embodiment of the invention is applied to a heterogeneous multi-core processor, can distribute atomic fingerprint operation to a plurality of slave cores for execution, quickly obtains the result of a fingerprint function by adopting a mode of predetermining an atomic fingerprint table, and can realize high-efficiency AKMC atomic fingerprint calculation by combining the relative coordinate, the neighbor relation and the atomic type vector of a vacancy system.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic flowchart of an atomic fingerprint calculation method for atomic dynamics monte carlo simulation according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an encoding algorithm of a vacancy system provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of a process of atomic fingerprint calculation according to an embodiment of the present invention;
FIG. 4 is a comparison graph of simulated durations for various processors using different methods for fingerprint calculation according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an atomic fingerprint calculation apparatus for atomic dynamics monte carlo simulation according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Probability Γ of a vacancy i transition to its first neighbor k in the Atomic Kinetic Monte Carlo simulation (AKMC) i The energy difference between before and after the transition is determined as follows:
Figure BDA0003742579040000051
wherein, gamma is 0 For trial frequency, it is usually 6 × 10 12 s -1 ,k B Boltzmann constant, T is absolute temperature,
Figure BDA0003742579040000052
as reference energy, E i Is the system energy before the transition of the vacancy i,
Figure BDA0003742579040000053
is the system energy after the transition of the vacancy to the atom k. The number of atoms of the first neighbors of different crystal lattices is different, for example, 8 first neighbors exist at any lattice site under Body-Centered Cubic (BCC) crystal, 12 first neighbors exist at any lattice of Face-Centered Cubic (FCC) crystal, and 6 first neighbors exist at any lattice of Simple Cubic (SC) crystal. For any vacancy, the calculation of 1 +needs to be completed k Sub-energy calculation, where 1 corresponds to the initial state, N k Then corresponds to N k A possible post-transition state.
To calculate the energy of an arbitrary state, interaction potentials (Interaction potentials) that can describe the Interaction between atoms, also called Force-fields, are needed. The existing AKMC program mainly uses empirical potentials based on physical models, such as Embedded atomic potential (Embedded atomic potential) and the like. Traditional experience has limited potential accuracy. In recent years, machine learning force fields based on machine learning models, in particular Neural Network force fields (NNPs), have been increasingly used for the construction of experience potentials due to their high-precision characteristics.
The specific process of atomic neural network potential (atomic neural network potential) is as follows: firstly, calculating to obtain Atomic fingerprints (Atomic fingerprints) according to the Atomic coordinate information R; then, atomic fingerprint G i Afferent atomic neural network S as a direct input i (multilayer neural network) to obtain the corresponding atomic energy E i The total energy of the system is the sum of the atomic energies. In the whole process, the calculation of the atomic fingerprint is a very critical and time-consuming step, and the calculation of the atomic fingerprint usually accounts for 30% -50% of the total calculation time of the force field.
An atom fingerprint is a vector description of the environment in which the atom is located. By context is meant the relationship of an atom to its neighbors, usually measured in distance. In a molecule or a solid, for an arbitrary atom i, the radius r is centered on the atom cut The atoms within the virtual sphere of (a) are referred to as the neighbor atoms of atom i. Where r is cut Also known as cutoff radius (cutoff radius), typically takes 5-7 angstroms. The number of neighboring atoms within the truncation radius is the number of neighboring atoms.
An atomic fingerprint is usually calculated using an artificially defined fingerprint function (fingerprint function), i.e. a descriptor function (descriptor function), and the following formula (radial symmetry function) is one of the commonly used fingerprint functions at present:
Figure BDA0003742579040000061
wherein r is c To cut off the radius, η and R s For artificially defined hyper-parameters, r ij Is the distance of atom i from its neighbor atom jAnd (5) separating. When the interatomic distance is greater than the cutoff radius, the value of the fingerprint function is zero. If there are K different fingerprint functions (i.e., K sets of different hyper-parameters), then each atom can describe its atomic environment by a vector of length K, i.e., the fingerprint vector for that atom. When there are N atoms in the system, a matrix G of N rows and K columns may be used to store all fingerprint information, where G (i, K) is the fingerprint information of atom i calculated based on the kth fingerprint function.
Heterogeneous multi-core processors are one of the types of processors common in computers, which employ a processor of a typical master-slave architecture, containing several master cores and a larger number of compute cores (also referred to as slave cores), the number of slave cores being typically 4 or higher. The Main core and the slave core can simultaneously Access a Main Memory (Main Memory), the Main Memory is usually large in space but limited in Memory bandwidth, and the overhead of directly accessing the Main Memory (DMA) by the slave core is very large. The slave cores have independent Local Device Memories (LDMs), and access to their Local memories from the cores is very fast, but LDMs typically have hundreds of kB.
The main computing power of the heterogeneous multi-core processor is concentrated on the slave core, and if the slave core cannot be fully utilized, the performance of the heterogeneous multi-core processor cannot be exerted.
The current fingerprint calculation scheme is mainly used for molecular dynamics simulation, structural relaxation and the like. In these research tasks, atoms are free to move, and therefore the distance r between them ij Is continuous (in principle can be of any value, since the existence of a repulsive force r ij Not too small), the fingerprint function at any time step (time step) needs to be computed and acquired. The computation cost of the fingerprint function is extremely large, and the proportion of the fingerprint function in the total computation cost of the arbitrary time step function can reach 50-80% (depending on the definition of the fingerprint function).
However, in AKMC, the atoms are always located on lattice points and the lattice parameter is unchanged. Thus, the interatomic distance may only be a few specific values (determined by the lattice type), and the distribution of interatomic distances is discrete. Taking the BCC crystal as an example, the value of the interatomic distance r may only be
Figure BDA0003742579040000071
Figure BDA0003742579040000072
Etc. wherein a 0 Is the unit cell constant of BCC crystal. The cell constants are constant and the cutoff radius for interatomic interactions is constant, and the possible values of r that are less than the cutoff radius for interatomic interactions are limited.
Therefore, it is unnecessary to calculate the fingerprint corresponding to r in real time for AKMC, and the embodiment of the present invention proposes that the result of the fingerprint function can be quickly obtained based on the tabulation method.
In addition, the existing fingerprint computing method is mainly designed for a general purpose CPU (Intel/AMD/IBM and the like), is a typical serial method, and can only call a master core and cannot utilize the computing capacity of a slave core if the existing fingerprint computing method is not optimized and is directly operated on a heterogeneous multi-core processor. The method provided by the embodiment of the invention can make full use of the secondary core in fingerprint calculation so as to avoid the problem.
The method provided by the embodiment of the invention aims at the simulation design of the crystal lattice AKMC, and is suitable for solid systems such as body-centered cubic, face-centered cubic, simple cubic and the like.
Vacancy transition (a vacancy exchanging position with one of its N neighboring atoms) is the fundamental event of lattice AKMC simulation. AKMC generally only considers transitions of vacancies with their first neighbor atoms. BCC has 8 first neighbors in any empty place, 12 for FCC and 6 for SC.
The probability of vacancy transition is determined by the energy difference before and after the transition, the total energy of the system is the sum of atomic energy, the atomic energy is determined by the atomic fingerprint, and if the fingerprint is not changed, the atomic energy is not changed. As mentioned above, there is a truncation distance in the fingerprint calculation, so that when an atom is far from the transition point, the atom fingerprint does not change due to the transition, and the atom energy does not change. Therefore, vacancy transitions in AKMC only affect atoms around the transition point. The transition probability of a vacancy in AKMC requires a total of 1+N energy calculations, 1 for the initial state and N for the N possible final states of the transition, where N is the number of the first neighbors of the vacancy.
On the basis, the vacancy is taken as the center, the vacancy, N nearest neighbor atoms of the vacancy and all the nearest neighbor atoms of the N nearest neighbor atoms in the truncation radius are packed completely to form a vacancy system, the energy of the atoms is changed by vacancy transition, and the atoms outside the vacancy system cannot be influenced by the vacancy transition. There may be multiple null systems within the analog region and the null systems may overlap.
Based on the characteristics of lattice AKMC simulation, all atoms are located on lattice points, and any lattice point of BCC, FCC and SC systems is equivalent in space, wherein the equivalence refers to that the total neighbor number of any lattice point, the neighbor number of 1/2/3 … and the distance from the neighbor to the central lattice point are completely consistent within a truncation radius. Moreover, the space coordinates of lattice points in BCC, FCC and SC systems are multiples of a/2, a is a lattice parameter, so the space coordinates of the lattice points can be represented by relative coordinates, such as relative (1,1,1) corresponding to real coordinates (a/2,a/2,a/2). In addition, the distance from a lattice point i to its j-layer neighbor is also dependent only on the lattice parameter a and is determined, for example, the distances from a lattice point in BCC to its 0-5-layer neighbors are 0,
Figure BDA0003742579040000081
a 0
Figure BDA0003742579040000082
based on the above properties, the spatial distribution of the vacancy system can be described in two arrays that are only related to the lattice parameter a and the lattice type (BCC, FCC, SC).
Fig. 1 shows a schematic flowchart of an atomic fingerprint calculation method for atomic dynamics monte carlo simulation according to an embodiment of the present invention, which is applied to a heterogeneous multi-core processor, where the heterogeneous multi-core processor includes a master core and a plurality of slave cores, and the method includes the following steps:
and S102, distributing a plurality of grid points of the vacancy system to each slave core.
The grid points of the vacancy system are distributed on each slave core as uniformly as possible, so that the computing power of each slave core is effectively utilized. Each slave core respectively stores relative coordinates, neighbor relations, atom type vectors and atom fingerprint tables of the vacancy system.
Wherein, the relative coordinates refer to the coordinates of the lattice points in the vacancy system relative to the central vacancy. The vacancy system includes a central vacancy, a plurality of nearest neighbor lattice points for the central vacancy, and all nearest neighbor atoms for the plurality of nearest neighbor lattice points within a truncation radius.
The neighbor relation includes distances between grid points and neighbor grid points in the vacancy system, and illustratively, the neighbor relation stores the neighbor relation of each grid point in the vacancy system in a three-dimensional array form. The neighbor relation of any lattice point comprises the sequence numbers of the neighbor lattice points of the lattice point and the distances between the lattice point and the neighbor lattice points. The distance may be expressed in terms of the number of adjacent layers from the lattice point, from which the actual distance of the lattice from its adjacent lattice point can be determined if the lattice structure is known.
The atom type vector includes the atom type of each lattice point in the vacancy system, and each lattice point in the vacancy system corresponds to one atom type vector. Based on the foregoing operation of allocating the grid points of the control system to the slave core, only the atom type vectors of the plurality of grid points to which they are allocated may be stored in the slave core. Illustratively, the atom type vector takes the form of a one-dimensional vector, the number of columns of the one-dimensional vector is the number of each lattice point in the vacancy system, and each column of the one-dimensional vector stores a value of the atom type of each corresponding lattice point. For example, a vacancy is represented by 0, a first atom type is represented by 1, a second atom type is represented by 2, and so on.
The atomic fingerprint table includes atomic fingerprints calculated based on relative coordinates, neighbor relations, and a preset fingerprint function, and the atomic fingerprint table may include atomic fingerprints of all lattice points of the vacancy system. Based on the above, all the atomic fingerprints corresponding to each lattice point in the vacancy system can be calculated in advance, and all the atomic fingerprints corresponding to each node are stored in the atomic fingerprint table. Illustratively, the atomic fingerprint table is stored in a matrix form, the matrix includes N rows and M columns, N is the number of layers of neighboring lattice points with a central vacancy within a truncation radius, and M is the number of fingerprint functions, and all atomic fingerprints corresponding to the neighboring lattice points are stored.
Illustratively, a plurality of lattice points of the vacancy system may be equally distributed to each slave core according to the number of the slave cores; each slave core stores the neighbor relation of the assigned lattice point. In particular, the grid points may be automatically assigned to the various slave cores by the master core. Assuming a vacancy system with 256 grid points, a heterogeneous multi-core processor includes 64 slave cores, and each slave core may be assigned 4 grid points. For example, grid points 1-4 are assigned to the 1 st slave core, grid points 5-8 are assigned to the 2 nd slave core, grid points 9-12 are assigned to the 3 rd slave core, and so on.
And S104, the secondary core queries the atom fingerprint table according to the relative coordinate, the neighbor relation and the atom type vector to obtain the atom fingerprint in the initial state.
And S106, exchanging the atom type of the target lattice point and the atom type of the central vacancy in the atom type vector, and exchanging the neighbor relation of the target lattice point and the neighbor relation of the central vacancy so as to simulate the transition of the central vacancy to the target lattice point.
When a simulated central vacancy jumps to a nearest neighbor lattice point in a vacancy system, exchanging atoms of the central vacancy and the nearest neighbor lattice point (if the nearest neighbor lattice point is also a vacancy, directly skipping the simulation of the nearest neighbor lattice point because the nearest neighbor lattice point is unchanged), namely exchanging the atom types of a target lattice point and the central vacancy in an atom type vector to obtain an atom type vector after simulated jump; after the central vacancy jumps to the nearest neighbor grid point, the neighbor relation changes with the position change, and the original neighbor relation needs to be replaced by the neighbor relation of the target grid point.
And S108, inquiring an atom fingerprint table from the kernel according to the neighbor relation and the atom type vector after the simulated central vacancy jumps to the target lattice point, and calculating to obtain the atom fingerprint after the simulated vacancy jumps.
After the analog transition is completed, the atomic fingerprint after the analog transition can be obtained by querying based on the atomic fingerprint table.
Further, to simulate other transition conditions, the initial state needs to be restored before simulating other transition conditions. Based on this, the above method further comprises the steps of: and exchanging the central vacancy in the atom type vector with the atom type of the target lattice point, and exchanging the neighbor relation of the central vacancy with the neighbor relation of the target lattice point so as to restore the initial state of the vacancy system. I.e. it is restored to the initial state in a manner opposite to the analog transition.
The atomic dynamics Monte Carlo simulated atomic fingerprint calculation method provided by the embodiment of the invention is applied to a heterogeneous multi-core processor, can distribute atomic fingerprint operation to a plurality of slave cores for execution, quickly obtains the result of a fingerprint function by adopting a mode of predetermining an atomic fingerprint table, and can realize high-efficiency AKMC atomic fingerprint calculation by combining the relative coordinate, the neighbor relation and the atomic type vector of a vacancy system.
Fig. 2 shows a schematic diagram of a coding algorithm of a Vacancy System according to an embodiment of the present method, where (a) is a visualization of the Vacancy System, (b) represents relative Coordinates (CET) of each grid point in the Vacancy System, and is stored in the form of a matrix, where the number of columns of the matrix is the ID of each grid point, where the relative coordinates of the central Vacancy is fixed to (0,0,0), (c) is a three-dimensional array (NET) of the grid point, NET [ i [ [ i ] m]The neighbor relation representing a lattice point numbered i in the space system, NET [ i]Is a [ N ] local ,2]The first column of the matrix is the ID of the neighbor (corresponding to CET), the second column of the matrix indicates that the neighbor is the j-th layer neighbor of the lattice point with number i, and the corresponding real distance can be obtained by directly looking up the table according to the j-th layer distance and the lattice characteristics.
For any vacancy, the three-dimensional Cartesian coordinate is (x, y, z), the Cartesian coordinate is firstly converted into the (i, j, k) coordinate, and then each column of the CET is respectively added with the (i, j, k), so that the coordinates of all grid points of the vacancy system can be obtained. Then, according to the coordinates (i, j, k), the type of each lattice point, that is, the atom type of each empty position system (VET) vector can be obtained. Each slot corresponds to a VET vector, which may be the same number of rows or columns as the CET matrix, as exemplified in fig. 2. In FIG. 2, (d) is exemplified by Fe-Cu, where 0 represents a vacancy, 1 represents Fe, and 2 represents Cu.
N marked in FIG. 2 local Refers to the first nearest neighbor lattice point including a central vacancy, which is in a completely ordered arrangement, N region Refers to the nearest neighbor lattice point, N, within the truncation radius of the first nearest neighbor lattice point including the central null position all The cell is a neighboring cell point of a neighboring cell point within the cutoff radius. N needs to be considered in VET all Internal atoms, only N having to be considered in NET region Inner lattice point, because N is calculated region Inner grid points are needed for their energy, while their own more outer neighbors are not needed.
The advantage of the above process is that CET and NET are only dependent on the lattice parameter a and lattice type, do not need to be established for each vacancy system, and the values therein can be expressed in integers of 16bit, with about 100kB of memory space available for both CET and NET. The length of the VET is mainly related to the truncation radius, and generally does not exceed 1000, and the atom type can be represented by an integer of 8bit, so the memory space occupied by the VET is very limited (about 1 kB).
In addition, because the distance of a grid point to its respective layer neighbors is fixed, fingerprints can be pre-computed and then stored in a matrix form. Supposing that M sets of different hyper-parameters (M different fingerprint functions) are provided, at most N layers of adjacent neighbors exist in the truncation radius, a matrix has N rows and M columns, wherein N is the number of corresponding adjacent neighbors, and M corresponds to the fingerprint function.
With the above basis, the fingerprint computation can be moved to the slave core.
FIG. 3 shows a flow diagram of atomic fingerprint computation. As shown in fig. 3, first, the Atoms in the vacancy system are distributed to the slave cores (distribution Atoms) as evenly as possible by one DMA operation according to the number of the slave cores, each of the LDMs of the slave cores stores one NET, CET and fingerprint table and n VETs, where n represents the number of Atoms responsible for the slave core.
Then, according to the neighbor relation and the lookup table, the atomic fingerprint calculation of the initial state (initial state) is completed, and the calculation result is also stored in the LDM. Then, a Vacancy transition (Simulate Vacancy Hop) is simulated on the slave core: for the kth transition, the 1 st component (certainly 0, i.e. central vacancy) in VET is exchanged with the (k + 1) th component, the 1 st matrix in NET is exchanged with the kth matrix at the same time, then the atomic fingerprint (calculated fingerprints for States) corresponding to the state k is calculated and stored in LDM, and then the (k + 1) th components in VET and NET are exchanged with the respective 1 st components (equivalent to restoration to the initial state). VET (initial), VET (final 1), VET (final 2) … VET (final), and the calculated respective atomic fingerprints (Feature) are shown in fig. 3. After all computations are completed, the result is written back to the main memory through one DMA. The whole process has only 2 DMA operations.
After the method provided by the embodiment of the invention is used, the AKMC atomic fingerprint calculation with high efficiency can be realized on a heterogeneous multi-core processor. The acceleration ratio can reach 95% N compared to using the primary kernel only slaves Or higher, where N slaves Indicating the number of slave cores to which each master core corresponds. FIG. 4 is a graph showing comparison of Simulation durations for different fingerprint calculations performed by various processors, where the horizontal axes are X86, SW, and SW (opt), the vertical axis is Simulation duration (Simulation Times), and the left-hand truncation radius is
Figure BDA0003742579040000121
Right side cutoff radius of
Figure BDA0003742579040000122
As shown in fig. 4, each color bar shows the overall time consumption of energy calculation, and is divided into fingerprint calculation, energy calculation, and others. SW (opt) represents that the acceleration method based on the embodiment of the invention realizes parallel computation by virtue of the slave core, SW represents that only the master core is adopted for computation, and X86 represents the high-performance CPU of amd. Taking the Shenwei super computer as an example, 64 slave cores are provided in 1 master core, and the measured acceleration ratio is about 62 (64 × 97%). As shown in fig. 4, even though there is a small difference in performance of the master core from amd, once the slave core can be fully utilized, amd can be far exceeded.
The embodiment of the invention also provides an atomic fingerprint computing device for atomic dynamics Monte Carlo simulation, which is applied to a heterogeneous multi-core processor, wherein the heterogeneous multi-core processor comprises a master core and a slave core. Fig. 5 is a schematic structural diagram of an atomic fingerprint calculation apparatus for atomic dynamics monte carlo simulation according to an embodiment of the present invention, the apparatus including:
an allocating module 501, configured to allocate multiple lattice points of a vacancy system to each slave core; each slave core respectively stores the relative coordinate, the neighbor relation, the atom type vector and the atom fingerprint table of the vacancy system; the relative coordinates are coordinates of grid points in the vacancy system relative to a central vacancy, the neighbor relation comprises distances between the grid points in the vacancy system and neighbor grid points, the atom type vector comprises an atom type of each grid point in the vacancy system, and the atom fingerprint table comprises atom fingerprints calculated based on the relative coordinates, the neighbor relation and a preset fingerprint function;
an initial state calculation module 502, configured to query the atom fingerprint table by the slave core according to the relative coordinate, the neighbor relation, and the atom type vector to obtain an atom fingerprint in an initial state;
a transition simulation module 503, configured to swap an atom type of the central vacancy with a target lattice point in the atom type vector, and swap a neighbor relation of the target lattice point with a neighbor relation of the central vacancy, so as to simulate a transition of the central vacancy to the target lattice point;
and a transition state calculation module 504, configured to query the atom fingerprint table to calculate an atom fingerprint after the simulated vacancy transition according to the atom type vector and the neighbor relation after the simulated vacancy transition to the target lattice point.
Optionally, the allocation module is specifically configured to: equally distributing a plurality of grid points of the vacancy system to each slave core according to the number of the slave cores; each slave core stores the neighbor relation of the allocated grid point.
Optionally, the apparatus further comprises a transition recovery module configured to: exchanging the central vacancy in the atom type vector with an atom type of the target lattice point, and exchanging a neighbor relationship of the central vacancy with a neighbor relationship of the target lattice point, to restore to an initial state of the vacancy system.
Optionally, the apparatus further comprises a fingerprint calculation module configured to: pre-calculating all the atom fingerprints corresponding to each lattice point in the vacancy system, and storing all the atom fingerprints corresponding to each node in the atom fingerprint table; the atomic fingerprint table is stored in a matrix form, the matrix comprises N rows and M columns, N is the number of layers of adjacent lattice points of the central vacancy in the truncation radius, and M is the number of fingerprint functions.
Optionally, the neighbor relations store neighbor relations of each lattice point in the vacancy system in a three-dimensional array form, and the neighbor relations of any lattice point include sequence numbers of neighbor lattice points of the lattice point and distances of neighbor lattice points of the lattice point.
Optionally, the vacancy system includes a central vacancy, a plurality of nearest neighbor lattice points for the central vacancy, all of the nearest neighbor atoms of the plurality of nearest neighbor lattice points within a truncation radius.
Optionally, the atom type vector is a one-dimensional vector, the number of columns of the one-dimensional vector is a serial number of each lattice point in the vacancy system, and a value stored in each column of the one-dimensional vector is an atom type of each corresponding lattice point.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by instructing a control device to implement the methods, and the programs may be stored in a computer-readable storage medium, and when executed, the programs may include the processes of the above method embodiments, where the storage medium may be a memory, a magnetic disk, an optical disk, and the like.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The above description of the present invention is intended to be illustrative. The present invention is not limited to the above-described embodiments, and various changes and modifications may be made without departing from the spirit and scope of the present invention, and these changes and modifications fall within the scope of the claimed invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (10)

1. An atomic dynamics Monte Carlo simulation atomic fingerprint calculation method is applied to a heterogeneous multi-core processor, wherein the heterogeneous multi-core processor comprises a master core and a slave core, and the method comprises the following steps:
allocating a plurality of grid points of a vacancy system to each slave core; each slave core respectively stores the relative coordinate, the neighbor relation, the atom type vector and the atom fingerprint table of the vacancy system; the relative coordinates are coordinates of grid points in the vacancy system relative to a central vacancy, the neighbor relation comprises distances between the grid points in the vacancy system and neighbor grid points, the atom type vector comprises an atom type of each grid point in the vacancy system, and the atom fingerprint table comprises atom fingerprints calculated based on the relative coordinates, the neighbor relation and a preset fingerprint function;
the secondary core queries the atom fingerprint table according to the relative coordinate, the neighbor relation and the atom type vector to obtain an atom fingerprint in an initial state;
exchanging a target lattice point in the atom type vector with the atom type of the central vacancy, and exchanging a neighbor relation of the target lattice point with a neighbor relation of the central vacancy, to simulate the central vacancy transitioning to the target lattice point;
and the secondary core inquires the atom fingerprint table to calculate and obtain the atom fingerprint after the simulated vacancy transition according to the neighbor relation and the atom type vector after the simulated vacancy transition to the target lattice point.
2. The method of claim 1, wherein assigning the plurality of grid points of the vacancy system to each slave core comprises:
equally distributing a plurality of grid points of the vacancy system to each slave core according to the number of the slave cores; each slave core stores the neighbor relation of the allocated grid point.
3. The method of claim 1, wherein after said querying said atomic fingerprint table to obtain an atomic fingerprint after an analog vacancy transition, said method further comprises:
exchanging the central vacancy in the atom type vector with an atom type of the target lattice point, and exchanging a neighbor relation of the central vacancy with a neighbor relation of the target lattice point to restore to an initial state of the vacancy system.
4. The method of claim 1, further comprising:
pre-calculating all the atom fingerprints corresponding to each lattice point in the vacancy system, and storing all the atom fingerprints corresponding to each node in the atom fingerprint table;
the atomic fingerprint table is stored in a matrix form, the matrix comprises N rows and M columns, N is the number of layers of adjacent lattice points of the central vacancy in the truncation radius, and M is the number of fingerprint functions.
5. The method of claim 1, wherein the neighbor relations store neighbor relations of each lattice point in the vacancy system in a three-dimensional array, and wherein the neighbor relations of any lattice point include a sequence number of a neighbor lattice point of the lattice point and a distance of the neighbor lattice point of the lattice point.
6. The method of claim 1, wherein the vacancy system includes a central vacancy, a plurality of nearest neighbor lattice points for the central vacancy, all neighboring atoms of the plurality of nearest neighbor lattice points that are within a truncation radius.
7. The method of claim 1, wherein the atom type vector is a one-dimensional vector, the number of columns of the one-dimensional vector is a sequence number of each lattice point in the vacancy system, and each column of the one-dimensional vector stores a value of the atom type of the corresponding lattice point.
8. An atomic dynamics Monte Carlo simulated atomic fingerprint computing device, which is applied to a heterogeneous multi-core processor, wherein the heterogeneous multi-core processor comprises a master core and a slave core, and the device comprises:
the allocation module is used for allocating a plurality of grid points of the vacancy system to each slave core; each slave core respectively stores the relative coordinate, the neighbor relation, the atom type vector and the atom fingerprint table of the vacancy system; the relative coordinates are coordinates of grid points in the vacancy system relative to a central vacancy, the neighbor relation comprises distances between the grid points in the vacancy system and neighbor grid points, the atom type vector comprises an atom type of each grid point in the vacancy system, and the atom fingerprint table comprises atom fingerprints calculated based on the relative coordinates, the neighbor relation and a preset fingerprint function;
the initial state calculation module is used for inquiring the atom fingerprint table by the secondary core according to the relative coordinate, the neighbor relation and the atom type vector to obtain an atom fingerprint of an initial state;
a transition simulation module for exchanging a target lattice point in the atom type vector with an atom type of the central vacancy, and exchanging a neighbor relation of the target lattice point with a neighbor relation of the central vacancy, to simulate the central vacancy transitioning to the target lattice point;
and the transition state calculation module is used for inquiring the atomic fingerprint table to calculate and obtain the atomic fingerprint after the simulated vacancy transition according to the neighbor relation and the atomic type vector after the simulated vacancy transition to the target lattice point.
9. The apparatus of claim 8, wherein the assignment module is specifically configured to:
equally distributing a plurality of grid points of the vacancy system to each slave core according to the number of the slave cores; each slave core stores the neighbor relation of the allocated grid point.
10. The apparatus of claim 8, further comprising a transition recovery module to:
exchanging the central vacancy in the atom type vector with an atom type of the target lattice point, and exchanging a neighbor relation of the central vacancy with a neighbor relation of the target lattice point to restore to an initial state of the vacancy system.
CN202210816365.6A 2022-07-12 2022-07-12 Atomic fingerprint calculation method and device for atomic dynamics Monte Carlo simulation Active CN115394364B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210816365.6A CN115394364B (en) 2022-07-12 2022-07-12 Atomic fingerprint calculation method and device for atomic dynamics Monte Carlo simulation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210816365.6A CN115394364B (en) 2022-07-12 2022-07-12 Atomic fingerprint calculation method and device for atomic dynamics Monte Carlo simulation

Publications (2)

Publication Number Publication Date
CN115394364A true CN115394364A (en) 2022-11-25
CN115394364B CN115394364B (en) 2024-02-02

Family

ID=84115828

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210816365.6A Active CN115394364B (en) 2022-07-12 2022-07-12 Atomic fingerprint calculation method and device for atomic dynamics Monte Carlo simulation

Country Status (1)

Country Link
CN (1) CN115394364B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110320176A1 (en) * 2009-12-18 2011-12-29 Georgia Inst of Tech, Office of Technology Licensing Screening metal organic framework materials
CN107239352A (en) * 2017-05-31 2017-10-10 北京科技大学 The communication optimization method and its system of a kind of dynamics Monte Carlo Parallel Simulation
CN108959709A (en) * 2018-06-04 2018-12-07 中国科学院合肥物质科学研究院 Grain boundary structure searching method based on defect property and multi-scale Simulation
CN110120248A (en) * 2019-04-08 2019-08-13 中国科学院合肥物质科学研究院 The method that simulation nanocrystalline metal accumulates damage of offing normal
CN110459269A (en) * 2019-08-07 2019-11-15 中国原子能科学研究院 A kind of multi-scale coupling analogy method of nuclear reactor material irradiation damage
CN110570910A (en) * 2019-08-19 2019-12-13 华中科技大学 Method and system for reducing dislocation density of growing gallium nitride film
CN111444134A (en) * 2020-03-24 2020-07-24 山东大学 Parallel PME (pulse-modulated emission) accelerated optimization method and system of molecular dynamics simulation software
CN111814315A (en) * 2020-06-17 2020-10-23 中国科学院合肥物质科学研究院 Method for calculating dynamic property of defect cluster in metal material
CN113223641A (en) * 2021-04-28 2021-08-06 中国科学院合肥物质科学研究院 Simulation method for jumping out of super potential valley by vacancy clusters

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110320176A1 (en) * 2009-12-18 2011-12-29 Georgia Inst of Tech, Office of Technology Licensing Screening metal organic framework materials
CN107239352A (en) * 2017-05-31 2017-10-10 北京科技大学 The communication optimization method and its system of a kind of dynamics Monte Carlo Parallel Simulation
CN108959709A (en) * 2018-06-04 2018-12-07 中国科学院合肥物质科学研究院 Grain boundary structure searching method based on defect property and multi-scale Simulation
CN110120248A (en) * 2019-04-08 2019-08-13 中国科学院合肥物质科学研究院 The method that simulation nanocrystalline metal accumulates damage of offing normal
CN110459269A (en) * 2019-08-07 2019-11-15 中国原子能科学研究院 A kind of multi-scale coupling analogy method of nuclear reactor material irradiation damage
CN110570910A (en) * 2019-08-19 2019-12-13 华中科技大学 Method and system for reducing dislocation density of growing gallium nitride film
CN111444134A (en) * 2020-03-24 2020-07-24 山东大学 Parallel PME (pulse-modulated emission) accelerated optimization method and system of molecular dynamics simulation software
CN111814315A (en) * 2020-06-17 2020-10-23 中国科学院合肥物质科学研究院 Method for calculating dynamic property of defect cluster in metal material
CN113223641A (en) * 2021-04-28 2021-08-06 中国科学院合肥物质科学研究院 Simulation method for jumping out of super potential valley by vacancy clusters

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
JIANJIANG LI: "Crystal-KMC: parallel software for lattice dynamics monte carlo simulation of metal materials", TSINGHUA SCIENCE AND TECHNOLOGY, vol. 23, no. 4 *
吴子若;程鑫彬;王占山;: "动力学晶格蒙特卡洛方法模拟Cu薄膜生长", 光子学报, no. 01 *
尚子豪: "原子动力学蒙特卡洛程序OpenKMC 在反应堆压力容器钢缺陷损伤研究中的优化与应用", 计算机工程与科学 *
王栋;商红慧;张云泉;李琨;贺新福;贾丽霞;: "原子动力学蒙特卡洛程序MISA-KMC在反应堆压力容器钢辐照损伤研究中的应用", 计算机科学, no. 04 *

Also Published As

Publication number Publication date
CN115394364B (en) 2024-02-02

Similar Documents

Publication Publication Date Title
JP7245338B2 (en) neural network processor
US9959337B2 (en) Independent data processing environments within a big data cluster system
Li et al. Faster model matrix crossproducts for large generalized linear models with discretized covariates
CN103810111B (en) The method and its treatment element of address generation in active memory part
EP3526665B1 (en) Sorting for data-parallel computing devices
CN103761215B (en) Matrix transpose optimization method based on graphic process unit
CN103140829B (en) Methods, apparatuses and storage device associated for breadth-first iterative traversal of multiprocessor core
Whitby et al. Geowave: Utilizing distributed key-value stores for multidimensional data
Dang et al. A parallel implementation on GPUs of ADI finite difference methods for parabolic PDEs with applications in finance
US11556757B1 (en) System and method of executing deep tensor columns in neural networks
CN111104457A (en) Massive space-time data management method based on distributed database
EP4012556A2 (en) Fractal calculating device and method, integrated circuit and board card
Deng et al. A data and task co-scheduling algorithm for scientific cloud workflows
CN114730275A (en) Method and apparatus for vectorized resource scheduling in a distributed computing system using tensor
CN110135569A (en) Heterogeneous platform neuron positioning three-level flow parallel method, system and medium
CN114580606A (en) Data processing method, data processing device, computer equipment and storage medium
CN103069396A (en) Object arrangement apparatus, method therefor, and computer program
CN113569511A (en) Quantum circuit simulation method and device
Lubell-Doughtie et al. Practical distributed classification using the alternating direction method of multipliers algorithm
CN107408132A (en) The effective performance of insertion and point inquiry operation in row store
CN105210059A (en) Data processing method and system
TWI758223B (en) Computing method with dynamic minibatch sizes and computing system and computer-readable storage media for performing the same
CN115394364B (en) Atomic fingerprint calculation method and device for atomic dynamics Monte Carlo simulation
CN104239520A (en) Historical-information-based HDFS (hadoop distributed file system) data block placement strategy
Liu et al. Accelerating approximate matrix multiplication for near-sparse matrices on GPUs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant