CN112861333B - OpenMP and MPI-based method and device for calculating effect of electromagnetic waves and plasma - Google Patents

OpenMP and MPI-based method and device for calculating effect of electromagnetic waves and plasma Download PDF

Info

Publication number
CN112861333B
CN112861333B CN202110124410.7A CN202110124410A CN112861333B CN 112861333 B CN112861333 B CN 112861333B CN 202110124410 A CN202110124410 A CN 202110124410A CN 112861333 B CN112861333 B CN 112861333B
Authority
CN
China
Prior art keywords
time
mpi
calculation
openmp
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110124410.7A
Other languages
Chinese (zh)
Other versions
CN112861333A (en
Inventor
何凌磊
陈靓
元光
闫玉波
郝书吉
满莉
李清亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ocean University of China
China Research Institute of Radio Wave Propagation CRIRP
Original Assignee
Ocean University of China
China Research Institute of Radio Wave Propagation CRIRP
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ocean University of China, China Research Institute of Radio Wave Propagation CRIRP filed Critical Ocean University of China
Priority to CN202110124410.7A priority Critical patent/CN112861333B/en
Publication of CN112861333A publication Critical patent/CN112861333A/en
Application granted granted Critical
Publication of CN112861333B publication Critical patent/CN112861333B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation

Abstract

The invention provides a computing method and device for electromagnetic wave and plasma action based on OpenMP and MPI. The calculation method comprises the following steps: step Y, establishing a solving model of the interaction of the electromagnetic waves and the plasma and performing dispersion; step X, performing self-adaptive calculation on the discrete solving model, and determining the thread number of OpenMP and the process number of MPI; step U, solving the discrete solution model, including: and performing parallelization calculation according to the OpenMP thread number obtained by the self-adaptive calculation, and performing parallelization updating and storage on data according to the MPI thread number obtained by the self-adaptive calculation to obtain a simulation result of the interaction of the electromagnetic wave and the plasma. The invention adopts OpenMP to carry out parallelization processing on the data computation module, reduces the parallel overhead caused by the communication operation necessary during the MPI parallelization, and saves the time required by numerical simulation. In the time iteration process, the data storage module adopts MPI to perform parallelization processing, so that the running efficiency of the whole program is improved.

Description

OpenMP and MPI-based method and device for calculating effect of electromagnetic waves and plasma
Technical Field
The invention relates to the technical field of analog simulation, in particular to a computing method and device for electromagnetic wave and plasma action based on OpenMP and MPI.
Background
The high-frequency electromagnetic wave has both high-frequency component and low-frequency component when non-linear interaction occurs with plasma, in the numerical simulation process, the span of the spatial scale magnitude required to be simulated is from km to cm magnitude, so that the required grid spacing (grid size) is greatly restricted, and the corresponding time step determined by Courant-Friedrichs-Lewy condition is 10 - 9 10 -10 Of the order of s. The nonlinear interaction between the high-frequency electromagnetic wave and the plasma is simulated in a full-scale and full-space serial manner, and the total cycle iteration number reaches 10 even if the physical phenomenon of millisecond magnitude is simulated 6 -10 7 The secondary order, therefore, requires a large consumption of computer resources to obtain the simulation result, and the serial operation has not been able to meet the current demand.
Disclosure of Invention
The invention provides a method and a device for calculating the interaction between electromagnetic waves and plasmas based on OpenMP and MPI, and aims to solve the technical problem of how to perform efficient simulation calculation on the interaction between the electromagnetic waves and the plasmas.
The method for calculating the action of electromagnetic waves and plasma based on OpenMP and MPI comprises the following steps:
step Y, establishing a solving model of the interaction of the electromagnetic waves and the plasma and performing dispersion;
step X, performing self-adaptive calculation on the discrete solving model, and determining the thread number of OpenMP and the process number of MPI;
step U, solving the discrete solution model, including: and performing parallelization calculation according to the OpenMP thread number obtained by the self-adaptive calculation, and performing parallelization updating and storage on data according to the MPI thread number obtained by the self-adaptive calculation to obtain a simulation result of the interaction of the electromagnetic wave and the plasma.
According to the calculation method of the electromagnetic wave and plasma effect based on OpenMP and MPI, disclosed by the embodiment of the invention, openMP is used for carrying out parallelization operation on data calculation, and meanwhile, data storage is subjected to parallelization processing based on MPI, so that a set of program framework based on OpenMP and MPI mixed programming is built and is used for simulating the interaction of the electromagnetic wave and the plasma. The invention effectively saves the simulation calculation time for simulating the interaction of the electromagnetic wave and the plasma in the full-scale and full-space; the risk brought by memory access conflict is reduced while the parallel overhead is reduced through OpenMP + MPI mixed programming; the portability of the program is enhanced by adaptively calculating and distributing the number of threads and processes through the program, the method can be suitable for computer platforms with various hardware parameters, and the maximum hardware potential of the computer platforms is exerted as much as possible.
According to some embodiments of the invention, the discretizing of the solution model in step Y comprises:
based on a finite difference time domain method, performing space-time dispersion on the solved model according to a simulation environment and preset requirements on stability and convergence of the solved model to determine a time step length and a space step length of the solved model.
In some embodiments of the present invention, a total number of time steps is determined according to the time step and a preset time range, and in step U, when the discrete solution model is solved, iterative computation is performed on the solution model by using the total number of time steps as a loop iteration end condition.
According to some embodiments of the invention, the step X comprises:
step X1, performing iteration loop calculation for preset times on the discrete solved model series;
step X2, based on the calculation result of the step X1, acquiring a plurality of time parameters by using an MPI timing function;
and step X3, calculating and determining the thread number of the OpenMP and the process number of the MPI according to the preset relation between the hardware information of the computer platform and the time parameter information.
In some embodiments of the invention, in the step X2, a plurality of the time parameters include: the average time consumption of the serial computing part, the average time consumption of the parallel computing part, the average time consumption of data updating storage and the time consumption of communication among processes are calculated;
the hardware information of the computer platform in the step X3 includes: memory size and CPU core count.
The computing device based on the electromagnetic wave and plasma action of OpenMP and MPI comprises:
the modeling module is used for establishing a solution model of the interaction of the electromagnetic waves and the plasma and performing dispersion;
the parallel parameter determination module is used for performing self-adaptive calculation on the discrete solution model and determining the thread number of OpenMP and the process number of MPI;
the calculation module is used for solving the discrete solution model, and comprises: and performing parallelization calculation according to the OpenMP thread number obtained by the self-adaptive calculation, and performing parallelization updating and storage on data according to the MPI thread number obtained by the self-adaptive calculation to obtain a simulation result of the interaction of the electromagnetic wave and the plasma.
According to the computing device based on the OpenMP and MPI electromagnetic wave and plasma action, the OpenMP is adopted for parallelizing the numerical computation module of the finite difference time domain, the parallel overhead caused by the necessary communication operation during MPI parallelizing is reduced, and the time required by numerical computation simulation is effectively saved. In the time iteration process, the data storage module adopts MPI to carry out parallelization processing, so that the operation efficiency of the whole program can be effectively improved, and each process has own memory and variable through the process processing, so that the conflict problem is avoided.
According to some embodiments of the invention, the computing device further comprises:
and the discrete module is used for performing space-time dispersion on the solved model based on a time domain finite difference method according to a simulation environment and preset requirements on stability and convergence of the solved model so as to determine the time step length and the space step length of the solved model.
In some embodiments of the present invention, the discretization module determines a total time step number according to the time step length and a preset time range, and the calculation module performs iterative calculation on the solution model by using the total time step number as a loop iteration end condition when solving the solution model after discretization.
According to some embodiments of the invention, the parallel parameter determination module comprises:
the self-adaptive calculation module is used for carrying out iteration loop calculation on the discrete solution model serial for preset times;
the parameter acquisition module is used for acquiring a plurality of time parameters by using an MPI timing function based on the calculation result of the self-adaptive calculation module;
and the parameter calculation module is used for calculating and determining the thread number of the OpenMP and the process number of the MPI according to the preset relationship between the hardware information of the computer platform and the plurality of time parameter information.
In some embodiments of the invention, a plurality of said time parameters comprises: the average time consumption of the serial computing part, the average time consumption of the parallel computing part, the average time consumption of data updating storage and the time consumption of communication among processes are calculated;
the hardware information of the computer platform comprises: memory size and CPU core count.
Drawings
FIG. 1 is a flowchart of a method for calculating the electromagnetic wave and plasma interaction based on OpenMP and MPI according to an embodiment of the present invention;
FIG. 2 is a block diagram of a computing framework for electromagnetic wave and plasma interaction based on OpenMP and MPI according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a computing device based on the electromagnetic wave and plasma interaction of OpenMP and MPI according to an embodiment of the present invention;
FIG. 4 is a graph illustrating simulation results of electromagnetic field, plasma velocity, and density at 5.0035ms for a calculated plasma numerical simulation of electromagnetic wave and plasma interaction based on OpenMP and MPI, in accordance with an embodiment of the present invention;
FIG. 5 is a schematic diagram illustrating data accuracy comparison of a calculation method of electromagnetic wave and plasma action based on OpenMP and MPI according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of an acceleration ratio of a parallel processor system program run to a single processor system program run according to an embodiment of the present invention.
Reference numerals are as follows:
in the context of the computing device 100, the computing device,
the modeling module 10, the parallel parameter determining module 20, the adaptive calculating module 210, the parameter acquiring module 220, the parameter calculating module 230, the calculating module 30 and the discrete module 40.
Detailed Description
To further explain the technical means and effects of the present invention adopted to achieve the intended purpose, the present invention will be described in detail with reference to the accompanying drawings and preferred embodiments.
Large-scale parallel operation becomes the mainstream direction of scientific computation at present, and various parallel technologies are developed along with the large-scale parallel operation. MPI (Message serving Interface) is an Application Program Interface (API) for parallel computing, which is established on the basis of a commercial multi-computer (multi-processor) cluster of hundreds of CPUs (central processing units) in a single-core era, and can be used on the multi-CPU cluster or the multi-core CPU after entering the multi-core era; openMP (Open Multi-Processing) is known as a shared storage standard. The parallel programming interface (API) is designed for writing parallel programs on the multi-core CPU, and is particularly suitable for parallel program design of the multi-core CPU.
With the development of large-scale integrated circuit technology, the number of CPU cores which can be borne by a single node on a cluster is more and more, on the basis of sharing a memory, because message transfer operation is not needed, openMP has less parallel overhead, and the execution efficiency of the OpenMP + MPI hybrid programming adopted in the situation is much faster than that of the traditional pure MPI programming adopted; in addition, openMP-tunneling threads can take advantage of when some MPI processes are idle while others are busy. Therefore, aiming at the characteristic of simulating the interaction of electromagnetic waves and plasmas in a full scale, the invention provides a parallel scheme adopting OpenMP + MPI mixed programming.
The invention provides a computing method and device for electromagnetic wave and plasma action based on OpenMP and MPI.
It should be noted that the execution sequence of the step flow involved in the method of the present invention is not necessarily related to the step numbers. For example, step Y is not performed after step X.
As shown in fig. 1 and fig. 2, a method for calculating an electromagnetic wave and plasma interaction based on OpenMP and MPI according to an embodiment of the present invention includes:
step Y, establishing a solving model of the interaction of the electromagnetic waves and the plasma and performing dispersion;
step X, performing self-adaptive calculation on the discrete solution model, and determining the thread number of OpenMP and the process number of MPI;
step U, solving the discrete solution model, including: and performing parallelization calculation according to the thread number of the OpenMP obtained by the self-adaptive calculation, and performing parallelization updating and storage on data according to the thread number of the MPI obtained by the self-adaptive calculation to obtain a simulation result of the interaction of the electromagnetic wave and the plasma.
According to the calculation method of the electromagnetic wave and plasma effect based on OpenMP and MPI, disclosed by the embodiment of the invention, openMP is used for carrying out parallelization operation on data calculation, and meanwhile, data storage is subjected to parallelization processing based on MPI, so that a set of program framework based on OpenMP and MPI mixed programming is built and is used for simulating the interaction of the electromagnetic wave and the plasma. The invention effectively saves the simulation calculation time for simulating the interaction of the electromagnetic wave and the plasma in the full-scale and full-space; the risk brought by memory access conflict is reduced while the parallel overhead is reduced through OpenMP + MPI mixed programming; the portability of the program is enhanced by adaptively calculating and distributing the number of threads and processes through the program, the method can be suitable for computer platforms with various hardware parameters, and the maximum hardware potential of the computer platforms is exerted as much as possible.
According to some embodiments of the invention, discretizing the solution model in step Y comprises:
and performing space-time dispersion on the solved model based on a finite difference time domain method according to the preset requirements of the stability and the convergence of the simulation environment and the solved model to determine the time step length and the space step length of the solved model.
For example, a classical Yee structure may be adopted, byAnd carrying out space-time dispersion in a frog jump alternating dispersion mode. Considering the interaction between plasma and electromagnetic wave in the ionized layer, the time step and the space step should satisfy the Courant-Friedrichs-Lewy condition; considering that when the finite difference method in time domain is used for model solution, the finite difference replaces the second derivative in the wave equation to bring a certain numerical dispersion. Therefore, to reduce the numerical dispersion caused by the finite difference approximation substitution in the time domain, the canonical space step size is satisfied
Figure BDA0002923145830000061
The updated result needs to be subjected to discrete Fourier transform to obtain a spectrogram, and the frequency resolution of the discrete Fourier transform is determined by
Figure BDA0002923145830000062
Determining, wherein N is the number of samples, and Delta is the sampling time interval; maximum frequency of discrete Fourier transform
Figure BDA0002923145830000063
And (6) determining. In order to distinguish the ion sound wave in the kHz order from the spectrogram, the sampling time interval of the finite time domain difference can be determined according to the maximum frequency and the frequency resolution, and the storage period is obtained.
In some embodiments of the present invention, the total time step number is determined according to the time step length and the preset time range, and in step U, when the discrete solution model is solved, the total time step number is used as a loop iteration ending condition to perform iterative computation on the solution model.
Carrying out overall framework design on a program through the established numerical solution model, the discrete scheme, the numerical stability condition and the storage condition, wherein the calculation steps of the solution model are as follows:
step 1, initializing a program, and declaring numerical simulation related variables;
step 2, setting initial conditions, namely setting a space range, a time range, absorption boundary conditions and the like according to the specific conditions of the simulated area and giving initial values of related physical parameters;
step 3, obtaining the total time step number through the time range set in the step 2 and the obtained time step length, and performing iterative operation of the numerical calculation module and the numerical storage module by taking the total time step number as a cycle iteration ending condition;
step 4, program self-adaptive calculation distribution is carried out, and the number of the programs needing to be opened by MPI and the number of the threads which can be opened by OpenMP are determined;
step 5, initializing a message passing interface MPI;
step 6, the numerical calculation module mainly comprises the steps of solving numerical iteration update of each variable and adopting OpenMP to carry out parallelization operation;
step 7, the numerical value storage module mainly stores the solved variable updating results one by one and adopts MPI to perform parallelization operation;
step 8, finishing the loop iteration of the total time step for a plurality of times, and ending the MPI in parallel;
according to some embodiments of the invention, step X comprises:
step X1, performing iteration loop calculation of preset times on the discrete solved model series;
step X2, based on the calculation result of the step X1, acquiring a plurality of time parameters by using an MPI timing function;
and step X3, calculating and determining the thread number of the OpenMP and the process number of the MPI according to the preset relationship between the hardware information of the computer platform and the plurality of time parameter information.
In some embodiments of the invention, in step X2, the plurality of time parameters comprises: the average time consumption of the serial computing part, the average time consumption of the parallel computing part, the average time consumption of data updating storage and the time consumption of communication among processes are calculated;
the hardware information of the computer platform in step X3 includes: memory size and CPU core count.
For example, during program adaptive testing, first performing serial iterative loop calculation of a serial program for 1000 times, using an MPI timing function to obtain calculation that can be executed in parallel by a numerical calculation module in a storage period, average consumed time of a serial part of the numerical calculation module and average consumed time of the numerical storage module, which are p _ T, u _ T and s _ T, respectively, and obtain consumed time c _ T of communication between current program processes; then, carrying out hardware query operation to obtain the hardware information of the current computer platform: memory size and CPU core count. The number of processes opened by MPI and the number of threads opened by OpenMP are calculated and distributed in a self-adaptive mode according to the following three principles:
thread count/process count > ((u _ T + c _ T) x thread count + p _ T) x 1.2/s _ T;
the total number of threads + the total number of processes < the number of CPU cores x 2;
the memory occupied by the calculation module variable + the memory occupied by the storage module variable x the total number of processes < hardware memory.
It should be noted that, the numerical calculation module updates each solution variable, and adopts the OpenMP parallel technology to perform parallelization operation. And then dividing the space range of the simulation area into sub-areas with the same number as the total number of the threads opened up according to the total number of the threads opened up by the OpenMP. Due to the characteristic that OpenMP shares a memory, communication operation is not needed between adjacent threads, and certain parallel consumption is reduced;
and the numerical value storage module adopts an MPI scheme to perform parallelization processing. And determining the total number n +1 of the needed processes according to the self-adaptive calculation distribution. Wherein, the process 0 is used as a main process to carry out operations such as program initialization, numerical value iterative computation, numerical value computation result forwarding and the like, and the processes 1-n carry out data receiving and data storage operations. The process 0 performs loop iteration operation according to the total time step number, and forwards the updated variables to be stored to the corresponding process for storage operation after the loop is performed for a certain number of times; and after the process 0 finishes the data forwarding operation, continuing to perform iterative computation. And the processes 1-n receive the data transmitted by the process 0 and then perform storage operation, continue to wait for receiving the data transmitted by the process 0 after the storage is finished, finish the following storage operation, and repeat the steps until the time iteration is finished.
With reference to fig. 1 and 3, a computing apparatus 100 for enabling electromagnetic waves and plasma based on OpenMP and MPI according to an embodiment of the present invention includes: a modeling module 10, a parallel parameter determination module 20 and a calculation module 30,
specifically, the modeling module 10 is configured to establish a solution model of the interaction between the electromagnetic wave and the plasma and perform discretization;
the parallel parameter determination module 20 is configured to perform adaptive calculation on the discretized solution model, and determine a thread number of OpenMP and a process number of MPI;
the calculation module 30 is configured to solve the discretized solution model, and includes: and performing parallelization calculation according to the number of OpenMP threads obtained by the self-adaptive calculation, and performing parallelization updating and storage on data according to the number of MPI threads obtained by the self-adaptive calculation to obtain a simulation result of the interaction of the electromagnetic wave and the plasma.
According to the computing device 100 based on the electromagnetic wave and plasma effect of the OpenMP and MPI, the OpenMP is adopted to carry out the parallelization processing on the numerical computation module 30 with the finite difference in time domain, the parallel overhead caused by the communication operation required by the MPI parallelization is reduced, and the time required by the numerical computation simulation is effectively saved. In the time iteration process, the data storage module adopts MPI to carry out parallelization processing, so that the operation efficiency of the whole program can be effectively improved, and each process has own memory and variable through the process processing, so that the conflict problem is avoided.
According to some embodiments of the present invention, the computing apparatus 100 further includes a discretization module 40, configured to perform space-time discretization on the solution model based on a finite difference time domain method according to preset requirements of the simulation environment and stability and convergence of the solution model, so as to determine a time step and a space step of the solution model. The discretization method of the discretization module 40 for solving the model is similar to the foregoing and will not be described herein.
In some embodiments of the present invention, the discretization module 40 determines a total time step number according to the time step and the preset time range, and the calculation module performs iterative calculation on the solution model by using the total time step number as a loop iteration ending condition when the solution model after discretization is solved. The iterative solution process for solving the model is described above, and is not described herein again.
According to some embodiments of the invention, the parallel parameter determination module 20 comprises: an adaptation calculation module 210, a parameter acquisition module 220, and a parameter calculation module 230.
The adaptive computing module 210 is configured to perform iterative loop computation on the discrete solution model serial for a preset number of times;
the parameter obtaining module 220 obtains a plurality of time parameters by using an MPI timing function based on the calculation result of the adaptive calculation module 210;
the parameter calculating module 230 calculates and determines the thread number of the OpenMP and the process number of the MPI according to the preset relationship between the hardware information of the computer platform and the plurality of time parameter information.
In some embodiments of the invention, the plurality of time parameters comprises: the average time consumption of the serial computing part, the average time consumption of the parallel computing part, the average time consumption of data updating storage and the time consumption of communication among processes are calculated;
the hardware information of the computer platform comprises: memory size and CPU core count.
The method for the adaptive module to obtain the number of threads of OpenMP and the number of processes of MPI is described in the foregoing description and is not repeated herein.
The following describes a method and an apparatus for calculating electromagnetic wave and plasma interaction based on OpenMP and MPI according to the present invention in a specific embodiment with reference to the accompanying drawings. It is to be understood that the following description is only exemplary in nature and should not be taken as a specific limitation on the invention.
The parallel method for simulating the interaction of electromagnetic waves and plasma based on the OpenMP + MPI technology comprises the following execution steps:
step 1, establishing a solving model;
step 2, based on a classic Yee structure in a finite difference time domain method, performing space-time dispersion in a frog-leaping alternating dispersion mode;
step 3, considering the interaction between plasmas and electromagnetic waves in an ionized layer, wherein the time step length and the space step length meet the Courant-Friedrichs-Lewy condition;
and 4, considering that when a finite difference time domain method is used for model solution, a certain numerical dispersion is brought by the process of replacing a second derivative in the wave equation by the finite difference. Due to the fact thatThis is to reduce the numerical dispersion caused by the finite difference approximation substitution of the time domain, and the normalized space step size is satisfied
Figure BDA0002923145830000101
Step 5, performing discrete Fourier transform on the updating result to obtain a spectrogram, wherein the frequency resolution of the discrete Fourier transform is
Figure BDA0002923145830000102
Determining, wherein N is the number of samples and Δ is the sampling interval; maximum frequency of discrete Fourier transform
Figure BDA0002923145830000111
And (6) determining. In order to distinguish the ion sound wave in the kHz order from the spectrogram, the sampling time interval of the finite time domain difference can be determined according to the maximum frequency and the frequency resolution, and the storage period is obtained.
Step 6, initializing a program, declaring numerical simulation related variables, initializing a Message Passing Interface (MPI), and determining the total number and the number of processes;
step 7, setting initial conditions, namely setting a space range, a time range, a boundary absorption condition and the like according to the specific conditions of the simulated area, and endowing initial values of relevant physical parameters;
step 8, obtaining total time steps through the time range set in the step 7 and the time step obtained by the calculation meeting the numerical dispersion relation in the steps 3 and 4, and performing iterative operation of a numerical calculation module and a numerical storage module by taking the total time steps as circulation conditions;
step 9, firstly, performing serial iterative loop calculation of the serial program for 1000 times, respectively obtaining the calculation which can be executed in parallel by the numerical calculation module in one storage period, the average consumed time of the serial part of the numerical calculation module and the average consumed time of the numerical storage module in the MPI timing function, wherein the average consumed time is p _ T, u _ T and s _ T, and obtaining the communication consumed time c _ T between the current program processes; then, carrying out hardware query operation to obtain the hardware information of the current computer platform: memory size and CPU core count. The number of processes opened by MPI and the number of threads opened by OpenMP are calculated and distributed in a self-adaptive mode according to the following three principles:
thread count/process count > ((u _ T + c _ T) x thread count + p _ T) x 1.2/s _ T;
the total number of threads plus the total number of processes is less than the number of CPU cores multiplied by 2;
the memory occupied by the calculation module variable + the memory occupied by the storage module variable is multiplied by the total number of the processes and is less than the hardware memory.
Step 10, the numerical calculation module mainly comprises the steps of solving numerical update of each variable and adopting OpenMP to carry out parallelization operation;
step 11, the numerical value storage module mainly stores the solved variable updating results one by one and adopts MPI to perform parallelization operation;
step 12, finishing the loop iteration of the total time step for a plurality of times, and ending the MPI in parallel;
and 9, updating the solving variables of the numerical calculation module in the step 9, and performing parallelization operation by adopting an OpenMP parallel technology. According to the hardware resources of the computer and the process developed by the storage operation, the reasonable number of threads is set, and then the finite difference calculation area of the time domain is divided into sub-areas with the same number as the total number of the developed threads. Due to the characteristic that OpenMP shares a memory, communication operation is not needed between adjacent threads, and certain parallel consumption is reduced.
And (3) the numerical value storage module in the step 10 adopts an MPI scheme to carry out parallelization processing. And determining the total number n +1 of the required processes according to the storage requirement, the time required by single numerical calculation, the time required by single storage and the hardware resources of the computer. Wherein, the process 0 is used as a main process to carry out operations such as program initialization, numerical iteration calculation, numerical calculation result forwarding and the like, and the processes 1-n carry out operations such as data receiving, data storage and the like.
The process 0 performs loop iteration operation according to the total time step number, and forwards the updated variables to be stored to the corresponding process for storage operation after the loop is performed for a certain number of times; and after the process 0 finishes the data forwarding operation, continuing to perform iterative computation. And the processes 1-n receive the data transmitted by the process 0 and then perform storage operation, continue to wait for receiving the data transmitted by the process 0 after the storage is finished, finish the next storage operation, and repeat the steps until the time iteration is finished.
In order to verify the correctness and effectiveness of the invention, the embodiment adopts typical EISCAT ionosphere experimental conditions to prove the feasibility of the invention and obtains numerical simulation results of an electric field, a magnetic field, plasma density and plasma velocity when electromagnetic waves and the ionosphere perform nonlinear action in millisecond time under a full-scale space.
A solution model was established as follows:
Figure BDA0002923145830000121
Figure BDA0002923145830000122
Figure BDA0002923145830000123
Figure BDA0002923145830000124
the magnetic field disturbance H has components in both x and y directions
Figure BDA0002923145830000125
The electric field disturbance E and the plasma velocity U have components along the x, y and z directions, respectively
Figure BDA0002923145830000131
Figure BDA0002923145830000132
The unit vectors in the x, y and z directions, respectively, and the subscript alpha is used to indicate the type of plasma, electron e or oxygen ion O + The subscript i represents an oxygen ion O + The subscript e denotes electron, N α Number density of plasma, T α Is the plasma temperature. Mu.s 0 Is the magnetic permeability of the vacuum, epsilon 0 Is a vacuum dielectric constant, k B Is the boltzmann constant. q. q.s e =-e,
Figure BDA0002923145830000133
Respectively representing charged amounts, m e In order to be of an electron mass,
Figure BDA0002923145830000134
is the mass of oxygen ions, v α Is the collision frequency. Wherein B is t =B 0 + B represents the sum of the earth magnetic field and the disturbing magnetic field.
Based on a Yee structure in a classical finite difference time domain method, the plasma velocity U is measured α And plasma density N α Are arranged at the positions of the grid points E at the same time, and space-time dispersion is performed. In the time domain recursion scheme, a frog-leaping alternate discrete mode is adopted, and the updating cycle is as follows:
Figure BDA0002923145830000135
considering the interaction of plasma and electromagnetic wave in the ionized layer, the Courant-Friedrichs-Lewy condition needs to be satisfied:
Figure BDA0002923145830000136
meanwhile, numerical dispersion brought by a discrete mode under the condition that the finite difference is used for replacing a second derivative in a wave equation is considered. In order to reduce the numerical dispersion caused by the finite difference approximation in the time domain, the canonical space step size thus satisfies:
Figure BDA0002923145830000137
combining the above numerical stability conditions, in combination with the trigger wavelength of the EISCAT heater, the present embodiment sets the spatial step size to dz = λ/30=4m and the temporal step size to dt = dz/2c =6.67 × 10 -9 The simulated EISCAT heating example of this example requires 899377 iterations.
This example simulates the EISCAT heating experiment and the physical parameters are shown in the table below. The finite difference analog domain of time domain is from 200km below the bottom of the F2 layer to 340km of the reflection height of the pump wave. The heating pump wave needs 0.667ms when the transmitter reaches 200km of the simulated area edge, so the time scale of the area simulation is calculated as: 0.667-6.667ms.
Figure BDA0002923145830000141
The heating pump wave frequency is 4.03MHz, so the maximum time resolution is 1/(2 f) 0 )=1.24×10 -7 The storage period is stored once for 3 time steps.
The program is initialized and the relevant variables of numerical simulation are declared, for example, 13 physical parameters which need to be solved and change along with time, including the electric field (E) x 、E y 、E z ) Magnetic field (H) x 、H y ) Number density of electrons N e Ion number density N i Electron velocity (U) ex 、U ey 、U ez ) And ion velocity
Figure BDA0002923145830000142
Etc., background physical quantities such as background magnetic field B 0 Plasma density profile N 0 (z) and the like.
According to EISCAT heating experiment physical parameters, giving initial values of related physical parameters, and setting time and space ranges, background parameters, boundary absorption conditions and the like of a simulation area;
the program adaptively calculates allocation processes and threads. Firstly, performing serial iterative loop calculation of a serial program for 1000 times, respectively obtaining the calculation which can be executed in parallel by a numerical calculation module in a storage period, the average consumed time of the serial part of the numerical calculation module and the average consumed time of the numerical storage module in a storage period by using an MPI timing function, wherein the p _ T, u _ T and s _ T are respectively obtained, and the consumed time c _ T of communication among the current program processes is obtained; then, carrying out hardware query operation to obtain the hardware information of the current computer platform: memory size and CPU core count. The number of processes opened by MPI and the number of threads opened by OpenMP are calculated and distributed in a self-adaptive mode according to the following three principles:
thread count/process count > ((u _ T + c _ T) x thread count + p _ T) x 1.2/s _ T;
the total number of threads plus the total number of processes is less than the number of CPU cores multiplied by 2;
the memory occupied by the calculation module variable + the memory occupied by the storage module variable x the total number of processes < hardware memory.
In this embodiment, since the space step is 4m, the simulated space range is 140km, and therefore the total number of the divided lattices in the z direction is 35000. The total number of processes obtained by the adaptive computing allocation is 47, and the total number of threads is 24.
And initializing the message passing interface MPI, and opening up a corresponding number of processes according to the number of the processes obtained by self-adaptive distribution.
After the above setting is completed, the program performs loop iteration. As shown in fig. 2, the value storage module performs parallelization processing by using the MPI scheme. And (3) acquiring the total number n +1 of the processes according to self-adaptive distribution, wherein the process 0 is used as a main process to carry out operations such as program initialization, numerical calculation result forwarding and the like, and the processes 1-n carry out data receiving and data storage operations.
Further, the process 0 performs iterative operation in a loop according to the total time step number, and after the loop is performed for a certain number of times, the variables to be stored are forwarded to the corresponding process and the storage operation is performed. And after the process 0 finishes the data forwarding operation, continuing to perform loop iterative computation. And the processes 1-n receive the data transmitted by the process 0 and then perform storage operation, continue to wait for receiving the data transmitted by the process 0 after the single storage is completed, complete the next storage operation, and repeat the steps until the time iteration is finished.
Pair of numerical calculation modules included in the process 0
Figure BDA0002923145830000151
And updating in sequence. The numerical calculation module adopts OpenMP to carry out parallelization operation, the total number of the space area division corresponds to the total number of threads opened by the OpenMP, and the threads areThe total is obtained by program adaptive allocation. Due to the characteristic that OpenMP shares a memory, communication operation is not needed between adjacent threads, and certain parallel consumption is reduced.
Completing the loop iteration of the total time step for a plurality of times, and ending the MPI in parallel;
FIG. 4 selects the 750000 time step in the simulation result, i.e., E at 5.0035ms x ,E y ,E z ,ΔN i ,N 0 (z) changes in the respective components, and the O-wave reflection height E can be seen z Obvious swelling phenomenon appears, and the height E of reflection of X-wave z Numerical values also appear but are not sufficiently obvious, E z The generation of the electrostatic wave further causes strong disturbance of electrons and ions at the reflection point of the O wave, which is expressed as the change of the speed of the electrons and ions in the z direction, and further generates the change of the number density of the electrons and ions.
FIG. 5 shows the coincidence comparison between the method of the present invention and the MATLAB simulation software data, randomly selecting the 450000 time step Ex component in the simulation result, and intercepting the simulation result of 220-221.2km for comparison, which shows that the simulation result of the present invention is in substantial coincidence with the simulation result of the MATLAB simulation software, and the error is within the acceptable range.
Serial programs of the same task and parallel programs with different numbers of threads and processes are placed on workstations of the same hardware environment, and the serial programs and the parallel programs respectively run for 1000 storage cycles, namely 3000 time steps, so that the average execution time of each storage cycle is obtained based on the time steps. FIG. 6 is the ratio of the average execution time per memory cycle of the task in a single processor system and a parallel processor system, i.e., the speed-up ratio that measures the performance and effectiveness of program parallelization. From the figure, we can see that the acceleration ratio is also remarkably improved along with the increase of the thread and process opening, the thread reaches the highest value at 43, the acceleration effect is extremely close to the acceleration effect achieved by 47 processes and 24 threads of the self-adaptive calculation distribution, and the feasibility of the self-adaptive calculation distribution of the threads and the processes is also proved. When multitask calculation is carried out, the self-adaptive function of the program greatly saves the time required by manual thread and process allocation, and improves the portability of the program.
In summary, in the electromagnetic wave and plasma interaction calculation method based on OpenMP and MPI, the OpenMP is used to perform parallelization operation on data calculation and parallelization processing on data storage based on MPI, and a set of program framework based on OpenMP + MPI hybrid programming is built for simulating the interaction between electromagnetic waves and plasma. The invention effectively saves the simulation calculation time for simulating the interaction of the electromagnetic wave and the plasma in the full-scale and full-space; the risk brought by memory access conflict is reduced while the parallel overhead is reduced through OpenMP + MPI mixed programming; the portability of the program is enhanced by adaptively calculating and distributing the number of threads and processes through the program, the method can be suitable for computer platforms with various hardware parameters, and the maximum hardware potential of the computer platforms is exerted as much as possible.
While the present invention has been described in connection with the preferred embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (4)

1. A computing method of electromagnetic wave and plasma action based on OpenMP and MPI is characterized by comprising the following steps:
step Y, establishing a solving model of the interaction of the electromagnetic waves and the plasma and performing dispersion;
step X, performing self-adaptive calculation on the discrete solving model, and determining the thread number of OpenMP and the process number of MPI;
step U, solving the discrete solution model, including: carrying out parallelization calculation according to the OpenMP thread number obtained by self-adaptive calculation, and carrying out parallelization updating and storage on data according to the MPI thread number obtained by self-adaptive calculation to obtain a simulation result of interaction of electromagnetic waves and plasmas;
in the step Y, discretizing the solution model includes:
based on a finite difference time domain method, performing space-time dispersion on the solved model according to a simulation environment and preset requirements on stability and convergence of the solved model to determine the time step and the space step of the solved model;
determining total time steps according to the time step and a preset time range, wherein in the step U, when the discrete solution model is solved, iterative computation is carried out on the solution model by taking the total time steps as a cycle iteration ending condition;
the step X comprises the following steps:
step X1, performing iteration loop calculation for preset times on the discrete solved model series;
step X2, based on the calculation result of the step X1, acquiring a plurality of time parameters by using an MPI timing function;
and step X3, calculating and determining the thread number of the OpenMP and the process number of the MPI according to the preset relationship between the hardware information of the computer platform and the plurality of time parameter information.
2. The OpenMP and MPI-based electromagnetic wave and plasma interaction calculation method of claim 1, wherein in the step X2, the plurality of time parameters includes: the average consumed time of the serial computing part, the average consumed time available for the parallel computing part, the average consumed time of data updating storage and the communication consumed time among processes;
the hardware information of the computer platform in the step X3 includes: memory size and CPU core count.
3. A computing device based on the electromagnetic wave and plasma interaction of OpenMP and MPI, comprising:
the modeling module is used for establishing a solution model of the interaction of the electromagnetic waves and the plasma and performing dispersion;
the parallel parameter determination module is used for performing self-adaptive calculation on the discrete solution model and determining the thread number of OpenMP and the process number of MPI;
the calculation module is used for solving the discrete solution model, and comprises the following steps: parallelization calculation is carried out according to the thread number of the OpenMP obtained by self-adaptive calculation, and data are parallelized, updated and stored according to the thread number of the MPI obtained by self-adaptive calculation so as to obtain a simulation result of interaction of electromagnetic waves and plasmas;
the computing device further comprises:
the discrete module is used for performing space-time dispersion on the solved model based on a finite difference time domain method according to a simulation environment and preset requirements on stability and convergence of the solved model so as to determine the time step length and the space step length of the solved model;
the discrete module determines the total time step number according to the time step length and a preset time range, and the calculation module performs iterative calculation on the solution model by taking the total time step number as a cycle iteration ending condition when solving the solution model after the dispersion;
the parallel parameter determination module comprises:
the self-adaptive calculation module is used for serially performing iterative loop calculation for preset times on the discrete solution model;
the parameter acquisition module is used for acquiring a plurality of time parameters by using an MPI timing function based on the calculation result of the self-adaptive calculation module;
and the parameter calculation module is used for calculating and determining the thread number of the OpenMP and the process number of the MPI according to the preset relationship between the hardware information of the computer platform and the time parameter information.
4. The OpenMP and MPI-based electromagnetic wave plasmonics-based computing device of claim 3, wherein the plurality of time parameters includes: the average time consumption of the serial computing part, the average time consumption of the parallel computing part, the average time consumption of data updating storage and the time consumption of communication among processes are calculated;
the hardware information of the computer platform comprises: memory size and CPU core count.
CN202110124410.7A 2021-01-29 2021-01-29 OpenMP and MPI-based method and device for calculating effect of electromagnetic waves and plasma Active CN112861333B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110124410.7A CN112861333B (en) 2021-01-29 2021-01-29 OpenMP and MPI-based method and device for calculating effect of electromagnetic waves and plasma

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110124410.7A CN112861333B (en) 2021-01-29 2021-01-29 OpenMP and MPI-based method and device for calculating effect of electromagnetic waves and plasma

Publications (2)

Publication Number Publication Date
CN112861333A CN112861333A (en) 2021-05-28
CN112861333B true CN112861333B (en) 2022-11-15

Family

ID=75985990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110124410.7A Active CN112861333B (en) 2021-01-29 2021-01-29 OpenMP and MPI-based method and device for calculating effect of electromagnetic waves and plasma

Country Status (1)

Country Link
CN (1) CN112861333B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090255A (en) * 2017-11-29 2018-05-29 上海无线电设备研究所 A kind of low-and high-frequency hybrid algorithm of calculating plasma coverage goal electromagnetic scattering
CN108595277A (en) * 2018-04-08 2018-09-28 西安交通大学 A kind of communication optimization method of the CFD simulated programs based on OpenMP/MPI hybrid programmings

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104461466B (en) * 2013-09-25 2018-09-21 广州中国科学院软件应用技术研究所 The method for improving calculating speed based on MPI and OpenMP Hybrid paradigm parallel computations

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090255A (en) * 2017-11-29 2018-05-29 上海无线电设备研究所 A kind of low-and high-frequency hybrid algorithm of calculating plasma coverage goal electromagnetic scattering
CN108595277A (en) * 2018-04-08 2018-09-28 西安交通大学 A kind of communication optimization method of the CFD simulated programs based on OpenMP/MPI hybrid programmings

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
周雪等.基于MPI和OpenMP的重力及重力梯度数据并行正演算法研究.《世界地质》.2018,(第03期),全文. *
李正浩.《基于共享存储模式的电磁粒子模拟软件并行计算研究》.《中国优秀硕士学位论文全文数据库》.2009,(第11期),全文. *
王卫民.《等离子体覆盖金属目标的电磁散射特性》.《高电压技术》.2014, *

Also Published As

Publication number Publication date
CN112861333A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
Harish et al. Large graph algorithms for massively multithreaded architectures
Yang et al. Fast sparse matrix-vector multiplication on GPUs: Implications for graph mining
Yudanov et al. GPU-based simulation of spiking neural networks with real-time performance & high accuracy
Kraus et al. Accelerating a C++ CFD code with OpenACC
Shimokawabe et al. A stencil framework to realize large-scale computations beyond device memory capacity on GPU supercomputers
Feng et al. Robust parallel preconditioned power grid simulation on GPU with adaptive runtime performance modeling and optimization
Li et al. Multi-role sptrsv on sunway many-core architecture
Korch et al. Optimizing locality and scalability of embedded runge–kutta solvers using block-based pipelining
Yao et al. A communication-aware and predictive list scheduling algorithm for network-on-chip based heterogeneous muti-processor system-on-chip
CN112861333B (en) OpenMP and MPI-based method and device for calculating effect of electromagnetic waves and plasma
Bambha et al. A joint power/performance optimization algorithm for multiprocessor systems using a period graph construct
Chen et al. Large-scale parallel method of moments on CPU/MIC heterogeneous clusters
Sun et al. A synthesis methodology for hybrid custom instruction and coprocessor generation for extensible processors
Seal et al. Reversible parallel discrete event formulation of a tlm-based radio signal propagation model
Zhou et al. A Parallel Scheme for Large‐scale Polygon Rasterization on CUDA‐enabled GPUs
DeRose et al. Relative debugging for a highly parallel hybrid computer system
Freytag et al. Non-uniform partitioning for collaborative execution on heterogeneous architectures
Chacon-Golcher et al. Optimization of Particle-In-Cell simulations for Vlasov-Poisson system with strong magnetic field
Lu et al. Synergia CUDA: GPU-accelerated accelerator modeling package
Cui A Novel Approach to Hardware/Software Partitioning for Reconfigurable Embedded Systems.
Katoh et al. Cross-reference simulation by code-to-code adapter (CoToCoA) library for the study of multi-scale physics in planetary magnetospheres
Kang et al. NNsim: Fast performance estimation based on sampled simulation of GPGPU kernels for neural networks
Chen et al. A comparative study of preconditioners for GPU-accelerated conjugate gradient solver
Papadakis New Directions in Uncertainty Quantification Using Task-based Programming
Bertacco et al. High performance gate-level simulation with gp-gpu computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant