CN110543711B

CN110543711B - Parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation

Info

Publication number: CN110543711B
Application number: CN201910788607.3A
Authority: CN
Inventors: 刘天才; 赵民富; 王先梦; 卢旭; 胡长军; 吕玉凤; 王学松; 张佳; 杨宏伟; 祁琳; 杨文�
Original assignee: China Institute of Atomic of Energy
Current assignee: China Institute of Atomic of Energy
Priority date: 2019-08-26
Filing date: 2019-08-26
Publication date: 2021-07-20
Anticipated expiration: 2039-08-26
Also published as: CN110543711A

Abstract

The invention relates to a parallel realization and optimization method for numerical reactor thermal hydraulic subchannel simulation, which realizes thermal hydraulic subchannel simulation calculation by using a CPU + GPU mixed heterogeneous structure, transplants a heat transfer coefficient solving part in a solid heat conducting part to a GPU for solving, and carries out further parallel on a heat transfer coefficient solving process by OpenMP when traversing subchannels and axial nodes; in the establishment and solution part of the momentum equation, the transverse speed and the axial speed of the axial node are calculated and transplanted to a GPU for solution, a function for solving the transverse speed and the axial speed in a GPU function is decomposed into two functions, and two threads of OpenMP are added to simultaneously solve the axial speed and the transverse speed of one axial layer. The invention can improve the parallel efficiency of the thermal hydraulic sub-channel simulation software and improve the utilization rate of the software to the CPU/GPU architecture computer hardware.

Description

Parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation

Technical Field

The invention belongs to a thermal hydraulic simulation technology of a nuclear reactor, and particularly relates to a parallel implementation and optimization method for numerical reactor thermal hydraulic subchannel simulation.

Background

In the thermal hydraulic simulation of a nuclear reactor, a subchannel analysis method is used for assuming a region between fuel rods as a subchannel, modeling is carried out aiming at the subchannel, and states such as temperature, flow velocity and the like in the subchannel are simulated and analyzed. The performance of nuclear reactors is largely limited by the thermal-hydraulic design. In order to improve the thermal-hydraulic performance of the reactor core, the thermal-hydraulic analysis of the reactor core is required to calculate the pressure, flow and enthalpy distribution in each subchannel of the reactor core as accurately as possible, so that the burnout ratio which causes significant limitation on the design of the water reactor and the calculation of the gas content of an outlet are more accurate.

The subchannel analysis method of the computational analysis of the core thermal hydraulic power is a method widely used in engineering design and safety analysis. So far, a large number of single CPU architecture-based computing programs for a sub-channel model of reactor thermal hydraulic computation are available at home and abroad, and the computing advantages of the current CPU/GPU hybrid architecture processor are not fully utilized.

The hybrid heterogeneous computing system based on the CPU and the GPU is a research hotspot in the field of high-performance computing at home and abroad, and the hybrid heterogeneous application system based on the CPU and the GPU can obtain good performance. The CPU + GPU heterogeneous cooperative computing architecture is shown in fig. 1, and may be divided into 3 parallel levels: the method comprises the following steps of node-node parallelism, heterogeneous parallelism of a CPU and a GPU in a node, and intra-device parallelism (the CPU or the GPU).

Fig. 2 shows the difference between the CPU and GPU architecture, where the GPU has more ALU logic units than the CPU, but the relative control of the program is weaker than the CPU. These differences make GPUs more computationally efficient than CPUs for performing tasks that are computationally intensive to control.

Disclosure of Invention

The invention aims to provide a CPU/GPU architecture-oriented parallel implementation and optimization method for computing a thermal hydraulic subchannel, so as to improve the parallel efficiency of thermal hydraulic subchannel simulation software and improve the utilization rate of the software to CPU/GPU architecture computer hardware.

The technical scheme of the invention is as follows: a parallel implementation and optimization method for numerical reactor thermal hydraulic subchannel simulation uses a CPU + GPU mixed heterogeneous structure to implement thermal hydraulic subchannel simulation calculation, a heat transfer coefficient solving part in a solid heat conducting part is transplanted to a GPU to be solved, and the heat transfer coefficient solving process is further performed in parallel through OpenMP when traversing subchannels and axial nodes; in the establishment and solution part of the momentum equation, the transverse speed and the axial speed of the axial node are calculated and transplanted to a GPU for solution, a function for solving the transverse speed and the axial speed in a GPU function is decomposed into two functions, and two threads of OpenMP are added to simultaneously solve the axial speed and the transverse speed of one axial layer.

Further, the parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation comprises the following steps of:

1) copying an array of the stored subchannel information, an array of the gap information and an array of the axial node information which are required by calculation from a CPU memory to a GPU memory for calculation and use by the GPU;

2) dividing the scale of calculation into small-scale calculation with the same calculation flow;

3) calling a function for calculating the heat transfer coefficient on the GPU, and corresponding the divided small-scale calculated fragments to a calculation unit of the GPU for solving;

4) and transmitting the solving result from the equipment memory to a CPU memory, and performing subsequent program calculation.

Further, the parallel implementation and optimization method for numerical reactor thermal hydraulic subchannel simulation is described above, wherein a specific method for further parallelizing the heat transfer coefficient solving process through OpenMP is to use n threads to carry out the partial solving, equally divide the original calculation scale into n parts, and each thread bears 1/n of the original calculation amount.

Further, the parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation includes the following steps:

(1) and transmitting the gap information and the subchannel information contained in the axial layer j to the equipment memory from the CPU memory.

(2) Calling a GPU function, and respectively traversing the gaps and the subchannels of the axial layer to obtain the transverse speed and the axial speed of the nodes of the j layer and the j +1 layer;

(3) and transmitting the calculated transverse speed and axial speed from the equipment memory back to the CPU memory.

Further, according to the parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation, when the pressure coefficient matrix is solved, a solving library function required to be called in the calculation process is transplanted to the GPU, and then the solving library function transplanted to the GPU is called to achieve optimization processing of the solved pressure coefficient matrix.

The invention has the following beneficial effects:

1. the invention uses the thermal hydraulic sub-channel model realized by the CPU/GPU architecture to transplant the part which is realized by the CPU architecture and has larger time ratio and is calculation intensive in the software originally to be realized to the GPU with higher calculation speed, thereby improving the performance of the program and shortening the total simulation time.

2. The method further optimizes the program by using OpenMP, and uses multi-thread processing on the part without data correlation, and the parallel mode is shown in figure 4, so that the parallelism of the program is further improved, and the efficiency of the program is further improved.

3. Compared with the previous implementation mode facing a single CPU architecture, the implementation mode fully utilizes the hardware resources of the machines such as the current widely used cluster and the like, and the GPU with stronger computing power is not in an idle state any more in the computing process.

Drawings

FIG. 1 is a schematic diagram of a conventional CPU + GPU heterogeneous cooperative computing architecture;

FIG. 2 is a diagram showing a structural comparison between a CPU and a GPU;

FIG. 3 is a schematic diagram of a calculation process of a conventional thermodynamic sub-channel model;

FIG. 4 is a diagram illustrating a parallel manner of an OpenMP optimized program;

FIG. 5 is a schematic diagram of a solving process of a thermal hydraulic sub-channel program oriented to a CPU/GPU architecture.

Detailed Description

The invention is described in detail below with reference to the figures and examples.

The core of the thermodynamic and hydraulic calculation is mainly to solve four basic equations of mass conservation, energy conservation and axial and transverse momentum conservation of each sub-channel.

Fig. 3 shows a calculation flow of a conventional thermodynamic sub-channel model based on a single CPU architecture. In this computational flow, the program execution time is mainly consumed in the establishment and solution of the four basic equations mentioned above and in the solution of the solid heat-conducting portion.

The solution of the solid heat conducting part occupies about 30% of the solution time of the existing program, and the solution of the solid heat conducting part comprises the processes of solving the heat transfer coefficient between the solid and the liquid and calculating the heat distribution in the solid through a heat conduction formula, wherein the heat transfer coefficient needs to traverse all sub-channels and axial nodes, and each solution domain is relatively independent and has no dependency relationship when solving the process, and the calculation of the part is suitable for decomposing the heat transfer coefficient into task fragments and transplanting the task fragments to a GPU for solution. Meanwhile, the solution among solution domains in the solution process is not influenced mutually, so that the solution process can be further parallelized through OpenMP when traversing the sub-channel and traversing the axial node, and the parallelism of the part is further improved.

The establishment and solving process of the momentum equation is to calculate the transverse momentum and the axial momentum of each grid according to the initially assumed pressure field and the vacuole share and density of the gas phase, the liquid phase and the liquid drop phase. In the solving process, transverse solving is carried out firstly, then axial solving is carried out, and for the momentum of gas phase, liquid phase and liquid drop phase on the axial node j in the subchannel i, a Gaussian elimination method is used for simultaneously solving in one time step. In the solving process, the transverse speed and the axial speed of each node are solved by traversing the axial nodes, control logic is not complex, the same solving process is continuously called when different nodes are calculated by a program, the calculation is also more suitable for being solved by a GPU, meanwhile, each node in the original calculating process calculates the transverse direction and then calculates the axial direction, each solving domain is respectively solved by two threads for transverse solving and axial solving when the solving domain is solved by OpenMP, and the part is further optimized.

In the solution of the solid heat conduction part, the solution of the heat transfer coefficient part is transplanted to a GPU for solution, and the implementation mode is as follows:

1) and copying the array of the stored subchannel information, the array of the gap information and the array of the axial node information required by calculation from the CPU memory to the GPU memory for calculation and use by the GPU.

2) The scale of the calculation is divided, and the large-scale calculation is divided into small-scale calculation with the same calculation flow. In the part for solving the heat transfer coefficient, the axial grids need to be traversed, the traversal of each sub-channel axial grid has no data dependency, so that the axial grids are divided into different threads for calculation according to the set thread number, for example, 125 grid numbers (grid layer numbers are 2-126) are axially arranged, 4 threads are used for solving, then 2-32 layers of grids are solved by the thread 1, 33-63 layers of grids are solved by the thread 2, 64-94 layers of grids are solved by the thread 3, and 95-126 layers of grids are solved by the thread 4.

3) And calling a function on the GPU for calculating the heat transfer coefficient, and corresponding the divided small-scale calculated fragments to a calculation unit of the GPU for solving.

Meanwhile, OpenMP optimization is carried out at the position, n threads are used for solving the part, the original calculation scale is equally divided into n parts, and each thread bears 1/n of the original calculation amount. The pseudo code is as follows:

in the traversal fuel rod cycle, n CPU threads execute the code segments, each thread traverses number _ of _ rows/n fuel rods, then the code divides the workload, and the workload is uniformly distributed to a GPU for solving according to a set value.

The calculation flow of the establishment and solution part of the momentum equation is roughly as follows:

1) circulating along an axial layer j in a section, traversing from j to 1 to adding one to an axial node, and calculating the axial speed and the transverse speed of the current axial layer j and j +1 nodes when traversing each node;

2) traversing all gaps in each axial layer, and simultaneously calculating the transverse momentum of a gas phase, a liquid phase and a liquid drop phase for each gap by a Gaussian elimination method;

3) and traversing each subchannel in the current axial layer j, and simultaneously calculating the axial momentum of the gas phase, the liquid phase and the liquid drop phase on the layer j by using a Gaussian elimination method for each subchannel.

In the calculation process, the transverse velocity and the axial velocity of the axial node are calculated and transplanted to the GPU for solving. The implementation mode is as follows:

(1) transmitting the gap information and the subchannel information contained in the axial layer j to an equipment memory from a CPU memory;

Meanwhile, as an optimization technology, a function for solving the transverse speed and the axial speed in the GPU function is decomposed into two functions, and two threads of OpenMP are added to simultaneously solve the axial speed and the transverse speed of the axial layer. The pseudo code is as follows:

in addition, the solution of the pressure coefficient matrix generally needs to call an existing mathematical solution library (such as PETSc), and the optimization of the part takes into account that the solution library functions called in the calculation process are transplanted to the GPU, and then the GPU optimization for solving the pressure coefficient matrix is achieved by calling the solution library transplanted to the GPU.

The parallel solving flow of each solving domain after the optimization is completed is shown in fig. 5.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is intended to include such modifications and variations.

Claims

1. A parallel realization and optimization method for numerical reactor thermal hydraulic sub-channel simulation is characterized in that: the method uses the CPU + GPU mixed heterogeneous structure to realize the simulation calculation of the thermal hydraulic subchannel, the part for solving the heat transfer coefficient in the solution solid heat conduction part is transplanted to the GPU to be solved, and the heat transfer coefficient solution process is further parallelized through OpenMP when traversing the subchannel and traversing the axial node; in the establishment and solution part of the momentum equation, the transverse speed and the axial speed of the axial node are calculated and transplanted to a GPU for solution, a function for solving the transverse speed and the axial speed in a GPU function is decomposed into two functions, and two OpenMP threads are added to simultaneously solve the axial speed and the transverse speed of one axial layer;

the implementation manner of transplanting the part for solving the heat transfer coefficient in the solid heat conducting part to the GPU for solving is as follows:

4) transmitting the solved result from the equipment memory to a CPU memory, and performing subsequent program calculation;

the specific method for further parallelizing the heat transfer coefficient solving process through OpenMP is that n threads are used for solving the heat transfer coefficient solving process, the original calculation scale is equally divided into n parts, and each thread bears 1/n of the original calculation amount;

the implementation method for transplanting the transverse speed and the axial speed of the axial node to the GPU for solving is as follows:

(3) transmitting the calculated transverse speed and axial speed from the equipment memory back to the CPU memory;

when the pressure coefficient matrix is solved, solving library functions required to be called in the calculation process are transplanted to the GPU, and then the solving library function transplanted to the GPU is called to realize optimization processing of the pressure coefficient matrix.