CN110543711B - Parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation - Google Patents

Parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation Download PDF

Info

Publication number
CN110543711B
CN110543711B CN201910788607.3A CN201910788607A CN110543711B CN 110543711 B CN110543711 B CN 110543711B CN 201910788607 A CN201910788607 A CN 201910788607A CN 110543711 B CN110543711 B CN 110543711B
Authority
CN
China
Prior art keywords
gpu
axial
solving
calculation
speed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910788607.3A
Other languages
Chinese (zh)
Other versions
CN110543711A (en
Inventor
刘天才
赵民富
王先梦
卢旭
胡长军
吕玉凤
王学松
张佳
杨宏伟
祁琳
杨文�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Institute of Atomic of Energy
Original Assignee
China Institute of Atomic of Energy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Institute of Atomic of Energy filed Critical China Institute of Atomic of Energy
Priority to CN201910788607.3A priority Critical patent/CN110543711B/en
Publication of CN110543711A publication Critical patent/CN110543711A/en
Application granted granted Critical
Publication of CN110543711B publication Critical patent/CN110543711B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a parallel realization and optimization method for numerical reactor thermal hydraulic subchannel simulation, which realizes thermal hydraulic subchannel simulation calculation by using a CPU + GPU mixed heterogeneous structure, transplants a heat transfer coefficient solving part in a solid heat conducting part to a GPU for solving, and carries out further parallel on a heat transfer coefficient solving process by OpenMP when traversing subchannels and axial nodes; in the establishment and solution part of the momentum equation, the transverse speed and the axial speed of the axial node are calculated and transplanted to a GPU for solution, a function for solving the transverse speed and the axial speed in a GPU function is decomposed into two functions, and two threads of OpenMP are added to simultaneously solve the axial speed and the transverse speed of one axial layer. The invention can improve the parallel efficiency of the thermal hydraulic sub-channel simulation software and improve the utilization rate of the software to the CPU/GPU architecture computer hardware.

Description

Parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation
Technical Field
The invention belongs to a thermal hydraulic simulation technology of a nuclear reactor, and particularly relates to a parallel implementation and optimization method for numerical reactor thermal hydraulic subchannel simulation.
Background
In the thermal hydraulic simulation of a nuclear reactor, a subchannel analysis method is used for assuming a region between fuel rods as a subchannel, modeling is carried out aiming at the subchannel, and states such as temperature, flow velocity and the like in the subchannel are simulated and analyzed. The performance of nuclear reactors is largely limited by the thermal-hydraulic design. In order to improve the thermal-hydraulic performance of the reactor core, the thermal-hydraulic analysis of the reactor core is required to calculate the pressure, flow and enthalpy distribution in each subchannel of the reactor core as accurately as possible, so that the burnout ratio which causes significant limitation on the design of the water reactor and the calculation of the gas content of an outlet are more accurate.
The subchannel analysis method of the computational analysis of the core thermal hydraulic power is a method widely used in engineering design and safety analysis. So far, a large number of single CPU architecture-based computing programs for a sub-channel model of reactor thermal hydraulic computation are available at home and abroad, and the computing advantages of the current CPU/GPU hybrid architecture processor are not fully utilized.
The hybrid heterogeneous computing system based on the CPU and the GPU is a research hotspot in the field of high-performance computing at home and abroad, and the hybrid heterogeneous application system based on the CPU and the GPU can obtain good performance. The CPU + GPU heterogeneous cooperative computing architecture is shown in fig. 1, and may be divided into 3 parallel levels: the method comprises the following steps of node-node parallelism, heterogeneous parallelism of a CPU and a GPU in a node, and intra-device parallelism (the CPU or the GPU).
Fig. 2 shows the difference between the CPU and GPU architecture, where the GPU has more ALU logic units than the CPU, but the relative control of the program is weaker than the CPU. These differences make GPUs more computationally efficient than CPUs for performing tasks that are computationally intensive to control.
Disclosure of Invention
The invention aims to provide a CPU/GPU architecture-oriented parallel implementation and optimization method for computing a thermal hydraulic subchannel, so as to improve the parallel efficiency of thermal hydraulic subchannel simulation software and improve the utilization rate of the software to CPU/GPU architecture computer hardware.
The technical scheme of the invention is as follows: a parallel implementation and optimization method for numerical reactor thermal hydraulic subchannel simulation uses a CPU + GPU mixed heterogeneous structure to implement thermal hydraulic subchannel simulation calculation, a heat transfer coefficient solving part in a solid heat conducting part is transplanted to a GPU to be solved, and the heat transfer coefficient solving process is further performed in parallel through OpenMP when traversing subchannels and axial nodes; in the establishment and solution part of the momentum equation, the transverse speed and the axial speed of the axial node are calculated and transplanted to a GPU for solution, a function for solving the transverse speed and the axial speed in a GPU function is decomposed into two functions, and two threads of OpenMP are added to simultaneously solve the axial speed and the transverse speed of one axial layer.
Further, the parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation comprises the following steps of:
1) copying an array of the stored subchannel information, an array of the gap information and an array of the axial node information which are required by calculation from a CPU memory to a GPU memory for calculation and use by the GPU;
2) dividing the scale of calculation into small-scale calculation with the same calculation flow;
3) calling a function for calculating the heat transfer coefficient on the GPU, and corresponding the divided small-scale calculated fragments to a calculation unit of the GPU for solving;
4) and transmitting the solving result from the equipment memory to a CPU memory, and performing subsequent program calculation.
Further, the parallel implementation and optimization method for numerical reactor thermal hydraulic subchannel simulation is described above, wherein a specific method for further parallelizing the heat transfer coefficient solving process through OpenMP is to use n threads to carry out the partial solving, equally divide the original calculation scale into n parts, and each thread bears 1/n of the original calculation amount.
Further, the parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation includes the following steps:
(1) and transmitting the gap information and the subchannel information contained in the axial layer j to the equipment memory from the CPU memory.
(2) Calling a GPU function, and respectively traversing the gaps and the subchannels of the axial layer to obtain the transverse speed and the axial speed of the nodes of the j layer and the j +1 layer;
(3) and transmitting the calculated transverse speed and axial speed from the equipment memory back to the CPU memory.
Further, according to the parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation, when the pressure coefficient matrix is solved, a solving library function required to be called in the calculation process is transplanted to the GPU, and then the solving library function transplanted to the GPU is called to achieve optimization processing of the solved pressure coefficient matrix.
The invention has the following beneficial effects:
1. the invention uses the thermal hydraulic sub-channel model realized by the CPU/GPU architecture to transplant the part which is realized by the CPU architecture and has larger time ratio and is calculation intensive in the software originally to be realized to the GPU with higher calculation speed, thereby improving the performance of the program and shortening the total simulation time.
2. The method further optimizes the program by using OpenMP, and uses multi-thread processing on the part without data correlation, and the parallel mode is shown in figure 4, so that the parallelism of the program is further improved, and the efficiency of the program is further improved.
3. Compared with the previous implementation mode facing a single CPU architecture, the implementation mode fully utilizes the hardware resources of the machines such as the current widely used cluster and the like, and the GPU with stronger computing power is not in an idle state any more in the computing process.
Drawings
FIG. 1 is a schematic diagram of a conventional CPU + GPU heterogeneous cooperative computing architecture;
FIG. 2 is a diagram showing a structural comparison between a CPU and a GPU;
FIG. 3 is a schematic diagram of a calculation process of a conventional thermodynamic sub-channel model;
FIG. 4 is a diagram illustrating a parallel manner of an OpenMP optimized program;
FIG. 5 is a schematic diagram of a solving process of a thermal hydraulic sub-channel program oriented to a CPU/GPU architecture.
Detailed Description
The invention is described in detail below with reference to the figures and examples.
The core of the thermodynamic and hydraulic calculation is mainly to solve four basic equations of mass conservation, energy conservation and axial and transverse momentum conservation of each sub-channel.
Fig. 3 shows a calculation flow of a conventional thermodynamic sub-channel model based on a single CPU architecture. In this computational flow, the program execution time is mainly consumed in the establishment and solution of the four basic equations mentioned above and in the solution of the solid heat-conducting portion.
The solution of the solid heat conducting part occupies about 30% of the solution time of the existing program, and the solution of the solid heat conducting part comprises the processes of solving the heat transfer coefficient between the solid and the liquid and calculating the heat distribution in the solid through a heat conduction formula, wherein the heat transfer coefficient needs to traverse all sub-channels and axial nodes, and each solution domain is relatively independent and has no dependency relationship when solving the process, and the calculation of the part is suitable for decomposing the heat transfer coefficient into task fragments and transplanting the task fragments to a GPU for solution. Meanwhile, the solution among solution domains in the solution process is not influenced mutually, so that the solution process can be further parallelized through OpenMP when traversing the sub-channel and traversing the axial node, and the parallelism of the part is further improved.
The establishment and solving process of the momentum equation is to calculate the transverse momentum and the axial momentum of each grid according to the initially assumed pressure field and the vacuole share and density of the gas phase, the liquid phase and the liquid drop phase. In the solving process, transverse solving is carried out firstly, then axial solving is carried out, and for the momentum of gas phase, liquid phase and liquid drop phase on the axial node j in the subchannel i, a Gaussian elimination method is used for simultaneously solving in one time step. In the solving process, the transverse speed and the axial speed of each node are solved by traversing the axial nodes, control logic is not complex, the same solving process is continuously called when different nodes are calculated by a program, the calculation is also more suitable for being solved by a GPU, meanwhile, each node in the original calculating process calculates the transverse direction and then calculates the axial direction, each solving domain is respectively solved by two threads for transverse solving and axial solving when the solving domain is solved by OpenMP, and the part is further optimized.
In the solution of the solid heat conduction part, the solution of the heat transfer coefficient part is transplanted to a GPU for solution, and the implementation mode is as follows:
1) and copying the array of the stored subchannel information, the array of the gap information and the array of the axial node information required by calculation from the CPU memory to the GPU memory for calculation and use by the GPU.
2) The scale of the calculation is divided, and the large-scale calculation is divided into small-scale calculation with the same calculation flow. In the part for solving the heat transfer coefficient, the axial grids need to be traversed, the traversal of each sub-channel axial grid has no data dependency, so that the axial grids are divided into different threads for calculation according to the set thread number, for example, 125 grid numbers (grid layer numbers are 2-126) are axially arranged, 4 threads are used for solving, then 2-32 layers of grids are solved by the thread 1, 33-63 layers of grids are solved by the thread 2, 64-94 layers of grids are solved by the thread 3, and 95-126 layers of grids are solved by the thread 4.
3) And calling a function on the GPU for calculating the heat transfer coefficient, and corresponding the divided small-scale calculated fragments to a calculation unit of the GPU for solving.
4) And transmitting the solving result from the equipment memory to a CPU memory, and performing subsequent program calculation.
Meanwhile, OpenMP optimization is carried out at the position, n threads are used for solving the part, the original calculation scale is equally divided into n parts, and each thread bears 1/n of the original calculation amount. The pseudo code is as follows:
Figure BDA0002178861130000061
in the traversal fuel rod cycle, n CPU threads execute the code segments, each thread traverses number _ of _ rows/n fuel rods, then the code divides the workload, and the workload is uniformly distributed to a GPU for solving according to a set value.
The calculation flow of the establishment and solution part of the momentum equation is roughly as follows:
1) circulating along an axial layer j in a section, traversing from j to 1 to adding one to an axial node, and calculating the axial speed and the transverse speed of the current axial layer j and j +1 nodes when traversing each node;
2) traversing all gaps in each axial layer, and simultaneously calculating the transverse momentum of a gas phase, a liquid phase and a liquid drop phase for each gap by a Gaussian elimination method;
3) and traversing each subchannel in the current axial layer j, and simultaneously calculating the axial momentum of the gas phase, the liquid phase and the liquid drop phase on the layer j by using a Gaussian elimination method for each subchannel.
In the calculation process, the transverse velocity and the axial velocity of the axial node are calculated and transplanted to the GPU for solving. The implementation mode is as follows:
(1) transmitting the gap information and the subchannel information contained in the axial layer j to an equipment memory from a CPU memory;
(2) calling a GPU function, and respectively traversing the gaps and the subchannels of the axial layer to obtain the transverse speed and the axial speed of the nodes of the j layer and the j +1 layer;
(3) and transmitting the calculated transverse speed and axial speed from the equipment memory back to the CPU memory.
Meanwhile, as an optimization technology, a function for solving the transverse speed and the axial speed in the GPU function is decomposed into two functions, and two threads of OpenMP are added to simultaneously solve the axial speed and the transverse speed of the axial layer. The pseudo code is as follows:
Figure BDA0002178861130000071
in addition, the solution of the pressure coefficient matrix generally needs to call an existing mathematical solution library (such as PETSc), and the optimization of the part takes into account that the solution library functions called in the calculation process are transplanted to the GPU, and then the GPU optimization for solving the pressure coefficient matrix is achieved by calling the solution library transplanted to the GPU.
The parallel solving flow of each solving domain after the optimization is completed is shown in fig. 5.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is intended to include such modifications and variations.

Claims (1)

1. A parallel realization and optimization method for numerical reactor thermal hydraulic sub-channel simulation is characterized in that: the method uses the CPU + GPU mixed heterogeneous structure to realize the simulation calculation of the thermal hydraulic subchannel, the part for solving the heat transfer coefficient in the solution solid heat conduction part is transplanted to the GPU to be solved, and the heat transfer coefficient solution process is further parallelized through OpenMP when traversing the subchannel and traversing the axial node; in the establishment and solution part of the momentum equation, the transverse speed and the axial speed of the axial node are calculated and transplanted to a GPU for solution, a function for solving the transverse speed and the axial speed in a GPU function is decomposed into two functions, and two OpenMP threads are added to simultaneously solve the axial speed and the transverse speed of one axial layer;
the implementation manner of transplanting the part for solving the heat transfer coefficient in the solid heat conducting part to the GPU for solving is as follows:
1) copying an array of the stored subchannel information, an array of the gap information and an array of the axial node information which are required by calculation from a CPU memory to a GPU memory for calculation and use by the GPU;
2) dividing the scale of calculation into small-scale calculation with the same calculation flow;
3) calling a function for calculating the heat transfer coefficient on the GPU, and corresponding the divided small-scale calculated fragments to a calculation unit of the GPU for solving;
4) transmitting the solved result from the equipment memory to a CPU memory, and performing subsequent program calculation;
the specific method for further parallelizing the heat transfer coefficient solving process through OpenMP is that n threads are used for solving the heat transfer coefficient solving process, the original calculation scale is equally divided into n parts, and each thread bears 1/n of the original calculation amount;
the implementation method for transplanting the transverse speed and the axial speed of the axial node to the GPU for solving is as follows:
(1) transmitting the gap information and the subchannel information contained in the axial layer j to an equipment memory from a CPU memory;
(2) calling a GPU function, and respectively traversing the gaps and the subchannels of the axial layer to obtain the transverse speed and the axial speed of the nodes of the j layer and the j +1 layer;
(3) transmitting the calculated transverse speed and axial speed from the equipment memory back to the CPU memory;
when the pressure coefficient matrix is solved, solving library functions required to be called in the calculation process are transplanted to the GPU, and then the solving library function transplanted to the GPU is called to realize optimization processing of the pressure coefficient matrix.
CN201910788607.3A 2019-08-26 2019-08-26 Parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation Active CN110543711B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910788607.3A CN110543711B (en) 2019-08-26 2019-08-26 Parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910788607.3A CN110543711B (en) 2019-08-26 2019-08-26 Parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation

Publications (2)

Publication Number Publication Date
CN110543711A CN110543711A (en) 2019-12-06
CN110543711B true CN110543711B (en) 2021-07-20

Family

ID=68712021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910788607.3A Active CN110543711B (en) 2019-08-26 2019-08-26 Parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation

Country Status (1)

Country Link
CN (1) CN110543711B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111008070B (en) * 2019-12-10 2023-05-12 北京科技大学 Parallel task division method and system for full reactor core sub-channels of fast neutron reactor
CN113360187B (en) * 2021-04-22 2022-11-04 电子科技大学 Three-dimensional Kriging algorithm cooperative acceleration method based on CUDA and OpenMP

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103713938A (en) * 2013-12-17 2014-04-09 江苏名通信息科技有限公司 Multi-graphics-processing-unit (GPU) cooperative computing method based on Open MP under virtual environment
CN105911532A (en) * 2016-06-29 2016-08-31 北京化工大学 Synthetic aperture radar echo parallel simulation method based on depth cooperation
CN106874113A (en) * 2017-01-19 2017-06-20 国电南瑞科技股份有限公司 A kind of many GPU heterogeneous schemas static security analysis computational methods of CPU+
CN109903870A (en) * 2019-03-15 2019-06-18 西安交通大学 A kind of across dimension coupled simulation method of Nuclear Power System

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102053945B (en) * 2009-11-09 2012-11-21 中国科学院过程工程研究所 Concurrent computational system for multi-scale discrete simulation
WO2012112302A2 (en) * 2011-02-17 2012-08-23 Siemens Aktiengesellschaft Parallel processing in human-machine interface applications
US10594555B2 (en) * 2016-12-16 2020-03-17 Intelligent Platforms, Llc Cloud-enabled testing of control systems

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103713938A (en) * 2013-12-17 2014-04-09 江苏名通信息科技有限公司 Multi-graphics-processing-unit (GPU) cooperative computing method based on Open MP under virtual environment
CN105911532A (en) * 2016-06-29 2016-08-31 北京化工大学 Synthetic aperture radar echo parallel simulation method based on depth cooperation
CN106874113A (en) * 2017-01-19 2017-06-20 国电南瑞科技股份有限公司 A kind of many GPU heterogeneous schemas static security analysis computational methods of CPU+
CN109903870A (en) * 2019-03-15 2019-06-18 西安交通大学 A kind of across dimension coupled simulation method of Nuclear Power System

Also Published As

Publication number Publication date
CN110543711A (en) 2019-12-06

Similar Documents

Publication Publication Date Title
CN109063235B (en) Multi-physical coupling system and method for reactor simulation
Tramm et al. Memory bottlenecks and memory contention in multi-core Monte Carlo transport codes
CN102193830B (en) Many-core environment-oriented division mapping/reduction parallel programming model
CN110543711B (en) Parallel implementation and optimization method for numerical reactor thermal hydraulic sub-channel simulation
CN103226487A (en) Data distribution and local optimization method for heterogeneous many-core architecture multi-level storage structure
CN114218736A (en) Method for optimizing many-core in ocean mode ROMS
CN114970294A (en) Three-dimensional strain simulation PCG parallel optimization method and system based on Shenwei architecture
Li et al. OpenKMC: a KMC design for hundred-billion-atom simulation using millions of cores on Sunway Taihulight
Wang et al. GPU accelerated lattice Boltzmann method in neutron kinetics problems II: Neutron transport calculation
García et al. A Collision-based Domain Decomposition scheme for large-scale depletion with the Serpent 2 Monte Carlo code
Ma et al. Multidimensional parallel dynamic programming algorithm based on spark for large-scale hydropower systems
CN113391932A (en) Parallel characteristic line method transport scanning method and device for heterogeneous many-core architecture
Gong et al. An efficient wavefront parallel algorithm for structured three dimensional LU-SGS
Liu et al. Parallel implementation and optimization of regional ocean modeling system (ROMS) based on sunway SW26010 many-core processor
Yoshida et al. Current status of thermal/hydraulic feasibility project for reduced-moderation water reactor (2)-development of two-phase flow simulation code with advanced interface tracking method
Jiang et al. An optimized resource scheduling strategy for Hadoop speculative execution based on non-cooperative game schemes
CN113177371A (en) CFD (computational fluid dynamics) accelerated calculation method for sequential reconstruction of reactor core assembly basin flow field
CN114443265A (en) Three-dimensional chromatography static correction two-stage parallel computing implementation method and device
Zhang et al. Heterogeneous programming and optimization of gyrokinetic toroidal code using directives
Liu et al. Accelerating Large-Scale CFD Simulations with Lattice Boltzmann Method on a 40-Million-Core Sunway Supercomputer
Jiang et al. Hierarchical Model Parallelism for Optimizing Inference on Many-core Processor via Decoupled 3D-CNN Structure
Wang et al. Research on parallel optimization of artificial bee colony algorithm
Guan et al. Optimization of POM based on parallel supercomputing grid cloud platform
Dexun et al. The Design of Discrete Memory-Accessing Library of Unstructured-grid for Domestic Heterogeneous Many-Core Architecture
Tang et al. Parallel optimization of stencil computation base on sunway taihulight

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant