CN103064819A - Method for utilizing microwave integrated circuit (MIC) to rapidly achieve lattice Boltzmann parallel acceleration - Google Patents

Method for utilizing microwave integrated circuit (MIC) to rapidly achieve lattice Boltzmann parallel acceleration Download PDF

Info

Publication number
CN103064819A
CN103064819A CN2012104120747A CN201210412074A CN103064819A CN 103064819 A CN103064819 A CN 103064819A CN 2012104120747 A CN2012104120747 A CN 2012104120747A CN 201210412074 A CN201210412074 A CN 201210412074A CN 103064819 A CN103064819 A CN 103064819A
Authority
CN
China
Prior art keywords
mic
distribution function
grid
lattice
cpu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012104120747A
Other languages
Chinese (zh)
Inventor
张广勇
张清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN2012104120747A priority Critical patent/CN103064819A/en
Publication of CN103064819A publication Critical patent/CN103064819A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method for utilizing a microwave integrated circuit (MIC) to rapidly achieve lattice Boltzmann parallel acceleration. The method comprises the steps of enabling a central processing unit (CPU) to set parameters such as a computational domain, a reference length, an inflow velocity, density and Reynolds numbers according to physical problems and design thread counts of an inner core according to nuclear numbers of an MIC card; enabling an MIC end to calculate equilibrium state distribution functions of all lattice points in various directions through the macroscopic parameters (the density, the speed, the Reynolds numbers and a coefficient of viscosity and the like) so as to enable the equilibrium state distribution functions to serve as a computational initial field, performing parallel solving of a discrete equation and a processing edge, and enabling results obtained by final iteration to be passed back to a CPU end. The characteristic of rapid calculation of an MICMIC end is utilized to participate in calculation of migration and collision in lattice-Boltzmann, and the iterative process of the lattice-Boltzmann is accelerated by coordinated operation of the CPUCPU end and the MICMIC end.

Description

A kind of MIC of utilization realizes the parallel method of accelerating of LATTICE BOLTZMANN fast
Technical field
The present invention relates to computing machine high-performance computing sector, Fluid Mechanics Computation field, the MIC that is specifically related to a kind of Intel of utilization realizes the method that grid-Boltzmann accelerates fast.
Background technology
Lattice Boltzmann method (Lattice Boltzmann Method, LBM) developed into a kind of effective method for numerical simulation in the past 20 years, it is to see method between the microcosmic Molecular Dynamics method with based on a kind of Jie between the macro approach of continuous medium hypothesis.The method is different from traditional fluid simulation method, and it is based on Molecule Motion Theory, asks square to obtain macroscopical average characteristics by the pair distribution function that transports then of following the tracks of particle distribution function.The moving theoretical characteristics of LATTICE BOLTZMANN method is so that it is more effective in many Complex Flows in simulation, as porous medium flow, suspension flow, polyphasic flow, polycomponent stream etc.The LATTICE BOLTZMANN method has born parallel characteristics, and the advantage such as boundary treatment is simple, program is easy to implement.
Basic process when adopting the LBM method to find the solution physical problem as shown in Figure 1.For a specific physical problem, at first carry out additional step:
(1) based on various simplification and assumptions, carry out physical modeling, determine zoning, starting condition and boundary condition etc., and according to the difference of physical problem, select corresponding lattice Boltzmann model;
(2) carry out grid and divide, it is NX*NY that hypothetical trellis is divided size;
(3) according to different lattice Boltzmann model, select governing equation, and it is dispersed.As adopting the standard lattice Boltzmann method to simulate the isothermal incompressible flow is moving, the governing equation after then dispersing is the LBGK equation.
This 3 step of front just carried out before numerical simulation.Enter subsequently the numerical simulation stage:
(4) according to physical problem, the macroscopical parameter on given all lattice points (density, speed, viscosity coefficient etc.), and calculate thus the equilibrium distribution function of all directions on all lattice points, with this as first that calculates;
(5) find the solution governing equation after discrete, for example, adopt the migration collision rule to find the solution the LBGK equation;
(6) according to boundary condition, implement the boundary treatment form at corresponding lattice point at the boundary;
(7) based on macroscopical definition of quantity rule of different LATTICE BOLTZMANN models, calculate the macroscopical parameter on each lattice point;
(8) judge to calculate whether restrain;
(9) if calculate convergence, then export result of calculation; Otherwise returned for the 4th step, continue to find the solution, until convergence.
The single relaxation time approximation BGK of the grid of widespread use-Boltzmann model is based on following EVOLUTION EQUATION:
Figure 795842DEST_PATH_IMAGE001
Here,
Figure 193326DEST_PATH_IMAGE002
Be particle distribution function, represent time t,
Figure 738226DEST_PATH_IMAGE003
The place exists with microcosmic speed The probability of the particle of motion.Slack time
Figure 190384DEST_PATH_IMAGE005
Representative reaches the speed of partial balancing, and is relevant with the moving coefficient of viscosity of fluid.The balanced distribution function
Figure 758769DEST_PATH_IMAGE006
The low mach that is the Maxwell-Boltzmann equation is approximate, depends on density and the flowing velocity of fluid.Relation between them is determined by following formula:
Figure 850353DEST_PATH_IMAGE007
Wherein, in the D2Q9 model:
Figure 545907DEST_PATH_IMAGE008
Fluid density and speed then can be calculated according to following formula by particle distribution function:
Discrete velocity
Figure 713376DEST_PATH_IMAGE010
, and the number N of particle distribution function depends on selected grid-Boltzmann model, in the D2Q9 model
Figure 213628DEST_PATH_IMAGE011
9 components are arranged, and the number of corresponding particle distribution function is 9 also, sees Fig. 2.
MIC (Many Integrated Core) is the many-core processor that Intel Company releases, compare with general multinuclear Xeon, the many nuclear of MIC framework has less kernel and hardware thread, many-core processor computational resource density is higher, the chip-on communication expense significantly reduces, more transistor and energy can be competent at more complicated Parallel application.Intel MIC product based on the many-core processor of heavy nucleus, comprises the core more than 50 based on X86-based, and the vectorial bit wide of 512bit, and two smart performances surpass 1TFlops.
OpenMP is the guiding note of a cover for the design of the multithread programs on the shared drive parallel system, the same OpenMP programming model of supporting on the MIC platform, reduced like this construction cycle of MIC concurrent program, traditional parallel programming language had good support, therefore, can utilize the OpenMP parallel programming model on the MIC platform, to realize fast the performance application concurrent software, obtain fast the raising of performance.
The LBM method needs a large amount of calculating, the example that is calculated as with square chamber, the hypothetical trellis size is 1024*1024, iteration 10000 times, each net point will be done once migration and collision calculation in an iteration, be that the QuadCore Intel Xeon X5450 of 3.00GHz calculates the time that needs a few hours in dominant frequency, and we calculate larger grid iteration more times needs the time of a couple of days when counting, this has had a strong impact on the performance of LBM method.At present, people often carry out LBM with large-scale X86 server cluster and process, its principle is computational load to be divided then to be assigned to each lattice point first, then calculated separately by each lattice point and behind each iteration step swap data, at last the result is gathered output.This way is lower and very big a large amount of time, electric power and the maintenance cost of having consumed of Internet Transmission expense owing to cpu spike Floating-point Computation ability.And, along with the cycle request of people's convection cell simulation is shorter and shorter, accuracy requirement is more and more higher, the scale of PC server cluster is done larger and larger, all is faced with huge challenge at aspects such as system constructing cost, data center machine room space, power consumption heat radiation and electrical power limit, manageability, programming simplification, extendability, management maintenance expenses.
As seen, for satisfying the demand of fluid simulation, need a kind of method that improves the calculated performance of LBM, and the method can reduce machine room construction cost and management, operation, maintenance cost, and MIC can well address these problems.
Summary of the invention
The object of the invention is to realize fast the LATTICE BOLTZMANN accelerated method, improve its handling property and efficient, make CPU and MIC work in coordination with calculating, thereby satisfy the demand of fluid simulation, and reduce the machine room construction cost and management, operation, maintenance cost provide a kind of MIC of utilization to realize fast the parallel method of accelerating of LATTICE BOLTZMANN.
The objective of the invention is to realize in the following manner, to need that basic parameter initialization calculating is placed on the CPU end carries out, and consuming time and the extraordinary balanced distribution function calculation of concurrency, macroscopic quantity statistics, discrete equation is found the solution partly utilizes the OpenMP technology to carry out paralell design with boundary treatment, make it hold executed in parallel at MIC, CPU and MIC work in coordination with calculating, the final acceleration lattice Boltzmann method that realizes
Particular content comprises CPU end and MIC end, wherein:
The CPU end will carry out grid according to physical problem and divide, macroscopical parameter of density, speed, reference length, Reynolds number and viscosity coefficient on all lattice points of given grid, set simultaneously the thread execution configuration of kernel, start the parallel computation of MIC end, and the iterative computation result of reception MIC end obtains final fluid state;
MIC end uses corresponding multi-threaded parallel ground according to the equilibrium distribution function of all directions on described all lattice points according to described thread execution configuration, successively by migration and collision, the distribution function of layer when boundary treatment obtains the next one;
Concrete steps are as follows:
1) CPU end will carry out grid and divide, and set the initial value of the macroscopical parameter on all lattice points of grid, according to the check figure of described mic card the Thread Count of the described iterative computation of executed in parallel will be set, and specifically comprise:
According to the requirement of described physical problem grid being carried out in the Flow Field Calculation territory divides, described sizing grid is NX*NY, NX is the x direction, NY is the y direction, nodes on the described grid is N=NX*NY, according to the check figure that adopts mic card the Thread Count of the described migration collision calculation of executed in parallel is set, the check figure of described mic card is M, the Thread Count T=4*M of described migration collision calculation;
2) MIC end uses a kernel of asking distribution function according to initial macroscopical parameter, calculates the equilibrium distribution function of all directions on all lattice points, and then the repeatedly iteration by migration and collision, boundary treatment obtains the convergence state in described flow field;
3) MIC holds initial macroscopical parameter and the thread execution configuration in given flow field, and is delivered in the internal memory of described MIC end, and reads the net result in the MIC internal memory after the calculating of MIC end is complete;
The distribution function of layer obtained the distribution function in a period of time under this flow field layer by migration collision and boundary treatment when 4) MIC end used corresponding multi-threaded parallel ground current according to described flow field according to described thread execution configuration, specifically comprised:
5) MIC end use T thread parallel ground to N lattice point of described fluid grid according to initial distribution function F i (0)Or the distribution function F that calculates of previous step i (k)Carry out described migration collision and boundary treatment algorithm, calculate obtain described grid lattice point lower a period of time layer distribution function F i (k+1), described i gets altogether b+1 value of 0-b, represents respectively the distribution function of b+1 direction on the lattice point, and described k equals 1 or greater than 1 integer;
6) CPU end control iterations, and hold MIC final result to pass back to CPU and hold, wherein: CPU end control iteration ITR time, namely the kernel iteration is called ITR time, and described ITR is the iterations that carries out in the fluid simulation;
7) the grid lattice point is carried out after ITR time in iteration, and the MIC end is according to the distribution function F of last grid lattice point i (ITR)Calculate the flow field macroscopic view parameter of kernel parallel speed, density and stream function;
8) the CPU end as a result of writes back velocity field, density field and the stream function that the MIC end calculates in the memory modules.
The invention has the beneficial effects as follows: by technical scheme of the present invention as seen, the present invention partly is performance bottleneck in the LBM algorithm by test migration and collision and boundary treatment, and the data of this part have independence fully, be suitable for the upper multithreading that adopts of MIC fully and carry out parallel computation, and initiation parameter not consuming time and result's output still are placed on the execution of CPU end, CPU and MIC work in coordination with calculating.Improve 54 times by the test overall performance, present one is calculated the calculated performance that the MIC computing node is equivalent to 54 original above CPU nuclears, so not only satisfied the demand of fluid simulation, and greatly reduce power consumption, machine room construction cost and management, operation, maintenance cost have been reduced, and this method realizes that simply, the cost of development that needs is low.
Description of drawings
The basic flow sheet of accompanying drawing 1 LBM method analog approach;
Accompanying drawing 2 D2Q9 illustratons of model;
Accompanying drawing 3 utilizes MIC to accelerate the process flow diagram of LBM embodiment of the method;
Accompanying drawing 4 transition process synoptic diagram.
Embodiment
Explain below with reference to Figure of description method of the present invention being done.
In order to make the purpose, technical solutions and advantages of the present invention more clear, below in conjunction with drawings and Examples, the present invention is described in detail below.The method is divided into following steps:
1) performance bottleneck of location lattice Boltzmann method;
When utilizing LBM to carry out fluid simulation, calculating section the most consuming time is the process of finding the solution discrete equation and boundary treatment, and this process has occupied most times of whole simulation, and other parts are consuming time hardly, therefore, the iterative process of finding the solution discrete equation and boundary treatment is the performance bottleneck among the LBM;
2) concurrency analysis;
According to finding the solution the analysis of the serial algorithm of discrete equation and boundary treatment in the LBM algorithm, the migration of each net point, collision, macroscopic quantity statistics, equilibrium distribution function calculates and the calculating of boundary treatment is data parallel;
3) find the solution the paralell design of discrete equation and boundary treatment;
A) find the solution the process that discrete equation can adopt the migration collision, macroscopic quantity statistics, equilibrium distribution function calculate and collision process in between the calculating of each grid without any dependence, therefore, can allow each thread among the MIC be responsible for the calculating of the delegation net point of a grid in dividing, the calculating of every row net point utilizes the vectorization technology on the MIC further to accelerate; The migration of distribution function only relates to other lattice points around this lattice point, also can realize by the read operation of single thread to correlation distribution function in the global storage;
B) in the LBM algorithm, to do special processing (non-equilibrium extrapolation, bounce-back) to the border, there is not the dependence of data for the calculating between borderline each lattice point yet, therefore, can utilize the OpenMP multithreading to be responsible for the calculating of lattice point at the boundary;
C) threading model of OpenMP design: the Thread Count that kernel is set according to the MIC core number;
The MIC kernel code of d) finding the solution discrete equation and boundary treatment is write.
Embodiment
The object of the invention is to accelerate lattice Boltzmann method, improve its handling property, make CPU and MIC work in coordination with calculating, thereby satisfy the demand of fluid simulation, and reduce machine room construction cost and management, operation, maintenance cost.In the method, to need that initialization calculating is placed on the CPU end carries out, and find the solution discrete equation and boundary treatment partly utilizes the OpenMP technology to carry out paralell design consuming time and concurrency is extraordinary, make it hold executed in parallel at MIC, CPU and MIC work in coordination with calculating, the final realization accelerated lattice Boltzmann method, and as shown in Figure 3, concrete steps and implementation process are as follows:
(1) according to physical problem, hold macroscopical parameter (density, speed, viscosity coefficient etc.) on the given computational fields at CPU, pass to the MIC end;
(2) data structure and the storage mode of definition MIC end, the macroscopical parameters such as speed, density that are used for the balanced distribution function of each lattice point all directions of storage and each lattice point, the macroscopical Parameters Calculation that is transmitted by CPU end goes out the equilibrium distribution function of all directions on all lattice points, with this as first that calculates;
(3) design migration collision kernel, the design lines number of passes is T=4*M, M is the check figure of mic card, and allow migration and the collision process of each the thread computes delegation net point in the kernel, and utilize #pragma ivdep to realize the vectorization of interior loop in the kernel, as shown in Figure 4, the kernel false code is as follows;
1:#pragma omp parallel for private (i, j, k ...) num_threads (T) //T is Thread Count
2: for (i=1;i<NY-1;i++)
3:#pragma ivdep // vectorization
4: for(j=1;j<NX-1;j++)
5: {
6:k=i*NX+j; //k represents the label of grid
7:fr=fr0[k]; The upper for the moment distribution function of layer of // 0 representative
8: fe = fe0[k-1];
9: fn = fn0[k-NX];
10: fw = fw0[k+1];
11: fs = fs0[k+NX];
12: fne = fne0[k-NX-1];
13: fnw = fnw0[k-NX+1];
14: fsw = fsw0[k+NX+1];
15: fse = fse0[k+NX-1];
16 :/* collision process */
17: ask macroscopic quantity according to the distribution function fr-fse after the migration
18: ask the balanced distribution function f 1 of all directions, f2, f3, f4, f5, f6, f7, f8 according to macroscopic quantity;
19: according to f1, f2, f3, f4, f5, f6, f7, the distribution function fr after f8 and the migration, fe, fn, fw, fs, fne, fnw, fsw, fsw, fse ask the distribution function fr1[k after the collision], fe1[k], fn1[k], fw1[k] and, fs1[k], fne1[k] and, fnw1[k], fsw1[k] and, fse1[k];
20: }
(4) at MIC end the border is processed, boundary treatment can adopt the methods such as bounce method, non-equilibrium extrapolation method, the calculating of T thread process boundary node of same design to the processing on border the time;
(5) judge whether that iteration finishes, finish then output, otherwise continue iteration;
(6) MIC end is tried to achieve macroscopical parameter such as speed, density and stream function and the result is passed to the CPU end according to distribution function is parallel; The CPU end carries out result's output;
(7) performance test
A) test environment and test data
Test environment comprises hardware environment, software environment, operating software, and wherein operating software comprises the serial LBM algorithm that operates on the CPU and the parallel LBM algorithm that operates on the MIC; Test data has been chosen lid-driven cavity flow, and input comprises sizing grid and some other input parameter, and concrete every test environment and test data are as shown in table 1;
Table 1
B) results of property
In order to guarantee test performance result's stability, we have carried out 10 tests to above-mentioned operation, data type is float, CPU version LBM algorithm is 10324 seconds in the averaging time of single CPU operation 10 times, and MIC version LBM algorithm is 191 seconds in the averaging time of the above-mentioned same operation of single MIC operation 10 times, and the performance of MIC version operation is 10324/191=54 times of CPU version.
Except the described technical characterictic of instructions, be the known technology of those skilled in the art.

Claims (1)

1. one kind is utilized MIC to realize fast the parallel method of accelerating of LATTICE BOLTZMANN, it is characterized in that comprising CPU end and MIC end, wherein:
The CPU end will carry out grid according to physical problem and divide, macroscopical parameter of density, speed, reference length, Reynolds number and viscosity coefficient on all lattice points of given grid, set simultaneously the thread execution configuration of kernel, start the parallel computation of MIC end, and the iterative computation result of reception MIC end obtains final fluid state;
MIC end uses corresponding multi-threaded parallel ground according to the equilibrium distribution function of all directions on described all lattice points according to described thread execution configuration, successively by migration and collision, the distribution function of layer when boundary treatment obtains the next one;
Concrete steps are as follows:
1) CPU end will carry out grid and divide, and set the initial value of the macroscopical parameter on all lattice points of grid, according to the check figure of described mic card the Thread Count of the described iterative computation of executed in parallel will be set, and specifically comprise:
According to the requirement of described physical problem grid being carried out in the Flow Field Calculation territory divides, described sizing grid is NX*NY, NX is the x direction, NY is the y direction, nodes on the described grid is N=NX*NY, according to the check figure that adopts mic card the Thread Count of the described migration collision calculation of executed in parallel is set, the check figure of described mic card is M, the Thread Count T=4*M of described migration collision calculation;
2) MIC end uses a kernel of asking distribution function according to initial macroscopical parameter, calculates the equilibrium distribution function of all directions on all lattice points, and then the repeatedly iteration by migration and collision, boundary treatment obtains the convergence state in described flow field;
3) MIC holds initial macroscopical parameter and the thread execution configuration in given flow field, and is delivered in the internal memory of described MIC end, and reads the net result in the MIC internal memory after the calculating of MIC end is complete;
The distribution function of layer obtained the distribution function in a period of time under this flow field layer by migration collision and boundary treatment when 4) MIC end used corresponding multi-threaded parallel ground current according to described flow field according to described thread execution configuration, specifically comprised:
5) MIC end use T thread parallel ground to N lattice point of described fluid grid according to initial distribution function F i (0)Or the distribution function F that calculates of previous step i (k)Carry out described migration collision and boundary treatment algorithm, calculate obtain described grid lattice point lower a period of time layer distribution function F i (k+1), described i gets altogether b+1 value of 0-b, represents respectively the distribution function of b+1 direction on the lattice point, and described k equals 1 or greater than 1 integer;
6) CPU end control iterations, and hold MIC final result to pass back to CPU and hold, wherein: CPU end control iteration ITR time, namely the kernel iteration is called ITR time, and described ITR is the iterations that carries out in the fluid simulation;
7) the grid lattice point is carried out after ITR time in iteration, and the MIC end is according to the distribution function F of last grid lattice point i (ITR)Calculate the flow field macroscopic view parameter of kernel parallel speed, density and stream function;
8) the CPU end as a result of writes back velocity field, density field and the stream function that the MIC end calculates in the memory modules.
CN2012104120747A 2012-10-25 2012-10-25 Method for utilizing microwave integrated circuit (MIC) to rapidly achieve lattice Boltzmann parallel acceleration Pending CN103064819A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012104120747A CN103064819A (en) 2012-10-25 2012-10-25 Method for utilizing microwave integrated circuit (MIC) to rapidly achieve lattice Boltzmann parallel acceleration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012104120747A CN103064819A (en) 2012-10-25 2012-10-25 Method for utilizing microwave integrated circuit (MIC) to rapidly achieve lattice Boltzmann parallel acceleration

Publications (1)

Publication Number Publication Date
CN103064819A true CN103064819A (en) 2013-04-24

Family

ID=48107449

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012104120747A Pending CN103064819A (en) 2012-10-25 2012-10-25 Method for utilizing microwave integrated circuit (MIC) to rapidly achieve lattice Boltzmann parallel acceleration

Country Status (1)

Country Link
CN (1) CN103064819A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324531A (en) * 2013-06-09 2013-09-25 浪潮电子信息产业股份有限公司 Large eddy simulation method based on Boltzmann theory central processing unit (CPU)/ many integrated core (MIC) cooperative computing
CN103345580A (en) * 2013-07-02 2013-10-09 上海大学 Parallel CFD method based on lattice Boltzmann method
CN103530132A (en) * 2013-10-29 2014-01-22 浪潮电子信息产业股份有限公司 Method for transplanting CPU (central processing unit) serial programs to MIC (microphone) platform
CN103729180A (en) * 2013-12-25 2014-04-16 浪潮电子信息产业股份有限公司 Method for quickly developing CUDA (compute unified device architecture) parallel programs
CN103884343A (en) * 2014-02-26 2014-06-25 海华电子企业(中国)有限公司 Microwave integrated circuit (MIC) coprocessor-based whole-network shortest path planning parallelization method
CN109408867A (en) * 2018-09-12 2019-03-01 西安交通大学 A kind of explicit R-K time stepping method accelerated method based on MIC coprocessor
CN110187975A (en) * 2019-06-04 2019-08-30 成都申威科技有限责任公司 A kind of processor node distribution calculation method, storage medium and terminal device based on LBM
CN111105341A (en) * 2019-12-16 2020-05-05 上海大学 Framework method for solving computational fluid dynamics with low power consumption and high operational performance
CN112100099A (en) * 2020-09-28 2020-12-18 湖南长城银河科技有限公司 Lattice boltzmann optimization method for multi-core vector processor

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102681972A (en) * 2012-04-28 2012-09-19 浪潮电子信息产业股份有限公司 Method for accelerating lattice-Boltzmann by utilizing graphic processing units (GPUs)
CN102945295A (en) * 2012-10-15 2013-02-27 浪潮(北京)电子信息产业有限公司 Parallel acceleration method and system of lattice Boltzmann method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102681972A (en) * 2012-04-28 2012-09-19 浪潮电子信息产业股份有限公司 Method for accelerating lattice-Boltzmann by utilizing graphic processing units (GPUs)
CN102945295A (en) * 2012-10-15 2013-02-27 浪潮(北京)电子信息产业有限公司 Parallel acceleration method and system of lattice Boltzmann method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
曾凡松 等: "LBM在流体渗透率计算中的应用与优化", 《计算机工程与应用》 *
黄昌盛 等: "基于CUDA的格子Boltzmann方法:算法设计与程序优化", 《科学通报》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324531A (en) * 2013-06-09 2013-09-25 浪潮电子信息产业股份有限公司 Large eddy simulation method based on Boltzmann theory central processing unit (CPU)/ many integrated core (MIC) cooperative computing
CN103345580B (en) * 2013-07-02 2016-04-27 上海大学 Based on the parallel CFD method of lattice Boltzmann method
CN103345580A (en) * 2013-07-02 2013-10-09 上海大学 Parallel CFD method based on lattice Boltzmann method
CN103530132A (en) * 2013-10-29 2014-01-22 浪潮电子信息产业股份有限公司 Method for transplanting CPU (central processing unit) serial programs to MIC (microphone) platform
CN103729180A (en) * 2013-12-25 2014-04-16 浪潮电子信息产业股份有限公司 Method for quickly developing CUDA (compute unified device architecture) parallel programs
CN103884343B (en) * 2014-02-26 2017-01-11 海华电子企业(中国)有限公司 Microwave integrated circuit (MIC) coprocessor-based whole-network shortest path planning parallelization method
CN103884343A (en) * 2014-02-26 2014-06-25 海华电子企业(中国)有限公司 Microwave integrated circuit (MIC) coprocessor-based whole-network shortest path planning parallelization method
CN109408867A (en) * 2018-09-12 2019-03-01 西安交通大学 A kind of explicit R-K time stepping method accelerated method based on MIC coprocessor
CN109408867B (en) * 2018-09-12 2021-04-20 西安交通大学 Explicit R-K time propulsion acceleration method based on MIC coprocessor
CN110187975A (en) * 2019-06-04 2019-08-30 成都申威科技有限责任公司 A kind of processor node distribution calculation method, storage medium and terminal device based on LBM
CN110187975B (en) * 2019-06-04 2020-08-18 成都申威科技有限责任公司 Multi-core processor resource allocation calculation method, storage medium and terminal equipment
CN111105341A (en) * 2019-12-16 2020-05-05 上海大学 Framework method for solving computational fluid dynamics with low power consumption and high operational performance
CN112100099A (en) * 2020-09-28 2020-12-18 湖南长城银河科技有限公司 Lattice boltzmann optimization method for multi-core vector processor

Similar Documents

Publication Publication Date Title
CN103064819A (en) Method for utilizing microwave integrated circuit (MIC) to rapidly achieve lattice Boltzmann parallel acceleration
CN102945295B (en) A kind of parallel acceleration method of Lattice Boltzmann Method and system
CN102681972A (en) Method for accelerating lattice-Boltzmann by utilizing graphic processing units (GPUs)
Fryer et al. SNSPH: a parallel three-dimensional smoothed particle radiation hydrodynamics code
Zabelok et al. Adaptive kinetic-fluid solvers for heterogeneous computing architectures
CN103324531A (en) Large eddy simulation method based on Boltzmann theory central processing unit (CPU)/ many integrated core (MIC) cooperative computing
Liu et al. Sunwaylb: Enabling extreme-scale lattice boltzmann method based computing fluid dynamics simulations on sunway taihulight
Xiang et al. GPU acceleration of CFD algorithm: HSMAC and SIMPLE
CN103778098A (en) Large eddy simulation system and method for realizing cooperative computing based on latticed-Boltzmann theory
Onodera et al. Acceleration of wind simulation using locally mesh-refined lattice boltzmann method on gpu-rich supercomputers
Abreu et al. PIC codes in new processors: A full relativistic PIC code in CUDA-enabled hardware with direct visualization
Shankar et al. GRaM-X: a new GPU-accelerated dynamical spacetime GRMHD code for Exascale computing with the Einstein Toolkit
Oliapuram et al. Realtime forest animation in wind
Amador et al. CUDA-based linear solvers for stable fluids
Norman et al. A holistic algorithmic approach to improving accuracy, robustness, and computational efficiency for atmospheric dynamics
Ho et al. Multi-agent simulation on multiple GPUs
Iwasawa et al. Global simulation of planetary rings on Sunway TaihuLight
Yang et al. Physically-based tree animation and leaf deformation using CUDA in real-time
Huang et al. Parallel Performance and Optimization of the Lattice Boltzmann Method Software Palabos Using CUDA
Sishtla et al. Multi-GPU acceleration of the iPIC3D implicit particle-in-cell code
Frisch et al. Adaptive multi-grid methods for parallel CFD applications
Spurzem et al. Astrophysical particle simulations with custom GPU clusters
Shukla et al. Multi-science applications with single codebase-GAMER-for massively parallel architectures
Khani et al. A D3Q19 lattice Boltzmann solver on a GPU using constant-time circular array shifting
Fietz Performance Optimization of Parallel Lattice Boltzmann Fluid Flow Simulations on Complex Geometries

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130424