CN106529063A

CN106529063A - CFD technology-based fluid system and design method thereof

Info

Publication number: CN106529063A
Application number: CN201611029542.7A
Authority: CN
Inventors: 周顺明; 斯特凡·普鲁布斯丁
Original assignee: Yixing Bada Fluid Technology Co Ltd
Current assignee: Yixing Bada Fluid Technology Co Ltd
Priority date: 2016-11-14
Filing date: 2016-11-14
Publication date: 2017-03-22

Abstract

The invention relates to a CFD technology-based fluid system and a design method thereof. The CFD technology-based fluid system comprises a CPU module, a GPU module, a flash memory module, an SSD solid-state disk module, a signal input end and a signal output end, wherein the CPU module is connected with the GPU module and the flash memory module separately; the SSD solid-state disk module is connected with the flash memory module; and the signal input end and the signal output end are connected with the flash memory module.

Description

Fluid system and its method for designing based on CFD technologies

Technical field

The present invention relates to a kind of utilization of computational fluid dynamicses, particularly relates to a kind of fluid system based on CFD technologies And its method for designing.

Background technology

So-called CFD, English full name (Computational Fluid Dynamics), i.e. computational fluid dynamicses are stream One branch of mechanics, abbreviation CFD.CFD is the product of modern age hydrodynamics, numerical mathematics and computer science combination, is One boundary science with powerful vitality.It is using the mathematical method of various discretizations, right with electronic computer as instrument Hydromechanical all kinds of problems carry out numerical experiment, computer mould and fit analysis and research, to solve various practical problems.

The numerical heat transfer of Fluid Mechanics Computation and correlation, the principle for calculating Combustion is non-linear with Numerical Methods Solve The differential equation group of the quality of simultaneous, energy, component, momentum and self-defining scalar, solving result can forecast to flow, conduct heat, The details of the processes such as mass transfer, burning, and become the powerful of process apparatus optimization and amplification of quantitative design.Fluid Mechanics Computation Basic feature be numerical simulation and computer-experiment, it instead of cost to a great extent from basic physical theorem Huge hydrodynamics experimental facilitiess, produce tremendous influence in scientific research and engineering technology.It is at present in the world one Individual strong research field, is the core for carrying out conducting heat, the transmission of mass transfer, momentum and burning, multiphase flow and chemical reaction are studied And important technology, be widely used in aerospace design, Automobile Design, biomedical industries, chemical industry process industry, turbine design, Many engineering fields such as quasiconductor design, HVAC＆R, plate-fin heat exchanger design is one of key areas of CFD technology applications.

CFD obtained development at full speed in nearest 20 years, except computer hardware industrial expansion provided it with it is solid Material base outside, the method for being also primarily due to the method or experiment analyzed all has larger restriction, such as due to problem Complexity, cannot both perform an analysis solution, also powerless because of somewhat expensive to carry out experiment determination, and the method for CFD just has cost It is low and the advantages of more complicated or comparatively ideal process can be simulated.Experimentation can be widened through the CFD software of certain examination Scope, reduces the experimental work amount of cost intensive.A numerical simulation phase is carried out to phenomenon with computer under given parameter Found that new phenomenon was then given by experiment by CFD numerical simulations first when in a numerical experiment is carried out, also once having in history The example of confirmation.CFD software can typically release the physical model of various optimizations, such as permanent and UNSTEADY FLOW, laminar flow, disorderly Stream, incompressible and compressible flows, heat transfer, chemical reaction etc..Flow feature to each physical problem, has suitable Its numerical solution, user can be selected to explicitly or implicitly difference scheme, in calculating speed, stability and precision etc. Aspect reaches optimal.Exchange of values can be easily carried out between CFD software, and using unified forward and backward handling implement, this is just Eliminate researcher in computer approach, programming, repetition, the poorly efficient work that aspect puts into such as process in front and back, and can be with Main attention and wisdom are used for into the exploration of physical problem itself.

The general structure of CFD software is made up of pre-treatment, solver, three part of post processing.Pre-treatment, solver and rear place Three big modules are managed, is respectively played the role of its uniqueness, is expressed as follows respectively:

The fluid conceptual design of the CFD of currently available technology combines hydromechanical tradition using conventional fluid parameter Simultaneously minority is integrated with discrete data distribution and data learning functionality to computational methods, after such situation causes its conceptual design The suitability is poor.

The content of the invention

For above-mentioned deficiency of the prior art, the present invention provides one kind and can carry learning functionality and data base's deposit optimization And the fluid system based on CFD technologies that the whole system scheme suitability can be caused higher and its method for designing.

To realize above technical purpose, the technical scheme is that：A kind of fluid conceptual design frame based on CFD technologies Structure, including CPU module, CPU module, flash memory module and SSD solid state hard disc modules, the CPU module respectively with CPU module, dodge Storing module connects, and the SSD solid state hard discs module is connected with flash memory module, also including signal input part and signal output part, institute State signal input part and signal output part be connected with flash memory module.

To realize above technical purpose, the technical scheme is that：A kind of fluid conceptual design frame based on CFD technologies Framework equipment, method, comprises the steps：

Step 1：Initiation parameter and distribution CPU and GPU memory headrooms；According to the particle density ρ in application scenarios, particle The reynolds number Re of viscosity ν and selection is initialized and is distributed CPU and GPU memory headrooms to parameter；Initiation parameter includes sky Between particle rapidity u=0.1 in VELOCITY DISTRIBUTION model, characteristic linearity L=1, parameter alpha 0=2 slack time, particle density distribution Function f={ fi=0 } i=0 ..., q-1, q are the direction numbers and equilibrium distribution function feq=of space velocity distributed model { fieq } i=0 ..., q-1 ,]]>Fieq=ρ ω i Π j=1d (2-1+uj2) (2uj+1+3uj21-uj) vijv ,]]>Wherein, D is space dimensionality, and uj=0 is the initial velocity value of jth dimension, and ω i are that Gauss-Hermiter integrates flexible strategy, 0<ωi<1, from 0, Value in+1 or -1；With space velocity distributed model medium velocity array, Density Distribution array and balance density-of-states distribution array Used as direction number, array type definition is floating type to corresponding direction dimension size, and add up all direction numbers, with the direction added up Number is multiplied by byte length distribution CPU and the GPU memory headrooms that the byte length of floating type array is obtained, i.e., CPU and GPU is distributed Memory headroom size it is the same；

Step 2：By the speed array in the space velocity distributed model stored in CPU internal memories, Density Distribution array and Balance density-of-states distribution array data is transferred to the global memory Global memory of GPU；

Step 3：Distribution shared drive Shared memory spaces；It is empty according to the distribution CPU or GPU internal memory described in step 1 Between size shared drive Shared memory spaces are allocated；

Step 4：Three functions are realized using nVIDIA GPU programming language CUDA；1) H- α solvers function；The function is fixed Justice is equipment end function, and modifier is _ _ device___, will be called at equipment GPU ends, and runs on equipment, for adjusting Whole and calculating parameter alpha slack time；It is as follows using Taylor's One step development method computing formula：α=α *-F (α *) F ' (α *)]]>Its In, F (α)=H (f+ α Δs)-H (f), α * are front once calculated α, and α initial values are α 0, and fi (x, t) is designated as fi, x and t point The iteration time that particle in other representation space and the particle are located, fi (x, t) be particle under iteration time t in position x Particle probabilities value, F ' (α *) is that F (α *) carries out derivation to α *；2) propagate collision kernel function；The function is defined as equipment end Function, modifier are _ _ global__, will be called at host CPU end, run on equipment GPU；3) BORDER PROCESSING kernel letter Number；The function is defined as equipment end function, and modifier is _ _ global__, will be called at host CPU end, on equipment GPU Operation；The modifier is the affiliated type for arranging function；

Step 5：According to the iterationses of the required precision and setting of setting, the invocation step successively in each iterative process 4 three functions realized, calculate density function values fi and equilibrium state density function values until result of calculation twice in front and back Difference meets required precision or reaches iterationses, and iteration terminates；

Step 6：After iteration terminates, by 5 calculated density function values and balance the step of storage in GPU global memories Density function values are transferred to CPU internal memories, and discharge GPU memory headrooms；

Step 7：According to density function values and equilibrium density functional value, stream function is obtained using paraview softwares equivalent Line chart, while discharging host memory space, completes parallel flow calculating.

From the above, it can be seen that the present invention possesses advantages below：

1) ELBM simulated times ratio is realized using unified calculation device programming framework (CUDA) on the GPU of nVIDIA video cards The upper simulated times of CPU shorten 1/3rd；

2) by parametric solution mode slack time of comparison three kinds of entropy Lattice Boltzmann method models in of the invention, draw straight The method for connecing approximate solution parameter alpha is more more efficient than alternative manner, i.e., can averagely reduce by 31.7% time.

The present invention can make full use of the hardware resource of system in sum, and demonstrate entropy lattice from practical operation aspect The parallelization calculation of sub- Boltzmann's model, so as to significantly provide the efficiency of whole fluid calculation.For further, this Invention is calculated by using entropy Lattice Boltzmann method model parallel flow and reaches high stable and high-precision requirement, overcomes ELBM Calculate in parametric procedure slack time, the problem that amount of calculation will become more and time-consuming.

Description of the drawings

Fig. 1 is the flow chart of the fluid system and its method for designing based on CFD technologies；

Fig. 2 is particle two-dimensional velocity space distributed model D2Q9；

Calculating time comparison diagrams of the Fig. 3 for three kinds of methods of solver；

Equivalent streamline comparison diagrams of the Fig. 4 for Reynolds number 1000, wherein, figure (a) is after ELBM of the present invention is simulated on GPU Isoline of stream function, the isoline of stream function for scheming (b) for list of references [1] ELBM after simulation on CPU, figure (c) are with reference to text Offer the isoline of stream function after [1] LBM is simulated on CPU.

Specific embodiment

With reference to the accompanying drawings and detailed description the present invention is described in further detail.

The present invention chooses suitable equilibrium distribution letter on the basis of the fluid design method based on CFD technologies of standard Number so as to system entropy maximum is met under statistical significance, wherein for discrete type H entropy functions are commonly defined as：Here ω i are Gauss-Hermiter integrates flexible strategy.For entropy Lattice Boltzmann method equation, slack time, parameter γ must be adjusted It is whole, so which meets H function theory so that H function meets extremum conditions.H function monotonicity constraint is made up of two processes：1) One-particle distribution function deviate equilibrium state residual quantity during this H function remain constant；2) dissipate step, causes entropy function to increase, In ELBM, slack time, parameter was defined as γ=α β, here parameter alpha, and it is standard LBM slack time that β is defined as τ.

Parallel flow is calculated and is implemented：

One. simulated scenario describes SIPO<DP n=" 4 ">

On the premise of meet fluid design method of the entropy based on CFD technologies, parallel algorithm aims at minimum mould The run time that fluid analogy is calculated, its focus is by intensive calculating process is transferred on GPU compared with multithreading execution To optimize the application program of fluid calculation.It is different parallel from fluid design method of the entropy of multithreading on CPU based on CFD technologies, The upper particle encounters of GPU and communication process can be in the enterprising line density function of thread compared with CPU hundreds of times or even thousand times, equilibrium states point Cloth function and parameter are calculated, and the time cost that is in communication with each other on CPU and GPU is fixed.

First, defined in simulated scenario, related notion is as follows：For two-dimentional top cover driving model, it is distributed using space velocity Model D2Q9, as shown in Fig. 2 the x-axis direction case subnumber of the two-dimentional regular square of definition is nx, y-axis direction case subnumber is ny.It is three-dimensional Under situation, then increase z-axis direction case subnumber nz.Due to our fluids of the entropy under the different grid numbers of simulation based on CFD technologies The Parallel Computing Performance of method for designing, so for ease of for convenient to operate below different situations, the nx in realistic simulation, ny Configuration.For 9 different velocity attitude vector ei (0≤i of particle<9) it is defined as follows：

[CDATA[<Mathei=(0,0), (cos (i-1) π 2, sin (i-1) π 2), (2cos (π 4+ (i-5) π 2), 2sin (π4+(i-5)π2))]]>Meanwhile, Gauss-Hermiter integration flexible strategy ω i (0≤i<9) it is defined as：[CDATA[<mathω0 =49, ω 1=ω 2=ω 3=ω 4=19, ω 5=ω 6=ω 7=ω 8=136.]]>For two-dimensional lattice coordinate (x, y), Wherein 0≤x≤nx-1,0≤y≤ny-1.Particle rapidity array u [i] [j] [k] (0≤i under correspondence grid<nx,0≤j<ny, 0≤k<9), the particle density distribution function and equilibrium distribution function array under correspondence grid is respectively f [i] [j] [k], feq [i][j][k](0≤i<nx,0≤j<ny,0≤k<9)。

Fluid density and viscosity are respectively ρ, υ, and characteristic linearity is L, and top cover speed is u, and Reynolds number is defined as γ =α β, parameter alpha here, β are defined as δ t for step-length interval time, and it is standard LBM slack time that value is 1, τ.

2nd, algorithm design

For fluid design method of the standard on GPU based on CFD technologies, the particle encounter and biography on different grid Broadcast process to be controlled and calculate with different threads；Specifically, it is different according to the Thread Id correspondence on correspondence GPU Grid granule, while carrying out calculating the collision rift density fonction and equilibrium distribution function of particle.And for entropy is based on CFD For the fluid design method parallel computation of technology, parameter can also be calculated and be adjusted by the thread of GPU slack time. For the calculation of adjustment parameter slack time, it is well known that correspondence is solved the Nonlinear System of Equations under corresponding particle for it. We can be solved by Newton iteration method, three methods of development of Taylor's second outspread method and Taylor respectively, according to particle ripple The graceful equation of Wurz, data analysiss and numerical computations knowledge, can obtain specific formula for calculation：

1) Newton iteration method formula is：

[CDATA[<Math α n+1=α n-F (α n) F ' (α n)]]>

Here F (α)=H (f+ α (feq-f))-H (f)=H (f+ α Δs)-H (f), F ' (α) is to parameter alpha derivation；

2) it is as follows using Taylor's One step development method computing formula：

[CDATA[<Math α=α *-F (α *) F ' (α *)]]>SIPO<DP n=" 5 ">

F (α)=H (f+ α Δs)-H (f), Δ=feq-f, α * are front once calculated α, and α initial values are α 0, as 2；

3) the secondary displaying method computing formula of Taylor is as follows：

[CDATA[<Math α=α *+- F1 (α *)+F1 (α *) 2-4F2 (α *) F (α *) 2F2 (α *)]]>

F1 (α)=H ' (f+ α Δs) Δ, α * be front once calculated α, α initial values be α 0, as 2.

A kind of parallel flow based on entropy based on the fluid design method of CFD technologies calculates implementation method, including following step Suddenly：

Step 1：Initiation parameter and distribution CPU and GPU memory headrooms；

Parameter is initialized according to the reynolds number Re of particle density ρ, particle viscosity ν and selection in application scenarios and Distribution CPU and GPU memory headrooms；

Initiation parameter includes particle rapidity u=0.1 in space velocity distributed model, characteristic linearity

L=1, parameter alpha 0=2 slack time, particle density distribution function f={ fi=0 } i=0 ..., q-1 and balance State distribution function [CDATA [<Mathfeq={ fieq } i=0 ..., q-1 ,]]>[CDATA[<Mathfieq=ρ ω i Π j= 1d(2-1+uj2)(2uj+1+3uj21-uj)vijv,]]>Wherein, q is the direction number of space velocity distributed model, and d is space dimension Number, uj=0 be jth dimension initial velocity value, ω i be Gauss-Hermiter integrate flexible strategy, 0<ωi<1, from 0 ,+1 or -1 Value；

It is corresponding with space velocity distributed model medium velocity array, Density Distribution array and balance density-of-states distribution array Used as direction number, array type definition is floating type to direction dimension size, and add up all direction numbers, is multiplied by with the direction number for adding up Byte length distribution CPU and GPU memory headrooms that the byte length of floating type array is obtained, the i.e. internal memory to CPU and GPU distribution Space size is the same；

Step 2：By the space velocity distributed model medium velocity array stored in CPU internal memories, Density Distribution array and flat Weighing apparatus density-of-states distribution array data is transferred to the global memory Global memory of GPU；

Step 3：Distribution shared drive Shared memory spaces；

Shared drive Shared memory spaces are entered according to the distribution CPU or GPU memory headroom size described in step 1 Row distribution；

Step 4：Three functions are realized using nVIDIA GPU programming language CUDA；

1) H- α solvers function；SIPO<DP n=" 6 ">

The function is defined as equipment end function, and modifier is _ _ device___, will be called at equipment GPU ends, and set Standby upper operation, for adjusting and calculating parameter alpha slack time；

It is as follows using Taylor's One step development method computing formula：

[CDATA[<Math α=α *-F (α *) F ' (α *)]]>

Wherein, F (α)=H (f+ α Δs)-H (f), α * be front once calculated α, α initial values be α 0, fi (x, t) note The iteration time that particle and the particle in representation space are located is distinguished for fi, x and t, fi (x, t) is particle in iteration time t Under particle probabilities value in position x, F ' (α *) is that F (α *) carries out derivation to α *；

2) propagate collision kernel function；

The function is defined as equipment end function, and modifier is _ _ global__, will be called at host CPU end, in equipment Run on GPU；

Slack time in each iteration, after collision, propagation expression formula and every successive step in the LBM models of standard Parameter calculates particle rapidity under this iteration time step, function of particle density value and particle equilibrium distribution function value；

3) BORDER PROCESSING kernel function；

Using bounce-back form, from the microgranule on fluid directive border, reflected by border, be identical with the speed collided before wall Speed along backtracking；

The modifier is the affiliated type for arranging function；

Step 5：According to the iterationses of the required precision and setting of setting, the invocation step successively in each iterative process 4 three functions realized, calculate density function values and equilibrium state density function values, until the difference of result of calculation twice in front and back Value meets required precision or reaches iterationses, and iteration terminates；

SIPO<DP n=" 7 ">

The H- α solver functions in the step 4 are realized using Taylor's second outspread method；

The secondary displaying method computing formula of Taylor is as follows：

[CDATA[<Math α=α *+- F1 (α *)+F1 (α *) 2-4F2 (α *) F (α *) 2F2 (α *)]]>

F1 (α)=H ' (f+ α Δs) Δ, α * are front once calculated α, and α initial values are 2.To CPU in the step 1 It is allocated using Malloc functions and cudaMalloc functions with GPU memory headrooms.Fluid design of the entropy based on CFD technologies In the Parallel Computation running of method, mainly the kernel of equipment end is called to carry out particle correlative by host side The calculating of reason amount, wherein kernel include colliding and propagate, BORDER PROCESSING and parametric solution device.And for whole parallel computation Performance indications, we are tested and assessed by updating speed-up ratio and each second grid number, and their computing formula is as follows：

Parallel algorithm speed-up ratio=serial algorithm operation time/parallel algorithm operation time；

Update grid number=each dimension grid number product * iterative statement number * 10-6/ parallel computation times each second (MULPS)；

We solve parameters slack time according to three kinds of different parallel modes, Newton iteration method N.R. for finally drawing, The parallel algorithm speed-up ratio of Taylor's One step development method FOE and Taylor's second outspread method SOE is respectively 3.08,3.16 and 3.18. The percentage of time that the time that parameter slack time is solved simultaneously for three kinds of modes accounts for overall calculating is as shown in Figure 3.

For another performance indications updates grid number each second, carry out in the case of by different grid number different threads Practical Calculation, it can be deduced that for specific nVIDIA video cards, the selection of grid number and Thread Count very big will affect this Performance indications.Particularly, for the G210 video cards of nVIDIA, when grid number is 256*256, and Thread Count is 256, we Grid number is updated each second for drawing for 57.6MLUPS.This shows that the runnability of program will be greatly enhanced, so as to right The hardware resource of video card is also fully utilized.

Above the present invention and embodiments thereof are described, the description does not have restricted, shown in accompanying drawing also only It is one of embodiments of the present invention, actual structure is not limited thereto.If the generally speaking ordinary skill people of this area Member is by its enlightenment, in the case of without departing from the invention objective, similar to the technical scheme without designing for creativeness Frame mode and embodiment, protection scope of the present invention all should be belonged to.

Claims

1. a kind of fluid system based on CFD technologies, including CPU module, CPU module, flash memory module and SSD solid state hard disc moulds Block, the CPU module are connected with CPU module, flash memory module respectively, and the SSD solid state hard discs module is connected with flash memory module, also Including signal input part and signal output part, the signal input part and signal output part be connected with flash memory module.

2. a kind of fluid design method based on CFD technologies, comprises the steps：

Step 1：Initiation parameter and distribution CPU and GPU memory headrooms；According to the particle density ρ in application scenarios, particle viscosity The reynolds number Re of ν and selection is initialized and is distributed CPU and GPU memory headrooms to parameter；Initiation parameter includes space speed Particle rapidity u=0.1, characteristic linearity L=1 in degree distributed model, parameter alpha 0=2 slack time, particle density distribution function f ={ fi=0 } i=0 ..., q-1, q are the direction numbers and equilibrium distribution function feq=of space velocity distributed model { fieq } i=0 ..., q-1 ,]]>Fieq=ρ ω i Π j=1d (2-1+uj2) (2uj+1+3uj21-uj) vi jv ,]]>Its In, d is space dimensionality, and uj=0 is the initial velocity value of jth dimension, and ω i are that Gauss-Hermiter integrates flexible strategy, 0<ωi<1, The value from 0 ,+1 or -1；With space velocity distributed model medium velocity array, Density Distribution array and balance density-of-states distribution Used as direction number, array type definition is floating type to the corresponding direction dimension size of array, and add up all direction numbers, with what is added up Direction number is multiplied by byte length distribution CPU and the GPU memory headrooms that the byte length of floating type array is obtained, i.e., to CPU and GPU The memory headroom size of distribution is the same；

Step 3：Distribution shared drive Shared memory spaces；It is big according to the distribution CPU or GPU memory headroom described in step 1 It is little that shared drive Shared memory spaces are allocated；

Step 4：Three functions are realized using nVIDIA GPU programming language CUDA；1) H- α solvers function；The function is defined as Equipment end function, modifier are _ _ device___, will be called at equipment GPU ends, and are run on equipment, for adjustment and Calculate parameter alpha slack time；It is as follows using Taylor's One step development method computing formula：α=α *-F (α *) F ' (α *)]]>Wherein, F (α)=H (f+ α Δs)-H (f), α * be front once calculated α, α initial values be α 0, fi (x, t) be designated as fi, x and t difference table Show the iteration time that particle and the particle in space are located, fi (x, t) is grain of the particle under iteration time t in position x Sub- probit, F ' (α *) are that F (α *) carries out derivation to α *；2) propagate collision kernel function；The function is defined as equipment end function, Modifier is _ _ global__, will be called at host CPU end, runs on equipment GPU；3) BORDER PROCESSING kernel function；Should Function is defined as equipment end function, and modifier is _ _ global__, will be called at host CPU end, runs on equipment GPU； The modifier is the affiliated type for arranging function；

Step 5：According to the iterationses of the required precision and setting of setting, in each iterative process, invocation step 4 is real successively Three existing functions, calculate density function values fi and equilibrium state density function values until the difference of result of calculation twice in front and back Meet required precision or reach iterationses, iteration terminates；

Step 6：After iteration terminates, by 5 calculated density function values and equilibrium density the step of storage in GPU global memories Functional value is transferred to CPU internal memories, and discharges GPU memory headrooms；

Step 7：According to density function values and equilibrium density functional value, isoline of stream function is obtained using paraview softwares Figure, while discharging host memory space, completes parallel flow calculating.