CN114969857A - Structural design optimization method, system, computer equipment and storage medium - Google Patents

Structural design optimization method, system, computer equipment and storage medium Download PDF

Info

Publication number
CN114969857A
CN114969857A CN202110211739.7A CN202110211739A CN114969857A CN 114969857 A CN114969857 A CN 114969857A CN 202110211739 A CN202110211739 A CN 202110211739A CN 114969857 A CN114969857 A CN 114969857A
Authority
CN
China
Prior art keywords
equipment
calculation
unit
gpu
structural design
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110211739.7A
Other languages
Chinese (zh)
Inventor
王宇杰
崔向阳
蔡勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202110211739.7A priority Critical patent/CN114969857A/en
Publication of CN114969857A publication Critical patent/CN114969857A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/23Design optimisation, verification or simulation using finite element methods [FEM] or finite difference methods [FDM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Complex Calculations (AREA)

Abstract

The application relates to a structural design optimization method, a structural design optimization device, computer equipment and a storage medium. The method comprises the following steps: identifying a finite element calculation model loaded in a host through CPU equipment of the host, reading nodes and units in the finite element calculation model, recording units to which each node belongs, and storing the relationship between the nodes and the units; grouping the units in the finite element calculation model according to a unit coloring method, and distributing calculation tasks according to the number of GPU equipment and the combination of calculation capacity; symbol assembly is carried out on the unit stiffness matrix and the mapping relation, and numerical assembly is carried out according to the index array to obtain sparse expression of the overall stiffness matrix; and solving the linear system through GPU equipment, calculating sensitivity, summing according to thread specifications to obtain an objective function value, and optimizing the structural design according to the objective function value. The method and the device can effectively improve the calculation scale and the calculation efficiency of the parallel calculation of the GPU equipment.

Description

Structural design optimization method, system, computer equipment and storage medium
Technical Field
The present application relates to the field of engineering design, and in particular, to a structural design optimization method, system, computer device, and storage medium.
Background
The structural design optimization originates from the beginning of the twentieth century, is widely concerned by researchers in the middle of the twentieth century, combines a numerical analysis technology with computer analysis, and is widely applied to structural design in practical engineering. The detailed design stage of the structure mainly adopts size optimization, and the performance of the structure is improved by changing the attributes of the length, the width, the sectional area and the like of the structure. The basic design stage of the structure mainly adopts shape optimization to improve the performance of the structure by changing the geometric boundary shape of the structure. The two kinds of optimization are mainly applied to the later stage of structural design, are local optimization technologies after material layout is determined, and cannot essentially improve the performance of the structure. And the topological optimization is mainly applied to the conceptual design stage of the structure, and the corresponding material distribution is sought when the specific performance index of the structure under a certain constraint condition is optimal under the given load and boundary condition. Compared with the two previous optimization methods, the topological optimization theoretical basis is perfect, the design space is wide, engineers can put forward a novel concept design scheme through topological optimization, and the manufacturing cost is reduced while the structural performance is improved.
Topology optimization theory has advanced significantly over the past few years, but computational resource requirements remain a major problem. On the one hand, each iteration of topology optimization involves a finite element solution, which is a major computational bottleneck. On the other hand, Sigmund indicates that a coarse finite element mesh will lead to artificial length scale constraints, resulting in many truss-like structures in the final optimization result. Thus, the topology optimization process typically requires very fine mesh models to obtain high quality structures and reduce post-processing efforts in the actual manufacturing process. Especially for three-dimensional (3D) problems, a fine finite element mesh will further increase the computation time for larger degrees of freedom.
Disclosure of Invention
In view of the above, it is necessary to provide a structural design optimization method, system, computer device and storage medium for solving the above technical problems.
In a first aspect, an embodiment of the present invention provides a structural design optimization method, where the method includes:
identifying a finite element calculation model loaded in a host through CPU equipment of the host, reading nodes and units in the finite element calculation model, recording units to which each node belongs, and storing the relationship between the nodes and the units;
grouping the units in the finite element calculation model according to a unit coloring method, and distributing calculation tasks according to the number of GPU equipment and the combination of calculation capacity;
symbol assembly is carried out on the unit stiffness matrix and the mapping relation, and numerical assembly is carried out according to the index array to obtain sparse expression of the overall stiffness matrix;
the linear system is solved through GPU equipment, data communication is carried out between the GPU equipment in a P2P mode, objective function values are obtained through sensitivity calculation and summation according to thread specifications, and structural design optimization is completed according to the objective function values.
Further, the solving the linear system comprises:
carrying out data partitioning on a total stiffness matrix of the system, and dividing a full-rank matrix linear system into a plurality of subsystems in different columns and rows;
solving the current subsystem by adopting a GPU efficient linear iterative solver according to the right-end load vector;
and the GPU equipment synchronizes data exchange to obtain a complete displacement vector.
Further, the sensitivity calculation includes:
dividing the finite element calculation model into a plurality of sub models according to the number and the calculation capacity of GPU equipment, wherein each equipment is responsible for the calculation task of one sub model;
calculating the GPU acceleration sensitivity in a parallel mode of unit expansion, and using one thread to take charge of the calculation task of one unit;
and calculating the sensitivity value and the flexibility target value of each element, and finally summing through thread specifications to obtain a target function value.
Furthermore, the structural design optimization method also comprises sensitivity filtering and density updating;
the sensitivity filtering stores the filtered data by establishing a new array, and adopts a mode that one thread is responsible for one unit for parallel computation through GPU equipment;
the density updating adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, and in each iteration, the CPU equipment judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.
On the other hand, an embodiment of the present invention further provides a structural design optimization system, including:
the model reading module is used for identifying the finite element calculation model loaded in the host through the CPU equipment of the host, reading the nodes and the units in the finite element calculation model, recording the unit to which each node belongs and storing the relationship between the nodes and the units;
the unit coloring module is used for grouping the units in the finite element calculation model according to a unit coloring method and distributing calculation tasks according to the number of GPU equipment and the calculation capacity;
the rigidity matrix assembling module is used for carrying out symbol assembling on the unit rigidity matrix and the mapping relation and carrying out numerical value assembling according to the index array to obtain sparse expression of the overall rigidity matrix;
and the object solving module is used for solving the linear system through GPU equipment, carrying out data communication among the GPU equipment in a P2P mode, calculating sensitivity, summing according to thread specifications to obtain an objective function value, and optimizing the structural design according to the objective function value.
Further, the object solving module includes a linear system solving unit configured to:
carrying out data partition on a system total stiffness matrix, and dividing a full-rank matrix linear system into a plurality of subsystems in the same column and different rows;
solving the current subsystem by adopting a GPU efficient linear iterative solver according to the right-end load vector;
and the GPU equipment synchronizes data exchange to obtain a complete displacement vector.
Further, the object solving module further includes a sensitivity calculating unit, and the sensitivity calculating unit is configured to:
dividing the finite element calculation model into a plurality of sub models according to the number and the calculation capacity of GPU equipment, wherein each equipment is responsible for the calculation task of one sub model;
calculating the GPU acceleration sensitivity in a parallel mode of unit expansion, and using one thread to take charge of the calculation task of one unit;
and calculating the sensitivity value and the flexibility target value of each element, and finally summing through thread specifications to obtain a target function value.
Furthermore, the structural design optimization system also comprises a sensitivity filtering module and a density updating module;
the sensitivity filtering module is used for storing filtered data by establishing a new array and performing parallel computation by adopting a mode that one thread is responsible for one unit through GPU equipment;
the density updating module adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, and in each iteration, the CPU equipment judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.
The embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and when the processor executes the computer program, the following steps are implemented:
identifying a finite element calculation model loaded in a host through CPU equipment of the host, reading nodes and units in the finite element calculation model, recording units to which each node belongs, and storing the relationship between the nodes and the units;
grouping the units in the finite element calculation model according to a unit coloring method, and distributing calculation tasks according to the number of GPU equipment and the combination of calculation capacity;
symbol assembly is carried out on the unit stiffness matrix and the mapping relation, and numerical assembly is carried out according to the index array to obtain sparse expression of the overall stiffness matrix;
the linear system is solved through GPU equipment, data communication is carried out between the GPU equipment in a P2P mode, objective function values are obtained through sensitivity calculation and summation according to thread specifications, and structural design optimization is completed according to the objective function values.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
identifying a finite element calculation model loaded in a host through CPU equipment of the host, reading nodes and units in the finite element calculation model, recording units to which each node belongs, and storing the relationship between the nodes and the units;
grouping the units in the finite element calculation model according to a unit coloring method, and distributing calculation tasks according to the number of GPU equipment and the combination of calculation capacity;
symbol assembly is carried out on the unit stiffness matrix and the mapping relation, and numerical assembly is carried out according to the index array to obtain sparse expression of the overall stiffness matrix;
the linear system is solved through GPU equipment, data communication is carried out between the GPU equipment in a P2P mode, objective function values are obtained through sensitivity calculation and summation according to thread specifications, and structural design optimization is completed according to the objective function values.
The beneficial effect of this application is: the method and the device have the advantages that the effect of greatly accelerating the calculation process is realized by using the GPU equipment, the acceleration ratio which is more than one hundred times can be realized on the multi-GPU equipment, the calculation time of the main task of finite element analysis is greatly reduced, and the calculation scale and the calculation efficiency of the parallel calculation of the GPU equipment are improved. In addition, the whole topology optimization process realizes complete GPU equipment calculation, and avoids time overhead brought by data communication between a host and equipment; the maximum hiding of data communication between the devices is realized by combining asynchronous transmission, and the overall calculation efficiency is improved; in addition, the topological optimization solver in the application has low calculation cost, can realize the topological optimization high-precision rapid calculation of tens of millions of free scale structures on a common personal computer, is simple to use, and achieves high universality on different devices.
Drawings
FIG. 1 is a schematic flow chart diagram of a method for structural design optimization in one embodiment;
FIG. 2 is a flow diagram illustrating a solution to a linear system in one embodiment;
FIG. 3 is a schematic flow chart of a sensitivity calculation method in one embodiment;
FIG. 4 is a block diagram of the architecture of the architectural design optimization system in one embodiment;
FIG. 5 is a block diagram of an object solving module in one embodiment;
FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Topology optimization is a mathematical method that optimizes the distribution of materials in a given area based on given load cases, constraints and performance metrics, and is a structural optimization method. In the idea of automobile light weight, under the condition of ensuring basic performance of structural parts of the automobile body, redundant materials are reduced as much as possible to reduce the structural weight, and the utilization rate of the materials is improved, which is contrary to the core idea of structural topology optimization. In other words, the topological optimization aims to achieve the purpose of using the material as much as possible and optimizing certain performance by reasonably arranging the position of the material in a given area under certain constraint conditions. The traditional automobile design is usually based on engineering experience or trial and error and the like, so that a large amount of time and energy are consumed, the design period can be greatly shortened by means of an effective structure optimization technology, and the lightweight design of the automobile can be accurately guided.
To overcome the above problems, parallel computing techniques may be employed to address the difficulties of large-scale computing. Parallelization can be performed using conventional computer architectures such as multi-core processors, CPUs (central processing units) and novel multi-core architectures. However, the best choice of current computers is to use heterogeneous computing, where the CPU works with a new architecture that exploits heavily on application parallelism. GPUs can provide higher peak performance, energy, and cost efficiencies than traditional symmetric CPUs. The method gets rid of the problems of high hardware cost, complex use and maintenance and the like of the traditional parallel computing method, thereby quickly drawing the attention of scientific research and engineering personnel, and forming a relatively mature programming language and programming architecture through the development of nearly 10 years. And the calculation scale and the calculation efficiency of GPU parallel calculation can be further improved by adopting multiple GPUs. Therefore, the method has important significance for improving the calculation efficiency of the large-scale structural topology optimization by adopting the multi-GPU parallel calculation technology.
In one embodiment, as shown in fig. 1, there is provided a structural design optimization method, including the steps of:
step 101, identifying a finite element calculation model loaded in a host through CPU equipment of the host, reading nodes and units in the finite element calculation model, recording units to which each node belongs, and storing the relationship between the nodes and the units;
102, grouping units in the finite element calculation model according to a unit coloring method, and distributing calculation tasks according to the number of GPU equipment and the calculation capacity;
103, symbol assembly is carried out on the unit stiffness matrix and the mapping relation, and numerical assembly is carried out according to the index array to obtain sparse expression of the overall stiffness matrix;
and 104, solving the linear system through GPU equipment, performing data communication between the GPU equipment in a P2P mode, calculating sensitivity, summing according to thread specifications to obtain an objective function value, and finishing structural design optimization according to the objective function value.
Specifically, in this embodiment, finite element calculation model data is loaded into a host memory, for an overall model, a host CPU identifies nodes and units in the finite element calculation model data, records a unit to which each node belongs, and stores an affiliation relationship between the nodes and the units; the units in the model are grouped by a unit coloring method, the units of the same color do not have a common node, so that parallel assembly calculation can be carried out without causing writing conflict, and then appropriate calculation tasks are distributed to each device according to the number of GPUs and the calculation capacity of the GPUs, so that waiting among the devices is minimized; according to the efficient rigidity matrix assembling algorithm provided by the embodiment, the method is divided into two parts of symbol assembling and numerical assembling. Firstly, symbol assembly is carried out by a unit stiffness matrix and a mapping relation, and because the non-zero patterns of the total stiffness matrix in the whole topology optimization are the same, the part is only carried out in the first iteration; and then carrying out numerical value assembly according to the index array to obtain sparse expression of the overall stiffness matrix, wherein the sparse expression is stored in a row Compression (CSR) format.
The effect of greatly accelerating the calculation process is realized by using the GPU equipment, the acceleration ratio which is more than one hundred times can be realized on the multi-GPU equipment, the calculation time of the main task of finite element analysis is greatly reduced, and the calculation scale and the calculation efficiency of the parallel calculation of the GPU equipment are improved. In addition, the whole topology optimization process realizes complete GPU equipment calculation, and avoids time overhead brought by data communication between a host and equipment; the maximum hiding of data communication between the devices is realized by combining asynchronous transmission, and the overall calculation efficiency is improved; in addition, the topological optimization solver in the application has low calculation cost, can realize the topological optimization high-precision rapid calculation of tens of millions of free scale structures on a common personal computer, is simple to use, and achieves high universality on different devices.
In one embodiment, as shown in FIG. 2, the process of solving a linear system includes:
step 201, performing data partitioning on a total stiffness matrix of a system, and dividing a full-rank matrix linear system into a plurality of subsystems in the same column and different rows;
step 202, solving the current subsystem by adopting a GPU efficient linear iterative solver according to the right-end load vector;
and step 203, obtaining a complete displacement vector by the GPU equipment synchronous data exchange.
Specifically, firstly, data partitioning is carried out on a total stiffness matrix of the system, a full-rank matrix linear system is divided into a plurality of subsystems in the same column and different rows, and one GPU device is responsible for calculation; then, rapidly solving the current system by adopting an independently developed GPU efficient linear iterative solver according to the right-end load vector, and carrying out data communication among all devices in the period by adopting a P2P mode; and finally, the equipment synchronous data exchange obtains a complete displacement vector.
In one embodiment, as shown in fig. 3, the sensitivity calculation method includes:
step 301, dividing the finite element calculation model into a plurality of sub-models according to the number of GPU devices and the calculation capacity, wherein each device is responsible for the calculation task of one sub-model;
step 302, calculating the GPU acceleration sensitivity in a parallel mode of unit expansion, and using one thread to take charge of the calculation task of one unit;
and 303, calculating the sensitivity value and the flexibility target value of each element, and finally summing through a thread specification to obtain a target function value.
Specifically, in the embodiment, the sensitivity calculation is performed by dividing the model into a plurality of sub-models according to the number of GPUs and the calculation capacity, and each device is responsible for the calculation task of one sub-model; secondly, the GPU acceleration sensitivity calculation adopts a parallel mode of expanding according to units, namely one thread is used for being responsible for the calculation task of one unit; the index array is stored in a register and the common data is stored in a constant memory. And then calculating the sensitivity value and the flexibility target value of each element, and finally summing through thread specifications to obtain a target function value.
In one embodiment, the structural design optimization further includes sensitivity filtering and density updating; the sensitivity filtering stores the filtered data by establishing a new array, and adopts a mode that one thread is responsible for one unit for parallel computation through GPU equipment; the density updating adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, and in each iteration, the CPU equipment judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.
Specifically, the topology optimization sometimes causes unstable numerical values, such as a checkerboard phenomenon, and filtering the sensitivity is the most commonly used method to avoid the checkerboard phenomenon at present. Sensitivity filtering GPU parallel computing also adopts a mode that one thread is in charge of one unit, only a new array needs to be established to store filtered data, because the unit sensitivity is updated in the computing process, and the computing progress of each thread is not identical. Therefore, if the original array is updated directly, some erroneous results may be generated due to the computation synchronization between threads. In the embodiment, a method of mixed programming of a CPU and a GPU is adopted to update the density of the unit, wherein the task of updating the density of the unit is calculated by the GPU, and the new density of one unit is calculated by each thread by using a TFE method, and then the volume fraction is obtained by reduction and summation. The convergence of the density updating process is calculated and judged by the CPU, and in each iteration, the CPU judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.
It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in the above-described flowcharts may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or the stages is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 4, there is provided a structural design optimization system, comprising:
a model reading module 401, configured to identify a finite element computation model loaded in a host through a CPU device of the host, read nodes and units in the finite element computation model, record a unit to which each node belongs, and store a relationship between the nodes and the units;
a unit coloring module 402, configured to group units in the finite element computation model according to a unit coloring method, and allocate computation tasks according to the number of GPU devices in combination with computation power;
a stiffness matrix assembling module 403, configured to perform symbol assembly on the unit stiffness matrix and the mapping relationship, and perform numerical assembly according to the index array to obtain sparse representation of the total stiffness matrix;
and the object solving module 404 is configured to solve the linear system through GPU devices, perform data communication between the GPU devices in a P2P manner, perform sensitivity calculation, sum up according to thread specifications to obtain an objective function value, and complete optimization of the structural design according to the objective function value.
In one embodiment, as shown in FIG. 5, the object solving module 404 includes a solve linear system unit 4041 to:
carrying out data partitioning on a total stiffness matrix of the system, and dividing a full-rank matrix linear system into a plurality of subsystems in different columns and rows;
solving the current subsystem by adopting a GPU efficient linear iterative solver according to the right-end load vector;
and the GPU equipment synchronizes data exchange to obtain a complete displacement vector.
In one embodiment, as shown in fig. 5, the object solving module 404 further includes a sensitivity calculating unit 4042, where the sensitivity calculating unit 4042 is configured to:
dividing the finite element calculation model into a plurality of sub models according to the number and the calculation capacity of GPU equipment, wherein each equipment is responsible for the calculation task of one sub model;
calculating the GPU acceleration sensitivity in a parallel mode of unit expansion, and using one thread to take charge of the calculation task of one unit;
and calculating the sensitivity value and the flexibility target value of each element, and finally summing through thread specifications to obtain a target function value.
In one embodiment, the structural design optimization system further comprises a sensitivity filtering module and a density updating module;
the sensitivity filtering module is used for storing filtered data by establishing a new array and performing parallel computation by adopting a mode that one thread is responsible for one unit through GPU equipment;
the density updating module adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, in each iteration, the CPU equipment judges the convergence of the density updating process, and the lower threshold value and the upper threshold value are changed according to the optimal criterion updating scheme.
For the specific definition of the structural design optimization system, reference may be made to the above definition of the structural design optimization method, which is not described herein again. The various modules in the structural design optimization system described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
FIG. 6 is a diagram illustrating an internal structure of a computer device in one embodiment. As shown in fig. 6, the computer apparatus includes a processor, a memory, a network interface, an input device, and a display screen connected through a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to implement the method of privilege anomaly detection. The internal memory may also have a computer program stored therein, which when executed by the processor, causes the processor to perform the method for detecting an abnormality of authority. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the configuration shown in fig. 6 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
identifying a finite element calculation model loaded in a host through CPU equipment of the host, reading nodes and units in the finite element calculation model, recording units to which each node belongs, and storing the relationship between the nodes and the units;
grouping the units in the finite element calculation model according to a unit coloring method, and distributing calculation tasks according to the number of GPU equipment and the combination of calculation capacity;
symbol assembly is carried out on the unit stiffness matrix and the mapping relation, and numerical assembly is carried out according to the index array to obtain sparse expression of the overall stiffness matrix;
the linear system is solved through GPU equipment, data communication is carried out between the GPU equipment in a P2P mode, objective function values are obtained through sensitivity calculation and summation according to thread specifications, and structural design optimization is completed according to the objective function values.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
carrying out data partition on a system total stiffness matrix, and dividing a full-rank matrix linear system into a plurality of subsystems in the same column and different rows;
solving the current subsystem by adopting a GPU efficient linear iterative solver according to the right-end load vector;
and the GPU equipment synchronizes data exchange to obtain a complete displacement vector.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
dividing the finite element calculation model into a plurality of sub models according to the number and the calculation capacity of GPU equipment, wherein each equipment is responsible for the calculation task of one sub model;
calculating the GPU acceleration sensitivity in a parallel mode of unit expansion, and using one thread to take charge of the calculation task of one unit;
and calculating the sensitivity value and the flexibility target value of each element, and finally summing through thread specifications to obtain a target function value.
The sensitivity filtering stores the filtered data by establishing a new array, and adopts a mode that one thread is responsible for one unit for parallel computation through GPU equipment;
the density updating adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, and in each iteration, the CPU equipment judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
identifying a finite element calculation model loaded in a host through CPU equipment of the host, reading nodes and units in the finite element calculation model, recording units to which each node belongs, and storing the relationship between the nodes and the units;
grouping the units in the finite element calculation model according to a unit coloring method, and distributing calculation tasks according to the number of GPU equipment and the combination of calculation capacity;
symbol assembly is carried out on the unit stiffness matrix and the mapping relation, and numerical assembly is carried out according to the index array to obtain sparse expression of the overall stiffness matrix;
the linear system is solved through GPU equipment, data communication is carried out between the GPU equipment in a P2P mode, objective function values are obtained through sensitivity calculation and summation according to thread specifications, and structural design optimization is completed according to the objective function values.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
carrying out data partitioning on a total stiffness matrix of the system, and dividing a full-rank matrix linear system into a plurality of subsystems in different columns and rows;
solving the current subsystem by adopting a GPU efficient linear iterative solver according to the right-end load vector;
and the GPU equipment synchronizes data exchange to obtain a complete displacement vector.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
dividing the finite element calculation model into a plurality of sub models according to the number and the calculation capacity of GPU equipment, wherein each equipment is responsible for the calculation task of one sub model;
calculating the GPU acceleration sensitivity in a parallel mode of unit expansion, and using one thread to take charge of the calculation task of one unit;
and calculating the sensitivity value and the flexibility target value of each element, and finally summing through thread specifications to obtain a target function value.
The sensitivity filtering stores the filtered data by establishing a new array, and adopts a mode that one thread is responsible for one unit for parallel computation through GPU equipment;
the density updating adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, and in each iteration, the CPU equipment judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A structural design optimization method is characterized by comprising the following steps:
identifying a finite element calculation model loaded in a host through CPU equipment of the host, reading nodes and units in the finite element calculation model, recording units to which each node belongs, and storing the relationship between the nodes and the units;
grouping the units in the finite element calculation model according to a unit coloring method, and distributing calculation tasks according to the number of GPU equipment and the combination of calculation capacity;
symbol assembly is carried out on the unit stiffness matrix and the mapping relation, and numerical assembly is carried out according to the index array to obtain sparse expression of the overall stiffness matrix;
the linear system is solved through GPU equipment, data communication is carried out between the GPU equipment in a P2P mode, objective function values are obtained through sensitivity calculation and summation according to thread specifications, and structural design optimization is completed according to the objective function values.
2. The structural design optimization method of claim 1, wherein solving a linear system comprises:
carrying out data partitioning on a total stiffness matrix of the system, and dividing a full-rank matrix linear system into a plurality of subsystems in different columns and rows;
solving the current subsystem by adopting a GPU efficient linear iterative solver according to the right-end load vector;
and the GPU equipment synchronizes data exchange to obtain a complete displacement vector.
3. The structural design optimization method of claim 1, wherein the sensitivity calculation comprises:
dividing the finite element calculation model into a plurality of sub models according to the number and the calculation capacity of GPU equipment, wherein each equipment is responsible for the calculation task of one sub model;
calculating the GPU acceleration sensitivity in a parallel mode of unit expansion, and using one thread to take charge of the calculation task of one unit;
and calculating the sensitivity value and the flexibility target value of each element, and finally summing through thread specifications to obtain a target function value.
4. The structural design optimization method of claim 1, further comprising sensitivity filtering and density updating;
the sensitivity filtering stores the filtered data by establishing a new array, and adopts a mode that one thread is responsible for one unit for parallel computation through GPU equipment;
the density updating adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, and in each iteration, the CPU equipment judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.
5. A structural design optimization system, comprising:
the model reading module is used for identifying the finite element calculation model loaded in the host through the CPU equipment of the host, reading the nodes and the units in the finite element calculation model, recording the unit to which each node belongs and storing the relationship between the nodes and the units;
the unit coloring module is used for grouping the units in the finite element calculation model according to a unit coloring method and distributing calculation tasks according to the number of GPU equipment and the calculation capacity;
the rigidity matrix assembling module is used for carrying out symbol assembling on the unit rigidity matrix and the mapping relation and carrying out numerical value assembling according to the index array to obtain sparse expression of the overall rigidity matrix;
and the object solving module is used for solving the linear system through GPU equipment, carrying out data communication among the GPU equipment in a P2P mode, calculating sensitivity, summing according to thread specifications to obtain an objective function value, and optimizing the structural design according to the objective function value.
6. The structural design optimization system of claim 5, wherein the objective solving module comprises a solve linear system unit to:
carrying out data partitioning on a total stiffness matrix of the system, and dividing a full-rank matrix linear system into a plurality of subsystems in different columns and rows;
solving the current subsystem by adopting a GPU efficient linear iterative solver according to the right-end load vector;
and the GPU equipment synchronizes data exchange to obtain a complete displacement vector.
7. The structural design optimization system of claim 5, wherein the objective solving module further comprises a sensitivity calculation unit to:
dividing the finite element calculation model into a plurality of sub models according to the number and the calculation capacity of GPU equipment, wherein each equipment is responsible for the calculation task of one sub model;
calculating GPU acceleration sensitivity according to a parallel mode of unit expansion, and using one thread to take charge of the calculation task of one unit;
and calculating the sensitivity value and the flexibility target value of each element, and finally summing through thread specifications to obtain a target function value.
8. The structural design optimization system of claim 5, further comprising a sensitivity filtering module and a density updating module;
the sensitivity filtering module is used for establishing a new array to store the filtered data and performing parallel computation by adopting a mode that one thread is responsible for one unit through GPU equipment;
the density updating module adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, and in each iteration, the CPU equipment judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 4 are implemented when the computer program is executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 4.
CN202110211739.7A 2021-02-25 2021-02-25 Structural design optimization method, system, computer equipment and storage medium Pending CN114969857A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110211739.7A CN114969857A (en) 2021-02-25 2021-02-25 Structural design optimization method, system, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110211739.7A CN114969857A (en) 2021-02-25 2021-02-25 Structural design optimization method, system, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114969857A true CN114969857A (en) 2022-08-30

Family

ID=82972659

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110211739.7A Pending CN114969857A (en) 2021-02-25 2021-02-25 Structural design optimization method, system, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114969857A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115495312A (en) * 2022-09-27 2022-12-20 北京百度网讯科技有限公司 Service request processing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115495312A (en) * 2022-09-27 2022-12-20 北京百度网讯科技有限公司 Service request processing method and device
CN115495312B (en) * 2022-09-27 2023-07-18 北京百度网讯科技有限公司 Service request processing method and device

Similar Documents

Publication Publication Date Title
CN110533183B (en) Task placement method for heterogeneous network perception in pipeline distributed deep learning
CN107273094B (en) Data structure suitable for HPCG optimization on ' Shenwei ' Taihu light ' and efficient implementation method thereof
Huang et al. A heterogeneous PIM hardware-software co-design for energy-efficient graph processing
CN110516316B (en) GPU acceleration method for solving Euler equation by interrupted Galerkin method
CN111639054B (en) Data coupling method, system and medium for ocean mode and data assimilation
Haghi et al. FP-AMG: FPGA-based acceleration framework for algebraic multigrid solvers
Solano-Quinde et al. Unstructured grid applications on GPU: performance analysis and improvement
Geng et al. A survey: Handling irregularities in neural network acceleration with fpgas
Mostafazadeh Davani et al. Unsteady Navier-Stokes computations on GPU architectures
Chen et al. Rubik: A hierarchical architecture for efficient graph learning
CN114969857A (en) Structural design optimization method, system, computer equipment and storage medium
Zhou et al. Gcnear: A hybrid architecture for efficient gcn training with near-memory processing
CN111597602A (en) High-rise building structure efficient analysis method based on AMGPCG algorithm
JP5790270B2 (en) Structural analysis system, structural analysis program, and structural analysis method
CN116167304B (en) Reservoir value based on Shenwei architecture simulation GMRES optimization method and system
CN109522127B (en) Fluid machinery simulation program heterogeneous acceleration method based on GPU
CN115906684A (en) Hydrodynamics multi-grid solver parallel optimization method for Shenwei architecture
CN116303219A (en) Grid file acquisition method and device and electronic equipment
CN113900808A (en) MPI parallel data structure based on arbitrary polyhedron unstructured grid
CN113010316B (en) Multi-target group intelligent algorithm parallel optimization method based on cloud computing
Kuźnik et al. Graph grammar-based multi-frontal parallel direct solver for two-dimensional isogeometric analysis
Qu et al. Cheetah: An accurate assessment mechanism and a high-throughput acceleration architecture oriented toward resource efficiency
CN113065035A (en) Single-machine out-of-core attribute graph calculation method
Chen et al. A latency-hiding algorithm for abms on parallel/distributed computing environment
Wang et al. An expansion planning approach for intelligent grids with speculative parallelism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination