CN114969857A

CN114969857A - Structural design optimization method, system, computer equipment and storage medium

Info

Publication number: CN114969857A
Application number: CN202110211739.7A
Authority: CN
Inventors: 王宇杰; 崔向阳; 蔡勇
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2021-02-25
Filing date: 2021-02-25
Publication date: 2022-08-30

Abstract

The application relates to a structural design optimization method, a structural design optimization device, computer equipment and a storage medium. The method comprises the following steps: identifying a finite element calculation model loaded in a host through CPU equipment of the host, reading nodes and units in the finite element calculation model, recording units to which each node belongs, and storing the relationship between the nodes and the units; grouping the units in the finite element calculation model according to a unit coloring method, and distributing calculation tasks according to the number of GPU equipment and the combination of calculation capacity; symbol assembly is carried out on the unit stiffness matrix and the mapping relation, and numerical assembly is carried out according to the index array to obtain sparse expression of the overall stiffness matrix; and solving the linear system through GPU equipment, calculating sensitivity, summing according to thread specifications to obtain an objective function value, and optimizing the structural design according to the objective function value. The method and the device can effectively improve the calculation scale and the calculation efficiency of the parallel calculation of the GPU equipment.

Description

Structural design optimization method, system, computer equipment and storage medium

Technical Field

The present application relates to the field of engineering design, and in particular, to a structural design optimization method, system, computer device, and storage medium.

Background

The structural design optimization originates from the beginning of the twentieth century, is widely concerned by researchers in the middle of the twentieth century, combines a numerical analysis technology with computer analysis, and is widely applied to structural design in practical engineering. The detailed design stage of the structure mainly adopts size optimization, and the performance of the structure is improved by changing the attributes of the length, the width, the sectional area and the like of the structure. The basic design stage of the structure mainly adopts shape optimization to improve the performance of the structure by changing the geometric boundary shape of the structure. The two kinds of optimization are mainly applied to the later stage of structural design, are local optimization technologies after material layout is determined, and cannot essentially improve the performance of the structure. And the topological optimization is mainly applied to the conceptual design stage of the structure, and the corresponding material distribution is sought when the specific performance index of the structure under a certain constraint condition is optimal under the given load and boundary condition. Compared with the two previous optimization methods, the topological optimization theoretical basis is perfect, the design space is wide, engineers can put forward a novel concept design scheme through topological optimization, and the manufacturing cost is reduced while the structural performance is improved.

Topology optimization theory has advanced significantly over the past few years, but computational resource requirements remain a major problem. On the one hand, each iteration of topology optimization involves a finite element solution, which is a major computational bottleneck. On the other hand, Sigmund indicates that a coarse finite element mesh will lead to artificial length scale constraints, resulting in many truss-like structures in the final optimization result. Thus, the topology optimization process typically requires very fine mesh models to obtain high quality structures and reduce post-processing efforts in the actual manufacturing process. Especially for three-dimensional (3D) problems, a fine finite element mesh will further increase the computation time for larger degrees of freedom.

Disclosure of Invention

In view of the above, it is necessary to provide a structural design optimization method, system, computer device and storage medium for solving the above technical problems.

In a first aspect, an embodiment of the present invention provides a structural design optimization method, where the method includes:

identifying a finite element calculation model loaded in a host through CPU equipment of the host, reading nodes and units in the finite element calculation model, recording units to which each node belongs, and storing the relationship between the nodes and the units;

grouping the units in the finite element calculation model according to a unit coloring method, and distributing calculation tasks according to the number of GPU equipment and the combination of calculation capacity;

symbol assembly is carried out on the unit stiffness matrix and the mapping relation, and numerical assembly is carried out according to the index array to obtain sparse expression of the overall stiffness matrix;

the linear system is solved through GPU equipment, data communication is carried out between the GPU equipment in a P2P mode, objective function values are obtained through sensitivity calculation and summation according to thread specifications, and structural design optimization is completed according to the objective function values.

Further, the solving the linear system comprises:

carrying out data partitioning on a total stiffness matrix of the system, and dividing a full-rank matrix linear system into a plurality of subsystems in different columns and rows;

solving the current subsystem by adopting a GPU efficient linear iterative solver according to the right-end load vector;

and the GPU equipment synchronizes data exchange to obtain a complete displacement vector.

Further, the sensitivity calculation includes:

dividing the finite element calculation model into a plurality of sub models according to the number and the calculation capacity of GPU equipment, wherein each equipment is responsible for the calculation task of one sub model;

calculating the GPU acceleration sensitivity in a parallel mode of unit expansion, and using one thread to take charge of the calculation task of one unit;

and calculating the sensitivity value and the flexibility target value of each element, and finally summing through thread specifications to obtain a target function value.

Furthermore, the structural design optimization method also comprises sensitivity filtering and density updating;

the sensitivity filtering stores the filtered data by establishing a new array, and adopts a mode that one thread is responsible for one unit for parallel computation through GPU equipment;

the density updating adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, and in each iteration, the CPU equipment judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.

On the other hand, an embodiment of the present invention further provides a structural design optimization system, including:

the model reading module is used for identifying the finite element calculation model loaded in the host through the CPU equipment of the host, reading the nodes and the units in the finite element calculation model, recording the unit to which each node belongs and storing the relationship between the nodes and the units;

the unit coloring module is used for grouping the units in the finite element calculation model according to a unit coloring method and distributing calculation tasks according to the number of GPU equipment and the calculation capacity;

the rigidity matrix assembling module is used for carrying out symbol assembling on the unit rigidity matrix and the mapping relation and carrying out numerical value assembling according to the index array to obtain sparse expression of the overall rigidity matrix;

and the object solving module is used for solving the linear system through GPU equipment, carrying out data communication among the GPU equipment in a P2P mode, calculating sensitivity, summing according to thread specifications to obtain an objective function value, and optimizing the structural design according to the objective function value.

Further, the object solving module includes a linear system solving unit configured to:

carrying out data partition on a system total stiffness matrix, and dividing a full-rank matrix linear system into a plurality of subsystems in the same column and different rows;

Further, the object solving module further includes a sensitivity calculating unit, and the sensitivity calculating unit is configured to:

Furthermore, the structural design optimization system also comprises a sensitivity filtering module and a density updating module;

the sensitivity filtering module is used for storing filtered data by establishing a new array and performing parallel computation by adopting a mode that one thread is responsible for one unit through GPU equipment;

the density updating module adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, and in each iteration, the CPU equipment judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.

The embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and when the processor executes the computer program, the following steps are implemented:

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

The beneficial effect of this application is: the method and the device have the advantages that the effect of greatly accelerating the calculation process is realized by using the GPU equipment, the acceleration ratio which is more than one hundred times can be realized on the multi-GPU equipment, the calculation time of the main task of finite element analysis is greatly reduced, and the calculation scale and the calculation efficiency of the parallel calculation of the GPU equipment are improved. In addition, the whole topology optimization process realizes complete GPU equipment calculation, and avoids time overhead brought by data communication between a host and equipment; the maximum hiding of data communication between the devices is realized by combining asynchronous transmission, and the overall calculation efficiency is improved; in addition, the topological optimization solver in the application has low calculation cost, can realize the topological optimization high-precision rapid calculation of tens of millions of free scale structures on a common personal computer, is simple to use, and achieves high universality on different devices.

Drawings

FIG. 1 is a schematic flow chart diagram of a method for structural design optimization in one embodiment;

FIG. 2 is a flow diagram illustrating a solution to a linear system in one embodiment;

FIG. 3 is a schematic flow chart of a sensitivity calculation method in one embodiment;

FIG. 4 is a block diagram of the architecture of the architectural design optimization system in one embodiment;

FIG. 5 is a block diagram of an object solving module in one embodiment;

FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

Topology optimization is a mathematical method that optimizes the distribution of materials in a given area based on given load cases, constraints and performance metrics, and is a structural optimization method. In the idea of automobile light weight, under the condition of ensuring basic performance of structural parts of the automobile body, redundant materials are reduced as much as possible to reduce the structural weight, and the utilization rate of the materials is improved, which is contrary to the core idea of structural topology optimization. In other words, the topological optimization aims to achieve the purpose of using the material as much as possible and optimizing certain performance by reasonably arranging the position of the material in a given area under certain constraint conditions. The traditional automobile design is usually based on engineering experience or trial and error and the like, so that a large amount of time and energy are consumed, the design period can be greatly shortened by means of an effective structure optimization technology, and the lightweight design of the automobile can be accurately guided.

To overcome the above problems, parallel computing techniques may be employed to address the difficulties of large-scale computing. Parallelization can be performed using conventional computer architectures such as multi-core processors, CPUs (central processing units) and novel multi-core architectures. However, the best choice of current computers is to use heterogeneous computing, where the CPU works with a new architecture that exploits heavily on application parallelism. GPUs can provide higher peak performance, energy, and cost efficiencies than traditional symmetric CPUs. The method gets rid of the problems of high hardware cost, complex use and maintenance and the like of the traditional parallel computing method, thereby quickly drawing the attention of scientific research and engineering personnel, and forming a relatively mature programming language and programming architecture through the development of nearly 10 years. And the calculation scale and the calculation efficiency of GPU parallel calculation can be further improved by adopting multiple GPUs. Therefore, the method has important significance for improving the calculation efficiency of the large-scale structural topology optimization by adopting the multi-GPU parallel calculation technology.

In one embodiment, as shown in fig. 1, there is provided a structural design optimization method, including the steps of:

step 101, identifying a finite element calculation model loaded in a host through CPU equipment of the host, reading nodes and units in the finite element calculation model, recording units to which each node belongs, and storing the relationship between the nodes and the units;

102, grouping units in the finite element calculation model according to a unit coloring method, and distributing calculation tasks according to the number of GPU equipment and the calculation capacity;

103, symbol assembly is carried out on the unit stiffness matrix and the mapping relation, and numerical assembly is carried out according to the index array to obtain sparse expression of the overall stiffness matrix;

and 104, solving the linear system through GPU equipment, performing data communication between the GPU equipment in a P2P mode, calculating sensitivity, summing according to thread specifications to obtain an objective function value, and finishing structural design optimization according to the objective function value.

Specifically, in this embodiment, finite element calculation model data is loaded into a host memory, for an overall model, a host CPU identifies nodes and units in the finite element calculation model data, records a unit to which each node belongs, and stores an affiliation relationship between the nodes and the units; the units in the model are grouped by a unit coloring method, the units of the same color do not have a common node, so that parallel assembly calculation can be carried out without causing writing conflict, and then appropriate calculation tasks are distributed to each device according to the number of GPUs and the calculation capacity of the GPUs, so that waiting among the devices is minimized; according to the efficient rigidity matrix assembling algorithm provided by the embodiment, the method is divided into two parts of symbol assembling and numerical assembling. Firstly, symbol assembly is carried out by a unit stiffness matrix and a mapping relation, and because the non-zero patterns of the total stiffness matrix in the whole topology optimization are the same, the part is only carried out in the first iteration; and then carrying out numerical value assembly according to the index array to obtain sparse expression of the overall stiffness matrix, wherein the sparse expression is stored in a row Compression (CSR) format.

The effect of greatly accelerating the calculation process is realized by using the GPU equipment, the acceleration ratio which is more than one hundred times can be realized on the multi-GPU equipment, the calculation time of the main task of finite element analysis is greatly reduced, and the calculation scale and the calculation efficiency of the parallel calculation of the GPU equipment are improved. In addition, the whole topology optimization process realizes complete GPU equipment calculation, and avoids time overhead brought by data communication between a host and equipment; the maximum hiding of data communication between the devices is realized by combining asynchronous transmission, and the overall calculation efficiency is improved; in addition, the topological optimization solver in the application has low calculation cost, can realize the topological optimization high-precision rapid calculation of tens of millions of free scale structures on a common personal computer, is simple to use, and achieves high universality on different devices.

In one embodiment, as shown in FIG. 2, the process of solving a linear system includes:

step 201, performing data partitioning on a total stiffness matrix of a system, and dividing a full-rank matrix linear system into a plurality of subsystems in the same column and different rows;

step 202, solving the current subsystem by adopting a GPU efficient linear iterative solver according to the right-end load vector;

and step 203, obtaining a complete displacement vector by the GPU equipment synchronous data exchange.

Specifically, firstly, data partitioning is carried out on a total stiffness matrix of the system, a full-rank matrix linear system is divided into a plurality of subsystems in the same column and different rows, and one GPU device is responsible for calculation; then, rapidly solving the current system by adopting an independently developed GPU efficient linear iterative solver according to the right-end load vector, and carrying out data communication among all devices in the period by adopting a P2P mode; and finally, the equipment synchronous data exchange obtains a complete displacement vector.

In one embodiment, as shown in fig. 3, the sensitivity calculation method includes:

step 301, dividing the finite element calculation model into a plurality of sub-models according to the number of GPU devices and the calculation capacity, wherein each device is responsible for the calculation task of one sub-model;

step 302, calculating the GPU acceleration sensitivity in a parallel mode of unit expansion, and using one thread to take charge of the calculation task of one unit;

and 303, calculating the sensitivity value and the flexibility target value of each element, and finally summing through a thread specification to obtain a target function value.

Specifically, in the embodiment, the sensitivity calculation is performed by dividing the model into a plurality of sub-models according to the number of GPUs and the calculation capacity, and each device is responsible for the calculation task of one sub-model; secondly, the GPU acceleration sensitivity calculation adopts a parallel mode of expanding according to units, namely one thread is used for being responsible for the calculation task of one unit; the index array is stored in a register and the common data is stored in a constant memory. And then calculating the sensitivity value and the flexibility target value of each element, and finally summing through thread specifications to obtain a target function value.

In one embodiment, the structural design optimization further includes sensitivity filtering and density updating; the sensitivity filtering stores the filtered data by establishing a new array, and adopts a mode that one thread is responsible for one unit for parallel computation through GPU equipment; the density updating adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, and in each iteration, the CPU equipment judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.

Specifically, the topology optimization sometimes causes unstable numerical values, such as a checkerboard phenomenon, and filtering the sensitivity is the most commonly used method to avoid the checkerboard phenomenon at present. Sensitivity filtering GPU parallel computing also adopts a mode that one thread is in charge of one unit, only a new array needs to be established to store filtered data, because the unit sensitivity is updated in the computing process, and the computing progress of each thread is not identical. Therefore, if the original array is updated directly, some erroneous results may be generated due to the computation synchronization between threads. In the embodiment, a method of mixed programming of a CPU and a GPU is adopted to update the density of the unit, wherein the task of updating the density of the unit is calculated by the GPU, and the new density of one unit is calculated by each thread by using a TFE method, and then the volume fraction is obtained by reduction and summation. The convergence of the density updating process is calculated and judged by the CPU, and in each iteration, the CPU judges the convergence of the density updating process and changes the lower threshold and the upper threshold according to the optimal criterion updating scheme.

It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in the above-described flowcharts may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or the stages is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least a portion of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 4, there is provided a structural design optimization system, comprising:

a model reading module 401, configured to identify a finite element computation model loaded in a host through a CPU device of the host, read nodes and units in the finite element computation model, record a unit to which each node belongs, and store a relationship between the nodes and the units;

a unit coloring module 402, configured to group units in the finite element computation model according to a unit coloring method, and allocate computation tasks according to the number of GPU devices in combination with computation power;

a stiffness matrix assembling module 403, configured to perform symbol assembly on the unit stiffness matrix and the mapping relationship, and perform numerical assembly according to the index array to obtain sparse representation of the total stiffness matrix;

and the object solving module 404 is configured to solve the linear system through GPU devices, perform data communication between the GPU devices in a P2P manner, perform sensitivity calculation, sum up according to thread specifications to obtain an objective function value, and complete optimization of the structural design according to the objective function value.

In one embodiment, as shown in FIG. 5, the object solving module 404 includes a solve linear system unit 4041 to:

In one embodiment, as shown in fig. 5, the object solving module 404 further includes a sensitivity calculating unit 4042, where the sensitivity calculating unit 4042 is configured to:

In one embodiment, the structural design optimization system further comprises a sensitivity filtering module and a density updating module;

the density updating module adopts a mixed programming method of CPU equipment and GPU equipment to update the unit density, the density updating task of the unit is calculated by the GPU equipment, each thread calculates to obtain the new density of one unit, and the volume fraction is obtained by reduction and summation; the convergence of the density updating process is calculated and judged by the CPU equipment, in each iteration, the CPU equipment judges the convergence of the density updating process, and the lower threshold value and the upper threshold value are changed according to the optimal criterion updating scheme.

For the specific definition of the structural design optimization system, reference may be made to the above definition of the structural design optimization method, which is not described herein again. The various modules in the structural design optimization system described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

FIG. 6 is a diagram illustrating an internal structure of a computer device in one embodiment. As shown in fig. 6, the computer apparatus includes a processor, a memory, a network interface, an input device, and a display screen connected through a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to implement the method of privilege anomaly detection. The internal memory may also have a computer program stored therein, which when executed by the processor, causes the processor to perform the method for detecting an abnormality of authority. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the configuration shown in fig. 6 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:

In one embodiment, the processor, when executing the computer program, further performs the steps of:

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A structural design optimization method is characterized by comprising the following steps:

2. The structural design optimization method of claim 1, wherein solving a linear system comprises:

3. The structural design optimization method of claim 1, wherein the sensitivity calculation comprises:

4. The structural design optimization method of claim 1, further comprising sensitivity filtering and density updating;

5. A structural design optimization system, comprising:

6. The structural design optimization system of claim 5, wherein the objective solving module comprises a solve linear system unit to:

7. The structural design optimization system of claim 5, wherein the objective solving module further comprises a sensitivity calculation unit to:

calculating GPU acceleration sensitivity according to a parallel mode of unit expansion, and using one thread to take charge of the calculation task of one unit;

8. The structural design optimization system of claim 5, further comprising a sensitivity filtering module and a density updating module;

the sensitivity filtering module is used for establishing a new array to store the filtered data and performing parallel computation by adopting a mode that one thread is responsible for one unit through GPU equipment;

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 4 are implemented when the computer program is executed by the processor.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 4.