WO2024031525A1 - Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization - Google Patents
Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization Download PDFInfo
- Publication number
- WO2024031525A1 WO2024031525A1 PCT/CN2022/111730 CN2022111730W WO2024031525A1 WO 2024031525 A1 WO2024031525 A1 WO 2024031525A1 CN 2022111730 W CN2022111730 W CN 2022111730W WO 2024031525 A1 WO2024031525 A1 WO 2024031525A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pinns
- pde
- control variables
- weights
- updating
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 101
- 238000005457 optimization Methods 0.000 title claims abstract description 73
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 19
- 230000006870 function Effects 0.000 claims abstract description 50
- 239000011159 matrix material Substances 0.000 claims description 13
- 230000004069 differentiation Effects 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims 1
- 238000013459 approach Methods 0.000 description 10
- 230000000875 corresponding effect Effects 0.000 description 6
- 239000013598 vector Substances 0.000 description 5
- 238000002679 ablation Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 206010019233 Headaches Diseases 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 231100000869 headache Toxicity 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 229940050561 matrix product Drugs 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005293 physical law Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G06F17/13—Differential equations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Definitions
- aspects of the present disclosure relate generally to artificial intelligence, and more particularly, to a method and an apparatus provided for bi-level physics-informed neural networks for Partial Differential Equation constrained optimization.
- Partial Differential Equation (PDE) constrained optimization aims at optimizing the performance of a physical system constrained by PDEs with desired properties. It is an important task in many areas of science and engineering, with a wide range of applications including shape optimization problem such as the design on shapes of aircraft wings in aerodynamics, parameters optimization of channels in heat transfer, parameters optimization of flow control problem, etc.
- FEM solvers are usually expensive for high dimensional problems with a large search space or mesh size, because the computational cost of FEMs grows quadratically to cubically w.r. t mesh sizes.
- DeepONet learns a mapping from control (decision) variables to solutions of PDEs and further replaces PDE constraints with the operator network. But these methods require pretraining a large operator network which is non-trivial and inefficient. Moreover, its performance might deteriorate if the optimal solution is out of training distribution.
- PINNs Physics-informed Neural Networks solve learning tasks while respecting the properties of physical laws. This is achieved by informing a loss function about the mathematical equations that govern the physical system.
- the general procedure for solving a differential equation with a PINNs involves finding the parameters of a network that minimize a loss function involving the mismatch of output and data, as well as residuals of the boundary and initial conditions, PDE equations, and any other physical constraints required.
- Bi-level Physics-informed Neural networks with Broyden’s hypergradients for solving PDECO problems.
- the optimization of the targets and PDE constraints are decoupled, thereby naturally addressing the challenge of loss balancing in regularization-based methods.
- an iterative method that optimizes PINNs with PDE constraints in the inner loop while optimizes the control variables for objective functions in the outer using hypergradients is disclosed.
- Computing hypergradients in bi-level optimization for control variables is challenging if the inner loop optimization is complicated. Therefore, a method for calculating the hypergradients based on implicit differentiation using Broyden’s method is also disclosed.
- a computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) is disclosed, wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled.
- PDE Partial Differential Equation
- the method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
- the method comprises training the initialized PINNs under the initialized control variables for a certain number of epochs as warm up.
- the updating the weights of the PINNs further comprises updating the weights of the PINNs by the second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed for a certain number of epochs.
- the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation.
- the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as wherein denotes the objective function, ⁇ denotes the PDE losses and w denotes the weights of the PINNs, w * denotes the updated weights of the PINNs in the last iteration.
- the inverse vector-Hessian product is computed by finding a root z * of a linear equation which further comprises iteratively approximating the root z * using a low rank Broyden’s method.
- the iteratively approximating the root z * using the low rank Broyden’s method further comprises setting a maximum of the rank of B i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.
- the iteratively approximating the root z * using the low rank Broyden’s method further comprises setting a maximum number of iterations of the low rank Broyden’s method; running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.
- PDE Partial Differential Equation
- PDE Partial Differential Equation
- Fig. 1 illustrates an exemplary existing framework of a PDECO problem, in accordance with various aspects of the present disclosure.
- Fig. 2 illustrates an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
- Fig. 3 illustrates details of an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
- Fig. 4 illustrates an exemplary flow chart for solving PDECO problems, in accordance with various aspects of the present disclosure.
- Fig. 5 illustrates an exemplary computing system, in accordance with various aspects of the present disclosure.
- Partial Differential Equation (PDE) constrained optimization aims at optimizing the performance of a physical system constrained by PDEs with desired properties.
- the objective or target of PDECO would typically be parameterized by a set of control variables, and the state variables of the physical system would correspond to the solutions of PDE constraints.
- PINNs Physics-informed neural networks
- Some existing methods such as hPINN, treat PDE constraints as regularization terms and optimizes the control variables and state variables simultaneously.
- Such methods use the penalty method and the Lagrangian method to adjust the weights of multipliers.
- Some of them adopt the same formulation but use a line search to find the largest weight when the PDE error is within a certain range.
- the key limitation of these approaches is that the weights of multipliers are determined heuristically which might be sub-optimal and unstable.
- Another class of methods train an operator network from control variables to solutions of PDEs or objective functions.
- Several approaches use mesh-based methods and predict states on all mesh points from control variables at the same time.
- PI-DeepONet adopts the architecture of DeepONet and trains the network using physics-informed losses, which is also called PDE losses.
- Y, U, V be three Banach spaces.
- the solution fields of PDEs are called state variables, i.e., and functions or variables can be controlled are control variables, i.e., where Y ad and U ad are admissible spaces.
- the PDECO can be formulated as:
- the state variables denote the state of the physical system, which may typically be flow velocity and pressure, etc.
- the control variables may be parameters which can parameterize the flow distribution.
- the state variables may also be flow velocity and pressure, but the control variables are parameters of the structures to be optimized.
- solutions y and control variables u are respectively parameterized by and with w being the weights of PINNs. are hyper-parameters balancing theses terms of the optimization targets.
- ⁇ i are hard to set and are sensitive to the results due to the complex nature of regularization terms (PDE constraints) .
- PDE constraints regularization terms
- large ⁇ i makes it difficult to optimize the objective while small ⁇ i can result in a nonphysical solution of y w .
- the optimal ⁇ may also vary with the different phases of training.
- Fig. 1 illustrates an exemplary existing framework of a PDECO problem, in accordance with various aspects of the present disclosure.
- an initial set of weights of PINNs w and control variables ⁇ are input to block 101.
- an objective function which denotes the target to be optimized can be calculated as with its governing PDE constraints as and which represent the boundary/initial conditions of the physical system.
- the solutions of PDEs can be represented by state variables y
- control variables of the objective function can be represented by u, which are respectively parameterized by w and ⁇ .
- the objective function and the regularization terms corresponding to PDE constraints are combined into with efficient ⁇ i as in Eq. (3) . Then the control variables and state variables are optimized simultaneously by minimizing the combined objective function Then the optimized w and ⁇ are output.
- Fig. 1 is merely shown as an example, and other examples would be possible.
- Fig. 2 illustrates an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
- An objective function which denotes the target to be optimized is with its governing PDE constraints as and which represent the boundary/initial conditions of the physical system.
- the solutions of PDEs can be represented by state variables y, and control variables of the objective function can be represented by u, which are respectively parameterized by w and ⁇ .
- an initial set of weights of PINNs w and control variables ⁇ are input to block 201.
- ⁇ is optimized in the upper level with minimizing with respect to ⁇ given a fixed value of w in one iteration. Then the optimized ⁇ would be transferred to block 202 for optimizing w.
- the PINNs are trained in the lower level using PDE losses ⁇ only as Eq. (5) .
- the optimized w would be transferred back to block 201 for optimizing ⁇ in a next iteration.
- the optimized w and ⁇ are output.
- Fig. 2 is merely shown as an example, and other examples would be possible.
- Bi-level optimization is widely used in various machine learning tasks, e.g., neural architecture search, meta learning and hyperparameters optimization.
- One of the key challenges is to compute hypergradients with respect to the inner loop optimization.
- Some previous methods use unrolled optimization or truncated unrolled optimization which is to differentiate the optimization process. However, this is not scalable if the inner loop optimization is a complicated process.
- Some other methods compute the hypergradient based on implicit function theorem. This requires computing of the inverse hessian-vector product (inverse-HVP) . It is proposed to use neumann series to approximate hypergradients in some previous approaches. Some works also use the conjugated gradient method. The approximation for implicit differentiation is crucial for the accuracy of hypergradients computation.
- the gradient of with respect to parameters of control variables ⁇ is calculated, which is called hypergradient.
- the hypergradient is computed based on Implicit Function Theorem Differentiation, along which line a highly complex inverse Hessian-Jacobian product needs to be calculated.
- Broyden’s method which provides an efficient approximation at a superlinear convergence speed is proposed to be used, which would be discussed in detail below.
- the upper objective depends on the optimal w * of the lower level optimization as discussed with Eq. (4) and Fig. 2, i.e.,
- ⁇ z i+1 z i+1 -z i
- ⁇ g i+1 g i+1 -g i
- ⁇ is a step size.
- ⁇ could be set to 1, and other examples are allowed.
- the Broyden iterations are run until the maximum iteration is reached or the error is within a threshold. It is noted that the maximum memory limit should be less than the maximum number of iterations in order to approximate the inversion of the Hessian to a K-rank matrix. When the Broyden iterations are run more than K times, only the K latest u k and v k are preserved due to the memory limit.
- Fig. 3 illustrates details of an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
- an initial set of weights of PINNs w 0 and control variables ⁇ 0 can be input to block 301.
- the initial PINNs may be trained a certain number of epochs N w under the control variables ⁇ 0 as warm up, and then the pre-trained PINNs and its weights can be input to block 301.
- an objective function with similar formation as Eq. (1) is calculated, with its governing PDE constraints as and which represent the boundary/initial conditions of the physical system.
- the solutions of PDE constraints can be represented by state variables y
- control variables of the objective function can be represented by u, which are respectively parameterized by w and ⁇ .
- a gradient of with respect to ⁇ is to be calculated.
- the hypergradients of ⁇ is calculated using IFT Differentiation, and further the inverse vector-Hessian product is calculated based on Broyden’s method as in Eq. (9) and (14) which uses low rank approximation for acceleration.
- a high-dimensional matrix is replaced with low rank vector matrix product.
- the optimized ⁇ is transferred to block 302 for finetuning PINNs using PDE losses as in Eq. (3) only.
- PINNs are trained with gradient descent of the PDE losses under the condition of the optimized ⁇ fixed.
- the PINNs can be trained for a certain number of epochs N f to obtain an optimal w * .
- BPN Bi-level Physics-informed Neural networks with Broyden’s hypergradients
- the disclosed BPN firstly possesses an idea of solving a transposed linear system in Eq. (9) to reduce the computational cost. Moreover, the disclosed bi-level optimization is a more general framework compared with constrained optimization in existing approaches. It allows more flexible design choices like replacing FEM solvers with PINNs. Additionally, the Eq. (9) solves a system in the parameter space of neural networks but some traditional approaches use equations correspond to a real field defined on admissible space or meshes.
- Fig. 4 illustrates an exemplary flow chart for solving PDECO problems, in accordance with various aspects of the present disclosure. As described below, some or all illustrated features may be omitted in a particular implementation within the scope of the present disclosure, and some illustrated features may not be required for implementation of all embodiments. Further, some of the blocks may be performed parallel or in a different order. In some examples, the method may be carried out by any suitable apparatus or means for carrying out the functions or algorithm described below.
- the PDECO problem aims at optimizing the performance of a physical system constrained by PDEs with desired properties, which could be a variety of problems in science and engineering, to name a few, flow control problem, shape optimization problem, drag minimization, etc.
- the shape optimization problem or drag minimization is aimed at designing structures (the shapes, sizes, and distribution of given materials) , for example of an airfoil, with high performance for systems characterized by PDEs.
- the method described with Fig. 4 in combined with other aspects herein can be applied to all the PDECO problems with appropriate PDE constraints, not limited by examples below.
- Naiver-Strokes equations are one of the most important equations in fluid mechanics, aerodynamics and applied mathematics which are notoriously difficult to solve due to the high non-linearity.
- NS equations are solved in a pipeline and it aims to find the best inlet flow distribution f (y) to make the outlet flow as uniform as possible.
- Two inlets and two outlets and several walls could be defined for this domain, such as:
- This problem could be solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
- backstep flow is a classic example that might exhibits turbulent flow. In this problem, it also aimed to find the best inlet flow distribution f (y) to make the outlet flow symmetric and uniform.
- the inlet is the left side of the area and the outlet is the right side of the area,
- the velocity fields of the outlet are to be optimized
- the target velocity field is and the inlet velocity f (y) is initialized as 8y (0.5-y) .
- This problem could be also solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
- the shape of the obstacle is a ellipse parameterized by a parameter a ⁇ [0.5, 2.5] .
- the goal is to minimize the following objective.
- This problem could be also solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
- the method begins at block 401, with initializing weights w 0 of the PINNs u w and the control variables ⁇ 0 , wherein the solutions of PDE constraints are parameterized by the weights w of the PINNs.
- the PINNs is initialized with random parameters w 0
- the control variables ⁇ 0 are initialized with a guess.
- the method proceeds to block 402, with calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively.
- the PDE losses ⁇ related to the PDE constraints can be calculated as Eq. (5) .
- the PDE constraints could be the boundary and/or initial conditions of the physical systems.
- the objective function related to the optimization target can be calculated with similar form as Eq. (1) .
- block 403 is an optional block.
- the warm up would have considerable influence on the convergence speed but minor influence on the final performance, therefore the number of epochs could be chosen based on implementation and not limited.
- the method proceeds to block 404, with updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed.
- the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation.
- the hypergradient of the objective function with respect to the control variables can be calculated as Eq. (8) by Theorem 1.
- the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as wherein denotes the objective function, ⁇ denotes the PDE losses and w denotes the weights of the PINNs, w * denotes the updated weights of the PINNs in the last iteration.
- the inverse vector-Hessian product is computed by finding a root z * of a linear equation further comprises iteratively approximating the root z * using a low rank Broyden’s method.
- to iteratively approximating the root z * further comprises setting a maximum of the rank of B i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.
- to iteratively approximating the root z * further comprises setting a maximum number of iterations of the low rank Broyden’s method; running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.
- the method proceeds to block 405, with updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed.
- the PINNs are trained with the updated control variables for a certain number of epochs.
- the optimal weights of the PINNs are transferred back to block 404 for next iteration. It could also be seen from Fig. 4, the operation of block 404 and 405 form an outer loop of optimization of control variables, and the operation in block 405 itself is an inner loop optimization for finetuning PINNs, which are also illustrated and described with Fig. 3. Besides, the Broyden’s method iterations exist in block 404 in an aspect.
- the optimized control variables and PINNs are output at block 407.
- Fig. 5 illustrates an exemplary computing system, in accordance with various aspects of the present disclosure.
- the computing system may comprise at least one processor 510.
- the computing system may further comprise at least one storage device 520. It should be appreciated that the storage device 520 may store computer-executable instructions that, when executed, cause the processor 510 to perform any operations according to the embodiments of the present disclosure as described in connection with FIGs. 1-4.
- the embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium.
- the non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) , wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled.
- PDE Partial Differential Equation
- PINNs physics-informed neural networks
- the method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
- the embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium.
- the non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization flow control problem in a physical system with physics-informed neural networks (PINNs) with at least one of the described methods herein, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein an optimization target is to find a best inlet flow distribution which is parameterized by the control variables to make an outlet flow uniform, comprising: outputting the updated control variables denoting the best inlet flow distribution after convergence.
- PDE Partial Differential Equation
- the embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium.
- the non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a
- the non-transitory computer-readable medium may further comprise instructions that, when executed, cause one or more processors to perform any operations according to the embodiments of the present disclosure as described in connection with Figs. 1-4.
- modules in the apparatuses described above may be implemented in various approaches. These modules may be implemented as hardware, software, or a combination thereof. Moreover, any of these modules may be further functionally divided into sub-modules or combined together.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Operations Research (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Feedback Control In General (AREA)
Abstract
A computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) is disclosed, wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled. The method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
Description
Aspects of the present disclosure relate generally to artificial intelligence, and more particularly, to a method and an apparatus provided for bi-level physics-informed neural networks for Partial Differential Equation constrained optimization.
Partial Differential Equation (PDE) constrained optimization (PDECO) aims at optimizing the performance of a physical system constrained by PDEs with desired properties. It is an important task in many areas of science and engineering, with a wide range of applications including shape optimization problem such as the design on shapes of aircraft wings in aerodynamics, parameters optimization of channels in heat transfer, parameters optimization of flow control problem, etc.
For solving PDECO problems, traditional numerical methods like adjoint methods based on finite element methods (FEMs) have been studied for decades. However, FEM solvers are usually expensive for high dimensional problems with a large search space or mesh size, because the computational cost of FEMs grows quadratically to cubically w.r. t mesh sizes.
To mitigate this problem, neural networks like DeepONet have been proposed as surrogate models of FEMs recently. DeepONet learns a mapping from control (decision) variables to solutions of PDEs and further replaces PDE constraints with the operator network. But these methods require pretraining a large operator network which is non-trivial and inefficient. Moreover, its performance might deteriorate if the optimal solution is out of training distribution.
Physics-informed Neural Networks (PINNs) solve learning tasks while respecting the properties of physical laws. This is achieved by informing a loss function about the mathematical equations that govern the physical system. The general procedure for solving a differential equation with a PINNs involves finding the parameters of a network that minimize a loss function involving the mismatch of output and data, as well as residuals of the boundary and initial conditions, PDE equations, and any other physical constraints required.
Thus, another scope of neural methods is proposed to use a single PINNs to solve the PDECO problem instead of training a large operator network. It uses the method of Lagrangian multipliers to treat the PDE constraints as regularization terms, which can optimize the objective and PDE loss simultaneously. However, such methods introduce a trade-off between optimization targets and regularization terms (i.e., PDE losses) which is crucial for the performance. It is generally non-trivial to set proper weights for balancing these terms due to the lack of theoretical guidance.
Therefore, existing methods are insufficient to handle those PDE constraints that have a complicated or nonlinear dependency on optimization targets. It is imperative to develop an effective strategy for dealing with the PDE constraints for solving PDECO problems.
SUMMARY
The following presents a simplified summary of one or more aspects to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later.
Generally, a novel bi-level optimization framework named Bi-level Physics-informed Neural networks with Broyden’s hypergradients (BPN) for solving PDECO problems are disclosed. The optimization of the targets and PDE constraints are decoupled, thereby naturally addressing the challenge of loss balancing in regularization-based methods. To solve the bi-level optimization problem, an iterative method that optimizes PINNs with PDE constraints in the inner loop while optimizes the control variables for objective functions in the outer using hypergradients is disclosed. Computing hypergradients in bi-level optimization for control variables is challenging if the inner loop optimization is complicated. Therefore, a method for calculating the hypergradients based on implicit differentiation using Broyden’s method is also disclosed.
According to an aspect of the disclosure, a computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) is disclosed, wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled. The method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
According to a further aspect, the method comprises training the initialized PINNs under the initialized control variables for a certain number of epochs as warm up.
According to a further aspect, the updating the weights of the PINNs further comprises updating the weights of the PINNs by the second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed for a certain number of epochs.
According to a further aspect, the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation.
According to a further aspect, the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as
wherein
denotes the objective function, ε denotes the PDE losses and w denotes the weights of the PINNs, w
* denotes the updated weights of the PINNs in the last iteration.
According to a further aspect, the inverse vector-Hessian product is computed by finding a root z
* of a linear equation
which further comprises iteratively approximating the root z
*using a low rank Broyden’s method.
According to a further aspect, the iteratively approximating the root z
*using the low rank Broyden’s method further comprises following operations in each iteration: approximating the inversion of Hessian matrix
as
wherein k is rank of B
i; updating z, u and v according to z
i+1=z
i-α·B
ig
i (z) ,
and v
i+1=B
iΔz
i+1, α is a step size; updating the inversion of the Hessian matrix
by
According to a further aspect, the iteratively approximating the root z
*using the low rank Broyden’s method further comprises setting a maximum of the rank of B
i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.
According to a further aspect, the iteratively approximating the root z
*using the low rank Broyden’s method further comprises setting a maximum number of iterations of the low rank Broyden’s method; running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.
A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization flow control problem in a physical system with physics-informed neural networks (PINNs) with at least one of the methods disclosed herein, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein an optimization target is to find a best inlet flow distribution which is parameterized by the control variables to make an outlet flow uniform, comprises outputting the updated control variables denoting the best inlet flow distribution after convergence.
A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization shape optimization problem in a physical system with physics-informed neural networks (PINNs) with at least one of the methods disclosed herein, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein the optimization target is to find a best shape of an object which is parameterized by the control variables that minimizes drag forces or pressure from the flow, comprises outputting the updated control variables denoting the best shape after convergence.
The disclosed aspects will be described in connection with the appended drawings that are provided to illustrate and not to limit the disclosed aspects.
Fig. 1 illustrates an exemplary existing framework of a PDECO problem, in accordance with various aspects of the present disclosure.
Fig. 2 illustrates an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
Fig. 3 illustrates details of an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
Fig. 4 illustrates an exemplary flow chart for solving PDECO problems, in accordance with various aspects of the present disclosure.
Fig. 5 illustrates an exemplary computing system, in accordance with various aspects of the present disclosure.
The present disclosure will now be discussed with reference to several example implementations. It is to be understood that these implementations are discussed only for enabling those skilled in the art to better understand and thus implement the embodiments of the present disclosure, rather than suggesting any limitations on the scope of the present disclosure.
Various embodiments will be described in detail with reference to the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts. References made to examples and embodiments are for illustrative purposes, and are not intended to limit the scope of the disclosure.
Partial Differential Equation (PDE) constrained optimization (PDECO) aims at optimizing the performance of a physical system constrained by PDEs with desired properties. The objective or target of PDECO would typically be parameterized by a set of control variables, and the state variables of the physical system would correspond to the solutions of PDE constraints.
Surrogate modeling is an important class of methods for PDECO. Physics-informed neural networks (PINNs) are powerful and flexible surrogates to represent the solutions of PDEs. Some existing methods, such as hPINN, treat PDE constraints as regularization terms and optimizes the control variables and state variables simultaneously. Such methods use the penalty method and the Lagrangian method to adjust the weights of multipliers. Some of them adopt the same formulation but use a line search to find the largest weight when the PDE error is within a certain range. The key limitation of these approaches is that the weights of multipliers are determined heuristically which might be sub-optimal and unstable.
Another class of methods train an operator network from control variables to solutions of PDEs or objective functions. Several approaches use mesh-based methods and predict states on all mesh points from control variables at the same time. As another example, PI-DeepONet adopts the architecture of DeepONet and trains the network using physics-informed losses, which is also called PDE losses.
However, all the existing methods produce unsatisfactory results if the optimal solution is out of the distribution. To better illustrate our proposed approach, a PDECO problem is firstly formulated as below.
Let Y, U, V be three Banach spaces. The solution fields of PDEs are called state variables, i.e.,
and functions or variables can be controlled are control variables, i.e.,
where Y
ad and U
ad are admissible spaces. The PDECO can be formulated as:
wherein
is the objective function and e: Y×U→V are PDE constraints. Usually, the PDE system of e (y, u) =0 contains multiple equations and boundary/initial conditions as:
As an example, for a flow control problem, which is aimed at finding a best inlet flow distribution to make the outlet flow as uniform as possible. The state variables denote the state of the physical system, which may typically be flow velocity and pressure, etc., and the control variables may be parameters which can parameterize the flow distribution. As another example, for a shape optimization problem to find a best shape that minimized the drag forces from the flow, the state variables may also be flow velocity and pressure, but the control variables are parameters of the structures to be optimized.
As mentioned above, existing methods based on regularization (e.g. the penalty method) solve the PDECO problem by minimizing the following objective:
wherein the solutions y and control variables u are respectively parameterized by
and
with w being the weights of PINNs.
are hyper-parameters balancing theses terms of the optimization targets. One main difficulty is that λ
i are hard to set and are sensitive to the results due to the complex nature of regularization terms (PDE constraints) . In general, large λ
i makes it difficult to optimize the objective
while small λ
i can result in a nonphysical solution of y
w. Besides, the optimal λ may also vary with the different phases of training.
Fig. 1 illustrates an exemplary existing framework of a PDECO problem, in accordance with various aspects of the present disclosure.
At the beginning, an initial set of weights of PINNs w and control variables θ are input to block 101. At block 101, an objective function which denotes the target to be optimized can be calculated as
with its governing PDE constraints as
and
which represent the boundary/initial conditions of the physical system. As described above, the solutions of PDEs can be represented by state variables y , and control variables of the objective function
can be represented by u, which are respectively parameterized by w and θ.
At block 102, the objective function
and the regularization terms corresponding to PDE constraints are combined into
with efficient λ
i as in Eq. (3) . Then the control variables and state variables are optimized simultaneously by minimizing the combined objective function
Then the optimized w and θ are output.
Fig. 1 is merely shown as an example, and other examples would be possible.
To resolve the above challenges of regularization-based methods, an approach that interprets PDECO as a bi-level optimization problem is disclosed herein, which can facilitate a new solver consequentially. Specifically, the following bi-level optimization problem is solved:
In the outer loop, only
is minimized with respect to θ given the optimal value of w
*, and then PDE losses are optimized using PINNs in the inner loop with the fixed θ. The objective ε of the inner loop sub-problem is:
By transforming the problem in Eq. (1) into a bi-level optimization problem as above, the optimization of PDEs’ state variables and control variables can be decoupled, which relieves the headache of setting a proper hyper-parameter of λ
i in Eq. (3) .
To solve the proposed bi-level optimization problem in Eq. (4) , the inner loops and outer loops are designed to be executed iteratively. Fig. 2 illustrates an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
An objective function which denotes the target to be optimized is
with its governing PDE constraints as
and
which represent the boundary/initial conditions of the physical system. As described above, the solutions of PDEs can be represented by state variables y, and control variables of the objective function
can be represented by u, which are respectively parameterized by w and θ.
As shown in Fig. 2, an initial set of weights of PINNs w and control variables θ are input to block 201. At block 201,
is calculated and θ is optimized in the upper level with minimizing
with respect to θ given a fixed value of w in one iteration. Then the optimized θ would be transferred to block 202 for optimizing w.
At block 202, the PINNs are trained in the lower level using PDE losses ε only as Eq. (5) . After that, the optimized w would be transferred back to block 201 for optimizing θ in a next iteration. Then after a certain times of iteration and/or convergence, the optimized w and θ are output. By decoupling the optimization of state variables and control variables, the shortcomings of regularization-based methods described above could be overcome.
Fig. 2 is merely shown as an example, and other examples would be possible.
Bi-level optimization is widely used in various machine learning tasks, e.g., neural architecture search, meta learning and hyperparameters optimization. One of the key challenges is to compute hypergradients with respect to the inner loop optimization. Some previous methods use unrolled optimization or truncated unrolled optimization which is to differentiate the optimization process. However, this is not scalable if the inner loop optimization is a complicated process. Some other methods compute the hypergradient based on implicit function theorem. This requires computing of the inverse hessian-vector product (inverse-HVP) . It is proposed to use neumann series to approximate hypergradients in some previous approaches. Some works also use the conjugated gradient method. The approximation for implicit differentiation is crucial for the accuracy of hypergradients computation.
To optimize control variables in the outer loop or in the upper level as discussed with Eq. (4) and Fig. 2, the gradient of
with respect to parameters of control variables θis calculated, which is called hypergradient. The hypergradient is computed based on Implicit Function Theorem Differentiation, along which line a highly complex inverse Hessian-Jacobian product needs to be calculated. To address this issue, Broyden’s method which provides an efficient approximation at a superlinear convergence speed is proposed to be used, which would be discussed in detail below.
The upper objective
depends on the optimal w
*of the lower level optimization as discussed with Eq. (4) and Fig. 2, i.e.,
Thus, it is needed to consider the Jacobian of w
*with respect to θ when calculating the hypergradients. Since w
*minimizes the lower level problem,
can be derived by applying Cauchy Implicit Function Theorem as,
Theorem 1: if for some (w′, θ′) , the lower level optimization is solved, i.e.
and
is invertible, then there exists a function w
*=w
* (θ) surrounding (w′, θ′) s. t.
and we have:
By Theorem 1, the hypergradients could be calculated analytically as:
However, computing the inverse of Hessian matrix
is intractable for parameters of neural networks. To handle this challenge, we can first compute
which is also called the inverse vector-Hessian product. Like mentioned before, some previous works use the Neumann series to approximate the inverse of the Hessian matrix. However, this approach is a coarse and imprecise estimation of the hypergradients.
Therefore, a more efficient and effective approach to compute the inverse vector-Hessian product which enjoys superlinear convergence speed is disclosed herein. Computing z
*equals to find the root for the following linear equation as:
It is noted that for each evaluation of g
w (z) , it is only needed to compute two Jacobian-vector products, which does not need to create an instance of Hessian matrix as prior arts, resulting in a low computational cost. Specifically, a low rank Broyden’s method is used to iteratively approximated the solution z
*. In each iteration, firstly the inverse of
is approximated as:
Then z , u and v are updated according to the following rules:
z
i+1=z
i-α·B
ig
i (z) (11)
v
i+1=B
iΔz
i+1 (13)
Wherein Δz
i+1=z
i+1-z
i, Δg
i+1=g
i+1-g
i, and α is a step size. In one example, α could be set to 1, and other examples are allowed.
In summary, the inversion of the Hessian could be updated by:
After m iterations, z
m is used as the approximation of z
*.
Based on Eq. (10) , it can be seen that B
i is expressed via column vector u
k and v
k, meaning that instead of storing a whole high-dimensional matrix of
it only needs to record u
k and v
k, both of which are vectors having a dimension of k, where k=1…K and K is a tunable parameter depending on the memory limit.
would thus result in a matrix B
i having a rank of k, which would be far less than the dimension of
in order to achieve a low computational cost.
The Broyden iterations are run until the maximum iteration is reached or the error is within a threshold. It is noted that the maximum memory limit should be less than the maximum number of iterations in order to approximate the inversion of the Hessian to a K-rank matrix. When the Broyden iterations are run more than K times, only the K latest u
k and v
k are preserved due to the memory limit.
The influence of different iterations and memory limit of Broyden’s method is investigated through experiments on Heat 2d problem and NS backstep problem. The results of Heat 2d equation are shown in Table 1. It could be seen that roughly the performance improves (the objective function decreases) with the increase of maximum memory limit and maximum iterations. In this problem, the rank for approximating the inverse Hessian more than 16 is able to handle this problem. There is no significant gain if more than 32 iterations are used.
Table 1-Ablation study on Heat2d problem. Lower score means better performance.
The results of NS backstep problem are shown in Table 2. The trend is similar to Heat 2d problem and roughly the performance improves with the increase of Broyden iterations. However, it can further be seen that this problem is much more difficult and the performance deteriorates if only 8 memory steps are used even though many iterations are used.
Table 2-Ablation study on NS backstep problem. Lower score means better performance.
Fig. 3 illustrates details of an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
Similar to Fig. 2, an initial set of weights of PINNs w
0 and control variables θ
0 can be input to block 301. As an alternative, the initial PINNs may be trained a certain number of epochs N
w under the control variables θ
0 as warm up, and then the pre-trained PINNs and its weights can be input to block 301.
At block 301, an objective function
with similar formation as Eq. (1) is calculated, with its governing PDE constraints as
and
which represent the boundary/initial conditions of the physical system. As described above, the solutions of PDE constraints can be represented by state variables y, and control variables of the objective function
can be represented by u, which are respectively parameterized by w and θ.
In order to optimize θ by gradient descent in the outer loop with fixed w, a gradient of
with respect to θ is to be calculated. As discussed above and shown in block 301, the hypergradients of θ is calculated using IFT Differentiation, and further the inverse vector-Hessian product is calculated based on Broyden’s method as in Eq. (9) and (14) which uses low rank approximation for acceleration. As shown in block 301 with solid blocks, a high-dimensional matrix is replaced with low rank vector matrix product.
After θ is optimized in the iteration, the optimized θ is transferred to block 302 for finetuning PINNs using PDE losses as in Eq. (3) only. As shown in block 302, PINNs are trained with gradient descent of the PDE losses under the condition of the optimized θ fixed. In an example, the PINNs can be trained for a certain number of epochs N
f to obtain an optimal w
*.
After w is optimized in the iteration, the optimized w
*is transferred back to block 301 for optimizing θ in the next iteration. These two steps are iteratively executed until convergence.
The influence of warm up epochs N
w and different fine-tuning epochs N
f for the inner loop of PINNs are also investigated through experiments. Heat 2d problem are chosen and the results are shown in Table 3 and Table 4. It can be seen from the tables that these hyperparameters have considerable influence on the convergence speed. However, their impact on the final performance is minor within a broad range. Thus, BPN is robust to the choice of these two parameters on Heat 2d problem, but a more efficient and effective result could be achieved via choosing moderate N
w and/or N
f.
Table 3 –Ablation study on the influence of finetuning epochs of PINNs
Table 4 –Ablation study on the influence of warmup epochs of PINNs
Based on the description and examples before, the disclosed method as a whole is named Bi-level Physics-informed Neural networks with Broyden’s hypergradients (BPN) , however one or more of the aspects can be implemented solely or combined with its corresponding effects. The pseudo code of BPN is outlined in Algorithm 1.
The disclosed BPN firstly possesses an idea of solving a transposed linear system in Eq. (9) to reduce the computational cost. Moreover, the disclosed bi-level optimization is a more general framework compared with constrained optimization in existing approaches. It allows more flexible design choices like replacing FEM solvers with PINNs. Additionally, the Eq. (9) solves a system in the parameter space of neural networks but some traditional approaches use equations correspond to a real field defined on admissible space or meshes.
Fig. 4 illustrates an exemplary flow chart for solving PDECO problems, in accordance with various aspects of the present disclosure. As described below, some or all illustrated features may be omitted in a particular implementation within the scope of the present disclosure, and some illustrated features may not be required for implementation of all embodiments. Further, some of the blocks may be performed parallel or in a different order. In some examples, the method may be carried out by any suitable apparatus or means for carrying out the functions or algorithm described below.
The PDECO problem aims at optimizing the performance of a physical system constrained by PDEs with desired properties, which could be a variety of problems in science and engineering, to name a few, flow control problem, shape optimization problem, drag minimization, etc. The shape optimization problem or drag minimization is aimed at designing structures (the shapes, sizes, and distribution of given materials) , for example of an airfoil, with high performance for systems characterized by PDEs. The method described with Fig. 4 in combined with other aspects herein can be applied to all the PDECO problems with appropriate PDE constraints, not limited by examples below.
For example, Naiver-Strokes equations are one of the most important equations in fluid mechanics, aerodynamics and applied mathematics which are notoriously difficult to solve due to the high non-linearity. In this problem, NS equations are solved in a pipeline and it aims to find the best inlet flow distribution f (y) to make the outlet flow as uniform as possible. The flow velocity field is u= (u, v) and the pressure field is p and they are defined on a rectangle domain Ω, for example Ω= [0, 1.5] × [0, 1.0] . Two inlets and two outlets and several walls could be defined for this domain, such as:
The whole problem would be as follows, which could be concrete expression of Eq. (1) and (2) :
The target function is a parabolic function
and the velocity field on the second inlet and outlet isv
2 (x) =18 (x-0.5) (1-x) . The Reynold number can be set to 100 in this problem, and f (y) can be initialized the same with the target function as f (y) =4y (1-y) . This problem could be solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
As another example, backstep flow is a classic example that might exhibits turbulent flow. In this problem, it also aimed to find the best inlet flow distribution f (y) to make the outlet flow symmetric and uniform. The geometry of backstep could be viewed as a union of two rectangles, such as Ω= [0, 1] × [0, 0.5] ∪ [1, 2] × [0, 1] . The inlet is the left side of the area and the outlet is the right side of the area,
The velocity fields of the outlet are to be optimized,
The target velocity field is
and the inlet velocity f (y) is initialized as 8y (0.5-y) . This problem could be also solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
For other class of problem of shape optimization, for example, drag minimization over an obstacle of NS equations, which is a shape optimization task that is to find the best shape of the obstacle that minimizes the drag forces from the flow. The inlet is the left side of the area and the outlet is the right side of the area, shown as:
The flow field is defined on a domain Ω = [0, 8]
2\Ω° and Ω° is the obstacle. The shape of the obstacle is a ellipse parameterized by a parameter a ∈ [0.5, 2.5] . The goal is to minimize the following objective.
This problem could be also solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
Now the disclosed method which is generally applicable for solving PDECO problems is illustrated with Fig. 4. The method begins at block 401, with initializing weights w
0 of the PINNs u
w and the control variables θ
0 , wherein the solutions of PDE constraints are parameterized by the weights w of the PINNs. As an example, the PINNs is initialized with random parameters w
0, and the control variables θ
0 are initialized with a guess.
The method proceeds to block 402, with calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively. As an example, the PDE losses ε related to the PDE constraints can be calculated as Eq. (5) . Referring to several PDECO problems mentioned above, the PDE constraints could be the boundary and/or initial conditions of the physical systems. Also, as an example, the objective function
related to the optimization target can be calculated with similar form as Eq. (1) .
The method proceeds to block 403, with training the initialized PINNs under the initialized control variables for a certain number of epochs as warm up, as shown with dash line, block 403 is an optional block. The warm up would have considerable influence on the convergence speed but minor influence on the final performance, therefore the number of epochs could be chosen based on implementation and not limited.
The method proceeds to block 404, with updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed.
In an example, the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation. For example, the hypergradient of the objective function with respect to the control variables can be calculated as Eq. (8) by Theorem 1.
As a further example, the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as
wherein
denotes the objective function, ε denotes the PDE losses and w denotes the weights of the PINNs, w
* denotes the updated weights of the PINNs in the last iteration.
As a further example, the inverse vector-Hessian product is computed by finding a root z
* of a linear equation
further comprises iteratively approximating the root z
* using a low rank Broyden’s method. To iteratively approximating the root z
*, in each iteration of the low rank Broyden’s method, further comprises approximating the inversion of Hessian matrix
as
wherein k is rank of B
i; updating z , u and v according to z
i+1=z
i-α·B
ig
i (z) ,
and v
i+1=B
iΔz
i+1, α is a step size; updating the inversion of the Hessian matrix
by
As a further example, to iteratively approximating the root z
*, further comprises setting a maximum of the rank of B
i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.
As a further example, to iteratively approximating the root z
*, further comprises setting a maximum number of iterations of the low rank Broyden’s method; running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.
After the control variables are updated, the method proceeds to block 405, with updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed. As an example, the PINNs are trained with the updated control variables for a certain number of epochs.
After the PINNs are optimized and not converged determined at block 406, the optimal weights of the PINNs are transferred back to block 404 for next iteration. It could also be seen from Fig. 4, the operation of block 404 and 405 form an outer loop of optimization of control variables, and the operation in block 405 itself is an inner loop optimization for finetuning PINNs, which are also illustrated and described with Fig. 3. Besides, the Broyden’s method iterations exist in block 404 in an aspect.
After the PINNs are optimized and determined to be converged at block 406, the optimized control variables and PINNs are output at block 407.
Fig. 5 illustrates an exemplary computing system, in accordance with various aspects of the present disclosure. The computing system may comprise at least one processor 510. The computing system may further comprise at least one storage device 520. It should be appreciated that the storage device 520 may store computer-executable instructions that, when executed, cause the processor 510 to perform any operations according to the embodiments of the present disclosure as described in connection with FIGs. 1-4.
The embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) , wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled. The method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
The embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization flow control problem in a physical system with physics-informed neural networks (PINNs) with at least one of the described methods herein, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein an optimization target is to find a best inlet flow distribution which is parameterized by the control variables to make an outlet flow uniform, comprising: outputting the updated control variables denoting the best inlet flow distribution after convergence.
The embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a
The non-transitory computer-readable medium may further comprise instructions that, when executed, cause one or more processors to perform any operations according to the embodiments of the present disclosure as described in connection with Figs. 1-4.
It should be appreciated that all the operations in the methods described above are merely exemplary, and the present disclosure is not limited to any operations in the methods or sequence orders of these operations, and should cover all other equivalents under the same or similar concepts.
It should also be appreciated that all the modules in the apparatuses described above may be implemented in various approaches. These modules may be implemented as hardware, software, or a combination thereof. Moreover, any of these modules may be further functionally divided into sub-modules or combined together.
The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects. Thus, the claims are not intended to be limited to the aspects shown herein. All structural and functional equivalents to the elements of the various aspects described throughout the present disclosure that are known or later come to be known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the claims.
Claims (14)
- A computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) , wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled, the method comprising:initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs;calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively;updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed;updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; andupdating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
- The computer implemented method of claim 1, further comprising:training the initialized PINNs under the initialized control variables for a certain number of epochs as warm up.
- The computer implemented method of claim 1, the updating the weights of the PINNs further comprising:updating the weights of the PINNs by the second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed for a certain number of epochs.
- The computer implemented method of claim 1, wherein the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation.
- The computer implemented method of claim 4, wherein the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as wherein denotes the objective function, ε denotes the PDE losses and w denotes the weights of the PINNs, w * denotes the updated weights of the PINNs in the last iteration.
- The computer implemented method of claim 6, wherein the iteratively approximating the root z * using the low rank Broyden’s method further comprising in each iteration:
- The computer implemented method of claim 7, the iteratively approximating the root z * using the low rank Broyden’s method further comprising:setting a maximum of the rank of B i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.
- The computer implemented method of claim 7, the iteratively approximating the root z * using the low rank Broyden’s method further comprising:setting a maximum number of iterations of the low rank Broyden’s method;running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.
- A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization flow control problem in a physical system with physics-informed neural networks (PINNs) with method of one of claims 1-9, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein an optimization target is to find a best inlet flow distribution which is parameterized by the control variables to make an outlet flow uniform, comprising:outputting the updated control variables denoting the best inlet flow distribution after convergence.
- A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization shape optimization problem in a physical system with physics-informed neural networks (PINNs) with method of one of claims 1-9, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein the optimization target is to find a best shape of an object which is parameterized by the control variables that minimizes drag forces or pressure from the flow, comprising:outputting the updated control variables denoting the best shape after convergence.
- A computer system, comprising:one or more processors; andone or more storage devices storing computer-executable instructions that, when executed, cause the one or more processors to perform the operations of the method of one of claims 1-9.
- One or more computer readable storage media storing computer-executable instructions that, when executed, cause one or more processors to perform the operations of the method of one of claims 1-9.
- A computer program product comprising computer-executable instructions that, when executed, cause one or more processors to perform the operations of the method of one of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/111730 WO2024031525A1 (en) | 2022-08-11 | 2022-08-11 | Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/111730 WO2024031525A1 (en) | 2022-08-11 | 2022-08-11 | Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024031525A1 true WO2024031525A1 (en) | 2024-02-15 |
Family
ID=89850337
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/111730 WO2024031525A1 (en) | 2022-08-11 | 2022-08-11 | Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024031525A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210365616A1 (en) * | 2020-05-20 | 2021-11-25 | Bull Sas | Simulation by substitution of a model by physical laws with an automatic learning model |
CN114118405A (en) * | 2021-10-26 | 2022-03-01 | 中国人民解放军军事科学院国防科技创新研究院 | Loss function self-adaptive balancing method of neural network embedded with physical knowledge |
CN114239698A (en) * | 2021-11-26 | 2022-03-25 | 中国空间技术研究院 | Data processing method, device and equipment |
CN114611678A (en) * | 2022-03-21 | 2022-06-10 | 上海壁仞智能科技有限公司 | Training method and device, data processing method, electronic device and storage medium |
-
2022
- 2022-08-11 WO PCT/CN2022/111730 patent/WO2024031525A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210365616A1 (en) * | 2020-05-20 | 2021-11-25 | Bull Sas | Simulation by substitution of a model by physical laws with an automatic learning model |
CN114118405A (en) * | 2021-10-26 | 2022-03-01 | 中国人民解放军军事科学院国防科技创新研究院 | Loss function self-adaptive balancing method of neural network embedded with physical knowledge |
CN114239698A (en) * | 2021-11-26 | 2022-03-25 | 中国空间技术研究院 | Data processing method, device and equipment |
CN114611678A (en) * | 2022-03-21 | 2022-06-10 | 上海壁仞智能科技有限公司 | Training method and device, data processing method, electronic device and storage medium |
Non-Patent Citations (4)
Title |
---|
BROYDEN C. G.: "A class of methods for solving nonlinear simultaneous equations", MATHEMATICS OF COMPUTATION, AMERICAN MATHEMATICAL SOCIETY, US, vol. 19, no. 92, 1 January 1965 (1965-01-01), US , pages 577 - 593, XP093137646, ISSN: 0025-5718, DOI: 10.1090/S0025-5718-1965-0198670-6 * |
LORRAINE JONATHAN, VICOL PAUL, DUVENAUD DAVID: "Optimizing Millions of Hyperparameters by Implicit Differentiation", ARXIV (CORNELL UNIVERSITY), CORNELL UNIVERSITY LIBRARY, ARXIV.ORG, ITHACA, 6 November 2019 (2019-11-06), Ithaca, XP093137637, [retrieved on 20240305], DOI: 10.48550/arxiv.1911.02590 * |
PSAROS APOSTOLOS F, KAWAGUCHI KENJI, KARNIADAKIS GEORGE EM: "Meta-learning PINN loss functions", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, ARXIV.ORG, ITHACA, 12 July 2021 (2021-07-12), Ithaca, XP093137634, [retrieved on 20240305], DOI: 10.48550/arxiv.2107.05544 * |
RODOMANOV ANTON, NESTEROV YURII: "Greedy Quasi-Newton Methods with Explicit Superlinear Convergence", SIAM JOURNAL ON OPTIMIZATION, THE SOCIETY, PHILADELPHIA, PA,, US, vol. 31, no. 1, 1 January 2021 (2021-01-01), US , pages 785 - 811, XP093137649, ISSN: 1052-6234, DOI: 10.1137/20M1320651 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ahn et al. | Online tuning fuzzy PID controller using robust extended Kalman filter | |
Brandt | Multi-level adaptive solutions to boundary-value problems | |
Izci et al. | A novel improved arithmetic optimization algorithm for optimal design of PID controlled and Bode’s ideal transfer function based automobile cruise control system | |
Wang et al. | Modeling nonlinear systems using the tensor network B‐spline and the multi‐innovation identification theory | |
Yu et al. | Fault-tolerant control for over-actuated hypersonic reentry vehicle subject to multiple disturbances and actuator faults | |
Kang | Rate of convergence for the Legendre pseudospectral optimal control of feedback linearizable systems | |
Ataei et al. | Non-linear control of an uncertain hypersonic aircraft model using robust sum-of-squares method | |
Taheri et al. | Shaping low-thrust trajectories with thrust-handling feature | |
Katrutsa et al. | Deep multigrid: learning prolongation and restriction matrices | |
Benner et al. | Efficient solution of large-scale saddle point systems arising in Riccati-based boundary feedback stabilization of incompressible Stokes flow | |
CN114239698A (en) | Data processing method, device and equipment | |
Cao et al. | A practical parameter determination strategy based on improved hybrid PSO algorithm for higher-order sliding mode control of air-breathing hypersonic vehicles | |
Yang et al. | An iterative scheme of safe reinforcement learning for nonlinear systems via barrier certificate generation | |
WO2024031525A1 (en) | Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization | |
Ober-Blöbaum et al. | Variational learning of Euler–Lagrange dynamics from data | |
Wang et al. | Asymptotic self-similar blow up profile for 3-D Euler via physics-informed neural networks | |
Gao et al. | Optimization on the symplectic Stiefel manifold: SR decomposition-based retraction and applications | |
Ye et al. | On the importance of consistency in training deep neural networks | |
Witherden et al. | The design of steady state schemes for computational aerodynamics | |
Karamali et al. | Numerical solution of higher index DAEs using their IAE's structure: Trajectory-prescribed path control problem and simple pendulum | |
Hu et al. | Reentry trajectory optimization for hypersonic vehicles using fuzzy satisfactory goal programming method | |
Darling et al. | Rigid body attitude uncertainty propagation using the Gauss-Bingham distribution | |
CN114638076A (en) | Fluid topology optimization method and system based on physical neural network | |
Han et al. | Trajectory tracking control for underactuated autonomous vehicles via adaptive dynamic programming | |
Furfaro et al. | Increasing Autonomy of Aerospace Systems via PINN-based Solutions of HJB Equation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22954492 Country of ref document: EP Kind code of ref document: A1 |