WO2024031525A1 - Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization - Google Patents

Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization Download PDF

Info

Publication number
WO2024031525A1
WO2024031525A1 PCT/CN2022/111730 CN2022111730W WO2024031525A1 WO 2024031525 A1 WO2024031525 A1 WO 2024031525A1 CN 2022111730 W CN2022111730 W CN 2022111730W WO 2024031525 A1 WO2024031525 A1 WO 2024031525A1
Authority
WO
WIPO (PCT)
Prior art keywords
pinns
pde
control variables
weights
updating
Prior art date
Application number
PCT/CN2022/111730
Other languages
French (fr)
Inventor
Jun Zhu
Zhongkai HAO
Chengyang YING
Hang SU
Jian Song
Ze CHENG
Original Assignee
Robert Bosch Gmbh
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Robert Bosch Gmbh, Tsinghua University filed Critical Robert Bosch Gmbh
Priority to PCT/CN2022/111730 priority Critical patent/WO2024031525A1/en
Publication of WO2024031525A1 publication Critical patent/WO2024031525A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • G06F17/13Differential equations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • aspects of the present disclosure relate generally to artificial intelligence, and more particularly, to a method and an apparatus provided for bi-level physics-informed neural networks for Partial Differential Equation constrained optimization.
  • Partial Differential Equation (PDE) constrained optimization aims at optimizing the performance of a physical system constrained by PDEs with desired properties. It is an important task in many areas of science and engineering, with a wide range of applications including shape optimization problem such as the design on shapes of aircraft wings in aerodynamics, parameters optimization of channels in heat transfer, parameters optimization of flow control problem, etc.
  • FEM solvers are usually expensive for high dimensional problems with a large search space or mesh size, because the computational cost of FEMs grows quadratically to cubically w.r. t mesh sizes.
  • DeepONet learns a mapping from control (decision) variables to solutions of PDEs and further replaces PDE constraints with the operator network. But these methods require pretraining a large operator network which is non-trivial and inefficient. Moreover, its performance might deteriorate if the optimal solution is out of training distribution.
  • PINNs Physics-informed Neural Networks solve learning tasks while respecting the properties of physical laws. This is achieved by informing a loss function about the mathematical equations that govern the physical system.
  • the general procedure for solving a differential equation with a PINNs involves finding the parameters of a network that minimize a loss function involving the mismatch of output and data, as well as residuals of the boundary and initial conditions, PDE equations, and any other physical constraints required.
  • Bi-level Physics-informed Neural networks with Broyden’s hypergradients for solving PDECO problems.
  • the optimization of the targets and PDE constraints are decoupled, thereby naturally addressing the challenge of loss balancing in regularization-based methods.
  • an iterative method that optimizes PINNs with PDE constraints in the inner loop while optimizes the control variables for objective functions in the outer using hypergradients is disclosed.
  • Computing hypergradients in bi-level optimization for control variables is challenging if the inner loop optimization is complicated. Therefore, a method for calculating the hypergradients based on implicit differentiation using Broyden’s method is also disclosed.
  • a computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) is disclosed, wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled.
  • PDE Partial Differential Equation
  • the method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
  • the method comprises training the initialized PINNs under the initialized control variables for a certain number of epochs as warm up.
  • the updating the weights of the PINNs further comprises updating the weights of the PINNs by the second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed for a certain number of epochs.
  • the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation.
  • the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as wherein denotes the objective function, ⁇ denotes the PDE losses and w denotes the weights of the PINNs, w * denotes the updated weights of the PINNs in the last iteration.
  • the inverse vector-Hessian product is computed by finding a root z * of a linear equation which further comprises iteratively approximating the root z * using a low rank Broyden’s method.
  • the iteratively approximating the root z * using the low rank Broyden’s method further comprises setting a maximum of the rank of B i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.
  • the iteratively approximating the root z * using the low rank Broyden’s method further comprises setting a maximum number of iterations of the low rank Broyden’s method; running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.
  • PDE Partial Differential Equation
  • PDE Partial Differential Equation
  • Fig. 1 illustrates an exemplary existing framework of a PDECO problem, in accordance with various aspects of the present disclosure.
  • Fig. 2 illustrates an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
  • Fig. 3 illustrates details of an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
  • Fig. 4 illustrates an exemplary flow chart for solving PDECO problems, in accordance with various aspects of the present disclosure.
  • Fig. 5 illustrates an exemplary computing system, in accordance with various aspects of the present disclosure.
  • Partial Differential Equation (PDE) constrained optimization aims at optimizing the performance of a physical system constrained by PDEs with desired properties.
  • the objective or target of PDECO would typically be parameterized by a set of control variables, and the state variables of the physical system would correspond to the solutions of PDE constraints.
  • PINNs Physics-informed neural networks
  • Some existing methods such as hPINN, treat PDE constraints as regularization terms and optimizes the control variables and state variables simultaneously.
  • Such methods use the penalty method and the Lagrangian method to adjust the weights of multipliers.
  • Some of them adopt the same formulation but use a line search to find the largest weight when the PDE error is within a certain range.
  • the key limitation of these approaches is that the weights of multipliers are determined heuristically which might be sub-optimal and unstable.
  • Another class of methods train an operator network from control variables to solutions of PDEs or objective functions.
  • Several approaches use mesh-based methods and predict states on all mesh points from control variables at the same time.
  • PI-DeepONet adopts the architecture of DeepONet and trains the network using physics-informed losses, which is also called PDE losses.
  • Y, U, V be three Banach spaces.
  • the solution fields of PDEs are called state variables, i.e., and functions or variables can be controlled are control variables, i.e., where Y ad and U ad are admissible spaces.
  • the PDECO can be formulated as:
  • the state variables denote the state of the physical system, which may typically be flow velocity and pressure, etc.
  • the control variables may be parameters which can parameterize the flow distribution.
  • the state variables may also be flow velocity and pressure, but the control variables are parameters of the structures to be optimized.
  • solutions y and control variables u are respectively parameterized by and with w being the weights of PINNs. are hyper-parameters balancing theses terms of the optimization targets.
  • ⁇ i are hard to set and are sensitive to the results due to the complex nature of regularization terms (PDE constraints) .
  • PDE constraints regularization terms
  • large ⁇ i makes it difficult to optimize the objective while small ⁇ i can result in a nonphysical solution of y w .
  • the optimal ⁇ may also vary with the different phases of training.
  • Fig. 1 illustrates an exemplary existing framework of a PDECO problem, in accordance with various aspects of the present disclosure.
  • an initial set of weights of PINNs w and control variables ⁇ are input to block 101.
  • an objective function which denotes the target to be optimized can be calculated as with its governing PDE constraints as and which represent the boundary/initial conditions of the physical system.
  • the solutions of PDEs can be represented by state variables y
  • control variables of the objective function can be represented by u, which are respectively parameterized by w and ⁇ .
  • the objective function and the regularization terms corresponding to PDE constraints are combined into with efficient ⁇ i as in Eq. (3) . Then the control variables and state variables are optimized simultaneously by minimizing the combined objective function Then the optimized w and ⁇ are output.
  • Fig. 1 is merely shown as an example, and other examples would be possible.
  • Fig. 2 illustrates an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
  • An objective function which denotes the target to be optimized is with its governing PDE constraints as and which represent the boundary/initial conditions of the physical system.
  • the solutions of PDEs can be represented by state variables y, and control variables of the objective function can be represented by u, which are respectively parameterized by w and ⁇ .
  • an initial set of weights of PINNs w and control variables ⁇ are input to block 201.
  • is optimized in the upper level with minimizing with respect to ⁇ given a fixed value of w in one iteration. Then the optimized ⁇ would be transferred to block 202 for optimizing w.
  • the PINNs are trained in the lower level using PDE losses ⁇ only as Eq. (5) .
  • the optimized w would be transferred back to block 201 for optimizing ⁇ in a next iteration.
  • the optimized w and ⁇ are output.
  • Fig. 2 is merely shown as an example, and other examples would be possible.
  • Bi-level optimization is widely used in various machine learning tasks, e.g., neural architecture search, meta learning and hyperparameters optimization.
  • One of the key challenges is to compute hypergradients with respect to the inner loop optimization.
  • Some previous methods use unrolled optimization or truncated unrolled optimization which is to differentiate the optimization process. However, this is not scalable if the inner loop optimization is a complicated process.
  • Some other methods compute the hypergradient based on implicit function theorem. This requires computing of the inverse hessian-vector product (inverse-HVP) . It is proposed to use neumann series to approximate hypergradients in some previous approaches. Some works also use the conjugated gradient method. The approximation for implicit differentiation is crucial for the accuracy of hypergradients computation.
  • the gradient of with respect to parameters of control variables ⁇ is calculated, which is called hypergradient.
  • the hypergradient is computed based on Implicit Function Theorem Differentiation, along which line a highly complex inverse Hessian-Jacobian product needs to be calculated.
  • Broyden’s method which provides an efficient approximation at a superlinear convergence speed is proposed to be used, which would be discussed in detail below.
  • the upper objective depends on the optimal w * of the lower level optimization as discussed with Eq. (4) and Fig. 2, i.e.,
  • ⁇ z i+1 z i+1 -z i
  • ⁇ g i+1 g i+1 -g i
  • is a step size.
  • could be set to 1, and other examples are allowed.
  • the Broyden iterations are run until the maximum iteration is reached or the error is within a threshold. It is noted that the maximum memory limit should be less than the maximum number of iterations in order to approximate the inversion of the Hessian to a K-rank matrix. When the Broyden iterations are run more than K times, only the K latest u k and v k are preserved due to the memory limit.
  • Fig. 3 illustrates details of an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
  • an initial set of weights of PINNs w 0 and control variables ⁇ 0 can be input to block 301.
  • the initial PINNs may be trained a certain number of epochs N w under the control variables ⁇ 0 as warm up, and then the pre-trained PINNs and its weights can be input to block 301.
  • an objective function with similar formation as Eq. (1) is calculated, with its governing PDE constraints as and which represent the boundary/initial conditions of the physical system.
  • the solutions of PDE constraints can be represented by state variables y
  • control variables of the objective function can be represented by u, which are respectively parameterized by w and ⁇ .
  • a gradient of with respect to ⁇ is to be calculated.
  • the hypergradients of ⁇ is calculated using IFT Differentiation, and further the inverse vector-Hessian product is calculated based on Broyden’s method as in Eq. (9) and (14) which uses low rank approximation for acceleration.
  • a high-dimensional matrix is replaced with low rank vector matrix product.
  • the optimized ⁇ is transferred to block 302 for finetuning PINNs using PDE losses as in Eq. (3) only.
  • PINNs are trained with gradient descent of the PDE losses under the condition of the optimized ⁇ fixed.
  • the PINNs can be trained for a certain number of epochs N f to obtain an optimal w * .
  • BPN Bi-level Physics-informed Neural networks with Broyden’s hypergradients
  • the disclosed BPN firstly possesses an idea of solving a transposed linear system in Eq. (9) to reduce the computational cost. Moreover, the disclosed bi-level optimization is a more general framework compared with constrained optimization in existing approaches. It allows more flexible design choices like replacing FEM solvers with PINNs. Additionally, the Eq. (9) solves a system in the parameter space of neural networks but some traditional approaches use equations correspond to a real field defined on admissible space or meshes.
  • Fig. 4 illustrates an exemplary flow chart for solving PDECO problems, in accordance with various aspects of the present disclosure. As described below, some or all illustrated features may be omitted in a particular implementation within the scope of the present disclosure, and some illustrated features may not be required for implementation of all embodiments. Further, some of the blocks may be performed parallel or in a different order. In some examples, the method may be carried out by any suitable apparatus or means for carrying out the functions or algorithm described below.
  • the PDECO problem aims at optimizing the performance of a physical system constrained by PDEs with desired properties, which could be a variety of problems in science and engineering, to name a few, flow control problem, shape optimization problem, drag minimization, etc.
  • the shape optimization problem or drag minimization is aimed at designing structures (the shapes, sizes, and distribution of given materials) , for example of an airfoil, with high performance for systems characterized by PDEs.
  • the method described with Fig. 4 in combined with other aspects herein can be applied to all the PDECO problems with appropriate PDE constraints, not limited by examples below.
  • Naiver-Strokes equations are one of the most important equations in fluid mechanics, aerodynamics and applied mathematics which are notoriously difficult to solve due to the high non-linearity.
  • NS equations are solved in a pipeline and it aims to find the best inlet flow distribution f (y) to make the outlet flow as uniform as possible.
  • Two inlets and two outlets and several walls could be defined for this domain, such as:
  • This problem could be solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
  • backstep flow is a classic example that might exhibits turbulent flow. In this problem, it also aimed to find the best inlet flow distribution f (y) to make the outlet flow symmetric and uniform.
  • the inlet is the left side of the area and the outlet is the right side of the area,
  • the velocity fields of the outlet are to be optimized
  • the target velocity field is and the inlet velocity f (y) is initialized as 8y (0.5-y) .
  • This problem could be also solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
  • the shape of the obstacle is a ellipse parameterized by a parameter a ⁇ [0.5, 2.5] .
  • the goal is to minimize the following objective.
  • This problem could be also solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
  • the method begins at block 401, with initializing weights w 0 of the PINNs u w and the control variables ⁇ 0 , wherein the solutions of PDE constraints are parameterized by the weights w of the PINNs.
  • the PINNs is initialized with random parameters w 0
  • the control variables ⁇ 0 are initialized with a guess.
  • the method proceeds to block 402, with calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively.
  • the PDE losses ⁇ related to the PDE constraints can be calculated as Eq. (5) .
  • the PDE constraints could be the boundary and/or initial conditions of the physical systems.
  • the objective function related to the optimization target can be calculated with similar form as Eq. (1) .
  • block 403 is an optional block.
  • the warm up would have considerable influence on the convergence speed but minor influence on the final performance, therefore the number of epochs could be chosen based on implementation and not limited.
  • the method proceeds to block 404, with updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed.
  • the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation.
  • the hypergradient of the objective function with respect to the control variables can be calculated as Eq. (8) by Theorem 1.
  • the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as wherein denotes the objective function, ⁇ denotes the PDE losses and w denotes the weights of the PINNs, w * denotes the updated weights of the PINNs in the last iteration.
  • the inverse vector-Hessian product is computed by finding a root z * of a linear equation further comprises iteratively approximating the root z * using a low rank Broyden’s method.
  • to iteratively approximating the root z * further comprises setting a maximum of the rank of B i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.
  • to iteratively approximating the root z * further comprises setting a maximum number of iterations of the low rank Broyden’s method; running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.
  • the method proceeds to block 405, with updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed.
  • the PINNs are trained with the updated control variables for a certain number of epochs.
  • the optimal weights of the PINNs are transferred back to block 404 for next iteration. It could also be seen from Fig. 4, the operation of block 404 and 405 form an outer loop of optimization of control variables, and the operation in block 405 itself is an inner loop optimization for finetuning PINNs, which are also illustrated and described with Fig. 3. Besides, the Broyden’s method iterations exist in block 404 in an aspect.
  • the optimized control variables and PINNs are output at block 407.
  • Fig. 5 illustrates an exemplary computing system, in accordance with various aspects of the present disclosure.
  • the computing system may comprise at least one processor 510.
  • the computing system may further comprise at least one storage device 520. It should be appreciated that the storage device 520 may store computer-executable instructions that, when executed, cause the processor 510 to perform any operations according to the embodiments of the present disclosure as described in connection with FIGs. 1-4.
  • the embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium.
  • the non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) , wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled.
  • PDE Partial Differential Equation
  • PINNs physics-informed neural networks
  • the method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
  • the embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium.
  • the non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization flow control problem in a physical system with physics-informed neural networks (PINNs) with at least one of the described methods herein, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein an optimization target is to find a best inlet flow distribution which is parameterized by the control variables to make an outlet flow uniform, comprising: outputting the updated control variables denoting the best inlet flow distribution after convergence.
  • PDE Partial Differential Equation
  • the embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium.
  • the non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a
  • the non-transitory computer-readable medium may further comprise instructions that, when executed, cause one or more processors to perform any operations according to the embodiments of the present disclosure as described in connection with Figs. 1-4.
  • modules in the apparatuses described above may be implemented in various approaches. These modules may be implemented as hardware, software, or a combination thereof. Moreover, any of these modules may be further functionally divided into sub-modules or combined together.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Feedback Control In General (AREA)

Abstract

A computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) is disclosed, wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled. The method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.

Description

METHOD AND APPARATUS FOR BI-LEVEL PHYSICS-INFORMED NEURAL NETWORKS FOR PDE CONSTRAINED OPTIMIZATION FIELD
Aspects of the present disclosure relate generally to artificial intelligence, and more particularly, to a method and an apparatus provided for bi-level physics-informed neural networks for Partial Differential Equation constrained optimization.
BACKGROUND
Partial Differential Equation (PDE) constrained optimization (PDECO) aims at optimizing the performance of a physical system constrained by PDEs with desired properties. It is an important task in many areas of science and engineering, with a wide range of applications including shape optimization problem such as the design on shapes of aircraft wings in aerodynamics, parameters optimization of channels in heat transfer, parameters optimization of flow control problem, etc.
For solving PDECO problems, traditional numerical methods like adjoint methods based on finite element methods (FEMs) have been studied for decades. However, FEM solvers are usually expensive for high dimensional problems with a large search space or mesh size, because the computational cost of FEMs grows quadratically to cubically w.r. t mesh sizes.
To mitigate this problem, neural networks like DeepONet have been proposed as surrogate models of FEMs recently. DeepONet learns a mapping from control (decision) variables to solutions of PDEs and further replaces PDE constraints with the operator network. But these methods require pretraining a large operator network which is non-trivial and inefficient. Moreover, its performance might deteriorate if the optimal solution is out of training distribution.
Physics-informed Neural Networks (PINNs) solve learning tasks while respecting the properties of physical laws. This is achieved by informing a loss function about the mathematical equations that govern the physical system. The general procedure for solving a differential equation with a PINNs involves finding the parameters of a network that minimize a loss function involving the mismatch of output and data, as well as residuals of the boundary and initial conditions, PDE equations, and any other physical constraints required.
Thus, another scope of neural methods is proposed to use a single PINNs to solve the PDECO problem instead of training a large operator network. It uses the method of Lagrangian multipliers to treat the PDE constraints as regularization terms, which can optimize the objective and PDE loss simultaneously. However, such methods introduce a trade-off between optimization targets and regularization terms (i.e., PDE losses)  which is crucial for the performance. It is generally non-trivial to set proper weights for balancing these terms due to the lack of theoretical guidance.
Therefore, existing methods are insufficient to handle those PDE constraints that have a complicated or nonlinear dependency on optimization targets. It is imperative to develop an effective strategy for dealing with the PDE constraints for solving PDECO problems.
SUMMARY
The following presents a simplified summary of one or more aspects to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later.
Generally, a novel bi-level optimization framework named Bi-level Physics-informed Neural networks with Broyden’s hypergradients (BPN) for solving PDECO problems are disclosed. The optimization of the targets and PDE constraints are decoupled, thereby naturally addressing the challenge of loss balancing in regularization-based methods. To solve the bi-level optimization problem, an iterative method that optimizes PINNs with PDE constraints in the inner loop while optimizes the control variables for objective functions in the outer using hypergradients is disclosed. Computing hypergradients in bi-level optimization for control variables is challenging if the inner loop optimization is complicated. Therefore, a method for calculating the hypergradients based on implicit differentiation using Broyden’s method is also disclosed.
According to an aspect of the disclosure, a computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) is disclosed, wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled. The method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights  of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
According to a further aspect, the method comprises training the initialized PINNs under the initialized control variables for a certain number of epochs as warm up.
According to a further aspect, the updating the weights of the PINNs further comprises updating the weights of the PINNs by the second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed for a certain number of epochs.
According to a further aspect, the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation.
According to a further aspect, the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as
Figure PCTCN2022111730-appb-000001
wherein
Figure PCTCN2022111730-appb-000002
denotes the objective function, ε denotes the PDE losses and w denotes the weights of the PINNs, w * denotes the updated weights of the PINNs in the last iteration.
According to a further aspect, the inverse vector-Hessian product is computed by finding a root z * of a linear equation
Figure PCTCN2022111730-appb-000003
which further comprises iteratively approximating the root z *using a low rank Broyden’s method.
According to a further aspect, the iteratively approximating the root z *using the low rank Broyden’s method further comprises following operations in each iteration: approximating the inversion of Hessian matrix
Figure PCTCN2022111730-appb-000004
as
Figure PCTCN2022111730-appb-000005
wherein k is rank of B i; updating z, u and v according to z i+1=z i-α·B ig i (z) , 
Figure PCTCN2022111730-appb-000006
and v i+1=B iΔz i+1, α is a step size; updating the inversion of the Hessian matrix
Figure PCTCN2022111730-appb-000007
by
Figure PCTCN2022111730-appb-000008
According to a further aspect, the iteratively approximating the root z *using the low rank Broyden’s method further comprises setting a maximum of the rank of B i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.
According to a further aspect, the iteratively approximating the root z *using the low rank Broyden’s method further comprises setting a maximum number of iterations of the low rank Broyden’s method; running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.
A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization flow control problem in a physical system with physics-informed neural networks (PINNs) with at least one of the methods disclosed herein, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein an optimization target is to find a best inlet flow distribution which is parameterized by the control variables to make an outlet flow uniform, comprises outputting the updated control variables denoting the best inlet flow distribution after convergence.
A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization shape optimization problem in a physical system with physics-informed neural networks (PINNs) with at least one of the methods disclosed herein, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein the optimization target is to find a best shape of an object which is parameterized by the control variables that minimizes drag forces or pressure from the flow, comprises outputting the updated control variables denoting the best shape after convergence.
BRIEF DESCRIPTION OF THE DRAWINGS
The disclosed aspects will be described in connection with the appended drawings that are provided to illustrate and not to limit the disclosed aspects.
Fig. 1 illustrates an exemplary existing framework of a PDECO problem, in accordance with various aspects of the present disclosure.
Fig. 2 illustrates an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
Fig. 3 illustrates details of an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
Fig. 4 illustrates an exemplary flow chart for solving PDECO problems, in accordance with various aspects of the present disclosure.
Fig. 5 illustrates an exemplary computing system, in accordance with various aspects of the present disclosure.
DETAILED DESCRIPTION
The present disclosure will now be discussed with reference to several example implementations. It is to be understood that these implementations are discussed only for enabling those skilled in the art to better understand and thus implement the embodiments of the present disclosure, rather than suggesting any limitations on the scope of the present disclosure.
Various embodiments will be described in detail with reference to the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts. References made to examples and embodiments are for illustrative purposes, and are not intended to limit the scope of the disclosure.
Partial Differential Equation (PDE) constrained optimization (PDECO) aims at optimizing the performance of a physical system constrained by PDEs with desired properties. The objective or target of PDECO would typically be parameterized by a set of control variables, and the state variables of the physical system would correspond to the solutions of PDE constraints.
Surrogate modeling is an important class of methods for PDECO. Physics-informed neural networks (PINNs) are powerful and flexible surrogates to represent the solutions of PDEs. Some existing methods, such as hPINN, treat PDE constraints as regularization terms and optimizes the control variables and state variables simultaneously. Such methods use the penalty method and the Lagrangian method to adjust the weights of multipliers. Some of them adopt the same formulation but use a line search to find the largest weight when the PDE error is within a certain range. The key limitation of these approaches is that the weights of multipliers are determined heuristically which might be sub-optimal and unstable.
Another class of methods train an operator network from control variables to solutions of PDEs or objective functions. Several approaches use mesh-based methods and predict states on all mesh points from control variables at the same time. As another example, PI-DeepONet adopts the architecture of DeepONet and trains the network using physics-informed losses, which is also called PDE losses.
However, all the existing methods produce unsatisfactory results if the optimal solution is out of the distribution. To better illustrate our proposed approach, a PDECO problem is firstly formulated as below.
Let Y, U, V be three Banach spaces. The solution fields of PDEs are called state variables, i.e., 
Figure PCTCN2022111730-appb-000009
and functions or variables can be controlled are control variables, i.e., 
Figure PCTCN2022111730-appb-000010
where Y ad and U ad are admissible spaces. The PDECO can be formulated as:
Figure PCTCN2022111730-appb-000011
wherein
Figure PCTCN2022111730-appb-000012
is the objective function and e: Y×U→V are PDE constraints. Usually, the PDE system of e (y, u) =0 contains multiple equations and boundary/initial conditions as:
Figure PCTCN2022111730-appb-000013
wherein
Figure PCTCN2022111730-appb-000014
is the differential operator representing PDEs and 
Figure PCTCN2022111730-appb-000015
represents boundary/initial conditions.
As an example, for a flow control problem, which is aimed at finding a best inlet flow distribution to make the outlet flow as uniform as possible. The state variables denote the state of the physical system, which may typically be flow velocity and pressure, etc., and the control variables may be parameters which can parameterize the flow distribution. As another example, for a shape optimization problem to find a best shape that minimized the drag forces from the flow, the state variables may also be flow velocity and pressure, but the control variables are parameters of the structures to be optimized.
As mentioned above, existing methods based on regularization (e.g. the penalty method) solve the PDECO problem by minimizing the following objective:
Figure PCTCN2022111730-appb-000016
wherein the solutions y and control variables u are respectively parameterized by 
Figure PCTCN2022111730-appb-000017
and
Figure PCTCN2022111730-appb-000018
with w being the weights of PINNs. 
Figure PCTCN2022111730-appb-000019
are hyper-parameters balancing theses terms of the optimization targets. One main difficulty is that λ i are hard to set and are sensitive to the results due to the complex nature of regularization terms (PDE constraints) . In general, large λ i makes it difficult to optimize the objective
Figure PCTCN2022111730-appb-000020
while small λ i can result in a nonphysical solution of y w. Besides, the optimal λ may also vary with the different phases of training.
Fig. 1 illustrates an exemplary existing framework of a PDECO problem, in accordance with various aspects of the present disclosure.
At the beginning, an initial set of weights of PINNs w and control variables θ are input to block 101. At block 101, an objective function which denotes the target to be optimized can be calculated as
Figure PCTCN2022111730-appb-000021
with its governing PDE constraints as
Figure PCTCN2022111730-appb-000022
and
Figure PCTCN2022111730-appb-000023
which represent the boundary/initial conditions of the physical system. As described above, the solutions of PDEs can be represented by state variables y , and control variables of the objective function
Figure PCTCN2022111730-appb-000024
can be represented by u, which are respectively parameterized by w and θ.
At block 102, the objective function
Figure PCTCN2022111730-appb-000025
and the regularization terms corresponding to PDE constraints are combined into
Figure PCTCN2022111730-appb-000026
with efficient λ i as in Eq. (3) . Then the control variables and state variables are optimized simultaneously by minimizing the combined objective function
Figure PCTCN2022111730-appb-000027
Then the optimized w and θ are output.
Fig. 1 is merely shown as an example, and other examples would be possible.
To resolve the above challenges of regularization-based methods, an approach that interprets PDECO as a bi-level optimization problem is disclosed herein, which can facilitate a new solver consequentially. Specifically, the following bi-level optimization problem is solved:
Figure PCTCN2022111730-appb-000028
In the outer loop, only
Figure PCTCN2022111730-appb-000029
is minimized with respect to θ given the optimal value of w *, and then PDE losses are optimized using PINNs in the inner loop with the fixed θ. The objective ε of the inner loop sub-problem is:
Figure PCTCN2022111730-appb-000030
By transforming the problem in Eq. (1) into a bi-level optimization problem as above, the optimization of PDEs’ state variables and control variables can be decoupled, which relieves the headache of setting a proper hyper-parameter of λ i in Eq. (3) .
To solve the proposed bi-level optimization problem in Eq. (4) , the inner loops and outer loops are designed to be executed iteratively. Fig. 2 illustrates an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
An objective function which denotes the target to be optimized is
Figure PCTCN2022111730-appb-000031
with its governing PDE constraints as
Figure PCTCN2022111730-appb-000032
and
Figure PCTCN2022111730-appb-000033
which represent the boundary/initial conditions of the physical system. As described above, the solutions of PDEs can be represented by state variables y, and control variables of the objective function
Figure PCTCN2022111730-appb-000034
can be represented by u, which are respectively parameterized by w and θ.
As shown in Fig. 2, an initial set of weights of PINNs w and control variables θ are input to block 201. At block 201, 
Figure PCTCN2022111730-appb-000035
is calculated and θ is optimized in the upper level with minimizing
Figure PCTCN2022111730-appb-000036
with respect to θ given a fixed value of w in one iteration. Then the optimized θ would be transferred to block 202 for optimizing w.
At block 202, the PINNs are trained in the lower level using PDE losses ε only as Eq. (5) . After that, the optimized w would be transferred back to block 201 for optimizing θ in a next iteration. Then after a certain times of iteration and/or convergence, the optimized w and θ are output. By decoupling the optimization of state variables and control variables, the shortcomings of regularization-based methods described above could be overcome.
Fig. 2 is merely shown as an example, and other examples would be possible.
Bi-level optimization is widely used in various machine learning tasks, e.g., neural architecture search, meta learning and hyperparameters optimization. One of the key challenges is to compute hypergradients with respect to the inner loop optimization. Some previous methods use unrolled optimization or truncated unrolled optimization which is to differentiate the optimization process. However, this is not scalable if the inner loop optimization is a complicated process. Some other methods compute the hypergradient based on implicit function theorem. This requires computing of the inverse hessian-vector product (inverse-HVP) . It is proposed to use neumann series to  approximate hypergradients in some previous approaches. Some works also use the conjugated gradient method. The approximation for implicit differentiation is crucial for the accuracy of hypergradients computation.
To optimize control variables in the outer loop or in the upper level as discussed with Eq. (4) and Fig. 2, the gradient of
Figure PCTCN2022111730-appb-000037
with respect to parameters of control variables θis calculated, which is called hypergradient. The hypergradient is computed based on Implicit Function Theorem Differentiation, along which line a highly complex inverse Hessian-Jacobian product needs to be calculated. To address this issue, Broyden’s method which provides an efficient approximation at a superlinear convergence speed is proposed to be used, which would be discussed in detail below.
The upper objective
Figure PCTCN2022111730-appb-000038
depends on the optimal w *of the lower level optimization as discussed with Eq. (4) and Fig. 2, i.e.,
Figure PCTCN2022111730-appb-000039
Thus, it is needed to consider the Jacobian of w *with respect to θ when calculating the hypergradients. Since w *minimizes the lower level problem, 
Figure PCTCN2022111730-appb-000040
can be derived by applying Cauchy Implicit Function Theorem as,
Theorem 1: if for some (w′, θ′) , the lower level optimization is solved, i.e. 
Figure PCTCN2022111730-appb-000041
and
Figure PCTCN2022111730-appb-000042
is invertible, then there exists a function w *=w * (θ) surrounding (w′, θ′) s. t. 
Figure PCTCN2022111730-appb-000043
and we have:
Figure PCTCN2022111730-appb-000044
By Theorem 1, the hypergradients could be calculated analytically as:
Figure PCTCN2022111730-appb-000045
However, computing the inverse of Hessian matrix
Figure PCTCN2022111730-appb-000046
is intractable for parameters of neural networks. To handle this challenge, we can first compute
Figure PCTCN2022111730-appb-000047
Figure PCTCN2022111730-appb-000048
which is also called the inverse vector-Hessian product. Like mentioned before, some previous works use the Neumann series to approximate the inverse of the Hessian matrix. However, this approach is a coarse and imprecise estimation of the hypergradients.
Therefore, a more efficient and effective approach to compute the inverse vector-Hessian product which enjoys superlinear convergence speed is disclosed herein. Computing z *equals to find the root for the following linear equation as:
Figure PCTCN2022111730-appb-000049
It is noted that for each evaluation of g w (z) , it is only needed to compute two Jacobian-vector products, which does not need to create an instance of Hessian matrix as prior arts, resulting in a low computational cost. Specifically, a low rank Broyden’s method is used to iteratively approximated the solution z *. In each iteration, firstly the inverse of
Figure PCTCN2022111730-appb-000050
is approximated as:
Figure PCTCN2022111730-appb-000051
Then z , u and v are updated according to the following rules:
z i+1=z i-α·B ig i (z)     (11)
Figure PCTCN2022111730-appb-000052
v i+1=B iΔz i+1       (13)
Wherein Δz i+1=z i+1-z i, Δg i+1=g i+1-g i, and α is a step size. In one example, α could be set to 1, and other examples are allowed.
In summary, the inversion of the Hessian could be updated by:
Figure PCTCN2022111730-appb-000053
After m iterations, z m is used as the approximation of z *.
Based on Eq. (10) , it can be seen that B i is expressed via column vector u k and v k, meaning that instead of storing a whole high-dimensional matrix of
Figure PCTCN2022111730-appb-000054
it only needs to record u k and v k, both of which are vectors having a dimension of k, where k=1…K and K is a tunable parameter depending on the memory limit. 
Figure PCTCN2022111730-appb-000055
would thus result in a matrix B i having a rank of k, which would be far less than the dimension of
Figure PCTCN2022111730-appb-000056
in order to achieve a low computational cost.
The Broyden iterations are run until the maximum iteration is reached or the error is within a threshold. It is noted that the maximum memory limit should be less than the maximum number of iterations in order to approximate the inversion of the Hessian to a K-rank matrix. When the Broyden iterations are run more than K times, only the K latest u k and v k are preserved due to the memory limit.
The influence of different iterations and memory limit of Broyden’s method is investigated through experiments on Heat 2d problem and NS backstep problem. The results of Heat 2d equation are shown in Table 1. It could be seen that roughly the performance improves (the objective function decreases) with the increase of maximum memory limit and maximum iterations. In this problem, the rank for approximating the  inverse Hessian more than 16 is able to handle this problem. There is no significant gain if more than 32 iterations are used.
Figure PCTCN2022111730-appb-000057
Table 1-Ablation study on Heat2d problem. Lower score means better performance.
The results of NS backstep problem are shown in Table 2. The trend is similar to Heat 2d problem and roughly the performance improves with the increase of Broyden iterations. However, it can further be seen that this problem is much more difficult and the performance deteriorates if only 8 memory steps are used even though many iterations are used.
Figure PCTCN2022111730-appb-000058
Table 2-Ablation study on NS backstep problem. Lower score means better performance.
Fig. 3 illustrates details of an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.
Similar to Fig. 2, an initial set of weights of PINNs w 0 and control variables θ 0 can be input to block 301. As an alternative, the initial PINNs may be trained a certain number of epochs N w under the control variables θ 0 as warm up, and then the pre-trained PINNs and its weights can be input to block 301.
At block 301, an objective function
Figure PCTCN2022111730-appb-000059
with similar formation as Eq. (1) is calculated, with its governing PDE constraints as
Figure PCTCN2022111730-appb-000060
and
Figure PCTCN2022111730-appb-000061
which represent the boundary/initial conditions of the physical system. As described above, the solutions of PDE constraints can be represented by state variables y, and control variables of the objective function 
Figure PCTCN2022111730-appb-000062
can be represented by u, which are respectively parameterized by w and θ.
In order to optimize θ by gradient descent in the outer loop with fixed w, a gradient of
Figure PCTCN2022111730-appb-000063
with respect to θ is to be calculated. As discussed above and shown in block  301, the hypergradients of θ is calculated using IFT Differentiation, and further the inverse vector-Hessian product is calculated based on Broyden’s method as in Eq. (9) and (14) which uses low rank approximation for acceleration. As shown in block 301 with solid blocks, a high-dimensional matrix is replaced with low rank vector matrix product.
After θ is optimized in the iteration, the optimized θ is transferred to block 302 for finetuning PINNs using PDE losses as in Eq. (3) only. As shown in block 302, PINNs are trained with gradient descent of the PDE losses under the condition of the optimized θ fixed. In an example, the PINNs can be trained for a certain number of epochs N f to obtain an optimal w *.
After w is optimized in the iteration, the optimized w *is transferred back to block 301 for optimizing θ in the next iteration. These two steps are iteratively executed until convergence.
The influence of warm up epochs N w and different fine-tuning epochs N f for the inner loop of PINNs are also investigated through experiments. Heat 2d problem are chosen and the results are shown in Table 3 and Table 4. It can be seen from the tables that these hyperparameters have considerable influence on the convergence speed. However, their impact on the final performance is minor within a broad range. Thus, BPN is robust to the choice of these two parameters on Heat 2d problem, but a more efficient and effective result could be achieved via choosing moderate N w and/or N f.
Figure PCTCN2022111730-appb-000064
Table 3 –Ablation study on the influence of finetuning epochs of PINNs
Figure PCTCN2022111730-appb-000065
Table 4 –Ablation study on the influence of warmup epochs of PINNs
Based on the description and examples before, the disclosed method as a whole is named Bi-level Physics-informed Neural networks with Broyden’s hypergradients (BPN) , however one or more of the aspects can be implemented solely or combined with its corresponding effects. The pseudo code of BPN is outlined in Algorithm 1.
Figure PCTCN2022111730-appb-000066
Figure PCTCN2022111730-appb-000067
The disclosed BPN firstly possesses an idea of solving a transposed linear system in Eq. (9) to reduce the computational cost. Moreover, the disclosed bi-level optimization is a more general framework compared with constrained optimization in existing approaches. It allows more flexible design choices like replacing FEM solvers with PINNs. Additionally, the Eq. (9) solves a system in the parameter space of neural networks but some traditional approaches use equations correspond to a real field defined on admissible space or meshes.
Fig. 4 illustrates an exemplary flow chart for solving PDECO problems, in accordance with various aspects of the present disclosure. As described below, some or all illustrated features may be omitted in a particular implementation within the scope of the present disclosure, and some illustrated features may not be required for implementation of all embodiments. Further, some of the blocks may be performed parallel or in a different order. In some examples, the method may be carried out by any suitable apparatus or means for carrying out the functions or algorithm described below.
The PDECO problem aims at optimizing the performance of a physical system constrained by PDEs with desired properties, which could be a variety of problems in science and engineering, to name a few, flow control problem, shape optimization problem, drag minimization, etc. The shape optimization problem or drag minimization is aimed at designing structures (the shapes, sizes, and distribution of given materials) , for example of an airfoil, with high performance for systems characterized by PDEs. The method described with Fig. 4 in combined with other aspects herein can be applied to all the PDECO problems with appropriate PDE constraints, not limited by examples below.
For example, Naiver-Strokes equations are one of the most important equations in fluid mechanics, aerodynamics and applied mathematics which are notoriously difficult to solve due to the high non-linearity. In this problem, NS equations are solved in a pipeline and it aims to find the best inlet flow distribution f (y) to make the outlet flow  as uniform as possible. The flow velocity field is u= (u, v) and the pressure field is p and they are defined on a rectangle domain Ω, for example Ω= [0, 1.5] × [0, 1.0] . Two inlets and two outlets and several walls could be defined for this domain, such as:
Figure PCTCN2022111730-appb-000068
The whole problem would be as follows, which could be concrete expression of Eq. (1) and (2) :
Figure PCTCN2022111730-appb-000069
The target function is a parabolic function
Figure PCTCN2022111730-appb-000070
and the velocity field on the second inlet and outlet isv 2 (x) =18 (x-0.5) (1-x) . The Reynold number can be set to 100 in this problem, and f (y) can be initialized the same with the target function as f (y) =4y (1-y) . This problem could be solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
As another example, backstep flow is a classic example that might exhibits turbulent flow. In this problem, it also aimed to find the best inlet flow distribution f (y) to make the outlet flow symmetric and uniform. The geometry of backstep could be viewed as a union of two rectangles, such as Ω= [0, 1] × [0, 0.5] ∪ [1, 2] × [0, 1] . The inlet is the left side of the area and the outlet is the right side of the area,
Figure PCTCN2022111730-appb-000071
The velocity fields of the outlet are to be optimized,
Figure PCTCN2022111730-appb-000072
The target velocity field is
Figure PCTCN2022111730-appb-000073
and the inlet velocity f (y) is initialized as 8y (0.5-y) . This problem could be also solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
For other class of problem of shape optimization, for example, drag minimization over an obstacle of NS equations, which is a shape optimization task that is to find the best shape of the obstacle that minimizes the drag forces from the flow. The inlet is the left side of the area and the outlet is the right side of the area, shown as:
Figure PCTCN2022111730-appb-000074
The flow field is defined on a domain Ω = [0, 8]  2\Ω° and Ω° is the obstacle. The shape of the obstacle is a ellipse parameterized by a parameter a ∈ [0.5, 2.5] . The goal is to minimize the following objective.
Figure PCTCN2022111730-appb-000075
This problem could be also solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.
Now the disclosed method which is generally applicable for solving PDECO problems is illustrated with Fig. 4. The method begins at block 401, with initializing weights w 0 of the PINNs u w and the control variables θ 0 , wherein the solutions of PDE constraints are parameterized by the weights w of the PINNs. As an example, the PINNs is initialized with random parameters w 0, and the control variables θ 0 are initialized with a guess.
The method proceeds to block 402, with calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively. As an example, the PDE losses ε related to the PDE constraints can be calculated as Eq. (5) . Referring to several PDECO problems mentioned above, the PDE constraints could be the boundary and/or initial conditions of the physical systems. Also, as an example, the objective function
Figure PCTCN2022111730-appb-000076
related to the optimization target can be calculated with similar form as Eq. (1) .
The method proceeds to block 403, with training the initialized PINNs under the initialized control variables for a certain number of epochs as warm up, as shown with dash line, block 403 is an optional block. The warm up would have considerable influence on the convergence speed but minor influence on the final performance, therefore the number of epochs could be chosen based on implementation and not limited.
The method proceeds to block 404, with updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed.
In an example, the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation. For example, the hypergradient of the objective function with respect to the control variables can be calculated as Eq. (8) by Theorem 1.
As a further example, the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as
Figure PCTCN2022111730-appb-000077
wherein
Figure PCTCN2022111730-appb-000078
denotes the objective function, ε denotes the PDE losses and w denotes the weights of the PINNs, w * denotes the updated weights of the PINNs in the last iteration.
As a further example, the inverse vector-Hessian product is computed by finding a root z * of a linear equation
Figure PCTCN2022111730-appb-000079
further comprises iteratively approximating the root z * using a low rank Broyden’s method. To iteratively approximating the root z *, in each iteration of the low rank Broyden’s method, further comprises approximating the inversion of Hessian matrix 
Figure PCTCN2022111730-appb-000080
as
Figure PCTCN2022111730-appb-000081
wherein k is rank of B i; updating z , u and v according to z i+1=z i-α·B ig i (z) , 
Figure PCTCN2022111730-appb-000082
and v i+1=B iΔz i+1,  α is a step size; updating the inversion of the Hessian matrix
Figure PCTCN2022111730-appb-000083
by
Figure PCTCN2022111730-appb-000084
Figure PCTCN2022111730-appb-000085
As a further example, to iteratively approximating the root z *, further comprises setting a maximum of the rank of B i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.
As a further example, to iteratively approximating the root z *, further comprises setting a maximum number of iterations of the low rank Broyden’s method; running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.
After the control variables are updated, the method proceeds to block 405, with updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed. As an example, the PINNs are trained with the updated control variables for a certain number of epochs.
After the PINNs are optimized and not converged determined at block 406, the optimal weights of the PINNs are transferred back to block 404 for next iteration. It could also be seen from Fig. 4, the operation of  block  404 and 405 form an outer loop of optimization of control variables, and the operation in block 405 itself is an inner loop optimization for finetuning PINNs, which are also illustrated and described with Fig. 3. Besides, the Broyden’s method iterations exist in block 404 in an aspect.
After the PINNs are optimized and determined to be converged at block 406, the optimized control variables and PINNs are output at block 407.
Fig. 5 illustrates an exemplary computing system, in accordance with various aspects of the present disclosure. The computing system may comprise at least one processor 510. The computing system may further comprise at least one storage device 520. It should be appreciated that the storage device 520 may store computer-executable instructions that, when executed, cause the processor 510 to perform any operations according to the embodiments of the present disclosure as described in connection with FIGs. 1-4.
The embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) , wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled. The method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the  weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
The embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization flow control problem in a physical system with physics-informed neural networks (PINNs) with at least one of the described methods herein, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein an optimization target is to find a best inlet flow distribution which is parameterized by the control variables to make an outlet flow uniform, comprising: outputting the updated control variables denoting the best inlet flow distribution after convergence.
The embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a
The non-transitory computer-readable medium may further comprise instructions that, when executed, cause one or more processors to perform any operations according to the embodiments of the present disclosure as described in connection with Figs. 1-4.
It should be appreciated that all the operations in the methods described above are merely exemplary, and the present disclosure is not limited to any operations in the methods or sequence orders of these operations, and should cover all other equivalents under the same or similar concepts.
It should also be appreciated that all the modules in the apparatuses described above may be implemented in various approaches. These modules may be implemented as hardware, software, or a combination thereof. Moreover, any of these modules may be further functionally divided into sub-modules or combined together.
The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein  may be applied to other aspects. Thus, the claims are not intended to be limited to the aspects shown herein. All structural and functional equivalents to the elements of the various aspects described throughout the present disclosure that are known or later come to be known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the claims.

Claims (14)

  1. A computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) , wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled, the method comprising:
    initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs;
    calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively;
    updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed;
    updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and
    updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
  2. The computer implemented method of claim 1, further comprising:
    training the initialized PINNs under the initialized control variables for a certain number of epochs as warm up.
  3. The computer implemented method of claim 1, the updating the weights of the PINNs further comprising:
    updating the weights of the PINNs by the second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed for a certain number of epochs.
  4. The computer implemented method of claim 1, wherein the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation.
  5. The computer implemented method of claim 4, wherein the hypergradient of the objective function with respect to the control variables is calculated based at least on  computing an inverse vector-Hessian product as
    Figure PCTCN2022111730-appb-100001
    wherein
    Figure PCTCN2022111730-appb-100002
    denotes the objective function, ε denotes the PDE losses and w denotes the weights of the PINNs, w * denotes the updated weights of the PINNs in the last iteration.
  6. The computer implemented method of claim 5, wherein the inverse vector-Hessian product is computed by finding a root z * of a linear equation
    Figure PCTCN2022111730-appb-100003
    Figure PCTCN2022111730-appb-100004
    further comprising:
    iteratively approximating the root z * using a low rank Broyden’s method.
  7. The computer implemented method of claim 6, wherein the iteratively approximating the root z * using the low rank Broyden’s method further comprising in each iteration:
    approximating the inversion of Hessian matrix
    Figure PCTCN2022111730-appb-100005
    as
    Figure PCTCN2022111730-appb-100006
    Figure PCTCN2022111730-appb-100007
    wherein k is rank of B i;
    updating z , u and v according to z i+1=z i-α·B ig i (z) , 
    Figure PCTCN2022111730-appb-100008
    Figure PCTCN2022111730-appb-100009
    and v i+1=B iΔz i+1, α is a step size;
    updating the inversion of the Hessian matrix
    Figure PCTCN2022111730-appb-100010
    by
    Figure PCTCN2022111730-appb-100011
    Figure PCTCN2022111730-appb-100012
  8. The computer implemented method of claim 7, the iteratively approximating the root z * using the low rank Broyden’s method further comprising:
    setting a maximum of the rank of B i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.
  9. The computer implemented method of claim 7, the iteratively approximating the root z * using the low rank Broyden’s method further comprising:
    setting a maximum number of iterations of the low rank Broyden’s method;
    running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.
  10. A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization flow control problem in a physical system with physics-informed neural networks (PINNs) with method of one of claims 1-9, wherein  the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein an optimization target is to find a best inlet flow distribution which is parameterized by the control variables to make an outlet flow uniform, comprising:
    outputting the updated control variables denoting the best inlet flow distribution after convergence.
  11. A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization shape optimization problem in a physical system with physics-informed neural networks (PINNs) with method of one of claims 1-9, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein the optimization target is to find a best shape of an object which is parameterized by the control variables that minimizes drag forces or pressure from the flow, comprising:
    outputting the updated control variables denoting the best shape after convergence.
  12. A computer system, comprising:
    one or more processors; and
    one or more storage devices storing computer-executable instructions that, when executed, cause the one or more processors to perform the operations of the method of one of claims 1-9.
  13. One or more computer readable storage media storing computer-executable instructions that, when executed, cause one or more processors to perform the operations of the method of one of claims 1-9.
  14. A computer program product comprising computer-executable instructions that, when executed, cause one or more processors to perform the operations of the method of one of claims 1-9.
PCT/CN2022/111730 2022-08-11 2022-08-11 Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization WO2024031525A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/111730 WO2024031525A1 (en) 2022-08-11 2022-08-11 Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/111730 WO2024031525A1 (en) 2022-08-11 2022-08-11 Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization

Publications (1)

Publication Number Publication Date
WO2024031525A1 true WO2024031525A1 (en) 2024-02-15

Family

ID=89850337

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/111730 WO2024031525A1 (en) 2022-08-11 2022-08-11 Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization

Country Status (1)

Country Link
WO (1) WO2024031525A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210365616A1 (en) * 2020-05-20 2021-11-25 Bull Sas Simulation by substitution of a model by physical laws with an automatic learning model
CN114118405A (en) * 2021-10-26 2022-03-01 中国人民解放军军事科学院国防科技创新研究院 Loss function self-adaptive balancing method of neural network embedded with physical knowledge
CN114239698A (en) * 2021-11-26 2022-03-25 中国空间技术研究院 Data processing method, device and equipment
CN114611678A (en) * 2022-03-21 2022-06-10 上海壁仞智能科技有限公司 Training method and device, data processing method, electronic device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210365616A1 (en) * 2020-05-20 2021-11-25 Bull Sas Simulation by substitution of a model by physical laws with an automatic learning model
CN114118405A (en) * 2021-10-26 2022-03-01 中国人民解放军军事科学院国防科技创新研究院 Loss function self-adaptive balancing method of neural network embedded with physical knowledge
CN114239698A (en) * 2021-11-26 2022-03-25 中国空间技术研究院 Data processing method, device and equipment
CN114611678A (en) * 2022-03-21 2022-06-10 上海壁仞智能科技有限公司 Training method and device, data processing method, electronic device and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BROYDEN C. G.: "A class of methods for solving nonlinear simultaneous equations", MATHEMATICS OF COMPUTATION, AMERICAN MATHEMATICAL SOCIETY, US, vol. 19, no. 92, 1 January 1965 (1965-01-01), US , pages 577 - 593, XP093137646, ISSN: 0025-5718, DOI: 10.1090/S0025-5718-1965-0198670-6 *
LORRAINE JONATHAN, VICOL PAUL, DUVENAUD DAVID: "Optimizing Millions of Hyperparameters by Implicit Differentiation", ARXIV (CORNELL UNIVERSITY), CORNELL UNIVERSITY LIBRARY, ARXIV.ORG, ITHACA, 6 November 2019 (2019-11-06), Ithaca, XP093137637, [retrieved on 20240305], DOI: 10.48550/arxiv.1911.02590 *
PSAROS APOSTOLOS F, KAWAGUCHI KENJI, KARNIADAKIS GEORGE EM: "Meta-learning PINN loss functions", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, ARXIV.ORG, ITHACA, 12 July 2021 (2021-07-12), Ithaca, XP093137634, [retrieved on 20240305], DOI: 10.48550/arxiv.2107.05544 *
RODOMANOV ANTON, NESTEROV YURII: "Greedy Quasi-Newton Methods with Explicit Superlinear Convergence", SIAM JOURNAL ON OPTIMIZATION, THE SOCIETY, PHILADELPHIA, PA,, US, vol. 31, no. 1, 1 January 2021 (2021-01-01), US , pages 785 - 811, XP093137649, ISSN: 1052-6234, DOI: 10.1137/20M1320651 *

Similar Documents

Publication Publication Date Title
Ahn et al. Online tuning fuzzy PID controller using robust extended Kalman filter
Brandt Multi-level adaptive solutions to boundary-value problems
Izci et al. A novel improved arithmetic optimization algorithm for optimal design of PID controlled and Bode’s ideal transfer function based automobile cruise control system
Wang et al. Modeling nonlinear systems using the tensor network B‐spline and the multi‐innovation identification theory
Yu et al. Fault-tolerant control for over-actuated hypersonic reentry vehicle subject to multiple disturbances and actuator faults
Kang Rate of convergence for the Legendre pseudospectral optimal control of feedback linearizable systems
Ataei et al. Non-linear control of an uncertain hypersonic aircraft model using robust sum-of-squares method
Taheri et al. Shaping low-thrust trajectories with thrust-handling feature
Katrutsa et al. Deep multigrid: learning prolongation and restriction matrices
Benner et al. Efficient solution of large-scale saddle point systems arising in Riccati-based boundary feedback stabilization of incompressible Stokes flow
CN114239698A (en) Data processing method, device and equipment
Cao et al. A practical parameter determination strategy based on improved hybrid PSO algorithm for higher-order sliding mode control of air-breathing hypersonic vehicles
Yang et al. An iterative scheme of safe reinforcement learning for nonlinear systems via barrier certificate generation
WO2024031525A1 (en) Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization
Ober-Blöbaum et al. Variational learning of Euler–Lagrange dynamics from data
Wang et al. Asymptotic self-similar blow up profile for 3-D Euler via physics-informed neural networks
Gao et al. Optimization on the symplectic Stiefel manifold: SR decomposition-based retraction and applications
Ye et al. On the importance of consistency in training deep neural networks
Witherden et al. The design of steady state schemes for computational aerodynamics
Karamali et al. Numerical solution of higher index DAEs using their IAE's structure: Trajectory-prescribed path control problem and simple pendulum
Hu et al. Reentry trajectory optimization for hypersonic vehicles using fuzzy satisfactory goal programming method
Darling et al. Rigid body attitude uncertainty propagation using the Gauss-Bingham distribution
CN114638076A (en) Fluid topology optimization method and system based on physical neural network
Han et al. Trajectory tracking control for underactuated autonomous vehicles via adaptive dynamic programming
Furfaro et al. Increasing Autonomy of Aerospace Systems via PINN-based Solutions of HJB Equation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22954492

Country of ref document: EP

Kind code of ref document: A1