WO2024031525A1

WO2024031525A1 - Method and apparatus for bi-level physics-informed neural networks for pde constrained optimization

Info

Publication number: WO2024031525A1
Application number: PCT/CN2022/111730
Authority: WO
Inventors: Jun Zhu; Zhongkai HAO; Chengyang YING; Hang SU; Jian Song; Ze CHENG
Original assignee: Robert Bosch Gmbh; Tsinghua University
Priority date: 2022-08-11
Filing date: 2022-08-11
Publication date: 2024-02-15

Abstract

A computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) is disclosed, wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled. The method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.

Description

METHOD AND APPARATUS FOR BI-LEVEL PHYSICS-INFORMED NEURAL NETWORKS FOR PDE CONSTRAINED OPTIMIZATION

FIELD

Aspects of the present disclosure relate generally to artificial intelligence, and more particularly, to a method and an apparatus provided for bi-level physics-informed neural networks for Partial Differential Equation constrained optimization.

BACKGROUND

Partial Differential Equation (PDE) constrained optimization (PDECO) aims at optimizing the performance of a physical system constrained by PDEs with desired properties. It is an important task in many areas of science and engineering, with a wide range of applications including shape optimization problem such as the design on shapes of aircraft wings in aerodynamics, parameters optimization of channels in heat transfer, parameters optimization of flow control problem, etc.

For solving PDECO problems, traditional numerical methods like adjoint methods based on finite element methods (FEMs) have been studied for decades. However, FEM solvers are usually expensive for high dimensional problems with a large search space or mesh size, because the computational cost of FEMs grows quadratically to cubically w.r. t mesh sizes.

To mitigate this problem, neural networks like DeepONet have been proposed as surrogate models of FEMs recently. DeepONet learns a mapping from control (decision) variables to solutions of PDEs and further replaces PDE constraints with the operator network. But these methods require pretraining a large operator network which is non-trivial and inefficient. Moreover, its performance might deteriorate if the optimal solution is out of training distribution.

Physics-informed Neural Networks (PINNs) solve learning tasks while respecting the properties of physical laws. This is achieved by informing a loss function about the mathematical equations that govern the physical system. The general procedure for solving a differential equation with a PINNs involves finding the parameters of a network that minimize a loss function involving the mismatch of output and data, as well as residuals of the boundary and initial conditions, PDE equations, and any other physical constraints required.

Thus, another scope of neural methods is proposed to use a single PINNs to solve the PDECO problem instead of training a large operator network. It uses the method of Lagrangian multipliers to treat the PDE constraints as regularization terms, which can optimize the objective and PDE loss simultaneously. However, such methods introduce a trade-off between optimization targets and regularization terms (i.e., PDE losses) which is crucial for the performance. It is generally non-trivial to set proper weights for balancing these terms due to the lack of theoretical guidance.

Therefore, existing methods are insufficient to handle those PDE constraints that have a complicated or nonlinear dependency on optimization targets. It is imperative to develop an effective strategy for dealing with the PDE constraints for solving PDECO problems.

SUMMARY

The following presents a simplified summary of one or more aspects to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later.

Generally, a novel bi-level optimization framework named Bi-level Physics-informed Neural networks with Broyden’s hypergradients (BPN) for solving PDECO problems are disclosed. The optimization of the targets and PDE constraints are decoupled, thereby naturally addressing the challenge of loss balancing in regularization-based methods. To solve the bi-level optimization problem, an iterative method that optimizes PINNs with PDE constraints in the inner loop while optimizes the control variables for objective functions in the outer using hypergradients is disclosed. Computing hypergradients in bi-level optimization for control variables is challenging if the inner loop optimization is complicated. Therefore, a method for calculating the hypergradients based on implicit differentiation using Broyden’s method is also disclosed.

According to an aspect of the disclosure, a computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) is disclosed, wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled. The method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.

According to a further aspect, the method comprises training the initialized PINNs under the initialized control variables for a certain number of epochs as warm up.

According to a further aspect, the updating the weights of the PINNs further comprises updating the weights of the PINNs by the second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed for a certain number of epochs.

According to a further aspect, the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation.

According to a further aspect, the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as

wherein

denotes the objective function, ε denotes the PDE losses and w denotes the weights of the PINNs, w ^* denotes the updated weights of the PINNs in the last iteration.

According to a further aspect, the inverse vector-Hessian product is computed by finding a root z ^* of a linear equation

which further comprises iteratively approximating the root z ^*using a low rank Broyden’s method.

According to a further aspect, the iteratively approximating the root z ^*using the low rank Broyden’s method further comprises following operations in each iteration: approximating the inversion of Hessian matrix

as

wherein k is rank of B _i; updating z, u and v according to z _i+1=z _i-α·B _ig _i (z) ,

and v _i+1=B _iΔz _i+1, α is a step size; updating the inversion of the Hessian matrix

by

According to a further aspect, the iteratively approximating the root z ^*using the low rank Broyden’s method further comprises setting a maximum of the rank of B _i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.

According to a further aspect, the iteratively approximating the root z ^*using the low rank Broyden’s method further comprises setting a maximum number of iterations of the low rank Broyden’s method; running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.

A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization flow control problem in a physical system with physics-informed neural networks (PINNs) with at least one of the methods disclosed herein, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein an optimization target is to find a best inlet flow distribution which is parameterized by the control variables to make an outlet flow uniform, comprises outputting the updated control variables denoting the best inlet flow distribution after convergence.

A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization shape optimization problem in a physical system with physics-informed neural networks (PINNs) with at least one of the methods disclosed herein, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein the optimization target is to find a best shape of an object which is parameterized by the control variables that minimizes drag forces or pressure from the flow, comprises outputting the updated control variables denoting the best shape after convergence.

BRIEF DESCRIPTION OF THE DRAWINGS

The disclosed aspects will be described in connection with the appended drawings that are provided to illustrate and not to limit the disclosed aspects.

Fig. 1 illustrates an exemplary existing framework of a PDECO problem, in accordance with various aspects of the present disclosure.

Fig. 2 illustrates an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.

Fig. 3 illustrates details of an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.

Fig. 4 illustrates an exemplary flow chart for solving PDECO problems, in accordance with various aspects of the present disclosure.

Fig. 5 illustrates an exemplary computing system, in accordance with various aspects of the present disclosure.

DETAILED DESCRIPTION

The present disclosure will now be discussed with reference to several example implementations. It is to be understood that these implementations are discussed only for enabling those skilled in the art to better understand and thus implement the embodiments of the present disclosure, rather than suggesting any limitations on the scope of the present disclosure.

Various embodiments will be described in detail with reference to the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts. References made to examples and embodiments are for illustrative purposes, and are not intended to limit the scope of the disclosure.

Partial Differential Equation (PDE) constrained optimization (PDECO) aims at optimizing the performance of a physical system constrained by PDEs with desired properties. The objective or target of PDECO would typically be parameterized by a set of control variables, and the state variables of the physical system would correspond to the solutions of PDE constraints.

Surrogate modeling is an important class of methods for PDECO. Physics-informed neural networks (PINNs) are powerful and flexible surrogates to represent the solutions of PDEs. Some existing methods, such as hPINN, treat PDE constraints as regularization terms and optimizes the control variables and state variables simultaneously. Such methods use the penalty method and the Lagrangian method to adjust the weights of multipliers. Some of them adopt the same formulation but use a line search to find the largest weight when the PDE error is within a certain range. The key limitation of these approaches is that the weights of multipliers are determined heuristically which might be sub-optimal and unstable.

Another class of methods train an operator network from control variables to solutions of PDEs or objective functions. Several approaches use mesh-based methods and predict states on all mesh points from control variables at the same time. As another example, PI-DeepONet adopts the architecture of DeepONet and trains the network using physics-informed losses, which is also called PDE losses.

However, all the existing methods produce unsatisfactory results if the optimal solution is out of the distribution. To better illustrate our proposed approach, a PDECO problem is firstly formulated as below.

Let Y, U, V be three Banach spaces. The solution fields of PDEs are called state variables, i.e.,

and functions or variables can be controlled are control variables, i.e.,

where Y _ad and U _ad are admissible spaces. The PDECO can be formulated as:

wherein

is the objective function and e: Y×U→V are PDE constraints. Usually, the PDE system of e (y, u) =0 contains multiple equations and boundary/initial conditions as:

wherein

is the differential operator representing PDEs and

represents boundary/initial conditions.

As an example, for a flow control problem, which is aimed at finding a best inlet flow distribution to make the outlet flow as uniform as possible. The state variables denote the state of the physical system, which may typically be flow velocity and pressure, etc., and the control variables may be parameters which can parameterize the flow distribution. As another example, for a shape optimization problem to find a best shape that minimized the drag forces from the flow, the state variables may also be flow velocity and pressure, but the control variables are parameters of the structures to be optimized.

As mentioned above, existing methods based on regularization (e.g. the penalty method) solve the PDECO problem by minimizing the following objective:

wherein the solutions y and control variables u are respectively parameterized by

and

with w being the weights of PINNs.

are hyper-parameters balancing theses terms of the optimization targets. One main difficulty is that λ _i are hard to set and are sensitive to the results due to the complex nature of regularization terms (PDE constraints) . In general, large λ _i makes it difficult to optimize the objective

while small λ _i can result in a nonphysical solution of y _w. Besides, the optimal λ may also vary with the different phases of training.

At the beginning, an initial set of weights of PINNs w and control variables θ are input to block 101. At block 101, an objective function which denotes the target to be optimized can be calculated as

with its governing PDE constraints as

and

which represent the boundary/initial conditions of the physical system. As described above, the solutions of PDEs can be represented by state variables y , and control variables of the objective function

can be represented by u, which are respectively parameterized by w and θ.

At block 102, the objective function

and the regularization terms corresponding to PDE constraints are combined into

with efficient λ _i as in Eq. (3) . Then the control variables and state variables are optimized simultaneously by minimizing the combined objective function

Then the optimized w and θ are output.

Fig. 1 is merely shown as an example, and other examples would be possible.

To resolve the above challenges of regularization-based methods, an approach that interprets PDECO as a bi-level optimization problem is disclosed herein, which can facilitate a new solver consequentially. Specifically, the following bi-level optimization problem is solved:

In the outer loop, only

is minimized with respect to θ given the optimal value of w ^*, and then PDE losses are optimized using PINNs in the inner loop with the fixed θ. The objective ε of the inner loop sub-problem is:

By transforming the problem in Eq. (1) into a bi-level optimization problem as above, the optimization of PDEs’ state variables and control variables can be decoupled, which relieves the headache of setting a proper hyper-parameter of λ _i in Eq. (3) .

To solve the proposed bi-level optimization problem in Eq. (4) , the inner loops and outer loops are designed to be executed iteratively. Fig. 2 illustrates an exemplary framework for solving PDECO problems, in accordance with various aspects of the present disclosure.

An objective function which denotes the target to be optimized is

with its governing PDE constraints as

and

which represent the boundary/initial conditions of the physical system. As described above, the solutions of PDEs can be represented by state variables y, and control variables of the objective function

can be represented by u, which are respectively parameterized by w and θ.

As shown in Fig. 2, an initial set of weights of PINNs w and control variables θ are input to block 201. At block 201,

is calculated and θ is optimized in the upper level with minimizing

with respect to θ given a fixed value of w in one iteration. Then the optimized θ would be transferred to block 202 for optimizing w.

At block 202, the PINNs are trained in the lower level using PDE losses ε only as Eq. (5) . After that, the optimized w would be transferred back to block 201 for optimizing θ in a next iteration. Then after a certain times of iteration and/or convergence, the optimized w and θ are output. By decoupling the optimization of state variables and control variables, the shortcomings of regularization-based methods described above could be overcome.

Fig. 2 is merely shown as an example, and other examples would be possible.

Bi-level optimization is widely used in various machine learning tasks, e.g., neural architecture search, meta learning and hyperparameters optimization. One of the key challenges is to compute hypergradients with respect to the inner loop optimization. Some previous methods use unrolled optimization or truncated unrolled optimization which is to differentiate the optimization process. However, this is not scalable if the inner loop optimization is a complicated process. Some other methods compute the hypergradient based on implicit function theorem. This requires computing of the inverse hessian-vector product (inverse-HVP) . It is proposed to use neumann series to approximate hypergradients in some previous approaches. Some works also use the conjugated gradient method. The approximation for implicit differentiation is crucial for the accuracy of hypergradients computation.

To optimize control variables in the outer loop or in the upper level as discussed with Eq. (4) and Fig. 2, the gradient of

with respect to parameters of control variables θis calculated, which is called hypergradient. The hypergradient is computed based on Implicit Function Theorem Differentiation, along which line a highly complex inverse Hessian-Jacobian product needs to be calculated. To address this issue, Broyden’s method which provides an efficient approximation at a superlinear convergence speed is proposed to be used, which would be discussed in detail below.

The upper objective

depends on the optimal w ^*of the lower level optimization as discussed with Eq. (4) and Fig. 2, i.e.,

Thus, it is needed to consider the Jacobian of w ^*with respect to θ when calculating the hypergradients. Since w ^*minimizes the lower level problem,

can be derived by applying Cauchy Implicit Function Theorem as,

Theorem 1: if for some (w′, θ′) , the lower level optimization is solved, i.e.

and

is invertible, then there exists a function w ^*=w ^* (θ) surrounding (w′, θ′) s. t.

and we have:

By Theorem 1, the hypergradients could be calculated analytically as:

However, computing the inverse of Hessian matrix

is intractable for parameters of neural networks. To handle this challenge, we can first compute

which is also called the inverse vector-Hessian product. Like mentioned before, some previous works use the Neumann series to approximate the inverse of the Hessian matrix. However, this approach is a coarse and imprecise estimation of the hypergradients.

Therefore, a more efficient and effective approach to compute the inverse vector-Hessian product which enjoys superlinear convergence speed is disclosed herein. Computing z ^*equals to find the root for the following linear equation as:

It is noted that for each evaluation of g _w (z) , it is only needed to compute two Jacobian-vector products, which does not need to create an instance of Hessian matrix as prior arts, resulting in a low computational cost. Specifically, a low rank Broyden’s method is used to iteratively approximated the solution z ^*. In each iteration, firstly the inverse of

is approximated as:

Then z , u and v are updated according to the following rules:

z _i+1=z _i-α·B _ig _i (z) (11)

v _i+1=B _iΔz _i+1 (13)

Wherein Δz _i+1=z _i+1-z _i, Δg _i+1=g _i+1-g _i, and α is a step size. In one example, α could be set to 1, and other examples are allowed.

In summary, the inversion of the Hessian could be updated by:

After m iterations, z _m is used as the approximation of z ^*.

Based on Eq. (10) , it can be seen that B _i is expressed via column vector u _k and v _k, meaning that instead of storing a whole high-dimensional matrix of

it only needs to record u _k and v _k, both of which are vectors having a dimension of k, where k=1…K and K is a tunable parameter depending on the memory limit.

would thus result in a matrix B _i having a rank of k, which would be far less than the dimension of

in order to achieve a low computational cost.

The Broyden iterations are run until the maximum iteration is reached or the error is within a threshold. It is noted that the maximum memory limit should be less than the maximum number of iterations in order to approximate the inversion of the Hessian to a K-rank matrix. When the Broyden iterations are run more than K times, only the K latest u _k and v _k are preserved due to the memory limit.

The influence of different iterations and memory limit of Broyden’s method is investigated through experiments on Heat 2d problem and NS backstep problem. The results of Heat 2d equation are shown in Table 1. It could be seen that roughly the performance improves (the objective function decreases) with the increase of maximum memory limit and maximum iterations. In this problem, the rank for approximating the inverse Hessian more than 16 is able to handle this problem. There is no significant gain if more than 32 iterations are used.

Table 1-Ablation study on Heat2d problem. Lower score means better performance.

The results of NS backstep problem are shown in Table 2. The trend is similar to Heat 2d problem and roughly the performance improves with the increase of Broyden iterations. However, it can further be seen that this problem is much more difficult and the performance deteriorates if only 8 memory steps are used even though many iterations are used.

Table 2-Ablation study on NS backstep problem. Lower score means better performance.

Similar to Fig. 2, an initial set of weights of PINNs w ₀ and control variables θ ₀ can be input to block 301. As an alternative, the initial PINNs may be trained a certain number of epochs N _w under the control variables θ ₀ as warm up, and then the pre-trained PINNs and its weights can be input to block 301.

At block 301, an objective function

with similar formation as Eq. (1) is calculated, with its governing PDE constraints as

and

which represent the boundary/initial conditions of the physical system. As described above, the solutions of PDE constraints can be represented by state variables y, and control variables of the objective function

can be represented by u, which are respectively parameterized by w and θ.

In order to optimize θ by gradient descent in the outer loop with fixed w, a gradient of

with respect to θ is to be calculated. As discussed above and shown in block 301, the hypergradients of θ is calculated using IFT Differentiation, and further the inverse vector-Hessian product is calculated based on Broyden’s method as in Eq. (9) and (14) which uses low rank approximation for acceleration. As shown in block 301 with solid blocks, a high-dimensional matrix is replaced with low rank vector matrix product.

After θ is optimized in the iteration, the optimized θ is transferred to block 302 for finetuning PINNs using PDE losses as in Eq. (3) only. As shown in block 302, PINNs are trained with gradient descent of the PDE losses under the condition of the optimized θ fixed. In an example, the PINNs can be trained for a certain number of epochs N _f to obtain an optimal w ^*.

After w is optimized in the iteration, the optimized w ^*is transferred back to block 301 for optimizing θ in the next iteration. These two steps are iteratively executed until convergence.

The influence of warm up epochs N _w and different fine-tuning epochs N _f for the inner loop of PINNs are also investigated through experiments. Heat 2d problem are chosen and the results are shown in Table 3 and Table 4. It can be seen from the tables that these hyperparameters have considerable influence on the convergence speed. However, their impact on the final performance is minor within a broad range. Thus, BPN is robust to the choice of these two parameters on Heat 2d problem, but a more efficient and effective result could be achieved via choosing moderate N _w and/or N _f.

Table 3 –Ablation study on the influence of finetuning epochs of PINNs

Table 4 –Ablation study on the influence of warmup epochs of PINNs

Based on the description and examples before, the disclosed method as a whole is named Bi-level Physics-informed Neural networks with Broyden’s hypergradients (BPN) , however one or more of the aspects can be implemented solely or combined with its corresponding effects. The pseudo code of BPN is outlined in Algorithm 1.

The disclosed BPN firstly possesses an idea of solving a transposed linear system in Eq. (9) to reduce the computational cost. Moreover, the disclosed bi-level optimization is a more general framework compared with constrained optimization in existing approaches. It allows more flexible design choices like replacing FEM solvers with PINNs. Additionally, the Eq. (9) solves a system in the parameter space of neural networks but some traditional approaches use equations correspond to a real field defined on admissible space or meshes.

Fig. 4 illustrates an exemplary flow chart for solving PDECO problems, in accordance with various aspects of the present disclosure. As described below, some or all illustrated features may be omitted in a particular implementation within the scope of the present disclosure, and some illustrated features may not be required for implementation of all embodiments. Further, some of the blocks may be performed parallel or in a different order. In some examples, the method may be carried out by any suitable apparatus or means for carrying out the functions or algorithm described below.

The PDECO problem aims at optimizing the performance of a physical system constrained by PDEs with desired properties, which could be a variety of problems in science and engineering, to name a few, flow control problem, shape optimization problem, drag minimization, etc. The shape optimization problem or drag minimization is aimed at designing structures (the shapes, sizes, and distribution of given materials) , for example of an airfoil, with high performance for systems characterized by PDEs. The method described with Fig. 4 in combined with other aspects herein can be applied to all the PDECO problems with appropriate PDE constraints, not limited by examples below.

For example, Naiver-Strokes equations are one of the most important equations in fluid mechanics, aerodynamics and applied mathematics which are notoriously difficult to solve due to the high non-linearity. In this problem, NS equations are solved in a pipeline and it aims to find the best inlet flow distribution f (y) to make the outlet flow as uniform as possible. The flow velocity field is u= (u, v) and the pressure field is p and they are defined on a rectangle domain Ω, for example Ω= [0, 1.5] × [0, 1.0] . Two inlets and two outlets and several walls could be defined for this domain, such as:

The whole problem would be as follows, which could be concrete expression of Eq. (1) and (2) :

The target function is a parabolic function

and the velocity field on the second inlet and outlet isv ₂ (x) =18 (x-0.5) (1-x) . The Reynold number can be set to 100 in this problem, and f (y) can be initialized the same with the target function as f (y) =4y (1-y) . This problem could be solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.

As another example, backstep flow is a classic example that might exhibits turbulent flow. In this problem, it also aimed to find the best inlet flow distribution f (y) to make the outlet flow symmetric and uniform. The geometry of backstep could be viewed as a union of two rectangles, such as Ω= [0, 1] × [0, 0.5] ∪ [1, 2] × [0, 1] . The inlet is the left side of the area and the outlet is the right side of the area,

The velocity fields of the outlet are to be optimized,

The target velocity field is

and the inlet velocity f (y) is initialized as 8y (0.5-y) . This problem could be also solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.

For other class of problem of shape optimization, for example, drag minimization over an obstacle of NS equations, which is a shape optimization task that is to find the best shape of the obstacle that minimizes the drag forces from the flow. The inlet is the left side of the area and the outlet is the right side of the area, shown as:

The flow field is defined on a domain Ω = [0, 8] ²\Ω° and Ω° is the obstacle. The shape of the obstacle is a ellipse parameterized by a parameter a ∈ [0.5, 2.5] . The goal is to minimize the following objective.

This problem could be also solved with the disclosed method by solving the PDE constraints with PINNs and minimizing J with respect to f given the optimal solutions of the PDE constraints, as a bi-level optimization problem to be described below.

Now the disclosed method which is generally applicable for solving PDECO problems is illustrated with Fig. 4. The method begins at block 401, with initializing weights w ₀ of the PINNs u _w and the control variables θ ₀ , wherein the solutions of PDE constraints are parameterized by the weights w of the PINNs. As an example, the PINNs is initialized with random parameters w ₀, and the control variables θ ₀ are initialized with a guess.

The method proceeds to block 402, with calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively. As an example, the PDE losses ε related to the PDE constraints can be calculated as Eq. (5) . Referring to several PDECO problems mentioned above, the PDE constraints could be the boundary and/or initial conditions of the physical systems. Also, as an example, the objective function

related to the optimization target can be calculated with similar form as Eq. (1) .

The method proceeds to block 403, with training the initialized PINNs under the initialized control variables for a certain number of epochs as warm up, as shown with dash line, block 403 is an optional block. The warm up would have considerable influence on the convergence speed but minor influence on the final performance, therefore the number of epochs could be chosen based on implementation and not limited.

The method proceeds to block 404, with updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed.

In an example, the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation. For example, the hypergradient of the objective function with respect to the control variables can be calculated as Eq. (8) by Theorem 1.

As a further example, the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as

wherein

As a further example, the inverse vector-Hessian product is computed by finding a root z ^* of a linear equation

further comprises iteratively approximating the root z ^* using a low rank Broyden’s method. To iteratively approximating the root z ^*, in each iteration of the low rank Broyden’s method, further comprises approximating the inversion of Hessian matrix

as

wherein k is rank of B _i; updating z , u and v according to z _i+1=z _i-α·B _ig _i (z) ,

by

As a further example, to iteratively approximating the root z ^*, further comprises setting a maximum of the rank of B _i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.

As a further example, to iteratively approximating the root z ^*, further comprises setting a maximum number of iterations of the low rank Broyden’s method; running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.

After the control variables are updated, the method proceeds to block 405, with updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed. As an example, the PINNs are trained with the updated control variables for a certain number of epochs.

After the PINNs are optimized and not converged determined at block 406, the optimal weights of the PINNs are transferred back to block 404 for next iteration. It could also be seen from Fig. 4, the operation of

block

404 and 405 form an outer loop of optimization of control variables, and the operation in block 405 itself is an inner loop optimization for finetuning PINNs, which are also illustrated and described with Fig. 3. Besides, the Broyden’s method iterations exist in block 404 in an aspect.

After the PINNs are optimized and determined to be converged at block 406, the optimized control variables and PINNs are output at block 407.

Fig. 5 illustrates an exemplary computing system, in accordance with various aspects of the present disclosure. The computing system may comprise at least one processor 510. The computing system may further comprise at least one storage device 520. It should be appreciated that the storage device 520 may store computer-executable instructions that, when executed, cause the processor 510 to perform any operations according to the embodiments of the present disclosure as described in connection with FIGs. 1-4.

The embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) , wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled. The method comprises initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs; calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively; updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed; updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.

The embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization flow control problem in a physical system with physics-informed neural networks (PINNs) with at least one of the described methods herein, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein an optimization target is to find a best inlet flow distribution which is parameterized by the control variables to make an outlet flow uniform, comprising: outputting the updated control variables denoting the best inlet flow distribution after convergence.

The embodiments of the present disclosure may be embodied in a computer-readable medium such as non-transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform a

The non-transitory computer-readable medium may further comprise instructions that, when executed, cause one or more processors to perform any operations according to the embodiments of the present disclosure as described in connection with Figs. 1-4.

It should be appreciated that all the operations in the methods described above are merely exemplary, and the present disclosure is not limited to any operations in the methods or sequence orders of these operations, and should cover all other equivalents under the same or similar concepts.

It should also be appreciated that all the modules in the apparatuses described above may be implemented in various approaches. These modules may be implemented as hardware, software, or a combination thereof. Moreover, any of these modules may be further functionally divided into sub-modules or combined together.

The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects. Thus, the claims are not intended to be limited to the aspects shown herein. All structural and functional equivalents to the elements of the various aspects described throughout the present disclosure that are known or later come to be known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the claims.

Claims

A computer implemented method for solving Partial Differential Equation (PDE) constrained optimization problem with physics-informed neural networks (PINNs) , wherein the optimization of state variables corresponding to solutions of PDE constraints and control variables corresponding to an optimization target are decoupled, the method comprising:

initializing weights of the PINNs and the control variables, wherein the solutions of PDE constraints are parameterized by the weights of the PINNs;

calculating PDE losses related to the PDE constraints and an objective function related to the optimization target respectively;

updating the control variables by a first learning rate with gradient descent of the objective function in one iteration under the condition of the weights of the PINNs are fixed;

updating the weights of the PINNs by a second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed; and

updating the control variables and the weights of the PINNs iteratively until convergence, wherein the updated weights of the PINNs in a last iteration are used for updating the control variables in a next iteration.
The computer implemented method of claim 1, further comprising:

training the initialized PINNs under the initialized control variables for a certain number of epochs as warm up.
The computer implemented method of claim 1, the updating the weights of the PINNs further comprising:

updating the weights of the PINNs by the second learning rate with gradient descent of the PDE losses in the same iteration under the condition of the updated control variables are fixed for a certain number of epochs.
The computer implemented method of claim 1, wherein the gradient descent of the objective function is calculated as a hypergradient of the objective function with respect to the control variables based on Implicit Function Theorem Differentiation.
The computer implemented method of claim 4, wherein the hypergradient of the objective function with respect to the control variables is calculated based at least on computing an inverse vector-Hessian product as
wherein
denotes the objective function, ε denotes the PDE losses and w denotes the weights of the PINNs, w ^* denotes the updated weights of the PINNs in the last iteration.
The computer implemented method of claim 5, wherein the inverse vector-Hessian product is computed by finding a root z ^* of a linear equation

further comprising:

iteratively approximating the root z ^* using a low rank Broyden’s method.
The computer implemented method of claim 6, wherein the iteratively approximating the root z ^* using the low rank Broyden’s method further comprising in each iteration:

approximating the inversion of Hessian matrix
as

wherein k is rank of B _i;

updating z , u and v according to z _i+1=z _i-α·B _ig _i (z) ,

and v _i+1=B _iΔz _i+1, α is a step size;

updating the inversion of the Hessian matrix
by
The computer implemented method of claim 7, the iteratively approximating the root z ^* using the low rank Broyden’s method further comprising:

setting a maximum of the rank of B _i based on a memory limit, wherein the maximum of the rank is less than a maximum number of iterations of the low rank Broyden’s method.
The computer implemented method of claim 7, the iteratively approximating the root z ^* using the low rank Broyden’s method further comprising:

setting a maximum number of iterations of the low rank Broyden’s method;

running the iterations of the low rank Broyden’s method until the maximum number of iterations is reached or an error is within a threshold.
A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization flow control problem in a physical system with physics-informed neural networks (PINNs) with method of one of claims 1-9, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein an optimization target is to find a best inlet flow distribution which is parameterized by the control variables to make an outlet flow uniform, comprising:

outputting the updated control variables denoting the best inlet flow distribution after convergence.
A computer implemented method for solving a Partial Differential Equation (PDE) constrained optimization shape optimization problem in a physical system with physics-informed neural networks (PINNs) with method of one of claims 1-9, wherein the physical system is characterized by the PDE constraints, the state variables of the PDE constraints include at least one of flow velocity or pressure, and wherein the optimization target is to find a best shape of an object which is parameterized by the control variables that minimizes drag forces or pressure from the flow, comprising:

outputting the updated control variables denoting the best shape after convergence.
A computer system, comprising:

one or more processors; and

one or more storage devices storing computer-executable instructions that, when executed, cause the one or more processors to perform the operations of the method of one of claims 1-9.
One or more computer readable storage media storing computer-executable instructions that, when executed, cause one or more processors to perform the operations of the method of one of claims 1-9.
A computer program product comprising computer-executable instructions that, when executed, cause one or more processors to perform the operations of the method of one of claims 1-9.