CN112784205A - Partial differential equation data processing method, system, storage medium, device and application - Google Patents
Partial differential equation data processing method, system, storage medium, device and application Download PDFInfo
- Publication number
- CN112784205A CN112784205A CN202110131994.0A CN202110131994A CN112784205A CN 112784205 A CN112784205 A CN 112784205A CN 202110131994 A CN202110131994 A CN 202110131994A CN 112784205 A CN112784205 A CN 112784205A
- Authority
- CN
- China
- Prior art keywords
- partial differential
- differential equation
- term
- partial
- differential
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 30
- 238000000034 method Methods 0.000 claims abstract description 70
- 230000006870 function Effects 0.000 claims abstract description 39
- 230000007246 mechanism Effects 0.000 claims abstract description 27
- 239000002356 single layer Substances 0.000 claims abstract description 19
- 238000012545 processing Methods 0.000 claims abstract description 18
- 238000005070 sampling Methods 0.000 claims description 74
- 239000010410 layer Substances 0.000 claims description 14
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 6
- 230000006835 compression Effects 0.000 claims description 5
- 238000007906 compression Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 4
- 238000012937 correction Methods 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 3
- 238000012795 verification Methods 0.000 abstract description 5
- 230000000694 effects Effects 0.000 description 9
- 238000013135 deep learning Methods 0.000 description 4
- 238000003709 image segmentation Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000012067 mathematical method Methods 0.000 description 2
- 238000005312 nonlinear dynamic Methods 0.000 description 2
- 238000005293 physical law Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 235000015220 hamburgers Nutrition 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000000329 molecular dynamics simulation Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000005610 quantum mechanics Effects 0.000 description 1
- 238000005316 response function Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G06F17/13—Differential equations
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Operations Research (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Algebra (AREA)
- Complex Calculations (AREA)
Abstract
The invention belongs to the technical field of partial differential equation processing implied by a learning system from data, and discloses a partial differential equation data processing method, a system, a storage medium, equipment and application, wherein an attention mechanism is utilized to sample time space data; constructing a differential term alternative library by using prior knowledge and a basis function representation method; forming a corresponding coefficient of a differential term in a deep network learning alternative library by a plurality of single-layer regression networks; regularization is added into the loss function, and interference terms are reduced through sparse regression to obtain a final partial differential equation. The method can fully utilize the space and time information of observation data, reduce the data quantity required by the model, and improve the precision of the learned partial differential equation while ensuring the efficiency; the method solves the problems that the traditional method for learning the partial differential equation implicit in the system through manual experience and experimental verification is low in efficiency, is easily influenced by subjective factors of people, and is difficult to find a mechanism behind a complex system.
Description
Technical Field
The invention belongs to the technical field of processing of partial differential equations implied by learning systems from data, and particularly relates to a method, a system, a storage medium, equipment and application for processing partial differential equation data.
Background
At present: differential equations, and in particular partial differential equations, play an important role in many disciplines and can be used to describe the physical laws behind a given system. Traditionally, partial differential equations have been derived mathematically or physically according to some basic principles, such as Schrodinger's equation in quantum mechanics to molecular dynamics models, Boltzmann's equation to Navy-Stokes equation, and so on. However, the mechanisms behind many complex systems in modern applications (such as many problems in multiphase flow, neuroscience, finance, bioscience, etc.) are not generally understood, and the partial differential equations of these systems are usually derived from empirical equations. With the rapid development of sensors, computing power and data storage technologies over the last decade, large amounts of data can now be easily collected, stored and processed. Such a large amount of data provides new opportunities for discovering (possibly new) physical laws from the data. Therefore, building a model to learn partial differential equations from data to approximate observed complex dynamic data would be of great interest to humans in analyzing and understanding the underlying mechanisms of complex systems in modern applications.
Early summary of the implicit mechanism from the system often relied on human experience, or hypothesis was proposed and then experimental verification was performed. The method is greatly influenced by human subjectivity, the efficiency is low, and the implicit partial differential equation of a complex or newly-appeared system is difficult to obtain by means of an empirical formula. Solutions to the partial differential equations implied by computer technology summary systems have therefore gradually emerged as computing power and data storage technologies have advanced. Joshbangard and Michael Schmidt made preliminary attempts in 2007 and 2009, respectively, on how to learn implicit partial differential equations in systems under data drive, and their main idea was to compare the numerical differential of input data with the analytical differential of candidate functions and determine a nonlinear dynamic system using symbolic regression and evolutionary algorithms. Emmanuel de Bezenac uses a nonlinear dynamics sparse identification method (SINDy) to carry out partial differential equation modeling on sea surface temperature in 2017, and the main idea of the SINDy is to firstly construct a candidate library of sufficiently large partial differential equation items and then learn an implicit partial differential equation from sea surface temperature data to predict the sea surface temperature. In 2018, Marizar Raissi proposes a model for learning unknown parameters on the premise that a nonlinear response function form of a partial differential equation is known, and the main idea of Marizar Raissi is to introduce a regularization idea between two continuous time steps through a Gaussian process. Generally, a method of learning partial differential equations from data focuses on representing observed data with a relatively simple model and obtaining an analytical form of the partial differential equations.
In the existing method of learning the partial differential equation implied by the system from data, there are several problems and drawbacks: (1) the traditional method for learning the partial differential equation implicit in the system through manual experience and experimental verification has low efficiency, is easily influenced by human subjective factors, and is difficult to find a mechanism behind a complex system. (2) The existing method for learning the partial differential equation hidden in the system under data driving has large limitation, more limitation on terms of the partial differential equation which can be learned, and the learning effect is easily influenced by noise. (3) The existing method for learning the partial differential equation implied by the system through deep learning needs a large amount of sample data, while some systems are difficult to acquire enough data for training, and the hardware cost of deep learning is high, and the operation efficiency is not high, so that the use of the methods is limited.
Through the above analysis, the problems and defects of the prior art are as follows:
(1) the traditional method for learning the partial differential equation implicit in the system through manual experience and experimental verification has low efficiency, is easily influenced by human subjective factors, and is difficult to find a mechanism behind a complex system, so that the finally obtained partial differential equation has overlarge error and unstable effect.
(2) The method for learning the partial differential equation implied by the system under the drive of data in the prior art has large limitation, and has more limitation on terms of the partial differential equation which can be learned, so that the learned partial differential equation does not conform to the analytic form of a real equation; and the learning effect is easily influenced by noise, so that the error of the finally obtained partial differential equation is overlarge.
(3) The existing method of learning the partial differential equation implied by the system through deep learning needs a large amount of sample data, while some systems are difficult to acquire enough data for training, and the hardware cost of deep learning is high, and the operation efficiency is not high, so that the use of the methods is limited.
The difficulty in solving the above problems and defects is: how to learn the partial differential equation implied by the system in the case of small data amount, and how to improve learning efficiency and accuracy.
The significance of solving the problems and the defects is as follows: the method for exploring the partial differential equation implied by the system through the data driving method can save a large amount of labor cost, help people to understand the operation rule of the complex system and promote the progress in the field of learning the partial differential equation implied by the system from data.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a partial differential equation data processing method, a system, a storage medium, equipment and application.
The invention is realized in such a way that a partial differential equation data processing method comprises the following steps:
sampling time space data by using an attention mechanism to improve learning precision and initial value robustness;
constructing a differential term alternative library by using prior knowledge and a basis function representation method to obtain a reasonable and complete alternative library;
forming corresponding coefficients of differential terms in a deep network learning alternative library by a plurality of single-layer regression networks to finish primary partial differential equation learning;
regularization is added into the loss function, and interference terms are reduced through sparse regression to obtain a final partial differential equation.
The method utilizes the attention mechanism to sample time space data, and because some systems controlled by the partial differential equation (such as image denoising and image segmentation tasks based on the partial differential equation) are more concerned about the final value of the partial differential equation, the attention mechanism is introduced to improve the precision and the initial value robustness during time sampling, so that the method is more effective in the task of concerning the final value result of the partial differential equation; constructing a differential term alternative library by using prior knowledge and a basis function representation method, so that the alternative library is complete and reasonable, and the calculated amount is reduced; then, learning coefficients corresponding to the candidate items in the candidate library by using a deep network formed by a plurality of single-layer regression networks so as to complete preliminary partial differential equation learning; and finally, adding a regularization term into the loss function to enable the network to become a partial differential equation learner by utilizing a sparse regression method so as to reduce irrelevant interference terms.
Further, the data sampling of the partial differential equation data processing method specifically includes: firstly, performing spatial sampling on input time sequence data, wherein random sampling is used for spatial sampling, and partial points in an input space are selected as observation objects; then time sampling is carried out, uniform sampling is used for time sampling, new sample values at intervals are obtained on the basis of observation points selected by space sampling, an attention mechanism is introduced during time sampling, and the weight of the data of the second half part is increased by controlling the time sampling rate;
further, the network parameters of the partial differential equation data processing method during data sampling are set as follows: boundary width: 5; spatial sampling rate: 2 percent; total time sampling rate: 25 percent; number of time-sampled equal segments: 5; attention mechanism correction: the sampling rate of each section is increased by 5%; segment time sampling rate: 15%, 20%, 25%, 30%, 35%.
Further, the partial differential equation data processing method constructs a differential term alternative library: equations containing partial derivatives or partial differentials of unknown functions are called partial differential equations, the general form of which is expressed as:
ut(t,x,y)=F(x,y,u,ux,uy,uxx,uxy,uyy,...),(x,y)∈R2,t∈[0,T].;
wherein t representsM, x and y represent space, partial differential equation utIs the differential term x, y, ux,uy,uxx,uxy,uyy,..; the partial differential equation is expressed in the form:
ωt=Θ(ω,u,v)ξ;
wherein Θ represents a matrix composed of various differential terms ω, u and v, and ξ represents coefficients of the differential terms, i.e. a candidate library of the differential terms is constructed, and then the coefficient ξ corresponding to each differential term is learned to obtain a final partial differential equation. Constructing a candidate library based on a basis function representation method, and constructing x, y, ux,uy,uxx,uxy,uyyThe basic differential terms are equal, and when a specific system is aimed at, part of complex differential terms closely related to the system can be added into an alternative library according to prior knowledge; the alternative library is not too small, otherwise the accuracy of the final learned partial differential equation is influenced; the alternative library is not too large, otherwise, the operation efficiency is influenced;
alternative bank compression is performed. After the original differential item candidate base is obtained, the candidate base matrix is compressed according to the data sampling result of the previous step to reduce the calculation amount:
whereinRepresenting matrix compression, and taking a new compressed matrix as a final differential item alternative library;
the candidate item constructing method selected when constructing the differential item candidate library comprises a forward difference method, a backward difference method, a central difference method and a Chebyshev polynomial interpolation method; when the subsequently obtained partial differential equation needs to be solved iteratively, a Chebyshev polynomial interpolation method is adopted to construct a differential term candidate library.
Further, the differential term coefficient learning of the partial differential equation data processing method: constructing a depth network to obtain a corresponding coefficient of each differential term in the candidate library, wherein the depth network is formed by stacking a plurality of single-layer regression networks to improve the precision of the network training result layer by layer, and the single regression network comprises the following components: obtaining the coefficient xi of the differential term by a regression method for the compressed differential term candidate library, wherein the loss function to be minimized is as follows:
whereinIndicating the estimated value, the coefficient of the differential term obtained from the ith layer is xiiFinally, the coefficient finally output by the single-layer regression network is used as the coefficient xi of the partial differential equation;
the regression mode of the network selects least square regression, and an lstsq tool in Numpy is selected to complete the task of least square regression.
Further, the regularization of the partial differential equation data processing method includes: the loss function is modified to:
wherein lambda represents a regularization coefficient, and an interference term with a small coefficient can be omitted from a partial differential equation subjected to sparse regression, so that the finally obtained result is closer to a real partial differential equation;
during the regularization, L1 regularization is selected for sparse regression, and the regularization coefficient lambda of the network is from 10 to 10 as the number of network layers is increased-6To 10-5And gradually changing, and the obtained partial differential equation term and the coefficient corresponding to each differential term form a partial differential equation finally learned by the model.
It is a further object of the invention to provide a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:
sampling the time-space data by using an attention mechanism;
constructing a differential term alternative library by using prior knowledge and a basis function representation method;
forming a corresponding coefficient of a differential term in a deep network learning alternative library by a plurality of single-layer regression networks;
regularization is added into the loss function, and interference terms are reduced through sparse regression to obtain a final partial differential equation.
It is another object of the present invention to provide a computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
sampling the time-space data by using an attention mechanism;
constructing a differential term alternative library by using prior knowledge and a basis function representation method;
forming a corresponding coefficient of a differential term in a deep network learning alternative library by a plurality of single-layer regression networks;
regularization is added into the loss function, and interference terms are reduced through sparse regression to obtain a final partial differential equation.
Another object of the present invention is to provide a partial differential equation data processing system implementing the partial differential equation data processing method, the partial differential equation data processing system including:
the data sampling module is used for sampling data firstly, and the data sampling is carried out in two aspects of space and time respectively;
the differential term alternative library construction module is used for constructing an alternative library and selecting differential terms to construct a differential term alternative library by utilizing the prior knowledge and a basis function representation method;
the differential term coefficient learning module is used for learning the coefficient corresponding to the candidate item in the candidate library by utilizing a depth network formed by a plurality of single-layer regression networks;
and the regularization screening differential term module is used for adding a regularization term into the loss function to enable the network to become a partial differential equation learner by using a sparse regression method so as to reduce irrelevant interference terms, and the finally obtained partial differential equation term and the coefficient corresponding to each differential term form a partial differential equation finally learned by the network.
Another object of the present invention is to provide a data learning terminal, which is used for implementing the partial differential equation data processing method.
By combining all the technical schemes, the invention has the advantages and positive effects that: according to the method, data are sampled firstly, the data sampling is carried out in two aspects of space and time, and because some systems controlled by partial differential equations (such as image denoising and image segmentation tasks based on the partial differential equations) are more concerned about the final values of the partial differential equations, an attention mechanism is introduced during the time sampling to improve the precision and the initial value robustness, so that the method is more effective in a task of concerning the final value results of the partial differential equations; then constructing an alternative library, and selecting reasonable differential terms to construct a differential term alternative library by using mathematical methods such as priori knowledge, basis function representation and the like; then, learning coefficients corresponding to the candidate items in the candidate library by using a deep network formed by a plurality of single-layer regression networks; and finally, adding a regularization term into the loss function to enable the network to become a partial differential equation learner by utilizing a sparse regression method so as to reduce irrelevant interference terms, wherein the finally obtained partial differential equation term and the coefficient corresponding to each differential term form a partial differential equation finally learned by the network.
Compared with other existing methods for learning partial differential equations from a system, the method can fully utilize space and time information of observation data, reduce data quantity required by a model, and improve the accuracy of the learned partial differential equations while ensuring efficiency. The invention designs a deep differential equation network to learn the partial differential equation implied by the system, and is used for solving the problems that the traditional method for learning the partial differential equation implied by the system through artificial experience and experimental verification has low efficiency, is easily influenced by human subjective factors, and is difficult to find a mechanism behind a complex system. The deep differential equation network can learn a plurality of differential terms of the partial differential equation, and has stronger initial value robustness on the task of processing more sensitive final value data. The deep differential equation network needs small data quantity, low hardware requirement and high operation efficiency.
Table 1 the method of the present invention learns the effect of the classical partial differential equation:
table 2 the method of the present invention learns the effect of the image processing partial differential equation:
average running time of the algorithm: 5.018 s.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained from the drawings without creative efforts.
Fig. 1 is a flowchart of a partial differential equation data processing method according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of a partial differential equation data processing system according to an embodiment of the present invention.
Fig. 3 is a flowchart of an implementation of a partial differential equation data processing method according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In view of the problems in the prior art, the present invention provides a partial differential equation data processing method, system, storage medium, device and application, and the present invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, the partial differential equation data processing method provided by the present invention includes the following steps:
s101: sampling the time-space data by using an attention mechanism;
s102: constructing a differential term alternative library by using prior knowledge and a basis function representation method;
s103: forming a corresponding coefficient of a differential term in a deep network learning alternative library by a plurality of single-layer regression networks;
s104: regularization is added into the loss function, and interference terms are reduced through sparse regression to obtain a final partial differential equation.
Those skilled in the art of the method for processing partial differential equation data provided by the present invention can also use other steps to implement, and the method for processing partial differential equation data provided by the present invention in fig. 1 is only a specific example.
As shown in fig. 2, the partial differential equation data processing system provided by the present invention includes:
the data sampling module is used for sampling data firstly, the data sampling is respectively carried out in two aspects of space and time, and because some systems controlled by partial differential equations (such as image denoising and image segmentation tasks based on partial differential equations) are more concerned about the final values of the partial differential equations, an attention mechanism is introduced during the time sampling to improve the precision and the initial value robustness, so that the data sampling module is more effective in the task of concerning the final value results of the partial differential equations;
the differential term alternative library construction module is used for constructing an alternative library, and reasonable differential terms are selected to construct the differential term alternative library by using mathematical methods such as priori knowledge, basis function representation and the like;
the differential term coefficient learning module is used for learning the coefficient corresponding to the candidate item in the candidate library by utilizing a depth network formed by a plurality of single-layer regression networks;
and the regularization screening differential term module is used for adding a regularization term into the loss function to enable the network to become a partial differential equation learner by using a sparse regression method so as to reduce irrelevant interference terms, and the finally obtained partial differential equation term and the coefficient corresponding to each differential term form a partial differential equation finally learned by the network.
The technical solution of the present invention is further described below with reference to the accompanying drawings.
As shown in fig. 3, the partial differential equation data processing method provided by the present invention includes the following steps:
the method comprises the following steps: and (6) sampling data. Firstly, performing spatial sampling on input time sequence data, wherein random sampling is used for spatial sampling, and partial points in an input space are selected as observation objects; and then time sampling is carried out, uniform sampling is used for time sampling, new sample values at intervals are obtained on the basis of observation points selected by space sampling, an attention mechanism is introduced during time sampling, and the weight of the data of the second half part is increased by controlling the time sampling rate so as to increase the precision and the initial value robustness of the model.
The attention mechanism affects the time sampling of the input as follows: the total time sampling rate is 25% of all time sequences, all input time sequences are equally divided into five parts, the time sampling rate of the first part is 15%, the time sampling rate of the second part is 20%, and the like, and the attention mechanism correction value of each section of sampling rate is 5%.
The network parameters at the time of data sampling are set as follows: boundary width: 5; spatial sampling rate: 2 percent; total time sampling rate: 25 percent; number of time-sampled equal segments: 5; attention mechanism correction: the sampling rate of each segment is increased by 5%. Segment time sampling rate: 15%, 20%, 25%, 30%, 35%.
Step two: constructing a differential item alternative library: equations containing the partial derivative (or partial differential) of the unknown function are called partial differential equations, the general form of which can be expressed as:
ut(t,x,y)=F(x,y,u,ux,uy,uxx,uxy,uyy,...),(x,y)∈R2,t∈[0,T].;
where t represents time, x and y represent space, partial differential equation utIs the differential term x, y, ux,uy,uxx,uxy,uyy,.. The partial differential equation is expressed in the present invention as follows:
ωt=Θ(ω,u,v)ξ;
wherein Θ represents a matrix composed of various differential terms ω, u and v, and ξ represents coefficients of the differential terms, i.e. a candidate library of the differential terms is constructed, and then the coefficient ξ corresponding to each differential term is learned to obtain a final partial differential equation. Constructing a candidate library based on a basis function representation method, and constructing x, y, ux,uy,uxx,uxy,uyyAnd waiting for basic differential terms, and adding part of complex differential terms closely related to a specific system into an alternative library according to prior knowledge when the specific system is targeted. The alternative library is not too small, otherwise the accuracy of the final learned partial differential equation is influenced; the alternative library should not be too large, which would affect the efficiency of operation.
The alternative bank compression is followed. After the original differential item candidate base is obtained, the candidate base matrix is compressed according to the data sampling result of the previous step to reduce the calculation amount:
whereinAnd (4) representing matrix compression, and taking the compressed new matrix as a final differential item candidate library.
When a subsequently obtained partial differential equation needs to be solved iteratively, if a solved numerical method is consistent with a construction method, the solving precision is highest, and the Chebyshev polynomial interpolation method is adopted to construct the differential item candidate library in the specific embodiment.
Step three: learning of differential term coefficients: and constructing a deep network to obtain a corresponding coefficient of each differential term in the alternative library. The deep network is stacked by a plurality of single-layer regression networks to improve the precision of the network training result layer by layer, wherein the single regression network is composed of: obtaining the coefficient xi of the differential term by a regression method for the compressed differential term candidate library, wherein the loss function to be minimized is as follows:
whereinIndicating the estimated value, the coefficient of the differential term obtained from the ith layer is xiiAnd finally, taking the coefficient finally output by the single-layer regression network as the coefficient xi of the partial differential equation.
The regression mode of the network selects least square regression, and an lstsq tool in Numpy is selected to complete the task of least square regression.
Step four: regularization: when constructing the differential term alternative library, an over-complete library is generally constructed in order to be able to represent complex partial differential equations, and therefore interference terms which are not included in the real partial differential equations are included therein. In order to eliminate the interferences and enable the learned partial differential equation to be closer to a real equation, the invention introduces a regularization term and modifies a loss function into:
where λ represents the regularization coefficient. Thus, the partial differential equation subjected to sparse regression omits the interference terms with small coefficients, so that the final result is closer to the real partial differential equation.
During regularization, L1 regularization is selected for sparse regression, and considering that regularization coefficients are not suitable to be set to be too large initially during deep network training, otherwise, terms which may cause some real partial differential equations to exist are ignored in the first layers of the network due to the small coefficients, and therefore the terms are omitted in the first layers of the networkThe regularization coefficient lambda of the network is from 10 as the number of network layers increases-6To 10-5Gradually changing. The finally obtained partial differential equation term and the coefficient corresponding to each differential term form a partial differential equation finally learned by the model.
The technical effects of the present invention will be described in detail with reference to simulations.
1. The simulation experiment is completed by Python language on a PC with a CPU being an Intel (R) core (TM) i7-9700, a CPU3.60GHz, a RAM 16.00GB and a ubuntu 16.04 operating system.
2. The experimental data of the invention is the data generated by various classical partial differential equations and the real noise-containing data generated when the partial differential equations are used for image processing. The classical partial differential equations comprise Burgers equations, KdV equations, NLS equations and Navier Stokes equations; the image processing by the partial differential equation includes image denoising data by the thermal diffusion equation and level set image segmentation data by the GAC model. The results of the experiment are shown in tables 1 and 2.
Table 1 the method of the present invention learns the effect of the classical partial differential equation:
table 2 the method of the present invention learns the effect of the image processing partial differential equation:
average running time of the algorithm: 5.018 s.
According to the results, the depth differential equation network can learn partial differential equations with abundant types of differential terms, can obtain higher precision and higher efficiency under the condition of smaller required data volume, and still has good learning effect when processing real data containing noise interference generated when image processing is carried out through the partial differential equations. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A partial differential equation data processing method, characterized by comprising:
sampling the time-space data by using an attention mechanism;
constructing a differential term alternative library by using prior knowledge and a basis function representation method;
forming a corresponding coefficient of a differential term in a deep network learning alternative library by a plurality of single-layer regression networks;
regularization is added into the loss function, and interference terms are reduced through sparse regression to obtain a final partial differential equation.
2. The partial differential equation data processing method according to claim 1, wherein the data sampling of the partial differential equation data processing method specifically includes: firstly, performing spatial sampling on input time sequence data, wherein random sampling is used for spatial sampling, and partial points in an input space are selected as observation objects; and then time sampling is carried out, uniform sampling is used for time sampling, new sample values at intervals are obtained on the basis of observation points selected by space sampling, an attention mechanism is introduced during time sampling, and the weight of the data of the second half part is increased by controlling the time sampling rate.
3. The partial differential equation data processing method according to claim 2, wherein the network parameter settings at the time of data sampling of the partial differential equation data processing method are as follows: boundary width: 5; spatial sampling rate: 2 percent; total time sampling rate: 25 percent; number of time-sampled equal segments: 5; attention mechanism correction: the sampling rate of each section is increased by 5%; segment time sampling rate: 15%, 20%, 25%, 30%, 35%.
4. The partial differential equation data processing method according to claim 1, wherein the partial differential equation data processing method constructs a candidate library of differential terms: equations containing partial derivatives or partial differentials of unknown functions are called partial differential equations, the general form of which is expressed as:
ut(t,x,y)=F(x,y,u,ux,uy,uxx,uxy,uyy,...),(x,y)∈R2,t∈[0,T].;
where t represents time, x and y represent space, partial differential equation utIs the differential term x, y, ux,uy,uxx,uxy,uyy,..; the partial differential equation is expressed in the form:
ωt=Θ(ω,u,v)ξ;
wherein theta represents a matrix composed of various differential terms omega, u and v, and xi represents coefficients of the differential terms, namely, a candidate library of the differential terms is constructed, then the coefficient xi corresponding to each differential term is learned to obtain a final partial differential equation, and the construction of the candidate library is based on a basis function representation method to construct x, y, u and ux,uy,uxx,uxy,uyyThe basic differential terms are equal, and when a specific system is aimed at, part of complex differential terms closely related to the system can be added into an alternative library according to prior knowledge; the alternative library is not too small, otherwise the accuracy of the final learned partial differential equation is influenced; the alternative library is not too large, otherwise, the operation efficiency is influenced;
and (3) compressing the alternative library, and compressing the alternative library matrix according to the data sampling result of the previous step after obtaining the original differential term alternative library so as to reduce the calculated amount:
C′ωt=C′Θ(ω,u,v)ξ;
wherein C' represents matrix compression, and the new compressed matrix is used as a final differential item alternative library;
the candidate item constructing method selected when constructing the differential item candidate library comprises a forward difference method, a backward difference method, a central difference method and a Chebyshev polynomial interpolation method; when the subsequently obtained partial differential equation needs to be solved iteratively, a Chebyshev polynomial interpolation method is adopted to construct a differential term candidate library.
5. The partial differential equation data processing method according to claim 1, characterized in that a differential term coefficient of the partial differential equation data processing method learns: constructing a depth network to obtain a corresponding coefficient of each differential term in the candidate library, wherein the depth network is formed by stacking a plurality of single-layer regression networks to improve the precision of the network training result layer by layer, and the single regression network comprises the following components: obtaining the coefficient xi of the differential term by a regression method for the compressed differential term candidate library, wherein the loss function to be minimized is as follows:
whereinIndicating the estimated value, the coefficient of the differential term obtained from the ith layer is xiiFinally, the coefficient finally output by the single-layer regression network is used as the coefficient xi of the partial differential equation;
the regression mode of the network selects least square regression, and an lstsq tool in Numpy is selected to complete the task of least square regression.
6. The partial differential equation data processing method according to claim 1, wherein the regularization of the partial differential equation data processing method comprises: the loss function is modified to:
wherein lambda represents a regularization coefficient, and an interference term with a small coefficient can be omitted from a partial differential equation subjected to sparse regression, so that the finally obtained result is closer to a real partial differential equation;
during the regularization, L1 regularization is selected for sparse regression, and the regularization coefficient lambda of the network is from 10 to 10 as the number of network layers is increased-6To 10-5And gradually changing, and the obtained partial differential equation term and the coefficient corresponding to each differential term form a partial differential equation finally learned by the model.
7. A computer device, characterized in that the computer device comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of:
sampling the time-space data by using an attention mechanism;
constructing a differential term alternative library by using prior knowledge and a basis function representation method;
forming a corresponding coefficient of a differential term in a deep network learning alternative library by a plurality of single-layer regression networks;
regularization is added into the loss function, and interference terms are reduced through sparse regression to obtain a final partial differential equation.
8. A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
sampling the time-space data by using an attention mechanism;
constructing a differential term alternative library by using prior knowledge and a basis function representation method;
forming a corresponding coefficient of a differential term in a deep network learning alternative library by a plurality of single-layer regression networks;
regularization is added into the loss function, and interference terms are reduced through sparse regression to obtain a final partial differential equation.
9. A partial differential equation data processing system for implementing the partial differential equation data processing method according to any one of claims 1 to 6, the partial differential equation data processing system comprising:
the data sampling module is used for sampling data firstly, and the data sampling is carried out in two aspects of space and time respectively;
the differential term alternative library construction module is used for constructing an alternative library and selecting differential terms to construct a differential term alternative library by utilizing the prior knowledge and a basis function representation method;
the differential term coefficient learning module is used for learning the coefficient corresponding to the candidate item in the candidate library by utilizing a depth network formed by a plurality of single-layer regression networks;
and the regularization screening differential term module is used for adding a regularization term into the loss function to enable the network to become a partial differential equation learner by using a sparse regression method so as to reduce irrelevant interference terms, and the finally obtained partial differential equation term and the coefficient corresponding to each differential term form a partial differential equation finally learned by the network.
10. A data learning terminal, characterized in that the data learning terminal is used for realizing the partial differential equation data processing method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110131994.0A CN112784205A (en) | 2021-01-30 | 2021-01-30 | Partial differential equation data processing method, system, storage medium, device and application |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110131994.0A CN112784205A (en) | 2021-01-30 | 2021-01-30 | Partial differential equation data processing method, system, storage medium, device and application |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112784205A true CN112784205A (en) | 2021-05-11 |
Family
ID=75760069
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110131994.0A Pending CN112784205A (en) | 2021-01-30 | 2021-01-30 | Partial differential equation data processing method, system, storage medium, device and application |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112784205A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113641100A (en) * | 2021-07-14 | 2021-11-12 | 苏州国科医工科技发展(集团)有限公司 | Universal identification method for unknown nonlinear system |
CN115344819A (en) * | 2022-08-16 | 2022-11-15 | 哈尔滨工业大学 | State equation-based explicit Euler method symbolic network ordinary differential equation identification method |
WO2023051236A1 (en) * | 2021-09-28 | 2023-04-06 | 华为技术有限公司 | Method for solving partial differential equation, and device related thereto |
CN116010760A (en) * | 2023-02-21 | 2023-04-25 | 北京航空航天大学 | Dynamic system equation discovery method for double-layer optimization driving |
CN116504341A (en) * | 2022-05-20 | 2023-07-28 | 大连理工大学 | Sequential singular value filtering method for data-driven identification partial differential equation |
WO2023165500A1 (en) * | 2022-03-04 | 2023-09-07 | 本源量子计算科技(合肥)股份有限公司 | Processing method and apparatus for data processing task, storage medium, and electronic device |
CN116738128A (en) * | 2022-03-04 | 2023-09-12 | 本源量子计算科技(合肥)股份有限公司 | Method and device for solving time-containing partial differential equation by utilizing quantum circuit |
CN118279310A (en) * | 2024-06-03 | 2024-07-02 | 中国科学院空天信息创新研究院 | Remote sensing image anomaly detection method and device based on differential modeling |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105814489A (en) * | 2013-09-09 | 2016-07-27 | Asml荷兰有限公司 | Methods and apparatus for calculating electromagnetic scattering properties of a structure and for reconstruction of approximate structures |
US20170293825A1 (en) * | 2016-04-08 | 2017-10-12 | Wuhan University | Method and system for reconstructing super-resolution image |
US20200311878A1 (en) * | 2019-04-01 | 2020-10-01 | Canon Medical Systems Corporation | Apparatus and method for image reconstruction using feature-aware deep learning |
CN112270650A (en) * | 2020-10-12 | 2021-01-26 | 西南大学 | Image processing method, system, medium, and apparatus based on sparse autoencoder |
-
2021
- 2021-01-30 CN CN202110131994.0A patent/CN112784205A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105814489A (en) * | 2013-09-09 | 2016-07-27 | Asml荷兰有限公司 | Methods and apparatus for calculating electromagnetic scattering properties of a structure and for reconstruction of approximate structures |
US20170293825A1 (en) * | 2016-04-08 | 2017-10-12 | Wuhan University | Method and system for reconstructing super-resolution image |
US20200311878A1 (en) * | 2019-04-01 | 2020-10-01 | Canon Medical Systems Corporation | Apparatus and method for image reconstruction using feature-aware deep learning |
CN112270650A (en) * | 2020-10-12 | 2021-01-26 | 西南大学 | Image processing method, system, medium, and apparatus based on sparse autoencoder |
Non-Patent Citations (2)
Title |
---|
杜渺勇;施垚;周浩;韩丹夫;: "基于偏微分方程和机器学习的图像去噪算法", 杭州师范大学学报(自然科学版), no. 02, 19 March 2020 (2020-03-19) * |
王晨希;王浩;王权;任海萍;: "医学人工智能产品的网络安全探讨", 中国医疗设备, no. 12, 10 December 2018 (2018-12-10) * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113641100A (en) * | 2021-07-14 | 2021-11-12 | 苏州国科医工科技发展(集团)有限公司 | Universal identification method for unknown nonlinear system |
CN113641100B (en) * | 2021-07-14 | 2023-11-28 | 苏州国科医工科技发展(集团)有限公司 | Universal identification method for unknown nonlinear system |
WO2023051236A1 (en) * | 2021-09-28 | 2023-04-06 | 华为技术有限公司 | Method for solving partial differential equation, and device related thereto |
WO2023165500A1 (en) * | 2022-03-04 | 2023-09-07 | 本源量子计算科技(合肥)股份有限公司 | Processing method and apparatus for data processing task, storage medium, and electronic device |
CN116738128A (en) * | 2022-03-04 | 2023-09-12 | 本源量子计算科技(合肥)股份有限公司 | Method and device for solving time-containing partial differential equation by utilizing quantum circuit |
CN116504341A (en) * | 2022-05-20 | 2023-07-28 | 大连理工大学 | Sequential singular value filtering method for data-driven identification partial differential equation |
CN116504341B (en) * | 2022-05-20 | 2023-11-07 | 大连理工大学 | Sequential singular value filtering method for data-driven identification partial differential equation |
CN115344819A (en) * | 2022-08-16 | 2022-11-15 | 哈尔滨工业大学 | State equation-based explicit Euler method symbolic network ordinary differential equation identification method |
CN116010760A (en) * | 2023-02-21 | 2023-04-25 | 北京航空航天大学 | Dynamic system equation discovery method for double-layer optimization driving |
CN116010760B (en) * | 2023-02-21 | 2023-06-30 | 北京航空航天大学 | Dynamic system equation discovery method for double-layer optimization driving |
CN118279310A (en) * | 2024-06-03 | 2024-07-02 | 中国科学院空天信息创新研究院 | Remote sensing image anomaly detection method and device based on differential modeling |
CN118279310B (en) * | 2024-06-03 | 2024-08-30 | 中国科学院空天信息创新研究院 | Remote sensing image anomaly detection method and device based on differential modeling |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112784205A (en) | Partial differential equation data processing method, system, storage medium, device and application | |
Liu et al. | Cope with diverse data structures in multi-fidelity modeling: a Gaussian process method | |
US20240273336A1 (en) | Neural Architecture Search with Factorized Hierarchical Search Space | |
Gardner et al. | Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration | |
CN108399201B (en) | Web user access path prediction method based on recurrent neural network | |
Couckuyt et al. | Fast calculation of multiobjective probability of improvement and expected improvement criteria for Pareto optimization | |
Giannakoglou et al. | Aerodynamic shape design using evolutionary algorithms and new gradient-assisted metamodels | |
CN110321603A (en) | A kind of depth calculation model for Fault Diagnosis of Aircraft Engine Gas Path | |
Owoyele et al. | Efficient bifurcation and tabulation of multi-dimensional combustion manifolds using deep mixture of experts: An a priori study | |
Ghorbani et al. | A hybrid artificial neural network and genetic algorithm for predicting viscosity of Iranian crude oils | |
CN114298851A (en) | Network user social behavior analysis method and device based on graph sign learning and storage medium | |
JP6086612B2 (en) | Method and apparatus for converting chemical reaction mechanism | |
Souza et al. | Variable and time-lag selection using empirical data | |
CN117216525A (en) | Sparse graph attention soft measurement modeling method based on CNN-LKA | |
CN115600105A (en) | Water body missing data interpolation method and device based on MIC-LSTM | |
Zhang et al. | A non-intrusive neural network model order reduction algorithm for parameterized parabolic PDEs | |
Choubineh et al. | An innovative application of deep learning in multiscale modeling of subsurface fluid flow: Reconstructing the basis functions of the mixed GMsFEM | |
Ghosh et al. | Deep learning enabled surrogate model of complex food processes for rapid prediction | |
CN114154615A (en) | Neural architecture searching method and device based on hardware performance | |
Williams et al. | Novel tool for selecting surrogate modeling techniques for surface approximation | |
CN113723471B (en) | Nanoparticle concentration and particle size estimation method and device | |
Chang et al. | Singular layer physics informed neural network method for plane parallel flows | |
CN115423076A (en) | Directed hypergraph chain prediction method based on two-step framework | |
Wang et al. | Separable Gaussian neural networks for high-dimensional nonlinear stochastic systems | |
CN111523647B (en) | Network model training method and device, feature selection model, method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |