WO2023036164A1 - 一种基于物理信息神经网络的模型训练方法及相关装置 - Google Patents
一种基于物理信息神经网络的模型训练方法及相关装置 Download PDFInfo
- Publication number
- WO2023036164A1 WO2023036164A1 PCT/CN2022/117447 CN2022117447W WO2023036164A1 WO 2023036164 A1 WO2023036164 A1 WO 2023036164A1 CN 2022117447 W CN2022117447 W CN 2022117447W WO 2023036164 A1 WO2023036164 A1 WO 2023036164A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- neural network
- data
- output data
- sampling point
- loss function
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 144
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 140
- 238000000034 method Methods 0.000 title claims abstract description 134
- 230000008569 process Effects 0.000 claims abstract description 40
- 238000004088 simulation Methods 0.000 claims description 182
- 230000006870 function Effects 0.000 claims description 179
- 238000005070 sampling Methods 0.000 claims description 84
- 239000013598 vector Substances 0.000 claims description 83
- 238000012545 processing Methods 0.000 claims description 37
- 230000005284 excitation Effects 0.000 claims description 20
- 230000004913 activation Effects 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 14
- 230000005684 electric field Effects 0.000 claims description 14
- 238000003062 neural network model Methods 0.000 claims description 14
- 230000000737 periodic effect Effects 0.000 claims description 14
- 238000010586 diagram Methods 0.000 description 16
- 238000009826 distribution Methods 0.000 description 14
- 230000005672 electromagnetic field Effects 0.000 description 10
- 238000013461 design Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000013473 artificial intelligence Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000005293 physical law Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 238000010521 absorption reaction Methods 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000013433 optimization analysis Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
- G06N3/065—Analogue means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Definitions
- the present application relates to the field of computer technology, in particular to a physical information neural network-based model training method and related devices.
- Electromagnetic simulation is the main technology for the design, optimization, and analysis of various antennas and antenna arrays. Through electromagnetic simulation, some performance indicators of the simulated antenna can be calculated, such as return loss, antenna energy efficiency, etc., which can guide the design or optimization of the antenna.
- the calculation method of the performance index of the traditional electromagnetic simulation can be to mesh the simulation domain of the antenna first, and then solve Maxwell’s equations on the discrete grid to calculate the full amount of electromagnetic field for the next step of optimization analysis.
- Statistical results show that discrete grid division usually takes tens of minutes to several hours. For calculation grids with tens of millions of levels, the solution of the governing equations takes 4 to 8 hours. Too much time.
- This application provides a model training method based on Physical Informed Neural Networks (PINNs), which is used to improve the accuracy of model training.
- PINNs Physical Informed Neural Networks
- the present application also provides corresponding devices, computer equipment, computer-readable storage media, computer program products, and the like.
- the first aspect of the present application provides a model training method based on physical information neural network PINNs, the physical information neural network includes a first neural network and partial differential equations, the first neural network includes at least two residual network channels, the method includes : Obtain multiple sampling point data from the simulation domain of the antenna.
- the multiple sampling point data include the sampling point data of the active area, the sampling point data of the passive area, the boundary data of the simulation domain, and the initial data of the simulation domain.
- the simulation The domain includes an active area and a passive area; each training sample is input to each residual network channel of the first neural network by the product of each training sample and the corresponding coefficient of each residual network channel, and each training sample includes a The sampling point data and the hidden vector corresponding to the simulation domain, the coefficients corresponding to each residual network channel are different; the data input in each residual network channel is processed through the first neural network to obtain the output data set, where the output The data set includes active output data, passive output data, boundary output data and initial output data; the output data set is processed by partial differential equations to obtain the total loss function, the total loss function and the active loss function, passive loss Function, boundary loss function are related to the initial loss function; update the parameters in the first neural network according to the total loss function to obtain the second neural network; use the second neural network as the first neural network, iteratively execute the above training process until the first neural network The second neural network reaches the convergence condition to obtain the target physical information neural network model for the electromagnetic simulation of the antenna.
- PINNs is to add physical equations as constraints into the neural network so that the training results meet the physical laws.
- this limitation is actually by adding the residual of the physical equation before and after iteration to the loss function of the neural network, so that the physical equation also "participates" in the training process.
- the neural network optimizes not only the loss function of the network itself during the training iteration, but also the residual error of each iteration of the physical equation, so that the final training result satisfies the physical law.
- the first neural network is used to represent the neural network before one iteration
- the second neural network is used to represent the neural network after one iteration.
- the first neural network includes multiple residual network channels, and the multiple in this application includes two or more. Each residual network channel can transform input data into output data in electromagnetic form.
- the partial differential equation may be a point source Maxwell equation.
- the simulated domain of the antenna refers to the simulated coverage area of the antenna's electromagnetic waves.
- the antenna can be understood as an antenna of a terminal, or an antenna of a network device.
- the antennas of different terminals or network devices are usually different, so the simulation domains of different antennas are also different.
- the simulation domain includes the active area, the passive area and the boundary.
- the active area refers to the near-source area including the excitation source that is affected by the excitation source after adding an excitation source to the antenna array.
- the boundary refers to the is the edge of the simulation domain, and the passive area refers to the area in the simulation domain other than the active area and the boundary.
- the boundary of the simulation domain usually has a rebound boundary or an absorption boundary, and different types of boundaries have a great influence on the results of electromagnetic simulation.
- the simulation domain of the antenna may include multiple simulation domains of different antennas, and the hidden vectors corresponding to each simulation domain may be different.
- the sampling point data refers to the data corresponding to the sampling point.
- sampling point data There are four types of sampling point data, sampling point data in the active area, sampling point data in the passive area, boundary data of the simulation domain, and initial data of the simulation domain.
- the sampling point data is usually four-dimensional, including the three-dimensional space coordinates of the sampling point and the one-dimensional time information of the sampling point.
- a training sample refers to sample data used for training a model.
- the training samples include not only sampling point data, but also hidden vector Z corresponding to the simulation domain.
- the hidden vector Z is used to represent the parameter settings of different electromagnetic simulation scenarios.
- the hidden vector Z adopts a low-dimensional vector, and the commonly used dimension selection can be 16, 32, 64, 128, etc.
- sampling point data there are four types of sampling point data
- training samples which are training samples containing sampling point data in the active area, training samples containing sampling point data in the passive area, and simulation domain
- the training samples of the data on the boundary of including the training samples of the initial data of the simulation domain.
- each type of training sample is input into each residual network channel one by one, each residual network channel will get the output data of this type, and then the output data of each residual network channel is summarized, An output data corresponding to the input can be obtained. Therefore, there are also four types of output data, which are active output data, passive output data, boundary output data, and initial output data. In addition, the coefficients of each residual network channel are different, so that the same training sample can be differentiated, thereby improving the model training accuracy.
- the parameters in the first neural network may be updated using a gradient descent method.
- the target PINNs model is relative to the initial PINNs model before starting model training.
- the parameters in the first neural network of the initial PINNs are usually larger.
- the training samples are constantly updated.
- the parameters in the first neural network are until the convergence condition is reached to obtain the second neural network.
- the parameters in the second neural network can be understood as fixed, and the entire model at this time is called the target PINNs model.
- the first neural network of PINNs includes multiple residual network channels, and the coefficients corresponding to each residual network channel are different, so that different coefficients can be used to multiply in the model training stage
- one data can be expanded into multiple data, and signals of different frequencies can be captured through multi-residual network channels, thereby improving the accuracy of model training.
- the active region is an area in the simulation domain centered on the point source corresponding to the excitation source and having the first length as the radius, the first length and the continuous probability density function
- the first parameter is related, the continuous probability density function tends to the Dirac function, and the function of the point source is the product of the continuous probability density function and the signal of the excitation source;
- the passive area is the area in the simulation domain except the active area and the boundary .
- J(x,t) represents the function of the point source
- ⁇ (xx 0 ) represents the Dirac function
- g(t) represents the signal of the excitation source
- x 0 represents the position of the excitation source.
- the function of this point source represents an excitation source signal of the form g(t) applied at x 0 in the simulation domain.
- ⁇ (xx 0 ) is replaced by a continuous probability density function ⁇ ⁇ (x), which is close to a Dirac function and can be expressed as ⁇ (xx 0 ) ⁇ ⁇ (x).
- the ⁇ ⁇ (x) represents an abstract typical distribution, and the specific form may be in the form of Gaussian distribution, Cauchy distribution or exponential distribution.
- the Dirac function is replaced by the continuous probability density function ⁇ ⁇ (x) approaching the Dirac function, which overcomes the bottleneck that PINNs cannot handle point source problems.
- the active output data is the sum of the output data of each residual network channel when one of the training samples contains sampling point data in the active area
- the passive output data is The output data is the sum of the output data of each residual network channel when one of the training samples contains the sampling point data of the passive area
- the boundary output data is when one of the training samples contains boundary data
- the sum of the output data of each residual network channel, the initial output data is the sum of the output data of each residual network channel when one of the training samples contains the initial data.
- the data output by each residual network channel may be multiplied by some coefficients, and then added and summed.
- the output data of each residual network channel can be added, or the output data of each residual network channel can be multiplied by some coefficients, and then added and summed.
- the method of summing the output data of multiple residual network channels and performing partial differentiation can improve the accuracy of model training.
- each residual network channel includes a sinusoidal periodic activation function; the sinusoidal periodic activation function is used to convert the data in each residual network channel into electric field parameters and magnetic field parameters As the output data of each residual network channel.
- each residual network channel can include a residual network and a sinusoidal periodic activation function
- the residual network can optimize the first neural network model, improve the performance of the first network model, and the sinusoidal periodic activation function
- the function can obtain electric field data and magnetic field data. This combination of residual network and sinusoidal periodic activation function can effectively improve the accuracy of the model.
- the coefficient corresponding to each residual network channel increases exponentially.
- the coefficient corresponding to each residual network channel in the multiple residual network channels increases exponentially. If there are four residual network channels, the coefficients of the four residual network channels can be respectively 1, 2, 4, and 8. This exponentially increasing method is conducive to quickly widening the gap of the same data, thereby improving the accuracy of model training.
- processing the output data set through a partial differential equation to obtain a total loss function includes: each time using one output data in the output data set as the partial differential equation The known quantity is operated on the partial differential equation to obtain a loss function corresponding to the output data; the loss function corresponding to each output data in the output data set is accumulated according to the preset relationship to obtain the total loss function.
- the preset relationship includes learnable parameters and hyperparameters, and the learnable parameters corresponding to different loss functions related to the total loss function are different, and the learnable parameters will vary with the first neuron
- the parameters in the network are updated, and the hyperparameters are used to assist the loss function corresponding to the weighting of the learnable parameters.
- the method when updating parameters in the first neural network according to the total loss function, the method further includes: updating hidden vectors in the simulation domain and learnable parameters in the preset relationship.
- the second aspect of the present application provides a method for incremental learning, the method includes: obtaining multiple sampling point data from the simulation domain of the antenna to be optimized, the multiple sampling point data includes sampling point data of the active area, passive The sampling point data of the area, the data of the boundary of the simulation domain, and the initial data of the simulation domain, the simulation domain includes the active area and the passive area; input multiple sample data to the target physical information neural network, wherein, each sample data includes A sampling point data and the first latent vector of the simulation domain, the target physical information neural network is the target physical information neural network model trained by the above first aspect or any possible implementation of the above first aspect; through the target physical information neural network The network obtains the output data corresponding to each sample data; the parameters in the control target physical information neural network remain unchanged, and the first hidden vector in the simulation domain is adjusted according to the output data to obtain the second hidden vector; the second hidden vector is used as the second hidden vector An implicit vector, iteratively performing the adjustment to the first implicit vector through different sample data until the output data meets the preset requirements of the antenna
- the parameters in the target physical information neural network are frozen, and the hidden vector of the simulation domain of the antenna to be optimized is repeatedly adjusted through the output data of the target physical information neural network until the Matching hidden vectors, this method can quickly learn hidden vectors and improve the acquisition speed of hidden vectors in new electromagnetic simulation scenarios.
- the third aspect of the present application provides an electromagnetic simulation method, which includes using the target physical information neural network model trained in the first aspect or any possible implementation of the first aspect to simulate the antenna to obtain the The electromagnetic field distribution of the antenna.
- the fourth aspect of the present application provides a model training device based on a physical information neural network, which has the function of implementing the method of the first aspect or any possible implementation manner of the first aspect.
- This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
- the hardware or software includes one or more modules corresponding to the above functions, for example: an acquisition unit and one or more processing units.
- the fifth aspect of the present application provides a device for incremental learning, which has the function of realizing the method of the second aspect above.
- This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
- the hardware or software includes one or more modules corresponding to the above functions, for example: an acquisition unit and one or more processing units.
- the sixth aspect of the present application provides an electromagnetic simulation device, which has the function of implementing the method of the third aspect above.
- This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
- the hardware or software includes one or more modules corresponding to the above functions, for example: one or more processing units.
- a seventh aspect of the present application provides a computer device, the computer device includes at least one processor, a memory, an input/output (input/output, I/O) interface, and a computer executable program stored in the memory and operable on the processor Instructions, when the computer-executed instructions are executed by the processor, the processor executes the method according to the above first aspect or any possible implementation manner of the first aspect.
- the eighth aspect of the present application provides a computer device, the computer device includes at least one processor, a memory, an input/output (input/output, I/O) interface, and a computer executable program stored in the memory and operable on the processor Instructions, when the computer-executed instructions are executed by the processor, the processor executes the method in the second aspect above.
- a ninth aspect of the present application provides a computer device, the computer device includes at least one processor, a memory, an input/output (input/output, I/O) interface, and a computer executable program stored in the memory and operable on the processor Instructions, when the computer-executed instructions are executed by the processor, the processor executes the method in the third aspect above.
- the tenth aspect of the present application provides a computer-readable storage medium that stores one or more computer-executable instructions.
- the processor executes any one of the above-mentioned first aspect or the first aspect. method of implementation.
- the eleventh aspect of the present application provides a computer-readable storage medium storing one or more computer-executable instructions.
- the processor executes the method in the second aspect above.
- the twelfth aspect of the present application provides a computer-readable storage medium storing one or more computer-executable instructions.
- the processor executes the method in the third aspect above.
- the thirteenth aspect of the present application provides a computer program product that stores one or more computer-executable instructions.
- the processor executes any possible implementation of the above-mentioned first aspect or the first aspect. way of way.
- the fourteenth aspect of the present application provides a computer program product storing one or more computer-executable instructions.
- the processor executes the method of the above-mentioned second aspect.
- the fifteenth aspect of the present application provides a computer program product storing one or more computer-executable instructions.
- the processor executes the method in the third aspect above.
- a sixteenth aspect of the present application provides a chip system, where the chip system includes at least one processor, and the at least one processor is configured to implement the functions involved in the above-mentioned first aspect or any possible implementation manner of the first aspect.
- the system-on-a-chip may also include a memory, which is used to store necessary program instructions and data of the device for processing the artificial intelligence model.
- the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
- a seventeenth aspect of the present application provides a chip system, the chip system includes at least one processor, and the at least one processor is configured to implement the functions involved in the second aspect above.
- the system-on-a-chip may further include a memory, which is used to store necessary program instructions and data of the device for data processing based on the artificial intelligence model.
- the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
- the eighteenth aspect of the present application provides a chip system, the chip system includes at least one processor, and the at least one processor is configured to implement the functions involved in the second aspect above.
- the system-on-a-chip may further include a memory, which is used to store necessary program instructions and data of the device for data processing based on the artificial intelligence model.
- the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
- Fig. 1 is a schematic structural diagram of the physical information neural network model provided by the embodiment of the present application.
- Fig. 2 is a schematic diagram of model training provided by the embodiment of the present application.
- FIG. 3 is a schematic diagram of a simulation domain of an antenna provided by an embodiment of the present application.
- Fig. 4 is a schematic diagram of an embodiment of the model training method provided by the embodiment of the present application.
- Fig. 5 is a schematic diagram of an example of the model training method provided by the embodiment of the present application.
- Fig. 6 is a schematic diagram of an example of the power supply Maxwell's equation provided by the embodiment of the present application.
- Fig. 7 is a schematic diagram of an embodiment of the incremental learning method provided by the embodiment of the present application.
- Fig. 8 is a schematic diagram of another embodiment of an incremental learning method provided by the embodiment of the present application.
- Fig. 9 is a comparison diagram of experimental effects provided by the embodiment of the present application.
- Fig. 10 is a schematic diagram of an embodiment of the electromagnetic simulation provided by the embodiment of the present application.
- Fig. 11 is a schematic diagram of an embodiment of the model training device provided by the embodiment of the present application.
- Fig. 12 is a schematic diagram of an embodiment of an incremental learning device provided by an embodiment of the present application.
- FIG. 13 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
- the embodiment of the present application provides a model training method based on Physical Informed Neural Networks (PINNs), which is used to improve the accuracy of model training, thereby improving the accuracy of electromagnetic simulation.
- PINNs Physical Informed Neural Networks
- the present application also provides corresponding devices, computer equipment, computer-readable storage media, computer program products, and the like. Each will be described in detail below.
- Antennas can be optimized through electromagnetic simulations.
- the artificial intelligence (AI) technology can be used to pre-train the neural network model, use the neural network model to complete the electromagnetic simulation process, determine the electromagnetic field distribution and performance indicators of the antenna to be optimized and other simulation results, and then according to the simulation results. Antennas are optimized.
- AI artificial intelligence
- PINNs is to add physical equations as restrictions to the neural network so that the training results meet the physical laws. And this limitation is actually by adding the residual of the physical equation before and after iteration to the loss function of the neural network, so that the physical equation also "participates" in the training process. In this way, the neural network optimizes not only the loss function of the network itself during the training iteration, but also the residual error of each iteration of the physical equation, so that the final training result satisfies the physical law.
- the embodiments of the present application provide the following aspects: 1. Provide a PINNs model with a new structure; 2. Train the PINNs model with the new structure based on the simulation domain of the antenna , to obtain the target PINNs model; 3. Use the target PINNs model for incremental learning to obtain the hidden vector of the new electromagnetic simulation scene; 4. Use the target PINNs model for electromagnetic simulation to obtain the electromagnetic field data of each point in the antenna simulation domain .
- the process of model training, the process of incremental learning and the process of electromagnetic simulation can all be carried out on a computer device, and the computer device can be a server, a terminal device or a virtual machine (virtual machine, VM).
- Terminal equipment also called user equipment (UE) is a device with wireless transceiver function, which can be deployed on land, including indoor or outdoor, handheld or vehicle-mounted; it can also be deployed on water (such as ships etc.); can also be deployed in the air (such as aircraft, balloons and satellites, etc.).
- the terminal may be a mobile phone, a tablet computer (pad), a computer with a wireless transceiver function, a virtual reality (virtual reality, VR) terminal, an augmented reality (augmented reality, AR) terminal, an industrial control (industrial control) Wireless terminals in self driving, wireless terminals in remote medical, wireless terminals in smart grid, wireless terminals in transportation safety, Wireless terminals in smart cities, wireless terminals in smart homes, etc.
- the VM may be a virtualized device that is divided in a virtualized manner on the hardware resources of the physical machine.
- the PINNs model of the novel structure provided in the embodiment of the present application can be understood by referring to FIG. 1 .
- the PINNs model of the novel structure that the embodiment of the present application provides can comprise the first neural network and partial differential equation (partial differential equation, PDE), and this first neural network comprises at least two residual network channels, such as
- the first neural network shown in FIG. 1 includes n residual network channels, such as: residual network channel 1, residual network channel 2, . . . , residual network channel n.
- the partial differential equations can be point source Maxwell equations.
- Each residual network channel can include a residual network and a sinusoidal periodic activation function.
- the residual network can optimize the first neural network to improve the performance of the first neural network, and the sinusoidal periodic activation function is used to convert the data in each residual network channel into electric field parameters and magnetic field parameters as each The output data of residual network channels.
- This combination of residual network and sinusoidal periodic activation function can effectively improve the accuracy of the model.
- the PINNs model of the new structure is trained in the antenna-based simulation domain to obtain the target PINNs model.
- the model training process provided by the embodiment of the present application can be understood by referring to FIG. 2 .
- input training samples into the PINNs model process the training samples through the first neural network to obtain output data, process the output data through partial differential equations to obtain the loss function, and then update through the loss function parameters in the first neural network, the computer device executes the training process iteratively until the convergence condition is reached, and the target PINNs model is obtained.
- the training samples used for training the PINNs model in the embodiment of the present application are from the simulation domain of the antenna, and the simulation domain of the antenna refers to the simulated coverage area of the electromagnetic wave of the antenna.
- the antenna can be understood as an antenna of a terminal, or an antenna of a network device.
- the antennas of different terminals or network devices are usually different, so the simulation domains of different antennas are also different.
- the antenna in the embodiment of the present application may be an antenna powered by a pulse excitation source.
- the simulated domain of the antenna includes the active area, the passive area and the boundary.
- the structure of the antenna may be a butterfly structure 100 as shown in FIG. 3 , and the antenna of the butterfly structure includes two opposite triangular structures.
- the area covered by the simulated electromagnetic wave of the antenna can be understood as the simulation domain 101 of the butterfly antenna.
- the near-source area including the point source 102 is the active area 103
- the area in the simulation domain 101 except the active area 103 and the boundary of the simulation domain 101 is the passive area 104 .
- the active area is the area in the simulation domain centered on the point source corresponding to the excitation source and the first length is the radius, the first length is related to the first parameter in the continuous probability density function, and the continuous probability density function Approaching to the Dirac function, the function of the point source is the product of the continuous probability density function and the signal of the excitation source;
- the passive area is the area in the simulation domain other than the active area and the boundary, or remove the boundary, inside the simulation domain area other than the active area.
- the simulation domain with the boundary removed can be represented by ⁇
- the active region can be represented by ⁇ 0
- ⁇ 3 ⁇ , ⁇ 1 ⁇ 0 .
- x 0 represents the center of the point source corresponding to the excitation source
- x represents the radius of the first length
- ⁇ represents the first parameter in the continuous probability density function.
- the value of ⁇ can be set according to requirements, usually 1/100-1/200 of the length of the simulation domain, and the time range and space range of the simulation domain can be determined according to the antenna.
- J(x,t) represents the function of the point source
- ⁇ (xx 0 ) represents the Dirac function
- g(t) represents the signal of the excitation source
- x 0 represents the position of the excitation source.
- the function of this point source represents an excitation source signal of the form g(t) applied at x 0 in the simulation domain.
- the continuous probability density function ⁇ ⁇ (x) is used to replace ⁇ (xx 0 ), and the continuous probability density function approaches the Dirac function, which can be expressed as ⁇ (xx 0 ) ⁇ ⁇ (x) .
- the ⁇ ⁇ (x) represents an abstract typical distribution, and the specific form may be in the form of Gaussian distribution, Cauchy distribution or exponential distribution. The forms of several distributions can be understood by referring to Table 1 below.
- the continuous probability density function ⁇ ⁇ (x) approaching the Dirac function is used to replace the Dirac function, which overcomes the bottleneck that PINNs cannot handle the point source problem.
- an embodiment of the PINNs-based model training method provided by the embodiment of the present application can be understood with reference to FIG. 4 , as shown in FIG. 4 , an embodiment of the PINNs-based model training method provided by the embodiment of the present application may include:
- the computer device acquires multiple sampling point data from the simulation domain of the antenna.
- the plurality of sampling point data includes sampling point data of the active area, sampling point data of the passive area, boundary data of the simulation domain, and initial data of the simulation domain, and the simulation domain includes the active area and the passive area.
- sampling point data there are four types of sampling point data, sampling point data in the active area, sampling point data in the passive area, boundary data of the simulation domain, and initial data of the simulation domain.
- the boundaries of the simulation domain usually have rebound boundaries or absorption boundaries. Different types of boundaries have a great influence on the results of electromagnetic simulations.
- the sampling point data is usually four-dimensional, including the three-dimensional space coordinates of the sampling point and the one-dimensional time information of the sampling point.
- the sampling point data in the active area can be expressed as U SRC , the sampling point data U NO_SRC in the passive area, the boundary data U BC in the simulation domain, and the initial data U IC in the simulation domain.
- the computer device inputs the product of each training sample among the multiple training samples and the corresponding coefficient of each residual network channel to each residual network channel of the first neural network.
- each training sample includes a sampling point data and a hidden vector corresponding to the simulation domain.
- Training samples refer to the sample data used to train the PINNs model.
- the training samples include not only sampling point data, but also hidden vector Z corresponding to the simulation domain.
- the training samples containing U SRC (Z, U SRC )
- the training samples containing U NO_SRC (Z, U NO_SRC )
- the implicit vector Z is used to represent the parameter settings of different electromagnetic simulation scenarios.
- the hidden vector Z adopts a low-dimensional vector, and the commonly used dimension selection can be 16, 32, 64, 128, etc.
- the coefficients corresponding to each residual network channel are different. As shown in Figure 5, there are n residual network channels in the first neural network, from residual network channel 1 to residual network channel n, where the coefficient corresponding to residual network channel 1 is a 1 , residual network channel The coefficient corresponding to 2 is a 2 ,..., the coefficient corresponding to the residual network channel n is a n , the coefficients of these n residual network channels can also be expressed in the form of a set as ⁇ a 1 ,a 2 ,...,a n ⁇ . In this way, when the training sample is X, the input of each residual network channel can be expressed as ⁇ a 1 X,a 2 X,...,a n X ⁇ .
- the X may be any one of X SRC , X NO_SRC , X BC and X IC mentioned above.
- the training samples come from multiple electromagnetic simulation scenarios, that is, from a variety of different antenna simulation domains, then there will be a corresponding hidden vector for each different simulation domain. If there are N different simulation domains, then N A latent vector can be expressed as ⁇ Z 1 ,...Z N ⁇ .
- the computer device processes the data input into each residual network channel through the first neural network to obtain an output data set.
- the output data set includes active output data, passive output data, boundary output data and initial output data.
- the active output data is the sum of the output data of each residual network channel when one of the training samples contains the sampling point data of the active area
- the passive output data is The sum of the output data of each residual network channel when one of the multiple training samples contains sampling point data in the passive area
- the boundary output data is each residual when one of the multiple training samples contains boundary data
- the initial output data is the sum of the output data of each residual network channel when one of the training samples contains the initial data.
- This output data set can be expressed as ⁇ Y SRC , Y NO_SRC , Y BC , Y IC ⁇ .
- the computer device processes the output data set through partial differential equations to obtain an overall loss function.
- the total loss function is obtained according to the active loss function, passive loss function, boundary loss function and initial loss function.
- the active loss function refers to the loss function obtained by the active output data
- the passive loss function refers to the loss function obtained by the passive output data
- the boundary loss function refers to the loss function obtained by the boundary output data
- the initial The loss function refers to the loss function obtained by the initial output data.
- the active loss function can be expressed by L SRC
- the passive loss function can be expressed by L NO_SRC
- the boundary loss function can be expressed by L BC
- the initial loss function can be expressed by L IC .
- the process of obtaining the total loss function may be: each time an output data in the output data set is used as a known quantity of the partial differential equation, and the partial differential equation is operated to obtain a loss function corresponding to the output data;
- the loss function corresponding to each output data in the output data set is accumulated according to the preset relationship to obtain the total loss function.
- the preset relationship includes learnable parameters, and the learnable parameters corresponding to different loss functions are different.
- the partial differential equation can be a point source Maxwell equation
- the output data Y is usually six-dimensional, including three-dimensional electric field data and three-dimensional magnetic field data, as shown in Figure 6, the electric field data and magnetic field data in the output data Y
- E in Figure 6 represents the electric field
- H represents the magnetic field
- x, y, and z in the table below represent the three-dimensional space respectively.
- the total loss function can be accumulated according to the preset relationship.
- the preset relationship includes learnable parameters and hyperparameters.
- the learnable parameters corresponding to different loss functions related to the total loss function are different.
- the learnable parameters will change with the first
- the parameters in the neural network are updated, and the hyperparameters are used to assist the loss function corresponding to the weighting of the learnable parameters.
- the preset relationship can be expressed as:
- L total represents the total loss function
- L i represents four types of loss functions
- ⁇ is a hyperparameter
- the value of this hyperparameter can be 0.01.
- this is just an example of the hyperparameter value.
- the dynamic weighted loss function is implemented through hyperparameters and learnable parameters, and the weights of various loss functions are balanced, which can accelerate the convergence speed in the neural network training process.
- the computer device updates parameters in the first neural network according to the total loss function to obtain a second neural network.
- the first neural network is used to represent the neural network before one iteration
- the second neural network is used to represent the neural network after one iteration
- the hidden vector Z in the simulation domain and the learnable parameter ⁇ in the above preset relationship can also be updated. That is, ⁇ , Z, and ⁇ can be updated according to L total .
- the method of gradient descent can be used to update ⁇ , Z, and ⁇ , and the ⁇ , Z, and ⁇ of the current iteration can be adjusted down to obtain new ⁇ , Z, and ⁇ , and start the next iteration process .
- the second neural network is used as the first neural network, and the above training process is iteratively executed until the second neural network reaches the convergence condition, so as to obtain the target physical information neural network model.
- the target PINNs model is relative to the initial PINNs model before starting model training, and the parameters in the first neural network of the initial PINNs are usually larger.
- the parameters in the first neural network are updated until the convergence condition is reached, and the second neural network is obtained.
- the parameters in the second neural network can be understood as fixed, and the entire model at this time is called the target PINNs model.
- the first neural network of PINNs includes multiple residual network channels, and the coefficients corresponding to each residual network channel are different, in this way, different coefficients can be multiplied by the same in the model training stage
- one data can be expanded into multiple data, and signals of different frequencies can be captured through multi-residual network channels, thereby improving the accuracy of model training.
- an embodiment of the incremental learning provided by the embodiment of the present application includes:
- the computer device acquires multiple sampling point data from a simulation domain of the antenna to be optimized.
- the plurality of sampling point data includes sampling point data of the active area, sampling point data of the passive area, boundary data of the simulation domain, and initial data of the simulation domain, and the simulation domain includes the active area and the passive area.
- sampling point data in the embodiment of the present application can be understood by referring to the sampling point data in the above step 201, but the sampling point data in the embodiment of the present application comes from the simulation domain of the antenna to be optimized, or from the new electromagnetic simulation scene simulation domain.
- the computer device inputs a plurality of sample data to the target physical information neural network, wherein each sample data includes a sample point data and the first hidden vector of the simulation domain.
- the target physical information neural network is the target physical information neural network obtained by the PINNs-based model training method.
- the computer device obtains the output data corresponding to each sample data through the target physical information neural network.
- the computer equipment controls the parameters in the physical information neural network to remain unchanged, and adjusts the first hidden vector in the simulation domain according to the output data to obtain the second hidden vector.
- the adjustment of the hidden vector in the embodiment of the present application may be performed in a gradient descent manner.
- the second latent vector as the first latent vector, iteratively execute the above-mentioned adjustment to the first latent vector through different sample data until the output data meets the preset requirements of the antenna to be optimized, so as to obtain the second latent vector matching the simulation domain. vector.
- the first hidden vector may be understood as a hidden vector before iteration
- the second hidden vector may be understood as a hidden vector after iteration
- the parameters in the target physical information neural network are frozen, and the hidden vector of the simulation domain of the antenna to be optimized is repeatedly adjusted through the output data of the target physical information neural network until the simulation domain is obtained.
- this method can quickly learn hidden vectors and improve the acquisition speed of hidden vectors in new electromagnetic simulation scenarios.
- the developers have done related experiments.
- the hidden vector Z of the new electromagnetic simulation scene obtained by using the incremental learning scheme provided by this application is compared with the hidden vector Z of the new electromagnetic simulation scene obtained by the original method.
- the time comparison diagram of vector Z as can be seen from Figure 9, under the condition of 5% error, the application scheme only needs 200 seconds to obtain the hidden vector Z of the new electromagnetic simulation scene, while the new electromagnetic simulation scene is obtained by using the original method
- the hidden vector Z of the scene takes 3337 seconds, and the solution of this application has greatly improved the speed.
- the target PINNs model can be stored in the form of a model file, and the target PINNs needs to be used in the computer equipment (such as: terminal equipment, server or VM, etc.) used for electromagnetic simulation
- the computer equipment used for electromagnetic simulation can actively load the model file of the target PINNs model. It may also be that the model file storing the target PINNs model is actively sent to the computer equipment used for electromagnetic simulation to install the model file of the target PINNs model.
- the target PINNs model can be used for electromagnetic simulation.
- the simulation results can be the schematic diagram of electromagnetic field distribution shown in Figure 10, or some performance indicators of the simulated antenna, such as: electromagnetic field data of each point in the antenna simulation domain.
- the electromagnetic field data includes electric field data and magnetic field data , such as: electric field strength and magnetic field strength. In this way, the antenna can be optimally designed through the results of electromagnetic simulation.
- the electromagnetic simulation solution provided in the embodiment of the present application uses the target PINNs model of the multi-residual network channel to perform the electromagnetic simulation process, which greatly improves the accuracy of the electromagnetic simulation.
- the above describes the model training method based on the physical information neural network and the method of incremental learning.
- the following describes the model training device 40 based on the physical information neural network provided by the embodiment of the present application in conjunction with the accompanying drawing 11.
- the physical information neural network based Model training device 40 includes:
- the acquisition unit 401 is configured to acquire multiple sampling point data from the simulation domain of the antenna, the multiple sampling point data includes sampling point data in the active area, sampling point data in the passive area, data on the boundary of the simulation domain, and simulation domain data
- the initial data, the simulation domain includes the active area and the passive area.
- the function of the acquiring unit 401 can be understood by referring to step 201 in the above method embodiment.
- the first processing unit 402 is configured to input the product of each training sample in multiple training samples and the corresponding coefficient of each residual network channel to each residual network channel of the first neural network, and each training sample includes an acquisition unit 401 The acquired data of a sampling point and the hidden vector corresponding to the simulation domain, the coefficients corresponding to each residual network channel are different.
- the function of the first processing unit 402 can be understood by referring to step 202 in the above method embodiment.
- the second processing unit 403 is configured to process the data input into each residual network channel by the first processing unit 402 through the first neural network to obtain an output data set, wherein the output data set includes active output data, Source output data, boundary output data, and initial output data.
- the function of the second processing unit 403 can be understood by referring to step 203 in the above method embodiment.
- the third processing unit 404 is configured to process the output data set through a partial differential equation to obtain a total loss function, and the total loss function is obtained according to the active loss function, the passive loss function, the boundary loss function and the initial loss function.
- the function of the third processing unit 404 can be understood by referring to step 204 in the above method embodiment.
- the fourth processing unit 405 is configured to update the parameters in the first neural network according to the total loss function to obtain the second neural network.
- the function of the fourth processing unit 405 can be understood by referring to step 205 in the above method embodiment.
- the second neural network is used as the first neural network, and the above training process is iteratively executed until the second neural network reaches the convergence condition, so as to obtain the target physical information neural network model.
- the first neural network of PINNs includes multiple residual network channels, and the coefficients corresponding to each residual network channel are different, in this way, different coefficients can be multiplied by the same in the model training stage
- one data can be expanded into multiple data, and signals of different frequencies can be captured through multi-residual network channels, thereby improving the accuracy of model training.
- the active region is an area in the simulation domain centered on the point source corresponding to the excitation source and having a radius of the first length, the first length is related to the first parameter in the continuous probability density function, and the continuous probability density function Approaching to the Dirac function, the function of the point source is the product of the continuous probability density function and the signal of the excitation source; the passive area is the area in the simulation domain except the active area and the boundary.
- the active output data is the sum of the output data of each residual network channel when one of the training samples contains sampling point data in the active area
- the passive output data is the The sum of the output data of each residual network channel when a training sample contains sampling point data in the passive area
- the boundary output data is the output data of each residual network channel when one of the training samples contains boundary data
- the sum of the initial output data is the sum of the output data of each residual network channel when one of the training samples contains the initial data.
- each residual network channel includes a sinusoidal periodic activation function; the sinusoidal periodic activation function is used to convert the data in each residual network channel into electric field parameters and magnetic field parameters as the output of each residual network channel data.
- the coefficient corresponding to each residual network channel increases exponentially.
- the third processing unit 404 is configured to use one output data in the output data set each time as a known quantity of the partial differential equation, and perform operations on the partial differential equation to obtain a loss function corresponding to the output data;
- the loss function corresponding to each output data in the data set is accumulated according to the preset relationship to obtain the total loss function.
- the preset relationship includes learnable parameters and hyperparameters. Different loss functions related to the total loss function correspond to different learnable parameters.
- the learnable parameters will be updated as the parameters in the first neural network are updated.
- the hyperparameters are used to assist Loss function corresponding to learnable parameter weighting.
- the fourth processing unit 405 is also configured to update the hidden vector of the simulation domain and the learnable parameters in the preset relationship.
- the preset relationship includes learnable parameters, and different loss functions correspond to different learnable parameters.
- the physical information neural network-based model training device 40 described above can be understood by referring to the corresponding descriptions in the foregoing method embodiments, and will not be repeated here.
- an embodiment of the incremental learning device 50 provided by the embodiment of the present application includes:
- the acquiring unit 501 is configured to acquire multiple sampling point data from the simulation domain of the antenna to be optimized, the multiple sampling point data includes sampling point data of the active area, sampling point data of the passive area, and boundary data of the simulation domain , and the initial data of the simulation domain, which includes active and passive regions.
- the obtaining unit 501 may execute step 301 in the foregoing method embodiments.
- the first processing unit 502 is configured to input a plurality of sample data to the target physical information neural network, wherein each sample data includes a sampling point data and the first hidden vector of the simulation domain, and the target physical information neural network is based on the physical information neural network.
- the network model training method obtains the target physical information neural network.
- the first processing unit 502 may execute step 302 in the foregoing method embodiment.
- the second processing unit 503 is configured to obtain output data corresponding to each sample data through the target physical information neural network.
- the second processing unit 503 may execute step 303 in the above method embodiment.
- the third processing unit 504 is configured to control the parameters in the physical information neural network to remain unchanged, and adjust the first hidden vector in the simulation domain according to the output data to obtain the second hidden vector.
- the third processing unit 504 may execute step 304 in the above method embodiment.
- the second latent vector as the first latent vector, iteratively execute the above-mentioned adjustment to the first latent vector through different sample data until the output data meets the preset requirements of the antenna to be optimized, so as to obtain the second latent vector matching the simulation domain. vector.
- the parameters in the target physical information neural network are frozen, and the hidden vector of the simulation domain of the antenna to be optimized is repeatedly adjusted through the output data of the target physical information neural network until the simulation domain is obtained.
- this method can quickly learn hidden vectors and improve the acquisition speed of hidden vectors in new electromagnetic simulation scenarios.
- An embodiment of the present application provides an electromagnetic simulation device, the electromagnetic simulation device is installed with the above-mentioned target physical information neural network model, and the electromagnetic simulation device can simulate the antenna through the target physical information neural network model to obtain the described The electromagnetic field distribution of the antenna's simulation domain.
- FIG. 13 is a schematic diagram of a possible logical structure of the computer device 60 provided by the embodiment of the present application.
- the computer equipment 60 may be a model training device based on a physical information neural network, or an incremental learning device or an electromagnetic simulation device.
- the computer device 60 includes: a processor 601 , a communication interface 602 , a memory 603 and a bus 604 .
- the processor 601 , the communication interface 602 and the memory 603 are connected to each other through a bus 604 .
- the processor 601 is used to control and manage the actions of the computer device 60.
- the processor 601 is used to execute the processes in the method embodiments shown in FIG. 1 to FIG. 9, and the communication interface 602 is used to support The computer device 60 communicates.
- the memory 603 is used for storing program codes and data of the computer device 60 .
- the processor 601 may be a central processing unit, a general processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. It can implement or execute the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
- the processor 601 may also be a combination that implements computing functions, for example, a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like.
- the bus 604 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus or the like.
- PCI Peripheral Component Interconnect
- EISA Extended Industry Standard Architecture
- a computer-readable storage medium stores computer-executable instructions.
- the processor of the device executes the computer-executable instructions
- the device executes the above physical information-based A model training method of a neural network, a method of incremental learning, or a method of performing the above-mentioned electromagnetic simulation.
- a computer program product includes computer-executable instructions stored in a computer-readable storage medium; when the processor of the device executes the computer-executable instructions , the device executes the above-mentioned model training method based on the physical information neural network, the incremental learning method, or the above-mentioned electromagnetic simulation method.
- a chip system in another embodiment, is also provided, the chip system includes a processor, and the processor is used to implement the above-mentioned model training method based on the physical information neural network, the incremental learning method, or perform the above-mentioned electromagnetic simulation Methods.
- the system-on-a-chip may further include a memory, which is used for storing necessary program instructions and data of the device for inter-process communication.
- the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
- the disclosed systems, devices and methods may be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of units is only a logical function division. In actual implementation, there may be other division methods.
- multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented.
- the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
- a unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- each functional unit in each embodiment of the embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
- the functions are realized in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
- the technical solution of the embodiment of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods in the various embodiments of the embodiments of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本申请公开了一种基于PINNs的模型训练方法,该PINNs中包括第一神经网络和偏微分方程,该第一神经网络包括至少两个残差网络通道,在模型训练过程中,可以使用至少两个残差网络通道对训练样本进行处理,而且每个残差网络通道对应的系数不同,用不同的系数乘上相同的训练样本,就可以将一个数据扩展为多个数据,而且还可以通过多个残差网络通道捕捉不同频率的信号,从而提高了模型训练的精确度。
Description
本申请要求于2021年9月13日提交中国专利局、申请号为202111069844.8、发明名称为“一种基于物理信息神经网络的模型训练方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及计算机技术领域,具体涉及一种基于物理信息神经网络的模型训练方法及相关装置。
电磁仿真是各类天线及天线阵列设计、优化、分析的主要技术,通过电磁仿真可以计算出所仿真天线的一些性能指标,如回波损耗、天线能量效率等,从而可以指导天线的设计或优化。
传统的电磁仿真的性能指标的计算方法可以是对天线的仿真域先进行网格划分,然后在离散的网格上求解麦克斯韦方程,计算得到全量的电磁场从而进行下一步的优化分析。统计结果表明,离散网格划分通常耗时几十分钟到几个小时不等,对于千万级左右的计算网格,控制方程的求解则需耗时4到8个小时,这种计算方法耗时过多。
目前也有通过物理信息神经网络(Physical Informed Neural Networks,PINNs)模型计算电磁仿真的性能指标的方案,但目前训练得到的PINNs模型所计算出的电磁仿真的性能指标的精度不高,不利于天线的优化。
发明内容
本申请提供一种基于物理信息神经网络(Physical Informed Neural Networks,PINNs)的模型训练方法,用于提升模型训练的准确度。本申请还提供了相应的装置、计算机设备、计算机可读存储介质和计算机程序产品等。
本申请第一方面提供一种基于物理信息神经网络PINNs的模型训练方法,该物理信息神经网络包括第一神经网络和偏微分方程,第一神经网络包括至少两个残差网络通道,该方法包括:从天线的仿真域获取多个采样点数据,多个采样点数据包括有源区的采样点数据、无源区的采样点数据、仿真域的边界的数据,以及仿真域的初始数据,仿真域包括有源区和无源区;向第一神经网络的每个残差网络通道输入多个训练样本中每个训练样本与每个残差网络通道对应系数的乘积,每个训练样本包括一个采样点数据和仿真域对应的隐向量,每个残差网络通道对应的系数不同;通过第一神经网络对输入每个残差网络通道中的数据进行处理,以得到输出数据集,其中,输出数据集包括有源输出数据、无源输出数据、边界输出数据和初始输出数据;通过偏微分方程对输出数据集进行处理,以得到总损失函数,总损失函数与有源损失函数、无源损失函数、边界损失函数和初始损失函数有关;根据总损失函数更新第一神经网络中的参数,以得到第二神经网络;将第二神经网络作为第一神经网络,迭代执行上述训练过程,直到第二神经网络达到收敛条件,以得到用于天线的电磁仿真的目标物理信息神经网络模型。
本申请中,PINNs就是把物理方程作为限制加入神经网络中使训练的结果满足物理规律。而这个限制其实就是通过把物理方程迭代前后的残差加到神经网络的损失函数里,让 物理方程也“参与”到了训练过程。这样,神经网络在训练迭代时候优化的不仅仅的网络自己的损失函数,还有物理方程每次迭代的残差,使得最后训练出来的结果满足物理规律。
本申请中,第一神经网络用于表示一次迭代前的神经网络,第二神经网络用于表示一次迭代后的神经网络。第一神经网络包括多个残差网络通道,本申请中的多个包括两个或两个以上。每个残差网络通道都可以将输入数据转换为电磁形式的输出数据。
本申请中,偏微分方程可以是点源麦克斯韦方程。
本申请中,天线的仿真域指的是模拟出的天线电磁波的覆盖区域。天线可以理解为是终端的天线,也可以是网络设备的天线。不同终端或网络设备的天线通常不同,所以,不同天线的仿真域也不同。
本申请中,仿真域包括有源区、无源区和边界,有源区指的是模拟在天线阵列中加入激励源,被激励源影响的包含激励源在内的近源区域,边界指的是仿真域的边沿,无源区指的是仿真域中除有源区和边界之外的区域。仿真域的边界通常有反弹边界或吸收边界,不同类型的边界对电磁仿真的结果影响很大。
本申请中,天线的仿真域可以包括多个不同天线各自的仿真域,每个仿真域对应的隐向量可以不相同。
本申请中,采样点数据指的是采样点对应的数据。采样点数据有四种类型,有源区的采样点数据、无源区的采样点数据、仿真域的边界的数据以及仿真域的初始数据。仿真域的初始数据通常指仿真域初始状态(通常指的是时间维度上t=0时)的电场数据和磁场数据,初始状态的仿真域的电场数据和磁场数据通常为零。采样点数据通常是四维的,包括采样点的三维空间坐标,以及采样点的一维时间信息,采样点数据的形式可以表示为U=(x,y,z,t)。
本申请中,训练样本指的是用于训练模型的样本数据。该训练样本中不仅包括采样点数据,还包括仿真域对应的隐向量Z。训练样本可以表示为X=(Z,U)的形式。
本申请中,隐向量Z用来表征不同电磁仿真场景的参数设置。本申请中,隐向量Z采用低维向量,常用的维度选择可以是16,32,64,128等。
本申请中,因为采样点数据有四种类型,所以训练样本也有四种类型,分别为包含有源区的采样点数据的训练样本,包含无源区的采样点数据的训练样本,包含仿真域的边界的数据的训练样本,包含仿真域的初始数据的训练样本。
本申请中,将每种类型的训练样本逐个输入每个残差网络通道中,每个残差网络通道都会得到该种类型的输出数据,然后再将每个残差网络通道的输出数据汇总,就可以得到对应输入的一个输出数据。因此,输出数据也有四种类型,分别为有源输出数据,无源输出数据,边界输出数据和初始输出数据。另外,每个残差网络通道的系数不同,这样可以对同一个训练样本进行差异化变化,从而提高模型训练精度。
本申请中,因为训练样本有四种类型,所以输出数据也有四种类型,损失函数也有四种类型,通过四种类型的损失函数得到总损失函数再更新第一神经网络中的参数,以得到第二神经网络。
本申请中,可以采用梯度下降的方法更新第一神经网络中的参数。
本申请中,目标PINNs模型是相对于开始模型训练前的初始PINNs模型来说的,初始PINNs的第一神经网络中参数通常是较大的,在模型训练过程中,通过训练样本,不断的更新第一神经网络中的参数,直到达到收敛条件,得到第二神经网络,这时第二神经网络中的参数可以理解为固定下来,这时的整个模型称为目标PINNs模型。
由以上第一方面的描述可知,因为PINNs的第一神经网络中包括多个残差网络通道,而且每个残差网络通道对应的系数不同,这样,在模型训练阶段就可以用不同的系数乘上相同的训练样本,就可以将一个数据扩展为多个数据,而且还可以通过多残差网络通道捕捉不同频率的信号,从而提高了模型训练的准确度。
在第一方面的一种可能的实现方式中,有源区是仿真域中,以激励源对应的点源为中心,以第一长度为半径的区域,第一长度与连续概率密度函数中的第一参数相关,连续概率密度函数趋近于狄拉克函数,点源的函数为连续概率密度函数与激励源的信号的乘积;无源区是仿真域中除有源区和边界之外的区域。
本申请中,将激励源视为点源,点源的函数可以表示为J(x,t)=η
α(x)g(t),相比于现有的点源的函数J(x,t)=δ(x-x
0)g(t),将狄拉克函数δ(x-x
0)替换为连续概率密度函数η
α(x)。其中,J(x,t)表示点源的函数,δ(x-x
0)表示狄拉克函数,g(t)表示激励源的信号,x
0表示激励源的位置。该点源的函数表示在仿真域的x
0处施加g(t)形式的激励源信号。
本申请中,采用连续概率密度函数η
α(x)替换δ(x-x
0),该连续概率密度函数趋近于狄拉克函数,可以表示为δ(x-x
0)~η
α(x)。该η
α(x)表示的是抽象出的典型分布,具体的形式可以是高斯分布的形式、柯西分布的形式或指数分布的形式。
该种可能的实现方式中,通过趋近于狄拉克函数的连续概率密度函数η
α(x)代替狄拉克函数,克服了PINNs不能处理点源问题的瓶颈。
在第一方面的一种可能的实现方式中,有源输出数据为多个训练样本中的一个训练样本包含有源区的采样点数据时每个残差网络通道的输出数据之和,无源输出数据为多个训练样本中的一个训练样本包含无源区的采样点数据时每个残差网络通道的输出数据之和,边界输出数据为多个训练样本中的一个训练样本包含边界数据时每个残差网络通道的输出数据之和,初始输出数据为多个训练样本中的一个训练样本包含初始数据时每个残差网络通道的输出数据之和。
一种可能的实现方式中,可以是对每个残差网络通道输出的数据乘以一些系数,再相加求和。
该种可能的实现方式中,可以将每个残差网络通道的输出数据相加,也可以对每个残差网络通道输出的数据乘以一些系数,再相加求和。本申请这种通过对多个残差网络通道的输出数据做加和再做偏微分的方式可以提高模型训练的准确度。
在第一方面的一种可能的实现方式中,每个残差网络通道包括正弦周期性激活函数;正弦周期性激活函数用于将每个残差网络通道中的数据转换为电场参数和磁场参数作为每个残差网络通道的输出数据。
该种可能的实现方式中,每个残差网络通道都可以包括残差网络和正弦周期性激活函数,残差网络可以优化第一神经网络模型,提高第一网络模型的性能,正弦周期性激活函 数可以将得到电场数据和磁场数据,这种残差网络和正弦周期性激活函数相结合的方式,可以有效提升模型的准确度。
在第一方面的一种可能的实现方式中,每个残差网络通道对应的系数按照指数级递增。
该种可能的实现方式中,多个残差网络通道中的每个残差网络通道对应的系数按照指数级递增,如有四个残差网络通道,则四个残差网络通道的系数可以分别为1、2、4和8,这种指数级递增的方式有利于快速拉开同一数据的差距,从而提高了模型训练的准确度。
在第一方面的一种可能的实现方式中,上述步骤:通过偏微分方程对输出数据集进行处理,以得到总损失函数,包括:每次将输出数据集中的一个输出数据作为偏微分方程的已知量,对偏微分方程进行运算,以得到一个输出数据对应的损失函数;对输出数据集中的每个输出数据对应的损失函数按照预设关系进行累加,以得到总损失函数。
在第一方面的一种可能的实现方式中,预设关系中包括可学参数和超参数,与总损失函数有关的不同损失函数对应的可学参数不同,可学参数会随着第一神经网络中的参数更新而更新,超参数用于辅助可学参数加权对应的损失函数。
在第一方面的一种可能的实现方式中,根据总损失函数更新第一神经网络中的参数时,该方法还包括:更新仿真域的隐向量,以及预设关系中的可学参数。
本申请第二方面提供一种增量学习的方法,该方法包括:获取来自于待优化天线的仿真域的多个采样点数据,多个采样点数据包括有源区的采样点数据、无源区的采样点数据、仿真域的边界的数据,以及仿真域的初始数据,仿真域包括有源区和无源区;向目标物理信息神经网络输入多个样本数据,其中,每个样本数据包括一个采样点数据和仿真域的第一隐向量,目标物理信息神经网络为上述第一方面或上述第一方面任一种可能的实现方式训练得到的目标物理信息神经网络模型;通过目标物理信息神经网络得到与每个样本数据对应的输出数据;控制标物理信息神经网络中的参数不变,根据输出数据调整仿真域的第一隐向量,以得到第二隐向量;将第二隐向量作为第一隐向量,通过不同的样本数据迭代执行上述对第一隐向量的调整,直到输出数据满足待优化天线的预设要求,以得到与仿真域相匹配的第二隐向量。
该第二方面中,在增量学习过程中,冻结目标物理信息神经网络中的参数,通过目标物理信息神经网络的输出数据来反复调节待优化天线的仿真域的隐向量,直到得到与仿真域相匹配的隐向量,这种方式可以快速学习出隐向量,提高了新电磁仿真场景的隐向量的获取速度。
本申请第三方面提供一种电磁仿真的方法,该方法包括使用上述第一方面或上述第一方面任一种可能的实现方式训练得到的目标物理信息神经网络模型对天线进行仿真,以得到该天线的电磁场分布。
本申请第四方面提供一种基于物理信息神经网络的模型训练装置,该装置具有实现上述第一方面或第一方面任意一种可能实现方式的方法的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块,例如:获取单元以及一个或多个处理单元。
本申请第五方面提供一种增量学习的装置,该装置具有实现上述第二方面的方法的功 能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块,例如:获取单元以及一个或多个处理单元。
本申请第六方面提供一种电磁仿真的装置,该装置具有实现上述第三方面的方法的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块,例如:一个或多个处理单元。
本申请第七方面提供一种计算机设备,该计算机设备包括至少一个处理器、存储器、输入/输出(input/output,I/O)接口以及存储在存储器中并可在处理器上运行的计算机执行指令,当计算机执行指令被处理器执行时,处理器执行如上述第一方面或第一方面任意一种可能的实现方式的方法。
本申请第八方面提供一种计算机设备,该计算机设备包括至少一个处理器、存储器、输入/输出(input/output,I/O)接口以及存储在存储器中并可在处理器上运行的计算机执行指令,当计算机执行指令被处理器执行时,处理器执行如上述第二方面的方法。
本申请第九方面提供一种计算机设备,该计算机设备包括至少一个处理器、存储器、输入/输出(input/output,I/O)接口以及存储在存储器中并可在处理器上运行的计算机执行指令,当计算机执行指令被处理器执行时,处理器执行如上述第三方面的方法。
本申请第十方面提供一种存储一个或多个计算机执行指令的计算机可读存储介质,当计算机执行指令被处理器执行时,处理器执行如上述第一方面或第一方面任意一种可能的实现方式的方法。
本申请第十一方面提供一种存储一个或多个计算机执行指令的计算机可读存储介质,当计算机执行指令被处理器执行时,处理器执行如上述第二方面的方法。
本申请第十二方面提供一种存储一个或多个计算机执行指令的计算机可读存储介质,当计算机执行指令被处理器执行时,处理器执行如上述第三方面的方法。
本申请第十三方面提供一种存储一个或多个计算机执行指令的计算机程序产品,当计算机执行指令被处理器执行时,处理器执行如上述第一方面或第一方面任意一种可能的实现方式的方法。
本申请第十四方面提供一种存储一个或多个计算机执行指令的计算机程序产品,当计算机执行指令被处理器执行时,处理器执行如上述第二方面的方法。
本申请第十五方面提供一种存储一个或多个计算机执行指令的计算机程序产品,当计算机执行指令被处理器执行时,处理器执行如上述第三方面的方法。
本申请第十六方面提供了一种芯片系统,该芯片系统包括至少一个处理器,至少一个处理器用于实现上述第一方面或第一方面任意一种可能的实现方式中所涉及的功能。在一种可能的设计中,芯片系统还可以包括存储器,存储器,用于保存处理人工智能模型的装置必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。
本申请第十七方面提供了一种芯片系统,该芯片系统包括至少一个处理器,至少一个处理器用于实现上述第二方面中所涉及的功能。在一种可能的设计中,芯片系统还可以包括存储器,存储器,用于保存基于人工智能模型的数据处理的装置必要的程序指令和数据。 该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。
本申请第十八方面提供了一种芯片系统,该芯片系统包括至少一个处理器,至少一个处理器用于实现上述第二方面中所涉及的功能。在一种可能的设计中,芯片系统还可以包括存储器,存储器,用于保存基于人工智能模型的数据处理的装置必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。
图1是本申请实施例提供的物理信息神经网络模型的一结构示意图;
图2是本申请实施例提供的一模型训练示意图;
图3是本申请实施例提供的一天线的仿真域的示意图;
图4是本申请实施例提供的模型训练方法的一实施例示意图;
图5是本申请实施例提供的模型训练方法的一示例示意图;
图6是本申请实施例提供的电源麦克斯韦方程的一示例示意图;
图7是本申请实施例提供的增量学习的方法的一实施例示意图;
图8是本申请实施例提供的一增量学习的方法的另一实施例示意图;
图9是本申请实施例提供的一实验效果对比图;
图10是本申请实施例提供的电磁仿真的一实施例示意图;
图11是本申请实施例提供的模型训练装置的一实施例示意图;
图12是本申请实施例提供的增量学习的装置的一实施例示意图;
图13是本申请实施例提供的一计算机设备的一结构示意图。
下面结合附图,对本申请的实施例进行描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。本领域普通技术人员可知,随着技术发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
本申请实施例提供一种基于物理信息神经网络(Physical Informed Neural Networks,PINNs)的模型训练方法,用于提升模型训练的准确度,从而提高电磁仿真的准确度。本申请还提供了相应的装置、计算机设备、计算机可读存储介质和计算机程序产品等。以下分别进行详细说明。
天线可以通过电磁仿真进行优化。目前可以通过人工智能(artificial intelligence,AI)技术预先训练神经网络模型,使用该神经网络模型来完成电磁仿真的过程,确定出待优化天线的电磁场分布以及性能指标等仿真结果,进而根据仿真结果对天线进行优化。
因为电磁场分布具有很强的物理特性,所以针对电磁仿真的神经网络模型多是PINNs模型。PINNs就是把物理方程作为限制加入神经网络中使训练的结果满足物理规律。而这个限制其实就是通过把物理方程迭代前后的残差加到神经网络的损失函数里,让物理方程也“参与”到了训练过程。这样,神经网络在训练迭代时候优化的不仅仅的网络自己的损失函数,还有物理方程每次迭代的残差,使得最后训练出来的结果满足物理规律。
为了更好的使用PINNs模型进行电磁仿真,本申请实施例提供以下几个方面的内容:一、提供一种新型结构的PINNs模型;二、基于天线的仿真域对该新型结构的PINNs模型进行训练,以得到目标PINNs模型;三、使用该目标PINNs模型进行增量学习,以得到新电磁仿真场景的隐向量;四、使用目标PINNs模型进行电磁仿真,以得到天线仿真域中各点的电磁场数据。该模型训练的过程、增量学习的过程都以及电磁仿真的过程都可以是在计算机设备上进行的,该计算机设备可以是服务器、终端设备或虚拟机(virtual machine,VM)。
终端设备(也可以称为用户设备(user equipment,UE))是一种具有无线收发功能的设备,可以部署在陆地上,包括室内或室外、手持或车载;也可以部署在水面上(如轮船等);还可以部署在空中(例如飞机、气球和卫星上等)。所述终端可以是手机(mobile phone)、平板电脑(pad)、带无线收发功能的电脑、虚拟现实(virtual reality,VR)终端、增强现实(augmented reality,AR)终端、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、远程医疗(remote medical)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端等。
VM可以是在物理机的硬件资源上采用虚拟化的方式划分出来的虚拟化的设备。
下面对本申请实施例所涉及到的内容依次进行介绍。
一、新型结构的PINNs模型。
本申请实施例提供的新型结构的PINNs模型可以参阅图1进行理解。如图1所示,本申请实施例提供的新型结构的PINNs模型可以包括第一神经网络和偏微分方程(partial differential equation,PDE),该第一神经网络包括至少两个残差网络通道,如图1中所示的第一神经网络包括n个残差网络通道,如:残差网络通道1、残差网络通道2,…,残差网络通道n。偏微分方程可以是点源麦克斯韦方程。
每个残差网络通道都有对应的系数,n个残差网络通道各自对应的系数可以按照指数级递增。如n=4时,有四个残差网络通道,这四个残差网络通道各自对应的系数可以分别为1、2、4和8。如n=5时,有五个残差网络通道,这五个残差网络通道各自对应的系数可以分别为1、2、4、8和16。
每个残差网络通道都可以包括残差网络和正弦周期性激活函数。其中,残差网络和正弦周期性激活函数可以表示为x→φ
i(x)=x+sin(W
ix+b
i),其中,x表示残差,sin(W
ix+b
i)表示正弦周期性激活函数。
本申请实施例中,残差网络可以优化第一神经网络,提高第一神经网络的性能,正弦周期性激活函数用于将每个残差网络通道中的数据转换为电场参数和磁场参数作为每个残 差网络通道的输出数据。这种残差网络和正弦周期性激活函数相结合的方式,可以有效提升模型的准确度。
二、基于天线的仿真域对该新型结构的PINNs模型进行训练,以得到目标PINNs模型。
本申请实施例提供的模型训练的过程可以参阅图2进行理解。如图2所示,向PINNs模型中输入训练样本,通过第一神经网络对训练样本进行处理,以得到输出数据,通过偏微分方程对输出数据进行处理,以得到损失函数,再通过损失函数更新第一神经网络中的参数,计算机设备迭代执行该训练过程,直到达到收敛条件,得到目标PINNs模型。
本申请实施例用于训练PINNs模型的训练样本是来自于天线的仿真域的,天线的仿真域指的是模拟出的天线电磁波的覆盖区域。天线可以理解为是终端的天线,也可以是网络设备的天线。不同终端或网络设备的天线通常不同,所以,不同天线的仿真域也不同。
本申请实施例中的天线可以是采用脉冲激励源进行加源的天线。这样,天线的仿真域包括有源区、无源区和边界。天线的结构可以是如图3所示的蝶形结构100,该蝶形结构的天线包括两个相对的三角形结构。模拟该天线电磁波所覆盖的区域可以理解为是这个蝶形天线的仿真域101,如图3中的可以在两个三角形的中间位置进行加源,可以将该激励源理解为是点源102,包含点源102的近源区域为有源区103,仿真域101中除有源区103和该仿真域101的边界之外的区域为无源区104。
也理解为:有源区是仿真域中,以激励源对应的点源为中心,以第一长度为半径的区域,第一长度与连续概率密度函数中的第一参数相关,连续概率密度函数趋近于狄拉克函数,点源的函数为连续概率密度函数与激励源的信号的乘积;无源区是仿真域中除有源区和边界之外的区域,或者除掉边界,仿真域内部除有源区之外的区域。
本申请实施例中,除掉边界的仿真域可以用Ω表示,有源区可以用Ω
0表示,无源区可以用Ω
1表示,这样,Ω
0={(x
0+x)∈Ω,||x||≤3α},Ω
1=Ω-Ω
0。其中,x
0表示激励源对应的点源的中心,x表示第一长度的半径,α表示连续概率密度函数中的第一参数。本申请实施例中,α的取值可以是根据需求设置的,通常是仿真域长度的1/100~1/200,仿真域的时间范围和空间范围都可以是根据天线确定的。
本申请实施例中,将激励源视为点源,点源的函数可以表示为J(x,t)=η
α(x)g(t),相比于现有的点源的函数J(x,t)=δ(x-x
0)g(t),将狄拉克函数δ(x-x
0)替换为连续概率密度函数η
α(x)。其中,J(x,t)表示点源的函数,δ(x-x
0)表示狄拉克函数,g(t)表示激励源的信号,x
0表示激励源的位置。该点源的函数表示在仿真域的x
0处施加g(t)形式的激励源信号。
本申请中实施例中,采用连续概率密度函数η
α(x)替换δ(x-x
0),该连续概率密度函数趋近于狄拉克函数,可以表示为δ(x-x
0)~η
α(x)。该η
α(x)表示的是抽象出的典型分布,具体的形式可以是高斯分布的形式、柯西分布的形式或指数分布的形式。几种分布的形式可以参阅下表1进行理解。
表1:
本申请实施例中,通过趋近于狄拉克函数的连续概率密度函数η
α(x)代替狄拉克函数,克服了PINNs不能处理点源问题的瓶颈。
本申请实施例提供的基于PINNs的模型训练方法的一实施例可以参阅图4进行理解,如图4所示,本申请实施例提供的基于PINNs的模型训练方法的一实施例可以包括:
201.计算机设备从天线的仿真域获取多个采样点数据。
其中,多个采样点数据包括有源区的采样点数据、无源区的采样点数据、仿真域的边界的数据,以及仿真域的初始数据,仿真域包括有源区和无源区。
本申请实施例中,采样点数据有四种类型,有源区的采样点数据、无源区的采样点数据、仿真域的边界的数据以及仿真域的初始数据。仿真域的边界通常有反弹边界或吸收边界,不同类型的边界对电磁仿真的结果影响很大,仿真域的初始数据通常指仿真域初始状态(通常指的是时间维度上t=0时)的电场数据和磁场数据,初始状态的仿真域的电场数据和磁场数据通常为零。采样点数据通常是四维的,包括采样点的三维空间坐标,以及采样点的一维时间信息,采样点数据的形式可以表示为U=(x,y,z,t)。按照采样点数据的类型表示,有源区的采样点数据可以表示为U
SRC,无源区的采样点数据U
NO_SRC,仿真域的边界数据U
BC,仿真域的初始数据U
IC。
202.计算机设备向第一神经网络的每个残差网络通道输入多个训练样本中每个训练样本与每个残差网络通道对应系数的乘积。
其中,每个训练样本包括一个采样点数据和仿真域对应的隐向量。
训练样本指的是用于训练PINNs模型的样本数据。该训练样本中不仅包括采样点数据,还包括仿真域对应的隐向量Z。训练样本可以表示为X=(Z,U)的形式。按照训练样本的类型表示,包含U
SRC的训练样本可以表示为X
SRC=(Z,U
SRC),包含U
NO_SRC的训练样本可以表示为X
NO_SRC=(Z,U
NO_SRC),包含U
BC的训练样本可以表示为X
BC=(Z,U
BC),包含U
IC的训练样本可以表示为X
IC=(Z,U
IC)。
隐向量Z用来表征不同电磁仿真场景的参数设置。本申请实施例中,隐向量Z采用低维向量,常用的维度选择可以是16,32,64,128等。
每个残差网络通道对应的系数不同。如图5所示,第一神经网络中有n个残差网络通道,由残差网络通道1到残差网络通道n,其中,残差网络通道1对应的系数为a
1,残差网络通道2对应的系数为a
2,…,残差网络通道n对应的系数为a
n,这n个残差网络通道的系数也可以 用集合的形式表示为{a
1,a
2,…,a
n}。这样,当训练样本为X时,每个残差网络通道的输入即可表示为{a
1X,a
2X,…,a
nX}。该X可以是上述X
SRC、X
NO_SRC、X
BC和X
IC中的任意一个。
若训练样本来自于多个电磁仿真场景,即来自于多种不同天线的仿真域,那么,针对每个不同的仿真域都会有一个对应的隐向量,如有N个不同的仿真域,那么N个隐向量可以表示为{Z
1,…Z
N}。当有N个仿真域时,来自于第i个仿真域的训练样本可以表示为{X
i,SRC=(Z
i,U
i,SRC),X
i,NO_SRC=(Z
i,U
i,NO_SRC),X
i,IC=(Z
i,U
i,IC),X
i,BC=(Z
i,U
i,BC)}。
203.计算机设备通过第一神经网络对输入每个残差网络通道中的数据进行处理,以得到输出数据集。
其中,输出数据集包括有源输出数据、无源输出数据、边界输出数据和初始输出数据。
可选地,本申请实施例中,有源输出数据为多个训练样本中的一个训练样本包含有源区的采样点数据时每个残差网络通道的输出数据之和,无源输出数据为多个训练样本中的一个训练样本包含无源区的采样点数据时每个残差网络通道的输出数据之和,边界输出数据为多个训练样本中的一个训练样本包含边界数据时每个残差网络通道的输出数据之和,初始输出数据为多个训练样本中的一个训练样本包含初始数据时每个残差网络通道的输出数据之和。
本申请实施例中,不限于直接将各残差网络通道的输出数据直接相加求和这种方式,也可以是对每个残差网络通道输出的数据乘以一些系数,再相加求和。
该输出数据集可以表示为{Y
SRC,Y
NO_SRC,Y
BC,Y
IC}。其中,每个Y都可以是通过每个残差网络通道的系数乘以对应类型的X,然后再对每个残差网络通道的输出做相加求和得到的,可以表示为Y=Y
1+Y
2…+Y
n,其中,Y
1表示残差网络通道1的输出数据,Y
n表示残差网络通道n的输出数据。
204.计算机设备通过偏微分方程对输出数据集进行处理,以得到总损失函数。
其中,总损失函数根据有源损失函数、无源损失函数、边界损失函数和初始损失函数得到。
有源损失函数指的是通过有源输出数据得到的损失函数,无源损失函数指的是通过无源输出数据得到的损失函数,边界损失函数指的是通过边界输出数据得到的损失函数,初始损失函数指的是通过初始输出数据得到的损失函数。有源损失函数可以用L
SRC表示,无源损失函数可以用L
NO_SRC表示,边界损失函数可以用L
BC表示,初始损失函数可以用L
IC表示。
可选地,得到总损失函数的过程可以是:每次将输出数据集中的一个输出数据作为偏微分方程的已知量,对偏微分方程进行运算,以得到一个输出数据对应的损失函数;对输出数据集中的每个输出数据对应的损失函数按照预设关系进行累加,以得到总损失函数。其中,预设关系中包括可学参数,不同损失函数对应的可学参数不同。
该偏微分方程可以是点源麦克斯韦方程,输出数据Y通常是六维的,会包括三维的电场数据,以及三维的磁场数据,如图6所示,将输出数据Y中的电场数据和磁场数据作为已知量代入如图6所示的电源麦克斯韦方程,然后进行计算,可以计算的到对应的损失函数。图6中的E表示电场,H表示磁场,下表x、y、z分别表示三维空间。
总损失函数可以是按照预设关系进行累加得到的,预设关系中包括可学参数和超参数,与总损失函数有关的不同损失函数对应的可学参数不同,可学参数会随着第一神经网络中的参数更新而更新,超参数用于辅助可学参数加权对应的损失函数。
该预设关系可以表示为:
其中,L
total表示总损失函数,L
i表示四种类型的损失函数,ε为超参数,该超参数的取值可以为0.01,当然,此处只是超参数取值的一个示例,本申请中,不限定超参数的具体值,λ
i为可学参数,i=1,2,3,4。
本申请实施例中,通过超参数和可学参数实现动态加权损失函数,平衡各项损失函数权重,可以加速神经网络训练过程中的收敛速度。
205.计算机设备根据总损失函数更新第一神经网络中的参数,以得到第二神经网络。
本申请中,第一神经网络用于表示一次迭代前的神经网络,第二神经网络用于表示一次迭代后的神经网络。
另外,在更新第一神经网络中的参数θ时,还可以更新仿真域的隐向量Z,以及上述预设关系中的可学参数λ。也就是可以根据L
total更新θ、Z和λ。
本申请实施例中,更新θ、Z和λ可以采用梯度下降的方法,在本轮迭代的θ、Z和λ的基础上进行下调,得到新的θ、Z和λ,开始下一轮迭代过程。
将第二神经网络作为第一神经网络,迭代执行上述训练过程,直到第二神经网络达到收敛条件,以得到目标物理信息神经网络模型。
本申请实施例中,目标PINNs模型是相对于开始模型训练前的初始PINNs模型来说的,初始PINNs的第一神经网络中参数通常是较大的,在模型训练过程中,通过训练样本,不断的更新第一神经网络中的参数,直到达到收敛条件,得到第二神经网络,这时第二神经网络中的参数可以理解为固定下来,这时的整个模型称为目标PINNs模型。
本申请实施例中,因为PINNs的第一神经网络中包括多个残差网络通道,而且每个残差网络通道对应的系数不同,这样,在模型训练阶段就可以用不同的系数乘上相同的训练样本,就可以将一个数据扩展为多个数据,而且还可以通过多残差网络通道捕捉不同频率的信号,从而提高了模型训练的准确度。
三、使用目标PINNs模型进行增量学习。
如图7所示,本申请实施例提供的增量学习的一实施例包括:
301.计算机设备获取来自于待优化天线的仿真域的多个采样点数据。
多个采样点数据包括有源区的采样点数据、无源区的采样点数据、仿真域的边界的数据,以及仿真域的初始数据,仿真域包括有源区和无源区。
本申请实施例中的采样点数据可以参阅上述步骤201中的采样点数据进行理解,只是本申请实施例中的采样点数据来自于待优化天线的仿真域,或者说来自于新电磁仿真场景的仿真域。
302.计算机设备向目标物理信息神经网络输入多个样本数据,其中,每个样本数据包 括一个采样点数据和仿真域的第一隐向量。
目标物理信息神经网络为基于PINNs的模型训练方法得到的目标物理信息神经网络。
303.计算机设备通过目标物理信息神经网络得到与每个样本数据对应的输出数据。
304.计算机设备控制标物理信息神经网络中的参数不变,根据输出数据调整仿真域的第一隐向量,以得到第二隐向量。
本申请实施例对隐向量的调节可以是通过梯度下降方式进行调节。
将第二隐向量作为第一隐向量,通过不同的样本数据迭代执行上述对第一隐向量的调整,直到输出数据满足待优化天线的预设要求,以得到与仿真域相匹配的第二隐向量。
本申请实施例中,可以将第一隐向量理解为是迭代前的隐向量,第二隐向量理解为是迭代后的隐向量。
本申请实施例中,在增量学习过程中,冻结目标物理信息神经网络中的参数,通过目标物理信息神经网络的输出数据来反复调节待优化天线的仿真域的隐向量,直到得到与仿真域相匹配的隐向量,这种方式可以快速学习出隐向量,提高了新电磁仿真场景的隐向量的获取速度。
上述增量学习的过程可以参阅图8的示例进行理解,如图8所示,针对新的电磁仿真场景,可以采用已经训练好的目标PINNs模型,保持该目标PINNs模型中的θ在每轮迭代过程中都不改变,向目标PINNs模型中输入{Xnew,
SRC,=(Znew,Unew-
SRC),Xnew,
NO_SRC=(Znew,Unew-
NO_SRC),Xnew,
IC=(Znew,Unew-
IC),Xnew,
BC=(Znew,Unew-
BC)},通过目标PINNs模型的输出数据Y来调整输入数据X中的Z,直到输出数据Y满足预设要求,得到与新电磁仿真场景相匹配的隐向量Z。
关于增量学习的方案,开发人员做了相关实验,如图9所示为采用本申请提供的增量学习方案得到新电磁仿真场景的隐向量Z相比于原始方法得到新电磁仿真场景的隐向量Z的时间对比图,从图9中可以看出,在5%的误差情况下,本申请方案只需要200秒就可以得到新电磁仿真场景的隐向量Z,而采用原始方法得到新电磁仿真场景的隐向量Z需要3337秒,本申请的方案在速度上提高了很多。
四、使用目标PINNs模型进行电磁仿真,以得到天线仿真域中各点的电磁场数据。
通过上述模型训练的过程训练得到目标PINNs模型后,可以将该目标PINNs模型以模型文件的形式存储,在用于电磁仿真的计算机设备(如:终端设备、服务器或VM等)需要使用该目标PINNs模型时,可以是用于电磁仿真的计算机设备主动加载该目标PINNs模型的模型文件。也可以是存储该目标PINNs模型的模型文件主动发送给用于电磁仿真的计算机设备安装该目标PINNs模型的模型文件。
如图10所示,计算机设备上安装该目标PINNs模型后,就可以使用该目标PINNs模型进行电磁仿真。仿真结果可以是图10所示的电磁场分布示意图,也可以是所仿真天线的一些性能指标,如:天线仿真域中各点的电磁场数据,本申请实施例中,电磁场数据包括电场数据和磁场数据,如:电场强度和磁场强度等。这样,就可以通过电磁仿真的结果对天线进行优化设计。
本申请实施例所提供的电磁仿真的方案,采用多残差网络通道的目标PINNs模型执行 电磁仿真过程,很大程度上提升了电磁仿真的精准度。
以上描述了基于物理信息神经网络的模型训练方法,以及增量学习的方法,下面结合附图11介绍本申请实施例提供的基于物理信息神经网络的模型训练装置40,该基于物理信息神经网络的模型训练装置40包括:
获取单元401,用于从天线的仿真域获取多个采样点数据,多个采样点数据包括有源区的采样点数据、无源区的采样点数据、仿真域的边界的数据,以及仿真域的初始数据,仿真域包括有源区和无源区。该获取单元401的功能可以参阅上述方法实施例中的步骤201进行理解。
第一处理单元402,用于向第一神经网络的每个残差网络通道输入多个训练样本中每个训练样本与每个残差网络通道对应系数的乘积,每个训练样本包括获取单元401获取的一个采样点数据和仿真域对应的隐向量,每个残差网络通道对应的系数不同。该第一处理单元402的功能可以参阅上述方法实施例中的步骤202进行理解。
第二处理单元403,用于通过第一神经网络对第一处理单元402输入每个残差网络通道中的数据进行处理,以得到输出数据集,其中,输出数据集包括有源输出数据、无源输出数据、边界输出数据和初始输出数据。该第二处理单元403的功能可以参阅上述方法实施例中的步骤203进行理解。
第三处理单元404,用于通过偏微分方程对输出数据集进行处理,以得到总损失函数,总损失函数根据有源损失函数、无源损失函数、边界损失函数和初始损失函数得到。该第三处理单元404的功能可以参阅上述方法实施例中的步骤204进行理解。
第四处理单元405,用于根据总损失函数更新第一神经网络中的参数,以得到第二神经网络。该第四处理单元405的功能可以参阅上述方法实施例中的步骤205进行理解。
将第二神经网络作为第一神经网络,迭代执行上述训练过程,直到第二神经网络达到收敛条件,以得到目标物理信息神经网络模型。
本申请实施例中,因为PINNs的第一神经网络中包括多个残差网络通道,而且每个残差网络通道对应的系数不同,这样,在模型训练阶段就可以用不同的系数乘上相同的训练样本,就可以将一个数据扩展为多个数据,而且还可以通过多残差网络通道捕捉不同频率的信号,从而提高了模型训练的准确度。
可选地,有源区是仿真域中,以激励源对应的点源为中心,以第一长度为半径的区域,第一长度与连续概率密度函数中的第一参数相关,连续概率密度函数趋近于狄拉克函数,点源的函数为连续概率密度函数与激励源的信号的乘积;无源区是仿真域中除有源区和边界之外的区域。
可选地,有源输出数据为多个训练样本中的一个训练样本包含有源区的采样点数据时每个残差网络通道的输出数据之和,无源输出数据为多个训练样本中的一个训练样本包含无源区的采样点数据时每个残差网络通道的输出数据之和,边界输出数据为多个训练样本中的一个训练样本包含边界数据时每个残差网络通道的输出数据之和,初始输出数据为多个训练样本中的一个训练样本包含初始数据时每个残差网络通道的输出数据之和。
可选地,每个残差网络通道包括正弦周期性激活函数;正弦周期性激活函数用于将每 个残差网络通道中的数据转换为电场参数和磁场参数作为每个残差网络通道的输出数据。
可选地,每个残差网络通道对应的系数按照指数级递增。
可选地,第三处理单元404,用于每次将输出数据集中的一个输出数据作为偏微分方程的已知量,对偏微分方程进行运算,以得到一个输出数据对应的损失函数;对输出数据集中的每个输出数据对应的损失函数按照预设关系进行累加,以得到总损失函数。
预设关系中包括可学参数和超参数,与总损失函数有关的不同损失函数对应的可学参数不同,可学参数会随着第一神经网络中的参数更新而更新,超参数用于辅助可学参数加权对应的损失函数。
可选地,第四处理单元405,还用于更新仿真域的隐向量,以及预设关系中的可学参数。
可选地,预设关系中包括可学参数,不同损失函数对应的可学参数不同。
以上所描述的基于物理信息神经网络的模型训练装置40可以参阅前述方法实施例部分的相应描述进行理解,此处不做重复赘述。
如图12所示,本申请实施例提供的增量学习的装置50的一实施例包括:
获取单元501,用于获取来自于待优化天线的仿真域的多个采样点数据,多个采样点数据包括有源区的采样点数据、无源区的采样点数据、仿真域的边界的数据,以及仿真域的初始数据,仿真域包括有源区和无源区。该获取单元501可以执行上述方法实施例中的步骤301。
第一处理单元502,用于向目标物理信息神经网络输入多个样本数据,其中,每个样本数据包括一个采样点数据和仿真域的第一隐向量,目标物理信息神经网络为基于物理信息神经网络的模型训练方法得到的目标物理信息神经网络。该第一处理单元502可以执行上述方法实施例中的步骤302。
第二处理单元503,用于通过目标物理信息神经网络得到与每个样本数据对应的输出数据。该第二处理单元503可以执行上述方法实施例中的步骤303。
第三处理单元504,用于控制标物理信息神经网络中的参数不变,根据输出数据调整仿真域的第一隐向量,以得到第二隐向量。该第三处理单元504可以执行上述方法实施例中的步骤304。
将第二隐向量作为第一隐向量,通过不同的样本数据迭代执行上述对第一隐向量的调整,直到输出数据满足待优化天线的预设要求,以得到与仿真域相匹配的第二隐向量。
本申请实施例中,在增量学习过程中,冻结目标物理信息神经网络中的参数,通过目标物理信息神经网络的输出数据来反复调节待优化天线的仿真域的隐向量,直到得到与仿真域相匹配的隐向量,这种方式可以快速学习出隐向量,提高了新电磁仿真场景的隐向量的获取速度。
本申请实施例提供一种电磁仿真的装置,该电磁仿真的装置安装有上述目标物理信息神经网络模型,该电磁仿真的装置可以通过该目标物理信息神经网络模型对天线进行仿真,以得到所述天线的仿真域的电磁场分布。
图13所示,为本申请的实施例提供的计算机设备60的一种可能的逻辑结构示意图。该计算机设备60可以是基于物理信息神经网络的模型训练装置,也可以是增量学习的装置或 者电磁仿真的装置。该计算机设备60包括:处理器601、通信接口602、存储器603以及总线604。处理器601、通信接口602以及存储器603通过总线604相互连接。在本申请的实施例中,处理器601用于对计算机设备60的动作进行控制管理,例如,处理器601用于执行图1至图9的方法实施例中的过程,通信接口602用于支持计算机设备60进行通信。存储器603,用于存储计算机设备60的程序代码和数据。
其中,处理器601可以是中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器601也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。总线604可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图13中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在本申请的另一实施例中,还提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机执行指令,当设备的处理器执行该计算机执行指令时,设备执行上述基于物理信息神经网络的模型训练方法、增量学习的方法或者执行上述电磁仿真的方法。
在本申请的另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中;当设备的处理器执行该计算机执行指令时,设备执行上述基于物理信息神经网络的模型训练方法、增量学习的方法或者执行上述电磁仿真的方法。
在本申请的另一实施例中,还提供一种芯片系统,该芯片系统包括处理器,该处理器用于实现上述基于物理信息神经网络的模型训练方法、增量学习的方法或者执行上述电磁仿真的方法。在一种可能的设计中,芯片系统还可以包括存储器,存储器,用于保存进程间通信的装置必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请实施例的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请实施例所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的 间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请实施例各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请实施例各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上,仅为本申请实施例的具体实施方式,但本申请实施例的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请实施例揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请实施例的保护范围之内。因此,本申请实施例的保护范围应以权利要求的保护范围为准。
Claims (17)
- 一种基于物理信息神经网络PINNs的模型训练方法,其特征在于,所述物理信息神经网络包括第一神经网络和偏微分方程,所述第一神经网络包括至少两个残差网络通道,所述方法包括:从天线的仿真域获取多个采样点数据,所述多个采样点数据包括有源区的采样点数据、无源区的采样点数据、所述仿真域的边界的数据,以及所述仿真域的初始数据,所述仿真域包括所述有源区和所述无源区;向所述第一神经网络的每个残差网络通道输入多个训练样本中每个训练样本与所述每个残差网络通道对应系数的乘积,所述每个训练样本包括一个采样点数据和所述仿真域对应的隐向量,所述每个残差网络通道对应的系数不同;通过所述第一神经网络对输入所述每个残差网络通道中的数据进行处理,以得到输出数据集,其中,所述输出数据集包括有源输出数据、无源输出数据、边界输出数据和初始输出数据;通过所述偏微分方程对所述输出数据集进行处理,以得到总损失函数,所述总损失函数与有源损失函数、无源损失函数、边界损失函数和初始损失函数有关;根据所述总损失函数更新所述第一神经网络中的参数,以得到第二神经网络;将所述第二神经网络作为所述第一神经网络,迭代执行上述训练过程,直到所述第二神经网络达到收敛条件,以得到用于所述天线的电磁仿真的目标物理信息神经网络模型。
- 根据权利要求1所述的模型训练方法,其特征在于,所述有源区是所述仿真域中,以激励源对应的点源为中心,以第一长度为半径的区域,所述第一长度与连续概率密度函数中的第一参数相关,所述连续概率密度函数趋近于狄拉克函数;所述无源区是所述仿真域中除所述有源区和所述仿真域的边界之外的区域。
- 根据权利要求1或2所述的模型训练方法,其特征在于,所述有源输出数据为所述多个训练样本中的一个训练样本包含所述有源区的采样点数据时所述每个残差网络通道的输出数据之和;所述无源输出数据为所述多个训练样本中的一个训练样本包含所述无源区的采样点数据时所述每个残差网络通道的输出数据之和;所述边界输出数据为所述多个训练样本中的一个训练样本包含所述仿真域的边界的数据时所述每个残差网络通道的输出数据之和;所述初始输出数据为所述多个训练样本中的一个训练样本包含所述初始数据时所述每个残差网络通道的输出数据之和。
- 根据权利要求1-3任一项所述的模型训练方法,其特征在于,所述每个残差网络通道包括正弦周期性激活函数;所述正弦周期性激活函数用于将所述每个残差网络通道中的数据转换为电场参数和磁场参数作为所述每个残差网络通道的输出数据。
- 根据权利要求1-4任一项所述的模型训练方法,其特征在于,所述每个残差网络通 道对应的系数按照指数级递增。
- 根据权利要求1-5任一项所述的模型训练方法,其特征在于,所述通过所述偏微分方程对所述输出数据集进行处理,以得到总损失函数,包括:每次将所述输出数据集中的一个输出数据作为所述偏微分方程的已知量,对所述偏微分方程进行运算,以得到所述一个输出数据对应的损失函数;对所述输出数据集中的每个输出数据对应的损失函数按照预设关系进行累加,以得到所述总损失函数。
- 根据权利要求6所述的模型训练方法,其特征在于,所述预设关系中包括可学参数和超参数,与所述总损失函数有关的不同损失函数对应的可学参数不同,所述可学参数会随着所述第一神经网络中的参数更新而更新,所述超参数用于辅助所述可学参数加权对应的损失函数。
- 根据权利要求7所述的模型训练方法,其特征在于,所述根据所述总损失函数更新所述第一神经网络中的参数时,所述方法还包括:更新所述仿真域的隐向量,以及所述预设关系中的所述可学参数。
- 一种增量学习的方法,其特征在于,包括:获取来自于待优化天线的仿真域的多个采样点数据,所述多个采样点数据包括有源区的采样点数据、无源区的采样点数据、所述仿真域的边界的数据,以及所述仿真域的初始数据,所述仿真域包括所述有源区和所述无源区;向目标物理信息神经网络输入多个样本数据,其中,每个样本数据包括一个采样点数据和所述仿真域的第一隐向量,所述目标物理信息神经网络为上述权利要求1-8任一项的模型训练方法得到的目标物理信息神经网络;通过所述目标物理信息神经网络得到与所述每个样本数据对应的输出数据;控制所述标物理信息神经网络中的参数不变,根据所述输出数据调整所述仿真域的第一隐向量,以得到第二隐向量;将所述第二隐向量作为所述第一隐向量,通过不同的样本数据迭代执行上述对所述第一隐向量的调整,直到所述输出数据满足所述待优化天线的预设要求,以得到与所述仿真域相匹配的第二隐向量。
- 一种基于物理信息神经网络的模型训练装置,其特征在于,所述物理信息神经网络包括第一神经网络和偏微分方程,所述第一神经网络包括至少两个残差网络通道,所述模型训练装置包括:获取单元,用于从天线的仿真域获取多个采样点数据,所述多个采样点数据包括有源区的采样点数据、无源区的采样点数据、所述仿真域的边界的数据,以及所述仿真域的初始数据,所述仿真域包括所述有源区和所述无源区;第一处理单元,用于向所述第一神经网络的每个残差网络通道输入多个训练样本中每个训练样本与所述每个残差网络通道对应系数的乘积,所述每个训练样本包括一个采样点数据和所述仿真域对应的隐向量,所述每个残差网络通道对应的系数不同;第二处理单元,用于通过所述第一神经网络对输入所述每个残差网络通道中的数据进 行处理,以得到输出数据集,其中,所述输出数据集包括有源输出数据、无源输出数据、边界输出数据和初始输出数据;第三处理单元,用于通过所述偏微分方程对所述输出数据集进行处理,以得到总损失函数,所述总损失函数与有源损失函数、无源损失函数、边界损失函数和初始损失函数有关;第四处理单元,用于根据所述总损失函数更新所述第一神经网络中的参数,以得到第二神经网络;将所述第二神经网络作为所述第一神经网络,迭代执行上述训练过程,直到所述第二神经网络达到收敛条件,以得到用于所述天线的电磁仿真的目标物理信息神经网络模型。
- 根据权利要求10所述的模型训练装置,其特征在于,所述第三处理单元,用于每次将所述输出数据集中的一个输出数据作为所述偏微分方程的已知量,对所述偏微分方程进行运算,以得到所述一个输出数据对应的损失函数;对所述输出数据集中的每个输出数据对应的损失函数按照预设关系进行累加,以得到所述总损失函数。
- 根据权利要求10所述的模型训练装置,其特征在于,所述第四处理单元,还用于更新所述仿真域的隐向量,以及所述预设关系中的可学参数。
- 一种增量学习的装置,其特征在于,包括:获取单元,用于获取来自于待优化天线的仿真域的多个采样点数据,所述多个采样点数据包括有源区的采样点数据、无源区的采样点数据、所述仿真域的边界的数据,以及所述仿真域的初始数据,所述仿真域包括所述有源区和所述无源区;第一处理单元,用于向目标物理信息神经网络输入多个样本数据,其中,每个样本数据包括一个采样点数据和所述仿真域的第一隐向量,所述目标物理信息神经网络为上述权利要求1-8任一项的模型训练方法得到的目标物理信息神经网络;第二处理单元,用于通过所述目标物理信息神经网络得到与所述每个样本数据对应的输出数据;第三处理单元,用于控制所述标物理信息神经网络中的参数不变,根据所述输出数据调整所述仿真域的第一隐向量,以得到第二隐向量;将所述第二隐向量作为所述第一隐向量,通过不同的样本数据迭代执行上述对所述第一隐向量的调整,直到所述输出数据满足所述待优化天线的预设要求,以得到与所述仿真域相匹配的第二隐向量。
- 一种计算设备,其特征在于,包括一个或多个处理器和存储有计算机程序的计算机可读存储介质;所述计算机程序被所述一个或多个处理器执行时实现如权利要求1-8任一项或实现如权利要求9所述的方法。
- 一种芯片系统,其特征在于,包括一个或多个处理器,所述一个或多个处理器被调用用于执行如权利要求1-8任一项所述的方法或执行如权利要求9所述的方法。
- 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被一个或多个处理器执行时实现如权利要求1-8任一项所述的方法或实现如权利要求9所述的方法。
- 一种计算机程序产品,其特征在于,包括计算机程序,所述计算机程序当被一个或多个处理器执行时用于实现如权利要求1-8任一项所述的方法或实现如权利要求9所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111069844.8 | 2021-09-13 | ||
CN202111069844.8A CN115809695A (zh) | 2021-09-13 | 2021-09-13 | 一种基于物理信息神经网络的模型训练方法及相关装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023036164A1 true WO2023036164A1 (zh) | 2023-03-16 |
Family
ID=85481142
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/117447 WO2023036164A1 (zh) | 2021-09-13 | 2022-09-07 | 一种基于物理信息神经网络的模型训练方法及相关装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115809695A (zh) |
WO (1) | WO2023036164A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117540489A (zh) * | 2023-11-13 | 2024-02-09 | 重庆大学 | 一种基于多任务学习的翼型气动数据计算方法及系统 |
CN117574618A (zh) * | 2023-11-08 | 2024-02-20 | 中国人民解放军陆军工程大学 | 用于电流仿真训练的实时检测方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110401964A (zh) * | 2019-08-06 | 2019-11-01 | 北京邮电大学 | 一种面向用户为中心网络基于深度学习的功率控制方法 |
CN112468203A (zh) * | 2020-11-19 | 2021-03-09 | 杭州勒贝格智能系统股份有限公司 | 深度迭代神经网络用低秩csi反馈方法、存储介质及设备 |
CN112488924A (zh) * | 2020-12-21 | 2021-03-12 | 深圳大学 | 一种图像超分辨率模型训练方法、重建方法及装置 |
CN112925012A (zh) * | 2021-01-26 | 2021-06-08 | 中国矿业大学(北京) | 地震全波形反演方法及装置 |
US20210237767A1 (en) * | 2020-02-03 | 2021-08-05 | Robert Bosch Gmbh | Training a generator neural network using a discriminator with localized distinguishing information |
-
2021
- 2021-09-13 CN CN202111069844.8A patent/CN115809695A/zh active Pending
-
2022
- 2022-09-07 WO PCT/CN2022/117447 patent/WO2023036164A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110401964A (zh) * | 2019-08-06 | 2019-11-01 | 北京邮电大学 | 一种面向用户为中心网络基于深度学习的功率控制方法 |
US20210237767A1 (en) * | 2020-02-03 | 2021-08-05 | Robert Bosch Gmbh | Training a generator neural network using a discriminator with localized distinguishing information |
CN112468203A (zh) * | 2020-11-19 | 2021-03-09 | 杭州勒贝格智能系统股份有限公司 | 深度迭代神经网络用低秩csi反馈方法、存储介质及设备 |
CN112488924A (zh) * | 2020-12-21 | 2021-03-12 | 深圳大学 | 一种图像超分辨率模型训练方法、重建方法及装置 |
CN112925012A (zh) * | 2021-01-26 | 2021-06-08 | 中国矿业大学(北京) | 地震全波形反演方法及装置 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117574618A (zh) * | 2023-11-08 | 2024-02-20 | 中国人民解放军陆军工程大学 | 用于电流仿真训练的实时检测方法 |
CN117540489A (zh) * | 2023-11-13 | 2024-02-09 | 重庆大学 | 一种基于多任务学习的翼型气动数据计算方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN115809695A (zh) | 2023-03-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023036164A1 (zh) | 一种基于物理信息神经网络的模型训练方法及相关装置 | |
Li et al. | Periodogram estimation based on LSSVR-CCPSO compensation for forecasting ship motion | |
Cui et al. | Received signal strength based indoor positioning using a random vector functional link network | |
WO2022068623A1 (zh) | 一种模型训练方法及相关设备 | |
AU2021245165B2 (en) | Method and device for processing quantum data | |
CN113435247B (zh) | 一种通信干扰智能识别方法、系统及终端 | |
CN113065997B (zh) | 一种图像处理方法、神经网络的训练方法以及相关设备 | |
TW202022798A (zh) | 處理卷積神經網路的方法 | |
Wu et al. | Wifi fingerprinting and tracking using neural networks | |
CN111343602B (zh) | 基于进化算法的联合布局与任务调度优化方法 | |
CN109740109A (zh) | 一种基于酉变换的PolSAR图像广义目标分解方法 | |
Wei et al. | IoT-aided fingerprint indoor positioning using support vector classification | |
CN115935154B (zh) | 基于稀疏表示与近端算法的射频信号特征遴选与识别方法 | |
CN109117545B (zh) | 基于神经网络的天线快速设计方法 | |
WO2023125090A1 (zh) | 一种小区负载的调整方法及其相关设备 | |
CN115953651A (zh) | 一种基于跨域设备的模型训练方法、装置、设备及介质 | |
TWI748794B (zh) | 基於類神經網路的波束選擇方法及管理伺服器 | |
CN115577782A (zh) | 量子计算方法、装置、设备及存储介质 | |
CN104881703A (zh) | 图像阈值分割的Tent映射改进蜂群算法 | |
CN109726805B (zh) | 利用黑盒模拟器进行神经网络处理器设计的方法 | |
Zou et al. | Research on Node Location Algorithm of Zigbee Based on PSO-GRNN Neural Network | |
CN116776230B (zh) | 一种基于特征压印与特征迁移识别信号的方法及系统 | |
CN117272778B (zh) | 微波无源器件的设计方法和装置 | |
CN116503675B (zh) | 一种基于强聚类损失函数的多种类目标辨识方法及系统 | |
CN117496375B (zh) | 面向遥感基础模型的异构npu训练方法及系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22866627 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22866627 Country of ref document: EP Kind code of ref document: A1 |