CN117313560B - Multi-objective optimization method for IGBT module packaging based on machine learning - Google Patents

Multi-objective optimization method for IGBT module packaging based on machine learning Download PDF

Info

Publication number
CN117313560B
CN117313560B CN202311617795.6A CN202311617795A CN117313560B CN 117313560 B CN117313560 B CN 117313560B CN 202311617795 A CN202311617795 A CN 202311617795A CN 117313560 B CN117313560 B CN 117313560B
Authority
CN
China
Prior art keywords
network
neural network
state
output
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311617795.6A
Other languages
Chinese (zh)
Other versions
CN117313560A (en
Inventor
王佳宁
孙菲双
王睿源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202311617795.6A priority Critical patent/CN117313560B/en
Publication of CN117313560A publication Critical patent/CN117313560A/en
Application granted granted Critical
Publication of CN117313560B publication Critical patent/CN117313560B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/23Design optimisation, verification or simulation using finite element methods [FEM] or finite difference methods [FDM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/06Multi-objective optimisation, e.g. Pareto optimisation using simulated annealing [SA], ant colony algorithms or genetic algorithms [GA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/14Force analysis or force optimisation, e.g. static or dynamic forces

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Geometry (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a multi-objective optimization method for IGBT module packaging based on machine learning, and belongs to the technical field of power electronics. The method comprises the steps of establishing a three-objective optimization model by utilizing a neural network; determining a state set, an action set and a reward function; offline learning is carried out by using a DDPG algorithm of machine learning, so as to obtain an optimal strategy; and substituting the optimal strategy into the three-objective optimization model for application. The optimization method can enable the IGBT module to realize optimization of solder layer stress, stray inductance and chip junction temperature under any state and any weight coefficient. The invention can solve the problem of high-dimensional design variable of the complex IGBT module, avoids the problem that the traditional manual design cannot be optimized, and adopts the DDPG algorithm to avoid the time-consuming optimizing solving process when the traditional genetic algorithm is changed, thereby greatly saving the computing resource and improving the design efficiency of the IGBT module.

Description

Multi-objective optimization method for IGBT module packaging based on machine learning
Technical Field
The invention belongs to the technical field of power electronics, and particularly relates to a multi-objective optimization method for IGBT module packaging.
Background
The IGBT module, as one of the main failure elements in the photovoltaic inverter, the failure of which will directly affect the safe operation of the system. In the IGBT module, larger stray inductance can bring voltage overshoot and oscillation, and reliability is reduced. Meanwhile, the IGBT module in the photovoltaic inverter is influenced by illumination intensity and external environment temperature throughout the year, and junction temperature inside the device is easy to fluctuate. Because of the mismatch in thermal expansion coefficients of the internal materials, junction temperature fluctuations can cause extrusion deformation between device layers, thereby creating thermal stresses. Such cyclic thermal stresses can accelerate life consumption of the device, ultimately leading to thermal fatigue failure of the device.
Therefore, optimization of IGBT module packaging targeting low stray inductance, low junction temperature and low stress is carried out, and the method has great engineering application value and research significance. The traditional artificial-based packaging optimization design method has the main defects that the mathematical model of a power module and the optimization process need repeated manual iteration improvement, the whole design period is long, and the time cost is high. Many expert scholars propose different solutions for this:
the simulation design research of the electric-thermal-force multiple physical fields of the multi-chip parallel SiC MOSFET module [ D ] (university of Tianjin's university's treatise, 2022.DOI: 10.27356/D).
cnki.gtjdu.2020.002346), a response surface optimization method is adopted to perform structural optimization based on the thermal performance of the power module. However, this solution has the following drawbacks:
1) Because of the limitation of finite element software, the optimization target only selects thermal and mechanical performance indexes, and the influence of parasitic parameter stray inductance is not considered;
2) The current level of the method is a fixed value, when the system design requirement changes, iterative optimization is needed again, and the defect of long time consumption exists;
a layout for automatically designing modules using the idea of genetic algorithm is proposed in the heading "Automatic layout design for power module [ J ]" (IEEE Transactions on Power electronics.2013, 28 (1): 481-487 ]), and the layout design process is generally iterated based on genetic algorithm by expressing layout position information, chip placement information, etc. by genetic characters. However, this solution has the following drawbacks:
if the number of the parallel chips of the modules is large, the premature convergence of the genetic algorithm can be caused by the excessive population number, and the accuracy of the layout result is reduced;
in addition, the method lacks accurate mathematical modeling for parasitic inductance, temperature and the like of the module, and influences of factors such as mutual inductance, thermal coupling and the like are often ignored in the design process.
Disclosure of Invention
The technical problem to be solved by the invention is that the optimal solution cannot be provided under a variable state in the multi-objective optimization method of the existing IGBT module, and the defect of long calculation time is overcome. Aiming at the defects, the invention provides a multi-objective optimization method for IGBT module packaging based on machine learning, which utilizes depth certainty in machine learning to provide a strategy algorithm (DDPG), provides an optimal solution under a variable state in a quick response manner, and avoids the limitation that a meta heuristic algorithm needs to carry out iterative optimization again.
In order to achieve the aim of the invention, the invention provides a multi-objective optimization method for IGBT module packaging based on machine learning, wherein the IGBT module comprises an upper bridge arm chip, a lower bridge arm chip, a DBC substrate, a solder layer and a bonding wire; the DBC substrate comprises an upper copper layer, a ceramic layer and a lower copper layer, wherein the thicknesses of the upper copper layer and the lower copper layer are the same; the method comprises the following steps:
step 1, constructing a three-objective optimization model based on a neural network;
the IGBT module is recorded as a system, and the stress F of a solder layer, the stray inductance L and the junction temperature T of a chip of the system are used j Establishing a three-target optimization model based on a neural network as a target;
the input variables of the neural network are 6, and the neural network is divided into two groups, wherein the first group is a current level I, and the second group is an IGBT module size, and the neural network comprises: lateral distance d of the same bridge arm chip 1 Upper bridge armLateral distance d between chip and lower bridge arm chip 2 Longitudinal distance d between upper bridge arm chip and lower bridge arm chip 3 Copper layer thickness h 1 And ceramic layer thickness h 2
The output variables of the neural network are 3, and the output variables are respectively: solder layer stress F, stray inductance L and chip junction temperature T j
Step 2, determining a state set S and an action set A according to the three-objective optimization model obtained in the step 1 0 And a reward function R, and calculates an average reward
Step 3, according to the state set S and the action set A obtained in the step 2 0 And a reward function R, performing offline learning by using a DDPG algorithm of machine learning to obtain an optimal strategy pi(s) y );
The DDPG algorithm comprises 4 neural networks, namely an online strategy network, a target strategy network, an online evaluation network and a target evaluation network, wherein the neural network parameters of the online strategy network are recorded as theta μ The neural network parameters of the target policy network are noted as θ μ’ The neural network parameter of the on-line evaluation network is marked as theta Q The neural network parameters of the target evaluation network are marked as theta Q’
The optimal strategy pi (s y ) The expression of (2) is as follows:
wherein s is y For the entered state value, a y For the state passing through the optimal strategy pi (s y ) Output action value s y =(I y ,F y ,L y ,T jy ) y Wherein I y For current level in any one of the set of states S, F y 、L y 、T jy Respectively the initial module size and the current level I y Corresponding solder layer stress, stray inductance and chip junction temperature;a y =(d 1y ,d 2y ,d 3y ,h 1y ,h 2y ) y wherein (d) 1y ,d 2y ,d 3y ,h 1y ,h 2y ) y To be at current level I y In a state, outputting the optimal IGBT module size corresponding to the lowest stray inductance, the lowest chip junction temperature and the lowest solder layer stress after the optimal strategy;
step 4, the optimal strategy pi (s y ) Substituting the three-objective optimization model based on the neural network established in the step 1, and adopting an optimal strategy pi (S) by the system under any state in the state set S y ) All can realize average rewardsMaximization.
Preferably, the implementation process of step 1 is as follows:
step 1.1, determining input variables and output variables of a neural network;
the input variables of the neural network are 6, namely the current level I of the system and the transverse distance d of the same bridge arm chip 1 Lateral distance d between upper bridge arm chip and lower bridge arm chip 2 Longitudinal distance d between upper bridge arm chip and lower bridge arm chip 3 Copper layer thickness h 1 And ceramic layer thickness h 2 The method comprises the steps of carrying out a first treatment on the surface of the The output variables of the neural network are 3, namely the stress F of the solder layer, the stray inductance L and the junction temperature T of the chip j
Step 1.2, acquiring a sample data set required for constructing a neural network by using simulation software;
acquiring sample data required for constructing a neural network by using simulation software, and establishing a sample data set, wherein the sample data set comprises E groups of sample data, each group of data comprises 6 pieces of neural network input data and 3 pieces of neural network simulation output values corresponding to the 6 pieces of neural network input data, and the three groups of data are respectively recorded as input T and simulation output gamma, T= (I) F ,d 1F ,d 2F ,d 3F ,h 1F ,h 2F ),Wherein->For the stress simulation output value of the solder layer, +.>Simulation output value for stray inductance, < >>The chip is a chip junction temperature simulation output value, wherein f=1, 2, E;
dividing the sample data set into a training subset and a verification subset, wherein the training subset comprises E1 group sample data, and the verification subset comprises E2 group sample data, and E1+E2=E;
step 1.3, constructing a neural network A;
the method comprises the steps of constructing a neural network A, wherein the neural network 1 consists of an input layer, an output layer and an hidden layer, wherein the input layer contains 6 neurons, the hidden layer contains 11 neurons, and the output layer contains 3 neurons;
step 1.4, randomly extracting a group of input data from the training subset obtained in step 1.2, inputting the input data into the neural network A to obtain outputs corresponding to the input data, and respectively recording the outputs as solder layer stress network output values F F1 Stray inductance network output value L F1 And chip junction temperature network output value T jF1 Wherein f1=1, 2, E1;
step 1.5, carrying out parameter updating on the neural network A by adopting an error back propagation gradient descent algorithm to obtain an updated neural network B;
step 1.6, respectively inputting the E2 group input data of the verification subset obtained in the step 1.2 into a neural network B to obtain E2 group output corresponding to the E2 group input data, and marking any one of the E2 group input data as F2 group, wherein the F2 group comprises a solder layer stress network output value F F2 Stray inductance network output value L F2 And chip junction temperature network output value T Jf2 Wherein f2=e1+1, e1+2, E;
step 1.7, defining a root mean square error sigma, wherein the expression is as follows:
comparing the root mean square error sigma with a preset target error epsilon, and making the following judgment:
if sigma < epsilon, the neural network model is constructed; otherwise, returning to the step 1.4;
and marking the constructed neural network model as a three-objective optimization model.
Preferably, the implementation process of step 2 is as follows:
the state set S is defined as follows:
defining action set A 0 The following are provided:
recording a certain moment of the system as T, and a moment of the system termination state as T, t=1, 2. The state of the system at the time t is recorded as s t The action taken by the system at time t is denoted as a t The specific expression is as follows:
let s be the state at the time t+1 next to the time t t+1 The operation at time t+1 is denoted as a t+1 The specific expression is as follows:
the bonus function R represents a weighted sum of the bonus values generated by all actions of the system from the current state to the end state, expressed as follows:
wherein, gamma is a discount factor and represents the influence degree of the time length on the rewarding value t-1 For accumulation of discount factors at time t, r t For the state s of the system at time t t Take action a t The single step prize value obtained after that has the expression:
wherein, psi is penalty coefficient, eta 1 As the first weight coefficient, eta 2 As the second weight coefficient, eta 3 Is a third weight coefficient;
will single step prize value r t Record as average rewards
Preferably, in step 3, offline learning is performed by using a DDPG algorithm of machine learning to obtain an optimal policy pi (s y ) The specific implementation process of (2) is as follows:
step 3.1, initializing neural network parameters theta of an online policy network, a target policy network, an online evaluation network and a target evaluation network μ 、θ μ’ 、θ Q 、θ Q’ Let theta μ’μ
θ Q’Q The method comprises the steps of carrying out a first treatment on the surface of the Initializing the capacity of an experience playback pool P as D;
the output of the online policy network is noted as a, a=μ (s|θ μ ) Wherein a is an action value output by the online policy network, and a corresponds to the action set A 0 And a= (d) 1 ,d 2 ,d 3 ,h 1 ,h 2 ) The method comprises the steps of carrying out a first treatment on the surface of the S is a state value input by the online policy network, S corresponds to an individual in the state set S, and s= (I, F, L, T) j ) The method comprises the steps of carrying out a first treatment on the surface of the μ is the neural network parameter θ through the online policy network μ And a policy derived from the entered state value s;
step 3.2, state s of the system at time t t Inputting the online policy network to obtain the output mu of the online policy network t (s tμt ) And adding noise delta t Action a of obtaining final output t The specific expression is as follows:
step 3.3, the system is based on the state s t Executing action a t
Loading the three-objective optimization model into a machine learning algorithm, and recording the three-objective optimization model as an environment model; will I t ,d 1t ,d 2t ,d 3t ,h 1t ,h 2t As input variables of the environmental model, output variables are obtained and denoted as F t+1 ,L t+1 ,T jt+1 The method comprises the steps of carrying out a first treatment on the surface of the Build I t For normal distribution function of mean value, giving standard deviation, randomly sampling to obtain I t+1
Transition to a new state s t+1 =(I t+1 ,F t+1 ,L t+1 ,T jt+1 ) t+1 At the same time get the execution action a t The single step prize value r t Will(s) t ,a t ,r t ,s t+1 ) Called a state transition sequence, and stored in the experience playback pool P, the system enters a state s of t+1 at the next time t+1
Circularly executing the steps 3.2-3.3, recording the number of state transition sequences in the experience playback pool P as N, entering the step 3.4 if N=D, otherwise returning to the step 3.2;
step 3.4, randomly extracting n state transition sequences from the experience playback pool P, wherein n is less than D, taking the n state transition sequences as small batch data for training an online strategy network and an online evaluation network, and recording the kth state transition sequence in the small batch data as(s) k ,a k ,r k ,s k+1 ) N is a small batch sampling factor, k=1, 2,;
step 3.5, obtained according to step 3.4Small lot size data(s) k ,a k ,r k ,s k+1 ) K=1, 2,3,..n, n, calculated as the jackpot y k And error function L (θ) Q ) The specific expression is as follows:
in which Q (s k+1 ,u (s k+1μ’Q’ ) Scoring value output for target evaluation network, wherein u (s k+1μ’ )|θ Q’ Action value s output for target strategy network k+1 The state values input for the target evaluation network and the target strategy network; q(s) k ,a kQ ) For on-line evaluation of the scoring value output by the network s k And a k The method comprises the steps of evaluating a state value and an action value input by a network on line;
step 3.6, on-line evaluation network is performed by minimizing the error function L (θ Q ) To update theta Q The online strategy network passes through a deterministic strategy gradient V θμ J update θ μ The target evaluation network and the target policy network update theta by a moving average method Q’ And theta μ’ The specific expression is as follows:
wherein, V is a partial guide symbol, wherein V θμ J represents policy J vs. theta μ The deviation is calculated and guided, and the deviation is calculated,input representing online evaluation network is s=s k ,a=μ(s k ) When in use, the scoring value output by the network is evaluated onlineDeviation of the action value a is determined, +.>Input representing online policy network is s=s k When the online policy network outputs action value +.>For theta μ To calculate the deviation and guide θQ L(θ Q ) Representing an error function L (θ) Q ) For theta Q Obtaining a deflection guide; alpha Q To evaluate the learning rate of a network on line, alpha μ Learning rate of online strategy network, tau is a running average update parameter, and 0 < alpha Q <1,0<α μ <1,0<τ<1,/>Neural network parameters for an online evaluation network after updating, +.>For the neural network parameters of the online policy network after updating, +.>Neural network parameters of the evaluation network for the target after updating, < ->Neural network parameters of the target strategy network after updating;
step 3.7, giving a step size, a maximum step size max Training round number M and maximum training round number M, step=1, 2,..step max M=1, 2, M, when the steps 3.4 to 3.6 are completed once, the training process of one step is completed, the steps 3.4 to 3.6 are repeatedly executed, and when step is finished max When the training process of each step length is completed, the training process of one round is completed; starting the training process of the next round from the step 3.2 to the step 3.6, repeatedly executing the steps 3.2-3.6, and ending the learning process of the DDPG algorithm when the training processes of M rounds are completed;
on-line policy network, target policy network, on-line evaluation network, and neural network parameter θ of target evaluation network μ 、θ μ’ 、θ Q 、θ Q’ Toward maximizationIs updated in the direction of the (c) to finally obtain the optimal strategy pi (s y )。
Compared with the prior art, the invention has the beneficial effects that:
(1) According to the invention, the nonlinear mapping relation between the module size and the optimization target is constructed by utilizing the neural network, the capability of learning the distributed characteristics among data is utilized by the ANN, the complex finite element simulation is converted into a matrix expression corresponding to the neural network, the optimization target of the IGBT module can be quickly obtained by inputting the corresponding module size, the problems of time consumption, complex finite element simulation and difficult software interaction caused by multi-physical field coupling are avoided, and the calculation resources are greatly reduced;
(2) The optimal strategy pi (a|s) provided by the invention can directly obtain the optimal design variable value under the design requirements of different IGBT module current levels so as to maximize the efficiency, does not need to carry out complex and time-consuming optimizing solving process again, is simple, convenient and quick, and saves calculation resources.
Drawings
Fig. 1 is a three-objective optimization model structure based on a neural network, which is built in an embodiment of the present invention.
FIG. 2 is a chart showing the convergence of the training frequency and error of the neural network according to the present invention.
FIG. 3 is a block diagram of a multi-objective optimization method of the present invention.
FIG. 4 is a flow chart of the multi-objective optimization method of the present invention.
FIG. 5 is a chart showing the convergence effect of average rewards in an embodiment of the invention.
Description of the embodiments
The present invention will be described in detail with reference to the accompanying drawings.
Fig. 1 is a block diagram of a neural network based on a three-objective optimization model of the neural network established in an embodiment of the present invention. As can be seen, the input layer contains 6 neurons, the output layer contains 3 neurons, the hidden layer contains 11 neurons, given a set of inputsIn (I, d) 1 ,d 2 ,d 3 ,h 1 ,h 2 ) Can obtain a group of outputs (F, L, T j ). In addition, fig. 1 shows the relationship between 11 neurons in the hidden layer and 6 neurons in the input layer and 3 neurons in the output layer, respectively.
Fig. 2 is a convergence graph of the root mean square error of the neural network training in this example. After the data set and structure of the neural network are determined, the neural network is trained using the training commands. After about 10000 iterations, the root mean square error is significantly reduced, and the neural network training is completed.
Fig. 3 is a block diagram of the IGBT module multi-objective optimization method of the present invention, and fig. 4 is a flowchart of the IGBT module multi-objective optimization method of the present invention. As can be seen from fig. 3 and 4, the IGBT module multi-objective optimization method optimizes the IGBT module stress, stray inductance, and junction temperature based on machine learning.
The invention provides a multi-objective optimization method for IGBT module packaging based on machine learning, wherein the IGBT module comprises an upper bridge arm chip, a lower bridge arm chip, a DBC substrate (directly copper-clad ceramic substrate), a solder layer and a bonding wire; the DBC substrate comprises an upper copper layer, a ceramic layer and a lower copper layer, wherein the thicknesses of the upper copper layer and the lower copper layer are the same. Specifically, the method comprises the following steps:
step 1, constructing a three-objective optimization model based on a neural network;
the IGBT module is recorded as a system, and the stress F of a solder layer, the stray inductance L and the junction temperature T of a chip of the system are used j Establishing a three-target optimization model based on a neural network as a target;
the input variables of the neural network are 6, and the neural network is divided into two groups, wherein the first group is a current level I, and the second group is an IGBT module size, and the neural network comprises: lateral distance d of the same bridge arm chip 1 Lateral distance d between upper bridge arm chip and lower bridge arm chip 2 Longitudinal distance d between upper bridge arm chip and lower bridge arm chip 3 Copper layer thickness h 1 And ceramic layer thickness h 2
The output variables of the neural network are 3, and the output variables are respectively: stress F of solder layer, strayingInductance L and chip junction temperature T j
In this embodiment, the implementation procedure of step 1 is as follows:
step 1.1, determining input variables and output variables of a neural network;
the input variables of the neural network are 6, namely the current level I of the system and the transverse distance d of the same bridge arm chip 1 Lateral distance d between upper bridge arm chip and lower bridge arm chip 2 Longitudinal distance d between upper bridge arm chip and lower bridge arm chip 3 Copper layer thickness h 1 And ceramic layer thickness h 2 The method comprises the steps of carrying out a first treatment on the surface of the The output variables of the neural network are 3, namely the stress F of the solder layer, the stray inductance L and the junction temperature T of the chip j
Step 1.2, acquiring a sample data set required for constructing a neural network by using simulation software;
acquiring sample data required for constructing a neural network by using simulation software, and establishing a sample data set, wherein the sample data set comprises E groups of sample data, each group of data comprises 6 pieces of neural network input data and 3 pieces of neural network simulation output values corresponding to the 6 pieces of neural network input data, and the three groups of data are respectively recorded as input T and simulation output gamma, T= (I) F ,d 1F ,d 2F ,d 3F ,h 1F ,h 2F ),Wherein->For the stress simulation output value of the solder layer, +.>Simulation output value for stray inductance, < >>The output values were simulated for the chip junction temperature, where f=1, 2, E;
dividing the sample data set into a training subset and a verification subset, wherein the training subset comprises E1 group sample data, and the verification subset comprises E2 group sample data, and E1+E2=E;
step 1.3, constructing a neural network A;
the method comprises the steps of constructing a neural network A, wherein the neural network 1 consists of an input layer, an output layer and an hidden layer, wherein the input layer contains 6 neurons, the hidden layer contains 11 neurons, and the output layer contains 3 neurons;
step 1.4, randomly extracting a group of input data from the training subset obtained in step 1.2, inputting the input data into the neural network A to obtain outputs corresponding to the input data, and respectively recording the outputs as solder layer stress network output values F F1 Stray inductance network output value L F1 And chip junction temperature network output value T jF1 Wherein f1=1, 2, E1;
step 1.5, carrying out parameter updating on the neural network A by adopting an error back propagation gradient descent algorithm to obtain an updated neural network B;
step 1.6, respectively inputting the E2 group input data of the verification subset obtained in the step 1.2 into a neural network B to obtain E2 group output corresponding to the E2 group input data, and marking any one of the E2 group input data as F2 group, wherein the F2 group comprises a solder layer stress network output value F F2 Stray inductance network output value L F2 And chip junction temperature network output value T Jf2 Wherein f2=e1+1, e1+2, E;
step 1.7, defining a root mean square error sigma, wherein the expression is as follows:
comparing the root mean square error sigma with a preset target error epsilon, and making the following judgment:
if sigma < epsilon, the neural network model is constructed; otherwise, returning to the step 1.4;
and marking the constructed neural network model as a three-objective optimization model.
Specifically, in this example, the stray inductance of the IGBT module is first extracted using Q3D finite element simulation software, and the side of the dc positive input terminal is set to be source, and dc negativeThe input side is sink, for the module size in this half bridge module: the same bridge arm chip is subjected to parameterization treatment by a transverse distance d1, a transverse distance d2 between an upper bridge arm chip and a lower bridge arm chip, a longitudinal distance d3 between the upper bridge arm chip and the lower bridge arm chip, a copper layer thickness h1 and a ceramic layer thickness h 2. d1 has a value of 1 mm11 mm, step length of 2 mm; d2 is 5 mm +.>20 mm, step length is 3 mm; d3 is 5 mm +.>25 mm, step length of 5 mm; the value range of h1 is 0.1 mm +.>0.2 mm, step size of 0.05; the value range of millimeter h2 is 0.2 millimeter +.>0.4 mm, step size of 0.1 mm; a total of 54×3=1875 sets of data were available, noted as data a.
In this example, the IGBT module was simulated using the comsol software for electro-thermal-force coupling to extract the chip junction temperature and solder layer stress. And carrying out parameterization treatment on the module size and the current level in the half-bridge module, wherein the value range of the current level is 230-250 milliamperes, the step length is 4, and the value range and the step length of the module size are the same. A total of 55×3=9375 sets of data were available, noted as data B.
In this example, the input data of data A and B are inclusion relationships and can be integrated into one
And (3) carrying out normalization processing and random scrambling on the data to obtain a group of sample data sets, wherein the group of sample data comprises 55×3=9375 groups of sample data, namely E=9375.
In this example, the sample data set is divided into a training subset and a verification subset, and the division ratio can be divided into 80% and 20% according to a classical division method, wherein 80% is used as the training set, and 20% is used as the test set. Stage, e1=7500, e2=1875.
Step 2, determining a state set S and an action set A according to the three-objective optimization model obtained in the step 1 0 And a reward function R, and calculates an average reward
In this embodiment, the implementation procedure of step 2 is as follows:
the state set S is defined as follows:
defining action set A 0 The following are provided:
recording a certain moment of the system as T, and a moment of the system termination state as T, t=1, 2. The state of the system at the time t is recorded as s t The action taken by the system at time t is denoted as a t The specific expression is as follows:
let s be the state at the time t+1 next to the time t t+1 The operation at time t+1 is denoted as a t+1 The specific expression is as follows:
the bonus function R represents a weighted sum of the bonus values generated by all actions of the system from the current state to the end state, expressed as follows:
wherein, gamma is a discount factor and represents the influence degree of the time length on the rewarding value t-1 For accumulation of discount factors at time t, r t For the state s of the system at time t t Take action a t The single step prize value obtained after that has the expression:
wherein, psi is penalty coefficient, eta 1 As the first weight coefficient, eta 2 As the second weight coefficient, eta 3 Is a third weight coefficient;
will single step prize value r t Record as average rewards
Specifically, let ψ=1000 and η 123 =1, taking γ=0.9.
Step 3, according to the state set S and the action set A obtained in the step 2 0 And a reward function R, performing offline learning by using a DDPG algorithm of machine learning to obtain an optimal strategy pi(s) y );
The DDPG algorithm comprises 4 neural networks, namely an online strategy network, a target strategy network, an online evaluation network and a target evaluation network, wherein the neural network parameters of the online strategy network are recorded as theta μ The neural network parameters of the target policy network are noted as θ μ’ The neural network parameter of the on-line evaluation network is marked as theta Q The neural network parameters of the target evaluation network are marked as theta Q’
The optimal strategy pi (s y ) The expression of (2) is as follows:
wherein s is y For the entered state value, a y For the state passing through the optimal strategy pi (s y ) Output action value s y =(I y ,F y ,L y ,T jy ) y Wherein I y For current level in any one of the set of states S, F y 、L y 、T jy Respectively the initial module size and the current level I y Corresponding solder layer stress, stray inductance and chip junction temperature; a, a y =(d 1y ,d 2y ,d 3y ,h 1y ,h 2y ) y Wherein (d) 1y ,d 2y ,d 3y ,h 1y ,h 2y ) y To be at current level I y In the state, the optimal IGBT module size corresponding to the lowest stray inductance, the lowest chip junction temperature and the lowest solder layer stress is output after the optimal strategy.
Step 4, the optimal strategy pi (s y ) Substituting the three-objective optimization model based on the neural network established in the step 1, and adopting an optimal strategy pi (S) by the system under any state in the state set S y ) All can realize average rewardsMaximization.
In this embodiment, step 3 of performing offline learning by using the DDPG algorithm of machine learning to obtain an optimal policy pi(s) y ) The specific implementation process of (2) is as follows:
step 3.1, initializing neural network parameters theta of an online policy network, a target policy network, an online evaluation network and a target evaluation network μ 、θ μ’ 、θ Q 、θ Q’ Let theta μ’μ
θ Q’Q The method comprises the steps of carrying out a first treatment on the surface of the Initializing the capacity of an experience playback pool P as D;
the output of the online policy network is noted as a, a=μ (s|θ μ ) Wherein a is an action value output by the online policy network, and a corresponds to the action set A 0 And a= (d) 1 ,d 2 ,d 3 ,h 1 ,h 2 ) The method comprises the steps of carrying out a first treatment on the surface of the s is the state value input by the online policy network, and s corresponds toIndividual in the state set S, and s= (I, F, L, T) j ) The method comprises the steps of carrying out a first treatment on the surface of the μ is the neural network parameter θ through the online policy network μ And a policy derived from the entered state value s;
step 3.2, state s of the system at time t t Inputting the online policy network to obtain the output mu of the online policy network t (s tμt ) And adding noise delta t Action a of obtaining final output t The specific expression is as follows:
step 3.3, the system is based on the state s t Executing action a t
Loading the three-objective optimization model into a machine learning algorithm, and recording the three-objective optimization model as an environment model; will I t ,d 1t ,d 2t ,d 3t ,h 1t ,h 2t As input variables of the environmental model, output variables are obtained and denoted as F t+1 ,L t+1 ,T jt+1 The method comprises the steps of carrying out a first treatment on the surface of the Build I t For normal distribution function of mean value, giving standard deviation, randomly sampling to obtain I t+1
Transition to a new state s t+1 =(I t+1 ,F t+1 ,L t+1 ,T jt+1 ) t+1 At the same time get the execution action a t The single step prize value r t Will(s) t ,a t ,r t ,s t+1 ) Called a state transition sequence, and stored in the experience playback pool P, the system enters a state s of t+1 at the next time t+1
Circularly executing the steps 3.2-3.3, recording the number of state transition sequences in the experience playback pool P as N, entering the step 3.4 if N=D, otherwise returning to the step 3.2;
step 3.4, randomly extracting n state transition sequences from the experience playback pool P, wherein n is less than D, taking the n state transition sequences as small batch data for training an online strategy network and an online evaluation network, and recording the kth state transition sequence in the small batch data as(s) k ,a k ,r k ,s k+1 ) N is a small batch sampling factor, k=1, 2,;
step 3.5, based on the small batch data(s) obtained in step 3.4 k ,a k ,r k ,s k+1 ) K=1, 2,3,..n, n, calculated as the jackpot y k And error function L (θ) Q ) The specific expression is as follows:
in which Q (s k+1 ,u (s k+1μ’Q’ ) Scoring value output for target evaluation network, wherein u (s k+1μ’ )|θ Q’ Action value s output for target strategy network k+1 The state values input for the target evaluation network and the target strategy network; q(s) k ,a kQ ) For on-line evaluation of the scoring value output by the network s k And a k The method comprises the steps of evaluating a state value and an action value input by a network on line;
step 3.6, on-line evaluation network is performed by minimizing the error function L (θ Q ) To update theta Q The online strategy network passes through a deterministic strategy gradient V θμ J update θ μ The target evaluation network and the target policy network update theta by a moving average method Q’ And theta μ’ The specific expression is as follows:
wherein, V is a partial guide symbol, wherein V θμ J represents policy J vs. theta μ The deviation is calculated and guided, and the deviation is calculated,input representing online evaluation network is s=s k ,a=μ(s k ) When in use, the scoring value output by the network is evaluated onlineDeviation of the action value a is determined, +.>Input representing online policy network is s=s k When the online policy network outputs action value +.>For theta μ To calculate the deviation and guide θQ L(θ Q ) Representing an error function L (θ) Q ) For theta Q Obtaining a deflection guide;
α Q to evaluate the learning rate of a network on line, alpha μ Learning rate of online strategy network, tau is a running average update parameter, and 0 < alpha Q <1,0<α μ <1,0<τ<1,Neural network parameters for an online evaluation network after updating, +.>For the neural network parameters of the online policy network after updating, +.>To update the neural network parameters of the target evaluation network after,neural network parameters of the target strategy network after updating;
step 3.7, giving a step size, a maximum step size max Training round number M and maximum training round number M, step=1, 2,..step max M=1, 2, M, when the steps 3.4 to 3.6 are completed once, the training process of one step is completed, the steps 3.4 to 3.6 are repeatedly executed, and when step is finished max When the training process of each step length is completed, the training process of one round is completed; the training process of the next round starts from the step 3.2 to the step 3.6, the steps 3.2-3.6 are repeatedly executed, and when the training process of M rounds is completed, the DDPG algorithm is learnedEnding the process;
on-line policy network, target policy network, on-line evaluation network, and neural network parameter θ of target evaluation network μ 、θ μ’ 、θ Q 、θ Q’ Toward maximizationIs updated in the direction of the (c) to finally obtain the optimal strategy pi (s y )。
In this embodiment, D is 10000, M is 300, α Qμ =0.001, factor n is 32.
FIG. 5 is a chart showing the convergence effect of average rewards in the embodiment of the present invention, wherein the abscissa in FIG. 5 is the training round number m, and the ordinate is the average rewards. As can be seen from FIG. 5, as the number of training rounds m increases, the average prize +.>Oscillating up and down, then gradually increasing and finally keeping at-212 +>-214, and when m=300, < >>= -212.8, training effect has reached an optimum, neural network parameters θ of online policy network, target policy network, online evaluation network and target evaluation network μ 、θ μ’ 、θ Q 、θ Q’ Has been updated to obtain an optimal strategy pi (s y )。/>

Claims (4)

1. A multi-objective optimization method for IGBT module packaging based on machine learning, wherein the IGBT module comprises an upper bridge arm chip, a lower bridge arm chip, a DBC substrate, a solder layer and a bonding wire; the DBC substrate comprises an upper copper layer, a ceramic layer and a lower copper layer, wherein the thicknesses of the upper copper layer and the lower copper layer are the same; the method is characterized by comprising the following steps of:
step 1, constructing a three-objective optimization model based on a neural network;
the IGBT module is recorded as a system, and the stress F of a solder layer, the stray inductance L and the junction temperature T of a chip of the system are used j Establishing a three-target optimization model based on a neural network as a target;
the input variables of the neural network are 6, and the neural network is divided into two groups, wherein the first group is a current level I, and the second group is an IGBT module size, and the neural network comprises: lateral distance d of the same bridge arm chip 1 Lateral distance d between upper bridge arm chip and lower bridge arm chip 2 Longitudinal distance d between upper bridge arm chip and lower bridge arm chip 3 Copper layer thickness h 1 And ceramic layer thickness h 2
The output variables of the neural network are 3, and the output variables are respectively: solder layer stress F, stray inductance L and chip junction temperature T j
Step 2, determining a state set S and an action set A according to the three-objective optimization model obtained in the step 1 0 And a reward function R, and calculates an average reward
Step 3, according to the state set S and the action set A obtained in the step 2 0 And a reward function R, performing offline learning by using a DDPG algorithm of machine learning to obtain an optimal strategy pi(s) y );
The DDPG algorithm comprises 4 neural networks, namely an online strategy network, a target strategy network, an online evaluation network and a target evaluation network, wherein the neural network parameters of the online strategy network are recorded as theta μ The neural network parameters of the target policy network are noted as θ μ’ The neural network parameter of the on-line evaluation network is marked as theta Q The neural network parameters of the target evaluation network are marked as theta Q’
The optimal strategy pi (s y ) The expression of (2) is as follows:
wherein s is y For the entered state value, a y For the state passing through the optimal strategy pi (s y ) Output action value s y =(I y ,F y ,L y ,T jy ) y Wherein I y For current level in any one of the set of states S, F y 、L y 、T jy Respectively the initial module size and the current level I y Corresponding solder layer stress, stray inductance and chip junction temperature; a, a y =(d 1y ,d 2y ,d 3y ,h 1y ,h 2y ) y Wherein (d) 1y ,d 2y ,d 3y ,h 1y ,h 2y ) y To be at current level I y In a state, outputting the optimal IGBT module size corresponding to the lowest stray inductance, the lowest chip junction temperature and the lowest solder layer stress after the optimal strategy;
step 4, the optimal strategy pi (s y ) Substituting the three-objective optimization model based on the neural network established in the step 1, and adopting an optimal strategy pi (S) by the system under any state in the state set S y ) All can realize average rewardsMaximization.
2. The multi-objective optimization method of machine learning based IGBT module packaging of claim 1, wherein the implementation process of step 1 is as follows:
step 1.1, determining input variables and output variables of a neural network;
the input variables of the neural network are 6, namely the current level I of the system and the transverse distance d of the same bridge arm chip 1 Lateral distance d between upper bridge arm chip and lower bridge arm chip 2 Longitudinal distance d between upper bridge arm chip and lower bridge arm chip 3 Copper layer thickness h 1 And ceramic layer thickness h 2 The method comprises the steps of carrying out a first treatment on the surface of the The output variables of the neural network are 3, namely the stress F of the solder layer, the stray inductance L and the junction temperature T of the chip j
Step 1.2, acquiring a sample data set required for constructing a neural network by using simulation software;
acquiring sample data required for constructing a neural network by using simulation software, and establishing a sample data set, wherein the sample data set comprises E groups of sample data, each group of data comprises 6 pieces of neural network input data and 3 pieces of neural network simulation output values corresponding to the 6 pieces of neural network input data, and the three groups of data are respectively recorded as input T and simulation output gamma, T= (I) F ,d 1F ,d 2F ,d 3F ,h 1F ,h 2F ),Wherein->For the stress simulation output value of the solder layer, +.>Simulation output value for stray inductance, < >>The output values were simulated for the chip junction temperature, where f=1, 2, E;
dividing the sample data set into a training subset and a verification subset, wherein the training subset comprises E1 group sample data, and the verification subset comprises E2 group sample data, and E1+E2=E;
step 1.3, constructing a neural network A;
the method comprises the steps of constructing a neural network A, wherein the neural network A consists of an input layer, an output layer and an hidden layer, the input layer contains 6 neurons, the hidden layer contains 11 neurons, and the output layer contains 3 neurons;
step 1.4, randomly extracting a group of input data from the training subset obtained in step 1.2, marking as F1 group, inputting into the neural network A, obtaining outputs corresponding to the input data, and marking as solder layer stress network output values F respectively F1 Stray inductance network output value L F1 Network transmission with chip junction temperatureOutput value T jF1 Wherein f1=1, 2, E1;
step 1.5, carrying out parameter updating on the neural network A by adopting an error back propagation gradient descent algorithm to obtain an updated neural network B;
step 1.6, respectively inputting the E2 group input data of the verification subset obtained in the step 1.2 into a neural network B to obtain E2 group output corresponding to the E2 group input data, and marking any one of the E2 group input data as F2 group, wherein the F2 group comprises a solder layer stress network output value F F2 Stray inductance network output value L F2 And chip junction temperature network output value T jF2 Wherein f2=e1+1, e1+2, E;
step 1.7, defining a root mean square error sigma, wherein the expression is as follows:
comparing the root mean square error sigma with a preset target error epsilon, and making the following judgment:
if sigma < epsilon, the neural network model is constructed; otherwise, returning to the step 1.4;
and marking the constructed neural network model as a three-objective optimization model.
3. The multi-objective optimization method of machine learning based IGBT module packaging of claim 1, wherein the implementation process of step 2 is as follows:
the state set S is defined as follows:
defining action set A 0 The following are provided:
recording a certain moment of the system as T, and a moment of the system termination state as T, t=1, 2. The state of the system at the time t is recorded as s t Taking the system at the time tThe action is denoted as a t The specific expression is as follows:
let s be the state at the time t+1 next to the time t t+1 The operation at time t+1 is denoted as a t+1 The specific expression is as follows:
the bonus function R represents a weighted sum of the bonus values generated by all actions of the system from the current state to the end state, expressed as follows:
wherein, gamma is a discount factor and represents the influence degree of the time length on the rewarding value t-1 For accumulation of discount factors at time t, r t For the state s of the system at time t t Take action a t The single step prize value obtained after that has the expression:
wherein, psi is penalty coefficient, eta 1 As the first weight coefficient, eta 2 As the second weight coefficient, eta 3 Is a third weight coefficient;
will single step prize value r t Record as average rewards
4. The multi-objective optimization method of machine learning-based IGBT module package according to claim 1, wherein in step 3, the machine learning-based DDPG algorithm performs offline learning to obtain an optimal policy pi (s y ) The specific implementation process of (2) is as follows:
step 3.1, initialize inLine policy network, target policy network, on-line evaluation network, and neural network parameter θ of target evaluation network μ 、θ μ’ 、θ Q 、θ Q’ Let theta μ’μ
θ Q’Q The method comprises the steps of carrying out a first treatment on the surface of the Initializing the capacity of an experience playback pool P as D;
the output of the online policy network is noted as a, a=μ (s|θ μ ) Where a is the action value output by the online policy network, a corresponds to the action set A in claim 1 0 And a= (d) 1 ,d 2 ,d 3 ,h 1 ,h 2 ) The method comprises the steps of carrying out a first treatment on the surface of the S is the state value entered by the online policy network, S corresponds to an individual in the state set S in claim 1, and s= (I, F, L, T) j ) The method comprises the steps of carrying out a first treatment on the surface of the μ is the neural network parameter θ through the online policy network μ And a policy derived from the entered state value s;
step 3.2, state s of the system at time t t Inputting the online policy network to obtain the output mu of the online policy network t (s tμt ) And adding noise delta t Action a of obtaining final output t The specific expression is as follows:
step 3.3, the system is based on the state s t Executing action a t
Loading the three-objective optimization model into a machine learning algorithm, and recording the three-objective optimization model as an environment model; will I t ,d 1t ,d 2t ,d 3t ,h 1t ,h 2t As input variables of the environmental model, output variables are obtained and denoted as F t+1 ,L t+1 ,T jt+1 The method comprises the steps of carrying out a first treatment on the surface of the Build I t For normal distribution function of mean value, giving standard deviation, randomly sampling to obtain I t+1
Transition to a new state s t+1 =(I t+1 ,F t+1 ,L t+1 ,T jt+1 ) t+1 At the same time get the execution action a t The single step prize value r t Will(s) t ,a t ,r t ,s t+1 ) Called a state transition sequence, and stored in the experience playback pool P, the system enters a state s of t+1 at the next time t+1
Circularly executing the steps 3.2-3.3, recording the number of state transition sequences in the experience playback pool P as N, entering the step 3.4 if N=D, otherwise returning to the step 3.2;
step 3.4, randomly extracting n state transition sequences from the experience playback pool P, wherein n is less than D, taking the n state transition sequences as small batch data for training an online strategy network and an online evaluation network, and recording the kth state transition sequence in the small batch data as(s) k ,a k ,r k ,s k+1 ) N is a small batch sampling factor, k=1, 2,;
step 3.5, based on the small batch data(s) obtained in step 3.4 k ,a k ,r k ,s k+1 ) K=1, 2,3,..n, n, calculated as the jackpot y k And error function L (θ) Q ) The specific expression is as follows:
in which Q (s k+1 ,u (s k+1μ’Q’ ) Scoring value output for target evaluation network, wherein u (s k+1μ’ )|θ Q’ Action value s output for target strategy network k+1 The state values input for the target evaluation network and the target strategy network; q(s) k ,a kQ ) For on-line evaluation of the scoring value output by the network s k And a k The method comprises the steps of evaluating a state value and an action value input by a network on line;
step 3.6, on-line evaluation network is performed by minimizing the error function L (θ Q ) To update theta Q The online strategy network passes through a deterministic strategy gradient V θμ J update θ μ The target evaluation network and the target policy network update theta by a moving average method Q’ And theta μ’ The specific expression is as follows:
wherein, V is a partial guide symbol, wherein V θμ J represents policy J vs. theta μ Deviation-inducing and->Input representing online evaluation network is s=s k ,a=μ(s k ) In the time of online evaluation of the scoring value outputted by the network +.>Deviation of the action value a is determined, +.>Input representing online policy network is s=s k When the online policy network outputs action value +.>For theta μ To calculate the deviation and guide θQ L(θ Q ) Representing an error function L (θ) Q ) For theta Q Obtaining a deflection guide;
α Q to evaluate the learning rate of a network on line, alpha μ Learning rate of online strategy network, tau is a running average update parameter, and 0 < alpha Q <1,0<α μ <1,0<τ<1,Neural network parameters for an online evaluation network after updating, +.>For the neural network parameters of the online policy network after updating, +.>To update the neural network parameters of the online policy network after that,neural network parameters of the target strategy network after updating;
step 3.7, giving a step size, a maximum step size max Training round number M and maximum training round number M, step=1, 2,..step max M=1, 2, M, when the steps 3.4 to 3.6 are completed once, the training process of one step is completed, the steps 3.4 to 3.6 are repeatedly executed, and when step is finished max When the training process of each step length is completed, the training process of one round is completed; starting the training process of the next round from the step 3.2 to the step 3.6, repeatedly executing the steps 3.2-3.6, and ending the learning process of the DDPG algorithm when the training processes of M rounds are completed;
on-line policy network, target policy network, on-line evaluation network, and neural network parameter θ of target evaluation network μ 、θ μ’ 、θ Q 、θ Q’ Toward maximizationIs updated in the direction of the (c) to finally obtain the optimal strategy pi (s y )。
CN202311617795.6A 2023-11-30 2023-11-30 Multi-objective optimization method for IGBT module packaging based on machine learning Active CN117313560B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311617795.6A CN117313560B (en) 2023-11-30 2023-11-30 Multi-objective optimization method for IGBT module packaging based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311617795.6A CN117313560B (en) 2023-11-30 2023-11-30 Multi-objective optimization method for IGBT module packaging based on machine learning

Publications (2)

Publication Number Publication Date
CN117313560A CN117313560A (en) 2023-12-29
CN117313560B true CN117313560B (en) 2024-02-09

Family

ID=89281586

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311617795.6A Active CN117313560B (en) 2023-11-30 2023-11-30 Multi-objective optimization method for IGBT module packaging based on machine learning

Country Status (1)

Country Link
CN (1) CN117313560B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114172403A (en) * 2021-12-07 2022-03-11 合肥工业大学 Inverter efficiency optimization method based on deep reinforcement learning
CN115021325A (en) * 2022-06-22 2022-09-06 合肥工业大学 Photovoltaic inverter multi-objective optimization method based on DDPG algorithm
CN115765396A (en) * 2022-11-23 2023-03-07 天地(常州)自动化股份有限公司 Coordination optimization method for IGBT spike voltage suppression
DE102022108379A1 (en) * 2022-04-07 2023-10-12 Dr. Ing. H.C. F. Porsche Aktiengesellschaft Method, system and computer program product for the optimized construction and/or design of a technical component
CN117057229A (en) * 2023-08-10 2023-11-14 合肥工业大学 Multi-objective optimization method based on deep reinforcement learning power module
CN117057228A (en) * 2023-08-10 2023-11-14 合肥工业大学 Inverter multi-objective optimization method based on deep reinforcement learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766137A (en) * 2019-10-18 2020-02-07 武汉大学 Power electronic circuit fault diagnosis method based on longicorn whisker optimized deep confidence network algorithm

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114172403A (en) * 2021-12-07 2022-03-11 合肥工业大学 Inverter efficiency optimization method based on deep reinforcement learning
DE102022108379A1 (en) * 2022-04-07 2023-10-12 Dr. Ing. H.C. F. Porsche Aktiengesellschaft Method, system and computer program product for the optimized construction and/or design of a technical component
CN115021325A (en) * 2022-06-22 2022-09-06 合肥工业大学 Photovoltaic inverter multi-objective optimization method based on DDPG algorithm
CN115765396A (en) * 2022-11-23 2023-03-07 天地(常州)自动化股份有限公司 Coordination optimization method for IGBT spike voltage suppression
CN117057229A (en) * 2023-08-10 2023-11-14 合肥工业大学 Multi-objective optimization method based on deep reinforcement learning power module
CN117057228A (en) * 2023-08-10 2023-11-14 合肥工业大学 Inverter multi-objective optimization method based on deep reinforcement learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Co-Reduction of Common Mode Noise and Loop Current of Three-Level Active Neutral Point Clamped Inverters;Jianing Wang等;《IEEE》;全文 *
功率IGBT模块热网络参数提取研究综述;王存乐等;《电工电气》;全文 *
基于多目标优化的电动汽车变流器IGBT及开关频率的选择;罗旭;王学梅;吴海平;;电工技术学报(第10期);全文 *

Also Published As

Publication number Publication date
CN117313560A (en) 2023-12-29

Similar Documents

Publication Publication Date Title
CN110084221B (en) Serialized human face key point detection method with relay supervision based on deep learning
CN110175386B (en) Method for predicting temperature of electrical equipment of transformer substation
WO2021109644A1 (en) Hybrid vehicle working condition prediction method based on meta-learning
CN112149316A (en) Aero-engine residual life prediction method based on improved CNN model
CN113988449B (en) Wind power prediction method based on transducer model
CN109754122A (en) A kind of Numerical Predicting Method of the BP neural network based on random forest feature extraction
CN110245390B (en) Automobile engine oil consumption prediction method based on RS-BP neural network
CN117057229A (en) Multi-objective optimization method based on deep reinforcement learning power module
CN113822418A (en) Wind power plant power prediction method, system, device and storage medium
CN113313306A (en) Elastic neural network load prediction method based on improved wolf optimization algorithm
CN114021483A (en) Ultra-short-term wind power prediction method based on time domain characteristics and XGboost
CN116484495A (en) Pneumatic data fusion modeling method based on test design
CN117313560B (en) Multi-objective optimization method for IGBT module packaging based on machine learning
CN112947080B (en) Scene parameter transformation-based intelligent decision model performance evaluation system
CN112731098B (en) Radio frequency low-noise discharge circuit fault diagnosis method, system, medium and application
CN110276478B (en) Short-term wind power prediction method based on segmented ant colony algorithm optimization SVM
Bi et al. Self-adaptive Teaching-learning-based Optimizer with Improved RBF and Sparse Autoencoder for Complex Optimization Problems
CN110489790B (en) IGBT junction temperature prediction method based on improved ABC-SVR
CN116488151A (en) Short-term wind power prediction method based on condition generation countermeasure network
CN114372640A (en) Wind power prediction method based on fluctuation sequence classification correction
CN112347704B (en) Efficient artificial neural network microwave device modeling method based on Bayesian theory
CN113111588B (en) NO of gas turbine X Emission concentration prediction method and device
CN114091392A (en) Boolean satisfiability judgment method based on linear programming
CN113393051A (en) Power distribution network investment decision method based on deep migration learning
CN110543724A (en) Satellite structure performance prediction method for overall design

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant