CN114362519B

CN114362519B - Efficiency optimization method and system for two-phase staggered parallel DC-DC converter

Info

Publication number: CN114362519B
Application number: CN202111661765.6A
Authority: CN
Inventors: 尹泉; 缪佶桂; 刘洋
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2023-10-24
Anticipated expiration: 2041-12-31
Also published as: CN114362519A

Abstract

The invention discloses a method and a system for optimizing efficiency of a two-phase staggered parallel DC-DC converter, and belongs to the field of power electronics. The method comprises the following steps: respectively acquiring efficiency data of two phases of the two-phase staggered parallel DC-DC converter under different working conditions; training efficiency data by using an SVR algorithm to obtain an efficiency prediction model of two phases under any working condition; fitting an efficiency characteristic curve of two phases under any working condition according to an efficiency prediction model; determining an overall efficiency optimal problem based on an efficiency characteristic curve, and converting the overall efficiency optimal problem into a current distribution optimal problem; the Q-learning reinforcement learning strategy is used to find the optimal current distribution scheme. The method can adapt to the change of the operation condition of the system, predicts the efficiency under any condition, processes the unknown or incomplete information of the dynamic environment, gives an optimal current distribution scheme under the condition that a model does not need to be changed, improves the applicability of the model, has few required parameters, is simple and easy to understand, is easy to realize, and has industrial application value.

Description

Efficiency optimization method and system for two-phase staggered parallel DC-DC converter

Technical Field

The invention belongs to the field of power electronics, and particularly relates to a method and a system for optimizing efficiency of a two-phase staggered parallel DC-DC converter.

Background

The two-phase staggered parallel DC-DC converter is widely applied to high-current and high-power occasions by the characteristics of high power, high power density, high efficiency, high reliability and the like. Current load distribution of each single-phase converter is realized by adopting a current sharing control strategy so as to ensure safe operation of the system. However, because of the difference between the component parameters and the parasitic parameters of the circuit of each phase converter of the staggered parallel converters, the efficiency characteristics of each phase converter are different, and if a control scheme of each phase current sharing is adopted, the overall efficiency of the system cannot be guaranteed to be optimal, and energy waste can be caused. Therefore, the research on the efficiency optimization method of the two-phase staggered parallel DC-DC converter based on non-current-sharing control has important significance.

The traditional non-current sharing current distribution scheme is obtained by establishing an accurate power loss model, fitting a single-phase converter efficiency characteristic curve and combining an efficiency calculation method to optimize the total efficiency of the system. Because the working condition of the converter is complex and changeable, such as load or voltage conversion ratio change, the efficiency characteristics of the converter under different working conditions are different, when the working conditions change, the data under the working conditions need to be collected again to fit the efficiency characteristic curve, which is very tedious and reduces the applicability of the model. In addition, when the system operates under complex conditions, solving the optimal current distribution scheme based on a mathematical method generally faces the difficulty of complex computation. Various heuristic algorithms are currently used, including particle swarm algorithms, ant colony algorithms, simulated annealing algorithms, hill climbing algorithms, differential evolution algorithms, genetic algorithms, and the like. However, these algorithms cannot adapt to the external conditions, i.e. the dynamic changes of the working conditions, and need to be reset when the external conditions change, wherein the search performance of some algorithms depends on the selection of initial values and optimization parameters to a great extent, and is easy to fall into local optimum, resulting in larger result errors.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide a method and a system for optimizing the efficiency of a two-phase staggered parallel DC-DC converter, and aims to solve the problem of larger error of the existing non-current-sharing control method for the converter.

In order to achieve the above object, according to one aspect of the present invention, there is provided a method for optimizing efficiency of a two-phase interleaved parallel DC-DC converter, comprising the steps of:

s1, respectively acquiring efficiency data of two phases of a two-phase staggered parallel DC-DC converter under different working conditions;

s2, training efficiency data of two phases under different working conditions by using a SVR (support vector regression) algorithm to obtain an efficiency prediction model of two phases of the two-phase staggered parallel DC-DC converter under any working condition;

s3, fitting a two-phase efficiency characteristic curve of the two-phase staggered parallel DC-DC converter under any working condition according to the efficiency prediction model;

s4, determining the overall efficiency optimal problem of the two-phase staggered parallel DC-DC converter based on the efficiency characteristic curve, and converting the overall efficiency optimal problem into the current distribution optimal problem;

s5, abstracting the optimal current distribution problem, and searching an optimal current distribution scheme by using a Q-learning reinforcement learning strategy.

Further, S1 includes the following steps:

s11, setting working condition parameters of the two-phase staggered parallel DC-DC converter, wherein the working condition parameters comprise switching frequency, an input voltage range, an output voltage range and an output current range;

s12, under the set working condition, different input voltages and output voltages are given, the load resistance is changed, each phase of the converter is enabled to independently operate, and the efficiency of each phase of the converter under different output currents is calculated.

Further, S2 includes the following steps:

s21, randomly dividing efficiency data under different working conditions into a training set and a testing set according to a preset proportion;

s22, training an efficiency prediction model of each phase of the converter by utilizing an SVR algorithm, and performing 10-fold cross validation;

s23, calculating the prediction precision of the established efficiency prediction model, and evaluating the rationality and accuracy of the efficiency prediction model.

Further, S3 includes the following steps:

s31, enabling each phase efficiency eta of the two-phase staggered parallel DC-DC converter _i Represented as output current I _o Is a function of:

η _i ＝f(I _o ),i＝1,2

s32, a group of converter operation working conditions are given, the change condition of the efficiency of each phase along with the output current under the working conditions is predicted according to an efficiency prediction model, and then an accurate expression of the relation between the efficiency of each phase and the output current in S32, namely an efficiency characteristic curve, is obtained through a polynomial fitting mode.

Further, S4 includes the following steps:

s41, deducing the overall efficiency of the two-phase staggered parallel DC-DC converter, and expressing the overall efficiency as a relation of each phase efficiency:

wherein I is _oi For the i-th phase output current, eta _i (I _oi ) For the ith phase efficiency, I _load Is the total load current, is the sum of two-phase output currents;

s42, replacing the single-phase efficiency in S41 with the efficiency characteristic curves of the phases obtained in S3, wherein the overall efficiency of the system is only represented by the output current of each phase;

s43, determining constraint conditions of current load distribution problems according to the output current of each phase and the range limit of total load current, and converting the overall efficiency maximization into the current distribution optimal problem:

wherein I is _oimin And I _oimax Respectively the minimum value and the maximum value of the output current of the ith phase, I _loadmin And I _loadmax The load current minimum and maximum, respectively.

Further, S5 includes the following steps:

s51, setting a state space lambda of a Q-learning reinforcement learning strategy _i ：λ _i ＝I _oi /I _load ，λ _i The value range of (2) satisfies the following conditions:

wherein θ is a quantized value of the state space, [ · ] is a rounding operation;

setting an action space A of a Q-learning reinforcement learning strategy:wherein (1)>Setting the objective function of the Q-learning reinforcement learning strategy +.>For, among othersFor punishment items-> Is->R is a penalty factor;

setting a reward function r (s, a) of the Q-learning reinforcement learning strategy:

wherein, psi is _min The minimum value of the objective function at the historical moment;

s52, simulating a Q-learning reinforcement learning strategy, and running an epsilon-greedy strategy, and continuously exploring until an optimal current distribution scheme is found.

The beneficial effects are that: the method can adapt to the change of the operation condition of the system, predicts the efficiency under any condition, processes the unknown or incomplete information of the dynamic environment, gives an optimal current distribution scheme under the condition that a model does not need to be changed, improves the applicability of the model, has few required parameters, is simple and easy to understand, is easy to realize, and has industrial application value.

Another aspect of the invention provides a two-phase interleaved parallel DC-DC converter efficiency optimization system comprising: a computer readable storage medium and a processor;

the computer-readable storage medium is for storing executable instructions;

the processor is configured to read executable instructions stored in the computer readable storage medium and execute the method for optimizing efficiency of the two-phase interleaved parallel DC-DC converter.

By the above technical scheme conceived by the present invention, the following advantageous effects can be obtained compared with the prior art.

(1) The invention provides a two-phase staggered parallel DC-DC converter efficiency optimization method, which adopts SVR algorithm to train efficiency data under different working conditions to obtain a model capable of predicting the efficiency characteristic of the converter under any working condition, when the working condition changes, the data under the working condition is not required to be collected again to fit an efficiency characteristic curve, and the applicability is strong;

(2) According to the efficiency optimization method for the two-phase staggered parallel DC-DC converter, efficiency optimization is realized by adopting non-current-sharing control, a non-current-sharing current distribution scheme is obtained by optimizing a Q-learning algorithm, and the intelligent algorithm solves the problem that the traditional mathematical solving method is difficult to calculate under complex working conditions such as load or voltage conversion ratio change;

(3) Compared with a current sharing control scheme, the efficiency optimization method of the two-phase staggered parallel DC-DC converter provided by the invention can obviously improve the efficiency of the converter, is simple and easy to understand, has fewer required parameters, can adapt to the change of the operation working condition of a system, processes unknown or incomplete information of a dynamic environment, does not need to reset the algorithm when the environment changes, is easy to realize, and has industrial application value.

Drawings

FIG. 1 is a flow chart of a method for optimizing the efficiency of a two-phase interleaved parallel DC-DC converter according to the present invention;

FIG. 2 is a diagram of a two-phase interleaved parallel synchronous rectification Buck converter system;

FIG. 3 is a two-phase interleaved parallel synchronous rectification Buck type DC-DC converter topology;

FIG. 4 is a single phase synchronous rectification Buck circuit topology;

FIG. 5 is a graph of voltage and current waveforms before and after load ramp up;

FIG. 6 is a graph comparing overall efficiency of a current sharing mode and efficiency optimization mode system.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not interfere with each other.

In order to achieve the above objective, the present invention provides a method for optimizing efficiency of a two-phase interleaved parallel DC-DC converter, as shown in fig. 1, comprising the steps of:

s2, training efficiency data of two phases under different working conditions by using an SVR algorithm to obtain an efficiency prediction model of two phases of the two-phase staggered parallel DC-DC converter under any working condition;

In order to clearly describe the efficiency optimization method of the two-phase interleaved parallel DC-DC converter according to the present invention, the following description will describe the converter in fig. 2 as an embodiment:

specifically, S1 includes the steps of:

s11, fig. 3 shows the structure of a two-phase staggered parallel synchronous rectification Buck converter system. Setting up a simulation platform in Saber, setting working condition parameters of the converter, wherein the inductance L1 and the inductance L2 are 33uH, the switching frequency is 200kHz, the input voltage range is 30-48V, the output voltage range is 12-24V, and the output current ranges of the single-phase converter IL1 and IL2 are 1-12A;

s12, in a set working condition range of the converter, different input voltages and output voltages are given, load resistance is changed, each phase converter is enabled to independently operate, the efficiency of each phase converter under different output currents is recorded, and the efficiency is calculated by the average value of input power and output power when the converter operates in a steady state.

In this embodiment, efficiency data of each single-phase converter at each of input voltages of 30, 33, 36, 39, 42, 45, 48V, output voltages of 12, 15, 18, 21, 24V, and each 1A increase in output current from 1A to 12A is collected. Each phase transformer has 420 sets of data.

Specifically, S2 includes the steps of:

s21, the efficiency data of each phase is calculated according to 8: the proportion of 2 is randomly divided into a training set and a testing set, wherein the training set is used for establishing an efficiency prediction model, and the testing set is used for testing the prediction precision of the model;

s22, training an efficiency prediction model of each phase converter by utilizing an SVR algorithm, and performing 10-fold cross validation. The basic principle of the SVR algorithm is to map samples to a high-dimensional feature space, find an optimal hyperplane in the high-dimensional feature space, and minimize the total deviation of all sample data to the optimal hyperplane. Let phi (x) represent the feature vector after mapping the input vector x, the model corresponding to the partitioned hyperplane in the feature space can be expressed as:

f(x)＝ω ^T φ(x)+b

wherein ω is a normal vector, determines the direction of the optimal hyperplane, and b is a displacement term, determines the distance between the optimal hyperplane and the origin. The SVR problem can be formalized as:

wherein x is _i And y _i Respectively the ith input vector and the corresponding output vector, xi _i And xi _i ^* For relaxation variables, C is a penalty parameter, 10000 is taken in the embodiment, epsilon is the maximum deviation between the tolerable model output f (x) and the real output y, the loss is calculated only when the absolute value of the deviation is larger than epsilon, and the method is equivalent to constructing a banded region with the bandwidth of 2 epsilon by taking f (x) as the center, and if training data falls into the region, the prediction is considered to be correct. Converting the SVR problem into its dual problem:

wherein alpha is _i And alpha _i ^* Is a lagrange multiplier. Solving the dual problem can yield the solution of SVR as follows:

wherein, kappa (x) _i X) is a kernel function, which can be expressed as:

κ(x _i ,x)＝φ(x _i ) ^T φ(x)

the general gaussian kernel function is selected in this embodiment, and its expression is:

wherein sigma is the bandwidth of Gaussian kernel, realIn the examples, get

S23, calculating the prediction precision of the established efficiency prediction model, and evaluating the rationality and accuracy of the model. The model was evaluated using Mean Absolute Error (MAE) and Root Mean Square Error (RMSE):

the performance of the model is trained and tested by a 10-fold cross validation method through mean value evaluation of root mean square error, the performance scores obtained by the two-phase efficiency prediction model of the converter are 0.481 and 0.408, the error is small, and the model performance is good.

And predicting the efficiency of 84 groups of test set data by using the trained model, and comparing the efficiency with a true value, wherein the average absolute error and the root mean square error of the first phase efficiency prediction model are respectively 0.146 and 0.335, and the average absolute error and the root mean square error of the second phase efficiency prediction model are respectively 0.264 and 0.379. The model has higher prediction precision, and the rationality and accuracy of the model are verified.

Specifically, fig. 4 is a single-phase synchronous rectification Buck circuit topology. V in the figure _in R is the input voltage _dson 、R _sdson On-resistance, L of main power switching tube and synchronous rectifying tube respectively _esr ESR, C, of energy storage inductance _iesr And C _oesr ESR, I of input filter capacitor and output filter capacitor respectively _L Is inductance current, R is load resistance, I _o For load current, V _o Is the output voltage.

In practical engineering application, the ESR of the filter capacitor is small, and the generated loss is negligible. The loss of the energy storage inductor comprises copper loss and iron loss. The copper loss is mainly generated by ESR of the energy storage inductor, and the iron loss is negligible. The loss of the energy storage inductance is expressed as:

P _L ＝I _L ² ×L _esr

the losses of the main power switch tube comprise conduction losses, switching losses and output capacitance losses. The conduction loss is expressed as:

P _on ＝R _dson ×I _ds(rms) ²

wherein I is _ds(rms) Is the effective value of the main power switch tube current, and can be expressed as:

wherein D is the duty ratio of the main power switching tube, delta I _L Is an inductor current ripple.

The turn-on loss of the main power switch can be expressed as:

wherein t is _r The time for the drain-source voltage to drop from 90% to 10% when the main power switch tube is turned on is f the switching frequency.

The turn-off loss can be expressed as:

wherein t is _f The time taken for the drain-source voltage to rise from 10% to 90% when the main power switch tube is turned off.

The power loss due to discharging the output capacitor can be expressed as:

wherein C is _oss Is the output capacitance of the main switching tube.

In summary, the total loss of the main power switch tube can be expressed as:

P _sw ＝P _on +P _swon +P _swoff +P _co

losses of the synchronous rectifiers include conduction losses, switching losses, output capacitance losses, and parasitic body diode reverse recovery losses of the synchronous rectifiers. The conduction loss can be expressed as:

P _{syn_on} ＝R _sdson ×I _sds(rms) ²

wherein I is _sds(rms) Is the effective value of synchronous rectifier current and can be expressed as:

the turn-on loss of the synchronous rectifier can be expressed as:

wherein t is _sr The time it takes for the drain-source voltage to drop from 90% to 10% when the synchronous rectifier is on.

The turn-off loss can be expressed as:

wherein t is _sf The drain-source voltage is increased from 10% to 90% for the time when the synchronous rectifier is turned off.

The output capacitance loss can be expressed as:

wherein C is _{syn_oss} Is the output capacitance of the synchronous rectifier.

The parasitic body diode reverse recovery loss of the synchronous rectifier can be expressed as:

P _qrr ＝Q _rr ×V _r ×f

wherein V is _r To synchronize the reverse voltage applied when the rectifier is turned off, Q _rr The charge is restored for the reverse direction of the parasitic body diode.

In summary, the total loss of the synchronous rectifier can be expressed as:

P _syn ＝P _{syn_on} +P _{syn_swon} +P _{syn_swoff} +P _{syn_co} +P _qrr

based on the above loss analysis, the efficiency of a single phase synchronous rectification Buck converter can be expressed as:

wherein:

A＝R _dson ×D+R _sdson ×(1-D)

it can be seen that the efficiency characteristics of a single phase synchronous rectification Buck converter can be expressed as a function of the output current if the input voltage, output voltage and switching frequency of the converter are constant and stable and regardless of the effect of all parasitic parameters in the circuit.

Specifically, S3 includes the steps of:

η _i ＝f(I _o )

s32, selecting an operation condition of the converter which is not contained in a group of training data, wherein in the embodiment, the input voltage is 42V, and the output voltage is 20V. According to the SVR algorithm training obtained model, predicting the change condition of the efficiency of the single-phase converter along with the output current under the working condition, and obtaining an accurate expression of the relation between each single-phase efficiency and the output current through 5 times of polynomial fitting, namely, an efficiency characteristic curve is as follows:

specifically, S4 includes the steps of:

s41. the overall efficiency of the two-phase interleaved parallel synchronous rectification Buck converter can be expressed as:

wherein P is _in And P _out Respectively input power and output power, P _ii And P _oi I-th phase input power and output power, respectively.

Based on the parallel characteristics, where output power is represented by output current, the overall efficiency can be written as:

wherein I is _oi For the i-th phase output current, eta _i (I _oi ) For the ith phase efficiency, I _load Is the load current, is the sum of two-phase output current;

replacing the single-phase efficiency in S41 with the efficiency characteristic curve of each phase converter obtained in S3, wherein the overall efficiency of the system can be represented by the output current of the two-phase converter;

s42, determining the optimal efficiency problem as maximizing the overall efficiency of the parallel converter system. As can be seen from S3, the overall efficiency is a function of the output current of each single-phase converter, and the overall efficiency is maximally converted into a problem of load distribution of each phase current, which can be expressed as:

the total load current is limited to 1-24A, and the output current of each single-phase converter is limited to 0-12A. The constraint condition of load current is combined with the above formula, and the optimization problem containing the constraint condition is obtained:

specifically, S5 includes the steps of:

s51. the optimum efficiency can be expressed as:

wherein lambda is _i ＝I _oi /I _load 。

S52, abstracting the current distribution problem.

The proportion lambda of the output current of the single-phase converter to the total load current _i A state space defined as a Q-learning reinforcement learning strategy;

defining a current increment unit theta, wherein the output current of each single-phase converter is increased and reduced by one increment unit or is unchanged to form an action space of a Q-learning optimizing strategy, and the action space of a converter system comprises 9 combinations of output current changes of two-phase converters;

definition of the definitionWill->As penalty terms, get objective functions containing penalty terms:

wherein r is a penalty factor,is->Is a desired value of (2). In this embodiment +.>Combining the objective function and constraints, a current optimum allocation problem can be expressed as:

wherein lambda is _imin And lambda (lambda) _imax Lambda is lambda _i Is a minimum and a maximum of (a).

The reward function is designed according to the objective function. When the state is transferred to the next state through a certain action, if delta phi is smaller than 0, the reward of the action is 1; if Δψ is greater than 0, the prize of action is-1; if Δψ is equal to 0, the prize of action is 0; if ψ reaches a minimum of the history time, a higher prize, for example, 50 can be given.

Further examining the limitation of the current boundary condition to the state space of the Q-learning current distribution optimizing algorithm. Determining lambda from the range of converter system load current and single phase output current _i The total load current is known to be limited to 1-24A, and the output current of each single-phase converter is known to be limited to 0-12A. When the load current is 1-12A, 0 is less than or equal to lambda _i Less than or equal to 1, i=1, 2; when the load current is 13-24A,i=1, 2. Wherein (1)>

Known lambda _i Setting a current increment unit theta=1/1000, wherein the duty ratio of the load current of each phase inverter relative to 1/theta is as follows:

wherein []For rounding operations, according to conditionsThere is->H _i The value ranges of (2) determine all possible cases of the current distribution scheme, thereby constituting the entire state space.

Taking the load current 6A as an example, since the load current is smaller than the maximum value of the single-phase output current, the single-phase output current can take 0-6A, the sum of the two-phase output currents is 6A, and according to θ=1/1000, 1000 distribution schemes are provided.

Taking load current 13A as an example, the load current is larger than the maximum value of the single-phase output current, and the ratio of the single-phase output current to the total load current is 1/13-lambda because the range of the single-phase output current is 0-12A and the sum of the two-phase output currents is 13A _i Less than or equal to 12/13, i=1, 2, and according to θ=1/1000, about H _i A total of 847 allocation schemes of=77 to 923 constitute the entire state space.

S53, taking 1/20 theta as intervals in a state space as initial states of different epocodes (optimizing a group of initial states to be called one epocode). The whole learning process includes all the epodes, constituting the outer loop of the algorithm. After a set of initial states is given, optimization is performed in the current distribution scheme determined in S52, so as to form an internal loop of the algorithm. In the inner loop of the algorithm, the ε -greedy strategy is designed as: when the generated random number is smaller than epsilon, selecting the optimal action corresponding to the maximum action value function, and if the maximum expected values of some actions are the same, randomly selecting the action with the maximum expected value; when the generated random number is greater than ε, then the action is randomly selected.

S54, searching an optimal current distribution scheme among all possible current distribution schemes according to a Q-learning reinforcement learning algorithm. Specifically, the method comprises the following steps:

in a first step, Q (s, a) is initialized,given the algorithm parameters, the boundary constraints of the objective function determined in S52 are loaded. The smaller the learning rate alpha value is, the slower the training speed is, and the training speed is generally set to be 0.8-1; the discount factor gamma is generally set to 0-1; state quantization θ, the smaller the setting, the higher the accuracy; the greedy coefficient epsilon is used for action selection, optimal actions are selected when the generated random number is smaller than epsilon, and random selection is performed when the generated random number is larger than epsilon. In this embodiment, the parameters are set to α=1, γ=0.8, θ=1/1000, and ε=0.9.

Secondly, given the initial state s of the algorithm, selecting an action a according to an epsilon-greedy strategy, selecting an optimal action when the generated random number is smaller than 0.9, and randomly selecting when the generated random number is larger than 0.9. Under policy pi, Q (s, a) can be calculated as:

wherein R is _s (a) For average prize of state s, P _ss' [π(s)]The transition probability of the state s under the policy pi is V ^π (s) is the expected value of the cumulative payback at state s, defined as:

third, the rewards r (s, a) are calculated by using the rewarding function, the next state s 'is obtained according to the current state and the action, the Q value of the state s' is calculated based on the epsilon-greedy strategy, and the Q (s, a) value is updated.

Action a is selected at state s in return r (s, a) according to an epsilon-greedy strategy, and the next state s'. Wherein the return r (S, a) is calculated from the reward function set in S52. Thus, the Q (s, a) value is updated as follows:

wherein,,representing that the state s' selects actions by greedy strategy, namely selecting actions corresponding to the maximum action value function in the action space;

fourth, updating the state and action s=s ', a=a';

fifthly, judging whether the final state is reached, if the final state is not reached, returning to the third step, otherwise, carrying out the next step;

step six, judging whether the learning process is finished, if not, returning to the step two, otherwise, carrying out the next step;

seventh, until all Q (s, a) converges, outputting the final strategyAnd obtaining an optimal allocation scheme. The optimal current distribution scheme determined according to the Q-learning algorithm in this embodiment is shown in table 1:

table 1 Current distribution scheme when System efficiency is optimal

In this embodiment, the two-phase converters are not forced to operate simultaneously, and only the first phase is operated at a load current of 1-12A. If the two phases are forced to run, only the boundary condition of the objective function needs to be changed.

The optimal current distribution scheme given by the Q-learning algorithm is verified in a simulation manner.

The controller for efficiency optimization control comprises a main controller and a current distribution controller. The main controller realizes a voltage-current double-closed-loop PI control strategy, and ensures the stable operation of the system. When the system is disturbed and is in a dynamic regulation process, a current sharing control mode is adopted, when the system is in a stable working state, a control signal is sent to a current distribution controller, the current distribution controller solves a current distribution scheme when the efficiency of the parallel system is optimal, the distribution scheme is transmitted back to a main controller, the main controller receives the distribution scheme and then redistributes the current given by the two-phase converter according to the distribution scheme, and the system operates in an optimal efficiency mode.

And carrying out simulation verification on the optimal current distribution scheme given by the Q-learning algorithm according to the control strategy. As can be seen from fig. 5, in the initial stage of system operation, the output voltage rises rapidly, the overshoot is small, and the output current operates in the current sharing mode. After the system reaches steady state, it is operated in an efficiency optimization mode. The load is suddenly reduced from 2Ω to 1Ω at 10ms, suddenly increased from 1Ω to 1.5Ω at 20ms, and the output voltage has small fluctuations at 10ms and 20ms, and then goes into steady state. And the system operates in a current sharing mode at the initial stage of load mutation, and operates in an efficiency optimization mode after reaching a steady state.

And (3) changing the load resistance for multiple simulation, and recording the overall efficiency of the system in a current sharing mode and an efficiency optimization mode. The change curves of the efficiency of the two modes along with the load current are drawn, as shown in fig. 6, the system efficiency of the efficiency optimization mode is obviously improved compared with the current sharing, and the effectiveness of the proposed efficiency optimization algorithm is verified.

It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims

1. The method for optimizing the efficiency of the two-phase staggered parallel DC-DC converter is characterized by comprising the following steps of:

s2, training efficiency data of two phases under different working conditions to obtain an efficiency prediction model of two phases of the two-phase staggered parallel DC-DC converter under any working condition; the step S2 comprises the following steps:

s23, calculating the prediction precision of the established efficiency prediction model, and evaluating the rationality and accuracy of the efficiency prediction model

2. The method of claim 1, wherein S1 comprises the steps of:

3. The method of claim 1, wherein S3 comprises the steps of:

s31, enabling each phase efficiency eta of the two-phase staggered parallel DC-DC converter _i Represented as output current I _oi Is a function of:

η _i ＝f(I _oi ),i＝1,2

s32, a group of converter operation working conditions are given, the change condition of the efficiency of each phase along with the output current under the working conditions is predicted according to an efficiency prediction model, and then an accurate expression of the relation between the efficiency of each phase and the output current, namely an efficiency characteristic curve, is obtained through a polynomial fitting mode.

4. A method according to claim 3, wherein S4 comprises the steps of:

wherein I is _oimin And I _oimax Respectively the minimum sum of the i-th phase output currentsMaximum value, I _loadmin And I _loadmax The load current minimum and maximum, respectively.

5. The method of claim 4, wherein S5 comprises the steps of:

wherein θ is a quantized value of the state space, [ ] is a rounding operation;

setting an action space A of a Q-learning reinforcement learning strategy:wherein C is _Di ＝[0,±1]θ，Δλ _i ∈C _Di ；

Setting the objective function of the Q-learning reinforcement learning strategy asWherein->For punishment items-> Is->R is a penalty factor;

6. A two-phase interleaved parallel DC-DC converter efficiency optimization system, comprising: a computer readable storage medium and a processor;

the computer-readable storage medium is for storing executable instructions;

the processor is configured to read executable instructions stored in the computer readable storage medium and execute the two-phase interleaved parallel DC-DC converter efficiency optimization method of any one of claims 1 to 5.