CN114362519A

CN114362519A - Efficiency optimization method and system for two-phase interleaved parallel DC-DC converter

Info

Publication number: CN114362519A
Application number: CN202111661765.6A
Authority: CN
Inventors: 尹泉; 缪佶桂; 刘洋
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2022-04-15
Anticipated expiration: 2041-12-31
Also published as: CN114362519B

Abstract

The invention discloses an efficiency optimization method and system for a two-phase interleaved parallel DC-DC converter, and belongs to the field of power electronics. The method comprises the following steps: respectively collecting efficiency data of two phases of the two-phase interleaved parallel DC-DC converter under different working conditions; training efficiency data by using an SVR algorithm to obtain an efficiency prediction model of two phases under any working condition; fitting an efficiency characteristic curve of two phases under any working condition according to the efficiency prediction model; determining the optimal problem of the overall efficiency based on the efficiency characteristic curve, and converting the optimal problem into the optimal problem of current distribution; and finding an optimal current distribution scheme by using a Q-learning reinforcement learning strategy. The method can adapt to the change of the system operation condition, predict the efficiency under any condition, process unknown or incomplete information of the dynamic environment, provide an optimal current distribution scheme under the condition of not changing the model, improve the applicability of the model, have few required parameters, have concise and understandable algorithm, are easy to realize and have industrial application value.

Description

Efficiency optimization method and system for two-phase interleaved parallel DC-DC converter

Technical Field

The invention belongs to the field of power electronics, and particularly relates to an efficiency optimization method and system for a two-phase interleaved parallel DC-DC converter.

Background

The two-phase interleaved parallel DC-DC converter is widely applied to high-current and high-power occasions due to the characteristics of high power, high power density, high efficiency, high reliability and the like. Current load distribution of each single-phase converter is generally realized by adopting a current sharing control strategy to ensure safe operation of the system. However, since the component parameters and circuit parasitic parameters of each branch phase-change converter of the interleaved parallel converter are different, the efficiency characteristics of each branch phase-change converter are different, and if a control scheme of equalizing the phases is adopted, the overall efficiency of the system cannot be guaranteed to be optimal, which may cause energy waste. Therefore, the method for optimizing the efficiency of the two-phase interleaved parallel DC-DC converter based on non-current-sharing control has important significance.

The traditional non-current-sharing current distribution scheme is that the total efficiency of a system is optimally obtained by establishing an accurate power loss model, fitting an efficiency characteristic curve of a single-phase converter and combining an efficiency calculation method. Because the operating condition of converter is complicated changeable, for example load or voltage conversion ratio change, the efficiency characteristic of converter is different under different operating modes, when the operating mode changes, need gather the data under this operating mode again and carry out the fitting of efficiency characteristic curve, and is very loaded down with trivial details, has reduced the suitability of model. In addition, when the system operates under a complex working condition, the optimal current distribution scheme based on a mathematical method is difficult to solve in a complex way. Various heuristic algorithms are currently used, including particle swarm algorithms, ant colony algorithms, simulated annealing algorithms, hill climbing algorithms, differential evolution algorithms, genetic algorithms, and the like. However, these algorithms cannot adapt to the dynamic change of the external condition, that is, the working condition, and need to be reset when the external condition changes, and the search performance of some of the algorithms depends on the selection of the initial value and the optimization parameter to a great extent, which is easy to fall into the local optimum, resulting in a large error of the result.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide an efficiency optimization method and system for a two-phase interleaved parallel DC-DC converter, and aims to solve the problem that the error of the existing method for carrying out non-current-sharing control on the converter is large.

In order to achieve the above object, an aspect of the present invention provides an efficiency optimization method for a two-phase interleaved parallel DC-DC converter, including the following steps:

s1, respectively acquiring efficiency data of two phases of two-phase interleaved parallel DC-DC converters under different working conditions;

s2, training efficiency data of the two phases under different working conditions by using an SVR (support vector regression) algorithm to obtain an efficiency prediction model of the two phases of the two-phase interleaved parallel DC-DC converter under any working condition;

s3, fitting an efficiency characteristic curve of two phases of the two-phase interleaved parallel DC-DC converter under any working condition according to the efficiency prediction model;

s4, determining the optimal overall efficiency problem of the two-phase interleaved parallel DC-DC converter based on the efficiency characteristic curve, and converting the optimal overall efficiency problem into the optimal current distribution problem;

and S5, abstracting the optimal current distribution problem, and searching an optimal current distribution scheme by using a Q-learning reinforcement learning strategy.

Further, S1 includes the following steps:

s11, setting working condition parameters of the two-phase interleaved parallel DC-DC converter, wherein the working condition parameters comprise switching frequency, an input voltage range, an output voltage range and an output current range;

s12, under a set working condition, different input voltages and output voltages are given, load resistance is changed, each phase of the converter operates independently, and the efficiency of each phase of the converter under different output currents is calculated.

Further, S2 includes the following steps:

s21, dividing efficiency data under different working conditions into a training set and a testing set at random according to a preset proportion;

s22, training an efficiency prediction model of each phase of the converter by utilizing an SVR algorithm, and performing 10-fold cross validation;

and S23, calculating the prediction precision of the established efficiency prediction model, and evaluating the rationality and accuracy of the efficiency prediction model.

Further, S3 includes the following steps:

s31, connecting each phase efficiency eta of two-phase interleaved parallel DC-DC converter_iExpressed as output current I_oFunction of (c):

η_i＝f(I_o),i＝1,2

s32, a set of converter operation conditions are given, the change condition of the efficiency of each phase along with the output current under the conditions is predicted according to the efficiency prediction model, and then an accurate expression of the relation between the efficiency of each phase and the output current in S32, namely an efficiency characteristic curve, is obtained in a polynomial fitting mode.

Further, S4 includes the following steps:

s41, deducing the overall efficiency of the two-phase interleaved parallel DC-DC converter, and expressing the overall efficiency as a relation of each phase efficiency:

wherein, I_oiFor the i-th phase output current, η_i(I_oi) For the I-th phase efficiency, I_loadIs the total load current, which is the sum of the two phase output currents;

s42, replacing the single-phase efficiency in S41 with the efficiency characteristic curve of each phase obtained in S3, wherein the overall efficiency of the system is only represented by the output current of each phase;

s43, determining constraint conditions of the current load distribution problem according to the range limit of the output current and the total load current of each phase, and converting the overall efficiency maximization into the current distribution optimal problem:

wherein, I_oiminAnd I_oimaxRespectively the minimum and maximum of the I-th phase output current, I_loadminAnd I_loadmaxRespectively a load current minimum and maximum.

Further, S5 includes the following steps:

s51, setting a state space lambda of a Q-learning reinforcement learning strategy_i：λ_i＝I_oi/I_load，

λ_iThe value range of (A) satisfies:

wherein, theta is a quantization value of the state space, and [. cndot. ] is a rounding operation;

setting an action space A of a Q-learning reinforcement learning strategy:

wherein the content of the first and second substances,

setting an objective function of a Q-learning reinforcement learning strategy

In which are

In order to be a penalty term,

is composed of

R is a penalty factor;

set reward function r (s, a) of Q-learning reinforcement learning strategy:

wherein psi_minThe minimum value of the objective function at the historical moment is taken as the minimum value;

s52, simulating a Q-learning reinforcement learning strategy, operating an epsilon-greedy strategy, and continuously exploring until an optimal current distribution scheme is found.

Has the advantages that: the method can adapt to the change of the system operation condition, predict the efficiency under any condition, process unknown or incomplete information of the dynamic environment, provide an optimal current distribution scheme under the condition of not changing the model, improve the applicability of the model, have few required parameters, have concise and understandable algorithm, are easy to realize and have industrial application value.

In another aspect, the present invention provides an efficiency optimization system for a two-phase interleaved parallel DC-DC converter, including: a computer-readable storage medium and a processor;

the computer-readable storage medium is used for storing executable instructions;

the processor is used for reading the executable instructions stored in the computer readable storage medium and executing the efficiency optimization method of the two-phase interleaved parallel DC-DC converter.

Through the technical scheme, compared with the prior art, the invention can obtain the following beneficial effects.

(1) The invention provides an efficiency optimization method of a two-phase interleaved parallel DC-DC converter, which adopts an SVR algorithm to train efficiency data under different working conditions to obtain a model capable of predicting the efficiency characteristic of the converter under any working condition, when the working condition changes, the data under the working condition does not need to be collected again to carry out the fitting of an efficiency characteristic curve, and the applicability is strong;

(2) the efficiency optimization method for the two-phase interleaved parallel DC-DC converter provided by the invention adopts non-current-sharing control to realize efficiency optimization, the current distribution scheme of non-current sharing is obtained by optimizing a Q-learning algorithm, and the intelligent algorithm avoids the problem that the traditional mathematical solving method is difficult to calculate under complex working conditions such as load or voltage conversion ratio change and the like;

(3) compared with the current-sharing control scheme, the efficiency optimization method for the two-phase interleaved parallel DC-DC converter provided by the invention can obviously improve the efficiency of the converter, has the advantages of concise and understandable algorithm, less required parameters, capability of adapting to the change of the system operation condition, capability of processing unknown or incomplete information of a dynamic environment, no need of resetting the algorithm when the environment changes, easiness in realization and industrial application value.

Drawings

FIG. 1 is a flow chart of a method for optimizing efficiency of a two-phase interleaved parallel DC-DC converter according to the present invention;

FIG. 2 is a schematic diagram of a two-phase interleaved synchronous rectification Buck converter system;

FIG. 3 is a two-phase interleaved synchronous rectification Buck type DC-DC converter topology;

FIG. 4 is a single-phase synchronous rectification Buck circuit topology;

FIG. 5 is a graph of voltage and current waveforms before and after a sudden change in load;

fig. 6 is a comparison graph of the overall efficiency of the current share mode and efficiency optimization mode systems.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

In order to achieve the above object, the present invention provides an efficiency optimization method for a two-phase interleaved parallel DC-DC converter, as shown in fig. 1, including the following steps:

s2, training efficiency data of the two phases under different working conditions by using an SVR algorithm to obtain an efficiency prediction model of the two phases of the two-phase interleaved parallel DC-DC converter under any working condition;

In order to clearly describe the efficiency optimization method of the two-phase interleaved parallel DC-DC converter according to the present invention, the following description will be made by taking the converter in fig. 2 as an example:

specifically, S1 includes the steps of:

s11, FIG. 3 shows a structure of a two-phase interleaved synchronous rectification Buck converter system. A simulation platform is built in Saber, working condition parameters of the converter are set, the inductors L1 and L2 are both 33uH, the switching frequency is 200kHz, the input voltage range is 30-48V, the output voltage range is 12-24V, and the output current ranges IL1 and IL2 of the single-phase converter are 1-12A;

s12, in a set working condition range of the converter, different input voltages and output voltages are given, load resistance is changed, each phase-change converter operates independently, efficiency of each phase-change converter under different output currents is recorded, and the efficiency is calculated by the average value of input power and output power when the converter operates in a steady state.

In this embodiment, efficiency data of each single-phase converter at an input voltage of 30, 33, 36, 39, 42, 45, 48V, an output voltage of 12, 15, 18, 21, 24V, and an output current of 1A to 12A for each increase of 1A is collected. Each phase transformer has 420 sets of data.

Specifically, S2 includes the steps of:

s21, performing phase efficiency data according to the following steps of 8: 2, randomly dividing the model into a training set and a test set in proportion, wherein the training set is used for establishing an efficiency prediction model, and the test set is used for testing the prediction precision of the model;

and S22, training an efficiency prediction model of each phase transformer by using an SVR algorithm, and performing 10-fold cross validation. The basic principle of the SVR algorithm is to map the samples to a high-dimensional feature space, and find an optimal hyperplane in the high-dimensional feature space, so that the total deviation from all sample data to the optimal hyperplane is minimized. Let φ (x) denote the feature vector after mapping the input vector x, and the model corresponding to the partition hyperplane in the feature space can be expressed as:

f(x)＝ω^Tφ(x)+b

wherein omega is a normal vector and determines the direction of the optimal hyperplane, and b is a displacement term and determines the distance between the optimal hyperplane and the origin. The SVR problem can be formalized as:

wherein x is_iAnd y_iRespectively the ith input vector and the corresponding output vector, xi_iAnd xi_i ^*And C is a penalty parameter, 10000 is taken in the embodiment, epsilon is the maximum deviation amount between the tolerable model output f (x) and the real output y, the loss is calculated only when the absolute value of the deviation is larger than epsilon, which is equivalent to that f (x) is taken as the center, a strip-shaped area with the bandwidth of 2 epsilon is constructed, and if the training data falls into the area, the prediction is considered to be correct. Converting the SVR problem into its dual problem:

wherein alpha is_iAnd alpha_i ^*Is a lagrange multiplier. Solving the dual problem can yield a solution of SVR:

wherein, κ (x)_iAnd x) is a kernel function, which can be expressed as:

κ(x_i,x)＝φ(x_i)^Tφ(x)

in this embodiment, a commonly used gaussian kernel function is selected, and its expression is:

wherein σ is the bandwidth of the Gaussian kernel, and is taken in the embodiment

And S23, calculating the prediction precision of the established efficiency prediction model, and evaluating the rationality and accuracy of the model. The model was evaluated using Mean Absolute Error (MAE) and Root Mean Square Error (RMSE):

the mean value of the root mean square error is used for evaluating the performance of the model trained and tested by adopting a 10-fold cross validation method, the performance scores obtained by the two-phase efficiency prediction model of the converter are 0.481 and 0.408, the error is small, and the model performance is good.

The trained models were used to predict the efficiency of 84 test set data sets and compared to the true values, with the first phase efficiency prediction model having mean absolute error and root mean square error of 0.146 and 0.335, respectively, and the second phase efficiency prediction model having mean absolute error and root mean square error of 0.264 and 0.379, respectively. The prediction precision of the model is high, and the rationality and the accuracy of the model are verified.

Specifically, fig. 4 is a single-phase synchronous rectification Buck circuit topology. In the figure V_inFor input voltage, R_dson、R_sdsonOn-resistances, L, of main power switching tube and synchronous rectifying tube, respectively_esrFor ESR, C of energy-storage inductors_iesrAnd C_oesrESR, I of the input filter capacitor and the output filter capacitor respectively_LIs an inductive current, R is a load resistance, I_oIs a load current, V_oIs the output voltage.

The ESR of the filter capacitor in practical engineering application is very small, and the generated loss is negligible. The loss of the energy storage inductor comprises copper loss and iron loss. The copper loss is mainly generated by ESR of the energy storage inductor, and the iron loss can be ignored. The loss of the energy storage inductor is expressed as:

P_L＝I_L ²×L_esr

the losses of the main power switch tube include conduction loss, switching loss and output capacitance loss. The conduction loss is expressed as:

P_on＝R_dson×I_ds(rms) ²

wherein, I_ds(rms)Is the effective value of the current of the main power switch tube, and can be expressed as:

wherein D is the duty ratio of the main power switch tube, and Delta I_LIs the inductor current ripple.

The turn-on loss of the main power switch tube can be expressed as:

wherein, t_rThe time for the drain-source voltage to drop from 90% to 10% when the main power switch tube is switched on is shown, and f is the switching frequency.

The turn-off loss can be expressed as:

wherein, t_fThe time for the drain-source voltage to rise from 10% to 90% when the main power switch tube is turned off.

The power loss resulting from the discharge of the output capacitor can be expressed as:

wherein, C_ossIs the output capacitance of the main switch tube.

In summary, the total loss of the main power switch tube can be expressed as:

P_sw＝P_on+P_swon+P_swoff+P_co

losses of the synchronous rectifiers include conduction losses, switching losses, output capacitance losses and parasitic body diode reverse recovery losses of the synchronous rectifiers. The conduction loss can be expressed as:

P_{syn_on}＝R_sdson×I_sds(rms) ²

wherein, I_sds(rms)Is the effective value of the synchronous rectifier current, which can be expressed as:

the turn-on loss of a synchronous rectifier can be expressed as:

wherein, t_srThe time for the drain-source voltage to drop from 90% to 10% when the synchronous rectifier is turned on.

The turn-off loss can be expressed as:

wherein, t_sfThe time for the drain-source voltage to rise from 10% to 90% when the synchronous rectifier tube is turned off.

The output capacitance loss can be expressed as:

wherein, C_{syn_oss}Is the output capacitance of the synchronous rectifier tube.

The parasitic body diode reverse recovery loss of a synchronous rectifier can be expressed as:

P_qrr＝Q_rr×V_r×f

wherein, V_rFor reverse voltage, Q, sustained by synchronous rectifier tube during turn-off_rrThe charge is recovered in the reverse direction of the parasitic body diode.

In summary, the total loss of the synchronous rectifier can be expressed as:

P_syn＝P_{syn_on}+P_{syn_swon}+P_{syn_swoff}+P_{syn_co}+P_qrr

based on the above loss analysis, the efficiency of a single-phase synchronous rectification Buck converter can be expressed as:

wherein:

A＝R_dson×D+R_sdson×(1-D)

it can be seen that the efficiency characteristics of a single-phase synchronous rectified Buck converter can be expressed as a function of the output current if the input voltage, output voltage and switching frequency of the converter are constant and stable, and without taking into account the effects of all parasitic parameters in the circuit.

Specifically, S3 includes the steps of:

η_i＝f(I_o)

and S32, selecting the operation condition of the converter not included in a group of training data, wherein the input voltage is 42V, and the output voltage is 20V in the embodiment. And (3) training an obtained model according to an SVR algorithm, predicting the change condition of the efficiency of the single-phase converter along with the output current under the working condition, and fitting by using a 5-degree polynomial to obtain an accurate expression of the relation between each single-phase efficiency and the output current, namely an efficiency characteristic curve is as follows:

specifically, S4 includes the steps of:

s41, the overall efficiency of the two-phase interleaved parallel synchronous rectification Buck converter can be expressed as:

wherein, P_inAnd P_outInput power and output power, P, respectively_iiAnd P_oiAre respectively provided withThe ith phase input power and the ith phase output power.

From the parallel characteristic, the output power is expressed as the output current, and the overall efficiency can be written as:

wherein, I_oiFor the i-th phase output current, η_i(I_oi) For the I-th phase efficiency, I_loadIs the load current, is the sum of the two phase output currents;

replacing the single-phase efficiency in S41 with each phase-transformer efficiency characteristic curve obtained in S3, wherein the overall system efficiency can be only represented by the output current of the two-phase transformer;

and S42, determining the optimal efficiency problem as the maximum of the overall efficiency of the parallel converter system. As can be seen from S3, the overall efficiency is a function of the output current of each single-phase converter, and the maximum overall efficiency is transformed into the load distribution problem of each phase current, which can be expressed as:

the total load current is limited to 1-24A, and the output current of each single-phase converter is limited to 0-12A. Combining the constraint condition of the load current with the above formula to obtain the optimization problem containing the constraint condition as follows:

specifically, S5 includes the steps of:

s51, the optimal efficiency can be expressed as:

wherein λ is_i＝I_oi/I_load。

S52, abstracting the current distribution problem.

Ratio lambda of single-phase converter output current to total load current_iDefining a state space of the Q-learning reinforcement learning strategy;

defining a current increment unit theta, wherein the output current of each single-phase converter is increased, decreased by one increment unit or unchanged to form an action space of a Q-learning optimization strategy, and the action space of the converter system comprises 9 combinations of output current changes of two-phase converters;

definition of

Will be provided with

As a penalty term, an objective function containing the penalty term is obtained:

wherein, r is a penalty factor,

is composed of

Is calculated from the expected value of (c). In the present embodiment

Combining the objective function and the constraint condition, given the load current, the current optimal distribution problem can be expressed as:

wherein λ is_iminAnd λ_imaxIs λ_iMinimum and maximum values of.

And designing a reward function according to the objective function. When the state is shifted to the next state through a certain action, if Δ ψ is smaller than 0, the prize of the action is 1; if Δ ψ is greater than 0, the reward for the action is-1; if Δ ψ is equal to 0, the reward for the action is 0; if ψ reaches a minimum value of historical time, a higher prize, e.g. 50, can be given.

Further examining the limitation of the current boundary condition on the state space of the Q-learning current distribution optimization algorithm. Determining lambda from a range of converter system load current and single phase output current_iThe value range of (1) is known to limit the total load current to 1-24A, and the output current of each single-phase converter is limited to 0-12A. When the load current is 1-12A and lambda is more than or equal to 0_i1, i is equal to or less than 1, 2; when the load current is 13-24A,

i is 1, 2. Wherein the content of the first and second substances,

known as λ_iSetting the current increment unit theta to 1/1000, wherein the ratio of the load current of each phase inverter to 1/theta is:

wherein [. ]]For rounding operations, according to conditions

Is provided with

H_iThe value ranges of (a) determine all possible situations of the current distribution scheme, thus constituting the entire state space.

Taking the load current 6A as an example, since the load current is smaller than the maximum value of the single-phase output current, the single-phase output current can be 0 to 6A, the sum of the two-phase output currents is 6A, and there are 1000 distribution schemes according to θ ═ 1/1000.

Take load current 13A as an exampleThe load current is larger than the maximum value of the single-phase output current, the range of the single-phase output current is 0-12A, the sum of the two-phase output current is 13A, and the proportion of the two-phase output current in the total load current is 1/13-lambda _i12/13, i 1,2, according to theta 1/1000, with about H_iThe total 847 distribution scheme of 77 to 923 forms the whole state space.

And S53, taking 1/20 theta as an initial state of different epsilon in the state space (optimizing a group of initial states is called an epsilon). The entire learning process includes all epsilodes, forming the outer loop of the algorithm. Given a set of initial states, the optimization is performed in the current allocation scheme determined in S52, constituting the inner loop of the algorithm. In the inner loop of the algorithm, the epsilon-greedy strategy is designed as: when the generated random number is smaller than epsilon, selecting the optimal action corresponding to the function of the maximum action value, and if the maximum expected values of some actions are the same, randomly selecting the action with the maximum expected value; when the generated random number is greater than epsilon, an action is randomly selected.

And S54, searching an optimal current distribution scheme in all possible current distribution schemes according to a Q-learning reinforcement learning algorithm. Specifically, the method comprises the following steps:

first, initializing Q (s, a),

given the algorithm parameters, the boundary constraints of the objective function determined at S52 are loaded. The smaller the learning rate alpha value is, the slower the training speed is, and the training speed is generally set to be 0.8-1; the discount factor gamma is generally set to 0-1; the smaller the state quantization theta is set, the higher the precision is; and the greedy coefficient epsilon is used for action selection, when the generated random number is smaller than epsilon, the optimal action is selected, and when the generated random number is larger than epsilon, the random selection is carried out. In the present embodiment, the parameters α is 1, γ is 0.8, θ is 1/1000, and ∈ is 0.9.

And secondly, giving an initial state s of the algorithm, selecting an action a according to an epsilon-greedy strategy, selecting an optimal action when the generated random number is less than 0.9, and randomly selecting when the generated random number is more than 0.9. Under strategy π, Q (s, a) can be calculated as:

wherein R is_s(a) Average reward for state s, P_ss'[π(s)]Is the transition probability of state s under strategy pi, V^π(s) is the expected value of the cumulative reward at state s, defined as:

and thirdly, calculating the return r (s, a) by using a reward function, obtaining the next state s 'according to the current state and the action, calculating the Q value of the state s' based on an epsilon-greedy strategy, and updating the Q (s, a) value.

At state s, action a is selected according to an epsilon-greedy strategy resulting in a reward r (s, a), and a next state s'. Here, the reward r (S, a) is calculated from the reward function set in S52. Thus, the Q (s, a) value is updated as follows:

wherein the content of the first and second substances,

the state s' selects the action by a greedy strategy, namely, the action corresponding to the maximum action value function in the action space is selected;

step four, updating the state and the action s ═ s ', a ═ a';

step five, judging whether the final state is reached, if not, returning to the step three, otherwise, carrying out the next step;

sixthly, judging whether the learning process is finished or not, if not, returning to the second step, otherwise, performing the next step;

seventh step, until all Q (s, a) converge, output the final strategy

And obtaining an optimal distribution scheme. The optimal current distribution scheme determined according to the Q-learning algorithm in this embodiment is shown in Table 1:

TABLE 1 Current Allocation scheme for optimal System efficiency

In the embodiment, the two-phase converter is not forced to operate simultaneously, and only the first phase works when the load current is 1-12A. If two phases are forced to operate simultaneously, only the boundary condition of the objective function needs to be changed.

The optimal current distribution scheme given by the Q-learning algorithm is subjected to simulation verification.

The controller for efficiency optimization control includes a main controller and a current distribution controller. The main controller realizes a voltage and current double-closed-loop PI control strategy and ensures the stable operation of the system. When the system is disturbed and in a dynamic adjusting process, a current sharing control mode is adopted, when the system is in a stable working state, a control signal is sent to the current distribution controller, the current distribution controller solves a current distribution scheme when the efficiency of the parallel system is optimal, the distribution scheme is sent back to the main controller, the main controller receives the distribution scheme and then redistributes the current setting of the two-phase converter according to the distribution scheme, and the system operates in an efficiency optimal mode.

And carrying out simulation verification on the optimal current distribution scheme given by the Q-learning algorithm according to the control strategy. As can be seen from fig. 5, in the initial stage of system operation, the output voltage rises rapidly, the overshoot is small, and the output current operates in the current sharing mode. After the system reaches steady state, it is operated in efficiency optimization mode. At 10ms the load is suddenly decreased from 2 Ω to 1 Ω, at 20ms the load is suddenly increased from 1 Ω to 1.5 Ω, the output voltage has little fluctuation at 10ms and 20ms, and then enters steady state. The system operates in a current sharing mode at the initial sudden load change stage, and operates in an efficiency optimization mode after reaching a steady state.

And changing the load resistance for multiple times of simulation, and recording the overall efficiency of the system in the current sharing mode and the efficiency optimization mode. The change curves of the efficiency of the two modes along with the load current are drawn, as shown in fig. 6, it can be seen that the system efficiency of the efficiency optimization mode is obviously improved compared with the current sharing, and the effectiveness of the efficiency optimization algorithm is verified.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A method for optimizing the efficiency of a two-phase interleaved parallel DC-DC converter is characterized by comprising the following steps:

s2, training efficiency data of the two phases under different working conditions to obtain an efficiency prediction model of the two phases of the two-phase interleaved parallel DC-DC converter under any working condition;

2. The method of claim 1, wherein the S1 includes the steps of:

3. The method of claim 1, wherein the S2 includes the steps of:

4. The method of claim 1, wherein the S3 includes the steps of:

s31, connecting each phase efficiency eta of two-phase interleaved parallel DC-DC converter_iExpressed as output current I_oiFunction of (c):

η_i＝f(I_oi),i＝1,2

s32, a set of converter operation conditions are given, the change condition of the efficiency of each phase along with the output current under the conditions is predicted according to an efficiency prediction model, and then an accurate expression of the relation between the efficiency of each phase and the output current, namely an efficiency characteristic curve, is obtained through a polynomial fitting mode.

5. The method of claim 4, wherein the S4 includes the steps of:

wherein the content of the first and second substances,I_oifor the i-th phase output current, η_i(I_oi) For the I-th phase efficiency, I_loadIs the total load current, which is the sum of the two phase output currents;

6. The method of claim 5, wherein the S5 includes the steps of:

λ_iThe value range of (A) satisfies:

actions for setting Q-learning reinforcement learning strategySpace A:

wherein the content of the first and second substances,

setting the objective function of the Q-learning reinforcement learning strategy as

Wherein

In order to be a penalty term,

is composed of

R is a penalty factor;

set reward function r (s, a) of Q-learning reinforcement learning strategy:

7. An efficiency optimization system for a two-phase interleaved parallel DC-DC converter, comprising: a computer-readable storage medium and a processor;

the processor is used for reading executable instructions stored in the computer readable storage medium and executing the efficiency optimization method of the two-phase interleaved parallel DC-DC converter according to any one of claims 1 to 6.