CN103679292B

CN103679292B - Electricity collaborative optimization method for double batteries of intelligent micro power grid

Info

Publication number: CN103679292B
Application number: CN201310695793.9A
Authority: CN
Inventors: 刘德荣; 魏庆来; 徐延才
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2013-12-17
Filing date: 2013-12-17
Publication date: 2017-01-11
Anticipated expiration: 2033-12-17
Also published as: CN103679292A

Abstract

The invention discloses an electricity collaborative optimization method for double batteries of an intelligent micro power grid. The electricity collaborative optimization method comprises the steps that related parameters are initialized; overall circulation is started, and a network judgment weight is initialized; outer circulation is started, a battery control strategy is initialized, and the battery control strategy is regulated according to the actual electricity use conditions; internal circulation is started, the neural network weight is circularly corrected and judged through the collected system state and the regulated battery control strategy, and the performance of the current battery control strategy is evaluated by utilizing the network judgment weight; whether the internal circulation is finished or not is judged, if not, the internal circulation is repeated, if not, whether the outer circulation is finished or not is judged, if the outer circulation is not finished, the outer circulation is repeated, otherwise whether the overall circulation is finished or not is judged, and if the overall circulation is not finished, the overall circulation is repeated; after a searching program completely operates, the optimum battery control strategy is stored according to the performance obtained from judgment, and the electricity use cost is calculated.

Description

A kind of intelligent micro-grid double cell electric energy cooperative optimization method

Technical field

The invention belongs to intelligent grid electric energy optimizing technical field, be specifically related to a kind of intelligent micro-grid double cell electric energy association Same optimal control method.

Background technology

Accumulator, also known as rechargeable battery, be the conventional energy storage device of intelligent micro-grid user's residential terminal, storage chemistry Energy, is directly translated into a kind of electrochemical equipment of electric energy if desired by chemical energy.Accumulator is to be standard by rechargeable Design, realize recharging by reversible chemical reaction, each side such as integrated cost considers, energy storage accumulator mainly uses Lead-acid accumulator.Its work process: the electric energy outside utilizing during charging makes internal active substance regeneration, is chemistry electrical power storage Can, need again chemical energy to be converted into electric energy output during electric discharge, it is achieved the storing process of electric energy.

Batteries to store energy is an important ring of residential electric power energy storage section in intelligent grid, and has use temperature range Extensively, the feature such as charge acceptance strong, life-span length, easy care, wherein energy storage battery is to ensure that smart grid security is stably transported The visual plant of row, it is possible to emergency power supply is provided for residential customer, cuts down the power consumption of peak of power consumption period, reduction electrical network Peak load difference, is the extremely important ingredient of intelligent grid.In actual application, on the one hand by energy storage electric power storage The working mechanism in pond is analyzed double-energy storage battery and is filled (putting) electrical characteristics, on the other hand double negative according to intelligent micro-grid residential customer end Double-energy storage battery is filled the parameters such as (putting) electricity order, time and carries out optimum control by the practical operation situation carried, and Intelligent Optimal is micro- The double cell electric energy of electrical network residential customer end, reduces the peak load difference of electrical network, improves the efficiency of operation of power networks and reduces use The actual electric cost at family.But in actual double cell collaborative optimization running, the load change of residential customer end relates to Artificial subjective factor, it is difficult to accurately predict, time span is long simultaneously, controls effect the most fairly obvious and non-linear in the short time The feature such as serious makes double cell electric energy work in coordination with optimization to be difficult to set up accurate mechanism model, assist to the battery power of intelligent grid The biggest difficulty is brought with optimal control.Therefore, based on operation of power networks relevant historical data, design a set of effective double cell Electric energy works in coordination with optimization control scheme, to realize reducing electrical network maximum load, promoting network load balance, the motility of raising electrical network It is that intelligent grid develops the problem needing solution badly with electric cost compatible, reduction user.

Summary of the invention

(1) to solve the technical problem that

The technical problem to be solved is to utilize intelligent grid dual user load histories service data, Spot Price Information and neutral net, build intelligent micro-grid double cell electric energy and work in coordination with optimal control network method, use and move based on self adaptation The self-learning method of state planning, it is achieved the collaborative optimal control of double cell electric energy.

(2) technical scheme

The present invention proposes a kind of intelligent micro-grid double cell electric energy cooperative optimization method, comprising:

S1, double cell electric energy cooperative optimization method is carried out relevant parameter initialization, including Spot Price, a time period Two interior terminal use's load datas and judge modular neural network parameter；

S2, unlatching global loops, initialize and pass judgment on network weight, uses random method to realize within the specific limits；

S3, unlatching outer circulation, initialize battery control strategy, and according to actual electricity consumption situation to described battery control strategy It is adjusted；

Circulation in S4, unlatching, uses and performs dependence heuristic dynamic programming method, utilizes the system mode and tune collected Battery control strategy circulation after whole is revised and is passed judgment on neural network weight, and utilizes this judge network weight to evaluate and test present battery control The performance of system strategy；Judge whether current interior circulation completes, if it is not yet done, return circulation in step S4 continues, otherwise Judge whether current outer circulation completes, if it is not yet done, return step S3 to continue outer circulation, otherwise judge that the current overall situation is followed Whether ring completes, if it is not yet done, return step S2 to continue global loops, otherwise continues next step；

After S5, search utility run completely, preserved optimum battery control strategy also according to passing judgment on the performance obtained Calculate electric cost.

(3) beneficial effect

The present invention, according to two user's history load operation data of load end and Spot Price information, builds double cell Electric energy works in coordination with optimal control network method, the optimal control plan in using self adaptation dynamic programming method to predict the corresponding time period Slightly, reduce the peak valley load difference of electrical network, improve motility and the compatibility of electrical network, reduce the electric cost of user simultaneously.

Accompanying drawing explanation

Fig. 1 is intelligent micro-grid dual-battery structure schematic diagram in the present invention；

Fig. 2 is the flow chart that in the present invention, intelligent micro-grid double cell electric energy works in coordination with optimal control method；

Fig. 3 be in the present invention ADHDP method realize block diagram.

Detailed description of the invention

For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment, and reference Accompanying drawing, the present invention is described in further detail.

Fig. 1 is intelligent micro-grid dual-battery structure schematic diagram in the present invention.As it is shown in figure 1, intelligent micro-grid is by external public affairs Common-battery net 1 provides external public electric wire net power supply, and pricing mode is Spot Price, is only house end and provides electric energy, it is impossible to reversely Conveying electric energy；The electric energy that external utility network 1 provides can charge to energy-storage battery 2 and energy-storage battery 4, simultaneously can also be for living Residence 3 and house 5 provide electric energy；Energy-storage battery 2 and energy-storage battery 4 can also be respectively house 3 and house 5 electric terminal provides electricity Can, but intelligent dwelling 3 or house 5 synchronization can only by external public electric wire net 1, energy-storage battery 2 and energy-storage battery 4 in one Powering for it, same residential terminal can not discharge by energy-storage battery 2 and energy-storage battery 4 simultaneously, simultaneously two energy-storage batteries it Between can not charge mutually.Controller can determine at a time by external public electric wire net 1, energy storage according to actual operation result In battery 2 and energy-storage battery 4, one or two are that the electric terminal of house 3 and house 5 is powered, it is also possible to by external public Electrical network 1 is charged for energy-storage battery.It should be noted that this Fig. 1 is schematic sketch, actual intelligent micro-grid with And user's residential terminal also includes other each workpieces, such as straight AC converter, transformator, controller and laundry The residential electric power terminals such as machine, but it is all those skilled in the art and knows, and do not affect the control method of the present invention, therefore It is not added with at this repeating.

Fig. 2 shows that the intelligent micro-grid double cell electric energy that the present invention proposes works in coordination with the flow chart of optimal control method.As Shown in Fig. 2, the method includes the following steps performed successively:

S3, unlatching outer circulation, initialize battery and control parameter, enter according to the electric energy that double cell is stored by its initialization data Row is optimized and revised；

Circulation in S4, unlatching, uses and performs dependence heuristic dynamic programming method (ADHDP), utilizes the system collected State and battery control strategy, as passing judgment on the input data selecting module, are revised and are passed judgment on network weight, evaluate and test present battery control System strategy performance, it is judged that in current, whether circulation completes, if it is not yet done, return circulation in step S4 continues, otherwise sentences Whether disconnected current outer circulation completes, if it is not yet done, return step S3 to continue outer circulation, otherwise judges current global loops Whether complete, if it is not yet done, return step S2 to continue global loops, otherwise continue next step；

S5, searching procedure run the best battery control strategy occurred afterwards completely and calculate electric cost.

Introduce each step above-mentioned separately below.

The double cell electric energy cooperative optimization method of the present invention is that self adaptation thoery of dynamic programming method based on data is carried out Build, it is not necessary to concrete mathematical model, i.e. utilize self adaptation dynamic programming algorithm processing and basis during problem analysis Process the spy that data characteristics adjusts processing method automatically, processing sequence makes itself and problem characteristic match with processing parameter of data Point, filling when adjusting and optimizing residential customer end exists double cell (putting) electricity order and the method for amount.According to intelligent micro-grid Middle electrical network and residential customer part relevant historical data, self adaptation dynamic programming can learn automatically, updates current The structure of system optimizing control, it is achieved the electric energy of double cell is collaborative to be optimized and revised.

According to performing dependence heuristic dynamic programming algorithm and the demand of intelligent micro-grid, at intelligent micro-grid double cell The starting stage of electric energy Cooperative Optimization Algorithm needs to carry out the initial work of parameter, runs for algorithm and carries out basic data deposit Work.

The relevant parameter initial work of algorithm includes Spot Price, two terminal residential customer bearing powers and adaptive Answer the parameter passing judgment on modular neural network in dynamic programming algorithm.

Spot Price: using simplest peak valley pricing strategy, i.e. peak of power consumption period electricity rates high, electricity consumption is low herein Paddy period electricity rates are low, to encourage user to avoid the Peak power use period.Spot Price information data is with reference to China some areas Spot Price data obtain, change an electricity consumption electricity price per hour, and electricity price data be normalized.

Two residential customer loads: with reference to the residential customer history electricity consumption data of power supply department collection accumulation, be analyzed Process and obtain community building and determine the user load data of two ordinary residences in the time period, and the two load data is carried out Normalized, it is simple to self adaptation dynamic programming method uses.

Algorithm parameter initializes: need to carry out the self adaptation dynamic programming method of intelligent micro-grid in the algorithm incipient stage Double cell reserve of electricity initializes with initial quantity of electricity, and global iterative initializes with interior circulation, outer circulation number of times, discount factor and god Through network structure initial work, wherein discount factor refers to the variable used in ADHDP algorithm, and this variable embodies The performance of the previous moment system size on the impact of current time system, its span be (0,1].

Wherein, the structure arranging judge network is 7-20-1, and wherein 7 for passing judgment on network input layer number of nodes, and 20 for commenting Sentencing network hidden node quantity, 1 for passing judgment on network output layer number of nodes, and its hidden node quantity can rule of thumb be adjusted Whole to obtain optimal Approximation effect.

In actual application, Spot Price information can be obtained by the real time data that power department is issued, user The historical data accumulation of load can be used to again transport the intelligent micro-grid electric energy Cooperative Optimization Algorithm of each time period Calculating planning and adjusting, be allowed to more press close to current practice, algorithm can carry out learning adjusting and optimizing automatically.Therefore, reality herein Time electricity price and two residential customer load datas close to practical situation.

(i.e. perform to rely on heuristic dynamic programming method according to intelligent micro-grid double cell electric energy Cooperative Optimization Algorithm ADHDP), need that this is passed judgment on network and carry out network weight initial work, considering weight convergence speed with stable In the case of property, within global iterative, determine weights W₁、W₂In the range of (-0.01,0.01), random initializtion obtains.Wherein W₁For passing judgment on the weights between network input layer node and hidden node, W₂For pass judgment on network hidden node and output layer node it Between weights.The input value passing judgment on network includes normalized Spot Price C_norm(t), normalized user load P_{load1_norm} (t) and P_{load2_norm}(t), the real-time electric energy E of normalized double cell_{b1_norm}(t) and E_{b2_norm}Filling of (t), battery 1 and battery 2 (putting) electric control strategy u₁(t) and u₂T (), output valve is performance index function.

Pass judgment on network weight to initialize within global iterative, judge can be re-started when each global loops starts The initialization of network weight, thus on the basis of ensureing network stabilization and convergence rate, better ensure that the receipts of neutral net Hold back, in order to find double cell optimal control policy as early as possible.

S3, unlatching outer circulation, initialize battery and control parameter u (t), the electricity stored double cell according to its initialization data Adjustment can be optimized；

Intelligent micro-grid single battery control strategy is { 1: battery charges；0: battery is hung up；-1: battery is to this lateral load Power supply (such as battery 1, load 1 is this lateral load)；-2: battery is (such as battery 1, negative to heteropleural load supplying Carry 2 and be heteropleural load；)；-3: battery is simultaneously to this heteropleural load supplying (i.e. battery is simultaneously to load 1 and load 2 power supply) }. Consider battery 1 and battery 2 and the practical situation of two loads, all control strategies of intelligent dwelling double-battery charge (putting) electricity As shown in table 1, but in initialization procedure, there are 7 { (u₁(t), u₂(t)) }=(-1 ,-2), (-1 ,-3), (-2 ,-1), (-2 ,- 3), (-3 ,-1), (-3 ,-2), (-3 ,-3) } due to two pieces of accumulator, to fill the strategy mutual exclusion of (putting) electricity untenable, need again to adjust Whole.

The feasible control scheme list of table 1 double cell

The initial work method of the control strategy of user's house end double cell is that, within outer circulation, control strategy exists { 1,0 ,-1, in-2 ,-3}, random initializtion obtains its control strategy.Outer circulation ensure that, when circulation starts every time, intelligence is micro- The control strategy of the double cell of electrical network re-starts random initializtion so that the control strategy of double cell relates to as much as possible All of control strategy.Table 2 and table 3 are with u₁T ()={ 1,0 ,-1, illustrate energy-storage battery as a example by-2 ,-3} fills (putting) electricity Control strategy adjusts, and works as u₂(t) and u₁When () clashes t, default priority considers u₁(t)。

Table 2 u₁(t)=and 1,0, control Developing Tactics table when-1 ,-2}

Table 3 u₁(t)={ during-3}, control Developing Tactics table

After battery control strategy initializes, according to the practical situation of user power utilization, the battery in set time section is controlled Strategy correspondingly adjusts, and the practical situation of wherein said user power utilization includes the electricity consumption number of residential customer in set time section According to this and the electric energy memory state etc. of double cell, simultaneously the filling of double cell (putting) electrically with fill (putting) power consumption and also can calculate Arrive.Adjustment mode after battery control strategy initializes illustrates as a example by energy-storage battery 1, and concrete adjustment mode is shown in Table 2 Hes Table 3.

In the process, guaranteed output conservation is wanted between load and battery, electrical network

P_load1(t)+P_load2(t)=P_battery1(t)+P_battery2(t)+P_grid(t)。

Wherein, P_load1T () represents the realtime power of load 1 (load 1 is all electrical appliances in house 1), P_load2(t) Represent the realtime power of load 2 (load 2 is all electrical appliances in house 2), P_battery1T () represents filling in real time of battery 1 (putting) electrical power, P_battery2(t) represent battery 2 fill (putting) electrical power, P in real time_gridT () represents the real-time merit obtained from electrical network Rate.

Circulation in S4, unlatching, acquisition system state (including: normalized Spot Price C_norm(t), normalized user Load P_{load1_norm}(t) and P_{load2_norm}(t), the real-time electric energy E of normalized double cell_{b1_norm}(t) and E_{b2_norm}(t)), and Use and perform to rely on heuristic dynamic programming method (ADHDP), utilize the system mode and battery control strategy conduct collected Pass judgment on the input data of network, revise the weights passing judgment on network, evaluate and test present battery control strategy performance, it is judged that circulation in current Whether complete, if it is not yet done, return circulation in step S4 continues, otherwise judge whether current outer circulation completes, if Not yet complete, then return step S3 and continue outer circulation, otherwise judge whether current global loops completes, if it is not yet done, Return step S2 and continue global loops, otherwise continue next step；

Outer circulation start the S3 stage be calculated adjustment after double cell control strategy and accumulator fill (putting) electricity Mode and quota data and utility function and system mode and battery control strategy matrix, wherein, utility function is according to battery Control strategy obtains.

Fig. 3 show in the present invention perform rely on heuristic dynamic programming method (ADHDP) realize block diagram.Such as Fig. 3 institute Showing, this stage utilizes and performs to rely on heuristic dynamic programming method according to network the input data, i.e. system mode being previously obtained With battery control strategy matrix, training is passed judgment on mixed-media network modules mixed-media accordingly, is revised its network weight, and utilize judge network evaluation to work as Front battery control strategy performance.

According to the Bellman principle of optimization, dynamic programming method is used for finding optimum control for complex nonlinear optimization problem The behavior of system strategy.Working in coordination with optimization problem for intelligent micro-grid double cell electric energy, current system is represented by:

X (t+1)=F [x (t), u (t)].

X (t) represents that the state vector of t system (includes normalized Spot Price C_norm(t), normalized user Load P_{load1_norm}(t) and P_{load2_norm}(t), the real-time electric energy E of normalized double cell_{b1_norm}(t) and E_{b2_norm}(t)), u T () represents the control strategy of two accumulator of t, F is when t battery control strategy u (t), current system conditions x T () is to the transfer function of next system mode x (t+1).

For evaluating the performance index function of the system of system quality it is:

J [x (t)] = Σ_{k = t}^{\infty} γ^{k - t} U [x (k), u (k)],

Wherein U represents the utility function of system, and it is for representing expense produced by system power consumption at a time, γ is discount factor, span (0,1].The target of dynamic programming is exactly to find optimum battery a making property of control strategy Can minimize by target function.According to Bellman principle, optimal performance index function is:

J * [x (t)] = \min_{u (t)} (U [x (t), u (t)] + γJ * [x (t + 1)]) .

Therefore the optimal control policy of t is

u * (t) = \arg \min_{u (t)} (U [x (t), u (t)] + γJ * [x (t + 1)]) .

For in concrete intelligent micro-grid double cell electric energy cooperative optimization method, concrete parameter is expressed as follows:

System mode vector is

x(t)=[C_norm(t), P_{load1_norm}(t), P_{load2_norm}(t), E_{b1_norm}(t), E_{b2_norm}(t)],

Wherein C_normT () is the Spot Price after normalization, P_{load1_norm}(t) and P_{load2_norm}T () is the most normalized User load 1 and user load 2, E_{b1_norm}(t) and E_{b2_norm}T () is respectively battery 1 and the real-time energy storage of battery 2 after normalization Electricity.

The control strategy of energy-storage battery is

u(t)=(u₁(t), u₂(t)),

Wherein u₁(t) and u₂What t () was respectively battery 1 and battery 2 fills (putting) electric control strategy.

Utility function computing formula is

U(t)=c(t)(P_load1(t)+P_load2(t)+E_{battery1_change}(t)+E_{battery2_change}(t))/U_max,

Wherein c (t) represents current time Spot Price, P_load1(t) and P_load2T () is respectively user load 1 and user bears Carry 2, E_{battery1_change}(t) and E_{battery2_change}T () fills (putting) in being respectively the current slot of energy-storage battery 1 and battery 2 Electricity electric energy, be negative value during electric discharge, during charging be on the occasion of.U_maxMaximum for utility function set in advance.

Performance index function formula according to system can deduce

J [x (t)]=U [x (t), u (t)]+γ J [x (t+1)].

Therefore, intelligent micro-grid double cell electric energy cooperative optimization method performs rely on heuristic dynamic programming and pass judgment on network Target be by training minimize judge network output error.Owing to J [x (t+1)] is the performance indications letter of subsequent time Number, is therefore difficult to obtain.For the ease of solving, we define new performance index function Q (t)=J [x (t+1)], make effectiveness letter Number U [x (t), u (t)] is abbreviated as U (t), then we both can obtain:

Q(t-1)=U(t)+γQ(t)

It can be seen that performance index function Q (t) in above formula only performance with the utility function of t and a upper moment refers to Scalar functions is correlated with, and unrelated with the performance index function in t+1 moment.We use judge network approximate solution Q (t) below.

Neural network forecast error is passed judgment in definition

e_c(t)=γQ(t)+U(t)-Q(t-1)。

Q (t) is for passing judgment on the t output function of network, and Q (t-1) is the t-1 moment output function passing judgment on network, and γ is Discount factor.

Judge network minimizes error function and is

E_{c} (t) = \frac{1}{2} {e_{c}}^{2} (t),

The right value update rule then passing judgment on network obtains based on gradient descent method

w_c(t+1)=w_c(t)+Δw_c(t),

Δ w_{c} (t) = l_{c} (t) [- \frac{&PartialD; E_{c} (t)}{&PartialD; w_{c} (t)}],

\frac{&PartialD; E_{c} (t)}{&PartialD; w_{c} (t)} = [- \frac{&PartialD; E_{c} (t)}{&PartialD; Q (t)} \frac{&PartialD; Q (t)}{&PartialD; w_{c} (t)}],

Wherein w_cT () represents the judge network weight of t, Δ w_cT () represents its weights variable quantity, l_cWhen () represents t t Carve the learning rate passing judgment on network, l_c(t)>0。

Pass judgment on network along with the increase of frequency of training, its forecast error e_cT () gradually levels off to zero.Therefore, performance indications Relation between function Q (t-1) and Q (t) can be approximately:

Q(t-1)≈U(t)+γQ(t)。

The structure passing judgment on network includes input layer, hidden layer and output layer three-decker, the activation between input layer and hidden layer Function is bipolar S type function, the linear function of the activation primitive between hidden layer and output layer.Making hidden layer neuron number is L, Output layer neuron number is M.System mode in known one section of set time section and battery control strategy matrix Statecontrol and utility function matrix U, can obtain output according to first network calculations

Q=purline(W₂*σ(W₁* statecontrol)),

Wherein

σ(W₁*statecontrol)∈R^L,

σ (z_{i}) = \frac{e^{z_{i}} - e^{{- z}_{i}}}{e^{z_{i}} + e^{{- z}_{i}}}, i = 1, . . . L,

purline(W₂*σ(W₁*statecontrol))∈R^M, purline (z)=z.

And the performance index function matrix that Q is all moment Q (t) in this section of set time section, statecontrol= [x, u], x represents all moment x (t) state vectors in this section of set time section, and it is all that u represents in this section of set time section The control strategy in moment u (t), U is the utility function matrix of all moment U (t).

From front, the performance index function of previous moment t-1 can be calculated by following formula

Q(t-1)=U(t)+γQ(t)。

The target output value now passing judgment on network is Q (t-1), and input function matrix is statecontrol, can have In the case of the inputoutput data determined, this is passed judgment on network and carry out a Weight Training, obtain the weights W trained₁And W₂, its Middle W₁For passing judgment on the weights between network input layer node and hidden node, W₂For passing judgment on network hidden node and output layer node Between weights.

The two weights W subsequently₁And W₂It is delivered to select module, selects module according to these weights and system mode matrix x T () (includes normalized Spot Price, the realtime power of normalized user load 1 and the realtime power of load 2, normalization The real-time electric energy of double cell) all possible battery control strategy combination u is calculated, it is thus achieved that corresponding select module letter Number output valve, judges the best performance of current any control strategy according to the size of this output valve；Such as, if, with currently System mode and pass judgment on the weights of network all battery control strategies calculated, the selection module that obtains output result is Little control strategy is current optimum battery control strategy.Now select the hidden layer neuron number of module also for L, output Layer neuron number is also M.

F_S=purline(W₂*σ(W₁* statecontrol)),

Wherein

σ(W₁*statecontrol)∈R^L,

σ (z_{i}) = \frac{e^{z_{i}} - e^{{- z}_{i}}}{e^{z_{i}} + e^{{- z}_{i}}}, i = 1, . . . L,

purline(W₂*σ(W₁*statecontrol))∈R^M, purline (z)=z, and F_SFor selecting modularity function.

According to all selection modularity functions output result obtained, find the selection modularity function output knot that functional value is minimum Fruit also finds the control strategy u of its correspondence.According to control strategy u and two actual user load data, to intelligent micro-grid Two pieces of batteries fill (putting) electricity order carry out planning and adjusting, according to formula with amount

Cost=(P_load1+P_load2+E_{battery1_change}+E_{battery2_change})*C

Calculate the cost of user.Wherein C is to Spot Price matrixes all in this section of set time section, E_{battery1_change} And E_{battery2_change}Be respectively energy-storage battery 1 and battery 2 fills (putting) electricity electric energy, is negative value during electric discharge, during charging be on the occasion of.

Setting cycle-index according to interior circulation, if circulation not yet terminates in now, continues to return S4 step section start and continues Continuous network weight of passing judgment on is trained, if interior circulation is over, now the inventive method may proceed to whether judge current outer circulation Terminate.If outer circulation not yet terminates, then return to again the double cell control strategy of intelligent micro-grid be initialized at S3 Assignment, is optimized adjustment according to initialization result to the electric energy of two pieces of battery storage subsequently；If outer circulation terminates the most, then originally Inventive method may proceed to judge whether current global loops terminates, if global loops not yet terminates, can return at S2 commenting Sentence two weights W of modular neural network₁、W₂Carry out random initializtion, start to calculate next time；If global loops is the most Terminate, then the method may proceed to carry out next step S5.

Pass judgment on after network trains every time and terminate in interior circulation, recalculate intelligent micro-grid according to the weights of this neutral net The control strategy of double cell, is then adjusted correspondingly according to actual user's load and battery energy storage data, and according to phase The result answered calculates the cost of double cell user.During the continuous circuit training of network, constantly calculate each training result User cost value Cost, then with minima Cost of current system record_minCompare, it may be judged whether be better than currently known Minimum user cost, if less than record value, then will record minima with currency update, do not update.Side of the present invention Method continually searches for Neural Network Data when best battery control strategy and optimal user cost in a program, until network is instructed Practice and terminate.

The advantage of control algolithms based on data maximum is Model free control.Real system exists serious non-linear, no The factor such as definitiveness, time variation, when obtaining accurate mathematical model, model-free adaption optimal control just can Play the advantage of self, overcome the contradiction between theoretical model and actual application, reduce the requirement to model and have preferably Comprehensive Control effect.

Particular embodiments described above, has been carried out the purpose of the present invention, technical scheme and beneficial effect the most in detail Describe in detail bright it should be understood that the foregoing is only the specific embodiment of the present invention, be not limited to the present invention, all Within the spirit and principles in the present invention, any modification, equivalent substitution and improvement etc. done, should be included in the protection of the present invention Within the scope of.

Claims

1. an intelligent micro-grid double cell electric energy cooperative optimization method, comprising:

S1, double cell electric energy cooperative optimization method is carried out relevant parameter initialization, including Spot Price, in the time period Two terminal use's load datas and judge modular neural network parameter；

S2, unlatching global loops, initialize and pass judgment on network weight, uses random method to realize in preset range；

S3, unlatching outer circulation, initialize battery control strategy, and carry out described battery control strategy according to actual electricity consumption situation Adjust；

Circulation in S4, unlatching, uses and performs dependence heuristic dynamic programming method, after utilizing the system mode collected and adjusting Battery control strategy circulation revise pass judgment on neural network weight, and utilize this judge network weight evaluation and test present battery control plan Performance slightly；Judge whether current interior circulation completes, if it is not yet done, return circulation in step S4 continues, otherwise judge Whether current outer circulation completes, if it is not yet done, return step S3 to continue outer circulation, otherwise judges that current global loops is No complete, if it is not yet done, return step S2 to continue global loops, otherwise continue next step；

After S5, search utility run completely, preserved optimum battery control strategy according to passing judgment on the performance obtained and calculate Electric cost.

2. the method for claim 1, it is characterised in that in step sl, Spot Price data are with reference to China partly The Spot Price information in district and the Spot Price that obtains, two user load data messages are according to residential building electricity consumption historical data Obtain.

3. the method for claim 1, it is characterised in that in step sl, the structure arranging judge network is 7-20-1, Wherein 7 for passing judgment on network input layer number of nodes, and 20 for passing judgment on network hidden node quantity, and 1 for passing judgment on network output layer node Quantity, and its hidden node quantity can rule of thumb be adjusted obtaining optimal Approximation effect.

4. the method for claim 1, it is characterised in that in step s 2, with random in the range of (-0.01,0.01) Initial method obtains passes judgment on network weight.

5. the method for claim 1, it is characterised in that in step s 4, the input data of described judge network and defeated Go out data to include:

Input data: normalized Spot Price C_norm(t), normalized user load P_{load1_norm}(t) and P_{load2_norm}(t), The real-time electric energy E of normalized double cell_{b1_norm}(t) and E_{b2_norm}(t), the battery control strategy u of double cell₁(t) and u₂(t)；

Output data: according to Bellman equation, utilize and pass judgment on output Q (t) in a moment and utility function U (t) and folding on network Button factor gamma is calculated data Q (t-1) of current time, and computing formula is as follows:

Q (t-1)=U (t)+γ Q (t)；

Wherein, utility function U (t) represents the cost that current time t institute power consumption produces.

6. the method for claim 1, it is characterised in that in step S4, utility function computing formula is:

U (t)=c (t) (P_load1(t)+P_load2(t)+E_{battery1_change}(t)+E_{battery2_change}(t))/U_max；

Wherein, c (t) represents current time Spot Price, P_load1(t) and P_load2T () is respectively user 1 bearing power and user 2 Bearing power, E_{battery1_change}(t) and E_{battery2_change}T () is respectively rechargeable electrical energy in double cell current slot, electric discharge Time be negative value, during charging be on the occasion of；U_maxMaximum for utility function set in advance.

7. the method for claim 1, it is characterised in that in step S4, the right value update rule passing judgment on network is as follows:

w_c(t+1)=w_c(t)+Δw_c(t)

{Δw}_{c} (t) = l_{c} (t) [- \frac{\partial E_{c} (t)}{\partial w_{c} (t)}]

\frac{\partial E_{c} (t)}{\partial w_{c} (t)} = [- \frac{\partial E_{c} (t)}{\partial Q (t)} \frac{\partial Q (t)}{\partial w_{c} (t)}]

E_{c} (t) = \frac{1}{2} {e_{c}}^{2} (t)

Wherein, w_cT () represents the judge network weight of t, Δ w_cT () represents its weights variable quantity, l_cT () represents that t is commented Sentence the learning rate of network, l_c(t) ＞ 0；Q (t) represents the output passing judgment on network, e_cT () represents the forecast error passing judgment on network, E_c T () minimizes error function for passing judgment on network.

8. the method for claim 1, it is characterised in that pass judgment on present battery control strategy performance in step S4 concrete For: according to described judge network, utilize the neural network weight obtained that all of battery control strategy is evaluated, it is thus achieved that Optimum battery control strategy under Current Situation of Neural Network weights and the evaluation of estimate of correspondence thereof.

9. the method for claim 1, it is characterised in that step S4 also includes, the optimum that the most current interior circulation obtains Evaluation of estimate corresponding to battery control strategy and current optimal value, if the optimum battery control strategy that in current, circulation obtains is corresponding Evaluation of estimate be better than optimal value, then update described optimal value.