WO2018145498A1 - 基于强化学习算法的双馈感应风力发电机自校正控制方法 - Google Patents

基于强化学习算法的双馈感应风力发电机自校正控制方法 Download PDF

Info

Publication number
WO2018145498A1
WO2018145498A1 PCT/CN2017/110899 CN2017110899W WO2018145498A1 WO 2018145498 A1 WO2018145498 A1 WO 2018145498A1 CN 2017110899 W CN2017110899 W CN 2017110899W WO 2018145498 A1 WO2018145498 A1 WO 2018145498A1
Authority
WO
WIPO (PCT)
Prior art keywords
controller
action
value
stator
control
Prior art date
Application number
PCT/CN2017/110899
Other languages
English (en)
French (fr)
Inventor
余涛
程乐峰
李靖
王克英
Original Assignee
华南理工大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华南理工大学 filed Critical 华南理工大学
Publication of WO2018145498A1 publication Critical patent/WO2018145498A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02PCONTROL OR REGULATION OF ELECTRIC MOTORS, ELECTRIC GENERATORS OR DYNAMO-ELECTRIC CONVERTERS; CONTROLLING TRANSFORMERS, REACTORS OR CHOKE COILS
    • H02P21/00Arrangements or methods for the control of electric machines by vector control, e.g. by control of field orientation
    • H02P21/14Estimation or adaptation of machine parameters, e.g. flux, current or voltage

Definitions

  • the invention relates to a self-correcting control of a doubly-fed induction wind turbine, in particular to a self-tuning control method of a doubly-fed induction wind turbine based on a Reinforcement Learning (RL) algorithm.
  • RL Reinforcement Learning
  • Variable-speed constant-frequency doubly-fed power generation is a commonly used power generation method for wind power generation. Its generator uses a double-fed induction generator (DFIG). When the unit is working below the rated wind speed, the maximum capture of wind energy can be achieved by adjusting the rotor speed of the generator and maintaining the optimum tip speed ratio.
  • the control system often adopts vector control based on stator field orientation to realize decoupling control of generator active and reactive power.
  • the object of the present invention is to overcome the problems of the prior art and to provide an automatic and automatic optimization of the output of the fan control system, which not only achieves maximum tracking of wind energy, but also has good dynamic performance, and significantly enhances the robustness of the control system. And adaptive self-correction control method for doubly-fed induction wind turbine based on reinforcement learning algorithm.
  • Self-tuning control method for doubly-fed induction wind turbine based on reinforcement learning algorithm adding RL controller to PI controller in PI control-based vector control system, dynamically correcting output of PI controller, RL controller including RL-P
  • the controller and the RL-Q controller, the RL-P controller and the RL-Q controller respectively correct the active and reactive power control signals;
  • the self-correction control method comprises the following steps:
  • the RL-P controller and the RL-Q controller respectively sample the active power error value ⁇ P and the reactive power error value ⁇ Q; the RL-P controller and the RL-Q controller respectively determine the interval s of the power error values ⁇ P and ⁇ Q k ;
  • the action value ⁇ k and the output signal of the PI controller are added by an adder to obtain a given value i qs * of the stator q-axis current, that is, a control signal of the active power;
  • the action value ⁇ k and the output signal of the PI controller are added by an adder to obtain a given value i ds * of the stator d-axis current, that is, a control signal of the reactive power;
  • the RL controller obtains the immediate reward value r k from the reward function; the reward function is designed as:
  • the value is a pointer of the action set A, that is, the sequence number of the kth action value ⁇ in the action set A, and ⁇ 1 and ⁇ 2 are the weight values of the square terms before and after the balance, and the values are all obtained through a large number of simulation experiments;
  • ⁇ and ⁇ are discount factors, and the values are all obtained through a large number of simulation experiments;
  • the invention provides a self-tuning control architecture, that is, a PI controller in a vector control system based on PI control is additionally provided with an RL controller to dynamically correct the output of the PI controller, wherein the RL-P and RL-Q controllers respectively Correction of active and reactive power control signals.
  • the present invention has the following advantages:
  • the present invention proposes a self-tuning control method for a doubly-fed induction wind turbine based on a reinforcement learning algorithm.
  • the method introduces a reinforcement learning control algorithm, which is insensitive to the mathematical model and operating state of the controlled object, and its self-learning ability versus parameters. Variation or external interference has strong adaptability and robustness.
  • the method is simulated by Matlab/Simulink simulation platform. The simulation results show that the self-tuning controller can quickly and automatically optimize the output of the fan control system, which not only achieves the maximum tracking of wind energy, but also has good dynamic performance and significantly enhances the control. Robustness and adaptability of the system.
  • the control strategy of the present invention does not need to change the structure and parameters of the original PI controller, and only needs to add a self-correction module, and the engineering implementation is very simple.
  • the control signal of the RL controller is a discrete action value, it is easy to cause overshoot.
  • the fuzzy control may be considered to fuzzify the input and output signals.
  • FIG. 1 is a schematic diagram of a reinforcement learning system of the present invention
  • FIG. 2 is a block diagram of self-tuning control of a doubly-fed wind power generation system according to the present invention
  • FIG. 3 is a flow chart of self-correction learning of a doubly-fed induction wind turbine based on a reinforcement learning algorithm
  • Figure 5 is a RL-Q controller control signal for reactive power regulation in the embodiment
  • Figure 9 is a reactive power curve of active power regulation in the embodiment.
  • Figure 11 is a reactive power curve when the parameters of the disturbance analysis process are changed in the embodiment.
  • Figure 13 is a RL-Q controller control signal when the disturbance analysis process parameters change in the embodiment.
  • the structure is complex, subject to parameter changes and external disturbances, and has the characteristics of nonlinear, time-varying and strong coupling. If only traditional vector control is used, it is difficult to meet the high adaptability and high robustness of the control system. Sexual requirements.
  • this paper proposes a self-tuning control method for doubly-fed induction wind turbine based on reinforcement learning (RL) algorithm.
  • This method introduces Q learning algorithm and acts as a reinforcement learning core algorithm, which can be quickly and automatically Optimizing the output of the PI controller online, after introducing the enhanced learning self-correction control, maintains the ability of the original system to capture the maximum wind energy, while improving its dynamic performance, enhancing the robustness and adaptability.
  • the meanings of the variables are as follows: P: active power; Q: reactive power; U qs : q-axis component of stator voltage vector; I qs : q-axis component of stator current vector; U s : stator voltage vector magnitude ;i ds : the d-axis component of the stator current.
  • the transfer function of the stator current control power can be obtained from the formula (7).
  • i dr the d-axis component of the rotor current
  • i qr the q-axis component of the rotor current
  • L s stator inductance
  • L m mutual inductance between the stator and the rotor
  • i ds stator The d-axis component of the current
  • i qs the q-axis component of the stator current
  • ⁇ s the stator flux vector magnitude
  • ⁇ dr the d-axis component of the rotor flux vector
  • ⁇ qr the q-axis component of the rotor flux vector
  • ⁇ s the stator flux vector magnitude
  • L m the mutual inductance between the stator and the rotor
  • s stator inductance
  • L r rotor inductance
  • i dr d-axis component of rotor current
  • i qr q-axis component of rotor current
  • u dr the d-axis component of the rotor voltage
  • u qr the q-axis component of the rotor voltage
  • i dr the d-axis component of the rotor current
  • i qr the q-axis component of the rotor current
  • ⁇ s stator magnet Chain vector amplitude
  • R r rotor resistance
  • p differential operator
  • ⁇ s slip electrical angular velocity.
  • the transfer function of the stator current controlled by the rotor voltage can be obtained from equations (8) and (9) (10).
  • a vector control system based on PI control in the direction of stator flux linkage of the doubly-fed induction wind power generation system can be designed.
  • the self-correction control method of the present invention adds a RL controller to the PI controller in the above-designed system, and uses the superimposed value of the output signals of the two controllers as the power control signal.
  • the self-tuning controller design based on reinforcement learning.
  • the reinforcement learning (referred to as RL) algorithm is a system learning from environmental state to action mapping, which is a kind of test evaluation. learning process. This can be described using FIG.
  • the Agent selects an action to act on the environment (ie, the system) according to the learning algorithm, causing the change of the environmental state s, and the environment then feeds back an immediate enhanced signal (a prize or penalty) to the Agent, and the Agent selects according to the enhanced signal and the new state of the environment s' The next action.
  • the learning principle of RL is: If a certain decision behavior (action) of the Agent improves the reinforcement signal, the trend of generating this decision behavior will be strengthened. In recent years, RL theory has made remarkable achievements in the application of power grid systems in the fields of scheduling, reactive power optimization and power market.
  • Figure 1 is a schematic diagram of the reinforcement learning system.
  • the Q learning algorithm is an intensive learning algorithm that improves the control strategy through trial and error and environmental interaction from a long-term perspective.
  • One of the salient features is the object model's independence.
  • Q learning is to estimate the Q value of the optimal control strategy.
  • Q k denote the kth iteration value of the optimal value function Q * , and the Q value is updated according to the iterative formula (11):
  • the action selection strategy is the key to the Q learning control algorithm.
  • the action that defines the agent to select the highest Q value under state s is called the greedy policy p * , and its action is called greedy action.
  • the agent selects the action with the highest Q value every iteration, it will cause convergence to the local optimum, because the same action chain is always executed without searching for other actions.
  • the present invention utilizes a tracking algorithm to design an action selection strategy.
  • the algorithm is based on the probability distribution. When initializing, it gives the selected probability equal to each feasible action in each state. As the iteration progresses, the probability changes with the change of the Q value table.
  • the update formula is as follows:
  • the existing doubly-fed induction fan control system constructed with a fixed gain PI controller will reduce the control performance when the system conditions change.
  • the invention proposes a self-tuning control architecture, as shown in Fig. 2 is a self-correcting control block diagram of the doubly-fed wind power generation system.
  • An RL controller is added to the original PI controller to dynamically correct the output of the PI controller.
  • the RL controller includes an RL-P controller and an RL-Q controller, wherein the RL-P controller and the RL-Q controller Correct the active and reactive power control signals separately.
  • the input value of the RL-P controller is the active power error value ⁇ P, and the action probability distribution obtained by the Q learning algorithm
  • the action ⁇ k is selected and outputted, and the action ⁇ k is added to the output signal of the PI controller by an adder to obtain a given value of the stator q-axis current i qs * , that is, a control signal of the active power; an input of the RL-Q controller
  • the value is the reactive power error value ⁇ Q, and the action probability distribution obtained by the Q learning algorithm
  • the action ⁇ k is selected and outputted, and the action ⁇ k is added to the output signal of the PI controller by an adder to obtain a given value i ds * of the stator d-axis current, that is, a control signal of the reactive power.
  • the RL controller is always in the online learning state during the running process. Once the controlled quantity deviates from the control target (such as parameter change or external disturbance), the control strategy is automatically adjusted,
  • the RL-P controller and the RL-Q controller respectively sample the active power error value ⁇ P and the reactive power error value ⁇ Q.
  • the RL-P controller and the RL-Q controller respectively determine the interval s k to which the power error values ⁇ P and ⁇ Q belong, and the power error values are divided into (- ⁇ , -0.1), [-0.1, -0.06), [-0.06,- 0.03), [-0.03, -0.02), [-0.02, -0.005), [-0.005, 0.005], (0.005, 0.02), (0.02, 0.03), (0.03, 0.06], (0.06, 0.1), (0.1, + ⁇ ) 11 different intervals s, forming a state set S;
  • the RL controller obtains the immediate reward value r k from the reward function; the reward function is designed as: In the middle The value is the pointer of the action set A, that is, the sequence number of the kth action value ⁇ in the action set A, and ⁇ 1 and ⁇ 2 are the weight values of the square terms before and after the balance, and the values are all obtained through a large number of simulation experiments; The negative value of the function can make the control target power error value as small as possible;
  • ⁇ and ⁇ are discount factors, and the values are all obtained through a large number of simulation experiments.
  • S6 update the action probability distribution according to the action selection strategy update formula; if the agent agent selects the action with the highest Q value every iteration, it will cause convergence to the local optimum, so the same action chain is always executed without searching for other actions.
  • the present invention utilizes a tracking algorithm to design an action selection strategy.
  • the strategy is based on a probability distribution. When initializing, the selected probability is given to each feasible action in each state. As the iteration proceeds, the probability follows.
  • the Q value table changes with changes; the RL controller finds the action a g with the highest Q value in the state s k , a g is called the greedy action; the iterative formula of the action probability distribution is:
  • the output value of the controller output signal RL is superimposed with a g PI controller in each state s to automatically optimize the control performance of PI controller, so that the power error value as the error value small.
  • the Q matrix and the probability distribution need to be initialized before iteration.
  • the probability of being selected for each feasible action in each state that is,
  • the present invention provides a self-tuning control method for a doubly-fed induction wind turbine based on a reinforcement learning algorithm, which does not need to change the structure and parameters of the original PI controller after introducing the reinforcement learning self-correction control.
  • the implementation is very simple, maintaining the ability of the original system to capture the maximum wind energy, while improving its dynamic performance, enhancing robustness and adaptability.
  • the algorithm provided by the present invention is used to control the reactive power regulation process of the doubly-fed wind power generator.
  • the reactive power is initially set to 0.9 Mvar, and when it is 1 s, it is reduced to 0 var, and after 2 s, it is increased by 0.9 Mvar again.
  • the simulation ends.
  • the wind speed is kept constant at 10m/s.
  • the reactive power response curve during reactive power regulation is given by Figure 4.
  • Figure 4 is the self-correction control dynamic performance based on the reinforcement learning algorithm is superior to the traditional vector control.
  • Figure 5 is the correction control signal of the reinforcement learning controller based on the reactive power deviation output.
  • Figure 6 is the active power curve during the reactive power adjustment process. As can be seen from Figure 6, during the reactive power adjustment process, the active power is always Stay the same and achieve decoupling well.
  • the algorithm provided by the present invention is used to control the active power adjustment process of the doubly-fed wind power generator.
  • the wind speed is initially set to 10 m/s, and at 2 s, it is increased to 11 m/s, and the simulation ends at 30 s.
  • the reactive power is set to 0var, and the simulation result of the active power adjustment process system is shown in the figure below.
  • Figure 7 shows the active power response curve during the active power adjustment process. It can be seen from the figure that the self-correction control based on the reinforcement learning algorithm and the traditional vector control active power response curve are basically coincident, because based on the principle of maximum wind energy capture, When the wind speed is abrupt, the active power reference value does not change but changes according to the optimal power curve.
  • the algorithm provided by the present invention is used to analyze the disturbance in the control process of the doubly-fed wind power generator.
  • Figure 10, Figure 11, Figure 12 and Figure 13 show the dynamic response diagrams of the parameters after the parameter changes, the active power curves under the dynamic response of the traditional vector control and the self-correction control based on the reinforcement learning algorithm under the same conditions, Power curve, RL-P controller control signal and RL-Q controller control signal. It can be seen from FIG. 12 and FIG.
  • the enhanced learning controller immediately outputs the correction control signal according to the deviation value to compensate for the influence of the parameter change. It can be seen from Fig. 10 and Fig. 11 that the self-correction control is adopted, the overshoot is small, the dynamic quality is improved, and the control performance is improved.
  • the invention provides a self-tuning control method for a doubly-fed induction wind turbine based on a reinforcement learning algorithm, and an algorithm control
  • the object is a doubly-fed wind power generation system, which is characterized by multivariable, nonlinear, significant changes in parameters and external disturbances.
  • the present invention designs a self-tuning controller for the fan, which can effectively improve the robustness and adaptability of the control system.
  • the control strategy does not need to change the structure and parameters of the original PI controller, just add a self-correction module, and the project implementation is very simple.
  • the control signal of the RL controller is a discrete action value, it is easy to cause overshoot.
  • the fuzzy control may be considered to fuzzify the input and output signals.
  • the invention provides a self-tuning control method for a doubly-fed induction wind turbine based on a reinforcement learning algorithm.
  • the method introduces a Q learning algorithm as a reinforcement learning core algorithm, and the reinforcement learning control algorithm is insensitive to the mathematical model and operating state of the controlled object.
  • the learning ability has strong adaptability and robustness to parameter changes or external disturbances, and can quickly and automatically optimize the output of the PI controller online.
  • the system is performed when the wind speed is lower than the rated wind speed.
  • the simulation results show that the proposed method can quickly and automatically optimize the output of the fan control system after entering the self-correcting control of the reinforcement learning. It not only achieves the maximum tracking of wind energy, but also has good dynamic performance, which significantly enhances the robustness of the control system. Sex and adaptability.

Landscapes

  • Engineering & Computer Science (AREA)
  • Power Engineering (AREA)
  • Control Of Eletrric Generators (AREA)
  • Feedback Control In General (AREA)

Abstract

一种基于强化学习算法的双馈感应风力发电机自校正控制方法;该方法在基于PI控制的矢量控制系统中的PI控制器上增加RL控制器,动态校正PI控制器的输出,RL控制器包括RL‐P控制器和RL‐Q控制器,RL‐P控制器和RL‐Q控制器分别对有功和无功功率控制信号校正。该方法引入Q学习算法作为强化学习核心算法,强化学习控制算法对被控对象的数学模型和运行状态不敏感,其学习能力对参数变化或外部干扰具有较强的自适应性和鲁棒性,可快速自动地在线优化PI控制器的输出;该方法具有良好的动态性能,显著增强了控制系统的鲁棒性和适应性。

Description

基于强化学习算法的双馈感应风力发电机自校正控制方法 技术领域
本发明涉及一种双馈感应风力发电机自校正控制,特别是涉及一种基于强化学习(Reinforcement Learning,RL)算法的双馈感应风力发电机自校正控制方法。
背景技术
变速恒频双馈发电是目前风力发电普遍采用的一种发电方式,其发电机采用双馈感应电机(double-fed induction generator,DFIG)。当机组工作在额定风速以下时,通过调节发电机转子转速,保持最佳叶尖速比,实现对风能的最大捕获。其控制系统常采用基于定子磁场定向的矢量控制,实现发电机有功、无功功率的解耦控制。
由于风能具有强烈的随机性、时变性,且系统含有未建模或无法准确建模的动态部分,使双馈发电系统成为一个多变量、非线性、强耦合系统,因此仅采用传统矢量控制难以满足控制系统对高适应性和高鲁棒性的要求。目前有采用各种不同的控制方案,但控制效果都不是非常理想,如采用神经网络控制方案,该控制方案改善了控制性能,但稳态误差较大。而采用模糊滑模控制策略,将模糊控制和滑模控制相结合,虽取得了良好的控制效果,但实现较复杂。
发明内容
本发明目的在于克服现有技术的问题,提供一种能够快速自动地优化风机控制系统的输出,不仅实现了对风能的最大追踪,而且具有良好的动态性能,显著增强了控制系统的鲁棒性和适应性的基于强化学习算法的双馈感应风力发电机自校正控制方法。
本发明目的通过如下技术方案实现:
基于强化学习算法的双馈感应风力发电机自校正控制方法:在基于PI控制的矢量控制系统中的PI控制器上增加RL控制器,动态校正PI控制器的输出,RL控制器包括RL-P控制器和RL-Q控制器,RL-P控制器和RL-Q控制器分别对有功和无功功率控制信号校正;该自校正控制方法包括如下步骤:
S1:RL-P控制器和RL-Q控制器分别采样有功功率误差值ΔP和无功功率误差值ΔQ;RL-P控制器和RL-Q控制器分别判断功率误差值ΔP和ΔQ所属区间sk
S2:对于所识别的区间sk,RL-P控制器或RL-Q控制器根据该sk所对应的动作概率分布
Figure PCTCN2017110899-appb-000001
用随机函数输出动作αk,得RL-P控制器或RL-Q控制器输出的校正信号;动作α相应的被选择的概率的集合构成概率分布,每个区间s有其对应的概率分布Ps(a);
对于RL-P控制器,动作值αk与PI控制器的输出信号用加法器相加得到定子q轴电流的给定值iqs *,即有功功率的控制信号;
对于RL-Q控制器,动作值αk与PI控制器的输出信号用加法器相加得到定子d轴电流的给定值ids *,即无功功率的控制信号;
S3:RL-P控制器和RL-Q控制器分别采样有功功率误差值ΔP和无功功率误差值ΔQ并判断其所属区间sk+1
S4:RL控制器由奖励函数获得立即奖励值rk;奖励函数设计为:
Figure PCTCN2017110899-appb-000002
式中
Figure PCTCN2017110899-appb-000003
值是动作集A的指针,即第k次动作值α在动作集A中的序号,μ1和μ2为平衡前后各平方项的权重值,其数值均为通过大量仿真实验调试所得;
S5:基于Q值迭代公式更新Q矩阵;Q函数为一种期望折扣奖励值,Q学习的目的是估计最优控制策略的Q值,设Qk为最优值函数Q*的第k次迭代值,Q值迭代公式设计为:
Figure PCTCN2017110899-appb-000004
式中α、γ为折扣因子,其数值均为通过大量仿真实验调试所得;
S6:根据动作选择策略更新公式更新动作概率分布;利用一种追踪算法设计动作选择策略,策略基于概率分布,初始化时,赋予各状态下每个可行动作相等的被选概率,随着迭代的进行,概率随Q值表格的变化而变化;RL控制器找出状态sk下具有最高Q值的动作ag,ag称为贪婪动作;动作概率分布的迭代公式为:
Figure PCTCN2017110899-appb-000005
Figure PCTCN2017110899-appb-000006
Figure PCTCN2017110899-appb-000007
分别为第k次迭代时sk状态和非sk状态下选择动作a的概率;β为动作搜索速度,其数值通过大量仿真实验调试所得;
S7:令k=k+1,并返回步骤S2;根据动作概率分布选择并输出动作αk+1,被选择的动作与PI控制器的输出信号相叠加产生相应的定子电流给定值信号,即功率控制信号,并按顺序依次执行接下来的步骤不断循环,在经过多次的迭代后,每个状态s存在Qs k以概率1收敛于Qs *,即获得一个以Qs *表示的最优控制策略以及该最优控制策略所对应的贪婪动作ag,至此完成自校正过程,此时每个状态s下RL控制器输出值ag与PI控制器的输出信号叠加即可自动优化PI控制器的控制性能,使功率误差值误差值小。
本发明提出一种自校正控制架构,即基于PI控制的矢量控制系统中的PI控制器上附加一个RL控制器,来动态校正PI控制器的输出,其中RL-P和RL-Q控制器分别对有功和无功功率控制信号校正。
相对于现有技术,本发明具有如下优点:
1)本发明提出一种基于强化学习算法的双馈感应风力发电机自校正控制方法,该方法引入强化学习控制算法,对被控对象的数学模型和运行状态不敏感,其自学习能力对参数变化或外部干扰具有较强的自适应性和鲁棒性。该方法通过Matlab/Simulink仿真平台进行仿真,仿真结果表明该自校正控制器能够快速自动地优化风机控制系统的输出,不仅实现了对风能的最大追踪,而且具有良好的动态性能,显著增强了控制系统的鲁棒性和适应性。
2)本发明控制策略无需改变原PI控制器的结构和参数,只需增加一个自校正模块,工程实现十分简便。同时,由于RL控制器的控制信号为离散动作值,易导致超调,后续研究中可考虑结合模糊控制对输入输出信号模糊化。
附图说明
图1为本发明强化学习系统原理图;
图2为本发明双馈风力发电系统自校正控制框图;
图3为基于强化学习算法的双馈感应风力发电机自校正学习流程图;
图4为实施例中无功功率调节的无功功率响应曲线;
图5为实施例中无功功率调节的RL-Q控制器控制信号;
图6为实施例中无功功率调节的有功功率曲线;
图7为实施例中有功功率调节的有功功率响应曲线;
图8为实施例中有功功率调节的RL-P控制器控制信号;
图9为实施例中有功功率调节的无功功率曲线;
图10为实施例中扰动分析过程参数变化时的有功功率曲线;
图11为实施例中扰动分析过程参数变化时的无功功率曲线;
图12为实施例中扰动分析过程参数变化时的RL-P控制器控制信号;
图13为实施例中扰动分析过程参数变化时的RL-Q控制器控制信号。
具体实施方式
为更好地理解本发明,下面结合附图和实施例对本发明作进一步的说明,但本发明的实施方式不限如此。
针对双馈感应风力发电系统结构复杂,受参数变化和外部干扰较显著,具有非线性、时变、强耦合的特点,若仅采用传统矢量控制则难以满足控制系统对高适应性和高鲁棒性的要求。
本发明在传统矢量控制的基础上,提出一种基于强化学习(RL)算法的双馈感应风力发电机自校正控制方法,该方法引入Q学习算法,并作为强化学习核心算法,可快速自动地在线优化PI控制器的输出,在引入强化学习自校正控制后,保持了原系统最大风能捕获的能力,同时改善了其动态性能,增强了鲁棒性和自适应性。
第一,双馈感应风力发电系统在定子磁链定向下的基于PI控制的矢量控制系统设计。
当定子取发电机惯例,转子取电动机惯例时,三相对称系统中具有均匀气隙的双馈感应发电机在两相同步旋转dq坐标系下的数学模型为:
定子电压方程
Figure PCTCN2017110899-appb-000008
转子电压方程
Figure PCTCN2017110899-appb-000009
定子磁链方程
Figure PCTCN2017110899-appb-000010
转子磁链方程
Figure PCTCN2017110899-appb-000011
电磁转矩方程
Figure PCTCN2017110899-appb-000012
定子功率输出方程
Figure PCTCN2017110899-appb-000013
公式(1)~(6)中:下标d和q分别表示d轴和q轴分量;下标s和r分别表示定子和转子分量;U、i、ψ、Te、P、Q分别表示电压、电流、磁链、电磁转矩、有功和无功功率;R、L分别表示电阻和电感;ω1为同步转速;ωs为转差电角速度,ωs=ω1r=sω1;ωr为发电机转子电角速度,s为转差率;np为极对数;p为微分算子。
采用定子磁链定向矢量控制,将定子磁链矢量定向于d轴上,有ψds=ψs,ψqs=0。稳态运行时,定子磁链保持恒定,忽略定子绕组电阻压降,则Uds=0,Uqs=ω1ψs=Us,Us为定子电压矢量幅值。
由式(6)得
Figure PCTCN2017110899-appb-000014
在该式中各变量含义如下:P:有功功率;Q:无功功率;Uqs:定子电压矢量的q轴分量;Iqs:定子电流矢量的q轴分量;Us:定子电压矢量幅值;ids:定子电流的d轴分量。从式公式(7)可得定子电流控制功率的传递函数。
由公式(3)得:
Figure PCTCN2017110899-appb-000015
在该公式中,各变量含义如下:idr:转子电流的d轴分量;iqr:转子电流的q轴分量;Ls:定子电感;Lm:定子与转子间的互感;ids:定子电流的d轴分量;iqs:定子电流的q轴分量;ψs:定子磁链矢量幅值;
由公式(4)得
Figure PCTCN2017110899-appb-000016
在该公式中,
Figure PCTCN2017110899-appb-000017
各变量含义如下,ψdr:转子磁链矢量的d轴分量;ψqr:转子磁链矢量的q轴分量;ψs:定子磁链矢量幅值;Lm:定子与转子间的互感;Ls:定子电感;Lr:转子电感;idr:转子电流的d轴分量;iqr:转子电流的q轴分量;
再由公式(2)得
Figure PCTCN2017110899-appb-000018
在该公式中,
Figure PCTCN2017110899-appb-000019
各变量含义如下,udr:转子电压的d轴分量;uqr:转子电压的q轴分量;idr:转子电流的d轴分量;iqr:转子电流的q轴分量;ψs:定子磁链矢量幅值;Rr:转子电阻;p:微分算子;ωs:转差电角速度。从公式(8)(9)(10)可得到由转子电压控制定子电流的传递函数。
根据上述公式(7)~(10)可设计出双馈感应风力发电系统在定子磁链定向下的基于PI控制的矢量控制系统。本发明自校正控制方法即在上述所设计系统中的PI控制器的基础上附加一个RL控制器,用两控制器的输出信号的叠加值作为功率的控制信号。
第二,基于强化学习的自校正控制器设计。
强化学习(简称为RL)算法是系统从环境状态到动作映射的学习,是一种试探评价的 学习过程。可用附图1来描述。Agent根据学习算法选择一个动作作用于环境(即系统),引起环境状态s的变化,环境再反馈一个立即强化信号(奖或罚)给Agent,Agent根据强化信号及环境的新状态s′再选择下一个动作。RL的学习原则是:若Agent的某个决策行为(动作)使强化信号得到改善,就使以后产生这个决策行为的趋势加强。近年来,RL理论在电力系统中诸于调度、无功优化和电力市场等领域的应用研究成果显著。
如图1所示,图1为强化学习系统原理图。根据图1,Q学习算法是一种从长期的观点通过试错与环境交互来改进控制策略的强化学习算法,其显著特点之一是对象模型的无关性。
Q学习的目的是去估计最优控制策略的Q值。设Qk表示最优值函数Q*的第k次迭代值,Q值按迭代公式(11)更新:
Figure PCTCN2017110899-appb-000020
动作选择策略是Q学习控制算法的关键。定义Agent在状态s下选择具有最高Q值的动作称为贪婪策略p*,其动作称为贪婪动作。
Figure PCTCN2017110899-appb-000021
若Agent每次迭代都选取Q值最高的动作,会导致收敛于局部最优,因为总是执行相同的动作链而未搜索其他动作。为避免这种情况,本发明利用一种追踪算法来设计动作选择策略。该算法基于概率分布,初始化时,赋予各状态下每个可行动作相等的被选概率,随着迭代的进行,概率随Q值表格的变化而变化,更新公式如下:
Figure PCTCN2017110899-appb-000022
式中:
Figure PCTCN2017110899-appb-000023
Figure PCTCN2017110899-appb-000024
分别为第k次迭代时sk状态和非sk状态下选择动作a的概率;ag为贪婪动作;β为动作搜索速度。由式(13)可看出,具有较高Q值的动作被选择的概率较大,对应环境某一具体的状态,贪婪动作的被选概率随着该状态的复现而不断变大,在 经过足够多数的迭代后,Qk将会以概率1收敛于Q*,也即获得一个以Q*表示的最优控制策略。
基于此,自校正控制器的结构设计描述如下。以固定增益的PI控制器构建的现有双馈感应风机控制系统,当系统工况改变时,控制性能会下降。本发明提出一种自校正控制架构,如图2所示为双馈风力发电系统自校正控制框图。在原PI控制器的基础上附加一个RL控制器,来动态校正PI控制器的输出,RL控制器包括RL-P控制器和RL-Q控制器,其中RL-P控制器和RL-Q控制器分别对有功和无功功率控制信号校正。RL-P控制器的输入值为有功功率误差值ΔP,通过Q学习算法所得动作概率分布
Figure PCTCN2017110899-appb-000025
选择并输出动作αk,该动作αk与PI控制器的输出信号用加法器相加得到定子q轴电流的给定值iqs *,即有功功率的控制信号;RL-Q控制器的输入值为无功功率误差值ΔQ,通过Q学习算法所得动作概率分布
Figure PCTCN2017110899-appb-000026
选择并输出动作αk,该动作αk与PI控制器的输出信号用加法器相加得到定子d轴电流的给定值ids *,即无功功率的控制信号。RL控制器在运行过程一直处于在线学习状态,被控量一旦偏离控制目标(比如参数变化或外部扰动所致),便自动调整控制策略,从而增加原控制系统的自适应和自学习能力。
自校正控制器的核心控制算法流程如附图3所示,其描述如下:
S1:RL-P控制器和RL-Q控制器分别采样有功功率误差值ΔP和无功功率误差值ΔQ。RL-P控制器和RL-Q控制器分别判断功率误差值ΔP和ΔQ所属区间sk,功率误差值划分为(-∞,-0.1)、[-0.1,-0.06)、[-0.06,-0.03)、[-0.03,-0.02)、[-0.02,-0.005)、[-0.005,0.005]、(0.005,0.02]、(0.02,0.03]、(0.03,0.06]、(0.06,0.1]、(0.1,+∞)11个不同区间s,构成状态集合S;
S2:对于所识别的区间sk,RL-P控制器或RL-Q控制器根据该sk所对应的动作概率分布
Figure PCTCN2017110899-appb-000027
用随机函数输出动作αk,得RL-P控制器或RL-Q控制器输出的校正信号;动作αk在每个s下总共有11种选择,构成动作空间A,11种选择分别是[0.06,0.04,0.03,0.02,0.01,0,-0.01,-0.02,-0.03,-0.04,-0.06],在同一个区间s下每个动作α有相应的被选择的概率,11个动作α相应的被选择的概率的集合构成了所述的概率分布Ps(a),每个区间s有其对应的概率分布Ps(a);对于RL-P控制器,动作值αk与PI控制器的输出 信号用加法器相加得到定子q轴电流的给定值iqs *,即有功功率的控制信号;对于RL-Q控制器,动作值αk与PI控制器的输出信号用加法器相加得到定子d轴电流的给定值ids *,即无功功率的控制信号。
S3:RL-P控制器和RL-Q控制器分别采样有功功率误差值ΔP和无功功率误差值ΔQ并判断其所属区间sk+1
S4:RL控制器由奖励函数获得立即奖励值rk;奖励函数设计为:
Figure PCTCN2017110899-appb-000028
式中
Figure PCTCN2017110899-appb-000029
值是动作集A的指针,即第k次动作值α在动作集A中的序号,μ1和μ2为平衡前后各平方项的权重值,其数值均为通过大量仿真实验调试所得;奖励函数取负值能使控制目标功率误差值尽可能小;
S5:基于Q值迭代公式更新Q矩阵;Q函数为一种期望折扣奖励值,Q学习的目的是估计最优控制策略的Q值,设Qk为最优值函数Q*的第k次迭代值,Q值迭代公式设计为:
Figure PCTCN2017110899-appb-000030
式中α、γ为折扣因子,其数值均为通过大量仿真实验调试所得。步骤S4中功率误差值越小,rk值越大,Qk+1(sk,ak)值越大;
S6:根据动作选择策略更新公式更新动作概率分布;若智能体Agent每次迭代都选取Q值最高的动作,会导致收敛于局部最优,因此总是执行相同的动作链而未搜索其他动作,为避免这种情况的发生,本发明利用一种追踪算法设计动作选择策略,策略基于概率分布,初始化时,赋予各状态下每个可行动作相等的被选概率,随着迭代的进行,概率随Q值表格的变化而变化;RL控制器找出状态sk下具有最高Q值的动作ag,ag称为贪婪动作;动作概率分布的迭代公式为:
Figure PCTCN2017110899-appb-000031
Figure PCTCN2017110899-appb-000032
Figure PCTCN2017110899-appb-000033
分别为第k次迭代时sk状态和非sk状态下选择动作a的概率;β为动作搜索速度,其数值通过大量仿真实验调试所得。
由功率分布迭代公式可知,具有较高Q值的动作即能使功率误差值较小的动作被选择的概率较大,对应环境某一具体的状态s,贪婪动作的被选概率随着该状态的复现而不断变大并趋近于1;
S7:令k=k+1,并返回步骤S2;根据动作概率分布选择并输出动作αk+1,被选择的动作与PI控制器的输出信号相叠加产生相应的定子电流给定值信号,即功率控制信号。并按顺序依次执行接下来的步骤不断循环。在经过足够多数的迭代后,每个状态s存在Qs k以概率1收敛于Qs *,也即获得一个以Qs *表示的最优控制策略以及该最优控制策略所对应的贪婪动作ag,至此完成自校正过程,此时每个状态s下RL控制器输出值ag与PI控制器的输出信号叠加即可自动优化PI控制器的控制性能,使功率误差值误差值尽可能小。
在迭代前需对Q矩阵以及概率分布进行初始化。Q矩阵每个元素的初值为0,即令Q0(s,a)=0,
Figure PCTCN2017110899-appb-000034
令各状态下每个可行动作相等的被选概率,即令
Figure PCTCN2017110899-appb-000035
根据前面的描述,本发明提供了一种基于强化学习算法的双馈感应风力发电机自校正控制方法,该方法在引入强化学习自校正控制后,无需改变原PI控制器的结构和参数,工程实现十分简便,保持了原系统最大风能捕获的能力,同时改善了其动态性能,增强了鲁棒性和自适应性。
实施例
针对双馈感应风力发电机,验证本发明所设计的控制器的正确性和有效性。
双馈感应风力发电机选择如下参数进行仿真验证:双馈风力发电机额定功率为P=9MW(=6*1.5MW),Rs=0.007pu,Rr=0.005pu,Ls=3.071pu,Lr=3.056pu,Lm=2.9pu,np=3,这些参数可以代入到上面的公式(1)~(10)中进行计算双馈风力发电机相应的参数。两PI控制器的参数为:比例增益:Kp=6.9;积分增益:Ki=408,RL-P控制器的参数为:权重值μ1=0.001,折扣因子α=0.6,γ=0.001,动作搜索速度β=0.9; RL-Q控制器的参数为:权重值μ2=0.001,折扣因子α=0.6,γ=0.001,动作搜索速度β=0.9。
(1)无功功率调节
应用本发明提供的算法来控制双馈风力发电机的无功功率调节过程,该调节过程中,无功功率初始给定为0.9Mvar,1s时降为0var,2s后再次上升0.9Mvar,3s时仿真结束。仿真期间,保持风速为10m/s不变,无功功率调节过程中无功功率响应曲线由图4给出,该图中,基于强化学习算法的自校正控制动态性能优于传统矢量控制。图5为强化学习控制器基于无功功率偏差输出的校正控制信号,图6为无功功率调节过程中有功功率曲线图,从图6可看出,在无功功率调节过程中,有功功率始终保持不变,很好地实现了解耦。
(2)有功功率调节
应用本发明提供的算法来控制双馈风力发电机的有功功率调节过程,该调节过程中,风速初始给定为10m/s,2s时上升为11m/s,30s时仿真结束。仿真期间,设定无功功率为0var不变,有功功率调节过程系统响应仿真结果如下图所示。图7给出了有功功率调节过程中的有功功率响应曲线,从该图可看出基于强化学习算法的自校正控制和传统矢量控制有功功率响应曲线基本重合,这是因为基于最大风能捕获原理,当风速突变时,有功功率参考值不突变而是按照最佳功率曲线变化,功率偏差始终很小,未达到强化学习设定最小动作值的状态,故强化学习控制器输出控制信号为0,从而两条曲线重合。图8为有功功率调节过程中的RL-P控制器控制信号,而图9为有功功率调节过程中的无功功率曲线,从图9可看出,在有功功率调节过程中,无功功率不受影响,实现了解耦。
(3)扰动分析
应用本发明提供的算法对双馈风力发电机控制过程中的扰动进行分析,为考察系统对电机参数变化的鲁棒性,假设风速为10m/s不变,在t=2s时b增大一倍。图10、图11、图12和图13分别给出了参数变化后参数变化时动态响应图,相同条件下传统矢量控制与基于强化学习算法的自校正控制的动态响应下的有功功率曲线、无功功率曲线、RL-P控制器控制信号和RL-Q控制器控制信号。由图12和图13可看出,当参数变化导致有功和无功功率与参考值出现偏差后,强化学习控制器根据偏差值立即输出校正控制信号,来补偿参数变化的影响。由图10和图11可看出,采用自校正控制,超调较小,改善了动态品质,提高了控制性能。
本发明提供一种基于强化学习算法的双馈感应风力发电机自校正控制方法,算法控制 对象为双馈风力发电系统,该系统具有多变量、非线性、受参数变化和外部干扰显著的特点。利用强化学习算法具有的在线自学习能力和模型无关性特点,本发明设计了风机自校正控制器,可有效提高其控制系统的鲁棒性和自适应性。此外,该控制策略无需改变原PI控制器的结构和参数,只需增加一个自校正模块,工程实现十分简便。同时,由于RL控制器的控制信号为离散动作值,易导致超调,后续研究中可考虑结合模糊控制对输入输出信号模糊化。
本发明提供了一种基于强化学习算法的双馈感应风力发电机自校正控制方法,该方法引入Q学习算法作为强化学习核心算法,强化学习控制算法对被控对象的数学模型和运行状态不敏感,其学习能力对参数变化或外部干扰具有较强的自适应性和鲁棒性,可快速自动地在线优化PI控制器的输出,基于MATLAB/Simulink环境,在风速低于额定风速时对系统进行仿真,结果表明该方法在进入强化学习自校正控制后,能够快速自动地优化风机控制系统的输出,不仅实现了对风能的最大追踪,而且具有良好的动态性能,显著增强了控制系统的鲁棒性和适应性。
以上所述实施例仅表达了本发明的一种实施方式,其描述较为具体和详细,但并不能因此而理解为对本发明范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明的保护范围应以所附权利要求为准。

Claims (5)

  1. 基于强化学习算法的双馈感应风力发电机自校正控制方法,其特征在于,在基于PI控制的矢量控制系统中的PI控制器上增加RL控制器,动态校正PI控制器的输出,RL控制器包括RL-P控制器和RL-Q控制器,RL-P控制器和RL-Q控制器分别对有功和无功功率控制信号校正;该自校正控制方法包括如下步骤:
    S1:RL-P控制器和RL-Q控制器分别采样有功功率误差值ΔP和无功功率误差值ΔQ;RL-P控制器和RL-Q控制器分别判断功率误差值ΔP和ΔQ所属区间sk
    S2:对于所识别的区间sk,RL-P控制器或RL-Q控制器根据该sk所对应的动作概率分布
    Figure PCTCN2017110899-appb-100001
    用随机函数输出动作αk,得RL-P控制器或RL-Q控制器输出的校正信号;对于动作α相应的被选择的概率的集合构成概率分布,每个区间s有其对应的概率分布Ps(a);
    对于RL-P控制器,动作值αk与PI控制器的输出信号用加法器相加得到定子q轴电流的给定值iqs *,即有功功率的控制信号;
    对于RL-Q控制器,动作值αk与PI控制器的输出信号用加法器相加得到定子d轴电流的给定值ids *,即无功功率的控制信号;
    S3:RL-P控制器和RL-Q控制器分别采样有功功率误差值ΔP和无功功率误差值ΔQ并判断其所属区间sk+1
    S4:RL控制器由奖励函数获得立即奖励值rk;奖励函数设计为:
    Figure PCTCN2017110899-appb-100002
    式中
    Figure PCTCN2017110899-appb-100003
    值是动作集A的指针,该指针为第k次动作值α在动作集A中的序号,μ1和μ2为平衡前后各平方项的权重值,其数值均为通过大量仿真实验调试所得;
    S5:基于Q值迭代公式更新Q矩阵;Q函数为一种期望折扣奖励值,Q学习的目的是估计最优控制策略的Q值,设Qk为最优值函数Q*的第k次迭代值,Q值迭代公式设计为:
    Figure PCTCN2017110899-appb-100004
    式中α、γ为折扣因子,其数值均为通过大量仿真实验调试所得;
    S6:根据动作选择策略更新公式更新动作概率分布;利用一种追踪算法设计动作选择策略,策略基于概率分布,初始化时,赋予各状态下每个可行动作相等的被选概率,随着迭代的进行,概率随Q值表格的变化而变化;RL控制器找出状态sk下具有最高Q值的动作ag,ag称为贪婪动作;动作概率分布的迭代公式为:
    Figure PCTCN2017110899-appb-100005
    Figure PCTCN2017110899-appb-100006
    Figure PCTCN2017110899-appb-100007
    分别为第k次迭代时sk状态和非sk状态下选择动作a的概率;β为动作搜索速度,其数值通过大量仿真实验调试所得;
    S7:令k=k+1,并返回步骤S2;根据动作概率分布选择并输出动作αk+1,被选择的动作与PI控制器的输出信号相叠加产生相应的定子电流给定值信号,即功率控制信号,并按顺序依次执行接下来的步骤不断循环,在经过多次的迭代后,每个状态s存在Qs k以概率1收敛于Qs *,即获得一个以Qs *表示的最优控制策略以及该最优控制策略所对应的贪婪动作ag,至此完成自校正过程,此时每个状态s下RL控制器输出值ag与PI控制器的输出信号叠加,即可自动优化PI控制器的控制性能,使功率误差值小。
  2. 根据权利要求1所述的基于强化学习算法的双馈感应风力发电机自校正控制方法,其特征在于,所述功率误差值ΔP和ΔQ所属区间sk根据功率误差值划分为(-∞,-0.1)、[-0.1,-0.06)、[-0.06,-0.03)、[-0.03,-0.02)、[-0.02,-0.005)、[-0.005,0.005]、(0.005,0.02]、(0.02,0.03]、(0.03,0.06]、(0.06,0.1]、(0.1,+∞)11个不同区间,构成状态集合S。
  3. 根据权利要求2所述的基于强化学习算法的双馈感应风力发电机自校正控制方法,其特征在于,动作αk在每个区间s下总共有11种选择,构成动作空间A,11种选择是[0.06,0.04,0.03,0.02,0.01,0,-0.01,-0.02,-0.03,-0.04,-0.06],在同一个区间s下每个动作α有相应的被选择的概率。
  4. 根据权利要求1所述的基于强化学习算法的双馈感应风力发电机自校正控制方法,其特征在于,在迭代前需对Q矩阵以及概率分布进行初始化;Q矩阵每个元素的初值为0,即令
    Figure PCTCN2017110899-appb-100008
    令各状态下每个可行动作相等的被选概率,即令
    Figure PCTCN2017110899-appb-100009
  5. 根据权利要求1所述的基于强化学习算法的双馈感应风力发电机自校正控制方法,其特征在于,所述基于PI控制的矢量控制系统根据如下公式(7)~(10)设计:
    Figure PCTCN2017110899-appb-100010
    其中,P:有功功率;Q:无功功率;Uqs:定子电压矢量的q轴分量;Iqs:定子电流矢量的q轴分量;Us:定子电压矢量幅值;ids:定子电流的d轴分量;
    Figure PCTCN2017110899-appb-100011
    其中,idr:转子电流的d轴分量;iqr:转子电流的q轴分量;Ls:定子电感;Lm:定子与转子间的互感;ids:定子电流的d轴分量;iqs:定子电流的q轴分量;ψs:定子磁链矢量幅值;
    Figure PCTCN2017110899-appb-100012
    其中,
    Figure PCTCN2017110899-appb-100013
    各变量含义如下,ψdr:转子磁链矢量的d轴分量;ψqr:转子磁链矢量的q轴分量;ψs:定子磁链矢量幅值;Lm:定子与转子间的互感;Ls:定子电感;Lr:转子电感;idr:转子电流的d轴分量;iqr:转子电流的q轴分量;
    Figure PCTCN2017110899-appb-100014
    在该公式中,
    Figure PCTCN2017110899-appb-100015
    各变量含义如下,udr:转子电压的d轴分量;uqr:转子电压的q轴分量;idr:转子电流的d轴分量;iqr:转子电流的q轴分量;ψs:定子磁链矢量幅值;Rr:转子电阻;p:微分算子;ωs:转差电角速度;从公式(8)(9)(10)可得到由转子电压控制定子电流的传递函数。
PCT/CN2017/110899 2017-02-10 2017-11-14 基于强化学习算法的双馈感应风力发电机自校正控制方法 WO2018145498A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710073833.4A CN106877766A (zh) 2017-02-10 2017-02-10 基于强化学习算法的双馈感应风力发电机自校正控制方法
CN201710073833.4 2017-02-10

Publications (1)

Publication Number Publication Date
WO2018145498A1 true WO2018145498A1 (zh) 2018-08-16

Family

ID=59167407

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/110899 WO2018145498A1 (zh) 2017-02-10 2017-11-14 基于强化学习算法的双馈感应风力发电机自校正控制方法

Country Status (2)

Country Link
CN (1) CN106877766A (zh)
WO (1) WO2018145498A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109714786A (zh) * 2019-03-06 2019-05-03 重庆邮电大学 基于Q-learning的毫微微小区功率控制方法

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106877766A (zh) * 2017-02-10 2017-06-20 华南理工大学 基于强化学习算法的双馈感应风力发电机自校正控制方法
CN108429475B (zh) * 2018-02-11 2020-02-18 东南大学 一种用于波浪发电系统的并网逆变器控制方法
CN110244077B (zh) * 2019-06-04 2021-03-30 哈尔滨工程大学 一种热式风速传感器恒功率调节与精度补偿方法
CN114002957B (zh) * 2021-11-02 2023-11-03 广东技术师范大学 一种基于深度强化学习的智能控制方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100114388A1 (en) * 2006-11-28 2010-05-06 The Royal Institution For The Advancement Of Learning/Mcgill University Method and system for controlling a doubly-fed induction machine
CN104506106A (zh) * 2014-12-30 2015-04-08 徐州中矿大传动与自动化有限公司 一种双馈电机励磁控制及零速启动方法
CN105897102A (zh) * 2016-03-18 2016-08-24 国家电网公司 精确计算电网故障期间双馈式发电机定子磁链的方法
CN106877766A (zh) * 2017-02-10 2017-06-20 华南理工大学 基于强化学习算法的双馈感应风力发电机自校正控制方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7066034B2 (en) * 2001-11-12 2006-06-27 International Rectifier Corporation Start-up method and system for permanent magnet synchronous motor drive
CN102611380B (zh) * 2012-03-09 2014-08-13 哈尔滨工业大学 一种双馈电机参数在线辨识方法
CN103746628B (zh) * 2013-12-31 2014-11-26 华北电力大学(保定) 一种双馈感应风力发电机转子侧换流器的控制方法
CN103904641B (zh) * 2014-03-14 2016-05-04 华南理工大学 基于相关均衡强化学习的孤岛微电网智能发电控制方法
CN104993759B (zh) * 2015-07-07 2017-08-25 河南师范大学 双馈风力发电机快速弱磁控制方法
CN104967376B (zh) * 2015-07-07 2017-08-25 河南师范大学 双馈风力发电机转子磁链无差拍故障运行方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100114388A1 (en) * 2006-11-28 2010-05-06 The Royal Institution For The Advancement Of Learning/Mcgill University Method and system for controlling a doubly-fed induction machine
CN104506106A (zh) * 2014-12-30 2015-04-08 徐州中矿大传动与自动化有限公司 一种双馈电机励磁控制及零速启动方法
CN105897102A (zh) * 2016-03-18 2016-08-24 国家电网公司 精确计算电网故障期间双馈式发电机定子磁链的方法
CN106877766A (zh) * 2017-02-10 2017-06-20 华南理工大学 基于强化学习算法的双馈感应风力发电机自校正控制方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LI, JING ET AL.: "Self- Tuning Control Based on Reinforcement Learning Algorithm for Doubly-Fed Induction Wind Power Generator", SMALL & SPECIAL ELECTRICAL MACHINES, vol. 41, no. 3, 28 March 2013 (2013-03-28), pages 53 - 54, ISSN: 1004-7018 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109714786A (zh) * 2019-03-06 2019-05-03 重庆邮电大学 基于Q-learning的毫微微小区功率控制方法
CN109714786B (zh) * 2019-03-06 2021-07-16 重庆邮电大学 基于Q-learning的毫微微小区功率控制方法

Also Published As

Publication number Publication date
CN106877766A (zh) 2017-06-20

Similar Documents

Publication Publication Date Title
WO2018145498A1 (zh) 基于强化学习算法的双馈感应风力发电机自校正控制方法
TWI445276B (zh) 一種整合自動電壓調整器之控制系統和方法
CN108649847A (zh) 基于频率法和模糊控制的电机pi控制器参数整定方法
WO2024021206A1 (zh) 一种基于构网型变流器的储能系统控制方法、系统、存储介质及设备
CN110224416B (zh) 一种基于根轨迹族的超低频振荡抑制方法
CN106059422B (zh) 一种用于双馈风电场次同步振荡抑制的模糊控制方法
CN106712055B (zh) 一种与低励限制功能相协调的电力系统稳定器配置方法
CN109638881B (zh) 电网强度自适应优化的储能逆变器虚拟同步方法及系统
CN109599889B (zh) 基于模糊自抗扰的不平衡电压下的穿越控制方法、系统
CN109742756A (zh) 超导储能辅助pss抑制低频振荡的参数调整方法
CN111478365B (zh) 一种直驱风电机组虚拟同步机控制参数的优化方法及系统
CN109787274B (zh) 一种虚拟同步控制方法及转子侧变频器控制器
CN111725840A (zh) 一种直驱风电机组控制器参数辨识方法
CN110212574B (zh) 考虑虚拟惯量的风电控制参数协调设置方法
Li et al. Dynamic modeling and controller design for a novel front-end speed regulation (FESR) wind turbine
CN111049178A (zh) 一种直驱永磁风电机组经vsc-hvdc并网稳定控制分析方法
CN117318553B (zh) 基于td3和维也纳整流器的低风速永磁直驱风电机组控制方法
CN110417047B (zh) 基于复转矩系数分析双馈风机ssci阻尼特性的方法
CN111030136A (zh) 一种水电机组调速器电力系统稳定器设计方法
CN106849130A (zh) 一种电力系统稳定器参数的整定方法
He et al. Introducing MRAC‐PSS‐VI to Increase Small‐Signal Stability of the Power System after Wind Power Integration
Manjeera et al. Design and Implementation of Fuzzy logic-2DOF controller for Emulation of wind turbine System
CN115903457A (zh) 一种基于深度强化学习的低风速永磁同步风力发电机控制方法
CN110556873B (zh) 一种基于罚函数的vsg自适应转动惯量控制方法
CN112486019A (zh) 不确定风力发电机系统的最大功率跟踪模糊控制方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17896071

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03/12/2019)

122 Ep: pct application non-entry in european phase

Ref document number: 17896071

Country of ref document: EP

Kind code of ref document: A1