JP7090445B2

JP7090445B2 - Control systems, learning devices, control devices, and control methods

Info

Publication number: JP7090445B2
Application number: JP2018053510A
Authority: JP
Inventors: 房二堀部
Original assignee: Lixil Corp
Current assignee: Lixil Corp
Priority date: 2018-03-20
Filing date: 2018-03-20
Publication date: 2022-06-24
Anticipated expiration: 2038-03-20
Also published as: JP2019165602A

Description

本発明は、制御システム、学習装置、制御装置、及び制御方法に関する。 The present invention relates to a control system, a learning device, a control device, and a control method.

従来、風力発電システムにおいて、風車のブレード（翼）の取り付け角（ピッチ角）を変化させることにより、出力を高効率に制御する技術がある。しかし、垂直軸型の風車では翼のピッチ制御を持たないものが多い。ピッチ制御を持たない風力発電システムにおいては、風車が一定以上の風速（強風）を受ける状況では、風車の回転速度を減速させる。これにより、強風下において風車の回転速度が過回転となり強制停止してしまう事態を防ぎ、強風下においても風車の回転動作を継続させる。こうすることで運転可能な条件の拡大を図ってきた（例えば、特許文献１）。 Conventionally, in a wind power generation system, there is a technique for controlling the output with high efficiency by changing the mounting angle (pitch angle) of the blades (blades) of the wind turbine. However, many vertical axis type wind turbines do not have wing pitch control. In a wind power generation system that does not have pitch control, the rotation speed of the wind turbine is reduced when the wind turbine receives a certain wind speed (strong wind). This prevents the situation where the rotation speed of the wind turbine becomes excessively rotated and is forcibly stopped under strong wind, and the rotating operation of the wind turbine is continued even under strong wind. By doing so, the operating conditions have been expanded (for example, Patent Document 1).

特許第４４０１１１７号公報Japanese Patent No. 4401117

しかしながら、風速の時系列変化（以下、風況という）は時々刻々と変化するため、不安定であり、予測することが困難であるという実情がある。このため、強風に備えて風車の回転速度を減速させた状態で、想定していた強風が吹かなかった場合には、風車の回転速度を減速させた量に応じて発電電力が低減してしまうためピッチ制御と比較して制御が困難である。
さらに、風車が設置された場所の地形や、季節等に応じて風況が変化するため、強風時においてどの程度風車の回転を減速させて制御すれば過回転とならずに済むか見極めることが困難である。
このため、強風時において、ある条件では過回転により風車が強制停止となり、また別の条件では過回転を抑制して発電を維持していたにも係らず、運転（発電）を停止させる上限の風速（限界風速）に風速が達したために、風車の回転が強制停止となってしまう場合があった。 However, since the time-series changes in wind speed (hereinafter referred to as wind conditions) change from moment to moment, the reality is that they are unstable and difficult to predict. For this reason, if the expected strong wind does not blow while the rotation speed of the wind turbine is decelerated in preparation for a strong wind, the generated power will be reduced according to the amount of deceleration of the rotation speed of the wind turbine. Therefore, it is more difficult to control than pitch control.
Furthermore, since the wind conditions change depending on the terrain of the place where the wind turbine is installed and the season, it is possible to determine how much the rotation of the wind turbine should be slowed down and controlled to prevent over-rotation during strong winds. Have difficulty.
For this reason, in strong winds, the wind turbine is forcibly stopped due to over-rotation under certain conditions, and under other conditions, the upper limit of stopping operation (power generation) despite suppressing over-rotation and maintaining power generation. Since the wind speed reached the wind speed (limit wind speed), the rotation of the wind turbine was sometimes forcibly stopped.

本発明は、風況に応じて限界風速を変更することが可能となる制御システム、学習装置、制御装置、及び制御方法を提供することである。 The present invention is to provide a control system, a learning device, a control device, and a control method capable of changing the critical wind speed according to the wind condition.

上述した課題を解決するために本発明の一実施形態は、風力発電システムの風車の回転数を制御する制御システムであって、前記風力発電システムの風車の設置場所における風速を示す風速情報、前記風車の回転に関する回転情報、及び、前記風速と前記回転との関係情報であって前記風速と前記風車が回転可能な回転数の最大値とを示す関係情報に基づいて、前記風速と前記回転との対応情報を学習する学習部と、前記風速と前記回転との対応情報を記憶する記憶部と、前記風車の回転数を制御する回転制御パラメータを前記風車に設定した場合における前記回転情報、及び風速情報を検出する状態検出部と、前記状態検出部により検出された前記回転情報と前記風速情報、及び前記対応情報に基づいて、前記回転情報を決定する決定部と、前記決定部により決定された回転情報に基づいて、前記風車の回転を制御する制御部とを備え、風速の時系列変化において、風速が所定の強風判定閾値以上となる強風区間における前記状態検出部により検出された前記回転数と前記回転数の最大値との差分が所定の余裕閾値未満である場合であって、尚且つ、前記状態検出部により検出された前記回転情報に示される回転数の変化率が所定の変化閾値未満である場合、前記風力発電システムによる発電が可能な風速の上限である風速限界を増加させる制御システムである。 In order to solve the above-mentioned problems, one embodiment of the present invention is a control system that controls the rotation speed of the wind turbine of the wind power generation system, and wind speed information indicating the wind speed at the installation location of the wind turbine of the wind power generation system, said. The wind speed and the rotation are based on the rotation information regarding the rotation of the wind turbine and the relational information indicating the wind speed and the maximum value of the rotation speed at which the wind turbine can rotate, which is the relation information between the wind speed and the rotation. A learning unit for learning the correspondence information of the wind turbine, a storage unit for storing the correspondence information between the wind speed and the rotation, the rotation information when the rotation control parameter for controlling the rotation speed of the wind turbine is set in the wind turbine, and A state detection unit that detects wind speed information, a determination unit that determines the rotation information based on the rotation information and the wind speed information detected by the state detection unit, and the corresponding information, and a determination unit that determines the rotation information. A control unit that controls the rotation of the wind turbine based on the rotation information is provided , and the rotation detected by the state detection unit in a strong wind section where the wind speed is equal to or higher than a predetermined strong wind determination threshold in a time-series change of the wind speed. When the difference between the number and the maximum value of the rotation number is less than the predetermined margin threshold value, and the change rate of the rotation number indicated in the rotation information detected by the state detection unit is a predetermined change. When it is less than the threshold value, it is a control system that increases the wind speed limit, which is the upper limit of the wind speed that can be generated by the wind power generation system .

また、本発明の一実施形態は、上述の制御システムであって、前記状態検出部により検出された前記回転情報、及び前記風速情報に基づいて、所定の報酬条件に応じた報酬を算出する報酬算出部を更に備え、前記関係情報は、前記報酬算出部により算出された報酬を含み、前記学習部は、報酬に基づいて前記対応情報を学習する強化学習モデルである。 Further, one embodiment of the present invention is the above-mentioned control system, and is a reward for calculating a reward according to a predetermined reward condition based on the rotation information detected by the state detection unit and the wind velocity information. Further including a calculation unit, the relational information includes a reward calculated by the reward calculation unit, and the learning unit is an enhanced learning model for learning the corresponding information based on the reward.

また、本発明の一実施形態は、上述の制御システムであって、前記回転数の最大値、前記回転数、及び前記回転数の変化率に基づいて報酬を算出する。 Further, one embodiment of the present invention is the control system described above, and calculates a reward based on the maximum value of the rotation speed, the rotation speed, and the rate of change of the rotation speed.

また、本発明の一実施形態は、風力発電システムの風車の回転数を制御する制御システムであって、前記風力発電システムの風車の設置場所における風速を示す風速情報、前記風車の回転に関する回転情報、及び、前記風速と前記回転との関係情報であって前記風速と前記風車が回転可能な回転数の最大値とを示す関係情報に基づいて、前記風速と前記回転との対応情報を学習する学習部と、前記風速と前記回転との対応情報を記憶する記憶部と、前記風車の回転数を制御する回転制御パラメータを前記風車に設定した場合における前記回転情報、及び風速情報を検出する状態検出部と、前記状態検出部により検出された前記回転情報と前記風速情報、及び前記対応情報に基づいて、前記回転情報を決定する決定部と、前記決定部により決定された回転情報に基づいて、前記風車の回転を制御する制御部とを備え、前記状態検出部により検出された前記回転情報、及び前記風速情報に基づいて、所定の報酬条件に応じた報酬を算出する報酬算出部を更に備え、前記関係情報は、前記報酬算出部により算出された報酬を含み、前記学習部は、報酬に基づいて前記対応情報を強化学習によって学習し、前記報酬算出部は、前記回転数の最大値、前記回転数、及び前記回転数の変化率に基づいて報酬を算出し、前記報酬算出部は、前記状態検出部により検出された前記回転数と前記回転数の最大値との差分が所定の余裕閾値未満である場合であって、かつ、前記状態検出部により検出された前記回転情報に示される回転数の変化率が所定の変化閾値以上である場合、第１レベルの報酬を算出し、前記変化率が前記変化閾値未満である場合、前記第１レベルより高い第２レベルの報酬を算出する制御システムである。 Further, one embodiment of the present invention is a control system that controls the rotation speed of the wind turbine of the wind power generation system, and wind speed information indicating the wind speed at the installation location of the wind turbine of the wind power generation system, rotation information regarding the rotation of the wind turbine. , And, based on the relational information between the wind speed and the rotation and showing the relational information indicating the wind speed and the maximum value of the rotation speed at which the wind turbine can rotate, the correspondence information between the wind speed and the rotation is learned. A state of detecting the rotation information and the wind speed information when the learning unit, the storage unit for storing the correspondence information between the wind speed and the rotation, and the rotation control parameter for controlling the rotation speed of the wind turbine are set in the wind turbine. Based on the detection unit, the determination unit that determines the rotation information based on the rotation information and the wind speed information detected by the state detection unit, and the corresponding information, and the rotation information determined by the determination unit. Further, a reward calculation unit that includes a control unit that controls the rotation of the wind turbine , and calculates a reward according to a predetermined reward condition based on the rotation information detected by the state detection unit and the wind speed information. The related information includes the reward calculated by the reward calculation unit, the learning unit learns the corresponding information based on the reward by enhanced learning, and the reward calculation unit is the maximum value of the rotation speed. , The reward is calculated based on the rotation speed and the rate of change of the rotation speed, and the reward calculation unit has a predetermined difference between the rotation speed detected by the state detection unit and the maximum value of the rotation speed. If it is less than the margin threshold and the rate of change of the number of rotations indicated in the rotation information detected by the state detection unit is equal to or more than the predetermined change threshold, the first level reward is calculated. It is a control system that calculates a second level reward higher than the first level when the rate of change is less than the change threshold .

また、本発明の一実施形態は、上述の制御システムであって、前記報酬算出部は、前記状態検出部により検出された前記回転数と前記回転数の最大値との差分が所定の余裕閾値以上である場合、前記第１レベルより高く、尚且つ前記第２レベルより低い第３レベルの報酬を算出する。
また、本発明の一実施形態は、上述の制御システムであって、風速の時系列変化において、風速が所定の強風判定閾値以上となる強風区間における前記状態検出部により検出された前記回転数と前記回転数の最大値との差分が所定の余裕閾値未満である場合であって、尚且つ、前記状態検出部により検出された前記回転情報に示される回転数の変化率が所定の変化閾値未満である場合、前記風力発電システムによる発電が可能な風速の上限である風速限界を増加させる。 Further, one embodiment of the present invention is the above-mentioned control system, and in the reward calculation unit, the difference between the rotation speed detected by the state detection unit and the maximum value of the rotation speed is a predetermined margin threshold value. In the above case, the reward of the third level higher than the first level and lower than the second level is calculated.
Further, one embodiment of the present invention is the above-mentioned control system, in which the rotation speed detected by the state detection unit in a strong wind section where the wind speed is equal to or higher than a predetermined strong wind determination threshold value in a time-series change of the wind speed. When the difference from the maximum value of the rotation speed is less than the predetermined margin threshold value, and the change rate of the rotation speed shown in the rotation information detected by the state detection unit is less than the predetermined change threshold value. If this is the case, the wind speed limit, which is the upper limit of the wind speed that can be generated by the wind power generation system, is increased.

また、本発明の一実施形態は、風力発電システムの風車の設置場所における風速を示す風速情報、前記風車の回転に関する回転情報、前記風力発電システムにより発電される回生電力に関する電力情報、及び、前記風速と前記回転との関係情報であって、風力発電システムによる発電が可能な風速の限界である限界風速と前記風車が回転可能な回転数の最大値とを示す関係情報に基づいて、前記風速と前記回転との対応情報を学習する学習部を備え、制御装置において求められる報酬を取得可能な学習装置であり、前記制御装置は、前記風車の回転数を制御する回転制御パラメータを前記風車に設定した場合における前記回転情報、及び風速情報を検出する状態検出部と、前記状態検出部により検出された前記回転情報、及び前記風速情報に基づいて、所定の報酬条件に応じた報酬を算出する報酬算出部とを備え、前記報酬算出部は、前記回転数の最大値、前記回転数、及び前記回転数の変化率に基づいて報酬を算出し、前記報酬算出部は、前記状態検出部により検出された前記回転数と前記回転数の最大値との差分が所定の余裕閾値未満である場合であって、かつ、前記状態検出部により検出された前記回転情報に示される回転数の変化率が所定の変化閾値以上である場合、第１レベルの報酬を算出し、前記変化率が前記変化閾値未満である場合、前記第１レベルより高い第２レベルの報酬を算出し、前記関係情報は、前記報酬算出部により算出された報酬を含み、前記学習部は、報酬に基づいて前記対応情報を強化学習によって学習する学習装置である。 Further, in one embodiment of the present invention, wind speed information indicating the wind speed at the installation location of the wind turbine of the wind power generation system, rotation information regarding the rotation of the wind turbine, power information regarding the regenerative power generated by the wind power generation system, and the above-mentioned The wind speed is based on the relationship information between the wind speed and the rotation, which is the limit of the wind speed that can be generated by the wind power generation system, and the maximum value of the rotation speed at which the wind turbine can rotate. It is a learning device that has a learning unit that learns the correspondence information between the wind turbine and the rotation, and can acquire the reward required by the control device. A reward according to a predetermined reward condition is calculated based on the rotation information and the state detection unit that detects the wind speed information in the case of setting, the rotation information detected by the state detection unit, and the wind speed information. A reward calculation unit is provided, the reward calculation unit calculates a reward based on the maximum value of the rotation speed, the rotation speed, and the rate of change of the rotation speed, and the reward calculation unit is performed by the state detection unit. When the difference between the detected rotation speed and the maximum value of the rotation speed is less than a predetermined margin threshold value, and the rate of change of the rotation speed indicated in the rotation information detected by the state detection unit. Is equal to or greater than a predetermined change threshold, a first-level reward is calculated, and if the change rate is less than the change threshold, a second-level reward higher than the first level is calculated. , The learning unit includes the reward calculated by the reward calculation unit, and the learning unit is a learning device that learns the corresponding information based on the reward by enhanced learning .

また、本発明の一実施形態は、風力発電システムの風車の回転数を制御する回転制御パラメータを前記風力発電システムに設定した場合における、前記風車の回転に関する回転情報、前記風力発電システムにより発電された回生電力に関する電力情報、及び前記風車における風速に関する風速情報を検出する状態検出部と、前記状態検出部により検出された前記電力情報、前記風速情報、及び前記風速と前記風車の回転との対応情報に基づいて、前記回転情報を決定する決定部と、前記決定部により決定された回転情報に基づいて、前記回転数を制御する制御部とを備え、風速の時系列変化において、風速が所定の強風判定閾値以上となる強風区間における前記状態検出部により検出された前記回転数と前記回転数の最大値との差分が所定の余裕閾値未満である場合であって、尚且つ、前記状態検出部により検出された前記回転情報に示される回転数の変化率が所定の変化閾値未満である場合、前記風力発電システムによる発電が可能な風速の上限である風速限界を増加させる制御装置である。 Further, in one embodiment of the present invention, when the rotation control parameter for controlling the rotation speed of the wind turbine of the wind power generation system is set in the wind power generation system, the rotation information regarding the rotation of the wind turbine and the power generation by the wind power generation system are generated. A state detection unit that detects power information related to the regenerative power and wind speed information related to the wind speed in the wind turbine, the power information detected by the state detection unit, the wind speed information, and the correspondence between the wind speed and the rotation of the wind turbine. A determination unit that determines the rotation information based on the information and a control unit that controls the rotation speed based on the rotation information determined by the determination unit are provided , and the wind speed is predetermined in a time-series change of the wind speed. The difference between the rotation speed detected by the state detection unit and the maximum value of the rotation speed in the strong wind section equal to or higher than the strong wind determination threshold value is less than the predetermined margin threshold value, and the state detection This is a control device that increases the wind speed limit, which is the upper limit of the wind speed that can be generated by the wind power generation system, when the rate of change of the rotation speed indicated in the rotation information detected by the unit is less than a predetermined change threshold .

また、本発明の一実施形態は、風力発電システムの風車の回転数を制御する制御方法であって、学習部が、前記風力発電システムの風車の設置場所における風速を示す風速情報、前記風車の回転に関する回転情報及び、前記風速と前記回転との関係情報であって前記風速と前記風車が回転可能な回転数の最大値とを示す関係情報に基づいて、前記風速と前記回転との対応情報を学習し、記憶部が、前記風速と前記回転との対応情報を記憶し、状態検出部が、前記回転数を制御する回転制御パラメータを前記風車に設定した場合における前記回転情報、及び前記風速情報を検出し、決定部が、前記状態検出部により検出された前記回転情報、前記風速情報、及び前記対応情報に基づいて、前記回転情報を決定し、制御部が、前記決定部により決定された回転情報に基づいて、前記風車の回転を制御し、風速の時系列変化において、風速が所定の強風判定閾値以上となる強風区間における前記状態検出部により検出された前記回転数と前記回転数の最大値との差分が所定の余裕閾値未満である場合であって、尚且つ、前記状態検出部により検出された前記回転情報に示される回転数の変化率が所定の変化閾値未満である場合、前記風力発電システムによる発電が可能な風速の上限である風速限界を増加させる制御方法である。
また、本発明の一実施形態は、風力発電システムの風車の回転数を制御する制御方法であって、学習部が、前記風力発電システムの風車の設置場所における風速を示す風速情報、前記風車の回転に関する回転情報及び、前記風速と前記回転との関係情報であって前記風速と前記風車が回転可能な回転数の最大値とを示す関係情報に基づいて、前記風速と前記回転との対応情報を学習し、記憶部が、前記風速と前記回転との対応情報を記憶し、状態検出部が、前記回転数を制御する回転制御パラメータを前記風車に設定した場合における前記回転情報、及び前記風速情報を検出し、報酬算出部が、前記状態検出部により検出された前記回転情報、及び前記風速情報に基づいて、所定の報酬条件に応じた報酬を算出し、前記報酬算出部が、前記回転数の最大値、前記回転数、及び前記回転数の変化率に基づいて報酬を算出し、前記報酬算出部が、前記状態検出部により検出された前記回転数と前記回転数の最大値との差分が所定の余裕閾値未満である場合であって、かつ、前記状態検出部により検出された前記回転情報に示される回転数の変化率が所定の変化閾値以上である場合、第１レベルの報酬を算出し、前記変化率が前記変化閾値未満である場合、前記第１レベルより高い第２レベルの報酬を算出し、前記関係情報は、前記報酬算出部により算出された報酬を含み、前記学習部は、報酬に基づいて前記対応情報を強化学習によって学習し、決定部が、前記状態検出部により検出された前記回転情報、前記風速情報、及び前記対応情報に基づいて、前記回転情報を決定し、制御部が、前記決定部により決定された回転情報に基づいて、前記風車の回転を制御する制御方法である。 Further, one embodiment of the present invention is a control method for controlling the rotation speed of the wind turbine of the wind power generation system, wherein the learning unit provides wind speed information indicating the wind speed at the installation location of the wind turbine of the wind power generation system, the wind turbine. Correspondence information between the wind speed and the rotation based on the rotation information regarding the rotation and the relational information indicating the wind speed and the maximum value of the rotation speed at which the wind turbine can rotate, which is the relational information between the wind speed and the rotation. The storage unit stores the correspondence information between the wind speed and the rotation, and the state detection unit sets the rotation control parameter for controlling the rotation speed to the wind turbine, and the rotation information and the wind speed. The information is detected, the determination unit determines the rotation information based on the rotation information, the wind speed information, and the corresponding information detected by the state detection unit, and the control unit is determined by the determination unit. The rotation speed and the rotation speed detected by the state detection unit in a strong wind section where the wind speed is equal to or higher than a predetermined strong wind determination threshold value in a time-series change of the wind speed by controlling the rotation of the wind turbine based on the rotation information. When the difference from the maximum value of is less than the predetermined margin threshold value, and the rate of change of the rotation speed shown in the rotation information detected by the state detection unit is less than the predetermined change threshold value. This is a control method for increasing the wind speed limit, which is the upper limit of the wind speed that can be generated by the wind power generation system .
Further, one embodiment of the present invention is a control method for controlling the rotation speed of the wind turbine of the wind power generation system, wherein the learning unit provides wind speed information indicating the wind speed at the installation location of the wind turbine of the wind power generation system, the wind turbine. Correspondence information between the wind speed and the rotation based on the rotation information regarding the rotation and the relational information indicating the wind speed and the maximum value of the rotation speed at which the wind turbine can rotate, which is the relational information between the wind speed and the rotation. The storage unit stores the correspondence information between the wind speed and the rotation, and the state detection unit sets the rotation control parameter for controlling the rotation speed to the wind turbine, and the rotation information and the wind speed. The information is detected, the reward calculation unit calculates the reward according to the predetermined reward condition based on the rotation information detected by the state detection unit and the wind speed information, and the reward calculation unit performs the rotation. The reward is calculated based on the maximum value of the number, the number of rotations, and the rate of change of the number of rotations, and the reward calculation unit determines the number of rotations detected by the state detection unit and the maximum value of the number of rotations. When the difference is less than the predetermined margin threshold and the rate of change of the number of rotations indicated in the rotation information detected by the state detection unit is equal to or more than the predetermined change threshold, the first level reward. When the rate of change is less than the threshold of change, the reward of the second level higher than the first level is calculated, and the related information includes the reward calculated by the reward calculation unit, and the learning. The unit learns the correspondence information by reinforcement learning based on the reward, and the determination unit determines the rotation information based on the rotation information, the wind speed information, and the correspondence information detected by the state detection unit. However, this is a control method in which the control unit controls the rotation of the wind turbine based on the rotation information determined by the determination unit.

以上説明したように、この発明によれば、風況に応じて限界風速を変更することが可能となる。 As described above, according to the present invention, it is possible to change the critical wind speed according to the wind conditions.

第１の実施形態に係る風力発電システム１の概略構成の一例を示すブロック図である。It is a block diagram which shows an example of the schematic structure of the wind power generation system 1 which concerns on 1st Embodiment. 第１の実施形態に係る風力発電システム１の制御装置６０及び学習装置７０の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the control device 60 and the learning device 70 of the wind power generation system 1 which concerns on 1st Embodiment. 第１の実施形態に係る強風時における報酬条件の例を示す図である。It is a figure which shows the example of the reward condition at the time of a strong wind which concerns on 1st Embodiment. 第１の実施形態に係る風力発電システム１における風速と風車の回転速度との関係の一例を示す図である。It is a figure which shows an example of the relationship between the wind speed and the rotation speed of a wind turbine in the wind power generation system 1 which concerns on 1st Embodiment. 第１の実施形態に係る風力発電システム１における風速と発電電力との関係の一例を示す図である。It is a figure which shows an example of the relationship between the wind speed and the generated electric power in the wind power generation system 1 which concerns on 1st Embodiment. 第１の実施形態に係る制御装置６０の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the control device 60 which concerns on 1st Embodiment. 第２の実施形態に係る対象区間を説明する図である。It is a figure explaining the target section which concerns on 2nd Embodiment. 第２の実施形態に係る制御装置６０の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the control device 60 which concerns on 2nd Embodiment. 第３の実施形態に係る制御装置６０の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the control device 60 which concerns on 3rd Embodiment. 第４の実施形態に係る制御装置６０の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the control device 60 which concerns on 4th Embodiment. 第５の実施形態に係る風力発電システム１Ａの制御装置６０Ａの構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the control device 60A of the wind power generation system 1A which concerns on 5th Embodiment. 第６の実施形態に係る風力発電システム１Ｂの制御装置６０Ｂの構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the control device 60B of the wind power generation system 1B which concerns on 6th Embodiment.

以下、実施形態の制御システム、学習装置、制御装置を、図面を参照して説明する。 Hereinafter, the control system, the learning device, and the control device of the embodiment will be described with reference to the drawings.

＜第１の実施形態＞
図１は、第１の実施形態に係る風力発電システム１の概略構成の一例を示すブロック図である。風力発電システム１は、風力発電機本体１０と制御システム５０とを備える。風力発電機本体１０と制御システム５０との間では、種々の情報がやりとりされる。
図１に示すように、例えば、制御システム５０から風力発電機本体１０に、風力発電機本体１０を制御する制御パラメータが出力される。
また、例えば、風力発電機本体１０から制御システム５０に、風力発電機本体１０の状態を示す状態パラメータが出力される。 <First Embodiment>
FIG. 1 is a block diagram showing an example of a schematic configuration of the wind power generation system 1 according to the first embodiment. The wind power generation system 1 includes a wind power generator main body 10 and a control system 50. Various information is exchanged between the wind power generator main body 10 and the control system 50.
As shown in FIG. 1, for example, a control parameter for controlling the wind power generator main body 10 is output from the control system 50 to the wind power generator main body 10.
Further, for example, a state parameter indicating the state of the wind power generator main body 10 is output from the wind power generator main body 10 to the control system 50.

制御パラメータは、例えば、風車２０の回転数を制御する回転数制御パラメータ、及び発電機３０により発電される回生電力の電力量を制御する電力制御パラメータである。
また、状態パラメータは、例えば、風車２０の風速、風車２０の回転速度（以下、単に回転速度ともいう）、及び発電機３０により発電された回生電力の電力量を示す情報である。 The control parameters are, for example, a rotation speed control parameter for controlling the rotation speed of the wind turbine 20 and a power control parameter for controlling the electric energy of the regenerative power generated by the generator 30.
Further, the state parameters are information indicating, for example, the wind speed of the wind turbine 20, the rotation speed of the wind turbine 20 (hereinafter, also simply referred to as the rotation speed), and the electric energy of the regenerated electric power generated by the generator 30.

風力発電機本体１０は、風車２０、発電機３０、整流・昇圧部３１、電圧検出部３２、電流検出部３３、風速センサ４１及び回転速度センサ４２を備える。
風車２０は、例えば、垂直軸型風車として構成されており、鉛直方向に延びる回転軸の周囲に複数の直線翼が一体として回転可能に連結させた直線翼垂直軸風車などによって構成されている。
風車２０は、例えば、後述する発電機３０の回転子と回転軸を介して接続され、発電機３０の回転子と一体となって回転する。ここで、発電機３０の回転子は、発電機３０により発電される回生電力の電力量に応じた回転数で回転する。また、回生電力の電力量は、後述する制御システム５０によりＭＰＰＴ（Maximum Power Point Tracking）制御がなされる。このため、風車２０の回転数は、制御システム５０によるＭＰＰＴ制御により、間接的に制御される。 The wind power generator main body 10 includes a wind turbine 20, a generator 30, a rectifying / boosting unit 31, a voltage detection unit 32, a current detection unit 33, a wind speed sensor 41, and a rotation speed sensor 42.
The wind turbine 20 is configured as, for example, a vertical axis type wind turbine, and is configured by a straight wing vertical axis wind turbine or the like in which a plurality of straight blades are integrally rotatably connected around a rotating shaft extending in the vertical direction.
For example, the wind turbine 20 is connected to the rotor of the generator 30 described later via a rotation shaft, and rotates integrally with the rotor of the generator 30. Here, the rotor of the generator 30 rotates at a rotation speed corresponding to the electric energy of the regenerated electric power generated by the generator 30. Further, the electric energy of the regenerative power is controlled by MPPT (Maximum Power Point Tracking) by the control system 50 described later. Therefore, the rotation speed of the wind turbine 20 is indirectly controlled by the MPPT control by the control system 50.

発電機３０は、風車２０の回転力を変換して電力を生じさせる装置であり、例えば、三相交流発電機として構成され、風車２０の回転と連動して回転する回転子が風車２０の回転軸に連結されて回転することにより交流電力を発電する。発電機３０は、発電した交流電力を整流・昇圧部３１に供給する。なお、発電機３０は、発電した電力を整流・昇圧部３１側に供給する発電機として動作する他、整流・昇圧部３１から交流電力が供給される電動機としても動作する。発電機３０は、例えば、風車２０の起動時に回転をアシストするアシスト制御を行う場合等に電動機として動作する。 The generator 30 is a device that converts the rotational force of the wind turbine 20 to generate electric power. For example, the generator 30 is configured as a three-phase alternating current generator, and a rotor that rotates in conjunction with the rotation of the wind turbine 20 rotates the wind turbine 20. AC power is generated by being connected to a shaft and rotating. The generator 30 supplies the generated AC power to the rectifying / boosting unit 31. The generator 30 operates as a generator that supplies the generated power to the rectifying / boosting unit 31 side, and also operates as an electric motor to which AC power is supplied from the rectifying / boosting unit 31. The generator 30 operates as an electric motor, for example, when performing assist control for assisting rotation when the wind turbine 20 is started.

整流・昇圧部３１は、発電機３０により発電された交流電力を直流電力に変換し、変換した直流電力の電圧を変換（昇圧）する。整流・昇圧部３１は、例えば、昇圧チョッパ回路である。整流・昇圧部３１から出力される直流電力が、発電機３０により発電された回生電力に相当する。 The rectifying / boosting unit 31 converts the AC power generated by the generator 30 into DC power, and converts (boosts) the voltage of the converted DC power. The rectifying / boosting unit 31 is, for example, a boosting chopper circuit. The DC power output from the rectifying / boosting unit 31 corresponds to the regenerative power generated by the generator 30.

なお、整流・昇圧部３１は、発電機３０に発電動作を行わせる場合には昇圧チョッパ回路として作動し、アシスト制御時等に発電機３０を電動機として動作させる場合にはインバ－タとして作動する回路である。なお、アシスト制御時の供給電力は、風力発電システム１のバッテリ（不図示）からの電力であってもよい。 The rectifying / boosting unit 31 operates as a boosting chopper circuit when the generator 30 is operated to generate power, and operates as an inverter when the generator 30 is operated as an electric motor during assist control or the like. It is a circuit. The electric power supplied during the assist control may be the electric power from the battery (not shown) of the wind power generation system 1.

電圧検出部３２は、公知の電圧計によって構成され、整流・昇圧部３１から出力される出力電圧を検出し、検出した出力電圧を制御システム５０に出力する。
電流検出部３３は、公知の電流計によって構成され、整流・昇圧部３１から出力される出力電流を検出し、検出した出力電流を制御システム５０に出力する。 The voltage detection unit 32 is composed of a known voltmeter, detects the output voltage output from the rectifying / boosting unit 31, and outputs the detected output voltage to the control system 50.
The current detection unit 33 is composed of a known ammeter, detects the output current output from the rectifying / boosting unit 31, and outputs the detected output current to the control system 50.

風速センサ４１は、公知の風速センサによって構成され、例えば、風車２０の近傍の所定位置（例えば、風車２０における回転翼以外の部位）に設けられて風車が受ける風の風速を検出する。風速センサ４１は、検出した風速を示す情報を、制御システム５０に出力する。 The wind speed sensor 41 is composed of a known wind speed sensor, and is provided at a predetermined position near the wind turbine 20 (for example, a portion other than the rotary blade in the wind turbine 20) to detect the wind speed of the wind received by the wind turbine. The wind speed sensor 41 outputs information indicating the detected wind speed to the control system 50.

回転速度センサ４２は、風車２０の回転速度を検出する。回転速度センサ４２は、風車２０の回転軸部（不図示）の回転速度を検出し得るセンサであればよく、公知の様々な回転速度センサを用いることができる。回転速度センサ４２は、検出した回転速度を示す情報を、制御システム５０に出力する。 The rotation speed sensor 42 detects the rotation speed of the wind turbine 20. The rotation speed sensor 42 may be any sensor that can detect the rotation speed of the rotation shaft portion (not shown) of the wind turbine 20, and various known rotation speed sensors can be used. The rotation speed sensor 42 outputs information indicating the detected rotation speed to the control system 50.

制御システム５０は、制御装置６０と、学習装置７０とを備える。
制御装置６０は、風速センサ４１により検出された風速、及び回転速度センサ４２により検出された回転速度に基づいて、回転数制御パラメータを決定することにより風車の回転数を制御する。制御装置６０は、学習装置７０を用いて、回転数制御パラメータを決定する。制御装置６０が学習装置７０を用いて回転数制御パラメータを決定する方法については後で詳しく説明する。 The control system 50 includes a control device 60 and a learning device 70.
The control device 60 controls the rotation speed of the wind turbine by determining the rotation speed control parameter based on the wind speed detected by the wind speed sensor 41 and the rotation speed detected by the rotation speed sensor 42. The control device 60 uses the learning device 70 to determine the rotation speed control parameters. The method in which the control device 60 determines the rotation speed control parameter using the learning device 70 will be described in detail later.

学習装置７０は、例えば、強化学習を行う装置である。この場合、学習装置７０は、強化学習における学習する主体となるエージェントに相当し、制御対象（本実施形態では、風力発電機本体１０）とのやりとりにより、制御対象をより適切に制御するための学習を進める。
以下では、学習装置７０が強化学習を行う場合を例示して説明するが、これに限定されない。学習装置７０は、制御対象（風力発電機本体１０）に関する状態に基づいて、制御対象を制御するパラメータがより適切となるように学習するものであればよい。学習装置７０は、教師あり学習を行ってもよいし、教師なし学習を行ってもよいし、その他の学習を行ってもよい。ここで、制御対象（風力発電機本体１０）に関する状態とは、風力発電機本体１０及び風力発電機本体１０の周囲の状態であり、例えば、状態パラメータで示される風車２０における風速、風車２０の回転速度、及び発電機３０の発電量等の変数である。また、ここでの状態には、上述した風速等のような時々刻々変化する状態の他、予め定められた状態、例えば、風車２０の回転速度の限界値、風車２０の回転トルクの上下限、及び発電機３０が発電可能な最大の電力量等を含む。 The learning device 70 is, for example, a device that performs reinforcement learning. In this case, the learning device 70 corresponds to an agent that is a learning subject in reinforcement learning, and is for more appropriately controlling the controlled object by interacting with the controlled object (in the present embodiment, the wind power generator main body 10). Advance learning.
Hereinafter, the case where the learning device 70 performs reinforcement learning will be described as an example, but the present invention is not limited to this. The learning device 70 may be one that learns so that the parameters for controlling the controlled object become more appropriate based on the state of the controlled object (wind power generator main body 10). The learning device 70 may perform supervised learning, unsupervised learning, or other learning. Here, the state related to the controlled object (wind power generator main body 10) is the state around the wind power generator main body 10 and the wind power generator main body 10, and for example, the wind speed in the wind turbine 20 and the wind turbine 20 indicated by the state parameter. It is a variable such as a rotation speed and a power generation amount of the generator 30. Further, the state here includes a state that changes from moment to moment such as the above-mentioned wind speed, and a predetermined state, for example, a limit value of the rotation speed of the wind turbine 20, an upper and lower limit of the rotation torque of the wind turbine 20. And the maximum amount of electric power that the generator 30 can generate.

本実施形態では、学習装置７０は、風力発電機本体１０を制御する回転数制御パラメータを出力し、出力した回転数制御パラメータに応じて、風力発電機本体１０の状態を観察し、状態の変化に応じて、次の回転数制御パラメータを決定する。
また、学習装置７０は、風力発電機本体１０の状態に応じた報酬を受け取る。これにより、学習装置７０は、報酬を手掛かりとして自身が出力した回転数制御パラメータの良し悪しを判断することにより学習を進め、より適した回転数制御パラメータを出力することが可能となる。 In the present embodiment, the learning device 70 outputs a rotation speed control parameter for controlling the wind power generator main body 10, observes the state of the wind power generator main body 10 according to the output rotation speed control parameter, and changes the state. The next rotation speed control parameter is determined according to.
Further, the learning device 70 receives a reward according to the state of the wind power generator main body 10. As a result, the learning device 70 can proceed with learning by determining whether the rotation speed control parameter output by itself is good or bad by using the reward as a clue, and can output a more suitable rotation speed control parameter.

図２は、本発明の一実施形態に係る風力発電システム１の制御装置６０の構成の一例を示すブロック図である。
図２に示すように、制御装置６０は、パラメータ取得部６１と、状態検出部６２と、報酬算出部６３と、報酬出力部６４とを備える。また、学習装置７０は、強化学習部７１を備える。ここで、強化学習部７１は、「学習部」の一例である。 FIG. 2 is a block diagram showing an example of the configuration of the control device 60 of the wind power generation system 1 according to the embodiment of the present invention.
As shown in FIG. 2, the control device 60 includes a parameter acquisition unit 61, a state detection unit 62, a reward calculation unit 63, and a reward output unit 64. Further, the learning device 70 includes a reinforcement learning unit 71. Here, the reinforcement learning unit 71 is an example of the “learning unit”.

パラメータ取得部６１は、強化学習部７１から出力される回転数制御パラメータを取得する。パラメータ取得部６１は、取得した回転数制御パラメータを、風力発電機本体１０に対して出力する。 The parameter acquisition unit 61 acquires the rotation speed control parameter output from the reinforcement learning unit 71. The parameter acquisition unit 61 outputs the acquired rotation speed control parameter to the wind power generator main body 10.

状態検出部６２は、風力発電機本体１０の状態を示す状態パラメータを検出する。状態パラメータは、風力発電機本体１０に含まれる風車２０や発電機３０に関する情報であり、例えば、風速センサ４１により検出された風速、回転速度センサ４２により検出された回転速度、及び発電機３０により発電された回生電力を示す情報である。状態検出部６２は、検出した状態パラメータを、報酬算出部６３に出力する。 The state detection unit 62 detects a state parameter indicating the state of the wind power generator main body 10. The state parameters are information about the wind turbine 20 and the generator 30 included in the wind power generator main body 10, and are, for example, the wind speed detected by the wind speed sensor 41, the rotation speed detected by the rotation speed sensor 42, and the generator 30. It is information indicating the regenerated electric power generated. The state detection unit 62 outputs the detected state parameter to the reward calculation unit 63.

ここで、風速センサ４１により検出された風速は「風速情報」の一例である。回転速度センサ４２により検出された回転速度は、「回転情報」の一例である。また、発電機３０により発電された回生電力は、「電力情報」の一例である。 Here, the wind speed detected by the wind speed sensor 41 is an example of "wind speed information". The rotation speed detected by the rotation speed sensor 42 is an example of "rotation information". Further, the regenerative electric power generated by the generator 30 is an example of "electric power information".

報酬算出部６３は、状態検出部６２から取得した風速、及び回転速度を示す情報に基づいて、報酬を算出する。報酬算出部６３は、予め定めた所定の報酬条件に応じて報酬を算出する。
ここで、報酬条件は、例えば、風速に対して回転速度がより適切に制御されたと判定される場合に、より高い報酬が得られるように設定される。 The reward calculation unit 63 calculates the reward based on the information indicating the wind speed and the rotation speed acquired from the state detection unit 62. The reward calculation unit 63 calculates the reward according to a predetermined reward condition.
Here, the reward condition is set so that a higher reward can be obtained, for example, when it is determined that the rotation speed is more appropriately controlled with respect to the wind speed.

例えば、報酬算出部６３は、風速、及び回転速度に基づいて、強風時において風車の回転が過回転となることなく制御され、発電電力が低下し過ぎることが抑制されている場合には、より高い報酬を算出する。また、例えば、報酬算出部６３は、風速、及び回転速度に基づいて、発電に適した風況にも係らず、風車の回転が不適切に抑制されてしまい、発電電力の低下を引き起こしている場合には、より低い報酬を算出する。報酬算出部６３は、算出した報酬を報酬出力部６４に出力する。 For example, when the reward calculation unit 63 is controlled based on the wind speed and the rotation speed without causing the rotation of the wind turbine to become over-rotated in a strong wind, and the generated power is suppressed from being excessively reduced, the reward calculation unit 63 is more likely to be used. Calculate high rewards. Further, for example, the reward calculation unit 63 inappropriately suppresses the rotation of the wind turbine based on the wind speed and the rotation speed, regardless of the wind conditions suitable for power generation, causing a decrease in the generated power. If so, calculate a lower reward. The reward calculation unit 63 outputs the calculated reward to the reward output unit 64.

報酬出力部６４は、報酬算出部６３から取得した報酬を、強化学習部７１に出力する。 The reward output unit 64 outputs the reward acquired from the reward calculation unit 63 to the reinforcement learning unit 71.

強化学習部７１は、風力発電機本体１０の状態を示す風速と回転速度を示す情報を取得する。また、強化学習部７１は、報酬出力部６４から報酬を取得する。強化学習部７１は、取得した風速と回転速度を示す情報、及び風速に対して予め定められた回転数の許容範囲と回転数の目標値を示す情報に基づいて、報酬を手掛かりとしてより高い報酬が得られるように学習を進め、より適切な回転数制御パラメータを出力する。 The reinforcement learning unit 71 acquires information indicating the wind speed and the rotation speed indicating the state of the wind power generator main body 10. Further, the reinforcement learning unit 71 acquires a reward from the reward output unit 64. The reinforcement learning unit 71 uses the reward as a clue to raise the reward based on the acquired information indicating the wind speed and the rotation speed, and the information indicating the allowable range of the rotation speed and the target value of the rotation speed predetermined for the wind speed. The learning is advanced so that the above can be obtained, and a more appropriate rotation speed control parameter is output.

ここで、回転数の許容範囲とは、風車２０や発電機３０の機械的な耐用限界等に基づいて定められる風車２０の回転数として許容される範囲である。また、回転数の目標値とは、風速ごとに定まる回生電力が最大となる風車の回転数である。 Here, the permissible range of the rotation speed is a range permissible as the rotation speed of the wind turbine 20 determined based on the mechanical service limit of the wind turbine 20 and the generator 30. Further, the target value of the rotation speed is the rotation speed of the wind turbine having the maximum regenerative power determined for each wind speed.

風力発電では、強風時においては、特に風車の回転数を制御することが困難となる場合があり、風車の回転速度が過多になった場合には風車が強制停止されてしまい発電電力の低下を招く。また、強風に備えて風車の回転速度を抑制し過ぎると、発電電力の低下を招く要因となり得る。
そこで、本実施形態では強風時における回転数制御パラメータに対し、風車が適切に制御されたか否かに応じて、報酬に差がつくように報酬条件を設定する。これにより、強化学習部７１に、強風時においてもより適切な回転数制御パラメータが出力できるように学習させることが可能となる。 In wind power generation, it may be difficult to control the rotation speed of the wind turbine, especially in strong winds, and if the rotation speed of the wind turbine becomes excessive, the wind turbine will be forcibly stopped and the generated power will decrease. Invite. Further, if the rotation speed of the wind turbine is suppressed too much in preparation for a strong wind, it may cause a decrease in the generated power.
Therefore, in the present embodiment, the reward condition is set so that the reward is different depending on whether or not the wind turbine is appropriately controlled with respect to the rotation speed control parameter in the strong wind. This makes it possible for the reinforcement learning unit 71 to learn so that more appropriate rotation speed control parameters can be output even in a strong wind.

図３は、第１の実施形態に係る強風時における報酬条件の例を示す図である。
図３に示すように、強風時において、回転速度が所定の第１閾値以上である場合、最低レベルである第１レベルの報酬とする。つまり、強風時において回転速度が超過している場合には、最も低い報酬とする。最低レベルである第１レベルの報酬とは、例えば、マイナスの報酬である。 FIG. 3 is a diagram showing an example of a reward condition in a strong wind according to the first embodiment.
As shown in FIG. 3, when the rotation speed is equal to or higher than a predetermined first threshold value in a strong wind, the reward is the first level, which is the lowest level. In other words, if the rotation speed is exceeded in a strong wind, the lowest reward will be given. The lowest level first level reward is, for example, a negative reward.

また、回転速度が所定の第２閾値以上、尚且つ第１閾値未満である場合、最高ランクである第２レベルの報酬とする。つまり、強風時において回転速度が適正範囲に制御されている場合には、最も高い報酬とする。なお、第２閾値は第１閾値よりも低い閾値である。 Further, when the rotation speed is equal to or higher than a predetermined second threshold value and less than the first threshold value, the reward is the second level, which is the highest rank. That is, when the rotation speed is controlled within an appropriate range in a strong wind, the highest reward is given. The second threshold value is lower than the first threshold value.

また、回転速度が所定の第２閾値未満である場合、第２レベルよりも低く、尚且つ第１レベルよりは高い第３レベルの報酬とする。つまり、強風時において回転速度が速度不足である場合には、最も高い報酬よりは低い報酬であるが、回転速度が速度超過である場合よりは高い報酬とする。 Further, when the rotation speed is less than a predetermined second threshold value, the reward is a third level lower than the second level and higher than the first level. That is, when the rotation speed is insufficient in a strong wind, the reward is lower than the highest reward, but the reward is higher than when the rotation speed is excessive.

なお、ここでの強風時とは、所定の強風判定閾値以上の風速が検出された場合であり、この強風判定閾値は、風車の構造や機械的な強度に応じて任意に定められてよい。また、上述した第１閾値、及び第２閾値は、風速に依らず一定の値であってもよいし、風速に応じでそれぞれ異なる値であってもよい。 The strong wind here is a case where a wind speed equal to or higher than a predetermined strong wind determination threshold value is detected, and this strong wind determination threshold value may be arbitrarily determined according to the structure of the wind turbine and the mechanical strength. Further, the above-mentioned first threshold value and the second threshold value may be constant values regardless of the wind speed, or may be different values depending on the wind speed.

図４は、第１の実施形態に係る風力発電システム１における風速と風車の回転速度との関係の一例を示す図である。図４の横軸は風速［ｍ／ｓ］、縦軸は風車２０の回転数［ｒ／ｍｉｎ］を示す。また、図４においては、風速Ｂ［ｍ／ｓ］以上である場合に強風となる。 FIG. 4 is a diagram showing an example of the relationship between the wind speed and the rotation speed of the wind turbine in the wind power generation system 1 according to the first embodiment. The horizontal axis of FIG. 4 shows the wind speed [m / s], and the vertical axis shows the rotation speed [r / min] of the wind turbine 20. Further, in FIG. 4, when the wind speed is B [m / s] or more, the wind becomes strong.

図４に示すように、強風ではない風速Ａ［ｍ／ｓ］～風速Ｂ［ｍ／ｓ］までの間において、風速と回転数とが正の比例係数で比例する関係となるように制御される。一方、強風である風速Ｂ［ｍ／ｓ］以上の風速である場合、風速と回転数とが所定の関係となるように制御されることが望ましい。ここでの所定の関係とは、例えば、風速が増加すると回転数が低下する関係である。風速と回転数との関係が、この例における特性ＦＧ上の点となるように制御した場合、発電機３０のトルクを最大トルクに維持することができる。この場合、最大の発電電力を得ることが可能となる。この例では特性ＦＧに示す線上に沿って回転数が制御されることが最も望ましい。このため、本実施形態では、強風時において、風車２０の回転数が、その風速に応じた適正範囲に制御された場合に、最も高い報酬を設定する。この場合の適正範囲は、例えば、風速に応じた特性ＦＧ上の点を含む所定の範囲である。 As shown in FIG. 4, the wind speed and the rotation speed are controlled to be proportional to each other by a positive proportional coefficient between the wind speed A [m / s] and the wind speed B [m / s], which are not strong winds. To. On the other hand, when the wind speed is B [m / s] or higher, which is a strong wind, it is desirable that the wind speed and the rotation speed are controlled so as to have a predetermined relationship. The predetermined relationship here is, for example, a relationship in which the rotation speed decreases as the wind speed increases. When the relationship between the wind speed and the rotation speed is controlled to be a point on the characteristic FG in this example, the torque of the generator 30 can be maintained at the maximum torque. In this case, it is possible to obtain the maximum generated power. In this example, it is most desirable that the rotation speed is controlled along the line shown in the characteristic FG. Therefore, in the present embodiment, the highest reward is set when the rotation speed of the wind turbine 20 is controlled within an appropriate range according to the wind speed in a strong wind. The appropriate range in this case is, for example, a predetermined range including a point on the characteristic FG according to the wind speed.

一方、回転数が特性ＦＧ上の点を超過してしまった場合、発電機３０の回転子の機械的な耐用限界を超過する可能性があることから、風車が強制停止される。風車が強制停止されてしまうと発電をすることができない。つまり、風速と回転数との関係が領域Ｄにある場合、風車が強制停止される可能性が高まることから、このような事態は望ましくない。このため、本実施形態では、強風時において、風車２０の回転数が、その風速に応じた回転速度における速度超過の範囲に制御された場合に、最も低い報酬を設定する。この場合の速度超過の範囲は、例えば、風速に応じた領域Ｄに含まれる所定の範囲である。 On the other hand, if the rotation speed exceeds the point on the characteristic FG, the wind turbine is forcibly stopped because the mechanical service limit of the rotor of the generator 30 may be exceeded. If the windmill is forcibly stopped, it will not be able to generate electricity. That is, when the relationship between the wind speed and the rotation speed is in the region D, the possibility that the wind turbine is forcibly stopped increases, so such a situation is not desirable. Therefore, in the present embodiment, the lowest reward is set when the rotation speed of the wind turbine 20 is controlled within the range of the speed exceeding in the rotation speed corresponding to the wind speed in a strong wind. The range of the overspeed in this case is, for example, a predetermined range included in the region D corresponding to the wind speed.

また、回転数が特性ＦＧ上の点よりも低下した場合、風車２０の回転数を過剰に抑制することになり、発電機３０はさらに発電をすることが可能であるにも係らず、発電できていない状態となる。この場合、風車２０が強制停止されたり、発電機３０の機械的な耐用限界を超えたりする心配はないが、発電電力を最大限に引き出せていない。このため、本実施形態では、強風時において、風車２０の回転数が、その風速に応じた速度不足の範囲に制御された場合に、最も高い報酬より低いが、最も低い報酬よりは高い報酬を設定する。この場合の速度不足の範囲は、例えば、領域ＦＢＣＧで示される範囲である。 Further, when the rotation speed is lower than the point on the characteristic FG, the rotation speed of the wind turbine 20 is excessively suppressed, and the generator 30 can generate power even though it can generate more power. It will be in a state where it is not. In this case, there is no concern that the wind turbine 20 will be forcibly stopped or the mechanical endurance limit of the generator 30 will be exceeded, but the generated power cannot be maximized. Therefore, in the present embodiment, when the rotation speed of the wind turbine 20 is controlled within the range of the speed shortage according to the wind speed in a strong wind, the reward is lower than the highest reward but higher than the lowest reward. Set. The range of insufficiency in this case is, for example, the range indicated by the region FBCG.

図５は、第１の実施形態に係る風力発電システム１における風速と発電電力の出力との関係の一例を示す図である。図５の横軸は風速［ｍ／ｓ］、縦軸は発電電力［ｋＷ］を示す。図５では、図４における制御特性に基づいて風車の回転数が制御された場合における風力と発電電力との関係を示している。また、図５においては、図４同様に、風速Ｂ［ｍ／ｓ］以上である場合に強風となる。
図５に示すように、強風ではない風速Ａ［ｍ／ｓ］～風速Ｂ［ｍ／ｓ］までの間において、風速に応じて発電電力が三次関数的に増大する。一方、強風である風速Ｂ［ｍ／ｓ］以上の風速である場合、図４に示す特性ＦＧに沿って回転数が制御されることで、発電電力が最大出力Ｍａｘに維持される。 FIG. 5 is a diagram showing an example of the relationship between the wind speed and the output of the generated power in the wind power generation system 1 according to the first embodiment. The horizontal axis of FIG. 5 shows the wind speed [m / s], and the vertical axis shows the generated power [kW]. FIG. 5 shows the relationship between the wind power and the generated power when the rotation speed of the wind turbine is controlled based on the control characteristics in FIG. Further, in FIG. 5, as in FIG. 4, when the wind speed is B [m / s] or more, the wind becomes strong.
As shown in FIG. 5, the generated power increases cubicly according to the wind speed between the wind speed A [m / s] and the wind speed B [m / s], which are not strong winds. On the other hand, when the wind speed is higher than the wind speed B [m / s], which is a strong wind, the generated power is maintained at the maximum output Max by controlling the rotation speed according to the characteristic FG shown in FIG.

図６は、第１の実施形態に係る制御装置６０の動作例を示すフローチャートである。
まず、制御装置６０のパラメータ取得部６１は、強化学習部７１から回転数制御パラメータを取得する（ステップＳ１０）。
次に、状態検出部６２は、風速、及び回転速度を検出する（ステップＳ１１）。状態検出部６２は、風速センサ４１により検出された風速、及び回転速度センサ４２により検出された回転速度を取得することにより、風速、及び回転速度を検出する。状態検出部６２は、検出した風速、及び回転速度を、報酬算出部６３に出力する。 FIG. 6 is a flowchart showing an operation example of the control device 60 according to the first embodiment.
First, the parameter acquisition unit 61 of the control device 60 acquires the rotation speed control parameter from the reinforcement learning unit 71 (step S10).
Next, the state detection unit 62 detects the wind speed and the rotation speed (step S11). The state detection unit 62 detects the wind speed and the rotation speed by acquiring the wind speed detected by the wind speed sensor 41 and the rotation speed detected by the rotation speed sensor 42. The state detection unit 62 outputs the detected wind speed and rotation speed to the reward calculation unit 63.

次に、報酬算出部６３は、報酬を算出する。
まず、報酬算出部６３は、風速が強風であるか否かを判定する（ステップＳ１２）。報酬算出部６３は、風速が強風である場合、回転速度が第１閾値以上であるか否かを判定する（ステップＳ１３）。報酬算出部６３は、回転速度が第１閾値以上である場合、第１レベルの報酬とする（ステップＳ１４）。一方、報酬算出部６３は、回転速度が第１閾値未満である場合、回転速度が第２閾値以上であるか否かを判定する（ステップＳ１６）。 Next, the reward calculation unit 63 calculates the reward.
First, the reward calculation unit 63 determines whether or not the wind speed is a strong wind (step S12). When the wind speed is a strong wind, the reward calculation unit 63 determines whether or not the rotation speed is equal to or higher than the first threshold value (step S13). When the rotation speed is equal to or higher than the first threshold value, the reward calculation unit 63 sets the reward as the first level reward (step S14). On the other hand, when the rotation speed is less than the first threshold value, the reward calculation unit 63 determines whether or not the rotation speed is equal to or higher than the second threshold value (step S16).

報酬算出部６３は、回転速度が第２閾値以上である場合、第２レベルの報酬とする（ステップＳ１７）。一方、報酬算出部６３は、回転速度が第２閾値未満である場合、第３レベルの報酬とする（ステップＳ１８）。 When the rotation speed is equal to or higher than the second threshold value, the reward calculation unit 63 sets the reward as the second level reward (step S17). On the other hand, when the rotation speed is less than the second threshold value, the reward calculation unit 63 sets the reward as the third level reward (step S18).

なお、ステップＳ１２において、風速が強風でない場合、報酬算出部６３は、通常レベルの報酬とする（ステップＳ１９）。通常レベルの報酬とは、例えば、回転速度が適正範囲に含まれるように制御されている場合には最も高い報酬とし、適正範囲から外れた場合には外れた方向（速度超過、又は速度不足）に関わらず、適正範囲から乖離した度合に応じて、報酬を低減させる。 In step S12, if the wind speed is not a strong wind, the reward calculation unit 63 sets the reward as a normal level reward (step S19). The normal level reward is, for example, the highest reward when the rotation speed is controlled to be included in the appropriate range, and when it is out of the appropriate range, the direction is out of the range (excessive speed or insufficient speed). Regardless, the reward will be reduced according to the degree of deviation from the appropriate range.

報酬算出部６３は、算出した報酬を、報酬出力部６４に出力する。報酬出力部６４は、報酬を、強化学習部７１に出力する（ステップＳ１５）。 The reward calculation unit 63 outputs the calculated reward to the reward output unit 64. The reward output unit 64 outputs the reward to the reinforcement learning unit 71 (step S15).

＜第２の実施形態＞
次に第２の実施形態について説明する。
本実施形態では、制御装置６０の制御対象が回生電力である点において、他の実施形態と相違する。以下では、上述した実施形態と異なる点を説明し、上述した実施形態と同一または類似の機能を有する構成に同一の符号を付し、その説明を省略する。 <Second embodiment>
Next, the second embodiment will be described.
This embodiment differs from other embodiments in that the controlled object of the control device 60 is regenerative power. Hereinafter, the points different from those of the above-described embodiment will be described, and the same reference numerals will be given to the configurations having the same or similar functions as those of the above-described embodiments, and the description thereof will be omitted.

まず前提として、風力発電においては、風車２０が受ける風速に対する回生電力の最大値は風車ごとに固有の特性値として一義的に定められている。そして、この回生電力の最大値が、風速に応じた回生電力の目標値として設定され、回生電力の目標値に近づくように回生電力が制御される。 First, as a premise, in wind power generation, the maximum value of regenerative power with respect to the wind speed received by the wind turbine 20 is uniquely determined as a characteristic value peculiar to each wind turbine. Then, the maximum value of this regenerative power is set as a target value of the regenerative power according to the wind speed, and the regenerative power is controlled so as to approach the target value of the regenerative power.

制御装置６０は、学習装置７０を用いて、電力制御パラメータを決定する。制御装置６０は、電力制御パラメータに基づいて、整流・昇圧部３１から出力される直流電力が、電力制御パラメータにより指示された電力値に近づくようにＭＰＰＴ（Maximum Power Point Tracking）制御を行う。具体的には、制御装置６０は、電圧検出部３２で検出される出力電圧、および電流検出部で検出される出力電流によって決定される出力電力が、電力制御パラメータにより指示された回生電力の目標値となるように整流・昇圧部３１に与えるＰＷＭ信号のデューティ比を変化させる。 The control device 60 uses the learning device 70 to determine power control parameters. The control device 60 performs MPPT (Maximum Power Point Tracking) control so that the DC power output from the rectifying / boosting unit 31 approaches the power value specified by the power control parameter based on the power control parameter. Specifically, in the control device 60, the output power determined by the output voltage detected by the voltage detection unit 32 and the output current detected by the current detection unit is the target of the regenerative power specified by the power control parameter. The duty ratio of the PWM signal given to the rectifying / boosting unit 31 is changed so as to be a value.

学習装置７０は、状態検出部６２により検知される状態パラメータ、及び報酬出力部６４により出力される報酬に基づいて、電力制御パラメータを出力する。また、学習装置７０は、出力した電力制御パラメータが風力発電機本体１０の整流・昇圧部３１に設定されたことによる、風力発電機本体１０の状態の変化を観察し、状態の変化や報酬に応じて、次の電力制御パラメータを決定する。 The learning device 70 outputs a power control parameter based on the state parameter detected by the state detection unit 62 and the reward output by the reward output unit 64. Further, the learning device 70 observes a change in the state of the wind power generator main body 10 due to the output power control parameter being set in the rectifying / boosting unit 31 of the wind power generator main body 10, and is used as a state change or a reward. The following power control parameters are determined accordingly.

強化学習部７１は、風力発電機本体１０の状態を示す風速と回転速度と回生電力を示す情報を取得する。強化学習部７１は、取得した風速と回転速度と回生電力を示す情報、及び回生電力の目標値を示す情報に基づいて、報酬を手掛かりとしてより高い報酬が得られるように学習を進め、より適切な電力制御パラメータを出力する。ここで、回生電力の目標値とは、風速ごとに定まる発電可能な回生電力の最大値である。 The reinforcement learning unit 71 acquires information indicating the wind speed, the rotation speed, and the regenerative power indicating the state of the wind power generator main body 10. Based on the acquired information indicating the wind speed, rotation speed, and regenerative power, and the information indicating the target value of the regenerative power, the reinforcement learning unit 71 proceeds with learning so that a higher reward can be obtained by using the reward as a clue, and is more appropriate. Power control parameters are output. Here, the target value of the regenerative power is the maximum value of the regenerative power that can be generated, which is determined for each wind speed.

報酬算出部６３は、報酬を算出する場合、報酬を算出する対象とする所定の対象区間を抽出する。ここで、対象区間とは、風況に応じて定まる所定の区間であり、例えば、風速が減速する減速区間と風速が加速する加速区間とを合わせた区間である。 When calculating the reward, the reward calculation unit 63 extracts a predetermined target section for which the reward is calculated. Here, the target section is a predetermined section determined according to the wind condition, and is, for example, a section in which a deceleration section in which the wind speed decelerates and an acceleration section in which the wind speed accelerates are combined.

風力発電においては、風速が減速しているにも係らず、風速に応じた最大の回生電力が得られる回転数で風車を回転させ続けると、発電負荷により風車の回転が失速してしまう場合があった。このための対策として、例えば、減速区間では回生電力を最大にする制御を行わず、回転が失速しないように回転速度を維持する制御を行い、加速時に回転数を増加させることでより高い回生電力が得られるように制御することが考えられる。このように制御すれば、トータルの発電量を増やすことが可能である。 In wind power generation, even though the wind speed is decelerating, if the wind turbine is continuously rotated at a rotation speed at which the maximum regenerative power corresponding to the wind speed can be obtained, the rotation of the wind turbine may stall due to the power generation load. there were. As a countermeasure for this, for example, in the deceleration section, the control to maximize the regenerative power is not performed, the control to maintain the rotation speed is performed so that the rotation does not stall, and the rotation speed is increased during acceleration to increase the regenerative power. It is conceivable to control so that By controlling in this way, it is possible to increase the total amount of power generation.

そこで、本実施形態では、報酬算出部６３は、所定の対象区間における回生電力の加算値が所定の電力閾値以上である場合には、より高い報酬を算出する。また、報酬算出部６３は、対象区間における回生電力の加算値が所定の電力閾値未満である場合には、より低い報酬を算出する。これにより、学習装置７０に、所定の対象区間における回生電力の加算値が高くなるような電力制御パラメータを出力するように学習させることができる。例えば、学習装置７０に、減速区間のどのタイミングで回生電力を最大にする制御から回転速度を維持する制御に切替え、加速区間のどのタイミングで回転速度を維持する制御から回生電力を最大にする制御に切替えれば、トータルの発電量が増えるかを学習させることができる。 Therefore, in the present embodiment, the reward calculation unit 63 calculates a higher reward when the added value of the regenerative power in the predetermined target section is equal to or higher than the predetermined power threshold value. Further, the reward calculation unit 63 calculates a lower reward when the added value of the regenerative power in the target section is less than a predetermined power threshold value. As a result, the learning device 70 can be trained to output a power control parameter such that the added value of the regenerative power in a predetermined target section becomes high. For example, the learning device 70 is switched from a control that maximizes the regenerative power at which timing in the deceleration section to a control that maintains the rotation speed, and a control that maximizes the regenerative power from a control that maintains the rotation speed at which timing in the acceleration section. If you switch to, you can learn whether the total amount of power generation will increase.

図７は、第２の実施形態に係る風力発電システム１における対象区間を示す図である。図７（ａ）は風速の時間変化を模式的に示す図である。図７（ｂ）は風速の時間変化の一例である。図７（ａ）、及び（ｂ）の横軸は時間［ｍｉｎ］、縦軸は風車２０の風速［ｍ／ｓ］を示す。 FIG. 7 is a diagram showing a target section in the wind power generation system 1 according to the second embodiment. FIG. 7A is a diagram schematically showing the time change of the wind speed. FIG. 7B is an example of the time change of the wind speed. The horizontal axis of FIGS. 7A and 7B shows the time [min], and the vertical axis shows the wind speed [m / s] of the wind turbine 20.

図７（ａ）に示すように、風速の時間変化においては、加速のピークＰ１を示した後、減速に転じて減速のピークＰ２を示し、その後加速に転じて加速のピークＰ３を示す。風速はこのように減速と加速とを交互に繰り返しながら変化する。報酬算出部６３は、加速のピークＰ１から減速のピークＰ２までを減速区間、減速のピークＰ２から加速のピークＰ３までを加速区間とし、減速区間とその後の加速区間とを合わせた対象区間を抽出する。 As shown in FIG. 7A, in the time change of the wind speed, the peak P1 of acceleration is shown, then the peak P2 of deceleration is shown after turning to deceleration, and then the peak P3 of acceleration is shown after turning to acceleration. In this way, the wind speed changes while alternately repeating deceleration and acceleration. The reward calculation unit 63 extracts a target section in which the deceleration section and the subsequent acceleration section are combined, with the acceleration peak P1 to the deceleration peak P2 as the deceleration section and the deceleration peak P2 to the acceleration peak P3 as the acceleration section. do.

図７（ｂ）に示すように、時間Ｔ１において風速が加速のピークとなり、その後減速した風速が時間Ｔ２において再び加速のピークとなる場合、対象区間は時間Ｔ１からＴ２までの間（以下、単に「時間Ｔ１～Ｔ２」と記す）である。同様に、時間Ｔ２～Ｔ３、時間Ｔ３～Ｔ４、…がそれぞれ対象区間である。対象区間の時間は、風況に応じて定まる任意の時間であってよく、ある対象区間と他の対象区間との時間が異なっていてよい。また、時間Ｔ４～Ｔ５のように、一旦減速した風速がしばらく維持され、再度減速したような場合も減速区間としてよく、その後の加速区間と合わせて対象区間としてよい。また、時間Ｔ５～Ｔ６のように対象区間に対して減速区間の割合が極端に少ない場合や、時間Ｔ６～Ｔ７のように対象区間に対して増加区間の割合が極端に少ない場合も、対象区間としてよい。 As shown in FIG. 7B, when the wind speed reaches the peak of acceleration at time T1 and then the decelerated wind speed reaches the peak of acceleration again at time T2, the target section is between time T1 and T2 (hereinafter, simply). It is described as "time T1 to T2"). Similarly, the times T2 to T3, the times T3 to T4, ... Are the target sections, respectively. The time of the target section may be any time determined according to the wind conditions, and the time of one target section and another target section may be different. Further, when the wind speed once decelerated is maintained for a while and then decelerated again, as in the time T4 to T5, the deceleration section may be set, and the target section may be combined with the subsequent acceleration section. Further, when the ratio of the deceleration section to the target section is extremely small such as time T5 to T6, or when the ratio of the increase section to the target section is extremely small such as time T6 to T7, the target section is also used. May be.

報酬算出部６３は、対象区間を抽出する場合、例えば、状態検出部６２により検出された風速Ｖ（ｎ）と、その前に状態検出部６２により検出された風速Ｖ（ｎ－１）との風速差分（Ｖ（ｎ）－Ｖ（ｎ－１））を算出することで、風速が減速しているか、減速のピークであるか、加速しているか、又は加速のピークであるかを判定する。報酬算出部６３は、風速差分がマイナスの値である場合、風速が減速であると判定する。報酬算出部６３は、風速差分がマイナスの値から０（ゼロ）、又は０（ゼロ）に近い所定の範囲内に変化した場合、風速が減速のピークであると判定する。報酬算出部６３は、風速差分がプラスの値である場合、風速が加速であると判定する。報酬算出部６３は、風速差分がプラスの値から０（ゼロ）、又は０（ゼロ）に近い所定の範囲内に変化した場合、風速が加速のピークであると判定する。 When the reward calculation unit 63 extracts the target section, for example, the wind speed V (n) detected by the state detection unit 62 and the wind speed V (n-1) detected by the state detection unit 62 before that. By calculating the wind speed difference (V (n) -V (n-1)), it is determined whether the wind speed is decelerating, decelerating peak, accelerating, or accelerating peak. .. When the wind speed difference is a negative value, the reward calculation unit 63 determines that the wind speed is decelerating. The reward calculation unit 63 determines that the wind speed is the peak of deceleration when the wind speed difference changes from a negative value to within a predetermined range close to 0 (zero) or 0 (zero). The reward calculation unit 63 determines that the wind speed is acceleration when the wind speed difference is a positive value. The reward calculation unit 63 determines that the wind speed is the peak of acceleration when the wind speed difference changes from a positive value to within a predetermined range close to 0 (zero) or 0 (zero).

なお、風速を検出する周期は、風力発電機本体１０に対して制御を行う制御周期（例えば、１０［ｍｓ］）や風況等に応じて任意に定められてよい。また、風況に応じた回生電力が出力されるまでに所定の遅延があることが考えられることから、報酬算出部６３は、対象区間に応じた時間に所定の遅延時間を考慮した時間における回生電力に基づいて、報酬を算出するようにしてよい。この場合の遅延時間は、風速に依らず一定の値であってもよいし、風速に応じて変動する値であってもよい。 The cycle for detecting the wind speed may be arbitrarily determined according to the control cycle (for example, 10 [ms]) for controlling the wind power generator main body 10, the wind condition, and the like. Further, since it is conceivable that there is a predetermined delay before the regenerative power according to the wind condition is output, the reward calculation unit 63 regenerates the time according to the target section in consideration of the predetermined delay time. The reward may be calculated based on the electric power. The delay time in this case may be a constant value regardless of the wind speed, or may be a value that fluctuates according to the wind speed.

また、上記では、報酬算出部６３は、対象区間における回生電力の加算値の大きさに基づいて、報酬を算出したが、これに限定されることはない。報酬算出部６３は、減速区間において回転速度が失速することなく維持された場合により高い報酬を算出し、減速区間において回転速度が失速した場合にはより低い報酬を算出するようにしてもよい。 Further, in the above, the reward calculation unit 63 calculates the reward based on the magnitude of the added value of the regenerative power in the target section, but the reward is not limited to this. The reward calculation unit 63 may calculate a higher reward when the rotation speed is maintained without stalling in the deceleration section, and may calculate a lower reward when the rotation speed stalls in the deceleration section.

また、上記では、報酬算出部６３は、風速に基づいて対象区間を抽出したが、これに限定されない。報酬算出部６３は、風速に代えて回転数を用いて、対象区間を抽出してもよいし、風速と回転数を用いて対象区間を抽出してもよい。 Further, in the above, the reward calculation unit 63 has extracted the target section based on the wind speed, but the present invention is not limited to this. The reward calculation unit 63 may extract the target section by using the rotation speed instead of the wind speed, or may extract the target section by using the wind speed and the rotation speed.

また、報酬算出部６３は、風速に基づいて、対象区間におけるトータル発電量に基づいて報酬を算出するか否かを判定してもよい。報酬算出部６３は、例えば、風速が所定の強風閾値未満である場合、つまり強風でない場合、対象区間におけるトータル発電量に基づいて報酬を算出すると判定する。一方、報酬算出部６３は、風速が所定の強風閾値以上である場合、つまり強風である場合、対象区間におけるトータル発電量に基づいて報酬を算出しないと判定する。強風時に発電量を高めようとすれば、風車２０が過回転となる可能性があるためである。 Further, the reward calculation unit 63 may determine whether or not to calculate the reward based on the total power generation amount in the target section based on the wind speed. The reward calculation unit 63 determines that the reward is calculated based on the total power generation amount in the target section, for example, when the wind speed is less than a predetermined strong wind threshold value, that is, when the wind speed is not strong. On the other hand, the reward calculation unit 63 determines that the reward is not calculated based on the total power generation amount in the target section when the wind speed is equal to or higher than the predetermined strong wind threshold value, that is, when the wind speed is strong. This is because if an attempt is made to increase the amount of power generation in a strong wind, the wind turbine 20 may over-rotate.

図８は、第２の実施形態に係る制御装置６０の動作例を示すフローチャートである。
まず、制御装置６０のパラメータ取得部６１は、強化学習部７１から電力制御パラメータを取得する（ステップＳ２０）。
次に、状態検出部６２は、風速、回転速度、及び回生電力を検出する（ステップＳ２１）。状態検出部６２は、風速センサ４１により検出された風速、及び回転速度センサ４２により検出された回転速度、を取得することにより、風速、及び回転速度を検出する。また、状態検出部６２は、電圧検出部３２より検出された回生電力の電圧、及び電流検出部３３により検出された回生電力の電流を取得することにより、回生電力を検出する。状態検出部６２は、検出した風速、回転速度、及び回生電力を、報酬算出部６３に出力する。 FIG. 8 is a flowchart showing an operation example of the control device 60 according to the second embodiment.
First, the parameter acquisition unit 61 of the control device 60 acquires the power control parameter from the reinforcement learning unit 71 (step S20).
Next, the state detection unit 62 detects the wind speed, the rotation speed, and the regenerative power (step S21). The state detection unit 62 detects the wind speed and the rotation speed by acquiring the wind speed detected by the wind speed sensor 41 and the rotation speed detected by the rotation speed sensor 42. Further, the state detection unit 62 detects the regenerative power by acquiring the voltage of the regenerative power detected by the voltage detection unit 32 and the current of the regenerative power detected by the current detection unit 33. The state detection unit 62 outputs the detected wind speed, rotation speed, and regenerative power to the reward calculation unit 63.

次に、報酬算出部６３は、報酬を算出する。
まず、報酬算出部６３は、風速が加速のピークであるか否かを判定する（ステップＳ２２）。報酬算出部６３は、風速が加速のピークである場合、対象区間における回生電力の加算値（トータル発電量）が所定の電力閾値以上であるか否かを判定する（ステップＳ２３）。報酬算出部６３は、トータル発電量が電力閾値以上である場合、第２レベルの報酬とする（ステップＳ２４）。一方、報酬算出部６３は、トータル発電量が電力閾値未満である場合、第２レベルより低い第１レベルの報酬とする（ステップＳ２５）。
報酬算出部６３は、トータル電力量をクリアし、ステップＳ２０に示す処理に戻る（ステップＳ２６）。 Next, the reward calculation unit 63 calculates the reward.
First, the reward calculation unit 63 determines whether or not the wind speed is the peak of acceleration (step S22). When the wind speed is the peak of acceleration, the reward calculation unit 63 determines whether or not the added value (total power generation amount) of the regenerative power in the target section is equal to or higher than the predetermined power threshold value (step S23). When the total power generation amount is equal to or greater than the power threshold value, the reward calculation unit 63 sets the reward as the second level reward (step S24). On the other hand, when the total power generation amount is less than the power threshold value, the reward calculation unit 63 sets the reward as the first level reward lower than the second level (step S25).
The reward calculation unit 63 clears the total electric energy and returns to the process shown in step S20 (step S26).

一方、ステップＳ２２において、風速が加速のピークでない場合、報酬算出部６３は、トータル発電量に、検出した回生電力を加算し、ステップＳ２０に示す処理に戻る（ステップＳ２７）。 On the other hand, if the wind speed is not the peak of acceleration in step S22, the reward calculation unit 63 adds the detected regenerative power to the total power generation amount, and returns to the process shown in step S20 (step S27).

＜第３の実施形態＞
次に第３の実施形態について説明する。
本実施形態では、風車２０の回転数を制御する場合に、風況に応じて、それぞれ異なる制御を行う点において、他の実施形態と相違する。以下では、上述した実施形態と異なる点を説明し、上述した実施形態と同一または類似の機能を有する構成に同一の符号を付し、その説明を省略する。 <Third embodiment>
Next, a third embodiment will be described.
This embodiment is different from other embodiments in that when the rotation speed of the wind turbine 20 is controlled, different controls are performed according to the wind conditions. Hereinafter, the points different from those of the above-described embodiment will be described, and the same reference numerals will be given to the configurations having the same or similar functions as those of the above-described embodiments, and the description thereof will be omitted.

本実施形態では、風況を減速区間、加速区間、及び強風区間に分類し、分類した区間の各々に基づいて、風車２０の回転数を制御するように、学習装置７０に学習させる。ここで、減速区間、及び加速区間は、第２の実施形態における減速区間、及び加速区間と同等である。強風区間は、風速が所定の強風判定閾値以上となる区間である。 In the present embodiment, the wind conditions are classified into a deceleration section, an acceleration section, and a strong wind section, and the learning device 70 is made to learn to control the rotation speed of the wind turbine 20 based on each of the classified sections. Here, the deceleration section and the acceleration section are equivalent to the deceleration section and the acceleration section in the second embodiment. The strong wind section is a section in which the wind speed is equal to or higher than a predetermined strong wind determination threshold value.

風力発電においては、加速区間では、風速に応じた最大の回生電力が得られる回転数で風車を回転させることが望ましい。風速と回転数との関係は、風速が加速すれば、風車２０の回転数が増加する傾向にあるが、風車の慣性（イナーシャ）により、風速に対して一定ではない遅延が発生する。このため、風速に対応する回転数で風車を回転させた場合であっても、想定される最大の回生電力が得られない場合があった。このため、加速区間では、風車のイナーシャを考慮して制御されることが望ましい。 In wind power generation, it is desirable to rotate the wind turbine at a rotation speed at which the maximum regenerative power corresponding to the wind speed can be obtained in the acceleration section. Regarding the relationship between the wind speed and the rotation speed, the rotation speed of the wind turbine 20 tends to increase as the wind speed accelerates, but the inertia of the wind turbine causes a non-constant delay with respect to the wind speed. Therefore, even when the wind turbine is rotated at a rotation speed corresponding to the wind speed, the maximum expected regenerative power may not be obtained. Therefore, it is desirable to control the acceleration section in consideration of the inertia of the wind turbine.

また、減速区間では、風速に応じた最大の回生電力が得られる回転数で風車を回転させ続けると、発電負荷により風車の回転が失速してしまう場合があった。このため、減速区間では、風速の変化（減速）の度合い（単位時間あたりの風速の変化量）を考慮して制御されることが望ましい。 Further, in the deceleration section, if the wind turbine is continuously rotated at a rotation speed at which the maximum regenerative power corresponding to the wind speed can be obtained, the rotation of the wind turbine may stall due to the power generation load. Therefore, it is desirable that the deceleration section is controlled in consideration of the degree of change (deceleration) of the wind speed (the amount of change in the wind speed per unit time).

また、強風区間では、回転数を減速させることで風車２０の回転速度が速度超過に陥らないように制御するが、減速し過ぎると回生電力が低下してしまう場合があった。このため、強風区間では、風速の変化（加速）の度合い（単位時間あたりの風速の変化量）と回生電力とを考慮して制御されることが望ましい。 Further, in the strong wind section, the rotation speed of the wind turbine 20 is controlled so as not to exceed the speed by decelerating the rotation speed, but if the speed is reduced too much, the regenerative power may decrease. Therefore, in the strong wind section, it is desirable that the control is performed in consideration of the degree of change (acceleration) of the wind speed (the amount of change in the wind speed per unit time) and the regenerative power.

そこで、本実施形態では、加速区間では、風速に応じて定まる風車２０の回転数の目標値を基準とし、基準である目標値を含む所定の範囲の回転数を目標の範囲として、目標の範囲内で風車２０の回転数を制御し、より大きな回生電力が得られる場合により高い報酬を与えることで、風況に応じてより適した目標値を探すように学習させる。 Therefore, in the present embodiment, in the acceleration section, the target range of the target range is based on the target value of the rotation speed of the wind turbine 20 determined according to the wind speed, and the target range is the rotation speed in a predetermined range including the reference target value. By controlling the rotation speed of the wind turbine 20 within and giving a higher reward when a larger regenerative power can be obtained, the learner is made to search for a more suitable target value according to the wind condition.

具体的には、強化学習部７１は、風力発電機本体１０の状態を示す風速と回転速度を示す情報を取得する。強化学習部７１は、取得した風速と回転速度を示す情報、及び回生電力の目標値を含む風車２０に設定可能な所定の範囲の回転数を示す情報に基づいて、報酬を手掛かりとしてより高い報酬が得られるように学習を進め、より適切な電力制御パラメータを出力する。ここで、風車２０に設定可能な所定の範囲の回転数とは、風速ごとに定まる回生電力が最大となる風車の回転数を含む所定の範囲の回転数である。この範囲には、風車２０や発電機３０の機械的な耐用限界等に基づいて定められる風車２０の回転数として許容される範囲内であることが望ましい。 Specifically, the reinforcement learning unit 71 acquires information indicating the wind speed and the rotation speed indicating the state of the wind power generator main body 10. The reinforcement learning unit 71 uses the reward as a clue to raise the reward based on the acquired information indicating the wind speed and the rotation speed and the information indicating the rotation speed in a predetermined range that can be set for the wind turbine 20 including the target value of the regenerative power. We proceed with learning so that we can obtain more appropriate power control parameters. Here, the rotation speed in a predetermined range that can be set in the wind turbine 20 is a rotation speed in a predetermined range including the rotation speed of the wind turbine that maximizes the regenerative power determined for each wind speed. It is desirable that this range is within an allowable range as the rotation speed of the wind turbine 20 determined based on the mechanical service limit of the wind turbine 20 and the generator 30.

また、報酬算出部６３は、加速区間では、より大きな回生電力が得られる場合により高い報酬を算出する。これにより、強化学習部７１は、加速区間では、目標の範囲内で回転数制御パラメータを出力し、出力した回転数制御パラメータが、風車２０に設定された場合の風車２０の回転の状態に応じて、より大きな回生電力が得られる制御を学習する。 Further, the reward calculation unit 63 calculates a higher reward in the acceleration section when a larger regenerative power can be obtained. As a result, the reinforcement learning unit 71 outputs the rotation speed control parameter within the target range in the acceleration section, and the output rotation speed control parameter corresponds to the rotation state of the wind turbine 20 when the wind turbine 20 is set. And learn the control to obtain larger regenerative power.

また、本実施形態においては、減速区間では、報酬算出部６３は、風速の減速の度合いに応じて報酬を算出する。具体的には、報酬算出部６３は、風速の減速の度合いに対する、風車２０の回転速度の変化量がより小さい場合に、より高い報酬を算出する。これにより、強化学習部７１は、減速区間では、風速が減速する場合であっても、風車２０の回転速度が失速しないように維持するような制御を学習する。 Further, in the present embodiment, in the deceleration section, the reward calculation unit 63 calculates the reward according to the degree of deceleration of the wind speed. Specifically, the reward calculation unit 63 calculates a higher reward when the amount of change in the rotation speed of the wind turbine 20 with respect to the degree of deceleration of the wind speed is smaller. As a result, the reinforcement learning unit 71 learns the control for maintaining the rotational speed of the wind turbine 20 so as not to stall even when the wind speed is decelerated in the deceleration section.

また、報酬算出部６３は、減速区間とその後の加速区間とを合わせた区間（減速対象区間）における回生電力の加算値がより大きい場合に、より高い報酬を算出する。これにより、強化学習部７１は、減速区間で回生電力が小さくなった場合であっても、その後の加速区間でより大きな回生電力が出力されるような制御を学習する。 Further, the reward calculation unit 63 calculates a higher reward when the added value of the regenerative power in the section (deceleration target section) in which the deceleration section and the subsequent acceleration section are combined is larger. As a result, the reinforcement learning unit 71 learns the control so that a larger regenerative power is output in the subsequent acceleration section even when the regenerative power becomes smaller in the deceleration section.

また、本実施形態においては、強風区間では、強風時において風速が加速する強風加速区間とその後の減速区間とを合わせた区間（強風対象区間）における回生電力の加算値がより大きい場合に、より高い報酬を算出する。これにより、強化学習部７１は、強風区間で減速させ過ぎてその後の減速区間で回生電力が小さくならないような制御を学習する。 Further, in the present embodiment, in the strong wind section, when the added value of the regenerative power in the section including the strong wind acceleration section in which the wind speed accelerates in the strong wind and the subsequent deceleration section (strong wind target section) is larger, the additional value is higher. Calculate high rewards. As a result, the reinforcement learning unit 71 learns the control so that the regenerative power is not reduced too much in the strong wind section and the regenerative power is not reduced in the subsequent deceleration section.

なお、報酬算出部６３は、強風区間では、風車２０の回転速度が速度超過となった場合には、最も低い報酬を算出する。
これにより、報酬算出部６３は、強風区間で速度超過とならないような制御を学習する。 In the strong wind section, the reward calculation unit 63 calculates the lowest reward when the rotation speed of the wind turbine 20 exceeds the speed.
As a result, the reward calculation unit 63 learns the control so as not to cause the speed to be exceeded in the strong wind section.

図９は、第３の実施形態に係る制御装置６０の動作例を示すフローチャートである。
まず、制御装置６０のパラメータ取得部６１は、強化学習部７１から、目標の範囲内の回転数制御パラメータを取得する（ステップＳ３０）。
次に、状態検出部６２は、風速、回転速度、及び回生電力を検出する（ステップＳ３１）。
次に、報酬算出部６３は、取得した風速に基づいて、区間を抽出するか否か判定する（ステップＳ３２）。報酬算出部６３は、取得した風速が強風から強風ではない通常の風速に変化した場合、加速のピークとなった場合、又は減速のピークとなった場合、区間を抽出すると判定する。報酬算出部６３は、区間を抽出しない場合、ステップＳ３０に戻る。 FIG. 9 is a flowchart showing an operation example of the control device 60 according to the third embodiment.
First, the parameter acquisition unit 61 of the control device 60 acquires the rotation speed control parameter within the target range from the reinforcement learning unit 71 (step S30).
Next, the state detection unit 62 detects the wind speed, the rotation speed, and the regenerative power (step S31).
Next, the reward calculation unit 63 determines whether or not to extract the section based on the acquired wind speed (step S32). The reward calculation unit 63 determines that the section is extracted when the acquired wind speed changes from a strong wind to a normal wind speed which is not a strong wind, when the acceleration peaks, or when the deceleration peaks. If the reward calculation unit 63 does not extract the section, the reward calculation unit 63 returns to step S30.

報酬算出部６３は、区間を抽出する場合、風速が所定の強風判定閾値以上となった場合、風速が所定の強風判定閾値未満となるまでを強風区間として抽出する。また、報酬算出部６３は、風速が加速のピークとなった場合、その後に風速が減速のピークとなるまでを減速区間として抽出する。報酬算出部６３は、風速が所定の減速のピークとなった場合、その後に風速が加速のピークとなるまでを加速区間として抽出する。 When extracting a section, the reward calculation unit 63 extracts the section until the wind speed becomes less than the predetermined strong wind determination threshold when the wind speed becomes equal to or higher than the predetermined strong wind determination threshold. Further, when the wind speed reaches the peak of acceleration, the reward calculation unit 63 extracts the period until the wind speed reaches the peak of deceleration as a deceleration section. When the wind speed reaches a predetermined deceleration peak, the reward calculation unit 63 extracts the period until the wind speed reaches the acceleration peak as an acceleration section.

報酬算出部６３は、抽出した区間が強風区間であるか否かを判定する（ステップＳ３３）。報酬算出部６３は、抽出した区間が強風区間である場合、強風加速区間とその後の減速区間における回生電力の加算値に応じた報酬を算出する（ステップＳ３４）一方、報酬算出部６３は、区間が強風区間でない場合、抽出した区間が減速区間であるか否かを判定する（ステップＳ３５）。
報酬算出部６３は、抽出した区間が減速区間である場合、減速区間とその後の加速区間における回生電力の加算値に応じた報酬を算出する（ステップＳ３６）。
一方、報酬算出部６３は、抽出した区間が減速区間でない場合、つまり加速区間である場合、回生電力に応じた報酬を算出する（ステップＳ３７）。 The reward calculation unit 63 determines whether or not the extracted section is a strong wind section (step S33). When the extracted section is a strong wind section, the reward calculation unit 63 calculates a reward according to the added value of the regenerative power in the strong wind acceleration section and the subsequent deceleration section (step S34), while the reward calculation unit 63 calculates the section. If is not a strong wind section, it is determined whether or not the extracted section is a deceleration section (step S35).
When the extracted section is a deceleration section, the reward calculation unit 63 calculates a reward according to the added value of the regenerative power in the deceleration section and the subsequent acceleration section (step S36).
On the other hand, when the extracted section is not a deceleration section, that is, an acceleration section, the reward calculation unit 63 calculates a reward according to the regenerative power (step S37).

＜第４の実施形態＞
次に第４の実施形態について説明する。
本実施形態では、風車２０の回転数を制御する場合に、風況に応じて限界風速を変化させる点において、他の実施形態と相違する。以下では、上述した実施形態と異なる点を説明し、上述した実施形態と同一または類似の機能を有する構成に同一の符号を付し、その説明を省略する。 <Fourth Embodiment>
Next, a fourth embodiment will be described.
This embodiment is different from other embodiments in that the limit wind speed is changed according to the wind condition when the rotation speed of the wind turbine 20 is controlled. Hereinafter, the points different from those of the above-described embodiment will be described, and the same reference numerals will be given to the configurations having the same or similar functions as those of the above-described embodiments, and the description thereof will be omitted.

限界風速とは、風力発電システム１による発電が可能な風速の上限である。限界風速は、例えば、風車２０や発電機３０の機械的な耐用限界等に基づいて決定される破壊風速に所定の余裕（マージン）を考慮して決定される。ここで、破壊風速とは、風車２０や発電機３０が損傷したり破壊されたりする程度の風速である。 The critical wind speed is the upper limit of the wind speed that can be generated by the wind power generation system 1. The limit wind speed is determined in consideration of a predetermined margin for the breaking wind speed determined based on, for example, the mechanical service limit of the wind turbine 20 or the generator 30. Here, the breaking wind speed is such that the wind turbine 20 and the generator 30 are damaged or destroyed.

風力発電システム１においては、風車２０が受ける風速が限界風速に達すると、運転を停止する。この場合、風力発電システム１は、発電が制御されているか否かに関わらず、強制的に運転を停止する。このため、強風時に発電が制御されている場合であっても、風速が限界風速に達すれば運転を停止せざるを得ず、回生電力の低減の要因となる場合があった。このため、強風時であっても発電が適切に制御されている場合には、限界風速をより破壊風速に近付ける方向に変化させ、発電を維持できるように制御されることが望ましい。 In the wind power generation system 1, when the wind speed received by the wind turbine 20 reaches the limit wind speed, the operation is stopped. In this case, the wind power generation system 1 forcibly stops the operation regardless of whether or not the power generation is controlled. Therefore, even when the power generation is controlled during a strong wind, the operation must be stopped when the wind speed reaches the limit wind speed, which may be a factor of reducing the regenerative power. Therefore, if the power generation is properly controlled even in a strong wind, it is desirable to change the critical wind speed in a direction closer to the breaking wind speed and control the power generation so that the power generation can be maintained.

そこで、本実施形態では、強風区間における制御の安定度合に基づいて、限界風速に近づいている場合でも、安定した制御が行われている場合には破壊風速に近付ける方向に変化させ、より安定した制御が行われている場合により高い報酬を与えることで、強風時においても風況に応じてより安定した制御が行われるように学習させる。 Therefore, in the present embodiment, based on the stability of the control in the strong wind section, even if the wind speed is approaching the limit wind speed, if the stable control is performed, the wind speed is changed to approach the breaking wind speed to make it more stable. By giving a higher reward when control is performed, learning is performed so that more stable control is performed according to the wind conditions even in strong winds.

具体的には、報酬算出部６３は、強風区間において、風車２０の回転速度と最大回転速度との差分、及び風車２０の回転速度の変化率を算出し、算出した差分と変化率とに基づいて、報酬を算出する。 Specifically, the reward calculation unit 63 calculates the difference between the rotation speed of the wind turbine 20 and the maximum rotation speed and the rate of change of the rotation speed of the wind turbine 20 in the strong wind section, and is based on the calculated difference and the rate of change. And calculate the reward.

また、報酬算出部６３は、風車２０の回転速度と最大回転速度との差分が所定の余裕閾値未満である場合、風車２０の回転速度の変化率が所定の変化閾値未満である場合より高い報酬（第２レベル）を算出し、当該回転速度の変化率が所定の変化閾値以上である場合より低い報酬（第１レベル）を算出する。風車２０の回転速度の変化率が低い場合、風車２０の回転速度の変化率が高い場合と比較して、風車２０の回転がより安定して制御されていると判断できるためである。 Further, the reward calculation unit 63 receives a higher reward than when the difference between the rotation speed of the wind turbine 20 and the maximum rotation speed is less than the predetermined margin threshold value and the change rate of the rotation speed of the wind turbine 20 is less than the predetermined change threshold value. (Second level) is calculated, and a lower reward (first level) is calculated than when the rate of change of the rotation speed is equal to or higher than a predetermined change threshold. This is because when the rate of change in the rotation speed of the wind turbine 20 is low, it can be determined that the rotation of the wind turbine 20 is controlled more stably than in the case where the rate of change in the rotation speed of the wind turbine 20 is high.

報酬算出部６３は、風車２０の回転速度と最大回転速度との差分が所定の余裕閾値以上である場合には、上述した当該回転速度の変化率が所定の変化閾値以上である場合の報酬（第１レベル）よりも高く、当該回転速度の変化率が所定の変化閾値未満である場合の報酬（第２レベル）よりも低い報酬（第３レベル）を算出する。風車２０の回転速度と最大回転速度との差分が余裕閾値以上である場合、強風時の制御として、過度に安全な方向により過ぎていると判断できるためである。 The reward calculation unit 63 receives a reward when the difference between the rotation speed of the wind turbine 20 and the maximum rotation speed is equal to or more than a predetermined margin threshold, and the rate of change of the rotation speed described above is equal to or more than a predetermined change threshold. A reward (third level) higher than the reward (second level) higher than the reward (second level) when the rate of change of the rotation speed is less than a predetermined change threshold is calculated. This is because when the difference between the rotation speed of the wind turbine 20 and the maximum rotation speed is equal to or larger than the margin threshold value, it can be determined that the control in a strong wind is excessively in a safe direction.

ここで、風車２０の最大回転速度とは、風車２０に回転させることが可能な回転速度の最大値である。最大回転速度は、例えば、強度設計上の上限とする。 Here, the maximum rotation speed of the wind turbine 20 is the maximum value of the rotation speed that can be rotated by the wind turbine 20. The maximum rotation speed is, for example, an upper limit in strength design.

また、本実施形態の制御装置６０では、強風区間において、風車２０の回転速度と最大回転速度との差分が所定の余裕閾値未満である場合であって、尚且つ、風車２０の回転速度の変化率が所定の変化閾値未満である場合、限界風速を破壊風速に近づく方向に変化させる。具体的には、限界風速を記憶する風速情報記憶部（不図示）に記憶させている限界風速を書き換える。これにより、変更後の限界風速に応じた制御が行われる。つまり、制御装置６０は、状態検出部６２により検出された風速に基づいて、風速情報記憶部に記憶された限界風速を参照し、風速が限界風速以上である場合には、風力発電機本体１０の動作を停止させる。 Further, in the control device 60 of the present embodiment, in a strong wind section, the difference between the rotation speed of the wind turbine 20 and the maximum rotation speed is less than a predetermined margin threshold value, and the change in the rotation speed of the wind turbine 20. When the rate is less than a predetermined change threshold, the critical wind speed is changed in a direction approaching the breaking wind speed. Specifically, the limit wind speed stored in the wind speed information storage unit (not shown) that stores the limit wind speed is rewritten. As a result, control is performed according to the changed critical wind speed. That is, the control device 60 refers to the limit wind speed stored in the wind speed information storage unit based on the wind speed detected by the state detection unit 62, and when the wind speed is equal to or higher than the limit wind speed, the wind power generator main body 10 Stop the operation of.

なお、限界風速の変更は、所定の限界風速変更値に基づいて、段階的に変更することが望ましい。限界風速変更値は、風車２０の構造や、立地条件、季節等に基づいて任意に設定されてよい。 It is desirable to change the limit wind speed step by step based on a predetermined limit wind speed change value. The limit wind speed change value may be arbitrarily set based on the structure of the wind turbine 20, location conditions, seasons, and the like.

図１０は、第４の実施形態に係る制御装置６０の動作例を示すフローチャートである。
まず、制御装置６０のパラメータ取得部６１は、強化学習部７１から、回転数制御パラメータを取得する（ステップＳ４０）。
次に、状態検出部６２は、風速、回転速度、及び回生電力を検出する（ステップＳ４１）。
次に、報酬算出部６３は、取得した風速が強風であるか否か判定する（ステップＳ４２）。 FIG. 10 is a flowchart showing an operation example of the control device 60 according to the fourth embodiment.
First, the parameter acquisition unit 61 of the control device 60 acquires the rotation speed control parameter from the reinforcement learning unit 71 (step S40).
Next, the state detection unit 62 detects the wind speed, the rotation speed, and the regenerative power (step S41).
Next, the reward calculation unit 63 determines whether or not the acquired wind speed is a strong wind (step S42).

報酬算出部６３は、取得した風速が強風である場合、状態検出部６２から取得した回転速度と最大回転速度との差分を算出し、算出した差分が余裕閾値以上か否かを判定する（ステップＳ４３）。報酬算出部６３は、算出した差分が余裕閾値以上である場合、第３レベルの報酬を算出する（ステップＳ４４）。 When the acquired wind speed is a strong wind, the reward calculation unit 63 calculates the difference between the rotation speed acquired from the state detection unit 62 and the maximum rotation speed, and determines whether or not the calculated difference is equal to or greater than the margin threshold value (step). S43). When the calculated difference is equal to or greater than the margin threshold value, the reward calculation unit 63 calculates the third level reward (step S44).

報酬算出部６３は、算出した差分が余裕閾値未満である場合、状態検出部６２から取得した回転速度の変化率を算出し、算出した変化率が所定の変化閾値未満か否かを判定する（ステップＳ４５）。報酬算出部６３は、算出した変化率が変化閾値未満である場合、第２レベルの報酬（最高レベル）を算出する（ステップＳ４６）。この場合、制御装置６０は、限界風速を破壊風速に近付ける方向に変更する（ステップＳ４８）。
報酬算出部６３は、算出した変化率が変化閾値以上である場合、第１レベルの報酬（最低レベル）を算出する（ステップＳ４７）。 When the calculated difference is less than the margin threshold value, the reward calculation unit 63 calculates the change rate of the rotation speed acquired from the state detection unit 62, and determines whether or not the calculated change rate is less than the predetermined change threshold value (. Step S45). When the calculated change rate is less than the change threshold value, the reward calculation unit 63 calculates the second level reward (highest level) (step S46). In this case, the control device 60 changes the critical wind speed toward the breaking wind speed (step S48).
When the calculated change rate is equal to or higher than the change threshold value, the reward calculation unit 63 calculates the first level reward (lowest level) (step S47).

（第５の実施形態）
次に、第５の実施形態について説明する。
本実施形態では、制御装置６０が、学習済みモデルを用いて風車２０の回転数を制御する点において、上述した実施形態と相違する。
図１１は、第５の実施形態の変形例に係る風力発電システム１Ａの概略構成の一例を示すブロック図である。図１１に示すように、制御装置６０Ａは、学習済みモデル記憶部６５と、決定部６６と、制御部６７とを備える。 (Fifth Embodiment)
Next, a fifth embodiment will be described.
The present embodiment differs from the above-described embodiment in that the control device 60 controls the rotation speed of the wind turbine 20 by using the trained model.
FIG. 11 is a block diagram showing an example of a schematic configuration of the wind power generation system 1A according to the modified example of the fifth embodiment. As shown in FIG. 11, the control device 60A includes a learned model storage unit 65, a determination unit 66, and a control unit 67.

学習済みモデル記憶部６５は、学習済みモデルを記憶する。学習済みモデルは、制御対象である風力発電機本体１０の状態と、風力発電機本体１０に対する制御との関係を示す情報（関係情報）が格納されたデータベース（学習済みモデル）である。学習済みモデルは、風力発電機本体１０の状態に応じて、その状態に対応する風力発電機本体１０を制御する指標を示すパラメータ（以下、制御指標パラメータという）を推定するモデルである。 The trained model storage unit 65 stores the trained model. The trained model is a database (learned model) in which information (relationship information) showing the relationship between the state of the wind power generator main body 10 to be controlled and the control for the wind power generator main body 10 is stored. The trained model is a model that estimates a parameter (hereinafter, referred to as a control index parameter) indicating an index for controlling the wind power generator main body 10 corresponding to the state of the wind power generator main body 10.

ここで、制御指標パラメータは、風力発電機本体１０を制御する指標となる情報であって、制御パラメータそのものであってもよいし、制御パラメータを導出するために用いられる情報であってもよい。 Here, the control index parameter is information that is an index for controlling the wind power generator main body 10, and may be the control parameter itself or the information used for deriving the control parameter.

例えば、制御指標パラメータが風車２０の回転を制御する指標となる情報である場合、制御指標パラメータは、回転数制御パラメータそのものであってもよいし、風車２０の回転数や回転速度を数値で示すものであってもよいし、回転数や回転速度を増加させる、又は減少させるというような風車２０の回転数の制御を相対的に示すものであってもよい。 For example, when the control index parameter is information that is an index for controlling the rotation of the wind turbine 20, the control index parameter may be the rotation speed control parameter itself, or indicates the rotation speed or the rotation speed of the wind turbine 20 numerically. It may be one, or it may be relatively indicating the control of the rotation speed of the wind turbine 20 such as increasing or decreasing the rotation speed or the rotation speed.

例えば、制御指標パラメータが回生電力を制御する指標となる情報である場合、制御指標パラメータは、電力制御パラメータそのものであってもよいし、回生電力の目標値を示すものであってもよいし、回生電力を増加させる、又は減少させるというような風回生電力の制御を相対的に示すものであってもよい。 For example, when the control index parameter is information that is an index for controlling the regenerative power, the control index parameter may be the power control parameter itself, or may indicate a target value of the regenerative power. It may relatively indicate the control of the wind regenerative power such as increasing or decreasing the regenerative power.

学習済みモデルは、例えば、上述した実施形態において強化学習部７１により学習が実施されることにより作成された学習済みモデルであってもよいし、他の風車であって、風車２０と似た構造を有し、風車２０が設置された地域と似たような地域に設けられた風車における風力発電システムの状態と制御との関係を学習した学習済みモデルであってもよい。 The trained model may be, for example, a trained model created by learning by the reinforcement learning unit 71 in the above-described embodiment, or another wind turbine having a structure similar to that of the wind turbine 20. It may be a trained model that has learned the relationship between the state and control of the wind power generation system in a wind turbine installed in an area similar to the area where the wind turbine 20 is installed.

決定部６６は、取得した制御指標パラメータに基づいて、風力発電機本体１０に対する制御に関する制御情報を決定する。ここでの制御情報は、制御指標パラメータに応じて決定される制御を示す情報であり、例えば風車の回転に関する回転情報であり、又、例えば回生電力に関する電力情報である。つまり、決定部６６は、回転指標パラメータに基づいて回転情報を決定する。また、決定部６６は、電力指標パラメータに基づいて電力情報を決定する。決定部６６は、決定した回転情報を、制御部６７に出力する。 The determination unit 66 determines control information regarding control for the wind power generator main body 10 based on the acquired control index parameters. The control information here is information indicating control determined according to a control index parameter, for example, rotation information regarding the rotation of a wind turbine, and power information regarding, for example, regenerative power. That is, the determination unit 66 determines the rotation information based on the rotation index parameter. Further, the determination unit 66 determines the power information based on the power index parameter. The determination unit 66 outputs the determined rotation information to the control unit 67.

ここでの回転情報には、例えば、風車の回転数を増加させるか、或いは減少させるかといった回転数の変化を示す情報の他、段階的に変化させるか、一気に変化させるかといった回転数を変化させる方法を示す情報も含まれる。 The rotation information here includes, for example, information indicating a change in the rotation speed such as whether to increase or decrease the rotation speed of the wind turbine, as well as a change in the rotation speed such as whether to change the wind turbine stepwise or at once. It also contains information that shows how to make it.

また、電力情報には、例えば、回生電力を増加させるか、或いは減少させるかといった回生電力の変化を示す情報の他、段階的に変化させるか、一気に変化させるかといった回生電力を変化させる度合を示す情報も含まれる。 Further, in the electric power information, for example, in addition to information indicating a change in the regenerative power such as whether to increase or decrease the regenerative power, the degree to which the regenerative power is changed such as whether to change it stepwise or at once is included. Information to indicate is also included.

制御部６７は、決定部６６により決定された制御情報に基づいて、風力発電機本体１０を制御する制御パラメータを決定する。制御部６７は、例えば、決定部６６により決定された回転情報に基づいて、風車２０の回転が許容範囲に収まり、尚且つ目標に近づくよう、風車２０の回転を制御する回転数制御パラメータを決定する。また、制御部６７は、例えば、決定部６６により決定された電力情報に基づいて、回生電力が目標に近づくよう、回生電力を制御する電力制御パラメータを決定する。制御部６７は、決定した制御パラメータを、パラメータ取得部６１を介して風力発電機本体１０に出力する。 The control unit 67 determines the control parameters for controlling the wind power generator main body 10 based on the control information determined by the determination unit 66. The control unit 67 determines, for example, a rotation speed control parameter that controls the rotation of the wind turbine 20 so that the rotation of the wind turbine 20 falls within the allowable range and approaches the target, based on the rotation information determined by the determination unit 66. do. Further, the control unit 67 determines a power control parameter for controlling the regenerative power so that the regenerative power approaches the target, for example, based on the power information determined by the determination unit 66. The control unit 67 outputs the determined control parameter to the wind power generator main body 10 via the parameter acquisition unit 61.

ここで、風車２０の回転における許容範囲は、風車２０の回転数として許容される範囲のことであり、例えば、図４における特性ＥＦ、及び特性ＦＧよりも回転数が低い領域、つまり、図４のＡＥＦＧＣで囲まれた領域である。また、目標は、風車２０の回転数の目標となる値であり、例えば図４における特性ＥＦ、及び特性ＦＧに沿った値である。 Here, the permissible range in the rotation of the wind turbine 20 is the permissible range as the rotation speed of the wind turbine 20, for example, a region having a lower rotation speed than the characteristic EF and the characteristic FG in FIG. 4, that is, FIG. It is an area surrounded by AEFGC. Further, the target is a target value of the rotation speed of the wind turbine 20, and is, for example, a value along the characteristic EF and the characteristic FG in FIG.

（第６の実施形態）
次に、第６の実施形態について説明する。
本実施形態では、制御装置６０が学習済みモデルを用いて出力した制御指標パラメータ（以下、単にパラメータという）と、学習装置７０が出力したパラメータとのいずれかを用いて、風車２０の回転数を制御する点において、上述した実施形態と相違する。
図１２は、第６の実施形態の変形例に係る風力発電システム１Ｂの概略構成の一例を示すブロック図である。図１２に示すように、制御装置６０Ｂは、選択部６８を備える。 (Sixth Embodiment)
Next, the sixth embodiment will be described.
In the present embodiment, the rotation speed of the wind turbine 20 is determined by using either the control index parameter (hereinafter, simply referred to as a parameter) output by the control device 60 using the trained model or the parameter output by the learning device 70. It differs from the above-described embodiment in that it is controlled.
FIG. 12 is a block diagram showing an example of a schematic configuration of the wind power generation system 1B according to the modified example of the sixth embodiment. As shown in FIG. 12, the control device 60B includes a selection unit 68.

選択部６８は、学習済みモデル記憶部６５に記憶される学習済みモデルから出力されるパラメータと、学習装置７０により出力されるパラメータとの何れか一方を決定部６６に出力する。選択部６８は、何れの一方を選択するかを、予め定められたフェーズに従って決定するようにしてよい。選択部６８は、例えば、風車２０の回転数の制御を学習装置70に学習させる学習フェーズにおいては、学習装置７０により出力されるパラメータを選択する。一方、選択部６８は、風車２０の回転数の制御を学習済みの学習モデルが学習済みモデル記憶部６５に記憶され、学習済みモデルを用いて風車２０の回転数の制御する制御フェーズにおいては、学習済みモデル記憶部６５に記憶される学習済みモデルから出力されるパラメータを選択する。 The selection unit 68 outputs either a parameter output from the trained model stored in the trained model storage unit 65 or a parameter output by the learning device 70 to the determination unit 66. The selection unit 68 may determine which one to select according to a predetermined phase. The selection unit 68 selects, for example, the parameters output by the learning device 70 in the learning phase in which the learning device 70 learns the control of the rotation speed of the wind turbine 20. On the other hand, in the selection unit 68, the learning model learned to control the rotation speed of the windmill 20 is stored in the learned model storage unit 65, and in the control phase in which the learning model is used to control the rotation speed of the windmill 20. Select the parameter output from the trained model stored in the trained model storage unit 65.

また、上述した少なくとも一つの実施形態では、強化学習部７１が学習した内容を、学習済みモデル記憶部６５やその他の図示しない記憶部に記憶させておき、記憶させた内容に基づいて、更に学習を進めるようにしてよい。これにより、風車２０に共通するある程度の基本的な制御について学習したモデルを、風車２０が設けられた地域の風況や、季節の風況、昼夜の時間帯による風況の相違や、天候等の状態に応じた制御を行うことが可能となる。
なお、上述した少なくとも一つの実施形態では、風車２０の回転数を制御するパラメータとして回転数制御パラメータが用いられる場合を例示して説明したが、これに限定されることはない。制御システム５０は、風車２０の回転数を制御するパラメータとして、回転速度や回転時間等を制御するようにしてもよい。この場合、風車２０の回転数を制御するパラメータは、例えば回転速度パラメータ、回転時間パラメータ等であってよい。このような、風車２０の回転数を制御するパラメータの総称として、回転制御パラメータが用いられてよい。つまり、回転数制御パラメータは、「回転数制御パラメータ」の一例である。 Further, in at least one embodiment described above, the content learned by the reinforcement learning unit 71 is stored in the trained model storage unit 65 and other storage units (not shown), and further learning is performed based on the stored content. You may try to proceed. As a result, the model learned about some basic control common to the wind turbine 20 can be used for wind conditions in the area where the wind turbine 20 is installed, seasonal wind conditions, differences in wind conditions depending on the time of day and night, weather, etc. It is possible to perform control according to the state.
In the above-mentioned at least one embodiment, the case where the rotation speed control parameter is used as the parameter for controlling the rotation speed of the wind turbine 20 has been described as an example, but the present invention is not limited to this. The control system 50 may control the rotation speed, the rotation time, and the like as parameters for controlling the rotation speed of the wind turbine 20. In this case, the parameters that control the rotation speed of the wind turbine 20 may be, for example, a rotation speed parameter, a rotation time parameter, or the like. A rotation control parameter may be used as a general term for such parameters for controlling the rotation speed of the wind turbine 20. That is, the rotation speed control parameter is an example of the "rotation speed control parameter".

以上説明したように、第１の実施形態の制御システム５０は、風車２０の設置場所における風速を示す風速情報（例えば、風速センサ４１により検出された風速）、風車２０の回転に関する回転情報（例えば、回転速度センサ４２により検出された回転速度）、及び、風速と回転の関係情報であって許容範囲と目標とを示す関係情報（例えば、図４に示す、風速と回転数との関係）に基づいて、風速と回転との対応情報を学習する強化学習部７１と、風速と回転の対応情報を記憶する学習済みモデル記憶部６５と、風車の回転数を制御する回転制御パラメータを風車２０に設定した場合における風速を検出する状態検出部６２と、状態検出部６２により検出された風速、及び対応情報に基づいて、回転情報を決定する決定部６６と、決定部６６により決定された回転情報に基づいて、回転情報が許容範囲に収まり、尚且つ目標に近づくように、風車２０の回転を制御する制御部６７とを備える。これにより、実施形態の制御システム５０は、不安定な風況であっても風車２０の回転速度を最適化させるように制御を行うことが可能となる。 As described above, the control system 50 of the first embodiment has wind speed information indicating the wind speed at the installation location of the wind turbine 20 (for example, the wind speed detected by the wind speed sensor 41) and rotation information regarding the rotation of the wind turbine 20 (for example,). , The rotation speed detected by the rotation speed sensor 42), and the relation information indicating the allowable range and the target, which is the relation information between the wind speed and the rotation (for example, the relation between the wind speed and the rotation speed shown in FIG. 4). Based on this, the enhanced learning unit 71 that learns the correspondence information between wind speed and rotation, the learned model storage unit 65 that stores the correspondence information between wind speed and rotation, and the rotation control parameter that controls the rotation speed of the windmill are set to the windmill 20. The state detection unit 62 that detects the wind speed when set, the determination unit 66 that determines the rotation information based on the wind speed detected by the state detection unit 62, and the corresponding information, and the rotation information determined by the determination unit 66. Based on the above, the control unit 67 for controlling the rotation of the wind turbine 20 is provided so that the rotation information falls within the allowable range and approaches the target. As a result, the control system 50 of the embodiment can control so as to optimize the rotation speed of the wind turbine 20 even in an unstable wind condition.

また、第１の実施形態の制御システム５０は、状態検出部６２により検出された風速、及び風車２０の回転速度に基づいて、所定の報酬条件に応じた報酬を算出する報酬算出部６３を更に備え、強化学習部７１は、報酬に基づいて風速と回転との対応情報（例えば、回転数制御パラメータ）を学習する強化学習モデルである。これにより、実施形態の制御システム５０は、報酬手掛かりとしてより適切な制御を学習することができる。 Further, the control system 50 of the first embodiment further includes a reward calculation unit 63 that calculates a reward according to a predetermined reward condition based on the wind speed detected by the state detection unit 62 and the rotation speed of the wind turbine 20. The reinforcement learning unit 71 is a reinforcement learning model that learns correspondence information (for example, rotation speed control parameter) between wind speed and rotation based on a reward. Thereby, the control system 50 of the embodiment can learn more appropriate control as a reward clue.

また、第１の実施形態の制御システム５０では、報酬算出部６３は、状態検出部６２により検出された風速が強風であり、尚且つ、風車２０の回転速度が第１閾値以上である場合、第１レベルの報酬を算出する。また、報酬算出部６３は、状態検出部６２により検出された風速が強風であり、尚且つ、風車２０の回転速度が前記第１閾値より小さい第２閾値以上である場合、第１レベルより高い第２レベルの報酬を算出する。また、報酬算出部６３は、状態検出部６２により検出された風速が強風であり、尚且つ、風車２０の回転速度が第２閾値未満である場合、第１レベルより高く、尚且つ第２レベルより低い第３レベルの報酬を算出する。これにより、実施形態の制御システム５０は、風速が強風である場合に、回転速度が超過しないように制御することが可能である。また、回転速度が超過しない場合には、回転速度が速度不足となるよりも適正範囲となるように、学習させることが可能となるため、回転速度が超過し易い強風時にも、強制停止してしまうことを抑制し、また、発電電力が最大を維持するように学習させることができる。 Further, in the control system 50 of the first embodiment, the reward calculation unit 63 determines that the wind speed detected by the state detection unit 62 is a strong wind and the rotation speed of the wind turbine 20 is equal to or higher than the first threshold value. Calculate the first level reward. Further, the reward calculation unit 63 is higher than the first level when the wind speed detected by the state detection unit 62 is a strong wind and the rotation speed of the wind turbine 20 is equal to or higher than the second threshold value smaller than the first threshold value. Calculate the second level reward. Further, when the wind speed detected by the state detection unit 62 is a strong wind and the rotation speed of the wind turbine 20 is less than the second threshold value, the reward calculation unit 63 is higher than the first level and is at the second level. Calculate the lower third level reward. Thereby, the control system 50 of the embodiment can control so that the rotation speed does not exceed when the wind speed is a strong wind. In addition, if the rotation speed does not exceed, it is possible to learn so that the rotation speed is within the appropriate range rather than becoming insufficient, so even in strong winds where the rotation speed tends to exceed, it is forcibly stopped. It is possible to suppress the storage and to learn to maintain the maximum generated power.

また、第１の実施形態の学習装置７０は、風力発電システム１の風車２０の設置場所における風速を示す風速情報、及び風車２０の回転に関する回転情報に基づいて、風速と回転の対応情報を学習する強化学習部７１を備えるため、風速と風車の回転との状態に応じて、どのように風車の回転を制御すべきかを学習することができるため、風況が不安定であっても風車の回転をより適切に制御することが可能となる。 Further, the learning device 70 of the first embodiment learns the correspondence information between the wind speed and the rotation based on the wind speed information indicating the wind speed at the installation location of the wind turbine 20 of the wind power generation system 1 and the rotation information regarding the rotation of the wind turbine 20. Since the enhanced learning unit 71 is provided, it is possible to learn how to control the rotation of the wind turbine according to the state of the wind speed and the rotation of the wind turbine. It becomes possible to control the rotation more appropriately.

また、第１の実施形態の制御装置６０は、風力発電システム１の風車２０の回転数を制御する回転制御パラメータを風車２０に設定した場合における風速を検出する状態検出部６２と、状態検出部６２により検出された風速、及び、風速と風車の回転との対応情報（例えば、学習済みモデル記憶部６５に記憶された学習済みモデル）に基づいて、風車の回転に関する回転情報を決定する決定部６６と、決定部６６により決定された回転情報に基づいて、風車２０の回転を制御する制御部６７とを備える。これにより、実施形態の制御装置６０は、風速等の状態が学習済みモデルで学習済みの状態と似たような状態である場合に、学習済みモデルから出力された制御パラメータに応じた制御を行うことができ、より適切に制御することが可能となる。 Further, the control device 60 of the first embodiment has a state detection unit 62 for detecting the wind speed when the rotation control parameter for controlling the rotation speed of the wind turbine 20 of the wind power generation system 1 is set in the wind turbine 20, and a state detection unit. A determination unit that determines rotation information regarding the rotation of the wind turbine based on the wind speed detected by 62 and the correspondence information between the wind speed and the rotation of the wind turbine (for example, the trained model stored in the trained model storage unit 65). 66 is provided with a control unit 67 that controls the rotation of the wind turbine 20 based on the rotation information determined by the determination unit 66. As a result, the control device 60 of the embodiment controls according to the control parameters output from the trained model when the state such as the wind speed is a state similar to the trained state in the trained model. It can be controlled more appropriately.

以上説明したように、第２の実施形態の制御システム５０は、風車２０の設置場所における風速を示す風速情報（例えば、風速センサ４１により検出された風速）、風車２０の回転に関する回転情報（例えば、回転速度センサ４２により検出された回転速度）、発電システムにより発電される回生電力に関する電力情報（例えば、電圧検出部３２により検出された回生電力の電圧、および電流検出部３３により検出された回生電力の電流）及び、風速と回転と回生電力との関係情報であって回生電力の目標を示す関係情報に基づいて、風速と回転と回生電力との対応情報を学習する強化学習部７１と、風速と回転と回生電力の対応情報を記憶する学習済みモデル記憶部６５と、回生電力を制御する電力制御パラメータに基づいて整流・昇圧部３１を制御した場合における回生電力の変化、および風速を検出する状態検出部６２と、状態検出部６２により検出された回生電力と風速、及び対応情報に基づいて、電力情報を決定する決定部６６と、決定部６６により決定された電力情報に基づいて、電力情報の目標に近づくように、回生電力を制御する制御部６７とを備える。これにより、実施形態の制御システム５０は、風況が変化した場合であってもトータルの発電電力が最大となるように制御を行うことが可能となる。 As described above, the control system 50 of the second embodiment has wind speed information indicating the wind speed at the installation location of the wind turbine 20 (for example, the wind speed detected by the wind speed sensor 41) and rotation information regarding the rotation of the wind turbine 20 (for example,). , Rotation speed detected by the rotation speed sensor 42), power information regarding the regenerative power generated by the power generation system (for example, the voltage of the regenerative power detected by the voltage detection unit 32, and the regeneration detected by the current detection unit 33). The enhanced learning unit 71, which learns the correspondence information between wind speed, rotation, and regenerative power, based on the relational information between wind speed, rotation, and regenerative power, which indicates the target of regenerative power. Detects changes in regenerative power and wind speed when the trained model storage unit 65, which stores information on the correspondence between wind speed, rotation, and regenerative power, and the rectification / booster unit 31 are controlled based on the power control parameters that control regenerative power. Based on the state detection unit 62, the determination unit 66 that determines the power information based on the regenerative power and wind speed detected by the state detection unit 62, and the corresponding information, and the power information determined by the determination unit 66. A control unit 67 that controls the regenerated electric power is provided so as to approach the target of the electric power information. As a result, the control system 50 of the embodiment can be controlled so that the total generated power is maximized even when the wind condition changes.

また、第２の実施形態の制御システム５０は、状態検出部６２により検出された風速、及び回生電力に基づいて、所定の報酬条件に応じた報酬を算出する報酬算出部６３を更に備え、強化学習部７１は、報酬に基づいて風速と回生電力との対応情報（例えば、電力制御パラメータ）を学習する強化学習モデルである。これにより、実施形態の制御システム５０は、報酬を手掛かりとしてより適切な制御を学習することができる。 Further, the control system 50 of the second embodiment further includes and strengthens a reward calculation unit 63 that calculates a reward according to a predetermined reward condition based on the wind speed detected by the state detection unit 62 and the regenerative power. The learning unit 71 is a reinforcement learning model that learns correspondence information (for example, power control parameter) between wind speed and regenerated power based on a reward. Thereby, the control system 50 of the embodiment can learn more appropriate control by using the reward as a clue.

また、第２の実施形態の制御システム５０では、報酬算出部６３は、状態検出部６２により検出された回生電力が所定の電力閾値未満である場合、第１レベルの報酬を算出する。また、報酬算出部６３は、状態検出部６２により検出された回生電力が所定の電力閾値以上である場合、第１レベルより高い第２レベルの報酬を算出する。これにより、実施形態の制御システム５０は、回生電力が大きくなるように制御することが可能である。
また、第２の実施形態の制御システム５０では、報酬算出部６３は、状態検出部６２により検出された風速に基づいて、減速区間とその後の加速区間とを含む対象区間を抽出し、対象区間における、状態検出部６２により検出された回生電力の加算値が所定の電力閾値以上である場合、第２レベルの報酬を算出する。これにより、第２の実施形態の制御システム５０では、減速区間において回生電力を出力させ続けると風車の回転が失速する場合があっても、減速区間においては回生電力の出力を抑制して、加速区間で回生電力をより高く出力させるなどの制御を学習させ、対象区間におけるトータルの回生電力が大きくなるように制御することが可能である。
また、第２の実施形態の制御システム５０では、報酬算出部６３は、状態検出部６２により検出された風速が所定の強風判定閾値未満である場合に報酬を算出する。これにより、第２の実施形態の制御システム５０では、強風時にも回生電力を大きくしようとして過回転に陥ってしまうような間違った制御を抑制することが可能である。 Further, in the control system 50 of the second embodiment, the reward calculation unit 63 calculates the first level reward when the regenerative power detected by the state detection unit 62 is less than a predetermined power threshold value. Further, the reward calculation unit 63 calculates a second level reward higher than the first level when the regenerative power detected by the state detection unit 62 is equal to or higher than a predetermined power threshold value. Thereby, the control system 50 of the embodiment can be controlled so that the regenerative power becomes large.
Further, in the control system 50 of the second embodiment, the reward calculation unit 63 extracts a target section including a deceleration section and a subsequent acceleration section based on the wind speed detected by the state detection unit 62, and the target section. When the added value of the regenerative power detected by the state detection unit 62 is equal to or higher than the predetermined power threshold value, the second level reward is calculated. As a result, in the control system 50 of the second embodiment, even if the rotation of the wind turbine may stall if the regenerative power is continuously output in the deceleration section, the output of the regenerative power is suppressed in the deceleration section to accelerate. It is possible to learn the control such as outputting the regenerative power higher in the section and control so that the total regenerative power in the target section becomes large.
Further, in the control system 50 of the second embodiment, the reward calculation unit 63 calculates the reward when the wind speed detected by the state detection unit 62 is less than a predetermined strong wind determination threshold value. As a result, in the control system 50 of the second embodiment, it is possible to suppress erroneous control that causes over-rotation in an attempt to increase the regenerative power even in a strong wind.

また、第２の実施形態の学習装置７０は、風力発電システム１の風車２０の設置場所における風速を示す風速情報、風車２０の回転に関する回転情報、及び風力発電システム１により発電される回生電力に関する電力情報に基づいて、風速と回転と回生電力の対応情報を学習する強化学習部７１を備えるため、風速と風車の回転と回生電力の状態に応じて、どのように回生電力を制御すべきかを学習することができるため、風況が変化する場合であっても回生電力をより適切に制御することが可能となる。 Further, the learning device 70 of the second embodiment relates to wind speed information indicating the wind speed at the installation location of the wind turbine 20 of the wind power generation system 1, rotation information regarding the rotation of the wind turbine 20, and regenerative power generated by the wind power generation system 1. Since the enhanced learning unit 71 that learns the correspondence information between the wind speed, the rotation, and the regenerated power based on the power information is provided, how to control the regenerated power according to the state of the wind speed, the rotation of the wind turbine, and the regenerated power is determined. Since it can be learned, it becomes possible to more appropriately control the regenerated power even when the wind conditions change.

また、第２の実施形態の制御装置６０は、風力発電システム１により発電された回生電力を制御する電力制御パラメータを整流・昇圧部３１に設定した場合における回生電力と風速とを検出する状態検出部６２と、状態検出部６２により検出された回生電力、風速、及び、風速と風車の回転と回生電力との対応情報（例えば、学習済みモデル記憶部６５に記憶された学習済みモデル）に基づいて、回生電力に関する電力情報を決定する決定部６６と、決定部６６により決定された電力情報に基づいて、回生電力を制御する制御部６７とを備える。これにより、実施形態の制御装置６０は、風速等の状態が学習済みモデルで学習済みの状態と似たような状態である場合に、学習済みモデルから出力された制御パラメータに応じた制御を行うことができ、より適切に制御することが可能となる。 Further, the control device 60 of the second embodiment detects a state of detecting the regenerated power and the wind speed when the power control parameter for controlling the regenerated power generated by the wind power generation system 1 is set in the rectifying / boosting unit 31. Based on the regenerative power and wind speed detected by the unit 62 and the state detection unit 62, and the correspondence information between the wind speed and the rotation of the wind turbine and the regenerative power (for example, the trained model stored in the trained model storage unit 65). A determination unit 66 that determines power information regarding the regenerative power, and a control unit 67 that controls the regenerative power based on the power information determined by the determination unit 66 are provided. As a result, the control device 60 of the embodiment controls according to the control parameters output from the trained model when the state such as the wind speed is a state similar to the trained state in the trained model. It can be controlled more appropriately.

以上説明したように、第３の実施形態の制御システム５０は、風車２０の設置場所における風速を示す風速情報（例えば、風速センサ４１により検出された風速）、風車２０の回転に関する回転情報（例えば、回転速度センサ４２により検出された回転速度）、発電システムにより発電される回生電力に関する電力情報（例えば、電圧検出部３２により検出された回生電力の電圧、および電流検出部３３により検出された回生電力の電流）及び、風速と回転と回生電力との関係情報であって回転数の目標を含む前記風車に設定可能な前記回転数の範囲を示す関係情報に基づいて、風速と回転と回生電力との対応情報を学習する強化学習部７１と、風速と回転と回生電力の対応情報を記憶する学習済みモデル記憶部６５と、回転制御パラメータに基づいて制御した場合における回転速度、及び風速を検出する状態検出部６２と、状態検出部６２により検出された回転速度と風速、及び対応情報に基づいて、回転情報を決定する決定部６６と、決定部６６により決定された回転情報に基づいて、風車２０の回転を制御する制御部６７とを備える。これにより、実施形態の制御システム５０は、風況が変化した場合であっても発電電力量が最大となるように制御を行うことが可能となる。 As described above, the control system 50 of the third embodiment has wind speed information indicating the wind speed at the installation location of the wind turbine 20 (for example, the wind speed detected by the wind speed sensor 41) and rotation information regarding the rotation of the wind turbine 20 (for example). , Rotational speed detected by the rotation speed sensor 42), power information regarding the regenerated power generated by the power generation system (for example, the voltage of the regenerated power detected by the voltage detection unit 32, and the regeneration detected by the current detection unit 33). Wind speed, rotation, and regenerative power based on the relationship information between wind speed, rotation, and regenerative power, which indicates the range of the rotation speed that can be set for the wind turbine, including the target of the rotation speed. The enhanced learning unit 71 that learns the correspondence information with, the trained model storage unit 65 that stores the correspondence information of the wind speed, the rotation, and the regenerative power, and the rotation speed and the wind speed when controlled based on the rotation control parameters are detected. Based on the state detection unit 62, the determination unit 66 that determines the rotation information based on the rotation speed and wind speed detected by the state detection unit 62, and the corresponding information, and the rotation information determined by the determination unit 66. A control unit 67 that controls the rotation of the wind turbine 20 is provided. As a result, the control system 50 of the embodiment can control so that the amount of generated power is maximized even when the wind condition changes.

以上説明したように、第４の実施形態の制御システム５０は、風車２０の設置場所における風速を示す風速情報（例えば、風速センサ４１により検出された風速）、風車２０の回転に関する回転情報（例えば、回転速度センサ４２により検出された回転速度）、発電システムにより発電される回生電力に関する電力情報（例えば、電圧検出部３２により検出された回生電力の電圧、および電流検出部３３により検出された回生電力の電流）及び、風速と回転と回生電力との関係情報であって前記風速と前記風車が回転可能な回転数の最大値とを示す関係情報に基づいて、風速と回転と回生電力との対応情報を学習する強化学習部７１と、風速と回転と回生電力の対応情報を記憶する学習済みモデル記憶部６５と、回転制御パラメータに基づいて制御した場合における回転速度、および風速を検出する状態検出部６２と、状態検出部６２により検出された回転速度と風速、及び対応情報に基づいて、回転情報を決定する決定部６６と、決定部６６により決定された回転情報に基づいて、風車２０の回転を制御する制御部６７とを備える。これにより、実施形態の制御システム５０は、風況が変化した場合であってもトータルの発電電力が最大となるように制御を行うことが可能となる。 As described above, the control system 50 of the fourth embodiment has wind speed information indicating the wind speed at the installation location of the wind turbine 20 (for example, the wind speed detected by the wind speed sensor 41) and rotation information regarding the rotation of the wind turbine 20 (for example). , Rotation speed detected by the rotation speed sensor 42), power information regarding the regenerative power generated by the power generation system (for example, the voltage of the regenerative power detected by the voltage detection unit 32, and the regeneration detected by the current detection unit 33). The wind speed, rotation, and regenerative power are based on the relationship information between the wind speed, rotation, and regenerative power, which indicates the wind speed and the maximum value of the rotation speed at which the wind turbine can rotate. An enhanced learning unit 71 that learns correspondence information, a learned model storage unit 65 that stores correspondence information between wind speed, rotation, and regenerative power, and a state that detects the rotation speed and wind speed when controlled based on rotation control parameters. The wind turbine 20 is based on the rotation speed and wind speed detected by the detection unit 62, the determination unit 66 that determines the rotation information based on the corresponding information, and the rotation information determined by the determination unit 66. A control unit 67 for controlling the rotation of the wind turbine is provided. As a result, the control system 50 of the embodiment can be controlled so that the total generated power is maximized even when the wind condition changes.

上述した実施形態における制御システム５０、制御装置６０、及び学習装置７０の各々が行う処理の全部または一部をコンピュータで実現するようにしてもよい。その場合、この機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよく、ＦＰＧＡ等のプログラマブルロジックデバイスを用いて実現されるものであってもよい。 All or part of the processing performed by each of the control system 50, the control device 60, and the learning device 70 in the above-described embodiment may be realized by a computer. In that case, a program for realizing this function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read by a computer system and executed. The term "computer system" as used herein includes hardware such as an OS and peripheral devices. Further, the "computer-readable recording medium" refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, or a CD-ROM, and a storage device such as a hard disk built in a computer system. Further, a "computer-readable recording medium" is a communication line for transmitting a program via a network such as the Internet or a communication line such as a telephone line, and dynamically holds the program for a short period of time. It may also include a program that holds a program for a certain period of time, such as a volatile memory inside a computer system that is a server or a client in that case. Further, the above program may be for realizing a part of the above-mentioned functions, and may be further realized for realizing the above-mentioned functions in combination with a program already recorded in the computer system. It may be realized by using a programmable logic device such as FPGA.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 Although the embodiments of the present invention have been described in detail with reference to the drawings, the specific configuration is not limited to this embodiment, and includes designs and the like within a range that does not deviate from the gist of the present invention.

１風力発電システム
１０風力発電機本体
２０風車
３０発電機
３１整流・昇圧部
３２電圧検出部
３３電流検出部
４１風速センサ
４２回転速度センサ
５０制御システム
６０制御装置
６１パラメータ取得部
６２状態検出部
６３報酬算出部
６４報酬出力部
６５学習済みモデル記憶部
６６決定部
６７制御部
７０学習装置
７１強化学習部 1 Wind power generation system 10 Wind power generator body 20 Wind turbine 30 Generator 31 Rectification / boosting unit 32 Voltage detection unit 33 Current detection unit 41 Wind speed sensor 42 Rotation speed sensor 50 Control system 60 Control device 61 Parameter acquisition unit 62 State detection unit 63 Reward Calculation unit 64 Reward output unit 65 Learned model storage unit 66 Determination unit 67 Control unit 70 Learning device 71 Enhanced learning unit

Claims

A control system that controls the number of revolutions of a wind turbine in a wind power generation system.
Wind speed information indicating the wind speed at the installation location of the wind turbine of the wind power generation system, rotation information regarding the rotation of the wind turbine, and information on the relationship between the wind speed and the rotation of the wind speed and the rotation speed at which the wind turbine can rotate. A learning unit that learns the correspondence information between the wind speed and the rotation based on the relational information indicating the maximum value.
A storage unit that stores information on the correspondence between the wind speed and the rotation,
A state detection unit that detects the rotation information and the wind speed information when the rotation control parameter for controlling the rotation speed of the wind turbine is set for the wind turbine.
A determination unit that determines the rotation information based on the rotation information, the wind speed information, and the corresponding information detected by the state detection unit.
A control unit that controls the rotation of the wind turbine based on the rotation information determined by the determination unit is provided .
When the difference between the rotation speed detected by the state detection unit and the maximum value of the rotation speed in the strong wind section where the wind speed is equal to or higher than the predetermined strong wind determination threshold value in the time-series change of the wind speed is less than the predetermined margin threshold value. Moreover, when the rate of change of the rotation speed indicated in the rotation information detected by the state detection unit is less than a predetermined change threshold value, it is the upper limit of the wind speed at which the wind power generation system can generate power. Increase wind speed limit
Control system.

Further, a reward calculation unit for calculating a reward according to a predetermined reward condition based on the rotation information detected by the state detection unit and the wind speed information is provided.
The control system according to claim 1, wherein the related information includes a reward calculated by the reward calculation unit, and the learning unit learns the corresponding information based on the reward by reinforcement learning .

The control system according to claim 2, wherein the reward calculation unit calculates a reward based on the maximum value of the rotation speed, the rotation speed, and the rate of change of the rotation speed.

A control system that controls the number of revolutions of a wind turbine in a wind power generation system.
Wind speed information indicating the wind speed at the installation location of the wind turbine of the wind power generation system, rotation information regarding the rotation of the wind turbine, and information on the relationship between the wind speed and the rotation of the wind speed and the rotation speed at which the wind turbine can rotate. A learning unit that learns the correspondence information between the wind speed and the rotation based on the relational information indicating the maximum value.
A storage unit that stores information on the correspondence between the wind speed and the rotation,
A state detection unit that detects the rotation information and the wind speed information when the rotation control parameter for controlling the rotation speed of the wind turbine is set for the wind turbine.
A determination unit that determines the rotation information based on the rotation information, the wind speed information, and the corresponding information detected by the state detection unit.
A control unit that controls the rotation of the wind turbine based on the rotation information determined by the determination unit is provided .
Further, a reward calculation unit for calculating a reward according to a predetermined reward condition based on the rotation information detected by the state detection unit and the wind speed information is provided.
The relational information includes the reward calculated by the reward calculation unit.
The learning unit learns the corresponding information by reinforcement learning based on the reward, and
The reward calculation unit calculates a reward based on the maximum value of the rotation speed, the rotation speed, and the rate of change of the rotation speed.
The reward calculation unit is a case where the difference between the rotation speed detected by the state detection unit and the maximum value of the rotation speed is less than a predetermined margin threshold value, and is detected by the state detection unit. When the change rate of the rotation speed shown in the rotation information is equal to or more than a predetermined change threshold value, the reward of the first level is calculated, and when the change rate is less than the change threshold value, the second level is higher than the first level. Calculate level rewards
Control system.

When the difference between the rotation speed detected by the state detection unit and the maximum value of the rotation speed is equal to or greater than the margin threshold value, the reward calculation unit is higher than the first level and higher than the second level. The control system of claim 4, which calculates a lower third level reward.

When the difference between the rotation speed detected by the state detection unit and the maximum value of the rotation speed in the strong wind section where the wind speed is equal to or higher than the predetermined strong wind determination threshold value in the time-series change of the wind speed is less than the predetermined margin threshold value. Moreover, when the rate of change of the rotation speed indicated in the rotation information detected by the state detection unit is less than a predetermined change threshold value, it is the upper limit of the wind speed at which the wind power generation system can generate power. The control system according to claim 4 or 5 , which increases the wind speed limit.

Wind speed information indicating the wind speed at the installation location of the wind turbine of the wind power generation system, rotation information regarding the rotation of the wind turbine, power information regarding the regenerative power generated by the wind power generation system, and information on the relationship between the wind speed and the rotation. The correspondence information between the wind speed and the rotation is learned based on the relational information indicating the limit wind speed which is the limit of the wind speed that can be generated by the wind power generation system and the maximum value of the rotation speed at which the wind turbine can rotate. It is a learning device equipped with a learning unit and capable of obtaining the reward required by the control device .
The control device is
A state detection unit that detects the rotation information and the wind speed information when the rotation control parameter for controlling the rotation speed of the wind turbine is set for the wind turbine.
With the reward calculation unit that calculates the reward according to the predetermined reward condition based on the rotation information detected by the state detection unit and the wind speed information.
Equipped with
The reward calculation unit calculates a reward based on the maximum value of the rotation speed, the rotation speed, and the rate of change of the rotation speed.
The reward calculation unit is a case where the difference between the rotation speed detected by the state detection unit and the maximum value of the rotation speed is less than a predetermined margin threshold value, and is detected by the state detection unit. When the change rate of the rotation speed shown in the rotation information is equal to or more than a predetermined change threshold value, the reward of the first level is calculated, and when the change rate is less than the change threshold value, the second level is higher than the first level. Calculate the level reward,
The relational information includes the reward calculated by the reward calculation unit.
The learning unit learns the corresponding information by reinforcement learning based on the reward.
Learning device.

When the rotation control parameter for controlling the rotation speed of the wind turbine of the wind power generation system is set in the wind power generation system, the rotation information regarding the rotation of the wind turbine, the power information regarding the regenerative power generated by the wind power generation system, and the wind turbine. A state detector that detects wind speed information related to the wind speed in
A determination unit that determines the rotation information based on the power information, the wind speed information, and the correspondence information between the wind speed and the rotation of the wind turbine detected by the state detection unit.
A control unit that controls the rotation speed based on the rotation information determined by the determination unit is provided .
When the difference between the rotation speed detected by the state detection unit and the maximum value of the rotation speed in the strong wind section where the wind speed is equal to or higher than the predetermined strong wind determination threshold value in the time-series change of the wind speed is less than the predetermined margin threshold value. Moreover, when the rate of change of the rotation speed indicated in the rotation information detected by the state detection unit is less than a predetermined change threshold value, it is the upper limit of the wind speed at which the wind power generation system can generate power. Increase wind speed limit
Control device.

It is a control method that controls the rotation speed of the wind turbine of the wind power generation system.
The learning unit has wind speed information indicating the wind speed at the installation location of the wind turbine of the wind power generation system, rotation information regarding the rotation of the wind turbine, and information on the relationship between the wind speed and the rotation, and the wind speed and the wind turbine can rotate. Based on the relational information indicating the maximum value of the rotation speed, the correspondence information between the wind speed and the rotation is learned, and the correspondence information is learned.
The storage unit stores the correspondence information between the wind speed and the rotation, and stores the correspondence information.
The state detection unit detects the rotation information and the wind speed information when the rotation control parameter for controlling the rotation speed is set in the wind turbine.
The determination unit determines the rotation information based on the rotation information, the wind speed information, and the corresponding information detected by the state detection unit.
The control unit controls the rotation of the wind turbine based on the rotation information determined by the determination unit .
When the difference between the rotation speed detected by the state detection unit and the maximum value of the rotation speed in the strong wind section where the wind speed is equal to or higher than the predetermined strong wind determination threshold value in the time-series change of the wind speed is less than the predetermined margin threshold value. Moreover, when the rate of change of the rotation speed indicated in the rotation information detected by the state detection unit is less than a predetermined change threshold value, it is the upper limit of the wind speed at which the wind power generation system can generate power. Increase wind speed limit
Control method.

It is a control method that controls the rotation speed of the wind turbine of the wind power generation system.
The learning unit has wind speed information indicating the wind speed at the installation location of the wind turbine of the wind power generation system, rotation information regarding the rotation of the wind turbine, and information on the relationship between the wind speed and the rotation, and the wind speed and the wind turbine can rotate. Based on the relational information indicating the maximum value of the rotation speed, the correspondence information between the wind speed and the rotation is learned, and the correspondence information is learned.
The storage unit stores the correspondence information between the wind speed and the rotation, and stores the correspondence information.
The state detection unit detects the rotation information and the wind speed information when the rotation control parameter for controlling the rotation speed is set in the wind turbine.
The reward calculation unit calculates a reward according to a predetermined reward condition based on the rotation information and the wind speed information detected by the state detection unit.
The reward calculation unit calculates a reward based on the maximum value of the rotation speed, the rotation speed, and the rate of change of the rotation speed.
The reward calculation unit is a case where the difference between the rotation speed detected by the state detection unit and the maximum value of the rotation speed is less than a predetermined margin threshold value, and is detected by the state detection unit. When the change rate of the rotation speed shown in the rotation information is equal to or more than a predetermined change threshold value, the reward of the first level is calculated, and when the change rate is less than the change threshold value, the second level is higher than the first level. Calculate the level reward,
The relational information includes the reward calculated by the reward calculation unit.
The learning unit learns the corresponding information by reinforcement learning based on the reward, and
The determination unit determines the rotation information based on the rotation information, the wind speed information, and the corresponding information detected by the state detection unit.
A control method in which the control unit controls the rotation of the wind turbine based on the rotation information determined by the determination unit.