JP7517313B2

JP7517313B2 - Apparatus, method and program

Info

Publication number: JP7517313B2
Application number: JP2021191945A
Authority: JP
Inventors: 豪 ▲高▼見; 順二山本; 恵一郎小渕; 宏明鹿子木; 陽太古川
Original assignee: Yokogawa Electric Corp
Current assignee: Yokogawa Electric Corp
Priority date: 2021-11-26
Filing date: 2021-11-26
Publication date: 2024-07-17
Anticipated expiration: 2041-11-26
Also published as: JP2023078694A

Description

本発明は、装置、方法およびプログラムに関する。 The present invention relates to an apparatus, a method and a program.

特許文献１には、「測定データの入力に応じ、予め設定された報酬関数により定まる報酬値を高めるために推奨される第１種類の制御内容を示す推奨制御パラメータを出力する第１モデルの学習処理を実行する」と記載されている。
［先行技術文献］
［特許文献］
［特許文献１］特開２０２１－０８６２８３号公報
［特許文献２］特開２０２０－０２７５５６号公報
［特許文献３］特開２０１９－０２０８８５号公報
［非特許文献］
［非特許文献１］高見豪、「プラント制御ＡＩの実現」、横河技報、横河電機株式会社、２０２０年、Ｖｏｌ．６３、第１号、ｐ．３３～３６
［非特許文献２］今井拓司、「横河電機とNAISTが化学プラント向けに強化学習、少ない試行回数で高度な制御を実現」、日経Ｒｏｂｏｔｉｃｓ、株式会社日経ＢＰ，２０１９年３月号 Patent Document 1 states that "in response to input of measurement data, a learning process for a first model is executed to output recommended control parameters indicating a first type of control content recommended for increasing a reward value determined by a preset reward function."
[Prior Art Literature]
[Patent Documents]
[Patent Document 1] JP 2021-086283 A [Patent Document 2] JP 2020-027556 A [Patent Document 3] JP 2019-020885 A [Non-Patent Document]
[Non-Patent Document 1] Go Takami, "Realizing Plant Control AI," Yokogawa Technical Report, Yokogawa Electric Corporation, 2020, Vol. 63, No. 1, pp. 33-36
[Non-Patent Document 2] Takuji Imai, "Yokogawa Electric and NAIST Achieve Reinforcement Learning for Chemical Plants, Advanced Control with Fewer Trials," Nikkei Robotics, Nikkei BP, March 2019 issue

本発明の第１の態様においては、装置が提供される。装置は、設備に関する状態パラメータの値が入力されることに応じて、設備の制御パラメータの推奨値を出力する操業モデルに対し、状態パラメータの値を供給する供給部を備えてよい。装置は、供給部が操業モデルに状態パラメータの値を供給することに応じて当該操業モデルから出力される制御パラメータの推奨値を取得する制御パラメータ取得部を備えてよい。装置は、制御パラメータ取得部により取得された推奨値により設備を操業した結果に応じたモデル評価値を取得する取得部を備えてよい。装置は、モデル評価値、および、設備を人手の操作により操業した結果に応じた基準評価値に基づいて操業モデルを評価する評価部を備えてよい。 In a first aspect of the present invention, an apparatus is provided. The apparatus may include a supply unit that supplies a state parameter value to an operation model that outputs a recommended value of a control parameter of the equipment in response to an input of a value of a state parameter related to the equipment. The apparatus may include a control parameter acquisition unit that acquires a recommended value of a control parameter output from the operation model in response to the supply unit supplying the state parameter value to the operation model. The apparatus may include an acquisition unit that acquires a model evaluation value corresponding to a result of operating the equipment using the recommended value acquired by the control parameter acquisition unit. The apparatus may include an evaluation unit that evaluates the operation model based on the model evaluation value and a reference evaluation value corresponding to a result of operating the equipment by manual operation.

基準評価値は、設備のシミュレータに人手の操作を入力した結果に基づいて算出されてよい。 The reference evaluation value may be calculated based on the results of manual operations input into an equipment simulator.

モデル評価値は、制御パラメータ取得部により取得された推奨値を設備のシミュレータに入力した結果に基づいて算出されてよい。 The model evaluation value may be calculated based on the results of inputting the recommended values acquired by the control parameter acquisition unit into the equipment simulator.

モデル評価値は、推奨値により操業された設備に関するパラメータが目標範囲内に収まるか否かに基づいて算出されてよい。基準評価値は、人手の操作により操業された設備に関するパラメータが目標範囲内に収まるか否かに基づいて算出されてよい。 The model evaluation value may be calculated based on whether the parameters for the equipment operated with the recommended values fall within the target range. The reference evaluation value may be calculated based on whether the parameters for the equipment operated by manual operation fall within the target range.

装置は、設備に関する複数種類のパラメータのうち、オペレータにより選択される選択パラメータについてオペレータにより設定される目標範囲を取得する目標範囲取得部をさらに備えてよい。 The device may further include a target range acquisition unit that acquires a target range set by an operator for a selected parameter selected by the operator from among multiple types of parameters related to the equipment.

装置は、複数種類のパラメータから選択パラメータが選択されることに応じて、設備の過去の操業での当該選択パラメータの値を表示させる表示制御部をさらに備えてよい。 The device may further include a display control unit that displays the value of a selected parameter from past operations of the equipment in response to the selection of the selected parameter from multiple types of parameters.

表示制御部は、各選択パラメータを座標軸とする座標空間に、設備の過去の操業での各選択パラメータの値を表示させてよい。 The display control unit may display the values of each selected parameter from past operations of the equipment in a coordinate space with each selected parameter as the coordinate axis.

設備は、物の製造を行う設備であってよい。設備に関するパラメータは、物の品質を示す指標値または物の生産量の少なくとも１つであってよい。 The equipment may be equipment for manufacturing an item. The equipment parameter may be at least one of an index value indicating the quality of the item or the production volume of the item.

装置は、状態パラメータの値、および、制御パラメータの値を含む学習データを用いて操業モデルの学習処理を実行する学習処理部をさらに備えてよい。 The device may further include a learning processing unit that performs learning processing of the operation model using learning data including values of the state parameters and the control parameters.

学習処理部は、学習データと、予め設定された報酬関数により定まる報酬値とを用いて操業モデルの学習処理を実行してよい。 The learning processing unit may perform learning processing of the operation model using the learning data and a reward value determined by a preset reward function.

本発明の第２の態様においては、方法が提供される。方法は、設備に関する状態を示す状態パラメータの値が入力されることに応じて、設備の制御パラメータの推奨値を出力する操業モデルに対し、状態パラメータの値を供給する供給段階を備えてよい。方法は、供給段階により操業モデルに状態パラメータの値を供給することに応じて当該操業モデルから出力される制御パラメータの推奨値を取得する制御パラメータ取得段階を備えてよい。方法は、制御パラメータ取得段階により取得された推奨値により設備を操業した結果に応じたモデル評価値を取得する取得段階を備えてよい。方法は、モデル評価値、および、設備を人手の操作により操業した結果に応じた基準評価値に基づいて操業モデルを評価する評価段階を備えてよい。 In a second aspect of the present invention, a method is provided. The method may include a supply step of supplying a value of a state parameter indicating a state of the equipment to an operation model that outputs a recommended value of a control parameter of the equipment in response to input of the value of the state parameter. The method may include a control parameter acquisition step of acquiring a recommended value of a control parameter output from the operation model in response to supplying the value of the state parameter to the operation model by the supply step. The method may include an acquisition step of acquiring a model evaluation value corresponding to a result of operating the equipment using the recommended value acquired by the control parameter acquisition step. The method may include an evaluation step of evaluating the operation model based on the model evaluation value and a reference evaluation value corresponding to a result of operating the equipment by manual operation.

本発明の第３の態様においては、プログラムが提供される。プログラムは、コンピュータを、設備に関する状態パラメータの値が入力されることに応じて、設備の制御パラメータの推奨値を出力する操業モデルに対し、状態パラメータの値を供給する供給部として機能させてよい。プログラムは、コンピュータを、供給部が操業モデルに状態パラメータの値を供給することに応じて当該操業モデルから出力される制御パラメータの推奨値を取得する制御パラメータ取得部として機能させてよい。プログラムは、コンピュータを、制御パラメータ取得部により取得された推奨値により設備を操業した結果に応じたモデル評価値を取得する取得部として機能させてよい。プログラムは、コンピュータを、モデル評価値、および、設備を人手の操作により操業した結果に応じた基準評価値に基づいて操業モデルを評価する評価部として機能させてよい。 In a third aspect of the present invention, a program is provided. The program may cause a computer to function as a supply unit that supplies a state parameter value to an operation model that outputs a recommended value of a control parameter of the equipment in response to an input of a value of a state parameter related to the equipment. The program may cause a computer to function as a control parameter acquisition unit that acquires a recommended value of a control parameter output from the operation model in response to the supply unit supplying the state parameter value to the operation model. The program may cause a computer to function as an acquisition unit that acquires a model evaluation value corresponding to the result of operating the equipment using the recommended value acquired by the control parameter acquisition unit. The program may cause a computer to function as an evaluation unit that evaluates the operation model based on the model evaluation value and a reference evaluation value corresponding to the result of operating the equipment by manual operation.

なお、上記の発明の概要は、本発明の必要な特徴の全てを列挙したものではない。また、これらの特徴群のサブコンビネーションもまた、発明となりうる。 Note that the above summary of the invention does not list all of the necessary features of the present invention. Also, subcombinations of these features may also be inventions.

実施形態に係るシステム１を示す。1 shows a system 1 according to an embodiment. 操業モデル４０１のデータ構造を示す。4 shows the data structure of an operation model 401. 行動決定テーブルを示す。1 shows an action decision table. 操業モデル４０１の学習動作を示す。The learning operation of the operation model 401 is shown. 操業モデル４０１の他の学習動作を示す。Another learning operation of the operation model 401 is shown. 目標設定モデル４１４の学習動作を示す。4 illustrates the learning operation of the goal setting model 414. 操業モデル４０１の評価動作を示す。The evaluation operation of the operation model 401 is shown. 設備２の操業動作を示す。The operation of equipment 2 is shown. 変形例に係る目標設定モデル４１４Ａを示す。13 shows a goal setting model 414A according to a modified example. 本発明の複数の態様が全体的または部分的に具現化されてよいコンピュータ２２００の例を示す。22 illustrates an example computer 2200 in which aspects of the present invention may be embodied, in whole or in part.

以下、発明の実施の形態を通じて本発明を説明するが、以下の実施形態は特許請求の範囲にかかる発明を限定するものではない。また、実施形態の中で説明されている特徴の組み合わせの全てが発明の解決手段に必須であるとは限らない。 The present invention will be described below through embodiments of the invention, but the following embodiments do not limit the invention according to the claims. Furthermore, not all of the combinations of features described in the embodiments are necessarily essential to the solution of the invention.

［１．システム］
図１は、本実施形態に係るシステム１を示す。システム１は、設備２と、装置４とを備える。なお、図中のブロックは、それぞれ機能的に分離された機能ブロックであって、実際の装置構成とは必ずしも一致していなくてもよい。すなわち、本図において、１つのブロックとして示されているからといって、それが必ずしも１つの装置により構成されていなくてもよい。また、本図において、別々のブロックとして示されているからといって、それらが必ずしも別々の装置により構成されていなくてもよい。 [1. System]
FIG. 1 shows a system 1 according to this embodiment. The system 1 includes a facility 2 and an apparatus 4. Note that the blocks in the figure are functionally separated functional blocks, and do not necessarily correspond to the actual apparatus configuration. In other words, even if one block is shown in this figure, it does not necessarily have to be composed of one apparatus. Also, even if separate blocks are shown in this figure, they do not necessarily have to be composed of separate apparatus.

［１．１．設備２］
設備２は、１または複数の機器（図示せず）が備え付けられた施設や装置等である。例えば、設備２は、プラントであってもよいし、複数の機器を複合させた複合装置であってもよい。プラントとしては、化学やバイオ等の工業プラントの他、ガス田や油田等の井戸元やその周辺を管理制御するプラント、水力・火力・原子力等の発電を管理制御するプラント、太陽光や風力等の環境発電を管理制御するプラント、上下水やダム等を管理制御するプラント等が挙げられる。 [1.1. Equipment 2]
The facility 2 is a facility or device equipped with one or more devices (not shown). For example, the facility 2 may be a plant, or a composite device in which multiple devices are combined. Examples of the plant include industrial plants such as chemical and bio plants, plants that manage and control wellheads and surrounding areas of gas fields and oil fields, plants that manage and control power generation such as hydroelectric power, thermal power, and nuclear power, plants that manage and control environmental power generation such as solar power and wind power, and plants that manage and control water supply and sewage systems, dams, etc.

各機器は、器具、機械または装置であり、例えば、設備２のプロセスにおける圧力、温度、ｐＨ、速度、流量などの少なくとも１つの物理量を制御するバルブ、ポンプ、ヒータ、ファン、モータ、スイッチ等のアクチュエータであってよい。各機器は互いに異種でもよいし、少なくとも一部の２以上の機器が同種でもよい。各機器は、装置２により有線または無線で制御されてよい。 Each device is an instrument, machine, or device, and may be, for example, an actuator such as a valve, pump, heater, fan, motor, or switch that controls at least one physical quantity, such as pressure, temperature, pH, speed, or flow rate, in the process of facility 2. Each device may be of a different type, or at least some of two or more devices may be of the same type. Each device may be controlled by device 2 via a wired or wireless connection.

設備２には、１または複数のセンサ（図示せず）が設けられてよい。各センサは、設備２に関する状態の計測または判別を行う。各センサは、設備２の生産量、混入する不純物の割合、各制御対象の運転状況、アラームの発生状況等の操業状態の計測または判別を行ってよい。機器の運転状況は、一例として機器により制御される圧力、温度、ｐＨ、速度、流量などの少なくとも１つの物理量で表されてよい。各センサは、計測または判別の結果を、装置２に供給してよい。 The equipment 2 may be provided with one or more sensors (not shown). Each sensor measures or determines the status of the equipment 2. Each sensor may measure or determine the operating status of the equipment 2, such as the production volume of the equipment 2, the proportion of impurities mixed in, the operating status of each control object, and the occurrence of alarms. The operating status of the equipment may be represented by at least one physical quantity, such as pressure, temperature, pH, speed, and flow rate, which are controlled by the equipment, as an example. Each sensor may supply the measurement or determination results to the device 2.

［１．２．装置４］
装置４は、設備２を操業する操業モデル４０１を用いて設備２を操業するものであってよく、操業モデル４０１と、操業部４０２と、パラメータ取得部４０３とを有してよい。また、装置４は、操業モデル４０１の学習処理を行うものであってよく、入力部４１１と、記憶部４１２と、第２学習処理部４１３と、目標設定モデル４１４と、第１供給部４１５と、第２取得部４１６と、第２供給部４１７とを有してよい。また、装置４は、目標設定モデル４１４の学習処理を行うものであってよく、第１取得部４２１と、第１学習処理部４２２とを有してよい。また、装置４は、操業モデル４０１の評価を行うものであってよく、シミュレータ４３１と、表示制御部４３２と、目標範囲取得部４３３と、評価値取得部４３４と、評価部４３５とを有してよい。 [1.2. Apparatus 4]
The device 4 may operate the equipment 2 using an operation model 401 for operating the equipment 2, and may have the operation model 401, an operation unit 402, and a parameter acquisition unit 403. The device 4 may also perform a learning process of the operation model 401, and may have an input unit 411, a storage unit 412, a second learning processing unit 413, a target setting model 414, a first supply unit 415, a second acquisition unit 416, and a second supply unit 417. The device 4 may also perform a learning process of the target setting model 414, and may have a first acquisition unit 421 and a first learning processing unit 422. The device 4 may also evaluate the operation model 401, and may have a simulator 431, a display control unit 432, a target range acquisition unit 433, an evaluation value acquisition unit 434, and an evaluation unit 435.

［１．２．１．操業モデル４０１］
操業モデル４０１は、設備２を操業するためのモデルである。操業モデル４０１は、設備２に関する状態パラメータの値が入力されることに応じて設備２の制御パラメータの推奨値を出力してよい。操業モデル４０１は、制御パラメータを操業部４０２に供給してよい。 [1.2.1. Operation Model 401]
The operation model 401 is a model for operating the equipment 2. The operation model 401 may output recommended values of control parameters of the equipment 2 in response to input of values of state parameters related to the equipment 2. The operation model 401 may supply the control parameters to the operation section 402.

操業モデル４０１は、状態パラメータの値が入力されることに応じて、設備２に関する状態を当該操業モデル４０１の学習に用いられた目標設定データの内容に応じた状態に近づける制御パラメータの推奨値を出力してよい。目標設定データは、設備２に関するパラメータのうち、目標範囲の設定対象とされたパラメータの識別情報、および、当該パラメータに対して設定された目標範囲を含んでよい。目標設定データは、パラメータの識別情報および目標範囲の組み合わせを１つのみ含んでもよいし、複数含んでもよい。 In response to input of the value of the state parameter, the operation model 401 may output a recommended value of the control parameter that brings the state of the equipment 2 closer to a state corresponding to the contents of the target setting data used to learn the operation model 401. The target setting data may include identification information of a parameter for which a target range is set among the parameters of the equipment 2, and the target range set for that parameter. The target setting data may include only one combination of parameter identification information and target range, or may include multiple combinations.

設備２に関するパラメータは、設備２に関する状態パラメータと、設備２の制御パラメータとを含んでよい。設備２に関する状態パラメータは、設備２の状態パラメータと、設備２の操業による生産物の状態パラメータ（パフォーマンスパラメータとも称する）とを含んでよい。設備２の状態パラメータは、例えば圧力や流量、温度、ｐＨ，速度、消費電力、濃度などであってよい。設備２の状態パラメータは、設備２の消費エネルギーに関するパラメータであってもよいし、温室効果ガスの排出量に関するパラメータであってもよいし、歩留まりに関するパラメータであってもよい。生産物の状態パラメータは、例えば品質を示す指標値（品質値とも称する）や生産量などであってよい。品質値は、例えば生産物の純度や濃度、組成、粘度、色などを示す値であってよい。制御パラメータは、例えばバルブの操作量などであってよい。なお、本実施形態では一例として、目標範囲の設定対象のパラメータは、設備２に関する状態パラメータであってよい。 The parameters related to the equipment 2 may include state parameters related to the equipment 2 and control parameters of the equipment 2. The state parameters related to the equipment 2 may include state parameters of the equipment 2 and state parameters of the product by the operation of the equipment 2 (also referred to as performance parameters). The state parameters of the equipment 2 may be, for example, pressure, flow rate, temperature, pH, speed, power consumption, concentration, etc. The state parameters of the equipment 2 may be parameters related to the energy consumption of the equipment 2, parameters related to the amount of greenhouse gas emissions, or parameters related to the yield. The state parameters of the product may be, for example, an index value indicating quality (also referred to as a quality value) or a production amount. The quality value may be, for example, a value indicating the purity, concentration, composition, viscosity, color, etc. of the product. The control parameter may be, for example, the amount of operation of the valve. In this embodiment, as an example, the parameter for which the target range is set may be a state parameter related to the equipment 2.

［１．２．２．操業部４０２］
操業部４０２は、操業モデル４０１を用いて設備２を操業する。操業部４０２は、制御パラメータ取得部の一例であってよく、後述のパラメータ取得部４０３が操業モデル４０１に状態パラメータの値を供給することに応じて当該操業モデル４０１から出力される制御パラメータの推奨値を取得してよい。操業部４０２は、操業モデル４０１から出力される制御パラメータを用いて設備２の各機器を制御することで、設備２を操業してよい。操業部４０２は、操業モデル４０１から出力される制御パラメータの推奨値をシミュレータ４３１に供給して、設備２の操業をシミュレートさせてもよい。 [1.2.2. Operation Section 402]
The operation unit 402 operates the equipment 2 using the operation model 401. The operation unit 402 may be an example of a control parameter acquisition unit, and may acquire recommended values of control parameters output from the operation model 401 in response to a parameter acquisition unit 403 described below supplying values of state parameters to the operation model 401. The operation unit 402 may operate the equipment 2 by controlling each device of the equipment 2 using the control parameters output from the operation model 401. The operation unit 402 may supply the recommended values of the control parameters output from the operation model 401 to a simulator 431 to simulate the operation of the equipment 2.

なお、操業部４０２は、入力部４１１を介して入力される人手の操作に応じて設備２を操業してもよい。操業部４０２は、人手の操作に応じた制御パラメータをシミュレータ４３１に供給して、設備２の操業をシミュレートさせてもよい。 The operation unit 402 may operate the equipment 2 in response to manual operations input via the input unit 411. The operation unit 402 may supply control parameters in response to the manual operations to the simulator 431 to simulate the operation of the equipment 2.

［１．２．３．パラメータ取得部４０３］
パラメータ取得部４０３は、設備２に関するパラメータを取得する。パラメータ取得部４０３は、設備２の制御パラメータを操業部４０２から取得してよい。パラメータ取得部４０３は、設備２に関する状態パラメータ（本実施形態では一例として生産物の状態パラメータおよび設備２の状態パラメータ）を設備２から取得してよい。但し、パラメータ取得部４０３は、設備２の制御パラメータを設備２から取得してもよいし、生産物の状態パラメータをオペレータから取得してもよい。また、シミュレータ４３１によって設備２のシミュレーションが行われる場合には、パラメータ取得部４０３は、シミュレーションされた設備２に関するパラメータのうち、設備２に関する状態パラメータをシミュレータ４３１から取得してもよい。 Parameter Acquisition Unit 403
The parameter acquisition unit 403 acquires parameters related to the equipment 2. The parameter acquisition unit 403 may acquire control parameters of the equipment 2 from the operation unit 402. The parameter acquisition unit 403 may acquire state parameters related to the equipment 2 (in the present embodiment, as an example, state parameters of the product and state parameters of the equipment 2) from the equipment 2. However, the parameter acquisition unit 403 may acquire the control parameters of the equipment 2 from the equipment 2, or may acquire the state parameters of the product from the operator. In addition, when a simulation of the equipment 2 is performed by the simulator 431, the parameter acquisition unit 403 may acquire state parameters related to the equipment 2 from the simulator 431, among the parameters related to the simulated equipment 2.

パラメータ取得部４０３は、取得した各パラメータを記憶部４１２に記憶させてよい。パラメータ取得部４０３は、供給部の一例であってよく、操業モデル４０１に対し、状態パラメータの値を供給してよい。 The parameter acquisition unit 403 may store each acquired parameter in the storage unit 412. The parameter acquisition unit 403 may be an example of a supply unit, and may supply the value of the state parameter to the operation model 401.

［１．２．４．入力部４１１］
入力部４１１は、オペレータから種々の入力操作を受ける。入力部４１１は、オペレータから設備２の操業計画の入力操作を受けてよい。また、入力部４１１は、オペレータから目標設定データの入力操作を受けてよい。目標設定モデル４１４の学習処理が完了していない場合には、操業計画と、目標設定データとは対応付けて入力されてよい。 [1.2.4. Input section 411]
The input unit 411 receives various input operations from an operator. The input unit 411 may receive an input operation of an operation plan for the equipment 2 from the operator. The input unit 411 may also receive an input operation of target setting data from the operator. When the learning process of the target setting model 414 is not completed, the operation plan and the target setting data may be input in association with each other.

ここで、設備２の操業計画は、設備２によって生産される物の生産計画量、目標品質および材料種別の少なくとも１つを示してよい。設備２の操業計画は、設備２のエネルギー効率や消費電力、歩留まり、温室効果ガスの排出量など、他の内容を示してもよい。 Here, the operation plan for facility 2 may indicate at least one of the planned production volume, target quality, and material type of the product produced by facility 2. The operation plan for facility 2 may also indicate other content, such as the energy efficiency, power consumption, yield, and greenhouse gas emissions of facility 2.

入力部４１１は、入力された操業計画および目標設定データを記憶部４１２に記憶させてよい。入力部４１１は、入力された目標設定データを第２学習処理部４１３に供給してよい。入力部４１１は、入力された操業計画を第１供給部４１５に供給してよい。 The input unit 411 may store the input operation plan and target setting data in the storage unit 412. The input unit 411 may supply the input target setting data to the second learning processing unit 413. The input unit 411 may supply the input operation plan to the first supply unit 415.

［１．２．５．記憶部４１２］
記憶部４１２は、種々のデータを記憶する。記憶部４１２は、パラメータ取得部４０３により取得された各パラメータを記憶してよい。また、記憶部４１２は、入力部４１１により入力された目標設定データと、設備２の操業計画とを記憶してよい。記憶部４１２に記憶される目標設定データは、操業モデル４０１の学習に用いられたデータであってよく、一例としてベテランのオペレータにより設定されたデータであってよい。なお、後述の目標設定モデル４１４に操業計画を供給して目標設定データが生成される場合には、記憶部４１２は、当該目標設定データと操業計画とをさらに記憶してよい。記憶部４１２に記憶されたデータは、第１学習処理部４２２および第２学習処理部４１３による学習処理に使用されてよい。 [1.2.5. Storage unit 412]
The storage unit 412 stores various data. The storage unit 412 may store each parameter acquired by the parameter acquisition unit 403. The storage unit 412 may also store the target setting data input by the input unit 411. and an operation plan of the facility 2. The target setting data stored in the storage unit 412 may be data used for learning the operation model 401, and may be data set by an experienced operator, for example. In addition, when the target setting data is generated by supplying the operation plan to the target setting model 414 described later, the storage unit 412 further stores the target setting data and the operation plan. The data stored in the storage unit 412 may be used in the learning process by the first learning processing unit 422 and the second learning processing unit 413.

［１．２．６．第２学習処理部４１３］
第２学習処理部４１３は、設備２に関する状態パラメータの値、および、設備２の制御パラメータの値を含む学習データを用い、状態パラメータの値が入力されることに応じて制御パラメータの推奨値を出力するように操業モデル４０１の学習処理を実行する。第２学習処理部４１３が用いる学習データに含まれる状態パラメータおよび制御パラメータの種類は、パラメータ取得部４０３により取得されるパラメータの中でオペレータにより任意に選択されてよい。 [1.2.6. Second learning processing unit 413]
The second learning processing unit 413 executes a learning process of the operation model 401 to output recommended values of the control parameters in response to input of the state parameter values, using learning data including values of state parameters related to the equipment 2 and values of control parameters of the equipment 2. The types of state parameters and control parameters included in the learning data used by the second learning processing unit 413 may be arbitrarily selected by the operator from the parameters acquired by the parameter acquisition unit 403.

第２学習処理部４１３は、操業モデル４０１の学習処理を強化学習により行ってよい。例えば、第２学習処理部４１３は、学習データと、予め設定された報酬関数により定まる報酬値とを用いて操業モデル４０１の学習処理を実行してよい。 The second learning processing unit 413 may perform the learning process of the operation model 401 by reinforcement learning. For example, the second learning processing unit 413 may execute the learning process of the operation model 401 using the learning data and a reward value determined by a preset reward function.

第２学習処理部４１３は、目標設定データをさらに用いて操業モデル４０１の学習処理を行ってよく、状態パラメータの値が入力されることに応じて、設備２に関する状態を、当該目標設定データの内容に応じた状態に近づける制御パラメータの値を出力するように学習処理を行ってよい。この場合に、第２学習処理部４１３は、目標設定データの内容に基づいて設定された報酬関数により定まる報酬値を用いて学習処理を行ってよい。 The second learning processing unit 413 may further use the target setting data to perform learning processing of the operation model 401, and may perform learning processing to output values of control parameters that bring the state of the equipment 2 closer to a state corresponding to the content of the target setting data in response to input of the value of the state parameter. In this case, the second learning processing unit 413 may perform learning processing using a reward value determined by a reward function set based on the content of the target setting data.

例えば、報酬関数は、操業モデル４０１から出力される制御パラメータを用いて操業された設備２に関する状態パラメータの値が目標設定データの内容を満たす場合に報酬値を１とし、満たさない場合に報酬値を０とする関数であってよい。また、報酬関数は、操業モデル４０１から出力される制御パラメータを用いて操業された設備２に関する状態パラメータの値が目標設定データの目標範囲から外れる度合いに応じて報酬値を変化させる関数であってよい。一例として、報酬関数は、次の式（１）で示される関数であってよい。 For example, the reward function may be a function that sets the reward value to 1 when the value of a state parameter for equipment 2 operated using the control parameters output from operation model 401 satisfies the contents of the target setting data, and sets the reward value to 0 when it does not. Furthermore, the reward function may be a function that changes the reward value depending on the degree to which the value of a state parameter for equipment 2 operated using the control parameters output from operation model 401 deviates from the target range of the target setting data. As an example, the reward function may be a function shown in the following formula (1).

報酬値＝ａ＊省エネ指標－ｂ＊品質値の外れ度（１）
なお、式（１）中「ａ」，「ｂ」は係数であってよい。「省エネ指標」とは、設備２の省エネルギーの度合いを示す指標であり、設備２の状態パラメータから算出される値であってよい。「品質値の外れ度」とは、生産物の品質値が目標設定データにおける品質値の目標範囲から外れた大きさであってよい。 Reward value = a * energy saving index - b * quality value deviation (1)
In addition, "a" and "b" in formula (1) may be coefficients. The "energy saving index" is an index indicating the degree of energy saving of the equipment 2, and may be a value calculated from the state parameters of the equipment 2. The "degree of deviation of the quality value" may be the degree to which the quality value of the product deviates from the target range of the quality value in the target setting data.

［１．２．７．目標設定モデル４１４］
目標設定モデル４１４は、操業計画が入力されることに応じて、操業モデル４０１の学習に用いるべき目標設定データのうち、パラメータの識別情報または目標範囲の少なくとも一方を出力する。本実施形態では一例として、目標設定モデル４１４は、パラメータの識別情報および目標範囲の両方を出力してよい。目標設定モデル４１４は、パラメータの識別情報および目標範囲の組み合わせを１つのみ出力してもよいし、複数出力してもよい。 1.2.7. Goal Setting Model 414
In response to an input of an operation plan, the target setting model 414 outputs at least one of the parameter identification information and the target range among the target setting data to be used for learning of the operation model 401. As an example in the present embodiment, the target setting model 414 may output both the parameter identification information and the target range. The target setting model 414 may output only one combination of the parameter identification information and the target range, or may output multiple combinations.

［１．２．８．第１供給部４１５］
第１供給部４１５は、設備２の操業計画が入力されることに応じて、目標設定モデル４１４に当該操業計画を供給する。第１供給部４１５には、入力部４１１に対して新たに操業計画が入力されることに応じて、当該新たな操業計画を目標設定モデル４１４に供給してよい。これにより、操業計画に応じたデータが目標設定モデル４１４から出力される。 [1.2.8. 1st supply section 415]
In response to the input of an operation plan of the facility 2, the first supply unit 415 supplies the operation plan to the target setting model 414. In response to the input of the plan, the new operation plan may be provided to the target setting model 414. As a result, data corresponding to the operation plan is output from the target setting model 414.

［１．２．９．第２取得部４１６］
第２取得部４１６は、第１供給部４１５によって操業計画が供給された目標設定モデル４１４からの出力データを取得する。本実施形態では一例として、第２取得部４１６は、目標設定モデル４１４からパラメータの識別情報および目標範囲の両方を出力データとして取得してよい。第２取得部４１６は、取得した出力データを第２供給部４１７に供給してよい。 [1.2.9. Second acquisition unit 416]
The second acquisition unit 416 acquires output data from the target setting model 414 to which the operation plan is supplied by the first supply unit 415. As an example in the present embodiment, the second acquisition unit 416 acquires output data from the target setting model 414. The second acquiring section 416 may acquire both the identification information and the target range of the parameter as output data, and may supply the acquired output data to the second supplying section 417.

［１．２．１０．第２供給部４１７］
第２供給部４１７は、第２学習処理部４１３に対し、第２取得部４１６により取得された出力データに応じた目標設定データを供給する。これにより、第２供給部４１７から供給された目標設定データを用いて操業モデル４０１の学習処理が行われる。 [1.2.10. Second supply section 417]
The second supply unit 417 supplies the second learning processing unit 413 with the target setting data corresponding to the output data acquired by the second acquisition unit 416. A learning process of the operation model 401 is performed using the setting data.

なお、本実施形態においては一例として、目標設定モデル４１４からの出力データは、パラメータの識別情報および目標範囲の両方を含んでいる。そのため、第２供給部４１７は、出力データをそのまま目標設定データとして操業モデル４０１に供給してよい。 In this embodiment, as an example, the output data from the target setting model 414 includes both parameter identification information and target ranges. Therefore, the second supply unit 417 may supply the output data directly to the operation model 401 as target setting data.

［１．２．１１．第１取得部４２１］
第１取得部４２１は、設備２の操業計画と、操業モデル４０１の学習に用いられた目標設定データのうちの少なくともパラメータの識別情報と、を取得する。本実施形態では一例として、第１取得部４２１は、目標設定データのうちのパラメータの識別情報および目標範囲の両方を取得してよい。第１取得部４２１は、取得したデータを第１学習処理部４２２に供給してよい。 [1.2.11. First acquisition unit 421]
The first acquisition unit 421 acquires the operation plan of the equipment 2 and at least the identification information of the parameters of the target setting data used for learning the operation model 401. The first acquisition section 421 may acquire both the identification information and the target range of the parameter from the target setting data. The first acquisition section 421 may supply the acquired data to the first learning processing section 422.

［１．２．１２．第１学習処理部４２２］
第１学習処理部４２２は、第１取得部４２１が取得したパラメータの識別情報および操業計画を含む学習データを用いて目標設定モデル４１４の学習処理を行う。 [1.2.12. First learning processing unit 422]
The first learning processing unit 422 performs learning processing of the target setting model 414 using learning data including the parameter identification information and the operation plan acquired by the first acquisition unit 421 .

第１学習処理部４２２は、ディープラーニングなどの教師あり学習によって目標設定モデル４１４の学習処理を行うが、他の機械学習の手法によって目標設定モデル４１４の学習を行ってもよい。例えば、第１学習処理部４２２は、操業モデル４０１の学習に用いられた目標設定データにおけるパラメータの識別情報と、当該目標設定データと対応付けて入力された操業計画とを含む学習データを用いて目標設定モデル４１４の学習処理を行ってよい。本実施形態では一例として、第１学習処理部４２２は、ベテランのオペレータにより設定されて操業モデル４０１の学習に用いられた目標設定データにおけるパラメータの識別情報と、当該目標設定データと対応付けて入力された操業計画とを含む学習データを用いて目標設定モデル４１４の学習処理を行ってよい。 The first learning processing unit 422 performs learning processing of the target setting model 414 by supervised learning such as deep learning, but may also perform learning of the target setting model 414 by other machine learning methods. For example, the first learning processing unit 422 may perform learning processing of the target setting model 414 using learning data including identification information of parameters in the target setting data used to learn the operation model 401 and an operation plan input in association with the target setting data. As an example in this embodiment, the first learning processing unit 422 may perform learning processing of the target setting model 414 using learning data including identification information of parameters in the target setting data set by a veteran operator and used to learn the operation model 401 and an operation plan input in association with the target setting data.

第１学習処理部４２２は、目標設定モデル４１４からの出力データの内容が、操業モデル４０１の学習に用いられた目標設定データの内容に近似するように、目標設定モデル４１４の学習処理を行ってよい。また、第１学習処理部４２２は、目標設定モデル４１４に対し、操業計画が入力されることに応じて、当該操業計画が達成されるために操業モデル４０１の学習に用いられるべき目標設定データのパラメータの識別情報や目標範囲を出力するように学習処理を行ってよい。 The first learning processing unit 422 may perform learning processing of the target setting model 414 so that the contents of the output data from the target setting model 414 approximate the contents of the target setting data used in learning the operation model 401. In addition, the first learning processing unit 422 may perform learning processing to output identification information and target ranges of parameters of the target setting data to be used in learning the operation model 401 in order to achieve an operation plan, in response to an operation plan being input to the target setting model 414.

第１学習処理部４２２は、第１取得部４２１が取得したパラメータの目標範囲をさらに含む学習データを用いて目標設定モデル４１４の学習処理を行ってよい。つまり、第１学習処理部４２２は、操業モデル４０１の学習に用いられた目標設定データにおけるパラメータの識別情報および目標範囲を含む学習データを用いて目標設定モデル４１４の学習処理を行ってよい。第１学習処理部４２２は、目標設定モデル４１４に対し、操業計画が入力されることに応じて、当該操業計画が達成されるために操業モデル４０１の学習に用いられるべき目標設定データのうち、パラメータの識別情報および目標範囲の両方を出力するように学習処理を行ってよい。 The first learning processing unit 422 may perform learning processing of the goal setting model 414 using learning data that further includes the target range of the parameter acquired by the first acquisition unit 421. In other words, the first learning processing unit 422 may perform learning processing of the goal setting model 414 using learning data that includes the identification information and target range of the parameter in the goal setting data used in learning the operation model 401. In response to an operation plan being input to the goal setting model 414, the first learning processing unit 422 may perform learning processing to output both the identification information and the target range of the parameter from the target setting data to be used in learning the operation model 401 to achieve the operation plan.

［１．２．１３．シミュレータ４３１］
シミュレータ４３１は、設備２の状態をシミュレーションする。シミュレータ４３１は、設備２の定常状態から停止までを動的にシミュレーションするダイナミックシミュレータでもよいし、設備２の定常状態をシミュレーションするスタティックシミュレータでもよい。 Simulator 431
The simulator 431 simulates the state of the facility 2. The simulator 431 may be a dynamic simulator that dynamically simulates the facility 2 from a steady state to a stop, or may be a static simulator that simulates the steady state of the facility 2.

シミュレータ４３１は、操業部４０２から供給される制御パラメータの値に基づいて操業された設備２に関する状態をシミュレートしてよい。操業部４０２から供給される制御パラメータの値は、操業モデル４０１から出力される制御パラメータの推奨値であってもよいし、人手の操作に応じた制御パラメータであってもよい。シミュレータ４３１は、シミュレーションにおいて操業された設備２に関するパラメータ（本実施形態では一例として、設備２に関する状態パラメータ）を評価値取得部４３４およびパラメータ取得部４０３に供給してよい。 The simulator 431 may simulate the state of the operated equipment 2 based on the values of the control parameters supplied from the operation unit 402. The values of the control parameters supplied from the operation unit 402 may be recommended values of the control parameters output from the operation model 401, or may be control parameters according to manual operation. The simulator 431 may supply parameters of the equipment 2 operated in the simulation (as an example in this embodiment, state parameters of the equipment 2) to the evaluation value acquisition unit 434 and the parameter acquisition unit 403.

［１．２．１４．表示制御部４３２］
表示制御部４３２は、図示しない表示装置に種々の情報を表示させる。例えば、表示制御部４３２は、目標設定モデル４１４から第２取得部４１６が取得したパラメータの識別情報や目標範囲を表示させてもよい。また、表示制御部４３２は、パラメータ取得部４０３により取得された各パラメータを記憶部４１２から読み出して表示させてよい。表示制御部４３２は、設備２に関する複数種類のパラメータから何れかのパラメータ（選択パラメータとも称する）がオペレータにより選択されることに応じて、設備２の過去の操業での当該選択パラメータの値を表示させてよい。 [1.2.14. Display control unit 432]
The display control unit 432 displays various information on a display device (not shown). For example, the display control unit 432 may display identification information and target ranges of parameters acquired by the second acquisition unit 416 from the target setting model 414. The display control unit 432 may read out and display each parameter acquired by the parameter acquisition unit 403 from the storage unit 412. In response to an operator selecting any parameter (also referred to as a selected parameter) from a plurality of types of parameters related to the equipment 2, the display control unit 432 may display the value of the selected parameter in past operations of the equipment 2.

なお、設備２が物の製造を行う場合には、表示制御部４３２と、後述の目標範囲取得部４３３、評価値取得部４３４および評価部４３５とにおいて、設備２に関するパラメータは、生産物に関する状態パラメータであってよく、本実施形態では一例として生産物の品質を示す指標値または生産物の生産量の少なくとも１つであってよい。これに加えて、または、これに代えて、設備２に関するパラメータは、設備２の状態パラメータ（一例として設備２のエネルギー効率や消費電力など）であってもよいし、設備２の制御パラメータであってもよい。 When equipment 2 manufactures goods, the parameters related to equipment 2 in the display control unit 432, the target range acquisition unit 433, the evaluation value acquisition unit 434, and the evaluation unit 435 described below may be state parameters related to the product, and in this embodiment, as an example, may be at least one of an index value indicating the quality of the product or the production volume of the product. In addition to this, or instead of this, the parameters related to equipment 2 may be state parameters of equipment 2 (for example, the energy efficiency and power consumption of equipment 2, etc.) or may be control parameters of equipment 2.

［１．２．１５．目標範囲取得部４３３］
目標範囲取得部４３３は、オペレータにより選択された選択パラメータについて、操業モデル４０１を評価するためにオペレータにより設定される目標範囲（評価用目標範囲とも称する）を取得する。例えば、目標範囲取得部４３３は、表示制御部４３２によって各選択パラメータについて表示される過去の操業での値に基づいてオペレータにより設定される評価用目標範囲を取得してよい。目標範囲取得部４３３は、取得した評価用目標範囲を評価値取得部４３４に供給してよい。 [1.2.15. Target range acquisition unit 433]
The target range acquisition unit 433 acquires a target range (also referred to as an evaluation target range) set by the operator for the selected parameter selected by the operator in order to evaluate the operation model 401. For example, the target range acquisition unit 433 may acquire the evaluation target range set by the operator based on the value in the past operation displayed for each selected parameter by the display control unit 432. The target range acquisition unit 433 may supply the acquired evaluation target range to the evaluation value acquisition unit 434.

なお、評価用目標範囲は、操業モデル４０１の学習に用いられた目標設定データ内の目標範囲と同じであってもよいし、異なってもよい。また、評価用目標範囲が設定されるパラメータは、操業モデル４０１の学習に用いられた目標設定データ内のパラメータと同じであってもよいし、異なってもよい。 The evaluation target range may be the same as or different from the target range in the target setting data used to train the operation model 401. The parameters for which the evaluation target range is set may be the same as or different from the parameters in the target setting data used to train the operation model 401.

［１．２．１６．評価値取得部４３４］
評価値取得部４３４は、操業部４０２により操業モデル４０１から取得された推奨値により設備２を操業した結果に応じたモデル評価値を取得する。評価値取得部４３４は、取得したモデル評価値を評価部４３５に供給してよい。 [1.2.16. Evaluation value acquisition unit 434]
The evaluation value acquisition unit 434 acquires a model evaluation value according to a result of operating the equipment 2 based on the recommended value acquired from the operation model 401 by the operation unit 402. The evaluation value acquisition unit 434 may supply the acquired model evaluation value to the evaluation unit 435.

モデル評価値は、操業モデル４０１を評価するための評価値であってよい。本実施形態においては一例として、モデル評価値は、操業モデル４０１から出力された推奨値によって操業された設備２に関するパラメータが評価用目標範囲に収まるか否かに基づいて算出されてよい。なお、モデル評価値は、第２学習処理部４１３が操業モデル４０１の強化学習において用いる報酬値と同じ値であってもよいし、異なる値であってもよい。 The model evaluation value may be an evaluation value for evaluating the operation model 401. In this embodiment, as an example, the model evaluation value may be calculated based on whether or not the parameters related to the equipment 2 operated according to the recommended value output from the operation model 401 fall within the evaluation target range. Note that the model evaluation value may be the same value as the reward value used by the second learning processing unit 413 in the reinforcement learning of the operation model 401, or may be a different value.

また、評価値取得部４３４は、設備２を人手の操作（一例としてベテランのオペレータの操作）により操業した結果に応じた基準評価値をさらに取得してよい。評価値取得部４３４は、取得した基準評価値を評価部４３５に供給してよい。 The evaluation value acquisition unit 434 may further acquire a reference evaluation value according to the results of operating the equipment 2 through manual operation (for example, operation by an experienced operator). The evaluation value acquisition unit 434 may supply the acquired reference evaluation value to the evaluation unit 435.

基準評価値は、モデル評価値の基準値であってよい。基準評価値は、人手の操作により操業された設備２に関するパラメータが目標範囲内に収まるか否かに基づいて、モデル評価値と同様に算出されてよい。 The reference evaluation value may be a reference value of the model evaluation value. The reference evaluation value may be calculated in the same manner as the model evaluation value, based on whether the parameters related to the equipment 2 operated by manual operation fall within a target range.

［１．２．１７．評価部４３５］
評価部４３５は、モデル評価値、および、基準評価値に基づいて操業モデル４０１を評価する。評価部４３５は、モデル評価値と基準評価値との比較結果に基づいて操業モデル４０１を評価してよい。例えば、評価部４３５は、モデル評価値が基準評価値よりも良好な値である場合に、操業モデル４０１が良好である旨の評価を行ってよい。評価部４３５は、評価結果を表示制御部４３２などに出力してよい。 Evaluation Unit 435
The evaluation unit 435 evaluates the operation model 401 based on the model evaluation value and the reference evaluation value. The evaluation unit 435 may evaluate the operation model 401 based on a comparison result between the model evaluation value and the reference evaluation value. For example, when the model evaluation value is a better value than the reference evaluation value, the evaluation unit 435 may evaluate that the operation model 401 is good. The evaluation unit 435 may output the evaluation result to the display control unit 432 or the like.

以上の装置４によれば、目標設定モデル４１４は、操業計画が入力されることに応じて、操業モデル４０１の学習に用いるべき目標設定データのうちパラメータの識別情報または目標範囲の少なくとも一方を出力する。また、目標設定モデル４１４の学習処理は、設備２の操業計画と、操業モデル４０１の学習に用いられた目標設定データのうちの少なくともパラメータの識別情報とを含む学習データを用いて行われる。従って、目標設定モデル４１４からの出力データ（ここではパラメータの識別情報または目標範囲の少なくとも一方）の内容を、操業計画が達成されるために操業モデル４０１の学習において用いられた目標設定データの内容に近似させることができる。よって、目標設定モデル４１４からの出力データを用いて操業モデル４０１の学習処理を行うことにより、操業計画に応じた適切な操業を行う操業モデル４０１を生成することができる。 According to the above-mentioned device 4, the target setting model 414 outputs at least one of the parameter identification information or the target range of the target setting data to be used for learning the operation model 401 in response to the input of the operation plan. In addition, the learning process of the target setting model 414 is performed using the operation plan of the equipment 2 and learning data including at least the parameter identification information of the target setting data used for learning the operation model 401. Therefore, the contents of the output data from the target setting model 414 (here, at least one of the parameter identification information or the target range) can be approximated to the contents of the target setting data used in learning the operation model 401 to achieve the operation plan. Therefore, by performing the learning process of the operation model 401 using the output data from the target setting model 414, it is possible to generate an operation model 401 that performs appropriate operation according to the operation plan.

また、目標設定データを用い、状態パラメータの値が入力されることに応じて、設備２に関する状態を当該目標設定データの内容に応じた状態に近づける制御パラメータの値を出力するように操業モデル４０１の学習処理が行われる。従って、適切な操業状態で設備２の操業を行う操業モデル４０１を生成することができる。 In addition, using the target setting data, a learning process is performed on the operation model 401 so as to output the value of the control parameter that brings the state of the equipment 2 closer to the state corresponding to the content of the target setting data in response to the input of the state parameter value. Therefore, an operation model 401 that operates the equipment 2 in an appropriate operating state can be generated.

また、新たな操業計画が入力されることに応じて、当該操業計画が目標設定モデル４１４に供給され、目標設定モデル４１４からの出力データに応じた目標設定データを用いて操業モデル４０１の学習処理が行われる。従って、操業計画が変更されるごとに、操業計画に応じた目標設定データを用いて操業モデル４０１の学習処理を行い、操業計画に応じた適切な操業を行う操業モデル４０１を生成することができる。 In addition, when a new operation plan is input, the operation plan is supplied to the target setting model 414, and the operation model 401 undergoes a learning process using target setting data corresponding to the output data from the target setting model 414. Therefore, each time the operation plan is changed, the operation model 401 undergoes a learning process using target setting data corresponding to the operation plan, and an operation model 401 that performs appropriate operation according to the operation plan can be generated.

また、操業モデル４０１から出力される制御パラメータの推奨値により設備２を操業した結果に応じたモデル評価値と、設備２を人手の操作により操業した結果に応じた基準評価値と基づいて操業モデル４０１が評価される。従って、操業モデル４０１を用いることによる操業結果の良否、ひいては操業モデル４０１の良否を画一的に判断することができる。 The operation model 401 is evaluated based on a model evaluation value corresponding to the results of operating the equipment 2 using the recommended values of the control parameters output from the operation model 401, and a reference evaluation value corresponding to the results of operating the equipment 2 by manual operation. Therefore, the quality of the operation results obtained by using the operation model 401, and therefore the quality of the operation model 401, can be uniformly determined.

また、操業モデル４０１から出力される制御パラメータの推奨値により操業された設備２に関するパラメータが操業モデル４０１の評価用目標範囲内に収まるか否かに基づいてモデル評価値が算出され、人手の操作により操業された設備２に関するパラメータが評価用目標範囲内に収まるか否かに基づいて基準評価値が算出される。従って、操業モデル４０１を用いることによる操業結果の良否をいっそう画一的に判断することができる。 In addition, a model evaluation value is calculated based on whether the parameters of equipment 2 operated according to the recommended values of the control parameters output from the operation model 401 fall within the evaluation target range of the operation model 401, and a reference evaluation value is calculated based on whether the parameters of equipment 2 operated by manual operation fall within the evaluation target range. Therefore, the quality of the operation results by using the operation model 401 can be judged more uniformly.

また、設備２に関する複数種類のパラメータのうち、オペレータにより選択される選択パラメータについて、オペレータにより設定される評価用目標範囲が取得されるので、任意のパラメータについて任意の評価用目標範囲を設定することができる。従って、操業結果の評価基準を任意に設定することができる。 In addition, the target evaluation range set by the operator is obtained for the selected parameters selected by the operator from among multiple types of parameters related to the equipment 2, so any target evaluation range can be set for any parameter. Therefore, the evaluation criteria for the operation results can be set arbitrarily.

また、複数種類のパラメータから選択パラメータが選択されることに応じて、設備２の過去の操業での当該選択パラメータの値が表示されるので、過去の選択パラメータの値に基づいて評価用目標範囲を設定することができる。 In addition, as a selected parameter is selected from multiple types of parameters, the value of the selected parameter in past operations of equipment 2 is displayed, so that a target range for evaluation can be set based on the past values of the selected parameter.

また、操業モデル４０１の評価において、設備２に関するパラメータは設備２による生産物の品質を示す指標値または生産物の生産量の少なくとも１つであるので、生産量や品質が向上する操業モデル４０１を良好な操業モデル４０１とする評価結果を取得することができる。従って、評価の高い操業モデル４０１を用いることにより、生産量や品質を向上させることができる。 In addition, in the evaluation of the operation model 401, the parameter related to the equipment 2 is at least one of an index value indicating the quality of the product produced by the equipment 2 or the production volume of the product, so that an evaluation result can be obtained that determines that the operation model 401 that improves the production volume or quality is a good operation model 401. Therefore, by using a highly evaluated operation model 401, the production volume and quality can be improved.

また、状態パラメータの値、および、制御パラメータの値を含む学習データを用いて第２学習処理部４１３により操業モデル４０１の学習処理が実行されるので、評価の低い操業モデル４０１に学習処理を行い、評価の高い操業モデル４０１を得ることができる。 In addition, the second learning processing unit 413 performs learning processing of the operation model 401 using learning data including the values of the state parameters and the values of the control parameters, so that it is possible to perform learning processing on an operation model 401 with a low evaluation to obtain an operation model 401 with a high evaluation.

また、学習データと、予め設定された報酬関数により定まる報酬値とを用いて操業モデル４０１の学習処理が実行されるので、評価の高い操業モデル４０１を確実に得ることができる。 In addition, the learning process of the operation model 401 is performed using the learning data and a reward value determined by a preset reward function, so that a highly rated operation model 401 can be reliably obtained.

［２．操業モデル４０１］
図２は、操業モデル４０１のデータ構造を示す。操業モデル４０１は、サンプリングされた状態データの集合を示す状態ｓと各状態下に取られた行動ａとの組み合わせ（ｓ，ａ）と、報酬によって計算されたウエイトｗとで構成されるデータ構造を有する。なお、このようなウエイトは、目標設定データを用いた報酬関数により定まる報酬に基づいて決定されてよい。本図においては、一例として、状態ｓ＝（ＴＩ００１，ＴＩ００２，ＴＩ００３，ＦＩ００１，ＦＩ００２，ＶＩ００１）とした場合を示している。そして、本図においては、例えば、ｓ＝（－２．４７８０３，－２．４８４１３，－０．０７３２４，２９．７１１９１，２４．２５１１，７０）の状態下でａ＝１の行動が取られた場合に、報酬によって計算されたウエイトがｗ＝１４４．１４８４であることを意味している。このような操業モデル４０１により次の行動が決定される。 [2. Operation Model 401]
FIG. 2 shows the data structure of the operation model 401. The operation model 401 has a data structure consisting of a combination (s, a) of a state s indicating a set of sampled state data and an action a taken under each state, and a weight w calculated by the reward. Such weights may be determined based on the reward determined by a reward function using goal setting data. In this figure, as an example, a case where state s=(TI001, TI002, TI003, FI001, FI002, VI001) is shown. In this figure, for example, when an action of a=1 is taken under a state of s=(-2.47803, -2.48413, -0.07324, 29.71191, 24.2511, 70), the weight calculated by the reward is w=144.1484. The next action is determined by such an operation model 401.

図３は、行動決定テーブルを示す。行動決定テーブルは、入力された状態ｓと取り得る行動ａとで構成される。本図においては、一例として、入力された状態がｓ＝（０．１，０．２，０．４，０．３，０．８，０．２）であり、取り得る行動がａ＝（－３，－１，０，１，３）の５つである場合を示している。例えば、このような行動決定テーブルを図４に示される操業モデル４０１に入力することにより、次の行動が決定される。これについてフローを用いて詳細に説明する。 Figure 3 shows the action decision table. The action decision table is composed of the input state s and the possible actions a. In this figure, as an example, the input state s = (0.1, 0.2, 0.4, 0.3, 0.8, 0.2) and the five possible actions a = (-3, -1, 0, 1, 3) are shown. For example, by inputting such an action decision table into the operation model 401 shown in Figure 4, the next action is decided. This will be explained in detail using a flow chart.

［３．装置４の動作］
［３．１．操業モデル４０１の学習動作］
図４は、操業モデル４０１の学習動作を示す。装置１は、ステップＳ１０１～Ｓ１１９の処理により操業モデル４０１を生成してよい。 3. Operation of Device 4
[3.1. Learning Operation of Operation Model 401]
4 shows the learning operation of the operation model 401. The device 1 may generate the operation model 401 by the processes of steps S101 to S119.

ステップＳ１０１において、第２学習処理部４１３は、目標設定データを取得する。本図の動作において第２学習処理部４１３は、入力部４１１を介して入力されたパラメータの識別情報と、当該パラメータの目標範囲とを含む目標設定データを取得してよい。 In step S101, the second learning processing unit 413 acquires goal setting data. In the operation of this figure, the second learning processing unit 413 may acquire goal setting data including identification information of a parameter input via the input unit 411 and a target range of the parameter.

なお、ステップＳ１０１において表示制御部４３２は、操業モデル４０１の学習で以前に用いられた目標設定データの内容を表示させてもよい。例えば、表示制御部４３２は、設備２に関する複数種類のパラメータの何れかが目標範囲の設定対象としてオペレータにより選択されることに応じて、当該パラメータについて以前に設定された目標範囲を表示させてよい。また、表示制御部４３２は、以前に設定された目標範囲に含まれる少なくとも一部の領域を、推奨される目標範囲としてさらに表示させてもよい。例えば、表示制御部４３２は、以前に設定された目標範囲のうち、予め指定された割合の中央部分の範囲を推奨される目標範囲として表示させてよい。一例として、パラメータＰａについて以前に設定された目標範囲が５～１５であり、パラメータＰｂについて以前に設定された目標範囲が１０～３０であり、指定割合が９０％である場合には、表示制御部４３２は、パラメータＰａについて推奨される目標範囲を６～１４、パラメータＰｂについて推奨される目標範囲を１２～２８としてよい。 In step S101, the display control unit 432 may display the contents of the target setting data previously used in the learning of the operation model 401. For example, the display control unit 432 may display the target range previously set for the parameter in response to the operator selecting one of the multiple types of parameters related to the equipment 2 as the target range setting target for the parameter. The display control unit 432 may further display at least a part of the area included in the previously set target range as the recommended target range. For example, the display control unit 432 may display the central range of the previously set target range with a pre-specified ratio as the recommended target range. As an example, if the previously set target range for the parameter Pa is 5 to 15, the previously set target range for the parameter Pb is 10 to 30, and the specified ratio is 90%, the display control unit 432 may set the recommended target range for the parameter Pa to 6 to 14 and the recommended target range for the parameter Pb to 12 to 28.

表示制御部４３２は、目標範囲の設定対象の各パラメータを座標軸とする座標空間に、以前に設定された目標範囲を表示させてよい。また、表示制御部４３２は、推奨される目標範囲を座標空間にさらに表示させてよい。以前に設定された目標範囲が座標空間に表示される場合には、第２学習処理部４１３は、入力部４１１により座標空間内で範囲指定が行われることに応じて、その指定範囲を目標範囲として取得してよい。 The display control unit 432 may display a previously set target range in a coordinate space having each parameter for which the target range is set as a coordinate axis. The display control unit 432 may also display a recommended target range in the coordinate space. When a previously set target range is displayed in the coordinate space, the second learning processing unit 413 may acquire the specified range as the target range in response to a range specified in the coordinate space by the input unit 411.

ステップＳ１０３において、第２学習処理部４１３は、目標設定データを用いて報酬関数を決定する。第２学習処理部４１３は、操業モデル４０１により操業された設備２に関する状態が、目標設定データの内容に応じた状態に近づく場合に報酬値が高くなるように報酬関数を決定してよい。また、第２学習処理部４１３は、操業モデル４０１により操業された設備２に関する状態パラメータが、目標設定データの内容を満たす場合に報酬値が高くなるように報酬関数を決定してよい。 In step S103, the second learning processing unit 413 determines a reward function using the goal setting data. The second learning processing unit 413 may determine the reward function such that the reward value becomes high when the state of the equipment 2 operated by the operation model 401 approaches a state corresponding to the contents of the goal setting data. The second learning processing unit 413 may also determine the reward function such that the reward value becomes high when the state parameters of the equipment 2 operated by the operation model 401 satisfy the contents of the goal setting data.

ステップＳ１０５において、パラメータ取得部４０３は、設備２に関する状態パラメータを取得する。例えば、パラメータ取得部４０３は、設備２またはシミュレータ４３１から状態パラメータを取得してよい。 In step S105, the parameter acquisition unit 403 acquires state parameters related to the equipment 2. For example, the parameter acquisition unit 403 may acquire the state parameters from the equipment 2 or the simulator 431.

ステップＳ１０７において、第２学習処理部４１３は、行動を決定し、決定した行動に応じた制御パラメータを決定する。例えば、第２学習処理部４１３は、ランダムに行動を決定する。なお、上述の説明では、第２学習処理部４１３がランダムに行動を決定する場合を一例として示したが、これに限定されるものではない。第２学習処理部４１３が行動を決定するにあたって、例えば、ＦＫＤＰＰ（ＦａｃｔｏｒｉａｌＫｅｒｎｅｌＤｙｎａｍｉｃＰｏｌｉｃｙＰｒｏｇｒａｍｍｉｎｇ）等の既知のＡＩアルゴリズムが用いられてもよい。このようなカーネル法を用いる場合、第２学習処理部４１３は、状態データから状態ｓのベクトルを生成する。次に、第２学習処理部４１３は、状態ｓと、取り得る全ての行動ａとの組み合わせを、例えば図３に示されるような行動決定テーブルとして生成する。そして、第２学習処理部４１３は、行動決定テーブルを、例えば図２に示されるような操業モデル４０１へ入力する。これに応じて、行動決定テーブルの各行と、操業モデル４０１のうちのウエイト列を除いた各サンプルデータとの間でカーネル計算が行われ、各サンプルデータとの間の距離がそれぞれ算出される。そして、各サンプルデータについて算出した距離にそれぞれのウエイト列の値を乗算したものが順次足し合わせられて、各行動における報酬期待値が計算される。操業モデル４０１は、このようにして計算された報酬期待値が最も高くなる行動を選択する。第２学習処理部４１３は、例えばこのようにして、更新中の操業モデルを用いて報酬期待値が最も高いと判断された行動を選択することにより行動を決定してもよい。学習時においては、第２学習処理部４１３は、ランダムに行動を決定するか、操業モデル４０１を用いて行動を決定するかを適宜選択しながら行動を決定すればよい。第２学習処理部４１３は、決定した行動に応じた制御パラメータを操業部４０２へ供給する。 In step S107, the second learning processing unit 413 determines an action and determines control parameters according to the determined action. For example, the second learning processing unit 413 randomly determines an action. In the above description, the second learning processing unit 413 randomly determines an action, but this is not limited to this. When the second learning processing unit 413 determines an action, a known AI algorithm such as FKDPP (Factorial Kernel Dynamic Policy Programming) may be used. When using such a kernel method, the second learning processing unit 413 generates a vector of state s from the state data. Next, the second learning processing unit 413 generates a combination of the state s and all possible actions a as an action determination table, for example, as shown in FIG. 3. Then, the second learning processing unit 413 inputs the action determination table into the operation model 401, for example, as shown in FIG. 2. In response to this, a kernel calculation is performed between each row of the behavior determination table and each sample data of the operation model 401 excluding the weight column, and the distance between each sample data is calculated. Then, the distance calculated for each sample data is multiplied by the value of each weight column and the products are sequentially added together to calculate the reward expectation value for each behavior. The operation model 401 selects the behavior that has the highest reward expectation value calculated in this way. The second learning processing unit 413 may determine the behavior by selecting the behavior that is determined to have the highest reward expectation value using the operation model being updated in this way, for example. During learning, the second learning processing unit 413 may determine the behavior while appropriately selecting whether to determine the behavior randomly or to determine the behavior using the operation model 401. The second learning processing unit 413 supplies the control parameters according to the determined behavior to the operation unit 402.

ステップＳ１０９において、操業部４０２は、供給された制御パラメータに応じて設備２を操業する。操業部４０２は、供給された制御パラメータに応じてシミュレータ４３１にシミュレーションを行わせてもよい。 In step S109, the operation unit 402 operates the equipment 2 according to the supplied control parameters. The operation unit 402 may cause the simulator 431 to perform a simulation according to the supplied control parameters.

ステップＳ１１１において、パラメータ取得部４０３は、設備２に関する状態パラメータを取得する。これにより、決定された制御パラメータにより設備２が操業されたことに応じて変化した後の状態パラメータが取得される。なお、ステップＳ１０９においてシミュレーションが行われた場合には、パラメータ取得部４０３は、状態パラメータをシミュレータ４３１から取得してよい。 In step S111, the parameter acquisition unit 403 acquires state parameters related to the equipment 2. This acquires state parameters that have changed in response to the equipment 2 being operated according to the determined control parameters. Note that, if a simulation has been performed in step S109, the parameter acquisition unit 403 may acquire the state parameters from the simulator 431.

ステップＳ１１３において、第２学習処理部４１３は、取得されたパラメータに基づいて報酬値を算出する。第２学習処理部４１３は、ステップＳ１０３で決定した報酬関数を用いて報酬値を算出してよい。 In step S113, the second learning processing unit 413 calculates a reward value based on the acquired parameters. The second learning processing unit 413 may calculate the reward value using the reward function determined in step S103.

ステップＳ１１５において、第２学習処理部４１３は、制御パラメータの決定に応じたパラメータの取得処理が、指定されたステップ回数を超えたかどうか判定する。なお、このようなステップ回数は、予めオペレータにより指定されたものであってもよいし、学習対象期間（例えば１０日間等）を基に定められたものであってもよい。上述の処理が指定されたステップ回数を超えていないと判定された場合（ステップＳ１１５；Ｎｏ）、第２学習処理部４１３は、処理をステップＳ１０７に戻してフローを継続する。これにより、制御パラメータの決定に応じた状態パラメータの取得処理が指定されたステップ回数だけ実行される。 In step S115, the second learning processing unit 413 determines whether the process of acquiring parameters in response to the determination of the control parameters has exceeded a specified number of steps. Note that such a number of steps may be specified in advance by the operator, or may be determined based on a learning period (e.g., 10 days, etc.). If it is determined that the above-mentioned process has not exceeded the specified number of steps (step S115; No), the second learning processing unit 413 returns the process to step S107 and continues the flow. As a result, the process of acquiring state parameters in response to the determination of the control parameters is executed the specified number of steps.

ステップＳ１１５において、上述の処理が指定されたステップ回数を超えたと判定された場合（ステップＳ１１５；Ｙｅｓ）、第２学習処理部４１３は、処理をステップＳ１１７へ進める。ステップＳ１１７において、第２学習処理部４１３は、操業モデル４０１を更新する。例えば、第２学習処理部４１３は、図２に示される操業モデルにおけるウエイト列の値を上書きするほか、これまでに保存されていない新たなサンプルデータを操業モデル４０１に追加する。 If it is determined in step S115 that the above-mentioned processing has exceeded the specified number of steps (step S115; Yes), the second learning processing unit 413 advances the processing to step S117. In step S117, the second learning processing unit 413 updates the operation model 401. For example, the second learning processing unit 413 overwrites the values of the weight column in the operation model shown in FIG. 2, and adds new sample data that has not been saved to the operation model 401.

ステップＳ１１９において、第２学習処理部４１３は、操業モデル４０１の更新処理が、指定された繰り返し回数を超えたかどうか判定する。なお、このような繰り返し回数は、予めオペレータにより指定されたものであってもよいし、操業モデル４０１の妥当性に応じて定められたものであってもよい。上述の処理が指定された繰り返し回数を超えていないと判定された場合（ステップＳ１１９；Ｎｏ）、第２学習処理部４１３は、処理をステップＳ１０５へ戻してフローを継続する。 In step S119, the second learning processing unit 413 determines whether the update process of the operation model 401 has exceeded a specified number of repetitions. Note that such a number of repetitions may be specified in advance by an operator, or may be determined according to the validity of the operation model 401. If it is determined that the above-mentioned process has not exceeded the specified number of repetitions (step S119; No), the second learning processing unit 413 returns the process to step S105 and continues the flow.

ステップＳ１１９において、上述の処理が指定された繰り返し回数を超えたと判定された場合（ステップＳ１１９；Ｙｅｓ）、第２学習処理部４１３は、フローを終了する。第２学習処理部４１３は、例えばこのようにして、設備２に関する状態パラメータに応じた制御パラメータを出力する操業モデル４０１を生成することができる。 If it is determined in step S119 that the above-mentioned process has exceeded the specified number of repetitions (step S119; Yes), the second learning processing unit 413 ends the flow. In this way, for example, the second learning processing unit 413 can generate an operation model 401 that outputs control parameters according to state parameters related to the equipment 2.

［３．２．操業モデル４０１の他の学習動作］
図５は、操業モデル４０１の他の学習動作を示す。目標設定モデル４１４の学習処理が完了している場合には、装置１は、ステップＳ１２１～Ｓ１２３，Ｓ１０３～Ｓ１１９の処理により操業モデル４０１を生成してもよい。 [3.2. Other learning operations of the operation model 401]
5 shows another learning operation of the operation model 401. When the learning process of the target setting model 414 is completed, the device 1 may generate the operation model 401 by the processes of steps S121 to S123 and S103 to S119.

ステップＳ１２１において、第１供給部４１５は、新たに入力される操業計画を取得する。第１供給部４１５は、入力部４１１に新たに入力される操業計画を取得してよい。 In step S121, the first supply unit 415 acquires a newly input operation plan. The first supply unit 415 may acquire a newly input operation plan to the input unit 411.

ステップＳ１２３において、第１供給部４１５は、取得した操業計画を目標設定モデル４１４に供給する。これにより、新たに操業計画が入力されることに応じて、当該操業計画が目標設定モデル４１４に供給される。 In step S123, the first supply unit 415 supplies the acquired operation plan to the target setting model 414. As a result, when a new operation plan is input, the operation plan is supplied to the target setting model 414.

ステップＳ１２５において、第２供給部４１７は、目標設定モデルからの出力データ（本実施形態では一例として、パラメータの識別情報および目標範囲）を取得し、当該出力データに応じた目標設定データを取得する。第２供給部４１７は、出力データをそのまま目標設定データとして取得してよい。 In step S125, the second supply unit 417 acquires output data (as an example in this embodiment, parameter identification information and target range) from the goal setting model, and acquires goal setting data corresponding to the output data. The second supply unit 417 may acquire the output data as it is as the goal setting data.

これに代えて、第２供給部４１７は、取得した出力データの内容を、推奨される目標設定データとして表示制御部４３２に表示させ、表示内容に基づいてオペレータにより入力される目標設定データを取得してもよい。一例として、或るパラメータについて目標範囲を１～１０とする出力データの内容が表示され、オペレータが当該パラメータの目標範囲を２～９と入力した場合には、第２供給部４１７は、当該パラメータについての目標範囲を２～９とする目標設定データを取得してよい。 Alternatively, the second supply unit 417 may cause the display control unit 432 to display the contents of the acquired output data as recommended goal setting data, and acquire the goal setting data input by the operator based on the displayed contents. As an example, if the contents of the output data in which the target range for a certain parameter is 1 to 10 is displayed, and the operator inputs the target range for the parameter as 2 to 9, the second supply unit 417 may acquire the goal setting data in which the target range for the parameter is 2 to 9.

以降、上述のステップＳ１０３～Ｓ１１９と同様にして、操業モデル４０１の学習動作が行われてよい。 After this, the learning operation of the operation model 401 may be performed in the same manner as steps S103 to S119 described above.

［３．３．目標設定モデル４１４の学習動作］
図６は、目標設定モデル４１４の学習動作を示す。装置１は、ステップＳ１３１～Ｓ１３３の処理により目標設定モデル４１４を生成してよい。 3.3. Learning Operation of Goal Setting Model 414
6 shows the learning operation of the goal setting model 414. The device 1 may generate the goal setting model 414 by the processes of steps S131 to S133.

ステップＳ１３１において、第１取得部４２１は、設備２の操業計画と、操業モデル４０１の学習に用いられた目標設定データのうちの少なくともパラメータの識別情報と、を取得する。本実施形態では一例として、第１取得部４２１は、操業モデル４０１の学習に用いられた目標設定データのうち少なくともパラメータの識別情報と、当該目標設定データと対応付けて入力部４１１により入力された操業計画とを取得してよい。また、第１取得部４２１は、目標設定データのうちのパラメータの識別情報および目標範囲の両方を取得してよい。第１取得部４２１は、目標設定データにおけるパラメータの識別情報や、操業計画などを記憶部４１２から取得してよい。 In step S131, the first acquisition unit 421 acquires the operation plan of the equipment 2 and at least parameter identification information of the target setting data used to learn the operation model 401. As an example in this embodiment, the first acquisition unit 421 may acquire at least parameter identification information of the target setting data used to learn the operation model 401 and the operation plan input by the input unit 411 in association with the target setting data. The first acquisition unit 421 may acquire both the parameter identification information and the target range of the parameter in the target setting data. The first acquisition unit 421 may acquire the parameter identification information in the target setting data, the operation plan, and the like from the storage unit 412.

ステップＳ１３３において、第１学習処理部４２２は、第１取得部４２１が取得したパラメータの識別情報および操業計画を含む学習データを用いて目標設定モデル４１４の学習処理を行う。第１学習処理部４２２は、ディープラーニングなどの教師あり学習によって目標設定モデル４１４の学習処理を行ってよく、目標設定モデル４１４からの出力データの内容が、操業モデル４０１の学習に用いられた目標設定データの内容に近似するように、目標設定モデル４１４の学習処理を行ってよい。また、本実施形態では一例として、第１学習処理部４２２は、目標設定モデル４１４に対し、操業計画が入力されることに応じて、当該操業計画が達成されるために操業モデル４０１の学習に用いられるべき目標設定データの内容を出力するように学習処理を行ってよい。 In step S133, the first learning processing unit 422 performs a learning process for the target setting model 414 using learning data including the parameter identification information and the operation plan acquired by the first acquisition unit 421. The first learning processing unit 422 may perform the learning process for the target setting model 414 by supervised learning such as deep learning, and may perform the learning process for the target setting model 414 so that the content of the output data from the target setting model 414 approximates the content of the target setting data used in the learning of the operation model 401. Also, as an example in this embodiment, the first learning processing unit 422 may perform a learning process for the target setting model 414 so that, in response to an input of an operation plan, the first learning processing unit 422 outputs the content of the target setting data to be used in the learning of the operation model 401 to achieve the operation plan.

［３．４．操業モデル４０１の評価動作］
図７は、操業モデル４０１の評価動作を示す。装置１は、ステップＳ１４１～Ｓ１７５の処理により、生成された操業モデル４０１を評価してよい。 [3.4. Evaluation of Operation Model 401]
7 shows an evaluation operation of the operation model 401. The device 1 may evaluate the operation model 401 generated by the processes of steps S141 to S175.

ステップＳ１４１において、表示制御部４３２は、設備２に関する複数種類のパラメータの何れかが選択パラメータとしてオペレータにより選択されることに応じて、設備２の過去の操業での当該選択パラメータの値を表示させる。例えば、表示制御部４３２は、各選択パラメータを座標軸とする座標空間に、設備２の過去の操業での各選択パラメータの値を表示させてよい。一例として、表示制御部４３２は、設備２の過去の操業での各選択パラメータの値をそれぞれ表示させてもよいし、設備２の過去の操業での各選択パラメータの値の最大値および最小値を表示させることで、各選択パラメータの値の範囲を表示させてもよい。 In step S141, in response to the operator selecting one of multiple types of parameters related to equipment 2 as a selected parameter, the display control unit 432 displays the value of the selected parameter in past operations of equipment 2. For example, the display control unit 432 may display the value of each selected parameter in past operations of equipment 2 in a coordinate space with each selected parameter as a coordinate axis. As one example, the display control unit 432 may display the value of each selected parameter in past operations of equipment 2, or may display the maximum and minimum values of each selected parameter in past operations of equipment 2 to display the range of values of each selected parameter.

ステップＳ１４３において、目標範囲取得部４３３は、選択パラメータについて、オペレータにより設定される評価用目標範囲を取得する。目標範囲取得部４３３は、表示制御部４３２によって表示される、各選択パラメータを座標軸とする座標空間内で範囲指定が行われることに応じて、その指定範囲を評価用目標範囲として取得してよい。 In step S143, the target range acquisition unit 433 acquires the evaluation target range set by the operator for the selected parameters. The target range acquisition unit 433 may acquire the specified range as the evaluation target range in response to range specification within a coordinate space displayed by the display control unit 432, with each selected parameter as its coordinate axis.

ステップＳ１４５において状態パラメータ取得部４０３は、設備２に関する状態パラメータをシミュレータ４３１から取得する。なお、ステップＳ１４５の処理が最初に実行される場合には、設備２の状態は、予め設定された初期状態であってよい。 In step S145, the state parameter acquisition unit 403 acquires state parameters related to the equipment 2 from the simulator 431. When the process of step S145 is executed for the first time, the state of the equipment 2 may be a preset initial state.

ステップＳ１４７においてシミュレータ４３１は、人手の操作に応じたシミュレーションを行う。シミュレータ４３１は、人手の操作に応じた制御パラメータに基づいて操業された設備２の状態をシミュレートしてよい。 In step S147, the simulator 431 performs a simulation according to the manual operation. The simulator 431 may simulate the state of the equipment 2 operated based on the control parameters according to the manual operation.

ステップＳ１４９においてシミュレータ４３１は、シミュレーションの終了が指示されたか否かを判定する。例えば、シミュレータ４３１は、入力部４１１を介してシミュレーションの終了指示が入力されたか否かを判定してよい。ステップＳ１４９においてシミュレーションの終了が指示されていないと判定した場合（ステップＳ１４９；Ｎｏ）には、上述のステップＳ１４５に処理が移行してよい。ステップＳ１４９においてシミュレーションの終了が指示されたと判定した場合（ステップＳ１４９；Ｙｅｓ）には、ステップＳ１５１に処理が移行してよい。 In step S149, the simulator 431 determines whether or not an instruction to end the simulation has been given. For example, the simulator 431 may determine whether or not an instruction to end the simulation has been input via the input unit 411. If it is determined in step S149 that an instruction to end the simulation has not been given (step S149; No), the process may proceed to step S145 described above. If it is determined in step S149 that an instruction to end the simulation has been given (step S149; Yes), the process may proceed to step S151.

ステップＳ１５１において評価値取得部４３４は、設備２を人手の操作により操業した結果に応じた基準評価値を取得する。本実施形態では一例として、基準評価値は、シミュレータ４３１に人手の操作を入力した結果に基づいて算出されてよい。 In step S151, the evaluation value acquisition unit 434 acquires a reference evaluation value according to the result of operating the equipment 2 by manual operation. In this embodiment, as an example, the reference evaluation value may be calculated based on the result of inputting the manual operation into the simulator 431.

基準評価値は、人手の操作に応じた制御パラメータによって操業された設備２に関するパラメータが評価用目標範囲に収まるか否かに基づいて算出されてよい。設備２に関する複数のパラメータのそれぞれについて評価用目標範囲が設定される場合には、基準評価値は、評価用目標範囲が設定されたパラメータ数（ａ）のうち、対応する目標範囲内に収まるパラメータ数（ｂ）の割合（つまりｂ／ａ）に基づいて算出されてよい。なお、基準評価値は、シミュレータ４３１により算出されてもよいし、操業された設備２に関するパラメータをシミュレータ４３１から取得した評価値取得部４３４によって算出されてもよい。 The reference evaluation value may be calculated based on whether the parameters for the equipment 2 operated by the control parameters according to the human operation fall within the target range for evaluation. When a target range for evaluation is set for each of the multiple parameters for the equipment 2, the reference evaluation value may be calculated based on the ratio (i.e., b/a) of the number of parameters (b) that fall within the corresponding target range out of the number of parameters (a) for which the target range for evaluation is set. Note that the reference evaluation value may be calculated by the simulator 431, or may be calculated by the evaluation value acquisition unit 434 that acquires the parameters for the operated equipment 2 from the simulator 431.

ステップＳ１５３においてシミュレータ４３１は人手の操作に応じたシミュレーションを終了する。シミュレーションの終了により、シミュレートされた設備２の状態は初期状態にリセットされてよい。 In step S153, the simulator 431 ends the simulation in response to the manual operation. Upon completion of the simulation, the simulated state of the equipment 2 may be reset to the initial state.

ステップＳ１６１において状態パラメータ取得部４０３は、設備２に関する状態パラメータをシミュレータ４３１から取得する。なお、ステップＳ１６１の処理が最初に実行される場合には、設備２の状態は、予め設定された初期状態であってよい。 In step S161, the state parameter acquisition unit 403 acquires state parameters related to the equipment 2 from the simulator 431. Note that when the process of step S161 is executed for the first time, the state of the equipment 2 may be a preset initial state.

ステップＳ１６３においてパラメータ取得部４０３は、取得した状態パラメータを操業モデル４０１に供給する。これにより、操業モデル４０１から制御パラメータの推奨値が出力される。 In step S163, the parameter acquisition unit 403 supplies the acquired state parameters to the operation model 401. This causes the operation model 401 to output recommended values for the control parameters.

ステップＳ１６５において操業部４０２は、操業モデル４０１から出力される制御パラメータの推奨値を取得する。 In step S165, the operation unit 402 obtains the recommended values of the control parameters output from the operation model 401.

ステップＳ１６７においてシミュレータ４３１は、操業モデル４０１からの制御パラメータの推奨値に応じたシミュレーションを行う。シミュレータ４３１は、制御パラメータの推奨値に基づいて操業された設備２の状態をシミュレートしてよい。 In step S167, the simulator 431 performs a simulation according to the recommended values of the control parameters from the operation model 401. The simulator 431 may simulate the state of the equipment 2 operated based on the recommended values of the control parameters.

ステップＳ１６９においてシミュレータ４３１は、シミュレーションの終了が指示されたか否かを判定する。ステップＳ１６９においてシミュレーションの終了が指示されていないと判定した場合（ステップＳ１６９；Ｎｏ）には、上述のステップＳ１６１に処理が移行してよい。ステップＳ１６９においてシミュレーションの終了が指示されたと判定した場合（ステップＳ１６９；Ｙｅｓ）には、ステップＳ１７１に処理が移行してよい。 In step S169, the simulator 431 determines whether or not an instruction to end the simulation has been given. If it is determined in step S169 that an instruction to end the simulation has not been given (step S169; No), the process may proceed to step S161 described above. If it is determined in step S169 that an instruction to end the simulation has been given (step S169; Yes), the process may proceed to step S171.

ステップＳ１７１において評価値取得部４３４は、設備２を制御パラメータの推奨値により操業した結果に応じたモデル評価値を取得する。本実施形態では一例として、モデル評価値は、操業部４０２により取得された推奨値を設備２のシミュレータ４３１に入力した結果に基づいて算出されてよい。モデル評価値は、基準評価値と同様にして、制御パラメータの推奨値によって操業された設備２に関するパラメータが評価用目標範囲に収まるか否かに基づいて算出されてよい。 In step S171, the evaluation value acquisition unit 434 acquires a model evaluation value according to the result of operating the equipment 2 with the recommended values of the control parameters. As an example in this embodiment, the model evaluation value may be calculated based on the result of inputting the recommended values acquired by the operation unit 402 to the simulator 431 of the equipment 2. The model evaluation value may be calculated, similar to the reference evaluation value, based on whether the parameters related to the equipment 2 operated with the recommended values of the control parameters fall within the evaluation target range.

ステップＳ１７３においてシミュレータ４３１は制御パラメータの推奨値に応じたシミュレーションを終了する。シミュレーションの終了により、シミュレートされた設備２の状態は初期状態にリセットされてよい。 In step S173, the simulator 431 ends the simulation according to the recommended values of the control parameters. Upon ending the simulation, the simulated state of the equipment 2 may be reset to the initial state.

ステップＳ１７５において評価部４３５は、モデル評価値、および、基準評価値に基づいて操業モデル４０１を評価する。本実施形態では一例として、評価部４３５は、モデル評価値が基準評価値よりも大きい場合に、操業モデル４０１が良好である旨の評価を行ってよい。 In step S175, the evaluation unit 435 evaluates the operation model 401 based on the model evaluation value and the reference evaluation value. As an example in this embodiment, the evaluation unit 435 may evaluate the operation model 401 as being good when the model evaluation value is greater than the reference evaluation value.

以上の動作によれば、設備２のシミュレータ４３１に人手の操作を入力した結果に基づいて基準評価値が算出されるので、実際に設備２を操業することなく速やかに基準評価値を得ることができる。 According to the above operation, the reference evaluation value is calculated based on the results of manual operations input into the simulator 431 of the equipment 2, so the reference evaluation value can be obtained quickly without actually operating the equipment 2.

また、操業モデル４０１から取得された制御パラメータの推奨値を設備２のシミュレータ４３１に入力した結果に基づいてモデル評価値が算出されるので、実際に設備２を操業することなく速やかにモデル評価値を得ることができる。 In addition, the model evaluation value is calculated based on the results of inputting the recommended values of the control parameters obtained from the operation model 401 into the simulator 431 of the equipment 2, so that the model evaluation value can be obtained quickly without actually operating the equipment 2.

また、基準評価値およびモデル評価値がそれぞれシミュレーションの結果に基づいて算出されるので、人手の操作により設備２を操業する場合と、操業モデル４０１を用いて設備２を操業する場合とで、操業前の設備２を同じ状態に揃えることができる。従って、操業モデル４０１を用いることによる操業結果の良否を精度良く判断することができる。 In addition, since the reference evaluation value and the model evaluation value are each calculated based on the results of the simulation, the equipment 2 can be brought into the same state before operation when the equipment 2 is operated by manual operation and when the equipment 2 is operated using the operation model 401. Therefore, the quality of the operation results by using the operation model 401 can be accurately determined.

また、評価用目標範囲を設定する場合に、オペレータにより選択された各選択パラメータを座標軸とする座標空間に、設備２の過去の操業での各選択パラメータの値が表示されるので、選択パラメータの過去の値や、その範囲の把握を容易化し、評価用目標範囲の設定を容易化することができる。 In addition, when setting a target range for evaluation, the values of each selected parameter from past operations of equipment 2 are displayed in a coordinate space with each selected parameter selected by the operator as its coordinate axis, making it easy to understand the past values and ranges of the selected parameters and to set a target range for evaluation.

［３．５．設備２の操業動作］
図８は、設備２の操業動作を示す。装置１は、ステップＳ１８１～Ｓ１９１の処理により設備２を操業してよい。 [3.5. Operation of Equipment 2]
8 shows the operation of the facility 2. The apparatus 1 may operate the facility 2 by performing the processes of steps S181 to S191.

ステップＳ１８１において状態パラメータ取得部４０３は、設備２に関する状態パラメータを取得する。ステップＳ１８３においてパラメータ取得部４０３は、取得した状態パラメータを操業モデル４０１に供給する。これにより、操業モデル４０１から制御パラメータの推奨値が出力される。ステップＳ１８５において操業部４０２は、操業モデル４０１から出力される制御パラメータの推奨値を取得する。ステップＳ１８７において操業部４０２は、操業モデル４０１からの制御パラメータの推奨値に応じて設備２を操業する。 In step S181, the state parameter acquisition unit 403 acquires state parameters related to equipment 2. In step S183, the parameter acquisition unit 403 supplies the acquired state parameters to the operation model 401. As a result, recommended values of the control parameters are output from the operation model 401. In step S185, the operation unit 402 acquires the recommended values of the control parameters output from the operation model 401. In step S187, the operation unit 402 operates equipment 2 according to the recommended values of the control parameters from the operation model 401.

ステップＳ１８９において操業部４０２は、操業の終了が指示されたか否かを判定する。ステップＳ１８９において操業の終了が指示されていないと判定した場合（ステップＳ１８９；Ｎｏ）には、上述のステップＳ１８１に処理が移行してよい。ステップＳ１８９において操業の終了が指示されたと判定した場合（ステップＳ１８９；Ｙｅｓ）には、ステップＳ１９１に処理が移行し、操業部４０２は設備２の操業を終了する。 In step S189, the operation department 402 determines whether or not an instruction to end operations has been given. If it is determined in step S189 that an instruction to end operations has not been given (step S189; No), processing may proceed to step S181 described above. If it is determined in step S189 that an instruction to end operations has been given (step S189; Yes), processing proceeds to step S191, and the operation department 402 ends operations of equipment 2.

［４．目標設定モデル４１４の変形例（１）］
なお、上記の実施形態においては、目標設定モデル４１４には操業計画と、操業モデル４０１の学習に用いた目標設定データにおけるパラメータの識別情報および目標範囲を含む学習データとを用いて学習処理が行われ、学習処理が行われた目標設定モデル４１４は操業計画の入力に応じて、操業モデル４０１の学習に用いるべき目標設定データのうち、パラメータの識別情報および目標範囲を出力することとして説明した。しかしながら、目標設定モデル４１４についての学習データ、入力データ、および、出力データの内容の組み合わせはこれに限らない。 [4. Variation (1) of the goal setting model 414]
In the above embodiment, the target setting model 414 is subjected to a learning process using an operation plan and learning data including identification information and target ranges of parameters in the target setting data used in learning the operation model 401, and the target setting model 414 that has undergone the learning process outputs, in response to an input of an operation plan, the identification information and target ranges of parameters from the target setting data to be used in learning the operation model 401. However, the combination of the contents of the learning data, input data, and output data for the target setting model 414 is not limited to this.

例えば、目標設定モデル４１４は、操業計画と、操業モデル４０１の学習に用いた目標設定データにおけるパラメータの識別情報を含む学習データとを用いて学習処理が行われ、操業計画の入力に応じて、操業モデル４０１の学習に用いるべき目標設定データのうち、パラメータの識別情報のみを出力し、目標範囲を出力しなくてもよい。学習データには、操業モデルの学習に用いた目標設定データにおけるパラメータの識別情報および目標範囲の両方が含まれてもよい。目標設定モデル４１４は、単一のパラメータの識別情報を出力してもよいし、複数のパラメータの識別情報を出力してもよい。パラメータの識別情報のみが目標設定モデル４１４から出力される場合には、第２供給部４１７は、出力された各パラメータの識別情報を表示制御部４３２に表示させて、オペレータにより入力される目標範囲をパラメータの識別情報毎に取得し、これらの識別情報および目標範囲を示す目標設定データを生成して、第２学習処理部４１３に供給してよい。 For example, the target setting model 414 performs a learning process using the operation plan and learning data including the identification information of the parameters in the target setting data used to learn the operation model 401, and may output only the identification information of the parameters from the target setting data to be used to learn the operation model 401 in response to the input of the operation plan, without outputting the target range. The learning data may include both the identification information and the target range of the parameters in the target setting data used to learn the operation model. The target setting model 414 may output the identification information of a single parameter, or may output the identification information of multiple parameters. When only the identification information of the parameters is output from the target setting model 414, the second supply unit 417 may display the identification information of each output parameter on the display control unit 432, obtain the target range input by the operator for each identification information of the parameter, generate target setting data indicating these identification information and the target range, and supply it to the second learning processing unit 413.

また、目標設定モデル４１４は、操業計画と、操業モデル４０１の学習に用いた目標設定データにおけるパラメータの識別情報および目標範囲を含む学習データとを用いて学習処理が行われ、操業計画と、目標範囲の設定対象とするべきパラメータの識別情報との入力に応じて、操業モデル４０１の学習に用いるべき目標設定データのうち、当該パラメータについての目標範囲のみを出力し、パラメータの識別情報を出力しなくてもよい。この場合には、目標範囲の設定対象とするべき単一のパラメータの識別情報が入力部４１１を介してオペレータから目標設定モデル４１４に入力されて、目標設定モデル４１４から当該単一のパラメータの目標範囲が出力されてよい。これに代えて、目標範囲の設定対象とするべき複数のパラメータの識別情報が入力部４１１を介してオペレータから目標設定モデル４１４に入力されて、目標設定モデル４１４から各パラメータの目標範囲が出力されてもよい。第２供給部４１７は、目標設定モデル４１４に入力された各パラメータの識別情報と、目標設定モデル４１４から出力された各パラメータの目標範囲とを示す目標設定データを生成して、第２学習処理部４１３に供給してよい。 In addition, the target setting model 414 performs a learning process using the operation plan and learning data including the identification information and target range of the parameter in the target setting data used to learn the operation model 401, and outputs only the target range for the parameter among the target setting data to be used for learning the operation model 401 in response to the input of the operation plan and the identification information of the parameter to be set as the target range, and does not need to output the identification information of the parameter. In this case, the identification information of a single parameter to be set as the target range may be input from the operator to the target setting model 414 via the input unit 411, and the target range of the single parameter may be output from the target setting model 414. Alternatively, the identification information of multiple parameters to be set as the target range may be input from the operator to the target setting model 414 via the input unit 411, and the target range of each parameter may be output from the target setting model 414. The second supply unit 417 may generate goal setting data indicating the identification information of each parameter input to the goal setting model 414 and the target range of each parameter output from the goal setting model 414, and supply the data to the second learning processing unit 413.

［５．目標設定モデル４１４の変形例（２）］
また、上記の実施形態においては、目標設定モデル４１４を単一のモデルとして説明したが、機能の異なる複数のモデルを有してもよい。 [5. Variation (2) of the goal setting model 414]
Furthermore, in the above embodiment, the goal setting model 414 has been described as a single model, but it may have a plurality of models with different functions.

図９は、本変形例に係る目標設定モデル４１４Ａを示す。目標設定モデル４１４Ａは、少なくとも１つのパラメータ設定モデル４１４１と、少なくとも１つの目標範囲設定モデル４１４２とを有してよい。本実施形態においては一例として、目標設定モデル４１４Ａは、２つのパラメータ設定モデル４１４１ａ，４１４１ｂと、４つの目標範囲設定モデル４１４２ａ～４１４２ｄとを有する。各パラメータ設定モデル４１４１は、操業計画が入力されることに応じて、目標範囲の設定対象とされるべきパラメータの識別情報を出力する。各目標範囲設定モデル４１４２は、操業計画と、目標範囲の設定対象とされるべきパラメータの識別情報とが入力されることに応じて、当該パラメータに対して設定されるべき目標範囲を出力する。 Figure 9 shows the target setting model 414A according to this modified example. The target setting model 414A may have at least one parameter setting model 4141 and at least one target range setting model 4142. In this embodiment, as an example, the target setting model 414A has two parameter setting models 4141a and 4141b and four target range setting models 4142a to 4142d. Each parameter setting model 4141 outputs identification information of a parameter for which a target range is to be set in response to input of an operation plan. Each target range setting model 4142 outputs a target range to be set for that parameter in response to input of an operation plan and identification information of a parameter for which a target range is to be set.

このうち、パラメータ設定モデル４１４１ａは、操業計画が入力されることに応じて、目標範囲の設定対象とされるべきパラメータ（パラメータＰａとも称する）の識別情報を出力してよい。本実施形態においては一例として、パラメータ設定モデル４１４１ａは、第１供給部４１５から操業計画が入力されることに応じて、当該操業計画と、パラメータＰａの識別情報とを目標範囲設定モデル４１４２ａに供給する。 Of these, the parameter setting model 4141a may output identification information of a parameter (also referred to as parameter Pa) for which a target range is to be set in response to input of an operation plan. In the present embodiment, as an example, in response to input of an operation plan from the first supply unit 415, the parameter setting model 4141a supplies the operation plan and identification information of parameter Pa to the target range setting model 4142a.

目標範囲設定モデル４１４２ａは、操業計画と、パラメータの識別情報とが入力されることに応じて、当該パラメータに対して設定されるべき目標範囲の上限値を出力してよい。本実施形態においては一例として、目標範囲設定モデル４１４２ａは、第１供給部４１５により操業計画が入力されたパラメータ設定モデル４１４１ａから操業計画と、パラメータＰａの識別情報とが入力されることに応じて、当該操業計画と、パラメータＰａの識別情報と、パラメータＰａの目標範囲の上限値Ｖ_{ＰａＭＡＸ}とを目標範囲設定モデル４１４２ｂに供給する。なお、操業計画は、パラメータ設定モデル４１４１ａを介して第１供給部４１５から目標範囲設定モデル４１４２ａに入力される代わりに、第１供給部４１５から目標範囲設定モデル４１４２ａに直接入力されてもよい。後述の目標範囲設定モデル４１４２ｂ～４１４２ｄや、パラメータ設定モデル４１４１ｂについても同様である。 The target range setting model 4142a may output the upper limit of the target range to be set for the parameter in response to input of the operation plan and the identification information of the parameter. In the present embodiment, as an example, in response to input of the operation plan and the identification information of the parameter Pa from the parameter setting model 4141a to which the operation plan has been input by the first supply unit 415, the target range setting model 4142a supplies the operation plan, the identification information of the parameter Pa, and the upper limit value V _PaMAX of the target range of the parameter Pa to the target range setting model 4142b. Note that the operation plan may be directly input from the first supply unit 415 to the target range setting model 4142a instead of being input from the first supply unit 415 to the target range setting model 4142a via the parameter setting model 4141a. The same applies to the target range setting models 4142b to 4142d and the parameter setting model 4141b described below.

目標範囲設定モデル４１４２ｂは、操業計画と、パラメータの識別情報とが入力されることに応じて、当該パラメータに対して設定されるべき目標範囲の下限値を出力してよい。本実施形態においては一例として、目標範囲設定モデル４１４２ｂは、第１供給部４１５により操業計画が入力されたパラメータ設定モデル４１４１ａから操業計画と、パラメータＰａの識別情報と、パラメータＰａの目標範囲の上限値Ｖ_{ＰａＭＡＸ}とが入力されることに応じて、当該操業計画と、パラメータＰａの識別情報と、パラメータＰａの目標範囲の上下限値Ｖ_{ＰａＭＡＸ}，Ｖ_{ＰａＭＩＮ}とをパラメータ設定モデル４１４１ｂに供給する。 The target range setting model 4142b may output the lower limit of the target range to be set for the parameter in response to the input of the operation plan and the identification information of the parameter. In the present embodiment, as an example, in response to the input of the operation plan, the identification information of the parameter Pa, and the upper limit V _PaMAX of the target range of the parameter Pa from the parameter setting model 4141a to which the operation plan has been input by the first supply unit 415, the target range setting model 4142b supplies the operation plan, the identification information of the parameter Pa, and the upper and lower limits V _PaMAX and V _PaMIN of the target range of the parameter Pa to the parameter setting model 4141b.

パラメータ設定モデル４１４１ｂは、操業計画と、既に目標範囲の設定対象とされたパラメータＰａの識別情報とが入力されることに応じて、目標範囲の設定対象とされるべき他のパラメータ（パラメータＰｂとも称する）の識別情報を出力してよい。本実施形態においては一例として、パラメータ設定モデル４１４１ｂは、目標範囲設定モデル４１４２ｂから操業計画と、パラメータＰａの識別情報および目標範囲の上下限値Ｖ_{ＰａＭＡＸ}，Ｖ_{ＰａＭＩＮ}とが入力されることに応じて、当該操業計画と、パラメータＰａの識別情報および上下限値Ｖ_{ＰａＭＡＸ}，Ｖ_{ＰａＭＩＮ}と、パラメータＰａとは異なるパラメータＰｂの識別情報とを目標範囲設定モデル４１４２ｃに供給する。 In response to input of an operation plan and identification information of parameter Pa for which a target range has already been set, the parameter setting model 4141b may output identification information of another parameter (also referred to as parameter Pb) for which a target range should be set. In the present embodiment, as an example, in response to input of an operation plan, identification information of parameter Pa, and upper and lower limit values V _PaMAX and V _PaMIN of the target range from the target range setting model 4142b, the parameter setting model 4141b supplies the operation plan, identification information of parameter Pa, upper and lower limit values V _PaMAX and V _PaMIN , and identification information of parameter Pb different from parameter Pa to the target range setting model 4142c.

目標範囲設定モデル４１４２ｃは、操業計画と、パラメータの識別情報とが入力されることに応じて、当該パラメータに対して設定されるべき目標範囲の上限値を出力してよい。本実施形態においては一例として、目標範囲設定モデル４１４２ｃは、パラメータ設定モデル４１４１ｂから操業計画と、パラメータＰａの識別情報および目標範囲の上下限値Ｖ_{ＰａＭＡＸ}，Ｖ_{ＰａＭＩＮ}と、パラメータＰｂの識別情報とが供給されることに応じて、当該操業計画と、パラメータＰａの識別情報と、パラメータＰａの目標範囲の上下限値Ｖ_{ＰａＭＡＸ}，Ｖ_{ＰａＭＩＮ}と、パラメータＰｂの識別情報と、パラメータＰｂの目標範囲の上限値Ｖ_{ＰｂＭＡＸ}とを目標範囲設定モデル４１４２ｄに供給する。 The target range setting model 4142c may output the upper limit of the target range to be set for the parameter in response to input of the operation plan and the identification information of the parameter. In the present embodiment, as an example, in response to the operation plan, the identification information of the parameter Pa, the upper and lower limit values V PaMAX and V _PaMIN of the target range, and the identification information of the parameter Pb being supplied from the parameter setting model 4141b, the target range setting model 4142c supplies the operation plan, the identification information of the parameter Pa, the upper and lower limit values V _PaMAX and V _PaMIN of the target range of the parameter Pa, the identification information of the parameter _Pb , and the upper limit value V _PbMAX of the target range of the parameter Pb to the target range setting model 4142d.

目標範囲設定モデル４１４２ｄは、操業計画と、パラメータの識別情報とが入力されることに応じて、当該パラメータに対して設定されるべき目標範囲の下限値を出力してよい。本実施形態においては一例として、目標範囲設定モデル４１４２ｄは、目標範囲設定モデル４１４２ｃから操業計画と、パラメータＰａの識別情報および目標範囲の上下限値Ｖ_{ＰａＭＡＸ}，Ｖ_{ＰａＭＩＮ}と、パラメータＰｂの識別情報と、パラメータＰｂの目標範囲の上限値Ｖ_{ＰｂＭＡＸ}とが供給されることに応じて、当該操業計画と、パラメータＰａの識別情報と、パラメータＰａの目標範囲の上下限値Ｖ_{ＰａＭＡＸ}，Ｖ_{ＰａＭＩＮ}と、パラメータＰｂの識別情報と、パラメータＰｂの目標範囲の上下限値Ｖ_{ＰｂＭＡＸ}，Ｖ_{ＰｂＭＩＮ}とを出力する。 The target range setting model 4142d may output the lower limit of the target range to be set for the parameter in response to input of the operation plan and the identification information of the parameter. In the present embodiment, as an example, in response to the operation plan, the identification information of the parameter Pa, the upper and lower limit values V _PaMAX and V _PaMIN of the target range, the identification information of the parameter Pb, and the upper limit value V _PbMAX of the target range of the parameter Pb being supplied from the target range setting model 4142c, the target range setting model 4142d outputs the operation plan, the identification information of the parameter Pa, the upper and lower limit values V _PaMAX and V _PaMIN of the target range of the parameter Pa, and the identification information of the parameter Pb, and the upper and lower limit values V _PbMAX and V _PbMIN of the target range of the parameter Pb.

以上の目標設定モデル４１４Ａによれば、パラメータ設定モデル４１４１に操業計画が入力されることに応じて、目標範囲の設定対象とされるべきパラメータの識別情報が出力される。従って、目標範囲が設定されるパラメータを、操業モデル４０１の学習に用いられた目標設定データのパラメータに合わせることができる。 According to the above-described target setting model 414A, when an operation plan is input to the parameter setting model 4141, identification information of the parameters for which the target range should be set is output. Therefore, the parameters for which the target range is set can be matched to the parameters of the target setting data used to train the operation model 401.

また、パラメータ設定モデル４１４１ｂに操業計画と、既に目標範囲の設定対象とされたパラメータＰａの識別情報とが入力されることに応じて、目標範囲の設定対象とされるべきパラメータＰｂの識別情報が出力される。従って、目標範囲が設定されるパラメータＰｂを、操業モデル４０１の学習においてパラメータＰａとともに目標設定データに用いられたパラメータに合わせることができる。 In addition, in response to inputting an operation plan and identification information of parameter Pa for which a target range has already been set to parameter setting model 4141b, identification information of parameter Pb for which a target range should be set is output. Therefore, parameter Pb for which a target range is set can be matched to a parameter used in the target setting data together with parameter Pa in the learning of operation model 401.

また、パラメータ設定モデル４１４１ｂに操業計画と、既に目標範囲の設定対象とされたパラメータＰａの識別情報および目標範囲の上下限値Ｖ_{ＰａＭＡＸ}，Ｖ_{ＰａＭＩＮ}とが入力されることに応じて、目標範囲の設定対象とされるべきパラメータＰｂの識別情報が出力される。従って、目標範囲が設定されるパラメータＰｂを、操業モデル４０１の学習においてパラメータＰａの識別情報および上下限値Ｖ_{ＰａＭＡＸ}，Ｖ_{ＰａＭＩＮ}とともに目標設定データに用いられたパラメータに合わせることができる。 Furthermore, in response to inputting an operation plan, identification information of parameter Pa for which a target range has already been set, and the upper and lower limits _VPaMAX and _VPaMIN of the target range to the parameter setting model 4141b, identification information of parameter Pb for which a target range should be set is output. Therefore, the parameter Pb for which a target range is set can be matched to the parameter used in the target setting data together with the identification information and the upper and lower limits _VPaMAX and _VPaMIN of parameter Pa in learning the operation model 401.

また、目標範囲設定モデル４１４２に操業計画と、目標範囲の設定対象とされるべきパラメータＰａ，Ｐｂの識別情報とが入力されることに応じて、当該パラメータＰａ，Ｐｂに対して設定されるべき目標範囲が出力される。従って、パラメータＰａ，Ｐｂに対して設定されるべき目標範囲を、操業モデル４０１の学習に用いられた目標設定データの目標範囲に合わせることができる。 In addition, in response to inputting an operation plan and identification information of parameters Pa and Pb for which the target ranges are to be set to the target range setting model 4142, the target ranges to be set for the parameters Pa and Pb are output. Therefore, the target ranges to be set for parameters Pa and Pb can be matched to the target ranges of the target setting data used to train the operation model 401.

また、パラメータ設定モデル４１４１ａに対して操業計画が入力されることに応じて、目標範囲の設定対象とされるべきパラメータＰａの識別情報がパラメータ設定モデル４１４１ａから出力され、目標範囲設定モデル４１４２ａに対して操業計画が入力され、かつ、パラメータ設定モデル４１４１からパラメータＰａの識別情報が入力されることに応じて、当該パラメータＰａに対して設定されるべき目標範囲Ｖ_{ＰａＭＡＸ}，Ｖ_{ＰａＭＩＮ}が目標範囲設定モデル４１４２ａから出力される。従って、目標設定データのパラメータと、目標範囲とを順次、自動的に取得することができる。 Furthermore, in response to an operation plan being input to the parameter setting model 4141a, identification information of the parameter Pa for which a target range is to be set is output from the parameter setting model 4141a, and in response to an operation plan being input to the target range setting model 4142a and the identification information of the parameter Pa being input from the parameter setting model 4141, the target ranges V _PaMAX and V _PaMIN to be set for the parameter Pa are output from the target range setting model 4142a. Therefore, the parameters of the target setting data and the target ranges can be automatically acquired in sequence.

なお、以上の目標設定モデル４１４Ａのパラメータ設定モデル４１４１ａは、第１取得部４２１が取得したパラメータの識別情報と、操業計画とを含む学習データを用いて第１学習処理部４２２により学習処理が行われてよい。これにより、パラメータ設定モデル４１４１ａを学習するための学習データではパラメータの目標範囲を省くことができるため、学習処理を容易化することができる。 The parameter setting model 4141a of the above target setting model 414A may be subjected to learning processing by the first learning processing unit 422 using learning data including the parameter identification information acquired by the first acquisition unit 421 and the operation plan. This makes it possible to omit the target range of the parameters from the learning data for learning the parameter setting model 4141a, thereby facilitating the learning processing.

また、パラメータ設定モデル４１４１ｂと、目標範囲設定モデル４１４２ａ～４１４２ｄとは、第１取得部４２１が取得したパラメータの識別情報および目標範囲と、操業計画とを含む学習データを用いて第１学習処理部４２２により学習処理が行われてよい。これにより、モデルからの出力データの内容を、操業計画が達成されるために操業モデル４０１の学習に用いられた目標設定データの内容に近似させることができる。 The parameter setting model 4141b and the target range setting models 4142a to 4142d may be subjected to learning processing by the first learning processing unit 422 using learning data including the parameter identification information and target range acquired by the first acquisition unit 421, and the operation plan. This allows the contents of the output data from the model to be approximated to the contents of the target setting data used in learning the operation model 401 to achieve the operation plan.

［６．その他の変形例］
なお、上記の実施形態においては、装置４は操業モデル４０１および目標設定モデル４１４を有することとして説明したが、これらの何れかを有しないこととしてもよい。装置４は、操業モデル４０１および目標設定モデル４１４を有しない場合には、外部接続された記憶装置内の操業モデル４０１および目標設定モデル４１４に対して学習処理を行ってもよいし、外部接続された記憶装置内の操業モデル４０１に対して評価を行ってもよいし、外部接続された記憶装置内の操業モデル４０１を用いて操業を行ってもよい。 [6. Other Modifications]
In the above embodiment, the device 4 has been described as having the operation model 401 and the target setting model 414, but it may not have either of them. When the device 4 does not have the operation model 401 and the target setting model 414, the device 4 may perform a learning process on the operation model 401 and the target setting model 414 in an externally connected storage device, may perform an evaluation on the operation model 401 in an externally connected storage device, or may perform an operation using the operation model 401 in an externally connected storage device.

また、装置４は、目標設定モデル４１４の学習処理を行うべく第１取得部４２１および第１学習処理部４２２などを有することとして説明したが、これらを有しなくてもよい。この場合には、装置４は、学習済みの目標設定モデル４１４を用いて操業モデル４０１の学習処理を行ってよい。学習済みの目標設定モデル４１４を複数の装置４で共有し、各装置４で別々の操業モデル４０１の学習処理を行ってもよい。 Although the device 4 has been described as having a first acquisition unit 421 and a first learning processing unit 422 to perform learning processing of the goal setting model 414, these may not be included. In this case, the device 4 may perform learning processing of the operation model 401 using the learned goal setting model 414. The learned goal setting model 414 may be shared by multiple devices 4, and each device 4 may perform learning processing of a different operation model 401.

また、装置４は、操業モデル４０１の学習処理を行うべく第２学習処理部４１３等を有することとして説明したが、有しなくてもよい。この場合には、装置４は、操業モデル４０１の学習で用いられた目標設定データ内のパラメータの識別情報などを外部から取得して、目標設定モデル４１４の学習処理を行ってよい。 In addition, although the device 4 has been described as having a second learning processing unit 413 and the like to perform learning processing of the operation model 401, this need not be the case. In this case, the device 4 may externally acquire identification information of parameters in the target setting data used in learning the operation model 401, and perform learning processing of the target setting model 414.

また、評価値取得部４３４は、シミュレーション結果に応じた基準評価値およびモデル評価値を取得することとして説明したが、設備２を実際に操業した結果に応じた基準評価値およびモデル評価値を取得してもよい。 In addition, the evaluation value acquisition unit 434 has been described as acquiring a reference evaluation value and a model evaluation value according to the simulation results, but it may also acquire a reference evaluation value and a model evaluation value according to the results of actually operating the equipment 2.

また、評価値取得部４３４が基準評価値を算出することとして説明したが、固定値として予め装置４内に記憶されていてもよい。 In addition, although the evaluation value acquisition unit 434 has been described as calculating the reference evaluation value, the reference evaluation value may be stored in advance in the device 4 as a fixed value.

また、本発明の様々な実施形態は、フローチャートおよびブロック図を参照して記載されてよく、ここにおいてブロックは、（１）操作が実行されるプロセスの段階または（２）操作を実行する役割を持つ装置のセクションを表わしてよい。特定の段階およびセクションが、専用回路、コンピュータ可読媒体上に格納されるコンピュータ可読命令と共に供給されるプログラマブル回路、および／またはコンピュータ可読媒体上に格納されるコンピュータ可読命令と共に供給されるプロセッサによって実装されてよい。専用回路は、デジタルおよび／またはアナログハードウェア回路を含んでよく、集積回路（ＩＣ）および／またはディスクリート回路を含んでよい。プログラマブル回路は、論理ＡＮＤ、論理ＯＲ、論理ＸＯＲ、論理ＮＡＮＤ、論理ＮＯＲ、および他の論理操作、フリップフロップ、レジスタ、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、プログラマブルロジックアレイ（ＰＬＡ）等のようなメモリ要素等を含む、再構成可能なハードウェア回路を含んでよい。 Various embodiments of the present invention may also be described with reference to flow charts and block diagrams, where the blocks may represent (1) stages of a process in which operations are performed or (2) sections of an apparatus responsible for performing the operations. Particular stages and sections may be implemented by dedicated circuitry, programmable circuitry provided with computer readable instructions stored on a computer readable medium, and/or a processor provided with computer readable instructions stored on a computer readable medium. Dedicated circuitry may include digital and/or analog hardware circuitry and may include integrated circuits (ICs) and/or discrete circuits. Programmable circuitry may include reconfigurable hardware circuitry including logical AND, logical OR, logical XOR, logical NAND, logical NOR, and other logical operations, memory elements such as flip-flops, registers, field programmable gate arrays (FPGAs), programmable logic arrays (PLAs), and the like.

コンピュータ可読媒体は、適切なデバイスによって実行される命令を格納可能な任意の有形なデバイスを含んでよく、その結果、そこに格納される命令を有するコンピュータ可読媒体は、フローチャートまたはブロック図で指定された操作を実行するための手段を作成すべく実行され得る命令を含む、製品を備えることになる。コンピュータ可読媒体の例としては、電子記憶媒体、磁気記憶媒体、光記憶媒体、電磁記憶媒体、半導体記憶媒体等が含まれてよい。コンピュータ可読媒体のより具体的な例としては、フロッピー（登録商標）ディスク、ディスケット、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、リードオンリメモリ（ＲＯＭ）、消去可能プログラマブルリードオンリメモリ（ＥＰＲＯＭまたはフラッシュメモリ）、電気的消去可能プログラマブルリードオンリメモリ（ＥＥＰＲＯＭ）、静的ランダムアクセスメモリ（ＳＲＡＭ）、コンパクトディスクリードオンリメモリ（ＣＤ-ＲＯＭ）、デジタル多用途ディスク（ＤＶＤ）、ブルーレイ（ＲＴＭ）ディスク、メモリスティック、集積回路カード等が含まれてよい。 A computer-readable medium may include any tangible device capable of storing instructions that are executed by a suitable device, such that the computer-readable medium having instructions stored thereon comprises an article of manufacture that includes instructions that can be executed to create means for performing the operations specified in the flowchart or block diagram. Examples of computer-readable media may include electronic storage media, magnetic storage media, optical storage media, electromagnetic storage media, semiconductor storage media, and the like. More specific examples of computer-readable media may include floppy disks, diskettes, hard disks, random access memories (RAMs), read-only memories (ROMs), erasable programmable read-only memories (EPROMs or flash memories), electrically erasable programmable read-only memories (EEPROMs), static random access memories (SRAMs), compact disk read-only memories (CD-ROMs), digital versatile disks (DVDs), Blu-ray (RTM) disks, memory sticks, integrated circuit cards, and the like.

コンピュータ可読命令は、アセンブラ命令、命令セットアーキテクチャ（ＩＳＡ）命令、マシン命令、マシン依存命令、マイクロコード、ファームウェア命令、状態設定データ、またはＳｍａｌｌｔａｌｋ（登録商標）、ＪＡＶＡ（登録商標）、Ｃ＋＋等のようなオブジェクト指向プログラミング言語、および「Ｃ」プログラミング言語または同様のプログラミング言語のような従来の手続型プログラミング言語を含む、１または複数のプログラミング言語の任意の組み合わせで記述されたソースコードまたはオブジェクトコードのいずれかを含んでよい。 The computer readable instructions may include either assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including object-oriented programming languages such as Smalltalk®, JAVA®, C++, etc., and conventional procedural programming languages such as the "C" programming language or similar programming languages.

コンピュータ可読命令は、汎用コンピュータ、特殊目的のコンピュータ、若しくは他のプログラム可能なデータ処理装置のプロセッサまたはプログラマブル回路に対し、ローカルにまたはローカルエリアネットワーク（ＬＡＮ）、インターネット等のようなワイドエリアネットワーク（ＷＡＮ）を介して提供され、フローチャートまたはブロック図で指定された操作を実行するための手段を作成すべく、コンピュータ可読命令を実行してよい。プロセッサの例としては、コンピュータプロセッサ、処理ユニット、マイクロプロセッサ、デジタル信号プロセッサ、コントローラ、マイクロコントローラ等を含む。 The computer-readable instructions may be provided to a processor or programmable circuit of a general-purpose computer, special-purpose computer, or other programmable data processing apparatus, either locally or over a wide area network (WAN) such as a local area network (LAN), the Internet, etc., to execute the computer-readable instructions to create means for performing the operations specified in the flowcharts or block diagrams. Examples of processors include computer processors, processing units, microprocessors, digital signal processors, controllers, microcontrollers, etc.

図１０は、本発明の複数の態様が全体的または部分的に具現化されてよいコンピュータ２２００の例を示す。コンピュータ２２００にインストールされたプログラムは、コンピュータ２２００に、本発明の実施形態に係る装置に関連付けられる操作または当該装置の１または複数のセクションとして機能させることができ、または当該操作または当該１または複数のセクションを実行させることができ、および／またはコンピュータ２２００に、本発明の実施形態に係るプロセスまたは当該プロセスの段階を実行させることができる。そのようなプログラムは、コンピュータ２２００に、本明細書に記載のフローチャートおよびブロック図のブロックのうちのいくつかまたはすべてに関連付けられた特定の操作を実行させるべく、ＣＰＵ２２１２によって実行されてよい。 10 shows an example of a computer 2200 in which aspects of the present invention may be embodied in whole or in part. A program installed on the computer 2200 may cause the computer 2200 to function as or perform operations associated with an apparatus or one or more sections of the apparatus according to an embodiment of the present invention, and/or to perform a process or steps of the process according to an embodiment of the present invention. Such a program may be executed by the CPU 2212 to cause the computer 2200 to perform certain operations associated with some or all of the blocks of the flowcharts and block diagrams described herein.

本実施形態によるコンピュータ２２００は、ＣＰＵ２２１２、ＲＡＭ２２１４、グラフィックコントローラ２２１６、およびディスプレイデバイス２２１８を含み、それらはホストコントローラ２２１０によって相互に接続されている。コンピュータ２２００はまた、通信インタフェース２２２２、ハードディスクドライブ２２２４、ＤＶＤ－ＲＯＭドライブ２２２６、およびＩＣカードドライブのような入／出力ユニットを含み、それらは入／出力コントローラ２２２０を介してホストコントローラ２２１０に接続されている。コンピュータはまた、ＲＯＭ２２３０およびキーボード２２４２のようなレガシの入／出力ユニットを含み、それらは入／出力チップ２２４０を介して入／出力コントローラ２２２０に接続されている。 The computer 2200 according to this embodiment includes a CPU 2212, a RAM 2214, a graphics controller 2216, and a display device 2218, which are interconnected by a host controller 2210. The computer 2200 also includes input/output units such as a communication interface 2222, a hard disk drive 2224, a DVD-ROM drive 2226, and an IC card drive, which are connected to the host controller 2210 via an input/output controller 2220. The computer also includes legacy input/output units such as a ROM 2230 and a keyboard 2242, which are connected to the input/output controller 2220 via an input/output chip 2240.

ＣＰＵ２２１２は、ＲＯＭ２２３０およびＲＡＭ２２１４内に格納されたプログラムに従い動作し、それにより各ユニットを制御する。グラフィックコントローラ２２１６は、ＲＡＭ２２１４内に提供されるフレームバッファ等またはそれ自体の中にＣＰＵ２２１２によって生成されたイメージデータを取得し、イメージデータがディスプレイデバイス２２１８上に表示されるようにする。 The CPU 2212 operates according to the programs stored in the ROM 2230 and the RAM 2214, thereby controlling each unit. The graphics controller 2216 retrieves image data generated by the CPU 2212 into a frame buffer or the like provided in the RAM 2214 or into itself, and causes the image data to be displayed on the display device 2218.

通信インタフェース２２２２は、ネットワークを介して他の電子デバイスと通信する。ハードディスクドライブ２２２４は、コンピュータ２２００内のＣＰＵ２２１２によって使用されるプログラムおよびデータを格納する。ＤＶＤ－ＲＯＭドライブ２２２６は、プログラムまたはデータをＤＶＤ－ＲＯＭ２２０１から読み取り、ハードディスクドライブ２２２４にＲＡＭ２２１４を介してプログラムまたはデータを提供する。ＩＣカードドライブは、プログラムおよびデータをＩＣカードから読み取り、および／またはプログラムおよびデータをＩＣカードに書き込む。 The communication interface 2222 communicates with other electronic devices via a network. The hard disk drive 2224 stores programs and data used by the CPU 2212 in the computer 2200. The DVD-ROM drive 2226 reads programs or data from the DVD-ROM 2201 and provides the programs or data to the hard disk drive 2224 via the RAM 2214. The IC card drive reads programs and data from an IC card and/or writes programs and data to an IC card.

ＲＯＭ２２３０はその中に、アクティブ化時にコンピュータ２２００によって実行されるブートプログラム等、および／またはコンピュータ２２００のハードウェアに依存するプログラムを格納する。入／出力チップ２２４０はまた、様々な入／出力ユニットをパラレルポート、シリアルポート、キーボードポート、マウスポート等を介して、入／出力コントローラ２２２０に接続してよい。 ROM 2230 stores therein a boot program, etc., executed by computer 2200 upon activation, and/or a program that depends on the hardware of computer 2200. I/O chip 2240 may also connect various I/O units to I/O controller 2220 via a parallel port, a serial port, a keyboard port, a mouse port, etc.

プログラムが、ＤＶＤ－ＲＯＭ２２０１またはＩＣカードのようなコンピュータ可読媒体によって提供される。プログラムは、コンピュータ可読媒体から読み取られ、コンピュータ可読媒体の例でもあるハードディスクドライブ２２２４、ＲＡＭ２２１４、またはＲＯＭ２２３０にインストールされ、ＣＰＵ２２１２によって実行される。これらのプログラム内に記述される情報処理は、コンピュータ２２００に読み取られ、プログラムと、上記様々なタイプのハードウェアリソースとの間の連携をもたらす。装置または方法が、コンピュータ２２００の使用に従い情報の操作または処理を実現することによって構成されてよい。 The programs are provided by a computer-readable medium such as a DVD-ROM 2201 or an IC card. The programs are read from the computer-readable medium, installed in the hard disk drive 2224, RAM 2214, or ROM 2230, which are also examples of computer-readable media, and executed by the CPU 2212. The information processing described in these programs is read by the computer 2200, and brings about cooperation between the programs and the various types of hardware resources described above. An apparatus or method may be constructed by realizing the manipulation or processing of information according to the use of the computer 2200.

例えば、通信がコンピュータ２２００および外部デバイス間で実行される場合、ＣＰＵ２２１２は、ＲＡＭ２２１４にロードされた通信プログラムを実行し、通信プログラムに記述された処理に基づいて、通信インタフェース２２２２に対し、通信処理を命令してよい。通信インタフェース２２２２は、ＣＰＵ２２１２の制御下、ＲＡＭ２２１４、ハードディスクドライブ２２２４、ＤＶＤ－ＲＯＭ２２０１、またはＩＣカードのような記録媒体内に提供される送信バッファ処理領域に格納された送信データを読み取り、読み取られた送信データをネットワークに送信し、またはネットワークから受信された受信データを記録媒体上に提供される受信バッファ処理領域等に書き込む。 For example, when communication is performed between computer 2200 and an external device, CPU 2212 may execute a communication program loaded into RAM 2214 and instruct communication interface 2222 to perform communication processing based on the processing described in the communication program. Under the control of CPU 2212, communication interface 2222 reads transmission data stored in a transmission buffer processing area provided in RAM 2214, hard disk drive 2224, DVD-ROM 2201, or a recording medium such as an IC card, and transmits the read transmission data to the network, or writes reception data received from the network to a reception buffer processing area or the like provided on the recording medium.

また、ＣＰＵ２２１２は、ハードディスクドライブ２２２４、ＤＶＤ－ＲＯＭドライブ２２２６（ＤＶＤ－ＲＯＭ２２０１）、ＩＣカード等のような外部記録媒体に格納されたファイルまたはデータベースの全部または必要な部分がＲＡＭ２２１４に読み取られるようにし、ＲＡＭ２２１４上のデータに対し様々なタイプの処理を実行してよい。ＣＰＵ２２１２は次に、処理されたデータを外部記録媒体にライトバックする。 The CPU 2212 may also cause all or a necessary portion of a file or database stored on an external recording medium such as the hard disk drive 2224, the DVD-ROM drive 2226 (DVD-ROM 2201), an IC card, etc. to be read into the RAM 2214, and perform various types of processing on the data on the RAM 2214. The CPU 2212 then writes back the processed data to the external recording medium.

様々なタイプのプログラム、データ、テーブル、およびデータベースのような様々なタイプの情報が記録媒体に格納され、情報処理を受けてよい。ＣＰＵ２２１２は、ＲＡＭ２２１４から読み取られたデータに対し、本開示の随所に記載され、プログラムの命令シーケンスによって指定される様々なタイプの操作、情報処理、条件判断、条件分岐、無条件分岐、情報の検索／置換等を含む、様々なタイプの処理を実行してよく、結果をＲＡＭ２２１４に対しライトバックする。また、ＣＰＵ２２１２は、記録媒体内のファイル、データベース等における情報を検索してよい。例えば、各々が第２の属性の属性値に関連付けられた第１の属性の属性値を有する複数のエントリが記録媒体内に格納される場合、ＣＰＵ２２１２は、第１の属性の属性値が指定される、条件に一致するエントリを当該複数のエントリの中から検索し、当該エントリ内に格納された第２の属性の属性値を読み取り、それにより予め定められた条件を満たす第１の属性に関連付けられた第２の属性の属性値を取得してよい。 Various types of information, such as various types of programs, data, tables, and databases, may be stored in the recording medium and may undergo information processing. CPU 2212 may perform various types of processing on data read from RAM 2214, including various types of operations, information processing, conditional judgment, conditional branching, unconditional branching, information search/replacement, etc., as described throughout this disclosure and specified by the instruction sequence of the program, and write back the results to RAM 2214. CPU 2212 may also search for information in a file, database, etc. in the recording medium. For example, if multiple entries each having an attribute value of a first attribute associated with an attribute value of a second attribute are stored in the recording medium, CPU 2212 may search for an entry that matches a condition in which an attribute value of the first attribute is specified from among the multiple entries, read the attribute value of the second attribute stored in the entry, and thereby obtain the attribute value of the second attribute associated with the first attribute that satisfies a predetermined condition.

上で説明したプログラムまたはソフトウェアモジュールは、コンピュータ２２００上またはコンピュータ２２００近傍のコンピュータ可読媒体に格納されてよい。また、専用通信ネットワークまたはインターネットに接続されたサーバーシステム内に提供されるハードディスクまたはＲＡＭのような記録媒体が、コンピュータ可読媒体として使用可能であり、それによりプログラムを、ネットワークを介してコンピュータ２２００に提供する。 The above-described program or software module may be stored on a computer-readable medium on the computer 2200 or in the vicinity of the computer 2200. Also, a recording medium such as a hard disk or RAM provided in a server system connected to a dedicated communication network or the Internet can be used as a computer-readable medium, thereby providing the program to the computer 2200 via the network.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されない。上記実施の形態に、多様な変更または改良を加えることが可能であることが当業者に明らかである。その様な変更または改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 The present invention has been described above using an embodiment, but the technical scope of the present invention is not limited to the scope described in the above embodiment. It is clear to those skilled in the art that various modifications and improvements can be made to the above embodiment. It is clear from the claims that forms with such modifications or improvements can also be included in the technical scope of the present invention.

特許請求の範囲、明細書、および図面中において示した装置、システム、プログラム、および方法における動作、手順、ステップ、および段階等の各処理の実行順序は、特段「より前に」、「先立って」等と明示しておらず、また、前の処理の出力を後の処理で用いるのでない限り、任意の順序で実現しうることに留意すべきである。特許請求の範囲、明細書、および図面中の動作フローに関して、便宜上「まず、」、「次に、」等を用いて説明したとしても、この順で実施することが必須であることを意味するものではない。 The order of execution of each process, such as operations, procedures, steps, and stages, in the devices, systems, programs, and methods shown in the claims, specifications, and drawings is not specifically stated as "before" or "prior to," and it should be noted that the processes may be performed in any order, unless the output of a previous process is used in a later process. Even if the operational flow in the claims, specifications, and drawings is explained using "first," "next," etc. for convenience, it does not mean that it is necessary to perform the processes in this order.

１システム
２設備
４装置
４０１操業モデル
４０２操業部
４０３パラメータ取得部
４１１入力部
４１２記憶部
４１３第２学習処理部
４１４目標設定モデル
４１５第１供給部
４１６第２取得部
４１７第２供給部
４２１第１取得部
４２２第１学習処理部
４３１シミュレータ
４３２表示制御部
４３３目標範囲取得部
４３４評価値取得部
４３５評価部
２２００コンピュータ
２２０１ＤＶＤ－ＲＯＭ
２２１０ホストコントローラ
２２１２ＣＰＵ
２２１４ＲＡＭ
２２１６グラフィックコントローラ
２２１８ディスプレイデバイス
２２２０入／出力コントローラ
２２２２通信インタフェース
２２２４ハードディスクドライブ
２２２６ＤＶＤ－ＲＯＭドライブ
２２３０ＲＯＭ
２２４０入／出力チップ
２２４２キーボード
４１４１パラメータ設定モデル
４１４２目標範囲設定モデル REFERENCE SIGNS LIST 1 System 2 Facility 4 Apparatus 401 Operation model 402 Operation section 403 Parameter acquisition section 411 Input section 412 Storage section 413 Second learning processing section 414 Goal setting model 415 First supply section 416 Second acquisition section 417 Second supply section 421 First acquisition section 422 First learning processing section 431 Simulator 432 Display control section 433 Target range acquisition section 434 Evaluation value acquisition section 435 Evaluation section 2200 Computer 2201 DVD-ROM
2210 host controller 2212 CPU
2214 RAM
2216 Graphic controller 2218 Display device 2220 Input/output controller 2222 Communication interface 2224 Hard disk drive 2226 DVD-ROM drive 2230 ROM
2240 Input/Output Chip 2242 Keyboard 4141 Parameter Setting Model 4142 Target Range Setting Model

Claims

a supply unit that supplies the value of a state parameter to an operation model that outputs a recommended value of a control parameter of the equipment in response to an input of a value of the state parameter related to the equipment;
a control parameter acquisition unit that acquires a recommended value of a control parameter output from the operation model in response to the supply unit supplying the value of the state parameter to the operation model;
an acquisition unit that acquires a model evaluation value corresponding to a result of operating the equipment using the recommended value acquired by the control parameter acquisition unit;
an evaluation unit that evaluates the operation model based on the model evaluation value and a reference evaluation value calculated based on a result of inputting a manual operation into a simulator of the equipment ;
An apparatus comprising :

a supply unit that supplies the value of a state parameter to an operation model that outputs a recommended value of a control parameter of the equipment in response to an input of a value of the state parameter related to the equipment;
a control parameter acquisition unit that acquires a recommended value of a control parameter output from the operation model in response to the supply unit supplying the value of the state parameter to the operation model;
an acquisition unit that acquires a model evaluation value corresponding to a result of operating the equipment using the recommended value acquired by the control parameter acquisition unit;
an evaluation unit that evaluates the operation model based on the model evaluation value and a reference evaluation value according to a result of operating the equipment by manual operation;
Equipped with
The model evaluation value is calculated based on whether a parameter related to the equipment operated according to the recommended value falls within a target range;
An apparatus in which the reference evaluation value is calculated based on whether or not a parameter related to the equipment operated by manual operation falls within the target range.

The device according to claim 2, further comprising a target range acquisition unit that acquires the target range set by an operator for a selected parameter selected by the operator from among multiple types of parameters related to the equipment.

The device according to claim 3, further comprising a display control unit that displays the value of the selected parameter in past operations of the equipment in response to the selection of the selected parameter from the multiple types of parameters.

The device according to claim 4, wherein the display control unit displays the values of each selected parameter in past operations of the equipment in a coordinate space with each selected parameter as a coordinate axis.

The facility is a facility for manufacturing an object,
The apparatus according to claim 2 , wherein the parameter relating to the equipment is at least one of an index value indicating a quality of the product or a production volume of the product.

The device according to any one of claims 1 to 6, wherein the evaluation unit evaluates the operation model as good if the model evaluation value is better than the reference evaluation value.

The device according to any one of claims 1 to 7, wherein the model evaluation value is calculated based on the result of inputting the recommended value acquired by the control parameter acquisition unit into a simulator of the equipment.

The device according to any one of claims 1 to 8, further comprising a learning processing unit that executes learning processing of the operation model using learning data including values of state parameters and values of control parameters.

The device according to claim 9, wherein the learning processing unit executes learning processing of the operation model using the learning data and a reward value determined by a preset reward function.

a supply step of supplying the value of the state parameter to an operation model that outputs a recommended value of a control parameter of the equipment in response to input of the value of the state parameter indicating a state of the equipment;
a control parameter acquisition step of acquiring recommended values of control parameters output from the operation model in response to the supply of the state parameter values to the operation model in the supply step;
an acquisition step of acquiring a model evaluation value corresponding to a result of operating the equipment using the recommended value acquired in the control parameter acquisition step;
an evaluation stage in which the operation model is evaluated based on the model evaluation value and a reference evaluation value calculated based on a result of inputting a manual operation into a simulator of the equipment ;
A method for providing the above.

a supply step of supplying the value of the state parameter to an operation model that outputs a recommended value of a control parameter of the equipment in response to input of the value of the state parameter indicating a state of the equipment;
a control parameter acquisition step of acquiring recommended values of control parameters output from the operation model in response to the supply of the state parameter values to the operation model in the supply step;
an acquisition step of acquiring a model evaluation value corresponding to a result of operating the equipment using the recommended value acquired in the control parameter acquisition step;
an evaluation stage in which the operation model is evaluated based on the model evaluation value and a reference evaluation value according to a result of operating the equipment by manual operation;
Equipped with
The model evaluation value is calculated based on whether a parameter related to the equipment operated according to the recommended value falls within a target range;
The method in which the reference evaluation value is calculated based on whether or not a parameter related to the equipment operated by manual operation falls within the target range.

Computer,
a supply unit that supplies the value of a state parameter to an operation model that outputs a recommended value of a control parameter of the equipment in response to an input of a value of the state parameter related to the equipment;
a control parameter acquisition unit that acquires a recommended value of a control parameter output from the operation model in response to the supply unit supplying the value of the state parameter to the operation model;
an acquisition unit that acquires a model evaluation value corresponding to a result of operating the equipment using the recommended value acquired by the control parameter acquisition unit;
a program that functions as an evaluation unit that evaluates the operation model based on the model evaluation value and a reference evaluation value calculated based on a result of inputting manual operations into a simulator of the equipment .

Computer,
a supply unit that supplies the value of a state parameter to an operation model that outputs a recommended value of a control parameter of the equipment in response to an input of a value of the state parameter related to the equipment;
a control parameter acquisition unit that acquires a recommended value of a control parameter output from the operation model in response to the supply unit supplying the value of the state parameter to the operation model;
an acquisition unit that acquires a model evaluation value corresponding to a result of operating the equipment using the recommended value acquired by the control parameter acquisition unit;
an evaluation unit that evaluates the operation model based on the model evaluation value and a reference evaluation value according to a result of operating the equipment by manual operation;
The model evaluation value is calculated based on whether a parameter related to the equipment operated according to the recommended value falls within a target range;
A program in which the reference evaluation value is calculated based on whether or not a parameter related to the equipment operated by manual operation falls within the target range.