JP6915371B2

JP6915371B2 - Control device, control program, learning data creation method, and learning method

Info

Publication number: JP6915371B2
Application number: JP2017096165A
Authority: JP
Inventors: 安藤　丹一; 丹一安藤
Original assignee: Omron Corp
Current assignee: Omron Corp
Priority date: 2017-05-15
Filing date: 2017-05-15
Publication date: 2021-08-04
Anticipated expiration: 2037-05-15
Also published as: WO2018211927A1; JP2018194932A

Description

本発明は、制御装置、制御プログラム、学習データ作成方法、及び学習方法に関する。 The present invention relates to a control device, a control program, a learning data creation method, and a learning method.

近年、所望の装置の動作を制御する機能の習得に、ディープラーニング等の機械学習が用いられることがある。例えば、特許文献１では、ニューラル装置を利用したインテリジェント住宅システムが提案されている。具体的には、特許文献１では、アイデンティティ情報、座標情報、及び使用者が過去に電子装置を調節した制御情報を用いて、使用者の慣用の操作状態に当該電子装置を調節制御する機能を習得する機械学習を行うシステムが提案されている。 In recent years, machine learning such as deep learning may be used to acquire a function of controlling the operation of a desired device. For example, Patent Document 1 proposes an intelligent housing system using a neural device. Specifically, in Patent Document 1, a function of adjusting and controlling the electronic device according to the user's customary operating state by using identity information, coordinate information, and control information for which the user has adjusted the electronic device in the past is provided. A system for learning machine learning has been proposed.

特表２０１６−５３２３５５号公報Special Table 2016-532355

上記のとおり、機械学習を利用すれば、所望の装置の動作を制御する機能を習得したシステムを容易に構築することができる。しかしながら、本件発明者は、制御対象装置の動作を制御する機能を機械学習により習得したシステムでは、次のような問題点が発生し得ることを見出した。 As described above, by using machine learning, it is possible to easily construct a system that has acquired the function of controlling the operation of a desired device. However, the present inventor has found that the following problems may occur in a system in which the function of controlling the operation of the controlled device is acquired by machine learning.

すなわち、制御対象装置の動作を制御する機能を機械学習により習得した複数の異なる学習器を利用する場合に、１又は複数の制御対象装置に対する複数の制御の間で競合が発生する可能性がある。例えば、各使用者に応じて構築した学習器を利用して同一の制御対象装置を制御する場合に、異なる使用者の学習器から異なる動作を実行させる制御指令が発行されると、当該制御対象装置の制御に競合が発生してしまう。また、例えば、各制御対象装置に応じて構築した学習器を利用して複数の制御対象装置を同時に制御する場合に、同一の場所に移動させる等の同時に実現できない動作を実行させる制御指令が発行されると、当該複数の制御対象装置の制御に競合が発生してしまう。このような制御の競合が発生した場合に、従来のシステムでは、制御対象装置が動作不能に陥ってしまう可能性があるという問題点を本件発明者は見出した。 That is, when a plurality of different learners who have acquired the function of controlling the operation of the controlled object device by machine learning are used, a conflict may occur between a plurality of controls for one or a plurality of controlled devices. .. For example, when the same controlled object device is controlled by using the learner constructed for each user, when a control command for executing a different operation is issued from the learner of a different user, the controlled object is controlled. Conflicts occur in the control of the device. Further, for example, when controlling a plurality of controlled object devices at the same time by using a learner constructed according to each controlled object device, a control command is issued to execute an operation that cannot be realized at the same time, such as moving to the same place. If this is done, a conflict will occur in the control of the plurality of controlled devices. The present inventor has found that when such a control conflict occurs, the controlled device may become inoperable in the conventional system.

本発明は、一側面では、このような実情を鑑みてなされたものであり、その目的は、複数の制御の間で競合が発生しても、制御対象装置が動作不能に陥らないようにする技術を提供することである。 The present invention, on the one hand, has been made in view of such circumstances, and an object of the present invention is to prevent the controlled device from becoming inoperable even if a conflict occurs between a plurality of controls. To provide technology.

本発明は、上述した課題を解決するために、以下の構成を採用する。 The present invention employs the following configuration in order to solve the above-mentioned problems.

すなわち、本発明の一側面に係る制御装置は、第１の制御対象装置の動作を制御するための学習を行った学習済みの第１の学習器から出力される制御値に基づいて、当該第１の制御対象装置の動作を制御する第１の制御処理部と、第２の制御対象装置の動作を制御するための学習を行った学習済みの第２の学習器から出力される制御値に基づいて、当該第２の制御対象装置の動作を制御する第２の制御処理部と、前記第１の学習器から出力される制御値に基づく前記第１の制御対象装置の制御と前記第２の学習器から出力される制御値に基づく前記第２の制御対象装置の制御とが競合する場合に、前記第１の制御対象装置及び前記第２の制御対象装置の制御を修正することで、当該競合を解消する競合解消部と、を備える。 That is, the control device according to one aspect of the present invention is based on the control value output from the learned first learner that has learned to control the operation of the first control target device. To the control value output from the first control processing unit that controls the operation of the control target device 1 and the learned second learner that has learned to control the operation of the second control target device. Based on this, the second control processing unit that controls the operation of the second control target device, the control of the first control target device based on the control value output from the first learner, and the second control target device. When the control of the second control target device based on the control value output from the learning device of the above conflicts with the control of the second control target device, the control of the first control target device and the second control target device can be modified. It is provided with a conflict resolving unit for resolving the conflict.

当該構成では、第１の制御処理部が、第１の学習器を利用して、第１の制御対象装置の動作を制御する。また、第２の制御処理部が、第２の学習器を利用して、第２の制御対象装置の動作を制御する。そして、第１の制御対象装置の制御と第２の制御対象装置の制御との間に競合が発生する場合には、競合解消部が、第１の制御対象装置及び第２の制御対象装置の制御を修正することで、当該競合を解消する。したがって、当該構成によれば、複数の制御の間で競合が発生しても、制御対象装置が動作不能に陥らないようにすることができる。なお、第１の制御対象装置及び第２の制御対象装置は、同一の制御対象装置であってもよいし、互いに異なる制御対象装置であってもよい。また、制御対象装置は、制御対象となり得るあらゆる種類の装置を含んでよく、例えば、空調装置、ロボット装置等であってよい。 In this configuration, the first control processing unit uses the first learner to control the operation of the first controlled device. Further, the second control processing unit controls the operation of the second controlled target device by using the second learning device. Then, when a conflict occurs between the control of the first control target device and the control of the second control target device, the conflict resolution unit determines the first control target device and the second control target device. The conflict is resolved by modifying the control. Therefore, according to the configuration, even if a conflict occurs between a plurality of controls, the controlled device can be prevented from becoming inoperable. The first control target device and the second control target device may be the same control target device or may be different control target devices from each other. Further, the control target device may include any kind of device that can be a control target, and may be, for example, an air conditioner, a robot device, or the like.

上記一側面に係る制御装置において、前記競合解消部は、前記第１の学習器から出力される前記第１の制御対象装置の制御値及び前記第２の学習器から出力される前記第２の制御対象装置の制御値を入力すると、前記競合を解消するように修正された前記第１の制御対象装置の制御値及び前記第２の制御対象装置の制御値を出力するように学習を行った学習済みの第３の学習器を利用して、前記競合を解消してもよい。当該構成によれば、制御対象装置の制御が複雑に競合し得る場合であっても、当該制御対象装置に生じ得る競合を容易に解消することができる。 In the control device according to the one aspect, the conflict resolving unit has the control value of the first controlled device output from the first learning device and the second learning device output from the second learning device. When the control value of the control target device is input, learning is performed so as to output the control value of the first control target device and the control value of the second control target device modified so as to eliminate the conflict. The conflict may be resolved by using the learned third learner. According to this configuration, even when the control of the controlled target device may conflict in a complicated manner, the conflict that may occur in the controlled target device can be easily eliminated.

上記一側面に係る制御装置は、前記第１の学習器から出力される前記第１の制御対象装置の制御値及び前記第２の学習器から出力される前記第２の制御対象装置の制御値に基づいて、前記第１の制御対象装置及び前記第２の制御対象装置の制御がどのように競合するかを示す競合種別情報を特定する競合種別特定部を更に備えてよく、前記競合解消部は、特定された前記競合種別情報を前記第３の学習器に更に入力してもよい。当該構成によれば、競合の仕方（種別）に応じて、適した競合の解消方法を採用するようにすることができる。 The control device according to the one aspect is a control value of the first control target device output from the first learner and a control value of the second control target device output from the second learner. Based on the above, a conflict type identification unit that specifies conflict type information indicating how the controls of the first control target device and the second control target device compete with each other may be further provided, and the conflict resolution unit may be further provided. May further input the identified competition type information into the third learner. According to this configuration, it is possible to adopt a suitable conflict resolution method according to the competition method (type).

上記一側面に係る制御装置において、前記競合種別特定部は、前記第１の学習器から出力される前記第１の制御対象装置の制御値及び前記第２の学習器から出力される前記第２の制御対象装置の制御値を入力すると、前記競合種別情報に対応する出力値を出力するように学習を行った学習済みの第４の学習器を利用して、前記競合種別情報を特定してもよい。当該構成によれば、制御対象装置の制御が複雑に競合し得る場合であっても、当該制御対象装置に生じ得る競合の種別を容易に特定することができる。 In the control device according to the one aspect, the competition type identification unit is the control value of the first control target device output from the first learning device and the second learning device output from the second learning device. When the control value of the control target device of is input, the conflict type information is specified by using the learned fourth learner which has been trained to output the output value corresponding to the conflict type information. May be good. According to this configuration, even when the control of the controlled target device may conflict in a complicated manner, the type of conflict that may occur in the controlled target device can be easily specified.

上記一側面に係る制御装置において、前記第１、第２、第３及び第４の学習器はそれぞれニューラルネットワークにより構成されてよい。当該構成によれば、制御対象装置の制御に生じ得る競合を解消可能な制御装置を簡易に実現することができる。 In the control device according to the one aspect, the first, second, third, and fourth learners may each be configured by a neural network. According to this configuration, it is possible to easily realize a control device capable of resolving a conflict that may occur in the control of the controlled device.

上記一側面に係る制御装置において、前記競合解消部は、前記第１の学習器から出力される制御値に基づく前記第１の制御対象装置の制御、及び前記第２の学習器から出力される制御値に基づく前記第２の制御対象装置の制御のいずれか一方を優先することで、前記競合を解消してもよい。当該構成によれば、第１の学習器から出力される制御値及び第２の学習器から出力される制御値のいずれか一方を優先し、他方を無視することで、制御対象装置の制御に生じる競合を確実に解消することができる。 In the control device according to the one aspect, the conflict resolving unit controls the first controlled device based on the control value output from the first learner, and outputs from the second learner. The conflict may be resolved by giving priority to either one of the controls of the second controlled target device based on the control value. According to this configuration, one of the control value output from the first learner and the control value output from the second learner is prioritized and the other is ignored to control the controlled device. It is possible to surely eliminate the conflict that occurs.

上記一側面に係る制御装置において、前記第１の制御対象装置と前記第２の制御対象装置とは同一の制御対象装置であってよく、前記競合解消部は、前記第１の学習器から出力される制御値と前記第２の学習器から出力される制御値とを平均化することで、前記競合を解消してもよい。当該構成によれば、第１の学習器から出力される制御値及び第２の学習器から出力される制御値の平均値を採用することで、制御対象装置の制御に生じる競合を解消することができる。 In the control device according to the one aspect, the first control target device and the second control target device may be the same control target device, and the conflict resolving unit outputs from the first learner. The conflict may be eliminated by averaging the control value to be output and the control value output from the second learner. According to this configuration, by adopting the average value of the control value output from the first learner and the control value output from the second learner, the conflict that occurs in the control of the controlled device can be eliminated. Can be done.

なお、上記各形態に係る制御装置の別の形態として、以上の各構成を実現する情報処理方法であってもよいし、プログラムであってもよいし、このようなプログラムを記録したコンピュータその他装置、機械等が読み取り可能な記憶媒体であってもよい。ここで、コンピュータ等が読み取り可能な記録媒体とは、プログラム等の情報を、電気的、磁気的、光学的、機械的、又は、化学的作用によって蓄積する媒体である。 As another form of the control device according to each of the above forms, an information processing method that realizes each of the above configurations may be used, a program may be used, or a computer or other device that records such a program. , A storage medium that can be read by a machine or the like. Here, the recording medium that can be read by a computer or the like is a medium that stores information such as a program by electrical, magnetic, optical, mechanical, or chemical action.

例えば、本発明の一側面に係る制御プログラムは、第１の制御対象装置及び第２の制御対象装置の動作を制御するコンピュータに、前記第１の制御対象装置の動作を制御するための学習を行った学習済みの第１の学習器から出力される前記第１の制御対象装置を制御するための制御値を取得するステップと、前記第２の制御対象装置の動作を制御するための学習を行った学習済みの第２の学習器から出力される前記第２の制御対象装置を制御するための制御値を取得するステップと、前記第１の学習器から出力される制御値に基づく前記第１の制御対象装置の制御と前記第２の学習器から出力される制御値に基づく前記第２の制御対象装置の制御とが競合する場合に、当該競合を解消するように修正された前記第１の制御対象装置の制御値及び前記第２の制御対象装置の制御値を取得するステップと、取得された前記制御値に基づいて前記第１の制御対象装置及び前記第２の制御対象装置を制御するステップと、を実行させるためのプログラムである。 For example, in the control program according to one aspect of the present invention, the computer that controls the operation of the first control target device and the second control target device learns to control the operation of the first control target device. The step of acquiring the control value for controlling the first controlled target device output from the learned first learned device and the learning for controlling the operation of the second controlled target device are performed. The step of acquiring the control value for controlling the second controlled target device output from the second learned device that has been learned, and the first step based on the control value output from the first learner. When the control of the control target device 1 and the control of the second control target device based on the control value output from the second learner conflict with each other, the first is modified so as to eliminate the conflict. The step of acquiring the control value of the control target device 1 and the control value of the second control target device, and the first control target device and the second control target device based on the acquired control value. It is a program for executing the steps to be controlled.

また、本発明の一側面に係る学習データ作成方法は、第１の制御対象装置の動作を制御するための学習を行った学習済みの第１の学習器から出力される当該第１の制御対象装置を制御するための制御値を取得するステップと、第２の制御対象装置の動作を制御するための学習を行った学習済みの第２の学習器から出力される当該第２の制御対象装置を制御するための制御値を取得するステップと、前記第１の学習器から出力される制御値に基づく前記第１の制御対象装置の制御と前記第２の学習器から出力される制御値に基づく前記第２の制御対象装置の制御とが競合するか否かを判定するステップと、前記第１の学習器から出力される制御値に基づく前記第１の制御対象装置の制御と前記第２の学習器から出力される制御値に基づく前記第２の制御対象装置の制御とが競合する場合に、当該競合を解消するように前記第１の制御対象装置及び前記第２の制御対象装置の制御値の修正値を決定するステップと、前記第１の学習器から得られた制御値及び前記第２の学習器から得られた制御値を入力データとし、決定された前記修正値を教師データとして、学習器の学習を行うための学習データを作成するステップと、を備える。当該構成によれば、制御対象装置の制御に生じ得る競合の解消に利用する上記第３の学習器を構築するための学習データを収集することができる。 Further, the learning data creation method according to one aspect of the present invention is the first control target output from the learned first learner that has learned to control the operation of the first control target device. The second control target device output from the learned second learner that has been trained to control the operation of the second control target device and the step of acquiring the control value for controlling the device. To the step of acquiring the control value for controlling, the control of the first control target device based on the control value output from the first learner, and the control value output from the second learner. The step of determining whether or not the control of the second control target device based on the above conflicts, the control of the first control target device based on the control value output from the first learner, and the second control of the first control target device. When the control of the second control target device based on the control value output from the learning device of the above conflicts with the control of the second control target device, the first control target device and the second control target device so as to eliminate the conflict. The step of determining the correction value of the control value, the control value obtained from the first learning device and the control value obtained from the second learning device are used as input data, and the determined correction value is used as teacher data. As a step, it is provided with a step of creating learning data for learning the learner. According to this configuration, it is possible to collect learning data for constructing the third learning device used for resolving the conflict that may occur in the control of the controlled device.

上記一側面に係る学習データ作成方法において、前記修正値は、オペレータの入力により決定されてよい。制御対象装置が人の利用する装置である場合に、制御対象装置の制御に生じ得る競合の解消に利用する第３の学習器の構築に最適な学習データを作成することができる。 In the learning data creation method according to the above aspect, the correction value may be determined by the input of the operator. When the controlled target device is a device used by a person, it is possible to create optimal learning data for constructing a third learning device used for resolving a conflict that may occur in the control of the controlled target device.

上記一側面に係る学習データ作成方法において、前記修正値は、所定の規則に従って決定されてよい。当該構成によれば、第３の学習器の構築に利用する学習データを簡易に作成することができる。 In the learning data creation method according to the above aspect, the modified value may be determined according to a predetermined rule. According to this configuration, learning data used for constructing the third learning device can be easily created.

また、本発明の一側面に係る学習方法は、上記いずれかの形態に係る学習データ作成方法により作成した前記学習データを取得するステップと、取得した前記学習データにより学習器の学習を行うステップと、を備える。当該構成によれば、制御対象装置の制御に生じ得る競合の解消に利用する上記第３の学習器を構築することができる。 Further, the learning method according to one aspect of the present invention includes a step of acquiring the learning data created by the learning data creating method according to any one of the above forms, and a step of learning the learning device using the acquired learning data. , Equipped with. According to this configuration, it is possible to construct the third learning device used for resolving the conflict that may occur in the control of the controlled device.

本発明によれば、複数の制御の間で競合が発生しても、制御対象装置が動作不能に陥らないようにする技術を提供することができる。 According to the present invention, it is possible to provide a technique for preventing a controlled device from becoming inoperable even if a conflict occurs between a plurality of controls.

図１は、実施の形態に係る制御装置及び学習装置の適用場面の一例を模式的に例示する。FIG. 1 schematically illustrates an example of an application scene of the control device and the learning device according to the embodiment. 図２は、実施の形態に係る制御装置のハードウェア構成の一例を模式的に例示する。FIG. 2 schematically illustrates an example of the hardware configuration of the control device according to the embodiment. 図３は、実施の形態に係るデータ収集用制御装置のハードウェア構成の一例を模式的に例示する。FIG. 3 schematically illustrates an example of the hardware configuration of the data collection control device according to the embodiment. 図４は、実施の形態に係る学習装置のハードウェア構成の一例を模式的に例示する。FIG. 4 schematically illustrates an example of the hardware configuration of the learning device according to the embodiment. 図５は、実施の形態に係る制御装置の機能構成の一例を模式的に例示する。FIG. 5 schematically illustrates an example of the functional configuration of the control device according to the embodiment. 図６は、実施の形態に係るデータ収集用制御装置の機能構成の一例を模式的に例示する。FIG. 6 schematically illustrates an example of the functional configuration of the data collection control device according to the embodiment. 図７は、実施の形態に係る学習装置の機能構成の一例を模式的に例示する。FIG. 7 schematically illustrates an example of the functional configuration of the learning device according to the embodiment. 図８は、実施の形態に係る制御装置の処理手順の一例を例示する。FIG. 8 illustrates an example of the processing procedure of the control device according to the embodiment. 図９は、実施の形態に係るデータ収集用制御装置の処理手順の一例を例示する。FIG. 9 illustrates an example of the processing procedure of the data collection control device according to the embodiment. 図１０は、実施の形態に係る学習装置の処理手順の一例を例示する。FIG. 10 illustrates an example of the processing procedure of the learning device according to the embodiment. 図１１は、変形例に係る制御装置の構成の一例を模式的に例示する。FIG. 11 schematically illustrates an example of the configuration of the control device according to the modified example. 図１２は、変形例に係る制御装置の構成の一例を模式的に例示する。FIG. 12 schematically illustrates an example of the configuration of the control device according to the modified example. 図１３は、変形例に係る制御装置の構成の一例を模式的に例示する。FIG. 13 schematically illustrates an example of the configuration of the control device according to the modified example.

以下、本発明の一側面に係る実施の形態（以下、「本実施形態」とも表記する）を、図面に基づいて説明する。ただし、以下で説明する本実施形態は、あらゆる点において本発明の例示に過ぎない。本発明の範囲を逸脱することなく種々の改良や変形を行うことができることは言うまでもない。つまり、本発明の実施にあたって、実施形態に応じた具体的構成が適宜採用されてもよい。なお、本実施形態において登場するデータを自然言語により説明しているが、より具体的には、コンピュータが認識可能な疑似言語、コマンド、パラメータ、マシン語等で指定される。 Hereinafter, embodiments according to one aspect of the present invention (hereinafter, also referred to as “the present embodiment”) will be described with reference to the drawings. However, the embodiments described below are merely examples of the present invention in all respects. Needless to say, various improvements and modifications can be made without departing from the scope of the present invention. That is, in carrying out the present invention, a specific configuration according to the embodiment may be appropriately adopted. Although the data appearing in the present embodiment is described in natural language, more specifically, it is specified in a pseudo language, a command, a parameter, a machine language, etc. that can be recognized by a computer.

§１適用例
まず、図１を用いて、本発明が適用される場面の一例について説明する。図１は、本実施形態に係る制御装置１及び学習装置３の適用場面の一例を模式的に例示する。 §1 Application example First, an example of a situation in which the present invention is applied will be described with reference to FIG. FIG. 1 schematically illustrates an example of application situations of the control device 1 and the learning device 3 according to the present embodiment.

図１に示されるとおり、本実施形態に係る制御装置１は、複数の利用者（図１では、利用者Ａ及びＢ）からの指示に従って、制御対象装置である空調装置４の動作を制御する情報処理装置である。空調装置４は、例えば、室内の温度を調節する公知のエア・コンディショナであり、本発明の「第１の制御対象装置」及び「第２の制御対象装置」に相当する。すなわち、本実施形態では、第１の制御対象装置及び第２の制御対象装置は同一である。ただし、第１の制御対象装置及び第２の制御対象装置は、このような例に限定されなくてもよく、異なる装置であってもよい。 As shown in FIG. 1, the control device 1 according to the present embodiment controls the operation of the air conditioning device 4 which is a controlled target device according to instructions from a plurality of users (users A and B in FIG. 1). It is an information processing device. The air conditioner 4 is, for example, a known air conditioner that regulates the temperature in a room, and corresponds to the “first control target device” and the “second control target device” of the present invention. That is, in the present embodiment, the first controlled object device and the second controlled object device are the same. However, the first control target device and the second control target device do not have to be limited to such an example, and may be different devices.

本実施形態に係る制御装置１は、空調装置４の動作を制御するための２つの学習器を備えている。第１の学習器（後述する第１のニューラルネットワーク５）は、利用者Ａの好みに応じた空調装置４の動作の制御を予め学習済みである。一方、第２の学習器（後述する第２のニューラルネットワーク６）は、利用者Ｂの好みに応じた空調装置４の動作の制御を予め学習済みである。制御装置１は、第１の学習器及び第２の学習器それぞれから出力される制御値に基づいて、空調装置４の動作を制御する。 The control device 1 according to the present embodiment includes two learning devices for controlling the operation of the air conditioner 4. The first learner (the first neural network 5 described later) has learned in advance the control of the operation of the air conditioner 4 according to the preference of the user A. On the other hand, the second learner (second neural network 6 described later) has learned in advance the control of the operation of the air conditioner 4 according to the preference of the user B. The control device 1 controls the operation of the air conditioner 4 based on the control values output from each of the first learner and the second learner.

このとき、第１の学習器から出力される制御値と第２の学習器から出力される制御値とが相違する場合、空調装置４の動作の制御に競合が生じる可能性がある。例えば、室温が２４度である状況で、第１の学習器から出力される制御値が室温を２６度にする指令を構成しており、第２の学習器から出力される制御値が室温を２２度にする指令を構成している場合、空調装置４の動作の制御に競合が生じる。 At this time, if the control value output from the first learner and the control value output from the second learner are different, there is a possibility that a conflict may occur in the control of the operation of the air conditioner 4. For example, in a situation where the room temperature is 24 degrees, the control value output from the first learner constitutes a command to set the room temperature to 26 degrees, and the control value output from the second learner constitutes the room temperature. When the command to set the temperature to 22 degrees is configured, a conflict occurs in the control of the operation of the air conditioner 4.

そこで、本実施形態に係る制御装置１は、第１の学習器から出力される制御値に基づく空調装置４の制御と第２の学習器から出力される制御値に基づく空調装置４の制御とが競合する場合に、空調装置４の制御を修正することで当該競合を解消する。具体的には、本実施形態に係る制御装置１は、第３の学習器（後述する第３のニューラルネットワーク７）を利用して、空調装置４の制御の競合を解消する。 Therefore, the control device 1 according to the present embodiment controls the air conditioner 4 based on the control value output from the first learner and controls the air conditioner 4 based on the control value output from the second learner. When there is a conflict, the conflict is resolved by modifying the control of the air conditioner 4. Specifically, the control device 1 according to the present embodiment uses a third learner (third neural network 7 described later) to eliminate the competition for control of the air conditioner 4.

第３の学習器は、第１の学習器及び第２の学習器それぞれから出力される制御値を入力すると、競合を解消するように修正された制御値（以下、「修正済み制御値」とも記載する）を出力するように予め学習済みである。そのため、制御装置１は、第１の学習器及び第２の学習器それぞれから得られる制御値を第３の学習器に入力することで、競合が生じないように修正された制御値を得ることができる。制御装置１は、このようにして得られる修正済み制御値に基づいて、空調装置４の動作を制御する。 When the control values output from the first learner and the second learner are input to the third learner, the control value modified so as to eliminate the conflict (hereinafter, also referred to as "corrected control value"). It has been learned in advance to output). Therefore, the control device 1 inputs the control values obtained from the first learner and the second learner into the third learner to obtain the control values corrected so as not to cause a conflict. Can be done. The control device 1 controls the operation of the air conditioner 4 based on the corrected control value thus obtained.

一方、本実施形態に係る学習装置３は、第３の学習器の機械学習を行う情報処理装置である。本実施形態に係る学習装置３は、データ収集用制御装置２を用いて、第３の学習器の機械学習に利用する学習データを収集する。データ収集用制御装置２は、制御装置１と同様に、第１の学習器及び第２の学習器を利用して、各利用者（Ａ、Ｂ）の好みに適するように空調装置４の動作を制御する。ただし、データ収集用制御装置２は、空調装置４の制御の競合を解消しない（第３の学習器を利用しない）点で、制御装置１と相違する。 On the other hand, the learning device 3 according to the present embodiment is an information processing device that performs machine learning of the third learning device. The learning device 3 according to the present embodiment uses the data collection control device 2 to collect learning data to be used for machine learning of the third learning device. Similar to the control device 1, the data collection control device 2 uses the first learner and the second learner to operate the air conditioner 4 so as to suit the tastes of each user (A, B). To control. However, the data collection control device 2 is different from the control device 1 in that it does not resolve the control conflict of the air conditioner 4 (does not use the third learner).

すなわち、データ収集用制御装置２による空調装置４の制御では、上記のような競合が生じ得る。そこで、データ収集用制御装置２は、第１の学習器から得られる制御値に基づく空調装置４の制御と第２の学習器から得られる制御値に基づく空調装置４の制御とが競合するか否かを判定する。第１の学習器から得られる制御値に基づく空調装置４の制御と第２の学習器から得られる制御値に基づく空調装置４の制御とが競合すると判定した場合、データ収集用制御装置２は、当該競合を解消するように制御値の修正値を決定する。 That is, in the control of the air conditioner 4 by the data collection control device 2, the above-mentioned conflict may occur. Therefore, in the data collection control device 2, does the control of the air conditioner 4 based on the control value obtained from the first learner compete with the control of the air conditioner 4 based on the control value obtained from the second learner? Judge whether or not. When it is determined that the control of the air conditioner 4 based on the control value obtained from the first learner and the control of the air conditioner 4 based on the control value obtained from the second learner conflict with each other, the data collection control device 2 , Determine the correction value of the control value so as to resolve the conflict.

例えば、データ収集用制御装置２は、第１の学習器及び第２の学習器それぞれから得られる制御値のうちのいずれか一方を優先する。すなわち、データ収集用制御装置２は、優先する方の制御値を修正済み制御値として取り扱う。また、例えば、データ収集用制御装置２は、第１の学習器及び第２の学習器それぞれから得られる制御値の平均値を修正済み制御値として算出する。これにより、データ収集用制御装置２は、上記競合を解消するように決定した修正済み制御値を取得することができる。 For example, the data collection control device 2 gives priority to either one of the control values obtained from each of the first learner and the second learner. That is, the data collection control device 2 handles the priority control value as the corrected control value. Further, for example, the data collection control device 2 calculates the average value of the control values obtained from each of the first learner and the second learner as the corrected control value. As a result, the data collection control device 2 can acquire the corrected control value determined to eliminate the conflict.

そして、データ収集用制御装置２は、第１の学習器及び第２の学習器それぞれから得た制御値を入力データとし、上記により得られる修正済み制御値を教師データとして、第３の学習器の機械学習に利用する学習データを作成する。つまり、データ収集用制御装置２は、修正前の各制御値と修正済みの制御値とを組にすることで、学習データを作成する。 Then, the data collection control device 2 uses the control values obtained from the first learner and the second learner as input data, and the corrected control value obtained as described above as the teacher data, as the third learner. Create learning data to be used for machine learning. That is, the data collection control device 2 creates learning data by combining each control value before modification and the modified control value.

学習装置３は、このようにして作成された学習データを取得し、取得した学習データを用いて第３の学習器の機械学習を行うことで、上記制御装置１で利用可能な学習済みの第３の学習器を構築する。なお、制御装置１は、例えば、ネットワークを介して、学習装置３から学習済みの第３の学習器を取得してもよい。また、制御装置１を製造する際に、組み込みデータとして、学習済みの第３の学習器は、制御装置１に組み込まれてもよい。 The learning device 3 acquires the learning data created in this way, and performs machine learning of the third learning device using the acquired learning data, so that the learned third device that can be used in the control device 1 can be used. Build the learning device of 3. The control device 1 may acquire a third learned device that has been learned from the learning device 3 via a network, for example. Further, when the control device 1 is manufactured, the third learned device that has been learned as embedded data may be incorporated into the control device 1.

以上のとおり、本実施形態に係る制御装置１は、学習済みの第１の学習器及び第２の学習器を用いることで、各利用者（Ａ、Ｂ）の好みに適するように空調装置４の動作を制御することができる。加えて、第１の学習器及び第２の学習器それぞれから得られる制御値に基づく空調装置４の制御に競合が生じる場合に、第３の学習器を利用して、当該競合を解消することができる。したがって、本実施形態によれば、各利用者（Ａ、Ｂ）による制御の間で競合が発生しても、空調装置４が動作不能に陥らないようにすることができる。 As described above, the control device 1 according to the present embodiment uses the learned first learner and the second learner, so that the air conditioner 4 is suitable for the preference of each user (A, B). You can control the operation of. In addition, when there is a conflict in the control of the air conditioner 4 based on the control values obtained from each of the first learner and the second learner, the third learner is used to eliminate the conflict. Can be done. Therefore, according to the present embodiment, it is possible to prevent the air conditioner 4 from becoming inoperable even if a conflict occurs between the controls by each user (A, B).

§２構成例
［ハードウェア構成］
＜制御装置＞
次に、図２を用いて、本実施形態に係る制御装置１のハードウェア構成の一例について説明する。図２は、本実施形態に係る制御装置１のハードウェア構成の一例を模式的に例示する。 §2 Configuration example [Hardware configuration]
<Control device>
Next, an example of the hardware configuration of the control device 1 according to the present embodiment will be described with reference to FIG. FIG. 2 schematically illustrates an example of the hardware configuration of the control device 1 according to the present embodiment.

図２に示されるとおり、本実施形態に係る制御装置１は、制御部１１、記憶部１２、及び外部インタフェース１３が電気的に接続されたコンピュータである。なお、図２では、外部インタフェースを「外部Ｉ／Ｆ」と記載している。 As shown in FIG. 2, the control device 1 according to the present embodiment is a computer to which the control unit 11, the storage unit 12, and the external interface 13 are electrically connected. In FIG. 2, the external interface is described as "external I / F".

制御部１１は、ＣＰＵ（Central Processing Unit）、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）等を含み、プログラム及びデータに基づいて各種情報処理を実行するように構成される。記憶部１２は、制御部１１で実行される制御プログラム１２１、学習済みの第１の学習器に関する情報を示す第１動作制御学習結果データ１２２、学習済みの第２の学習器に関する情報を示す第２動作制御学習結果データ１２３、学習済みの第３の学習器に関する情報を示す競合解消学習結果データ１２４等を記憶する。 The control unit 11 includes a CPU (Central Processing Unit), a RAM (Random Access Memory), a ROM (Read Only Memory), and the like, and is configured to execute various information processing based on programs and data. The storage unit 12 has a control program 121 executed by the control unit 11, a first motion control learning result data 122 showing information about the learned first learning device, and a second learning device showing information about the learned second learning device. 2 The motion control learning result data 123, the conflict elimination learning result data 124 indicating information about the learned third learning device, and the like are stored.

制御プログラム１２１は、後述する空調装置４の動作を制御する処理（図８）を制御部１１に実行させるためのプログラムである。第１動作制御学習結果データ１２２は、学習済みの第１の学習器の設定に利用するデータである。第２動作制御学習結果データ１２３は、学習済みの第２の学習器の設定に利用するデータである。競合解消学習結果データ１２４は、学習済みの第３の学習器の設定に利用するデータである。詳細は後述する。 The control program 121 is a program for causing the control unit 11 to execute a process (FIG. 8) for controlling the operation of the air conditioner 4 described later. The first motion control learning result data 122 is data used for setting the learned first learner. The second motion control learning result data 123 is data used for setting the learned second learner. The conflict resolution learning result data 124 is data used for setting the learned third learner. Details will be described later.

外部インタフェース１３は、外部装置と接続するためのインタフェースであり、接続する外部装置に応じて適宜構成される。本実施形態では、制御装置１は、外部インタフェース１３を介して、空調装置４に接続する。なお、制御装置１は、外部インタフェース１３を介して、記憶媒体に記憶されたデータを読み込むためのドライブ装置等に接続されてもよい。この場合、制御装置１は、ドライブ装置を介して、上記制御プログラム１２１、第１動作制御学習結果データ１２２、第２動作制御学習結果データ１２３、及び競合解消学習結果データ１２４を取得してもよい。また、ドライブ装置を利用する場合、上記制御プログラム１２１、第１動作制御学習結果データ１２２、第２動作制御学習結果データ１２３、及び競合解消学習結果データ１２４は、記憶媒体に記憶されていてもよい。記憶媒体は、コンピュータその他装置、機械等が記録されたプログラム等の情報を読み取り可能なように、当該プログラム等の情報を、電気的、磁気的、光学的、機械的又は化学的作用によって蓄積する媒体である。記憶媒体は、例えば、ＣＤ（Compact Disk）、ＤＶＤ（Digital Versatile Disk）、フラッシュメモリ等である。 The external interface 13 is an interface for connecting to an external device, and is appropriately configured according to the external device to be connected. In the present embodiment, the control device 1 is connected to the air conditioner 4 via the external interface 13. The control device 1 may be connected to a drive device or the like for reading data stored in the storage medium via the external interface 13. In this case, the control device 1 may acquire the control program 121, the first motion control learning result data 122, the second motion control learning result data 123, and the conflict resolution learning result data 124 via the drive device. .. When using the drive device, the control program 121, the first motion control learning result data 122, the second motion control learning result data 123, and the conflict resolution learning result data 124 may be stored in the storage medium. .. The storage medium stores the information of the program or the like by electrical, magnetic, optical, mechanical or chemical action so that the information of the program or the like recorded by the computer or other device or machine can be read. It is a medium. The storage medium is, for example, a CD (Compact Disk), a DVD (Digital Versatile Disk), a flash memory, or the like.

＜データ収集用制御装置＞
次に、図３を用いて、本実施形態に係るデータ収集用制御装置２のハードウェア構成の一例について説明する。図３は、本実施形態に係るデータ収集用制御装置２のハードウェア構成の一例を模式的に例示する。 <Control device for data collection>
Next, an example of the hardware configuration of the data collection control device 2 according to the present embodiment will be described with reference to FIG. FIG. 3 schematically illustrates an example of the hardware configuration of the data collection control device 2 according to the present embodiment.

図３に示されるとおり、本実施形態に係るデータ収集用制御装置２は、学習データの収集の場面で利用される制御装置であり、上記制御装置１とほぼ同様に構成される。すなわち、データ収集用制御装置２は、制御部２１、記憶部２２、及び外部インタフェース２３が電気的に接続されたコンピュータである。なお、図３では、上記図２と同様に、外部インタフェースを「外部Ｉ／Ｆ」と記載している。 As shown in FIG. 3, the data collection control device 2 according to the present embodiment is a control device used in the scene of collecting learning data, and is configured in substantially the same manner as the control device 1. That is, the data collection control device 2 is a computer to which the control unit 21, the storage unit 22, and the external interface 23 are electrically connected. In FIG. 3, the external interface is described as "external I / F" as in FIG.

制御部２１、記憶部２２及び外部インタフェース２３は、上記制御装置１の制御部１１、記憶部１２及び外部インタフェース１３と同様に構成される。ただし、記憶部２２は、データ収集用制御プログラム２２１、第１動作制御学習結果データ１２２、第２動作制御学習結果データ１２３、データ収集用制御プログラム２２１を実行することで作成される学習データ２２３等を記憶する。 The control unit 21, the storage unit 22, and the external interface 23 are configured in the same manner as the control unit 11, the storage unit 12, and the external interface 13 of the control device 1. However, the storage unit 22 may execute the data collection control program 221, the first motion control learning result data 122, the second motion control learning result data 123, the data collection control program 221 to create learning data 223, etc. Remember.

データ収集用制御プログラム２２１は、データ収集用制御装置２に後述する学習データの収集処理（図９）を実行させるためのプログラムである。学習データ２２３は、第１の学習器及び第２の学習器それぞれから出力される制御値を入力すると、競合を解消するように修正済みの制御値を出力するように第３の学習器の学習を行うためのデータである。詳細は後述する。 The data collection control program 221 is a program for causing the data collection control device 2 to execute the learning data collection process (FIG. 9) described later. When the learning data 223 inputs the control values output from the first learning device and the second learning device, the learning data 223 learns the third learning device so as to output the corrected control values so as to eliminate the conflict. It is the data to do. Details will be described later.

＜学習装置＞
次に、図４を用いて、本実施形態に係る学習装置３のハードウェア構成の一例を説明する。図４は、本実施形態に係る学習装置３のハードウェア構成の一例を模式的に例示する。 <Learning device>
Next, an example of the hardware configuration of the learning device 3 according to the present embodiment will be described with reference to FIG. FIG. 4 schematically illustrates an example of the hardware configuration of the learning device 3 according to the present embodiment.

図４に示されるとおり、本実施形態に係る学習装置３は、制御部３１、記憶部３２、通信インタフェース３３、入力装置３４、出力装置３５、及びドライブ３６が電気的に接続されたコンピュータである。なお、図４では、通信インタフェースを「通信Ｉ／Ｆ」と記載している。 As shown in FIG. 4, the learning device 3 according to the present embodiment is a computer to which the control unit 31, the storage unit 32, the communication interface 33, the input device 34, the output device 35, and the drive 36 are electrically connected. .. In FIG. 4, the communication interface is described as "communication I / F".

制御部３１は、ＣＰＵ、ＲＡＭ、ＲＯＭ等を含み、プログラム及びデータに基づいて各種情報処理を実行するように構成される。記憶部３２は、制御部３１で実行される学習プログラム３２１、第３の学習器の学習に利用する学習データ２２３、学習プログラム３２１を実行することで作成した競合解消学習結果データ１２４等を記憶する。学習プログラム３２１は、学習装置３に後述する学習処理（図１０）を実行させるためのプログラムである。 The control unit 31 includes a CPU, RAM, ROM, and the like, and is configured to execute various information processing based on programs and data. The storage unit 32 stores the learning program 321 executed by the control unit 31, the learning data 223 used for learning of the third learning device, the conflict resolution learning result data 124 created by executing the learning program 321 and the like. .. The learning program 321 is a program for causing the learning device 3 to execute a learning process (FIG. 10) described later.

通信インタフェース３３は、例えば、有線ＬＡＮ（Local Area Network）モジュール、無線ＬＡＮモジュール等であり、ネットワークを介した有線又は無線通信を行うためのインタフェースである。入力装置３４は、例えば、マウス、キーボード等の入力を行うための装置である。出力装置３５は、例えば、ディスプレイ、スピーカ等の出力を行うための装置である。 The communication interface 33 is, for example, a wired LAN (Local Area Network) module, a wireless LAN module, or the like, and is an interface for performing wired or wireless communication via a network. The input device 34 is, for example, a device for inputting a mouse, a keyboard, or the like. The output device 35 is, for example, a device for outputting a display, a speaker, or the like.

ドライブ３６は、例えば、ＣＤドライブ、ＤＶＤドライブ等であり、記憶媒体９１に記憶されたプログラムを読み込むためのドライブ装置である。ドライブ３６の種類は、記憶媒体９１の種類に応じて適宜選択されてよい。上記学習プログラム３２１及び／又は学習データ２２３は、この記憶媒体９１に記憶されていてもよい。 The drive 36 is, for example, a CD drive, a DVD drive, or the like, and is a drive device for reading a program stored in the storage medium 91. The type of the drive 36 may be appropriately selected according to the type of the storage medium 91. The learning program 321 and / or the learning data 223 may be stored in the storage medium 91.

記憶媒体９１は、コンピュータその他装置、機械等が記録されたプログラム等の情報を読み取り可能なように、当該プログラム等の情報を、電気的、磁気的、光学的、機械的又は化学的作用によって蓄積する媒体である。学習装置３は、この記憶媒体９１から、上記学習プログラム３２１及び／又は学習データ２２３を取得してもよい。 The storage medium 91 stores the information of the program or the like by electrical, magnetic, optical, mechanical or chemical action so that the information of the program or the like recorded by the computer or other device, the machine or the like can be read. It is a medium to do. The learning device 3 may acquire the learning program 321 and / or the learning data 223 from the storage medium 91.

なお、図４では、記憶媒体９１の一例として、ＣＤ、ＤＶＤ等のディスク型の記憶媒体を例示している。しかしながら、記憶媒体９１の種類は、ディスク型に限定される訳ではなく、ディスク型以外であってもよい。ディスク型以外の記憶媒体として、例えば、フラッシュメモリ等の半導体メモリを挙げることができる。 Note that FIG. 4 illustrates a disc-type storage medium such as a CD or DVD as an example of the storage medium 91. However, the type of the storage medium 91 is not limited to the disc type, and may be other than the disc type. Examples of storage media other than the disk type include semiconductor memories such as flash memories.

［機能構成］
＜制御装置＞
次に、図５を用いて、本実施形態に係る制御装置１の機能構成の一例を説明する。図５は、本実施形態に係る制御装置１の機能構成の一例を模式的に例示する。 [Functional configuration]
<Control device>
Next, an example of the functional configuration of the control device 1 according to the present embodiment will be described with reference to FIG. FIG. 5 schematically illustrates an example of the functional configuration of the control device 1 according to the present embodiment.

制御装置１の制御部１１は、記憶部１２に記憶された制御プログラム１２１をＲＡＭに展開する。そして、制御部１１は、ＲＡＭに展開された制御プログラム１２１をＣＰＵにより解釈及び実行して、各構成要素を制御する。これによって、図５に示されるとおり、本実施形態に係る制御装置１は、第１の制御処理部１１１、第２の制御処理部１１２、及び競合解消部１１３を備えるコンピュータとして機能する。 The control unit 11 of the control device 1 expands the control program 121 stored in the storage unit 12 into the RAM. Then, the control unit 11 interprets and executes the control program 121 expanded in the RAM by the CPU to control each component. As a result, as shown in FIG. 5, the control device 1 according to the present embodiment functions as a computer including the first control processing unit 111, the second control processing unit 112, and the conflict resolving unit 113.

第１の制御処理部１１１は、第１の学習器である第１のニューラルネットワーク５を利用して、空調装置４の動作を制御する。第１のニューラルネットワーク５は、利用者Ａの好みに適した空調装置４の動作の制御を予め学習済みである。第１の制御処理部１１１は、利用者Ａからの指示データ、位置情報等を第１のニューラルネットワーク５に入力することで、当該第１のニューラルネットワーク５から空調装置４に対する制御値を取得する。 The first control processing unit 111 controls the operation of the air conditioner 4 by using the first neural network 5 which is the first learner. The first neural network 5 has learned in advance the control of the operation of the air conditioner 4 suitable for the preference of the user A. The first control processing unit 111 acquires the control value for the air conditioner 4 from the first neural network 5 by inputting the instruction data, the position information, and the like from the user A into the first neural network 5. ..

一方、第２の制御処理部１１２は、第２の学習器である第２のニューラルネットワーク６を利用して、空調装置４の動作を制御する。第２のニューラルネットワーク６は、利用者Ｂの好みに適した空調装置４の動作の制御を予め学習済みである。第２の制御処理部１１２は、利用者Ｂからの指示データ、位置情報等を第２のニューラルネットワーク６に入力することで、当該第２のニューラルネットワーク６から空調装置４に対する制御値を取得する。 On the other hand, the second control processing unit 112 controls the operation of the air conditioner 4 by using the second neural network 6 which is the second learner. The second neural network 6 has learned in advance the control of the operation of the air conditioner 4 suitable for the preference of the user B. The second control processing unit 112 acquires the control value for the air conditioner 4 from the second neural network 6 by inputting the instruction data, the position information, and the like from the user B into the second neural network 6. ..

なお、第１のニューラルネットワーク５及び第２のニューラルネットワーク６に入力する情報（データ）の種類は、実施の形態に応じて適宜決定されてよい。利用者Ａ及びＢは、例えば、ＰＣ（Personal Computer）、携帯電話、リモートコントローラ等のユーザ端末を用いて、空調装置４に対して室温調整の要求を行ってもよい。これに応じて、制御装置１は、公知の無線又は有線のデータ通信により、各利用者（Ａ、Ｂ）のユーザ端末から指示データを受信してもよい。 The type of information (data) to be input to the first neural network 5 and the second neural network 6 may be appropriately determined according to the embodiment. Users A and B may request the air conditioner 4 to adjust the room temperature by using, for example, a user terminal such as a PC (Personal Computer), a mobile phone, or a remote controller. In response to this, the control device 1 may receive instruction data from the user terminals of each user (A, B) by known wireless or wired data communication.

このとき、制御装置１は、ユーザ端末からの指示データに付随して、各ニューラルネットワーク（５、６）に入力する各種情報を取得してもよい。例えば、ユーザ端末が、ＧＰＳ（Global Positioning System）信号の受信機を備える場合には、制御装置１は、各ニューラルネットワーク（５、６）に入力する情報として、各ユーザ端末から各利用者（Ａ、Ｂ）の位置情報を取得してもよい。 At this time, the control device 1 may acquire various information to be input to each neural network (5, 6) along with the instruction data from the user terminal. For example, when the user terminal includes a GPS (Global Positioning System) signal receiver, the control device 1 receives information to be input to each neural network (5, 6) from each user terminal to each user (A). , B) may be acquired.

また、制御装置１は、各利用者（Ａ、Ｂ）の個人情報を記憶部１２に予め保持していてもよい。この場合、制御部１１は、各ユーザ端末から指示データを受信した際に、各ニューラルネットワーク（５、６）に入力する情報として、記憶部１２から各利用者（Ａ、Ｂ）の個人情報を取得してもよい。 Further, the control device 1 may store the personal information of each user (A, B) in the storage unit 12 in advance. In this case, when the control unit 11 receives the instruction data from each user terminal, the control unit 11 inputs the personal information of each user (A, B) from the storage unit 12 as the information to be input to each neural network (5, 6). You may get it.

競合解消部１１３は、第１のニューラルネットワーク５から出力される制御値に基づく空調装置４の制御と第２のニューラルネットワーク６から出力される制御値に基づく空調装置４の制御とが競合する場合に、空調装置４の制御を修正することで、当該競合を解消する。本実施形態では、第３の学習器である第３のニューラルネットワーク７を利用して、当該競合の解消を行う。 When the conflict resolving unit 113 conflicts between the control of the air conditioner 4 based on the control value output from the first neural network 5 and the control of the air conditioner 4 based on the control value output from the second neural network 6. In addition, the conflict is resolved by modifying the control of the air conditioner 4. In the present embodiment, the third neural network 7, which is the third learning device, is used to resolve the conflict.

第３のニューラルネットワーク７は、第１のニューラルネットワーク５及び第２のニューラルネットワーク６それぞれから出力される制御値を入力すると、競合を解消するように修正済みの制御値を出力するように予め学習済みである。そのため、競合解消部１１３は、第１のニューラルネットワーク５及び第２のニューラルネットワーク６それぞれから出力される制御値を第３のニューラルネットワーク７に入力することで、競合を解消するように修正済みの制御値を第３のニューラルネットワーク７から取得することができる。 When the third neural network 7 inputs the control values output from each of the first neural network 5 and the second neural network 6, the third neural network 7 learns in advance to output the corrected control values so as to eliminate the conflict. It's done. Therefore, the conflict resolving unit 113 has been modified so as to resolve the conflict by inputting the control values output from each of the first neural network 5 and the second neural network 6 into the third neural network 7. The control value can be obtained from the third neural network 7.

次に、各ニューラルネットワーク５〜７について説明する。図５に示されるとおり、第１のニューラルネットワーク５は、いわゆる深層学習に用いられる多層構造のニューラルネットワークであり、入力から順に、入力層５１、中間層（隠れ層）５２、及び出力層５３を備えている。 Next, each neural network 5 to 7 will be described. As shown in FIG. 5, the first neural network 5 is a multi-layered neural network used for so-called deep learning, and the input layer 51, the intermediate layer (hidden layer) 52, and the output layer 53 are arranged in order from the input. I have.

なお、図５の例では、第１のニューラルネットワーク５は、１層の中間層５２を備えており、入力層５１の出力が中間層５２の入力となり、中間層５２の出力が出力層５３の入力となっている。ただし、中間層５２の数は１層に限られなくてもよく、第１のニューラルネットワーク５は、２層以上の中間層５２を備えてもよい。 In the example of FIG. 5, the first neural network 5 includes an intermediate layer 52 of one layer, the output of the input layer 51 is the input of the intermediate layer 52, and the output of the intermediate layer 52 is the output layer 53. It is an input. However, the number of the intermediate layers 52 is not limited to one layer, and the first neural network 5 may include two or more intermediate layers 52.

各層５１〜５３は、１又は複数のニューロンを備えている。例えば、入力層５１のニューロンの数は、入力に利用する情報の件数に応じて設定することができる。中間層５２のニューロンの数は、実施の形態に応じて適宜設定することができる。また、出力層５３のニューロンの数は、出力する制御値の種類数に応じて設定することができる。 Each layer 51-53 comprises one or more neurons. For example, the number of neurons in the input layer 51 can be set according to the number of pieces of information used for input. The number of neurons in the middle layer 52 can be appropriately set according to the embodiment. Further, the number of neurons in the output layer 53 can be set according to the number of types of control values to be output.

隣接する層のニューロン同士は適宜結合され、各結合には重み（結合荷重）が設定されている。図５の例では、各ニューロンは、隣接する層の全てのニューロンと結合されているが、ニューロンの結合は、このような例に限定されなくてもよく、実施の形態に応じて適宜設定されてよい。 Neurons in adjacent layers are appropriately connected to each other, and a weight (connection load) is set for each connection. In the example of FIG. 5, each neuron is connected to all neurons in the adjacent layer, but the connection of neurons does not have to be limited to such an example and is appropriately set according to the embodiment. You can.

各ニューロンには閾値が設定されており、基本的には、各入力と各重みとの積の和が閾値を超えているか否かによって各ニューロンの出力が決定される。第１の制御処理部１１１は、このような第１のニューラルネットワーク５の入力層５１に利用者Ａからの指示データ、位置情報等の各種情報を入力し、順伝搬の方向に各層５１〜５３に含まれる各ニューロンの発火判定を行うことで、出力層５３から制御値（出力値）を得ることができる。 A threshold is set for each neuron, and basically, the output of each neuron is determined by whether or not the sum of the products of each input and each weight exceeds the threshold. The first control processing unit 111 inputs various information such as instruction data and position information from the user A to the input layer 51 of the first neural network 5, and each layer 51 to 53 in the forward propagation direction. A control value (output value) can be obtained from the output layer 53 by determining the firing of each neuron included in.

なお、以上のような第１のニューラルネットワーク５の構成（例えば、ニューラルネットワークの層数、各層におけるニューロンの個数、ニューロン同士の結合関係、各ニューロンの伝達関数）、各ニューロン間の結合の重み、及び各ニューロンの閾値を示す情報は、第１動作制御学習結果データ１２２に含まれている。第１の制御処理部１１１は、第１動作制御学習結果データ１２２を参照して、利用者Ａの好みに適した空調装置４の動作の制御を学習済みの第１のニューラルネットワーク５の設定を行う。 The configuration of the first neural network 5 as described above (for example, the number of layers of the neural network, the number of neurons in each layer, the connection relationship between neurons, the transfer function of each neuron), the weight of the connection between each neuron, And the information indicating the threshold value of each neuron is included in the first motion control learning result data 122. The first control processing unit 111 refers to the first motion control learning result data 122 to set the first neural network 5 that has learned to control the motion of the air conditioner 4 that suits the preference of the user A. conduct.

第２のニューラルネットワーク６及び第３のニューラルネットワーク７も、第１のニューラルネットワーク５と同様に構成される。すなわち、第２のニューラルネットワーク６は、入力から順に、入力層６１、中間層（隠れ層）６２、及び出力層６３を備えている。第３のニューラルネットワーク７は、入力から順に、入力層７１、中間層（隠れ層）７２、及び出力層７３を備えている。各中間層（６２、７２）の数、各層（６１〜６３、７１〜７３）のニューロンの数、及び隣接する層のニューロンの結合は、実施の形態に応じて適宜設定されてよい。 The second neural network 6 and the third neural network 7 are configured in the same manner as the first neural network 5. That is, the second neural network 6 includes an input layer 61, an intermediate layer (hidden layer) 62, and an output layer 63 in this order from the input. The third neural network 7 includes an input layer 71, an intermediate layer (hidden layer) 72, and an output layer 73 in this order from the input. The number of each intermediate layer (62, 72), the number of neurons in each layer (61-63, 71-73), and the connection of neurons in adjacent layers may be appropriately set according to the embodiment.

第２の制御処理部１１２は、第２のニューラルネットワーク６の入力層６１に利用者Ｂからの指示データ、位置情報等の各種情報を入力し、順伝搬の方向に各層６１〜６３に含まれる各ニューロンの発火判定を行うことで、出力層６３から制御値（出力値）を得ることができる。また、競合解消部１１３は、第１のニューラルネットワーク５及び第２のニューラルネットワーク６の各出力層（５３、６３）から出力される制御値を入力層７１に入力し、順伝搬の方向に各層７１〜７３に含まれる各ニューロンの発火判定を行うことで、出力層７３から修正済みの制御値（出力値）を得ることができる。 The second control processing unit 112 inputs various information such as instruction data and position information from the user B to the input layer 61 of the second neural network 6, and is included in each of the layers 61 to 63 in the forward propagation direction. A control value (output value) can be obtained from the output layer 63 by determining the firing of each neuron. Further, the conflict resolving unit 113 inputs the control values output from the output layers (53, 63) of the first neural network 5 and the second neural network 6 to the input layer 71, and each layer in the forward propagation direction. By determining the firing of each neuron included in 71 to 73, a corrected control value (output value) can be obtained from the output layer 73.

なお、以上のような第２のニューラルネットワーク６の構成、各ニューロン間の結合の重み、及び各ニューロンの閾値を示す情報は、第２動作制御学習結果データ１２３に含まれている。第２の制御処理部１１２は、第２動作制御学習結果データ１２３を参照して、利用者Ｂの好みに適した空調装置４の動作の制御を学習済みの第２のニューラルネットワーク６の設定を行う。 Information indicating the configuration of the second neural network 6 as described above, the weight of the connection between each neuron, and the threshold value of each neuron is included in the second motion control learning result data 123. The second control processing unit 112 refers to the second motion control learning result data 123, and sets the second neural network 6 that has learned the control of the motion of the air conditioner 4 suitable for the preference of the user B. conduct.

同様に、以上のような第３のニューラルネットワーク７の構成、各ニューロン間の結合の重み、及び各ニューロンの閾値を示す情報は、競合解消学習結果データ１２４に含まれている。競合解消部１１３は、競合解消学習結果データ１２４を参照して、第１のニューラルネットワーク５及び第２のニューラルネットワーク６それぞれから出力される制御値を入力すると、競合を解消するように修正済みの制御値を出力するように学習済みである第３のニューラルネットワーク７の設定を行う。 Similarly, the information indicating the configuration of the third neural network 7 as described above, the weight of the connection between each neuron, and the threshold value of each neuron is included in the conflict resolution learning result data 124. The conflict resolution unit 113 has been modified so as to resolve the conflict when the control values output from each of the first neural network 5 and the second neural network 6 are input with reference to the conflict resolution learning result data 124. The third neural network 7 that has been learned is set so as to output the control value.

＜データ収集用制御装置＞
次に、図６を用いて、本実施形態に係るデータ収集用制御装置２の機能構成の一例を説明する。図６は、本実施形態に係るデータ収集用制御装置２の機能構成の一例を模式的に例示する。 <Control device for data collection>
Next, an example of the functional configuration of the data collection control device 2 according to the present embodiment will be described with reference to FIG. FIG. 6 schematically illustrates an example of the functional configuration of the data collection control device 2 according to the present embodiment.

データ収集用制御装置２の制御部２１は、記憶部２２に記憶されたデータ収集用制御プログラム２２１をＲＡＭに展開する。そして、制御部２１は、ＲＡＭに展開されたデータ収集用制御プログラム２２１をＣＰＵにより解釈及び実行して、各構成要素を制御する。これにより、図６に示されるとおり、本実施形態に係るデータ収集用制御装置２は、第１の制御処理部２１１、第２の制御処理部２１２、修正値決定部２１３、及び学習データ作成部２１４を備えるコンピュータとして機能する。 The control unit 21 of the data collection control device 2 expands the data collection control program 221 stored in the storage unit 22 into the RAM. Then, the control unit 21 interprets and executes the data collection control program 221 expanded in the RAM by the CPU to control each component. As a result, as shown in FIG. 6, the data collection control device 2 according to the present embodiment has the first control processing unit 211, the second control processing unit 212, the correction value determination unit 213, and the learning data creation unit. It functions as a computer equipped with 214.

第１の制御処理部２１１は、上記制御装置１の第１の制御処理部１１１と同様である。すなわち、第１の制御処理部２１１は、第１動作制御学習結果データ１２２を参照して、第１のニューラルネットワーク５の設定を行う。そして、第１の制御処理部２１１は、設定した第１のニューラルネットワーク５の入力層５１に利用者Ａからの指示データ、位置情報等の各種情報を入力し、順伝搬の方向に各層５１〜５３に含まれる各ニューロンの発火判定を行うことで、利用者Ａの好みに応じた空調装置４に対する制御値（出力値）を出力層５３から取得する。 The first control processing unit 211 is the same as the first control processing unit 111 of the control device 1. That is, the first control processing unit 211 sets the first neural network 5 with reference to the first motion control learning result data 122. Then, the first control processing unit 211 inputs various information such as instruction data and position information from the user A to the input layer 51 of the set first neural network 5, and each layer 51 to 51 in the forward propagation direction. By determining the firing of each neuron included in the 53, the control value (output value) for the air conditioner 4 according to the preference of the user A is acquired from the output layer 53.

第２の制御処理部２１２は、上記制御装置１の第２の制御処理部１１２と同様である。すなわち、第２の制御処理部２１２は、第２動作制御学習結果データ１２３を参照して、第２のニューラルネットワーク６の設定を行う。そして、第２の制御処理部２１２は、設定した第２のニューラルネットワーク６の入力層６１に利用者Ｂからの指示データ、位置情報等の各種情報を入力し、順伝搬の方向に各層６１〜６３に含まれる各ニューロンの発火判定を行うことで、利用者Ｂの好みに応じた空調装置４に対する制御値（出力値）を出力層６３から取得する。 The second control processing unit 212 is the same as the second control processing unit 112 of the control device 1. That is, the second control processing unit 212 sets the second neural network 6 with reference to the second motion control learning result data 123. Then, the second control processing unit 212 inputs various information such as instruction data and position information from the user B to the input layer 61 of the set second neural network 6, and each layer 61 to 61 in the forward propagation direction. By determining the firing of each neuron included in 63, the control value (output value) for the air conditioner 4 according to the preference of the user B is acquired from the output layer 63.

データ収集用制御装置２は、第１のニューラルネットワーク５及び第２のニューラルネットワーク６それぞれから得られる制御値に基づいて、空調装置４の動作を制御する。ただし、空調装置４の動作を制御しようとした結果、空調装置４の制御に競合が生じる場合には、空調装置４は動作不能に陥る可能性がある。 The data collection control device 2 controls the operation of the air conditioner 4 based on the control values obtained from each of the first neural network 5 and the second neural network 6. However, if there is a conflict in the control of the air conditioner 4 as a result of trying to control the operation of the air conditioner 4, the air conditioner 4 may become inoperable.

例えば、上記のとおり、室温が２４度である状況で、第１のニューラルネットワーク５から出力される制御値が室温を２６度に上げる指令を構成しており、第２のニューラルネットワーク６から出力される制御値が室温を２２度に下げる指令を構成しているとする。このような場合、空調装置４の制御に競合が生じてしまい、データ収集用制御装置２は、室温を上げるように空調装置４の動作を制御すればよいのか、室温を下げるように空調装置４の動作を制御すればよいのか判断できなくなってしまう。 For example, as described above, in a situation where the room temperature is 24 degrees, the control value output from the first neural network 5 constitutes a command to raise the room temperature to 26 degrees, and is output from the second neural network 6. It is assumed that the control value constitutes a command to lower the room temperature to 22 degrees. In such a case, a conflict arises in the control of the air conditioner 4, and the data collection control device 2 should control the operation of the air conditioner 4 so as to raise the room temperature, or the air conditioner 4 so as to lower the room temperature. It becomes impossible to judge whether to control the operation of.

そこで、空調装置４の動作を制御しようとした結果、空調装置４の制御に競合が生じる場合には、修正値決定部２１３が、当該競合を解消するように制御値の修正値を決定する。そして、学習データ作成部２１４は、各ニューラルネットワーク（５、６）から得られる制御値を入力データとし、修正値決定部２１３により決定された修正済みの制御値を教師データとして、第３のニューラルネットワーク７を構築するための学習データ２２３を作成する。 Therefore, when a conflict occurs in the control of the air conditioner 4 as a result of trying to control the operation of the air conditioner 4, the correction value determination unit 213 determines the correction value of the control value so as to eliminate the conflict. Then, the learning data creation unit 214 uses the control values obtained from each neural network (5, 6) as input data, and the corrected control value determined by the correction value determination unit 213 as teacher data, and uses the third neural network as the third neural network. The learning data 223 for constructing the network 7 is created.

＜学習装置＞
次に、図７を用いて、本実施形態に係る学習装置３の機能構成の一例を説明する。図７は、本実施形態に係る学習装置３の機能構成の一例を模式的に例示する。 <Learning device>
Next, an example of the functional configuration of the learning device 3 according to the present embodiment will be described with reference to FIG. 7. FIG. 7 schematically illustrates an example of the functional configuration of the learning device 3 according to the present embodiment.

学習装置３の制御部３１は、記憶部３２に記憶された学習プログラム３２１をＲＡＭに展開する。そして、制御部３１は、ＲＡＭに展開された学習プログラム３２１をＣＰＵにより解釈及び実行して、各構成要素を制御する。これによって、図７に示されるとおり、本実施形態に係る学習装置３は、学習データ取得部３１１及び学習処理部３１２を備えるコンピュータとして機能する。 The control unit 31 of the learning device 3 expands the learning program 321 stored in the storage unit 32 into the RAM. Then, the control unit 31 interprets and executes the learning program 321 expanded in the RAM by the CPU to control each component. As a result, as shown in FIG. 7, the learning device 3 according to the present embodiment functions as a computer including the learning data acquisition unit 311 and the learning processing unit 312.

学習データ取得部３１１は、上記により作成された学習データ２２３を取得する。学習処理部３１２は、取得した学習データ２２３及び学習用のニューラルネットワーク８を利用して、上記制御装置１で利用する第３のニューラルネットワークの構築を行う。すなわち、学習処理部３１２は、各ニューラルネットワーク（５、６）から得られる制御値を入力すると、競合を解消するように修正済みの制御値を出力するようにニューラルネットワーク８を学習させる。 The learning data acquisition unit 311 acquires the learning data 223 created above. The learning processing unit 312 constructs a third neural network used by the control device 1 by using the acquired learning data 223 and the neural network 8 for learning. That is, when the control value obtained from each neural network (5, 6) is input, the learning processing unit 312 trains the neural network 8 so as to output the corrected control value so as to eliminate the conflict.

学習対象となるニューラルネットワーク８は、第３のニューラルネットワーク７と同様に構成される。すなわち、学習用のニューラルネットワーク８は、入力層８１、中間層（隠れ層）８２、及び出力層８３を備え、各層８１〜８３は、上記第３のニューラルネットワーク７の各層７１〜７３と同様に構成される。 The neural network 8 to be learned is configured in the same manner as the third neural network 7. That is, the neural network 8 for learning includes an input layer 81, an intermediate layer (hidden layer) 82, and an output layer 83, and each layer 81 to 83 is the same as each layer 71 to 73 of the third neural network 7. It is composed.

学習処理部３１２は、ニューラルネットワークの学習処理により、各ニューラルネットワーク（５、６）から得られる制御値を入力層８１に入力すると、競合を解消するように修正済みの制御値を出力層８３から出力するニューラルネットワーク８を構築する。これにより構築されたニューラルネットワーク８は、学習済みの第３のニューラルネットワーク７として利用可能である。学習処理部３１２は、構築したニューラルネットワーク８の構成、各ニューロン間の結合の重み、及び各ニューロンの閾値を示す情報を競合解消学習結果データ１２４として記憶部３２に格納する。 When the learning processing unit 312 inputs the control value obtained from each neural network (5, 6) to the input layer 81 by the learning process of the neural network, the learning processing unit 312 inputs the corrected control value from the output layer 83 so as to eliminate the conflict. A neural network 8 to be output is constructed. The neural network 8 constructed thereby can be used as a trained third neural network 7. The learning processing unit 312 stores information indicating the configuration of the constructed neural network 8, the weight of the connection between each neuron, and the threshold value of each neuron as the conflict resolution learning result data 124 in the storage unit 32.

＜その他＞
制御装置１、データ収集用制御装置２、及び学習装置３の各機能に関しては後述する動作例で詳細に説明する。なお、本実施形態では、制御装置１、データ収集用制御装置２、及び学習装置３の各機能がいずれも汎用のＣＰＵにより実現される例について説明した。しかしながら、以上の機能の一部又は全部が、１又は複数の専用のハードウェアプロセッサにより実現されてもよい。また、制御装置１、データ収集用制御装置２、及び学習装置３それぞれの機能構成に関して、実施形態に応じて、適宜、機能の省略、置換及び追加が行われてもよい。 <Others>
Each function of the control device 1, the data collection control device 2, and the learning device 3 will be described in detail in an operation example described later. In this embodiment, an example in which each function of the control device 1, the data collection control device 2, and the learning device 3 is realized by a general-purpose CPU has been described. However, some or all of the above functions may be realized by one or more dedicated hardware processors. Further, with respect to the functional configurations of the control device 1, the data collection control device 2, and the learning device 3, the functions may be omitted, replaced, or added as appropriate according to the embodiment.

§３動作例
［制御装置］
次に、図８を用いて、制御装置１の動作例を説明する。図８は、制御装置１の処理手順の一例を例示するフローチャートである。なお、以下で説明する処理手順は一例に過ぎず、各処理は可能な限り変更されてよい。また、以下で説明する処理手順について、実施の形態に応じて、適宜、ステップの省略、置換、及び追加が可能である。 §3 Operation example [Control device]
Next, an operation example of the control device 1 will be described with reference to FIG. FIG. 8 is a flowchart illustrating an example of the processing procedure of the control device 1. The processing procedure described below is only an example, and each processing may be changed as much as possible. Further, with respect to the processing procedure described below, steps can be omitted, replaced, and added as appropriate according to the embodiment.

（起動）
まず、制御装置１及び空調装置４を含むシステムを適宜起動する。システムを起動すると、制御装置１は、制御プログラム１２１を読み込んで、初期設定の処理を実行する。具体的には、制御部１１は、各学習結果データ１２２〜１２４を参照して、各ニューラルネットワーク５〜７の構造、各ニューロン間の結合の重み及び各ニューロンの閾値の設定を行う。そして、制御部１１は、以下の処理手順に従って、空調装置４の動作を制御する。 (Start-up)
First, the system including the control device 1 and the air conditioner 4 is appropriately started. When the system is started, the control device 1 reads the control program 121 and executes the initial setting process. Specifically, the control unit 11 sets the structure of each neural network 5 to 7, the weight of the connection between each neuron, and the threshold value of each neuron with reference to the learning result data 122 to 124. Then, the control unit 11 controls the operation of the air conditioner 4 according to the following processing procedure.

（ステップＳ１０１）
ステップＳ１０１では、制御部１１は、空調装置４の動作の制御に利用する各種情報、換言すると、空調装置４の動作を決定する要因となる情報を各利用者（Ａ、Ｂ）から取得する。上記のとおり、空調装置４の動作の制御に利用する情報の種類は、実施の形態に応じて適宜決定されてよい。各利用者（Ａ、Ｂ）は、例えば、ＰＣ（Personal Computer）、携帯電話、リモートコントローラ等のユーザ端末を用いて、空調装置４に対して室温調整の要求を行ってもよい。これに応じて、制御部１１は、公知の無線又は有線のデータ通信により、空調装置４の動作の制御に利用する指示データ、位置情報等の各種情報をユーザ端末から取得してもよい。また、制御部１１は、空調装置４の動作の制御に利用する情報として、各利用者（Ａ、Ｂ）の個人情報を記憶部１２から取得してもよい。 (Step S101)
In step S101, the control unit 11 acquires various information used for controlling the operation of the air conditioner 4, in other words, information that is a factor for determining the operation of the air conditioner 4 from each user (A, B). As described above, the type of information used for controlling the operation of the air conditioner 4 may be appropriately determined according to the embodiment. Each user (A, B) may request the air conditioner 4 to adjust the room temperature by using, for example, a user terminal such as a PC (Personal Computer), a mobile phone, or a remote controller. In response to this, the control unit 11 may acquire various information such as instruction data and position information used for controlling the operation of the air conditioner 4 from the user terminal by known wireless or wired data communication. Further, the control unit 11 may acquire personal information of each user (A, B) from the storage unit 12 as information used for controlling the operation of the air conditioner 4.

（ステップＳ１０２）
次のステップＳ１０２では、制御部１１は、第１の制御処理部１１１として機能し、利用者Ａから取得した指示データ、位置情報等の各種情報を第１のニューラルネットワーク５に入力する。第１のニューラルネットワーク５は、利用者Ａから取得した各種情報を入力すると、利用者Ａの好みに応じた空調装置４に対する制御値を出力するように予め学習済みである。そのため、制御部１１は、利用者Ａから取得した各種情報を入力層５１の各ニューロンに入力し、順伝搬の方向に各層５１〜５３に含まれる各ニューロンの発火判定を行うことで、利用者Ａの好みに応じた空調装置４に対する制御値を出力層５３の各ニューロンから取得することができる。 (Step S102)
In the next step S102, the control unit 11 functions as the first control processing unit 111, and inputs various information such as instruction data and position information acquired from the user A to the first neural network 5. The first neural network 5 has been learned in advance so that when various information acquired from the user A is input, a control value for the air conditioner 4 according to the preference of the user A is output. Therefore, the control unit 11 inputs various information acquired from the user A to each neuron of the input layer 51, and determines the firing of each neuron included in each of the layers 51 to 53 in the direction of forward propagation. The control value for the air conditioner 4 according to the preference of A can be obtained from each neuron of the output layer 53.

また、制御部１１は、第２の制御処理部１１２として機能し、利用者Ｂから取得した指示データ、位置情報等の各種情報を第２のニューラルネットワーク６に入力する。第２のニューラルネットワーク６は、利用者Ｂから取得した各種情報を入力すると、利用者Ｂの好みに応じた空調装置４に対する制御値を出力するように予め学習済みである。そのため、制御部１１は、利用者Ｂから取得した各種情報を入力層６１の各ニューロンに入力し、順伝搬の方向に各層６１〜６３に含まれる各ニューロンの発火判定を行うことで、利用者Ｂの好みに応じた空調装置４に対する制御値を出力層６３の各ニューロンから取得することができる。 Further, the control unit 11 functions as a second control processing unit 112, and inputs various information such as instruction data and position information acquired from the user B to the second neural network 6. The second neural network 6 has been learned in advance so that when various information acquired from the user B is input, the control value for the air conditioner 4 according to the preference of the user B is output. Therefore, the control unit 11 inputs various information acquired from the user B to each neuron of the input layer 61, and determines the firing of each neuron included in each of the layers 61 to 63 in the direction of forward propagation. The control value for the air conditioner 4 according to the preference of B can be obtained from each neuron of the output layer 63.

（ステップＳ１０３）
次のステップＳ１０３では、制御部１１は、競合解消部１１３として機能し、各ニューラルネットワーク（５、６）から得られた制御値を第３のニューラルネットワーク７の入力層７１に入力する。そして、制御部１１は、順伝搬の方向に各層７１〜７３に含まれる各ニューロンの発火判定を行うことで、第３のニューラルネットワーク７の出力層７３から競合を解消するように修正済みの制御値を取得する。これにより、本実施形態では、各ニューラルネットワーク（５、６）を利用した空調装置４の制御に競合が発生する場合に、当該競合を解消するように空調装置４の制御を修正することができる。 (Step S103)
In the next step S103, the control unit 11 functions as the conflict resolution unit 113, and inputs the control values obtained from the respective neural networks (5, 6) to the input layer 71 of the third neural network 7. Then, the control unit 11 determines the firing of each neuron included in each layer 71 to 73 in the forward propagation direction, so that the control has been modified so as to eliminate the conflict from the output layer 73 of the third neural network 7. Get the value. Thereby, in the present embodiment, when a conflict occurs in the control of the air conditioner 4 using each neural network (5, 6), the control of the air conditioner 4 can be modified so as to eliminate the conflict. ..

ここで、本ステップＳ１０３では、ステップＳ１０２で取得した各制御値により空調装置４の制御に競合が発生するか否かを区別せずに、第３のニューラルネットワーク７に当該各制御値を入力している。つまり、ステップＳ１０２で取得した各制御値により空調装置４の制御に競合が発生しない場合にも、当該ステップＳ１０２で取得した各制御値を第３のニューラルネットワーク７に入力している。 Here, in this step S103, each control value is input to the third neural network 7 without distinguishing whether or not a conflict occurs in the control of the air conditioner 4 depending on each control value acquired in step S102. ing. That is, even when there is no conflict in the control of the air conditioner 4 due to the control values acquired in step S102, the control values acquired in step S102 are input to the third neural network 7.

このとき、第３のニューラルネットワーク７は、ステップＳ１０２で取得した各制御値により空調装置４の制御に競合が発生しない場合、各制御値をそのまま出力するように学習されていてもよいし、競合が発生する場合と同様に修正済みの制御値を出力するように学習されていてもよい。なお、以下では、入力された制御値を修正せずに第３のニューラルネットワーク７からそのまま出力される制御値も「修正済みの制御値（修正済み制御値）」と称する。 At this time, the third neural network 7 may be learned to output each control value as it is when the control values of the air conditioner 4 do not conflict with each control value acquired in step S102, or the conflict may occur. May be trained to output the corrected control value as in the case of occurrence. In the following, the control value output as it is from the third neural network 7 without modifying the input control value is also referred to as "corrected control value (corrected control value)".

（ステップＳ１０４）
次のステップＳ１０４では、制御部１１は、上記ステップＳ１０３において第３のニューラルネットワーク７から取得した修正済み制御値に基づいて、空調装置４の動作を制御する。制御値は、例えば、空調装置４を動作させることで達成すべき所望の室温を示す。制御部１１は、制御値に示される所望の室温と現状の室温とを比較し、所望の室温になるように空調装置４の冷暖房の動作を制御する。 (Step S104)
In the next step S104, the control unit 11 controls the operation of the air conditioner 4 based on the corrected control value acquired from the third neural network 7 in the step S103. The control value indicates, for example, the desired room temperature to be achieved by operating the air conditioner 4. The control unit 11 compares the desired room temperature indicated by the control value with the current room temperature, and controls the operation of cooling and heating of the air conditioner 4 so as to reach the desired room temperature.

以上により、制御部１１は、本動作例に係る処理を終了する。制御部１１は、以上のステップＳ１０１〜Ｓ１０４の処理を定期的又は不定期的に繰り返し実行してもよい。これにより、制御装置１は、各利用者（Ａ、Ｂ）の好みに応じた空調装置４の動作の制御を継続的に実施することができる。 As described above, the control unit 11 ends the process related to this operation example. The control unit 11 may repeatedly execute the above steps S101 to S104 periodically or irregularly. As a result, the control device 1 can continuously control the operation of the air conditioner 4 according to the preference of each user (A, B).

［データ収集用制御装置］
次に、図９を用いて、データ収集用制御装置２の動作例を説明する。図９は、データ収集用制御装置２の処理手順の一例を例示するフローチャートである。なお、以下で説明する処理手順は、本発明の「学習データ作成方法」に相当する。ただし、以下で説明する処理手順は一例に過ぎず、各処理は可能な限り変更されてよい。また、以下で説明する処理手順について、実施の形態に応じて、適宜、ステップの省略、置換、及び追加が可能である。 [Control device for data collection]
Next, an operation example of the data collection control device 2 will be described with reference to FIG. FIG. 9 is a flowchart illustrating an example of the processing procedure of the data collection control device 2. The processing procedure described below corresponds to the "learning data creation method" of the present invention. However, the processing procedure described below is only an example, and each processing may be changed as much as possible. Further, with respect to the processing procedure described below, steps can be omitted, replaced, and added as appropriate according to the embodiment.

（起動）
上記と同様に、データ収集用制御装置２及び空調装置４を含むシステムを適宜起動する。システムを起動すると、データ収集用制御装置２は、データ収集用制御プログラム２２１を読み込んで、初期設定の処理を実行する。すなわち、制御部２１は、各学習結果データ（１２２、１２３）を参照して、各ニューラルネットワーク（５、６）の構造、各ニューロン間の結合の重み及び各ニューロンの閾値の設定を行う。そして、制御部２１は、以下の処理手順に従って、第３のニューラルネットワーク７を構築するための学習データ２２３を作成する。 (Start-up)
In the same manner as described above, the system including the data collection control device 2 and the air conditioner 4 is appropriately started. When the system is started, the data collection control device 2 reads the data collection control program 221 and executes the initial setting process. That is, the control unit 21 sets the structure of each neural network (5, 6), the weight of the connection between each neuron, and the threshold value of each neuron with reference to each learning result data (122, 123). Then, the control unit 21 creates learning data 223 for constructing the third neural network 7 according to the following processing procedure.

（ステップＳ２０１及びＳ２０２）
ステップＳ２０１では、制御部２１は、上記ステップＳ１０１と同様に、各ニューラルネットワーク（５、６）に入力する各種情報を各利用者（Ａ、Ｂ）から取得する。 (Steps S201 and S202)
In step S201, the control unit 21 acquires various information to be input to each neural network (5, 6) from each user (A, B) in the same manner as in step S101.

次のステップＳ２０２では、制御部２１は、上記ステップＳ１０２と同様に、第１の制御処理部２１１として機能し、利用者Ａから取得した各種情報を第１のニューラルネットワーク５の入力層５１に入力する。そして、制御部２１は、順伝搬の方向に各層５１〜５３に含まれる各ニューロンの発火判定を行うことで、第１のニューラルネットワーク５の出力層５３から出力される利用者Ａの好みに応じた空調装置４に対する制御値を取得する。 In the next step S202, the control unit 21 functions as the first control processing unit 211 as in the above step S102, and inputs various information acquired from the user A to the input layer 51 of the first neural network 5. do. Then, the control unit 21 determines the firing of each neuron included in each layer 51 to 53 in the forward propagation direction, and according to the preference of the user A output from the output layer 53 of the first neural network 5. Acquire the control value for the air conditioner 4.

また、制御部２１は、第２の制御処理部２１２として機能し、利用者Ｂから取得した各種情報を第２のニューラルネットワーク６の入力層６１に入力する。そして、制御部２１は、順伝搬の方向に各層６１〜６３に含まれる各ニューロンの発火判定を行うことで、第２のニューラルネットワーク６の出力層６３から出力される利用者Ｂの好みに応じた空調装置４に対する制御値を取得する。 Further, the control unit 21 functions as a second control processing unit 212, and inputs various information acquired from the user B to the input layer 61 of the second neural network 6. Then, the control unit 21 determines the firing of each neuron included in each layer 61 to 63 in the forward propagation direction, and according to the preference of the user B output from the output layer 63 of the second neural network 6. Acquire the control value for the air conditioner 4.

（ステップＳ２０３及びＳ２０４）
次のステップＳ２０３では、制御部２１は、上記ステップＳ２０２で各ニューラルネットワーク（５、６）から取得した各制御値に基づいて、制御対象装置である空調装置４の動作を制御する。そして、ステップＳ２０４では、制御部２１は、空調装置４の制御に競合が発生するか否かを判定する。 (Steps S203 and S204)
In the next step S203, the control unit 21 controls the operation of the air conditioner 4 which is the control target device based on each control value acquired from each neural network (5, 6) in the step S202. Then, in step S204, the control unit 21 determines whether or not a conflict occurs in the control of the air conditioner 4.

このとき、制御部２１は、ステップＳ２０３において空調装置４を実際に動作させて、各ニューラルネットワーク（５、６）から取得した各制御値が競合を発生させるか否かを判定してもよい。また、制御部２１は、ステップＳ２０３において空調装置４を実際には動作させず、各ニューラルネットワーク（５、６）から取得した各制御値に基づいて、空調装置４の動作をシミュレートすることで、競合が発生するか否かを判定してもよい。 At this time, the control unit 21 may actually operate the air conditioner 4 in step S203 to determine whether or not each control value acquired from each neural network (5, 6) causes a conflict. Further, the control unit 21 does not actually operate the air conditioner 4 in step S203, but simulates the operation of the air conditioner 4 based on each control value acquired from each neural network (5, 6). , It may be determined whether or not a conflict occurs.

競合が発生するか否かを判定する方法は、実施の形態に応じて適宜設定されてよい。例えば、各ニューラルネットワーク（５、６）から取得した各制御値に基づいた空調装置４の制御を同時に実行できない場合に、制御部２１は、空調装置４の制御に競合が発生すると判定してもよい。空調装置４の制御に競合が発生すると判定した場合には、制御部２１は、次のステップＳ２０５に処理を進める。一方、空調装置４の制御に競合が発生しないと判定した場合には、制御部２１は、本動作例に係る処理を終了する。 The method for determining whether or not a conflict occurs may be appropriately set according to the embodiment. For example, if the control of the air conditioner 4 based on the control values acquired from each neural network (5, 6) cannot be executed at the same time, the control unit 21 may determine that a conflict occurs in the control of the air conditioner 4. good. When it is determined that a conflict occurs in the control of the air conditioner 4, the control unit 21 proceeds to the next step S205. On the other hand, when it is determined that there is no conflict in the control of the air conditioner 4, the control unit 21 ends the process according to this operation example.

（ステップＳ２０５）
次のステップＳ２０５では、制御部２１は、修正値決定部２１３として機能し、空調装置４の制御に生じた競合を解消するように、各ニューラルネットワーク（５、６）から取得した各制御値の修正値を決定する。これにより、制御部２１は、競合を解消するように修正済みの制御値を取得する。 (Step S205)
In the next step S205, the control unit 21 functions as a correction value determination unit 213, and of each control value acquired from each neural network (5, 6) so as to eliminate the conflict caused in the control of the air conditioner 4. Determine the correction value. As a result, the control unit 21 acquires the corrected control value so as to eliminate the conflict.

制御値の修正方法は、実施の形態に応じて適宜選択されてよい。例えば、制御部２１は、所定の規則に従って修正値を決定してもよい。この場合、利用者Ａ及びＢのいずれか一方を優先することが所定の規則として定められているときには、制御部２１は、各ニューラルネットワーク（５、６）から取得した制御値のうちいずれか一方を優先する。すなわち、制御部２１は、優先する方の制御値を修正済み制御値として取り扱う。また、利用者Ａ及びＢを共に平等に扱うことが所定の規則として定められているときには、制御部２１は、各ニューラルネットワーク（５、６）から取得した制御値を平均化することで、修正済みの制御値を取得する。 The method for modifying the control value may be appropriately selected according to the embodiment. For example, the control unit 21 may determine the correction value according to a predetermined rule. In this case, when it is stipulated as a predetermined rule to give priority to either one of the users A and B, the control unit 21 receives one of the control values acquired from each neural network (5, 6). Give priority to. That is, the control unit 21 treats the priority control value as the corrected control value. Further, when it is stipulated as a predetermined rule that both users A and B should be treated equally, the control unit 21 corrects by averaging the control values acquired from each neural network (5, 6). Get the completed control value.

なお、所定の規則は、このような例に限定されなくてもよい。例えば、所定の規則として、利用者Ａ及びＢそれぞれに優先度が設定されている場合には、制御部２１は、各ニューラルネットワーク（５、６）から取得した制御値の加重平均を修正済みの制御値として取得してもよい。 It should be noted that the predetermined rule does not have to be limited to such an example. For example, when the priority is set for each of the users A and B as a predetermined rule, the control unit 21 has corrected the weighted average of the control values acquired from each neural network (5, 6). It may be acquired as a control value.

また、例えば、制御部２１は、修正値の入力をオペレータから受け付けてもよい。すなわち、制御部２１は、オペレータからの入力に基づいて修正済みの制御値を決定してもよい。この場合、データ収集用制御装置２は、外部インタフェース２３を介して、キーボード、マイクロフォン等の入力装置に接続していてもよい。これにより、オペレータは、キーボード入力、音声入力等により、修正済みの制御値を入力することができる。 Further, for example, the control unit 21 may accept the input of the correction value from the operator. That is, the control unit 21 may determine the corrected control value based on the input from the operator. In this case, the data collection control device 2 may be connected to an input device such as a keyboard or a microphone via the external interface 23. As a result, the operator can input the corrected control value by keyboard input, voice input, or the like.

（ステップＳ２０６）
次のステップＳ２０６では、制御部２１は、学習データ作成部２１４として機能し、ステップＳ２０２で各ニューラルネットワーク（５、６）から取得した制御値と、ステップＳ２０５で決定した修正済みの制御値とを組にする。これにより、制御部２１は、修正前の各制御値を入力データとし、修正済みの制御値を教師データとする学習データ２２３を作成する。そして、制御部２１は、作成した学習データ２２３を記憶部２２に保存する。 (Step S206)
In the next step S206, the control unit 21 functions as a learning data creation unit 214, and obtains the control value acquired from each neural network (5, 6) in step S202 and the corrected control value determined in step S205. Make a pair. As a result, the control unit 21 creates learning data 223 in which each control value before modification is input data and the modified control value is teacher data. Then, the control unit 21 stores the created learning data 223 in the storage unit 22.

以上により、制御部２１は、本動作例に係る処理を終了する。制御部２１は、上記ステップＳ２０１〜Ｓ２０６の一連の処理を繰り返し実行することで、複数件の学習データ２２３を収集することができる。 As described above, the control unit 21 ends the process related to this operation example. The control unit 21 can collect a plurality of learning data 223s by repeatedly executing the series of processes of steps S201 to S206.

なお、空調装置４の制御に競合が発生しない場合に、各制御値をそのまま出力する第３のニューラルネットワーク７を構築するときには、制御部２１は、ステップＳ２０４で競合が発生しないと判定した際の各制御値を入力データ及び教師データとして学習データ２２３を作成してもよい。 When constructing a third neural network 7 that outputs each control value as it is when there is no conflict in the control of the air conditioner 4, the control unit 21 determines in step S204 that no conflict occurs. Learning data 223 may be created by using each control value as input data and teacher data.

また、空調装置４の制御に競合が発生しない場合でも、各ニューラルネットワーク（５、６）から得られる制御値を修正する第３のニューラルネットワーク７を構築するときには、制御部２１は、ステップＳ２０４で競合が発生しないと判定した際にも、上記ステップＳ２０５及びＳ２０６の処理を実行してもよい。 Further, when constructing the third neural network 7 that modifies the control values obtained from the respective neural networks (5, 6) even when there is no conflict in the control of the air conditioner 4, the control unit 21 performs the control unit 21 in step S204. Even when it is determined that no conflict occurs, the processes of steps S205 and S206 may be executed.

この場合、制御部２１は、オペレータからの入力を受け付けて、利用者Ａ及びＢの両方の好みに適した修正済みの制御値が得られるように修正値を決定してもよい。これにより、各ニューラルネットワーク（５、６）から得られる制御値を入力すると、利用者Ａ及びＢの両方の好みに適した修正済みの制御値を出力する第３のニューラルネットワーク７の構築に利用可能な学習データ２２３を作成することができる。 In this case, the control unit 21 may accept the input from the operator and determine the modified value so that the modified control value suitable for the preferences of both users A and B can be obtained. As a result, when the control values obtained from each neural network (5, 6) are input, it is used for the construction of the third neural network 7 that outputs the corrected control values suitable for the preferences of both users A and B. Possible learning data 223 can be created.

［学習装置］
次に、図１０を用いて、学習装置３の動作例を説明する。図１０は、学習装置３の処理手順の一例を例示するフローチャートである。なお、以下で説明する処理手順は一例に過ぎず、各処理は可能な限り変更されてよい。また、以下で説明する処理手順について、実施の形態に応じて、適宜、ステップの省略、置換、及び追加が可能である。 [Learning device]
Next, an operation example of the learning device 3 will be described with reference to FIG. FIG. 10 is a flowchart illustrating an example of the processing procedure of the learning device 3. The processing procedure described below is only an example, and each processing may be changed as much as possible. Further, with respect to the processing procedure described below, steps can be omitted, replaced, and added as appropriate according to the embodiment.

（ステップＳ３０１）
ステップＳ３０１では、制御部３１は、学習データ取得部３１１として機能し、上記データ収集用制御装置２により作成された学習データ２２３を取得する。 (Step S301)
In step S301, the control unit 31 functions as the learning data acquisition unit 311 and acquires the learning data 223 created by the data collection control device 2.

データ収集用制御装置２により作成した学習データ２２３を学習装置３に転送する方法は、実施の形態に応じて適宜選択されてよい。例えば、学習装置３とデータ収集用制御装置２とがネットワークを介して接続している場合には、制御部３１は、ネットワークを介してデータ収集用制御装置２にアクセスすることで、学習データ２２３を取得することができる。また、例えば、データ収集用制御装置２で作成された学習データ２２３は、ＮＡＳ（Network Attached Storage）等のその他の情報処理装置（記憶装置）に格納されていてもよい。この場合、制御部３１は、当該その他の情報処理装置にアクセスすることで、学習データ２２３を取得することができる。また、例えば、データ収集用制御装置２で作成された学習データ２２３は、記憶媒体９１に格納されていてもよい。この場合、制御部３１は、ドライブ３６を介して記憶媒体９１から学習データ２２３を取得することができる。なお、本ステップＳ３０１で取得する学習データ２２３の件数は、学習用のニューラルネットワーク８の学習を行うことができるように、実施の形態に応じて適宜決定されてよい。 The method of transferring the learning data 223 created by the data collection control device 2 to the learning device 3 may be appropriately selected depending on the embodiment. For example, when the learning device 3 and the data collection control device 2 are connected via a network, the control unit 31 accesses the data collection control device 2 via the network to obtain the learning data 223. Can be obtained. Further, for example, the learning data 223 created by the data collection control device 2 may be stored in another information processing device (storage device) such as NAS (Network Attached Storage). In this case, the control unit 31 can acquire the learning data 223 by accessing the other information processing device. Further, for example, the learning data 223 created by the data collection control device 2 may be stored in the storage medium 91. In this case, the control unit 31 can acquire the learning data 223 from the storage medium 91 via the drive 36. The number of learning data 223 acquired in this step S301 may be appropriately determined according to the embodiment so that the learning neural network 8 can be learned.

（ステップＳ３０２）
次のステップＳ３０２では、制御部３１は、学習処理部３１２として機能し、ステップＳ３０１で取得した学習データ２２３を用いて、各ニューラルネットワーク（５、６）から得られる制御値を入力すると、競合を解消するように修正済みの制御値を出力するように学習用のニューラルネットワーク８の学習を行う。 (Step S302)
In the next step S302, the control unit 31 functions as a learning processing unit 312, and when the control values obtained from each neural network (5, 6) are input using the learning data 223 acquired in step S301, a conflict occurs. The training neural network 8 is trained so as to output the corrected control value so as to be eliminated.

具体的には、まず、制御部３１は、学習処理を行う対象となる学習用のニューラルネットワーク８を用意する。用意するニューラルネットワーク８の構成、各ニューロン間の結合の重みの初期値、及び各ニューロンの閾値の初期値は、テンプレートにより与えられてもよいし、オペレータの入力により与えられてもよい。また、再学習を行う場合には、制御部３１は、再学習を行う対象となる競合解消学習結果データ１２４に基づいて、学習用のニューラルネットワーク８を用意してもよい。 Specifically, first, the control unit 31 prepares a learning neural network 8 to be subjected to learning processing. The configuration of the neural network 8 to be prepared, the initial value of the weight of the connection between each neuron, and the initial value of the threshold value of each neuron may be given by the template or by the input of the operator. Further, when performing re-learning, the control unit 31 may prepare a neural network 8 for learning based on the conflict resolution learning result data 124 to be re-learned.

次に、制御部３１は、ステップＳ３０１で取得した学習データ２２３に含まれる各ニューラルネットワーク（５、６）から得られた制御値を入力データとし、修正済みの制御値を教師データとして、ニューラルネットワーク８の学習を行う。このニューラルネットワーク８の学習には、勾配降下法、確率的勾配降下法等が用いられてよい。 Next, the control unit 31 uses the control values obtained from the neural networks (5, 6) included in the learning data 223 acquired in step S301 as input data, and the corrected control values as teacher data, as the neural network. 8 learning is performed. A gradient descent method, a stochastic gradient descent method, or the like may be used for learning the neural network 8.

例えば、制御部３１は、学習データ２２３に含まれる各ニューラルネットワーク（５、６）から得られた制御値を入力層８１に入力して、学習用のニューラルネットワーク８の順伝播方向の演算処理を行う。これにより、制御部３１は、学習用のニューラルネットワーク８の出力層８３から出力値を得る。次に、制御部３１は、出力層８３から出力された出力値を学習データ２２３に含まれる修正済みの制御値との誤差を算出する。続いて、制御部３１は、誤差逆伝搬法により、算出した出力値の誤差を用いて、各ニューロン間の結合の重み及び各ニューロンの閾値それぞれの誤差を算出する。そして、制御部３１は、算出した各誤差に基づいて、各ニューロン間の結合の重み及び各ニューロンの閾値それぞれの値の更新を行う。 For example, the control unit 31 inputs the control values obtained from each neural network (5, 6) included in the learning data 223 into the input layer 81, and performs arithmetic processing in the forward propagation direction of the training neural network 8. conduct. As a result, the control unit 31 obtains an output value from the output layer 83 of the learning neural network 8. Next, the control unit 31 calculates an error between the output value output from the output layer 83 and the corrected control value included in the learning data 223. Subsequently, the control unit 31 calculates the error of the connection weight between each neuron and the error of each threshold value of each neuron by using the error of the output value calculated by the error back propagation method. Then, the control unit 31 updates the weight of the connection between each neuron and the value of each threshold value of each neuron based on each calculated error.

制御部３１は、各件の学習データ２２３について、出力層８３から出力される出力値が対応する修正済みの制御値と一致するまでこの一連の処理を繰り返すことにより、ニューラルネットワーク８の学習を行う。これにより、各ニューラルネットワーク（５、６）から得られる制御値を入力すると、競合を解消するように修正済みの制御値を出力するニューラルネットワーク８を構築することができる。 The control unit 31 learns the neural network 8 by repeating this series of processes for each learning data 223 until the output value output from the output layer 83 matches the corresponding corrected control value. .. As a result, when the control values obtained from each neural network (5, 6) are input, the neural network 8 that outputs the corrected control value so as to eliminate the conflict can be constructed.

（ステップＳ３０３）
次のステップＳ３０３では、制御部３１は、学習処理部３１２として機能して、構築したニューラルネットワーク８の構成、各ニューロン間の結合の重み、及び各ニューロンの閾値を示す情報を競合解消学習結果データ１２４として記憶部３２に格納する。これにより、制御部３１は、本動作例に係る学習処理を終了する。 (Step S303)
In the next step S303, the control unit 31 functions as a learning processing unit 312 to provide information indicating the configuration of the constructed neural network 8, the weight of the connection between each neuron, and the threshold value of each neuron as conflict resolution learning result data. It is stored in the storage unit 32 as 124. As a result, the control unit 31 ends the learning process related to this operation example.

なお、学習装置３と制御装置１とがネットワークを介して接続可能な場合、制御部３１は、上記ステップＳ３０３の処理が完了した後に、作成した競合解消学習結果データ１２４を制御装置１に適宜転送してもよい。また、制御部３１は、上記ステップＳ３０１〜Ｓ３０３の学習処理を定期的又は不定期的に実行することで、競合解消学習結果データ１２４を定期的に又は不定期的に更新してもよい。そして、制御部３１は、作成した競合解消学習結果データ１２４を当該学習処理の実行毎に制御装置１に転送することで、制御装置１の保持する競合解消学習結果データ１２４を定期的に又は不定期的に更新してもよい。 When the learning device 3 and the control device 1 can be connected via a network, the control unit 31 appropriately transfers the created conflict resolution learning result data 124 to the control device 1 after the process of step S303 is completed. You may. Further, the control unit 31 may update the conflict resolution learning result data 124 periodically or irregularly by executing the learning processes of steps S301 to S303 periodically or irregularly. Then, the control unit 31 transfers the created conflict resolution learning result data 124 to the control device 1 each time the learning process is executed, so that the conflict resolution learning result data 124 held by the control device 1 is periodically or not. It may be updated regularly.

［作用・効果］
以上のように、本実施形態では、各ニューラルネットワーク（５、６）によって、各利用者（Ａ、Ｂ）の好みに適するように空調装置４の動作を制御することができる。ただし、両者から相反する指示がなされた場合等には、空調装置４の制御に競合が発生し得る。これに対して、本実施形態では、各ニューラルネットワーク（５、６）から得られた制御値を、上記ステップＳ１０３の処理で利用する第３のニューラルネットワーク７によって、当該競合を解消するように修正することができる。したがって、本実施形態によれば、各利用者（Ａ、Ｂ）による制御の間で競合が発生しても、空調装置４が動作不能に陥らないようにすることができる。 [Action / Effect]
As described above, in the present embodiment, the operation of the air conditioner 4 can be controlled by each neural network (5, 6) so as to suit the preference of each user (A, B). However, if conflicting instructions are given by both parties, a conflict may occur in the control of the air conditioner 4. On the other hand, in the present embodiment, the control values obtained from the respective neural networks (5, 6) are modified so as to eliminate the conflict by the third neural network 7 used in the process of step S103. can do. Therefore, according to the present embodiment, it is possible to prevent the air conditioner 4 from becoming inoperable even if a conflict occurs between the controls by each user (A, B).

§４変形例
以上、本発明の実施の形態を詳細に説明してきたが、前述までの説明はあらゆる点において本発明の例示に過ぎない。本発明の範囲を逸脱することなく種々の改良や変形を行うことができることは言うまでもない。例えば、以下のような変更が可能である。なお、以下では、上記実施形態と同様の構成要素に関しては同様の符号を用い、上記実施形態と同様の点については、適宜説明を省略した。以下の変形例は適宜組み合わせ可能である。 §4 Modifications Although the embodiments of the present invention have been described in detail above, the above description is merely an example of the present invention in all respects. Needless to say, various improvements and modifications can be made without departing from the scope of the present invention. For example, the following changes can be made. In the following, the same reference numerals will be used for the same components as those in the above embodiment, and the same points as in the above embodiment will be omitted as appropriate. The following modifications can be combined as appropriate.

＜４．１＞
上記実施形態では、制御装置１により制御される制御対象装置として空調装置を例示している。しかしながら、制御対象装置の種類は、空調装置に限られなくてもよく、実施の形態に応じて適宜選択されてよい。制御対象装置は、例えば、ロボット装置等であってもよい。 <4.1>
In the above embodiment, an air conditioner is exemplified as a controlled object device controlled by the control device 1. However, the type of the device to be controlled does not have to be limited to the air conditioner, and may be appropriately selected according to the embodiment. The control target device may be, for example, a robot device or the like.

また、上記実施形態では、第１のニューラルネットワーク５及び第２のニューラルネットワーク６は、同一の制御対象装置（空調装置４）を制御対象としている。しかしながら、第１のニューラルネットワーク５が制御対象とする制御対象装置と第２のニューラルネットワーク６が制御対象とする制御対象装置とは異なっていてもよい。 Further, in the above embodiment, the first neural network 5 and the second neural network 6 control the same controlled object device (air conditioner 4). However, the controlled object device controlled by the first neural network 5 and the controlled object device controlled by the second neural network 6 may be different from each other.

例えば、第１のニューラルネットワーク５は、第１の制御対象装置として、第１のロボット装置を制御対象としてもよい。そして、第２のニューラルネットワーク６は、第２の制御対象装置として、第１のロボット装置とは異なる第２のロボット装置を制御対象としてもよい。この場合、例えば、第１のロボット装置及び第２のロボット装置に同じタイミングで同じ位置に移動する制御指令が発行されたときに、両ロボット装置の制御の間で競合が発生し得る。 For example, in the first neural network 5, the first robot device may be the control target as the first control target device. Then, the second neural network 6 may use a second robot device different from the first robot device as a control target as the second control target device. In this case, for example, when a control command for moving to the same position at the same timing is issued to the first robot device and the second robot device, a conflict may occur between the controls of both robot devices.

また、上記実施形態では、制御対象装置の動作を制御する制御値を発行する学習器として２つの学習器（第１のニューラルネットワーク５及び第２のニューラルネットワーク６）が利用されている。しかしながら、制御対象装置の動作を制御する制御値を発行する学習器の数は、２つに限られなくてもよく、３つ以上であってもよい。 Further, in the above embodiment, two learning devices (first neural network 5 and second neural network 6) are used as learning devices that issue control values for controlling the operation of the controlled device. However, the number of learners that issue control values that control the operation of the controlled device is not limited to two, and may be three or more.

＜４．２＞
上記制御装置１、データ収集用制御装置２、及び学習装置３それぞれの具体的なハードウェア構成に関して、実施の形態に応じて、適宜、構成要素の省略、置換及び追加が可能である。例えば、制御部１１は、複数のプロセッサを含んでもよい。制御装置１及びデータ収集用制御装置２は、通信インタフェースを備え、ネットワークを介して他の情報処理装置とデータのやりとりが可能に構成されてもよい。制御装置１、データ収集用制御装置２、及び学習装置３はそれぞれ、複数台のコンピュータで構成されてもよい。 <4.2>
Regarding the specific hardware configurations of the control device 1, the data collection control device 2, and the learning device 3, components can be omitted, replaced, or added as appropriate according to the embodiment. For example, the control unit 11 may include a plurality of processors. The control device 1 and the data collection control device 2 may be configured to include a communication interface and enable data exchange with other information processing devices via a network. The control device 1, the data collection control device 2, and the learning device 3 may each be composed of a plurality of computers.

また、制御装置１及びデータ収集用制御装置２はそれぞれ、提供されるサービス専用に設計されたＥＣＵ（Electronic Control Unit）等の情報処理装置の他、制御する対象となる制御対象装置に応じて、汎用のデスクトップＰＣ、タブレットＰＣ、携帯電話等が適宜用いられてもよい。また、学習装置３は、提供されるサービス専用に設計された情報処理装置の他、汎用のサーバ装置、デスクトップＰＣ等が用いられてもよい。 Further, each of the control device 1 and the data collection control device 2 depends on an information processing device such as an ECU (Electronic Control Unit) designed exclusively for the provided service, and a control target device to be controlled. A general-purpose desktop PC, tablet PC, mobile phone, or the like may be used as appropriate. Further, as the learning device 3, a general-purpose server device, a desktop PC, or the like may be used in addition to the information processing device designed exclusively for the provided service.

＜４．３＞
また、上記実施形態では、図５〜図７に示されるとおり、各ニューラルネットワーク５〜８として、多層構造を有する一般的な順伝播型ニューラルネットワークを用いている。しかしながら、各ニューラルネットワーク５〜８の種類は、このような例に限定されなくてもよく、実施の形態に応じて適宜選択されてよい。例えば、入力データとして画像を用いる場合、各ニューラルネットワーク５〜８には、畳み込み層及びプーリング層を備える畳み込みニューラルネットワークを用いてもよい。また、例えば、入力データとして時系列データを用いる場合、各ニューラルネットワーク５〜８には、中間層から入力層等のように出力側から入力側に再帰する結合を有する再帰型ニューラルネットワークが用いられてもよい。なお、各ニューラルネットワーク５〜８の層数、各層におけるニューロンの個数、ニューロン同士の結合関係、及び各ニューロンの伝達関数は、実施の形態に応じて適宜決定されてよい。 <4.3>
Further, in the above embodiment, as shown in FIGS. 5 to 7, a general feedforward neural network having a multi-layer structure is used as each neural network 5 to 8. However, the types of each neural network 5 to 8 do not have to be limited to such an example, and may be appropriately selected depending on the embodiment. For example, when an image is used as input data, a convolutional neural network including a convolutional layer and a pooling layer may be used for each neural network 5 to 8. Further, for example, when time-series data is used as input data, a recursive neural network having a recursive connection from the output side to the input side, such as an intermediate layer to an input layer, is used for each neural network 5 to 8. You may. The number of layers of each neural network 5 to 8, the number of neurons in each layer, the connection relationship between neurons, and the transfer function of each neuron may be appropriately determined according to the embodiment.

＜４．４＞
また、上記ステップＳ１０３では、空調装置４の制御に競合を発生させるか否かを区別せずに、ステップＳ１０２で各ニューラルネットワーク（５、６）から取得した制御値を第３のニューラルネットワーク７に入力している。しかしながら、制御装置１の処理手順は、このような例に限定されなくてもよい。ステップＳ１０２で各ニューラルネットワーク（５、６）から取得した制御値が空調装置４の制御に競合を発生させる場合にのみ、制御部１１は、当該ステップＳ１０２で取得した制御値を第３のニューラルネットワーク７に入力してもよい。この場合、ステップＳ１０２で各ニューラルネットワーク（５、６）から取得した制御値が空調装置４の制御に競合を発生させないときには、制御部１１は、上記ステップＳ１０３を省略して、次のステップＳ１０４の処理を実行することで、各ニューラルネットワーク（５、６）から取得した制御値をそのまま空調装置４の制御に利用してもよい。 <4.4>
Further, in step S103, the control values acquired from each neural network (5, 6) in step S102 are transferred to the third neural network 7 without distinguishing whether or not a conflict occurs in the control of the air conditioner 4. I'm typing. However, the processing procedure of the control device 1 does not have to be limited to such an example. Only when the control value acquired from each neural network (5, 6) in step S102 causes a conflict in the control of the air conditioner 4, the control unit 11 uses the control value acquired in step S102 as the third neural network. You may enter in 7. In this case, when the control values acquired from the neural networks (5, 6) in step S102 do not cause a conflict in the control of the air conditioner 4, the control unit 11 omits the step S103 and skips the step S103 to the next step S104. By executing the process, the control values acquired from each neural network (5, 6) may be used as they are for the control of the air conditioner 4.

また、上記ステップＳ１０３では、第３のニューラルネットワーク７を用いて、空調装置４の制御に生じる競合の解消を行っている。しかしながら、空調装置４の制御に生じる競合の解消を行う方法は、このような例に限定されなくてもよい。ニューラルネットワークを用いずに、空調装置４の制御に生じる競合の解消が行われてもよい。 Further, in step S103, the third neural network 7 is used to eliminate the conflict that occurs in the control of the air conditioner 4. However, the method for resolving the conflict that occurs in the control of the air conditioner 4 does not have to be limited to such an example. Conflicts that occur in the control of the air conditioner 4 may be resolved without using a neural network.

図１１は、本変形例に係る制御装置１Ａを模式的に例示する。制御装置１Ａは、競合解消学習結果データ１２４を保持せず、ニューラルネットワークを利用しない競合解消部１１３Ａを備える点を除き、上記制御装置１と同様に構成されている。この場合、制御部１１は、上記ステップＳ１０３において、競合解消部１１３Ａとして機能し、各ニューラルネットワーク（５、６）から取得した制御値により空調装置４の制御に競合が発生するか否かを判定する。例えば、制御部１１は、空調装置４の動作をシミュレートすることで、当該空調装置４の制御に競合が発生するか否かを判定する。 FIG. 11 schematically illustrates the control device 1A according to this modification. The control device 1A is configured in the same manner as the control device 1 except that it does not hold the conflict resolution learning result data 124 and includes a conflict resolution unit 113A that does not use a neural network. In this case, the control unit 11 functions as the conflict resolution unit 113A in step S103, and determines whether or not a conflict occurs in the control of the air conditioner 4 based on the control values acquired from each neural network (5, 6). do. For example, the control unit 11 determines whether or not a conflict occurs in the control of the air conditioner 4 by simulating the operation of the air conditioner 4.

そして、空調装置４の制御に競合が発生しないと判定した場合には、制御部１１は、各ニューラルネットワーク（５、６）から取得した制御値に基づいて空調装置４の動作を制御する。一方、空調装置４の制御に競合が発生すると判定した場合には、制御部１１は、競合を解消するように、各ニューラルネットワーク（５、６）から取得した制御値を修正し、修正した制御値に基づいて空調装置４の動作を制御する。 Then, when it is determined that there is no conflict in the control of the air conditioner 4, the control unit 11 controls the operation of the air conditioner 4 based on the control values acquired from each neural network (5, 6). On the other hand, when it is determined that a conflict occurs in the control of the air conditioner 4, the control unit 11 corrects the control value acquired from each neural network (5, 6) so as to eliminate the conflict, and the corrected control. The operation of the air conditioner 4 is controlled based on the value.

なお、制御値の修正方法は、上記ステップＳ２０５と同様に、実施の形態に応じて適宜選択されてよい。例えば、制御部１１は、各ニューラルネットワーク（５、６）から取得した制御値のうちいずれか一方を優先することで、修正済みの修正値を決定してもよい。また、例えば、制御部１１は、各ニューラルネットワーク（５、６）から取得した制御値を平均化することで、修正済みの制御値を決定してもよい。 The method for modifying the control value may be appropriately selected according to the embodiment, as in step S205. For example, the control unit 11 may determine the corrected correction value by giving priority to any one of the control values acquired from each neural network (5, 6). Further, for example, the control unit 11 may determine the corrected control value by averaging the control values acquired from each neural network (5, 6).

＜４．５＞
また、上記実施形態では、制御部１１は、空調装置４の制御にどのような競合を発生させるかを特定せずに、各ニューラルネットワーク（５、６）から取得した制御値を第３のニューラルネットワーク７に入力している。しかしながら、制御装置１の処理手順は、このような例に限定されなくてもよく、制御部１１は、各ニューラルネットワーク（５、６）から取得した制御値が空調装置４の制御にどのような競合を発生させるかを特定してもよい。 <4.5>
Further, in the above embodiment, the control unit 11 uses the control value acquired from each neural network (5, 6) as the third neural network without specifying what kind of competition is generated in the control of the air conditioner 4. Input to network 7. However, the processing procedure of the control device 1 does not have to be limited to such an example, and the control unit 11 determines what kind of control value the control value acquired from each neural network (5, 6) is for controlling the air conditioner 4. You may specify whether to cause a conflict.

図１２は、本変形例に係る制御装置１Ｂを模式的に例示する。制御装置１Ｂは、各ニューラルネットワーク（５、６）から取得した制御値に基づいて、空調装置４の制御がどのように競合するかを示す競合種別情報１２５を特定する競合種別特定部１１４を備える点、特定した競合種別情報１２５を第３のニューラルネットワーク７の入力に利用する点を除き、上記制御装置１と同様に構成されている。 FIG. 12 schematically illustrates the control device 1B according to this modification. The control device 1B includes a competition type identification unit 114 that specifies competition type information 125 indicating how the control of the air conditioner 4 competes based on the control values acquired from each neural network (5, 6). It is configured in the same manner as the control device 1 except that the specified competition type information 125 is used for the input of the third neural network 7.

この場合、制御部１１は、上記ステップＳ１０３を実行する前に、各ニューラルネットワーク（５、６）から取得した制御値に基づいて、空調装置４の制御がどのように競合するかを示す競合種別情報１２５を特定する。競合の種別（仕方）は、実施の形態に応じて適宜設定されてよい。そして、上記ステップＳ１０３において、制御部１１は、各ニューラルネットワーク（５、６）から取得した制御値及び競合種別情報１２５を第３のニューラルネットワーク７の入力層７１に入力する。 In this case, the control unit 11 shows how the controls of the air conditioner 4 compete with each other based on the control values acquired from each neural network (5, 6) before executing the step S103. Identify information 125. The type (method) of competition may be appropriately set according to the embodiment. Then, in step S103, the control unit 11 inputs the control value and the competition type information 125 acquired from each neural network (5, 6) to the input layer 71 of the third neural network 7.

これにより、制御装置１Ｂは、競合の種別に応じて制御値の修正方法を確実に変更することができる。例えば、利用者Ａが利用者Ｂよりも高い室温を所望している場合に、第３のニューラルネットワーク７は、各ニューラルネットワーク（５、６）から取得した制御値の平均値を修正済み制御値として出力するようにしてもよい。そして、利用者Ｂが利用者Ａよりも高い室温を所望している場合に、第３のニューラルネットワーク７は、ニューラルネットワーク５から取得した制御値を修正済み制御値として優先して出力するようにしてもよい。 As a result, the control device 1B can surely change the method of correcting the control value according to the type of competition. For example, when the user A desires a room temperature higher than that of the user B, the third neural network 7 sets the average value of the control values acquired from each neural network (5, 6) as the corrected control value. It may be output as. Then, when the user B desires a room temperature higher than that of the user A, the third neural network 7 preferentially outputs the control value acquired from the neural network 5 as the corrected control value. You may.

なお、図１３に例示するように、空調装置４の制御がどのように競合するかの特定には、ニューラルネットワーク等の学習器が利用されてもよい。図１３は、本変形例に係る制御装置１Ｃを模式的に例示する。制御装置１Ｃは、第４のニューラルネットワーク１１５を利用して競合種別情報１２５を特定する競合種別特定部１１４Ｃを備える点を除き、上記制御装置１Ｂと同様に構成されている。第４のニューラルネットワーク１１５は、各ニューラルネットワーク（５、６）から出力される制御値を入力すると、競合種別情報１２５に対応する出力値を出力するように学習済みである。第４のニューラルネットワーク１１５は、例えば、各ニューラルネットワーク５〜７と同様に構成されてよい。上記処理手順において、システムを起動した際に、制御部１１は、学習結果データ１２６を参照して、第４のニューラルネットワーク１１５の構造、各ニューロン間の結合の重み及び各ニューロンの閾値の設定を行う。これにより、空調装置４の制御に生じる競合を複雑に分類することができ、各分類に適切な解消方法を採用することができるようになる。 As illustrated in FIG. 13, a learning device such as a neural network may be used to identify how the controls of the air conditioner 4 compete with each other. FIG. 13 schematically illustrates the control device 1C according to this modification. The control device 1C is configured in the same manner as the control device 1B, except that the control device 1C includes a competition type identification unit 114C that specifies the competition type information 125 by using the fourth neural network 115. The fourth neural network 115 has been learned to output the output value corresponding to the conflict type information 125 when the control value output from each neural network (5, 6) is input. The fourth neural network 115 may be configured in the same manner as each neural network 5 to 7, for example. In the above processing procedure, when the system is started, the control unit 11 refers to the learning result data 126 to set the structure of the fourth neural network 115, the weight of the connection between each neuron, and the threshold value of each neuron. conduct. As a result, the conflicts that occur in the control of the air conditioner 4 can be classified in a complicated manner, and an appropriate resolving method can be adopted for each classification.

＜４．６＞
また、上記実施形態（及び変形例）では、各学習器は、ニューラルネットワークにより構成されている。しかしながら、各学習器の種類は、ニューラルネットワークに限られなくてもよく、実施の形態に応じて適宜選択されてよい。各学習器には、例えば、サポートベクターマシン、自己組織化マップ、強化学習により学習を行う学習器等が用いられてもよい。 <4.6>
Further, in the above embodiment (and modification), each learner is configured by a neural network. However, the type of each learner does not have to be limited to the neural network, and may be appropriately selected according to the embodiment. For each learning device, for example, a support vector machine, a self-organizing map, a learning device that learns by reinforcement learning, or the like may be used.

＜４．７＞
また、上記各ニューラルネットワーク（５、６）を作成するための学習装置を用意してもよい。例えば、機械学習に利用する学習データを上記学習データ２２３から各利用者（Ａ、Ｂ）に適した制御を学習するための学習データに変更することで、上記学習装置３により、学習済みの各ニューラルネットワーク（５、６）を作成することができる。各利用者（Ａ、Ｂ）に適した制御を学習させるための学習データは、各利用者（Ａ、Ｂ）から取得した入力データとなる各種情報と、各利用者（Ａ、Ｂ）の好みに適した教師データとなる本来の制御値とを組み合わせることで作成することができる。学習装置は、このような学習データを利用して、上記ステップＳ３０１〜３０３の処理を実行することで、学習済みの各ニューラルネットワーク（５、６）を構築し、各動作制御学習結果データ（１２２、１２３）を作成することができる。 <4.7>
Further, a learning device for creating each of the above neural networks (5, 6) may be prepared. For example, by changing the learning data used for machine learning from the learning data 223 to learning data for learning the control suitable for each user (A, B), each of the learned data by the learning device 3 A neural network (5, 6) can be created. The learning data for learning the control suitable for each user (A, B) includes various information that is input data acquired from each user (A, B) and the preference of each user (A, B). It can be created by combining with the original control value which is the teacher data suitable for. The learning device uses such learning data to execute the processes of steps S301 to 303 to construct each trained neural network (5, 6), and each motion control learning result data (122). , 123) can be created.

同様に、上記第４のニューラルネットワーク１１５を作成するための学習装置を用意してもよい。例えば、機械学習に利用する学習データを上記学習データ２２３から競合の種別の特定を学習するための学習データに変更することで、上記学習装置３により、学習済みの第４のニューラルネットワーク１１５を作成することができる。競合の種別の特定を学習するための学習データは、各ニューラルネットワーク（５、６）から取得される入力データとなる制御値と、教師データとなる競合種別情報１２５に対応する出力値とを組み合わせることで作成することができる。学習装置は、このような学習データを利用して、上記ステップＳ３０１〜３０３の処理を実行することで、学習済みの第４のニューラルネットワーク１１５を構築し、学習結果データ１２６を作成することができる。 Similarly, a learning device for creating the fourth neural network 115 may be prepared. For example, by changing the learning data used for machine learning from the learning data 223 to learning data for learning the identification of the type of competition, the learning device 3 creates a learned fourth neural network 115. can do. The learning data for learning the identification of the type of conflict is a combination of a control value as input data acquired from each neural network (5, 6) and an output value corresponding to the contention type information 125 as teacher data. It can be created by. The learning device can construct the trained fourth neural network 115 and create the learning result data 126 by executing the processes of steps S301 to 303 using such learning data. ..

１・１Ａ・１Ｂ・１Ｃ…制御装置、
１１…制御部、１２…記憶部、１３…外部インタフェース、
１１１…第１の制御処理部、１１２…第２の制御処理部、
１１３・１１３Ａ…競合解消部、
１１４・１１４Ｃ…競合種別特定部、
１１５…第４のニューラルネットワーク、
１２１…制御プログラム、１２２…第１動作制御学習結果データ、
１２３…第２動作制御学習結果データ、
１２４…競合解消学習結果データ、
１２５…競合種別情報、１２６…学習結果データ、
２…データ収集用制御装置、
２１…制御部、２２…記憶部、２３…外部インタフェース、
２１１…第１の制御処理部、２１２…第２の制御処理部、
２１３…修正値決定部、２１４…学習データ作成部、
２２１…データ収集用制御プログラム、２２３…学習データ、
３…学習装置、
３１…制御装置、３２…記憶部、３３…通信インタフェース、
３４…入力装置、３５…出力装置、３６…ドライブ、
３１１…学習データ取得部、３１２…学習処理部、
３２１…学習プログラム、
５…第１のニューラルネットワーク、
５１…入力層、５２…中間層（隠れ層）、５３…出力層、
６…第２のニューラルネットワーク、
６１…入力層、６２…中間層（隠れ層）、６３…出力層、
７…第３のニューラルネットワーク、
７１…入力層、７２…中間層（隠れ層）、７３…出力層、
８…学習用のニューラルネットワーク、
８１…入力層、８２…中間層（隠れ層）、８３…出力層、
９１…記憶媒体 1.1A, 1B, 1C ... Control device,
11 ... Control unit, 12 ... Storage unit, 13 ... External interface,
111 ... 1st control processing unit, 112 ... 2nd control processing unit,
113 / 113A ... Conflict Resolution Department,
114 / 114C ... Competitive type identification part,
115 ... Fourth neural network,
121 ... control program, 122 ... first motion control learning result data,
123 ... Second motion control learning result data,
124 ... Conflict resolution learning result data,
125 ... Competitive type information, 126 ... Learning result data,
2 ... Data collection control device,
21 ... Control unit, 22 ... Storage unit, 23 ... External interface,
211 ... 1st control processing unit, 212 ... 2nd control processing unit,
213 ... Correction value determination unit, 214 ... Learning data creation unit,
221 ... Data collection control program, 223 ... Learning data,
3 ... Learning device,
31 ... control device, 32 ... storage unit, 33 ... communication interface,
34 ... Input device, 35 ... Output device, 36 ... Drive,
311 ... Learning data acquisition unit 312 ... Learning processing unit,
321 ... Learning program,
5 ... First neural network,
51 ... Input layer, 52 ... Intermediate layer (hidden layer), 53 ... Output layer,
6 ... Second neural network,
61 ... Input layer, 62 ... Intermediate layer (hidden layer), 63 ... Output layer,
7 ... Third neural network,
71 ... Input layer, 72 ... Intermediate layer (hidden layer), 73 ... Output layer,
8 ... Neural network for learning,
81 ... Input layer, 82 ... Intermediate layer (hidden layer), 83 ... Output layer,
91 ... Storage medium

Claims

A first control target device that controls the operation of the first control target device based on a control value output from a learned first learner that has been trained to control the operation of the first control target device. Control processing unit and
A second control target device that controls the operation of the second control target device based on a control value output from the learned second learner that has been trained to control the operation of the second control target device. Control processing unit and
The control of the first control target device based on the control value output from the first learner and the control of the second control target device based on the control value output from the second learner conflict with each other. In this case, when the control value of the first control target device output from the first learner and the control value of the second control target device output from the second learner are input, the control value of the second control target device is input. Using the trained third learner that has been trained to output the control value of the first control target device and the control value of the second control target device modified to eliminate the conflict. And the conflict resolution department that resolves the conflict,
To prepare
Control device.

The first control is based on the control value of the first control target device output from the first learner and the control value of the second control target device output from the second learner. Further provided with a conflict type identification unit that specifies conflict type information indicating how the control of the target device and the second control target device conflicts with each other.
The conflict resolving unit further inputs the specified conflict type information to the third learner.
The control device according to claim 1.

The competition type identification unit inputs the control value of the first control target device output from the first learner and the control value of the second control target device output from the second learner. Then, the conflict type information is specified by using the learned fourth learner that has been trained to output the output value corresponding to the conflict type information.
The control device according to claim 2.

The first, second, third and fourth learners are each composed of a neural network.
The control device according to claim 3.

The conflict resolving unit controls the first controlled device based on the control value output from the first learner, and the second control based on the control value output from the second learner. By giving priority to one of the controls of the target device, the conflict is resolved.
The control device according to any one of claims 1 to 4.

The first control target device and the second control target device are the same control target device, and the first control target device and the second control target device are the same control target device.
The conflict resolving unit resolves the conflict by averaging the control value output from the first learner and the control value output from the second learner.
The control device according to any one of claims 1 to 5.

To the computer that controls the operation of the first controlled device and the second controlled device,
A step of acquiring a control value for controlling the first control target device output from the learned first learner that has learned to control the operation of the first control target device, and a step of acquiring the control value for controlling the first control target device.
A step of acquiring a control value for controlling the second control target device output from the learned second learner that has learned to control the operation of the second control target device, and a step of acquiring the control value for controlling the second control target device.
The control of the first control target device based on the control value output from the first learner and the control of the second control target device based on the control value output from the second learner conflict with each other. In this case, when the control value of the first control target device output from the first learner and the control value of the second control target device output from the second learner are input, the control value of the second control target device is input. Modified from the trained third learner that has been trained to output the control value of the first control target device and the control value of the second control target device modified to eliminate the conflict. The step of acquiring the control value of the first control target device and the control value of the second control target device, and
A step of controlling the first control target device and the second control target device based on the acquired control value, and
A control program for executing.