JP2014099113A

JP2014099113A - Electric appliance network system

Info

Publication number: JP2014099113A
Application number: JP2012251659A
Authority: JP
Inventors: Masakazu Maehara; 正和前原
Original assignee: Samsung R&D Institute Japan Co Ltd
Current assignee: Samsung R&D Institute Japan Co Ltd
Priority date: 2012-11-15
Filing date: 2012-11-15
Publication date: 2014-05-29
Also published as: KR20140063392A

Abstract

PROBLEM TO BE SOLVED: To perform learning control of a plurality of electric appliances in an autonomous and distributed manner, and to achieve optimal control of the plurality of electric appliances by solving a failure accompanied by extraction of a sensor, and to easily shift a model of an electric appliance unit to the other master appliance.SOLUTION: A master appliance includes: a plurality of electric appliance agents 30a to 30e corresponding to a plurality of electric appliances; and an agent management part 37 for inputting a plurality of appliance information obtained from the plurality of electric appliances to each of the plurality of electric appliance agents 30a to 30e, and for outputting a control command to be obtained from each of the plurality of electric appliance agents 30a to 30e to the plurality of electric appliances, and for calculating a reward from state change quantity to be obtained from the plurality of electric appliances as a result, and for updating the value function of each of the plurality of electric appliance agents 30a to 30e by using the reward as a parameter.

Description

本発明は、複数の家電機器とその複数の家電機器を制御するマスター機器とが通信ネットワークを介して接続されたものにおいて、マスター機器が前記複数の家電機器を学習制御する家電ネットワークシステムに関するものである。 The present invention relates to a home appliance network system in which a plurality of home appliances and a master device that controls the plurality of home appliances are connected via a communication network, and the master device learns and controls the plurality of home appliances. is there.

従来の家電ネットワークシステムとしては、特許文献１に示すように、複数のセンサが存在する知能化住宅において、少なくとも１つの家電機器と、当該家電機器の制御を行う機器制御装置（エージェント）とを有するものが考えられている。具体的に機器制御装置は、任意の家電機器の制御に関して、その家電機器と相関関係の大きいセンサを特定し、ニューラルネットを用いて、特定されたセンサに対する制御を学習し、任意の家電機器の制御を最適化するように構成されている。 As a conventional home appliance network system, as shown in Patent Document 1, an intelligent house having a plurality of sensors includes at least one home appliance and an appliance control device (agent) that controls the home appliance. Things are being considered. Specifically, the device control device identifies a sensor having a large correlation with the home appliance with respect to the control of an arbitrary home appliance, learns control of the specified sensor using a neural network, Configured to optimize control.

しかしながら、特許文献１の家電ネットワークシステムでは、最適な制御を行うために、任意の家電機器との相関関係が大きいセンサを特定して、そのセンサに対する家電機器の最適な動作を学習させるものであり、各家電機器に対して相関関係の大きいセンサを抽出する必要があり、その抽出に必要な時系列データの保存や演算処理が必要であり効率的ではない。 However, in the home appliance network system of Patent Document 1, in order to perform optimal control, a sensor having a large correlation with an arbitrary home appliance is specified, and the optimum operation of the home appliance with respect to the sensor is learned. Therefore, it is necessary to extract a sensor having a large correlation with respect to each home appliance, and it is not efficient because it requires time series data storage and calculation processing necessary for the extraction.

また、さまざまな家電機器が乱立する環境においては、複数の家電機器それぞれにおいて相関関係の大きいセンサを抽出することが難しい。さらに言えば、全てのセンサが少なからず各家電機器と相関関係を持っている可能性があるので、相関関係の大きいセンサを抽出して制御するだけででは、家電機器の最適な制御を行うことが難しいことや、特定のセンサに関する制御に偏る可能性がある。 Moreover, in an environment where various home appliances are prone, it is difficult to extract a sensor having a large correlation in each of the plurality of home appliances. Furthermore, there is a possibility that all the sensors have a lot of correlation with each home appliance, so it is necessary to extract and control a sensor with a large correlation to perform optimal control of the home appliance. May be difficult or may be biased to control related to a specific sensor.

さらに、前記機器制御装置は、単一のエージェントを構成して、任意の家電機器の制御を集中的に学習させているので、他の機器制御装置に家電機器単位でのモデルを移行することが難しく、また、フォールトトレラント性が悪いという問題がある。 Furthermore, the device control device constitutes a single agent and intensively learns the control of any home appliance, so that the model in units of home appliances can be transferred to another device control device. There is a problem that it is difficult and fault tolerant.

国際公開ＷＯ２００５／０８３５３１号公報International Publication WO2005 / 083531

そこで本発明は、上記課題を解決すべくなされたものであり、複数の家電機器を自律的分散的に学習制御するとともに、センサの抽出に伴う不具合を解消して、複数の家電機器の最適制御を可能にするだけでなく、家電機器単位のモデルを他のマスター機器に対して移行容易にすることを所期課題とするものである。 Therefore, the present invention has been made to solve the above-described problems, and learns and controls a plurality of home appliances in an autonomous and distributed manner, solves problems associated with sensor extraction, and controls a plurality of home appliances optimally. In addition to making it possible to make a model of home appliances easy to migrate to other master devices, it is an intended task.

すなわち本発明に係る家電ネットワークシステムは、それぞれ標準化された任意のプロトコルを有する複数の家電機器に通信ネットワークを介して接続され、前記複数の家電機器を制御するマスター機器を有する家電ネットワークシステムであって、前記マスター機器が、前記複数の家電機器それぞれに対応する複数の家電機器エージェントと、前記複数の家電機器エージェントそれぞれに対して、前記複数の家電機器から得られる機器情報を入力して、前記複数の家電機器エージェントから得られる制御コマンドを前記複数の家電機器に出力し、その結果、前記複数の家電機器から得られる状態変化量により報酬を算出し、当該報酬をパラメータとして前記複数の家電機器エージェントの価値関数を更新するエージェント管理部とを有することを特徴とする。 That is, the home appliance network system according to the present invention is a home appliance network system having a master device connected to a plurality of home appliances each having an arbitrary standardized protocol via a communication network and controlling the plurality of home appliances. The master device inputs a plurality of home appliance agents corresponding to the plurality of home appliances and device information obtained from the plurality of home appliances for each of the plurality of home appliance agents. The control command obtained from the home appliance agent is output to the plurality of home appliances, and as a result, a reward is calculated from the amount of state change obtained from the plurality of home appliances, and the plurality of home appliance agents is used with the reward as a parameter. An agent management unit that updates the value function of And wherein the door.

このようなものであれば、マスター機器が複数の家電機器それぞれに対応する複数の家電機器エージェントを有しており、エージェント管理部が各家電機器エージェントに複数の家電機器から得られる機器情報を入力して制御コマンドを生成しているので、複数の家電機器を自律的分散的に学習制御するとともに、センサの抽出に伴う不具合を解消して、複数の家電機器の最適制御を可能にすることができる。
また、エージェント管理部が、複数の家電機器から得られる状態変化量により報酬を算出し、当該報酬をパラメータとして前記複数の家電機器エージェントの価値関数を更新しているので、未知の環境においても学習することができ最適な動作を各家電機器に行わせることができる。
さらに、マスター機器が、複数の家電機器それぞれに対応する複数の家電機器エージェントを有しているので、一部のエージェントを他のマスター機器に移動させることが比較的容易であり、また、フォールトトレラント性に優れている。 If this is the case, the master device has a plurality of home appliance agents corresponding to the plurality of home appliances, and the agent management unit inputs device information obtained from the plurality of home appliances to each home appliance agent. Control commands are generated, and learning control of multiple home appliances is performed autonomously and distributed, and problems associated with sensor extraction can be resolved to enable optimal control of multiple home appliances. it can.
In addition, since the agent management unit calculates a reward based on the amount of state change obtained from a plurality of home appliances, and updates the value function of the plurality of home appliance agents using the reward as a parameter, learning is also possible in an unknown environment. It is possible to make each home appliance perform an optimal operation.
Furthermore, since the master device has a plurality of home appliance agents corresponding to the plurality of home appliances, it is relatively easy to move some agents to other master devices, and it is fault tolerant. Excellent in properties.

前記複数の家電機器それぞれに対応する複数の家電機器エージェントを生成するエージェント生成部を有することが望ましい。これならば、通信ネットワークを介して接続される家電機器を増やしても、その家電機器に対応する家電機器エージェントを自動的に生成することができる。 It is desirable to have an agent generation unit that generates a plurality of home appliance agents corresponding to each of the plurality of home appliances. In this case, even if the number of home appliances connected through the communication network is increased, a home appliance agent corresponding to the home appliance can be automatically generated.

前記エージェント管理部が、前記複数の家電機器から得られる状態変化量として電力消費量差を用いて報酬を算出し、当該報酬をパラメータとして、前記複数の家電機器における電力消費量を最小化すべく前記複数の家電機器エージェントの価値関数を更新するものであることが望ましい。 The agent management unit calculates a reward using a difference in power consumption as a state change amount obtained from the plurality of home appliances, and uses the reward as a parameter to minimize power consumption in the plurality of home appliances. It is desirable to update the value functions of multiple home appliance agents.

前記マスター機器が、家電機器により構成されていることが望ましい。これならば、別途機器制御用の制御装置を用意する必要が無い。 It is desirable that the master device is composed of home appliances. In this case, it is not necessary to prepare a separate control device for device control.

また本発明に係る家電制御プログラムは、それぞれ標準化された任意のプロトコルを有する複数の家電機器に通信ネットワークを介して接続され、前記複数の家電機器を制御するマスター機器を有する家電ネットワークシステムに用いられる家電制御プログラムであって、前記複数の家電機器それぞれに対応する複数の家電機器エージェントと、前記複数の家電機器エージェントそれぞれに対して、前記複数の家電機器から得られる機器情報を入力して、前記複数の家電機器エージェントから得られる制御コマンドを前記複数の家電機器に出力し、その結果、前記複数の家電機器から得られる状態変化量により報酬を算出し、当該報酬をパラメータとして前記複数の家電機器エージェントの価値関数を更新するエージェント管理部と、として機能を前記マスター機器に備えさせることを特徴とする。 The home appliance control program according to the present invention is connected to a plurality of home appliances each having an arbitrary standardized protocol via a communication network, and is used in a home appliance network system having a master device that controls the plurality of home appliances. A home appliance control program, a plurality of home appliance agents corresponding to each of the plurality of home appliances, and each of the plurality of home appliance agents, inputting device information obtained from the plurality of home appliances, A control command obtained from a plurality of home appliance agents is output to the plurality of home appliances, and as a result, a reward is calculated based on the amount of state change obtained from the plurality of home appliances, and the plurality of home appliances using the reward as a parameter As an agent manager that updates the agent value function, Characterized in that equip the ability to the master device.

このように構成した本発明によれば、複数の家電機器を自律的分散的に学習制御するとともに、センサの抽出に伴う不具合を解消して、複数の家電機器の最適制御を可能にするだけでなく、家電機器単位のモデルを他のマスター機器に対して移行容易にすることができる。 According to the present invention configured as described above, the learning control of a plurality of home appliances is performed autonomously and distributed, and the problem associated with the extraction of the sensor is solved to enable the optimal control of the plurality of home appliances. In addition, it is possible to easily transfer a model of a home appliance unit to another master device.

本発明の一実施形態に係る家電ネットワークシステムを示す模式図。The schematic diagram which shows the household appliance network system which concerns on one Embodiment of this invention. 同実施形態の複数の家電機器からマスター機器への情報送信を示す模式図。The schematic diagram which shows the information transmission from the some household appliances of the embodiment to a master apparatus. 同実施形態のマスター機器による複数の家電機器の制御を示す模式図。The schematic diagram which shows control of the some household appliances by the master apparatus of the embodiment. 同実施形態のマスター機器の機能ブロック図。The functional block diagram of the master apparatus of the embodiment. 同実施形態の各家電機器エージェントへの入力及び出力を示す模式図。The schematic diagram which shows the input and output to each household appliance agent of the embodiment. 同実施形態のマスター機器の制御手順を示すフローチャート。The flowchart which shows the control procedure of the master apparatus of the embodiment. 同実施形態の簡単なモデルを示す模式図。The schematic diagram which shows the simple model of the embodiment. 図７のモデルにおいて状態変化情報を取得した場合の動作内容を示す図。The figure which shows the operation | movement content at the time of acquiring state change information in the model of FIG. 図７のモデルにおいて各家電機器を制御する場合の動作内容を示す図。The figure which shows the operation | movement content in the case of controlling each household appliances in the model of FIG. 図７のモデルにおいて各家電機器から電力消費情報を取得した場合の動作内容を示す図。The figure which shows the operation | movement content at the time of acquiring power consumption information from each household appliances in the model of FIG.

以下に本発明に係る家電ネットワークシステムの一実施形態について図面を参照して説明する。 Hereinafter, an embodiment of a home appliance network system according to the present invention will be described with reference to the drawings.

本実施形態に係る家電ネットワークシステム１００は、図１〜図３に示すように、複数の家電機器２ａ〜２ｅに通信ネットワークＮＴを介して接続され、前記複数の家電機器２ａ〜２ｅを制御するマスター機器３を有するものである。 As shown in FIGS. 1 to 3, the home appliance network system 100 according to the present embodiment is connected to a plurality of home appliances 2a to 2e via a communication network NT and controls the plurality of home appliances 2a to 2e. The device 3 is included.

複数の家電機器２ａ〜２ｅは、それぞれ標準化された任意のプロトコルを有するものであり、例えばＥｃｈｏｎｅｔ、Ｚｉｇｂｅｅ又はＵＰｎＰ等の通信プロトコルを有するものである。また、複数の家電機器２ａ〜２ｅとして本実施形態では、冷蔵庫２ａ、ＢＤ（Ｂｌｕ−ｒａｙＤｉｓｃ）レコーダ２ｂ、エアコン２ｃ、洗濯機２ｄ、電子レンジ２ｅ等である。その他、例えばテレビ、ファンヒータ、空気清浄機、照明装置等の家電機器を有していても良い。 Each of the plurality of home appliances 2a to 2e has a standardized arbitrary protocol, and has a communication protocol such as Echonet, Zigbee, or UPnP, for example. In the present embodiment, the home appliances 2a to 2e are a refrigerator 2a, a BD (Blu-ray Disc) recorder 2b, an air conditioner 2c, a washing machine 2d, a microwave oven 2e, and the like. In addition, you may have household appliances, such as a television, a fan heater, an air cleaner, and an illuminating device, for example.

マスター機器３は、通信ネットワークＮＴを介して接続される複数の家電機器２ａ〜２ｅとの間で通信可能とするため、図３に示すように、複数の家電機器２ａ〜２ｅそれぞれのプロトコルの制御機能（例えば、Ｅｃｈｏｎｅｔ、Ｚｉｇｂｅｅ又はＵＰｎＰ等）を有するものである。本実施形態のマスター機器３は、例えばテレビやレコーダ等の家電機器により構成されており、ＣＰＵ、メモリ、通信インターフェース等を有するコンピュータである。そして、このマスター機器３は、前記メモリの所定領域に格納してあるプログラムに基づいてＣＰＵやその周辺機器が作動することにより、図４に示すように、通信プロトコル受信部３１、通信プロトコル送信部３２、入力変換部３３、出力変換部３４、プロトコル解析部３５、エージェント生成部３６、エージェント管理部３７等として機能する。 In order to enable the master device 3 to communicate with the plurality of home appliances 2a to 2e connected via the communication network NT, as shown in FIG. 3, control of the protocol of each of the plurality of home appliances 2a to 2e. It has a function (for example, Echonet, Zigbee or UPnP). The master device 3 according to the present embodiment is configured by home appliances such as a television and a recorder, and is a computer having a CPU, a memory, a communication interface, and the like. As shown in FIG. 4, the master device 3 operates as a CPU and its peripheral devices based on a program stored in a predetermined area of the memory. 32, an input conversion unit 33, an output conversion unit 34, a protocol analysis unit 35, an agent generation unit 36, an agent management unit 37, and the like.

通信プロトコル受信部３１は、複数の家電機器２ａ〜２ｅそれぞれからの入力プロトコルＸａ〜Ｘｅを受信するものであり、通信プロトコル送信部３２は、複数の家電機器２ａ〜２ｅそれぞれに出力プロトコルＹａ〜Ｙｅを送信するものである。 The communication protocol receiving unit 31 receives the input protocols Xa to Xe from each of the plurality of home appliances 2a to 2e, and the communication protocol transmission unit 32 outputs the output protocols Ya to Ye to each of the plurality of home appliances 2a to 2e. Is to send.

入力変換部３３は、前記通信プロトコル受信部３１により受信された入力プロトコルＸａ〜Ｘｅをプロトコル解析部３５を利用してエージェント入力値に変換するものであり、出力変換部３４は、後述する制御コマンド等の出力値をプロトコル解析部３５を利用して出力プロトコルＹａ〜Ｙｅに変換して通信プロトコル送信部３２に出力するものである。 The input conversion unit 33 converts the input protocols Xa to Xe received by the communication protocol reception unit 31 into an agent input value using the protocol analysis unit 35. The output conversion unit 34 is a control command described later. Are converted into output protocols Ya to Ye using the protocol analysis unit 35 and output to the communication protocol transmission unit 32.

プロトコル解析部３５は、入力プロトコルＸａ〜Ｘｅを解析してエージェント入力値Ｘ１ａ〜Ｘ１ｅに変換するとともに、制御コマンド等の出力値を解析して出力プロトコルＹａ〜Ｙｅに変換するものである。 The protocol analysis unit 35 analyzes the input protocols Xa to Xe and converts them into agent input values X1a to X1e, and analyzes the output values of control commands and the like to convert them into output protocols Ya to Ye.

エージェント生成部３６は、複数の家電機器２ａ〜２ｅそれぞれに対応する解析モデルである複数の家電機器エージェント３０ａ〜３０ｅをマスター機器３の内部メモリに設定された仮想空間内に生成するものである。 The agent generation unit 36 generates a plurality of home appliance agents 30a to 30e, which are analysis models corresponding to the plurality of home appliances 2a to 2e, in a virtual space set in the internal memory of the master device 3.

エージェント管理部３７は、複数の家電機器エージェント３０ａ〜３０ｅそれぞれに対して、複数の家電機器２ａ〜２ｅから得られるそれらの機器情報（例えば状態変化量）を示すエージェント入力値Ｘ１ａ〜Ｘ１ｅを入力して（図５参照）、複数の家電機器エージェント３０ａ〜３０ｅから得られる制御コマンドＹ１ａ〜Ｙ１ｅを複数の家電機器２ａ〜２ｅに出力し、その結果、複数の家電機器２ａ〜２ｅから得られる状態変化量により報酬を算出し、当該報酬をパラメータとして複数の家電機器エージェント３０ａ〜３０ｅの価値関数を更新するものである。このようにエージェント管理部３７は、強化学習を用いて複数の家電機器２ａ〜２ｅを学習制御するものである。 The agent management unit 37 inputs agent input values X1a to X1e indicating those pieces of device information (for example, state change amounts) obtained from the plurality of home appliances 2a to 2e for each of the plurality of home appliance agents 30a to 30e. (See FIG. 5), control commands Y1a to Y1e obtained from the plurality of home appliance agents 30a to 30e are output to the plurality of home appliances 2a to 2e, and as a result, state changes obtained from the plurality of home appliances 2a to 2e. The reward is calculated by the amount, and the value functions of the plurality of home appliance agents 30a to 30e are updated using the reward as a parameter. Thus, the agent management part 37 carries out learning control of the some household appliances 2a-2e using reinforcement learning.

なお、強化学習の学習法の詳細（価値関数の詳細）については、連続的な状態空間及び行動空間に適用できる学習法が望ましい。 In addition, about the details of the learning method of reinforcement learning (details of a value function), the learning method applicable to continuous state space and action space is desirable.

また、エージェント管理部３７は、複数の家電機器２ａ〜２ｅから得られる状態変化量として電力消費量差を用いて報酬を算出し、当該報酬をパラメータとして、複数の家電機器２ａ〜２ｅにおける電力消費量を最小化すべく複数の家電機器エージェント３０ａ〜３０ｅの価値関数を更新する。 In addition, the agent management unit 37 calculates a reward using a difference in power consumption as a state change amount obtained from the plurality of home appliances 2a to 2e, and uses the reward as a parameter to consume power in the plurality of home appliances 2a to 2e. The value functions of the plurality of home appliance agents 30a to 30e are updated to minimize the amount.

以下、マスター機器３による複数の家電機器２ａ〜２ｅの制御手順について特に図６を参照して説明する。 Hereinafter, the control procedure of the plurality of home appliances 2a to 2e by the master device 3 will be described with reference to FIG.

まず、マスター機器３の通信プロトコル受信部３１は、スレーブ機器である複数の家電機器２ａ〜２ｅから入力プロトコルＸａ〜Ｘｅを受信する（ステップＳ１、図２参照）。この通信プロトコル受信部３１により受信された入力プロトコルＸａ〜Ｘｅは、入力変換部３３に送信される。 First, the communication protocol receiving unit 31 of the master device 3 receives the input protocols Xa to Xe from the plurality of home appliances 2a to 2e that are slave devices (see step S1, FIG. 2). The input protocols Xa to Xe received by the communication protocol receiver 31 are transmitted to the input converter 33.

そして、入力変換部３３は、プロトコル解析部３５を利用して、入力プロトコルＸａ〜Ｘｅを解析してエージェント入力値Ｘ１ａ〜Ｘ１ｅを得る（ステップＳ２）。そして、入力変換部３３は、このエージェント入力値Ｘ１ａ〜Ｘ１ｅが、家電機器２ａ〜２ｅのプロファイル情報であるか、機器の状態変化に関連する状態変化情報であるかを判断し（ステップＳ３）、家電機器２ａ〜２ｅのプロファイル情報であれば、入力変換部３３は、前記エージェント入力値Ｘ１ａ〜Ｘ１ｅをエージェント生成部３６へ送信する（ステップＳ４）。 Then, the input conversion unit 33 uses the protocol analysis unit 35 to analyze the input protocols Xa to Xe to obtain agent input values X1a to X1e (step S2). And the input conversion part 33 judges whether this agent input value X1a-X1e is the profile information of household appliances 2a-2e, or the state change information relevant to the state change of an apparatus (step S3), If it is the profile information of the household electrical appliances 2a to 2e, the input conversion unit 33 transmits the agent input values X1a to X1e to the agent generation unit 36 (step S4).

プロファイル情報を示すエージェント入力値Ｘ１ａ〜Ｘ１ｅを受信したエージェント生成部３６は、そのプロファイル情報に基づいて家電機器２ａ〜２ｅの仮想モデルであるエージェント３０ａ〜３０ｅを新規で生成する（ステップＳ５）。なお、通信ネットワークＮＴを介してマスター機器３に複数の家電機器２ａ〜２ｅを接続すると、上記のようにして、エージェント生成部３６が、複数の家電機器２ａ〜２ｅそれぞれに対応する複数の家電機器エージェント３０ａ〜３０ｅを自動的に生成する。また、マスター機器３に複数の家電機器２ａ〜２ｅを接続した後にユーザにより入力される制御開始信号を受信した後に行うようにしても良い。 The agent generation unit 36 that has received the agent input values X1a to X1e indicating the profile information newly generates agents 30a to 30e that are virtual models of the home appliances 2a to 2e based on the profile information (step S5). When a plurality of home appliances 2a to 2e are connected to the master device 3 via the communication network NT, as described above, the agent generation unit 36 has a plurality of home appliances corresponding to each of the plurality of home appliances 2a to 2e. Agents 30a-30e are automatically generated. Moreover, you may make it perform after receiving the control start signal input by the user after connecting the several household appliances 2a-2e to the master apparatus 3. FIG.

また、プロファイル情報を示すエージェント入力値Ｘ１ａ〜Ｘ１ｅを受信したエージェント生成部３６は、すでにその家電機器２ａ〜２ｅのエージェント３０ａ〜３０ｅを生成している場合には、前記プロファイル情報に基づいて家電機器エージェント３０ａ〜３０ｅの情報を変更する（ステップＳ５）。 In addition, when the agent generation unit 36 that has received the agent input values X1a to X1e indicating the profile information has already generated the agents 30a to 30e of the home appliances 2a to 2e, the home appliance based on the profile information. Information on the agents 30a to 30e is changed (step S5).

このようにして全ての家電機器２ａ〜２ｅの家電機器エージェント３０ａ〜３０ｅが生成された後、複数の家電機器２ａ〜２ｅからの状態変化情報（入力プロトコルＸａ〜Ｘｅ）の入力待ち状態となる。 After the home appliance agents 30a to 30e of all the home appliances 2a to 2e are generated in this way, the input wait state of the state change information (input protocols Xa to Xe) from the plurality of home appliances 2a to 2e is entered.

一方で、前記入力変換部３３により得られたエージェント入力値Ｘ１ａ〜Ｘ１ｅが状態変化情報であれば、入力変換部３３は、エージェント入力値Ｘ１ａ〜Ｘ１ｅをエージェント管理部３７に送信する（ステップＳ６）。 On the other hand, if the agent input values X1a to X1e obtained by the input conversion unit 33 are state change information, the input conversion unit 33 transmits the agent input values X1a to X1e to the agent management unit 37 (step S6). .

状態変化情報を示すエージェント入力値Ｘ１ａ〜Ｘ１ｅを受信したエージェント管理部３７は、その状態変化情報（エージェント入力値）が、エージェント３０ａ〜３０ｅの任意の行動に対する最適化要素（本実施形態では電力消費量）と判断した場合（ステップＳ７）には、最適化要素をその目標値に近ければ近いほど大きくなるような数値として報酬値を決定し、この報酬値をエージェント３０ａ〜３０ｅに与えて、そのエージェント３０ａ〜３０ｅの評価関数を更新する（ステップＳ８）。 Upon receiving the agent input values X1a to X1e indicating the state change information, the agent management unit 37 receives the state change information (agent input value) as an optimization factor for any action of the agents 30a to 30e (in this embodiment, power consumption). (Step S7), the reward value is determined as a numerical value that becomes larger as the optimization element is closer to the target value, and this reward value is given to the agents 30a to 30e. The evaluation functions of the agents 30a to 30e are updated (step S8).

状態変化情報（エージェント入力値）がそれ以外の要素、つまり最適化要素（電力消費量）以外の要素である場合には、単純な状態変化として全ての家電機器エージェント３０ａ〜３０ｅに入力し、それぞれの家電機器エージェント３０ａ〜３０ｅの価値関数から最適行動（制御コマンドＹ１ａ〜Ｙ１ｅ）を得る（ステップＳ９）。そして、この最適行動（制御コマンドＹ１ａ〜Ｙ１ｅ）を出力変換部３４へ送信する。 When the state change information (agent input value) is an element other than that, that is, an element other than the optimization element (power consumption), it is input to all the home appliance agents 30a to 30e as simple state changes, The optimal behavior (control commands Y1a to Y1e) is obtained from the value functions of the home appliance agents 30a to 30e (step S9). Then, the optimal behavior (control commands Y1a to Y1e) is transmitted to the output conversion unit 34.

最適行動（制御コマンドＹ１ａ〜Ｙ１ｅ）を受信した出力変換部３４は、プロトコル解析部３５を利用して、複数の家電機器２ａ〜２ｅそれぞれの最適行動を示す出力プロトコルＹａ〜Ｙｅに変換して、通信プロトコル送信部３２に送信する（ステップＳ１０）。 The output conversion unit 34 that has received the optimal behavior (control commands Y1a to Y1e) uses the protocol analysis unit 35 to convert the output behaviors Ya to Ye indicating the optimal behavior of each of the plurality of home appliances 2a to 2e, It transmits to the communication protocol transmission part 32 (step S10).

この出力プロトコルＹａ〜Ｙｅを受信した通信プロトコル送信部３２は、複数の家電機器２ａ〜２ｅそれぞれに、対応する出力プロトコルＹａ〜Ｙｅを送信する（ステップＳ１１）。 The communication protocol transmitting unit 32 that has received the output protocols Ya to Ye transmits the corresponding output protocols Ya to Ye to each of the plurality of home appliances 2a to 2e (step S11).

次に、簡単なモデルとして、スレーブ機器である冷蔵庫２ａ及びエアコン２ｃをマスター機器３により制御される場合において、当該マスター機器３の学習制御の要部について図７〜図１０を参照して説明する。 Next, as a simple model, when the refrigerator 2a and the air conditioner 2c, which are slave devices, are controlled by the master device 3, the main part of the learning control of the master device 3 will be described with reference to FIGS. .

スレーブ機器である冷蔵庫２ａにおいて任意の状態変化（動作）が発生した場合、図８に示すように、当該状態変化がＥｃｈｏｎｅｔによりマスター機器３に送信される。この状態変化を受信したマスター機器３は、状態変化を示す状態変化情報を、冷蔵庫エージェント３０ａに入力するだけでなく、エアコンエージェント３０ｃに入力する。 When an arbitrary state change (operation) occurs in the refrigerator 2a, which is a slave device, the state change is transmitted to the master device 3 by Echonet as shown in FIG. The master device 3 that has received the state change not only inputs state change information indicating the state change to the refrigerator agent 30a but also inputs it to the air conditioner agent 30c.

そうすると、図９に示すように、この状態変化情報が入力された冷蔵庫エージェント３０ａ及びエアコンエージェント３０ｃにより最適行動を示す制御コマンドが得られる。そして、マスター機器３は、冷蔵庫エージェント３０ａから得られた制御コマンドをＥｃｈｏｎｅｔにより冷蔵庫２ａに送信して制御するとともに、エアコンエージェント３０ｃから得られた制御コマンドをＥｃｈｏｎｅｔによりエアコン２ｃに送信して制御する。 Then, as shown in FIG. 9, the control command indicating the optimum action is obtained by the refrigerator agent 30a and the air conditioner agent 30c to which the state change information is input. The master device 3 controls the control command obtained from the refrigerator agent 30a by transmitting it to the refrigerator 2a by Echonet, and transmits the control command obtained from the air conditioner agent 30c to the air conditioner 2c by Echonet.

次に、図１０に示すように、前記制御により冷蔵庫２ａが動作した結果、その制御に基づく冷蔵庫２ａの電力消費情報がＥｃｈｏｎｅｔによりマスター機器３に送信される。また、前記制御によりエアコン２ｃが動作した結果、その制御に基づくエアコン２ｃの電力消費情報がＥｃｈｏｎｅｔによりマスター機器３に送信される。これらの電力消費情報を取得したマスター機器３は、その電力消費情報から得られる電力消費量差を用いて報酬値を算出し、この報酬値をパラメータとして冷蔵庫２ａ及びエアコン２ｃの価値関数を更新する。これらの制御により、冷蔵庫２ａ及びエアコン２ｃを強化学習を用いて最適制御することができる。 Next, as shown in FIG. 10, as a result of the operation of the refrigerator 2a by the control, the power consumption information of the refrigerator 2a based on the control is transmitted to the master device 3 by Echonet. As a result of the operation of the air conditioner 2c by the control, the power consumption information of the air conditioner 2c based on the control is transmitted to the master device 3 by the echonet. The master device 3 that has acquired the power consumption information calculates a reward value using the difference in power consumption obtained from the power consumption information, and updates the value functions of the refrigerator 2a and the air conditioner 2c using the reward value as a parameter. . With these controls, the refrigerator 2a and the air conditioner 2c can be optimally controlled using reinforcement learning.

このように構成した本実施形態によれば、マスター機器３が複数の家電機器２ａ〜２ｅそれぞれに対応する複数の家電機器エージェント３０ａ〜３０ｅを有しており、エージェント管理部３７が各家電機器エージェント３０ａ〜３０ｅに複数の家電機器２ａ〜２ｅから得られる機器情報を入力して制御コマンドを生成しているので、複数の家電機器２ａ〜２ｅを自律的分散的に学習制御するとともに、センサの抽出に伴う不具合を解消して、複数の家電機器２ａ〜２ｅの最適制御を可能にすることができる。 According to the present embodiment configured as described above, the master device 3 includes the plurality of home appliance agents 30a to 30e corresponding to the plurality of home appliances 2a to 2e, respectively, and the agent management unit 37 includes each home appliance agent. Since the control information is generated by inputting the device information obtained from the plurality of home appliances 2a to 2e to 30a to 30e, the plurality of home appliances 2a to 2e are learned and controlled autonomously and distributed, and the sensor is extracted. It is possible to eliminate the problems associated with the above and to enable optimum control of the plurality of home appliances 2a to 2e.

また、エージェント管理部３７が、複数の家電機器２ａ〜２ｅから得られる状態変化量により報酬を算出し、当該報酬をパラメータとして前記複数の家電機器エージェント３０ａ〜３０ｅの価値関数を更新しているので、未知の環境においても学習することができ最適な動作を各家電機器２ａ〜２ｅに行わせることができる。 In addition, since the agent management unit 37 calculates a reward based on the state change amounts obtained from the plurality of home appliances 2a to 2e, and updates the value functions of the plurality of home appliance agents 30a to 30e using the reward as a parameter. The home appliances 2a to 2e can be learned in an unknown environment and can be operated optimally.

さらに、マスター機器３が、複数の家電機器２ａ〜２ｅそれぞれに対応する複数の家電機器エージェント３０ａ〜３０ｅを有しているので、一部のエージェントを他のマスター機器に移動させることが比較的容易であり、また、フォールトトレラント性に優れている。 Furthermore, since the master device 3 has a plurality of home appliance agents 30a to 30e corresponding to the plurality of home appliances 2a to 2e, it is relatively easy to move some agents to other master devices. Moreover, it is excellent in fault tolerant property.

その上、エージェント生成部３６により接続された家電機器２ａ〜２ｅに対応する家電機器エージェント３０ａ〜３０ｅを生成するので、通信ネットワークＮＴを介して接続される家電機器を増やしても、その家電機器に対応する家電機器エージェント３０ａ〜３０ｅを自動的に生成することができる。 In addition, since the home appliance agents 30a to 30e corresponding to the home appliances 2a to 2e connected by the agent generation unit 36 are generated, even if the number of home appliances connected via the communication network NT is increased, Corresponding home appliance agents 30a-30e can be automatically generated.

加えて、マスター機器が、テレビやＢＤレコーダ等の家電機器により構成されているので、別途機器制御用の制御装置を用意する必要が無い。 In addition, since the master device is composed of home appliances such as a television and a BD recorder, it is not necessary to prepare a separate control device for device control.

なお、本発明は前記実施形態に限られるものではない。 The present invention is not limited to the above embodiment.

例えば、前記実施形態では、マスター機器３の内部メモリ内の仮想空間内に複数の家電機器エージェントを形成しているが、その他、複数のマスター機器に複数の家電機器エージェントを分散して形成するようにしても良い。これならば、フォールトトレラント性をより一層向上させることができる。 For example, in the embodiment, a plurality of home appliance agents are formed in the virtual space in the internal memory of the master device 3, but a plurality of home appliance agents are distributed and formed in a plurality of master devices. Anyway. If it is this, fault tolerant property can be improved further.

その他、本発明は前記実施形態に限られず、その趣旨を逸脱しない範囲で種々の変形が可能であるのは言うまでもない。 In addition, it goes without saying that the present invention is not limited to the above-described embodiment, and various modifications can be made without departing from the spirit of the present invention.

１００・・・家電ネットワークシステム
２ａ〜２ｅ・・・家電機器
３・・・マスター機器
３０ａ〜３０ｅ・・・家電機器エージェント
３１・・・通信プロトコル受信部
３２・・・通信プロトコル送信部
３３・・・入力変換部
３４・・・出力変換部
３５・・・プロトコル解析部
３６・・・エージェント生成部
３７・・・エージェント管理部 DESCRIPTION OF SYMBOLS 100 ... Home appliance network system 2a-2e ... Home appliance 3 ... Master apparatus 30a-30e ... Home appliance agent 31 ... Communication protocol receiver 32 ... Communication protocol transmitter 33 ... Input conversion unit 34 ... output conversion unit 35 ... protocol analysis unit 36 ... agent generation unit 37 ... agent management unit

Claims

A home appliance network system having a master device connected to a plurality of home appliances each having an arbitrary standardized protocol via a communication network and controlling the plurality of home appliances,
The master device is
A plurality of home appliance agents corresponding to each of the plurality of home appliances;
For each of the plurality of home appliance agents, input device information obtained from the plurality of home appliances, and output a control command obtained from the plurality of home appliance agents to the plurality of home appliances, as a result, A home appliance network system comprising: an agent management unit that calculates a reward based on a state change amount obtained from the plurality of home appliances, and updates a value function of the plurality of home appliance agents using the reward as a parameter.

The home appliance network system according to claim 1, further comprising an agent generation unit that generates a plurality of home appliance agents corresponding to the plurality of home appliances.

The agent management unit calculates a reward using a difference in power consumption as a state change amount obtained from the plurality of home appliances, and uses the reward as a parameter to minimize power consumption in the plurality of home appliances. The home appliance network according to claim 1 or 2, wherein the value functions of a plurality of home appliance agents are updated.

The home appliance network system according to any one of claims 1 to 3, wherein the master device is configured by a home appliance.

A home appliance control program used in a home appliance network system having a master device that is connected to a plurality of home appliances each having a standardized arbitrary protocol via a communication network and controls the plurality of home appliances,
A plurality of home appliance agents corresponding to each of the plurality of home appliances;
For each of the plurality of home appliance agents, input device information obtained from the plurality of home appliances, and output a control command obtained from the plurality of home appliance agents to the plurality of home appliances, as a result, A reward is calculated from the amount of state change obtained from the plurality of home appliances, and the master device is provided with a function as an agent management unit that updates the value function of the plurality of home appliance agents using the reward as a parameter. A featured home appliance control program.