JP6945734B2

JP6945734B2 - Audio output device, device control system, audio output method, and program

Info

Publication number: JP6945734B2
Application number: JP2020522542A
Authority: JP
Inventors: 紀之小宮
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2018-06-01
Filing date: 2018-06-01
Publication date: 2021-10-06
Anticipated expiration: 2038-06-01
Also published as: WO2019229978A1; JPWO2019229978A1

Description

本発明は、音声出力装置、機器制御システム、音声出力方法、及び、プログラムに関する。 The present invention relates to a voice output device, a device control system, a voice output method, and a program.

現在、ユーザによる操作に従って、設備機器を制御する各種の技術が知られている。このような操作としては、画面操作のほか音声操作がある。音声操作に従って設備機器を制御する場合、例えば、ユーザが発した音声に基づいて設備機器を制御する機器制御装置を利用して設備機器を制御する。音声に基づいて設備機器を制御する機器制御装置を利用すると、ユーザの利便性が向上することが多い。 Currently, various techniques for controlling equipment according to an operation by a user are known. Such operations include voice operations in addition to screen operations. When the equipment is controlled according to the voice operation, for example, the equipment is controlled by using the equipment control device that controls the equipment based on the voice emitted by the user. The convenience of the user is often improved by using the equipment control device that controls the equipment based on the voice.

ところで、設備機器を所望の制御状態にするために、ユーザが、各種の制御内容に対応する操作を繰り返して実行するのは非常に面倒である。そこで、このような煩わしさを低減するために、ユーザ毎に所望の制御状態を示す情報を記憶する方法が知られている。例えば、特許文献１には、ユーザが携帯機を用いて機器を制御するときにユーザ識別情報を送信する携帯機を備え、機器の使用により取得された個人情報をユーザ識別情報と一致させて記憶する個人情報記憶システムが記載されている。 By the way, in order to bring the equipment into a desired control state, it is very troublesome for the user to repeatedly execute operations corresponding to various control contents. Therefore, in order to reduce such annoyance, there is known a method of storing information indicating a desired control state for each user. For example, Patent Document 1 includes a portable device that transmits user identification information when a user controls a device using the portable device, and stores personal information acquired by using the device in accordance with the user identification information. The personal information storage system to be used is described.

特開２００８−２３４３７１号公報Japanese Unexamined Patent Publication No. 2008-234371

しかしながら、特許文献１に記載された技術は、音声に基づいて設備機器を制御する機器制御装置を利用した技術ではないため、このような機器制御装置を利用した技術に直ちに適用することは困難である。このため、音声に基づいて設備機器を制御する機器制御装置を利用して、設備機器を容易に所望の制御状態にする技術が望まれている。 However, since the technique described in Patent Document 1 is not a technique using a device control device that controls equipment based on voice, it is difficult to immediately apply it to a technique using such a device control device. be. For this reason, there is a demand for a technique for easily bringing equipment into a desired control state by using an equipment control device that controls equipment based on voice.

本発明は、上記問題に鑑みてなされたものであり、音声に基づいて設備機器を制御する機器制御装置を利用して、設備機器を容易に所望の制御状態にする音声出力装置、機器制御システム、音声出力方法、及び、プログラムを提供することを目的とする。 The present invention has been made in view of the above problems, and is an audio output device and an equipment control system that easily puts an equipment into a desired control state by using an equipment control device that controls an equipment based on voice. , Audio output methods, and programs.

上記目的を達成するために、本発明に係る音声出力装置は、
音声に基づいて設備機器を制御する機器制御装置を利用して、前記設備機器を所望の制御状態にするための音声出力装置であって、
前記音声出力装置に対する、第１設備機器を制御するための第１操作を検知する操作検知手段と、
前記第１操作が検知された場合、前記第１操作の内容を表す第１音声を出力する音声出力手段と、
ユーザが発した音声であって第１の言葉を表すユーザ音声を検知する音声検知手段と、
予め定められた時間内に前記第１操作と前記ユーザ音声とが検知された場合、前記第１操作の内容と前記第１の言葉とが対応付けられた履歴情報を生成する履歴情報生成手段と、を備え、
前記音声出力手段は、前記履歴情報が生成された後、前記履歴情報により前記第１操作の内容に対応付けられた前記第１の言葉を表す前記ユーザ音声が検知された場合、前記第１音声を出力する。 In order to achieve the above object, the audio output device according to the present invention is
A voice output device for putting the equipment into a desired control state by using a device control device that controls the equipment based on voice.
An operation detection means for detecting the first operation for controlling the first equipment for the voice output device, and
When the first operation is detected, an audio output means for outputting a first audio representing the content of the first operation, and an audio output means.
A voice detection means for detecting a user voice that is a voice emitted by a user and represents the first word,
If the first operation within a predetermined time and said user voice is examined knowledge, history information generating means for generating history information the content of the first operation and the first word associated And with
After the history information is generated, the voice output means detects the user voice representing the first word associated with the content of the first operation by the history information , the first voice. Is output.

本発明では、ユーザによる設備機器に対する操作とユーザによりなされた行動とが予め定められた関係にあり、上記行動が検知された場合、上記操作の内容を表す音声が出力される。従って、本発明によれば、音声に基づいて設備機器を制御する機器制御装置を利用して、設備機器を容易に所望の制御状態にすることができる。 In the present invention, the operation of the equipment by the user and the action performed by the user have a predetermined relationship, and when the action is detected, a voice representing the content of the operation is output. Therefore, according to the present invention, it is possible to easily bring the equipment into a desired control state by using the equipment control device that controls the equipment based on voice.

本発明の実施形態１に係る機器制御システムの構成図Configuration diagram of the device control system according to the first embodiment of the present invention. 本発明の実施形態１に係る音声出力装置の構成図Configuration diagram of the audio output device according to the first embodiment of the present invention. 本発明の実施形態１に係る機器制御装置の構成図Configuration diagram of the device control device according to the first embodiment of the present invention. 本発明の実施形態１に係る音声出力装置の機能構成図Functional configuration diagram of the audio output device according to the first embodiment of the present invention 本発明の実施形態１に係る機器制御装置の機能構成図Functional configuration diagram of the device control device according to the first embodiment of the present invention 本発明の実施形態１に係る履歴情報を示す図The figure which shows the history information which concerns on Embodiment 1 of this invention. 本発明の実施形態１に係る音声出力装置が実行する音声出力処理を示すフローチャートA flowchart showing an audio output process executed by the audio output device according to the first embodiment of the present invention. 本発明の実施形態２に係る音声出力装置の機能構成図Functional configuration diagram of the audio output device according to the second embodiment of the present invention 本発明の実施形態２に係る履歴情報を示す図The figure which shows the history information which concerns on Embodiment 2 of this invention. 本発明の実施形態２に係る音声出力装置が実行する音声出力処理を示すフローチャートA flowchart showing an audio output process executed by the audio output device according to the second embodiment of the present invention. 本発明の実施形態３に係る音声出力装置の機能構成図Functional configuration diagram of the audio output device according to the third embodiment of the present invention 本発明の実施形態４に係る機器制御システムの構成図Configuration diagram of the device control system according to the fourth embodiment of the present invention. 本発明の実施形態４に係る履歴情報を示す図The figure which shows the history information which concerns on Embodiment 4 of this invention. 本発明の実施形態５に係る履歴情報を示す図The figure which shows the history information which concerns on Embodiment 5 of this invention. 本発明の実施形態６に係る音声出力装置が実行する設定処理を示すフローチャートA flowchart showing a setting process executed by the audio output device according to the sixth embodiment of the present invention.

（実施形態１）
まず、図１を参照して、本発明の実施形態１に係る機器制御システム１０００の構成について説明する。機器制御システム１０００は、音声出力装置１００と機器制御装置２００とを備え、音声出力装置１００と機器制御装置２００とが連携して設備機器を制御するシステムである。本実施形態では、設備機器は、空調機３００であるものとする。音声出力装置１００は、ユーザ１０から設備機器に対する操作を受け付け、この操作を表す音声を出力する。機器制御装置２００は、音声出力装置１００が出力した音声を検知し、この音声に対応する制御コマンドを生成し、この制御コマンドを空調機３００に送信する。機器制御装置２００と空調機３００とは、通信ネットワーク６００を介して相互に接続される。通信ネットワーク６００は、例えば、宅内に構築された無線ＬＡＮ（Local Area Network）である。(Embodiment 1)
First, the configuration of the device control system 1000 according to the first embodiment of the present invention will be described with reference to FIG. The device control system 1000 is a system including a voice output device 100 and a device control device 200, in which the voice output device 100 and the device control device 200 cooperate to control equipment. In the present embodiment, the equipment is assumed to be an air conditioner 300. The voice output device 100 receives an operation on the equipment from the user 10 and outputs a voice representing this operation. The device control device 200 detects the voice output by the voice output device 100, generates a control command corresponding to the voice, and transmits this control command to the air conditioner 300. The device control device 200 and the air conditioner 300 are connected to each other via a communication network 600. The communication network 600 is, for example, a wireless LAN (Local Area Network) constructed in the home.

機器制御装置２００は、一般的には、ユーザ１０が発話した音声を検知し、この音声に対応する制御コマンドを空調機３００に送信するために用いられる。しかしながら、本実施形態では、ユーザ１０が発話した音声ではなく、音声出力装置１００がユーザ１０から受け付けた操作に対応して発した音声を検知し、この音声に対応する制御コマンドを空調機３００に送信する。このように、音声出力装置１００がユーザ１０と機器制御装置２００とを中継することで、種々の効果が期待できる。例えば、かかる構成によれば、ユーザ１０は、音声操作ではなく画面操作により、空調機３００を制御することが可能となる。また、かかる構成によれば、例えば、ユーザ１０によりなされた操作に対応する制御だけでなく、この操作に関連する他の操作に対応する制御を自動で実行することが可能となる。 The device control device 200 is generally used to detect a voice spoken by the user 10 and transmit a control command corresponding to the voice to the air conditioner 300. However, in the present embodiment, not the voice spoken by the user 10 but the voice uttered by the voice output device 100 in response to the operation received from the user 10 is detected, and the control command corresponding to this voice is transmitted to the air conditioner 300. Send. In this way, when the voice output device 100 relays between the user 10 and the device control device 200, various effects can be expected. For example, according to such a configuration, the user 10 can control the air conditioner 300 not by voice operation but by screen operation. Further, according to such a configuration, for example, it is possible to automatically execute not only the control corresponding to the operation performed by the user 10 but also the control corresponding to other operations related to this operation.

音声出力装置１００は、空調機３００に対する制御を指示する操作をユーザ１０から受け付ける。音声出力装置１００は、ユーザ１０から受け付けた操作の内容に対応する音声を出力する。音声出力装置１００が出力した音声は、機器制御装置２００により検知される。音声出力装置１００は、ユーザ１０から受け付けた操作の履歴に基づいて、空調機３００をユーザ１０が所望する制御状態にするための音声を自動で出力することができる。このために、音声出力装置１００は、受け付けた操作の内容に対応する音声以外の音声を出力することもできるし、操作を受け付けていないときに音声を出力することもできる。音声出力装置１００は、例えば、スマートフォン、タブレット端末、又は、パーソナルコンピュータである。 The voice output device 100 receives from the user 10 an operation instructing control of the air conditioner 300. The voice output device 100 outputs voice corresponding to the content of the operation received from the user 10. The voice output by the voice output device 100 is detected by the device control device 200. The voice output device 100 can automatically output the voice for bringing the air conditioner 300 into the control state desired by the user 10 based on the history of operations received from the user 10. For this reason, the voice output device 100 can output voice other than the voice corresponding to the content of the received operation, or can output the voice when the operation is not accepted. The audio output device 100 is, for example, a smartphone, a tablet terminal, or a personal computer.

以下、図２を参照して、音声出力装置１００の構成について説明する。図２に示すように、音声出力装置１００は、プロセッサ１１と、フラッシュメモリ１２と、タッチスクリーン１３と、マイクロフォン１４と、スピーカ１５と、通信インターフェース１６と、を備える。プロセッサ１１は、音声出力装置１００の全体の動作を制御する。プロセッサ１１は、例えば、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）、ＲＴＣ（Real Time Clock）などを内蔵したＣＰＵ（Central Processing Unit）である。なお、ＣＰＵは、例えば、ＲＯＭに格納されている基本プログラムに従って動作し、ＲＡＭをワークエリアとして使用する。 Hereinafter, the configuration of the audio output device 100 will be described with reference to FIG. As shown in FIG. 2, the audio output device 100 includes a processor 11, a flash memory 12, a touch screen 13, a microphone 14, a speaker 15, and a communication interface 16. The processor 11 controls the overall operation of the audio output device 100. The processor 11 is, for example, a CPU (Central Processing Unit) having a built-in ROM (Read Only Memory), RAM (Random Access Memory), RTC (Real Time Clock), and the like. The CPU operates according to, for example, a basic program stored in the ROM, and uses the RAM as a work area.

フラッシュメモリ１２は、各種の情報を記憶する不揮発性メモリである。フラッシュメモリ１２は、例えば、プロセッサ１１が実行するプログラムを記憶する。タッチスクリーン１３は、ユーザによりなされた操作を検知し、検知の結果を示す信号をプロセッサ１１に供給する。また、タッチスクリーン１３は、プロセッサ１１による制御に従って、情報を表示する。 The flash memory 12 is a non-volatile memory that stores various types of information. The flash memory 12 stores, for example, a program executed by the processor 11. The touch screen 13 detects an operation performed by the user and supplies a signal indicating the detection result to the processor 11. Further, the touch screen 13 displays information under the control of the processor 11.

マイクロフォン１４は、音を電気信号に変換する機器である。例えば、マイクロフォン１４は、ユーザ１０が発した音声を電気信号に変換する。スピーカ１５は、供給された電気信号を物理振動に変換し、音を発生させる機器である。例えば、スピーカ１５は、ユーザ１０に各種のメッセージを伝達するための音声を出力する。通信インターフェース１６は、音声出力装置１００を電話網（図示せず）又はインターネット（図示せず）に接続するための通信インターフェースである。 The microphone 14 is a device that converts sound into an electric signal. For example, the microphone 14 converts the voice emitted by the user 10 into an electric signal. The speaker 15 is a device that converts a supplied electric signal into physical vibration to generate sound. For example, the speaker 15 outputs audio for transmitting various messages to the user 10. The communication interface 16 is a communication interface for connecting the voice output device 100 to a telephone network (not shown) or the Internet (not shown).

機器制御装置２００は、音声出力装置１００が出力した音声を検知し、検知した音声に対応する制御コマンドを生成する。機器制御装置２００は、生成した制御コマンドを、通信ネットワーク６００を介して、空調機３００に送信する。機器制御装置２００は、音声を言葉に変換する機能、言葉を制御コマンドに変換する機能を備える。機器制御装置２００は、例えば、スマートスピーカである。 The device control device 200 detects the voice output by the voice output device 100 and generates a control command corresponding to the detected voice. The device control device 200 transmits the generated control command to the air conditioner 300 via the communication network 600. The device control device 200 has a function of converting voice into words and a function of converting words into control commands. The device control device 200 is, for example, a smart speaker.

以下、図３を参照して、機器制御装置２００の構成について説明する。図３に示すように、機器制御装置２００は、プロセッサ２１と、フラッシュメモリ２２と、タッチスクリーン２３と、マイクロフォン２４と、スピーカ２５と、通信インターフェース２６と、を備える。プロセッサ２１は、機器制御装置２００の全体の動作を制御する。プロセッサ２１は、例えば、ＲＯＭ、ＲＡＭ、ＲＴＣなどを内蔵したＣＰＵである。なお、ＣＰＵは、例えば、ＲＯＭに格納されている基本プログラムに従って動作し、ＲＡＭをワークエリアとして使用する。 Hereinafter, the configuration of the device control device 200 will be described with reference to FIG. As shown in FIG. 3, the device control device 200 includes a processor 21, a flash memory 22, a touch screen 23, a microphone 24, a speaker 25, and a communication interface 26. The processor 21 controls the overall operation of the device control device 200. The processor 21 is, for example, a CPU having a built-in ROM, RAM, RTC, and the like. The CPU operates according to, for example, a basic program stored in the ROM, and uses the RAM as a work area.

フラッシュメモリ２２は、各種の情報を記憶する不揮発性メモリである。フラッシュメモリ２２は、例えば、プロセッサ２１が実行するプログラムを記憶する。タッチスクリーン２３は、ユーザによりなされた操作を検知し、検知の結果を示す信号をプロセッサ２１に供給する。また、タッチスクリーン２３は、プロセッサ２１による制御に従って、情報を表示する。 The flash memory 22 is a non-volatile memory that stores various types of information. The flash memory 22 stores, for example, a program executed by the processor 21. The touch screen 23 detects an operation performed by the user and supplies a signal indicating the detection result to the processor 21. Further, the touch screen 23 displays information under the control of the processor 21.

マイクロフォン２４は、音を電気信号に変換する機器である。例えば、マイクロフォン２４は、音声出力装置１００が出力した音声を電気信号に変換する。スピーカ２５は、供給された電気信号を物理振動に変換し、音を発声させる機器である。例えば、スピーカ２５は、ユーザ１０に各種のメッセージを伝達するための音声を出力する。通信インターフェース２６は、機器制御装置２００を通信ネットワーク６００に接続するための通信インターフェースである。 The microphone 24 is a device that converts sound into an electric signal. For example, the microphone 24 converts the sound output by the sound output device 100 into an electric signal. The speaker 25 is a device that converts a supplied electric signal into physical vibration and emits a sound. For example, the speaker 25 outputs audio for transmitting various messages to the user 10. The communication interface 26 is a communication interface for connecting the device control device 200 to the communication network 600.

空調機３００は、機器制御システム１０００による制御対象の設備機器である。空調機３００は、例えば、宅内の空間の空気を調和する機器である。空調機３００は、例えば、暖房機能、冷房機能、除湿機能、及び、送風機能を備える。空調機３００は、例えば、宅内に設置される室内機（図示せず）と、宅外に設置される室外機（図示せず）と、室内機と室外機とを操作するためのリモートコントローラ（図示せず）とを備える。空調機３００は、通信ネットワーク６００に接続する機能を有する。空調機３００は、通信ネットワーク６００を介して機器制御装置２００から受信した制御コマンドに従って制御される。 The air conditioner 300 is an equipment device to be controlled by the device control system 1000. The air conditioner 300 is, for example, a device that harmonizes the air in the space in the house. The air conditioner 300 includes, for example, a heating function, a cooling function, a dehumidifying function, and a blowing function. The air conditioner 300 is, for example, a remote controller (not shown) for operating an indoor unit (not shown) installed inside the house, an outdoor unit (not shown) installed outside the house, and the indoor unit and the outdoor unit. (Not shown). The air conditioner 300 has a function of connecting to the communication network 600. The air conditioner 300 is controlled according to a control command received from the device control device 200 via the communication network 600.

次に、図４を参照して、音声出力装置１００の機能について説明する。図４に示すように、音声出力装置１００は、機能的には、制御部１０１と、音声出力部１０３と、音声情報記憶部１０４と、行動検知部１０５と、履歴情報生成部１０６と、履歴情報記憶部１０７と、を備える。行動検知部１０５は、操作検知部１０２を備える。操作検知手段は、例えば、操作検知部１０２に対応する。音声出力手段は、例えば、音声出力部１０３に対応する。行動検知手段は、例えば、行動検知部１０５に対応する。履歴情報生成手段は、例えば、履歴情報生成部１０６に対応する。 Next, the function of the audio output device 100 will be described with reference to FIG. As shown in FIG. 4, the voice output device 100 functionally includes a control unit 101, a voice output unit 103, a voice information storage unit 104, an action detection unit 105, a history information generation unit 106, and a history. It includes an information storage unit 107. The action detection unit 105 includes an operation detection unit 102. The operation detection means corresponds to, for example, the operation detection unit 102. The audio output means corresponds to, for example, the audio output unit 103. The behavior detection means corresponds to, for example, the behavior detection unit 105. The history information generation means corresponds to, for example, the history information generation unit 106.

制御部１０１は、音声出力装置１００の全体の動作を制御する。例えば、制御部１０１は、操作検知部１０２による検知結果に基づいて、音声出力部１０３から音声を出力させる。また、例えば、制御部１０１は、操作検知部１０２による検知結果と行動検知部１０５による検知結果とに基づいて、履歴情報生成部１０６に履歴情報を生成させる。また、例えば、制御部１０１は、操作検知部１０２による検知結果と行動検知部１０５による検知結果とのうちの少なくとも一方の検知結果と、履歴情報と、に基づいて、音声出力部１０３から音声を出力させる。制御部１０１の機能は、例えば、プロセッサ１１がフラッシュメモリ１２に記憶されたプログラムを実行することにより実現される。 The control unit 101 controls the overall operation of the audio output device 100. For example, the control unit 101 causes the audio output unit 103 to output audio based on the detection result of the operation detection unit 102. Further, for example, the control unit 101 causes the history information generation unit 106 to generate history information based on the detection result by the operation detection unit 102 and the detection result by the action detection unit 105. Further, for example, the control unit 101 outputs voice from the voice output unit 103 based on the detection result of at least one of the detection result by the operation detection unit 102 and the detection result by the action detection unit 105 and the history information. Output. The function of the control unit 101 is realized, for example, by the processor 11 executing the program stored in the flash memory 12.

操作検知部１０２は、ユーザ１０による空調機３００に対する操作を検知する。この操作は、空調機３００を制御するための操作であり、空調機３００に対する制御内容を指示する操作である。この操作は、例えば、空調機３００に対して、電源のオンを指示する操作、空調モードを冷房に切り替える操作、又は、設定温度を２２℃に切り替える操作である。この操作は、音声出力装置１００に対する、画面操作又は音声操作である。例えば、操作検知部１０２は、空調機３００に対する操作を受け付けるための操作画面に対する画面操作を受け付ける。あるいは、操作検知部１０２は、空調機３００に対する制御内容を表す音声を検知する。操作検知部１０２は、実質的に、ユーザ１０による操作を受け付ける操作受付部とも言える。操作検知部１０２の機能は、例えば、タッチスクリーン１３の機能、又は、マイクロフォン１４の機能により実現される。 The operation detection unit 102 detects the operation of the user 10 on the air conditioner 300. This operation is an operation for controlling the air conditioner 300, and is an operation for instructing the control content for the air conditioner 300. This operation is, for example, an operation of instructing the air conditioner 300 to turn on the power, an operation of switching the air conditioning mode to cooling, or an operation of switching the set temperature to 22 ° C. This operation is a screen operation or a voice operation for the voice output device 100. For example, the operation detection unit 102 accepts screen operations on the operation screen for accepting operations on the air conditioner 300. Alternatively, the operation detection unit 102 detects a voice representing the control content for the air conditioner 300. The operation detection unit 102 can be said to be an operation reception unit that accepts operations by the user 10. The function of the operation detection unit 102 is realized by, for example, the function of the touch screen 13 or the function of the microphone 14.

音声出力部１０３は、操作検知部１０２により上記操作が検知された場合、上記操作の内容を表す音声を出力する。例えば、制御部１０１により操作の内容が特定され、制御部１０１により制御の内容を表す音声情報が取得されるものとする。この場合、音声出力部１０３は、制御部１０１から供給された音声情報に基づく電気信号を生成し、この電気信号に応じた音声を発生する。音声出力部１０３の機能は、例えば、プロセッサ１１とスピーカ１５とが協働することにより実現される。 When the operation detection unit 102 detects the operation, the voice output unit 103 outputs a voice representing the content of the operation. For example, it is assumed that the content of the operation is specified by the control unit 101, and the voice information representing the content of the control is acquired by the control unit 101. In this case, the voice output unit 103 generates an electric signal based on the voice information supplied from the control unit 101, and generates a voice corresponding to the electric signal. The function of the audio output unit 103 is realized, for example, by the cooperation of the processor 11 and the speaker 15.

音声情報記憶部１０４は、音声情報を記憶する。音声情報は、例えば、操作の内容毎、つまり、制御の内容毎に、出力すべき音声を示す情報である。例えば、音声情報は、空調機３００：電源：オンという操作の内容と、「空調機３００の電源をオンしてください。」という音声を出力するための電気信号に対応する情報とが対応付けられた情報である。音声情報記憶部１０４の機能は、例えば、フラッシュメモリ１２の機能により実現される。 The voice information storage unit 104 stores voice information. The voice information is, for example, information indicating voice to be output for each operation content, that is, for each control content. For example, the voice information is associated with the content of the operation of air conditioner 300: power: on and the information corresponding to the electric signal for outputting the voice "Please turn on the power of the air conditioner 300." Information. The function of the voice information storage unit 104 is realized by, for example, the function of the flash memory 12.

行動検知部１０５は、ユーザ１０によりなされた行動を検知する。この行動は、例えば、ユーザ１０による空調機３００に対する操作、又は、ユーザ１０による言葉の発声である。なお、本実施形態では、空調機３００に対する操作は、実質的に、音声出力装置１００に対する操作である。行動検知部１０５の機能は、例えば、タッチスクリーン１３の機能、又は、マイクロフォン１４の機能により実現される。 The action detection unit 105 detects the action taken by the user 10. This action is, for example, an operation of the air conditioner 300 by the user 10 or a utterance of words by the user 10. In the present embodiment, the operation on the air conditioner 300 is substantially an operation on the voice output device 100. The function of the action detection unit 105 is realized by, for example, the function of the touch screen 13 or the function of the microphone 14.

履歴情報生成部１０６は、操作検知部１０２により検知された上記操作と行動検知部１０５により検知された上記行動とが予め定められた関係にある場合、上記操作の内容と上記行動の内容とが対応付けられた履歴情報を生成する。予め定められた関係とは、例えば、検知された時刻の差が閾値以内である関係、又は、検知された時刻がいずれも設定モード中の時刻である関係である。履歴情報生成部１０６の機能は、例えば、プロセッサ１１がフラッシュメモリ１２に記憶されたプログラムを実行することにより実現される。 In the history information generation unit 106, when the operation detected by the operation detection unit 102 and the action detected by the action detection unit 105 have a predetermined relationship, the content of the operation and the content of the action are combined. Generate the associated history information. The predetermined relationship is, for example, a relationship in which the difference between the detected times is within the threshold value, or a relationship in which the detected times are both times in the setting mode. The function of the history information generation unit 106 is realized, for example, by the processor 11 executing the program stored in the flash memory 12.

履歴情報記憶部１０７は、履歴情報生成部１０６により生成された履歴情報を記憶する。履歴情報記憶部１０７の機能は、例えば、フラッシュメモリ１２の機能により実現される。 The history information storage unit 107 stores the history information generated by the history information generation unit 106. The function of the history information storage unit 107 is realized by, for example, the function of the flash memory 12.

音声出力部１０３は、行動検知部１０５により上記行動が検知された場合、履歴情報により上記行動の内容に対応付けられた上記操作の内容を表す上記音声を出力する。つまり、検知された行動の内容と操作の内容とが対応付けられた履歴情報が存在する場合、音声出力部１０３は、この操作の内容を表す音声を出力する。 When the action is detected by the action detection unit 105, the voice output unit 103 outputs the voice representing the content of the operation associated with the content of the action based on the history information. That is, when there is history information in which the content of the detected action and the content of the operation are associated with each other, the voice output unit 103 outputs a voice representing the content of this operation.

ここで、履歴情報生成部１０６は、予め定められた時間内に上記操作と上記行動とが検知された場合、上記操作の内容と上記行動の内容とが対応付けられた履歴情報を生成する。この予め定められた時間は、例えば、数分程度の時間である。そして、音声出力部１０３は、履歴情報が生成された後、上記行動が検知された場合、上記音声を出力する。このように、ユーザ１０が、上記操作と上記行動とを比較的短い間隔で連続して実行した実績がある場合、ユーザ１０は、上記行動とともに上記操作を実行する可能性が高い。そこで、このような実績がある場合、音声出力部１０３は、上記行動が検知された場合、上記ユーザ１０が上記操作をする可能性が高いとみなし、上記操作による制御を実現するための音声を自動で出力する。 Here, when the operation and the action are detected within a predetermined time, the history information generation unit 106 generates history information in which the content of the operation and the content of the action are associated with each other. This predetermined time is, for example, a time of about several minutes. Then, the voice output unit 103 outputs the voice when the action is detected after the history information is generated. As described above, when the user 10 has a track record of continuously executing the operation and the action at a relatively short interval, the user 10 is likely to execute the operation together with the action. Therefore, when there is such a track record, the voice output unit 103 considers that the user 10 is likely to perform the above operation when the above action is detected, and outputs the voice for realizing the control by the above operation. Output automatically.

具体的には、本実施形態では、操作検知部１０２は、ユーザ１０による空調機３００に対する第１操作とユーザ１０による空調機３００に対する第２操作とを検知する。ここで、音声出力部１０３は、第１操作が検知された場合、第１操作の内容を表す第１音声を出力し、第２操作が検知された場合、第２操作の内容を表す第２音声を出力する。また、行動検知部１０５は、操作検知部１０２を備え、ユーザ１０によりなされた行動として、第２操作を検知する。 Specifically, in the present embodiment, the operation detection unit 102 detects the first operation of the user 10 on the air conditioner 300 and the second operation of the user 10 on the air conditioner 300. Here, the voice output unit 103 outputs the first voice representing the content of the first operation when the first operation is detected, and the second voice representing the content of the second operation when the second operation is detected. Output audio. Further, the action detection unit 105 includes an operation detection unit 102, and detects a second operation as an action performed by the user 10.

また、履歴情報生成部１０６は、上記予め定められた時間内に第１操作と第２操作とが検知された場合、第１操作の内容と第２操作の内容とが対応付けられた履歴情報を生成する。そして、音声出力部１０３は、この履歴情報が生成された後、第２操作が検知された場合、第１音声と第２音声とを出力する。このように、第１操作と第２操作とが連続して検知された実績がある場合、第１操作と第２操作とが連続して実行される可能性が高い。そこで、音声出力部１０３は、このような実績がある場合において、第２操作が検知された場合、第１操作がなされる可能性が高いものとみなし、第２操作の内容を表す第２音声だけでなく、第１操作の内容を表す第１音声も出力する。 Further, when the first operation and the second operation are detected within the predetermined time, the history information generation unit 106 provides history information in which the contents of the first operation and the contents of the second operation are associated with each other. To generate. Then, when the second operation is detected after the history information is generated, the voice output unit 103 outputs the first voice and the second voice. As described above, when the first operation and the second operation are continuously detected, there is a high possibility that the first operation and the second operation are continuously executed. Therefore, in the case where the voice output unit 103 has such a track record, when the second operation is detected, the voice output unit 103 considers that the first operation is likely to be performed, and the second voice representing the content of the second operation. Not only that, the first voice representing the content of the first operation is also output.

なお、本実施形態では、履歴情報生成部１０６は、第２操作が検知されてから上記予め定められた時間が経過する前に第１操作が検知された場合、第１操作の内容と第２操作の内容とが対応付けられた履歴情報を生成する。つまり、本実施形態では、第２操作が検知されてから第１操作が検知された実績がある場合において、第２操作が新たに検知された場合、第２操作の内容を表す第２音声に加え、第１操作の内容を表す第１音声も出力される。一方、第２操作が検知されてから第１操作が検知された実績がある場合において、第１操作が新たに検知された場合、第１操作の内容を表す第１音声が出力され、第２操作の内容を表す第２音声が出力されない。 In the present embodiment, if the history information generation unit 106 detects the first operation before the predetermined time elapses after the second operation is detected, the contents of the first operation and the second operation Generate history information associated with the content of the operation. That is, in the present embodiment, when the first operation is detected after the second operation is detected and the second operation is newly detected, the second voice representing the content of the second operation is used. In addition, a first voice representing the content of the first operation is also output. On the other hand, when the first operation is detected after the second operation is detected and the first operation is newly detected, the first voice representing the content of the first operation is output and the second operation is output. The second voice indicating the content of the operation is not output.

このように、本実施形態では、ある操作が検知された後に更に実行される可能性が高いと考えられる操作の内容を表す音声は自動で出力され、一方、ある操作が検知された後に更に実行される可能性が低いと考えられる操作の内容を表す音声は自動で出力されない。例えば、第２操作が、空調機３００に対して電源をオンする操作であり、第１操作が、空調機３００に対して空調モードを冷房にする操作である場合を想定する。この場合、第２操作がなされた後に第１操作がなされる可能性は高いが、第１操作がなされた後に第２操作がなされる可能性は低いと考えられる。そこで、先に検知された操作が新たに検知された場合に限り、後で検知された操作の内容を示す音声が付加的に出力される。 As described above, in the present embodiment, the voice representing the content of the operation that is considered to be more likely to be executed after the certain operation is detected is automatically output, while the voice indicating the content of the operation is further executed after the certain operation is detected. The voice indicating the content of the operation that is unlikely to be performed is not automatically output. For example, assume that the second operation is an operation of turning on the power to the air conditioner 300 and the first operation is an operation of setting the air conditioning mode to the air conditioner 300. In this case, it is highly likely that the first operation will be performed after the second operation has been performed, but it is unlikely that the second operation will be performed after the first operation has been performed. Therefore, only when the operation detected earlier is newly detected, a voice indicating the content of the operation detected later is additionally output.

ここで、履歴情報生成部１０６は、直近の予め定められた期間内において上記予め定められた時間内に上記操作と上記行動とが検知された回数が、予め定められた閾値に達した場合、上記操作の内容と上記行動の内容とが対応付けられた履歴情報を生成することが好適である。直近の予め定められた期間は、例えば、直近の一ヶ月間である。予め定められた時間は、例えば、数分である。予め定められた閾値は、例えば、５回である。このように、上記操作と上記行動とが連続して検知された実績が、例えば、直近の１ヶ月間において５回ある場合に、上記行動とともに上記操作がなされる可能性は高いと考えられる。そこで、このような場合、上記行動が検知された場合、上記操作の内容を表す音声が出力されることが好適である。かかる構成によれば、不適切に音声が出力されることを抑制することができる。 Here, when the number of times the operation and the action are detected within the latest predetermined time within the latest predetermined period, the history information generation unit 106 reaches a predetermined threshold value. It is preferable to generate history information in which the content of the operation and the content of the action are associated with each other. The most recent predetermined period is, for example, the most recent month. The predetermined time is, for example, a few minutes. The predetermined threshold is, for example, 5 times. In this way, if the operation and the action are continuously detected five times in the last month, for example, it is highly likely that the operation is performed together with the action. Therefore, in such a case, when the above action is detected, it is preferable to output a voice representing the content of the above operation. According to such a configuration, it is possible to suppress inappropriate output of audio.

次に、図５を参照して、機器制御装置２００の機能について説明する。図５に示すように、機器制御装置２００は、機能的には、制御部２０１と、音声検知部２０２と、音声出力部２０３と、音声情報記憶部２０４と、機器制御部２０５と、コマンド情報記憶部２０６と、を備える。機器制御装置２００が備える音声検知手段は、例えば、音声検知部２０２に対応する。機器制御手段は、例えば、機器制御部２０５に対応する。 Next, the function of the device control device 200 will be described with reference to FIG. As shown in FIG. 5, the device control device 200 functionally includes a control unit 201, a voice detection unit 202, a voice output unit 203, a voice information storage unit 204, a device control unit 205, and command information. A storage unit 206 is provided. The voice detection means included in the device control device 200 corresponds to, for example, the voice detection unit 202. The device control means corresponds to, for example, the device control unit 205.

制御部２０１は、機器制御装置２００の全体の動作を制御する。例えば、制御部２０１は、音声検知部２０２により検知された音声から、空調機３００に対する制御の内容を特定し、特定した制御の内容を表す制御コマンドを機器制御部２０５に送信させる。制御部２０１の機能は、例えば、プロセッサ２１がフラッシュメモリ２２に記憶されたプログラムを実行することにより実現される。 The control unit 201 controls the overall operation of the device control device 200. For example, the control unit 201 identifies the content of control for the air conditioner 300 from the voice detected by the voice detection unit 202, and causes the device control unit 205 to transmit a control command indicating the content of the specified control. The function of the control unit 201 is realized, for example, by the processor 21 executing the program stored in the flash memory 22.

音声検知部２０２は、音声出力部１０３により出力された音声を検知する。従って、音声検知部２０２は、音声出力部１０３の近くに配置されることが望ましい。例えば、音声検知部２０２は、音声出力部１０３から数メートル以内の領域に配置される。音声検知部２０２の機能は、例えば、マイクロフォン２４の機能により実現される。 The voice detection unit 202 detects the voice output by the voice output unit 103. Therefore, it is desirable that the voice detection unit 202 is arranged near the voice output unit 103. For example, the voice detection unit 202 is arranged in an area within a few meters from the voice output unit 103. The function of the voice detection unit 202 is realized by, for example, the function of the microphone 24.

音声出力部２０３は、制御部２０１による制御に従って、種々の音声を出力する。例えば、音声出力部２０３は、ユーザ１０に対するアナウンスを表す音声を出力する。音声出力部２０３の機能は、例えば、プロセッサ２１とスピーカ２５とが協働することにより実現される。 The audio output unit 203 outputs various audios according to the control by the control unit 201. For example, the audio output unit 203 outputs an audio representing an announcement to the user 10. The function of the audio output unit 203 is realized, for example, by the cooperation of the processor 21 and the speaker 25.

音声情報記憶部２０４は、音声情報を記憶する。音声情報は、例えば、音声検知部２０２により検知された音声から制御の内容を特定するために用いる情報である。例えば、音声情報は、制御の内容毎に、制御の内容を表す音声に対応する電気信号を表す情報を示す情報である。音声情報記憶部２０４の機能は、例えば、フラッシュメモリ２２の機能により実現される。 The voice information storage unit 204 stores voice information. The voice information is, for example, information used to specify the content of control from the voice detected by the voice detection unit 202. For example, the voice information is information indicating information representing an electric signal corresponding to the voice representing the content of control for each content of control. The function of the voice information storage unit 204 is realized by, for example, the function of the flash memory 22.

機器制御部２０５は、音声検知部２０２により検知された音声により表される操作の内容に基づいて、空調機３００を制御する。例えば、機器制御部２０５は、制御部２０１による制御に従って、検知された音声に対応する制御コマンドを、通信ネットワーク６００を介して、空調機３００に送信する。機器制御部２０５の機能は、例えば、プロセッサ２１と通信インターフェース２６とが協働することにより実現される。 The device control unit 205 controls the air conditioner 300 based on the content of the operation represented by the voice detected by the voice detection unit 202. For example, the device control unit 205 transmits a control command corresponding to the detected voice to the air conditioner 300 via the communication network 600 according to the control by the control unit 201. The function of the device control unit 205 is realized, for example, by the cooperation of the processor 21 and the communication interface 26.

コマンド情報記憶部２０６は、コマンド情報を記憶する。コマンド情報は、例えば、操作の内容に対応する制御の内容と、制御コマンドとが対応付けられた情報である。コマンド情報記憶部２０６の機能は、例えば、フラッシュメモリ２２の機能により実現される。 The command information storage unit 206 stores command information. The command information is, for example, information in which the control content corresponding to the operation content and the control command are associated with each other. The function of the command information storage unit 206 is realized by, for example, the function of the flash memory 22.

次に、図６を参照して、履歴情報について説明する。図６に示す履歴情報は、過去に連続してなされた複数の操作の組み合わせの全てを示す情報である。検知開始時刻は、対応するレコードの組み合わせの検知が開始された時刻である。操作Ａは、過去に連続してなされた複数の操作のうち、最初になされた操作である。操作Ａは、例えば、第２操作に対応する。操作Ｂは、過去に連続してなされた複数の操作のうち、２番目になされた操作である。操作Ｂは、例えば、第１操作に対応する。操作Ｃは、過去に連続してなされた複数の操作のうち、３番目になされた操作である。操作Ｃは、例えば、第１操作に対応する。本実施形態では、第２操作は１つであり、第１操作は１つ以上である。 Next, the history information will be described with reference to FIG. The history information shown in FIG. 6 is information indicating all combinations of a plurality of operations performed consecutively in the past. The detection start time is the time when the detection of the corresponding record combination is started. Operation A is the first operation among a plurality of operations performed consecutively in the past. Operation A corresponds to, for example, the second operation. Operation B is the second operation among the plurality of operations performed consecutively in the past. Operation B corresponds to, for example, the first operation. Operation C is the third operation among the plurality of operations performed consecutively in the past. Operation C corresponds to, for example, the first operation. In the present embodiment, the second operation is one, and the first operation is one or more.

図６に示す履歴情報のうち一番上のレコードは、２０１８年５月１８日の１２：００に、空調機３００の電源をオンする操作と、空調機３００の空調モードを冷房にする操作と、空調機３００の設定温度を２８℃にする操作とが、連続して実行された実績を示すレコードである。このような実績がある場合において空調機３００の電源をオンする操作が検知された場合、空調機３００の空調モードを冷房にする操作と空調機３００の設定温度を２８℃にする操作とが実行される可能性が高い。そこで、空調機３００の電源をオンする操作が検知された場合、空調機３００の電源をオンすることを指示する音声に加え、空調機３００の空調モードを冷房にすることを指示する音声と、空調機３００の設定温度を２８℃にすることを指示する音声とが出力される。なお、図６に示す例では、第２操作に対応する操作Ａの内容が、一番上のレコードと一番下のレコードとで同じである。このような場合、検知開始時刻が新しい１番上のレコードが採用されることが好適である。 The top record of the history information shown in FIG. 6 is the operation of turning on the power of the air conditioner 300 and the operation of setting the air conditioning mode of the air conditioner 300 to cooling at 12:00 on May 18, 2018. The operation of setting the set temperature of the air conditioner 300 to 28 ° C. is a record showing the results of continuous execution. When the operation of turning on the power of the air conditioner 300 is detected in such a case, the operation of setting the air conditioning mode of the air conditioner 300 to cooling and the operation of setting the set temperature of the air conditioner 300 to 28 ° C. are executed. It is likely to be done. Therefore, when the operation of turning on the power of the air conditioner 300 is detected, in addition to the voice instructing to turn on the power of the air conditioner 300, the voice instructing to set the air conditioning mode of the air conditioner 300 to cooling and the voice instructing to turn on the air conditioning mode. A voice instructing that the set temperature of the air conditioner 300 is set to 28 ° C. is output. In the example shown in FIG. 6, the content of the operation A corresponding to the second operation is the same for the top record and the bottom record. In such a case, it is preferable that the top record with the newest detection start time is adopted.

なお、図６には、検知された操作の組み合わせの全てが履歴情報に含まれる例を示したが、履歴情報はこの例に限定されない。例えば、直近の予め定められた期間に検知された操作の組み合わせのみが履歴情報に含まれてもよい。また、直近の予め定められた期間に予め定められた回数以上検知された操作の組み合わせのみが履歴情報に含まれてもよい。また、競合関係にある組み合わせのうち検知開始時刻が古い方のレコードが履歴情報から除外されてもよい。競合関係にある組み合わせは、例えば、第２操作が同じであり、少なくとも１つの第１操作が異なる組み合わせである。 Note that FIG. 6 shows an example in which all combinations of detected operations are included in the history information, but the history information is not limited to this example. For example, only the combination of operations detected in the most recent predetermined period may be included in the history information. Further, the history information may include only a combination of operations detected more than a predetermined number of times in the latest predetermined period. Further, the record having the oldest detection start time among the competing combinations may be excluded from the history information. The combinations in a competing relationship are, for example, combinations in which the second operation is the same and at least one first operation is different.

次に、図７のフローチャートを参照して、音声出力装置１００が実行する音声出力処理について説明する。音声出力処理は、例えば、音声出力装置１００の電源が投入されたことに応答して実行される。 Next, the audio output process executed by the audio output device 100 will be described with reference to the flowchart of FIG. 7. The audio output process is executed, for example, in response to the power of the audio output device 100 being turned on.

まず、プロセッサ１１は、操作を検知したか否かを判別する（ステップＳ１０１）。プロセッサ１１は、操作を検知していないと判別すると（ステップＳ１０１：ＮＯ）、ステップＳ１０１に処理を戻す。一方、プロセッサ１１は、操作を検知したと判別すると（ステップＳ１０１：ＹＥＳ）、検知開始時刻を記憶する（ステップＳ１０２）。プロセッサ１１は、ステップＳ１０２の処理を完了すると、連動設定があるか否かを判別する（ステップＳ１０３）。具体的には、プロセッサ１１は、ステップＳ１０１で検知された操作を第２操作とするレコードが履歴情報に含まれるか否かを判別する。 First, the processor 11 determines whether or not the operation has been detected (step S101). When the processor 11 determines that the operation has not been detected (step S101: NO), the processor 11 returns the process to step S101. On the other hand, when the processor 11 determines that the operation has been detected (step S101: YES), the processor 11 stores the detection start time (step S102). When the processor 11 completes the process of step S102, the processor 11 determines whether or not there is an interlocking setting (step S103). Specifically, the processor 11 determines whether or not the history information includes a record whose second operation is the operation detected in step S101.

プロセッサ１１は、連動設定があると判別すると（ステップＳ１０３：ＹＥＳ）、音声群を出力する（ステップＳ１０４）。例えば、プロセッサ１１は、上記レコードに含まれる操作の内容を１つ選択する処理と、選択した操作の内容を表す音声をスピーカ１５から出力させる処理とを、上記レコードに含まれる操作の内容を全て選択するまで繰り返し実行する。一方、プロセッサ１１は、連動設定がないと判別すると（ステップＳ１０３：ＮＯ）、単一の音声を出力する（ステップＳ１０５）。例えば、プロセッサ１１は、ステップＳ１０１で検知された操作の内容を表す音声をスピーカ１５から出力させる。 When the processor 11 determines that there is an interlocking setting (step S103: YES), the processor 11 outputs a voice group (step S104). For example, the processor 11 performs a process of selecting one operation content included in the record and a process of outputting a voice representing the selected operation content from the speaker 15, all of the operation content included in the record. Repeat until you select it. On the other hand, when the processor 11 determines that there is no interlocking setting (step S103: NO), the processor 11 outputs a single sound (step S105). For example, the processor 11 causes the speaker 15 to output a voice representing the content of the operation detected in step S101.

プロセッサ１１は、ステップＳ１０４の処理又はステップＳ１０５の処理を完了した場合、操作を検知したか否かを判別する（ステップＳ１０６）。プロセッサ１１は、操作を検知したと判別すると（ステップＳ１０６：ＹＥＳ）、連動設定があるか否かを判別する（ステップＳ１０７）。具体的には、プロセッサ１１は、ステップＳ１０６で検知された操作を第２操作とするレコードが履歴情報に含まれるか否かを判別する。プロセッサ１１は、連動設定があると判別すると（ステップＳ１０７：ＹＥＳ）、音声群を出力する（ステップＳ１０８）。一方、プロセッサ１１は、連動設定がないと判別すると（ステップＳ１０７：ＮＯ）、単一の音声を出力する（ステップＳ１０９）。 When the process of step S104 or the process of step S105 is completed, the processor 11 determines whether or not the operation is detected (step S106). When the processor 11 determines that the operation has been detected (step S106: YES), the processor 11 determines whether or not there is an interlocking setting (step S107). Specifically, the processor 11 determines whether or not the history information includes a record whose second operation is the operation detected in step S106. When the processor 11 determines that there is an interlocking setting (step S107: YES), the processor 11 outputs a voice group (step S108). On the other hand, when the processor 11 determines that there is no interlocking setting (step S107: NO), the processor 11 outputs a single sound (step S109).

プロセッサ１１は、ステップＳ１０８の処理又はステップＳ１０９の処理を完了した場合、又は、操作を検知していないと判別した場合（ステップＳ１０６：ＮＯ）、検知開始時刻から第１時間が経過したか否かを判別する（ステップＳ１１０）。第１時間は、上述した予め定められた時間であり、例えば、数分である。プロセッサ１１は、検知開始時刻から第１時間が経過していないと判別すると（ステップＳ１１０：ＮＯ）、ステップＳ１０６に処理を戻す。 When the processor 11 completes the process of step S108 or the process of step S109, or determines that the operation has not been detected (step S106: NO), whether or not the first time has elapsed from the detection start time. (Step S110). The first time is the predetermined time described above, for example, a few minutes. When the processor 11 determines that the first time has not elapsed from the detection start time (step S110: NO), the processor 11 returns the process to step S106.

一方、プロセッサ１１は、検知開始時刻から第１時間が経過したと判別すると（ステップＳ１１０：ＹＥＳ）、第１期間内に複数の操作を検知したか否かを判別する（ステップＳ１１１）。第１期間は、検知開始時刻から第１時間が経過するまでの期間である。プロセッサ１１は、第１期間内に複数の操作を検知したと判別すると（ステップＳ１１１：ＹＥＳ）、履歴情報を生成する（ステップＳ１１２）。例えば、プロセッサ１１は、ステップＳ１０１で検知された操作を第２操作、ステップＳ１０６で検知された操作を第１操作とするレコードを含むように、履歴情報を更新する。プロセッサ１１は、第１期間内に複数の操作を検知していないと判別した場合（ステップＳ１１１：ＮＯ）、又は、ステップＳ１１２の処理を完了した場合、ステップＳ１０１に処理を戻す。 On the other hand, when the processor 11 determines that the first time has elapsed from the detection start time (step S110: YES), the processor 11 determines whether or not a plurality of operations have been detected within the first period (step S111). The first period is a period from the detection start time to the elapse of the first time. When the processor 11 determines that a plurality of operations have been detected within the first period (step S111: YES), the processor 11 generates history information (step S112). For example, the processor 11 updates the history information so as to include a record in which the operation detected in step S101 is the second operation and the operation detected in step S106 is the first operation. When it is determined that the processor 11 has not detected a plurality of operations within the first period (step S111: NO), or when the process of step S112 is completed, the process returns to step S101.

本実施形態に係る音声出力方法は、本実施形態に係る音声出力装置１００が図７に示す音声出力処理を実行することにより実現される。この音声出力方法では、まず、ユーザ１０による設備機器に対する操作を検知し、この操作が検知された場合、この操作の内容を表す音声を出力する。また、この音声出力方法では、ユーザ１０によりなされた行動を検知する。そして、この音声出力方法では、この行動が検知され、この操作とこの行動とが予め定められた関係にある場合、上記音声を出力する。 The audio output method according to the present embodiment is realized by the audio output device 100 according to the present embodiment executing the audio output process shown in FIG. 7. In this voice output method, first, an operation on the equipment by the user 10 is detected, and when this operation is detected, a voice representing the content of this operation is output. Further, in this voice output method, the action taken by the user 10 is detected. Then, in this voice output method, when this action is detected and this operation and this action have a predetermined relationship, the above sound is output.

以上説明したように、本実施形態では、ユーザ１０による設備機器に対する操作とユーザ１０によりなされた行動とが予め定められた関係にあり、上記行動が検知された場合、上記操作の内容を表す音声が出力される。従って、本実施形態によれば、音声に基づいて設備機器を制御する機器制御装置２００を利用して、設備機器を容易に所望の制御状態にすることができる。例えば、音声出力装置１００に対する単一の操作で、ユーザ１０の嗜好を反映した設備機器の制御を実現することが期待できる。 As described above, in the present embodiment, the operation of the equipment by the user 10 and the action performed by the user 10 have a predetermined relationship, and when the action is detected, a voice representing the content of the operation is heard. Is output. Therefore, according to the present embodiment, the equipment control device 200 that controls the equipment based on the voice can be used to easily bring the equipment into a desired control state. For example, it can be expected that the control of equipment and devices that reflects the tastes of the user 10 can be realized by a single operation on the voice output device 100.

また、本実施形態では、予め定められた時間内に複数の操作が実行された実績がある場合において、これらの複数の操作のうちいずれかの操作が実行された場合、他の操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、設備機器を少ない操作で所望する制御状態にすることができる。 Further, in the present embodiment, when a plurality of operations have been executed within a predetermined time, and when any of these plurality of operations is executed, the contents of the other operations are displayed. The voice to be represented is automatically output. Therefore, according to the present embodiment, the equipment can be brought into a desired control state with a small number of operations.

また、本実施形態では、予め定められた時間内に複数の操作が実行された実績がある場合において、これらの複数の操作のうち最初の操作が実行された場合、他の操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、設備機器を少ない操作で適切に所望する制御状態にすることができる。 Further, in the present embodiment, when a plurality of operations have been executed within a predetermined time, when the first operation among these a plurality of operations is executed, the contents of other operations are represented. Audio is automatically output. Therefore, according to the present embodiment, the equipment can be appropriately put into a desired control state with a small number of operations.

（実施形態２）
実施形態１では、連続して実行される操作及び行動にキーワードが対応付けられない例について説明した。本実施形態では、連続して実行される操作及び行動にキーワードを対応付ける例について説明する。以下、基本的に、実施形態１と異なる部分について説明する。(Embodiment 2)
In the first embodiment, an example in which keywords are not associated with operations and actions that are continuously executed has been described. In this embodiment, an example of associating a keyword with a continuously executed operation and action will be described. Hereinafter, a part different from the first embodiment will be basically described.

まず、図８を参照して、音声出力装置１２０の機能について説明する。図８に示すように、音声出力装置１２０は、機能的には、制御部１０１と、操作検知部１０２と、音声出力部１０３と、音声情報記憶部１０４と、行動検知部１０５と、履歴情報生成部１０６と、履歴情報記憶部１０７と、を備える。行動検知部１０５は、音声検知部１０８を備える。音声出力装置１２０が備える音声検知手段は、例えば、音声検知部１０８に対応する。 First, the function of the audio output device 120 will be described with reference to FIG. As shown in FIG. 8, the voice output device 120 functionally includes a control unit 101, an operation detection unit 102, a voice output unit 103, a voice information storage unit 104, an action detection unit 105, and history information. A generation unit 106 and a history information storage unit 107 are provided. The action detection unit 105 includes a voice detection unit 108. The voice detection means included in the voice output device 120 corresponds to, for example, the voice detection unit 108.

操作検知部１０２は、ユーザ１０による設備機器に対する第１操作を検知する。音声出力部１０３は、第１操作が検知された場合、第１操作の内容を表す第１音声を出力する。行動検知部１０５は、ユーザ１０が発した言葉を表す第３音声を検知する音声検知部１０８を備え、ユーザ１０による言葉の発声を、ユーザ１０によりなされた行動として検知する。音声検知部１０８の機能は、例えば、マイクロフォン１４の機能により実現される。 The operation detection unit 102 detects the first operation of the equipment by the user 10. When the first operation is detected, the voice output unit 103 outputs the first voice representing the content of the first operation. The action detection unit 105 includes a voice detection unit 108 that detects a third voice representing a word uttered by the user 10, and detects the utterance of the word by the user 10 as an action performed by the user 10. The function of the voice detection unit 108 is realized by, for example, the function of the microphone 14.

履歴情報生成部１０６は、予め定められた時間内に第１操作と第３音声とが検知された場合、第１操作の内容と上記言葉とが対応付けられた履歴情報を生成する。音声出力部１０３は、この履歴情報が生成された後、第３音声が検知された場合、第１音声を出力する。 When the first operation and the third voice are detected within a predetermined time, the history information generation unit 106 generates history information in which the content of the first operation and the above words are associated with each other. The audio output unit 103 outputs the first audio when the third audio is detected after the history information is generated.

第３音声により表される言葉は、第１操作の実行とともに発せられる可能性が高い言葉であり、キーワードとして扱われる。そして、このキーワードが第１操作の実行とともに発せられた実績がある場合において、このキーワードが発せられた場合、第１操作を表す音声が自動で出力される。なお、履歴情報によりこのキーワードに対応付けられる第１操作は、１つ以上であれば何個であってもよい。また、履歴情報は、第１操作の内容と第１操作の直前又は直後に発せられたキーワードとが対応付けられた情報でもよいし、第１操作の内容と第１操作の直前に発せられたキーワードとが対応付けられた情報でもよいし、第１操作の内容と第１操作の直後に発せられたキーワードとが対応付けられた情報でもよい。 The word represented by the third voice is a word that is likely to be uttered with the execution of the first operation, and is treated as a keyword. Then, when this keyword has a track record of being issued at the same time as the execution of the first operation, when this keyword is issued, a voice representing the first operation is automatically output. The number of first operations associated with this keyword based on the history information may be any number as long as it is one or more. Further, the history information may be information in which the content of the first operation is associated with the keyword issued immediately before or immediately after the first operation, or is issued immediately before the content of the first operation and the first operation. The information may be associated with a keyword, or the information may be associated with the content of the first operation and the keyword issued immediately after the first operation.

次に、図９を参照して、本実施形態における履歴情報について説明する。本実施形態では、履歴情報は、キーワードと操作Ａの内容と操作Ｂの内容と操作Ｃの内容とが対応付けられた情報である。操作Ａと操作Ｂと操作Ｃとは、キーワードの発話と共になされた操作であり、第１操作である。履歴情報により示される一番上のレコードは、「空調機」というキーワードの発話とともに、空調機３００の電源をオンする操作と、空調機３００の空調モードを冷房にする操作と、空調機３００の設定温度を２８℃に設定する操作とが実行された実績があることを示している。このような実績がある場合において、「空調機」というキーワードが発話されると、空調機３００の電源をオンすることを指示する音声と、空調機３００の空調モードを冷房にすることを指示する音声と、空調機３００の設定温度を２８℃に設定することを指示する音声とが、自動で出力される。なお、図９には、検知されたキーワード及び第１操作の組み合わせの全てが履歴情報に含まれる例を示したが、履歴情報はこの例に限定されない。例えば、直近の予め定められた期間に検知されたキーワード及び第１操作の組み合わせのみが履歴情報に含まれてもよい。また、直近の予め定められた期間に予め定められた回数以上検知されたキーワード及び第１操作の組み合わせのみが履歴情報に含まれてもよい。また、競合関係にある組み合わせのうち検知開始時刻が古い方のレコードが履歴情報から除外されてもよい。競合関係にある組み合わせは、例えば、キーワードが同じであり、少なくとも１つの第１操作が異なる組み合わせである。 Next, the history information in the present embodiment will be described with reference to FIG. In the present embodiment, the history information is information in which the keyword, the content of the operation A, the content of the operation B, and the content of the operation C are associated with each other. The operation A, the operation B, and the operation C are operations performed together with the utterance of the keyword, and are the first operations. The top record shown by the history information is the operation of turning on the power of the air conditioner 300, the operation of setting the air conditioning mode of the air conditioner 300 to cooling, and the operation of the air conditioner 300, along with the utterance of the keyword "air conditioner". It shows that the operation of setting the set temperature to 28 ° C. has been performed. In the case where there is such a track record, when the keyword "air conditioner" is spoken, a voice instructing to turn on the power of the air conditioner 300 and an instruction to set the air conditioning mode of the air conditioner 300 to cooling are instructed. The voice and the voice instructing to set the set temperature of the air conditioner 300 to 28 ° C. are automatically output. Note that FIG. 9 shows an example in which all combinations of the detected keyword and the first operation are included in the history information, but the history information is not limited to this example. For example, only the combination of the keyword and the first operation detected in the latest predetermined period may be included in the history information. Further, only the combination of the keyword and the first operation detected more than a predetermined number of times in the latest predetermined period may be included in the history information. Further, the record having the oldest detection start time among the competing combinations may be excluded from the history information. Competitive combinations are, for example, combinations in which the keywords are the same and at least one first operation is different.

次に、図１０のフローチャートを参照して、音声出力装置１２０が実行する音声出力処理について説明する。音声出力処理は、例えば、音声出力装置１２０の電源が投入されたことに応答して実行される。ここでは、キーワードの発話の後になされた一連の操作の内容が、キーワードと対応付けられる例について説明する。 Next, the audio output process executed by the audio output device 120 will be described with reference to the flowchart of FIG. The audio output process is executed, for example, in response to the power of the audio output device 120 being turned on. Here, an example in which the content of a series of operations performed after the utterance of a keyword is associated with the keyword will be described.

まず、プロセッサ１１は、言葉を検知したか否かを判別する（ステップＳ２０１）。例えば、プロセッサ１１は、キーワードとなり得る言葉を表す音声がマイクロフォン１４により検知されたか否かを判別する。プロセッサ１１は、言葉を検知していないと判別すると（ステップＳ２０１：ＮＯ）、ステップＳ２０１に処理を戻す。一方、プロセッサ１１は、言葉を検知したと判別すると（ステップＳ２０１：ＹＥＳ）、検知開始時刻を記憶する（ステップＳ２０２）。 First, the processor 11 determines whether or not the word has been detected (step S201). For example, the processor 11 determines whether or not a voice representing a word that can be a keyword is detected by the microphone 14. When the processor 11 determines that the word has not been detected (step S201: NO), the processor 11 returns the process to step S201. On the other hand, when the processor 11 determines that the word has been detected (step S201: YES), the processor 11 stores the detection start time (step S202).

プロセッサ１１は、ステップＳ２０２の処理を完了すると、連動設定があるか否かを判別する（ステップＳ２０３）。具体的には、プロセッサ１１は、ステップＳ２０１で検知された言葉をキーワードとするレコードが履歴情報に含まれているか否かを判別する。プロセッサ１１は、連動設定があると判別すると（ステップＳ２０３：ＹＥＳ）、単一の音声又は音声群を出力する（ステップＳ２０４）。なお、上記レコードに単一の操作の内容が含まれる場合、単一の音声が出力される。一方、上記レコードに複数の操作の内容が含まれる場合、音声群が出力される。 When the processor 11 completes the process of step S202, the processor 11 determines whether or not there is an interlocking setting (step S203). Specifically, the processor 11 determines whether or not the history information includes a record using the word detected in step S201 as a keyword. When the processor 11 determines that there is an interlocking setting (step S203: YES), the processor 11 outputs a single voice or voice group (step S204). If the record contains the content of a single operation, a single voice is output. On the other hand, when the record contains the contents of a plurality of operations, a voice group is output.

プロセッサ１１は、ステップＳ２０４の処理を完了した場合、又は、連動設定がないと判別した場合（ステップＳ２０３：ＮＯ）、操作を検知したか否かを判別する（ステップＳ２０５）。プロセッサ１１は、操作を検知したと判別すると（ステップＳ２０５：ＹＥＳ）、単一の音声を出力する（ステップＳ２０６）。プロセッサ１１は、ステップＳ２０６の処理を完了した場合、又は、操作を検知していないと判別した場合（ステップＳ２０５：ＮＯ）、検知開始時刻から第１時間が経過したか否かを判別する（ステップＳ２０７）。 When the processor 11 completes the process of step S204 or determines that there is no interlocking setting (step S203: NO), the processor 11 determines whether or not an operation has been detected (step S205). When the processor 11 determines that the operation has been detected (step S205: YES), the processor 11 outputs a single voice (step S206). When the processor 11 completes the process of step S206 or determines that the operation has not been detected (step S205: NO), the processor 11 determines whether or not the first time has elapsed from the detection start time (step S205). S207).

プロセッサ１１は、検知開始時刻から第１時間が経過していないと判別すると（ステップＳ２０７：ＮＯ）、ステップＳ２０５に処理を戻す。一方、プロセッサ１１は、検知開始時刻から第１時間が経過したと判別すると（ステップＳ２０７：ＹＥＳ）、第１期間内に操作を検知したか否かを判別する（ステップＳ２０８）。プロセッサ１１は、第１期間内に操作を検知したと判別すると（ステップＳ２０８：ＹＥＳ）、履歴情報を生成する（ステップＳ２０９）。プロセッサ１１は、検知された言葉であるキーワードと、第１期間内に検知された操作の内容とが対応付けられたレコードを含むように、履歴情報を更新する。プロセッサ１１は、ステップＳ２０９の処理を完了した場合、又は、第１期間内に操作を検知していないと判別した場合（ステップＳ２０８：ＮＯ）、ステップＳ２０１に処理を戻す。 When the processor 11 determines that the first time has not elapsed from the detection start time (step S207: NO), the processor 11 returns the process to step S205. On the other hand, when the processor 11 determines that the first time has elapsed from the detection start time (step S207: YES), the processor 11 determines whether or not the operation is detected within the first period (step S208). When the processor 11 determines that the operation has been detected within the first period (step S208: YES), the processor 11 generates history information (step S209). The processor 11 updates the history information so as to include a record in which the keyword which is the detected word and the content of the operation detected within the first period are associated with each other. When the processor 11 completes the process of step S209, or determines that the operation has not been detected within the first period (step S208: NO), the processor 11 returns the process to step S201.

本実施形態では、キーワードの発話とともに少なくとも１つの操作が検知された実績がある場合において、キーワードの発話が検知された場合、この少なくとも１つの操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、キーワードの発話により設備機器を所望する制御状態にすることができる。また、本実施形態では、設備機器を所望する制御状態にするための一連の操作に対応付けられるキーワードは、ユーザが自由に選択できるため、ユーザの利便性が高まる。 In the present embodiment, when at least one operation is detected together with the utterance of the keyword, when the utterance of the keyword is detected, a voice representing the content of the at least one operation is automatically output. Therefore, according to the present embodiment, the equipment can be put into a desired control state by speaking the keyword. Further, in the present embodiment, the keyword associated with the series of operations for bringing the equipment into the desired control state can be freely selected by the user, which enhances the convenience of the user.

（実施形態３）
実施形態１では、行動検知部１０５により検知される行動が、操作検知部１０２により検知される操作である例について説明した。また、実施形態２では、行動検知部１０５により検知される行動が、音声検知部１０８により検知される音声の発話である例について説明した。本実施形態では、行動検知部１０５により検知される行動が、操作検知部１０２により検知される操作と、音声検知部１０８により検知される音声の発話との双方である例について説明する。以下、基本的に、実施形態１，２と異なる部分について説明する。(Embodiment 3)
In the first embodiment, an example in which the action detected by the action detection unit 105 is an operation detected by the operation detection unit 102 has been described. Further, in the second embodiment, an example in which the action detected by the action detection unit 105 is a voice utterance detected by the voice detection unit 108 has been described. In the present embodiment, an example will be described in which the action detected by the action detection unit 105 is both an operation detected by the operation detection unit 102 and a voice utterance detected by the voice detection unit 108. Hereinafter, the parts different from those of the first and second embodiments will be basically described.

まず、図１１を参照して、音声出力装置１３０の機能について説明する。図１１に示すように、音声出力装置１３０は、機能的には、制御部１０１と、音声出力部１０３と、音声情報記憶部１０４と、行動検知部１０５と、履歴情報生成部１０６と、履歴情報記憶部１０７と、を備える。行動検知部１０５は、操作検知部１０２と音声検知部１０８とを備える。 First, the function of the audio output device 130 will be described with reference to FIG. As shown in FIG. 11, the voice output device 130 functionally includes a control unit 101, a voice output unit 103, a voice information storage unit 104, an action detection unit 105, a history information generation unit 106, and a history. It includes an information storage unit 107. The action detection unit 105 includes an operation detection unit 102 and a voice detection unit 108.

行動検知部１０５は、ユーザ１０が発した言葉を表す第３音声を検知する音声検知部１０８を備え、ユーザ１０による言葉の発声を、ユーザ１０によりなされた行動として検知する。履歴情報生成部１０６は、予め定められた時間内に第１操作と第２操作と第３音声とが検知された場合、第１操作の内容と第２操作の内容と上記言葉とが対応付けられた履歴情報を生成する。音声出力部１０３は、履歴情報が生成された後、第３音声が検知された場合、第１音声と第２音声とを出力する。 The action detection unit 105 includes a voice detection unit 108 that detects a third voice representing a word uttered by the user 10, and detects the utterance of the word by the user 10 as an action performed by the user 10. When the first operation, the second operation, and the third voice are detected within a predetermined time, the history information generation unit 106 associates the contents of the first operation, the contents of the second operation, and the above words. Generate the history information. The voice output unit 103 outputs the first voice and the second voice when the third voice is detected after the history information is generated.

つまり、本実施形態では、第１操作と第２操作と第３音声とが連続して検知された実績がある場合において、第２操作と第３音声との少なくとも一方が検知された場合、第１音声と第２音声とが出力される。例えば、実施形態２と同様に、図９に示す履歴情報が生成された場合を想定する。本実施形態では、例えば、「空調機」というキーワードが発音される場合と、空調機３００の電源をオンする操作がなされた場合とのいずれの場合においても、空調機３００の電源をオンすることを指示する音声と、空調機３００の空調モードを冷房にすることを指示する音声と、空調機３００の設定温度を２８℃にすることを指示する音声とが出力される。 That is, in the present embodiment, when the first operation, the second operation, and the third voice are continuously detected, and at least one of the second operation and the third voice is detected, the first operation is performed. The first voice and the second voice are output. For example, as in the second embodiment, it is assumed that the history information shown in FIG. 9 is generated. In the present embodiment, for example, the power of the air conditioner 300 is turned on regardless of whether the keyword "air conditioner" is pronounced or the operation of turning on the power of the air conditioner 300 is performed. Is output, a voice instructing that the air conditioning mode of the air conditioner 300 is to be cooled, and a voice instructing that the set temperature of the air conditioner 300 is set to 28 ° C. are output.

本実施形態では、キーワードの発話とともに複数の操作が検知された実績がある場合において、キーワードの発話が検知された場合、又は、複数の操作のうちの最初の操作が検知された場合、これらの複数の操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、キーワードの発話又は一連の操作のうちの最初の操作により設備機器を所望する制御状態にすることができる。 In the present embodiment, when a plurality of operations have been detected together with the utterance of the keyword, when the utterance of the keyword is detected, or when the first operation among the plurality of operations is detected, these Voices representing the contents of multiple operations are automatically output. Therefore, according to the present embodiment, the equipment can be put into a desired control state by uttering a keyword or the first operation in a series of operations.

（実施形態４）
実施形態１−３では、制御対象である設備機器が、１つだけである例について説明した。本実施形態では、制御対象である設備機器が、複数個である例について説明する。本実施形態では、図１２に示すように、制御対象である設備機器が、空調機３００と浴室暖房器３１０と給湯器３２０との３つである例について説明する。(Embodiment 4)
In the first to third embodiments, an example in which the number of equipment to be controlled is only one has been described. In the present embodiment, an example in which the number of equipment to be controlled is a plurality of equipment will be described. In the present embodiment, as shown in FIG. 12, an example in which the equipment to be controlled is the air conditioner 300, the bathroom heater 310, and the water heater 320 will be described.

操作検知部１０２は、ユーザ１０による複数の設備機器のうち第１設備機器に対する第１操作とユーザ１０による複数の設備機器のうち第２設備機器に対する第２操作とを検知する。本実施形態では、空調機３００と浴室暖房器３１０と給湯器３２０とのうちいずれか１つの設備機器が第２設備機器であり、残りの２つの設備機器が第１設備機器であるものとする。ただし、３つの設備機器のうちいずれの設備機器が第２設備機器であってもよい。 The operation detection unit 102 detects the first operation of the plurality of equipment by the user 10 for the first equipment and the second operation of the plurality of equipment by the user 10 for the second equipment. In the present embodiment, it is assumed that any one of the air conditioner 300, the bathroom heater 310, and the water heater 320 is the second equipment, and the remaining two equipments are the first equipment. .. However, any of the three equipments may be the second equipment.

音声出力部１０３は、第１操作が検知された場合、第１操作の内容を表す第１音声を出力し、第２操作が検知された場合、第２操作の内容を表す第２音声を出力する。行動検知部１０５は、操作検知部１０２を備え、ユーザ１０によりなされた行動として、第２操作を検知する。履歴情報生成部１０６は、予め定められた時間内に第１操作と第２操作とが検知された場合、第１操作の内容と第２操作の内容とが対応付けられた履歴情報を生成する。そして、音声出力部１０３は、履歴情報が生成された後、第２操作が検知された場合、第１音声と第２音声とを出力する。 When the first operation is detected, the voice output unit 103 outputs the first voice representing the content of the first operation, and when the second operation is detected, outputs the second voice representing the content of the second operation. do. The action detection unit 105 includes an operation detection unit 102, and detects a second operation as an action performed by the user 10. When the first operation and the second operation are detected within a predetermined time, the history information generation unit 106 generates history information in which the contents of the first operation and the contents of the second operation are associated with each other. .. Then, when the second operation is detected after the history information is generated, the voice output unit 103 outputs the first voice and the second voice.

次に、図１３を参照して、本実施形態における履歴情報について説明する。本実施形態では、履歴情報は、検知開始時刻と操作Ａの内容と操作Ｂの内容と操作Ｃの内容とが対応付けられた情報である。操作Ａと操作Ｂと操作Ｃとは、予め定められた時間内に連続して検知された一連の操作である。この一連の操作には、１つの設備機器に対する複数の操作が含まれていてもよい。操作Ａと操作Ｂと操作Ｃとのうちのいずれか１つの操作が第２操作である。残りの２つの操作が第１操作である。 Next, the history information in the present embodiment will be described with reference to FIG. In the present embodiment, the history information is information in which the detection start time, the content of the operation A, the content of the operation B, and the content of the operation C are associated with each other. Operation A, operation B, and operation C are a series of operations that are continuously detected within a predetermined time. This series of operations may include a plurality of operations for one equipment. The operation of any one of the operation A, the operation B, and the operation C is the second operation. The remaining two operations are the first operation.

履歴情報により示される一番上のレコードは、検知開始時刻である２０１８年５月１８日１２：００から予め定められた時間が経過するまでの間に、空調機３００の電源をオンする操作と、浴室暖房器３１０の電源をオンする操作と、給湯器３２０の電源をオンする操作と、が連続して実行された実績があることを示している。このような実績がある場合において、空調機３００の電源をオンする操作と、浴室暖房器３１０の電源をオンする操作と、給湯器３２０の電源をオンする操作とのうち、いずれかの操作が検知されると、空調機３００の電源をオンすることを指示する音声と、浴室暖房器３１０の電源をオンすることを指示する音声と、給湯器３２０の電源をオンすることを指示する音声とが、自動で出力される。 The top record indicated by the history information is the operation of turning on the power of the air conditioner 300 between the detection start time of 12:00 on May 18, 2018 and the elapse of a predetermined time. It shows that the operation of turning on the power of the bathroom heater 310 and the operation of turning on the power of the water heater 320 have been continuously executed. In the case where there is such a track record, one of the operation of turning on the power of the air conditioner 300, the operation of turning on the power of the bathroom heater 310, and the operation of turning on the power of the water heater 320 can be performed. When detected, a voice instructing the power of the air conditioner 300 to be turned on, a voice instructing the bathroom heater 310 to be turned on, and a voice instructing the water heater 320 to be turned on. Is automatically output.

本実施形態では、複数の設備機器に対して一連の操作がされた実績がある場合において、この一連の操作のうちのいずれかの操作が検知された場合、これらの一連の操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、一連の操作のうちの１つの操作により複数の設備機器を所望する制御状態にすることができる。 In the present embodiment, when a series of operations have been performed on a plurality of equipments and devices, and any one of the series of operations is detected, the content of the series of operations is represented. Audio is automatically output. Therefore, according to the present embodiment, a plurality of equipment can be brought into a desired control state by one of a series of operations.

（実施形態５）
実施形態４では、複数の設備機器に対する一連の操作にキーワードが対応付けられない例について説明した。本実施形態では、複数の設備機器に対する一連の操作にキーワードが対応付けられる例について説明する。以下、基本的に、実施形態４と異なる部分について説明する。(Embodiment 5)
In the fourth embodiment, an example in which a keyword is not associated with a series of operations for a plurality of equipments and devices has been described. In this embodiment, an example in which a keyword is associated with a series of operations for a plurality of equipments and devices will be described. Hereinafter, basically, the parts different from the fourth embodiment will be described.

行動検知部１０５は、ユーザ１０が発した言葉を表す第３音声を検知する音声検知部１０８を備え、ユーザ１０による上記言葉の発声を、ユーザ１０によりなされた行動として検知する。履歴情報生成部１０６は、予め定められた時間内に第１操作と第２操作と第３音声とが検知された場合、第１操作の内容と第２操作の内容と上記言葉とが対応付けられた履歴情報を生成する。 The action detection unit 105 includes a voice detection unit 108 that detects a third voice representing a word uttered by the user 10, and detects the utterance of the above word by the user 10 as an action performed by the user 10. When the first operation, the second operation, and the third voice are detected within a predetermined time, the history information generation unit 106 associates the contents of the first operation, the contents of the second operation, and the above words. Generate the history information.

音声出力部１０３は、履歴情報が生成された後、第３音声が検知された場合、第１音声と第２音声とを出力する。このように、本実施形態では、複数の設備機器に対する一連の操作と第３音声に対応するキーワードの発話とが連続して検知された実績がある場合において、一連の操作のうちの１つの操作又はキーワードの発話が検知された場合、一連の操作を表す一連の音声が出力される。 The voice output unit 103 outputs the first voice and the second voice when the third voice is detected after the history information is generated. As described above, in the present embodiment, when a series of operations for a plurality of equipments and a utterance of a keyword corresponding to the third voice have been continuously detected, one of the series of operations is performed. Alternatively, when the utterance of the keyword is detected, a series of voices representing a series of operations are output.

次に、図１４を参照して、本実施形態における履歴情報について説明する。本実施形態では、履歴情報は、キーワードと操作Ａの内容と操作Ｂの内容と操作Ｃの内容とが対応付けられた情報である。操作Ａと操作Ｂと操作Ｃとは、予め定められた時間内に連続して検知された一連の操作である。キーワードは、上記一連の操作とともに、上記予め定められた時間内に検知された言葉である。操作Ａと操作Ｂと操作Ｃとのうちのいずれか１つの操作が第２操作である。残りの２つの操作が第１操作である。 Next, the history information in the present embodiment will be described with reference to FIG. In the present embodiment, the history information is information in which the keyword, the content of the operation A, the content of the operation B, and the content of the operation C are associated with each other. Operation A, operation B, and operation C are a series of operations that are continuously detected within a predetermined time. The keyword is a word detected within the predetermined time together with the above series of operations. The operation of any one of the operation A, the operation B, and the operation C is the second operation. The remaining two operations are the first operation.

履歴情報により示される一番上のレコードは、「今帰った」というキーワードの発音と、空調機３００の電源をオンする操作と、浴室暖房器３１０の電源をオンする操作と、給湯器３２０の電源をオンする操作と、が連続して実行された実績があることを示している。このような実績がある場合において、「今帰った」というキーワードの発音、又は、空調機３００の電源をオンする操作と、浴室暖房器３１０の電源をオンする操作と、給湯器３２０の電源をオンする操作とのうち、いずれかの操作が検知されると、空調機３００の電源をオンすることを指示する音声と、浴室暖房器３１０の電源をオンすることを指示する音声と、給湯器３２０の電源をオンすることを指示する音声とが、自動で出力される。 The top record shown by the history information is the pronunciation of the keyword "I'm back", the operation to turn on the air conditioner 300, the operation to turn on the bathroom heater 310, and the water heater 320. It shows that the operation of turning on the power and the operation of turning on the power have been executed continuously. When there is such a track record, the keyword "I'm back now" or the operation to turn on the power of the air conditioner 300, the operation to turn on the power of the bathroom heater 310, and the power of the water heater 320 are turned on. When any of the on operations is detected, a voice instructing to turn on the power of the air conditioner 300, a voice instructing to turn on the power of the bathroom heater 310, and a water heater A voice instructing to turn on the power of the 320 is automatically output.

本実施形態では、キーワードの発話とともに複数の設備機器に対して一連の操作がされた実績がある場合において、キーワードの発話又はこの一連の操作のうちのいずれかの操作が検知された場合、これらの一連の操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、キーワードの発話又は一連の操作のうちの１つの操作により複数の設備機器を所望する制御状態にすることができる。 In the present embodiment, when a series of operations have been performed on a plurality of equipments and devices together with the utterance of the keyword, and when the utterance of the keyword or any one of the series of operations is detected, these The voice indicating the contents of the series of operations of is automatically output. Therefore, according to the present embodiment, it is possible to bring a plurality of equipments into a desired control state by speaking a keyword or operating one of a series of operations.

（実施形態６）
実施形態１−５では、履歴情報が自動で生成される例について説明した。本実施形態では、履歴情報が手動で生成される例について説明する。本実施形態では、タッチスクリーン１３又はマイクロフォン１４は、設定モードへの移行指示を受け付ける。移行指示受付手段は、例えば、タッチスクリーン１３又はマイクロフォン１４に対応する。(Embodiment 6)
In the first to fifth embodiments, an example in which history information is automatically generated has been described. In this embodiment, an example in which history information is manually generated will be described. In the present embodiment, the touch screen 13 or the microphone 14 receives an instruction to shift to the setting mode. The transition instruction receiving means corresponds to, for example, a touch screen 13 or a microphone 14.

履歴情報生成部１０６は、タッチスクリーン１３又はマイクロフォン１４により移行指示が受け付けられ、移行指示に基づいて設定モードが設定されている間に、操作と行動とが検知された場合、操作の内容と行動の内容とが対応付けられた履歴情報を生成する。音声出力部１０３は、履歴情報が生成された後、上記行動が検知された場合、上記音声を出力する。 When the transition instruction is received by the touch screen 13 or the microphone 14 and the operation and the action are detected while the setting mode is set based on the transition instruction, the history information generation unit 106, the content and the action of the operation. Generates history information associated with the contents of. The voice output unit 103 outputs the voice when the action is detected after the history information is generated.

例えば、音声出力装置１００に対してユーザ１０が移行指示をした場合を想定する。なお、音声出力装置１００は、プロセッサ１１とスピーカ１５とが協働して発声し、ユーザ１０の発話した音声をマイクロフォン１４により検知するものとする。この場合、例えば、音声出力装置１００は、「何を連動しますか？」と発声する。ユーザ１０が、「給湯器、電源、オン」と発声すると、音声出力装置１００は、「何を連動しますか？」と発声する。ユーザ１０が、「浴室暖房、電源、オン」と発声すると、音声出力装置１００は、「何を連動しますか？」と発声する。ユーザ１０が、「終わり」と発声すると、音声出力装置１００は、「キーワードは何ですか？」と発声する。ユーザ１０が、「今、帰った」と発声すると、音声出力装置１００は、「設定が完了しました」と発声する。 For example, it is assumed that the user 10 gives a transition instruction to the audio output device 100. In the voice output device 100, the processor 11 and the speaker 15 cooperate with each other to utter a voice, and the voice uttered by the user 10 is detected by the microphone 14. In this case, for example, the voice output device 100 utters "What do you want to work with?". When the user 10 utters "water heater, power supply, on", the voice output device 100 utters "what is linked?". When the user 10 utters "bathroom heating, power, on", the voice output device 100 utters "what is linked?". When the user 10 utters "end", the voice output device 100 utters "what is the keyword?". When the user 10 utters "I'm back now", the voice output device 100 utters "The setting is completed".

以下、図１５を参照して、音声出力装置１００が実行する設定処理について説明する。 Hereinafter, the setting process executed by the audio output device 100 will be described with reference to FIG.

まず、プロセッサ１１は、設定モードへの移行指示があるか否かを判別する（ステップＳ３０１）。プロセッサ１１は、設定モードへの移行指示がないと判別すると（ステップＳ３０１：ＮＯ）、ステップＳ３０１に処理を戻す。一方、プロセッサ１１は、設定モードへの移行指示があると判別すると（ステップＳ３０１：ＹＥＳ）、操作を促すメッセージを発声する（ステップＳ３０２）。 First, the processor 11 determines whether or not there is an instruction to shift to the setting mode (step S301). When the processor 11 determines that there is no instruction to shift to the setting mode (step S301: NO), the processor 11 returns the process to step S301. On the other hand, when the processor 11 determines that there is an instruction to shift to the setting mode (step S301: YES), the processor 11 utters a message prompting the operation (step S302).

プロセッサ１１は、ステップＳ３０２の処理を完了すると、制御指定操作があるか否かを判別する（ステップＳ３０３）。制御指定操作は、例えば、音声により制御を指定する操作である。プロセッサ１１は、制御指定操作があると判別すると（ステップＳ３０３：ＹＥＳ）、指定された制御を記憶する（ステップＳ３０４）。プロセッサ１１は、制御指定操作がないと判別した場合（ステップＳ３０３：ＮＯ）、又は、ステップＳ３０４の処理を完了した場合、設定終了操作があるか否かを判別する（ステップＳ３０５）。プロセッサ１１は、設定終了操作がないと判別すると（ステップＳ３０５：ＮＯ）、ステップＳ３０２に処理を戻す。一方、プロセッサ１１は、設定終了操作があると判別すると（ステップＳ３０５：ＹＥＳ）、キーワードの発声を促すメッセージを発声する（ステップＳ３０６）。 When the processor 11 completes the process of step S302, it determines whether or not there is a control designation operation (step S303). The control designation operation is, for example, an operation of designating control by voice. When the processor 11 determines that there is a control designation operation (step S303: YES), the processor 11 stores the designated control (step S304). When the processor 11 determines that there is no control designation operation (step S303: NO), or when the process of step S304 is completed, the processor 11 determines whether or not there is a setting end operation (step S305). When the processor 11 determines that there is no setting end operation (step S305: NO), the processor 11 returns the process to step S302. On the other hand, when the processor 11 determines that there is a setting end operation (step S305: YES), the processor 11 utters a message prompting the utterance of the keyword (step S306).

プロセッサ１１は、ステップＳ３０６の処理を完了すると、キーワードの発話があるか否かを判別する（ステップＳ３０７）。プロセッサ１１は、キーワードの発話があると判別すると（ステップＳ３０７：ＹＥＳ）、キーワード付きの履歴情報を生成する（ステップＳ３０８）。一方、プロセッサ１１は、キーワードの発話がないと判別すると（ステップＳ３０７：ＮＯ）、キーワード付きでない履歴情報を生成する（ステップＳ３０９）。プロセッサ１１は、ステップＳ３０８の処理又はステップＳ３０９の処理を完了すると、ステップＳ３０１に処理を戻す。 When the processor 11 completes the process of step S306, it determines whether or not there is an utterance of the keyword (step S307). When the processor 11 determines that there is an utterance of the keyword (step S307: YES), the processor 11 generates history information with the keyword (step S308). On the other hand, when the processor 11 determines that there is no utterance of the keyword (step S307: NO), the processor 11 generates history information without the keyword (step S309). When the processor 11 completes the process of step S308 or the process of step S309, the processor 11 returns the process to step S301.

本実施形態では、ユーザが所望する一連の操作を手動で設定される。従って、本実施形態によれば、ユーザが意図しない設定がされることを抑制することができる。 In this embodiment, a series of operations desired by the user is manually set. Therefore, according to the present embodiment, it is possible to prevent the user from making an unintended setting.

（変形例）
以上、本発明の実施形態を説明したが、本発明を実施するにあたっては、種々の形態による変形及び応用が可能である。(Modification example)
Although the embodiments of the present invention have been described above, various modifications and applications are possible in carrying out the present invention.

本発明において、上記実施形態において説明した構成、機能、動作のどの部分を採用するのかは任意である。また、本発明において、上述した構成、機能、動作のほか、更なる構成、機能、動作が採用されてもよい。また、上記実施形態において説明した構成、機能、動作は、自由に組み合わせることができる。 In the present invention, which part of the configuration, function, and operation described in the above embodiment is adopted is arbitrary. Further, in the present invention, in addition to the above-described configurations, functions, and operations, further configurations, functions, and operations may be adopted. Further, the configurations, functions, and operations described in the above embodiments can be freely combined.

例えば、実施形態２では、キーワードの発話の後に、設備機器に対する一連の操作がなされる例について説明した。設備機器に対する一連の操作の後に、キーワードの発話がされてもよい。 For example, in the second embodiment, an example in which a series of operations on the equipment is performed after the utterance of the keyword has been described. A keyword may be spoken after a series of operations on the equipment.

画面操作として説明した操作を音声操作にしてもよいし、音声操作として説明した操作を画面操作にしてもよい。 The operation described as a screen operation may be a voice operation, or the operation described as a voice operation may be a screen operation.

本発明に係る音声出力装置１００の動作を規定する動作プログラムを既存のパーソナルコンピュータや情報端末装置に適用することで、当該パーソナルコンピュータ等を本発明に係る音声出力装置１００として機能させることも可能である。また、このようなプログラムの配布方法は任意であり、例えば、ＣＤ−ＲＯＭ（Compact Disk Read-Only Memory）、ＤＶＤ（Digital Versatile Disk）、メモリカードなどのコンピュータ読み取り可能な記録媒体に格納して配布してもよいし、インターネットなどの通信ネットワークを介して配布してもよい。 By applying an operation program that defines the operation of the audio output device 100 according to the present invention to an existing personal computer or information terminal device, it is possible to make the personal computer or the like function as the audio output device 100 according to the present invention. be. The distribution method of such a program is arbitrary, and is stored and distributed in a computer-readable recording medium such as a CD-ROM (Compact Disk Read-Only Memory), a DVD (Digital Versatile Disk), or a memory card. It may be distributed via a communication network such as the Internet.

本発明は、本発明の広義の精神と範囲を逸脱することなく、様々な実施形態及び変形が可能とされるものである。また、上述した実施形態は、本発明を説明するためのものであり、本発明の範囲を限定するものではない。つまり、本発明の範囲は、実施形態ではなく、請求の範囲によって示される。そして、請求の範囲内及びそれと同等の発明の意義の範囲内で施される様々な変形が、本発明の範囲内とみなされる。 The present invention allows for various embodiments and modifications without departing from the broad spirit and scope of the present invention. Moreover, the above-described embodiment is for explaining the present invention, and does not limit the scope of the present invention. That is, the scope of the present invention is indicated not by the embodiment but by the claims. Then, various modifications made within the scope of the claims and within the equivalent meaning of the invention are considered to be within the scope of the present invention.

本発明は、音声に基づいて設備機器を制御する機器制御装置を備える機器制御システムに適用可能である。 The present invention is applicable to an equipment control system including an equipment control device that controls equipment equipment based on voice.

１０ユーザ、１１，２１プロセッサ、１２，２２フラッシュメモリ、１３，２３タッチスクリーン、１４，２４マイクロフォン、１５，２５スピーカ、１６，２６通信インターフェース、１００，１２０，１３０音声出力装置、１０１，２０１制御部、１０２操作検知部、１０３，２０３音声出力部、１０４，２０４音声情報記憶部、１０５行動検知部、１０６履歴情報生成部、１０７履歴情報記憶部、１０８，２０２音声検知部、２００機器制御装置、２０５機器制御部、２０６コマンド情報記憶部、３００空調機、３１０浴室暖房器、３２０給湯器、６００通信ネットワーク、１０００機器制御システム 10 users, 11,21 processors, 12,22 flash memory, 13,23 touch screens, 14,24 microphones, 15,25 speakers, 16,26 communication interfaces, 100,120,130 voice output devices, 101,201 controls , 102 operation detection unit, 103, 203 voice output unit, 104, 204 voice information storage unit, 105 action detection unit, 106 history information generation unit, 107 history information storage unit, 108, 202 voice detection unit, 200 device control device, 205 Equipment control unit, 206 Command information storage unit, 300 Air conditioner, 310 Bathroom heater, 320 Water heater, 600 Communication network, 1000 Equipment control system

Claims

A voice output device for putting the equipment into a desired control state by using a device control device that controls the equipment based on voice.
An operation detection means for detecting the first operation for controlling the first equipment for the voice output device, and
When the first operation is detected, an audio output means for outputting a first audio representing the content of the first operation, and an audio output means.
A voice detection means for detecting a user voice that is a voice emitted by a user and represents the first word,
If the first operation within a predetermined time and said user voice is examined knowledge, history information generating means for generating history information the content of the first operation and the first word associated And with
After the history information is generated, the voice output means detects the user voice representing the first word associated with the content of the first operation by the history information , the first voice. To output,
Audio output device.

The operation detecting means detects a second operation for controlling the first equipment with respect to the voice output device, and detects the second operation.
When the second operation is detected, the voice output means outputs a second voice representing the content of the second operation.
When the first operation, the second operation, and the user voice are detected within the predetermined time, the history information generating means includes the contents of the first operation, the contents of the second operation, and the above. Generate the history information associated with the first word,
After the history information is generated, the voice output means performs the second operation corresponding to the content of the second operation associated with the content of the first operation and the first word by the history information. Is detected, the first voice and the second voice are output.
The audio output device according to claim 1.

When the first operation is detected before the predetermined time elapses after the second operation is detected, the history information generating means has the contents of the first operation and the contents of the second operation. Generates the history information in which the first word is associated with the first word.
The audio output device according to claim 2.

The operation detecting means detects a second operation for controlling the second equipment with respect to the voice output device, and detects the second operation.
When the second operation is detected, the voice output means outputs a second voice representing the content of the second operation.
When the first operation, the second operation, and the user voice are detected within the predetermined time, the history information generating means includes the contents of the first operation, the contents of the second operation, and the above. Generate the history information associated with the first word,
After the history information is generated, the voice output means detects the user voice representing the first word associated with the content of the first operation and the content of the second operation by the history information. If so, the first voice and the second voice are output.
The audio output device according to claim 1.

In the history information generation means, the number of times that the first operation and the user voice representing the first word are detected within the predetermined time within the latest predetermined period is predetermined. When the threshold value is reached, the history information in which the content of the first operation and the first word are associated with each other is generated.
The audio output device according to any one of claims 1 to 4.

A voice output device for putting the equipment into a desired control state by using a device control device that controls the equipment based on voice.
An operation detection means for detecting the first operation for controlling the first equipment for the voice output device, and
When the first operation is detected, an audio output means for outputting a first audio representing the content of the first operation, and an audio output means.
A voice detection means for detecting a user voice that is a voice emitted by a user and represents the first word,
A transition instruction receiving means that accepts a transition instruction to the setting mode, and
The shift instruction is accepted by the prior SL migration instruction receiving means, while the setting mode based on the shift instruction is set, when the first operation and said user voice is detected, the first operation comprising of a history information generation means for generating the history information and the first words associated with the content, and
After the history information is generated, the voice output means detects the user voice representing the first word associated with the content of the first operation by the history information, the first voice. To output,
Voice output device.

A voice output device for putting the equipment into a desired control state by using a device control device that controls the equipment based on voice.
For the voice output device, the first operation and for controlling the equipment, for the audio output device, an operation detection means for detecting a second operation for controlling the equipment,
When the first operation is detected, a first voice representing the content of the first operation is output, and when the second operation is detected, a second voice representing the content of the second operation is output . Output means and
When the number of times the first operation and the second operation are detected within the latest predetermined period and within a predetermined time reaches a predetermined threshold value, the content of the first operation and A history information generation means for generating history information associated with the content of the second operation is provided.
When the second operation corresponding to the content of the second operation associated with the content of the first operation is detected by the history information after the history information is generated, the voice output means said. Outputs the first voice and the second voice,
Audio output device.

A device control apparatus which controls the equipment based on the voice, by using the device control apparatus, the equipment comprising a device control system including an audio output device to the desired control state,
The audio output device is
An operation detecting means for detecting an operation for controlling the equipment of the audio output device, and an operation detecting means for detecting the operation.
When the operation is detected, an audio output means for outputting a first audio representing the content of the operation, and
A voice detection means for detecting a user voice that is a voice emitted by a user and represents the first word,
If said operation with the user voice within a predetermined time is examined knowledge, and a history information generating means for generating history information the content of the operation and the first word associated ,
The device control device is
A voice detecting means for detecting the first voice output by the voice output means, and a voice detecting means.
A device control means for controlling the equipment based on the content of the operation represented by the first voice detected by the voice detection means is provided.
After the history information is generated, the voice output means outputs the first voice when the user voice representing the first word associated with the content of the operation is detected by the history information. do,
Equipment control system.

It is a voice output method executed by a voice output device for putting the equipment into a desired control state by using a device control device that controls equipment based on voice.
The operation for controlling the equipment with respect to the audio output device is detected, and the operation is detected.
When the operation is detected, the first sound representing the content of the operation is output.
Detects the user voice that is the voice uttered by the user and represents the first word,
Wherein if the operation and said user voice is examined knowledge, generates history information the content of the operation and the first word associated within a predetermined time,
After the history information is generated , when the user voice representing the first word associated with the content of the operation is detected by the history information, the first voice is output.
Audio output method.

A computer provided with an audio output device for putting the equipment into a desired control state by using an equipment control device that controls equipment based on voice.
An operation detecting means for detecting an operation for controlling the equipment of the audio output device,
When the operation is detected by the operation detecting means, the voice output means for outputting the first voice representing the content of the operation,
A voice detecting means for detecting a user voice that is a voice emitted by a user and represents the first word.
If said operation with the user voice within a predetermined time is examined knowledge, history information generating means for generating history information the content of the operation and the first word is associated, to function as a Program for
After the history information is generated, the voice output means outputs the first voice when the user voice representing the first word associated with the content of the operation is detected by the history information. do,
program.