WO2019229978A1 - Audio output device, apparatus control system, audio output method, and program - Google Patents

Audio output device, apparatus control system, audio output method, and program Download PDF

Info

Publication number
WO2019229978A1
WO2019229978A1 PCT/JP2018/021204 JP2018021204W WO2019229978A1 WO 2019229978 A1 WO2019229978 A1 WO 2019229978A1 JP 2018021204 W JP2018021204 W JP 2018021204W WO 2019229978 A1 WO2019229978 A1 WO 2019229978A1
Authority
WO
WIPO (PCT)
Prior art keywords
detected
history information
content
voice
sound
Prior art date
Application number
PCT/JP2018/021204
Other languages
French (fr)
Japanese (ja)
Inventor
紀之 小宮
Original Assignee
三菱電機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三菱電機株式会社 filed Critical 三菱電機株式会社
Priority to JP2020522542A priority Critical patent/JP6945734B2/en
Priority to PCT/JP2018/021204 priority patent/WO2019229978A1/en
Publication of WO2019229978A1 publication Critical patent/WO2019229978A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q9/00Arrangements in telecontrol or telemetry systems for selectively calling a substation from a main station, in which substation desired apparatus is selected for applying a control signal thereto or for obtaining measured values therefrom

Definitions

  • the present invention relates to an audio output device, a device control system, an audio output method, and a program.
  • Such operations include screen operations and voice operations.
  • equipment is controlled using an equipment control device which controls equipment based on the voice which a user uttered.
  • voice operation for example, equipment is controlled using an equipment control device which controls equipment based on the voice which a user uttered.
  • equipment control apparatus that controls facility devices based on voice is used, user convenience is often improved.
  • Patent Literature 1 includes a portable device that transmits user identification information when a user controls the device using the portable device, and stores personal information acquired by using the device in accordance with the user identification information. A personal information storage system is described.
  • Patent Document 1 since the technique described in Patent Document 1 is not a technique that uses a device control apparatus that controls equipment based on voice, it is difficult to immediately apply to a technique that uses such a device control apparatus. is there. For this reason, there is a demand for a technique for easily bringing a facility device into a desired control state using a device control device that controls the facility device based on voice.
  • the present invention has been made in view of the above problems, and an audio output device and an apparatus control system for easily setting an equipment device in a desired control state by using an equipment control device that controls the equipment equipment based on sound.
  • An object is to provide an audio output method and a program.
  • an audio output device includes: Operation detection means for detecting an operation on the equipment by the user; Audio output means for outputting audio representing the content of the operation when the operation is detected by the operation detection means; Action detecting means for detecting an action made by the user; When the operation detected by the operation detection unit and the behavior detected by the behavior detection unit have a predetermined relationship, history information in which the content of the operation and the content of the behavior are associated with each other is stored.
  • a history information generating means for generating, When the action is detected by the action detecting means, the sound output means outputs the sound representing the content of the operation associated with the content of the action by the history information.
  • the operation performed on the equipment by the user and the action performed by the user have a predetermined relationship, and when the action is detected, a sound representing the content of the operation is output. Therefore, according to this invention, an equipment apparatus can be easily made into a desired control state using the equipment control apparatus which controls an equipment apparatus based on an audio
  • the block diagram of the equipment control system which concerns on Embodiment 1 of this invention 1 is a configuration diagram of an audio output device according to Embodiment 1 of the present invention.
  • Functional configuration diagram of the device control apparatus according to the first embodiment of the present invention The figure which shows the historical information which concerns on Embodiment 1 of this invention.
  • Functional configuration diagram of the audio output device according to the second embodiment of the present invention The figure which shows the historical information which concerns on Embodiment 2 of this invention.
  • Functional configuration diagram of the audio output device according to the third embodiment of the present invention Configuration diagram of device control system according to Embodiment 4 of the present invention
  • the device control system 1000 includes a sound output device 100 and a device control device 200, and the sound output device 100 and the device control device 200 cooperate to control facility devices.
  • the equipment is assumed to be an air conditioner 300.
  • the voice output device 100 receives an operation on the equipment from the user 10 and outputs a voice representing the operation.
  • the device control device 200 detects the sound output by the sound output device 100, generates a control command corresponding to the sound, and transmits the control command to the air conditioner 300.
  • the device control apparatus 200 and the air conditioner 300 are connected to each other via a communication network 600.
  • the communication network 600 is, for example, a wireless LAN (Local Area Network) built in a home.
  • the device control apparatus 200 is generally used to detect a voice spoken by the user 10 and transmit a control command corresponding to the voice to the air conditioner 300. However, in this embodiment, not the voice uttered by the user 10 but the voice uttered in response to the operation received by the voice output device 100 from the user 10 is detected, and a control command corresponding to this voice is sent to the air conditioner 300. Send.
  • various effects can be expected by the audio output device 100 relaying the user 10 and the device control device 200.
  • the user 10 can control the air conditioner 300 not by voice operation but by screen operation.
  • control corresponding to another operation related to this operation can be automatically executed.
  • the audio output device 100 receives an operation for instructing control on the air conditioner 300 from the user 10.
  • the audio output device 100 outputs audio corresponding to the content of the operation received from the user 10.
  • the sound output from the sound output device 100 is detected by the device control device 200.
  • the audio output device 100 can automatically output audio for setting the air conditioner 300 to a control state desired by the user 10 based on the history of operations received from the user 10. For this reason, the audio output device 100 can output audio other than audio corresponding to the content of the accepted operation, or can output audio when no operation is accepted.
  • the audio output device 100 is, for example, a smartphone, a tablet terminal, or a personal computer.
  • the audio output device 100 includes a processor 11, a flash memory 12, a touch screen 13, a microphone 14, a speaker 15, and a communication interface 16.
  • the processor 11 controls the overall operation of the audio output device 100.
  • the processor 11 is, for example, a CPU (Central Processing Unit) that incorporates ROM (Read Only Memory), RAM (Random Access Memory), RTC (Real Time Clock), and the like.
  • the CPU operates according to a basic program stored in the ROM, for example, and uses the RAM as a work area.
  • the flash memory 12 is a nonvolatile memory that stores various types of information.
  • the flash memory 12 stores a program executed by the processor 11.
  • the touch screen 13 detects an operation performed by the user and supplies a signal indicating the detection result to the processor 11.
  • the touch screen 13 displays information according to control by the processor 11.
  • the microphone 14 is a device that converts sound into an electrical signal. For example, the microphone 14 converts a voice uttered by the user 10 into an electrical signal.
  • the speaker 15 is a device that converts a supplied electric signal into physical vibration and generates sound. For example, the speaker 15 outputs sound for transmitting various messages to the user 10.
  • the communication interface 16 is a communication interface for connecting the audio output device 100 to a telephone network (not shown) or the Internet (not shown).
  • the device control device 200 detects the sound output from the sound output device 100 and generates a control command corresponding to the detected sound.
  • the device control apparatus 200 transmits the generated control command to the air conditioner 300 via the communication network 600.
  • the device control apparatus 200 has a function of converting speech into words and a function of converting words into control commands.
  • the device control apparatus 200 is, for example, a smart speaker.
  • the device control apparatus 200 includes a processor 21, a flash memory 22, a touch screen 23, a microphone 24, a speaker 25, and a communication interface 26.
  • the processor 21 controls the overall operation of the device control apparatus 200.
  • the processor 21 is, for example, a CPU incorporating a ROM, RAM, RTC, and the like.
  • the CPU operates according to a basic program stored in the ROM, for example, and uses the RAM as a work area.
  • the flash memory 22 is a non-volatile memory that stores various types of information.
  • the flash memory 22 stores a program executed by the processor 21.
  • the touch screen 23 detects an operation performed by the user and supplies a signal indicating the detection result to the processor 21.
  • the touch screen 23 displays information according to control by the processor 21.
  • the microphone 24 is a device that converts sound into an electrical signal. For example, the microphone 24 converts the sound output from the sound output device 100 into an electrical signal.
  • the speaker 25 is a device that converts a supplied electric signal into physical vibration and utters a sound. For example, the speaker 25 outputs sound for transmitting various messages to the user 10.
  • the communication interface 26 is a communication interface for connecting the device control apparatus 200 to the communication network 600.
  • the air conditioner 300 is a facility device to be controlled by the device control system 1000.
  • the air conditioner 300 is, for example, a device that harmonizes air in a home space.
  • the air conditioner 300 includes, for example, a heating function, a cooling function, a dehumidifying function, and a blowing function.
  • the air conditioner 300 includes, for example, an indoor unit (not shown) installed in the house, an outdoor unit (not shown) installed outside the house, and a remote controller (for controlling the indoor unit and the outdoor unit). (Not shown).
  • the air conditioner 300 has a function of connecting to the communication network 600.
  • the air conditioner 300 is controlled according to a control command received from the device control apparatus 200 via the communication network 600.
  • the audio output device 100 functionally includes a control unit 101, an audio output unit 103, an audio information storage unit 104, an action detection unit 105, a history information generation unit 106, a history An information storage unit 107.
  • the behavior detection unit 105 includes an operation detection unit 102.
  • the operation detection unit corresponds to the operation detection unit 102, for example.
  • the audio output unit corresponds to the audio output unit 103, for example.
  • the behavior detection unit corresponds to the behavior detection unit 105, for example.
  • the history information generation unit corresponds to the history information generation unit 106, for example.
  • the control unit 101 controls the overall operation of the audio output device 100.
  • the control unit 101 causes the audio output unit 103 to output sound based on the detection result by the operation detection unit 102.
  • the control unit 101 causes the history information generation unit 106 to generate history information based on the detection result by the operation detection unit 102 and the detection result by the behavior detection unit 105.
  • the control unit 101 outputs a sound from the sound output unit 103 based on at least one detection result of the detection result by the operation detection unit 102 and the detection result by the behavior detection unit 105 and the history information. Output.
  • the function of the control unit 101 is realized, for example, when the processor 11 executes a program stored in the flash memory 12.
  • the operation detection unit 102 detects an operation performed on the air conditioner 300 by the user 10.
  • This operation is an operation for controlling the air conditioner 300, and is an operation for instructing control contents for the air conditioner 300.
  • This operation is, for example, an operation of instructing the air conditioner 300 to turn on the power, an operation of switching the air conditioning mode to cooling, or an operation of switching the set temperature to 22 ° C.
  • This operation is a screen operation or a voice operation on the voice output device 100.
  • the operation detection unit 102 receives a screen operation on an operation screen for receiving an operation on the air conditioner 300.
  • the operation detection part 102 detects the audio
  • FIG. It can be said that the operation detection unit 102 is substantially an operation reception unit that receives an operation by the user 10.
  • the function of the operation detection unit 102 is realized by the function of the touch screen 13 or the function of the microphone 14, for example.
  • the audio output unit 103 When the operation detection unit 102 detects the operation, the audio output unit 103 outputs a sound representing the content of the operation. For example, it is assumed that the content of the operation is specified by the control unit 101, and audio information representing the content of control is acquired by the control unit 101. In this case, the audio output unit 103 generates an electric signal based on the audio information supplied from the control unit 101, and generates an audio corresponding to the electric signal.
  • the function of the audio output unit 103 is realized by the cooperation of the processor 11 and the speaker 15, for example.
  • the voice information storage unit 104 stores voice information.
  • the sound information is information indicating sound to be output for each operation content, that is, for each control content.
  • the content of the operation of the air conditioner 300: power supply: on is associated with information corresponding to an electrical signal for outputting a sound “Please turn on the air conditioner 300.”
  • Information is realized by the function of the flash memory 12, for example.
  • the behavior detection unit 105 detects a behavior performed by the user 10. This action is, for example, an operation on the air conditioner 300 by the user 10 or a utterance of words by the user 10. In the present embodiment, the operation on the air conditioner 300 is substantially an operation on the audio output device 100.
  • the function of the behavior detection unit 105 is realized by, for example, the function of the touch screen 13 or the function of the microphone 14.
  • the history information generation unit 106 determines that the content of the operation and the content of the behavior are The associated history information is generated.
  • the predetermined relationship is, for example, a relationship in which the difference between detected times is within a threshold value, or a relationship in which all detected times are times in the setting mode.
  • the function of the history information generation unit 106 is realized by, for example, the processor 11 executing a program stored in the flash memory 12.
  • the history information storage unit 107 stores the history information generated by the history information generation unit 106.
  • the function of the history information storage unit 107 is realized by the function of the flash memory 12, for example.
  • the voice output unit 103 When the behavior detection unit 105 detects the behavior, the voice output unit 103 outputs the voice representing the content of the operation associated with the content of the behavior based on history information. That is, when there is history information in which the detected action content and the operation content are associated with each other, the audio output unit 103 outputs a sound representing the operation content.
  • the history information generation unit 106 when the operation and the action are detected within a predetermined time, the history information generation unit 106 generates history information in which the content of the operation and the content of the action are associated with each other. This predetermined time is, for example, about several minutes. Then, the sound output unit 103 outputs the sound when the behavior is detected after the history information is generated.
  • the voice output unit 103 considers that the user 10 is highly likely to perform the operation when the action is detected, and outputs a voice for realizing control by the operation. Output automatically.
  • the operation detection unit 102 detects a first operation on the air conditioner 300 by the user 10 and a second operation on the air conditioner 300 by the user 10.
  • the voice output unit 103 outputs the first voice representing the content of the first operation
  • the voice output unit 103 represents the second operation.
  • the behavior detection unit 105 includes an operation detection unit 102 and detects the second operation as the behavior performed by the user 10.
  • the history information generation unit 106 associates the contents of the first operation with the contents of the second operation. Is generated. Then, after the history information is generated, the voice output unit 103 outputs the first voice and the second voice when the second operation is detected.
  • the audio output unit 103 considers that the first operation is highly likely to be performed when the second operation is detected, and the second audio representing the content of the second operation. In addition, a first sound representing the content of the first operation is also output.
  • the history information generation unit 106 determines the contents of the first operation and the second information when the first operation is detected before the predetermined time elapses after the second operation is detected. History information associated with the contents of the operation is generated. That is, in the present embodiment, when there is a track record in which the first operation has been detected since the second operation has been detected, when the second operation is newly detected, the second sound representing the content of the second operation is displayed. In addition, a first sound representing the content of the first operation is also output.
  • a voice representing the content of an operation that is considered to be highly likely to be further executed after a certain operation is detected is automatically output, and further executed after a certain operation is detected.
  • the voice representing the content of the operation that is considered to be less likely to be performed is not automatically output.
  • the second operation is an operation for turning on the power to the air conditioner 300 and the first operation is an operation for setting the air conditioning mode to the air conditioner 300 for cooling.
  • the possibility that the first operation is performed after the second operation is performed is high, but the possibility that the second operation is performed after the first operation is performed is low. Therefore, only when a previously detected operation is newly detected, a sound indicating the content of the operation detected later is additionally output.
  • the history information generation unit 106 when the number of times that the operation and the action are detected within the predetermined time within the most recent predetermined period has reached a predetermined threshold, It is preferable to generate history information in which the content of the operation and the content of the action are associated with each other.
  • the most recent predetermined period is, for example, the most recent month.
  • the predetermined time is, for example, several minutes.
  • the predetermined threshold is, for example, 5 times.
  • the device control apparatus 200 functionally includes a control unit 201, a sound detection unit 202, a sound output unit 203, a sound information storage unit 204, a device control unit 205, and command information. And a storage unit 206.
  • the sound detection means included in the device control apparatus 200 corresponds to the sound detection unit 202, for example.
  • the device control means corresponds to the device control unit 205, for example.
  • the control unit 201 controls the overall operation of the device control apparatus 200.
  • the control unit 201 specifies the content of control for the air conditioner 300 from the sound detected by the sound detection unit 202, and causes the device control unit 205 to transmit a control command representing the specified content of control.
  • the function of the control unit 201 is realized, for example, when the processor 21 executes a program stored in the flash memory 22.
  • the voice detection unit 202 detects the voice output by the voice output unit 103. Therefore, it is desirable that the sound detection unit 202 be disposed near the sound output unit 103. For example, the voice detection unit 202 is disposed in an area within several meters from the voice output unit 103. The function of the voice detection unit 202 is realized by the function of the microphone 24, for example.
  • the sound output unit 203 outputs various sounds according to the control by the control unit 201.
  • the voice output unit 203 outputs a voice representing an announcement to the user 10.
  • the function of the audio output unit 203 is realized by the cooperation of the processor 21 and the speaker 25, for example.
  • the voice information storage unit 204 stores voice information.
  • the voice information is information used to specify the content of control from the voice detected by the voice detection unit 202, for example.
  • the audio information is information indicating information representing an electrical signal corresponding to a sound representing the content of control for each content of control.
  • the function of the audio information storage unit 204 is realized by the function of the flash memory 22, for example.
  • the device control unit 205 controls the air conditioner 300 based on the content of the operation represented by the voice detected by the voice detection unit 202. For example, the device control unit 205 transmits a control command corresponding to the detected voice to the air conditioner 300 via the communication network 600 according to the control by the control unit 201.
  • the function of the device control unit 205 is realized by the cooperation of the processor 21 and the communication interface 26, for example.
  • the command information storage unit 206 stores command information.
  • the command information is, for example, information in which the control content corresponding to the operation content is associated with the control command.
  • the function of the command information storage unit 206 is realized by the function of the flash memory 22, for example.
  • the history information shown in FIG. 6 is information indicating all combinations of a plurality of operations performed continuously in the past.
  • the detection start time is the time when detection of the corresponding record combination is started.
  • the operation A is an operation performed first among a plurality of operations performed continuously in the past.
  • the operation A corresponds to the second operation, for example.
  • the operation B is the second operation among a plurality of operations performed continuously in the past.
  • the operation B corresponds to the first operation, for example.
  • the operation C is the third operation among a plurality of operations performed continuously in the past.
  • the operation C corresponds to the first operation, for example.
  • there is one second operation and there are one or more first operations.
  • the top record in the history information shown in FIG. 6 includes an operation for turning on the power of the air conditioner 300 at 12:00 on May 18, 2018, and an operation for setting the air conditioning mode of the air conditioner 300 to air conditioning.
  • the operation for setting the set temperature of the air conditioner 300 to 28 ° C. is a record indicating the results of continuous execution.
  • a sound for instructing to set the set temperature of the air conditioner 300 to 28 ° C. is output.
  • the content of the operation A corresponding to the second operation is the same in the top record and the bottom record. In such a case, it is preferable to employ the top record with a new detection start time.
  • FIG. 6 shows an example in which all detected combinations of operations are included in the history information
  • the history information is not limited to this example. For example, only combinations of operations detected during the most recent predetermined period may be included in the history information. Further, only the combination of operations detected a predetermined number of times or more in the latest predetermined period may be included in the history information.
  • a record with an older detection start time among combinations having a competitive relationship may be excluded from the history information.
  • the combinations in the competitive relationship are, for example, combinations in which the second operation is the same and at least one first operation is different.
  • the audio output process executed by the audio output device 100 will be described with reference to the flowchart of FIG.
  • the audio output process is executed in response to, for example, turning on the power of the audio output device 100.
  • the processor 11 determines whether or not an operation is detected (step S101). When determining that the operation is not detected (step S101: NO), the processor 11 returns the process to step S101. On the other hand, if the processor 11 determines that an operation has been detected (step S101: YES), the processor 11 stores the detection start time (step S102). When completing the process of step S102, the processor 11 determines whether or not there is an interlocking setting (step S103). Specifically, the processor 11 determines whether or not the history information includes a record having the operation detected in step S101 as the second operation.
  • step S104 the processor 11 performs processing for selecting one operation content included in the record and processing for outputting sound representing the content of the selected operation from the speaker 15, and processing all the operation content included in the record. Run repeatedly until selected.
  • step S105 the processor 11 causes the speaker 15 to output sound representing the content of the operation detected in step S101.
  • the processor 11 determines whether or not an operation is detected when the process of step S104 or the process of step S105 is completed (step S106). When it is determined that the operation is detected (step S106: YES), the processor 11 determines whether or not there is an interlocking setting (step S107). Specifically, the processor 11 determines whether or not the history information includes a record having the operation detected in step S106 as the second operation. If the processor 11 determines that there is an interlocking setting (step S107: YES), the processor 11 outputs a voice group (step S108). On the other hand, when determining that there is no interlocking setting (step S107: NO), the processor 11 outputs a single sound (step S109).
  • step S106 determines that the operation is not detected (step S106: NO)
  • step S106 determines that the operation is not detected (step S106: NO)
  • step S110 whether or not the first time has elapsed from the detection start time. Is discriminated (step S110).
  • the first time is the above-described predetermined time, for example, several minutes.
  • step S111 determines whether or not a plurality of operations are detected within the first period.
  • the first period is a period until the first time elapses from the detection start time.
  • step S111: YES the processor 11 generates history information (step S112). For example, the processor 11 updates the history information so as to include a record in which the operation detected in step S101 is the second operation and the operation detected in step S106 is the first operation.
  • step S111: NO the processor 11 returns the process to step S101.
  • the audio output method is realized by the audio output device 100 according to the present embodiment executing the audio output process shown in FIG.
  • this voice output method first, an operation on the equipment device by the user 10 is detected, and when this operation is detected, a voice representing the content of this operation is output.
  • this audio output method an action performed by the user 10 is detected.
  • voice output method when this action is detected and this operation and this action have a predetermined relationship, the said audio
  • the operation performed on the equipment by the user 10 and the action performed by the user 10 are in a predetermined relationship, and when the action is detected, the sound representing the content of the operation is displayed. Is output. Therefore, according to the present embodiment, the equipment device can be easily brought into a desired control state by using the equipment control device 200 that controls the equipment device based on the voice. For example, it can be expected that the control of the facility device reflecting the preference of the user 10 is realized by a single operation on the audio output device 100.
  • the equipment when there is a track record of performing a plurality of operations within a predetermined time, when any of these operations is performed, the contents of the other operations are displayed.
  • the voice that represents is automatically output. Therefore, according to the present embodiment, the equipment can be brought into a desired control state with few operations.
  • the equipment when there is a track record of performing a plurality of operations within a predetermined time, when the first operation among these operations is performed, the contents of other operations are represented. Audio is automatically output. Therefore, according to the present embodiment, the equipment can be appropriately set to a desired control state with few operations.
  • the audio output device 120 functionally includes a control unit 101, an operation detection unit 102, an audio output unit 103, an audio information storage unit 104, an action detection unit 105, and history information.
  • a generation unit 106 and a history information storage unit 107 are provided.
  • the behavior detection unit 105 includes a voice detection unit 108.
  • the sound detection means included in the sound output device 120 corresponds to, for example, the sound detection unit 108.
  • the operation detection unit 102 detects a first operation performed on the equipment by the user 10.
  • the voice output unit 103 outputs a first voice representing the content of the first operation.
  • the behavior detection unit 105 includes a voice detection unit 108 that detects a third voice representing a word uttered by the user 10, and detects the utterance of the word by the user 10 as an action made by the user 10.
  • the function of the voice detection unit 108 is realized by the function of the microphone 14, for example.
  • the history information generation unit 106 generates history information in which the contents of the first operation and the above words are associated when the first operation and the third voice are detected within a predetermined time.
  • the sound output unit 103 outputs the first sound when the third sound is detected after the history information is generated.
  • the words expressed by the third voice are words that are highly likely to be issued along with the execution of the first operation, and are treated as keywords. Then, when there is a record that this keyword has been issued along with the execution of the first operation, when this keyword is issued, a voice representing the first operation is automatically output.
  • the history information may be information in which the content of the first operation is associated with the keyword issued immediately before or after the first operation, or the history information is issued immediately before the content of the first operation and the first operation.
  • Information associated with a keyword may be used, or information associated with the content of the first operation and the keyword issued immediately after the first operation may be used.
  • the history information is information in which a keyword, the content of the operation A, the content of the operation B, and the content of the operation C are associated with each other.
  • Operation A, operation B, and operation C are operations performed together with keyword utterances, and are first operations.
  • the top record indicated by the history information includes an utterance of the keyword “air conditioner”, an operation of turning on the power of the air conditioner 300, an operation of cooling the air conditioning mode of the air conditioner 300, This shows that there is a track record of performing an operation for setting the set temperature to 28 ° C.
  • FIG. 9 shows an example in which all detected combinations of keywords and first operations are included in the history information
  • the history information is not limited to this example. For example, only the combination of the keyword and the first operation detected during the most recent predetermined period may be included in the history information. In addition, only the combination of the keyword and the first operation detected more than a predetermined number of times in the most recent predetermined period may be included in the history information. In addition, a record with an older detection start time among combinations having a competitive relationship may be excluded from the history information.
  • the combinations having a competitive relationship are, for example, combinations having the same keyword and different at least one first operation.
  • the voice output process is executed in response to, for example, the power of the voice output device 120 being turned on.
  • an example will be described in which the contents of a series of operations performed after the keyword utterance are associated with the keyword.
  • the processor 11 determines whether or not a word is detected (step S201). For example, the processor 11 determines whether or not a voice representing a word that can be a keyword has been detected by the microphone 14. If the processor 11 determines that no word is detected (step S201: NO), the process returns to step S201. On the other hand, if the processor 11 determines that a word has been detected (step S201: YES), the processor 11 stores the detection start time (step S202).
  • step S203 determines whether or not there is an interlocking setting. Specifically, the processor 11 determines whether or not a record using the word detected in step S201 as a keyword is included in the history information. When the processor 11 determines that there is an interlocking setting (step S203: YES), the processor 11 outputs a single voice or a voice group (step S204). In addition, when the content of a single operation is included in the record, a single sound is output. On the other hand, when the contents of a plurality of operations are included in the record, a voice group is output.
  • step S204 determines whether an operation is detected (step S205).
  • step S205 determines that the operation is detected (step S205: YES)
  • step S206 determines whether the processor 11 completes the process of step S206 or determines that the operation is not detected (step S205: NO)
  • step S207 determines whether the first time has elapsed from the detection start time (step S205).
  • step S207: NO If the processor 11 determines that the first time has not elapsed since the detection start time (step S207: NO), the processor 11 returns the process to step S205.
  • step S207: YES the processor 11 determines whether an operation is detected within the first period (step S208).
  • step S208: YES the processor 11 determines whether an operation is detected within the first period.
  • step S208: YES the processor 11 generates history information (step S209).
  • the processor 11 updates the history information so as to include a record in which the keyword, which is the detected word, is associated with the content of the operation detected within the first period.
  • step S208: NO the processor 11 returns the process to step S201.
  • the facility device can be brought into a desired control state by the utterance of the keyword. Moreover, in this embodiment, since the user can select freely the keyword matched with a series of operation for making an equipment apparatus the desired control state, a user's convenience increases.
  • the audio output device 130 functionally includes a control unit 101, an audio output unit 103, an audio information storage unit 104, an action detection unit 105, a history information generation unit 106, a history An information storage unit 107.
  • the behavior detection unit 105 includes an operation detection unit 102 and a voice detection unit 108.
  • the behavior detection unit 105 includes a voice detection unit 108 that detects a third voice representing a word uttered by the user 10, and detects the utterance of the word by the user 10 as an action performed by the user 10.
  • the history information generation unit 106 associates the contents of the first operation, the contents of the second operation, and the above words. Generated history information is generated.
  • the sound output unit 103 outputs the first sound and the second sound when the third sound is detected after the history information is generated.
  • the air conditioner 300 when there is a track record in which the first operation, the second operation, and the third sound are detected continuously, when at least one of the second operation and the third sound is detected, One voice and second voice are output.
  • the history information illustrated in FIG. 9 is generated.
  • the air conditioner 300 is turned on in both cases where the keyword “air conditioner” is pronounced and whether the operation of turning on the air conditioner 300 is performed.
  • a voice instructing to set the air conditioning mode of the air conditioner 300 to cooling, and a voice instructing to set the set temperature of the air conditioner 300 to 28 ° C. are output.
  • the facility device can be brought into a desired control state by the first operation of the keyword utterance or the series of operations.
  • Embodiment 4 In the embodiment 1-3, the example in which there is only one equipment device to be controlled has been described. In the present embodiment, an example in which there are a plurality of facility devices to be controlled will be described. In the present embodiment, as shown in FIG. 12, an example in which the equipment to be controlled is three of an air conditioner 300, a bathroom heater 310, and a water heater 320 will be described.
  • the operation detection unit 102 detects a first operation on the first equipment device among the plurality of equipment devices by the user 10 and a second operation on the second equipment device among the plurality of equipment devices by the user 10.
  • a first operation on the first equipment device among the plurality of equipment devices by the user 10 and a second operation on the second equipment device among the plurality of equipment devices by the user 10.
  • any one equipment device among the air conditioner 300, the bathroom heater 310, and the water heater 320 is the second equipment device, and the remaining two equipment devices are the first equipment devices.
  • any of the three equipment may be the second equipment.
  • the sound output unit 103 outputs a first sound representing the content of the first operation when the first operation is detected, and outputs a second sound representing the content of the second operation when the second operation is detected.
  • the behavior detection unit 105 includes an operation detection unit 102 and detects the second operation as the behavior performed by the user 10.
  • the history information generation unit 106 When the first operation and the second operation are detected within a predetermined time, the history information generation unit 106 generates history information in which the content of the first operation and the content of the second operation are associated with each other. . Then, the sound output unit 103 outputs the first sound and the second sound when the second operation is detected after the history information is generated.
  • the history information is information in which the detection start time, the content of the operation A, the content of the operation B, and the content of the operation C are associated with each other.
  • the operation A, the operation B, and the operation C are a series of operations detected continuously within a predetermined time. This series of operations may include a plurality of operations for one facility device. Any one of the operation A, the operation B, and the operation C is the second operation. The remaining two operations are the first operations.
  • the top record indicated by the history information includes an operation to turn on the power of the air conditioner 300 until a predetermined time elapses from 12:00 on May 18, 2018, which is the detection start time.
  • the operation of turning on the power of the bathroom heater 310 and the operation of turning on the power of the hot water heater 320 have been performed in succession.
  • any one of an operation of turning on the power of the air conditioner 300, an operation of turning on the power of the bathroom heater 310, and an operation of turning on the power of the water heater 320 is performed.
  • a sound instructing to turn on the power of the air conditioner 300, a sound instructing to turn on the power of the bathroom heater 310, and a sound instructing to turn on the power of the water heater 320 Is automatically output.
  • Embodiment 5 demonstrated the example in which a keyword is not matched with a series of operation with respect to a some installation apparatus.
  • an example in which keywords are associated with a series of operations on a plurality of facility devices will be described.
  • fundamentally, differences from the fourth embodiment will be described.
  • the behavior detection unit 105 includes a voice detection unit 108 that detects a third voice representing a word uttered by the user 10, and detects the utterance of the word by the user 10 as an action performed by the user 10.
  • the history information generation unit 106 associates the contents of the first operation, the contents of the second operation, and the above words. Generated history information is generated.
  • the voice output unit 103 outputs the first voice and the second voice when the third voice is detected after the history information is generated.
  • the voice output unit 103 outputs the first voice and the second voice when the third voice is detected after the history information is generated.
  • a series of operations on a plurality of equipment devices and a keyword utterance corresponding to the third voice are continuously detected, one operation in the series of operations is performed.
  • a keyword utterance is detected, a series of sounds representing a series of operations are output.
  • the history information is information in which a keyword, the content of the operation A, the content of the operation B, and the content of the operation C are associated with each other.
  • the operation A, the operation B, and the operation C are a series of operations that are continuously detected within a predetermined time.
  • the keyword is a word detected within the predetermined time together with the series of operations. Any one of the operation A, the operation B, and the operation C is the second operation. The remaining two operations are the first operations.
  • the top record indicated by the history information includes the pronunciation of the keyword “returned now”, an operation to turn on the air conditioner 300, an operation to turn on the bathroom heater 310, and the hot water heater 320.
  • the pronunciation of the keyword “returned now” or the operation of turning on the power of the air conditioner 300, the operation of turning on the power of the bathroom heater 310, and the power of the water heater 320 are turned on.
  • a sound instructing to turn on the power of the air conditioner 300, a sound instructing to turn on the power of the bathroom heater 310, and a water heater A sound instructing to turn on the power of 320 is automatically output.
  • a plurality of facility devices can be brought into a desired control state by keyword utterance or one of a series of operations.
  • Embodiment 6 In Embodiment 1-5, the example in which history information is automatically generated has been described. In the present embodiment, an example in which history information is manually generated will be described.
  • the touch screen 13 or the microphone 14 receives an instruction to shift to the setting mode.
  • the transition instruction receiving unit corresponds to, for example, the touch screen 13 or the microphone 14.
  • the history information generation unit 106 receives the transition instruction from the touch screen 13 or the microphone 14, and when the operation and the action are detected while the setting mode is set based on the transition instruction, the operation information and the action The history information associated with the contents of is generated.
  • the voice output unit 103 outputs the voice when the behavior is detected after the history information is generated.
  • the audio output device 100 utters “What will be linked?”.
  • the audio output device 100 utters “Hot water heater, power, on”
  • the audio output device 100 utters “What do you want to work with?”.
  • the audio output device 100 utters “What do you want to work with?”.
  • the audio output device 100 utters “End”
  • the audio output device 100 utters “What is the keyword?”.
  • the audio output device 100 utters “Setting is complete”.
  • the processor 11 determines whether or not there is an instruction to shift to the setting mode (step S301). When the processor 11 determines that there is no instruction to shift to the setting mode (step S301: NO), the processor 11 returns the process to step S301. On the other hand, when the processor 11 determines that there is an instruction to shift to the setting mode (step S301: YES), the processor 11 utters a message for prompting the operation (step S302).
  • step S303 the processor 11 determines whether or not there is a control designation operation.
  • the control designation operation is an operation for designating control by voice, for example.
  • step S303: YES the processor 11 stores the designated control (step S304).
  • step S305 the processor 11 determines whether there is a setting end operation (step S305).
  • step S305: NO the processor 11 returns the process to step S302.
  • step S305: YES the processor 11 utters a message that prompts the keyword to be uttered (step S306).
  • step S306 determines whether or not there is a keyword utterance (step S307).
  • step S307: YES the processor 11 generates history information with a keyword (step S308).
  • step S307: NO the processor 11 generates history information without a keyword (step S309).
  • the second embodiment an example in which a series of operations on facility equipment is performed after the utterance of a keyword has been described.
  • the keyword may be uttered after a series of operations on the equipment.
  • the operation described as the screen operation may be a voice operation, or the operation described as the voice operation may be a screen operation.
  • the personal computer or the like can also function as the audio output device 100 according to the present invention.
  • the distribution method of such a program is arbitrary.
  • the program is stored and distributed on a computer-readable recording medium such as a CD-ROM (Compact Disk Read-Only Memory), a DVD (Digital Versatile Disk), or a memory card.
  • a computer-readable recording medium such as a CD-ROM (Compact Disk Read-Only Memory), a DVD (Digital Versatile Disk), or a memory card.
  • it may be distributed via a communication network such as the Internet.
  • the present invention is applicable to a device control system including a device control device that controls equipment based on voice.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Selective Calling Equipment (AREA)
  • User Interface Of Digital Computer (AREA)
  • Air Conditioning Control Device (AREA)

Abstract

In the present invention, an operation sensing unit (102) senses an operation performed by a user (10) on an equipment apparatus. If the operation sensing unit (102) has sensed an operation, an audio output unit (103) outputs audio representing the content of the operation. An action sensing unit (105) senses an action performed by the user. If the operation sensed by the operation sensing unit (102) and the action sensed by the action sensing unit (105) are in a predetermined relationship, a history information generation unit (106) generates history information in which the content of the operation and the content of the action are associated. If an action has been sensed by the action sensing unit (105), the audio output unit (103) outputs the audio representing the content of the operation which was associated with the content of the action, in the history information.

Description

音声出力装置、機器制御システム、音声出力方法、及び、プログラムAudio output device, device control system, audio output method, and program
 本発明は、音声出力装置、機器制御システム、音声出力方法、及び、プログラムに関する。 The present invention relates to an audio output device, a device control system, an audio output method, and a program.
 現在、ユーザによる操作に従って、設備機器を制御する各種の技術が知られている。このような操作としては、画面操作のほか音声操作がある。音声操作に従って設備機器を制御する場合、例えば、ユーザが発した音声に基づいて設備機器を制御する機器制御装置を利用して設備機器を制御する。音声に基づいて設備機器を制御する機器制御装置を利用すると、ユーザの利便性が向上することが多い。 Currently, various techniques for controlling equipment according to user operations are known. Such operations include screen operations and voice operations. When controlling equipment according to voice operation, for example, equipment is controlled using an equipment control device which controls equipment based on the voice which a user uttered. When a device control apparatus that controls facility devices based on voice is used, user convenience is often improved.
 ところで、設備機器を所望の制御状態にするために、ユーザが、各種の制御内容に対応する操作を繰り返して実行するのは非常に面倒である。そこで、このような煩わしさを低減するために、ユーザ毎に所望の制御状態を示す情報を記憶する方法が知られている。例えば、特許文献1には、ユーザが携帯機を用いて機器を制御するときにユーザ識別情報を送信する携帯機を備え、機器の使用により取得された個人情報をユーザ識別情報と一致させて記憶する個人情報記憶システムが記載されている。 By the way, it is very troublesome for the user to repeatedly perform operations corresponding to various control contents in order to set the equipment to a desired control state. In order to reduce such annoyance, a method for storing information indicating a desired control state for each user is known. For example, Patent Literature 1 includes a portable device that transmits user identification information when a user controls the device using the portable device, and stores personal information acquired by using the device in accordance with the user identification information. A personal information storage system is described.
特開2008-234371号公報JP 2008-234371 A
 しかしながら、特許文献1に記載された技術は、音声に基づいて設備機器を制御する機器制御装置を利用した技術ではないため、このような機器制御装置を利用した技術に直ちに適用することは困難である。このため、音声に基づいて設備機器を制御する機器制御装置を利用して、設備機器を容易に所望の制御状態にする技術が望まれている。 However, since the technique described in Patent Document 1 is not a technique that uses a device control apparatus that controls equipment based on voice, it is difficult to immediately apply to a technique that uses such a device control apparatus. is there. For this reason, there is a demand for a technique for easily bringing a facility device into a desired control state using a device control device that controls the facility device based on voice.
 本発明は、上記問題に鑑みてなされたものであり、音声に基づいて設備機器を制御する機器制御装置を利用して、設備機器を容易に所望の制御状態にする音声出力装置、機器制御システム、音声出力方法、及び、プログラムを提供することを目的とする。 The present invention has been made in view of the above problems, and an audio output device and an apparatus control system for easily setting an equipment device in a desired control state by using an equipment control device that controls the equipment equipment based on sound. An object is to provide an audio output method and a program.
 上記目的を達成するために、本発明に係る音声出力装置は、
 ユーザによる設備機器に対する操作を検知する操作検知手段と、
 前記操作検知手段により前記操作が検知された場合、前記操作の内容を表す音声を出力する音声出力手段と、
 前記ユーザによりなされた行動を検知する行動検知手段と、
 前記操作検知手段により検知された前記操作と前記行動検知手段により検知された前記行動とが予め定められた関係にある場合、前記操作の内容と前記行動の内容とが対応付けられた履歴情報を生成する履歴情報生成手段と、を備え、
 前記音声出力手段は、前記行動検知手段により前記行動が検知された場合、前記履歴情報により前記行動の内容に対応付けられた前記操作の内容を表す前記音声を出力する。
In order to achieve the above object, an audio output device according to the present invention includes:
Operation detection means for detecting an operation on the equipment by the user;
Audio output means for outputting audio representing the content of the operation when the operation is detected by the operation detection means;
Action detecting means for detecting an action made by the user;
When the operation detected by the operation detection unit and the behavior detected by the behavior detection unit have a predetermined relationship, history information in which the content of the operation and the content of the behavior are associated with each other is stored. A history information generating means for generating,
When the action is detected by the action detecting means, the sound output means outputs the sound representing the content of the operation associated with the content of the action by the history information.
 本発明では、ユーザによる設備機器に対する操作とユーザによりなされた行動とが予め定められた関係にあり、上記行動が検知された場合、上記操作の内容を表す音声が出力される。従って、本発明によれば、音声に基づいて設備機器を制御する機器制御装置を利用して、設備機器を容易に所望の制御状態にすることができる。 In the present invention, the operation performed on the equipment by the user and the action performed by the user have a predetermined relationship, and when the action is detected, a sound representing the content of the operation is output. Therefore, according to this invention, an equipment apparatus can be easily made into a desired control state using the equipment control apparatus which controls an equipment apparatus based on an audio | voice.
本発明の実施形態1に係る機器制御システムの構成図The block diagram of the equipment control system which concerns on Embodiment 1 of this invention 本発明の実施形態1に係る音声出力装置の構成図1 is a configuration diagram of an audio output device according to Embodiment 1 of the present invention. 本発明の実施形態1に係る機器制御装置の構成図The block diagram of the apparatus control apparatus which concerns on Embodiment 1 of this invention. 本発明の実施形態1に係る音声出力装置の機能構成図Functional configuration diagram of the audio output device according to the first embodiment of the present invention. 本発明の実施形態1に係る機器制御装置の機能構成図Functional configuration diagram of the device control apparatus according to the first embodiment of the present invention 本発明の実施形態1に係る履歴情報を示す図The figure which shows the historical information which concerns on Embodiment 1 of this invention. 本発明の実施形態1に係る音声出力装置が実行する音声出力処理を示すフローチャートThe flowchart which shows the audio | voice output process which the audio | voice output apparatus which concerns on Embodiment 1 of this invention performs. 本発明の実施形態2に係る音声出力装置の機能構成図Functional configuration diagram of the audio output device according to the second embodiment of the present invention 本発明の実施形態2に係る履歴情報を示す図The figure which shows the historical information which concerns on Embodiment 2 of this invention. 本発明の実施形態2に係る音声出力装置が実行する音声出力処理を示すフローチャートThe flowchart which shows the audio | voice output process which the audio | voice output apparatus which concerns on Embodiment 2 of this invention performs. 本発明の実施形態3に係る音声出力装置の機能構成図Functional configuration diagram of the audio output device according to the third embodiment of the present invention 本発明の実施形態4に係る機器制御システムの構成図Configuration diagram of device control system according to Embodiment 4 of the present invention 本発明の実施形態4に係る履歴情報を示す図The figure which shows the historical information which concerns on Embodiment 4 of this invention. 本発明の実施形態5に係る履歴情報を示す図The figure which shows the historical information which concerns on Embodiment 5 of this invention. 本発明の実施形態6に係る音声出力装置が実行する設定処理を示すフローチャートThe flowchart which shows the setting process which the audio | voice output apparatus which concerns on Embodiment 6 of this invention performs.
(実施形態1)
 まず、図1を参照して、本発明の実施形態1に係る機器制御システム1000の構成について説明する。機器制御システム1000は、音声出力装置100と機器制御装置200とを備え、音声出力装置100と機器制御装置200とが連携して設備機器を制御するシステムである。本実施形態では、設備機器は、空調機300であるものとする。音声出力装置100は、ユーザ10から設備機器に対する操作を受け付け、この操作を表す音声を出力する。機器制御装置200は、音声出力装置100が出力した音声を検知し、この音声に対応する制御コマンドを生成し、この制御コマンドを空調機300に送信する。機器制御装置200と空調機300とは、通信ネットワーク600を介して相互に接続される。通信ネットワーク600は、例えば、宅内に構築された無線LAN(Local Area Network)である。
(Embodiment 1)
First, the configuration of a device control system 1000 according to Embodiment 1 of the present invention will be described with reference to FIG. The device control system 1000 includes a sound output device 100 and a device control device 200, and the sound output device 100 and the device control device 200 cooperate to control facility devices. In the present embodiment, the equipment is assumed to be an air conditioner 300. The voice output device 100 receives an operation on the equipment from the user 10 and outputs a voice representing the operation. The device control device 200 detects the sound output by the sound output device 100, generates a control command corresponding to the sound, and transmits the control command to the air conditioner 300. The device control apparatus 200 and the air conditioner 300 are connected to each other via a communication network 600. The communication network 600 is, for example, a wireless LAN (Local Area Network) built in a home.
 機器制御装置200は、一般的には、ユーザ10が発話した音声を検知し、この音声に対応する制御コマンドを空調機300に送信するために用いられる。しかしながら、本実施形態では、ユーザ10が発話した音声ではなく、音声出力装置100がユーザ10から受け付けた操作に対応して発した音声を検知し、この音声に対応する制御コマンドを空調機300に送信する。このように、音声出力装置100がユーザ10と機器制御装置200とを中継することで、種々の効果が期待できる。例えば、かかる構成によれば、ユーザ10は、音声操作ではなく画面操作により、空調機300を制御することが可能となる。また、かかる構成によれば、例えば、ユーザ10によりなされた操作に対応する制御だけでなく、この操作に関連する他の操作に対応する制御を自動で実行することが可能となる。 The device control apparatus 200 is generally used to detect a voice spoken by the user 10 and transmit a control command corresponding to the voice to the air conditioner 300. However, in this embodiment, not the voice uttered by the user 10 but the voice uttered in response to the operation received by the voice output device 100 from the user 10 is detected, and a control command corresponding to this voice is sent to the air conditioner 300. Send. Thus, various effects can be expected by the audio output device 100 relaying the user 10 and the device control device 200. For example, according to such a configuration, the user 10 can control the air conditioner 300 not by voice operation but by screen operation. In addition, according to such a configuration, for example, not only control corresponding to an operation performed by the user 10 but also control corresponding to another operation related to this operation can be automatically executed.
 音声出力装置100は、空調機300に対する制御を指示する操作をユーザ10から受け付ける。音声出力装置100は、ユーザ10から受け付けた操作の内容に対応する音声を出力する。音声出力装置100が出力した音声は、機器制御装置200により検知される。音声出力装置100は、ユーザ10から受け付けた操作の履歴に基づいて、空調機300をユーザ10が所望する制御状態にするための音声を自動で出力することができる。このために、音声出力装置100は、受け付けた操作の内容に対応する音声以外の音声を出力することもできるし、操作を受け付けていないときに音声を出力することもできる。音声出力装置100は、例えば、スマートフォン、タブレット端末、又は、パーソナルコンピュータである。 The audio output device 100 receives an operation for instructing control on the air conditioner 300 from the user 10. The audio output device 100 outputs audio corresponding to the content of the operation received from the user 10. The sound output from the sound output device 100 is detected by the device control device 200. The audio output device 100 can automatically output audio for setting the air conditioner 300 to a control state desired by the user 10 based on the history of operations received from the user 10. For this reason, the audio output device 100 can output audio other than audio corresponding to the content of the accepted operation, or can output audio when no operation is accepted. The audio output device 100 is, for example, a smartphone, a tablet terminal, or a personal computer.
 以下、図2を参照して、音声出力装置100の構成について説明する。図2に示すように、音声出力装置100は、プロセッサ11と、フラッシュメモリ12と、タッチスクリーン13と、マイクロフォン14と、スピーカ15と、通信インターフェース16と、を備える。プロセッサ11は、音声出力装置100の全体の動作を制御する。プロセッサ11は、例えば、ROM(Read Only Memory)、RAM(Random Access Memory)、RTC(Real Time Clock)などを内蔵したCPU(Central Processing Unit)である。なお、CPUは、例えば、ROMに格納されている基本プログラムに従って動作し、RAMをワークエリアとして使用する。 Hereinafter, the configuration of the audio output device 100 will be described with reference to FIG. As shown in FIG. 2, the audio output device 100 includes a processor 11, a flash memory 12, a touch screen 13, a microphone 14, a speaker 15, and a communication interface 16. The processor 11 controls the overall operation of the audio output device 100. The processor 11 is, for example, a CPU (Central Processing Unit) that incorporates ROM (Read Only Memory), RAM (Random Access Memory), RTC (Real Time Clock), and the like. The CPU operates according to a basic program stored in the ROM, for example, and uses the RAM as a work area.
 フラッシュメモリ12は、各種の情報を記憶する不揮発性メモリである。フラッシュメモリ12は、例えば、プロセッサ11が実行するプログラムを記憶する。タッチスクリーン13は、ユーザによりなされた操作を検知し、検知の結果を示す信号をプロセッサ11に供給する。また、タッチスクリーン13は、プロセッサ11による制御に従って、情報を表示する。 The flash memory 12 is a nonvolatile memory that stores various types of information. For example, the flash memory 12 stores a program executed by the processor 11. The touch screen 13 detects an operation performed by the user and supplies a signal indicating the detection result to the processor 11. The touch screen 13 displays information according to control by the processor 11.
 マイクロフォン14は、音を電気信号に変換する機器である。例えば、マイクロフォン14は、ユーザ10が発した音声を電気信号に変換する。スピーカ15は、供給された電気信号を物理振動に変換し、音を発生させる機器である。例えば、スピーカ15は、ユーザ10に各種のメッセージを伝達するための音声を出力する。通信インターフェース16は、音声出力装置100を電話網(図示せず)又はインターネット(図示せず)に接続するための通信インターフェースである。 The microphone 14 is a device that converts sound into an electrical signal. For example, the microphone 14 converts a voice uttered by the user 10 into an electrical signal. The speaker 15 is a device that converts a supplied electric signal into physical vibration and generates sound. For example, the speaker 15 outputs sound for transmitting various messages to the user 10. The communication interface 16 is a communication interface for connecting the audio output device 100 to a telephone network (not shown) or the Internet (not shown).
 機器制御装置200は、音声出力装置100が出力した音声を検知し、検知した音声に対応する制御コマンドを生成する。機器制御装置200は、生成した制御コマンドを、通信ネットワーク600を介して、空調機300に送信する。機器制御装置200は、音声を言葉に変換する機能、言葉を制御コマンドに変換する機能を備える。機器制御装置200は、例えば、スマートスピーカである。 The device control device 200 detects the sound output from the sound output device 100 and generates a control command corresponding to the detected sound. The device control apparatus 200 transmits the generated control command to the air conditioner 300 via the communication network 600. The device control apparatus 200 has a function of converting speech into words and a function of converting words into control commands. The device control apparatus 200 is, for example, a smart speaker.
 以下、図3を参照して、機器制御装置200の構成について説明する。図3に示すように、機器制御装置200は、プロセッサ21と、フラッシュメモリ22と、タッチスクリーン23と、マイクロフォン24と、スピーカ25と、通信インターフェース26と、を備える。プロセッサ21は、機器制御装置200の全体の動作を制御する。プロセッサ21は、例えば、ROM、RAM、RTCなどを内蔵したCPUである。なお、CPUは、例えば、ROMに格納されている基本プログラムに従って動作し、RAMをワークエリアとして使用する。 Hereinafter, the configuration of the device control apparatus 200 will be described with reference to FIG. As illustrated in FIG. 3, the device control apparatus 200 includes a processor 21, a flash memory 22, a touch screen 23, a microphone 24, a speaker 25, and a communication interface 26. The processor 21 controls the overall operation of the device control apparatus 200. The processor 21 is, for example, a CPU incorporating a ROM, RAM, RTC, and the like. The CPU operates according to a basic program stored in the ROM, for example, and uses the RAM as a work area.
 フラッシュメモリ22は、各種の情報を記憶する不揮発性メモリである。フラッシュメモリ22は、例えば、プロセッサ21が実行するプログラムを記憶する。タッチスクリーン23は、ユーザによりなされた操作を検知し、検知の結果を示す信号をプロセッサ21に供給する。また、タッチスクリーン23は、プロセッサ21による制御に従って、情報を表示する。 The flash memory 22 is a non-volatile memory that stores various types of information. For example, the flash memory 22 stores a program executed by the processor 21. The touch screen 23 detects an operation performed by the user and supplies a signal indicating the detection result to the processor 21. The touch screen 23 displays information according to control by the processor 21.
 マイクロフォン24は、音を電気信号に変換する機器である。例えば、マイクロフォン24は、音声出力装置100が出力した音声を電気信号に変換する。スピーカ25は、供給された電気信号を物理振動に変換し、音を発声させる機器である。例えば、スピーカ25は、ユーザ10に各種のメッセージを伝達するための音声を出力する。通信インターフェース26は、機器制御装置200を通信ネットワーク600に接続するための通信インターフェースである。 The microphone 24 is a device that converts sound into an electrical signal. For example, the microphone 24 converts the sound output from the sound output device 100 into an electrical signal. The speaker 25 is a device that converts a supplied electric signal into physical vibration and utters a sound. For example, the speaker 25 outputs sound for transmitting various messages to the user 10. The communication interface 26 is a communication interface for connecting the device control apparatus 200 to the communication network 600.
 空調機300は、機器制御システム1000による制御対象の設備機器である。空調機300は、例えば、宅内の空間の空気を調和する機器である。空調機300は、例えば、暖房機能、冷房機能、除湿機能、及び、送風機能を備える。空調機300は、例えば、宅内に設置される室内機(図示せず)と、宅外に設置される室外機(図示せず)と、室内機と室外機とを操作するためのリモートコントローラ(図示せず)とを備える。空調機300は、通信ネットワーク600に接続する機能を有する。空調機300は、通信ネットワーク600を介して機器制御装置200から受信した制御コマンドに従って制御される。 The air conditioner 300 is a facility device to be controlled by the device control system 1000. The air conditioner 300 is, for example, a device that harmonizes air in a home space. The air conditioner 300 includes, for example, a heating function, a cooling function, a dehumidifying function, and a blowing function. The air conditioner 300 includes, for example, an indoor unit (not shown) installed in the house, an outdoor unit (not shown) installed outside the house, and a remote controller (for controlling the indoor unit and the outdoor unit). (Not shown). The air conditioner 300 has a function of connecting to the communication network 600. The air conditioner 300 is controlled according to a control command received from the device control apparatus 200 via the communication network 600.
 次に、図4を参照して、音声出力装置100の機能について説明する。図4に示すように、音声出力装置100は、機能的には、制御部101と、音声出力部103と、音声情報記憶部104と、行動検知部105と、履歴情報生成部106と、履歴情報記憶部107と、を備える。行動検知部105は、操作検知部102を備える。操作検知手段は、例えば、操作検知部102に対応する。音声出力手段は、例えば、音声出力部103に対応する。行動検知手段は、例えば、行動検知部105に対応する。履歴情報生成手段は、例えば、履歴情報生成部106に対応する。 Next, the function of the audio output device 100 will be described with reference to FIG. As shown in FIG. 4, the audio output device 100 functionally includes a control unit 101, an audio output unit 103, an audio information storage unit 104, an action detection unit 105, a history information generation unit 106, a history An information storage unit 107. The behavior detection unit 105 includes an operation detection unit 102. The operation detection unit corresponds to the operation detection unit 102, for example. The audio output unit corresponds to the audio output unit 103, for example. The behavior detection unit corresponds to the behavior detection unit 105, for example. The history information generation unit corresponds to the history information generation unit 106, for example.
 制御部101は、音声出力装置100の全体の動作を制御する。例えば、制御部101は、操作検知部102による検知結果に基づいて、音声出力部103から音声を出力させる。また、例えば、制御部101は、操作検知部102による検知結果と行動検知部105による検知結果とに基づいて、履歴情報生成部106に履歴情報を生成させる。また、例えば、制御部101は、操作検知部102による検知結果と行動検知部105による検知結果とのうちの少なくとも一方の検知結果と、履歴情報と、に基づいて、音声出力部103から音声を出力させる。制御部101の機能は、例えば、プロセッサ11がフラッシュメモリ12に記憶されたプログラムを実行することにより実現される。 The control unit 101 controls the overall operation of the audio output device 100. For example, the control unit 101 causes the audio output unit 103 to output sound based on the detection result by the operation detection unit 102. For example, the control unit 101 causes the history information generation unit 106 to generate history information based on the detection result by the operation detection unit 102 and the detection result by the behavior detection unit 105. Further, for example, the control unit 101 outputs a sound from the sound output unit 103 based on at least one detection result of the detection result by the operation detection unit 102 and the detection result by the behavior detection unit 105 and the history information. Output. The function of the control unit 101 is realized, for example, when the processor 11 executes a program stored in the flash memory 12.
 操作検知部102は、ユーザ10による空調機300に対する操作を検知する。この操作は、空調機300を制御するための操作であり、空調機300に対する制御内容を指示する操作である。この操作は、例えば、空調機300に対して、電源のオンを指示する操作、空調モードを冷房に切り替える操作、又は、設定温度を22℃に切り替える操作である。この操作は、音声出力装置100に対する、画面操作又は音声操作である。例えば、操作検知部102は、空調機300に対する操作を受け付けるための操作画面に対する画面操作を受け付ける。あるいは、操作検知部102は、空調機300に対する制御内容を表す音声を検知する。操作検知部102は、実質的に、ユーザ10による操作を受け付ける操作受付部とも言える。操作検知部102の機能は、例えば、タッチスクリーン13の機能、又は、マイクロフォン14の機能により実現される。 The operation detection unit 102 detects an operation performed on the air conditioner 300 by the user 10. This operation is an operation for controlling the air conditioner 300, and is an operation for instructing control contents for the air conditioner 300. This operation is, for example, an operation of instructing the air conditioner 300 to turn on the power, an operation of switching the air conditioning mode to cooling, or an operation of switching the set temperature to 22 ° C. This operation is a screen operation or a voice operation on the voice output device 100. For example, the operation detection unit 102 receives a screen operation on an operation screen for receiving an operation on the air conditioner 300. Or the operation detection part 102 detects the audio | voice showing the control content with respect to the air conditioner 300. FIG. It can be said that the operation detection unit 102 is substantially an operation reception unit that receives an operation by the user 10. The function of the operation detection unit 102 is realized by the function of the touch screen 13 or the function of the microphone 14, for example.
 音声出力部103は、操作検知部102により上記操作が検知された場合、上記操作の内容を表す音声を出力する。例えば、制御部101により操作の内容が特定され、制御部101により制御の内容を表す音声情報が取得されるものとする。この場合、音声出力部103は、制御部101から供給された音声情報に基づく電気信号を生成し、この電気信号に応じた音声を発生する。音声出力部103の機能は、例えば、プロセッサ11とスピーカ15とが協働することにより実現される。 When the operation detection unit 102 detects the operation, the audio output unit 103 outputs a sound representing the content of the operation. For example, it is assumed that the content of the operation is specified by the control unit 101, and audio information representing the content of control is acquired by the control unit 101. In this case, the audio output unit 103 generates an electric signal based on the audio information supplied from the control unit 101, and generates an audio corresponding to the electric signal. The function of the audio output unit 103 is realized by the cooperation of the processor 11 and the speaker 15, for example.
 音声情報記憶部104は、音声情報を記憶する。音声情報は、例えば、操作の内容毎、つまり、制御の内容毎に、出力すべき音声を示す情報である。例えば、音声情報は、空調機300:電源:オンという操作の内容と、「空調機300の電源をオンしてください。」という音声を出力するための電気信号に対応する情報とが対応付けられた情報である。音声情報記憶部104の機能は、例えば、フラッシュメモリ12の機能により実現される。 The voice information storage unit 104 stores voice information. The sound information is information indicating sound to be output for each operation content, that is, for each control content. For example, in the audio information, the content of the operation of the air conditioner 300: power supply: on is associated with information corresponding to an electrical signal for outputting a sound “Please turn on the air conditioner 300.” Information. The function of the audio information storage unit 104 is realized by the function of the flash memory 12, for example.
 行動検知部105は、ユーザ10によりなされた行動を検知する。この行動は、例えば、ユーザ10による空調機300に対する操作、又は、ユーザ10による言葉の発声である。なお、本実施形態では、空調機300に対する操作は、実質的に、音声出力装置100に対する操作である。行動検知部105の機能は、例えば、タッチスクリーン13の機能、又は、マイクロフォン14の機能により実現される。 The behavior detection unit 105 detects a behavior performed by the user 10. This action is, for example, an operation on the air conditioner 300 by the user 10 or a utterance of words by the user 10. In the present embodiment, the operation on the air conditioner 300 is substantially an operation on the audio output device 100. The function of the behavior detection unit 105 is realized by, for example, the function of the touch screen 13 or the function of the microphone 14.
 履歴情報生成部106は、操作検知部102により検知された上記操作と行動検知部105により検知された上記行動とが予め定められた関係にある場合、上記操作の内容と上記行動の内容とが対応付けられた履歴情報を生成する。予め定められた関係とは、例えば、検知された時刻の差が閾値以内である関係、又は、検知された時刻がいずれも設定モード中の時刻である関係である。履歴情報生成部106の機能は、例えば、プロセッサ11がフラッシュメモリ12に記憶されたプログラムを実行することにより実現される。 When the operation detected by the operation detection unit 102 and the behavior detected by the behavior detection unit 105 have a predetermined relationship, the history information generation unit 106 determines that the content of the operation and the content of the behavior are The associated history information is generated. The predetermined relationship is, for example, a relationship in which the difference between detected times is within a threshold value, or a relationship in which all detected times are times in the setting mode. The function of the history information generation unit 106 is realized by, for example, the processor 11 executing a program stored in the flash memory 12.
 履歴情報記憶部107は、履歴情報生成部106により生成された履歴情報を記憶する。履歴情報記憶部107の機能は、例えば、フラッシュメモリ12の機能により実現される。 The history information storage unit 107 stores the history information generated by the history information generation unit 106. The function of the history information storage unit 107 is realized by the function of the flash memory 12, for example.
 音声出力部103は、行動検知部105により上記行動が検知された場合、履歴情報により上記行動の内容に対応付けられた上記操作の内容を表す上記音声を出力する。つまり、検知された行動の内容と操作の内容とが対応付けられた履歴情報が存在する場合、音声出力部103は、この操作の内容を表す音声を出力する。 When the behavior detection unit 105 detects the behavior, the voice output unit 103 outputs the voice representing the content of the operation associated with the content of the behavior based on history information. That is, when there is history information in which the detected action content and the operation content are associated with each other, the audio output unit 103 outputs a sound representing the operation content.
 ここで、履歴情報生成部106は、予め定められた時間内に上記操作と上記行動とが検知された場合、上記操作の内容と上記行動の内容とが対応付けられた履歴情報を生成する。この予め定められた時間は、例えば、数分程度の時間である。そして、音声出力部103は、履歴情報が生成された後、上記行動が検知された場合、上記音声を出力する。このように、ユーザ10が、上記操作と上記行動とを比較的短い間隔で連続して実行した実績がある場合、ユーザ10は、上記行動とともに上記操作を実行する可能性が高い。そこで、このような実績がある場合、音声出力部103は、上記行動が検知された場合、上記ユーザ10が上記操作をする可能性が高いとみなし、上記操作による制御を実現するための音声を自動で出力する。 Here, when the operation and the action are detected within a predetermined time, the history information generation unit 106 generates history information in which the content of the operation and the content of the action are associated with each other. This predetermined time is, for example, about several minutes. Then, the sound output unit 103 outputs the sound when the behavior is detected after the history information is generated. Thus, when the user 10 has a track record of continuously executing the operation and the action at relatively short intervals, the user 10 is likely to execute the operation together with the action. Therefore, when there is such a track record, the voice output unit 103 considers that the user 10 is highly likely to perform the operation when the action is detected, and outputs a voice for realizing control by the operation. Output automatically.
 具体的には、本実施形態では、操作検知部102は、ユーザ10による空調機300に対する第1操作とユーザ10による空調機300に対する第2操作とを検知する。ここで、音声出力部103は、第1操作が検知された場合、第1操作の内容を表す第1音声を出力し、第2操作が検知された場合、第2操作の内容を表す第2音声を出力する。また、行動検知部105は、操作検知部102を備え、ユーザ10によりなされた行動として、第2操作を検知する。 Specifically, in the present embodiment, the operation detection unit 102 detects a first operation on the air conditioner 300 by the user 10 and a second operation on the air conditioner 300 by the user 10. Here, when the first operation is detected, the voice output unit 103 outputs the first voice representing the content of the first operation, and when the second operation is detected, the voice output unit 103 represents the second operation. Output audio. The behavior detection unit 105 includes an operation detection unit 102 and detects the second operation as the behavior performed by the user 10.
 また、履歴情報生成部106は、上記予め定められた時間内に第1操作と第2操作とが検知された場合、第1操作の内容と第2操作の内容とが対応付けられた履歴情報を生成する。そして、音声出力部103は、この履歴情報が生成された後、第2操作が検知された場合、第1音声と第2音声とを出力する。このように、第1操作と第2操作とが連続して検知された実績がある場合、第1操作と第2操作とが連続して実行される可能性が高い。そこで、音声出力部103は、このような実績がある場合において、第2操作が検知された場合、第1操作がなされる可能性が高いものとみなし、第2操作の内容を表す第2音声だけでなく、第1操作の内容を表す第1音声も出力する。 Further, when the first operation and the second operation are detected within the predetermined time, the history information generation unit 106 associates the contents of the first operation with the contents of the second operation. Is generated. Then, after the history information is generated, the voice output unit 103 outputs the first voice and the second voice when the second operation is detected. Thus, when there is a track record in which the first operation and the second operation are continuously detected, there is a high possibility that the first operation and the second operation are continuously executed. Therefore, when there is such a track record, the audio output unit 103 considers that the first operation is highly likely to be performed when the second operation is detected, and the second audio representing the content of the second operation. In addition, a first sound representing the content of the first operation is also output.
 なお、本実施形態では、履歴情報生成部106は、第2操作が検知されてから上記予め定められた時間が経過する前に第1操作が検知された場合、第1操作の内容と第2操作の内容とが対応付けられた履歴情報を生成する。つまり、本実施形態では、第2操作が検知されてから第1操作が検知された実績がある場合において、第2操作が新たに検知された場合、第2操作の内容を表す第2音声に加え、第1操作の内容を表す第1音声も出力される。一方、第2操作が検知されてから第1操作が検知された実績がある場合において、第1操作が新たに検知された場合、第1操作の内容を表す第1音声が出力され、第2操作の内容を表す第2音声が出力されない。 In the present embodiment, the history information generation unit 106 determines the contents of the first operation and the second information when the first operation is detected before the predetermined time elapses after the second operation is detected. History information associated with the contents of the operation is generated. That is, in the present embodiment, when there is a track record in which the first operation has been detected since the second operation has been detected, when the second operation is newly detected, the second sound representing the content of the second operation is displayed. In addition, a first sound representing the content of the first operation is also output. On the other hand, when there is a track record in which the first operation has been detected since the second operation has been detected, when the first operation is newly detected, a first sound representing the content of the first operation is output, and the second The second sound representing the content of the operation is not output.
 このように、本実施形態では、ある操作が検知された後に更に実行される可能性が高いと考えられる操作の内容を表す音声は自動で出力され、一方、ある操作が検知された後に更に実行される可能性が低いと考えられる操作の内容を表す音声は自動で出力されない。例えば、第2操作が、空調機300に対して電源をオンする操作であり、第1操作が、空調機300に対して空調モードを冷房にする操作である場合を想定する。この場合、第2操作がなされた後に第1操作がなされる可能性は高いが、第1操作がなされた後に第2操作がなされる可能性は低いと考えられる。そこで、先に検知された操作が新たに検知された場合に限り、後で検知された操作の内容を示す音声が付加的に出力される。 As described above, in the present embodiment, a voice representing the content of an operation that is considered to be highly likely to be further executed after a certain operation is detected is automatically output, and further executed after a certain operation is detected. The voice representing the content of the operation that is considered to be less likely to be performed is not automatically output. For example, it is assumed that the second operation is an operation for turning on the power to the air conditioner 300 and the first operation is an operation for setting the air conditioning mode to the air conditioner 300 for cooling. In this case, the possibility that the first operation is performed after the second operation is performed is high, but the possibility that the second operation is performed after the first operation is performed is low. Therefore, only when a previously detected operation is newly detected, a sound indicating the content of the operation detected later is additionally output.
 ここで、履歴情報生成部106は、直近の予め定められた期間内において上記予め定められた時間内に上記操作と上記行動とが検知された回数が、予め定められた閾値に達した場合、上記操作の内容と上記行動の内容とが対応付けられた履歴情報を生成することが好適である。直近の予め定められた期間は、例えば、直近の一ヶ月間である。予め定められた時間は、例えば、数分である。予め定められた閾値は、例えば、5回である。このように、上記操作と上記行動とが連続して検知された実績が、例えば、直近の1ヶ月間において5回ある場合に、上記行動とともに上記操作がなされる可能性は高いと考えられる。そこで、このような場合、上記行動が検知された場合、上記操作の内容を表す音声が出力されることが好適である。かかる構成によれば、不適切に音声が出力されることを抑制することができる。 Here, the history information generation unit 106, when the number of times that the operation and the action are detected within the predetermined time within the most recent predetermined period has reached a predetermined threshold, It is preferable to generate history information in which the content of the operation and the content of the action are associated with each other. The most recent predetermined period is, for example, the most recent month. The predetermined time is, for example, several minutes. The predetermined threshold is, for example, 5 times. Thus, when there is a track record in which the above operation and the action are continuously detected, for example, five times in the most recent month, it is considered highly likely that the operation is performed together with the action. Therefore, in such a case, when the behavior is detected, it is preferable that a sound representing the content of the operation is output. According to such a configuration, it is possible to suppress inappropriate output of sound.
 次に、図5を参照して、機器制御装置200の機能について説明する。図5に示すように、機器制御装置200は、機能的には、制御部201と、音声検知部202と、音声出力部203と、音声情報記憶部204と、機器制御部205と、コマンド情報記憶部206と、を備える。機器制御装置200が備える音声検知手段は、例えば、音声検知部202に対応する。機器制御手段は、例えば、機器制御部205に対応する。 Next, the function of the device control apparatus 200 will be described with reference to FIG. As shown in FIG. 5, the device control apparatus 200 functionally includes a control unit 201, a sound detection unit 202, a sound output unit 203, a sound information storage unit 204, a device control unit 205, and command information. And a storage unit 206. The sound detection means included in the device control apparatus 200 corresponds to the sound detection unit 202, for example. The device control means corresponds to the device control unit 205, for example.
 制御部201は、機器制御装置200の全体の動作を制御する。例えば、制御部201は、音声検知部202により検知された音声から、空調機300に対する制御の内容を特定し、特定した制御の内容を表す制御コマンドを機器制御部205に送信させる。制御部201の機能は、例えば、プロセッサ21がフラッシュメモリ22に記憶されたプログラムを実行することにより実現される。 The control unit 201 controls the overall operation of the device control apparatus 200. For example, the control unit 201 specifies the content of control for the air conditioner 300 from the sound detected by the sound detection unit 202, and causes the device control unit 205 to transmit a control command representing the specified content of control. The function of the control unit 201 is realized, for example, when the processor 21 executes a program stored in the flash memory 22.
 音声検知部202は、音声出力部103により出力された音声を検知する。従って、音声検知部202は、音声出力部103の近くに配置されることが望ましい。例えば、音声検知部202は、音声出力部103から数メートル以内の領域に配置される。音声検知部202の機能は、例えば、マイクロフォン24の機能により実現される。 The voice detection unit 202 detects the voice output by the voice output unit 103. Therefore, it is desirable that the sound detection unit 202 be disposed near the sound output unit 103. For example, the voice detection unit 202 is disposed in an area within several meters from the voice output unit 103. The function of the voice detection unit 202 is realized by the function of the microphone 24, for example.
 音声出力部203は、制御部201による制御に従って、種々の音声を出力する。例えば、音声出力部203は、ユーザ10に対するアナウンスを表す音声を出力する。音声出力部203の機能は、例えば、プロセッサ21とスピーカ25とが協働することにより実現される。 The sound output unit 203 outputs various sounds according to the control by the control unit 201. For example, the voice output unit 203 outputs a voice representing an announcement to the user 10. The function of the audio output unit 203 is realized by the cooperation of the processor 21 and the speaker 25, for example.
 音声情報記憶部204は、音声情報を記憶する。音声情報は、例えば、音声検知部202により検知された音声から制御の内容を特定するために用いる情報である。例えば、音声情報は、制御の内容毎に、制御の内容を表す音声に対応する電気信号を表す情報を示す情報である。音声情報記憶部204の機能は、例えば、フラッシュメモリ22の機能により実現される。 The voice information storage unit 204 stores voice information. The voice information is information used to specify the content of control from the voice detected by the voice detection unit 202, for example. For example, the audio information is information indicating information representing an electrical signal corresponding to a sound representing the content of control for each content of control. The function of the audio information storage unit 204 is realized by the function of the flash memory 22, for example.
 機器制御部205は、音声検知部202により検知された音声により表される操作の内容に基づいて、空調機300を制御する。例えば、機器制御部205は、制御部201による制御に従って、検知された音声に対応する制御コマンドを、通信ネットワーク600を介して、空調機300に送信する。機器制御部205の機能は、例えば、プロセッサ21と通信インターフェース26とが協働することにより実現される。 The device control unit 205 controls the air conditioner 300 based on the content of the operation represented by the voice detected by the voice detection unit 202. For example, the device control unit 205 transmits a control command corresponding to the detected voice to the air conditioner 300 via the communication network 600 according to the control by the control unit 201. The function of the device control unit 205 is realized by the cooperation of the processor 21 and the communication interface 26, for example.
 コマンド情報記憶部206は、コマンド情報を記憶する。コマンド情報は、例えば、操作の内容に対応する制御の内容と、制御コマンドとが対応付けられた情報である。コマンド情報記憶部206の機能は、例えば、フラッシュメモリ22の機能により実現される。 The command information storage unit 206 stores command information. The command information is, for example, information in which the control content corresponding to the operation content is associated with the control command. The function of the command information storage unit 206 is realized by the function of the flash memory 22, for example.
 次に、図6を参照して、履歴情報について説明する。図6に示す履歴情報は、過去に連続してなされた複数の操作の組み合わせの全てを示す情報である。検知開始時刻は、対応するレコードの組み合わせの検知が開始された時刻である。操作Aは、過去に連続してなされた複数の操作のうち、最初になされた操作である。操作Aは、例えば、第2操作に対応する。操作Bは、過去に連続してなされた複数の操作のうち、2番目になされた操作である。操作Bは、例えば、第1操作に対応する。操作Cは、過去に連続してなされた複数の操作のうち、3番目になされた操作である。操作Cは、例えば、第1操作に対応する。本実施形態では、第2操作は1つであり、第1操作は1つ以上である。 Next, history information will be described with reference to FIG. The history information shown in FIG. 6 is information indicating all combinations of a plurality of operations performed continuously in the past. The detection start time is the time when detection of the corresponding record combination is started. The operation A is an operation performed first among a plurality of operations performed continuously in the past. The operation A corresponds to the second operation, for example. The operation B is the second operation among a plurality of operations performed continuously in the past. The operation B corresponds to the first operation, for example. The operation C is the third operation among a plurality of operations performed continuously in the past. The operation C corresponds to the first operation, for example. In the present embodiment, there is one second operation, and there are one or more first operations.
 図6に示す履歴情報のうち一番上のレコードは、2018年5月18日の12:00に、空調機300の電源をオンする操作と、空調機300の空調モードを冷房にする操作と、空調機300の設定温度を28℃にする操作とが、連続して実行された実績を示すレコードである。このような実績がある場合において空調機300の電源をオンする操作が検知された場合、空調機300の空調モードを冷房にする操作と空調機300の設定温度を28℃にする操作とが実行される可能性が高い。そこで、空調機300の電源をオンする操作が検知された場合、空調機300の電源をオンすることを指示する音声に加え、空調機300の空調モードを冷房にすることを指示する音声と、空調機300の設定温度を28℃にすることを指示する音声とが出力される。なお、図6に示す例では、第2操作に対応する操作Aの内容が、一番上のレコードと一番下のレコードとで同じである。このような場合、検知開始時刻が新しい1番上のレコードが採用されることが好適である。 The top record in the history information shown in FIG. 6 includes an operation for turning on the power of the air conditioner 300 at 12:00 on May 18, 2018, and an operation for setting the air conditioning mode of the air conditioner 300 to air conditioning. The operation for setting the set temperature of the air conditioner 300 to 28 ° C. is a record indicating the results of continuous execution. When an operation for turning on the power of the air conditioner 300 is detected in such a case, an operation for cooling the air conditioning mode of the air conditioner 300 and an operation for setting the set temperature of the air conditioner 300 to 28 ° C. are executed. There is a high possibility of being. Therefore, when an operation to turn on the air conditioner 300 is detected, in addition to a sound for instructing to turn on the air conditioner 300, a sound for instructing the air conditioning mode of the air conditioner 300 to be cooled, A sound for instructing to set the set temperature of the air conditioner 300 to 28 ° C. is output. In the example shown in FIG. 6, the content of the operation A corresponding to the second operation is the same in the top record and the bottom record. In such a case, it is preferable to employ the top record with a new detection start time.
 なお、図6には、検知された操作の組み合わせの全てが履歴情報に含まれる例を示したが、履歴情報はこの例に限定されない。例えば、直近の予め定められた期間に検知された操作の組み合わせのみが履歴情報に含まれてもよい。また、直近の予め定められた期間に予め定められた回数以上検知された操作の組み合わせのみが履歴情報に含まれてもよい。また、競合関係にある組み合わせのうち検知開始時刻が古い方のレコードが履歴情報から除外されてもよい。競合関係にある組み合わせは、例えば、第2操作が同じであり、少なくとも1つの第1操作が異なる組み合わせである。 Although FIG. 6 shows an example in which all detected combinations of operations are included in the history information, the history information is not limited to this example. For example, only combinations of operations detected during the most recent predetermined period may be included in the history information. Further, only the combination of operations detected a predetermined number of times or more in the latest predetermined period may be included in the history information. In addition, a record with an older detection start time among combinations having a competitive relationship may be excluded from the history information. The combinations in the competitive relationship are, for example, combinations in which the second operation is the same and at least one first operation is different.
 次に、図7のフローチャートを参照して、音声出力装置100が実行する音声出力処理について説明する。音声出力処理は、例えば、音声出力装置100の電源が投入されたことに応答して実行される。 Next, the audio output process executed by the audio output device 100 will be described with reference to the flowchart of FIG. The audio output process is executed in response to, for example, turning on the power of the audio output device 100.
 まず、プロセッサ11は、操作を検知したか否かを判別する(ステップS101)。プロセッサ11は、操作を検知していないと判別すると(ステップS101:NO)、ステップS101に処理を戻す。一方、プロセッサ11は、操作を検知したと判別すると(ステップS101:YES)、検知開始時刻を記憶する(ステップS102)。プロセッサ11は、ステップS102の処理を完了すると、連動設定があるか否かを判別する(ステップS103)。具体的には、プロセッサ11は、ステップS101で検知された操作を第2操作とするレコードが履歴情報に含まれるか否かを判別する。 First, the processor 11 determines whether or not an operation is detected (step S101). When determining that the operation is not detected (step S101: NO), the processor 11 returns the process to step S101. On the other hand, if the processor 11 determines that an operation has been detected (step S101: YES), the processor 11 stores the detection start time (step S102). When completing the process of step S102, the processor 11 determines whether or not there is an interlocking setting (step S103). Specifically, the processor 11 determines whether or not the history information includes a record having the operation detected in step S101 as the second operation.
 プロセッサ11は、連動設定があると判別すると(ステップS103:YES)、音声群を出力する(ステップS104)。例えば、プロセッサ11は、上記レコードに含まれる操作の内容を1つ選択する処理と、選択した操作の内容を表す音声をスピーカ15から出力させる処理とを、上記レコードに含まれる操作の内容を全て選択するまで繰り返し実行する。一方、プロセッサ11は、連動設定がないと判別すると(ステップS103:NO)、単一の音声を出力する(ステップS105)。例えば、プロセッサ11は、ステップS101で検知された操作の内容を表す音声をスピーカ15から出力させる。 If the processor 11 determines that there is an interlocking setting (step S103: YES), the processor 11 outputs a voice group (step S104). For example, the processor 11 performs processing for selecting one operation content included in the record and processing for outputting sound representing the content of the selected operation from the speaker 15, and processing all the operation content included in the record. Run repeatedly until selected. On the other hand, when determining that there is no interlocking setting (step S103: NO), the processor 11 outputs a single sound (step S105). For example, the processor 11 causes the speaker 15 to output sound representing the content of the operation detected in step S101.
 プロセッサ11は、ステップS104の処理又はステップS105の処理を完了した場合、操作を検知したか否かを判別する(ステップS106)。プロセッサ11は、操作を検知したと判別すると(ステップS106:YES)、連動設定があるか否かを判別する(ステップS107)。具体的には、プロセッサ11は、ステップS106で検知された操作を第2操作とするレコードが履歴情報に含まれるか否かを判別する。プロセッサ11は、連動設定があると判別すると(ステップS107:YES)、音声群を出力する(ステップS108)。一方、プロセッサ11は、連動設定がないと判別すると(ステップS107:NO)、単一の音声を出力する(ステップS109)。 The processor 11 determines whether or not an operation is detected when the process of step S104 or the process of step S105 is completed (step S106). When it is determined that the operation is detected (step S106: YES), the processor 11 determines whether or not there is an interlocking setting (step S107). Specifically, the processor 11 determines whether or not the history information includes a record having the operation detected in step S106 as the second operation. If the processor 11 determines that there is an interlocking setting (step S107: YES), the processor 11 outputs a voice group (step S108). On the other hand, when determining that there is no interlocking setting (step S107: NO), the processor 11 outputs a single sound (step S109).
 プロセッサ11は、ステップS108の処理又はステップS109の処理を完了した場合、又は、操作を検知していないと判別した場合(ステップS106:NO)、検知開始時刻から第1時間が経過したか否かを判別する(ステップS110)。第1時間は、上述した予め定められた時間であり、例えば、数分である。プロセッサ11は、検知開始時刻から第1時間が経過していないと判別すると(ステップS110:NO)、ステップS106に処理を戻す。 When the processor 11 completes the process of step S108 or the process of step S109, or determines that the operation is not detected (step S106: NO), whether or not the first time has elapsed from the detection start time. Is discriminated (step S110). The first time is the above-described predetermined time, for example, several minutes. When determining that the first time has not elapsed since the detection start time (step S110: NO), the processor 11 returns the process to step S106.
 一方、プロセッサ11は、検知開始時刻から第1時間が経過したと判別すると(ステップS110:YES)、第1期間内に複数の操作を検知したか否かを判別する(ステップS111)。第1期間は、検知開始時刻から第1時間が経過するまでの期間である。プロセッサ11は、第1期間内に複数の操作を検知したと判別すると(ステップS111:YES)、履歴情報を生成する(ステップS112)。例えば、プロセッサ11は、ステップS101で検知された操作を第2操作、ステップS106で検知された操作を第1操作とするレコードを含むように、履歴情報を更新する。プロセッサ11は、第1期間内に複数の操作を検知していないと判別した場合(ステップS111:NO)、又は、ステップS112の処理を完了した場合、ステップS101に処理を戻す。 On the other hand, when determining that the first time has elapsed from the detection start time (step S110: YES), the processor 11 determines whether or not a plurality of operations are detected within the first period (step S111). The first period is a period until the first time elapses from the detection start time. When the processor 11 determines that a plurality of operations are detected within the first period (step S111: YES), the processor 11 generates history information (step S112). For example, the processor 11 updates the history information so as to include a record in which the operation detected in step S101 is the second operation and the operation detected in step S106 is the first operation. When it is determined that a plurality of operations are not detected within the first period (step S111: NO), or when the process of step S112 is completed, the processor 11 returns the process to step S101.
 本実施形態に係る音声出力方法は、本実施形態に係る音声出力装置100が図7に示す音声出力処理を実行することにより実現される。この音声出力方法では、まず、ユーザ10による設備機器に対する操作を検知し、この操作が検知された場合、この操作の内容を表す音声を出力する。また、この音声出力方法では、ユーザ10によりなされた行動を検知する。そして、この音声出力方法では、この行動が検知され、この操作とこの行動とが予め定められた関係にある場合、上記音声を出力する。 The audio output method according to the present embodiment is realized by the audio output device 100 according to the present embodiment executing the audio output process shown in FIG. In this voice output method, first, an operation on the equipment device by the user 10 is detected, and when this operation is detected, a voice representing the content of this operation is output. In this audio output method, an action performed by the user 10 is detected. And in this audio | voice output method, when this action is detected and this operation and this action have a predetermined relationship, the said audio | voice is output.
 以上説明したように、本実施形態では、ユーザ10による設備機器に対する操作とユーザ10によりなされた行動とが予め定められた関係にあり、上記行動が検知された場合、上記操作の内容を表す音声が出力される。従って、本実施形態によれば、音声に基づいて設備機器を制御する機器制御装置200を利用して、設備機器を容易に所望の制御状態にすることができる。例えば、音声出力装置100に対する単一の操作で、ユーザ10の嗜好を反映した設備機器の制御を実現することが期待できる。 As described above, in this embodiment, the operation performed on the equipment by the user 10 and the action performed by the user 10 are in a predetermined relationship, and when the action is detected, the sound representing the content of the operation is displayed. Is output. Therefore, according to the present embodiment, the equipment device can be easily brought into a desired control state by using the equipment control device 200 that controls the equipment device based on the voice. For example, it can be expected that the control of the facility device reflecting the preference of the user 10 is realized by a single operation on the audio output device 100.
 また、本実施形態では、予め定められた時間内に複数の操作が実行された実績がある場合において、これらの複数の操作のうちいずれかの操作が実行された場合、他の操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、設備機器を少ない操作で所望する制御状態にすることができる。 Further, in the present embodiment, when there is a track record of performing a plurality of operations within a predetermined time, when any of these operations is performed, the contents of the other operations are displayed. The voice that represents is automatically output. Therefore, according to the present embodiment, the equipment can be brought into a desired control state with few operations.
 また、本実施形態では、予め定められた時間内に複数の操作が実行された実績がある場合において、これらの複数の操作のうち最初の操作が実行された場合、他の操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、設備機器を少ない操作で適切に所望する制御状態にすることができる。 Further, in the present embodiment, when there is a track record of performing a plurality of operations within a predetermined time, when the first operation among these operations is performed, the contents of other operations are represented. Audio is automatically output. Therefore, according to the present embodiment, the equipment can be appropriately set to a desired control state with few operations.
(実施形態2)
 実施形態1では、連続して実行される操作及び行動にキーワードが対応付けられない例について説明した。本実施形態では、連続して実行される操作及び行動にキーワードを対応付ける例について説明する。以下、基本的に、実施形態1と異なる部分について説明する。
(Embodiment 2)
In the first embodiment, an example in which keywords are not associated with operations and actions that are continuously executed has been described. In the present embodiment, an example in which keywords are associated with operations and actions that are continuously executed will be described. Hereinafter, fundamentally, differences from the first embodiment will be described.
 まず、図8を参照して、音声出力装置120の機能について説明する。図8に示すように、音声出力装置120は、機能的には、制御部101と、操作検知部102と、音声出力部103と、音声情報記憶部104と、行動検知部105と、履歴情報生成部106と、履歴情報記憶部107と、を備える。行動検知部105は、音声検知部108を備える。音声出力装置120が備える音声検知手段は、例えば、音声検知部108に対応する。 First, the function of the audio output device 120 will be described with reference to FIG. As shown in FIG. 8, the audio output device 120 functionally includes a control unit 101, an operation detection unit 102, an audio output unit 103, an audio information storage unit 104, an action detection unit 105, and history information. A generation unit 106 and a history information storage unit 107 are provided. The behavior detection unit 105 includes a voice detection unit 108. The sound detection means included in the sound output device 120 corresponds to, for example, the sound detection unit 108.
 操作検知部102は、ユーザ10による設備機器に対する第1操作を検知する。音声出力部103は、第1操作が検知された場合、第1操作の内容を表す第1音声を出力する。行動検知部105は、ユーザ10が発した言葉を表す第3音声を検知する音声検知部108を備え、ユーザ10による言葉の発声を、ユーザ10によりなされた行動として検知する。音声検知部108の機能は、例えば、マイクロフォン14の機能により実現される。 The operation detection unit 102 detects a first operation performed on the equipment by the user 10. When the first operation is detected, the voice output unit 103 outputs a first voice representing the content of the first operation. The behavior detection unit 105 includes a voice detection unit 108 that detects a third voice representing a word uttered by the user 10, and detects the utterance of the word by the user 10 as an action made by the user 10. The function of the voice detection unit 108 is realized by the function of the microphone 14, for example.
 履歴情報生成部106は、予め定められた時間内に第1操作と第3音声とが検知された場合、第1操作の内容と上記言葉とが対応付けられた履歴情報を生成する。音声出力部103は、この履歴情報が生成された後、第3音声が検知された場合、第1音声を出力する。 The history information generation unit 106 generates history information in which the contents of the first operation and the above words are associated when the first operation and the third voice are detected within a predetermined time. The sound output unit 103 outputs the first sound when the third sound is detected after the history information is generated.
 第3音声により表される言葉は、第1操作の実行とともに発せられる可能性が高い言葉であり、キーワードとして扱われる。そして、このキーワードが第1操作の実行とともに発せられた実績がある場合において、このキーワードが発せられた場合、第1操作を表す音声が自動で出力される。なお、履歴情報によりこのキーワードに対応付けられる第1操作は、1つ以上であれば何個であってもよい。また、履歴情報は、第1操作の内容と第1操作の直前又は直後に発せられたキーワードとが対応付けられた情報でもよいし、第1操作の内容と第1操作の直前に発せられたキーワードとが対応付けられた情報でもよいし、第1操作の内容と第1操作の直後に発せられたキーワードとが対応付けられた情報でもよい。 The words expressed by the third voice are words that are highly likely to be issued along with the execution of the first operation, and are treated as keywords. Then, when there is a record that this keyword has been issued along with the execution of the first operation, when this keyword is issued, a voice representing the first operation is automatically output. Note that the number of first operations associated with this keyword based on history information may be any number as long as it is one or more. The history information may be information in which the content of the first operation is associated with the keyword issued immediately before or after the first operation, or the history information is issued immediately before the content of the first operation and the first operation. Information associated with a keyword may be used, or information associated with the content of the first operation and the keyword issued immediately after the first operation may be used.
 次に、図9を参照して、本実施形態における履歴情報について説明する。本実施形態では、履歴情報は、キーワードと操作Aの内容と操作Bの内容と操作Cの内容とが対応付けられた情報である。操作Aと操作Bと操作Cとは、キーワードの発話と共になされた操作であり、第1操作である。履歴情報により示される一番上のレコードは、「空調機」というキーワードの発話とともに、空調機300の電源をオンする操作と、空調機300の空調モードを冷房にする操作と、空調機300の設定温度を28℃に設定する操作とが実行された実績があることを示している。このような実績がある場合において、「空調機」というキーワードが発話されると、空調機300の電源をオンすることを指示する音声と、空調機300の空調モードを冷房にすることを指示する音声と、空調機300の設定温度を28℃に設定することを指示する音声とが、自動で出力される。なお、図9には、検知されたキーワード及び第1操作の組み合わせの全てが履歴情報に含まれる例を示したが、履歴情報はこの例に限定されない。例えば、直近の予め定められた期間に検知されたキーワード及び第1操作の組み合わせのみが履歴情報に含まれてもよい。また、直近の予め定められた期間に予め定められた回数以上検知されたキーワード及び第1操作の組み合わせのみが履歴情報に含まれてもよい。また、競合関係にある組み合わせのうち検知開始時刻が古い方のレコードが履歴情報から除外されてもよい。競合関係にある組み合わせは、例えば、キーワードが同じであり、少なくとも1つの第1操作が異なる組み合わせである。 Next, history information in the present embodiment will be described with reference to FIG. In the present embodiment, the history information is information in which a keyword, the content of the operation A, the content of the operation B, and the content of the operation C are associated with each other. Operation A, operation B, and operation C are operations performed together with keyword utterances, and are first operations. The top record indicated by the history information includes an utterance of the keyword “air conditioner”, an operation of turning on the power of the air conditioner 300, an operation of cooling the air conditioning mode of the air conditioner 300, This shows that there is a track record of performing an operation for setting the set temperature to 28 ° C. In such a case, when the keyword “air conditioner” is spoken, a voice instructing to turn on the power of the air conditioner 300 and instructing to set the air conditioning mode of the air conditioner 300 to cooling. A voice and a voice instructing to set the set temperature of the air conditioner 300 to 28 ° C. are automatically output. Although FIG. 9 shows an example in which all detected combinations of keywords and first operations are included in the history information, the history information is not limited to this example. For example, only the combination of the keyword and the first operation detected during the most recent predetermined period may be included in the history information. In addition, only the combination of the keyword and the first operation detected more than a predetermined number of times in the most recent predetermined period may be included in the history information. In addition, a record with an older detection start time among combinations having a competitive relationship may be excluded from the history information. The combinations having a competitive relationship are, for example, combinations having the same keyword and different at least one first operation.
 次に、図10のフローチャートを参照して、音声出力装置120が実行する音声出力処理について説明する。音声出力処理は、例えば、音声出力装置120の電源が投入されたことに応答して実行される。ここでは、キーワードの発話の後になされた一連の操作の内容が、キーワードと対応付けられる例について説明する。 Next, the audio output process executed by the audio output device 120 will be described with reference to the flowchart of FIG. The voice output process is executed in response to, for example, the power of the voice output device 120 being turned on. Here, an example will be described in which the contents of a series of operations performed after the keyword utterance are associated with the keyword.
 まず、プロセッサ11は、言葉を検知したか否かを判別する(ステップS201)。例えば、プロセッサ11は、キーワードとなり得る言葉を表す音声がマイクロフォン14により検知されたか否かを判別する。プロセッサ11は、言葉を検知していないと判別すると(ステップS201:NO)、ステップS201に処理を戻す。一方、プロセッサ11は、言葉を検知したと判別すると(ステップS201:YES)、検知開始時刻を記憶する(ステップS202)。 First, the processor 11 determines whether or not a word is detected (step S201). For example, the processor 11 determines whether or not a voice representing a word that can be a keyword has been detected by the microphone 14. If the processor 11 determines that no word is detected (step S201: NO), the process returns to step S201. On the other hand, if the processor 11 determines that a word has been detected (step S201: YES), the processor 11 stores the detection start time (step S202).
 プロセッサ11は、ステップS202の処理を完了すると、連動設定があるか否かを判別する(ステップS203)。具体的には、プロセッサ11は、ステップS201で検知された言葉をキーワードとするレコードが履歴情報に含まれているか否かを判別する。プロセッサ11は、連動設定があると判別すると(ステップS203:YES)、単一の音声又は音声群を出力する(ステップS204)。なお、上記レコードに単一の操作の内容が含まれる場合、単一の音声が出力される。一方、上記レコードに複数の操作の内容が含まれる場合、音声群が出力される。 When the processing of step S202 is completed, the processor 11 determines whether or not there is an interlocking setting (step S203). Specifically, the processor 11 determines whether or not a record using the word detected in step S201 as a keyword is included in the history information. When the processor 11 determines that there is an interlocking setting (step S203: YES), the processor 11 outputs a single voice or a voice group (step S204). In addition, when the content of a single operation is included in the record, a single sound is output. On the other hand, when the contents of a plurality of operations are included in the record, a voice group is output.
 プロセッサ11は、ステップS204の処理を完了した場合、又は、連動設定がないと判別した場合(ステップS203:NO)、操作を検知したか否かを判別する(ステップS205)。プロセッサ11は、操作を検知したと判別すると(ステップS205:YES)、単一の音声を出力する(ステップS206)。プロセッサ11は、ステップS206の処理を完了した場合、又は、操作を検知していないと判別した場合(ステップS205:NO)、検知開始時刻から第1時間が経過したか否かを判別する(ステップS207)。 When the processing of step S204 is completed or when it is determined that there is no interlocking setting (step S203: NO), the processor 11 determines whether an operation is detected (step S205). When determining that the operation is detected (step S205: YES), the processor 11 outputs a single sound (step S206). When the processor 11 completes the process of step S206 or determines that the operation is not detected (step S205: NO), the processor 11 determines whether the first time has elapsed from the detection start time (step S205). S207).
 プロセッサ11は、検知開始時刻から第1時間が経過していないと判別すると(ステップS207:NO)、ステップS205に処理を戻す。一方、プロセッサ11は、検知開始時刻から第1時間が経過したと判別すると(ステップS207:YES)、第1期間内に操作を検知したか否かを判別する(ステップS208)。プロセッサ11は、第1期間内に操作を検知したと判別すると(ステップS208:YES)、履歴情報を生成する(ステップS209)。プロセッサ11は、検知された言葉であるキーワードと、第1期間内に検知された操作の内容とが対応付けられたレコードを含むように、履歴情報を更新する。プロセッサ11は、ステップS209の処理を完了した場合、又は、第1期間内に操作を検知していないと判別した場合(ステップS208:NO)、ステップS201に処理を戻す。 If the processor 11 determines that the first time has not elapsed since the detection start time (step S207: NO), the processor 11 returns the process to step S205. On the other hand, when determining that the first time has elapsed from the detection start time (step S207: YES), the processor 11 determines whether an operation is detected within the first period (step S208). When the processor 11 determines that an operation has been detected within the first period (step S208: YES), the processor 11 generates history information (step S209). The processor 11 updates the history information so as to include a record in which the keyword, which is the detected word, is associated with the content of the operation detected within the first period. When the processor 11 completes the process of step S209 or determines that no operation is detected within the first period (step S208: NO), the processor 11 returns the process to step S201.
 本実施形態では、キーワードの発話とともに少なくとも1つの操作が検知された実績がある場合において、キーワードの発話が検知された場合、この少なくとも1つの操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、キーワードの発話により設備機器を所望する制御状態にすることができる。また、本実施形態では、設備機器を所望する制御状態にするための一連の操作に対応付けられるキーワードは、ユーザが自由に選択できるため、ユーザの利便性が高まる。 In this embodiment, when there is a track record in which at least one operation has been detected together with the keyword utterance, a voice representing the content of the at least one operation is automatically output when the keyword utterance is detected. Therefore, according to the present embodiment, the facility device can be brought into a desired control state by the utterance of the keyword. Moreover, in this embodiment, since the user can select freely the keyword matched with a series of operation for making an equipment apparatus the desired control state, a user's convenience increases.
(実施形態3)
 実施形態1では、行動検知部105により検知される行動が、操作検知部102により検知される操作である例について説明した。また、実施形態2では、行動検知部105により検知される行動が、音声検知部108により検知される音声の発話である例について説明した。本実施形態では、行動検知部105により検知される行動が、操作検知部102により検知される操作と、音声検知部108により検知される音声の発話との双方である例について説明する。以下、基本的に、実施形態1,2と異なる部分について説明する。
(Embodiment 3)
In the first embodiment, the example in which the behavior detected by the behavior detection unit 105 is an operation detected by the operation detection unit 102 has been described. In the second embodiment, the example in which the behavior detected by the behavior detection unit 105 is an utterance of voice detected by the voice detection unit 108 has been described. In the present embodiment, an example in which the behavior detected by the behavior detection unit 105 is both an operation detected by the operation detection unit 102 and a speech utterance detected by the voice detection unit 108 will be described. Hereinafter, fundamentally, differences from the first and second embodiments will be described.
 まず、図11を参照して、音声出力装置130の機能について説明する。図11に示すように、音声出力装置130は、機能的には、制御部101と、音声出力部103と、音声情報記憶部104と、行動検知部105と、履歴情報生成部106と、履歴情報記憶部107と、を備える。行動検知部105は、操作検知部102と音声検知部108とを備える。 First, the function of the audio output device 130 will be described with reference to FIG. As shown in FIG. 11, the audio output device 130 functionally includes a control unit 101, an audio output unit 103, an audio information storage unit 104, an action detection unit 105, a history information generation unit 106, a history An information storage unit 107. The behavior detection unit 105 includes an operation detection unit 102 and a voice detection unit 108.
 行動検知部105は、ユーザ10が発した言葉を表す第3音声を検知する音声検知部108を備え、ユーザ10による言葉の発声を、ユーザ10によりなされた行動として検知する。履歴情報生成部106は、予め定められた時間内に第1操作と第2操作と第3音声とが検知された場合、第1操作の内容と第2操作の内容と上記言葉とが対応付けられた履歴情報を生成する。音声出力部103は、履歴情報が生成された後、第3音声が検知された場合、第1音声と第2音声とを出力する。 The behavior detection unit 105 includes a voice detection unit 108 that detects a third voice representing a word uttered by the user 10, and detects the utterance of the word by the user 10 as an action performed by the user 10. When the first operation, the second operation, and the third voice are detected within a predetermined time, the history information generation unit 106 associates the contents of the first operation, the contents of the second operation, and the above words. Generated history information is generated. The sound output unit 103 outputs the first sound and the second sound when the third sound is detected after the history information is generated.
 つまり、本実施形態では、第1操作と第2操作と第3音声とが連続して検知された実績がある場合において、第2操作と第3音声との少なくとも一方が検知された場合、第1音声と第2音声とが出力される。例えば、実施形態2と同様に、図9に示す履歴情報が生成された場合を想定する。本実施形態では、例えば、「空調機」というキーワードが発音される場合と、空調機300の電源をオンする操作がなされた場合とのいずれの場合においても、空調機300の電源をオンすることを指示する音声と、空調機300の空調モードを冷房にすることを指示する音声と、空調機300の設定温度を28℃にすることを指示する音声とが出力される。 That is, in the present embodiment, when there is a track record in which the first operation, the second operation, and the third sound are detected continuously, when at least one of the second operation and the third sound is detected, One voice and second voice are output. For example, as in the second embodiment, it is assumed that the history information illustrated in FIG. 9 is generated. In the present embodiment, for example, the air conditioner 300 is turned on in both cases where the keyword “air conditioner” is pronounced and whether the operation of turning on the air conditioner 300 is performed. , A voice instructing to set the air conditioning mode of the air conditioner 300 to cooling, and a voice instructing to set the set temperature of the air conditioner 300 to 28 ° C. are output.
 本実施形態では、キーワードの発話とともに複数の操作が検知された実績がある場合において、キーワードの発話が検知された場合、又は、複数の操作のうちの最初の操作が検知された場合、これらの複数の操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、キーワードの発話又は一連の操作のうちの最初の操作により設備機器を所望する制御状態にすることができる。 In this embodiment, when there is a track record in which a plurality of operations are detected together with the utterance of the keyword, when the utterance of the keyword is detected, or when the first operation among the plurality of operations is detected, these Voices representing the contents of multiple operations are automatically output. Therefore, according to the present embodiment, the facility device can be brought into a desired control state by the first operation of the keyword utterance or the series of operations.
(実施形態4)
 実施形態1-3では、制御対象である設備機器が、1つだけである例について説明した。本実施形態では、制御対象である設備機器が、複数個である例について説明する。本実施形態では、図12に示すように、制御対象である設備機器が、空調機300と浴室暖房器310と給湯器320との3つである例について説明する。
(Embodiment 4)
In the embodiment 1-3, the example in which there is only one equipment device to be controlled has been described. In the present embodiment, an example in which there are a plurality of facility devices to be controlled will be described. In the present embodiment, as shown in FIG. 12, an example in which the equipment to be controlled is three of an air conditioner 300, a bathroom heater 310, and a water heater 320 will be described.
 操作検知部102は、ユーザ10による複数の設備機器のうち第1設備機器に対する第1操作とユーザ10による複数の設備機器のうち第2設備機器に対する第2操作とを検知する。本実施形態では、空調機300と浴室暖房器310と給湯器320とのうちいずれか1つの設備機器が第2設備機器であり、残りの2つの設備機器が第1設備機器であるものとする。ただし、3つの設備機器のうちいずれの設備機器が第2設備機器であってもよい。 The operation detection unit 102 detects a first operation on the first equipment device among the plurality of equipment devices by the user 10 and a second operation on the second equipment device among the plurality of equipment devices by the user 10. In the present embodiment, it is assumed that any one equipment device among the air conditioner 300, the bathroom heater 310, and the water heater 320 is the second equipment device, and the remaining two equipment devices are the first equipment devices. . However, any of the three equipment may be the second equipment.
 音声出力部103は、第1操作が検知された場合、第1操作の内容を表す第1音声を出力し、第2操作が検知された場合、第2操作の内容を表す第2音声を出力する。行動検知部105は、操作検知部102を備え、ユーザ10によりなされた行動として、第2操作を検知する。履歴情報生成部106は、予め定められた時間内に第1操作と第2操作とが検知された場合、第1操作の内容と第2操作の内容とが対応付けられた履歴情報を生成する。そして、音声出力部103は、履歴情報が生成された後、第2操作が検知された場合、第1音声と第2音声とを出力する。 The sound output unit 103 outputs a first sound representing the content of the first operation when the first operation is detected, and outputs a second sound representing the content of the second operation when the second operation is detected. To do. The behavior detection unit 105 includes an operation detection unit 102 and detects the second operation as the behavior performed by the user 10. When the first operation and the second operation are detected within a predetermined time, the history information generation unit 106 generates history information in which the content of the first operation and the content of the second operation are associated with each other. . Then, the sound output unit 103 outputs the first sound and the second sound when the second operation is detected after the history information is generated.
 次に、図13を参照して、本実施形態における履歴情報について説明する。本実施形態では、履歴情報は、検知開始時刻と操作Aの内容と操作Bの内容と操作Cの内容とが対応付けられた情報である。操作Aと操作Bと操作Cとは、予め定められた時間内に連続して検知された一連の操作である。この一連の操作には、1つの設備機器に対する複数の操作が含まれていてもよい。操作Aと操作Bと操作Cとのうちのいずれか1つの操作が第2操作である。残りの2つの操作が第1操作である。 Next, history information in the present embodiment will be described with reference to FIG. In the present embodiment, the history information is information in which the detection start time, the content of the operation A, the content of the operation B, and the content of the operation C are associated with each other. The operation A, the operation B, and the operation C are a series of operations detected continuously within a predetermined time. This series of operations may include a plurality of operations for one facility device. Any one of the operation A, the operation B, and the operation C is the second operation. The remaining two operations are the first operations.
 履歴情報により示される一番上のレコードは、検知開始時刻である2018年5月18日12:00から予め定められた時間が経過するまでの間に、空調機300の電源をオンする操作と、浴室暖房器310の電源をオンする操作と、給湯器320の電源をオンする操作と、が連続して実行された実績があることを示している。このような実績がある場合において、空調機300の電源をオンする操作と、浴室暖房器310の電源をオンする操作と、給湯器320の電源をオンする操作とのうち、いずれかの操作が検知されると、空調機300の電源をオンすることを指示する音声と、浴室暖房器310の電源をオンすることを指示する音声と、給湯器320の電源をオンすることを指示する音声とが、自動で出力される。 The top record indicated by the history information includes an operation to turn on the power of the air conditioner 300 until a predetermined time elapses from 12:00 on May 18, 2018, which is the detection start time. The operation of turning on the power of the bathroom heater 310 and the operation of turning on the power of the hot water heater 320 have been performed in succession. In the case where there is such a track record, any one of an operation of turning on the power of the air conditioner 300, an operation of turning on the power of the bathroom heater 310, and an operation of turning on the power of the water heater 320 is performed. When detected, a sound instructing to turn on the power of the air conditioner 300, a sound instructing to turn on the power of the bathroom heater 310, and a sound instructing to turn on the power of the water heater 320 Is automatically output.
 本実施形態では、複数の設備機器に対して一連の操作がされた実績がある場合において、この一連の操作のうちのいずれかの操作が検知された場合、これらの一連の操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、一連の操作のうちの1つの操作により複数の設備機器を所望する制御状態にすることができる。 In the present embodiment, when there is a track record of a series of operations performed on a plurality of equipment devices, if any of the series of operations is detected, the contents of these series of operations are represented. Audio is automatically output. Therefore, according to the present embodiment, a desired control state can be achieved for a plurality of facility devices by one operation in a series of operations.
(実施形態5)
 実施形態4では、複数の設備機器に対する一連の操作にキーワードが対応付けられない例について説明した。本実施形態では、複数の設備機器に対する一連の操作にキーワードが対応付けられる例について説明する。以下、基本的に、実施形態4と異なる部分について説明する。
(Embodiment 5)
Embodiment 4 demonstrated the example in which a keyword is not matched with a series of operation with respect to a some installation apparatus. In the present embodiment, an example in which keywords are associated with a series of operations on a plurality of facility devices will be described. Hereinafter, fundamentally, differences from the fourth embodiment will be described.
 行動検知部105は、ユーザ10が発した言葉を表す第3音声を検知する音声検知部108を備え、ユーザ10による上記言葉の発声を、ユーザ10によりなされた行動として検知する。履歴情報生成部106は、予め定められた時間内に第1操作と第2操作と第3音声とが検知された場合、第1操作の内容と第2操作の内容と上記言葉とが対応付けられた履歴情報を生成する。 The behavior detection unit 105 includes a voice detection unit 108 that detects a third voice representing a word uttered by the user 10, and detects the utterance of the word by the user 10 as an action performed by the user 10. When the first operation, the second operation, and the third voice are detected within a predetermined time, the history information generation unit 106 associates the contents of the first operation, the contents of the second operation, and the above words. Generated history information is generated.
 音声出力部103は、履歴情報が生成された後、第3音声が検知された場合、第1音声と第2音声とを出力する。このように、本実施形態では、複数の設備機器に対する一連の操作と第3音声に対応するキーワードの発話とが連続して検知された実績がある場合において、一連の操作のうちの1つの操作又はキーワードの発話が検知された場合、一連の操作を表す一連の音声が出力される。 The voice output unit 103 outputs the first voice and the second voice when the third voice is detected after the history information is generated. Thus, in this embodiment, when there is a track record in which a series of operations on a plurality of equipment devices and a keyword utterance corresponding to the third voice are continuously detected, one operation in the series of operations is performed. Alternatively, when a keyword utterance is detected, a series of sounds representing a series of operations are output.
 次に、図14を参照して、本実施形態における履歴情報について説明する。本実施形態では、履歴情報は、キーワードと操作Aの内容と操作Bの内容と操作Cの内容とが対応付けられた情報である。操作Aと操作Bと操作Cとは、予め定められた時間内に連続して検知された一連の操作である。キーワードは、上記一連の操作とともに、上記予め定められた時間内に検知された言葉である。操作Aと操作Bと操作Cとのうちのいずれか1つの操作が第2操作である。残りの2つの操作が第1操作である。 Next, history information in the present embodiment will be described with reference to FIG. In the present embodiment, the history information is information in which a keyword, the content of the operation A, the content of the operation B, and the content of the operation C are associated with each other. The operation A, the operation B, and the operation C are a series of operations that are continuously detected within a predetermined time. The keyword is a word detected within the predetermined time together with the series of operations. Any one of the operation A, the operation B, and the operation C is the second operation. The remaining two operations are the first operations.
 履歴情報により示される一番上のレコードは、「今帰った」というキーワードの発音と、空調機300の電源をオンする操作と、浴室暖房器310の電源をオンする操作と、給湯器320の電源をオンする操作と、が連続して実行された実績があることを示している。このような実績がある場合において、「今帰った」というキーワードの発音、又は、空調機300の電源をオンする操作と、浴室暖房器310の電源をオンする操作と、給湯器320の電源をオンする操作とのうち、いずれかの操作が検知されると、空調機300の電源をオンすることを指示する音声と、浴室暖房器310の電源をオンすることを指示する音声と、給湯器320の電源をオンすることを指示する音声とが、自動で出力される。 The top record indicated by the history information includes the pronunciation of the keyword “returned now”, an operation to turn on the air conditioner 300, an operation to turn on the bathroom heater 310, and the hot water heater 320. This shows that there is a track record of continuous operation of turning on the power. In such a case, the pronunciation of the keyword “returned now” or the operation of turning on the power of the air conditioner 300, the operation of turning on the power of the bathroom heater 310, and the power of the water heater 320 are turned on. When any one of the turning-on operations is detected, a sound instructing to turn on the power of the air conditioner 300, a sound instructing to turn on the power of the bathroom heater 310, and a water heater A sound instructing to turn on the power of 320 is automatically output.
 本実施形態では、キーワードの発話とともに複数の設備機器に対して一連の操作がされた実績がある場合において、キーワードの発話又はこの一連の操作のうちのいずれかの操作が検知された場合、これらの一連の操作の内容を表す音声が自動で出力される。従って、本実施形態によれば、キーワードの発話又は一連の操作のうちの1つの操作により複数の設備機器を所望する制御状態にすることができる。 In the present embodiment, when there is a track record of performing a series of operations on a plurality of facility devices together with the utterance of a keyword, if any of the utterance of the keyword or this series of operations is detected, these Voices representing the contents of a series of operations are automatically output. Therefore, according to the present embodiment, a plurality of facility devices can be brought into a desired control state by keyword utterance or one of a series of operations.
(実施形態6)
 実施形態1-5では、履歴情報が自動で生成される例について説明した。本実施形態では、履歴情報が手動で生成される例について説明する。本実施形態では、タッチスクリーン13又はマイクロフォン14は、設定モードへの移行指示を受け付ける。移行指示受付手段は、例えば、タッチスクリーン13又はマイクロフォン14に対応する。
(Embodiment 6)
In Embodiment 1-5, the example in which history information is automatically generated has been described. In the present embodiment, an example in which history information is manually generated will be described. In the present embodiment, the touch screen 13 or the microphone 14 receives an instruction to shift to the setting mode. The transition instruction receiving unit corresponds to, for example, the touch screen 13 or the microphone 14.
 履歴情報生成部106は、タッチスクリーン13又はマイクロフォン14により移行指示が受け付けられ、移行指示に基づいて設定モードが設定されている間に、操作と行動とが検知された場合、操作の内容と行動の内容とが対応付けられた履歴情報を生成する。音声出力部103は、履歴情報が生成された後、上記行動が検知された場合、上記音声を出力する。 The history information generation unit 106 receives the transition instruction from the touch screen 13 or the microphone 14, and when the operation and the action are detected while the setting mode is set based on the transition instruction, the operation information and the action The history information associated with the contents of is generated. The voice output unit 103 outputs the voice when the behavior is detected after the history information is generated.
 例えば、音声出力装置100に対してユーザ10が移行指示をした場合を想定する。なお、音声出力装置100は、プロセッサ11とスピーカ15とが協働して発声し、ユーザ10の発話した音声をマイクロフォン14により検知するものとする。この場合、例えば、音声出力装置100は、「何を連動しますか?」と発声する。ユーザ10が、「給湯器、電源、オン」と発声すると、音声出力装置100は、「何を連動しますか?」と発声する。ユーザ10が、「浴室暖房、電源、オン」と発声すると、音声出力装置100は、「何を連動しますか?」と発声する。ユーザ10が、「終わり」と発声すると、音声出力装置100は、「キーワードは何ですか?」と発声する。ユーザ10が、「今、帰った」と発声すると、音声出力装置100は、「設定が完了しました」と発声する。 For example, it is assumed that the user 10 instructs the voice output device 100 to shift. Note that in the audio output device 100, the processor 11 and the speaker 15 cooperate to utter, and the voice uttered by the user 10 is detected by the microphone 14. In this case, for example, the audio output device 100 utters “What will be linked?”. When the user 10 utters “Hot water heater, power, on”, the audio output device 100 utters “What do you want to work with?”. When the user 10 utters “bathroom heating, power, on”, the audio output device 100 utters “What do you want to work with?”. When the user 10 utters “End”, the audio output device 100 utters “What is the keyword?”. When the user 10 utters “I'm back now”, the audio output device 100 utters “Setting is complete”.
 以下、図15を参照して、音声出力装置100が実行する設定処理について説明する。 Hereinafter, the setting process executed by the audio output device 100 will be described with reference to FIG.
 まず、プロセッサ11は、設定モードへの移行指示があるか否かを判別する(ステップS301)。プロセッサ11は、設定モードへの移行指示がないと判別すると(ステップS301:NO)、ステップS301に処理を戻す。一方、プロセッサ11は、設定モードへの移行指示があると判別すると(ステップS301:YES)、操作を促すメッセージを発声する(ステップS302)。 First, the processor 11 determines whether or not there is an instruction to shift to the setting mode (step S301). When the processor 11 determines that there is no instruction to shift to the setting mode (step S301: NO), the processor 11 returns the process to step S301. On the other hand, when the processor 11 determines that there is an instruction to shift to the setting mode (step S301: YES), the processor 11 utters a message for prompting the operation (step S302).
 プロセッサ11は、ステップS302の処理を完了すると、制御指定操作があるか否かを判別する(ステップS303)。制御指定操作は、例えば、音声により制御を指定する操作である。プロセッサ11は、制御指定操作があると判別すると(ステップS303:YES)、指定された制御を記憶する(ステップS304)。プロセッサ11は、制御指定操作がないと判別した場合(ステップS303:NO)、又は、ステップS304の処理を完了した場合、設定終了操作があるか否かを判別する(ステップS305)。プロセッサ11は、設定終了操作がないと判別すると(ステップS305:NO)、ステップS302に処理を戻す。一方、プロセッサ11は、設定終了操作があると判別すると(ステップS305:YES)、キーワードの発声を促すメッセージを発声する(ステップS306)。 After completing the process of step S302, the processor 11 determines whether or not there is a control designation operation (step S303). The control designation operation is an operation for designating control by voice, for example. When determining that there is a control designation operation (step S303: YES), the processor 11 stores the designated control (step S304). When it is determined that there is no control designation operation (step S303: NO), or when the process of step S304 is completed, the processor 11 determines whether there is a setting end operation (step S305). When determining that there is no setting end operation (step S305: NO), the processor 11 returns the process to step S302. On the other hand, when determining that there is a setting end operation (step S305: YES), the processor 11 utters a message that prompts the keyword to be uttered (step S306).
 プロセッサ11は、ステップS306の処理を完了すると、キーワードの発話があるか否かを判別する(ステップS307)。プロセッサ11は、キーワードの発話があると判別すると(ステップS307:YES)、キーワード付きの履歴情報を生成する(ステップS308)。一方、プロセッサ11は、キーワードの発話がないと判別すると(ステップS307:NO)、キーワード付きでない履歴情報を生成する(ステップS309)。プロセッサ11は、ステップS308の処理又はステップS309の処理を完了すると、ステップS301に処理を戻す。 When completing the process of step S306, the processor 11 determines whether or not there is a keyword utterance (step S307). When determining that there is a keyword utterance (step S307: YES), the processor 11 generates history information with a keyword (step S308). On the other hand, when determining that there is no keyword utterance (step S307: NO), the processor 11 generates history information without a keyword (step S309). When completing the process of step S308 or the process of step S309, the processor 11 returns the process to step S301.
 本実施形態では、ユーザが所望する一連の操作を手動で設定される。従って、本実施形態によれば、ユーザが意図しない設定がされることを抑制することができる。 In this embodiment, a series of operations desired by the user are manually set. Therefore, according to this embodiment, it can suppress that the setting which a user does not intend is performed.
(変形例)
 以上、本発明の実施形態を説明したが、本発明を実施するにあたっては、種々の形態による変形及び応用が可能である。
(Modification)
As mentioned above, although embodiment of this invention was described, when implementing this invention, a deformation | transformation and application with a various form are possible.
 本発明において、上記実施形態において説明した構成、機能、動作のどの部分を採用するのかは任意である。また、本発明において、上述した構成、機能、動作のほか、更なる構成、機能、動作が採用されてもよい。また、上記実施形態において説明した構成、機能、動作は、自由に組み合わせることができる。 In the present invention, which part of the configuration, function, and operation described in the above embodiment is adopted is arbitrary. Further, in the present invention, in addition to the configuration, function, and operation described above, further configuration, function, and operation may be employed. Moreover, the structure, function, and operation | movement demonstrated in the said embodiment can be combined freely.
 例えば、実施形態2では、キーワードの発話の後に、設備機器に対する一連の操作がなされる例について説明した。設備機器に対する一連の操作の後に、キーワードの発話がされてもよい。 For example, in the second embodiment, an example in which a series of operations on facility equipment is performed after the utterance of a keyword has been described. The keyword may be uttered after a series of operations on the equipment.
 画面操作として説明した操作を音声操作にしてもよいし、音声操作として説明した操作を画面操作にしてもよい。 The operation described as the screen operation may be a voice operation, or the operation described as the voice operation may be a screen operation.
 本発明に係る音声出力装置100の動作を規定する動作プログラムを既存のパーソナルコンピュータや情報端末装置に適用することで、当該パーソナルコンピュータ等を本発明に係る音声出力装置100として機能させることも可能である。また、このようなプログラムの配布方法は任意であり、例えば、CD-ROM(Compact Disk Read-Only Memory)、DVD(Digital Versatile Disk)、メモリカードなどのコンピュータ読み取り可能な記録媒体に格納して配布してもよいし、インターネットなどの通信ネットワークを介して配布してもよい。 By applying an operation program that defines the operation of the audio output device 100 according to the present invention to an existing personal computer or information terminal device, the personal computer or the like can also function as the audio output device 100 according to the present invention. is there. The distribution method of such a program is arbitrary. For example, the program is stored and distributed on a computer-readable recording medium such as a CD-ROM (Compact Disk Read-Only Memory), a DVD (Digital Versatile Disk), or a memory card. Alternatively, it may be distributed via a communication network such as the Internet.
 本発明は、本発明の広義の精神と範囲を逸脱することなく、様々な実施形態及び変形が可能とされるものである。また、上述した実施形態は、本発明を説明するためのものであり、本発明の範囲を限定するものではない。つまり、本発明の範囲は、実施形態ではなく、請求の範囲によって示される。そして、請求の範囲内及びそれと同等の発明の意義の範囲内で施される様々な変形が、本発明の範囲内とみなされる。 The present invention is capable of various embodiments and modifications without departing from the broad spirit and scope of the present invention. Further, the above-described embodiment is for explaining the present invention, and does not limit the scope of the present invention. That is, the scope of the present invention is shown not by the embodiments but by the claims. Various modifications within the scope of the claims and within the scope of the equivalent invention are considered to be within the scope of the present invention.
 本発明は、音声に基づいて設備機器を制御する機器制御装置を備える機器制御システムに適用可能である。 The present invention is applicable to a device control system including a device control device that controls equipment based on voice.
10 ユーザ、11,21 プロセッサ、12,22 フラッシュメモリ、13,23 タッチスクリーン、14,24 マイクロフォン、15,25 スピーカ、16,26 通信インターフェース、100,120,130 音声出力装置、101,201 制御部、102 操作検知部、103,203 音声出力部、104,204 音声情報記憶部、105 行動検知部、106 履歴情報生成部、107 履歴情報記憶部、108,202 音声検知部、200 機器制御装置、205 機器制御部、206 コマンド情報記憶部、300 空調機、310 浴室暖房器、320 給湯器、600 通信ネットワーク、1000 機器制御システム 10 users, 11, 21 processor, 12, 22 flash memory, 13, 23 touch screen, 14, 24 microphone, 15, 25 speaker, 16, 26 communication interface, 100, 120, 130 audio output device, 101, 201 control unit , 102 operation detection unit, 103, 203 voice output unit, 104, 204 voice information storage unit, 105 behavior detection unit, 106 history information generation unit, 107 history information storage unit, 108, 202 voice detection unit, 200 device control device, 205 device control unit, 206 command information storage unit, 300 air conditioner, 310 bathroom heater, 320 water heater, 600 communication network, 1000 device control system

Claims (13)

  1.  ユーザによる設備機器に対する操作を検知する操作検知手段と、
     前記操作検知手段により前記操作が検知された場合、前記操作の内容を表す音声を出力する音声出力手段と、
     前記ユーザによりなされた行動を検知する行動検知手段と、
     前記操作検知手段により検知された前記操作と前記行動検知手段により検知された前記行動とが予め定められた関係にある場合、前記操作の内容と前記行動の内容とが対応付けられた履歴情報を生成する履歴情報生成手段と、を備え、
     前記音声出力手段は、前記行動検知手段により前記行動が検知された場合、前記履歴情報により前記行動の内容に対応付けられた前記操作の内容を表す前記音声を出力する、
     音声出力装置。
    Operation detection means for detecting an operation on the equipment by the user;
    Audio output means for outputting audio representing the content of the operation when the operation is detected by the operation detection means;
    Action detecting means for detecting an action made by the user;
    When the operation detected by the operation detection unit and the behavior detected by the behavior detection unit have a predetermined relationship, history information in which the content of the operation and the content of the behavior are associated with each other is stored. A history information generating means for generating,
    The voice output unit outputs the voice representing the content of the operation associated with the content of the action by the history information when the behavior is detected by the behavior detection unit.
    Audio output device.
  2.  前記履歴情報生成手段は、予め定められた時間内に前記操作と前記行動とが検知された場合、前記操作の内容と前記行動の内容とが対応付けられた前記履歴情報を生成し、
     前記音声出力手段は、前記履歴情報が生成された後、前記行動が検知された場合、前記音声を出力する、
     請求項1に記載の音声出力装置。
    The history information generation means generates the history information in which the content of the operation and the content of the action are associated when the operation and the behavior are detected within a predetermined time,
    The voice output means outputs the voice when the behavior is detected after the history information is generated.
    The audio output device according to claim 1.
  3.  前記操作検知手段は、前記ユーザによる前記設備機器に対する第1操作と前記ユーザによる前記設備機器に対する第2操作とを検知し、
     前記音声出力手段は、前記第1操作が検知された場合、前記第1操作の内容を表す第1音声を出力し、前記第2操作が検知された場合、前記第2操作の内容を表す第2音声を出力し、
     前記行動検知手段は、前記操作検知手段を備え、前記ユーザによりなされた行動として、前記第2操作を検知し、
     前記履歴情報生成手段は、前記予め定められた時間内に前記第1操作と前記第2操作とが検知された場合、前記第1操作の内容と前記第2操作の内容とが対応付けられた前記履歴情報を生成し、
     前記音声出力手段は、前記履歴情報が生成された後、前記第2操作が検知された場合、前記第1音声と前記第2音声とを出力する、
     請求項2に記載の音声出力装置。
    The operation detection means detects a first operation on the equipment by the user and a second operation on the equipment by the user,
    The sound output means outputs a first sound representing the content of the first operation when the first operation is detected, and a second sound representing the content of the second operation when the second operation is detected. 2 audio output,
    The behavior detection means includes the operation detection means, detects the second operation as an action performed by the user,
    The history information generation means associates the content of the first operation with the content of the second operation when the first operation and the second operation are detected within the predetermined time. Generating the history information;
    The sound output means outputs the first sound and the second sound when the second operation is detected after the history information is generated.
    The audio output device according to claim 2.
  4.  前記履歴情報生成手段は、前記第2操作が検知されてから前記予め定められた時間が経過する前に前記第1操作が検知された場合、前記第1操作の内容と前記第2操作の内容とが対応付けられた前記履歴情報を生成する、
     請求項3に記載の音声出力装置。
    When the first operation is detected before the predetermined time has elapsed since the second operation was detected, the history information generation unit is configured to include the contents of the first operation and the contents of the second operation. And generating the history information associated with
    The audio output device according to claim 3.
  5.  前記行動検知手段は、前記ユーザが発した言葉を表す第3音声を検知する音声検知手段を備え、前記ユーザによる前記言葉の発声を、前記ユーザによりなされた行動として検知し、
     前記履歴情報生成手段は、前記予め定められた時間内に前記第1操作と前記第2操作と前記第3音声とが検知された場合、前記第1操作の内容と前記第2操作の内容と前記言葉とが対応付けられた前記履歴情報を生成し、
     前記音声出力手段は、前記履歴情報が生成された後、前記第3音声が検知された場合、前記第1音声と前記第2音声とを出力する、
     請求項3又は4に記載の音声出力装置。
    The behavior detection means includes voice detection means for detecting a third voice representing a word uttered by the user, detects the utterance of the word by the user as an action made by the user,
    When the first operation, the second operation, and the third sound are detected within the predetermined time, the history information generation unit includes the contents of the first operation and the contents of the second operation. Generating the history information associated with the words;
    The sound output means outputs the first sound and the second sound when the third sound is detected after the history information is generated.
    The audio output device according to claim 3 or 4.
  6.  前記操作検知手段は、前記ユーザによる前記設備機器に対する第1操作を検知し、
     前記音声出力手段は、前記第1操作が検知された場合、前記第1操作の内容を表す第1音声を出力し、
     前記行動検知手段は、前記ユーザが発した言葉を表す第3音声を検知する音声検知手段を備え、前記ユーザによる前記言葉の発声を、前記ユーザによりなされた行動として検知し、
     前記履歴情報生成手段は、前記予め定められた時間内に前記第1操作と前記第3音声とが検知された場合、前記第1操作の内容と前記言葉とが対応付けられた前記履歴情報を生成し、
     前記音声出力手段は、前記履歴情報が生成された後、前記第3音声が検知された場合、前記第1音声を出力する、
     請求項2に記載の音声出力装置。
    The operation detection means detects a first operation on the equipment by the user,
    The sound output means outputs a first sound representing the content of the first operation when the first operation is detected,
    The behavior detection means includes voice detection means for detecting a third voice representing a word uttered by the user, detects the utterance of the word by the user as an action made by the user,
    When the first operation and the third voice are detected within the predetermined time, the history information generating unit displays the history information in which the contents of the first operation and the words are associated with each other. Generate
    The voice output means outputs the first voice when the third voice is detected after the history information is generated.
    The audio output device according to claim 2.
  7.  前記設備機器は複数個あり、
     前記操作検知手段は、前記ユーザによる前記複数の設備機器のうち第1設備機器に対する第1操作と前記ユーザによる前記複数の設備機器のうち第2設備機器に対する第2操作とを検知し、
     前記音声出力手段は、前記第1操作が検知された場合、前記第1操作の内容を表す第1音声を出力し、前記第2操作が検知された場合、前記第2操作の内容を表す第2音声を出力し、
     前記行動検知手段は、前記操作検知手段を備え、前記ユーザによりなされた行動として、前記第2操作を検知し、
     前記履歴情報生成手段は、前記予め定められた時間内に前記第1操作と前記第2操作とが検知された場合、前記第1操作の内容と前記第2操作の内容とが対応付けられた前記履歴情報を生成し、
     前記音声出力手段は、前記履歴情報が生成された後、前記第2操作が検知された場合、前記第1音声と前記第2音声とを出力する、
     請求項2に記載の音声出力装置。
    There are a plurality of the equipments,
    The operation detection means detects a first operation on a first equipment device among the plurality of equipment devices by the user and a second operation on a second equipment device among the plurality of equipment devices by the user,
    The sound output means outputs a first sound representing the content of the first operation when the first operation is detected, and a second sound representing the content of the second operation when the second operation is detected. 2 audio output,
    The behavior detection means includes the operation detection means, detects the second operation as an action performed by the user,
    The history information generation means associates the content of the first operation with the content of the second operation when the first operation and the second operation are detected within the predetermined time. Generating the history information;
    The sound output means outputs the first sound and the second sound when the second operation is detected after the history information is generated.
    The audio output device according to claim 2.
  8.  前記行動検知手段は、前記ユーザが発した言葉を表す第3音声を検知する音声検知手段を備え、前記ユーザによる前記言葉の発声を、前記ユーザによりなされた行動として検知し、
     前記履歴情報生成手段は、前記予め定められた時間内に前記第1操作と前記第2操作と前記第3音声とが検知された場合、前記第1操作の内容と前記第2操作の内容と前記言葉とが対応付けられた前記履歴情報を生成し、
     前記音声出力手段は、前記履歴情報が生成された後、前記第3音声が検知された場合、前記第1音声と前記第2音声とを出力する、
     請求項7に記載の音声出力装置。
    The behavior detection means includes voice detection means for detecting a third voice representing a word uttered by the user, detects the utterance of the word by the user as an action made by the user,
    When the first operation, the second operation, and the third sound are detected within the predetermined time, the history information generation unit includes the contents of the first operation and the contents of the second operation. Generating the history information associated with the words;
    The sound output means outputs the first sound and the second sound when the third sound is detected after the history information is generated.
    The audio output device according to claim 7.
  9.  前記履歴情報生成手段は、直近の予め定められた期間内において前記予め定められた時間内に前記操作と前記行動とが検知された回数が、予め定められた閾値に達した場合、前記操作の内容と前記行動の内容とが対応付けられた前記履歴情報を生成する、
     請求項2から8のいずれか1項に記載の音声出力装置。
    When the number of times that the operation and the action are detected within the predetermined time within the most recent predetermined period has reached a predetermined threshold, the history information generating means Generating the history information in which the content and the content of the action are associated;
    The audio output device according to any one of claims 2 to 8.
  10.  設定モードへの移行指示を受け付ける移行指示受付手段を更に備え、
     前記履歴情報生成手段は、前記移行指示受付手段により前記移行指示が受け付けられ、前記移行指示に基づいて前記設定モードが設定されている間に、前記操作と前記行動とが検知された場合、前記操作の内容と前記行動の内容とが対応付けられた前記履歴情報を生成し、
     前記音声出力手段は、前記履歴情報が生成された後、前記行動が検知された場合、前記音声を出力する、
     請求項1に記載の音声出力装置。
    A transition instruction receiving means for receiving a transition instruction to the setting mode;
    The history information generating means receives the transition instruction by the transition instruction receiving means, and when the operation and the action are detected while the setting mode is set based on the transition instruction, Generating the history information in which the content of the operation and the content of the action are associated;
    The voice output means outputs the voice when the behavior is detected after the history information is generated.
    The audio output device according to claim 1.
  11.  音声出力装置と機器制御装置とを備える機器制御システムであって、
     前記音声出力装置は、
     ユーザによる設備機器に対する操作を検知する操作検知手段と、
     前記操作検知手段により前記操作が検知された場合、前記操作の内容を表す音声を出力する音声出力手段と、
     前記ユーザによりなされた行動を検知する行動検知手段と、
     前記操作検知手段により検知された前記操作と前記行動検知手段により検知された前記行動とが予め定められた関係にある場合、前記操作の内容と前記行動の内容とが対応付けられた履歴情報を生成する履歴情報生成手段と、を備え、
     前記機器制御装置は、
     前記音声出力手段により出力された前記音声を検知する音声検知手段と、
     前記音声検知手段により検知された前記音声により表される前記操作の内容に基づいて、前記設備機器を制御する機器制御手段と、を備え、
     前記音声出力手段は、前記行動検知手段により前記行動が検知された場合、前記履歴情報により前記行動の内容に対応付けられた前記操作の内容を表す前記音声を出力する、
     機器制御システム。
    A device control system comprising an audio output device and a device control device,
    The audio output device is
    Operation detection means for detecting an operation on the equipment by the user;
    Audio output means for outputting audio representing the content of the operation when the operation is detected by the operation detection means;
    Action detecting means for detecting an action made by the user;
    When the operation detected by the operation detection unit and the behavior detected by the behavior detection unit have a predetermined relationship, history information in which the content of the operation and the content of the behavior are associated with each other is stored. A history information generating means for generating,
    The device control device
    Voice detection means for detecting the voice output by the voice output means;
    Device control means for controlling the facility equipment based on the content of the operation represented by the voice detected by the voice detection means,
    The voice output unit outputs the voice representing the content of the operation associated with the content of the action by the history information when the behavior is detected by the behavior detection unit.
    Equipment control system.
  12.  ユーザによる設備機器に対する操作を検知し、
     前記操作が検知された場合、前記操作の内容を表す音声を出力し、
     前記ユーザによりなされた行動を検知し、
     前記行動が検知され、前記操作と前記行動とが予め定められた関係にある場合、前記音声を出力する、
     音声出力方法。
    Detects user operations on equipment,
    When the operation is detected, a sound representing the content of the operation is output,
    Detecting actions taken by the user,
    When the action is detected and the operation and the action are in a predetermined relationship, the sound is output.
    Audio output method.
  13.  コンピュータを、
     ユーザによる設備機器に対する操作を検知する操作検知手段、
     前記操作検知手段により前記操作が検知された場合、前記操作の内容を表す音声を出力する音声出力手段、
     前記ユーザによりなされた行動を検知する行動検知手段、
     前記操作検知手段により検知された前記操作と前記行動検知手段により検知された前記行動とが予め定められた関係にある場合、前記操作の内容と前記行動の内容とが対応付けられた履歴情報を生成する履歴情報生成手段、として機能させるためのプログラムであって、
     前記音声出力手段は、前記行動検知手段により前記行動が検知された場合、前記履歴情報により前記行動の内容に対応付けられた前記操作の内容を表す前記音声を出力する、
     プログラム。
    Computer
    Operation detection means for detecting operations on equipment by the user,
    A sound output means for outputting a sound representing the content of the operation when the operation is detected by the operation detection means;
    Action detecting means for detecting an action made by the user;
    When the operation detected by the operation detection unit and the behavior detected by the behavior detection unit have a predetermined relationship, history information in which the content of the operation and the content of the behavior are associated with each other is stored. A program for functioning as history information generation means for generating,
    The voice output unit outputs the voice representing the content of the operation associated with the content of the action by the history information when the behavior is detected by the behavior detection unit.
    program.
PCT/JP2018/021204 2018-06-01 2018-06-01 Audio output device, apparatus control system, audio output method, and program WO2019229978A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2020522542A JP6945734B2 (en) 2018-06-01 2018-06-01 Audio output device, device control system, audio output method, and program
PCT/JP2018/021204 WO2019229978A1 (en) 2018-06-01 2018-06-01 Audio output device, apparatus control system, audio output method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2018/021204 WO2019229978A1 (en) 2018-06-01 2018-06-01 Audio output device, apparatus control system, audio output method, and program

Publications (1)

Publication Number Publication Date
WO2019229978A1 true WO2019229978A1 (en) 2019-12-05

Family

ID=68696888

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2018/021204 WO2019229978A1 (en) 2018-06-01 2018-06-01 Audio output device, apparatus control system, audio output method, and program

Country Status (2)

Country Link
JP (1) JP6945734B2 (en)
WO (1) WO2019229978A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020105466A1 (en) * 2018-11-21 2020-05-28 ソニー株式会社 Information processing device and information processing method
JP7474084B2 (en) 2020-03-17 2024-04-24 アイホン株式会社 Intercom System

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04167695A (en) * 1990-10-26 1992-06-15 Sharp Corp Remote control system
JP2001036981A (en) * 1999-07-16 2001-02-09 Fujitsu Ltd Remote controller and computer readable recording medium recording remote control program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04167695A (en) * 1990-10-26 1992-06-15 Sharp Corp Remote control system
JP2001036981A (en) * 1999-07-16 2001-02-09 Fujitsu Ltd Remote controller and computer readable recording medium recording remote control program

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020105466A1 (en) * 2018-11-21 2020-05-28 ソニー株式会社 Information processing device and information processing method
JPWO2020105466A1 (en) * 2018-11-21 2021-10-07 ソニーグループ株式会社 Information processing device and information processing method
JP7456387B2 (en) 2018-11-21 2024-03-27 ソニーグループ株式会社 Information processing device and information processing method
JP7474084B2 (en) 2020-03-17 2024-04-24 アイホン株式会社 Intercom System

Also Published As

Publication number Publication date
JP6945734B2 (en) 2021-10-06
JPWO2019229978A1 (en) 2021-01-14

Similar Documents

Publication Publication Date Title
KR102025566B1 (en) Home appliance and voice recognition server system using artificial intelligence and method for controlling thereof
CN105700389B (en) Intelligent home natural language control method
JP2018036397A (en) Response system and apparatus
JP2018194810A (en) Device controlling method and electronic apparatus
CN109360558B (en) Voice response method and device
KR102029820B1 (en) Electronic device and Method for controlling power using voice recognition thereof
WO2019229978A1 (en) Audio output device, apparatus control system, audio output method, and program
JP2016114744A (en) Electronic device control system, terminal device and server
JP6316214B2 (en) SYSTEM, SERVER, ELECTRONIC DEVICE, SERVER CONTROL METHOD, AND PROGRAM
JP2017203967A (en) Voice output control device, electronic apparatus, and method for controlling voice output control device
JP2018109663A (en) Speech processing unit, dialog system, terminal device, program, and speech processing method
CN109955270B (en) Voice option selection system and method and intelligent robot using same
CN111429917B (en) Equipment awakening method and terminal equipment
JP7456387B2 (en) Information processing device and information processing method
JP2009109536A (en) Voice recognition system and voice recognizer
JP2009288815A (en) Equipment control device, speech recognition device, agent device, equipment control method and program
JP6621593B2 (en) Dialog apparatus, dialog system, and control method of dialog apparatus
JP6997554B2 (en) Home appliance system
JP7170428B2 (en) ELECTRICAL DEVICE, COMMUNICATION ADAPTER, SETTING METHOD FOR ELECTRICAL DEVICE, AND PROGRAM
JP7159773B2 (en) VOICE OPERATING DEVICE, VOICE OPERATING METHOD, AND VOICE OPERATING SYSTEM
JP6921311B2 (en) Equipment control system, equipment, equipment control method and program
CN113314115A (en) Voice processing method of terminal equipment, terminal equipment and readable storage medium
JPWO2018158894A1 (en) Air conditioning control device, air conditioning control method and program
JP7018850B2 (en) Terminal device, decision method, decision program and decision device
WO2019154282A1 (en) Household appliance and voice recognition method, control method and control device thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18920639

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020522542

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18920639

Country of ref document: EP

Kind code of ref document: A1