JP2021032906A

JP2021032906A - Receiving device

Info

Publication number: JP2021032906A
Application number: JP2019148384A
Authority: JP
Inventors: 丈次山下; Joji Yamashita
Original assignee: Toshiba Visual Solutions Corp
Current assignee: Toshiba Visual Solutions Corp
Priority date: 2019-08-13
Filing date: 2019-08-13
Publication date: 2021-03-01
Anticipated expiration: 2039-08-13
Also published as: WO2021027892A1; JP7206167B2; CN112930686A; CN112930686B

Abstract

To reduce the onset of voice recognition service when the voice recognition service is unnecessary.SOLUTION: A receiving device comprises a voice input unit, a selection unit, and a voice recognition unit. The voice input unit accepts input of a user's voice. The selection unit selects one of an effective state and an invalid state of voice recognition, based on predetermined conditions. The voice recognition unit executes a voice recognition process for the voice inputted to the voice input unit, when the effective state is selected, and does not execute the voice recognition process, when the invalid state is selected.SELECTED DRAWING: Figure 2

Description

本発明の実施形態は、受信装置に関する。 Embodiments of the present invention relate to a receiving device.

近年、ユーザが音声によって機器の操作をすることができる音声認識サービスのニーズが高まりつつある。例えば、音声認識機能を備えるテレビジョン装置等の機器が知られている。このようなテレビジョン装置等においては、例えば、ユーザが発したウェイクワード（Wake Word）を検出した場合に、音声認識サービスを起動し、例えば、何らかの応答を返したり、ユーザの音声が認識しやすくなるように再生中のコンテンツの音量を下げたりする。 In recent years, there is an increasing need for a voice recognition service that allows a user to operate a device by voice. For example, devices such as television devices having a voice recognition function are known. In such a television device or the like, for example, when a Wake Word issued by a user is detected, a voice recognition service is activated, and for example, some response is returned or the user's voice is easily recognized. Decrease the volume of the content being played so that it becomes.

しかしながら、このようなテレビジョン装置等においては、ウェイクワードの誤検出等により、ユーザの意図しないタイミングで音声認識サービスが起動してしまう場合がある。このような場合に、ユーザのコンテンツの視聴が妨げられることにより、ユーザが煩わしさを感じるおそれがある。 However, in such a television device or the like, the voice recognition service may be activated at a timing not intended by the user due to erroneous detection of a wake word or the like. In such a case, the user may feel annoyed by hindering the viewing of the user's content.

特開２０１３−２３５０３２号公報Japanese Unexamined Patent Publication No. 2013-235032

音声認識サービスが不要な場面において音声認識サービスが開始することを低減する。 Reduce the start of the voice recognition service in situations where the voice recognition service is not required.

実施形態の受信装置は、音声入力部と、選択部と、音声認識部とを備える。音声入力部は、ユーザの音声を入力する。選択部は、所定の条件に基づいて、音声認識の有効状態と無効状態のいずれかを選択する。音声認識部は、有効状態が選択された場合、音声入力部に入力された音声に対する音声認識処理を実行し、無効状態が選択された場合、音声認識処理を実行しない。 The receiving device of the embodiment includes a voice input unit, a selection unit, and a voice recognition unit. The voice input unit inputs the user's voice. The selection unit selects either the enabled state or the disabled state of voice recognition based on a predetermined condition. The voice recognition unit executes voice recognition processing for the voice input to the voice input unit when the valid state is selected, and does not execute the voice recognition processing when the invalid state is selected.

図１は、第１の実施形態にかかるテレビジョン装置のハードウェア構成の一例を示す図である。FIG. 1 is a diagram showing an example of the hardware configuration of the television apparatus according to the first embodiment. 図２は、第１の実施形態にかかるテレビジョン装置の機能的構成の一例を示す図である。FIG. 2 is a diagram showing an example of a functional configuration of the television device according to the first embodiment. 図３は、第１の本実施形態にかかる音声認識の有効状態と無効状態の選択処理の流れの一例を示すフローチャートである。FIG. 3 is a flowchart showing an example of the flow of the process of selecting the enabled state and the disabled state of the voice recognition according to the first embodiment. 図４は、第２の実施形態にかかるテレビジョン装置の機能的構成の一例を示す図である。FIG. 4 is a diagram showing an example of the functional configuration of the television device according to the second embodiment. 図５は、第３の実施形態にかかるテレビジョン装置の機能的構成の一例を示す図である。FIG. 5 is a diagram showing an example of a functional configuration of the television device according to the third embodiment. 図６は、第４の実施形態にかかるテレビジョン装置の機能的構成の一例を示す図である。FIG. 6 is a diagram showing an example of the functional configuration of the television device according to the fourth embodiment. 図７は、第５の実施形態にかかるテレビジョン装置の機能的構成の一例を示す図である。FIG. 7 is a diagram showing an example of the functional configuration of the television device according to the fifth embodiment.

（第１の実施形態）
図１は、本実施形態にかかるテレビジョン装置１０のハードウェア構成の一例を示す図である。図１に示すように、テレビジョン装置１０は、アンテナ１０１と、入力端子１０２ａと、チューナ１０３と、デモジュレータ１０４と、デマルチプレクサ１０５と、入力端子１０２ｂおよび１０２ｃと、Ａ／Ｄ（アナログ／デジタル）変換器１０６と、セレクタ１０７と、信号処理部１０８と、スピーカ１０９と、表示パネル１１０と、操作部１１１と、受光部１１２と、ＩＰ通信部１１３と、ＣＰＵ（Central Processing Unit）１１４と、メモリ１１５と、ストレージ１１６と、マイク（マイクロフォン）１１７と、オーディオＩ／Ｆ（インターフェース）１１８とを備える。テレビジョン装置１０は、本実施形態における受信装置の一例である。 (First Embodiment)
FIG. 1 is a diagram showing an example of a hardware configuration of the television device 10 according to the present embodiment. As shown in FIG. 1, the television apparatus 10 includes an antenna 101, an input terminal 102a, a tuner 103, a demoxer 104, a demultiplexer 105, input terminals 102b and 102c, and an A / D (analog / digital). ) The converter 106, the selector 107, the signal processing unit 108, the speaker 109, the display panel 110, the operation unit 111, the light receiving unit 112, the IP communication unit 113, the CPU (Central Processing Unit) 114, and the like. It includes a memory 115, a storage 116, a microphone (microphone) 117, and an audio I / F (interface) 118. The television device 10 is an example of a receiving device in the present embodiment.

アンテナ１０１は、デジタル放送の放送信号を受信し、受信した放送信号を、入力端子１０２ａを介してチューナ１０３に供給する。チューナ１０３は、アンテナ１０１から供給された放送信号から所望のチャンネルの放送信号を選局し、選局した放送信号をデモジュレータ１０４に供給する。放送信号は、放送波ともいう。 The antenna 101 receives a broadcast signal of digital broadcasting, and supplies the received broadcast signal to the tuner 103 via the input terminal 102a. The tuner 103 selects a broadcast signal of a desired channel from the broadcast signal supplied from the antenna 101, and supplies the selected broadcast signal to the demodulator 104. Broadcast signals are also called broadcast waves.

デモジュレータ１０４は、チューナ１０３から供給された放送信号を復調し、復調した放送信号をデマルチプレクサ１０５に供給する。デマルチプレクサ１０５は、デモジュレータ１０４から供給された放送信号を分離して映像信号および音声信号を生成し、生成した映像信号および音声信号をセレクタ１０７に供給する。 The demodulator 104 demodulates the broadcast signal supplied from the tuner 103, and supplies the demodulated broadcast signal to the demultiplexer 105. The demultiplexer 105 separates the broadcast signal supplied from the demodulator 104 to generate a video signal and an audio signal, and supplies the generated video signal and the audio signal to the selector 107.

セレクタ１０７は、デマルチプレクサ２０５、Ａ／Ｄ変換器１０６、および入力端子１０２ｃから供給される複数の信号から１つを選択し、選択した１つの信号を信号処理部１０８に供給するように構成されている。 The selector 107 is configured to select one from a plurality of signals supplied from the demultiplexer 205, the A / D converter 106, and the input terminal 102c, and supply the selected signal to the signal processing unit 108. ing.

信号処理部１０８は、セレクタ１０７から供給される映像信号に所定の信号処理を施し、処理後の映像信号を表示パネル１１０に供給するように構成されている。また、信号処理部１０８は、セレクタ１０７から供給される音声信号に所定の信号処理を施し、処理後の音声信号をスピーカ１０９に供給するように構成されている。 The signal processing unit 108 is configured to perform predetermined signal processing on the video signal supplied from the selector 107 and supply the processed video signal to the display panel 110. Further, the signal processing unit 108 is configured to perform predetermined signal processing on the audio signal supplied from the selector 107 and supply the processed audio signal to the speaker 109.

スピーカ１０９は、信号処理部１０８から供給される音声信号に基づいて音声、または各種の音を出力するように構成されている。また、スピーカ１０９は、ＣＰＵ１１４による制御に基づいて、出力する音声または各種の音の音量を変更する。 The speaker 109 is configured to output voice or various sounds based on the voice signal supplied from the signal processing unit 108. Further, the speaker 109 changes the volume of the output voice or various sounds based on the control by the CPU 114.

表示パネル１１０は、信号処理部１０８から供給される映像信号またはＣＰＵ１１４による制御に基づいて、静止画や動画などの映像を表示するように構成されている。表示パネル１１０は、表示部の一例である。 The display panel 110 is configured to display a video such as a still image or a moving image based on a video signal supplied from the signal processing unit 108 or a control by the CPU 114. The display panel 110 is an example of a display unit.

入力端子１０２ｂは、外部から入力されるアナログ信号（映像信号および音声信号）を受け付ける。また、入力端子１０２ｃは、外部から入力されるデジタル信号（映像信号および音声信号）を受け付けるように構成されている。例えば、入力端子１０２ｃは、ＢＤ（Blu-ray Disc）（登録商標）などの録画再生用の記録媒体を駆動して録画および再生するドライブ装置を搭載したレコーダ（ＢＤレコーダ）等から、デジタル信号の入力が可能であるものとする。Ａ／Ｄ変換器１０６は、入力端子１０２ｂから供給されるアナログ信号にＡ／Ｄ変換を施すことにより生成したデジタル信号をセレクタ１０７に供給する。 The input terminal 102b receives an analog signal (video signal and audio signal) input from the outside. Further, the input terminal 102c is configured to receive digital signals (video signal and audio signal) input from the outside. For example, the input terminal 102c is a digital signal from a recorder (BD recorder) or the like equipped with a drive device that drives a recording medium for recording / playback such as a BD (Blu-ray Disc) (registered trademark) to record and play back. It is assumed that input is possible. The A / D converter 106 supplies the selector 107 with a digital signal generated by performing A / D conversion on the analog signal supplied from the input terminal 102b.

操作部１１１は、ユーザの操作入力を受け付ける。また、受光部１１２は、リモートコントローラ１１９からの赤外線を受光する。ＩＰ通信部１１３は、ネットワーク３００を介したＩＰ（インターネットプロトコル）通信を行うための通信インターフェースである。 The operation unit 111 receives the user's operation input. Further, the light receiving unit 112 receives infrared rays from the remote controller 119. The IP communication unit 113 is a communication interface for performing IP (Internet Protocol) communication via the network 300.

ＣＰＵ１１４は、テレビジョン装置１０全体を制御する制御部である。メモリ１１５は、ＣＰＵ１１４が実行する各種コンピュータプログラムを格納するＲＯＭ（Read Only Memory）や、ＣＰＵ１１４に作業エリアを提供するＲＡＭ（Random Access Memory）等である。また、ストレージ１１６は、ＨＤＤ（Hard Disk Drive,）やＳＳＤ（Solid State Drive）等である。ストレージ１１６は、例えば、セレクタ１０７により選択された信号を録画データとして記録する。 The CPU 114 is a control unit that controls the entire television device 10. The memory 115 is a ROM (Read Only Memory) for storing various computer programs executed by the CPU 114, a RAM (Random Access Memory) for providing a work area to the CPU 114, and the like. Further, the storage 116 is an HDD (Hard Disk Drive,), an SSD (Solid State Drive), or the like. The storage 116 records, for example, the signal selected by the selector 107 as recorded data.

マイク１１７は、ユーザが発話した音声を取得して、オーディオＩ／Ｆ１１８に送出する。マイク１１７は、音声入力部の一例である。マイク１１７は、“オン状態”の場合に音声の入力が可能であり、“オフ状態”の場合は、音声の入力が不可である。本実施形態においては、マイク１１７は、テレビジョン装置１０が起動した場合は自動的にオン状態となる。例えば、マイク１１７は、ＣＰＵ１１４による制御によって音声認識を有効状態にすることが選択されている場合は、オン状態のままとなる。また、例えば、マイク１１７は、ＣＰＵ１１４による制御によって音声認識を無効状態にすることが選択された場合に、オフ状態に切り替えられる。音声認識の有効状態と無効状態の選択の詳細については、選択部１５の処理として後述する。 The microphone 117 acquires the voice spoken by the user and sends it to the audio I / F 118. The microphone 117 is an example of a voice input unit. The microphone 117 can input voice when it is in the "on state", and cannot input voice when it is in the "off state". In the present embodiment, the microphone 117 is automatically turned on when the television device 10 is activated. For example, the microphone 117 remains on when it is selected to enable speech recognition under the control of the CPU 114. Further, for example, the microphone 117 is switched to the off state when it is selected to disable the voice recognition by the control by the CPU 114. The details of selecting the enabled state and the disabled state of the voice recognition will be described later as the processing of the selection unit 15.

オーディオＩ／Ｆ１１８は、マイク１１７が取得した音声をアナログ／デジタル変換して、音声信号としてＣＰＵ１１４に送出する。 The audio I / F 118 converts the voice acquired by the microphone 117 into analog / digital and sends it to the CPU 114 as a voice signal.

次に、本実施形態にかかるテレビジョン装置１０の機能について説明する。 Next, the function of the television device 10 according to the present embodiment will be described.

図２は、本実施形態にかかるテレビジョン装置１０の機能的構成の一例を示す図である。図２に示すように、テレビジョン装置１０は、取得部１１と、ウェイクワード検出部１２と、音声認識部１３と、表示制御部１４と、選択部１５と、機器制御部１６とを備える。 FIG. 2 is a diagram showing an example of a functional configuration of the television device 10 according to the present embodiment. As shown in FIG. 2, the television device 10 includes an acquisition unit 11, a wake word detection unit 12, a voice recognition unit 13, a display control unit 14, a selection unit 15, and a device control unit 16.

本実施形態のテレビジョン装置１０で実行されるプログラムは、上述した各部（取得部、ウェイクワード検出部、音声認識部、表示制御部、選択部、機器制御部）を含むモジュール構成となっており、実際のハードウェアとしてはＣＰＵ１１４がＲＯＭ等からプログラムを読み出して実行することにより上記各部がＲＡＭ等の主記憶装置上にロードされ、取得部、ウェイクワード検出部、音声認識部、表示制御部、選択部、機器制御部が主記憶装置上に生成されるようになっている。 The program executed by the television device 10 of the present embodiment has a modular configuration including the above-mentioned parts (acquisition unit, wake word detection unit, voice recognition unit, display control unit, selection unit, device control unit). As actual hardware, when the CPU 114 reads a program from a ROM or the like and executes it, each of the above parts is loaded on a main storage device such as a RAM, and an acquisition unit, a wake word detection unit, a voice recognition unit, a display control unit, The selection unit and the device control unit are generated on the main memory.

本実施形態のテレビジョン装置１０で実行されるプログラムは、例えば、ＲＯＭ等に予め組み込まれて提供される。また、本実施形態のテレビジョン装置１０で実行されるプログラムは、インストール可能な形式又は実行可能な形式のファイルでＣＤ−ＲＯＭ、フレキシブルディスク（ＦＤ）、ＣＤ−Ｒ、ＤＶＤ（Digital Versatile Disk）等のコンピュータで読み取り可能な記録媒体に記録して提供するように構成しても良い。 The program executed by the television device 10 of the present embodiment is provided, for example, by being preliminarily incorporated in a ROM or the like. The program executed by the television device 10 of the present embodiment is a file in an installable format or an executable format, such as a CD-ROM, a flexible disk (FD), a CD-R, or a DVD (Digital Versatile Disk). It may be configured to be recorded and provided on a computer-readable recording medium.

さらに、本実施形態のテレビジョン装置１０で実行されるプログラムを、インターネット等のネットワークに接続されたコンピュータ上に格納し、ネットワーク経由でダウンロードさせることにより提供するように構成しても良い。また、本実施形態のテレビジョン装置１０で実行されるプログラムをインターネット等のネットワーク経由で提供または配布するように構成しても良い。また、本実施形態では、１台のＣＰＵによって各機能部が実現されるように記載するが、複数のＣＰＵまたは各種の回路によって各機能部が実現されても良い。 Further, the program executed by the television apparatus 10 of the present embodiment may be stored on a computer connected to a network such as the Internet and provided by downloading via the network. Further, the program executed by the television device 10 of the present embodiment may be configured to be provided or distributed via a network such as the Internet. Further, in the present embodiment, it is described that each functional unit is realized by one CPU, but each functional unit may be realized by a plurality of CPUs or various circuits.

取得部１１は、マイク１１７に入力されたユーザの音声を、オーディオＩ／Ｆ１１８を介して取得する。取得部１１は、取得した音声を、ウェイクワード検出部１２と音声認識部１３とに送出する。なお、取得部１１が取得する“音声”は、オーディオＩ／Ｆ１１８によって変換されたデジタルな音声信号であるが、以下、単に“音声”と記載する。 The acquisition unit 11 acquires the user's voice input to the microphone 117 via the audio I / F 118. The acquisition unit 11 sends the acquired voice to the wake word detection unit 12 and the voice recognition unit 13. The "voice" acquired by the acquisition unit 11 is a digital voice signal converted by the audio I / F 118, but will be simply referred to as "voice" below.

また、取得部１１は、ＣＰＵ１１４と接続する操作部１１１、受光部１１２、ＩＰ通信部１１３、セレクタ１０７、信号処理部１０８等から、各種の信号を取得する。例えば、取得部１１は、受光部１１２が受光したリモートコントローラ１１９からの赤外線または操作部１１１に入力された操作に基づいて、ユーザの操作を受け付ける。取得部１１は、受け付けたユーザの操作の内容を、表示制御部１４と、機器制御部１６とに送出する。 Further, the acquisition unit 11 acquires various signals from the operation unit 111 connected to the CPU 114, the light receiving unit 112, the IP communication unit 113, the selector 107, the signal processing unit 108, and the like. For example, the acquisition unit 11 accepts the user's operation based on the infrared rays from the remote controller 119 received by the light receiving unit 112 or the operation input to the operation unit 111. The acquisition unit 11 sends the contents of the received user's operation to the display control unit 14 and the device control unit 16.

ウェイクワード検出部１２は、取得部１１によって取得された音声からウェイクワード（Wake Word）を検出する。ウェイクワードは、音声認識サービスの起動のトリガとなる所定の音声コマンドである。ウェイクワードは予め定められているものとする。また、音声信号がウェイクワードを含むか否かを判断する手法は、公知の音声認識の技術を採用することができる。 The wake word detection unit 12 detects a wake word from the voice acquired by the acquisition unit 11. A wake word is a predetermined voice command that triggers the activation of a voice recognition service. Wake words shall be predetermined. Further, as a method for determining whether or not the voice signal includes a wake word, a known voice recognition technique can be adopted.

本実施形態においては、後述の選択部１５によって音声認識の有効状態と無効状態のいずれが選択されたかによってウェイクワード検出部１２の設定自体は変化しないが、無効状態が選択された場合は、マイク１１７はオフ状態になり、音声の入力が不可であるため、音声は取得されない。このため、ウェイクワード検出部１２は、音声認識の無効状態が選択されている場合、ウェイクワードの検出処理を実行しない。また、音声認識の有効状態が選択されている場合は、マイク１１７はオン状態であり、音声の入力が可能である。このため、ウェイクワード検出部１２は、音声認識の有効状態が選択されている場合に、マイク１１７に入力された音声に対するウェイクワードの検出処理を実行する。 In the present embodiment, the setting itself of the wake word detection unit 12 does not change depending on whether the voice recognition enabled state or the disabled state is selected by the selection unit 15 described later, but when the disabled state is selected, the microphone Since 117 is turned off and voice input is not possible, no voice is acquired. Therefore, the wake word detection unit 12 does not execute the wake word detection process when the invalid state of voice recognition is selected. When the enabled state of voice recognition is selected, the microphone 117 is in the on state, and voice can be input. Therefore, the wake word detection unit 12 executes the wake word detection process for the voice input to the microphone 117 when the enabled state of voice recognition is selected.

ウェイクワード検出部１２は、取得部１１によって取得された音声からウェイクワードを検出した場合に、表示制御部１４および機器制御部１６にウェイクワードを検出したことを通知する。また、ウェイクワード検出部１２は、ウェイクワードの後にユーザの音声が続いて入力された場合、ウェイクワードの後に続く音声を音声認識部１３に送出する。 When the wake word is detected from the voice acquired by the acquisition unit 11, the wake word detection unit 12 notifies the display control unit 14 and the device control unit 16 that the wake word has been detected. Further, when the user's voice is input after the wake word, the wake word detection unit 12 sends the voice following the wake word to the voice recognition unit 13.

音声認識部１３は、マイク１１７に入力された音声に対する音声認識処理を実行する。本実施形態においては、後述の選択部１５によって音声認識の有効状態と無効状態のいずれが選択されたかによって音声認識部１３の設定自体は変化しないが、無効状態が選択された場合は、マイク１１７は音声の入力が不可であるため、音声は取得されない。このため、音声認識部１３は、音声認識の無効状態が選択されている場合、音声認識処理を実行しない。また、音声認識の有効状態が選択されている場合は、マイク１１７は音声の入力が可能である。このため、音声認識部１３は、音声認識の有効状態が選択されている場合に、マイク１１７に入力された音声に対する音声認識処理を実行する。 The voice recognition unit 13 executes voice recognition processing for the voice input to the microphone 117. In the present embodiment, the setting itself of the voice recognition unit 13 does not change depending on whether the voice recognition enabled state or the voice recognition disabled state is selected by the selection unit 15 described later, but when the disabled state is selected, the microphone 117 Since voice input is not possible, no voice is acquired. Therefore, the voice recognition unit 13 does not execute the voice recognition process when the voice recognition invalid state is selected. Further, when the enabled state of voice recognition is selected, the microphone 117 can input voice. Therefore, the voice recognition unit 13 executes the voice recognition process for the voice input to the microphone 117 when the enabled state of the voice recognition is selected.

より詳細には、音声認識部１３は、ウェイクワード検出部１２によってウェイクワードが検出された場合に、ウェイクワードの後に続く音声を音声認識処理することにより、ユーザの音声の内容を特定する。音声認識処理は、公知の技術を適用可能である。例えば、音声認識部１３は、公知の技術を用いて、ユーザの音声の内容をテキストデータに変換する。音声認識部１３は、音声認識結果を表示制御部１４と機器制御部１６とに送出する。本実施形態においては、音声認識部１３がユーザの音声を音声認識した結果に基づいて、表示制御部１４または機器制御部１６等の各機能部が処理を実行することにより、音声認識サービスを実現する。 More specifically, when the wake word is detected by the wake word detection unit 12, the voice recognition unit 13 identifies the content of the user's voice by performing voice recognition processing on the voice following the wake word. A known technique can be applied to the voice recognition process. For example, the voice recognition unit 13 converts the content of the user's voice into text data by using a known technique. The voice recognition unit 13 sends the voice recognition result to the display control unit 14 and the device control unit 16. In the present embodiment, the voice recognition service is realized by each functional unit such as the display control unit 14 or the device control unit 16 executing the process based on the result of the voice recognition unit 13 recognizing the user's voice. To do.

表示制御部１４は、表示パネル１１０への各種の表示を制御する。例えば、表示制御部１４は、取得部１１がリモートコントローラ１１９等に入力されたユーザの操作を取得した場合に、該操作に応じた操作画面を表示パネル１１０に表示する。より具体的には、ユーザが録画予約の設定を開始するボタンを押下する等の操作をした場合に、表示制御部１４は、該ユーザの操作を受け付け可能な操作画面を表示パネル１１０に表示する。操作画面の表示態様は、例えば、再生中のコンテンツの画面の上に重畳されて表示されるＯＳＤ（On Screen Display）でも良いし、表示パネル１１０全体に表示される全画面表示でも良い。なお、本実施形態において“コンテンツ”とは、テレビ番組、ＤＶＤ等に録画された動画、またはアプリケーションによって再生される動画等を含むものとする。 The display control unit 14 controls various displays on the display panel 110. For example, when the acquisition unit 11 acquires the user's operation input to the remote controller 119 or the like, the display control unit 14 displays the operation screen corresponding to the operation on the display panel 110. More specifically, when the user presses a button for starting the recording reservation setting, the display control unit 14 displays an operation screen capable of accepting the user's operation on the display panel 110. .. The display mode of the operation screen may be, for example, an OSD (On Screen Display) which is superimposed on the screen of the content being played and displayed, or a full screen display which is displayed on the entire display panel 110. In the present embodiment, the "content" includes a television program, a moving image recorded on a DVD or the like, a moving image played by an application, or the like.

また、表示制御部１４は、各種の通知画面を表示パネル１１０に表示する。例えば、表示制御部１４は、ユーザへの情報提供、警告、または注意喚起等のメッセージを含む通知画面を、再生中のコンテンツの画面の上に重畳してＯＳＤとして表示する。 In addition, the display control unit 14 displays various notification screens on the display panel 110. For example, the display control unit 14 superimposes a notification screen including a message such as information provision, warning, or alert to the user on the screen of the content being played and displays it as an OSD.

また、表示制御部１４は、ウェイクワード検出部１２によってウェイクワードが検出された場合に、音声に対して応答するメッセージまたはアイコン等を、表示パネル１１０に表示する。音声に対して応答するメッセージまたはアイコン等は、例えば、ユーザの発話を促す内容でも良いし、ユーザの音声の認識結果を文字データとして表示するものでも良い。当該メッセージまたはアイコン等の表示により、ユーザは、ウェイクワードが認識されたこと、および、発話する音声がテレビジョン装置１０に対する指示となることを容易に認識することができる。 Further, the display control unit 14 displays a message, an icon, or the like that responds to the voice on the display panel 110 when the wake word is detected by the wake word detection unit 12. The message or icon that responds to the voice may be, for example, content that encourages the user to speak, or may display the recognition result of the user's voice as character data. By displaying the message, the icon, or the like, the user can easily recognize that the wake word has been recognized and that the voice to be spoken is an instruction to the television device 10.

また、例えば、表示制御部１４は、表示パネル１１０に操作画面または通知画面を表示する場合に、操作画面を表示中であることを示す操作画面表示フラグまたは通知画面を表示中であることを示す通知画面表示フラグを、メモリ１１５に設定する。また、表示制御部１４は、操作画面または通知画面の表示を終了した場合は、操作画面表示フラグまたは通知画面表示フラグを、メモリ１１５から削除する。なお、表示パネル１１０に操作画面または通知画面を表示であることを示す手法はこれに限定されるものではない。例えば、表示制御部１４は、表示パネル１１０に操作画面または通知画面を表示したこと、または操作画面または通知画面の表示を終了したことを、選択部１５に通知しても良い。 Further, for example, when the display control unit 14 displays the operation screen or the notification screen on the display panel 110, the display control unit 14 indicates that the operation screen display flag or the notification screen indicating that the operation screen is being displayed is being displayed. The notification screen display flag is set in the memory 115. Further, when the display control unit 14 ends the display of the operation screen or the notification screen, the display control unit 14 deletes the operation screen display flag or the notification screen display flag from the memory 115. The method for indicating that the operation screen or the notification screen is displayed on the display panel 110 is not limited to this. For example, the display control unit 14 may notify the selection unit 15 that the operation screen or the notification screen has been displayed on the display panel 110, or that the display of the operation screen or the notification screen has been terminated.

また、表示制御部１４は、音声認識部１３によって認識されたユーザの音声に含まれる命令に基づいて、表示パネル１１０の表示を制御する。例えば、表示制御部１４は、ユーザの音声に含まれる命令に基づいて、チューナ１０３を制御して、ユーザが音声で指定した番組が放送されているチャンネルを選曲し、当該番組を表示パネル１１０に表示する。また、表示制御部１４は、ユーザの音声に含まれる命令に基づいて、ストレージ１１６または外部の記憶装置に保存された番組の録画データを再生して表示パネル１１０に表示しても良い。 Further, the display control unit 14 controls the display of the display panel 110 based on the command included in the user's voice recognized by the voice recognition unit 13. For example, the display control unit 14 controls the tuner 103 based on a command included in the user's voice, selects a channel on which the program specified by the user's voice is broadcast, and displays the program on the display panel 110. indicate. Further, the display control unit 14 may reproduce the recorded data of the program stored in the storage 116 or the external storage device and display it on the display panel 110 based on the command included in the user's voice.

選択部１５は、所定の条件に基づいて、音声認識の有効状態と無効状態のいずれかを選択する。 The selection unit 15 selects either an enabled state or an disabled state of voice recognition based on a predetermined condition.

本実施形態における所定の条件は、「操作画面または通知画面の少なくともいずれかが表示パネル１１０に表示されていること」である。本実施形態の選択部１５は、テレビジョン装置１０の表示パネル１１０の状態が所定の条件を満たす場合に、無効状態を選択する。また、選択部１５は、テレビジョン装置１０の表示パネル１１０の状態が所定の条件を満たさない場合に、有効状態を選択する。 The predetermined condition in the present embodiment is that "at least one of the operation screen and the notification screen is displayed on the display panel 110". The selection unit 15 of the present embodiment selects an invalid state when the state of the display panel 110 of the television device 10 satisfies a predetermined condition. Further, the selection unit 15 selects an effective state when the state of the display panel 110 of the television device 10 does not satisfy a predetermined condition.

例えば、選択部１５は、メモリ１１５に操作画面表示フラグが立っている場合に、操作画面が表示されていると判断し、メモリ１１５に通知画面表示フラグが立っている場合に通知画面表示されていると判断する。選択部１５は、操作画面または通知画面の少なくともいずれかが表示パネル１１０に表示されていると判断した場合に、テレビジョン装置１０が所定の条件を満たすと判定する。この場合、選択部１５は、無効状態を選択する。 For example, the selection unit 15 determines that the operation screen is displayed when the operation screen display flag is set in the memory 115, and displays the notification screen when the notification screen display flag is set in the memory 115. Judge that there is. The selection unit 15 determines that the television device 10 satisfies a predetermined condition when it is determined that at least one of the operation screen and the notification screen is displayed on the display panel 110. In this case, the selection unit 15 selects the invalid state.

なお、操作画面または通知画面の表示の有無を判断する手法はこれに限定されるものではなく、例えば、選択部１５は、表示制御部１４から取得した操作画面または通知画面の表示の有無に基づいて、操作画面または通知画面の少なくともいずれかが表示パネル１１０に表示されているか否かを判断しても良い。 The method of determining whether or not the operation screen or the notification screen is displayed is not limited to this. For example, the selection unit 15 is based on the presence or absence of the display of the operation screen or the notification screen acquired from the display control unit 14. It may be determined whether or not at least one of the operation screen and the notification screen is displayed on the display panel 110.

また、選択部１５は、操作画面および通知画面のいずれも表示パネル１１０に表示されていないと判断した場合に、テレビジョン装置１０が所定の条件を満たさないと判断する。この場合、選択部１５は、有効状態を選択する。 Further, when it is determined that neither the operation screen nor the notification screen is displayed on the display panel 110, the selection unit 15 determines that the television device 10 does not satisfy the predetermined condition. In this case, the selection unit 15 selects the valid state.

選択部１５は、音声認識の有効状態と無効状態の選択結果を、機器制御部１６に送出する。 The selection unit 15 sends the selection result of the enabled state and the disabled state of the voice recognition to the device control unit 16.

機器制御部１６は、テレビジョン装置１０に含まれる各種の機器を制御する。例えば、機器制御部１６は、選択部１５によって音声認識の無効状態が選択された場合に、マイク１１７をオフ状態にする。また、例えば、機器制御部１６は、選択部１５によって音声認識の有効状態が選択された場合に、マイク１１７をオン状態にする。 The device control unit 16 controls various devices included in the television device 10. For example, the device control unit 16 turns off the microphone 117 when the selection unit 15 selects the invalid state of voice recognition. Further, for example, the device control unit 16 turns on the microphone 117 when the voice recognition enabled state is selected by the selection unit 15.

また、機器制御部１６は、ウェイクワード検出部１２によってウェイクワードが検出された場合に、スピーカ１０９を制御して音量を下げる。これは、ユーザがウェイクワードの後に発話する音声の入力が、コンテンツの音に干渉されることを低減するためである。 Further, the device control unit 16 controls the speaker 109 to lower the volume when the wake word is detected by the wake word detection unit 12. This is to reduce the interference of the audio input that the user utters after the wake word with the sound of the content.

また、機器制御部１６は、音声認識部１３によって認識されたユーザの音声に含まれる命令に基づいて、テレビジョン装置１０に含まれる各種の機器を制御する。例えば、機器制御部１６は、ユーザの音声に、「音量を上げて」という命令が含まれている場合に、スピーカ１０９を制御して音量を上げる。なお、機器制御部１６は、音声認識部１３によって認識されたユーザの音声に含まれる命令に基づいて、インターネットから情報を検索しても良い。 Further, the device control unit 16 controls various devices included in the television device 10 based on a command included in the user's voice recognized by the voice recognition unit 13. For example, the device control unit 16 controls the speaker 109 to raise the volume when the user's voice includes a command to "turn up the volume". The device control unit 16 may search for information from the Internet based on a command included in the user's voice recognized by the voice recognition unit 13.

次に、以上のように構成されたテレビジョン装置１０で実行される音声認識の有効状態と無効状態の選択処理の流れを説明する。 Next, the flow of the selection process of the enabled state and the disabled state of the voice recognition executed by the television device 10 configured as described above will be described.

図３は、本実施形態にかかる音声認識の有効状態と無効状態の選択処理の流れの一例を示すフローチャートである。このフローチャートの処理は、テレビジョン装置１０が稼動している間は実行され続けるものとする。また、このフローチャートの開始時点においては、音声認識は有効状態であり、マイク１１７はオン状態であるものとする。 FIG. 3 is a flowchart showing an example of the flow of the process of selecting the enabled state and the disabled state of the voice recognition according to the present embodiment. It is assumed that the processing of this flowchart continues to be executed while the television device 10 is in operation. Further, at the start of this flowchart, it is assumed that the voice recognition is in the enabled state and the microphone 117 is in the on state.

まず、選択部１５は、例えば、メモリ１１５に操作画面表示フラグまたは通知画面表示フラグが立っているか否かに基づいて、テレビジョン装置１０が所定の条件を満たすか否かを判定する（Ｓ１）。 First, the selection unit 15 determines whether or not the television device 10 satisfies a predetermined condition based on, for example, whether or not the operation screen display flag or the notification screen display flag is set in the memory 115 (S1). ..

選択部１５は、メモリ１１５に操作画面表示フラグまたは通知画面表示フラグが立っている場合に、テレビジョン装置１０が所定の条件を満たすと判定する（Ｓ１“Ｙｅｓ”）。この場合、選択部１５は、音声認識の無効状態を選択する（Ｓ２）。選択部１５は、音声認識の無効状態を選択したことを、機器制御部１６に送出する。 The selection unit 15 determines that the television device 10 satisfies a predetermined condition when the operation screen display flag or the notification screen display flag is set in the memory 115 (S1 “Yes”). In this case, the selection unit 15 selects the disabled state of voice recognition (S2). The selection unit 15 sends out to the device control unit 16 that the invalid state of voice recognition has been selected.

次に、機器制御部１６は、マイク１１７を“オフ状態”にする（Ｓ３）。これにより、マイク１１７は音声の入力を受け付けない状態となる。機器制御部１６によってマイク１１７が“オフ状態”にされた後は、Ｓ１の処理に戻り、処理が繰り返される。 Next, the device control unit 16 turns the microphone 117 into an “off state” (S3). As a result, the microphone 117 is in a state of not accepting the input of voice. After the microphone 117 is turned "off" by the device control unit 16, the process returns to the process of S1 and the process is repeated.

また、選択部１５は、メモリ１１５に操作画面表示フラグおよび通知画面表示フラグのいずれも立っていない場合に、テレビジョン装置１０が所定の条件を満たさないと判定する（Ｓ１“Ｎｏ”）。この場合、選択部１５は、音声認識の有効状態を選択する（Ｓ４）。例えば、音声認識が無効状態になった後に、操作画面または通知画面の表示が終了してフラグが削除された場合、選択部１５が有効状態を選択することにより、音声認識が無効状態から有効状態に切り替わる。選択部１５は、音声認識の有効状態を選択したことを、機器制御部１６に送出する。 Further, the selection unit 15 determines that the television device 10 does not satisfy a predetermined condition when neither the operation screen display flag nor the notification screen display flag is set in the memory 115 (S1 “No”). In this case, the selection unit 15 selects the enabled state of voice recognition (S4). For example, when the display of the operation screen or the notification screen ends and the flag is deleted after the voice recognition is disabled, the selection unit 15 selects the enabled state, so that the voice recognition is enabled from the disabled state. Switch to. The selection unit 15 sends out to the device control unit 16 that the enabled state of voice recognition has been selected.

次に、機器制御部１６は、マイク１１７をオン状態にする（Ｓ５）。これにより、マイク１１７は音声の入力を受け付け可能な状態となる。なお、既にマイク１１７がオン状態である場合は、機器制御部１６は、特に何も処理を実行しない。 Next, the device control unit 16 turns on the microphone 117 (S5). As a result, the microphone 117 is in a state where it can accept voice input. If the microphone 117 is already on, the device control unit 16 does not perform any particular process.

次に、ウェイクワード検出部１２は、マイク１１７に入力されたユーザの音声を、オーディオＩ／Ｆ１１８を介して取得する（Ｓ６）。取得部１１は、取得した音声を、ウェイクワード検出部１２と音声認識部１３とに送出する。 Next, the wake word detection unit 12 acquires the user's voice input to the microphone 117 via the audio I / F 118 (S6). The acquisition unit 11 sends the acquired voice to the wake word detection unit 12 and the voice recognition unit 13.

そして、ウェイクワード検出部１２は、取得部１１によって取得された音声にウェイクワードが含まれるか否かを判断する（Ｓ７）。ウェイクワード検出部１２は、取得された音声からウェイクワードを検出した場合（Ｓ７“Ｙｅｓ”）、表示制御部１４および機器制御部１６にウェイクワードを検出したことを通知する。また、ウェイクワード検出部１２は、ウェイクワードの後にユーザの音声が続いて入力された場合、ウェイクワードの後に続く音声を音声認識部１３に送出する。 Then, the wake word detection unit 12 determines whether or not the voice acquired by the acquisition unit 11 includes the wake word (S7). When the wake word is detected from the acquired voice (S7 “Yes”), the wake word detection unit 12 notifies the display control unit 14 and the device control unit 16 that the wake word has been detected. Further, when the user's voice is input after the wake word, the wake word detection unit 12 sends the voice following the wake word to the voice recognition unit 13.

次に、機器制御部１６は、スピーカ１０９を制御して再生中のコンテンツの音量を下げる（Ｓ８）。また、表示制御部１４は、ユーザに対する応答メッセージまたはアイコンを表示パネルに表示パネル１１０に表示する（Ｓ９）。このような機器制御部１６または表示制御部１４による処理は、音声認識サービスの開始時の処理の一例である。 Next, the device control unit 16 controls the speaker 109 to reduce the volume of the content being played (S8). Further, the display control unit 14 displays a response message or an icon for the user on the display panel 110 (S9). Such processing by the device control unit 16 or the display control unit 14 is an example of processing at the start of the voice recognition service.

そして、音声認識部１３は、ウェイクワードの後にマイク１１７に入力された音声に対する音声認識処理を実行する（Ｓ１０）。音声認識部１３は、音声認識処理による音声認識結果を、表示制御部１４と機器制御部１６とに送出する。そして、表示制御部１４または機器制御部１６は、音声認識結果に基づく処理を実行することにより、音声認識サービスを実現する（Ｓ１１）。その後、Ｓ１の処理に戻り、テレビジョン装置１０の電源が切られるまで、このフローチャートの処理が繰り返される。 Then, the voice recognition unit 13 executes a voice recognition process for the voice input to the microphone 117 after the wake word (S10). The voice recognition unit 13 sends the voice recognition result of the voice recognition process to the display control unit 14 and the device control unit 16. Then, the display control unit 14 or the device control unit 16 realizes the voice recognition service by executing the process based on the voice recognition result (S11). After that, the process returns to S1 and the process of this flowchart is repeated until the power of the television device 10 is turned off.

このように、本実施形態のテレビジョン装置１０は、所定の条件に基づいて、音声認識の有効状態と無効状態のいずれかを選択し、有効状態を選択した場合はマイク１１７に入力された音声に対する音声認識処理を実行し、無効状態を選択した場合は音声認識処理を実行しない。このため、本実施形態のテレビジョン装置１０によれば、音声認識サービスが不要な場面において音声認識サービスが開始することを低減することができる。 As described above, the television device 10 of the present embodiment selects either the enabled state or the disabled state of voice recognition based on a predetermined condition, and when the enabled state is selected, the voice input to the microphone 117. The voice recognition process is executed for, and if the invalid state is selected, the voice recognition process is not executed. Therefore, according to the television device 10 of the present embodiment, it is possible to reduce the start of the voice recognition service in a scene where the voice recognition service is not required.

例えば、ユーザが発話した音声が、ウェイクワードではないにも関わらず、ウェイクワードとして誤認識される場合がある。一般に、ユーザがリモートコントローラ等を操作している場面においては、音声認識サービスによる操作は不要であることが多い。しかしながら、従来技術においては、ユーザが表示パネル上の操作画面を見ながらリモートコントローラ等を操作している場面において、ユーザの発話した音声がウェイクワードとして誤認識されると、音声認識サービスが開始し、表示パネル上にユーザに対する応答メッセージまたはアイコンが表示されて操作画面が消えてしまったり、見えにくくなってしまったりすることがあった。 For example, a voice spoken by a user may be erroneously recognized as a wake word even though it is not a wake word. In general, when the user is operating the remote controller or the like, the operation by the voice recognition service is often unnecessary. However, in the prior art, when the user is operating the remote controller or the like while looking at the operation screen on the display panel and the voice spoken by the user is erroneously recognized as a wake word, the voice recognition service is started. , A response message or icon to the user may be displayed on the display panel, and the operation screen may disappear or become difficult to see.

また、表示パネル上に通知画面が表示されている場合、ユーザは通知画面に表示されたメッセージ等を読んでいるため、該通知画面の表示の終了までは、他の画面によって該通知画面が遮られることは望ましくない。しかしながら、従来技術においては、ユーザが表示パネル上の通知画面を見ていても、ユーザの発話した音声がウェイクワードとして誤認識されると、音声認識サービスが開始し、表示パネル上にユーザに対する応答メッセージまたはアイコンが表示され、通知画面が消えてしまったり、見えにくくなってしまったりすることがあった。このような場合、ユーザが煩わしさを感じたり、ユーザへの情報提供に支障が出たりする場合がある。 Further, when the notification screen is displayed on the display panel, since the user is reading the message or the like displayed on the notification screen, the notification screen is blocked by another screen until the display of the notification screen ends. It is not desirable to be. However, in the prior art, even if the user is looking at the notification screen on the display panel, if the voice spoken by the user is erroneously recognized as a wake word, the voice recognition service is started and a response to the user is made on the display panel. Sometimes a message or icon was displayed and the notification screen disappeared or became difficult to see. In such a case, the user may feel annoyed or the information provision to the user may be hindered.

これに対して、本実施形態のテレビジョン装置１０は、表示パネル１１０に操作画面または通知画面の少なくともいずれかが表示されている場合に、テレビジョン装置１０が所定の条件を満たすと判断し、無効状態を選択する。このため、本実施形態のテレビジョン装置１０によれば、表示パネル１１０に操作画面または通知画面が表示されている場合に、音声認識サービスが開始することを低減することができる。このため、本実施形態のテレビジョン装置１０によれば、ユーザが操作画面または通知画面を使用しているときに、表示パネル１１０上にユーザに対する応答メッセージまたはアイコンが表示されてユーザが操作画面または通知画面を見にくくなるということを、低減することができる。 On the other hand, the television device 10 of the present embodiment determines that the television device 10 satisfies a predetermined condition when at least one of the operation screen and the notification screen is displayed on the display panel 110. Select an invalid state. Therefore, according to the television device 10 of the present embodiment, it is possible to reduce the start of the voice recognition service when the operation screen or the notification screen is displayed on the display panel 110. Therefore, according to the television device 10 of the present embodiment, when the user is using the operation screen or the notification screen, a response message or icon for the user is displayed on the display panel 110, and the user can use the operation screen or the operation screen. It is possible to reduce the difficulty of seeing the notification screen.

また、本実施形態のテレビジョン装置１０は、有効状態を選択した場合にマイク１１７をオン状態にし、無効状態を選択した場合にマイク１１７をオフ状態にする。このため、本実施形態のテレビジョン装置１０によれば、無効状態においては物理的にユーザの音声の入力を不可にし、音声認識サービス開始することを低減することができる。 Further, the television device 10 of the present embodiment turns on the microphone 117 when the enabled state is selected, and turns off the microphone 117 when the disabled state is selected. Therefore, according to the television device 10 of the present embodiment, it is possible to physically disable the input of the user's voice in the disabled state and reduce the start of the voice recognition service.

なお、本実施形態では、ハードウェアであるマイク１１７を音声入力部の一例としたが、プログラムによって実現される取得部１１を、音声入力部の一例としても良い。また、マイク１１７は、テレビジョン装置１０本体ではなく、リモートコントローラ１１９に設けられても良い。また、音声入力部は、テレビジョン装置１０の外部の音声認識機器によって実現されても良い。 In the present embodiment, the microphone 117, which is hardware, is used as an example of the voice input unit, but the acquisition unit 11 realized by the program may be used as an example of the voice input unit. Further, the microphone 117 may be provided on the remote controller 119 instead of the television device 10 main body. Further, the voice input unit may be realized by an external voice recognition device of the television device 10.

また、本実施形態では、「操作画面または通知画面の少なくともいずれかが表示パネル１１０に表示されていること」を所定の条件としたが、「操作画面が表示パネル１１０に表示されていること」または「通知画面が表示パネル１１０に表示されていること」を所定の条件としても良い。例えば、「操作画面が表示パネル１１０に表示されていること」を所定の件とする場合、選択部１５は、操作画面が表示パネル１１０に表示されている場合に、通知画面の表示の有無に関わらず、所定の条件を満たすと判定する。また、選択部１５は、操作画面が表示パネル１１０に表示されていない場合に、通知画面の表示の有無に関わらず、所定の条件を満たさないと判定する。 Further, in the present embodiment, "at least one of the operation screen and the notification screen is displayed on the display panel 110" is set as a predetermined condition, but "the operation screen is displayed on the display panel 110". Alternatively, "the notification screen is displayed on the display panel 110" may be a predetermined condition. For example, when "the operation screen is displayed on the display panel 110" is a predetermined matter, the selection unit 15 determines whether or not the notification screen is displayed when the operation screen is displayed on the display panel 110. Regardless, it is determined that the predetermined condition is satisfied. Further, the selection unit 15 determines that the predetermined condition is not satisfied regardless of whether or not the notification screen is displayed when the operation screen is not displayed on the display panel 110.

また、本実施形態では、ウェイクワード検出部１２と音声認識部１３とを別個の機能部としたが、音声認識部１３がウェイクワード検出部１２の機能を備えるものとしても良い。また、音声認識部１３とウェイクワード検出部１２とを総称して、音声認識部と称しても良い。なお、本実施形態で例示した音声認識サービスの内容は一例であり、音声認識サービスの内容は、例示した内容に限定されるものではない。 Further, in the present embodiment, the wake word detection unit 12 and the voice recognition unit 13 are separate functional units, but the voice recognition unit 13 may have the function of the wake word detection unit 12. Further, the voice recognition unit 13 and the wake word detection unit 12 may be collectively referred to as a voice recognition unit. The content of the voice recognition service illustrated in this embodiment is an example, and the content of the voice recognition service is not limited to the illustrated content.

また、本実施形態における音量の低下や表示パネル１１０への応答メッセージ等の表示は、音声認識サービス開始時の処理の一例であり、音声認識サービス開始時の処理はこれらに限定されるものではない。例えば、テレビジョン装置１０は、音声認識サービス開始時に、応答メッセージを音声出力しても良い。 Further, the reduction in volume and the display of the response message on the display panel 110 in the present embodiment are examples of the processing at the start of the voice recognition service, and the processing at the start of the voice recognition service is not limited to these. .. For example, the television device 10 may output a response message by voice when the voice recognition service is started.

また、本実施形態では、選択部１５は、所定の条件を満たすと判定した場合に、音声認識の無効状態を選択し、所定の条件を満たさないと判定した場合に、音声認識の有効状態を選択するものとしたが、選択基準はこれに限定されるものではない。 Further, in the present embodiment, the selection unit 15 selects the invalid state of voice recognition when it is determined that the predetermined condition is satisfied, and sets the valid state of voice recognition when it is determined that the predetermined condition is not satisfied. The selection is made, but the selection criteria are not limited to this.

例えば、音声認識が無効状態であることが通常の状態である場合、選択部１５は、所定の条件を満たすと判定した場合に、音声認識の有効状態を選択し、所定の条件を満たさないと判定した場合に、音声認識の無効状態を選択するものとしても良い。具体的な例を挙げると、所定の条件が「操作画面および通知画面のいずれも表示パネル１１０に表示されていないこと」である場合、選択部１５は、操作画面および通知画面のいずれも表示パネル１１０に表示されていないと判断した場合に、所定の条件を満たすと判定し、音声認識の有効状態を選択しても良い。また、選択部１５は、操作画面または通知画面のいずれかが表示パネル１１０に表示されていると判断した場合に、所定の条件を満たさないと判定し、音声認識の無効状態を選択するものとしても良い。 For example, when it is a normal state that the voice recognition is disabled, the selection unit 15 selects the enabled state of the voice recognition when it determines that the predetermined condition is satisfied, and the predetermined condition is not satisfied. When it is determined, the invalid state of voice recognition may be selected. To give a specific example, when the predetermined condition is "neither the operation screen nor the notification screen is displayed on the display panel 110", the selection unit 15 selects both the operation screen and the notification screen on the display panel. If it is determined that the display is not displayed on 110, it may be determined that a predetermined condition is satisfied, and the enabled state of voice recognition may be selected. Further, when it is determined that either the operation screen or the notification screen is displayed on the display panel 110, the selection unit 15 determines that the predetermined condition is not satisfied and selects the invalid state of voice recognition. Is also good.

（第２の実施形態）
上述の第１の実施形態では、音声認識の無効状態が選択される所定の条件は、「操作画面または通知画面の少なくともいずれかが表示パネル１１０に表示されていること」であった。これに対して、この第２の実施形態では、音声認識の無効状態が選択される所定の条件は、「所定のアプリケーションが実行中であること」である。 (Second embodiment)
In the first embodiment described above, the predetermined condition for selecting the disabled state of voice recognition is "at least one of the operation screen and the notification screen is displayed on the display panel 110". On the other hand, in the second embodiment, the predetermined condition for selecting the invalid state of voice recognition is "a predetermined application is being executed".

本実施形態にかかるテレビジョン装置１０のハードウェア構成は、第１の実施形態と同様である。 The hardware configuration of the television device 10 according to the present embodiment is the same as that of the first embodiment.

図４は、本実施形態にかかるテレビジョン装置１０の機能的構成の一例を示す図である。図４に示すように、テレビジョン装置１０は、取得部１１と、ウェイクワード検出部１２と、音声認識部１３と、表示制御部１４と、選択部１０１５と、機器制御部１６と、アプリケーション実行部１７とを備える。アプリケーション実行部１７も、他の機能部と同様に、ＣＰＵ１１４がプログラムを実行することによって実現される。取得部１１と、ウェイクワード検出部１２と、音声認識部１３と、表示制御部１４と、機器制御部１６とは、第１の実施形態と同様の機能を備える。 FIG. 4 is a diagram showing an example of the functional configuration of the television device 10 according to the present embodiment. As shown in FIG. 4, the television device 10 includes an acquisition unit 11, a wake word detection unit 12, a voice recognition unit 13, a display control unit 14, a selection unit 1015, a device control unit 16, and application execution. A unit 17 is provided. The application execution unit 17 is also realized by the CPU 114 executing the program, like the other functional units. The acquisition unit 11, the wake word detection unit 12, the voice recognition unit 13, the display control unit 14, and the device control unit 16 have the same functions as those in the first embodiment.

アプリケーション実行部１７は、コンテンツ配信のアプリケーションを実行し、該アプリケーションによって配信されるコンテンツの動画を、表示パネル１１０に表示させる。 The application execution unit 17 executes an application for content distribution, and displays a moving image of the content distributed by the application on the display panel 110.

アプリケーション実行部１７によって実行されるコンテンツ配信のアプリケーションは、本実施形態における所定のアプリケーションの一例である。コンテンツ配信のアプリケーションは、例えば、外部のサーバから、ネットワーク３００を介してドラマや映画等のコンテンツ動画の配信を受けるアプリケーションとするが、他の機能を含むアプリケーションであっても良い。 The content distribution application executed by the application execution unit 17 is an example of a predetermined application in the present embodiment. The content distribution application is, for example, an application that receives distribution of content videos such as dramas and movies from an external server via the network 300, but may be an application that includes other functions.

アプリケーション実行部１７は、例えば、コンテンツ配信のアプリケーションの実行中は、メモリ１１５にコンテンツ配信のアプリケーションが実行中であることを示すアプリケーション実行フラグを設定するものとする。 For example, while the content distribution application is being executed, the application execution unit 17 sets an application execution flag indicating that the content distribution application is being executed in the memory 115.

本実施形態の選択部１０１５は、第１の実施形態と同様に、所定の条件に基づいて、音声認識の有効状態と無効状態のいずれかを選択するが、本実施形態においては第１の実施形態とは異なる条件を用いて有効状態と無効状態のいずれかを選択する。 Similar to the first embodiment, the selection unit 1015 of the present embodiment selects either an enabled state or an disabled state of voice recognition based on a predetermined condition, but in the first embodiment, the first embodiment is selected. Select either the valid state or the invalid state using conditions different from the form.

より詳細には、本実施形態における所定の条件は、「所定のアプリケーション（コンテンツ配信のアプリケーション）が実行中であること」である。本実施形態の選択部１０１５は、コンテンツ配信のアプリケーションの実行状態を取得し、コンテンツ配信のアプリケーションが実行中である場合に、所定の条件が満たされていると判定し、音声認識の無効状態を選択する。また、選択部１０１５は、コンテンツ配信のアプリケーションが実行中ではない場合に、所定の条件が満たされていないと判定し、音声認識の有効状態を選択する。 More specifically, the predetermined condition in the present embodiment is "a predetermined application (content distribution application) is being executed". The selection unit 1015 of the present embodiment acquires the execution state of the content distribution application, determines that the predetermined condition is satisfied when the content distribution application is running, and sets the voice recognition invalid state. select. Further, the selection unit 1015 determines that the predetermined condition is not satisfied when the content distribution application is not being executed, and selects the enabled state of voice recognition.

選択部１０１５は、例えば、メモリ１１５のアプリケーション実行フラグの有無に基づいて、所定のアプリケーションが実行中であるか否かを判定するが、他の手法で所定のアプリケーションの実行状態を取得しても良い。 The selection unit 1015 determines whether or not a predetermined application is being executed based on, for example, the presence or absence of the application execution flag in the memory 115, but even if the execution state of the predetermined application is acquired by another method. good.

また、本実施形態にかかる音声認識の有効状態と無効状態の選択処理の流れは、図３で示した第１の実施形態と同様である。 Further, the flow of the selection process of the valid state and the invalid state of the voice recognition according to the present embodiment is the same as that of the first embodiment shown in FIG.

このように、本実施形態のテレビジョン装置１０は、コンテンツ配信のアプリケーションが実行中ではない場合に有効状態を選択し、コンテンツ配信のアプリケーションが実行中である場合に無効状態を選択する。このため、本実施形態のテレビジョン装置１０によれば、第１の実施形態の効果に加えて、コンテンツ配信のアプリケーションによって動画コンテンツ等が表示パネル１１０に表示されている場合に、音声認識サービスが開始することを低減する。 As described above, the television device 10 of the present embodiment selects the enabled state when the content distribution application is not running, and selects the disabled state when the content distribution application is running. Therefore, according to the television device 10 of the present embodiment, in addition to the effect of the first embodiment, when the video content or the like is displayed on the display panel 110 by the content distribution application, the voice recognition service is provided. Reduce getting started.

すなわち、本実施形態のテレビジョン装置１０によれば、音声認識サービスの開始によって表示パネル１１０上に表示されたコンテンツ動画が消えてしまったり、コンテンツ動画の上に応答メッセージ等が表示されてコンテンツ動画が隠れてしまったりという事態の発生を低減することができる。また、音声認識サービスが開始すると、スピーカ１０９の音量が下げられるため、再生中のコンテンツ動画の視聴が妨げられる場合がある。本実施形態のテレビジョン装置１０によれば、コンテンツ配信のアプリケーションによって動画コンテンツ等が表示パネル１１０に表示されている場合に、音声認識サービスが開始することを低減するため、再生中のコンテンツ動画をユーザが視聴することを妨げることを低減することができる。 That is, according to the television device 10 of the present embodiment, the content moving image displayed on the display panel 110 disappears due to the start of the voice recognition service, or a response message or the like is displayed on the content moving image to display the content moving image. It is possible to reduce the occurrence of situations such as hiding. Further, when the voice recognition service is started, the volume of the speaker 109 is lowered, which may hinder the viewing of the content moving image during playback. According to the television device 10 of the present embodiment, when the video content or the like is displayed on the display panel 110 by the content distribution application, the content video being played is displayed in order to reduce the start of the voice recognition service. It is possible to reduce hindering the user from viewing.

また、実際には音声認識サービスが開始しなくても、音声認識サービスが開始することをユーザが警戒し、動画コンテンツ等の視聴に集中できない場合があるが、本実施形態のテレビジョン装置１０は、このような事態を低減することができる。 Further, even if the voice recognition service does not actually start, the user may be wary of the start of the voice recognition service and may not be able to concentrate on viewing video content or the like. , Such a situation can be reduced.

なお、本実施形態においては、所定のアプリケーションはコンテンツ配信のアプリケーションであるものとしたが、テレビジョン装置１０で実行可能なアプリケーションのうち、いずれのアプリケーションが「所定のアプリケーション」となるかは、テレビジョン装置１０に予め設定されていても良いし、ユーザが設定可能であるものとしても良い。 In the present embodiment, the predetermined application is assumed to be a content distribution application, but which of the applications that can be executed by the television device 10 is the "predetermined application" is determined by the television. It may be set in advance in the John device 10, or it may be set by the user.

（第３の実施形態）
この第３の実施形態では、音声認識の無効状態が選択される所定の条件は、「現在時刻が無効期間内であること」である。 (Third Embodiment)
In this third embodiment, the predetermined condition for selecting the invalid state of voice recognition is "the current time is within the invalid period".

図５は、本実施形態にかかるテレビジョン装置１０の機能的構成の一例を示す図である。図５に示すように、テレビジョン装置１０は、取得部１０１１と、ウェイクワード検出部１２と、音声認識部１３と、表示制御部１４と、選択部２０１５と、機器制御部１６とを備える。ウェイクワード検出部１２と、音声認識部１３と、表示制御部１４と、機器制御部１６とは、第１の実施形態と同様の機能を備える。 FIG. 5 is a diagram showing an example of the functional configuration of the television device 10 according to the present embodiment. As shown in FIG. 5, the television device 10 includes an acquisition unit 1011, a wake word detection unit 12, a voice recognition unit 13, a display control unit 14, a selection unit 2015, and a device control unit 16. The wake word detection unit 12, the voice recognition unit 13, the display control unit 14, and the device control unit 16 have the same functions as those in the first embodiment.

本実施形態のテレビジョン装置１０は、音声認識を無効状態にする無効期間の設定を有する。無効期間は、音声認識が無効状態となる期間である。無効期間の設定は、例えば、ストレージ１１６に保存される。本実施形態においては、ユーザの操作によって該無効期間の設定が登録または変更されるものとする。無効期間の設定とは、例えば、無効期間の開始時刻および終了時刻に関する設定である。 The television device 10 of the present embodiment has an invalid period setting for disabling voice recognition. The invalid period is a period during which voice recognition is disabled. The invalid period setting is stored in the storage 116, for example. In the present embodiment, the setting of the invalid period is registered or changed by the operation of the user. The invalid period setting is, for example, a setting relating to a start time and an end time of the invalid period.

より詳細には、本実施形態の取得部１０１１は、第１の実施形態の機能を備えた上で、ユーザによる無効期間の開始時刻および終了時刻の入力操作を受け付ける。例えば、取得部１０１１は、受光部１１２が受光したリモートコントローラ１１９からの赤外線または操作部１１１に入力された操作に基づいて、ユーザによる無効期間の開始時刻および終了時刻の入力操作を受け付け、受け付けた無効期間の開始時刻および終了時刻を示す無効期間情報を、ストレージ１１６等に保存する。なお、無効期間情報の保存場所はこれに限定されるものではない。 More specifically, the acquisition unit 1011 of the present embodiment has the functions of the first embodiment and accepts the input operation of the start time and the end time of the invalid period by the user. For example, the acquisition unit 1011 receives and accepts the input operation of the start time and the end time of the invalid period by the user based on the infrared rays from the remote controller 119 received by the light receiving unit 112 or the operation input to the operation unit 111. The invalid period information indicating the start time and end time of the invalid period is stored in the storage 116 or the like. The storage location of the invalid period information is not limited to this.

例えば、ユーザは、就寝中に音声認識サービスが起動しないように、“ＰＭ２３：００〜ＡＭ０６：００”を無効期間として設定しても良い。また、ユーザは、自宅を留守にする期間に音声認識サービスが起動しないように、“ＡＭ０９：００〜ＰＭ１７：００”を無効期間として設定しても良い。 For example, the user may set "PM23: 00 to AM06: 00" as an invalid period so that the voice recognition service does not start while sleeping. In addition, the user may set "AM09: 00 to PM17:00" as an invalid period so that the voice recognition service does not start during the period when he / she is away from home.

また、本実施形態においては、無効期間として設定されていない期間は全て有効期間であるものとする。なお、本実施形態においては、第１の実施形態と同様に、通常の状態では音声認識が有効状態でマイク１１７がオン状態であるものとする。 Further, in the present embodiment, all the periods not set as the invalid period shall be the valid period. In the present embodiment, as in the first embodiment, it is assumed that the voice recognition is enabled and the microphone 117 is on in the normal state.

本実施形態の選択部２０１５は、第１の実施形態と同様に、所定の条件に基づいて、音声認識の有効状態と無効状態のいずれかを選択するが、本実施形態においては第１の実施形態とは異なる条件を用いて有効状態と無効状態のいずれかを選択する。 Similar to the first embodiment, the selection unit 2015 of the present embodiment selects either an enabled state or an disabled state of voice recognition based on a predetermined condition, but in the first embodiment, the first embodiment is selected. Select either the valid state or the invalid state using conditions different from the form.

より詳細には、本実施形態における所定の条件は、「現在時刻が無効期間内であること」である。本実施形態の選択部２０１５は、現在時刻が無効期間内である場合に、所定の条件が満たされていると判定し、音声認識の無効状態を選択する。また、選択部２０１５は、現在時刻が有効期間内であるである場合に、所定の条件が満たされていないと判定し、音声認識の有効状態を選択する。 More specifically, the predetermined condition in the present embodiment is "the current time is within the invalid period". When the current time is within the invalid period, the selection unit 2015 of the present embodiment determines that the predetermined condition is satisfied, and selects the invalid state of voice recognition. Further, the selection unit 2015 determines that the predetermined condition is not satisfied when the current time is within the valid period, and selects the valid state of voice recognition.

また、本実施形態にかかる音声認識の状態選択処理の流れは、図３で示した第１の実施形態と同様である。 Further, the flow of the voice recognition state selection process according to the present embodiment is the same as that of the first embodiment shown in FIG.

このように、本実施形態のテレビジョン装置１０によれば、現在時刻が有効期間内である場合に有効状態を選択し、現在時刻が無効期間内である場合に無効状態を選択することにより、第１の実施形態の効果に加えて、ユーザが音声認識サービスの開始を望まない時間帯に、音声認識サービスが開始されることを低減することができる。 As described above, according to the television device 10 of the present embodiment, the valid state is selected when the current time is within the valid period, and the invalid state is selected when the current time is within the invalid period. In addition to the effect of the first embodiment, it is possible to reduce the start of the voice recognition service at a time when the user does not want to start the voice recognition service.

なお、本実施形態においては、ユーザによる無効期間の設定を受け付けるものとしたが、有効期間の設定を受け付けるものとしても良い。例えば、テレビジョン装置１０において音声認識が無効状態であることが通常の状態である場合、設定された有効期間に限り、音声認識が有効状態になるものとしても良い。この場合、所定の条件は、例えば、「現在時刻が有効期間内であること」としても良い。また、当該構成を採用する場合、選択部２０１５は、所定の条件を満たすと判定した場合に、音声認識の有効状態を選択し、所定の条件を満たさないと判定した場合に、音声認識の無効状態を選択しても良い。 In the present embodiment, the invalid period setting by the user is accepted, but the valid period setting may be accepted. For example, when it is a normal state that the voice recognition is disabled in the television device 10, the voice recognition may be enabled only for the set valid period. In this case, the predetermined condition may be, for example, "the current time is within the valid period". Further, when adopting the configuration, the selection unit 2015 selects the enabled state of voice recognition when it is determined that the predetermined condition is satisfied, and invalidates the voice recognition when it is determined that the predetermined condition is not satisfied. You may select the state.

なお、本実施形態においては、無効期間は単に開始時刻と終了時刻とで定義されるものとしたが、曜日、または祝日等のカレンダ情報によってさらに詳細に定義されても良い。 In the present embodiment, the invalid period is simply defined by the start time and the end time, but may be defined in more detail by calendar information such as a day of the week or a holiday.

（第４の実施形態）
この第４の実施形態では、音声認識の無効状態が選択される所定の条件は、第３の実施形態と同様に「現在時刻が無効期間内であること」である。ただし、第３の実施形態では、ユーザが無効期間を設定していたのに対して、この第４の実施形態では、テレビジョン装置１０が学習結果に基づいて、無効期間を設定する。 (Fourth Embodiment)
In this fourth embodiment, the predetermined condition for selecting the invalid state of voice recognition is "the current time is within the invalid period" as in the third embodiment. However, in the third embodiment, the user sets the invalid period, whereas in the fourth embodiment, the television device 10 sets the invalid period based on the learning result.

図６は、本実施形態にかかるテレビジョン装置１０の機能的構成の一例を示す図である。図６に示すように、テレビジョン装置１０は、取得部１１と、ウェイクワード検出部１２と、音声認識部１３と、表示制御部１４と、選択部２０１５と、機器制御部１６と、学習部１８とを備える。学習部１８も、他の機能部と同様に、ＣＰＵ１１４がプログラムを実行することによって実現される。取得部１１と、ウェイクワード検出部１２と、音声認識部１３と、表示制御部１４と、機器制御部１６とは、第１の実施形態と同様の機能を備える。また、選択部２０１５は、第３の実施形態と同様の機能を備える。 FIG. 6 is a diagram showing an example of the functional configuration of the television device 10 according to the present embodiment. As shown in FIG. 6, the television device 10 includes an acquisition unit 11, a wake word detection unit 12, a voice recognition unit 13, a display control unit 14, a selection unit 2015, a device control unit 16, and a learning unit. It is provided with 18. The learning unit 18 is also realized by the CPU 114 executing the program, like the other functional units. The acquisition unit 11, the wake word detection unit 12, the voice recognition unit 13, the display control unit 14, and the device control unit 16 have the same functions as those in the first embodiment. Further, the selection unit 2015 has the same function as that of the third embodiment.

学習部１８は、ユーザによる操作のパターンを学習し、学習済みモデルを生成する。本実施形態における学習済モデルは、一例として、時刻と、該時刻における音声認識サービスの要否とを対応付けた情報である。学習部１８が学習をする手法には、例えば公知の機械学習または深層学習における教師なし学習の技術を適用することができる。学習済モデルは、例えばストレージ１１６等に保存されるが、保存場所はこれに限定されるものではない。 The learning unit 18 learns a pattern of operations by the user and generates a learned model. The trained model in the present embodiment is, for example, information in which the time is associated with the necessity of the voice recognition service at the time. For example, a known technique of unsupervised learning in machine learning or deep learning can be applied to the method in which the learning unit 18 learns. The trained model is stored in, for example, storage 116, but the storage location is not limited to this.

学習部１８の入力データは、ユーザの操作内容と時刻であり、例えば、ユーザが音声認識サービスの取り消し操作をした時刻、ユーザによる音声認識サービスの利用時刻等である。例えば、開始した音声認識サービスをユーザが利用せずにリモートコントローラ１１９等で終了させた場合、該時刻と、ユーザが音声認識サービスの取り消し操作をしたことを学習する。 The input data of the learning unit 18 is the operation content and time of the user, for example, the time when the user cancels the voice recognition service, the time when the user uses the voice recognition service, and the like. For example, when the started voice recognition service is terminated by the remote controller 119 or the like without being used by the user, the time and the user canceling the voice recognition service are learned.

学習部１８は、学習結果に基づいて、音声認識サービスが不要な時刻を出力する。学習部１８は、当該出力の結果を、無効期間の開始時刻および終了時刻を示す無効期間情報として、ストレージ１１６等に保存する。 The learning unit 18 outputs a time when the voice recognition service is unnecessary based on the learning result. The learning unit 18 stores the result of the output in the storage 116 or the like as invalid period information indicating the start time and end time of the invalid period.

また、学習部１８は、一度学習済みモデルを生成した後も、継続的にユーザによる操作のパターンを学習し、学習済みモデルの精度を向上させるものとする。 Further, the learning unit 18 continuously learns the operation pattern by the user even after the trained model is generated once, and improves the accuracy of the trained model.

このように、本実施形態のテレビジョン装置１０は、ユーザによる操作のパターンを学習した結果に基づいて、音声認識の無効期間を設定し、現在時刻が有効期間内である場合に有効状態を選択し、現在時刻が無効期間内である場合に無効状態を選択する。このため、本実施形態のテレビジョン装置１０によれば、第１，３の実施形態の効果に加えて、ユーザによる無効期間の設定操作の手間を低減することができる。 As described above, the television device 10 of the present embodiment sets the invalid period of voice recognition based on the result of learning the operation pattern by the user, and selects the effective state when the current time is within the valid period. Then, if the current time is within the invalid period, select the invalid state. Therefore, according to the television device 10 of the present embodiment, in addition to the effects of the first and third embodiments, it is possible to reduce the time and effort of the user to set the invalid period.

なお、本実施形態で例示した学習部１８への入力データおよび出力結果は、一例であり、これらに限定されるものではない。また、学習部１８は、時刻だけではなく、曜日、または祝日等のカレンダ情報によって異なる無効期間を設定しても良い。 The input data and the output result to the learning unit 18 illustrated in this embodiment are examples, and are not limited thereto. Further, the learning unit 18 may set a different invalid period depending on not only the time but also the calendar information such as the day of the week or a holiday.

なお、本実施形態においては、テレビジョン装置１０は、ユーザによる操作のパターンを学習した結果に基づいて、音声認識の無効期間を設定するものとしたが、学習した結果に基づいて、音声認識の有効期間の設定するものとしても良い。 In the present embodiment, the television device 10 sets the invalid period of voice recognition based on the result of learning the operation pattern by the user, but based on the learned result, the voice recognition is performed. The validity period may be set.

（第５の実施形態）
この第５の実施形態では、音声認識の無効状態が選択される所定の条件は、「現在時刻が特定の番組の開始時刻から終了時刻の間であること」である。 (Fifth Embodiment)
In this fifth embodiment, the predetermined condition for selecting the invalid state of voice recognition is "the current time is between the start time and the end time of a specific program".

図７は、本実施形態にかかるテレビジョン装置１０の機能的構成の一例を示す図である。図７に示すように、テレビジョン装置１０は、取得部２０１１と、ウェイクワード検出部１２と、音声認識部１３と、表示制御部１４と、選択部３０１５と、機器制御部１６と、番組表生成部１９とを備える。ウェイクワード検出部１２と、音声認識部１３と、表示制御部１４と、機器制御部１６とは、第１の実施形態と同様の機能を備える。 FIG. 7 is a diagram showing an example of the functional configuration of the television device 10 according to the present embodiment. As shown in FIG. 7, the television device 10 includes an acquisition unit 2011, a wake word detection unit 12, a voice recognition unit 13, a display control unit 14, a selection unit 3015, a device control unit 16, and a program guide. A generation unit 19 is provided. The wake word detection unit 12, the voice recognition unit 13, the display control unit 14, and the device control unit 16 have the same functions as those in the first embodiment.

本実施形態の取得部２０１１は、第１の実施形態の機能を備えた上で、放送信号に含まれるＳＩ（Service Information）情報から、番組に関する情報を取得する。取得部２０１１は、取得した番組に関する情報を、番組表生成部１９に送出する。 The acquisition unit 2011 of the present embodiment has the functions of the first embodiment and acquires information about the program from the SI (Service Information) information included in the broadcast signal. The acquisition unit 2011 sends information about the acquired program to the program guide generation unit 19.

また、本実施形態の取得部２０１１は、ユーザによる特定の番組を指定する操作を受け付ける。例えば、取得部２０１１は、受光部１１２が受光したリモートコントローラ１１９からの赤外線または操作部１１１に入力された操作に基づいて、ユーザによる特定の番組を指定する操作を受け付ける。また、取得部２０１１は、ストレージ１１６に保存された番組表から、ユーザによって指定された特定の番組の開始時刻および終了時刻を取得する。取得部２０１１は、受け付けた特定の番組の開始時刻および終了時刻を示す番組時刻情報を、ストレージ１１６等に保存する。なお、番組時刻情報の保存場所はこれに限定されるものではない。 In addition, the acquisition unit 2011 of the present embodiment accepts an operation of designating a specific program by the user. For example, the acquisition unit 2011 accepts an operation of designating a specific program by the user based on the infrared rays from the remote controller 119 received by the light receiving unit 112 or the operation input to the operation unit 111. In addition, the acquisition unit 2011 acquires the start time and end time of a specific program designated by the user from the program guide stored in the storage 116. The acquisition unit 2011 stores the program time information indicating the start time and end time of the received specific program in the storage 116 or the like. The storage location of the program time information is not limited to this.

番組表生成部１９は、取得部２０１１によって取得された番組に関する情報に基づいて、番組表を生成する。番組表生成部１９は、生成した番組表を、例えば、ストレージ１１６に保存する。 The program guide generation unit 19 generates a program guide based on the information about the program acquired by the acquisition unit 2011. The program guide generation unit 19 stores the generated program guide in, for example, the storage 116.

また、ユーザが特定の番組の開始時刻および終了時刻を入力するものとしても良い。 Further, the user may input the start time and the end time of a specific program.

本実施形態の選択部３０１５は、第１の実施形態と同様に、所定の条件に基づいて、音声認識の有効状態と無効状態のいずれかを選択するが、本実施形態においては第１の実施形態とは異なる条件を用いて有効状態と無効状態のいずれかを選択する。 Similar to the first embodiment, the selection unit 3015 of the present embodiment selects either an enabled state or an disabled state of voice recognition based on a predetermined condition, but in the first embodiment, the first embodiment is selected. Select either the valid state or the invalid state using conditions different from the form.

より詳細には、本実施形態における所定の条件は、「現在時刻が特定の番組の開始時刻から終了時刻の間であること」である。「特定の番組の開始時刻から終了時刻の間」は、本実施形態における無効期間の一例である。 More specifically, the predetermined condition in the present embodiment is "the current time is between the start time and the end time of a specific program". “Between the start time and the end time of a specific program” is an example of an invalid period in the present embodiment.

本実施形態の選択部３０１５は、現在時刻が特定の番組の開始時刻から終了時刻の間であるか否かに基づいて、有効状態と無効状態のいずれかを選択する。例えば、選択部３０１５は、現在時刻が特定の番組の開始時刻から終了時刻の間である場合に、所定の条件が満たされていると判定し、音声認識の無効状態を選択する。また、選択部３０１５は、現在時刻が特定の番組の開始時刻から終了時刻の間である場合に、所定の条件が満たされていないと判定し、音声認識の有効状態を選択する。 The selection unit 3015 of the present embodiment selects either an active state or an invalid state based on whether or not the current time is between the start time and the end time of a specific program. For example, when the current time is between the start time and the end time of a specific program, the selection unit 3015 determines that a predetermined condition is satisfied, and selects an invalid state of voice recognition. Further, the selection unit 3015 determines that the predetermined condition is not satisfied when the current time is between the start time and the end time of the specific program, and selects the enabled state of voice recognition.

このように、本実施形態のテレビジョン装置１０は、現在時刻が特定の番組の開始時刻から終了時刻の間であるか否かに基づいて、有効状態と無効状態のいずれかを選択する。このため、本実施形態のテレビジョン装置１０によれば、第１の実施形態の効果に加えて、ユーザが特定の番組を視聴している際に、音声認識サービスが開始することを防止することができる。このため、本実施形態のテレビジョン装置１０によれば、ユーザがお気に入りの番組の視聴を不要な音声認識サービスの開始によって妨げられることを低減することができる。また、本実施形態のテレビジョン装置１０によれば、ユーザが特定番組の視聴中に音声認識サービスによって意図せず他の番組に切り替わったり、テレビジョン装置１０の電源が切れてしまったりという誤動作の発生を低減することができ、また、該誤動作の発生によってユーザが番組を見逃してしまうことを低減させることができる。 As described above, the television apparatus 10 of the present embodiment selects either an enabled state or an invalid state based on whether or not the current time is between the start time and the end time of a specific program. Therefore, according to the television device 10 of the present embodiment, in addition to the effect of the first embodiment, it is possible to prevent the voice recognition service from starting when the user is watching a specific program. Can be done. Therefore, according to the television device 10 of the present embodiment, it is possible to reduce that the user is prevented from watching a favorite program by starting an unnecessary voice recognition service. Further, according to the television device 10 of the present embodiment, the user unintentionally switches to another program by the voice recognition service while watching a specific program, or the television device 10 is turned off. It is possible to reduce the occurrence, and it is possible to reduce the user from missing the program due to the occurrence of the malfunction.

なお、本実施形態においては、特定の番組をユーザが設定するものとしたが、テレビジョン装置１０がユーザの視聴履歴を学習した学習結果に基づいて特定の番組を設定しても良い。 In the present embodiment, the user sets a specific program, but the television device 10 may set a specific program based on the learning result of learning the viewing history of the user.

また、本実施形態においては、受信装置の一例であるテレビジョン装置１０が、放送信号から番組に関する情報を取得するものとしたが、受信装置は、ＩＰ通信部１１３およびネットワーク３００を介して外部から番組表データを取得しても良い。 Further, in the present embodiment, the television device 10 which is an example of the receiving device acquires the information about the program from the broadcast signal, but the receiving device is externally transmitted via the IP communication unit 113 and the network 300. Program guide data may be acquired.

（変形例１）
上述の第１から第５の実施形態では、音声認識が有効状態の場合と無効状態の場合とでマイク１１７のオン状態とオフ状態とを切り替えるものとしたが、マイク１１７はオン状態のままで、音声認識機能の有効状態と無効状態とを切り替えるものとしても良い。 (Modification example 1)
In the first to fifth embodiments described above, the microphone 117 is switched between the on state and the off state depending on whether the voice recognition is enabled or disabled, but the microphone 117 remains on. , The voice recognition function may be switched between the enabled state and the disabled state.

例えば、音声認識の無効状態が選択された場合、ウェイクワード検出部１２および音声認識部１３は、マイク１１７に入力された音声に対するウェイクワードの検出処理および音声認識処理を実行しない。このため、音声認識の無効状態が選択された場合には、マイク１１７が音声を入力可能な状態であっても、音声認識サービスが開始することは無い。 For example, when the voice recognition disabled state is selected, the wake word detection unit 12 and the voice recognition unit 13 do not execute the wake word detection process and the voice recognition process for the voice input to the microphone 117. Therefore, when the voice recognition disabled state is selected, the voice recognition service does not start even if the microphone 117 is in a state where voice can be input.

また、音声認識の有効状態が選択された場合は、ウェイクワード検出部１２および音声認識部１３は、第１〜５の実施形態と同様に、マイク１１７に入力された音声に対するウェイクワードの検出処理または音声認識処理を実行する。 When the voice recognition enabled state is selected, the wake word detection unit 12 and the voice recognition unit 13 detect the wake word for the voice input to the microphone 117, as in the first to fifth embodiments. Or execute voice recognition processing.

（変形例２）
上述の第１から第５の実施形態では、それぞれ異なる所定の条件に基づいて音声認識の有効状態と無効状態とを選択していたが、異なる実施形態における所定の条件を組み合わせても良い。例えば、音声認識の無効状態が選択される所定の条件は、第１から第５の実施形態の所定の条件をＯＲ条件として組み合わせた「操作画面または通知画面の少なくともいずれかが表示パネル１１０に表示されていること、所定のアプリケーションが実行中であること、現在時刻が無効期間内であること、または、現在時刻が特定の番組の開始時刻から終了時刻の間であること」であっても良いし、これらの所定の条件の一部を組み合わせたものであっても良い。 (Modification 2)
In the first to fifth embodiments described above, the enabled state and the disabled state of voice recognition are selected based on different predetermined conditions, but predetermined conditions in different embodiments may be combined. For example, the predetermined condition for selecting the disabled state of voice recognition is "at least one of the operation screen and the notification screen is displayed on the display panel 110" in which the predetermined conditions of the first to fifth embodiments are combined as an OR condition. It may be that the specified application is running, the current time is within the invalid period, or the current time is between the start time and the end time of a specific program. " However, it may be a combination of some of these predetermined conditions.

（変形例３）
上述の第１から第５の実施形態では、テレビジョン装置１０を受信装置の一例としたが、受信装置は、これに限定されるものではない。例えば、受信装置は、セットアップボックス、またはテレビジョン機能付きのＰＣ（Personal Computer）等でも良いし、ＢＤ（Blu-ray Disc）（登録商標）レコーダまたはＤＶＤレコーダ等の録画再生装置であっても良い。 (Modification example 3)
In the first to fifth embodiments described above, the television device 10 is used as an example of the receiving device, but the receiving device is not limited to this. For example, the receiving device may be a setup box, a PC (Personal Computer) with a television function, or a recording / playback device such as a BD (Blu-ray Disc) (registered trademark) recorder or a DVD recorder. ..

以上説明したとおり、第１から第５の実施形態によれば、音声認識サービスが不要な場面において音声認識サービスが開始することを低減することができる。 As described above, according to the first to fifth embodiments, it is possible to reduce the start of the voice recognition service in a situation where the voice recognition service is unnecessary.

本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 Although some embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These novel embodiments can be implemented in various other embodiments, and various omissions, replacements, and changes can be made without departing from the gist of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are also included in the scope of the invention described in the claims and the equivalent scope thereof.

１０テレビジョン装置
１１，１０１１，２０１１取得部
１２ウェイクワード検出部
１３音声認識部
１４表示制御部
１５，１０１５，２０１５，３０１５選択部
１６機器制御部
１７アプリケーション実行部
１８学習部
１９番組表生成部
１１０表示パネル
１１１操作部
１１２受光部
１１５メモリ
１１６ストレージ
１１７マイク
１１９リモートコントローラ
３００ネットワーク
10 Television device 11,1011,2011 Acquisition unit 12 Wake word detection unit 13 Voice recognition unit 14 Display control unit 15,1015, 2015, 3015 Selection unit 16 Device control unit 17 Application execution unit 18 Learning unit 19 Program guide generation unit 110 Display panel 111 Operation unit 112 Light receiving unit 115 Memory 116 Storage 117 Microphone 119 Remote controller 300 Network

Claims

A voice input unit for inputting the user's voice,
A selection unit that selects either the enabled state or the disabled state of voice recognition based on predetermined conditions,
When the valid state is selected, the voice recognition process for the voice input to the voice input unit is executed, and when the invalid state is selected, the voice recognition process which does not execute the voice recognition process is performed.
A receiver equipped with.

The predetermined condition is that at least one of an operation screen capable of accepting the operation of the user or a notification screen is displayed on the display unit.
The selection unit selects the invalid state when at least one of the operation screen and the notification screen is displayed on the display unit, and either the operation screen or the notification screen is displayed on the display unit. If not, select the valid state,
The receiving device according to claim 1.

The predetermined condition is that a predetermined application is running.
The selection unit acquires the execution state of the predetermined application, selects the valid state when the predetermined application is not running, and selects the invalid state when the predetermined application is running. To do
The receiving device according to claim 1.

The predetermined condition is that the current time is within the invalid period or within the valid period.
The selection unit selects the valid state when the current time is within the valid period, and selects the invalid state when the current time is within the invalid period.
The receiving device according to claim 1.

The voice input unit is a microphone.
A device control unit is further provided, which turns on the microphone when the active state is selected by the selection unit, and turns off the microphone when the invalid state is selected by the selection unit.
The receiving device according to any one of claims 1 to 4.