JP7424156B2

JP7424156B2 - Content output control device, content output system, content output control method, and program

Info

Publication number: JP7424156B2
Application number: JP2020054197A
Authority: JP
Inventors: 建小林; 利一藤井; 一夫野村; 達弘 ▲鮭▼川; 真史上村; 丁珠崔
Original assignee: JVCKenwood Corp
Current assignee: JVCKenwood Corp
Priority date: 2020-03-25
Filing date: 2020-03-25
Publication date: 2024-01-30
Anticipated expiration: 2040-03-25
Also published as: JP2021157246A

Description

本発明は、コンテンツ出力制御装置、コンテンツ出力システム、コンテンツ出力制御方法およびプログラムに関する。 The present invention relates to a content output control device, a content output system, a content output control method, and a program.

音声出力可能な電子書籍やゲームなど、表示画面にコンテンツ内容を表示することに加えて、音声を出力する技術が知られている（例えば、特許文献１参照）。 2. Description of the Related Art Techniques for outputting audio in addition to displaying content on a display screen, such as electronic books and games that can output audio, are known (for example, see Patent Document 1).

特開２０１６－１９２２１１号公報Japanese Patent Application Publication No. 2016-192211

ところが、このような装置を、例えば公共交通機関などの乗車時に利用している場合、車内放送などの周辺音が聞き取りにくい場合などがある。 However, when such a device is used, for example, while riding on public transportation, it may be difficult to hear surrounding sounds such as in-car announcements.

本発明は、上記に鑑みてなされたものであって、適切に周辺音の確認を可能にすることを目的とする。 The present invention has been made in view of the above, and an object of the present invention is to enable appropriate confirmation of surrounding sounds.

上述した課題を解決し、目的を達成するために、本発明に係るコンテンツ出力制御装置は、ユーザの視線方向を検出する視線検出部と、前記視線検出部の検出結果に基づき、前記ユーザがコンテンツを表示した表示画面を注視しているか否かを判断する注視判断部と、周辺音を取得する周辺音取得部と、前記コンテンツに関する音声を取得する音声取得部と、前記注視判断部が、前記ユーザがコンテンツを表示した表示画面を注視していると判断した場合は、前記音声取得部が取得した前記コンテンツに関する音声を出力し、前記ユーザがコンテンツを表示した表示画面を注視していないと判断した場合は、前記周辺音取得部が取得した周辺音を出力する音声出力制御部と、を備える。 In order to solve the above-mentioned problems and achieve the purpose, a content output control device according to the present invention includes a line-of-sight detecting section that detects the direction of the user's line-of-sight, and a line-of-sight detecting section that detects the direction of the user's line of sight. a gaze determination unit that determines whether or not the user is gazing at a display screen displaying the content; a peripheral sound acquisition unit that acquires ambient sound; an audio acquisition unit that acquires audio related to the content; If it is determined that the user is gazing at a display screen displaying content, the audio acquisition unit outputs the acquired audio related to the content, and it is determined that the user is not gazing at the display screen displaying content. In this case, the apparatus further includes an audio output control section that outputs the ambient sound acquired by the ambient sound acquisition section.

本発明に係るコンテンツ出力システムは、上記のコンテンツ出力制御装置と、周辺音を収音する収音部と、音声を出力する音声出力部と、を備える。 A content output system according to the present invention includes the content output control device described above, a sound collection unit that collects ambient sound, and an audio output unit that outputs audio.

本発明に係るコンテンツ出力制御方法は、音声出力装置を利用しているユーザの視線方向を検出するステップと、視線方向の検出結果に基づき、前記ユーザがコンテンツを表示した表示画面を注視しているか否かを判断するステップと、周辺音を取得するステップと、前記コンテンツに関する音声を取得するステップと、前記ユーザがコンテンツを表示した表示画面を注視していると判断した場合は、前記コンテンツに関する音声を出力し、前記ユーザがコンテンツを表示した表示画面を注視していないと判断した場合は、周辺音を出力するステップと、を含む。 The content output control method according to the present invention includes the steps of: detecting the gaze direction of a user using an audio output device; and determining whether the user is gazing at a display screen displaying content based on the detection result of the gaze direction. a step of determining whether or not the content is displayed; a step of obtaining ambient sound; a step of obtaining audio related to the content; and, if it is determined that the user is not gazing at a display screen displaying content, outputting ambient sound.

本発明に係るプログラムは、音声出力装置を利用しているユーザの視線方向を検出するステップと、視線方向の検出結果に基づき、前記ユーザがコンテンツを表示した表示画面を注視しているか否かを判断するステップと、周辺音を取得するステップと、前記コンテンツに関する音声を取得するステップと、前記ユーザがコンテンツを表示した表示画面を注視していると判断した場合は、前記コンテンツに関する音声を出力し、前記ユーザがコンテンツを表示した表示画面を注視していないと判断した場合は、周辺音を出力するステップと、をコンピュータに実行させる。 The program according to the present invention includes the steps of detecting the line of sight direction of a user using an audio output device, and determining whether or not the user is gazing at a display screen displaying content based on the detection result of the line of sight direction. a step of determining, a step of acquiring ambient sound, a step of acquiring audio related to the content, and if it is determined that the user is gazing at a display screen displaying the content, outputting audio related to the content. , if it is determined that the user is not gazing at the display screen displaying the content, the computer is caused to perform the steps of outputting ambient sound.

本発明によれば、適切に周辺音の確認できるという効果を奏する。 According to the present invention, it is possible to appropriately confirm surrounding sounds.

図１は、第一実施形態に係るコンテンツ出力システムを示す概略図である。FIG. 1 is a schematic diagram showing a content output system according to the first embodiment. 図２は、第一実施形態に係るコンテンツ出力システムを示すブロック図である。FIG. 2 is a block diagram showing the content output system according to the first embodiment. 図３は、第一実施形態に係るコンテンツ出力システムにおける処理の流れの一例を示すフローチャートである。FIG. 3 is a flowchart showing an example of the flow of processing in the content output system according to the first embodiment. 図４は、第一実施形態に係るコンテンツ出力システムにおける処理の流れの他の例を示すフローチャートである。FIG. 4 is a flowchart showing another example of the flow of processing in the content output system according to the first embodiment. 図５は、第二実施形態に係るコンテンツ出力システムのブロック図である。FIG. 5 is a block diagram of a content output system according to the second embodiment. 図６は、第二実施形態に係るコンテンツ出力システムにおける処理の流れの一例を示すフローチャートである。FIG. 6 is a flowchart illustrating an example of the flow of processing in the content output system according to the second embodiment. 図７は、第三実施形態に係るコンテンツ出力システムのブロック図である。FIG. 7 is a block diagram of a content output system according to a third embodiment. 図８は、第三実施形態に係るコンテンツ出力システムにおける処理の流れの一例を示すフローチャートである。FIG. 8 is a flowchart illustrating an example of the flow of processing in the content output system according to the third embodiment.

以下に添付図面を参照して、本発明に係るコンテンツ出力システム１の実施形態を詳細に説明する。なお、以下の実施形態により本発明が限定されるものではない。 DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of a content output system 1 according to the present invention will be described in detail below with reference to the accompanying drawings. Note that the present invention is not limited to the following embodiments.

［第一実施形態］
＜コンテンツ出力システム＞
図１は、第一実施形態に係るコンテンツ出力システム１を示す概略図である。図２は、第一実施形態に係るコンテンツ出力システム１を示すブロック図である。コンテンツ出力システム１は、音声出力装置としてのヘッドホン１０と、例えば、電子書籍端末、スマートフォン、タブレット端末、携帯用音楽再生装置、および、携帯用ゲーム機など、映像および音声で構成されるコンテンツのコンテンツ再生や表示等を行う電子機器３０とを含む。言い換えると、コンテンツ出力システム１は、音声出力装置としてのヘッドホン１０と電子機器３０との組み合わせである。コンテンツ出力システム１は、ユーザが表示画面３６を注視しているか否かに応じて、ヘッドホン１０から表示画面３６に表示したコンテンツに関する音声を出力したり、周辺の音声を出力したりする。 [First embodiment]
<Content output system>
FIG. 1 is a schematic diagram showing a content output system 1 according to the first embodiment. FIG. 2 is a block diagram showing the content output system 1 according to the first embodiment. The content output system 1 includes headphones 10 as an audio output device, and content that is composed of video and audio, such as an electronic book terminal, a smartphone, a tablet terminal, a portable music playback device, and a portable game machine. It also includes an electronic device 30 that performs playback, display, etc. In other words, the content output system 1 is a combination of the headphones 10 as an audio output device and the electronic device 30. The content output system 1 outputs audio related to the content displayed on the display screen 36 or peripheral audio from the headphones 10, depending on whether the user is gazing at the display screen 36 or not.

＜ヘッドホン＞
ヘッドホン１０は、例えばオーバーヘッド式で、ユーザの頭に装着する。ヘッドホン１０は、電子機器３０のコンテンツ出力制御装置４０から出力される音声データに基づいて、表示画面３６に表示したコンテンツに関する音声を出力する。ヘッドホン１０は、コンテンツ出力制御装置４０から出力される音声信号に基づいて、周辺音を出力可能である。ヘッドホン１０は、電子機器３０と有線または無線でデータを送受信可能に接続されている。ヘッドホン１０は、左音声出力部１１と右音声出力部１２とコンテンツ入力部３１と左マイクロフォン１４と右マイクロフォン１５とコンテンツ出力制御装置４０とを有する。 <Headphones>
The headphones 10 are, for example, of an overhead type and are worn on the user's head. The headphones 10 output audio related to the content displayed on the display screen 36 based on audio data output from the content output control device 40 of the electronic device 30. The headphones 10 can output ambient sound based on the audio signal output from the content output control device 40. The headphones 10 are connected to an electronic device 30 so as to be able to transmit and receive data by wire or wirelessly. The headphones 10 include a left audio output section 11 , a right audio output section 12 , a content input section 31 , a left microphone 14 , a right microphone 15 , and a content output control device 40 .

左音声出力部１１は、左耳用の音声出力部である。左音声出力部１１は、左耳を覆う筐体を有する。左音声出力部１１は、左耳において聞かせる音声を出力する。左音声出力部１１は、コンテンツ出力制御装置４０から音声データを取得する。左音声出力部１１は、音声データの左チャンネルデータを出力する。左音声出力部１１は、音声データの左チャンネルデータをＤ／Ａ変換して得られた電気信号を音に変換して出力する。 The left audio output unit 11 is an audio output unit for the left ear. The left audio output unit 11 has a housing that covers the left ear. The left audio output unit 11 outputs audio to be heard in the left ear. The left audio output unit 11 acquires audio data from the content output control device 40. The left audio output unit 11 outputs left channel data of audio data. The left audio output section 11 converts the electrical signal obtained by D/A converting the left channel data of the audio data into sound and outputs the sound.

右音声出力部１２は、右耳用の音声出力部である。右音声出力部１２は、右耳において聞かせる音声を出力する。右音声出力部１２は、コンテンツ出力制御装置４０から音声データを取得する。右音声出力部１２は、音声データの右チャンネルデータを出力する。右音声出力部１２は、音声データの右チャンネルデータをＤ／Ａ変換して得られた電気信号を音に変換して出力する。 The right audio output unit 12 is an audio output unit for the right ear. The right audio output unit 12 outputs audio to be heard in the right ear. The right audio output unit 12 acquires audio data from the content output control device 40. The right audio output unit 12 outputs right channel data of audio data. The right audio output unit 12 converts the electrical signal obtained by D/A converting the right channel data of the audio data into sound and outputs the sound.

左マイクロフォン１４は、左音声出力部１１の筐体に配置されている。左マイクロフォン１４は、周辺音を取得する。周辺音は、例えば第三者の話し声および乗物の騒音などを含む環境音である。左マイクロフォン１４は、取得した音声を電子機器３０の周辺音取得部５２に出力する。 The left microphone 14 is arranged in the casing of the left audio output section 11. The left microphone 14 acquires surrounding sounds. Ambient sounds are environmental sounds including, for example, third party voices and vehicle noises. The left microphone 14 outputs the acquired sound to the surrounding sound acquisition section 52 of the electronic device 30.

右マイクロフォン１５は、右音声出力部１２の筐体に配置されている。右マイクロフォン１５は、周辺音を取得する。右マイクロフォン１５は、取得した音声を電子機器３０の周辺音取得部５２に出力する。 The right microphone 15 is arranged in the casing of the right audio output section 12. The right microphone 15 acquires surrounding sounds. The right microphone 15 outputs the acquired sound to the surrounding sound acquisition section 52 of the electronic device 30.

＜電子機器＞
電子機器３０は、コンテンツ入力部３１と、表示部３２と、視線センサ３３と、コンテンツ出力制御装置４０とを有する。 <Electronic equipment>
The electronic device 30 includes a content input section 31, a display section 32, a line of sight sensor 33, and a content output control device 40.

コンテンツ入力部３１は、例えば音楽、映像またはゲームなどのコンテンツのコンテンツデータが入力される。コンテンツ入力部３１は、例えば図示しない記憶部に記憶されたコンテンツデータが入力されてもよい。コンテンツ入力部３１は、外部の記憶装置から、有線または無線でコンテンツデータが入力されてもよい。コンテンツ入力部３１に入力されるコンテンツデータは、例えば、音声出力を伴うコンテンツデータであり、動画コンテンツ、ゲームコンテンツ、ＷＥＢコンテンツなどである。 The content input section 31 receives content data of content such as music, video, or games. For example, content data stored in a storage unit (not shown) may be input to the content input unit 31. The content input unit 31 may receive content data from an external storage device via wire or wirelessly. The content data input to the content input section 31 is, for example, content data accompanied by audio output, such as video content, game content, and web content.

コンテンツデータには、コンテンツに関する映像データと音声データとを含む。コンテンツに関する映像データとは、表示画面３６に表示するコンテンツの映像データである。コンテンツに関する音声データとは、表示画面３６に表示されたコンテンツの映像データに対応して出力する音声の音声データである。コンテンツに関する音声データとは、例えば、電子書籍のテキスト読み上げの音声、ＷＥＢページのテキスト読み上げ音声および解説音声などでもよい。 The content data includes video data and audio data related to the content. The video data related to the content is the video data of the content displayed on the display screen 36. The audio data related to the content is the audio data of the audio output corresponding to the video data of the content displayed on the display screen 36. The audio data related to the content may be, for example, the audio of text reading of an electronic book, the audio of text reading of a web page, and the audio of commentary.

視線センサ３３は、電子機器３０の表示画面３６と同じ方向を向いて配置されている。視線センサ３３は、電子機器３０の表示画面３６と対面している人物の視線を検出するセンサである。視線センサ３３は、ユーザが表示画面３６を視認している状態では、ユーザの顔と向かい合う位置に配置されている。視線センサ３３は、例えば、電子機器３０の表示画面３６の上部に配置されている。視線センサ３３は、撮影した撮影データをコンテンツ出力制御装置４０の視線検出部４３へ出力する。 The line of sight sensor 33 is arranged facing the same direction as the display screen 36 of the electronic device 30. The line of sight sensor 33 is a sensor that detects the line of sight of a person facing the display screen 36 of the electronic device 30 . The line-of-sight sensor 33 is placed at a position facing the user's face when the user is viewing the display screen 36 . The line of sight sensor 33 is arranged, for example, above the display screen 36 of the electronic device 30. The line-of-sight sensor 33 outputs the photographed data to the line-of-sight detection unit 43 of the content output control device 40 .

視線センサ３３は、例えば、赤外ＬＥＤ群で構成された赤外光発光部と、一対の赤外線カメラとを含む。本実施形態では、視線センサ３３は、一対の赤外光発光部でユーザの顔方向に赤外光を照射し、赤外線カメラで撮影する。このようにして赤外線カメラで撮影した撮影映像から、後述する視線検出部４３が、ユーザの瞳孔と角膜反射の位置とに基づいて、ユーザの視線が表示画面３６を向いているか否かを判断する。さらに、ユーザの瞳孔と角膜反射の位置とに基づいて、表示画面３６におけるユーザの視線の位置が判断される。視線センサ３３は、可視光カメラなど、同様の機能を有する他の構成であってもよい。 The line-of-sight sensor 33 includes, for example, an infrared light emitting section composed of a group of infrared LEDs and a pair of infrared cameras. In this embodiment, the line-of-sight sensor 33 irradiates infrared light toward the user's face using a pair of infrared light emitting units, and photographs the user's face with an infrared camera. Based on the image captured by the infrared camera in this manner, a line of sight detection unit 43, which will be described later, determines whether the user's line of sight is directed toward the display screen 36 based on the position of the user's pupil and corneal reflection. . Furthermore, the position of the user's line of sight on the display screen 36 is determined based on the position of the user's pupil and corneal reflex. The line-of-sight sensor 33 may be a visible light camera or other configuration having a similar function.

表示部３２は、コンテンツ入力部３１に入力されたコンテンツの映像を表示する。表示部３２は、液晶ディスプレイ（ＬＣＤ：ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）または有機ＥＬ（ＯｒｇａｎｉｃＥｌｅｃｔｒｏ－Ｌｕｍｉｎｅｓｃｅｎｃｅ）ディスプレイなどを含むディスプレイである。表示部３２は、表示制御部４２から出力された表示データに基づいて、コンテンツの映像を表示する。表示部３２は、映像が表示される表示画面３６を備える。 The display section 32 displays an image of the content input to the content input section 31. The display unit 32 is a display including a liquid crystal display (LCD) or an organic electro-luminescence (EL) display. The display unit 32 displays the video of the content based on the display data output from the display control unit 42. The display unit 32 includes a display screen 36 on which images are displayed.

ヘッドホン１０には、ヘッドホン１０がユーザの頭に装着されているか否かを検出するセンサが備えられていてもよい。具体的には、ヘッドホン１０には３軸の加速度センサが備えられ、重力加速度が検出されている方向に基づいて、ヘッドホン１０がユーザの頭に装着されていると判断する。さらに、例えばヘッドバンドの開き具合またはイヤーパッドへの圧力などを検出する他のセンサを使用してもよい。 The headphones 10 may be equipped with a sensor that detects whether the headphones 10 are worn on the user's head. Specifically, the headphones 10 are equipped with a three-axis acceleration sensor, and it is determined that the headphones 10 are worn on the user's head based on the direction in which gravitational acceleration is detected. Additionally, other sensors may be used, such as detecting headband opening or ear pad pressure.

＜コンテンツ出力制御装置＞
コンテンツ出力制御装置４０は、ユーザが表示画面３６を注視しているか否かに応じて、ヘッドホン１０からコンテンツに関する音声を出力したり、周辺音を出力したりする。コンテンツ出力制御装置４０は、例えば、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）または映像処理用プロセッサなどで構成された演算処理装置（制御部）である。コンテンツ出力制御装置４０は、図示しない記憶部に記憶されているプログラムをメモリにロードして、プログラムに含まれる命令を実行する。コンテンツ出力制御装置４０は、映像取得部４１と表示制御部４２と視線検出部４３と注視判断部４４と音声処理部５０と内部メモリである記憶部とが含まれる。音声処理部５０は、音声取得部５１と周辺音取得部５２と音声出力制御部５３とを有する。コンテンツ出力制御装置４０は、一または複数の装置で構成されていてもよい。 <Content output control device>
The content output control device 40 outputs audio related to the content or ambient sound from the headphones 10 depending on whether the user is gazing at the display screen 36 or not. The content output control device 40 is an arithmetic processing device (control unit) including, for example, a CPU (Central Processing Unit) or a video processing processor. Content output control device 40 loads a program stored in a storage unit (not shown) into memory and executes instructions included in the program. The content output control device 40 includes a video acquisition section 41, a display control section 42, a line of sight detection section 43, a gaze determination section 44, an audio processing section 50, and a storage section that is an internal memory. The audio processing section 50 includes an audio acquisition section 51 , an ambient sound acquisition section 52 , and an audio output control section 53 . The content output control device 40 may be composed of one or more devices.

映像取得部４１は、表示部３２の表示画面３６に出力するためのコンテンツの映像データをコンテンツ入力部３１から取得する。 The video acquisition unit 41 acquires video data of content to be output to the display screen 36 of the display unit 32 from the content input unit 31.

表示制御部４２は、映像取得部４１が取得したコンテンツの映像データを表示部３２の表示画面３６に表示させる。 The display control unit 42 displays the video data of the content acquired by the video acquisition unit 41 on the display screen 36 of the display unit 32.

視線検出部４３は、視線センサ３３が撮影した撮影データに基づいて、ユーザの視線方向を検出する。視線を検出する方法は限定されないが、本実施形態では、角膜反射によって視線を検出する。 The line-of-sight detection unit 43 detects the user's line-of-sight direction based on the photographic data captured by the line-of-sight sensor 33 . Although the method of detecting the line of sight is not limited, in this embodiment, the line of sight is detected by corneal reflection.

注視判断部４４は、視線検出部４３の検出結果に基づき、ユーザがコンテンツを表示した電子機器３０の表示画面３６を注視しているか否かを判断する。表示画面３６を注視しているとは、ユーザの視線方向に表示画面３６が位置した状態、言い換えると、ユーザの視線方向と表示画面３６とが交差する状態が、第一所定期間以上継続することをいう。第一所定期間とは、例えば５秒程度である。表示画面３６を注視していないとは、ユーザの視線方向が表示画面３６と異なる方向に向いている状態、言い換えると、ユーザの視線方向と表示画面３６とが交差しない状態が、第二所定期間以上継続することをいう。第二所定期間とは、例えば５秒程度である。 The gaze determination unit 44 determines whether the user is gazing at the display screen 36 of the electronic device 30 on which content is displayed, based on the detection result of the line of sight detection unit 43. Gazing at the display screen 36 means that the display screen 36 is located in the direction of the user's line of sight, or in other words, the state in which the direction of the user's line of sight and the display screen 36 intersect continues for a first predetermined period or longer. means. The first predetermined period is, for example, about 5 seconds. Not looking at the display screen 36 means that the user's line of sight is in a direction different from the display screen 36, in other words, the user's line of sight does not intersect with the display screen 36 for a second predetermined period of time. It means to continue above. The second predetermined period is, for example, about 5 seconds.

音声取得部５１は、左音声出力部１１と右音声出力部１２とから出力するためのコンテンツに関する音声データをコンテンツ入力部３１から取得する。 The audio acquisition unit 51 acquires audio data related to content to be output from the left audio output unit 11 and the right audio output unit 12 from the content input unit 31.

周辺音取得部５２は、左マイクロフォン１４と右マイクロフォン１５とから、周辺音を取得する。 The surrounding sound acquisition unit 52 obtains surrounding sounds from the left microphone 14 and the right microphone 15.

音声出力制御部５３は、ヘッドホン１０から音声データを音として出力する制御を行う。より詳しくは、音声出力制御部５３は、音声データの左チャンネルデータをＤ／Ａ変換して増幅した信号を左音声出力部１１から出力させる。音声出力制御部５３は、音声データの右チャンネルデータをＤ／Ａ変換して増幅した信号を右音声出力部１２から出力させる。 The audio output control unit 53 performs control to output audio data as sound from the headphones 10. More specifically, the audio output control section 53 causes the left audio output section 11 to output a signal obtained by D/A converting and amplifying the left channel data of the audio data. The audio output control section 53 causes the right audio output section 12 to output a signal obtained by D/A converting and amplifying the right channel data of the audio data.

音声出力制御部５３は、注視判断部４４が、ユーザがコンテンツを表示した表示画面３６を注視していると判断した場合は、音声取得部５１が取得したコンテンツに関する音声を出力し、ユーザがコンテンツを表示した表示画面３６を注視していないと判断した場合は、周辺音取得部５２が取得した周辺音を出力する。 If the gaze determination unit 44 determines that the user is gazing at the display screen 36 displaying content, the audio output control unit 53 outputs audio related to the content acquired by the audio acquisition unit 51, so that the user can watch the content. If it is determined that the user is not gazing at the display screen 36 displaying , the ambient sound acquisition unit 52 outputs the acquired ambient sound.

音声出力制御部５３は、注視判断部４４が、ユーザがコンテンツを表示した表示画面３６を注視していないと判断した場合は、音声取得部５１が取得したコンテンツに関する音声に加えて、周辺音取得部５２が取得した周辺音を出力してもよい。この場合は、コンテンツに関する音声とともに周辺音がヘッドホン１０から出力される。 If the gaze determination unit 44 determines that the user is not gazing at the display screen 36 on which content is displayed, the audio output control unit 53 acquires surrounding sounds in addition to the audio related to the content acquired by the audio acquisition unit 51. The surrounding sound acquired by the unit 52 may be output. In this case, ambient sounds are output from the headphones 10 along with the audio related to the content.

次に、図３、図４を用いて、ヘッドホン１０における情報処理について説明する。図３は、第一実施形態に係るコンテンツ出力システム１における処理の流れの一例を示すフローチャートである。図４は、第一実施形態に係るコンテンツ出力システム１における処理の流れの他の例を示すフローチャートである。例えば、ヘッドホン１０の電源がＯＮになった場合、ヘッドホン１０がユーザの頭に装着された場合、ヘッドホン１０から音声の出力が開始された場合、または、コンテンツ出力制御処理の開始操作がされた場合などに、図３、図４に示すフローチャートの処理が実行される。 Next, information processing in the headphones 10 will be explained using FIGS. 3 and 4. FIG. 3 is a flowchart showing an example of the flow of processing in the content output system 1 according to the first embodiment. FIG. 4 is a flowchart showing another example of the flow of processing in the content output system 1 according to the first embodiment. For example, when the headphones 10 are turned on, when the headphones 10 are attached to the user's head, when audio output from the headphones 10 is started, or when an operation to start content output control processing is performed The processes shown in the flowcharts shown in FIGS. 3 and 4 are executed.

まず、図３に示す処理について説明する。コンテンツ出力制御装置４０は、音声出力を伴うコンテンツが表示部３２に表示されているか否かを判断する（ステップＳ１０１）。より詳しくは、コンテンツ入力部３１に入力された、音声および映像からなるコンテンツが再生され、コンテンツの映像が表示部３２に表示され、コンテンツの音声がヘッドホン１０に出力されているか否かを判断する。言い換えると、音声および映像からなるコンテンツが再生されているか否かを判断する。音声出力を伴うコンテンツが表示部３２に表示されている状態とは、表示部３２にコンテンツの映像が表示され、コンテンツの音声がヘッドホン１０に出力されている状態であり、これらの出力は、コンテンツの再生が終了するまで継続される。 First, the processing shown in FIG. 3 will be explained. The content output control device 40 determines whether content accompanied by audio output is being displayed on the display unit 32 (step S101). More specifically, the content input to the content input unit 31 and consisting of audio and video is played back, the video of the content is displayed on the display unit 32, and it is determined whether the audio of the content is being output to the headphones 10. . In other words, it is determined whether content consisting of audio and video is being played. The state in which content with audio output is displayed on the display unit 32 is a state in which the video of the content is displayed on the display unit 32 and the audio of the content is output to the headphones 10. continues until playback ends.

音声出力を伴うコンテンツが表示部３２に表示されていると判断する場合（ステップＳ１０１でＹｅｓ）、ステップＳ１０２へ進む。音声出力を伴うコンテンツが表示部３２に表示されていると判断しない場合（ステップＳ１０１でＮｏ）、本処理を終了する。音声出力を伴うコンテンツが表示部３２に表示されていると判断する場合（ステップＳ１０１でＹｅｓ）、ユーザが表示画面３６を注視しているか否かを判断する（ステップＳ１０１）。より詳しくは、注視判断部４４によって、視線検出部４３の検出結果から、ユーザがコンテンツを表示した電子機器３０の表示画面３６を注視しているか否かを判断する。注視判断部４４によって、ユーザの視線方向に表示画面３６が位置した状態が、第一所定期間以上継続いるか否かを判断する。注視判断部４４によって、ユーザが表示画面３６を注視していると判断する場合（ステップＳ１０２でＹｅｓ）、ステップＳ１０３へ進む。注視判断部４４によって、ユーザが表示画面３６を注視していると判断しない場合（ステップＳ１０２でＮｏ）、ステップＳ１０４へ進む。 If it is determined that content accompanied by audio output is being displayed on the display unit 32 (Yes in step S101), the process advances to step S102. If it is not determined that content accompanied by audio output is being displayed on the display unit 32 (No in step S101), this process ends. If it is determined that content with audio output is being displayed on the display unit 32 (Yes in step S101), it is determined whether the user is gazing at the display screen 36 (step S101). More specifically, the gaze determination unit 44 determines from the detection result of the line of sight detection unit 43 whether the user is gazing at the display screen 36 of the electronic device 30 on which content is displayed. The gaze determination unit 44 determines whether the display screen 36 remains in the user's line of sight direction for a first predetermined period or longer. If the gaze determining unit 44 determines that the user is gazing at the display screen 36 (Yes in step S102), the process advances to step S103. If the gaze determining unit 44 does not determine that the user is gazing at the display screen 36 (No in step S102), the process advances to step S104.

ユーザが表示画面３６を注視していると判断する場合（ステップＳ１０２でＹｅｓ）、コンテンツ出力制御装置４０は、コンテンツの音声を出力する（ステップＳ１０３）。より詳しくは、音声出力制御部５３は、音声取得部５１が取得した、コンテンツに関する音声をヘッドホン１０から出力する。これにより、ヘッドホン１０の左音声出力部１１と右音声出力部１２とからコンテンツの音声が出力される。コンテンツ出力制御装置４０は、ステップＳ１０５へ進む。 If it is determined that the user is gazing at the display screen 36 (Yes in step S102), the content output control device 40 outputs the audio of the content (step S103). More specifically, the audio output control unit 53 outputs the audio related to the content acquired by the audio acquisition unit 51 from the headphones 10. As a result, the audio of the content is output from the left audio output section 11 and the right audio output section 12 of the headphones 10. The content output control device 40 proceeds to step S105.

ステップＳ１０３の処理は、ステップＳ１０１でＹｅｓのときに周辺音が出力されていない状態の場合は周辺音が出力されていない状態でコンテンツ音の出力を継続し、ステップＳ１０１でＹｅｓのときに周辺音が出力されている状態の場合はコンテンツ音の出力を継続し周辺音の出力を停止する。 In the process of step S103, if the result in step S101 is Yes and the ambient sound is not being output, the output of the content sound is continued while the peripheral sound is not being output. is being output, the output of the content sound is continued and the output of the surrounding sound is stopped.

ユーザが表示画面３６を注視していると判断しない場合（ステップＳ１０２でＮｏ）、コンテンツ出力制御装置４０は、周辺音を出力する（ステップＳ１０４）。より詳しくは、音声出力制御部５３は、周辺音取得部５２が取得した周辺音を出力する。これにより、ヘッドホン１０の左音声出力部１１と右音声出力部１２とから周辺音が出力される。コンテンツ出力制御装置４０は、ステップＳ１０５へ進む。 If it is not determined that the user is gazing at the display screen 36 (No in step S102), the content output control device 40 outputs ambient sound (step S104). More specifically, the audio output control section 53 outputs the ambient sound acquired by the ambient sound acquisition section 52. As a result, ambient sound is output from the left audio output section 11 and the right audio output section 12 of the headphones 10. The content output control device 40 proceeds to step S105.

ステップＳ１０４の処理は、ステップＳ１０１でＹｅｓのときに周辺音が出力されていない状態の場合は周辺音の出力を開始し、ステップＳ１０１でＹｅｓのときに周辺音が出力されている状態の場合は周辺音の出力を維持する。 In the process of step S104, if the answer is Yes in step S101 and the surrounding sound is not being output, output of the surrounding sound is started, and if the answer is Yes in step S101 and the ambient sound is being output, the output of the surrounding sound is started. Maintains ambient sound output.

コンテンツ出力制御装置４０は、ヘッドホン１０や電子機器３０などのコンテンツ出力システム１の利用を終了するか否かを判断する（ステップＳ１０５）。例えば、ヘッドホン１０や電子機器３０の電源がＯＦＦになった場合、ヘッドホン１０がユーザの頭から取り外された場合、ヘッドホン１０への音声の出力が停止された場合、または、コンテンツ出力制御処理の終了操作がされた場合などに、利用を終了すると判断して（ステップＳ１０５でＹｅｓ）、処理を終了する。コンテンツ出力制御装置４０は、上記に該当しない場合、利用を終了すると判断せずに（ステップＳ１０５でＮｏ）、ステップＳ１０１の処理を再度実行する。 The content output control device 40 determines whether to end the use of the content output system 1 such as the headphones 10 or the electronic device 30 (step S105). For example, when the power of the headphones 10 or the electronic device 30 is turned off, when the headphones 10 are removed from the user's head, when the output of audio to the headphones 10 is stopped, or when the content output control process ends When an operation is performed, it is determined that the usage is to be ended (Yes in step S105), and the process is ended. If the above does not apply, the content output control device 40 does not decide to end the use (No in step S105) and executes the process in step S101 again.

つづいて、図４に示す処理について説明する。図４に示すフローチャートのステップＳ１１１、ステップＳ１１２、ステップＳ１１３、ステップＳ１１５は、図３に示すフローチャートのステップＳ１０１、ステップＳ１０２、ステップＳ１０３、ステップＳ１０５と同様の処理を行う。 Next, the processing shown in FIG. 4 will be explained. Steps S111, S112, S113, and S115 in the flowchart shown in FIG. 4 perform the same processing as steps S101, S102, S103, and S105 in the flowchart shown in FIG.

ユーザが表示画面３６を注視していると判断しない場合（ステップＳ１１２でＮｏ）、コンテンツ出力制御装置４０は、コンテンツに関する音声とともに、周辺音を出力する（ステップＳ１１４）。より詳しくは、音声取得部５１が取得したコンテンツに関する音声とともに、周辺音取得部５２が取得した周辺音をヘッドホン１０の左音声出力部１１と右音声出力部１２とから出力する。コンテンツ出力制御装置４０は、ステップＳ１１５へ進む。 If it is not determined that the user is gazing at the display screen 36 (No in step S112), the content output control device 40 outputs surrounding sound along with the audio related to the content (step S114). More specifically, the peripheral sound acquired by the peripheral sound acquisition unit 52 is outputted from the left audio output unit 11 and the right audio output unit 12 of the headphones 10, together with the audio related to the content acquired by the audio acquisition unit 51. The content output control device 40 proceeds to step S115.

ステップＳ１１３の処理は、ステップＳ１１１でＹｅｓのときに周辺音が出力されていない状態の場合は周辺音が出力されていない状態でコンテンツ音の出力を継続し、ステップＳ１１１でＹｅｓのときにコンテンツ音と周辺音が出力されている状態の場合はコンテンツ音の出力を継続し周辺音の出力を停止する。 The process of step S113 is to continue outputting the content sound when the ambient sound is not being output when the answer is Yes in step S111, and to continue outputting the content sound when the ambient sound is not being output. If the ambient sound is being output, the output of the content sound is continued and the output of the ambient sound is stopped.

ステップＳ１１４の処理は、ステップＳ１１１でＹｅｓのときにコンテンツ音が出力され周辺音が出力されていない状態の場合はコンテンツ音の出力を維持した状態で周辺音の出力を開始し、ステップＳ１１１でＹｅｓのときにコンテンツ音と周辺音が出力されている状態の場合はコンテンツ音と周辺音の出力を維持する。 In the process of step S114, if the content sound is outputted and the surrounding sound is not outputted when the answer is Yes in step S111, output of the surrounding sound is started while the output of the content sound is maintained, and if the answer is Yes in the step S111. If the content sound and ambient sound are being output at the time of , the output of the content sound and ambient sound is maintained.

ステップＳ１１４で、コンテンツ音とともに周辺音を出力する場合は、コンテンツ音の音量はユーザが設定した音量の状態であってもよく、周辺音を出力している期間のコンテンツ音の音量をユーザが設定した音量より低くしてもよい。 In step S114, when the ambient sound is output together with the content sound, the volume of the content sound may be set by the user, and the volume of the content sound during the period when the ambient sound is output is set by the user. You can set the volume to a lower level.

このようにして、ヘッドホン１０を装着しているユーザが表示画面３６を注視している場合、コンテンツに関する音声を出力し、ユーザが表示画面３６を注視していない場合、周辺音を出力する。 In this way, when the user wearing the headphones 10 is gazing at the display screen 36, audio related to the content is output, and when the user is not gazing at the display screen 36, ambient sounds are output.

＜効果＞
上述したように、本実施形態は、ヘッドホン１０を装着しているユーザが表示画面３６を注視している場合、コンテンツの音声を出力し、ユーザが表示画面３６を注視していない場合、周辺音を出力できる。本実施形態によれば、ユーザが表示画面３６を注視することを中断して、ユーザが周辺に注意を向けたと推定される状態となった場合に、適切に周辺音を確認できる。このようにして、本実施形態によれば、ユーザが周辺音を聞きたいときに、ユーザが操作をしなくても、適切に周辺音を確認できる。 <Effect>
As described above, this embodiment outputs the audio of the content when the user wearing the headphones 10 is gazing at the display screen 36, and outputs the audio of the content when the user is not gazing at the display screen 36, and outputs the surrounding sound when the user is not gazing at the display screen 36. can be output. According to the present embodiment, when the user stops gazing at the display screen 36 and is presumed to have turned his attention to the surroundings, it is possible to appropriately confirm surrounding sounds. In this way, according to the present embodiment, when the user wants to hear the surrounding sounds, the surrounding sounds can be appropriately confirmed without the user performing any operation.

本実施形態では、ユーザが表示画面３６を注視していない場合、コンテンツの音声とともに周辺音を出力できる。本実施形態によれば、コンテンツの視聴を継続しながら、周辺音を確認できる。 In this embodiment, when the user is not looking at the display screen 36, ambient sounds can be output together with the audio of the content. According to this embodiment, it is possible to check surrounding sounds while continuing to view content.

［第二実施形態］
図５、図６を参照しながら、本実施形態に係るコンテンツ出力システム１Ａについて説明する。図５は、第二実施形態に係るコンテンツ出力システム１Ａのブロック図である。図６は、第二実施形態に係るコンテンツ出力システム１Ａにおける処理の流れの一例を示すフローチャートである。コンテンツ出力システム１Ａは、基本的な構成は第一実施形態のコンテンツ出力システム１と同様である。以下の説明においては、コンテンツ出力システム１と同様の構成要素には、同一の符号または対応する符号を付し、その詳細な説明は省略する。本実施形態では、電子機器３０Ａは、ＧＮＳＳ（Global Navigation Satellite System）受信部３４Ａとコンテンツ出力制御装置４０Ａの位置情報算出部４５Ａと判断部４６Ａとを有する点と、音声出力制御部５３Ａにおける処理とが、第一実施形態と異なる。 [Second embodiment]
A content output system 1A according to this embodiment will be described with reference to FIGS. 5 and 6. FIG. 5 is a block diagram of a content output system 1A according to the second embodiment. FIG. 6 is a flowchart showing an example of the flow of processing in the content output system 1A according to the second embodiment. The content output system 1A has the same basic configuration as the content output system 1 of the first embodiment. In the following description, components similar to those of the content output system 1 are denoted by the same or corresponding symbols, and detailed description thereof will be omitted. In this embodiment, the electronic device 30A includes a GNSS (Global Navigation Satellite System) receiving section 34A, a position information calculation section 45A and a determination section 46A of the content output control device 40A, and processing in the audio output control section 53A. However, this embodiment is different from the first embodiment.

ＧＮＳＳ受信部３４Ａは、ＧＮＳＳ衛星からのＧＮＳＳ信号を受信するＧＮＳＳ受信機などで構成される。ＧＮＳＳ受信部３４Ａは、受信したＧＮＳＳ信号を位置情報算出部４５Ａに出力する。 The GNSS receiving unit 34A includes a GNSS receiver that receives GNSS signals from GNSS satellites. The GNSS receiving section 34A outputs the received GNSS signal to the position information calculating section 45A.

位置情報算出部４５Ａは、ＧＮＳＳ受信部３４ＡからＧＮＳＳ信号を受け付ける。位置情報算出部４５Ａは、ＧＮＳＳ信号に基づいて、現在位置情報を算出する。位置情報算出部４５ＡおよびＧＮＳＳ受信部３４Ａは、ＧＮＳＳ信号に限らず、他の方式の測位衛星システムに対応していてもよい。 The position information calculation unit 45A receives a GNSS signal from the GNSS reception unit 34A. The position information calculation unit 45A calculates current position information based on the GNSS signal. The position information calculation section 45A and the GNSS reception section 34A are not limited to GNSS signals, and may correspond to other types of positioning satellite systems.

判断部４６Ａは、ユーザが交通機関を利用しているか否かを判断する。例えば、判断部４６Ａは、位置情報算出部４５Ａが算出した位置情報に基づいて、ヘッドホン１０を装着したユーザの現在位置が、図示しない地図情報の交通機関の位置情報に該当する場合や、移動履歴や移動速度に基づいて、交通機関を利用していると判断してもよい。判断部４６Ａは、交通機関を利用しているかを判断する方法は限定されず、例えば、ヘッドホン１０の周辺のノイズまたは振動を利用するような他の方法によって判断されてもよい。 The determining unit 46A determines whether the user is using transportation. For example, based on the position information calculated by the position information calculation unit 45A, the determination unit 46A determines whether the current position of the user wearing the headphones 10 corresponds to the position information of a transportation facility in map information (not shown) or the movement history. It may be determined that the person is using transportation based on the person's travel rate or travel speed. The determining unit 46A is not limited to a method for determining whether the person is using transportation, and may use other methods such as using noise or vibrations around the headphones 10, for example.

音声出力制御部５３Ａは、ユーザが交通機関を利用している場合に、ユーザがコンテンツを表示した表示画面３６を注視していると判断した場合は、音声取得部５１が取得したコンテンツに関する音声を出力し、ユーザがコンテンツを表示した表示画面３６を注視していないと判断した場合は、周辺音取得部５２が取得した周辺音を出力する。 If the audio output control unit 53A determines that the user is gazing at the display screen 36 on which content is displayed while the user is using transportation, the audio output control unit 53A outputs audio related to the content acquired by the audio acquisition unit 51. If it is determined that the user is not gazing at the display screen 36 on which the content is displayed, the peripheral sound acquired by the peripheral sound acquisition unit 52 is output.

次に、図６を用いて、コンテンツ出力システム１Ａにおける情報処理について説明する。図６に示すフローチャートのステップＳ１２２ないしステップＳ１２６、は、図３に示すフローチャートのステップＳ１０１ないしステップＳ１０５と同様の処理を行う。 Next, information processing in the content output system 1A will be explained using FIG. 6. Steps S122 to S126 in the flowchart shown in FIG. 6 perform the same processing as steps S101 to S105 in the flowchart shown in FIG.

コンテンツ出力制御装置４０Ａは、交通機関を利用しているか否かを判断する（ステップＳ１２１）。より詳しくは、判断部４６Ａによって、位置情報算出部４５Ａが算出した位置情報に基づいて、ユーザの現在位置が、地図情報の交通機関の位置情報に該当する場合などの判断結果に基づき、交通機関を利用していると判断する。判断部４６Ａによって、交通機関を利用していると判断する場合（ステップＳ１２１でＹｅｓ）、ステップＳ１２２へ進む。判断部４６Ａによって、交通機関を利用していると判断しない場合（ステップＳ１２１でＮｏ）、処理を終了する。 The content output control device 40A determines whether transportation is being used (step S121). More specifically, the determination unit 46A determines whether the user's current location corresponds to the transportation facility location information in the map information based on the location information calculated by the location information calculation unit 45A. It is determined that you are using . If the determining unit 46A determines that the user is using transportation (Yes in step S121), the process advances to step S122. If the determining unit 46A does not determine that the user is using transportation (No in step S121), the process ends.

ステップＳ１２６における利用終了の判断は、ヘッドホン１０または電子機器３０Ａの利用終了の判断に加えて、コンテンツ出力制御装置４０Ａは、交通機関の利用を終了するか否かも判断する（ステップＳ１２６）。より詳しくは、判断部４６Ａによって、位置情報算出部４５Ａが算出した位置情報に基づいて、ユーザの現在位置が、地図情報の交通機関の位置情報から外れる場合など、交通機関の利用が終了していることの検出によって、交通機関の利用を終了すると判断する。判断部４６Ａによって、交通機関の利用を終了すると判断する場合（ステップＳ１２６でＹｅｓ）、処理を終了する。判断部４６Ａによって、交通機関の利用を終了すると判断しない場合（ステップＳ１２６でＮｏ）、ステップＳ１２２へ進む。 In addition to determining whether to end the use of the headphones 10 or the electronic device 30A in step S126, the content output control device 40A also determines whether to end the use of the transportation facility (step S126). More specifically, based on the position information calculated by the position information calculation unit 45A, the determination unit 46A determines whether the user's current position deviates from the position information of the transport system in the map information, or when the user has finished using the transport system. When it is detected that there is a vehicle, it is determined that the use of transportation is to be terminated. If the determination unit 46A determines that the use of the transportation facility is to be terminated (Yes in step S126), the process is terminated. If the determining unit 46A does not determine that the use of transportation is to be terminated (No in step S126), the process advances to step S122.

＜効果＞
上述したように、本実施形態は、ヘッドホン１０を装着したユーザが交通機関を利用している場合であって、ユーザが表示画面３６を注視していない場合には、周辺音を出力する。本実施形態によれば、ヘッドホン１０を装着したユーザが交通機関の利用中に、アナウンスなどを聞くために、コンテンツの表示画面３６を注視しなくなった場合などに、アナウンスなどの周辺音をヘッドホン１０で聞くことができる。本実施形態によれば、不用意に周辺音が出力されることを抑制できる。 <Effect>
As described above, in this embodiment, when the user wearing the headphones 10 is using public transportation and the user is not gazing at the display screen 36, ambient sound is output. According to the present embodiment, when a user wearing the headphones 10 stops paying attention to the content display screen 36 in order to listen to announcements while using public transportation, ambient sounds such as announcements are output to the headphones 10. You can listen to it at According to this embodiment, it is possible to suppress the surrounding sound from being outputted inadvertently.

［第三実施形態］
図７、図８を参照しながら、本実施形態に係るコンテンツ出力システム１Ｂについて説明する。図７は、第三実施形態に係るコンテンツ出力システム１Ｂのブロック図である。図８は、第三実施形態に係るコンテンツ出力システム１Ｂにおける処理の流れの一例を示すフローチャートである。コンテンツ出力システム１Ｂは、基本的な構成は第一実施形態のコンテンツ出力システム１と同様である。本実施形態では、電子機器３０Ｂは、撮影部３５Ｂと、コンテンツ出力制御装置４０Ｂの顔検出部４７Ｂとを有する点と、音声出力制御部５３Ｂにおける処理とが、第一実施形態と異なる。 [Third embodiment]
The content output system 1B according to this embodiment will be described with reference to FIGS. 7 and 8. FIG. 7 is a block diagram of a content output system 1B according to the third embodiment. FIG. 8 is a flowchart showing an example of the flow of processing in the content output system 1B according to the third embodiment. The content output system 1B has the same basic configuration as the content output system 1 of the first embodiment. In this embodiment, the electronic device 30B differs from the first embodiment in that it includes a photographing section 35B and a face detection section 47B of a content output control device 40B, and the processing in the audio output control section 53B.

撮影部３５Ｂは、電子機器３０Ｂの表示画面３６と対面している人物を撮影する可視光カメラである。撮影部３５Ｂは、ユーザの顔を撮影する。撮影部３５Ｂは、ユーザが表示画面を視認している状態では、ユーザの顔を撮影可能な位置に配置されている。撮影部３５Ｂは、例えば、電子機器３０Ｂの表示画面３６の上部に配置されている。撮影部３５Ｂは、撮影した撮影データをコンテンツ出力制御装置４０Ｂの顔検出部４７Ｂへ出力する。撮影部３３Ａと視線センサ３３とは、同一の可視光カメラであってもよい。 The photographing unit 35B is a visible light camera that photographs a person facing the display screen 36 of the electronic device 30B. The photographing unit 35B photographs the user's face. The photographing unit 35B is arranged at a position where it can photograph the user's face while the user is viewing the display screen. The photographing unit 35B is arranged, for example, above the display screen 36 of the electronic device 30B. The photographing unit 35B outputs the photographed data to the face detection unit 47B of the content output control device 40B. The photographing unit 33A and the line-of-sight sensor 33 may be the same visible light camera.

顔検出部４７Ｂは、撮影部３５Ｂが撮影した撮影データからユーザの顔を認識し、認識した顔の向きを検出する。より詳しくは、顔検出部４７Ｂは、顔検出部４７Ｂが検出した顔の向きが、表示画面３６に対して対向する向きであるか否かを検出する。例えばユーザがコンテンツが表示された表示画面３６を注視している場合には、ユーザの顔の向きは表示画面３６に対して対向する向きである。例えばユーザが周辺に注意を向けて周りを見回すような場合には、ユーザの顔の向きは表示画面３６に対して対向する向きではない。 The face detection section 47B recognizes the user's face from the photographic data photographed by the photographing section 35B, and detects the direction of the recognized face. More specifically, the face detection unit 47B detects whether the face detected by the face detection unit 47B is facing the display screen 36. For example, when the user is gazing at the display screen 36 on which content is displayed, the user's face is facing the display screen 36. For example, when the user pays attention to the surroundings and looks around, the orientation of the user's face is not facing the display screen 36.

表示画面３６に対して対向する向きとは、ユーザが表示画面３６に表示されたコンテンツの画像などを視認可能な向きのことである。表示画面３６に対して対向する向きとは、例えば、上下方向視および左右方向視において、ユーザの両眼の中心を通り、ユーザの前方に延びる直線と、表示画面３６とが交差する角度が例えば９０°±３０°程度の範囲など、明らかにユーザが表示画面３６を見ているとされる向きが定義されればよい。 The direction facing the display screen 36 refers to the direction in which the user can visually recognize the image of the content displayed on the display screen 36. The direction facing the display screen 36 is, for example, the angle at which the display screen 36 intersects a straight line that passes through the center of the user's eyes and extends in front of the user in the vertical and horizontal directions. The direction in which the user is clearly viewing the display screen 36 may be defined, such as a range of about 90°±30°.

音声出力制御部５３Ｂは、注視判断部４４によってユーザがコンテンツを表示した表示画面３６を注視していると判断したことに加えて、顔検出部４７Ｂが検出した顔の向きが、表示画面３６に対して対向する向きである場合に、音声取得部５１が取得したコンテンツに関する音声を出力する。音声出力制御部５３Ｂは、注視判断部４４によってユーザがコンテンツを表示した表示画面３６を注視していないと判断したことに加えて、顔検出部４７Ｂが検出した顔の向きが、表示画面３６に対して対向する向きではない場合に、周辺音取得部５２が取得した周辺音を出力する。ユーザがコンテンツを表示した表示画面３６を注視していない場合でも、ユーザの顔の向きが表示画面３６に対して対向する場合、ユーザがコンテンツの視聴を継続する意思があると推定される。この場合、コンテンツの視聴を継続させることが好ましい。これに対して、ユーザがコンテンツを表示した表示画面３６を注視しておらず、かつ、ユーザの顔の向きが表示画面３６に対して対向していない場合、ユーザは周辺に高い注意を払っていると推定される。この場合、周辺音を確認可能にすることが好ましい。 In addition to determining that the user is gazing at the display screen 36 on which content is displayed by the gaze determination unit 44, the audio output control unit 53B determines that the direction of the face detected by the face detection unit 47B is displayed on the display screen 36. If the content is facing the user, the audio acquisition unit 51 outputs audio related to the acquired content. In addition to the fact that the gaze determination unit 44 has determined that the user is not gazing at the display screen 36 on which content is displayed, the audio output control unit 53B also determines whether the direction of the face detected by the face detection unit 47B is on the display screen 36. If the direction is not opposite to the other, the surrounding sound acquisition unit 52 outputs the acquired surrounding sound. Even if the user is not gazing at the display screen 36 on which the content is displayed, if the user's face is facing the display screen 36, it is presumed that the user intends to continue viewing the content. In this case, it is preferable to continue viewing the content. On the other hand, if the user is not gazing at the display screen 36 on which content is displayed and the user's face is not facing the display screen 36, the user is paying close attention to the surroundings. It is estimated that there are. In this case, it is preferable to make surrounding sounds visible.

次に、図８を用いて、コンテンツ出力システム１Ｂにおける情報処理について説明する。図８に示すフローチャートのステップＳ１３１、ステップＳ１３２、ステップＳ１３４ないしステップＳ１３６は、図３に示すフローチャートのステップＳ１０１、ステップＳ１０２、ステップＳ１０３ないしステップＳ１０５と同様の処理を行う。 Next, information processing in the content output system 1B will be explained using FIG. 8. Steps S131, S132, S134 and S136 in the flowchart shown in FIG. 8 perform the same processing as steps S101, S102, S103 and S105 in the flowchart shown in FIG.

コンテンツ出力制御装置４０Ｂは、ユーザの顔は対向しているか否かを判断する（ステップＳ１３３）。より詳しくは、顔検出部４７Ｂが検出した顔の向きが、表示画面３６に対して対向する向きである場合（ステップＳ１３３でＹｅｓ）、ステップＳ１３４へ進む。顔検出部４７Ｂが検出した顔の向きが、表示画面３６に対して対向する向きではない場合（ステップＳ１３３でＮｏ）、ステップＳ１３５へ進む。 The content output control device 40B determines whether the user's face is facing the user (step S133). More specifically, if the direction of the face detected by the face detection unit 47B is facing the display screen 36 (Yes in step S133), the process advances to step S134. If the direction of the face detected by the face detection unit 47B is not facing the display screen 36 (No in step S133), the process advances to step S135.

＜効果＞
上述したように、本実施形態は、ユーザが表示画面３６を注視して、かつ、ユーザの顔の向きが表示画面３６に対して対向する場合、コンテンツの音声を出力し、ユーザが表示画面３６を注視していないで、かつ、ユーザの顔の向きが表示画面３６に対して対向する向きではない場合、周辺音取得部５２が取得した周辺音を出力できる。本実施形態によれば、ユーザが顔の向きを表示画面３６に対向しない向きにして、ユーザが周辺に注意を向けたと状態されるときに、適切に周辺音を確認できる。 <Effect>
As described above, in this embodiment, when the user is gazing at the display screen 36 and the direction of the user's face is opposite to the display screen 36, the audio of the content is output, and the user looks at the display screen 36. When the user is not gazing at the user and the direction of the user's face is not facing the display screen 36, the ambient sound acquired by the ambient sound acquisition unit 52 can be output. According to this embodiment, when the user turns his/her face away from the display screen 36 and pays attention to the surroundings, it is possible to appropriately confirm surrounding sounds.

図示したコンテンツ出力システム１の各構成要素は、機能概念的なものであり、必ずしも物理的に図示の如く構成されていなくてもよい。すなわち、各装置の具体的形態は、図示のものに限られず、各装置の処理負担や使用状況などに応じて、その全部または一部を任意の単位で機能的または物理的に分散または統合してもよい。 Each component of the illustrated content output system 1 is functionally conceptual, and does not necessarily have to be physically configured as illustrated. In other words, the specific form of each device is not limited to what is shown in the diagram, and all or part of it may be functionally or physically distributed or integrated into arbitrary units depending on the processing load and usage status of each device. It's okay.

コンテンツ出力システム１の構成は、例えば、ソフトウェアとして、メモリにロードされたプログラムなどによって実現される。上記実施形態では、これらのハードウェアまたはソフトウェアの連携によって実現される機能ブロックとして説明した。すなわち、これらの機能ブロックについては、ハードウェアのみ、ソフトウェアのみ、または、それらの組み合わせによって種々の形で実現できる。 The configuration of the content output system 1 is realized by, for example, a program loaded into a memory as software. The above embodiments have been described as functional blocks realized by cooperation of these hardware or software. That is, these functional blocks can be realized in various forms using only hardware, only software, or a combination thereof.

上記に記載した構成要素には、当業者が容易に想定できるもの、実質的に同一のものを含む。さらに、上記に記載した構成は適宜組み合わせが可能である。また、本発明の要旨を逸脱しない範囲において構成の種々の省略、置換または変更が可能である。 The components described above include those that can be easily imagined by those skilled in the art and that are substantially the same. Furthermore, the configurations described above can be combined as appropriate. Furthermore, various omissions, substitutions, or changes in the configuration are possible without departing from the gist of the present invention.

上記では、音声出力装置の一例としてヘッドホン１０について説明したが、これに限定されない。音声出力装置は、例えば、イヤホンおよび首掛け式のスピーカなどであってもよい。 Although the headphones 10 have been described above as an example of the audio output device, the present invention is not limited thereto. The audio output device may be, for example, an earphone or a neck-mounted speaker.

１コンテンツ出力システム
１０ヘッドホン（音声出力装置）
１１左音声出力部
１２右音声出力部
１４左マイクロフォン
１５右マイクロフォン
３０電子機器
３１コンテンツ入力部
３２表示部
３３視線センサ
４０コンテンツ出力制御装置
４１映像取得部
４２表示制御部
４３視線検出部
４４注視判断部
５０音声処理部
５１音声取得部
５２周辺音取得部
５３音声出力制御部 1 Content output system 10 Headphones (audio output device)
11 Left audio output section 12 Right audio output section 14 Left microphone 15 Right microphone 30 Electronic device 31 Content input section 32 Display section 33 Gaze sensor 40 Content output control device 41 Video acquisition section 42 Display control section 43 Gaze detection section 44 Gaze determination section 50 Audio processing unit 51 Audio acquisition unit 52 Ambient sound acquisition unit 53 Audio output control unit

Claims

a line-of-sight detection unit that detects the user's line-of-sight direction using a line-of-sight sensor disposed facing the same direction as the display screen ;
a gaze determination unit that determines whether the user is gazing at the display screen on which content is displayed based on the detection result of the gaze detection unit;
an ambient sound acquisition unit that acquires ambient sounds of the user from a microphone provided in headphones worn by the user ;
a determination unit that determines that the user uses transportation;
an audio acquisition unit that acquires audio related to the content;
When the user is using transportation, if the gaze determining unit determines that the user is gazing at the display screen on which content is displayed, the user may be gazing at the display screen on which content is displayed. an audio output control unit that outputs audio and, when determining that the user is not gazing at the display screen on which content is displayed, outputs the ambient sound acquired by the ambient sound acquisition unit;
A content output control device comprising:

If the audio output control unit determines that the user is not gazing at the display screen on which content is displayed, the audio output control unit may output audio data acquired by the surrounding sound acquisition unit in addition to the audio related to the content acquired by the audio acquisition unit. Outputs surrounding sounds,
The content output control device according to claim 1.

further comprising a face detection unit that recognizes the user's face and detects the orientation of the recognized face,
In addition to determining that the user is gazing at the display screen on which content is displayed, the audio output control unit determines that the face direction detected by the face detection unit is facing the display screen. If the orientation is the same, the audio acquisition unit outputs the audio related to the acquired content, and in addition to determining that the user is not gazing at the display screen on which the content is displayed, the face detection unit detects outputting the peripheral sound acquired by the peripheral sound acquisition unit when the direction of the face is not facing the display screen;
The content output control device according to claim 1 or 2.

A content output control device according to any one of claims 1 to 3;
A sound collection unit that collects surrounding sounds;
an audio output section that outputs audio;
A content output system comprising :

detecting the direction of the line of sight of a user using the audio output device with a line of sight sensor placed facing in the same direction as the display screen;
determining whether the user is gazing at the display screen on which content is displayed based on the detection result of the gaze direction;
acquiring ambient sounds of the user from a microphone provided in headphones worn by the user;
determining that the user uses transportation;
obtaining audio related to the content;
When the user is using transportation, if it is determined that the user is gazing at the display screen on which content is displayed, a sound related to the content is output, and a sound related to the content is output. If it is determined that the user is not looking at the display screen, outputting ambient sound;
A content output control method executed by a content output control device .

detecting the direction of the line of sight of a user using the audio output device with a line of sight sensor placed facing in the same direction as the display screen;
determining whether the user is gazing at the display screen on which content is displayed based on the detection result of the gaze direction;
acquiring ambient sounds of the user from a microphone provided in headphones worn by the user;
determining that the user uses transportation;
obtaining audio related to the content;
When the user is using transportation, if it is determined that the user is gazing at the display screen on which content is displayed, a sound related to the content is output, and a sound related to the content is output. If it is determined that the user is not looking at the display screen, outputting ambient sound;
A program that a computer runs, including.