JP2018093447A

JP2018093447A - Voice output controller

Info

Publication number: JP2018093447A
Application number: JP2016237679A
Authority: JP
Inventors: 直樹樋上; Naoki Higami; 侑人村; Yukihito Mura
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2016-12-07
Filing date: 2016-12-07
Publication date: 2018-06-14
Anticipated expiration: 2036-12-07
Also published as: JP6993774B2

Abstract

PROBLEM TO BE SOLVED: To realize a voice output controller capable of providing suitable immersion feeling according to the reproduction environment of a listener.SOLUTION: A voice output controller (1) outputting a voice to voice output devices (2a, 2b, 2c) includes a voice data acquisition unit (11) for acquiring the voice data, a metadata acquisition unit (12) for acquiring the metadata indicating correspondence of the voice data and an object, and a determination unit (13) for determining the voice output devices (2a, 2b, 2c) for outputting the voice indicated by the voice data, with reference to the mapping information indicating the mapping of the object and the voice output devices (2a, 2b, 2c), and the metadata.SELECTED DRAWING: Figure 1

Description

本発明は、音声出力制御装置に関する。 The present invention relates to an audio output control device.

録音時の音場を再現して音声を再生する技術が知られている。例えば、特許文献１は、頭上スピーカを有しない適応型オーディオシステムで反射音をレンダリングする適応型オーディオシステムのためのシステムおよび方法を開示している。 A technique for reproducing sound by reproducing the sound field during recording is known. For example, U.S. Patent No. 6,057,051 discloses a system and method for an adaptive audio system that renders reflected sound in an adaptive audio system that does not have overhead speakers.

特許文献２は、アンプ装置から入力された増幅音信号を無指向性の出力特性で出力するスピーカ装置を備えることによって受聴位置を変更する可能性のある受聴者にとって最適な音場を提供することができる音響システムについて開示している。 Patent Document 2 provides an optimum sound field for a listener who may change the listening position by including a speaker device that outputs an amplified sound signal input from an amplifier device with non-directional output characteristics. An acoustic system that can be used is disclosed.

特許文献３は、再生音場空間の室内音響特性（リスニングルームの大きさ・形・内装等）に合わせて、再生時の音響特性を好適に調整することができる音響再生装置について開示している。 Patent Document 3 discloses an acoustic reproduction apparatus that can suitably adjust the acoustic characteristics at the time of reproduction according to the room acoustic characteristics (the size, shape, interior, etc. of the listening room) of the reproduction sound field space. .

特表２０１５−５３０８２４号公報（２０１５年１０月１５日公表）Special table 2015-530824 publication (announced on October 15, 2015) 特開２０１４−１０３６１６号公報（２０１４年６月５日公開）Japanese Unexamined Patent Publication No. 2014-103616 (released on June 5, 2014) 特開２００８−２３３９２０号公報（２００８年１０月２日公開）JP 2008-233920 A (released on October 2, 2008)

しかしながら、従来技術の様に録音時の音場を再現しようとしても、再生環境によっては音声に対する没入感が高まるとは限らない。より具体的に言えば、音声を再生するスピーカの位置によっては、かえって没入感が低減することがある。例えば、室内で雨音のような環境音を聞く場合に、雨が窓や屋根に当たるような音が窓や屋根以外の場所に配置されたスピーカから聞こえてくると、かえって没入感が低減してしまう。 However, even if an attempt is made to reproduce the sound field during recording as in the prior art, the immersive feeling for the sound is not necessarily increased depending on the reproduction environment. More specifically, depending on the position of the speaker that reproduces the sound, the immersive feeling may be reduced. For example, when listening to environmental sounds such as rain sound indoors, if the sound of rain hitting a window or roof is heard from a speaker placed in a place other than the window or roof, the feeling of immersion is reduced. End up.

本発明の一態様は、上記課題に鑑みてなされたものであり、受聴者の再生環境に応じて好適な没入感を提供することのできる音声出力制御装置を実現することを目的とする。 One embodiment of the present invention has been made in view of the above-described problem, and an object thereof is to realize an audio output control device that can provide a suitable immersion feeling according to a reproduction environment of a listener.

上記の課題を解決するために、本発明の一態様に係る音声出力制御装置は、音声を音声出力装置に出力させる音声出力制御装置であって、音声データを取得する音声データ取得部と、上記音声データとオブジェクトとの対応関係を示すメタデータを取得するメタデータ取得部と、上記音声データの示す音声を出力させる音声出力装置を、上記オブジェクトと音声出力装置との対応付けを示す対応付け情報、および、上記メタデータを参照して決定する決定部と、を備えている構成である。 In order to solve the above problems, an audio output control device according to an aspect of the present invention is an audio output control device that outputs audio to an audio output device, the audio data acquisition unit acquiring audio data; A metadata acquisition unit that acquires metadata indicating a correspondence relationship between audio data and an object, an audio output device that outputs audio indicated by the audio data, and association information that indicates an association between the object and the audio output device And a determination unit that determines the data with reference to the metadata.

本発明の一態様に係る音声出力制御装置によれば、受聴者の再生環境に応じて好適な没入感を提供することができるという効果を奏する。 According to the audio output control device of one aspect of the present invention, there is an effect that it is possible to provide a suitable immersive feeling according to the reproduction environment of the listener.

本発明の実施形態１に係る音声出力制御装置の要部構成を示すブロック図である。It is a block diagram which shows the principal part structure of the audio | voice output control apparatus which concerns on Embodiment 1 of this invention. 本発明の実施形態１に係る音声出力制御装置の動作の一例を説明するフローチャートである。It is a flowchart explaining an example of operation | movement of the audio | voice output control apparatus which concerns on Embodiment 1 of this invention. メタデータに含まれる「音声データ・オブジェクト対応情報」の一例を示す図である。It is a figure which shows an example of the "audio | voice data object correspondence information" contained in metadata. 記憶部に記憶されている「スピーカ・オブジェクト対応情報」の一例を示す図である。It is a figure which shows an example of the "speaker object correspondence information" memorize | stored in the memory | storage part. スピーカ決定部によって生成される「音声データ・スピーカ対応情報」の一例を示す図である。It is a figure which shows an example of the "voice data / speaker correspondence information" produced | generated by the speaker determination part. 本発明の実施形態１に係る音声出力制御装置による音声データの出力例を説明するための図である。It is a figure for demonstrating the output example of the audio | voice data by the audio | voice output control apparatus which concerns on Embodiment 1 of this invention. 本発明の実施形態１に係る音声出力制御装置による音声データの出力例を説明するための図である。It is a figure for demonstrating the output example of the audio | voice data by the audio | voice output control apparatus which concerns on Embodiment 1 of this invention. 本発明の実施形態１に係る音声出力制御装置による音声データの出力例を説明するための図である。It is a figure for demonstrating the output example of the audio | voice data by the audio | voice output control apparatus which concerns on Embodiment 1 of this invention. 本発明の実施形態２に係る音声出力制御装置の要部構成を示すブロック図である。It is a block diagram which shows the principal part structure of the audio | voice output control apparatus which concerns on Embodiment 2 of this invention. 本発明の実施形態２に係る音声出力制御装置の動作の一例を説明するフローチャートである。It is a flowchart explaining an example of operation | movement of the audio | voice output control apparatus which concerns on Embodiment 2 of this invention. 本発明の実施形態２に係る音声出力制御装置のＵＩ生成部が生成するＵＩ画面の一例を示す図である。It is a figure which shows an example of the UI screen which the UI production | generation part of the audio | voice output control apparatus which concerns on Embodiment 2 of this invention produces | generates. 本発明の実施形態２に係る音声出力制御装置のＵＩ生成部が生成するＵＩ画面の一例を示す図である。It is a figure which shows an example of the UI screen which the UI production | generation part of the audio | voice output control apparatus which concerns on Embodiment 2 of this invention produces | generates. 本発明の実施形態２に係る音声出力制御装置のＵＩ生成部が生成するＵＩ画面の一例を示す図である。It is a figure which shows an example of the UI screen which the UI production | generation part of the audio | voice output control apparatus which concerns on Embodiment 2 of this invention produces | generates. 本発明の実施形態２に係る音声出力制御装置のＵＩ生成部が生成するＵＩ画面の一例を示す図である。It is a figure which shows an example of the UI screen which the UI production | generation part of the audio | voice output control apparatus which concerns on Embodiment 2 of this invention produces | generates. 本発明の実施形態２に係る音声出力制御装置のＵＩ生成部が生成するＵＩ画面の一例を示す図である。It is a figure which shows an example of the UI screen which the UI production | generation part of the audio | voice output control apparatus which concerns on Embodiment 2 of this invention produces | generates. 本発明の実施形態２に係る音声出力制御装置のＵＩ生成部が生成するＵＩ画面の一例を示す図である。It is a figure which shows an example of the UI screen which the UI production | generation part of the audio | voice output control apparatus which concerns on Embodiment 2 of this invention produces | generates. 本発明の実施形態２に係る音声出力制御装置のＵＩ生成部が生成するＵＩ画面の一例を示す図である。It is a figure which shows an example of the UI screen which the UI production | generation part of the audio | voice output control apparatus which concerns on Embodiment 2 of this invention produces | generates. 本発明に係る音声出力制御装置を備えたテレビの外観を示す図である。It is a figure which shows the external appearance of the television provided with the audio | voice output control apparatus which concerns on this invention.

〔実施形態１〕
以下、本発明の実施形態１に係る音声出力制御装置１について、詳細に説明する。 Embodiment 1
Hereinafter, the audio | voice output control apparatus 1 which concerns on Embodiment 1 of this invention is demonstrated in detail.

（１．音声出力制御装置１の要部構成）
図１は、本実施形態に係る音声出力制御装置１の要部構成を示すブロック図である。図１に示すように、音声出力制御装置１は、複数のスピーカ（音声出力装置）２ａ、２ｂおよび２ｃへと音声を出力させる。音声出力制御装置１は、スピーカ２ａ、２ｂおよび２ｃと、無線接続または有線接続されている。なお、スピーカの個数が３個の場合を例示したが、これは本実施形態を限定するものではなく、任意の個数のスピーカを対象とすることができる。また、図示は省略したが、音声出力制御装置１およびスピーカ２ａ、２ｂおよび２ｃは、無線接続または有線接続を実現するための通信部または接続部を備えている。 (1. Configuration of the main part of the audio output control device 1)
FIG. 1 is a block diagram showing a main configuration of the audio output control device 1 according to the present embodiment. As shown in FIG. 1, the audio output control device 1 outputs audio to a plurality of speakers (audio output devices) 2a, 2b, and 2c. The audio output control device 1 is connected to speakers 2a, 2b, and 2c in a wireless connection or a wired connection. In addition, although the case where the number of the speakers was three was illustrated, this does not limit this embodiment and can make an arbitrary number of speakers target. Although not shown, the audio output control device 1 and the speakers 2a, 2b, and 2c include a communication unit or a connection unit for realizing wireless connection or wired connection.

また、「音声」とは、「人の声」に限定されるものではなく、空気の振動により伝搬される音全般のことを指す。「音声」には、音楽、環境音、人の声等が含まれる。 Further, “sound” is not limited to “human voice”, but refers to all sounds propagated by vibration of air. “Speech” includes music, environmental sounds, human voices, and the like.

音声出力制御装置１は、制御部１０および記憶部２０を備えている。制御部１０は、音声出力制御装置１を統括的に制御する。 The audio output control device 1 includes a control unit 10 and a storage unit 20. The control unit 10 comprehensively controls the audio output control device 1.

制御部１０は、音声データ取得部１１、メタデータ取得部１２、スピーカ決定部（決定部）１３および出力スピーカ制御部１４を備えている。 The control unit 10 includes an audio data acquisition unit 11, a metadata acquisition unit 12, a speaker determination unit (determination unit) 13, and an output speaker control unit 14.

音声データ取得部１１は、音声出力制御装置１の処理対象となるコンテンツデータを参照し、当該コンテンツデータから音声データを取得する。コンテンツデータには、音声データおよびメタデータが含まれている。コンテンツデータは、サーバから取得してもよく、記憶部２０に予め記憶されていてもよい。また、コンテンツデータは、音声に関連したデータのみに限定されるものではなく、画像データ等の他のデータをさらに含むものであってもよい。 The audio data acquisition unit 11 refers to content data to be processed by the audio output control device 1 and acquires audio data from the content data. The content data includes audio data and metadata. The content data may be acquired from the server or stored in the storage unit 20 in advance. Further, the content data is not limited to only data related to sound, and may further include other data such as image data.

また、音声データ取得部１１は、取得した音声データを出力スピーカ制御部１４に供給する。なお、音声データ取得部１１は、取得した音声データに対して適宜復号処理等のデータ処理を行ったうえで出力スピーカ制御部１４に供給する構成とすることができる。 The audio data acquisition unit 11 supplies the acquired audio data to the output speaker control unit 14. The audio data acquisition unit 11 can be configured to appropriately perform data processing such as decoding processing on the acquired audio data and then supply the data to the output speaker control unit 14.

メタデータ取得部１２は、上記コンテンツデータからメタデータを取得する。取得したメタデータは、スピーカ決定部１３に供給される。詳細については後述するが、メタデータには、各音声データとオブジェクトとの対応関係を示す「音声データ・オブジェクト対応情報」が含まれている。 The metadata acquisition unit 12 acquires metadata from the content data. The acquired metadata is supplied to the speaker determination unit 13. Although details will be described later, the metadata includes “audio data / object correspondence information” indicating the correspondence between each piece of audio data and the object.

一方で、記憶部２０には、オブジェクトとスピーカ２ａ、２ｂおよび２ｃとの対応付けを示す「スピーカ・オブジェクト対応情報」が記憶されている。 On the other hand, the storage unit 20 stores “speaker / object correspondence information” indicating the association between the object and the speakers 2a, 2b, and 2c.

ここで、上記「オブジェクト」とは、任意の領域、任意の領域の一部、および任意の領域内に存在している物体の少なくともいずれかを指す。上記「任意の領域」とは、受聴者の再生環境を機能的または物理的に区分した一領域のことを指す。具体的には、例えば、受聴者の再生環境が家の中であれば、上記「任意の領域」は、例えば、キッチン、リビング、ベッドルーム等の、部屋であり得る。上記「任意の領域の一部」は、例えば、窓、天井等の、部屋を構成する部材であり得る。また、上記「任意の領域内に存在している物体」は、例えば、テレビジョン受像機（テレビ）、本棚等の、部屋の中に存在している物品であり得る。 Here, the “object” refers to at least one of an arbitrary region, a part of the arbitrary region, and an object existing in the arbitrary region. The “arbitrary area” refers to an area in which the reproduction environment of the listener is functionally or physically divided. Specifically, for example, if the reproduction environment of the listener is in a house, the “arbitrary area” may be a room such as a kitchen, a living room, a bedroom, and the like. The “part of an arbitrary region” may be a member constituting a room, such as a window or a ceiling. The “object existing in an arbitrary area” may be an article existing in a room, such as a television receiver (television) or a bookshelf.

スピーカ決定部１３は、記憶部２０に記憶されている上記「スピーカ・オブジェクト対応情報」と、上記メタデータに含まれる「音声データ・オブジェクト対応情報」とを参照して、上記音声データの示す音声と、当該音声を出力させるスピーカとの対応情報である「音声データ・スピーカ対応情報」を生成する。生成した「音声データ・スピーカ対応情報」は、出力スピーカ制御部１４に提供される。 The speaker determination unit 13 refers to the “speaker / object correspondence information” stored in the storage unit 20 and the “voice data / object correspondence information” included in the metadata, and the sound indicated by the sound data. And “speech data / speaker correspondence information” that is correspondence information with the speaker that outputs the sound. The generated “audio data / speaker correspondence information” is provided to the output speaker control unit 14.

出力スピーカ制御部１４は、「音声データ・スピーカ対応情報」に従って、音声データを、当該音声データに対応付けられたスピーカから出力させる。 The output speaker control unit 14 outputs audio data from the speaker associated with the audio data according to “audio data / speaker correspondence information”.

（２．音声出力制御装置１の動作）
図２は、本実施形態に係る音声出力制御装置１の動作の一例を説明するフローチャートである。 (2. Operation of the audio output control device 1)
FIG. 2 is a flowchart for explaining an example of the operation of the audio output control apparatus 1 according to the present embodiment.

（ステップＳ１１）
まず、音声データ取得部１１は、コンテンツデータから音声データを取得する。音声データ取得部１１は、取得した音声データを、出力スピーカ制御部１４に供給する。 (Step S11)
First, the audio data acquisition unit 11 acquires audio data from content data. The audio data acquisition unit 11 supplies the acquired audio data to the output speaker control unit 14.

（ステップＳ１２）
次いで、メタデータ取得部１２は、取得した音声データからメタデータを取得する。メタデータ取得部１２は、取得したメタデータを、スピーカ決定部１３に供給する。 (Step S12)
Next, the metadata acquisition unit 12 acquires metadata from the acquired audio data. The metadata acquisition unit 12 supplies the acquired metadata to the speaker determination unit 13.

（ステップＳ１３）
次いで、スピーカ決定部１３は、記憶部２０に記憶されている「スピーカ・オブジェクト対応情報」と、上記メタデータに含まれる「音声データ・オブジェクト対応情報」とを参照して、音声データの示す音声を出力させるスピーカを決定し、当該決定結果を示す「音声データ・スピーカ対応情報」を生成する。 (Step S13)
Next, the speaker determination unit 13 refers to the “speaker / object correspondence information” stored in the storage unit 20 and the “sound data / object correspondence information” included in the metadata, so that the voice indicated by the voice data is displayed. Is determined, and “audio data / speaker correspondence information” indicating the determination result is generated.

（ステップＳ１４）
次いで、出力スピーカ制御部１４は、「音声データ・スピーカ対応情報」に従って、音声データを、当該音声データに対応付けられたスピーカから出力させる。 (Step S14)
Next, the output speaker control unit 14 causes the audio data to be output from the speaker associated with the audio data according to the “audio data / speaker correspondence information”.

（３．各対応情報の具体例）
以下では、参照する図面を替えて、上記の説明において登場した各種の対応情報についてより具体的に説明する。 (3. Specific examples of each correspondence information)
Hereinafter, the various types of correspondence information appearing in the above description will be described more specifically with reference to the drawings.

（音声データ・オブジェクト対応情報）
図３は、メタデータに含まれる「音声データ・オブジェクト対応情報」の一例を示す図である。「音声データ・オブジェクト対応情報」は、コンテンツデータに含まれている音声データと、オブジェクトとの対応付けを示す情報である。 (Audio data / object correspondence information)
FIG. 3 is a diagram illustrating an example of “audio data / object correspondence information” included in the metadata. “Audio data / object correspondence information” is information indicating the association between audio data included in content data and an object.

図３に示すように、例えば、コンテンツ名「Relax Music[ Rain ]」であるコンテンツデータは、音声チャンネル１〜５の音声データを含んでいる。音声チャンネル１〜３の音声データは、出力先のオブジェクトとして「Ceiling」に対応付けられている、音声チャンネル４〜５の音声データは、出力先のオブジェクトとして「Window」に対応付けられている。コンテンツ名「Relax Music[ Rain ]」であるコンテンツデータにおいて、「Room」に関する指定はない。 As illustrated in FIG. 3, for example, content data having a content name “Relax Music [Rain]” includes audio data of audio channels 1 to 5. The audio data of the audio channels 1 to 3 is associated with “Ceiling” as an output destination object, and the audio data of the audio channels 4 to 5 is associated with “Window” as an output destination object. In the content data with the content name “Relax Music [Rain]”, there is no designation regarding “Room”.

一方、コンテンツ名「Relax Music[Cooking]」であるコンテンツデータに含まれている音声チャンネル１〜２の音声データは、出力先のオブジェクトとして「Kitchen」が対応付けられているが、「Place」に関する指定はない。 On the other hand, the audio data of the audio channels 1 to 2 included in the content data with the content name “Relax Music [Cooking]” is associated with “Kitchen” as an output destination object. There is no specification.

（スピーカ・オブジェクト対応情報）
図４は、記憶部２０に記憶されている「スピーカ・オブジェクト対応情報」の一例を示す図である。「スピーカ・オブジェクト対応情報」は、スピーカと、オブジェクトとの対応付けを示す情報である。 (Speaker / object correspondence information)
FIG. 4 is a diagram illustrating an example of “speaker / object correspondence information” stored in the storage unit 20. “Speaker / object correspondence information” is information indicating correspondence between a speaker and an object.

図４に示すように、例えば、ＩＤが「SP-01」であるスピーカは、「Living Room」の「Display Side」に対応付けられている。ＩＤが「SP-06」であるスピーカは、「Kitchen」の「Ceiling」に対応付けられている。各スピーカは、「スピーカ・オブジェクト対応情報」において対応付けられたオブジェクト（Room）内に存在しているオブジェクト（Place）と接するように、またはオブジェクト（Place）の付近に配置されていることが好ましい。 As shown in FIG. 4, for example, the speaker whose ID is “SP-01” is associated with “Display Side” of “Living Room”. The speaker whose ID is “SP-06” is associated with “Ceiling” of “Kitchen”. Each speaker is preferably arranged so as to be in contact with or in the vicinity of an object (Place) existing in the object (Room) associated in the “speaker / object correspondence information”. .

（音声データ・スピーカ対応情報）
図５は、スピーカ決定部１３によって生成される「音声データ・スピーカ対応情報」の一例を示す図である。スピーカ決定部１３は、「スピーカ・オブジェクト対応情報」（図４）と、メタデータに含まれる「音声データ・オブジェクト対応情報」（図３）とを参照して音声データの示す音声を出力させるスピーカを決定し、「音声データ・スピーカ対応情報」（図５）を生成する。 (Audio data / speaker compatibility information)
FIG. 5 is a diagram illustrating an example of “voice data / speaker correspondence information” generated by the speaker determination unit 13. The speaker determination unit 13 refers to the “speaker / object correspondence information” (FIG. 4) and the “voice data / object correspondence information” (FIG. 3) included in the metadata, and outputs a speaker indicated by the voice data. And “voice data / speaker correspondence information” (FIG. 5) is generated.

例えば、スピーカ決定部１３は、コンテンツ名「Relax Music[ Rain ]」であるコンテンツデータのメタデータを受け付けた場合、図４に示す「スピーカ・オブジェクト対応情報」を参照して、「Ceiling」に対応付けられたＩＤが「SP-04」および「SP-06」であるスピーカを、音声チャンネル１〜３の音声データの出力先として決定し、「Window」に対応付けられたＩＤが「SP-02」および「SP-03」であるスピーカを、音声チャンネル４〜５の音声データの出力先として決定する。 For example, when the speaker determination unit 13 receives the metadata of the content data with the content name “Relax Music [Rain]”, the speaker determination unit 13 corresponds to “Ceiling” with reference to the “speaker / object correspondence information” illustrated in FIG. Speakers with the attached IDs “SP-04” and “SP-06” are determined as the output destinations of the audio data of the audio channels 1 to 3, and the ID associated with “Window” is “SP-02” ”And“ SP-03 ”are determined as output destinations of the audio data of the audio channels 4 to 5.

また、例えば、スピーカ決定部１３は、コンテンツ名「Relax Music[ Cafe]」であるコンテンツデータのメタデータを受け付けた場合、図４に示す「スピーカ・オブジェクト対応情報」を参照して、登録された全てのスピーカ（ＩＤが「SP-01」、「SP-02」、…「SP-06」であるスピーカ）を、音声チャンネル１の音声データの出力先として決定する。 For example, when the speaker determination unit 13 receives the metadata of the content data having the content name “Relax Music [Cafe]”, the speaker determination unit 13 is registered with reference to the “speaker / object correspondence information” illustrated in FIG. 4. All speakers (speakers whose IDs are “SP-01”, “SP-02”,... “SP-06”) are determined as output destinations of the audio data of the audio channel 1.

また、例えば、スピーカ決定部１３は、コンテンツ名「RelaxMusic[Cooking]」であるコンテンツデータのメタデータを受け付けた場合、図４に示す「スピーカ・オブジェクト対応情報」を参照して、「Kitchen」に対応付けられたＩＤが「SP-06」であるスピーカを、音声チャンネル１〜２の音声データの出力先として決定する。なお、コンテンツ名「Relax Music[Cooking]」であるコンテンツデータのように、出力先のオブジェクトとして「Room」のみが対応付けられており、「Place」の指定がない場合は、「Room」の情報が一致しているスピーカ全てを音声データの出力先として決定すればよい。 For example, when the speaker determination unit 13 receives the metadata of the content data having the content name “RelaxMusic [Cooking]”, the “speaker / object correspondence information” illustrated in FIG. 4 is referred to and “Kitchen” is set. The speaker with the associated ID “SP-06” is determined as the output destination of the audio data of the audio channels 1 and 2. If only “Room” is associated as an output destination object and content is “Relax Music [Cooking]”, and “Place” is not specified, the “Room” information All the speakers with the same can be determined as the output destination of the audio data.

出力スピーカ制御部１４は、図５に示した「音声データ・スピーカ対応情報」に従って、音声データを、当該音声データに対応付けられたスピーカから出力させる。図６〜図８は、本発明の実施形態１に係る音声出力制御装置１による音声データの出力例を説明するための図である。 The output speaker control unit 14 outputs audio data from the speaker associated with the audio data in accordance with the “audio data / speaker correspondence information” shown in FIG. 6-8 is a figure for demonstrating the output example of the audio | voice data by the audio | voice output control apparatus 1 which concerns on Embodiment 1 of this invention.

図６は、例えば、コンテンツ名「Relax Music[ Rain ]」であるコンテンツデータに含まれている音声データの出力例を示している。コンテンツ名「Relax Music[ Rain ]」であるコンテンツデータを再生する場合、出力スピーカ制御部１４は、図５に示した「音声データ・スピーカ対応情報」に従って、音声チャンネル１〜３の音声データを、窓２０１に対応付けられたスピーカ２ｂから出力させ、音声チャンネル４〜５の音声データを、天井２０２に対応付けられたスピーカ２ａから出力させる。一方、出力スピーカ制御部１４は、テレビ２００に対応付けられたスピーカ２ｃからは、音声データを出力させない。これにより、雨音の環境音が、窓や天井の方向から聞こえてくるため、受聴者２０３は、雨が窓や屋根に当たっているように感じることができる。その結果、受聴者２０３は、雨音の環境音に対して好適な没入感を得ることができる。さらには、本発明の実施形態１に係る音声出力制御装置１は、音声データをスピーカに出力させると同時に、テレビ２００に雨の映像を表示させるように構成されていてもよい。これによって、受聴者２０３は、雨音の環境音に対するより高い没入感を得ることができる。 FIG. 6 shows an output example of audio data included in the content data having the content name “Relax Music [Rain]”, for example. When reproducing the content data with the content name “Relax Music [Rain]”, the output speaker control unit 14 converts the audio data of the audio channels 1 to 3 according to the “audio data / speaker correspondence information” shown in FIG. Output from the speaker 2 b associated with the window 201, and output audio data of the audio channels 4 to 5 from the speaker 2 a associated with the ceiling 202. On the other hand, the output speaker control unit 14 does not output audio data from the speaker 2 c associated with the television 200. Thereby, since the environmental sound of rain sound is heard from the direction of the window or the ceiling, the listener 203 can feel that the rain hits the window or the roof. As a result, the listener 203 can obtain a suitable immersion feeling with respect to the environmental sound of rain. Furthermore, the audio output control apparatus 1 according to Embodiment 1 of the present invention may be configured to display the rain video on the television 200 at the same time as outputting audio data to the speaker. Thereby, the listener 203 can obtain a higher immersive feeling with respect to the environmental sound of the rain sound.

また、図７は、例えば、コンテンツ名「Relax Music[ Cafe ]」であるコンテンツデータに含まれている音声データの出力例を示している。コンテンツ名「Relax Music[ Cafe ]」であるコンテンツデータを再生する場合、出力スピーカ制御部１４は、図５に示した「音声データ・スピーカ対応情報」に従って、音声チャンネル１の音声データを、再生環境内に存在している全てのスピーカ（すなわち、窓２０１に対応付けられたスピーカ２ｂ、天井２０２に対応付けられたスピーカ２ａ、およびテレビ２００に対応付けられたスピーカ２ｃ）から出力させる。これにより、カフェの喧騒音の環境音が、再生環境内全体から聞こえてくるため、受聴者２０３は、カフェにいるように感じることができる。その結果、受聴者２０３は、カフェの喧騒音の環境音に対して好適な没入感を得ることができる。さらには、本発明の実施形態１に係る音声出力制御装置１は、音声データをスピーカに出力させると同時に、テレビ２００にカフェの映像を表示させるように構成されていてもよい。これによって、受聴者２０３は、カフェの喧騒音の環境音に対するより高い没入感を得ることができる。 FIG. 7 shows an output example of audio data included in content data having a content name “Relax Music [Cafe]”, for example. When reproducing the content data with the content name “Relax Music [Cafe]”, the output speaker control unit 14 reproduces the audio data of the audio channel 1 according to the “audio data / speaker correspondence information” shown in FIG. (Ie, the speaker 2b associated with the window 201, the speaker 2a associated with the ceiling 202, and the speaker 2c associated with the television 200). Thereby, since the environmental sound of the noise of the cafe is heard from the entire reproduction environment, the listener 203 can feel as if he is in the cafe. As a result, the listener 203 can obtain a suitable immersion feeling with respect to the environmental noise of the noise of the cafe. Furthermore, the audio output control apparatus 1 according to the first embodiment of the present invention may be configured to display audio from a cafe on the television 200 at the same time as outputting audio data to a speaker. Thereby, the listener 203 can obtain a higher immersive feeling with respect to the environmental noise of the noise of the cafe.

また、図８は、例えば、コンテンツ名「Relax Music[ Fire Place]」であるコンテンツデータに含まれている音声データの出力例を示している。コンテンツ名「Relax Music[Fire Place ]」であるコンテンツデータを再生する場合、出力スピーカ制御部１４は、図５に示した「音声データ・スピーカ対応情報」に従って、音声チャンネル１〜２の音声データを、テレビ２００に対応付けられたスピーカ２ｃから出力させる。一方、出力スピーカ制御部１４は、窓２０１に対応付けられたスピーカ２ｂおよび天井２０２に対応付けられたスピーカ２ａからは、音声データを出力させない。これにより、暖炉のたき火の音の環境音が、窓２０１に対応付けられたスピーカ２ｂおよび天井２０２に対応付けられたスピーカ２ａから聞こえないので、受聴者２０３の暖炉のたき火の音の環境音に対する没入感が損なわれることがない。さらには、本発明の実施形態１に係る音声出力制御装置１は、音声データをスピーカに出力させると同時に、テレビ２００に暖炉のたき火の映像を表示させるように構成されていてもよい。これによって、受聴者２０３は、暖炉のたき火の音の環境音に対するより高い没入感を得ることができる。 FIG. 8 shows an output example of audio data included in content data having a content name “Relax Music [Fire Place]”, for example. When reproducing the content data with the content name “Relax Music [Fire Place]”, the output speaker control unit 14 converts the audio data of the audio channels 1 and 2 according to the “audio data / speaker correspondence information” shown in FIG. And output from the speaker 2c associated with the television 200. On the other hand, the output speaker control unit 14 does not output audio data from the speaker 2b associated with the window 201 and the speaker 2a associated with the ceiling 202. Accordingly, the environmental sound of the bonfire sound of the fireplace cannot be heard from the speaker 2b associated with the window 201 and the speaker 2a associated with the ceiling 202. Immersive feeling is not impaired. Furthermore, the audio output control apparatus 1 according to the first embodiment of the present invention may be configured to display audio bonfire video on the television 200 at the same time as outputting audio data to a speaker. Thereby, the listener 203 can obtain a higher immersive feeling with respect to the environmental sound of the bonfire sound of the fireplace.

（４．変形例）
一変形例において、図３におけるコンテンツ名「Relax Music[ Rain ]」であるコンテンツデータの様に、コンテンツデータにおいて「Room」の指定がないコンテンツの音声データを出力する場合に、本発明の実施形態１に係る音声出力制御装置１は、スピーカ決定部１３が、受聴者が存在している再生環境内の領域情報を加味して、音声データの示す音声を出力させるスピーカを決定するように構成されていてもよい。従って、本発明の実施形態１に係る音声出力制御装置１は、受聴者が存在している再生環境内の領域の情報を取得するための、領域情報取得部（図１中には図示しない）をさらに備えていてもよい。例えば、図６〜図８に示すように、受聴者２０３の再生環境が家の中であり、受聴者２０３がLiving Roomに存在している場合は、上記領域情報取得部は、受聴者が存在している再生環境内の領域の情報として、「Living Room」の領域情報を取得する。取得した領域情報は、スピーカ決定部１３に供給される。スピーカ決定部１３は、領域情報を加味して、音声データに対応付けられた全てのスピーカの内、Living Roomに対応づけられたスピーカのみを、音声データの出力先として決定する。受聴者が存在している再生環境内の領域の情報の取得方法としては、例えば、音声出力制御装置１が取得したメタデータに加えて、ユーザが所望の「Room」に対する指定を追加できるようにしてもよい。その結果、スピーカ決定部１３は、「音声データ・オブジェクト対応情報」（図３）および「スピーカ・オブジェクト対応情報」（図４）に加え、ユーザによる指定を参照して音声データの示す音声を出力させるスピーカを決定し、「音声データ・スピーカ対応情報」（図示しない）を生成する。従って、本発明の実施形態１に係る音声出力制御装置１は、ユーザに対して「Room」を指定させるためのＵＩ生成部および表示部と、ユーザからの操作を出力スピーカ決定に反映させるための操作受付部（図１中には図示しない）を更に備えていてもよい。 (4. Modifications)
In one modified example, the embodiment of the present invention is used when outputting audio data of content in which “Room” is not specified in the content data, such as the content data having the content name “Relax Music [Rain]” in FIG. The audio output control device 1 according to 1 is configured such that the speaker determination unit 13 determines a speaker that outputs the sound indicated by the audio data in consideration of the area information in the reproduction environment where the listener is present. It may be. Therefore, the audio output control apparatus 1 according to the first embodiment of the present invention is an area information acquisition unit (not shown in FIG. 1) for acquiring information on an area in the reproduction environment where the listener is present. May be further provided. For example, as shown in FIGS. 6 to 8, when the reproduction environment of the listener 203 is in the house and the listener 203 exists in the living room, the area information acquisition unit includes the listener. The area information of “Living Room” is acquired as information on the area in the playback environment. The acquired area information is supplied to the speaker determination unit 13. The speaker determination unit 13 considers the area information and determines only the speaker associated with the Living Room among all the speakers associated with the audio data as the output destination of the audio data. As a method for acquiring information on the area in the reproduction environment where the listener is present, for example, in addition to the metadata acquired by the audio output control device 1, the user can add a designation for a desired “Room”. May be. As a result, the speaker determination unit 13 outputs the voice indicated by the voice data with reference to the designation by the user in addition to the “voice data / object correspondence information” (FIG. 3) and the “speaker / object correspondence information” (FIG. 4). The speaker to be determined is determined, and “voice data / speaker correspondence information” (not shown) is generated. Therefore, the audio output control apparatus 1 according to the first embodiment of the present invention is a UI generation unit and a display unit for allowing the user to designate “Room”, and for reflecting an operation from the user in the output speaker determination. An operation receiving unit (not shown in FIG. 1) may be further provided.

また、別の変形例において、音声出力制御装置１は、取得したメタデータに対応する中間テーブル（図示しない）を、キャッシュデータとして記憶部２０に保持しておき、当該中間テーブルのキャッシュデータに対して、ユーザが所望する「Room」に対する指定を追加する構成としてもよい。この構成の場合、スピーカ決定部１３は、ユーザによる指定が追加された上記中間テーブルのキャッシュデータおよび「スピーカ・オブジェクト対応情報」（図４）を参照して音声データの示す音声を出力させるスピーカを決定し、「音声データ・スピーカ対応情報」（図示しない）を生成してもよい。 In another modification, the audio output control device 1 holds an intermediate table (not shown) corresponding to the acquired metadata in the storage unit 20 as cache data, and stores the cache data in the intermediate table. Thus, it is possible to add a designation for “Room” desired by the user. In the case of this configuration, the speaker determination unit 13 refers to the cache data of the intermediate table added by the user and “speaker / object correspondence information” (FIG. 4), and outputs a speaker that outputs the sound indicated by the sound data. The “voice data / speaker correspondence information” (not shown) may be generated.

また、別の変形例において、本発明の実施形態１に係る音声出力制御装置１は、ユーザの操作を受けて音声データの示す音声を出力するスピーカを決定する様に構成されていてもよい。例えば、「スピーカ・オブジェクト対応情報」（図４）に対して、ユーザが、音声を出力させたいスピーカに対する指定を追加できるようにしてもよい。その結果、スピーカ決定部１３は、「音声データ・オブジェクト対応情報」（図３）および「スピーカ・オブジェクト対応情報」（図４）に加え、ユーザによる上記指定を参照して音声データの示す音声を出力させるスピーカを決定し、「音声データ・スピーカ対応情報」（図示しない）を生成する。従って、本発明の実施形態１に係る音声出力制御装置１は、ユーザに対して出力スピーカを選択させるためのＵＩ生成部および表示部と、ユーザからの操作を出力スピーカ決定に反映させるための操作受付部（図１中には図示しない）を更に備えていてもよい。 In another modification, the audio output control device 1 according to the first embodiment of the present invention may be configured to determine a speaker that outputs audio indicated by audio data in response to a user operation. For example, for the “speaker / object correspondence information” (FIG. 4), the user may be able to add a designation for a speaker to which audio is to be output. As a result, the speaker determination unit 13 refers to the above-mentioned designation by the user in addition to “audio data / object correspondence information” (FIG. 3) and “speaker / object correspondence information” (FIG. 4). The speaker to be output is determined, and “voice data / speaker correspondence information” (not shown) is generated. Therefore, the audio output control device 1 according to the first embodiment of the present invention includes a UI generation unit and a display unit for causing the user to select an output speaker, and an operation for reflecting an operation from the user in the output speaker determination. You may further provide the reception part (not shown in FIG. 1).

〔実施形態２〕
以下、本発明の実施形態２に係る音声出力制御装置１ａについて、詳細に説明する。なお、説明の便宜上、前記実施形態にて説明した部材と同じ機能を有する部材については、同じ符号を付記し、その説明を省略する。 [Embodiment 2]
Hereinafter, the audio | voice output control apparatus 1a which concerns on Embodiment 2 of this invention is demonstrated in detail. For convenience of explanation, members having the same functions as those described in the embodiment are given the same reference numerals, and descriptions thereof are omitted.

（１．音声出力制御装置１ａの要部構成）
図９は、本実施形態に係る音声出力制御装置１ａの要部構成を示すブロック図である。図９に示すように、（i）音声出力制御装置１ａが表示部３０をさらに備えている点、および（ii）制御部１０ａが操作受付部１５、情報更新部１６およびＵＩ生成部１７をさらに備えている点が、実施形態１の音声出力制御装置１と異なっている。本実施形態に係る音声出力制御装置１ａをかかる構成とすることによって、音声出力制御装置１ａは、記憶部２０に記憶されている「スピーカ・オブジェクト対応情報」を、ユーザ指示に基づいて更新することが可能となっている。操作受付部１５、情報更新部１６およびＵＩ生成部１７に関する処理の詳細については後述する。 (1. Configuration of the main part of the audio output control device 1a)
FIG. 9 is a block diagram showing a main configuration of the audio output control device 1a according to the present embodiment. As shown in FIG. 9, (i) the voice output control device 1a further includes a display unit 30, and (ii) the control unit 10a further includes the operation receiving unit 15, the information updating unit 16, and the UI generation unit 17. This is different from the audio output control device 1 of the first embodiment. By configuring the audio output control device 1a according to the present embodiment as described above, the audio output control device 1a updates the “speaker / object correspondence information” stored in the storage unit 20 based on a user instruction. Is possible. Details of processing related to the operation reception unit 15, the information update unit 16, and the UI generation unit 17 will be described later.

表示部３０は、音声出力制御装置１ａと、無線接続または有線接続されている。図９に示すように、音声出力制御装置１ａは、「スピーカ・オブジェクト対応情報」に関連するユーザインタフェース画面（ＵＩ画面）の画像を、表示部３０に表示させる。尚、表示部３０は、タッチパネルであってもよい。また、図示は省略したが、音声出力制御装置１ａおよび表示部３０は、無線接続または有線接続を実現するための通信部または接続部を備えている。 The display unit 30 is wirelessly connected or wired to the audio output control device 1a. As illustrated in FIG. 9, the audio output control device 1 a causes the display unit 30 to display an image of a user interface screen (UI screen) related to “speaker / object correspondence information”. The display unit 30 may be a touch panel. Although not shown, the audio output control device 1a and the display unit 30 include a communication unit or a connection unit for realizing wireless connection or wired connection.

操作受付部１５は、オブジェクトとスピーカ２ａ、２ｂおよび２ｃとの対応付けに関するユーザからの指示を受け付ける。操作受付部１５は、例えば、ユーザが、マウス、タッチパネル等の入力装置（図示しない）を介して、ユーザからの指示を受け付ける。 The operation receiving unit 15 receives an instruction from the user regarding the association between the object and the speakers 2a, 2b, and 2c. For example, the operation receiving unit 15 receives an instruction from the user via an input device (not shown) such as a mouse or a touch panel.

情報更新部１６は、操作受付部１５が受け付けたユーザ指示に基づいて、記憶部２０に記憶されている「スピーカ・オブジェクト対応情報」を更新する。 The information updating unit 16 updates the “speaker / object correspondence information” stored in the storage unit 20 based on the user instruction received by the operation receiving unit 15.

ＵＩ生成部１７は、記憶部２０に記憶されている「スピーカ・オブジェクト対応情報」を取得し、「スピーカ・オブジェクト対応情報」に関連するＵＩ画面の画像を生成する。生成したＵＩ画面の画像は、表示部３０に表示される。 The UI generation unit 17 acquires “speaker / object correspondence information” stored in the storage unit 20 and generates an image of a UI screen related to the “speaker / object correspondence information”. The generated UI screen image is displayed on the display unit 30.

（２．音声出力制御装置１ａの動作）
本実施形態に係る音声出力制御装置１ａは、上述したとおり、ユーザ指示に基づいて、記憶部２０に記憶されている「スピーカ・オブジェクト対応情報」を更新することが可能となっている点が、実施形態１の音声出力制御装置１とは異なる。そこで、かかる相違点の動作のみを以下に説明する。 (2. Operation of the audio output control device 1a)
As described above, the audio output control device 1a according to the present embodiment can update the “speaker / object correspondence information” stored in the storage unit 20 based on a user instruction. Different from the audio output control device 1 of the first embodiment. Therefore, only the operation of this difference will be described below.

図１０は、本実施形態に係る音声出力制御装置１ａの動作の一例を説明するフローチャートである。 FIG. 10 is a flowchart for explaining an example of the operation of the audio output control apparatus 1a according to the present embodiment.

（ステップＳ２１）
まず、操作受付部１５は、オブジェクトとスピーカ２ａ、２ｂおよび２ｃとの対応付けに関するユーザからの指示を受け付ける。操作受付部１５は、受け付けたユーザ指示を、情報更新部（対応付け情報更新部）１６に供給する。 (Step S21)
First, the operation reception part 15 receives the instruction | indication from the user regarding matching with an object and the speakers 2a, 2b, and 2c. The operation reception unit 15 supplies the received user instruction to the information update unit (association information update unit) 16.

（ステップＳ２２）
次いで、情報更新部１６は、受け付けたユーザ指示に基づいて、記憶部２０に記憶されている「スピーカ・オブジェクト対応情報」を更新する。 (Step S22)
Next, the information updating unit 16 updates the “speaker / object correspondence information” stored in the storage unit 20 based on the received user instruction.

（３．ＵＩ画面の具体例、及び、受け付けたユーザ操作に基づく音声出力制御装置１ａの動作例）
以下では、図１１〜図１７を参照しながら、ＵＩ生成部（ＵＩ画面生成装置）１７が生成するＵＩ画面（１００ａ〜１００ｇ）の具体例、及び、受け付けたユーザ操作に基づく音声出力制御装置１ａの動作例について説明する。 (3. Specific example of UI screen and operation example of audio output control device 1a based on accepted user operation)
Hereinafter, with reference to FIGS. 11 to 17, specific examples of UI screens (100 a to 100 g) generated by the UI generation unit (UI screen generation device) 17, and the audio output control device 1 a based on the received user operation An example of the operation will be described.

（初期画面）
図１１は、ＵＩ生成部１７が生成するＵＩ画面１００ａの一例を示す図であり、「スピーカ・オブジェクト対応情報」を登録するための初期画面を示している。 (initial screen)
FIG. 11 is a diagram illustrating an example of a UI screen 100a generated by the UI generation unit 17, and illustrates an initial screen for registering “speaker / object correspondence information”.

図１１に示すように、ＵＩ画面１００ａは、スピーカ・オブジェクト対応情報表示領域１０２ａ、スライドバー１０４、および追加ボタン１０１ａを含んでいる。 As shown in FIG. 11, the UI screen 100a includes a speaker / object correspondence information display area 102a, a slide bar 104, and an add button 101a.

スピーカ・オブジェクト対応情報表示領域１０２ａは、登録済みのスピーカ・オブジェクト対応情報の一部または全体を表示するための領域である。スライドバー１０４は、スピーカ・オブジェクト対応情報の内、未表示となっている部分を表示させるために、ユーザによって上下に移動可能に構成されている。 The speaker / object correspondence information display area 102a is an area for displaying part or all of the registered speaker / object correspondence information. The slide bar 104 is configured to be movable up and down by the user in order to display an undisplayed portion of the speaker / object correspondence information.

追加ボタン１０１ａは、スピーカ・オブジェクト対応情報に、スピーカ・オブジェクト対応情報を追加するために用いられるボタンである。 The add button 101a is a button used to add speaker / object correspondence information to speaker / object correspondence information.

スライドバー１０４の移動および追加ボタン１０１ａの押下は、例えば、ユーザが操作するカーソルによる選択とリモコン等に備えられた物理的ボタンとの組合せによって行われる構成としてもよいし、表示部３０（図９）をタッチパネルとし、ユーザが直接タッチすることによって操作が行われる構成としてもよい。 The movement of the slide bar 104 and the pressing of the add button 101a may be performed by, for example, a combination of selection by a cursor operated by the user and a physical button provided on a remote controller or the like, or the display unit 30 (FIG. 9). ) May be used as a touch panel, and the operation may be performed when the user directly touches.

図１１に示すように、ＵＩ画面１００ａには、スピーカ・オブジェクト対応情報として、（i）スピーカのＩＤの情報、（ii）当該ＩＤに対応する「Room」（オブジェクトとしての「領域」の名称）の情報、および（iii）当該ＩＤに対応する「Place」（オブジェクトとしての「領域の一部」の名称または「領域内に存在している物体」の名称）の情報が含まれている。 As shown in FIG. 11, on the UI screen 100a, as speaker / object correspondence information, (i) speaker ID information, (ii) “Room” corresponding to the ID (name of “area” as an object) And (iii) information of “Place” (name of “part of region” as object or “name of object existing in region”) corresponding to the ID.

また、図１１に示す例では、フォーカス対象となっているＩＤ、当該ＩＤに対応する「Room」、および当該ＩＤに対応する「Place」が強調表示されている。図１１に示す例では、この強調表示は、ＩＤ、当該ＩＤに対応する「Room」、及び当該ＩＤに対応する「Place」を矩形の枠１０３で枠囲みすることによって行われる。フォーカス対象となっているＩＤ等に対するユーザの指示の具体例については後述する。 In the example shown in FIG. 11, the focus target ID, “Room” corresponding to the ID, and “Place” corresponding to the ID are highlighted. In the example shown in FIG. 11, this highlighting is performed by enclosing an ID, “Room” corresponding to the ID, and “Place” corresponding to the ID with a rectangular frame 103. A specific example of the user's instruction for the ID to be focused will be described later.

（Device name選択画面）
図１２は、ＵＩ生成部１７が生成するＵＩ画面１００ｂの一例を示す図であり、図１２は、図１１に示したＵＩ画面１００ａ内の追加ボタン１０１ａをユーザが押下した後に表示部３０に表示されるＵＩ画面を示している。 (Device name selection screen)
12 is a diagram illustrating an example of a UI screen 100b generated by the UI generation unit 17, and FIG. 12 is displayed on the display unit 30 after the user presses the add button 101a in the UI screen 100a illustrated in FIG. The UI screen to be displayed is shown.

図１２に示すように、ＵＩ画面１００ｂは、「Device name」表示領域１０２ｂおよびスライドバー１０４を含んでいる。 As illustrated in FIG. 12, the UI screen 100 b includes a “Device name” display area 102 b and a slide bar 104.

「Device name」表示領域１０２ｂは、音声出力制御装置１ａによって検出可能なスピーカの情報の一部または全体を表示するための領域である。スライドバー１０４は、スピーカの情報の内、未表示となっている部分を表示させるために、ユーザによって上下に移動可能に構成されている。 The “Device name” display area 102b is an area for displaying part or all of speaker information that can be detected by the audio output control device 1a. The slide bar 104 is configured to be movable up and down by the user in order to display an undisplayed portion of the speaker information.

図１２に示すように、ＵＩ画面１００ｂには、音声出力制御装置１ａによって検出可能なスピーカの情報として、「Device name」の情報が含まれている。また、図１２に示す例では、フォーカス対象となっている「Device name」が、矩形の枠１０３で枠囲みすることによって強調表示されている。ユーザは、フォーカス対象となっている「Device name」を、「スピーカ・オブジェクト対応情報」として新たに登録するスピーカとして選択することができる。 As illustrated in FIG. 12, the UI screen 100 b includes “Device name” information as speaker information that can be detected by the audio output control device 1 a. In the example illustrated in FIG. 12, “Device name” to be focused is highlighted by being surrounded by a rectangular frame 103. The user can select “Device name” to be focused as a speaker to be newly registered as “speaker / object correspondence information”.

情報更新部１６は、ユーザが選択したスピーカを、新たに登録すべきスピーカとして特定する。 The information update unit 16 specifies the speaker selected by the user as a speaker to be newly registered.

（スピーカ登録画面）
図１３は、ＵＩ生成部１７が生成するＵＩ画面１００ｃの一例を示す図であり、図１３は、図１２に示したＵＩ画面１００ｂにおいて、新たに登録するスピーカがユーザによって選択された後に表示部３０に表示されるＵＩ画面を示している。 (Speaker registration screen)
FIG. 13 is a diagram illustrating an example of a UI screen 100c generated by the UI generation unit 17, and FIG. 13 illustrates a display unit after a speaker to be newly registered is selected by the user on the UI screen 100b illustrated in FIG. 30 shows a UI screen displayed at 30.

図１３に示すように、ＵＩ画面１００ｃは、「Device name」表示領域１０６、スピーカＩＤ表示領域１０７、「Room」表示領域１０８、「Place」表示領域１０９、および追加ボタン１０１ｂを含んでいる。 As shown in FIG. 13, the UI screen 100c includes a “Device name” display area 106, a speaker ID display area 107, a “Room” display area 108, a “Place” display area 109, and an add button 101b.

「Device name」表示領域１０６は、図１２に示したＵＩ画面１００ｂにおいてユーザによって選択されたスピーカの名称を表示するための領域である。 The “Device name” display area 106 is an area for displaying the name of the speaker selected by the user on the UI screen 100b shown in FIG.

スピーカＩＤ表示領域１０７は、スピーカＩＤを表示するための領域である。スピーカＩＤ表示領域１０７には、「Room」および「Place」との対応付けが未だされていないＩＤを昇順に自動的に割り当てて表示させてもよく、ユーザが任意で選択したＩＤを表示させてもよい。 The speaker ID display area 107 is an area for displaying a speaker ID. In the speaker ID display area 107, IDs that are not yet associated with “Room” and “Place” may be automatically assigned and displayed in ascending order, or an ID arbitrarily selected by the user may be displayed. Also good.

「Room」表示領域１０８は、「Room」の情報を表示するための領域である。「Room」表示領域１０８は、「Room」候補リスト表示領域１０８ａ、スライドバー１０８ｂ、追加ボタン１０８ｃ、およびリスト表示ボタン１０８ｄを含んでいる。 The “Room” display area 108 is an area for displaying “Room” information. The “Room” display area 108 includes a “Room” candidate list display area 108a, a slide bar 108b, an add button 108c, and a list display button 108d.

リスト表示ボタン１０８ｄは、「Room」候補リスト表示領域１０８ａの表示／非表示を切り替えるために用いられるボタンである。「Room」候補リスト表示領域１０８ａが非表示状態の場合は、リスト表示ボタン１０８ｄをユーザが押下することによって「Room」候補リスト表示領域１０８ａを表示させることができる。逆もまた可能である。 The list display button 108d is a button used for switching display / non-display of the “Room” candidate list display area 108a. When the “Room” candidate list display area 108a is not displayed, the “Room” candidate list display area 108a can be displayed by the user pressing the list display button 108d. The reverse is also possible.

「Room」候補リスト表示領域１０８ａは、登録済みの「Room」候補リストの一部または全体を表示するための領域である。スライドバー１０８ｂは、「Room」候補リストの内、未表示となっている部分を表示させるために、ユーザによって上下に移動可能に構成されている。追加ボタン１０８ｃは、「Room」候補リストに、新たな「Room」候補情報を追加するために用いられるボタンである。「Room」候補リストの中からユーザが指定した「Room」が、「Room」表示領域１０８に表示される。 The “Room” candidate list display area 108 a is an area for displaying part or all of the registered “Room” candidate list. The slide bar 108 b is configured to be movable up and down by the user in order to display a portion that is not displayed in the “Room” candidate list. The add button 108c is a button used to add new “Room” candidate information to the “Room” candidate list. “Room” designated by the user from the “Room” candidate list is displayed in the “Room” display area 108.

「Place」表示領域１０９には、「Place」の情報を表示するための領域である。「Place」表示領域１０９は、図示しないが、「Room」表示領域１０８と同様に、「Place」候補リスト表示領域、およびスライドバーを含むように構成することができる。リスト表示ボタン１０９ｄを押下することによって、「Place」候補リスト表示領域１０８ａの表示／非表示状態を切り替えることができる。「Place」表示領域１０９には、「Place」候補リストの中からユーザが指定した「Place」が表示される。 The “Place” display area 109 is an area for displaying “Place” information. Although not shown, the “Place” display area 109 can be configured to include a “Place” candidate list display area and a slide bar, similarly to the “Room” display area 108. By pressing the list display button 109d, the display / non-display state of the “Place” candidate list display area 108a can be switched. In the “Place” display area 109, “Place” designated by the user from the “Place” candidate list is displayed.

追加ボタン１０１ｂが押下された場合、操作受付部１５は、（i）スピーカのＩＤの情報、（ii）当該ＩＤに対応する「Room」の情報、および（iii）当該ＩＤに対応する「Place」の情報についてのユーザからの指定を受け付ける。そして、操作受付部１５は、受け付けたユーザ指示を、情報更新部１６に供給する。情報更新部１６は、受け付けたユーザ指示に基づいて、スピーカＩＤ表示領域１０７に表示されているスピーカＩＤと、「Room」表示領域１０８に表示されている「Room」の情報と、「Place」表示領域１０９に表示されている「Place」の情報とを互いに関連付けたうえで、記憶部２０に記憶されている「スピーカ・オブジェクト対応情報」に付け加える。 When the add button 101b is pressed, the operation reception unit 15 (i) information on the speaker ID, (ii) information on “Room” corresponding to the ID, and (iii) “Place” corresponding to the ID. The user's specification about the information is accepted. Then, the operation reception unit 15 supplies the received user instruction to the information update unit 16. Based on the received user instruction, the information updating unit 16 displays the speaker ID displayed in the speaker ID display area 107, the information on “Room” displayed in the “Room” display area 108, and the “Place” display. The information “Place” displayed in the area 109 is associated with each other and added to “speaker / object correspondence information” stored in the storage unit 20.

（「Room」候補リスト更新画面）
図１４は、ＵＩ生成部１７が生成するＵＩ画面１００ｄの一例を示す図であり、図１４は、図１３に示したＵＩ画面１００ｃにおいて、ユーザによって追加ボタン１０８ｃが押下された後に表示部３０に表示されるＵＩ画面を示している。基本的には、メタデータが指定する「Room」の名称は、「Room」候補リストに表示される名称と対応しているが、メタデータ送信側のバージョンアップ等によって、メタデータが指定する「Room」の名称が変更され、「Room」候補リストを更新する必要性が生じる場合がある。図１３に示したＵＩ画面１００ｃでは、このような場合であっても、ユーザは追加ボタン１０８ｃを押下して、「Room」候補リストを更新することができる。 ("Room" candidate list update screen)
FIG. 14 is a diagram illustrating an example of the UI screen 100d generated by the UI generation unit 17, and FIG. 14 illustrates the UI screen 100c illustrated in FIG. 13 on the display unit 30 after the add button 108c is pressed by the user. A UI screen to be displayed is shown. Basically, the name of “Room” specified by the metadata corresponds to the name displayed in the “Room” candidate list. The name of “Room” may be changed and the “Room” candidate list may need to be updated. In the UI screen 100c shown in FIG. 13, even in such a case, the user can press the add button 108c to update the “Room” candidate list.

図１４に示すように、ＵＩ画面１００ｄは、図１３に示したＵＩ画面１００ｃに、「Room」候補リスト追加画面１１０が重畳表示されている。 As shown in FIG. 14, the UI screen 100d has a “Room” candidate list addition screen 110 superimposed on the UI screen 100c shown in FIG.

「Room」候補リスト追加画面１１０は、「Room」情報入力領域１１１、および追加ボタン１０１ｃを含んでいる。 The “Room” candidate list addition screen 110 includes a “Room” information input area 111 and an add button 101c.

「Room」情報入力領域１１１は、新規の「Room」候補の名称を入力するための領域である。「Room」情報入力領域１１１への「Room」候補の名称の入力は、キーボード等の入力装置（図示しない）を介して、ユーザが入力することができる。図１４では、新規の「Room」候補の名称として、ユーザが「Kids Room」と入力した例を示している。 The “Room” information input area 111 is an area for inputting a name of a new “Room” candidate. The user can input the name of the “Room” candidate name in the “Room” information input area 111 via an input device (not shown) such as a keyboard. FIG. 14 shows an example in which the user inputs “Kids Room” as the name of a new “Room” candidate.

追加ボタン１０１ｃは、「Room」候補リストに、「Room」候補を追加するために用いられるボタンである。 The add button 101c is a button used to add a “Room” candidate to the “Room” candidate list.

追加ボタン１０１ｃが押下された場合、操作受付部１５は、新規の「Room」候補の名称についてのユーザからの指定を受け付ける。そして、操作受付部１５は、受け付けたユーザ指示を、情報更新部１６に供給する。情報更新部１６は、受け付けたユーザ指示に基づいて、「Room」候補リスト表示領域１０８ａに表示されている「Room」候補リストに、新規「Room」候補として「Kids Room」を付け加える。 When the add button 101c is pressed, the operation reception unit 15 receives a designation from the user for the name of a new “Room” candidate. Then, the operation reception unit 15 supplies the received user instruction to the information update unit 16. Based on the received user instruction, the information update unit 16 adds “Kids Room” as a new “Room” candidate to the “Room” candidate list displayed in the “Room” candidate list display area 108a.

（スピーカ・オブジェクト対応情報更新画面（１））
図１５は、ＵＩ生成部１７が生成するＵＩ画面１００ｅの一例を示す図であり、図１５は、図１１に示したＵＩ画面１００ａにおいて、フォーカス対象となっているＩＤ、当該ＩＤに対応する「Room」、および当該ＩＤに対応する「Place」がユーザによって選択された後に表示部３０に表示されるＵＩ画面を示している。 (Speaker / object correspondence information update screen (1))
FIG. 15 is a diagram illustrating an example of a UI screen 100e generated by the UI generation unit 17, and FIG. 15 illustrates an ID that is a focus target in the UI screen 100a illustrated in FIG. The UI screen displayed on the display unit 30 after “Room” and “Place” corresponding to the ID are selected by the user is shown.

図１５に示すように、ＵＩ画面１００ｅは、図１１に示したＵＩ画面１００ａに、選択済みスピーカ情報表示画面１２０が重畳表示されている。 As shown in FIG. 15, the UI screen 100e has a selected speaker information display screen 120 superimposed on the UI screen 100a shown in FIG.

選択済みスピーカ情報表示画面１２０は、スピーカ・オブジェクト対応情報表示領域１２３、編集ボタン１２１および削除ボタン１２２を含んでいる。 The selected speaker information display screen 120 includes a speaker / object correspondence information display area 123, an edit button 121, and a delete button 122.

スピーカ・オブジェクト対応情報表示領域１２３は、選択されたスピーカについてのスピーカ・オブジェクト対応情報を表示するための領域である。編集ボタン１２１は、スピーカ・オブジェクト対応情報表示領域１２３に表示されたスピーカ・オブジェクト対応情報を編集するために用いられるボタンである。削除ボタン１２２は、スピーカ・オブジェクト対応情報表示領域１２３に表示されたスピーカ・オブジェクト対応情報を、スピーカ・オブジェクト対応情報表示領域１０２ａから削除するために用いられるボタンである。 The speaker / object correspondence information display area 123 is an area for displaying the speaker / object correspondence information for the selected speaker. The edit button 121 is a button used to edit the speaker / object correspondence information displayed in the speaker / object correspondence information display area 123. The delete button 122 is a button used to delete the speaker / object correspondence information displayed in the speaker / object correspondence information display area 123 from the speaker / object correspondence information display area 102a.

編集ボタン１２１が押下された場合、操作受付部１５は、ユーザからの編集の指定を受け付ける。そして、ＵＩ生成部１７は、図１７に示すＵＩ画面を生成する。図１７に示すＵＩ画面の詳細については後述する。 When the edit button 121 is pressed, the operation reception unit 15 receives an edit designation from the user. Then, the UI generation unit 17 generates a UI screen shown in FIG. Details of the UI screen shown in FIG. 17 will be described later.

削除ボタン１２２が押下された場合、操作受付部１５は、ユーザからの削除の指定を受け付ける。そして、ＵＩ生成部１７は、図１６に示すＵＩ画面を生成する。図１６に示すＵＩ画面の詳細については後述する。 When the delete button 122 is pressed, the operation reception unit 15 receives a deletion designation from the user. Then, the UI generation unit 17 generates a UI screen shown in FIG. Details of the UI screen shown in FIG. 16 will be described later.

（スピーカ・オブジェクト対応情報更新画面（２））
図１６は、ＵＩ生成部１７が生成するＵＩ画面１００ｆの一例を示す図であり、図１６は、図１５に示したＵＩ画面１００ｅ内の削除ボタン１２２をユーザが押下した後に表示部３０に表示されるＵＩ画面を示している。 (Speaker / object correspondence information update screen (2))
FIG. 16 is a diagram illustrating an example of a UI screen 100f generated by the UI generation unit 17, and FIG. 16 is displayed on the display unit 30 after the user presses the delete button 122 in the UI screen 100e illustrated in FIG. The UI screen to be displayed is shown.

図１６に示すように、ＵＩ画面１００ｆは、図１１に示したＵＩ画面１００ａに、意思確認画面１３０が重畳表示されている。 As shown in FIG. 16, the UI screen 100 f has the intention confirmation screen 130 superimposed on the UI screen 100 a shown in FIG. 11.

意思確認画面１３０は、Ｙｅｓボタン１３１およびＮｏボタン１３２を含んでいる。Ｙｅｓボタン１３１は、図１５のスピーカ・オブジェクト対応情報表示領域１２３に表示されたスピーカ・オブジェクト対応情報を削除する場合に用いられるボタンである。Ｎｏボタン１３２は、図１５のスピーカ・オブジェクト対応情報表示領域１２３に表示されたスピーカ・オブジェクト対応情報を削除しない場合に用いられるボタンである。 The intention confirmation screen 130 includes a Yes button 131 and a No button 132. The Yes button 131 is a button used when deleting the speaker / object correspondence information displayed in the speaker / object correspondence information display area 123 of FIG. 15. The No button 132 is a button used when the speaker / object correspondence information displayed in the speaker / object correspondence information display area 123 of FIG. 15 is not deleted.

Ｙｅｓボタン１３１が押下された場合、操作受付部１５は、ユーザからのＹｅｓの指定を受け付ける。そして、操作受付部１５は、受け付けたユーザ指示を、情報更新部１６に供給する。情報更新部１６は、受け付けたユーザ指示に基づいて、図１５のスピーカ・オブジェクト対応情報表示領域１２３に表示されたスピーカ・オブジェクト対応情報を、記憶部２０に記憶されている「スピーカ・オブジェクト対応情報」から削除する。ＵＩ生成部１７は、更新された「スピーカ・オブジェクト対応情報」に基づいて、スピーカ・オブジェクト対応情報表示領域１２３に、新たなスピーカ・オブジェクト対応情報を表示させる（図示しない）。 When the Yes button 131 is pressed, the operation reception unit 15 receives a Yes specification from the user. Then, the operation reception unit 15 supplies the received user instruction to the information update unit 16. Based on the received user instruction, the information updating unit 16 converts the speaker / object correspondence information displayed in the speaker / object correspondence information display area 123 of FIG. 15 into the “speaker / object correspondence information” stored in the storage unit 20. ". The UI generation unit 17 causes the speaker / object correspondence information display area 123 to display new speaker / object correspondence information (not shown) based on the updated “speaker / object correspondence information”.

Ｎｏボタン１３２が押下された場合、操作受付部１５は、ユーザからのＮｏの指定を受け付ける。この場合、図１５のスピーカ・オブジェクト対応情報表示領域１２３に表示されたスピーカ・オブジェクト対応情報は記憶部２０に記憶されている「スピーカ・オブジェクト対応情報」から削除されない。 When the No button 132 is pressed, the operation accepting unit 15 accepts designation of No from the user. In this case, the speaker / object correspondence information displayed in the speaker / object correspondence information display area 123 in FIG. 15 is not deleted from the “speaker / object correspondence information” stored in the storage unit 20.

スピーカ・オブジェクト対応情報からスピーカ情報が削除された場合、削除されたスピーカＩＤよりも後の番号のスピーカＩＤは、番号を繰り上げて表示するようにしてもよい（つまり、例えば、スピーカＩＤ「SP-03」の情報が削除された場合、スピーカＩＤ「SP-04」、「SP-05」および「SP-06」が、それぞれ、「SP-03」、「SP-04」および「SP-05」に繰り上がる。）。 When the speaker information is deleted from the speaker / object correspondence information, the speaker ID having a number after the deleted speaker ID may be displayed with the number incremented (that is, for example, the speaker ID “SP- When the information of “03” is deleted, the speaker IDs “SP-04”, “SP-05” and “SP-06” are changed to “SP-03”, “SP-04” and “SP-05”, respectively. )

（スピーカ・オブジェクト対応情報更新画面（３））
図１７は、ＵＩ生成部１７が生成するＵＩ画面１００ｇの一例を示す図であり、図１７は、図１５に示したＵＩ画面１００ｅ内の編集ボタン１２１をユーザが押下した後に表示部３０に表示されるＵＩ画面を示している。 (Speaker / object correspondence information update screen (3))
FIG. 17 is a diagram illustrating an example of a UI screen 100g generated by the UI generation unit 17, and FIG. 17 is a display on the display unit 30 after the user presses the edit button 121 in the UI screen 100e illustrated in FIG. The UI screen to be displayed is shown.

図１７に示したＵＩ画面１００ｇは、図１３に示したＵＩ画面１００ｃ内の追加ボタン１０１ｂが変更ボタン１０５に置き換わったものである。 A UI screen 100g shown in FIG. 17 is obtained by replacing the add button 101b in the UI screen 100c shown in FIG.

変更ボタン１０５が押下された場合、操作受付部１５は、「Room」の情報および「Place」の情報についてのユーザからの新たな指定を受け付ける。そして、操作受付部１５は、受け付けたユーザ指示を、情報更新部１６に供給する。そして、記憶部２０に記憶されている「スピーカ・オブジェクト対応情報」のうち、スピーカＩＤ「SP-07」に関連付けられた「Room」の情報および「Place」の情報を、それぞれ、「Room」表示領域１０８および「Place」表示領域１０９に表示されている情報に更新する。 When the change button 105 is pressed, the operation accepting unit 15 accepts a new designation from the user for the “Room” information and the “Place” information. Then, the operation reception unit 15 supplies the received user instruction to the information update unit 16. Of the “speaker / object correspondence information” stored in the storage unit 20, the “Room” information and the “Place” information associated with the speaker ID “SP-07” are respectively displayed in the “Room” display. The information displayed in the area 108 and the “Place” display area 109 is updated.

なお、本実施形態では、スピーカＩＤに対応するオブジェクトの情報（「Room」の情報および「Place」の情報）を変更する例を示したが、その逆に、オブジェクトの情報に対応するスピーカＩＤを変更することも可能である。 In the present embodiment, the object information (“Room” information and “Place” information) corresponding to the speaker ID is changed. However, conversely, the speaker ID corresponding to the object information is changed. It is also possible to change.

〔実施形態３〕
本発明に係る音声出力制御装置は、画像表示装置に備えられていてもよい。図１８は、本発明に係る音声出力制御装置とチューナとを備えたテレビ２００の外観を示す図である。 [Embodiment 3]
The audio output control device according to the present invention may be provided in an image display device. FIG. 18 is a diagram illustrating an appearance of a television 200 including the audio output control device and the tuner according to the present invention.

他の実施形態において、本発明に係る音声出力制御装置は、テレビ２００に外付けされた、テレビ２００とは別体の装置であってもよい。 In another embodiment, the audio output control device according to the present invention may be a device that is externally attached to the television 200 and separate from the television 200.

また、実施形態２に係る音声出力制御装置１ａを備える場合は、テレビ２００が、表示部３０（図９）を兼ねていてもよく、テレビ２００とは別に表示部３０が設けられていてもよい。 When the audio output control device 1a according to the second embodiment is provided, the television 200 may also serve as the display unit 30 (FIG. 9), and the display unit 30 may be provided separately from the television 200. .

画像表示装置は、テレビに限定されず、パソコン用のモニタ等であってもよい。 The image display device is not limited to a television, and may be a monitor for a personal computer.

〔ソフトウェアによる実現例〕
音声出力制御装置１、１ａの制御ブロック（特に音声データ取得部１１、メタデータ取得部１２、スピーカ決定部１３、および出力スピーカ制御部１４）は、集積回路（ＩＣチップ）等に形成された論理回路（ハードウェア）によって実現してもよいし、ＣＰＵ（Central Processing Unit）を用いてソフトウェアによって実現してもよい。 [Example of software implementation]
The control blocks (especially the audio data acquisition unit 11, the metadata acquisition unit 12, the speaker determination unit 13, and the output speaker control unit 14) of the audio output control devices 1 and 1a are logics formed in an integrated circuit (IC chip) or the like. It may be realized by a circuit (hardware) or may be realized by software using a CPU (Central Processing Unit).

後者の場合、音声出力制御装置１、１ａは、各機能を実現するソフトウェアであるプログラムの命令を実行するＣＰＵ、上記プログラムおよび各種データがコンピュータ（またはＣＰＵ）で読み取り可能に記録されたＲＯＭ（Read Only Memory）または記憶装置（これらを「記録媒体」と称する）、上記プログラムを展開するＲＡＭ（Random Access Memory）などを備えている。そして、コンピュータ（またはＣＰＵ）が上記プログラムを上記記録媒体から読み取って実行することにより、本発明の目的が達成される。上記記録媒体としては、「一時的でない有形の媒体」、例えば、テープ、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。また、上記プログラムは、該プログラムを伝送可能な任意の伝送媒体（通信ネットワークや放送波等）を介して上記コンピュータに供給されてもよい。なお、本発明の一態様は、上記プログラムが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。 In the latter case, the audio output control device 1, 1 a includes a CPU that executes instructions of a program that is software that realizes each function, and a ROM (Read that records the above program and various data so that the computer (or CPU) can read them. Only Memory) or a storage device (these are referred to as “recording media”), a RAM (Random Access Memory) for expanding the program, and the like. And the objective of this invention is achieved when a computer (or CPU) reads the said program from the said recording medium and runs it. As the recording medium, a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. The program may be supplied to the computer via an arbitrary transmission medium (such as a communication network or a broadcast wave) that can transmit the program. Note that one embodiment of the present invention can also be realized in the form of a data signal embedded in a carrier wave, in which the program is embodied by electronic transmission.

〔まとめ〕
本発明の態様１に係る音声出力制御装置（１、１ａ）は、音声を音声出力装置（スピーカ２ａ、２ｂ、２ｃ）に出力させる音声出力制御装置（１、１ａ）であって、音声データを取得する音声データ取得部（１１）と、上記音声データとオブジェクトとの対応関係を示すメタデータを取得するメタデータ取得部（１２）と、上記音声データの示す音声を出力させる音声出力装置（スピーカ２ａ、２ｂ、２ｃ）を、上記オブジェクトと音声出力装置（スピーカ２ａ、２ｂ、２ｃ）との対応付けを示す対応付け情報、および、上記メタデータを参照して決定する決定部（スピーカ決定部１３）と、を備えている構成である。 [Summary]
An audio output control device (1, 1a) according to an aspect 1 of the present invention is an audio output control device (1, 1a) that outputs audio to an audio output device (speakers 2a, 2b, 2c), and outputs audio data. An audio data acquisition unit (11) to acquire, a metadata acquisition unit (12) to acquire metadata indicating the correspondence between the audio data and the object, and an audio output device (speaker) that outputs the audio indicated by the audio data 2a, 2b, 2c) is determined with reference to the association information indicating the association between the object and the audio output device (speakers 2a, 2b, 2c) and the metadata (speaker determination unit 13). ).

上記の構成によれば、受聴者の再生環境に応じて好適な没入感を提供することができる。 According to said structure, a suitable immersion feeling can be provided according to a listener's reproduction environment.

本発明の態様２に係る音声出力制御装置（１、１ａ）は、上記の態様１において、上記メタデータは、上記オブジェクトとしての領域、領域の一部、および領域内に存在している物体の少なくとも何れかと、上記音声データとの対応関係を示すものである構成としてもよい。 The sound output control device (1, 1a) according to aspect 2 of the present invention is the aspect 1, wherein the metadata includes an area as the object, a part of the area, and an object existing in the area. It is good also as a structure which shows the correspondence of at least one and the said audio | voice data.

本発明の態様３に係る音声出力制御装置（１ａ）は、上記の態様１または２において、上記対応付け情報を表示する表示部（３０）と、上記オブジェクトと音声出力装置（スピーカ２ａ、２ｂ、２ｃ）との対応付けに関するユーザからの指示を受け付ける操作受付部（１５）と、上記操作受付部（１５）が受け付けたユーザ指示に基づいて上記対応付け情報を更新する対応付け情報更新部（情報更新部１６）と、を更に備えている構成としてもよい。 The audio output control device (1a) according to aspect 3 of the present invention is the display unit (30) that displays the association information, the object, and the audio output device (speakers 2a, 2b, 2c) an operation reception unit (15) that receives an instruction from the user regarding the association with the association, and an association information update unit (information) that updates the association information based on the user instruction received by the operation reception unit (15) The update unit 16) may be further provided.

上記の構成によれば、オブジェクトと音声出力装置との対応付けを示す対応付け情報を、ユーザ指示に基づいて更新することが可能となる。 According to said structure, it becomes possible to update the correlation information which shows the correlation with an object and an audio | voice output apparatus based on a user instruction | indication.

本発明の態様４に係るメタデータは、音声出力制御装置（１、１ａ）によって参照されるメタデータであって、音声データとオブジェクトとの対応関係を含み、上記音声出力制御装置（１、１ａ）は、上記音声データの示す音声を出力させる音声出力装置（スピーカ２ａ、２ｂ、２ｃ）を、上記オブジェクトと音声出力装置（スピーカ２ａ、２ｂ、２ｃ）との対応付けを示す対応付け情報、および、上記メタデータを参照して決定する構成である。 The metadata according to the aspect 4 of the present invention is metadata referred to by the audio output control device (1, 1a) and includes a correspondence relationship between the audio data and the object, and includes the audio output control device (1, 1a). ) Is a voice output device (speakers 2a, 2b, 2c) that outputs the voice indicated by the voice data, and association information indicating a correspondence between the object and the voice output devices (speakers 2a, 2b, 2c), and The configuration is determined with reference to the metadata.

上記の構成によれば、音声出力制御装置は、受聴者の再生環境に応じて好適な没入感を提供することができる。 According to said structure, the audio | voice output control apparatus can provide suitable immersion feeling according to a listener's reproduction | regeneration environment.

本発明の態様５に係るユーザインタフェース画面生成装置（ＵＩ生成部１７）は、音声を音声出力装置に出力させる音声出力制御装置（１、１ａ）によって参照される対応付け情報を入力するためのユーザインタフェース画面を生成するユーザインタフェース画面生成装置（ＵＩ生成部１７）であって、上記ユーザインタフェース画面は、オブジェクトと音声出力装置（スピーカ２ａ、２ｂ、２ｃ）との対応付けに関するユーザからの指示を受け付けるよう構成されている構成である。 The user interface screen generation device (UI generation unit 17) according to aspect 5 of the present invention is a user for inputting association information referred to by the audio output control device (1, 1a) that outputs audio to the audio output device. A user interface screen generation device (UI generation unit 17) that generates an interface screen, and the user interface screen receives an instruction from a user regarding association between an object and a sound output device (speakers 2a, 2b, and 2c). It is the structure comprised as follows.

本発明の態様６に係るテレビジョン受像機（テレビ２００）は、上記の態様１〜３のいずれかに記載の音声出力制御装置（１、１ａ）を備えている構成としてもよい。 The television receiver (television 200) which concerns on aspect 6 of this invention is good also as a structure provided with the audio | voice output control apparatus (1, 1a) in any one of said aspects 1-3.

上記の構成によれば、テレビジョン受像機は、受聴者の再生環境に応じて好適な没入感を提供することができる。 According to said structure, the television receiver can provide a suitable immersion feeling according to a listener's reproduction environment.

本発明の態様７に係る音声出力制御方法は、音声を音声出力装置に出力させる音声出力制御方法であって、音声データを取得する音声データ取得工程（ステップＳ１１）と、上記音声データとオブジェクトとの対応関係を示すメタデータを取得するメタデータ取得工程（ステップＳ１２）と、上記音声データの示す音声を出力させる音声出力装置（スピーカ２ａ、２ｂ、２ｃ）を、上記オブジェクトと音声出力装置（スピーカ２ａ、２ｂ、２ｃ）との対応付けを示す対応付け情報、および、上記メタデータを参照して決定する決定工程（ステップＳ１３）と、を包含している方法である。 An audio output control method according to aspect 7 of the present invention is an audio output control method for outputting audio to an audio output device, in which an audio data acquisition step (step S11) for acquiring audio data, the audio data and the object, A metadata acquisition step (step S12) for acquiring metadata indicating a correspondence relationship between the object and the sound output device (speaker 2a, 2b, 2c) that outputs the sound indicated by the sound data. 2a, 2b, and 2c), and a determination step (step S13) that is determined with reference to the metadata.

上記の構成によれば、態様１と同様の効果を奏する。 According to said structure, there exists an effect similar to aspect 1.

本発明の各態様に係る音声出力制御装置（１、１ａ）は、コンピュータによって実現してもよく、この場合には、コンピュータを上記音声出力制御装置（１、１ａ）が備える各部（ソフトウェア要素）として動作させることにより上記音声出力制御装置（１、１ａ）をコンピュータにて実現させる音声出力制御装置の音声出力制御プログラム、およびそれを記録したコンピュータ読み取り可能な記録媒体も、本発明の範疇に入る。 The audio output control device (1, 1a) according to each aspect of the present invention may be realized by a computer. In this case, each unit (software element) included in the audio output control device (1, 1a) includes the computer. The audio output control program of the audio output control device that realizes the audio output control device (1, 1a) by a computer by operating as a computer and a computer-readable recording medium on which the audio output control device is recorded fall within the scope of the present invention. .

本発明は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。さらに、各実施形態にそれぞれ開示された技術的手段を組み合わせることにより、新しい技術的特徴を形成することができる。 The present invention is not limited to the above-described embodiments, and various modifications are possible within the scope shown in the claims, and embodiments obtained by appropriately combining technical means disclosed in different embodiments. Is also included in the technical scope of the present invention. Furthermore, a new technical feature can be formed by combining the technical means disclosed in each embodiment.

１、１ａ音声出力制御装置
２ａ、２ｂ、２ｃスピーカ（音声出力装置）
１０、１０ａ制御部
１１音声データ取得部
１２メタデータ取得部
１３スピーカ決定部（決定部）
１５操作受付部
１６情報更新部
１７ＵＩ生成部（ユーザインタフェース画面生成装置）
３０表示部
２００テレビ（画像表示装置） 1, 1a Audio output control device 2a, 2b, 2c Speaker (audio output device)
10, 10a Control unit 11 Audio data acquisition unit 12 Metadata acquisition unit 13 Speaker determination unit (determination unit)
15 Operation Reception Unit 16 Information Update Unit 17 UI Generation Unit (User Interface Screen Generation Device)
30 Display unit 200 Television (image display device)

Claims

An audio output control device that outputs audio to an audio output device,
An audio data acquisition unit for acquiring audio data;
A metadata acquisition unit that acquires metadata indicating the correspondence between the audio data and the object;
A sound output device that outputs the sound indicated by the sound data, a determination unit that determines the association information indicating the association between the object and the sound output device, and the metadata;
An audio output control device comprising:

The metadata includes a correspondence relationship between the audio data and at least one of an area as the object, a part of the area, and an object existing in the area. The audio output control apparatus according to 1.

A display unit for displaying the association information;
An operation accepting unit that accepts an instruction from the user regarding the association between the object and the audio output device;
An association information update unit that updates the association information based on a user instruction received by the operation reception unit;
The audio output control apparatus according to claim 1, further comprising:

Metadata referenced by the audio output control device,
Including correspondence between audio data and objects,
The audio output control device determines an audio output device that outputs the audio indicated by the audio data with reference to association information indicating association between the object and the audio output device, and the metadata. Feature metadata.

A user interface screen generation device that generates a user interface screen for inputting association information referred to by an audio output control device that outputs audio to an audio output device,
The user interface screen generating apparatus, wherein the user interface screen is configured to receive an instruction from a user regarding association between an object and a sound output device.

A television receiver comprising the audio output control device according to claim 1.

An audio output control method for outputting audio to an audio output device,
An audio data acquisition process for acquiring audio data;
A metadata acquisition step of acquiring metadata indicating a correspondence relationship between the audio data and the object;
A determination step of determining an audio output device that outputs audio indicated by the audio data with reference to association information indicating an association between the object and the audio output device, and the metadata;
A voice output control method comprising:

An audio output control program for causing a computer to function as the audio output control device according to claim 1, wherein the audio output is for causing the computer to function as the audio data acquisition unit, the metadata acquisition unit, and the determination unit. Control program.

A computer-readable recording medium on which the audio output control program according to claim 8 is recorded.