JP2004147283A

JP2004147283A - Sound image localization device, sound image localization method, sound data distribution system, sound data distribution method and program

Info

Publication number: JP2004147283A
Application number: JP2003051877A
Authority: JP
Inventors: Akane Noguchi; 野口　あかね; Yu Nishibori; 西堀　佑
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2002-08-27
Filing date: 2003-02-27
Publication date: 2004-05-20
Anticipated expiration: 2023-02-27
Also published as: JP3982431B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a highly entertaining sound amusement. <P>SOLUTION: A sound data distribution server distributes sound data and sound source location information to a terminal. Here, the sound data is the information representing a sound which is output from a virtual sound source specified in three-dimensional coordinates, and the sound source information is the information indicating the location where the sound is generated. Receiving the sound data and the sound source location information, an audio signal generation unit 160 in the terminal generates audio signals with the localized sound image from the sound output from the sound source in accordance with the sound source location information, terminal location information which indicates the location of a user, direction information indicating the direction of the user's face, and a distance e which is the distance between the user's ears. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、音像を定位させる音像定位装置、音像定位方法およびプログラム、ならびに、音を表すサウンドデータを配信するサウンドデータ配信システムおよびサウンドデータ配信方法に関する。
【０００２】
【従来の技術】
従来から、無線通信網を介してストリーム配信された楽曲などのオーディオデータを移動端末において受信し、該移動端末に接続されたヘッドフォンなどから音として出力するオーディオ配信システムが知られている（例えば、特許文献１参照）。このようなオーディオ配信システムによれば、ユーザは、外出時などにおいても、音楽を手軽に楽しむことができる。
【０００３】
【特許文献１】
特開平９−１８１５１０号公報（第３図）
【０００４】
【発明が解決しようとする課題】
しかしながら、従来のオーディオ配信システムにあっては、配信されたオーディオデータを移動端末において忠実に再生することはできても、ユーザがオーディオデータの生成に参加するなどといった娯楽性をユーザに提供することはできなかった。
【０００５】
本発明は上述した事情に鑑みてなされたものであり、その目的とするところは、娯楽性の高い音響アミューズメントを提供することを可能にする音像定位装置、音像定位方法、サウンドデータ配信システム、サウンドデータ配信方法およびプログラムを提供することにある。
【０００６】
【課題を解決するための手段】
上記目的を達成するために、本発明に係る音像定位装置は、ユーザの位置、および、顔の向いている方向を示すユーザ情報を取得するユーザ情報取得手段と、仮想的な発音地点の位置を示す発音位置情報を取得する発音位置情報取得手段と、取得したユーザ情報で示される位置であって、取得したユーザ情報で示される方向に顔を向けたユーザからみて、前記発音地点に予め関連付けられた種類の音が、前記発音位置情報で示される位置から出力しているように音像を定位させる定位手段とを具備する構成を特徴とする。
上記構成によれば、ユーザがある地点に位置し、かつ、ある方向に顔を向けたとき、仮想的な発音地点に予め関連付けられた種類の音が、当該発音地点から出力しているように定位するので、ユーザに対して、あたかも、発音地点が配置された空間にいるかのような感覚を与えることが可能となる。
【０００７】
ここで、前記発音位置情報取得手段は、前記発音位置情報として、前記発音地点に対応付けられた移動体の位置を示す移動体位置情報を取得し、前記定位手段は、取得された前記移動体位置情報により示される前記移動体の位置から前記音が出力しているように音像を定位させることが好ましい。
この構成によれば、ユーザは、移動体とのおおよその位置関係を音像により知覚することができる。
【０００８】
また、別の好ましい態様において、前記発音地点に予め関連付けられた種類の音を示すサウンドデータを受信する受信手段を備え、前記定位手段は、前記受信手段により受信されたサウンドデータによって示される音の音像を定位させる。このように、受信手段を介してサウンドデータを取得することにより、音像定位装置において、サウンドデータを不揮発に記憶する記憶装置を特別に設ける必要がない。さらに、音像定位装置にサウンドデータを配信する配信装置によりサウンドデータを一括して管理できるため、サウンドデータの更新などが容易なものとなる。
なお、本発明は、上記音像定位装置のほか、音像定位方法およびプログラムとしても実現可能であり、上記音像定位装置と同様な効果を奏することが可能である。
【０００９】
また、本発明は、仮想的な発音地点に予め関連付けられた種類の音を表すサウンドデータを配信するサウンドデータ配信装置と、前記サウンドデータ配信装置から配信されたサウンドデータを受信し、受信したサウンドデータを用いて、ある地点に位置し、ある方向に顔を向けたユーザからみて、前記発音地点に予め関連付けられた種類の音が、当該発音地点の位置から出力しているように音像を定位させる端末とを具備することを特徴とするサウンドデータ配信システムを提供する。
【００１０】
【発明の実施の形態】
以下、本発明の実施形態について図面を参照して説明する。
【００１１】
＜サウンドデータ配信システムの概略構成＞
はじめに、本実施形態にかかるサウンドデータ配信システムの概略構成について図１を参照して説明する。この図において、衛星群４００は、ＧＰＳ（ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇ　Ｓｙｓｔｅｍ）などのＧＮＳＳ（Ｇｌｏｂａｌ　Ｎａｖｉｇａｔｉｏｎ　Ｓａｔｅｌｌｉｔｅ　Ｓｙｓｔｅｍ）に対応した人口衛星であり、図示しない地上制御局によって管制制御され、地上に向けて衛星信号を送出する。この衛星信号には、衛星から送出された時刻や、当該信号を送出した衛星の軌道位置などを示す情報が含まれている。
【００１２】
一方、移動通信網５００は、基地局制御装置など、データ通信サービスを提供するための各種装置を含んでおり、多数の基地局５１０が接続されている。サウンドデータ配信サーバ３００は、移動通信網５００および基地局５１０を介して、サウンドデータを端末１００にストリーム配信する。ここで、サウンドデータとは、３次元座標によって規定された仮想的な発音地点から出力される音を表す情報である。
【００１３】
端末１００は、携帯型の無線通信端末であり、いずれかの基地局５１０を介して、端末１００の位置を示す端末位置情報などをサウンドデータ配信サーバ３００に送信したり、サウンドデータ配信サーバ３００からサウンドデータなどを受信したりする。端末１００は、後述するように、サウンドデータを受信すると、当該サウンドデータに対応する発音地点の位置から発せられると想定される音の音像を定位させたオーディオ信号をサウンドデータから生成する。
【００１４】
また、端末１００は、ステレオ形式のヘッドフォン２００が接続可能になされており、サウンドデータから生成されたオーディオ信号を、ヘッドフォン２００を介して放音する。このヘッドフォン２００は、ユーザの頭に装着された使用状態において、ユーザの顔の向く方向を検出する方位センサ２１０を有しており、オーディオ信号が入力される間にわたり、検出した方位を示す信号を端末１００に送信する。また、端末１００は、衛星群４００から送出される衛星信号を受信するための受信機を備えている。
なお、この図においては、端末１００およびヘッドフォン２００の組として、２人のユーザ、すなわち、ユーザＵ１に用いられる組と、ユーザＵ２に用いられる組との２組が例示されているが、端末１００とヘッドフォン２００との組は、１組であっても良いし、３組以上であっても良い。
【００１５】
＜サウンドデータ配信サーバの構成＞
図２は、サウンドデータ配信サーバ３００の構成を示すブロック図である。この図において、制御部３１０は、バスＢ３を介して各部を制御する。また、制御部３１０は、後述するように発音地点を選択するための処理や、サウンドデータのデータ量を変換するための処理などを実行する。
【００１６】
通信部３２０は、移動通信網５００を介して、端末１００から送信された端末位置情報などの情報を受信する。また、通信部３２０は、後述するように制御部３１０によって選択された４つの発音地点に対応するサウンドデータの各々を並列して、端末１００に送信する。記憶部３３０は、磁気ディスクなどから構成され、各種情報を記憶する。
【００１７】
図３は、記憶部３３０に記憶される情報の一部を示す図である。この図に示されるように、記憶部３３０には、複数の発音地点ＩＤと、各々の発音地点ＩＤに対応付けられた発音位置情報およびサウンドデータが記憶されている。このうち、発音地点ＩＤは、発音地点を識別するためのものである。本実施形態においては、各発音地点は、街中における構造物（例えば、店舗や、ビルなど）に対応する位置に規定されており、それらの構造物を示す「ミニストアＡ」や、「ビルＡ」などの構造物名が、発音地点ＩＤとして登録されている。なお、発音地点の配置場所は、構造物に対応する位置に限られず、任意に設定することが可能である。また、発音位置情報は、例えば、緯度、経度、高度などからなり、３次元座標によって発音地点の位置を規定するものである。
【００１８】
サウンドデータは、発音地点から発せられると想定される音を表すデータであり、所定の周波数（例えば、４４．１ｋＨｚ）でサンプリングされたデータである。サウンドデータは、楽曲、楽音、音声などの音を表すデータであればどのようなデータであっても良く、本実施形態においては、サウンドデータとして「買い物マーチ（楽曲）」、「太鼓音（楽音）」、「犬の鳴き声（音声）」、「チャイム（電子音）」などが記録されている。サウンドデータ配信サーバ３００は、記憶部３３０に記憶されるサウンドデータのうち、４つのサウンドデータを端末１００に配信する。そして、端末１００おいては、配信された４つのサウンドデータを加工・混合した後、ヘッドフォン２００を介して放音する。
【００１９】
また、記憶部３３０には、サウンドデータのデータ量変換に使用されるデータ量変換テーブルが記憶されている。図４は、データ量変換テーブルの構成を示す図である。この図に示されるように、データ量変換テーブルＴＢＬは、端末１００および発音地点の距離Ｄと、当該発音地点に対応するサウンドデータが変換されるべきサンプリング周波数とが対応付けられたテーブルである。例えば、図４においては、「０」以上「Ｌ１（＞０）」未満の距離Ｄには、サンプリング周波数「ｆ１（＝４４．１ｋＨｚ）」が対応付けられており、「Ｌ１」以上「Ｌ２（＞Ｌ１）」未満の距離Ｄには、サンプリング周波数「ｆ２（＝２２ｋＨｚ）」が対応付けられている。制御部３１０は、データ量変換テーブルＴＢＬに従って、後述するデータ量変換処理において、サウンドデータのデータ量を変換する。なお、図４における「Ｌ１」、「Ｌ２」、「Ｌ３」および「Ｌ４」の各々は、「０」＜「Ｌ１」＜「Ｌ２」＜「Ｌ３」＜「Ｌ４」の関係を満たしている。
【００２０】
＜端末の構成＞
図５は、端末１００の構成を示すブロック図である。この図において制御部１１０は、バスＢ１を介して各部を制御する。また、衛星電波受信部１４５は、衛星群４００に含まれる複数の衛星の各々から、並列して衛星信号を受信し、受信された各々の信号を測位部１４０に入力する。測位部１４０は、衛星電波受信部１４５から入力された各々の衛星信号に含まれる送出時刻や軌道位置などの情報を用いて、端末１００の位置を示す端末位置情報を生成する。この際、測位部１４０は、端末１００から各々の衛星信号が送出された衛星までの距離（擬似距離）を測定し、測定された各々の距離を測位方程式に代入して、３次元座標による端末位置情報を生成する。
ここで、端末１００は、ユーザに携帯されて使用される。このため、測位部１４０によって測定された端末１００の位置は、ユーザの位置（中心位置）と等しいものとみなすことができる。
【００２１】
指示入力部１２０は、操作ボタンなどから構成され、オーディオ信号の生成開始を指示する生成開始信号などを制御部１１０に入力する。ここで、オーディオ信号の生成とは、発音地点から出力される音の音像を定位させたオーディオ信号を、サウンドデータから生成する処理である。制御部１１０は、指示入力部１２０を介して与えられるユーザからの指示に従って、端末１００全体を制御する。
【００２２】
方位検出部１５０は、ヘッドフォン２００に備えられた方位センサ２１０によって、ヘッドフォンを装着したユーザの顔がいずれの方向を向いているのかを検出し、方位情報としてバスＢ１に供給する。なお、ヘッドフォン２００に備えられる方位センサ２１０としては、地磁気を検出する手法や、ジャイロスコープを用いる手法の他、次の手法を用いることも可能である。すなわち、複数の測位部をヘッドフォン２００に設け、各測位部により検出された位置の相対的な変化量を用いて、ユーザの顔の向く方向を検出することも可能である。
【００２３】
無線通信部１３０は、制御部１１０の制御の下、端末１００が在圏するエリアの基地局５１０との間に無線リンクを確立し、これを経由して、端末位置情報をサウンドデータ配信サーバ３００に送信したり、４つのサウンドデータを並行してサウンドデータ配信サーバ３００から受信したりする。
【００２４】
オーディオ信号生成部１６０は、無線通信部１３０から並列して入力された４つのサウンドデータの各々から、２チャネルのオーディオ信号を生成し、生成したオーディオ信号をバスＢ１に供給する。この際、オーディオ信号生成部１６０は、左耳用のオーディオ信号であるＬチャネル信号と、右耳用のオーディオ信号であるＲチャネル信号とを別個独立に生成し、その各々をバスＢ１に供給する。バスＢ１に供給されたＬチャネル信号およびＲチャネル信号の各々は、オーディオ信号出力部１９０を介して、ヘッドフォン２００から音として出力される。
【００２５】
次に、オーディオ信号生成部１６０の詳細構成について、図６を参照して説明する。この図に示されるように、オーディオ信号生成部１６０には、端末１００がサウンドデータ配信サーバ３００から並行して受信するサウンドデータ数と等しい４つの加工部１７０−１、１７０−２、１７０−３、１７０−４が含まれている。これらの加工部１７０−１、１７０−２、１７０−３、１７０−４の各々は、４つのサウンドデータのうちいずれかひとつのサウンドデータを加工して、発音地点から発せられると想定される音の音像を定位させたオーディオ信号を生成する。いずれの加工部１７０−１、１７０−２、１７０−３、１７０−４が、いずれのサウンドデータを加工するかについては、各発音地点と端末１００との距離に応じて、加工部１７０−１、１７０−２、１７０−３、１７０−４の順に、端末１００との距離が長くなる発音地点に対応するサウンドデータを割り当てるなどの構成が考えられる。なお、以下の説明においては、加工部１７０−１、１７０−２、１７０−３、１７０−４の各々を区別する必要のない場合には、単に、その符号を１７０と記すことにする。
【００２６】
ここで、加工部１７０の詳細説明に先立って、ある地点（音源）から実際に出力された音を聴いた聴取者が、当該音源の方向や音源までの距離を知覚する仕組み、すなわち、音像定位について説明する。例えば、聴取者の右方に音源が位置する場合、聴取者の右耳から音源までの距離は、左耳から音源までの距離より近くなる。このため、ある時点に音源から出力された音が、右耳に到達するのにかかる時間は、左耳に到達するのにかかる時間より短くなる。このような左耳と右耳との間に生じる遅延時間により、聴取者は、音源の方向を知覚する。また、聴取者の近くに位置する音源と、遠くに位置する音源との２つの音源がある場合を想定する。この場合、ある音量（音圧）の音が各々の音源から出力されたとき、聴取者の位置においては、聴取者の遠くに位置する音源による音の音量より、近くに位置する音源による音の音量のほうが大きいものとなる。このような音量の差により、聴取者は、音源までの距離を知覚する。
【００２７】
そこで、本実施形態における各々の加工部１７０は、あたかも発音地点が実在するかのようにユーザが体感することができるように、各々の発音地点について、当該発音地点の位置と、ユーザ（聴取者）の左右の耳の位置とに応じて、左右の耳で生じる遅延時間および音量が規定されたオーディオ信号を生成する。以下、ひとつの加工部１７０に着目して、ひとつの発音地点に関するオーディオ信号の生成について説明する。
【００２８】
図７に示されるように、端末位置情報によって示されるユーザＵの中心位置Ｐ（Ｘ_Ｐ，Ｙ_Ｐ，Ｚ_Ｐ）と、方位情報によって示されるユーザの顔方向Ａとが与えられた場合、左耳と右耳との距離をｅとしたとき、ユーザＵの左耳の位置Ｌ（Ｘ_Ｌ，Ｙ_Ｌ，Ｚ_Ｌ）は、中心位置Ｐ（Ｘ_Ｐ，Ｙ_Ｐ，Ｚ_Ｐ）から水平、かつ、方向Ａと垂直にｅ／２の距離だけ左側の位置によって特定され、右耳の位置Ｒ（Ｘ_Ｒ，Ｙ_Ｒ，Ｚ_Ｒ）は、中心位置Ｐ（Ｘ_Ｐ，Ｙ_Ｐ，Ｚ_Ｐ）から水平、かつ、方向Ａと垂直にｅ／２の距離だけ右側の位置によって特定される。ここで、発音地点Ｓ（Ｘ_Ｓ，Ｙ_Ｓ，Ｚ_Ｓ）とユーザ（中心位置Ｐ）との距離が十分離れていることを想定し、音は平面波としてユーザの耳に到達するものとする。また、ユーザからみて発音地点Ｓが右前方に位置し、このとき、ユーザの顔の向いている方向Ａとユーザからみた発音地点Ｓの方向との成す角がθであると仮定する。このとき、発音地点Ｓから音が出力されると、右耳と左耳との間に生じる音の到達する時間差（遅延時間）Δｔは、到達経路の距離の差ｄおよび音速ｃを用いて、
【００２９】
【数１】

【００３０】
と表現される。ここで、ｄ＝ｅ・ｓｉｎθが成り立つので、遅延時間Δｔは、
【００３１】
【数２】

【００３２】
となる。
また、発音地点Ｓからユーザの左右各々の耳までの距離を、それぞれＤ_Ｌ、Ｄ_Ｒとし、時刻をｔ、球面波の波動方程式をｆとした場合、左耳で生じる音圧Ｐ_Ｌ、および、右耳で生じる音圧Ｐ_Ｒの各々は、以下のように表すことができる。
すなわち、
【００３３】
【数３】

【００３４】
【数４】

【００３５】
と表現することができる。
加工部１７０は、これらの遅延時間Δｔ式（２）、音圧Ｐ_Ｌ式（３）および音圧Ｐ_Ｒ式（４）を表現するオーディオ信号を、サウンドデータから生成する。これにより、ユーザにおいては、発音位置情報で示される位置から発音しているかのように、仮想的な発音地点による音像が定位する。
【００３６】
再び説明を図６に戻す。各々の加工部１７０には、パラメータ生成部１７２、遅延部１７６およびアンプ１７８が含まれている。このうち、パラメータ生成部１７２には、さらに、ディレイパラメータ生成部１７３およびアンプパラメータ生成部１７４が含まれている。ディレイパラメータ生成部１７３は、Ｌチャネル信号およびＲチャネル信号の各々の遅延時間Δｔを規定するパラメータを生成する。より詳細には、ディレイパラメータ生成部１７３は、サウンドデータ配信サーバ３００から受信した発音位置情報と、方位検出部１５０によって検出された方位情報と、測位部１４０によって検出された端末位置情報と、左右両耳間の距離ｅとを入力して、左右の耳間における遅延時間Δｔを規定するパラメータＤＰを式（２）により生成し、パラメータＤＰを遅延部１７６に送信する。
【００３７】
一方、アンプパラメータ生成部１７４は、Ｌチャネル信号およびＲチャネル信号の各々が放音されたときの音圧を表すパラメータを生成する。より詳細には、アンプパラメータ生成部１７４は、サウンドデータ配信サーバ３００から受信した発音位置情報と、方位検出部１５０によって検出された方位情報と、測位部１４０によって検出された端末位置情報と、左右両耳間の距離ｅとを入力して、左耳において生じる音圧Ｐ_Ｌを規定するパラメータＡＬと、右耳において生じる音圧Ｐ_Ｒを規定するパラメータＡＲを式（３）および式（４）により生成し、パラメータＡＬ、ＡＲをアンプ１７８に送信する。
なお、パラメータ生成部１７２に入力されるユーザの左右の耳間の距離ｅは、制御部１１０に含まれるＲＯＭ（Ｒｅａｄ　Ｏｎｌｙ　Ｍｅｍｏｒｙ）などに記憶され、ＲＯＭなどから読み出される構成としても良いし、指示入力部１２０を介してユーザが入力する構成としても良い。また、上述した、左右の耳の位置の特定方法、遅延時間Δｔ、音圧Ｐ_ＬおよびＰ_Ｒを表す式（２、３および４）は、あくまでも一例であり、さらに、頭部伝達関数や、周波数スペクトルの変化による音の質的変化、直接音と残響音との比による影響を取り入れるなどの各種の変更や改良を加えることが可能である。
【００３８】
遅延部１７６は、無線通信部１３０を介して入力したサウンドデータから、左耳用のＬチャネル信号ＳＬ１および右耳用のＲチャネル信号ＳＲ１を生成し、その各々をアンプ１７８に送信する。より具体的には、遅延部１７６は、ディレイパラメータ生成部１７３から受け取ったディレイパラメータＤＰに応じて、Ｌチャネル信号ＳＬ１とＲチャネル信号ＳＲ１とにおいて遅延が生じるように各々の信号を生成する。これにより、ひとつのサウンドデータについてのＬチャネル信号ＳＬ１およびＲチャネル信号ＳＲ１が、ユーザの左右の耳の位置に応じて、あたかも発音地点から到達時間の差が生じているかのように、すなわち、ユーザからみて、ある方向に位置する発音地点から出力された音であるかのように生成される。
【００３９】
アンプ１７８は、遅延部１７６から受け取ったＬチャネル信号ＳＬ１を、アンプパラメータ生成部１７４から受け取ったパラメータＡＬによって増幅する一方、遅延部１７６から受け取ったＲチャネル信号ＳＲ１を、アンプパラメータ生成部１７４から受け取ったパラメータＡＲによって増幅し、それぞれＬチャネル信号ＳＬ２およびＲチャネル信号ＳＲ２として混合部１８０に送信する。これにより、Ｌチャネル信号ＳＬ２およびＲチャネル信号ＳＲ２の各々が、ユーザの左右の耳の位置と各々の発音地点との距離に応じて、音圧レベルが異なるかのように生成される。なお、このようなアンプ１７８による音圧レベルの調整は、加工部１７０−１、１７０−２、１７０−３、１７０−４ごとに行われる。このため、各々の加工部１７０−１、１７０−２、１７０−３、１７０−４において生成されるオーディオ信号によって、ユーザに対して、あたかも各々の発音地点までの距離が異なるかのような感覚を与えることが可能となる。
【００４０】
混合部１８０は、４つの加工部１７０から送信された４つのＬチャネル信号ＳＬ２を混合し、Ｌチャネル信号ＳＬ３としてオーディオ信号出力部１９０に送信する一方で、４つのＲチャネル信号ＳＲ２を混合し、Ｒチャネル信号ＳＲ３としてオーディオ信号出力部１９０に送信する。この際、混合部１８０は、ユーザの耳に障害を与えないように、混合されたＬチャネル信号ＳＬ３およびＲチャネル信号ＳＲ３の信号レベルに制限をかけることが好ましい。混合部１８０から送信されたＬチャネル信号ＳＬ３およびＲチャネル信号ＳＲ３の各々は、オーディオ信号出力部１９０によって、Ｄ／Ａ（Ｄｉｇｉｔａｌ　／　Ａｎａｌｏｇ）変換された後、ヘッドフォン２００に出力され、左耳用の放音部２２０および右耳用の放音部２３０を介して放音される。
なお、確認的ではあるが、オーディオ信号生成部１６０による処理は、無線通信部１３０によるサウンドデータの受信や、測位部１４０による端末位置情報の生成、方位検出部１５０による方位情報の生成などの各種処理と並列して実行され、オーディオ信号は、サウンドデータからストリーム形式で生成される。このため、ユーザが移動すると、それに応じて端末位置情報や方位情報などが更新され、ユーザがいずれの位置に移動しようとも、また、いずれの方向に顔を向けようとも、ユーザからみて、各々の発音地点から出力される音の音像が定位するようにオーディオ信号が生成される。
【００４１】
＜サウンドデータ配信システムの動作＞
次にサウンドデータ配信システムの動作について、図８を参照して説明する。この動作は、サウンドデータ配信サーバ３００から端末１００にサウンドデータを配信し、端末１００において、端末位置情報や方位情報を更新しつつ、配信されたサウンドデータからオーディオ信号を生成する処理である。なお、この動作は、端末１００の指示入力部１２０から入力される生成開始信号をトリガとして処理を開始し、その後、端末１００によって、タイマ割り込みされる処理である。また、サウンドデータ配信サーバ３００と端末１００との接続認証や端末認証などの、一般的な移動通信システムにおいて実行される各種処理は、本件発明と直接関係しないため、それらの説明については省略することとする。
【００４２】
まず、端末１００の制御部１１０は、ステップＳＡ１において、衛星群４００から送信される衛星信号を衛星電波受信部１４５により受信し、衛星信号を取得する。次に、端末１００の制御部１１０は、ステップＳＡ２において、取得した衛星信号に応じて端末１００の位置を示す端末位置情報ＳＰを測位部１４０により生成する。次いで、端末１００の制御部１１０は、ステップＳＡ３において、生成した端末位置情報ＳＰを基地局５１０に送信する。
基地局５１０は、端末１００から端末位置情報ＳＰを受信すると、ステップＳＡ４において、端末位置情報ＳＰをサウンドデータ配信サーバ３００に転送する。
【００４３】
サウンドデータ配信サーバ３００の制御部３１０は、基地局５１０から転送された端末位置情報ＳＰを受信すると、ステップＳＡ５において、発音地点選択処理を実行する。この発音地点選択処理は、受信した端末位置情報ＳＰによる端末１００の位置と、発音位置情報による発音地点の位置とに応じて、発音地点を所定数に達するまで選択する処理である。ここで、サウンドデータ配信サーバ３００の制御部３１０が実行する発音地点選択処理を、図９を参照して説明する。
【００４４】
まず、制御部３１０は、ステップＳＡ５１において、選択された発音地点の数を示す選択数ｎを「０」にし、選択数ｎを初期化する。次に、制御部３１０は、ステップＳＡ５２において、この時点で未選択の発音地点のうち、最も近い発音地点を選択する。この際、制御部３１０は、受信した端末１００の端末位置情報ＳＰと、記憶部３３０に記憶される各発音地点の発音位置情報とを用いて発音地点を選択する。例えば、いま、図１１に示されるように、端末１００の周りに、８つの発音地点Ｓ１、Ｓ２、…、Ｓ８が配置されている場合を想定する。これらの発音地点Ｓ１、Ｓ２、…、Ｓ８の各々は、この順で端末１００から遠ざかる様に配置されているものとする。このとき、制御部３１０は、いずれの発音地点Ｓ１、Ｓ２、…、Ｓ８も未選択（選択数ｎ＝０）であれば、ステップＳＡ５２において、発音地点Ｓ１を選択する。
【００４５】
次に、制御部３１０は、ステップＳＡ５３において、選択数ｎを「１」だけインクリメントする。次いで、制御部３１０は、ステップＳＡ５４において、選択数ｎが所定数（本実施形態では４つ）に達したか否かを判別する。この判別結果が否定的であれば、制御部３１０は、処理手順をステップＳＡ５２に戻し、選択数ｎが所定数に達するまで、ステップＳＡ５２からステップＳＡ５４までの処理を繰り返す。
【００４６】
一方、ステップＳＡ５４の判別結果が肯定的となれば、制御部３１０は、所定数の発音地点が選択されたため、発音地点選択処理を終了する。例えば、図１１においては、制御部３１０は、８つの発音地点Ｓ１、Ｓ２、…、Ｓ８のうち、端末１００に近い、黒丸で示される４つの発音地点Ｓ１、Ｓ２、Ｓ３およびＳ４を選択する。なお、本実施形態においては、制御部３１０によって４つの発音地点が選択されるが、選択される発音地点の数は任意に設定することが可能である。
【００４７】
さて、再び図８において、サウンドデータ配信サーバ３００の制御部３１０は、発音地点選択処理（ステップＳＡ５）が終了すると、次に、ステップＳＡ６において、データ量変換処理を実行する。このデータ量変換処理は、選択された発音地点に対応するサウンドデータのデータ量、すなわち、端末１００に配信されるサウンドデータのデータ量を変換する処理である。ここで、サウンドデータ配信サーバ３００の制御部３１０が実行するデータ量変換処理を、図１０を参照して説明する。この説明においては、サウンドデータ配信サーバ３００の記憶部３３０に予め記録されるサウンドデータのサンプリング周波数は、４４．１ｋＨｚ以上であるものとする。
【００４８】
まず、制御部３１０は、ステップＳＡ６１において、発音地点選択処理によって選択された各々の発音地点と端末１００との距離Ｄを、発音位置情報および端末位置情報ＳＰを用いて求める。次に、制御部３１０は、ステップＳＡ６２において、図４に示されるデータ量変換テーブルＴＢＬを参照し、各々の発音地点から端末１００までの距離Ｄに応じて、各発音地点のサウンドデータが変換されるべきサンプリング周波数を特定する。例えば、いま、図１１において、端末１００と発音地点Ｓ１との距離Ｄが「０」以上「Ｌ１」未満であり、端末１００と発音地点Ｓ２との距離Ｄが「Ｌ１」以上「Ｌ２」未満であり、端末１００と発音地点Ｓ３との距離Ｄが「Ｌ２」以上「Ｌ３」未満であり、端末１００と発音地点Ｓ４との距離Ｄが「Ｌ３」以上「Ｌ４」未満であるものとする。このとき、制御部３１０は、データ量変換テーブルＴＢＬを参照して、発音地点Ｓ１のサウンドデータについてのサンプリング周波数をｆ１（４４．１ｋＨｚ）に特定し、発音地点Ｓ２のサウンドデータについてのサンプリング周波数をｆ２（２２ｋＨｚ）に特定し、発音地点Ｓ３のサウンドデータについてのサンプリング周波数をｆ３（１０Ｈｚ）に特定し、発音地点Ｓ４のサウンドデータについてのサンプリング周波数をｆ４（５ｋＨｚ）に特定する。
【００４９】
次に、制御部３１０は、ステップＳＡ６３において、記憶部３３０に予め記録される各々の発音地点のサウンドデータから、ステップＳＡ６２において特定されたサンプリング周波数のサウンドデータを生成する。これにより、生成されたサウンドデータは、端末１００からの距離が遠い発音地点のサウンドデータほど、サンプリング周波数が低下するため、端末１００からの距離が遠い発音地点のサンプリングデータほど、そのデータ量が縮減される。これにより、サウンドデータ配信サーバ３００から配信されるサウンドデータの総データ量が縮減され、結果として、サウンドデータの配信にかかる移動通信網５００におけるネットワークトラフィック、および、サウンドデータの送信にかかるサウンドデータ配信サーバ３００の負荷が低減されることとなる。
なお、一般に、オーディオ信号のサンプリング周波数が低下すると、当該オーディオ信号が放音されたときの音質は劣化するが、本実施形態においては、端末１００に配信されたサウンドデータは、端末１００の加工部１７０によって、端末１００から遠くに位置する発音地点のサウンドデータほど小さな音量となるように加工される。このため、端末１００から遠くに位置する発音地点のサウンドデータのサンプリング周波数を低下させても、端末１００において生成されるオーディオ信号が放音されたときの音質に、ほとんど影響することがない。言い換えれば、データ量変換処理によれば、音質を不当に損なうことなく、サウンドデータのデータ量を縮減させ、サウンドデータの配信によるネットワークトラフィックおよびサウンドデータ配信サーバ３００の負荷を低減することができる。
【００５０】
さて、再び図８において、サウンドデータ配信サーバ３００の制御部３１０は、データ量変換処理（ステップＳＡ６）が終了すると、次に、ステップＳＡ７において、データ量が変換された４つのサウンドデータＳＤ１、ＳＤ２、ＳＤ３、ＳＤ４の各々を、並列に基地局５１０に送信する。この際、制御部３１０は、各サウンドデータＳＤ１、ＳＤ２、ＳＤ３、ＳＤ４に、記憶部３３０に記録される発音位置情報を付加した後、サウンドデータＳＤ１、ＳＤ２、ＳＤ３、ＳＤ４をストリーム形式で送信する。例えば、図３において、基地局５１０に送信すべきサウンドデータが、発音地点ＩＤ「ミニストアＡ」に対応する「買い物マーチ」であれば、制御部３１０は、「買い物マーチ」に発音位置情報（ｘ１，ｙ１，ｚ１）を付加した後、「買い物マーチ」を送信する。
基地局５１０は、サウンドデータ配信サーバ３００から送信されたサウンドデータＳＤ１、ＳＤ２、ＳＤ３、ＳＤ４を受信すると、ステップＳＡ８において、それらのサウンドデータＳＤ１、ＳＤ２、ＳＤ３、ＳＤ４を端末１００に転送する。
【００５１】
一方、端末１００の制御部１１０は、ステップＳＡ３において端末位置情報ＳＰを基地局５１０に送信すると、次に、ステップＳＡ９において、ユーザの顔方向を示す方位情報を方位検出部１５０によって生成する。次いで、端末１００の制御部１１０は、ステップＳＡ１０において、基地局５１０から受信したサウンドデータＳＤ１、ＳＤ２、ＳＤ３、ＳＤ４からオーディオ信号を生成する。この際、端末１００の制御部１１０は、端末位置情報ＳＰ、方位情報および発音位置情報に応じて、オーディオ信号生成部１６０によりオーディオ信号をストリーム形式で生成する。次に、端末１００の制御部１１０は、ステップＳＡ１１において、オーディオ信号出力部１９０からオーディオ信号を出力する。端末１００から出力されたオーディオ信号は、ヘッドフォン２００を介して音として出力される。
【００５２】
例えば、いま、図１２に示されるように、発音地点Ｓ１、Ｓ２、Ｓ３、Ｓ４の位置が設定されており、端末位置情報ＳＰによって端末１００（ユーザ）の位置、方位情報によってユーザの顔の向く方向Ａが与えられたものとする。また、このとき、ユーザから各発音地点Ｓ１、Ｓ２、Ｓ３、Ｓ４までの距離は、左右両耳とも、この順で遠くなるものとする。この際、ヘッドフォン２００から出力される各発音地点のサウンドデータの音圧（音量）は、発音地点Ｓ１に対応するサウンドデータの音圧が最も大きくなり、発音地点Ｓ４に対応するサウンドデータの音圧が最も小さくなる。これにより、ユーザは、発音地点Ｓ１が最も近くに位置し、発音地点Ｓ４が最も遠くに位置するように知覚する。また、発音地点Ｓ１からユーザの右耳までの距離は、左耳までの距離より短いため、発音地点Ｓ１のサウンドデータについてのＬチャネル信号は、Ｒチャネル信号より遅延したものとなる。これにより、ユーザは、発音地点Ｓ１が右側に位置することを知覚する。同様に、Ｌチャネル信号とＲチャネル信号との間の遅延量（遅延時間Δｔ）により、ユーザは発音地点Ｓ４が左側に位置することを知覚する。
【００５３】
次に、図１３に示されるように、ユーザが、発音地点Ｓ４の方向を向いたとする。このとき、方位検出部１５０によって、ユーザの顔の向く方向Ａが更新されるため、Ｌチャネル信号とＲチャネル信号との間における各サウンドデータの遅延量（遅延時間Δｔ）が更新される。これにより、ユーザは、発音地点Ｓ２が右側に位置し、発音地点Ｓ３が左側に位置することを知覚する。
【００５４】
次いで、図１４に示されるように、ユーザが、発音地点Ｓ４に近づき、ユーザから各発音地点Ｓ１、Ｓ２、Ｓ３、Ｓ４までの距離が、発音地点Ｓ４、発音地点Ｓ２、発音地点Ｓ３、発音地点Ｓ１の順で遠くなる位置に移動したものとする。このようにユーザが移動すると、発音地点Ｓ１から遠ざかるため、発音地点Ｓ１のサウンドデータの音圧（音量）は小さくなる一方、発音地点Ｓ４に近づくため、発音地点Ｓ４のサウンドデータの音圧は大きくなる。これにより、ユーザは、発音地点Ｓ１から離れ、発音地点Ｓ４に近づいたことを知覚する。
【００５５】
以上説明したように、本実施形態によれば、ユーザの位置およびユーザの向く方向と、発音地点の位置とに応じて、ユーザからみて、発音位置情報で示される位置から発音しているかのように定位させたオーディオ信号が生成される。これにより、ユーザは、あたかも発音地点の各々が、規定された位置に実在するかのような感覚を得ることができる。例えば、ある領域に発音地点としてオーケストラの各パートを配置した場合、ユーザが当該領域内を移動すれば、ユーザは、あたかも各パートが配置された空間内を移動するかのような感覚を得ることが可能となり、ユーザがオーディオデータの生成に参加でき、変化に富んだ楽しい音響アミューズメントを提供することができる。
【００５６】
また、本実施形態においては、発音地点の音像が定位されるため、音声でユーザの目標物の位置を指示する音声情報システムに適用した場合、目標物が右側に位置すれば、「右方向、ガソリンスタンドの手前を右折です。」などの音声が、あたかも右側から出力されたかのように聞こえる。これにより、従来における音像を考慮しない音声情報システムと比較して、より直感的に方向に関する情報をユーザに与えることが可能となり、音声指示の効率が向上する。
さらに、サウンドデータ配信システムを視覚障害者向けガイドとしても使用できる。例えば、駅における切符自販機、駅員室、改札口などに、それら各々を表す音声を出力するような発音地点を配置する構成としても良い。このような構成にすれば、音声によって、目で見るのと同様に位置を案内することができるため、ユーザは自主的に目標物に近づくことができる。
【００５７】
くわえて、発音地点を商店などの位置に配置し、当該商店の広告を示す音声を出力する構成としても良い。これにより、あたかも商店から音声が出力されたかのように聞こえるため、ユーザは、例えばビルの２階などの目に付きにくい場所に立地する商店を探しやすくなる。一方、商店においては、広告効果が期待でき、商売が活性化することとなる。また、聴覚に働きかける広告が、呼び込みなどの実際の音から、仮想的な発音地点の広告に転換されることにより、街中における騒音が低減される。
【００５８】
＜第１実施形態の変形例＞
なお、上述した第１実施形態においては、サウンドデータの配信によるネットワークトラッフィックおよびサウンドデータ配信サーバの負荷を緩和するために、データ量変換処理においてサウンドデータのデータ量を縮減する例を示した。しかし、これらが問題とならない場合には、データ量変換処理を省略することができる。
【００５９】
＜第２実施形態＞
＜サウンドデータ配信システムの構成＞
上述した第１実施形態においては、ひとつのサウンドデータ配信サーバ３００から端末１００にサウンドデータを配信するサウンドデータ配信システムについて説明した。これに対し、第２実施形態では、複数のサウンドデータ配信サーバの各々から端末１００にサウンドデータを配信するサウンドデータ配信システムについて説明する。
なお、第２実施形態におけるサウンドデータ配信システムの構成のうち、第１実施形態に係るシステムと共通するものについては同一の符号が付されている。
【００６０】
図１５は、第２実施形態におけるサウンドデータ配信システムの概略構成を示す図である。この図に示されるように、移動通信網５００には、大別して２種類のサーバ装置が接続されている。すなわち、コントロールサーバ６００と、サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃとである。このうち、サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃの各々は、端末１００に配信されるサウンドデータであり、互いに異なる発音地点に対応したサウンドデータを記憶している。また、コントロールサーバ６００は、サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃから端末１００へのサウンドデータの配信を管理するものである。詳述すると、コントロールサーバ６００は、端末１００の位置と、各々の発音地点の位置とに応じて、所定数（例えば２つ）の発音地点を選択する。サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃは、コントロールサーバ６００によって選択された発音地点に対応するサウンドデータを、端末１００に配信する。なお、これらのコントロールサーバ６００、サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃの各々には、移動通信網５００において各々を特定するためのサーバＩＤが割り当てられている。
なお、説明の便宜上、第２実施形態においては、サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃは、移動通信網５００に直接接続されているが、サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃは、インターネットなどを介して移動通信網５００に接続される構成としても良い。また、第２実施形態においては、サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃの数が３つである例を説明するが、サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃの数は、３つに限られるものではなく、それ以外の数であってもよい。
【００６１】
まず、サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃの構成について説明する。第２実施形態におけるサウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃは、第１実施形態におけるサウンドデータ配信サーバ３００（図２参照）と同様の構成をしており、各部を制御する制御部と、移動通信網とデータを授受する通信部と、各種情報を記憶する記憶部とを備えている。
【００６２】
図１６は、サウンドデータ配信サーバ６１０Ａの記憶部に記憶される情報のうち主要なものを示す図であり、図１７は、サウンドデータ配信サーバ６１０Ｂの記憶部に記憶される情報のうち主要なものを示す図である。なお、図示しないが、サウンドデータ配信サーバ６１０Ｃにおける記憶部もサウンドデータ配信サーバ６１０Ａ、６１０Ｂと同様な情報が記憶されている。これらの図に示されるように、各サウンドデータ配信サーバ６１０Ａ、６１０Ｂの記憶部には、発音地点ＩＤと、各々の発音地点ＩＤに対応付けられたサウンドデータとが記憶される。この記憶部が、第１実施形態におけるサウンドデータ配信サーバ３００の記憶部３３０と比較して特徴的なのは、発音位置情報を記憶していない点にある。
【００６３】
次いで、コントロールサーバ６００の構成について説明する。コントロールサーバ６００は、サウンドデータ配信サーバ６１０の構成と同様の構成をしており、各部を制御する制御部と、移動通信網５００とデータを授受する通信部と、各種情報を記憶する記憶部とを備えている。
【００６４】
図１８は、コントロールサーバ６００の記憶部に記憶される情報のうち、主要なものを示す図である。この図に示されるように、記憶部には、発音地点ＩＤと、各々の発音地点ＩＤに対応付けられた発音位置情報およびサーバＩＤが記憶されている。コントロールサーバ６００の記憶部が、第１実施形態におけるサウンドデータ配信サーバ３００の記憶部３３０と比較して特徴的な点は、サウンドデータを記憶していない点と、サーバＩＤを記憶している点にある。発音地点ＩＤは、サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃに記憶される発音地点を特定するためのものである。また、サーバＩＤは、発音地点ＩＤによって特定される発音地点のサウンドデータが、いずれのサウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃに記憶されているかを示す情報である。この図におけるサーバＩＤ「Ａ」は、サウンドデータ配信サーバ６１０Ａを示し、サーバＩＤ「Ｂ」は、サウンドデータ配信サーバ６１０Ｂを示し、サーバＩＤ「Ｃ」は、サウンドデータ配信サーバ６１０Ｃを示している。例えば、この図においては、発音地点ＩＤ「ミニストアＡ」の発音地点は、座標（ｘ１，ｙ１，ｚ１）に位置し、そのサウンドデータはサウンドデータ配信サーバ６１０Ａに記憶されることを示し、発音地点ＩＤ「ミニストアＢ」の発音地点は、座標（ｘ２，ｙ２，ｚ２）に位置し、そのサウンドデータはサウンドデータ配信サーバ６１０Ｂに記憶されることを示している。
【００６５】
＜サウンドデータ配信システムの動作＞
第２実施形態におけるサウンドデータ配信システムの動作について図１９を参照して説明する。この動作は、コントロールサーバ６００によって選択された発音地点に対応するサウンドデータを、サウンドデータ配信サーバ６１０から端末１００に配信し、端末１００において、配信されたサウンドデータからオーディオ信号を生成する処理である。なお、この動作は、ユーザが端末１００の指示入力部１２０を介して、オーディオ信号の生成の開始を指示する生成開始信号を入力すると処理を開始し、その後、端末１００によって、タイマ割り込みされる処理である。また、コントロールサーバ６００と端末１００との間における接続認証や端末認証などの、一般的な移動通信システムにおいて実行される各種処理は、本件発明と直接関係しないため、それらの説明については省略することとする。
【００６６】
まず、端末１００の制御部１１０は、ステップＳＢ１において、衛星群４００から送信される衛星信号を、衛星電波受信部１４５により受信し、衛星信号を取得する。次に、端末１００の制御部１１０は、ステップＳＢ２において、取得した衛星信号に応じて端末１００の３次元位置を示す端末位置情報ＳＰを測位部１４０により生成する。
【００６７】
次に、端末１００の制御部１１０は、ステップＳＢ３において、端末位置情報ＳＰを、その情報の送信先であるコントロールサーバ６００を示すサーバＩＤと共に基地局５１０に送信する。基地局５１０は、端末位置情報ＳＰを端末１００から受信すると、ステップＳＢ４において、その情報をコントロールサーバ６００に転送する。
【００６８】
コントロールサーバ６００の制御部は、端末位置情報ＳＰを受信すると、ステップＳＢ５において、受け取った端末位置情報ＳＰと、記憶部に記憶される発音位置情報とに応じて、発音地点選択処理を実行する。この発音地点選択処理は、上述した第１実施形態における発音地点選択処理（図９参照）と同様の処理であり、端末１００に近い発音地点から順に、所定数（２つ）に達するまで発音地点を選択する処理である。この動作説明においては、例として、コントロールサーバ６００の制御部は、図１８に示される発音地点ＩＤのうち「ミニストアＡ」と「ビルＢ」との２つの発音地点を選択したものとして、すなわち、サウンドデータ配信サーバ６１０Ａに記憶される「買い物マーチＡ（図１６参照）」と、サウンドデータ配信サーバ６１０Ｂに記憶される「太鼓音Ｂ（図１７参照）」とが端末１００に配信されるものとして説明する。なお、コントロールサーバ６００の制御部によって選択される発音地点の数は、２つに限られず、任意に設定することが可能である。
【００６９】
次に、コントロールサーバ６００の制御部は、ステップＳＢ６において、選択された発音地点のサウンドデータを有するサウンドデータ配信サーバ６１０のサーバＩＤ＿ＳＩＤを基地局５１０に送信する。この際、コントロールサーバ６００の制御部は、サーバＩＤ＿ＳＩＤに対して、その記憶部に記憶される発音地点ＩＤおよび発音位置情報を付加した後、サーバＩＤ＿ＳＩＤを送信する。すなわち、コントロールサーバ６００の制御部は、図１８におけるサーバＩＤ「Ａ」に発音地点ＩＤ「ミニストアＡ」と発音位置情報（ｘ１，ｙ１，ｚ１）とを付加すると共に、サーバＩＤ「Ｂ」に発音地点ＩＤ「ビルＢ」と発音位置情報（ｘ２，ｙ２，ｚ２）とを付加した後、サーバＩＤ「Ａ」およびサーバＩＤ「Ｂ」を基地局５１０に送信する。
基地局５１０は、各サーバＩＤ＿ＳＩＤをコントロールサーバ６００から受信すると、ステップＳＢ７において、それらを端末１００に転送する。
【００７０】
端末１００の制御部１１０は、基地局５１０によって転送されたサーバＩＤ＿ＳＩＤを受信すると、ステップＳＢ８において、当該サーバＩＤ＿ＳＩＤによって特定されるサウンドデータ配信サーバ６１０Ａ、６１０Ｂの各々にサウンドデータの配信を要求すべく、配信要求ＤＲＡ、ＤＲＢを基地局５１０に送信する。すなわち、端末１００の制御部１１０は、サウンドデータ配信サーバ６１０Ａに対する「買い物マーチＡ」の配信要求ＤＲＡと、サウンドデータ配信サーバ６１０Ｂに対する「太鼓音Ｂ」の配信要求ＤＲＢとを基地局５１０に送信する。
【００７１】
基地局５１０は、配信要求ＤＲＡ、ＤＲＢを端末１００から受け取ると、ステップＳＢ９およびステップＳＢ１０において、受け取った配信要求ＤＲＡ、ＤＲＢの各々を、対応するサウンドデータ配信サーバ６１０Ａ、６１０Ｂに転送する。すなわち、基地局５１０は、ステップＳＢ９において、配信要求ＤＲＡをサウンドデータ配信サーバ６１０Ａに転送する一方、ステップＳＢ１０において、配信要求ＤＲＢをサウンドデータ配信サーバ６１０Ｂに転送する。
【００７２】
サウンドデータ配信サーバ６１０Ａの制御部は、基地局５１０から配信要求ＤＲＡを受け取ると、ステップＳＢ１１において、当該配信要求ＤＲＡによって示されるサウンドデータＳＤＡ（ここでは「買い物マーチＡ」）を基地局５１０にストリーム形式で送信する。基地局５１０は、サウンドデータ配信サーバ６１０ＡからサウンドデータＳＤＡを受け取ると、ステップＳＢ１２において、サウンドデータＳＤＡを端末１００に転送する。
【００７３】
一方、サウンドデータ配信サーバ６１０Ｂの制御部は、基地局５１０から配信要求ＤＲＢを受け取ると、ステップＳＢ１３において、当該配信要求ＤＲＢによって示されるサウンドデータＳＤＢ（ここでは「太鼓音Ｂ」）を基地局５１０にストリーム形式で送信する。基地局５１０は、サウンドデータ配信サーバ６１０ＢからサウンドデータＳＤＢを受け取ると、ステップＳＢ１４において、サウンドデータＳＤＢを端末１００に転送する。これらのステップＳＢ１３およびステップＳＢ１４の処理は、上述したステップＳＢ１１およびステップＳＢ１２の処理と並列して実行される。
【００７４】
なお、第１実施形態と同様に、サウンドデータＳＤＡ、ＳＤＢを送信する前に、サウンドデータ配信サーバ６１０Ａ、６１０Ｂにおいて、データ量変換処理を実行しても良い。すなわち、サウンドデータ配信サーバ６１０Ａ、６１０Ｂの各々において、端末１００と発音地点との距離に応じて、各々のサウンドデータＳＤＡ、ＳＤＢのデータ量を縮減した後、サウンドデータを基地局５１０に送信する構成としても良い。
【００７５】
端末１００の制御部１１０は、サウンドデータ配信サーバ６１０Ａから送信されたサウンドデータＳＤＡ（ここでは「買い物マーチＡ」）と、サウンドデータ配信サーバ６１０Ｂから送信されたサウンドデータＳＤＢ（ここでは「太鼓音Ｂ」）とを、基地局５１０を介して並列に受信すると、ステップＳＢ１５において、それらのサウンドデータをオーディオ信号生成部１６０に入力して、左右２チャネルのオーディオ信号を生成する。この際、「買い物マーチＡ」と「太鼓音Ｂ」との各々のサウンドデータは、オーディオ信号生成部１６０に含まれる２つの加工部１７０のいずれか一方によって処理される。そして、端末１００の制御部１１０は、ステップＳＢ１６において、オーディオ信号出力部１９０からオーディオ信号を出力する。端末１００から出力されたオーディオ信号は、ヘッドフォン２００を介して音として出力される。
【００７６】
このように第２実施形態においては、コントロールサーバ６００の管理の下、各サウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃから端末１００にサウンドデータが配信され、端末１００において、ユーザの位置および顔の向く方向Ａに応じて、仮想的な発音地点による音像を定位させたオーディオ信号が生成される。これにより、上述した第１実施形態と同様に、ユーザは、あたかも発音地点が実在するかのような空間を移動することができ、従来にない音響アミューズメントを提供することができる。
【００７７】
また、第２実施形態においては、複数のサウンドデータ配信サーバ６１０Ａ、６１０Ｂ、６１０Ｃからサウンドデータが配信される。このため、複数の端末１００が、ひとつのサウンドデータ配信サーバから集中的にサウンドデータの配信を受ける事態が回避され、サウンドデータ配信サーバにかかる負荷が分散されることとなる。さらに、コントロールサーバ６００によってサウンドデータの配信が一括管理されるため、サウンドデータの管理が容易となり、サウンドデータを容易に増加することができる。これにより、端末１００に配信されるサウンドデータの種類や内容が充実したものとなる。
【００７８】
＜第１および第２実施形態の変形例＞
なお、上述した第１および第２実施形態においては、端末位置情報ＳＰを、ＧＮＳＳによって端末１００において生成する例を示したが、これに限られない。例えば、サウンドデータ配信サーバ３００やコントロールサーバ６００において、端末１００が無線リンクを確立する基地局５１０の位置などに応じて、端末位置情報ＳＰを生成する構成としても良い。
【００７９】
また、上述した第１および第２実施形態においては、基地局５１０と無線通信する無線通信部１３０を備えた端末１００を説明したが、これに限られない。例えば、ＰＤＡ（Ｐｅｒｓｏｎａｌ　Ｄｉｇｉｔａｌ　Ａｓｓｉｓｔａｎｔｓ）などの無線通信機能を有さない携帯端末と、当該携帯端末に着脱可能な通信モジュールを取り付けて、サウンドデータ配信サーバ３００などとデータを授受する構成としても良い。
【００８０】
上述した第１および第２実施形態においては、サウンドデータをストリーム形式で配信する例を示したが、端末１００にサウンドデータを記憶する記憶部を設けて、記憶部にキャッシュされたサウンドデータによりオーディオ信号を生成する構成としても良い。
【００８１】
くわえて、上述した第１および第２実施形態においては、端末１００と発音地点との距離に応じて、サウンドデータ配信サーバ３００、６１０Ａなどにおいて、端末１００に配信するサウンドデータのデータ量を変換する例を示したがこれに限られない。例えば、サウンドデータ配信サーバ３００の記憶部３３０に、同一の音を表すサウンドデータであって、データ量が互いに異なる複数のサウンドデータ（例えば、サンプリング周波数が互いに異なる複数のサウンドデータ）を予め記憶させておき、このうちのいずれかを、端末１００と発音地点との距離に応じて選択し、選択されたサウンドデータを端末１００に送信する構成としても良い。これにより、上述した各実施形態と同様に、ネットワークトラフィックが低減されることとなる。
【００８２】
＜第３実施形態＞
上述した第１、第２実施形態においては、無線通信網５００を介してサウンドデータ配信サーバ３００、６１０からサウンドデータを受信し、受信したサウンドデータからオーディオ信号を生成する端末１００について説明した。これに対し、第３実施形態では、端末内に設けられた記憶部に記憶されるサウンドデータからオーディオ信号を生成する端末を説明する。第３実施形態においては、上述した各実施形態におけるサウンドデータ配信サーバ３００、および、基地局５１０などの無線通信のための設備は不要となる。
【００８３】
図２０は、第３実施形態における端末の構成を示す図である。この図において、第１実施形態における端末１００と同一の構成については、同一の符号が付されている。
第３実施形態における端末７００の構成のうち、特徴的なのは、発音情報記憶部７２０である。この発音情報記憶部７２０は、発音地点に関する発音位置情報や、サウンドデータを記憶するものであり、それらの情報をバスＢ１に供給する。上述した第１および第２実施形態における端末１００の制御部１１０は、サウンドデータ配信サーバ３００から配信されるサウンドデータからオーディオ信号を生成したが、第３実施形態における端末７００の制御部７１０は、発音情報記憶部７２０に記憶されるサウンドデータを読み出して、読み出されたサウンドデータからオーディオ信号を生成する。
【００８４】
より具体的には、まず、制御部７１０は、測位部１４０によって生成された端末位置情報ＳＰが示す端末７００の位置と、発音情報記憶部７２０に記憶される発音位置情報が示す発音地点の位置とに応じて、発音地点選択処理を実行する。この発音地点選択処理は、上述した第１実施形態におけるサウンドデータ配信サーバ３００において実行される発音地点選択処理（図９参照）と同様の処理であり、発音地点を、端末７００に近いものから順番に、所定数に達するまで選択する処理である。次に、制御部７１０は、選択された複数の発音地点の各々に対応するサウンドデータを、発音情報記憶部７２０から並列して読み出す。そして、制御部７１０は、読み出された各々のサウンドデータから、端末位置情報ＳＰと発音位置情報とに応じて、２チャネルのオーディオ信号をオーディオデータ生成部１６０によって生成する。
【００８５】
このように第３実施形態においては、端末７００に含まれる発音情報記憶部７２０から読み出されたサウンドデータによってオーディオ信号が生成される。これにより、上述した第１および第２実施形態と同様に、ユーザが仮想的な発音位置が配置された空間を移動することができ、従来にない音響アミューズメントを提供することができる。また、サウンドデータを配信する構成が不要となるためその構成が簡略化される。
【００８６】
なお、発音情報記憶部７２０は、必ずしも端末７００に内蔵でなくとも良い。例えば光ディスクなどの記録媒体から発音地点に関する情報を読み出すリムーバブルストレージを端末７００に設けても良いし、さらには、リムーバブルストレージを外付けするためのインターフェースを端末７００に設ける構成としても良い。
【００８７】
また、上述した第１、第２および第３実施形態においては、発音地点の位置が固定である例を示したが、発音地点の位置は、時系列的に変化する構成としても良い。このような構成にすると、例えば、ジェット機などの移動体から出力されると想定される音を仮想的に生成することができる。これにより、ヘッドフォン２００から出力される音像が、さらに、変化に富んで楽しいものとなる。
【００８８】
くわえて、上述した各実施形態においては、発音位置情報および端末位置情報ＳＰの各々を、３次元位置によって規定する例を示したが、２次元位置によって規定する構成としても良い。
【００８９】
＜第４実施形態＞
上述した各実施形態においては、位置が固定の発音地点についての音像を定位させるサウンドデータ配信システムについて説明した。これに対し、第４実施形態においては、発音地点の位置をユーザにより携行される端末の位置と対応付け、ユーザに対して、移動する他のユーザの位置から音が出力されているかのように音像を定位させるサウンドデータ配信システムについて説明する。この実施形態におけるサウンドデータ配信システムの構成のうち、第１実施形態に係るシステムと共通するものについては同一の符号が付されている。
【００９０】
＜サウンドデータ配信サーバの構成＞
上述した第１実施形態のサウンドデータ配信サーバ３００は、記憶部３３０にあらかじめ記憶されるサウンドデータを端末１００に配信した。これに対し、本実施形態にかかるサウンドデータ配信サーバは、サウンドデータと端末位置情報ＳＰとを端末から受け取り、それらを含む情報を他の端末に対して配信する。端末からサウンドデータ配信サーバにアップロードされるサウンドデータは、当該端末の位置から仮想的に出力される音を示す情報である。一方、サウンドデータと共にアップロードされる端末位置情報ＳＰは、サウンドデータに応じた音が出力されるべき仮想的な音源の位置を示す情報であり、その役割は、上記各実施形態における発音位置情報と共通する。
【００９１】
図２１は、サウンドデータ配信サーバ８００の構成を示す図である。この図に示されるように、サウンドデータ配信サーバ８００は、上述したサウンドデータ配信サーバ３００と比較して、移動ベクトル量演算部８１０を有している点、および記憶部３３０に記憶される情報の点で異なる。このうち移動ベクトル量演算部８１０は、１つの端末に関する２時点間の相対的な位置関係から、その端末の単位時間あたりの変位を示すベクトル量、すなわち速度（以下「移動ベクトル量ＭＶ」と称する）を演算する。例えば、図２２に示すように、地点Ｐ_Ｔ−１（ｘ_Ｔ−１，ｙ_Ｔ−１）に位置する端末９００が、単位時間「ｕｔ」だけ経過した後、地点Ｐ_Ｔ（ｘ_Ｔ，ｙ_Ｔ）に移動したとする。このとき、移動ベクトル量演算部８１０は、移動ベクトル量ＭＶとして、
【００９２】
【数５】

【００９３】
を演算する。なお、この移動ベクトル量ＭＶは、後述するように、端末９００において他の端末９００の移動経路の予測に用いられる。
一方、図２３は、サウンドデータ配信サーバ８００の記憶部３３０に記憶される情報を示す図である。この図に示すように、記憶部３３０には、「端末ＩＤ」と「端末位置情報ＳＰ」と「移動ベクトル量ＭＶ」と「サウンドデータ」とが対応付けられて記憶されている。このうち、端末ＩＤは、サウンドデータ配信システムに含まれる端末９００を識別するための情報である。記憶部３３０には、全ての端末９００に対応する端末ＩＤのうち、サウンドデータ配信サーバ８００と接続中の端末９００に対応する端末ＩＤが記憶される。
【００９４】
端末位置情報ＳＰは、端末ＩＤに対応する端末９００の位置を示す情報であり、１つの端末ＩＤにつき「Ｔ−１期」における情報と「Ｔ期」における情報との２つの時点における位置情報が含まれる。ここで、「Ｔ−１期」は、上述した単位時間「ｕｔ」だけ「Ｔ期」から過去の時点に相当する。
【００９５】
端末位置情報ＳＰは、端末９００が基地局５１０のサービスエリアに在圏する間にわたり単位時間「ｕｔ」が経過する度に更新される。なお、この実施形態においては、端末位置情報ＳＰは、（ｘ１_Ｔ−１，ｙ１_Ｔ−１）や（ｘ１_Ｔ，ｙ１_Ｔ）などにより示すように、２次元にて規定された位置情報とするが、上記発音位置情報と同様に３次元により規定された位置情報であっても良い。
【００９６】
移動ベクトル量ＭＶは、上述した移動ベクトル量演算部８１０により生成された情報であり、端末ＩＤに対応する端末９００の「Ｔ−１期」から「Ｔ期」までの期間における速度を示す。移動ベクトル量演算部８１０は、「Ｔ−１期」および「Ｔ期」の端末位置情報ＳＰを用いて移動ベクトル量ＭＶを演算する。
【００９７】
サウンドデータは、端末ＩＤに対応する端末９００からアップロードされたデータであり、サウンドデータ配信サーバ８００から他の端末９００に配信される。この実施形態においては、端末９００の位置から仮想的に出力される音は楽曲を構成する音であるものとして説明するが、端末９００の位置から仮想的に発せられる音は楽曲を構成する音に限られず、楽音や音声などの音であればいかなるものであっても良い。
以上説明した端末ＩＤと、端末位置情報ＳＰと、移動ベクトル量ＭＶと、サウンドデータとの組は、端末ＩＤに対応する端末９００がサウンドデータ配信サーバ８００と接続している間のみ記憶部３３０に記憶され、それらの接続関係が切断されると、制御部３１０により記憶部３３０から消去される。
【００９８】
＜端末の構成＞
端末９００は、自装置の位置から仮想的に出力される音を示すサウンドデータと、自装置の位置を示す端末位置情報ＳＰとをサウンドデータ配信サーバ８００にアップロードする一方で、他の端末９００に関するサウンドデータと端末位置情報ＳＰと移動ベクトル量ＭＶとをサウンドデータ配信サーバ８００からダウンロードする。端末９００は、ダウンロードした他の端末９００に関するサウンドデータと、端末位置情報ＳＰと、移動ベクトルＭＶとを用いて、あたかも他の端末９００（ユーザ）の位置からサウンドデータで示される楽曲が出力されているかのように音像を定位させる。
【００９９】
図２４は、端末９００の構成を示すブロック図である。この図に示されるように、端末９００は、第１実施形態における端末１００（図５参照）の構成各部に加え、記憶部９１０と位置予測部９２０とを備えている。
このうち記憶部９１０は、「サウンドデータ」および自装置（端末９００）の「端末ＩＤ」を記憶する。このサウンドデータは、自装置の位置から仮想的に出力されると想定された音を示すデータである。ユーザは、例えばネットワークを介してダウンロードするなどして、サウンドデータを端末９００の記憶部９１０に記憶させることができる。
【０１００】
位置予測部９２０は、サウンドデータ配信サーバ８００から配信された他の端末９００に関する「Ｔ期」の端末位置情報ＳＰと、移動ベクトル量ＭＶとを用いて、「Ｔ期」以降に他の端末９００が移動する経路を予測する。例えば、前掲図２２に示すように、「Ｔ期」における端末９００の位置Ｐ_Ｔ（ｘ_Ｔ，ｙ_Ｔ）と、「Ｔ−１期」から「Ｔ期」までの端末９００の移動ベクトル量ＭＶとが与えられた場合、位置予測部９２０は、端末９００の位置Ｐ_Ｃ（ｘ_Ｃ，ｙ_Ｃ）を例えば次式により時系列的に予測する。
【０１０１】
【数６】

【０１０２】
ここで、ｅｔは、「Ｔ期」からの経過時間を示す。
説明を再び図２４に戻す。オーディオ信号生成部１６０は、第１実施形態と同様に自装置で生成した端末位置情報ＳＰと方位情報Ａとを用いて、音像を定位させたオーディオ信号ＳＬ３およびＳＲ３をサウンドデータから生成するが、発音地点の位置を示す発音位置情報の代わりとして、位置予測部９２０により予測された他の端末９００の位置Ｐ_Ｃを用いて音像を定位させる。なお、音像を定位させる際に、他の端末９００の位置について実測値ではなく位置予測部９２０により予測された予測値（位置Ｐ_Ｃ）を用いる理由は、ネットワークトラフィックなどの影響により定位された音像が不自然なものとなるのを防止するためであるが、この点については後述する。
【０１０３】
＜サウンドデータ配信システムの動作＞
次にサウンドデータ配信システムの動作について図２５を参照して説明する。この動作は、端末９００において、あたかも他の端末９００の位置から楽曲が出力されているかのように音像を定位させる動作である。サウンドデータ配信システムにおいては、複数の端末９００の各々が、サウンドデータをアップロードする動作と、サウンドデータ配信サーバ８００からサウンドデータをダウンロードしたうえで自装置と他の端末９００との相対的な位置関係に応じて音像を定位させる動作とを並行して実行する。ただし、以下では、説明の便宜のために、サウンドデータ配信システムに含まれる複数の端末９００のうち２つの端末９００のみに着目し、このうち一方の端末（以下「９００Ｕ」という）についてはサウンドデータをアップロードする動作に特に注目し、他方の端末（以下「９００Ｄ」という）についてはサウンドデータのダウンロードおよび音像の定位のための処理に特に注目して説明を進める。
【０１０４】
まず、いずれかの基地局５１０が管轄するサービスエリアに端末９００Ｕが入ると、端末９００Ｕの制御部１１０は、ステップＳＣ１およびＳＣ２において、記憶部９１０に記憶されるサウンドデータＳＤと自装置の端末ＩＤ＿ＭＩとを、基地局５１０を介してサウンドデータ配信サーバ８００に送信する。サウンドデータ配信サーバ８００の制御部３１０は、端末９００ＵからサウンドデータＳＤと端末ＩＤ＿ＭＩとを受信すると、前掲図２３に示すように端末ＩＤ＿ＭＩとサウンドデータＳＤとを対応づけて記憶部３３０に記憶させる。
【０１０５】
一方、端末９００Ｕの制御部１１０は、サウンドデータＳＤおよび端末ＩＤを基地局５１０に送信した後（ステップＳＣ１）、ステップＳＣ４において、衛星群４００から送信される衛星信号を衛星電波受信部１４５により受信する。続いて、端末９００Ｕの制御部１１０は、ステップＳＣ５において、受信した衛星信号を用いて、自装置の位置を示す端末位置情報ＳＰ１を測位部１４０により生成する。次いで、端末９００Ｕの制御部１１０は、ステップＳＣ６およびＳＣ７において、生成した端末位置情報ＳＰ１と自装置の端末ＩＤ＿ＭＩとを、基地局５１０を介してサウンドデータ配信サーバ８００に送信する。
【０１０６】
サウンドデータ配信サーバ８００の制御部３１０は、基地局５１０を介して端末９００Ｕから端末位置情報ＳＰ１と端末ＩＤ＿ＭＩとを受信すると、ステップＳＣ８において、記憶部３３０に記憶された端末ＩＤのうち受信した端末ＩＤ＿ＭＩと等しい端末ＩＤに、受信した端末位置情報ＳＰ１を対応づけて記憶部３３０に記憶する。例えば、いま、図２６の上段に示されるように、あらかじめ記憶部３３０に端末ＩＤ「ＭＳ１」と、「Ｔ−１期」の端末位置情報ＳＰ（ｘ１_Ｔ−２，ｙ１_Ｔ−２）と、「Ｔ期」の端末位置情報ＳＰ（ｘ１_Ｔ−１，ｙ１_Ｔ−１）と、移動ベクトル量ＭＶ（ｘ１_Ｖ−１，ｙ１_Ｖ−１）と、サウンドデータＳＤ「楽曲１」とが対応づけられて記憶されている状況を想定する。この状況のもと、サウンドデータ配信サーバ８００がステップＳＣ７において、基地局５１０から端末ＩＤとして「ＭＳ１」と、端末位置情報ＳＰ１として（ｘ１_Ｔ，ｙ１_Ｔ）とを受信したとする。この際、制御部３１０は、同図の下段に示されるように、まず、現時点における「Ｔ期」の端末位置情報ＳＰ（ｘ１_Ｔ−１，ｙ１_Ｔ−１）を、「Ｔ−１期」の端末位置情報ＳＰとして書き換えた後、受信した端末位置情報ＳＰ１（ｘ１_Ｔ，ｙ１_Ｔ）を「Ｔ期」の端末位置情報ＳＰとして記憶部３３０に記憶させる。
【０１０７】
次に、サウンドデータ配信サーバ８００の制御部３１０は、ステップＳＣ９において、記憶部３３０に記憶された「Ｔ−１期」の端末位置情報ＳＰと「Ｔ期」の端末位置情報ＳＰとを用いて、移動ベクトル量演算部８１０により端末９００Ｕの移動ベクトル量ＭＶを演算する。次いで、制御部３１０は、演算した移動ベクトル量ＭＶを端末ＩＤと対応づけて記憶部３３０に記憶させる。なお、「Ｔ−１期」の端末位置情報ＳＰが存在しない場合、すなわちサウンドデータ配信サーバ８００によるステップＳＣ９の処理が第１回目の場合には、記憶部３３０には移動ベクトル量ＭＶとして零ベクトルが記録される。
【０１０８】
ここで、サウンドデータ配信サーバ８００からサウンドデータＳＤをダウンロードする端末９００Ｄの動作へと説明を移す。
端末９００Ｄの制御部１１０は、オーディオ信号の生成開始を指示する生成開始信号を指示入力部１２０から入力すると、ステップＳＣ１０において、衛星群４００から送信される衛星信号を衛星電波受信部１４５により受信する。次に、端末９００Ｄの制御部１１０は、ステップＳＣ１１において、受信した衛星信号を用いて自装置の位置を示す端末位置情報ＳＰ２を測位部１４０により生成する。次いで、端末９００Ｄの制御部１１０は、ステップＳＣ１２およびＳＣ１３において、端末ＩＤ＿ＭＩと生成した端末位置情報ＳＰ２とを、基地局５１０を介してサウンドデータ配信サーバ８００に送信する。
【０１０９】
サウンドデータ配信サーバ８００の制御部３１０は、基地局５１０から転送された端末位置情報ＳＰ２と端末ＩＤ＿ＭＩとを受信すると、ステップＳＣ１５において、端末選択処理を実行する。この端末選択処理は、第１実施形態における発音地点選択処理（図９参照）と略同様の処理であり、端末９００Ｄと他の端末９００との相対的な距離に応じて、他の端末９００のうち、端末９００Ｄの近傍に位置する他の端末９００を選択する処理である。この例では、端末選択処理によって端末９００Ｕが選択されたものとし、以降、端末９００Ｕに関するサウンドデータＳＤの音像を定位させる動作について説明する。
【０１１０】
サウンドデータ配信サーバ８００の制御部３１０は、端末選択処理により端末９００Ｕを選択すると、ステップＳＣ１６において、選択した端末９００Ｕの端末ＩＤ＿ＭＩと、記憶部３３０において当該端末ＩＤと対応づけられた「Ｔ期」の端末位置情報ＳＰ１と、移動ベクトル量ＭＶと、サウンドデータＳＤとを組にして基地局５１０に送信する。この際、制御部３１０は、サウンドデータＳＤのヘッダとして、端末ＩＤ＿ＭＩ、端末位置情報ＳＰ１および移動ベクトル量ＭＶを付加する形式でこれらの情報を送信する。なお、仮に端末選択処理において複数の端末９００が選択された場合、制御部３１０は、端末ＩＤ＿ＭＩと、「Ｔ期」の端末位置情報ＳＰと、移動ベクトル量ＭＶと、サウンドデータＳＤとを端末９００ごとに組にして基地局５１０に送信する。
基地局５１０は、端末ＩＤ＿ＭＩと、「Ｔ期」の端末位置情報ＳＰ１と、移動ベクトル量ＭＶと、サウンドデータＳＤとを受信すると、ステップＳＣ１７において、それらを端末９００Ｄに向けて転送する。
【０１１１】
一方、端末９００Ｄの制御部１１０は、端末位置情報ＳＰ２と端末ＩＤ＿ＭＩとを基地局５１０に送信すると（ステップＳＣ１２）、次に、ステップＳＣ１４において、ユーザの顔の向く方向Ａを示す方位情報ＯＤを方位検出部１５０により生成する。続いて、端末９００Ｄの制御部１１０は、基地局５１０から端末ＩＤ＿ＭＩと、「Ｔ期」の端末位置情報ＳＰ１と、移動ベクトル量ＭＶと、サウンドデータＳＤとを受信すると（ステップＳＣ１７）、ステップＳＣ１８において、受信したサウンドデータＳＤの音像を定位させたオーディオ信号を生成し、生成したオーディオ信号をヘッドフォン２００を介して放音する。この際、端末９００Ｄは、自装置に関する端末位置情報ＳＰ２および方位情報ＯＤを一定の時間間隔で更新しつつ、オーディオ信号を生成する。
サウンドデータ配信システムにおいては、以上説明したステップＳＣ４からステップＳＣ１８までの処理が繰り返されることにより、サウンドデータ配信サーバ８００から端末９００Ｄに配信されるサウンドデータＳＤが、ストリーム形式で端末９００Ｄにおいて再生される。
【０１１２】
以下、端末９００ＤがステップＳＣ１８において音像を定位させる処理について詳細に説明する。まず、端末９００Ｄの制御部１１０は、「Ｔ期」の端末位置情報ＳＰ１と、移動ベクトル量ＭＶとを用いて、位置予測部９２０により端末９００Ｕの位置を予測する。次に、端末９００Ｄの制御部１１０は、予測した端末９００Ｕの位置、ならびに自装置において生成した端末位置情報ＳＰ２および方位情報ＯＤに応じて、サウンドデータＳＤから音像を定位させたオーディオ信号をオーディオ信号生成部１６０により生成する。
【０１１３】
例えば、図２７中の実線で示すように、端末９００Ｄのユーザ９０２Ｄが、端末９００Ｕのユーザ９０２Ｕから離れた位置にて、ユーザ９０２Ｕの方向Ａに顔を向けている場合を想定する。この場合、端末９００Ｄの制御部１１０は、楽曲があたかもユーザ９０２Ｕの位置から放音されているかのようなオーディオ信号を生成し、ヘッドフォン２００を介して放音する。
【０１１４】
次に、図中波線で示すように、各ユーザ９０２Ｄおよび９０２Ｕが互いに近づくように移動すると、端末９００Ｄの制御部１１０は、端末９００Ｕおよび９００Ｄ間の距離が短くなるにつれ、ヘッドフォン２００から放音する楽曲の音圧（音量）を増大させる。これにより、ユーザ９０２Ｄは、自身の位置と楽曲が放音される地点との相対的な位置関係が、ユーザ９０２Ｄと他のユーザ９０２Ｕとの相対的な位置関係と連動するかのような感覚を得ることができる。
【０１１５】
次いで、図２８中の破線で示すように、ユーザ９０２Ｕが、端末９００Ｄのユーザ９０２Ｄからみて右手方向に遠のくように移動したとする。このようにユーザ９０２Ｕ（端末９００Ｕ）が移動すると、ユーザ９０２Ｕが遠のくにつれ、ヘッドフォン２００から放音される楽曲の音圧は左右両耳とも小さくなる。ただし、ユーザ９０２Ｄの右耳とユーザ９０２Ｕとの距離は、ユーザ９０２Ｄの左耳とユーザ９０２Ｕとの距離より短いため、右耳に至る音の音圧は、左耳に至る音の音圧よりも高くなる。同様の理由により、Ｒチャネル信号はＬチャネル信号より遅延したものとなる。これにより、ユーザ９０２Ｄは、楽曲が放音される地点が右手方向に遠のくことを知覚するとともに、他のユーザ９０２Ｕが右手方向に移動するという情報を得ることができる。すなわち、ユーザ９０２Ｄは、あらかじめ他のユーザ９０２Ｕの位置に対応付けられた音の種類を知っていれば、ヘッドフォン２００から放音される音を聞くのみで、他のユーザ９０２Ｕのおおよその位置を得ることができる。
【０１１６】
ここで、音像を定位させる際に、端末９００Ｕの位置の実測値ではなく位置予測部９２０による予測値を用いる理由について説明する。
例えばサウンドデータ配信サーバ８００と端末９００Ｄとのデータ伝送速度が遅い場合や、サウンドデータ配信サーバ８００の処理能力が低い場合などにおいては、端末９００Ｄが取得できる他の端末９００Ｕの端末位置情報ＳＰ１の単位時間あたりのサンプル数が少なくなる。このように、端末位置情報ＳＰ１のサンプル数が少なくなると、時間的に連続する端末位置情報ＳＰ１により示される位置が互いに極端に離間してしまうといった事態が生じ得る。このような事態が生じると、定位された音像により、ユーザ９０２Ｄに対して、発音地点が瞬間的に離れた地点に移動するかのような違和感を与えるおそれがある。これに対処すべく、本実施形態では、端末９００Ｄにおいて、端末位置情報ＳＰ１を取得した後、次の端末位置情報ＳＰ１を取得するまでの期間中に、端末９００Ｕの位置を予測に基づいて更新して、更新した位置に応じて音像を定位させている。つまり、予測値により実測値を時系列的に補間して、音像定位に用いられる時間的に連続した端末９００Ｕの位置が極端に離間するという事態を回避しつつ音像を定位させている。これにより、ユーザ９０２Ｄに与える違和感を排除することが可能となる。
【０１１７】
なお、端末９００Ｕの位置を予測する方法は、移動ベクトル量ＭＶを用いた線形の予測方法に限られない。例えば、過去における３以上の時点と、それらの各時点における端末９００Ｕの位置との組をパラメータとする非線形関数により、非線形状の端末９００Ｕの経路を予測しても良い。さらに、道路情報などの地理情報を用いて、ユーザ９０２Ｕが移動する周辺の地理状況の影響を取り入れて経路を予測しても良い。これにより、人間が行動可能な範囲を考慮して経路を予測できるため、予測精度を向上させることができる。
もっとも、端末９００Ｄが十分なサンプル数の端末位置情報ＳＰ１を取得できる場合であれば、予測値を用いることなく実測値（端末位置情報ＳＰ１）のみを用いて音像を定位させても良いのはもちろんである。また、原則的には実測値を用いて音像を定位させる一方で、端末位置情報ＳＰ１のサンプル数が少ない場合にのみ選択的に予測値を用いて音像を定位させても良い。
【０１１８】
本実施形態においては、仮想的に音を出力する地点の位置は、全て端末９００の位置に対応づけられていたが、端末９００の位置に対応づけられた発音地点と、第１実施形態で説明した位置が固定された発音地点とを混在させる構成としても良い。このような構成とする場合、サウンドデータ配信サーバ８００において、位置が移動する発音地点と、移動しない発音地点とに分けて処理を実行することにより、すなわち位置が移動する発音地点に対してのみ移動ベクトル量ＭＶの演算を行うことにより、全体としての処理量を低減させることが可能である。
【０１１９】
本実施形態においては、端末９００Ｄに配信されるサウンドデータＳＤは、端末９００Ｕからアップロードされたデータであったがこれに限られない。例えば、サウンドデータ配信サーバ８００の記憶部３３０にあらかじめ複数のサウンドデータＳＤを記憶させ、そのいずれかを端末９００Ｄに配信しても良い。いずれのサウンドデータＳＤをいずれの他の端末９００Ｕに対応づけるかは、ユーザ９０２Ｕが端末９００Ｕを介して指示しても良いし、ユーザ９０２Ｄが端末９００Ｄを介して指示しても良い。これにより、端末９００Ｕからアップロードされる情報からサウンドデータＳＤを省略することができるので、そのデータ量が大幅に削減される。
【０１２０】
本実施形態においては、ユーザ９０２Ｄおよび９０２Ｕ間の位置関係と、ユーザ９０２Ｄの顔の向く方向Ａとに応じて音像を定位させたが、さらに以下のような種々の変更を加えることが可能である。例えば、端末９００Ｕと端末９００Ｄとの速度差に応じて音の振動数を変化させることによって、音像にドップラー効果の影響を取り入れても良い。これにより、ユーザ９０２Ｄおよび９０２Ｕがすれ違う際に、その雰囲気をリアルに表現することができる。
【０１２１】
また、端末９００Ｄあるいは９００Ｕの位置に応じて、ヘッドフォン２００から出力される音にエフェクトをかけても良い。例えば、基地局５１０が提供するサービスエリアのうち、特定の領域に端末９００Ｄあるいは９００Ｕが位置する場合に、楽曲のテンポや音のコード感などを変化させても良い。かかるサウンドデータ配信システムを実現するには、サウンドデータ配信サーバ８００の記憶部３３０に、サービスエリアを区画した領域ごとに、サウンドデータＳＤにエフェクトを付与するためのパラメータを記憶させる。そして、サウンドデータＳＤの配信時において、端末９００Ｄあるいは９００Ｕの位置に応じてパラメータを選択し、パラメータに応じたエフェクトを付与したサウンドデータＳＤを端末９００Ｄに配信すれば良い。これにより、ユーザ９０２Ｄあるいは９０２Ｕが、例えば商店街などの特定の領域を通過する場合に、発音地点からの音にエフェクトが付与されるため、アミューズメント性が向上する。
【０１２２】
さらに、ユーザ９０２Ｕの顔を向く方向Ａの影響を音像定位に取り入れることも可能である。例えば、ユーザ９０２Ｕの位置から仮想的に出力する音について、ユーザ９０２Ｕの顔の向く方向Ａの指向性を持たせても良い。これにより、ユーザ９０２Ｄは、あたかもユーザ９０２Ｕの口から音が出力されているかのような感覚を得ることができる。
【０１２３】
また上述の端末９００Ｄは、音のみにより他の端末９００Ｕとの位置関係をユーザ９０２Ｄに示したが、これと並列して、他の手法を用いて端末９００Ｕとの位置関係を示しても良い。例えば、上述した構成に加え、端末９００ＤにＬＥＤ（Ｌｉｇｈｔ　Ｅｍｉｔｔｉｎｇ　Ｄｉｏｄｅ）などの表示手段を設け、他の端末９００Ｕとの距離が近い場合ほど、ＬＥＤの発光強度を強くして、ユーザ９０２Ｄに位置関係を示す構成としても良いし、また、端末９００Ｄに機械的な振動を発生させるバイブレータを内蔵しておき、振動により他の端末９００Ｕとの位置関係を示しても良い。これにより、ユーザ９０２Ｄは、聴覚に加え、視覚あるいは触覚によって、他のユーザ９０２Ｄの位置を把握することができる。
【０１２４】
＜第１、第２、第３および第４実施形態の変形例＞
なお、上述した各実施形態におけるオーディオデータ生成部１６０は、ユーザの位置および顔の方向と、発音地点の位置とに応じて、オーディオ信号を生成したが、さらに、ユーザが位置する空間の音場（例えば、ビルの壁による音の反射、回折などの効果）を取り入れる構成としても良い。
【０１２５】
また、上述した各実施形態においては、ヘッドフォン２００によってオーディオ信号を音として出力する例を示したが、これに限られない。例えば、自動車の車内に設置された複数のスピーカなどから放音する構成としても良い。このような構成にする場合、オーディオ信号生成部１６０においては、例えば、自動車の進行方向などにより、ユーザの顔の向きを特定し、特定された顔の向きと、自動車の位置と、各々のスピーカおよび左右の耳の相対的な位置関係とに応じて、各スピーカから出力されるオーディオ信号を生成しても良い。
【０１２６】
さらに、上述した各実施形態においては、発音地点の音像を定位させるために、２チャネルのオーディオ信号を生成する例を示したが、本発明はこれに限定されない。例えば、５．１チャネルなどの２チャネル以上のオーディオ信号を生成し、スピーカなどの放音装置から放音する構成としても良い。
【０１２７】
上述した各実施形態においては、発音地点選択処理として、端末１００、７００および９００との距離が近い発音地点から順番に、所定数に達するまで選択する方法（図９参照）を示したが、発音地点を選択する方法はこれに限られない。例えば、端末１００との距離が閾値以下である発音地点を、選択数にかかわらず選択する方法など、端末１００の位置と発音地点の位置とに応じて、発音地点を選択する方法であれば任意に本発明を適用できる。
なお、テーマパークなどの比較的狭い領域に、サウンドデータ配信システムを適用した場合などにおいて、全ての発音地点についての音像を定位させるときには、発音地点選択処理を省略することが可能である。
【０１２８】
上述した第１、第２および第４実施形態では、端末１００および９００の各々において、仮想的な発音地点の音像を定位させたが、サウンドデータ配信サーバ３００、６１０Ａ、６１０Ｂ、６１０Ｃおよび８００、あるいはコントロールサーバ６００において音像を定位させ、その音像を示す信号を端末に配信しても良い。要は、ユーザの位置、および、顔の向いている方向Ａを示す情報を取得し、その位置および顔の向いている方向Ａと、仮想的な発音地点の位置とに応じて、ユーザからみて、発音地点に予め関連付けられた音が、当該発音地点の位置から出力しているように音像を定位させるならば任意に本発明を適用可能である。
【０１２９】
また、本発明は、コンピュータを、以上説明した音像を定位させる端末１００、７００あるいは９００として機能させるためのプログラムとしても実行可能である。すなわち、このプログラムは、コンピュータを、ユーザの位置、および、顔の向いている方向を示すユーザ情報を取得する機能と、仮想的な発音地点の位置を示す発音位置情報を取得する機能と、取得したユーザ情報で示される位置であって、取得したユーザ情報で示される方向に顔を向けたユーザからみて、発音地点に予め関連付けられた種類の音が、発音位置情報で示される位置から出力しているように音像を定位させる機能とを実現させるためのプログラムとして特定される。
さらに、本発明は、このプログラムを記録したコンピュータ読みとり可能な記録媒体としても実現可能である。
【０１３０】
【発明の効果】
以上説明したように本発明によれば、娯楽性の高い音響アミューズメントを提供することを可能にする音像定位装置、音像定位方法、サウンドデータ配信システム、サウンドデータ配信方法およびプログラムが提供される。
【図面の簡単な説明】
【図１】本発明の第１実施形態におけるサウンドデータ配信システムの構成を示す図である。
【図２】同サウンドデータ配信システムに含まれるサウンドデータ配信サーバの構成を示す図である。
【図３】同サウンドデータ配信サーバの記憶部に記憶される情報を示す図である。
【図４】同サウンドデータ配信サーバの記憶部に記憶されるデータ量変換テーブルを示す図である。
【図５】同サウンドデータ配信システムに含まれる端末の構成を示す図である。
【図６】同端末に含まれるオーディオ信号生成部などの構成を示す図である。
【図７】同オーディオ信号生成部による処理を説明するための図である。
【図８】同サウンドデータ配信システムの動作を示すフローチャートである。
【図９】同サウンドデータ配信サーバが実行する発音地点選択処理を示すフローチャートである。
【図１０】同サウンドデータ配信サーバが実行するデータ量変換処理を示すフローチャートである。
【図１１】同サウンドデータ配信サーバによって選択された発音地点の様子を示す図である。
【図１２】同端末が生成するオーディオ信号による音像定位を説明するための図である。
【図１３】同音像定位を説明するための図である。
【図１４】同音像定位を説明するための図である。
【図１５】本発明の第２実施形態におけるサウンドデータ配信システムの構成を示す図である。
【図１６】同サウンドデータ配信システムに含まれるサウンドデータ配信サーバに記憶される情報を示す図である。
【図１７】同サウンドデータ配信サーバに記憶される情報を示す図である。
【図１８】同サウンドデータ配信システムに含まれるコントロールサーバに記憶される情報を示す図である。
【図１９】同サウンドデータ配信システムの動作を示すフローチャートである。
【図２０】本発明の第３実施形態における端末の構成を示す図である。
【図２１】本発明の第４実施形態におけるサウンドデータ配信サーバの構成を示す図である。
【図２２】端末の移動経路の予測方法を説明するための図である。
【図２３】同実施形態におけるサウンドデータ配信サーバに記憶される情報を示す図である。
【図２４】同実施形態における端末の構成を示す図である。
【図２５】同実施形態におけるサウンドデータ配信システムの動作を示すフローチャートである。
【図２６】同動作を説明するための図である。
【図２７】同動作における音像定位を説明するための図である。
【図２８】同動作における音像定位を説明するための図である。
【符号の説明】
１００，７００，９００…端末、１１０，７１０…制御部、１２０…指示入力部、１３０…無線通信部、１４０…測位部、１４５…衛星電波受信部、１５０…方位検出部、１６０…オーディオ信号生成部、１７０…加工部、１７２…パラメータ生成部、１７３…ディレイパラメータ生成部、１７４…アンプパラメータ生成部、１７６…遅延部、１７８…アンプ、１８０…混合部、１９０…オーディオ信号出力部、２００…ヘッドフォン、２１０…方位センサ、２２０，２３０…放音部、３００，６１０Ａ，６１０Ｂ，６１０Ｃ，８００…サウンドデータ配信サーバ、３１０…制御部、３２０…通信部、３３０…記憶部、４００…衛星群、５００…移動通信網、５１０…基地局、６００…コントロールサーバ、７２０…発音情報記憶部。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a sound image localization device for localizing a sound image, a sound image localization method and a program, and a sound data distribution system and a sound data distribution method for distributing sound data representing sound.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, there has been known an audio distribution system in which audio data such as music distributed in a stream via a wireless communication network is received by a mobile terminal and output as sound from headphones or the like connected to the mobile terminal (for example, Patent Document 1). According to such an audio distribution system, the user can easily enjoy music even when going out.
[0003]
[Patent Document 1]
JP-A-9-181510 (FIG. 3)
[0004]
[Problems to be solved by the invention]
However, in the conventional audio distribution system, even though the distributed audio data can be faithfully reproduced on the mobile terminal, it is necessary to provide the user with entertainment such as participating in the generation of the audio data. Could not.
[0005]
The present invention has been made in view of the circumstances described above, and has as its object to provide a sound image localization device, a sound image localization method, a sound data distribution system, and a sound that can provide a highly entertaining acoustic amusement. A data distribution method and a program are provided.
[0006]
[Means for Solving the Problems]
In order to achieve the above object, a sound image localization device according to the present invention includes a user position, a user information acquisition unit that acquires user information indicating a direction in which a face is facing, and a position of a virtual sounding point. Sounding position information obtaining means for obtaining sounding position information indicated by the user, and a position indicated by the obtained user information, which is previously associated with the sounding point as viewed from the user who turned his / her face in the direction indicated by the obtained user information. And a localization means for localizing the sound image so that different types of sounds are output from the position indicated by the sounding position information.
According to the above configuration, when the user is located at a certain point and turns his / her face in a certain direction, a sound of a type previously associated with the virtual sounding point is output from the sounding point. Since the localization is performed, it is possible to give the user a feeling as if the user is in the space where the sounding point is arranged.
[0007]
Here, the sounding position information obtaining means obtains, as the sounding position information, moving body position information indicating a position of the moving object associated with the sounding point, and the localization means obtains the moving body obtained. It is preferable that the sound image is localized so that the sound is output from the position of the moving body indicated by the position information.
According to this configuration, the user can perceive the approximate positional relationship with the moving object by the sound image.
[0008]
In another preferred aspect, the apparatus further includes a receiving unit that receives sound data indicating a type of sound previously associated with the sounding point, and the localization unit includes a sound source indicated by the sound data received by the receiving unit. Localize the sound image. As described above, by acquiring sound data via the receiving means, it is not necessary to provide a special storage device for storing sound data in a nonvolatile manner in the sound image localization apparatus. Furthermore, since the sound data can be collectively managed by the distribution device that distributes the sound data to the sound image localization device, the update of the sound data and the like can be easily performed.
The present invention can be realized as a sound image localization method and a program in addition to the above sound image localization device, and can achieve the same effects as the above sound image localization device.
[0009]
Also, the present invention provides a sound data distribution device that distributes sound data representing a type of sound previously associated with a virtual sound generation point, and receives sound data distributed from the sound data distribution device, and receives the received sound. Using the data, the sound image is localized such that the sound of the type previously associated with the sounding point is output from the position of the sounding point, as viewed from the user who is located at a certain point and faces in a certain direction. A sound data distribution system, comprising:
[0010]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0011]
<Schematic configuration of sound data distribution system>
First, a schematic configuration of a sound data distribution system according to the present embodiment will be described with reference to FIG. In this figure, a satellite group 400 is an artificial satellite corresponding to a GNSS (Global Navigation Satellite System) such as a GPS (Global Positioning System), is controlled by a ground control station (not shown), and transmits satellite signals to the ground. . The satellite signal includes information indicating the time when the signal was transmitted from the satellite, the orbital position of the satellite that transmitted the signal, and the like.
[0012]
On the other hand, the mobile communication network 500 includes various devices for providing a data communication service, such as a base station control device, and is connected to many base stations 510. The sound data distribution server 300 stream-distributes sound data to the terminal 100 via the mobile communication network 500 and the base station 510. Here, the sound data is information representing a sound output from a virtual sound generation point defined by three-dimensional coordinates.
[0013]
The terminal 100 is a portable wireless communication terminal, and transmits terminal position information indicating the position of the terminal 100 to the sound data distribution server 300 via any of the base stations 510, For example, receiving sound data. Upon receiving the sound data, the terminal 100 generates, from the sound data, an audio signal in which a sound image of a sound assumed to be emitted from a position of a sounding point corresponding to the sound data is localized, as described later.
[0014]
The terminal 100 is connectable to a stereo headphone 200, and emits an audio signal generated from sound data via the headphone 200. The headphone 200 has a direction sensor 210 that detects a direction in which the user's face faces when the headphone 200 is worn on the user's head, and outputs a signal indicating the detected direction while the audio signal is being input. Transmit to terminal 100. The terminal 100 includes a receiver for receiving a satellite signal transmitted from the satellite group 400.
Although FIG. 2 illustrates two sets of the terminal 100 and the headphone 200, two sets of two users, that is, a set used for the user U1 and a set used for the user U2. The pair of the headphones 200 may be one set, or three or more sets.
[0015]
<Structure of sound data distribution server>
FIG. 2 is a block diagram showing a configuration of the sound data distribution server 300. In this figure, a control unit 310 controls each unit via a bus B3. Further, the control unit 310 executes a process for selecting a sound generation point, a process for converting the data amount of sound data, and the like as described later.
[0016]
The communication unit 320 receives information such as terminal position information transmitted from the terminal 100 via the mobile communication network 500. The communication unit 320 transmits the sound data corresponding to the four sounding points selected by the control unit 310 to the terminal 100 in parallel, as described later. The storage unit 330 is composed of a magnetic disk or the like, and stores various information.
[0017]
FIG. 3 is a diagram showing a part of information stored in the storage unit 330. As shown in this figure, the storage unit 330 stores a plurality of sounding point IDs, sounding position information and sound data associated with each sounding point ID. The sounding point ID is for identifying the sounding point. In the present embodiment, each sound generation point is defined at a position corresponding to a structure (for example, a store or a building) in the city, and “mini store A” or “building A” indicating the structure is shown. And the like are registered as pronunciation point IDs. The location of the sound generation point is not limited to the position corresponding to the structure, and can be set arbitrarily. The sounding position information includes, for example, latitude, longitude, and altitude, and defines the position of the sounding point by three-dimensional coordinates.
[0018]
The sound data is data representing a sound assumed to be emitted from a sounding point, and is data sampled at a predetermined frequency (for example, 44.1 kHz). The sound data may be any data as long as the data represents a sound such as a musical piece, a musical sound, or a voice. In the present embodiment, the sound data include “shopping march (song)” and “taiko sound (musical sound)”. )), "Dog cry (voice)", "chime (electronic sound)" and the like. The sound data distribution server 300 distributes four pieces of sound data to the terminal 100 among the sound data stored in the storage unit 330. Then, the terminal 100 processes and mixes the four distributed sound data, and then emits the sound via the headphones 200.
[0019]
Further, the storage unit 330 stores a data amount conversion table used for data amount conversion of sound data. FIG. 4 is a diagram illustrating a configuration of the data amount conversion table. As shown in this figure, the data amount conversion table TBL is a table in which a distance D between the terminal 100 and the sounding point is associated with a sampling frequency at which sound data corresponding to the sounding point is to be converted. For example, in FIG. 4, the sampling frequency “f1 (= 44.1 kHz)” is associated with the distance D of “0” or more and less than “L1 (> 0)”, and “L1” or more and “L2 ( > L1), is associated with a sampling frequency “f2 (= 22 kHz)”. The control unit 310 converts the data amount of the sound data in a data amount conversion process described later according to the data amount conversion table TBL. Note that each of “L1”, “L2”, “L3” and “L4” in FIG. 4 satisfies the relationship of “0” <“L1” <“L2” <“L3” <“L4”.
[0020]
<Configuration of terminal>
FIG. 5 is a block diagram showing a configuration of terminal 100. In this figure, a control unit 110 controls each unit via a bus B1. Further, the satellite radio wave receiving unit 145 receives satellite signals in parallel from each of a plurality of satellites included in the satellite group 400 and inputs each received signal to the positioning unit 140. The positioning unit 140 generates terminal position information indicating the position of the terminal 100 using information such as the transmission time and the orbital position included in each satellite signal input from the satellite radio wave receiving unit 145. At this time, the positioning unit 140 measures the distance (pseudo distance) from the terminal 100 to the satellite from which each satellite signal has been transmitted, and substitutes each measured distance into a positioning equation to determine the terminal in three-dimensional coordinates. Generate location information.
Here, the terminal 100 is used by being carried by the user. Therefore, the position of terminal 100 measured by positioning section 140 can be regarded as being equal to the position (center position) of the user.
[0021]
The instruction input unit 120 is configured with operation buttons and the like, and inputs a generation start signal or the like for instructing the start of audio signal generation to the control unit 110. Here, the generation of the audio signal is a process of generating, from the sound data, an audio signal in which a sound image of a sound output from a sound generation point is localized. The control unit 110 controls the entire terminal 100 according to an instruction from the user given via the instruction input unit 120.
[0022]
The azimuth detecting unit 150 detects the direction of the face of the user wearing the headphones by using the azimuth sensor 210 provided in the headphones 200, and supplies the direction to the bus B1 as azimuth information. In addition, as the direction sensor 210 provided in the headphone 200, a method for detecting geomagnetism, a method using a gyroscope, or the following method can be used. That is, it is also possible to provide a plurality of positioning units in the headphone 200 and detect the direction in which the user's face turns using the relative amount of change in the position detected by each positioning unit.
[0023]
The wireless communication unit 130 establishes a wireless link with the base station 510 in the area where the terminal 100 is located under the control of the control unit 110, and transmits the terminal position information to the sound data distribution server 300 via the wireless link. And the four sound data are received from the sound data distribution server 300 in parallel.
[0024]
The audio signal generation unit 160 generates a two-channel audio signal from each of the four sound data input in parallel from the wireless communication unit 130, and supplies the generated audio signal to the bus B1. At this time, the audio signal generation unit 160 separately and independently generates an L channel signal as an audio signal for the left ear and an R channel signal as an audio signal for the right ear, and supplies each of them to the bus B1. . Each of the L-channel signal and the R-channel signal supplied to the bus B1 is output as a sound from the headphone 200 via the audio signal output unit 190.
[0025]
Next, a detailed configuration of the audio signal generation unit 160 will be described with reference to FIG. As shown in the figure, the audio signal generation unit 160 includes four processing units 170-1, 170-2, and 170-3 equal to the number of sound data that the terminal 100 receives in parallel from the sound data distribution server 300. , 170-4. Each of these processing units 170-1, 170-2, 170-3, and 170-4 processes any one of the four sound data to generate a sound that is assumed to be emitted from the sounding point. To generate an audio signal in which the sound image is localized. Which of the processing units 170-1, 170-2, 170-3, and 170-4 processes which sound data depends on the distance between each sounding point and the terminal 100. , 170-2, 170-3, and 170-4 in that order, sound data corresponding to a sounding point at which the distance to the terminal 100 becomes longer is assigned. In the following description, when it is not necessary to distinguish each of the processing units 170-1, 170-2, 170-3, and 170-4, the reference numeral is simply denoted by 170.
[0026]
Here, prior to the detailed description of the processing unit 170, a mechanism in which a listener who has actually heard a sound output from a certain point (sound source) perceives the direction of the sound source and the distance to the sound source, that is, sound image localization Will be described. For example, when the sound source is located to the right of the listener, the distance from the listener's right ear to the sound source is shorter than the distance from the left ear to the sound source. Therefore, the time required for the sound output from the sound source at a certain time to reach the right ear is shorter than the time required for the sound to reach the left ear. The listener perceives the direction of the sound source by the delay time generated between the left ear and the right ear. It is also assumed that there are two sound sources, a sound source located near the listener and a sound source located far away. In this case, when a sound of a certain volume (sound pressure) is output from each sound source, the sound of the sound from the sound source located closer to the listener than the sound volume of the sound source located far from the listener is located at the listener's position. The volume is higher. The listener perceives the distance to the sound source from such a difference in volume.
[0027]
Therefore, each processing unit 170 according to the present embodiment performs, for each sounding point, the position of the sounding point and the user (listener) so that the user can feel as if the sounding point actually exists. ), An audio signal is generated in which the delay time and volume generated in the left and right ears are defined according to the positions of the left and right ears. Hereinafter, focusing on one processing unit 170, generation of an audio signal for one sounding point will be described.
[0028]
As shown in FIG. 7, the center position P (X _P , Y _P , Z _P ) And the face direction A of the user indicated by the azimuth information, and given that the distance between the left ear and the right ear is e, the position L (X _L , Y _L , Z _L ) Is the center position P (X _P , Y _P , Z _P ), And is specified by a position on the left side by a distance of e / 2 in the horizontal direction and perpendicular to the direction A, and the right ear position R (X _R , Y _R , Z _R ) Is the center position P (X _P , Y _P , Z _P ) Is specified by a position on the right side by a distance of e / 2 in the horizontal direction and perpendicular to the direction A. Here, the pronunciation point S (X _S , Y _S , Z _S ) And the user (center position P) are sufficiently separated, and the sound arrives at the user's ear as a plane wave. It is also assumed that the sounding point S is located to the front right as viewed from the user, and at this time, the angle between the direction A in which the user's face is facing and the direction of the sounding point S as viewed from the user is θ. At this time, when the sound is output from the sound generation point S, the time difference (delay time) Δt at which the sound generated between the right ear and the left ear reaches is calculated by using the distance difference d of the reaching path and the sound speed c.
[0029]
(Equation 1)

[0030]
Is expressed as Here, since d = e · sin θ holds, the delay time Δt becomes
[0031]
(Equation 2)

[0032]
It becomes.
Further, the distance from the pronunciation point S to each of the left and right ears of the user is represented by D _L , D _R Where t is the time and f is the wave equation of the spherical wave, the sound pressure P generated at the left ear _L And the sound pressure P generated in the right ear _R Can be represented as follows:
That is,
[0033]
[Equation 3]

[0034]
(Equation 4)

[0035]
Can be expressed as
The processing unit 170 calculates the delay time Δt equation (2), the sound pressure P _L Equation (3) and sound pressure P _R An audio signal expressing Expression (4) is generated from the sound data. As a result, in the user, the sound image at the virtual sounding point is localized as if sounding from the position indicated by the sounding position information.
[0036]
The description is returned to FIG. Each processing unit 170 includes a parameter generation unit 172, a delay unit 176, and an amplifier 178. Among them, the parameter generation unit 172 further includes a delay parameter generation unit 173 and an amplifier parameter generation unit 174. Delay parameter generation section 173 generates a parameter that defines delay time Δt of each of the L channel signal and the R channel signal. More specifically, the delay parameter generation unit 173 generates the sounding position information received from the sound data distribution server 300, the direction information detected by the direction detection unit 150, the terminal position information detected by the positioning unit 140, By inputting the distance e between both ears, a parameter DP defining the delay time Δt between the left and right ears is generated by Expression (2), and the parameter DP is transmitted to the delay unit 176.
[0037]
On the other hand, the amplifier parameter generation unit 174 generates a parameter representing the sound pressure when each of the L channel signal and the R channel signal is emitted. More specifically, the amplifier parameter generation unit 174 generates the sounding position information received from the sound data distribution server 300, the direction information detected by the direction detection unit 150, the terminal position information detected by the positioning unit 140, By inputting the distance e between both ears, the sound pressure P generated at the left ear _L AL that determines the sound pressure and the sound pressure P generated in the right ear _R Is generated by Expressions (3) and (4), and the parameters AL and AR are transmitted to the amplifier 178.
The distance e between the left and right ears of the user input to the parameter generation unit 172 may be stored in a ROM (Read Only Memory) included in the control unit 110 and read out from the ROM or the like. A configuration in which the user inputs via the input unit 120 may be adopted. Further, the above-described method for specifying the positions of the left and right ears, the delay time Δt, and the sound pressure P _L And P _R Equations (2, 3, and 4) are merely examples, and further include head-related transfer functions, qualitative changes in sound due to changes in the frequency spectrum, and the effects of the ratio of direct sound to reverberant sound. Various changes and improvements can be made.
[0038]
The delay unit 176 generates an L channel signal SL1 for the left ear and an R channel signal SR1 for the right ear from the sound data input via the wireless communication unit 130, and transmits each of them to the amplifier 178. More specifically, delay section 176 generates each signal according to delay parameter DP received from delay parameter generation section 173 such that a delay occurs in L channel signal SL1 and R channel signal SR1. Accordingly, the L channel signal SL1 and the R channel signal SR1 for one sound data are generated as if a difference in arrival time from the sounding point occurs according to the positions of the left and right ears of the user, ie, the user From the viewpoint, the sound is generated as if the sound was output from a sounding point located in a certain direction.
[0039]
The amplifier 178 amplifies the L channel signal SL1 received from the delay unit 176 with the parameter AL received from the amplifier parameter generation unit 174, and receives the R channel signal SR1 received from the delay unit 176 from the amplifier parameter generation unit 174. The signal is amplified by the parameter AR and transmitted to the mixing section 180 as an L channel signal SL2 and an R channel signal SR2, respectively. As a result, each of the L channel signal SL2 and the R channel signal SR2 is generated as if the sound pressure levels are different according to the distance between the user's left and right ear positions and each sounding point. The adjustment of the sound pressure level by the amplifier 178 is performed for each of the processing units 170-1, 170-2, 170-3, and 170-4. For this reason, the audio signal generated in each of the processing units 170-1, 170-2, 170-3, and 170-4 gives the user a feeling as if the distance to each sounding point is different. Can be given.
[0040]
The mixing unit 180 mixes the four L-channel signals SL2 transmitted from the four processing units 170 and transmits the mixed L-channel signals SL2 to the audio signal output unit 190 as the L-channel signal SL3. The signal is transmitted to the audio signal output unit 190 as the R channel signal SR3. At this time, it is preferable that the mixing section 180 restricts the signal levels of the mixed L-channel signal SL3 and R-channel signal SR3 so as not to impair the user's ear. Each of the L-channel signal SL3 and the R-channel signal SR3 transmitted from the mixing unit 180 is subjected to D / A (Digital / Analog) conversion by the audio signal output unit 190, and is then output to the headphone 200 to be output to the left ear 200. Sound is emitted through the sound emitting section 220 and the sound emitting section 230 for the right ear.
Note that, although confirmative, the processing by the audio signal generation unit 160 includes various types of processing such as reception of sound data by the wireless communication unit 130, generation of terminal position information by the positioning unit 140, and generation of azimuth information by the azimuth detection unit 150. The processing is executed in parallel with the processing, and an audio signal is generated in a stream format from sound data. For this reason, when the user moves, the terminal position information, the direction information, and the like are updated accordingly, and regardless of the position to which the user moves and the direction in which the user faces, An audio signal is generated such that the sound image of the sound output from the sounding point is localized.
[0041]
<Operation of sound data distribution system>
Next, the operation of the sound data distribution system will be described with reference to FIG. This operation is a process of distributing sound data from the sound data distribution server 300 to the terminal 100 and generating an audio signal from the distributed sound data while updating the terminal position information and the direction information in the terminal 100. This operation is a process in which a process is started by using a generation start signal input from the instruction input unit 120 of the terminal 100 as a trigger, and thereafter, the terminal 100 is interrupted by a timer. In addition, various processes executed in a general mobile communication system, such as connection authentication and terminal authentication between the sound data distribution server 300 and the terminal 100, are not directly related to the present invention, and thus description thereof is omitted. And
[0042]
First, in step SA1, the control unit 110 of the terminal 100 receives the satellite signal transmitted from the satellite group 400 by the satellite radio wave receiving unit 145, and acquires the satellite signal. Next, in step SA2, the control unit 110 of the terminal 100 uses the positioning unit 140 to generate terminal position information SP indicating the position of the terminal 100 according to the acquired satellite signal. Next, the control unit 110 of the terminal 100 transmits the generated terminal location information SP to the base station 510 in Step SA3.
Upon receiving the terminal location information SP from the terminal 100, the base station 510 transfers the terminal location information SP to the sound data distribution server 300 in step SA4.
[0043]
Upon receiving the terminal position information SP transferred from the base station 510, the control unit 310 of the sound data distribution server 300 executes a sounding point selection process in step SA5. This sounding point selection processing is processing for selecting sounding points until reaching a predetermined number in accordance with the position of the terminal 100 according to the received terminal position information SP and the position of the sounding point according to the sounding position information. Here, the sound generation point selection processing executed by the control unit 310 of the sound data distribution server 300 will be described with reference to FIG.
[0044]
First, in step SA51, the control unit 310 sets the number of selections n indicating the number of selected sounding points to “0”, and initializes the number of selections n. Next, in step SA52, the control unit 310 selects the closest sounding point among the sounding points not selected at this time. At this time, the control unit 310 selects a sounding point using the received terminal position information SP of the terminal 100 and the sounding position information of each sounding point stored in the storage unit 330. For example, assume that eight sounding points S1, S2,..., S8 are arranged around the terminal 100 as shown in FIG. Each of these sounding points S1, S2,..., S8 is assumed to be arranged so as to be away from the terminal 100 in this order. At this time, if none of the sounding points S1, S2,..., S8 has been selected (the number of selections n = 0), the control unit 310 selects the sounding point S1 in step SA52.
[0045]
Next, in Step SA53, the control section 310 increments the selection number n by “1”. Next, in step SA54, control unit 310 determines whether or not the number of selections n has reached a predetermined number (four in the present embodiment). If the determination result is negative, the control unit 310 returns the processing procedure to step SA52, and repeats the processing from step SA52 to step SA54 until the selected number n reaches the predetermined number.
[0046]
On the other hand, if the determination result in step SA54 is affirmative, the control unit 310 ends the sounding point selection process because a predetermined number of sounding points has been selected. For example, in FIG. 11, the control unit 310 selects four sounding points S1, S2, S3, and S4 indicated by black circles near the terminal 100 among the eight sounding points S1, S2,..., S8. In the present embodiment, four sounding points are selected by the control unit 310, but the number of sounding points to be selected can be set arbitrarily.
[0047]
In FIG. 8 again, when the sound generation point selection process (step SA5) is completed, the control unit 310 of the sound data distribution server 300 next executes a data amount conversion process in step SA6. This data amount conversion process is a process of converting the data amount of sound data corresponding to the selected sounding point, that is, the data amount of sound data delivered to the terminal 100. Here, the data amount conversion processing executed by the control unit 310 of the sound data distribution server 300 will be described with reference to FIG. In this description, it is assumed that the sampling frequency of sound data recorded in advance in storage unit 330 of sound data distribution server 300 is 44.1 kHz or more.
[0048]
First, in step SA61, the control unit 310 obtains a distance D between each of the sounding points selected by the sounding point selection process and the terminal 100 using the sounding position information and the terminal position information SP. Next, in step SA62, the control unit 310 refers to the data amount conversion table TBL shown in FIG. 4 and converts sound data of each sounding point according to the distance D from each sounding point to the terminal 100. Specify the sampling frequency to be used. For example, in FIG. 11, the distance D between the terminal 100 and the sounding point S1 is “0” or more and less than “L1”, and the distance D between the terminal 100 and the sounding point S2 is “L1” or more and less than “L2”. The distance D between the terminal 100 and the sounding point S3 is "L2" or more and less than "L3", and the distance D between the terminal 100 and the sounding point S4 is "L3" or more and less than "L4". At this time, the control unit 310 refers to the data amount conversion table TBL, specifies the sampling frequency for the sound data at the sounding point S1 as f1 (44.1 kHz), and sets the sampling frequency for the sound data at the sounding point S2. It specifies f2 (22 kHz), specifies the sampling frequency for the sound data at the sounding point S3 as f3 (10 Hz), and specifies the sampling frequency for the sound data at the sounding point S4 as f4 (5 kHz).
[0049]
Next, in step SA63, the control unit 310 generates sound data of the sampling frequency specified in step SA62 from sound data of each sounding point recorded in advance in the storage unit 330. As a result, the generated sound data has a lower sampling frequency as the sound data at a sounding point farther from the terminal 100 has a smaller sampling frequency. Is done. As a result, the total data amount of sound data distributed from the sound data distribution server 300 is reduced, and as a result, network traffic in the mobile communication network 500 for sound data distribution and sound data distribution for sound data transmission The load on the server 300 is reduced.
In general, when the sampling frequency of an audio signal decreases, the sound quality when the audio signal is emitted deteriorates. However, in the present embodiment, the sound data delivered to the terminal 100 is processed by the processing unit of the terminal 100. By 170, the sound data at the sounding point located farther from the terminal 100 is processed so as to have a lower volume. For this reason, even if the sampling frequency of the sound data at the sounding point located far from the terminal 100 is reduced, the sound quality when the audio signal generated in the terminal 100 is emitted has almost no effect. In other words, according to the data amount conversion processing, the data amount of the sound data can be reduced without unduly deteriorating the sound quality, and the network traffic due to the distribution of the sound data and the load on the sound data distribution server 300 can be reduced.
[0050]
In FIG. 8 again, when the data amount conversion process (step SA6) is completed, the control unit 310 of the sound data distribution server 300 next proceeds to step SA7 where the four sound data SD1 and SD2 whose data amounts have been converted are output. , SD3, and SD4 are transmitted to the base station 510 in parallel. At this time, the control unit 310 adds sounding position information recorded in the storage unit 330 to each of the sound data SD1, SD2, SD3, and SD4, and then transmits the sound data SD1, SD2, SD3, and SD4 in a stream format. . For example, in FIG. 3, if the sound data to be transmitted to the base station 510 is “shopping march” corresponding to the sounding point ID “mini store A”, the control unit 310 transmits sounding position information (“shopping march”) to “shopping march”. After the addition of (x1, y1, z1), “shopping march” is transmitted.
When receiving the sound data SD1, SD2, SD3, and SD4 transmitted from the sound data distribution server 300, the base station 510 transfers the sound data SD1, SD2, SD3, and SD4 to the terminal 100 in step SA8.
[0051]
On the other hand, when the control unit 110 of the terminal 100 transmits the terminal position information SP to the base station 510 in step SA3, the azimuth detecting unit 150 generates azimuth information indicating the user's face direction in step SA9. Next, in step SA10, control unit 110 of terminal 100 generates an audio signal from sound data SD1, SD2, SD3, and SD4 received from base station 510. At this time, the control unit 110 of the terminal 100 generates an audio signal in a stream format by the audio signal generation unit 160 according to the terminal position information SP, the azimuth information, and the sounding position information. Next, control unit 110 of terminal 100 outputs an audio signal from audio signal output unit 190 in step SA11. The audio signal output from terminal 100 is output as a sound via headphones 200.
[0052]
For example, as shown in FIG. 12, the positions of the sounding points S1, S2, S3, and S4 are set, and the position of the terminal 100 (user) is turned by the terminal position information SP, and the user's face is turned by the azimuth information. Assume that direction A is given. At this time, the distance from the user to each sounding point S1, S2, S3, S4 is assumed to be longer in this order for both the left and right ears. At this time, the sound pressure (volume) of the sound data at each sounding point output from the headphone 200 has the highest sound pressure of the sound data corresponding to the sounding point S1, and the sound pressure of the sound data corresponding to the sounding point S4. Is the smallest. Thereby, the user perceives that the sounding point S1 is located closest and the sounding point S4 is located farthest. Since the distance from the sounding point S1 to the right ear of the user is shorter than the distance to the left ear, the L channel signal for the sound data at the sounding point S1 is delayed from the R channel signal. Thereby, the user perceives that the sound generation point S1 is located on the right side. Similarly, the user perceives that the sound generation point S4 is located on the left side based on the delay amount (delay time Δt) between the L channel signal and the R channel signal.
[0053]
Next, as shown in FIG. 13, it is assumed that the user has turned to the direction of the sound generation point S4. At this time, since the direction A in which the user's face turns is updated by the azimuth detecting unit 150, the delay amount (delay time Δt) of each sound data between the L channel signal and the R channel signal is updated. Thereby, the user perceives that the sounding point S2 is located on the right side and the sounding point S3 is located on the left side.
[0054]
Next, as shown in FIG. 14, the user approaches the sounding point S4, and the distance from the user to each sounding point S1, S2, S3, S4 is the sounding point S4, the sounding point S2, the sounding point S3, the sounding point. It is assumed that the user moves to a position farther in the order of S1. When the user moves in this way, the sound pressure (volume) of the sound data at the sounding point S1 decreases because the user moves away from the sounding point S1, and the sound pressure of the sound data at the sounding point S4 increases because the sound data approaches the sounding point S4. Become. Thereby, the user perceives that he has left the sounding point S1 and has approached the sounding point S4.
[0055]
As described above, according to the present embodiment, it is as if from the position indicated by the sounding position information, the user is sounding according to the position of the user, the direction in which the user faces, and the position of the sounding point. Is generated. Thereby, the user can obtain a feeling as if each of the sounding points actually exist at the specified position. For example, when each part of an orchestra is arranged as a sounding point in a certain area, if the user moves in the area, the user gets a feeling as if moving in the space where each part is arranged. It is possible for the user to participate in the generation of audio data, and to provide a varied and enjoyable sound amusement.
[0056]
Further, in the present embodiment, since the sound image of the sounding point is localized, when applied to a voice information system in which the position of the target of the user is indicated by voice, if the target is located on the right side, “rightward, Turn right before the gas station. "Sounds as if it were output from the right side. This makes it possible to more intuitively provide information on the direction to the user as compared with a conventional voice information system that does not consider a sound image, and improves the efficiency of voice instructions.
Furthermore, the sound data distribution system can be used as a guide for the visually impaired. For example, it is also possible to arrange a sounding point for outputting a sound representing each of them at a ticket vending machine, a station staff room, a ticket gate, or the like at a station. According to such a configuration, the position can be guided by voice in a manner similar to the case of seeing with the eyes, so that the user can voluntarily approach the target.
[0057]
In addition, the sounding point may be arranged at a store or the like, and a sound indicating an advertisement of the store may be output. This makes it sound as if the sound was output from the store, so that the user can easily find a store located in an invisible place such as the second floor of a building. On the other hand, in a store, an advertising effect can be expected, and business will be activated. In addition, an advertisement that affects hearing is converted from an actual sound such as a call to an advertisement at a virtual sounding point, thereby reducing noise in the city.
[0058]
<Modification of First Embodiment>
In the first embodiment described above, an example has been described in which the data amount of the sound data is reduced in the data amount conversion process in order to reduce the network traffic due to the distribution of the sound data and the load on the sound data distribution server. However, when these do not cause a problem, the data amount conversion processing can be omitted.
[0059]
<Second embodiment>
<Sound data distribution system configuration>
In the first embodiment described above, the sound data distribution system that distributes sound data from one sound data distribution server 300 to the terminal 100 has been described. On the other hand, in the second embodiment, a sound data distribution system that distributes sound data from each of a plurality of sound data distribution servers to the terminal 100 will be described.
In the configuration of the sound data distribution system according to the second embodiment, the same reference numerals are given to components common to the system according to the first embodiment.
[0060]
FIG. 15 is a diagram illustrating a schematic configuration of a sound data distribution system according to the second embodiment. As shown in this figure, roughly two types of server devices are connected to the mobile communication network 500. That is, the control server 600 and the sound

data distribution servers

610A, 610B, and 610C. Among these, each of the sound

data distribution servers

610A, 610B, and 610C is sound data distributed to the terminal 100, and stores sound data corresponding to different sounding points. The control server 600 manages the distribution of sound data from the sound

data distribution servers

610A, 610B, 610C to the terminal 100. More specifically, the control server 600 selects a predetermined number (for example, two) of sounding points according to the position of the terminal 100 and the position of each sounding point. The sound

data distribution servers

610A, 610B, and 610C distribute the sound data corresponding to the sounding points selected by the control server 600 to the terminal 100. Each of the control server 600 and the sound

data distribution servers

610A, 610B, and 610C is assigned a server ID for specifying each of them in the mobile communication network 500.
For convenience of explanation, in the second embodiment, the sound

data distribution servers

610A, 610B, and 610C are directly connected to the mobile communication network 500, but the sound

data distribution servers

610A, 610B, and 610C are connected to the Internet or the like. It may be configured to be connected to the mobile communication network 500 via a wireless communication network. In the second embodiment, an example in which the number of sound

data distribution servers

610A, 610B, and 610C is three will be described. However, the number of sound

data distribution servers

610A, 610B, and 610C is limited to three. Instead, other numbers may be used.
[0061]
First, the configuration of the sound

data distribution server

610A, 610B, 610C will be described. The sound

data distribution servers

610A, 610B, and 610C according to the second embodiment have the same configuration as the sound data distribution server 300 (see FIG. 2) according to the first embodiment, and include a control unit that controls each unit and a mobile communication unit. A communication unit that exchanges data with the network and a storage unit that stores various information are provided.
[0062]
FIG. 16 is a diagram showing the main information stored in the storage unit of sound data distribution server 610A, and FIG. 17 is a diagram showing the main information stored in the storage unit of sound data distribution server 610B. FIG. Although not shown, the storage unit of the sound data distribution server 610C stores the same information as the sound

data distribution servers

610A and 610B. As shown in these figures, the storage unit of each sound

data distribution server

610A, 610B stores a sounding point ID and sound data associated with each sounding point ID. The feature of this storage unit as compared with the storage unit 330 of the sound data distribution server 300 in the first embodiment is that it does not store the sounding position information.
[0063]
Next, the configuration of the control server 600 will be described. The control server 600 has a configuration similar to that of the sound data distribution server 610, and includes a control unit that controls each unit, a communication unit that exchanges data with the mobile communication network 500, and a storage unit that stores various information. It has.
[0064]
FIG. 18 is a diagram illustrating main information stored in the storage unit of the control server 600. As shown in the figure, the storage unit stores a sounding point ID, sounding position information and a server ID associated with each sounding point ID. The storage unit of the control server 600 is different from the storage unit 330 of the sound data distribution server 300 in the first embodiment in that it stores no sound data and stores a server ID. It is in. The sounding point ID is for specifying a sounding point stored in the sound

data distribution server

610A, 610B, 610C. The server ID is information indicating which sound

data distribution server

610A, 610B, or 610C stores the sound data of the sounding point specified by the sounding point ID. In this figure, the server ID “A” indicates the sound data distribution server 610A, the server ID “B” indicates the sound data distribution server 610B, and the server ID “C” indicates the sound data distribution server 610C. For example, in this figure, the sounding point of the sounding point ID “mini store A” is located at coordinates (x1, y1, z1), and the sound data is stored in the sound data distribution server 610A. The sounding point of the point ID “mini store B” is located at the coordinates (x2, y2, z2), and indicates that the sound data is stored in the sound data distribution server 610B.
[0065]
<Operation of sound data distribution system>
The operation of the sound data distribution system according to the second embodiment will be described with reference to FIG. This operation is a process in which sound data corresponding to a sound generation point selected by the control server 600 is distributed from the sound data distribution server 610 to the terminal 100, and the terminal 100 generates an audio signal from the distributed sound data. . This operation starts when a user inputs a generation start signal for instructing the start of generation of an audio signal via the instruction input unit 120 of the terminal 100, and thereafter, the terminal 100 performs a timer interrupt processing. It is. In addition, various processes executed in a general mobile communication system, such as connection authentication and terminal authentication between the control server 600 and the terminal 100, are not directly related to the present invention, and thus description thereof is omitted. And
[0066]
First, in step SB1, the control unit 110 of the terminal 100 receives a satellite signal transmitted from the satellite group 400 by the satellite radio wave receiving unit 145, and acquires a satellite signal. Next, in step SB2, the control unit 110 of the terminal 100 uses the positioning unit 140 to generate terminal position information SP indicating the three-dimensional position of the terminal 100 according to the acquired satellite signal.
[0067]
Next, in Step SB3, the control unit 110 of the terminal 100 transmits the terminal location information SP to the base station 510 together with the server ID indicating the control server 600 to which the information is transmitted. Upon receiving the terminal location information SP from the terminal 100, the base station 510 transfers the information to the control server 600 in step SB4.
[0068]
Upon receiving the terminal position information SP, the control unit of the control server 600 executes a sounding point selection process in step SB5 according to the received terminal position information SP and the sounding position information stored in the storage unit. This sounding point selection processing is the same processing as the sounding point selection processing (see FIG. 9) in the first embodiment described above, and sequentially starts at sounding points close to the terminal 100 until a predetermined number (two) is reached. Is a process of selecting. In this description of the operation, as an example, the control unit of the control server 600 selects two sounding points of “mini store A” and “building B” from the sounding point IDs shown in FIG. "Shopping March A (see FIG. 16)" stored in the sound data distribution server 610A and "Taiko B (see FIG. 17)" stored in the sound data distribution server 610B are distributed to the terminal 100. It will be described as. Note that the number of sounding points selected by the control unit of the control server 600 is not limited to two, and can be arbitrarily set.
[0069]
Next, the control unit of the control server 600 transmits the server ID_SID of the sound data distribution server 610 having the sound data of the selected sounding point to the base station 510 in Step SB6. At this time, the control unit of the control server 600 transmits the server ID_SID after adding the sounding point ID and the sounding position information stored in the storage unit to the server ID_SID. That is, the control unit of the control server 600 adds the pronunciation point ID “mini store A” and the pronunciation position information (x1, y1, z1) to the server ID “A” in FIG. After adding the sounding point ID “Building B” and the sounding position information (x2, y2, z2), the server ID “A” and the server ID “B” are transmitted to the base station 510.
When receiving each server ID_SID from the control server 600, the base station 510 transfers them to the terminal 100 in step SB7.
[0070]
Upon receiving the server ID_SID transferred by the base station 510, the control unit 110 of the terminal 100 requests each of the sound

data distribution servers

610A and 610B specified by the server ID_SID to transmit sound data in Step SB8. , And transmits the distribution requests DRA and DRB to the base station 510. That is, control unit 110 of terminal 100 transmits to base station 510 a distribution request DRA of “Shopping March A” to sound data distribution server 610A and a distribution request DRB of “Taiko B” to sound data distribution server 610B. .
[0071]
Upon receiving distribution requests DRA and DRB from terminal 100, base station 510 transfers the received distribution requests DRA and DRB to corresponding sound

data distribution servers

610A and 610B in steps SB9 and SB10. That is, in step SB9, the base station 510 transfers the distribution request DRA to the sound data distribution server 610A, and in step SB10, transfers the distribution request DRB to the sound data distribution server 610B.
[0072]
Upon receiving the distribution request DRA from the base station 510, the control unit of the sound data distribution server 610A streams the sound data SDA (here, “shopping march A”) indicated by the distribution request DRA to the base station 510 in step SB11. Send in format. When receiving the sound data SDA from the sound data distribution server 610A, the base station 510 transfers the sound data SDA to the terminal 100 in step SB12.
[0073]
On the other hand, when receiving the distribution request DRB from the base station 510, the control unit of the sound data distribution server 610B transmits the sound data SDB (here, “taiko B”) indicated by the distribution request DRB in step SB13. To the stream format. Upon receiving the sound data SDB from the sound data distribution server 610B, the base station 510 transfers the sound data SDB to the terminal 100 in step SB14. The processing of step SB13 and step SB14 is executed in parallel with the processing of step SB11 and step SB12 described above.
[0074]
Note that, similarly to the first embodiment, before transmitting the sound data SDA and SDB, the sound

data distribution servers

610A and 610B may execute a data amount conversion process. That is, in each of the sound

data distribution servers

610A and 610B, the data amount of each of the sound data SDA and SDB is reduced according to the distance between the terminal 100 and the sounding point, and then the sound data is transmitted to the base station 510. It is good.
[0075]
The control unit 110 of the terminal 100 transmits the sound data SDA (here, “Shopping March A”) transmitted from the sound data distribution server 610A and the sound data SDB (here, “Taiko B”) transmitted from the sound data distribution server 610B. )) Are received in parallel via the base station 510, in step SB15, the sound data is input to the audio signal generation unit 160 to generate left and right two-channel audio signals. At this time, each of the sound data of “shopping march A” and “drum sound B” is processed by one of the two processing units 170 included in the audio signal generation unit 160. Then, in step SB16, control section 110 of terminal 100 outputs an audio signal from audio signal output section 190. The audio signal output from terminal 100 is output as a sound via headphones 200.
[0076]
As described above, in the second embodiment, under the control of the control server 600, the sound data is distributed from each of the sound

data distribution servers

610A, 610B, and 610C to the terminal 100. In accordance with A, an audio signal in which a sound image at a virtual sound generation point is localized is generated. Thus, similarly to the above-described first embodiment, the user can move in a space as if a sounding point actually exists, and can provide an unprecedented acoustic amusement.
[0077]
In the second embodiment, sound data is distributed from a plurality of sound

data distribution servers

610A, 610B, and 610C. For this reason, a situation in which the plurality of terminals 100 receive intensive sound data distribution from one sound data distribution server is avoided, and the load on the sound data distribution server is distributed. Furthermore, since the distribution of the sound data is collectively managed by the control server 600, the management of the sound data becomes easy, and the sound data can be easily increased. As a result, the types and contents of the sound data distributed to the terminal 100 are enhanced.
[0078]
<Modifications of First and Second Embodiments>
In the above-described first and second embodiments, an example has been described in which the terminal position information SP is generated in the terminal 100 by the GNSS, but the present invention is not limited to this. For example, the sound data distribution server 300 or the control server 600 may be configured to generate the terminal position information SP according to the position of the base station 510 with which the terminal 100 establishes a wireless link.
[0079]
Further, in the first and second embodiments described above, the terminal 100 including the wireless communication unit 130 that wirelessly communicates with the base station 510 has been described, but the present invention is not limited to this. For example, a portable terminal having no wireless communication function such as a PDA (Personal Digital Assistants) and a detachable communication module may be attached to the portable terminal to exchange data with the sound data distribution server 300 and the like.
[0080]
In the above-described first and second embodiments, an example has been described in which sound data is distributed in a stream format. However, a storage unit for storing sound data is provided in the terminal 100, and audio data is stored in the storage unit using audio data cached in the storage unit. A configuration for generating a signal may be adopted.
[0081]
In addition, in the above-described first and second embodiments, the sound

data distribution servers

300, 610A and the like convert the data amount of sound data to be distributed to the terminal 100 in accordance with the distance between the terminal 100 and the sound generation point. Although an example is shown, the present invention is not limited to this. For example, the storage unit 330 of the sound data distribution server 300 previously stores a plurality of sound data representing the same sound and having different data amounts (for example, a plurality of sound data having different sampling frequencies). It is also possible to adopt a configuration in which any one of these is selected according to the distance between the terminal 100 and the sound generation point, and the selected sound data is transmitted to the terminal 100. Thereby, similarly to the above-described embodiments, network traffic is reduced.
[0082]
<Third embodiment>
In the first and second embodiments described above, the terminal 100 that receives sound data from the sound data distribution servers 300 and 610 via the wireless communication network 500 and generates an audio signal from the received sound data has been described. On the other hand, in the third embodiment, a terminal that generates an audio signal from sound data stored in a storage unit provided in the terminal will be described. In the third embodiment, the equipment for wireless communication such as the sound data distribution server 300 and the base station 510 in each of the above embodiments is not required.
[0083]
FIG. 20 is a diagram illustrating a configuration of a terminal according to the third embodiment. In this figure, the same components as those of the terminal 100 in the first embodiment are denoted by the same reference numerals.
The characteristic of the configuration of the terminal 700 in the third embodiment is a pronunciation information storage unit 720. The sounding information storage unit 720 stores sounding position information on sounding points and sound data, and supplies such information to the bus B1. The control unit 110 of the terminal 100 in the first and second embodiments described above generates an audio signal from the sound data distributed from the sound data distribution server 300. However, the control unit 710 of the terminal 700 in the third embodiment The sound data stored in the sound information storage unit 720 is read, and an audio signal is generated from the read sound data.
[0084]
More specifically, first, the control unit 710 determines the position of the terminal 700 indicated by the terminal position information SP generated by the positioning unit 140 and the position of the sounding point indicated by the sounding position information stored in the sounding information storage unit 720 In response to the above, a sound generation point selection process is executed. This sounding point selection processing is the same as the sounding point selection processing (see FIG. 9) executed in the sound data distribution server 300 in the first embodiment described above, and the sounding points are arranged in order from the one closest to the terminal 700. In addition, this is a process of selecting until a predetermined number is reached. Next, the control unit 710 reads sound data corresponding to each of the selected plurality of sounding points from the sounding information storage unit 720 in parallel. Then, the control unit 710 causes the audio data generation unit 160 to generate a two-channel audio signal from each of the read sound data according to the terminal position information SP and the sounding position information.
[0085]
As described above, in the third embodiment, an audio signal is generated based on the sound data read from the pronunciation information storage unit 720 included in the terminal 700. Thus, similarly to the first and second embodiments described above, the user can move in the space where the virtual sound generation positions are arranged, and it is possible to provide an unprecedented sound amusement. In addition, since a configuration for distributing sound data is not required, the configuration is simplified.
[0086]
Note that the pronunciation information storage section 720 does not necessarily have to be built in the terminal 700. For example, the terminal 700 may be provided with a removable storage for reading information about a sounding point from a recording medium such as an optical disk, or the terminal 700 may be provided with an interface for externally attaching the removable storage.
[0087]
Further, in the first, second and third embodiments described above, the example in which the position of the sounding point is fixed has been described, but the position of the sounding point may be changed in a time-series manner. With such a configuration, for example, a sound that is assumed to be output from a moving body such as a jet machine can be virtually generated. Thus, the sound image output from the headphone 200 becomes more varied and fun.
[0088]
In addition, in each of the above-described embodiments, an example has been described in which each of the sounding position information and the terminal position information SP is defined by a three-dimensional position.
[0089]
<Fourth embodiment>
In each of the embodiments described above, the sound data distribution system for localizing the sound image at the sounding point whose position is fixed has been described. On the other hand, in the fourth embodiment, the position of the sound generation point is associated with the position of the terminal carried by the user, and the sound is output to the user as if the sound was output from the position of another moving user. A sound data distribution system for localizing a sound image will be described. In the configuration of the sound data distribution system according to this embodiment, the same reference numerals are given to components common to the system according to the first embodiment.
[0090]
<Structure of sound data distribution server>
The above-described sound data distribution server 300 of the first embodiment has distributed sound data stored in advance in the storage unit 330 to the terminal 100. In contrast, the sound data distribution server according to the present embodiment receives sound data and terminal position information SP from a terminal, and distributes information including the data to another terminal. The sound data uploaded from the terminal to the sound data distribution server is information indicating a sound that is virtually output from the position of the terminal. On the other hand, the terminal position information SP uploaded together with the sound data is information indicating a position of a virtual sound source from which a sound corresponding to the sound data is to be output. Common.
[0091]
FIG. 21 is a diagram showing a configuration of the sound data distribution server 800. As shown in this figure, the sound data distribution server 800 is different from the above-described sound data distribution server 300 in that the sound data distribution server 800 has a movement vector amount calculation unit 810 and that the information stored in the storage unit 330 Different in that. Of these, the movement vector amount calculation unit 810 determines a vector amount indicating a displacement per unit time of the terminal, that is, a speed (hereinafter, referred to as a “movement vector amount MV”) based on a relative positional relationship between two time points with respect to one terminal. ) Is calculated. For example, as shown in FIG. _T-1 (X _T-1 , Y _T-1 ) Is located at the point P after the unit time “ut” has elapsed. _T (X _T , Y _T ). At this time, the movement vector amount calculation unit 810 calculates the movement vector amount MV as
[0092]
(Equation 5)

[0093]
Is calculated. The movement vector amount MV is used in the terminal 900 to predict the movement route of another terminal 900, as described later.
FIG. 23 is a diagram illustrating information stored in the storage unit 330 of the sound data distribution server 800. As shown in this figure, the storage unit 330 stores “terminal ID”, “terminal position information SP”, “movement vector amount MV”, and “sound data” in association with each other. Among them, the terminal ID is information for identifying the terminal 900 included in the sound data distribution system. The storage unit 330 stores the terminal ID corresponding to the terminal 900 connected to the sound data distribution server 800 among the terminal IDs corresponding to all the terminals 900.
[0094]
The terminal location information SP is information indicating the location of the terminal 900 corresponding to the terminal ID. For one terminal ID, the location information at two time points, the information in the “T-1 period” and the information in the “T period”, included. Here, the “T-1 period” corresponds to a point in the past from the “T period” by the unit time “ut” described above.
[0095]
The terminal location information SP is updated every time the unit time “ut” elapses while the terminal 900 is in the service area of the base station 510. In this embodiment, the terminal location information SP is (x1 _T-1 , Y1 _T-1 ) Or (x1 _T , Y1 _T )), The position information is defined in two dimensions, but may be position information defined in three dimensions in the same manner as the sounding position information.
[0096]
The movement vector amount MV is information generated by the movement vector amount calculation unit 810 described above, and indicates the speed of the terminal 900 corresponding to the terminal ID in the period from “T-1 period” to “T period”. The movement vector amount calculation unit 810 calculates the movement vector amount MV using the terminal position information SP of “T-1 period” and “T period”.
[0097]
The sound data is data uploaded from the terminal 900 corresponding to the terminal ID, and is distributed from the sound data distribution server 800 to another terminal 900. In this embodiment, a description will be given assuming that the sound virtually output from the position of the terminal 900 is a sound constituting the music, but the sound virtually emitted from the position of the terminal 900 is a sound constituting the music. The sound is not limited, and any sound such as a musical sound or voice may be used.
The set of the terminal ID, the terminal position information SP, the movement vector amount MV, and the sound data described above is stored in the storage unit 330 only while the terminal 900 corresponding to the terminal ID is connected to the sound data distribution server 800. When the data is stored and their connection is disconnected, the control unit 310 deletes the data from the storage unit 330.
[0098]
<Configuration of terminal>
The terminal 900 uploads sound data indicating sound virtually output from the position of the own device and terminal position information SP indicating the position of the own device to the sound data distribution server 800, while relating to the other terminal 900. The sound data, the terminal position information SP, and the movement vector amount MV are downloaded from the sound data distribution server 800. The terminal 900 outputs the music indicated by the sound data from the position of the other terminal 900 (user) using the downloaded sound data relating to the other terminal 900, the terminal position information SP, and the movement vector MV. Localize sound image as if
[0099]
FIG. 24 is a block diagram showing a configuration of terminal 900. As shown in this figure, the terminal 900 includes a storage unit 910 and a position prediction unit 920 in addition to the components of the terminal 100 (see FIG. 5) in the first embodiment.
The storage unit 910 stores the “sound data” and the “terminal ID” of the own device (terminal 900). This sound data is data indicating a sound assumed to be virtually output from the position of the own device. The user can store the sound data in the storage unit 910 of the terminal 900 by, for example, downloading it via a network.
[0100]
The position prediction unit 920 uses the terminal position information SP of the “T period” regarding the other terminal 900 distributed from the sound data distribution server 800 and the movement vector amount MV, and transmits the other terminals 900 after the “T period”. Predict the route that will travel. For example, as shown in FIG. 22, the position P of the terminal 900 in the “T period” _T (X _T , Y _T ) And the movement vector amount MV of terminal 900 from “T-1 period” to “T period”, position estimating section 920 determines position P of terminal 900. _C (X _C , Y _C ) Is predicted in time series by the following equation, for example.
[0101]
(Equation 6)

[0102]
Here, et indicates the elapsed time from “T period”.
The description returns to FIG. The audio signal generation unit 160 generates the audio signals SL3 and SR3 in which the sound image is localized from the sound data using the terminal position information SP and the direction information A generated by the own device as in the first embodiment. Instead of the sounding position information indicating the position of the sounding point, the position P of another terminal 900 predicted by the position prediction unit 920 _C Is used to localize the sound image. When the sound image is localized, the position of the other terminal 900 is not an actually measured value but a predicted value (position P) predicted by the position predicting unit 920. _C The reason for using) is to prevent the localized sound image from becoming unnatural due to the influence of network traffic and the like, which will be described later.
[0103]
<Operation of sound data distribution system>
Next, the operation of the sound data distribution system will be described with reference to FIG. This operation is an operation in which the terminal 900 localizes the sound image as if the music was output from the position of another terminal 900. In the sound data distribution system, each of the plurality of terminals 900 uploads sound data and downloads sound data from the sound data distribution server 800 and then performs a relative positional relationship between the terminal 900 and the other terminals 900. Is performed in parallel with the operation of localizing the sound image according to. However, in the following, for convenience of description, attention will be focused on only two terminals 900 among a plurality of terminals 900 included in the sound data distribution system, and one of these terminals (hereinafter, referred to as “900U”) will be referred to as sound data. Of the other terminal (hereinafter referred to as “900D”), and the description will proceed with particular attention to the processing for downloading sound data and localizing a sound image.
[0104]
First, when terminal 900U enters a service area under the control of any of base stations 510, control unit 110 of terminal 900U in step SC1 and SC2 transmits sound data SD stored in storage unit 910 and terminal ID_MI of its own device. Are transmitted to the sound data distribution server 800 via the base station 510. When receiving the sound data SD and the terminal ID_MI from the terminal 900U, the control unit 310 of the sound data distribution server 800 causes the storage unit 330 to store the terminal ID_MI and the sound data SD in association with each other as shown in FIG.
[0105]
On the other hand, after transmitting sound data SD and terminal ID to base station 510 (step SC1), control section 110 of terminal 900U receives a satellite signal transmitted from satellite group 400 by satellite radio wave receiving section 145 in step SC4. I do. Subsequently, in step SC5, the control unit 110 of the terminal 900U uses the received satellite signal to generate the terminal position information SP1 indicating the position of the own device by the positioning unit 140. Next, in steps SC6 and SC7, control unit 110 of terminal 900U transmits generated terminal location information SP1 and terminal ID_MI of the own device to sound data distribution server 800 via base station 510.
[0106]
When receiving the terminal location information SP1 and the terminal ID_MI from the terminal 900U via the base station 510, the control unit 310 of the sound data distribution server 800 determines in step SC8 the received terminal ID of the terminal IDs stored in the storage unit 330. The received terminal location information SP1 is stored in the storage unit 330 in association with the terminal ID equal to ID_MI. For example, as shown in the upper part of FIG. 26, terminal ID “MS1” and terminal position information SP (x1 _T-2 , Y1 _T-2 ) And the terminal position information SP (x1 _T-1 , Y1 _T-1 ) And the movement vector amount MV (x1 _V-1 , Y1 _V-1 ) And sound data SD “song 1” are stored in association with each other. Under this situation, the sound data distribution server 800 in step SC7 sends the terminal ID “MS1” from the base station 510 and the terminal position information SP1 (x1 _T , Y1 _T ) And received. At this time, as shown in the lower part of the figure, the control unit 310 firstly obtains the terminal position information SP (x1 _T-1, y1 _T-1 ) Is rewritten as the terminal position information SP of the “T-1 period”, and then the received terminal position information SP1 (x1 _T , Y1 _T ) Is stored in the storage unit 330 as the terminal position information SP of “T period”.
[0107]
Next, in step SC9, the control unit 310 of the sound data distribution server 800 uses the “T-1” terminal position information SP and the “T period” terminal position information SP stored in the storage unit 330. , The moving vector amount calculating unit 810 calculates the moving vector amount MV of the terminal 900U. Next, the control unit 310 causes the storage unit 330 to store the calculated movement vector amount MV in association with the terminal ID. If the terminal position information SP of the “T-1 period” does not exist, that is, if the processing of step SC9 by the sound data distribution server 800 is the first time, the storage unit 330 stores the zero vector as the movement vector amount MV. Is recorded.
[0108]
Here, the description shifts to the operation of terminal 900D that downloads sound data SD from sound data distribution server 800.
When control unit 110 of terminal 900D receives a generation start signal for instructing start of generation of an audio signal from instruction input unit 120, satellite signal reception unit 145 receives a satellite signal transmitted from satellite group 400 in step SC10. . Next, in step SC11, the control unit 110 of the terminal 900D uses the received satellite signal to cause the positioning unit 140 to generate terminal position information SP2 indicating the position of the own device. Next, in steps SC12 and SC13, control section 110 of terminal 900D transmits terminal ID_MI and generated terminal location information SP2 to sound data distribution server 800 via base station 510.
[0109]
Upon receiving terminal position information SP2 and terminal ID_MI transferred from base station 510, control unit 310 of sound data distribution server 800 executes terminal selection processing in step SC15. This terminal selection processing is substantially the same as the sound generation point selection processing (see FIG. 9) in the first embodiment, and is performed in accordance with the relative distance between the terminal 900D and the other terminal 900. The process is for selecting another terminal 900 located near the terminal 900D. In this example, it is assumed that the terminal 900U has been selected by the terminal selection process, and an operation for localizing the sound image of the sound data SD relating to the terminal 900U will be described below.
[0110]
When the control unit 310 of the sound data distribution server 800 selects the terminal 900U by the terminal selection process, in step SC16, the terminal ID_MI of the selected terminal 900U and the “T period” associated with the terminal ID in the storage unit 330 The terminal position information SP1, the movement vector amount MV, and the sound data SD are transmitted as a set to the base station 510. At this time, the control unit 310 transmits the information in the form of adding the terminal ID_MI, the terminal position information SP1, and the movement vector amount MV as a header of the sound data SD. If a plurality of terminals 900 are selected in the terminal selection process, control section 310 transmits terminal ID_MI, terminal position information SP of “T period”, movement vector amount MV, and sound data SD to terminal 900. Each group is transmitted to the base station 510.
Upon receiving the terminal ID_MI, the terminal position information SP1 in the “T period”, the movement vector amount MV, and the sound data SD, the base station 510 transfers them to the terminal 900D in step SC17.
[0111]
On the other hand, control unit 110 of terminal 900D transmits terminal position information SP2 and terminal ID_MI to base station 510 (step SC12). Next, in step SC14, azimuth information OD indicating direction A in which the face of the user faces is provided. Generated by the azimuth detecting unit 150. Subsequently, when the control unit 110 of the terminal 900D receives the terminal ID_MI, the terminal position information SP1 of the “T period”, the movement vector amount MV, and the sound data SD from the base station 510 (step SC17), the control proceeds to step SC18. In, an audio signal in which the sound image of the received sound data SD is localized is generated, and the generated audio signal is emitted via the headphones 200. At this time, the terminal 900D generates an audio signal while updating the terminal position information SP2 and the azimuth information OD relating to the own device at regular time intervals.
In the sound data distribution system, the sound data SD distributed from the sound data distribution server 800 to the terminal 900D is reproduced on the terminal 900D in a stream format by repeating the processing from step SC4 to step SC18 described above. .
[0112]
Hereinafter, the process in which terminal 900D localizes the sound image in step SC18 will be described in detail. First, the control unit 110 of the terminal 900D predicts the position of the terminal 900U by the position prediction unit 920 using the terminal position information SP1 in the “T period” and the movement vector amount MV. Next, the control unit 110 of the terminal 900D converts the audio signal obtained by localizing the sound image from the sound data SD into an audio signal according to the predicted position of the terminal 900U and the terminal position information SP2 and the direction information OD generated in the own device. Generated by the generator 160.
[0113]
For example, as shown by a solid line in FIG. 27, it is assumed that the user 902D of the terminal 900D faces his / her face in the direction A of the user 902U at a position away from the user 902U of the terminal 900U. In this case, the control unit 110 of the terminal 900D generates an audio signal as if the music is being emitted from the position of the user 902U, and emits the audio signal via the headphones 200.
[0114]
Next, as shown by a dashed line in the figure, when each

user

902D and 902U moves closer to each other, control unit 110 of terminal 900D emits sound from headphones 200 as the distance between

terminals

900U and 900D becomes shorter. Increase the sound pressure (volume) of the music. This allows the user 902D to feel as if the relative positional relationship between the user's own position and the point where the music is emitted is linked to the relative positional relationship between the user 902D and another user 902U. Obtainable.
[0115]
Next, as shown by a broken line in FIG. 28, it is assumed that the user 902U has moved farther in the right hand direction as viewed from the user 902D of the terminal 900D. When the user 902U (terminal 900U) moves in this way, as the user 902U moves farther, the sound pressure of the music emitted from the headphones 200 decreases in both the left and right ears. However, since the distance between the right ear of the user 902D and the user 902U is shorter than the distance between the left ear of the user 902D and the user 902U, the sound pressure of the sound reaching the right ear is lower than the sound pressure of the sound reaching the left ear. Get higher. For the same reason, the R channel signal is delayed from the L channel signal. Thereby, the user 902D can perceive that the point where the music is emitted is far in the right-hand direction, and can obtain information that the other user 902U moves in the right-hand direction. That is, if the user 902D knows in advance the type of sound associated with the position of the other user 902U, the user 902D only obtains the sound emitted from the headphones 200 and obtains the approximate position of the other user 902U. be able to.
[0116]
Here, the reason for using the predicted value by the position prediction unit 920 instead of the actually measured value of the position of the terminal 900U when localizing the sound image will be described.
For example, when the data transmission speed between the sound data distribution server 800 and the terminal 900D is low, or when the processing capacity of the sound data distribution server 800 is low, the unit of the terminal position information SP1 of another terminal 900U that the terminal 900D can acquire. Fewer samples per hour. As described above, when the number of samples of the terminal location information SP1 decreases, a situation may occur in which the locations indicated by the temporally continuous terminal location information SP1 are extremely separated from each other. When such a situation occurs, the localized sound image may give the user 902D an uncomfortable feeling as if the sounding point instantaneously moves to a distant point. In order to cope with this, in the present embodiment, the terminal 900D updates the position of the terminal 900U based on the prediction during a period after acquiring the terminal position information SP1 and before acquiring the next terminal position information SP1. Thus, the sound image is localized according to the updated position. In other words, the sound image is localized while avoiding a situation in which the temporally continuous positions of the terminal 900U used for sound image localization are extremely separated by interpolating the measured values in a time-series manner with the predicted values. This makes it possible to eliminate the sense of discomfort given to the user 902D.
[0117]
Note that the method of predicting the position of the terminal 900U is not limited to a linear prediction method using the movement vector amount MV. For example, the path of the non-linear terminal 900U may be predicted by a non-linear function using a set of three or more time points in the past and the position of the terminal 900U at each of those time points as parameters. Further, the route may be predicted using geographic information such as road information and incorporating the influence of the geographical situation around the user 902U. This makes it possible to predict the route in consideration of the range in which a person can act, so that the prediction accuracy can be improved.
However, if the terminal 900D can acquire the terminal position information SP1 of a sufficient number of samples, the sound image may be localized using only the actual measurement value (terminal position information SP1) without using the predicted value. It is. In addition, while the sound image is localized using the actually measured values in principle, the sound image may be selectively localized using the predicted value only when the number of samples of the terminal position information SP1 is small.
[0118]
In the present embodiment, the positions of the virtual sound output points are all associated with the position of the terminal 900. However, the sound output points associated with the positions of the terminal 900 will be described in the first embodiment. It is good also as composition which mixes the sounding point where the position where it did is fixed. In the case of such a configuration, the sound data distribution server 800 executes the processing separately for the sounding point where the position moves and the sounding point where the position does not move, that is, the sound data distribution server 800 moves only to the sounding point where the position moves. By calculating the vector amount MV, it is possible to reduce the processing amount as a whole.
[0119]
In the present embodiment, the sound data SD delivered to the terminal 900D is data uploaded from the terminal 900U, but is not limited to this. For example, a plurality of sound data SD may be stored in the storage unit 330 of the sound data distribution server 800 in advance, and any one of them may be distributed to the terminal 900D. Which sound data SD is associated with which other terminal 900U may be instructed by the user 902U via the terminal 900U or may be instructed by the user 902D via the terminal 900D. As a result, the sound data SD can be omitted from the information uploaded from the terminal 900U, so that the data amount is significantly reduced.
[0120]
In the present embodiment, the sound image is localized according to the positional relationship between the

users

902D and 902U and the direction A in which the face of the user 902D faces. However, various changes as described below can be made. . For example, the influence of the Doppler effect may be incorporated into the sound image by changing the frequency of sound according to the speed difference between the terminal 900U and the terminal 900D. Thereby, when the

users

902D and 902U pass each other, the atmosphere can be realistically expressed.
[0121]
Further, an effect may be applied to the sound output from the headphones 200 according to the position of the terminal 900D or 900U. For example, when the terminal 900D or 900U is located in a specific area of the service area provided by the base station 510, the tempo of the music, the sense of chord of the sound, and the like may be changed. In order to realize such a sound data distribution system, the storage unit 330 of the sound data distribution server 800 stores a parameter for applying an effect to the sound data SD for each area that divides the service area. Then, when distributing the sound data SD, a parameter may be selected according to the position of the terminal 900D or 900U, and the sound data SD to which an effect corresponding to the parameter has been added may be distributed to the terminal 900D. Thereby, when the

user

902D or 902U passes through a specific area such as a shopping street, for example, an effect is added to the sound from the sounding point, and the amusement property is improved.
[0122]
Further, it is also possible to incorporate the influence of the direction A facing the user 902U into the sound image localization. For example, sound that is virtually output from the position of the user 902U may have directivity in the direction A in which the face of the user 902U faces. Thereby, the user 902D can obtain a feeling as if the sound is being output from the mouth of the user 902U.
[0123]
Although the above-mentioned terminal 900D has shown the positional relationship with other terminal 900U to user 902D only by sound, the positional relationship with terminal 900U may be shown in parallel with this using other methods. For example, in addition to the above-described configuration, a display unit such as an LED (Light Emitting Diode) is provided in the terminal 900D, and the closer the distance to another terminal 900U is, the stronger the light emission intensity of the LED is. Alternatively, the terminal 900D may be provided with a built-in vibrator for generating mechanical vibration, and the vibration may indicate the positional relationship with another terminal 900U. Thus, the user 902D can grasp the position of another user 902D not only by hearing but also by visual or tactile sensation.
[0124]
<Modifications of First, Second, Third and Fourth Embodiments>
Note that the audio data generation unit 160 in each of the above-described embodiments generates an audio signal according to the position and face direction of the user and the position of the sound generation point. (For example, effects such as sound reflection and diffraction by a building wall) may be adopted.
[0125]
Further, in each of the embodiments described above, an example in which the audio signal is output as sound by the headphones 200 has been described, but the present invention is not limited to this. For example, a configuration may be adopted in which sound is emitted from a plurality of speakers installed in the interior of an automobile. In the case of such a configuration, the audio signal generation unit 160 specifies the direction of the user's face, for example, according to the traveling direction of the car, and the specified face direction, the position of the car, and each speaker. An audio signal output from each speaker may be generated according to the relative positional relationship between the left and right ears.
[0126]
Furthermore, in each of the above-described embodiments, an example has been described in which a two-channel audio signal is generated in order to localize a sound image at a sound generation point, but the present invention is not limited to this. For example, a configuration may be employed in which audio signals of two or more channels such as 5.1 channels are generated and emitted from a sound emitting device such as a speaker.
[0127]
In each of the above-described embodiments, as the sounding point selection process, a method of sequentially selecting sounding points that are closer to the

terminals

100, 700, and 900 until a predetermined number is reached (see FIG. 9) has been described. The method of selecting a point is not limited to this. For example, any method may be used as long as it selects a sounding point according to the position of the terminal 100 and the position of the sounding point, such as a method of selecting a sounding point whose distance to the terminal 100 is equal to or less than the threshold value regardless of the number of selections. The present invention can be applied to:
Note that, when the sound image distribution system is applied to a relatively small area such as a theme park or the like, and the sound images at all the sounding points are localized, the sounding point selection processing can be omitted.
[0128]
In the first, second, and fourth embodiments described above, the sound image of the virtual sounding point is localized in each of the

terminals

100 and 900. However, the sound

data distribution servers

300, 610A, 610B, 610C, and 800, or The control server 600 may localize the sound image and distribute a signal indicating the sound image to the terminal. In short, the information indicating the position of the user and the direction A in which the face is facing is acquired, and the user sees the information in accordance with the position, the direction A in which the face is facing, and the position of the virtual sounding point. The present invention can be arbitrarily applied as long as the sound image is localized so that the sound previously associated with the sounding point is output from the position of the sounding point.
[0129]
The present invention can also be executed as a program for causing a computer to function as the terminal 100, 700, or 900 for localizing a sound image described above. In other words, this program provides a computer with a function of acquiring user information indicating the position of the user and the direction in which the face is facing, a function of acquiring sounding position information indicating the position of a virtual sounding point, From the position indicated by the user information, the type of sound previously associated with the sounding point is output from the position indicated by the sounding position information, as viewed from the user who turned his / her face in the direction indicated by the acquired user information. As described above, it is specified as a program for realizing the function of localizing a sound image.
Further, the present invention can be realized as a computer-readable recording medium on which the program is recorded.
[0130]
【The invention's effect】
As described above, according to the present invention, a sound image localization apparatus, a sound image localization method, a sound data distribution system, a sound data distribution method, and a program that can provide amusement with high amusement are provided.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating a configuration of a sound data distribution system according to a first embodiment of the present invention.
FIG. 2 is a diagram showing a configuration of a sound data distribution server included in the sound data distribution system.
FIG. 3 is a diagram showing information stored in a storage unit of the sound data distribution server.
FIG. 4 is a diagram showing a data amount conversion table stored in a storage unit of the sound data distribution server.
FIG. 5 is a diagram showing a configuration of a terminal included in the sound data distribution system.
FIG. 6 is a diagram illustrating a configuration of an audio signal generation unit and the like included in the terminal.
FIG. 7 is a diagram for explaining processing by the audio signal generation unit.
FIG. 8 is a flowchart showing the operation of the sound data distribution system.
FIG. 9 is a flowchart showing a sounding point selection process executed by the sound data distribution server.
FIG. 10 is a flowchart showing a data amount conversion process executed by the sound data distribution server.
FIG. 11 is a diagram showing a state of a sounding point selected by the sound data distribution server.
FIG. 12 is a diagram for describing sound image localization by an audio signal generated by the terminal.
FIG. 13 is a diagram for explaining the same sound image localization.
FIG. 14 is a diagram for explaining the same sound image localization.
FIG. 15 is a diagram illustrating a configuration of a sound data distribution system according to a second embodiment of the present invention.
FIG. 16 is a diagram showing information stored in a sound data distribution server included in the sound data distribution system.
FIG. 17 is a diagram showing information stored in the sound data distribution server.
FIG. 18 is a diagram showing information stored in a control server included in the sound data distribution system.
FIG. 19 is a flowchart showing the operation of the sound data distribution system.
FIG. 20 is a diagram illustrating a configuration of a terminal according to the third embodiment of the present invention.
FIG. 21 is a diagram illustrating a configuration of a sound data distribution server according to a fourth embodiment of the present invention.
FIG. 22 is a diagram for explaining a method of estimating a movement route of a terminal.
FIG. 23 is a diagram showing information stored in the sound data distribution server in the embodiment.
FIG. 24 is a diagram showing a configuration of a terminal in the embodiment.
FIG. 25 is a flowchart showing the operation of the sound data distribution system in the embodiment.
FIG. 26 is a diagram for explaining the operation.
FIG. 27 is a diagram for describing sound image localization in the same operation.
FIG. 28 is a diagram for describing sound image localization in the same operation.
[Explanation of symbols]
100, 700, 900 terminal, 110, 710 control unit, 120 instruction input unit, 130 wireless communication unit, 140 positioning unit, 145 satellite radio wave reception unit, 150 azimuth detection unit, 160 audio signal generation Unit, 170 processing unit, 172 parameter generation unit, 173 delay parameter generation unit, 174 amplifier parameter generation unit, 176 delay unit, 178 amplifier, 180 mixing unit, 190 audio signal output unit, 200 Headphones, 210: Direction sensor, 220, 230: Sound emitting unit, 300, 610A, 610B, 610C, 800: Sound data distribution server, 310: Control unit, 320: Communication unit, 330: Storage unit, 400: Satellite group, 500: mobile communication network, 510: base station, 600: control server, 720: pronunciation information storage unit.

Claims

User information acquisition means for acquiring user information indicating the position of the user, and the direction in which the face is facing,
Sounding position information obtaining means for obtaining sounding position information indicating a position of a virtual sounding point,
A position indicated by the acquired user information, wherein the sound of the type previously associated with the sounding point is viewed from the user who turned his / her face in the direction indicated by the obtained user information, and the position indicated by the sounding position information Localization means for localizing the sound image as if it were output from
A sound image localization device comprising:

The sounding position information obtaining means obtains, as the sounding position information, moving object position information indicating a position of a moving object associated with the sounding point,
2. The sound image localization apparatus according to claim 1, wherein the localization unit localizes the sound image such that the sound is output from a position of the moving body indicated by the obtained moving body position information. 3.

A receiving unit that receives sound data indicating a type of sound previously associated with the sounding point,
The sound image localization apparatus according to claim 1, wherein the localization unit localizes a sound image of a sound indicated by the sound data received by the reception unit.

Acquire user information indicating the position of the user and the direction in which the face is facing,
Acquires pronunciation position information indicating the position of a virtual pronunciation point,
A position indicated by the acquired user information, wherein a sound of the type previously associated with the sounding point is viewed from the user who turned his / her face in the direction indicated by the acquired user information, and the position indicated by the sounding position information A sound image localization method characterized by localizing a sound image as if it were being output from a computer.

A sound data distribution device that distributes sound data representing a type of sound previously associated with a virtual pronunciation point;
Receiving the sound data distributed from the sound data distribution device, using the received sound data, located at a certain point, as seen from a user who turned his face in a certain direction, a type of type that is previously associated with the sounding point A sound data distribution system comprising: a terminal for localizing a sound image so that sound is output from the position of the sounding point.

A sound data distribution device that distributes sound data representing a type of sound previously associated with a virtual pronunciation point,
For a terminal that is located at a certain point and sees from a user who turns his / her face in a certain direction, a terminal that localizes a sound image so that a sound of a type previously associated with the sounding point is output from the position of the sounding point. A sound data distribution method characterized by distributing sound data selected according to a position of a user and a position of each sounding point among a plurality of sound data corresponding to each sounding point.

Computer
User information acquisition means for acquiring user information indicating the position of the user, and the direction in which the face is facing,
Sounding position information obtaining means for obtaining sounding position information indicating a position of a virtual sounding point,
A position indicated by the acquired user information, wherein the sound of the type previously associated with the sounding point is viewed from the user who turned his / her face in the direction indicated by the obtained user information, and the position indicated by the sounding position information Localization means for localizing the sound image as if it were output from
A program characterized by functioning as a program.