JP7074116B2

JP7074116B2 - Information processing method and information processing equipment

Info

Publication number: JP7074116B2
Application number: JP2019182585A
Authority: JP
Inventors: 優樹瀬戸
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2019-10-03
Filing date: 2019-10-03
Publication date: 2022-05-24
Anticipated expiration: 2038-03-01
Also published as: JP2020016901A

Description

本発明は、情報を処理する技術に関する。 The present invention relates to a technique for processing information.

端末装置が収集した情報を処理する各種の技術が従来から提案されている。例えば特許文献１には、識別情報が音響通信により端末装置に送信され、端末装置は当該識別情報に対応する関連情報を出力する構成が開示されている。 Various techniques for processing the information collected by the terminal device have been conventionally proposed. For example, Patent Document 1 discloses a configuration in which identification information is transmitted to a terminal device by acoustic communication, and the terminal device outputs related information corresponding to the identification information.

特開２０１６－１５３９０６号公報Japanese Unexamined Patent Publication No. 2016-153906

しかし、特許文献１の技術では、個々の端末装置が識別情報に対応する関連情報を取得するにとどまり、各端末装置が取得した識別情報を収集して全体として活用するという発想はなかった。本発明では、複数の端末装置がそれぞれ収音した音声に関する情報を様々な用途に活用することを目的とする。 However, in the technique of Patent Document 1, there is no idea that each terminal device only acquires related information corresponding to the identification information, and the identification information acquired by each terminal device is collected and utilized as a whole. An object of the present invention is to utilize information related to voice picked up by a plurality of terminal devices for various purposes.

以上の課題を解決するために、本発明の好適な態様に係る情報処理方法は、放音装置から放音されて端末装置が収音した音声を示す音声情報を当該端末装置から受信し、複数の端末装置からそれぞれ受信した複数の音声情報を記憶装置に記憶する。
本発明の好適な態様に係るデータ構造は、放音装置から放音されて端末装置が収音した音声をそれぞれが示す複数の音声情報を含むデータ構造であって、情報処理装置が前記データ構造に含まれる前記複数の音声情報を利用して提供情報を生成する処理に利用される。 In order to solve the above problems, a plurality of information processing methods according to a preferred embodiment of the present invention receive voice information indicating the sound emitted from the sound emitting device and picked up by the terminal device from the terminal device. A plurality of voice information received from each of the terminal devices of the above is stored in the storage device.
The data structure according to a preferred embodiment of the present invention is a data structure including a plurality of voice information indicating the sound emitted from the sound emitting device and collected by the terminal device, and the information processing device is the data structure. It is used in the process of generating the provided information by using the plurality of voice information included in the above.

第１実施形態における情報提供システムの構成を例示するブロック図である。It is a block diagram which illustrates the structure of the information provision system in 1st Embodiment. 端末装置の構成を例示するブロック図である。It is a block diagram exemplifying the configuration of a terminal device. 端末装置の制御装置が実行する処理を例示するフローチャートである。It is a flowchart which illustrates the process executed by the control device of a terminal device. 配信装置の構成を例示するブロック図である。It is a block diagram exemplifying the configuration of a distribution device. 関連テーブルの模式図である。It is a schematic diagram of a related table. 音声テーブルの模式図である。It is a schematic diagram of a voice table. 関連情報を生成する処理を例示するフローチャートである。It is a flowchart which illustrates the process which generates the related information. 提供情報を生成する処理を例示するフローチャートである。It is a flowchart which illustrates the process of generating the provided information. 第２実施形態に係る音声テーブルの模式図である。It is a schematic diagram of the audio table which concerns on 2nd Embodiment. 第３実施形態に係る音声テーブルの模式図である。It is a schematic diagram of the audio table which concerns on 3rd Embodiment.

＜第１実施形態＞
図１は、本発明の第１実施形態に係る情報提供システム１００の構成を例示するブロック図である。図１に例示される通り、第１実施形態の情報提供システム１００は、複数の端末装置１０Aと放音装置２０と配信装置３０（情報処理装置の一例）と端末装置１０Bとを具備する。情報提供システム１００の各要素は、例えば移動体通信網またはインターネット等を含む通信網７０を介して相互に通信可能である。情報提供システム１００は、端末装置１０Aの利用者と端末装置１０Bの利用者とに各種の情報を提供するためのコンピュータシステムである。端末装置１０Aおよび端末装置１０Bは、例えば携帯電話機、スマートフォン、タブレット端末、またはパーソナルコンピュータ等の可搬型の情報端末である。 <First Embodiment>
FIG. 1 is a block diagram illustrating the configuration of the information providing system 100 according to the first embodiment of the present invention. As illustrated in FIG. 1, the information providing system 100 of the first embodiment includes a plurality of terminal devices 10A, a sound emitting device 20, a distribution device 30 (an example of an information processing device), and a terminal device 10B. Each element of the information providing system 100 can communicate with each other via a communication network 70 including, for example, a mobile communication network or the Internet. The information providing system 100 is a computer system for providing various information to the user of the terminal device 10A and the user of the terminal device 10B. The terminal device 10A and the terminal device 10B are portable information terminals such as a mobile phone, a smartphone, a tablet terminal, or a personal computer.

＜放音装置２０＞
放音装置２０は、特定の施設Ｐに設置される。具体的には、放音装置２０は、施設Ｐの利用者に各種の情報を案内する音声（以下「案内音声」という）Ｖを放音する出力装置である。例えば、駅またはバス停等の交通施設、鉄道またはバス等の交通機関、販売店または飲食店等の商業施設、旅館またはホテル等の宿泊施設、博物館または美術館等の展示施設、史跡または名所等の観光施設、競技場または体育館等の運動施設、等が施設Ｐとして例示される。例えば施設Ｐの営業時間、施設内での販売商品、または、施設Ｐ内における非常事態（例えば火災等）の発生を案内する案内音声Ｖが放音される。または、電車またはバスの車輌（施設Ｐ）内において、電車の遅延等の運行状況、電車の到着、または、乗降時の注意事項等を通知する車内放送が案内音声Ｖとして例示される。なお、実際には複数の施設の各々に放音装置２０が設置され当該施設内で案内音声Ｖ（音声の一例）が放音されるが、以下の説明では便宜的に１つの施設Ｐに着目する。なお、放音装置２０が放音する案内音声Ｖは、施設Ｐの従業者が発音した音声でもよいし、例えば音声合成または録音により用意する音声でもよい。例えば従業者による文字列の指示に並行して、当該文字列が表す案内音声Ｖを音声合成によりリアルタイムで生成してもよい。 <Sound release device 20>
The sound emitting device 20 is installed in a specific facility P. Specifically, the sound emitting device 20 is an output device that emits a voice (hereinafter referred to as “guidance voice”) V for guiding various information to the user of the facility P. For example, transportation facilities such as stations or bus stops, transportation facilities such as railroads or buses, commercial facilities such as shops or restaurants, accommodation facilities such as inns or hotels, exhibition facilities such as museums or art galleries, sightseeing such as historical sites or famous places. Facilities, sports facilities such as stadiums or gymnasiums, etc. are exemplified as facility P. For example, the business hours of the facility P, the products sold in the facility, or the guidance voice V for informing the occurrence of an emergency (for example, a fire) in the facility P is emitted. Alternatively, in-vehicle broadcasting that notifies the operation status such as train delay, arrival of the train, precautions when getting on and off, etc. in the train or bus vehicle (facility P) is exemplified as the guidance voice V. In reality, a sound emitting device 20 is installed in each of a plurality of facilities, and a guidance voice V (an example of voice) is emitted in the facility. However, in the following explanation, one facility P is focused on for convenience. do. The guidance voice V emitted by the sound emitting device 20 may be a voice pronounced by an employee of the facility P, or may be a voice prepared by voice synthesis or recording, for example. For example, in parallel with the instruction of the character string by the employee, the guidance voice V represented by the character string may be generated in real time by voice synthesis.

第１実施形態の放音装置２０は、案内音声Ｖのほかに、音声情報Ｄ1を表す音響と位置情報Ｄ2を表す音響とを放音する。音声情報Ｄ1は、案内音声Ｖの内容（例えば案内音声Ｖの発音内容を表す文字列）を示す情報である。第１実施形態では、案内音声Ｖの内容を識別する識別子が音声情報Ｄ1として利用される。案内音声Ｖ毎に相異なる音声情報Ｄ1が事前に設定される。音声情報Ｄ1は、案内音声Ｖに対応する関連情報Ｒを特定するための情報である。 In addition to the guidance voice V, the sound emitting device 20 of the first embodiment emits a sound representing the voice information D1 and a sound representing the position information D2. The voice information D1 is information indicating the content of the guidance voice V (for example, a character string representing the pronunciation content of the guidance voice V). In the first embodiment, the identifier that identifies the content of the guidance voice V is used as the voice information D1. Different voice information D1 is set in advance for each guidance voice V. The voice information D1 is information for specifying the related information R corresponding to the guidance voice V.

位置情報Ｄ2は、放音装置２０により音響が放音される位置（以下「放音位置」）を示す情報である。放音位置毎に相異なる位置情報Ｄ2が事前に設定される。例えば、放音装置２０が設置されている施設Ｐの名称、当該施設Ｐが所在する地域（例えば関東地方および近畿地方等の区分）、または当該施設Ｐの緯度および経度等の地理的な地点が、放音位置として例示される。なお、放音位置は、地理的な特定の地点を示す情報に限定されず、例えば施設Ｐ内での階数や施設Ｐ内の特定の場所でもよい。また、位置情報Ｄ2は、各放音装置２０を識別するための情報でもよい。第１実施形態では、放音位置を識別する識別子が位置情報Ｄ2として利用される。 The position information D2 is information indicating a position where sound is emitted by the sound emitting device 20 (hereinafter referred to as “sound emitting position”). Different position information D2 is set in advance for each sound emission position. For example, the name of the facility P where the sound emitting device 20 is installed, the area where the facility P is located (for example, the division between the Kanto region and the Kinki region), or the geographical point such as the latitude and longitude of the facility P. , Illustrated as a sound emitting position. The sound emitting position is not limited to the information indicating a geographically specific point, and may be, for example, the number of floors in the facility P or a specific place in the facility P. Further, the position information D2 may be information for identifying each sound emitting device 20. In the first embodiment, the identifier that identifies the sound emission position is used as the position information D2.

音響信号Ｘを放音装置２０に供給することで、案内音声Ｖと音声情報Ｄ1を含む音響と位置情報Ｄ2を表す音響とが放音装置２０から放音される。音響信号Ｘは、案内音声Ｖと音声情報Ｄ1を表す音響と位置情報Ｄ2を表す音響とを含む音を表す信号である。第１実施形態の放音装置２０は、案内音声Ｖを再生する音響機器として機能するほか、空気振動としての音波を伝送媒体とした音響通信により音声情報Ｄ1および位置情報Ｄ2を周囲に送信する送信機としても機能する。すなわち、第１実施形態では、案内音声Ｖを放音する放音装置２０から音声情報Ｄ1および位置情報Ｄ2の音響を放音する音響通信により、当該音声情報Ｄ1および位置情報Ｄ2が周囲に送信される。音声情報Ｄ1は、案内音声Ｖの放音毎に送信される。例えば、案内音声Ｖの放音とともに（例えば案内音声Ｖの放音に並行または前後して）音声情報Ｄ1が送信される。他方、位置情報Ｄ2は、案内音声Ｖの放音とは別に所定の周期で反復的に送信される。なお、位置情報Ｄ2を案内音声Ｖの放音とともに送信してもよい。ただし、音声情報Ｄ1の送信と位置情報Ｄ2の送信とは時間的に重複しない。 By supplying the acoustic signal X to the sound emitting device 20, the sound including the guidance voice V and the voice information D1 and the sound representing the position information D2 are emitted from the sound emitting device 20. The acoustic signal X is a signal representing a sound including the guidance voice V, the sound representing the voice information D1, and the sound representing the position information D2. The sound emitting device 20 of the first embodiment functions as an acoustic device for reproducing the guidance voice V, and also transmits voice information D1 and position information D2 to the surroundings by acoustic communication using sound waves as air vibration as a transmission medium. It also functions as a machine. That is, in the first embodiment, the voice information D1 and the position information D2 are transmitted to the surroundings by the acoustic communication that emits the sound of the voice information D1 and the position information D2 from the sound emitting device 20 that emits the guidance voice V. To. The voice information D1 is transmitted every time the guidance voice V is emitted. For example, the voice information D1 is transmitted together with the sound of the guidance voice V (for example, in parallel with or before and after the sound of the guidance voice V). On the other hand, the position information D2 is repeatedly transmitted at a predetermined cycle separately from the sound emission of the guidance voice V. The position information D2 may be transmitted together with the sound of the guidance voice V. However, the transmission of the voice information D1 and the transmission of the position information D2 do not overlap in time.

音響信号Ｘは、案内音声Ｖを表す音声信号と、音声情報Ｄ1を音響成分として表す変調信号と、位置情報Ｄ2を音響成分として表す変調信号とを加算することで生成される。各変調信号は、例えば所定の周波数の搬送波を各情報（音声情報Ｄ1または位置情報Ｄ2）により周波数変調することで生成される。なお、拡散符号を利用した各情報の拡散変調と所定の周波数の搬送波を利用した周波数変換とを順次に実行することで変調信号を生成してもよい。変調信号の周波数帯域は、放音装置２０による放音と端末装置１０Aによる収音とが可能な周波数帯域であり、かつ、端末装置１０Aの利用者が通常の環境で聴取する音声の周波数帯域を上回る周波数帯域（例えば１８ｋＨｚ以上かつ２０ｋＨｚ以下）に設定される。したがって、利用者は、音声情報Ｄ1および位置情報Ｄ2の音響成分を殆ど聴取できない。ただし、変調信号の周波数帯域は任意であり、例えば可聴帯域内の変調信号を生成することも可能である。 The acoustic signal X is generated by adding an audio signal representing the guidance voice V, a modulated signal representing the audio information D1 as an acoustic component, and a modulated signal representing the position information D2 as an acoustic component. Each modulation signal is generated, for example, by frequency-modulating a carrier wave having a predetermined frequency with each information (voice information D1 or position information D2). A modulated signal may be generated by sequentially executing diffusion modulation of each information using a diffusion code and frequency conversion using a carrier wave having a predetermined frequency. The frequency band of the modulated signal is a frequency band in which the sound can be emitted by the sound emitting device 20 and the sound can be picked up by the terminal device 10A, and the frequency band of the sound heard by the user of the terminal device 10A in a normal environment. It is set to a higher frequency band (for example, 18 kHz or more and 20 kHz or less). Therefore, the user can hardly hear the acoustic components of the voice information D1 and the position information D2. However, the frequency band of the modulated signal is arbitrary, and it is possible to generate a modulated signal within the audible band, for example.

＜端末装置１０A＞
図１の複数の端末装置１０Aは、放音装置２０が設置される施設Ｐに所在する。第１実施形態の端末装置１０Aは、案内音声Ｖに関連する情報（以下「関連情報」という）Ｒを配信装置３０から取得する。関連情報Ｒは、例えば、案内音声Ｖの発話内容を表す文字列、当該文字列を他言語に翻訳した翻訳文、または、案内音声Ｖに関連するコンテンツの所在を表す情報（例えばＵＲＬ）等である。関連情報Ｒは、音声または画像で表現されてもよい。なお、他の施設にも同様に複数の端末装置１０Aが所在する。 <Terminal device 10A>
The plurality of terminal devices 10A of FIG. 1 are located in the facility P where the sound emitting device 20 is installed. The terminal device 10A of the first embodiment acquires information (hereinafter referred to as “related information”) R related to the guidance voice V from the distribution device 30. The related information R is, for example, a character string representing the utterance content of the guidance voice V, a translated text obtained by translating the character string into another language, information indicating the location of the content related to the guidance voice V (for example, URL), or the like. be. The relevant information R may be represented by audio or image. Similarly, a plurality of terminal devices 10A are located in other facilities.

図２は、端末装置１０Aの構成を例示するブロック図である。図２に例示される通り、端末装置１０Aは、制御装置１１と記憶装置１２と通信装置１３と収音装置１４と再生装置１５とを具備する。端末装置１０Aは、典型的には前述の通り、利用者が所有する情報端末である。ただし、交通機関に設置された電光掲示板、または商業施設に設置される電子看板（デジタルサイネージ）等の案内用の表示端末を端末装置１０Aとして利用してもよい。 FIG. 2 is a block diagram illustrating the configuration of the terminal device 10A. As illustrated in FIG. 2, the terminal device 10A includes a control device 11, a storage device 12, a communication device 13, a sound collecting device 14, and a reproducing device 15. The terminal device 10A is typically an information terminal owned by the user, as described above. However, a display terminal for guidance such as an electric bulletin board installed in a transportation facility or an electronic signage (digital signage) installed in a commercial facility may be used as the terminal device 10A.

収音装置１４は、周囲の音響を収音する音響機器（マイクロホン）である。具体的には、収音装置１４は、放音装置２０が施設Ｐ内に放音した音響を収音し、当該音響の波形を表す音響信号Ｙを生成する。したがって、施設Ｐでの収音により生成された音響信号Ｙには、音声情報Ｄ1の音響成分と位置情報Ｄ2の音響成分とが含まれ得る。位置情報Ｄ2は、案内音声Ｖを収音したときの端末装置１０Aの位置を示す情報とも換言される。 The sound collecting device 14 is an acoustic device (microphone) that collects ambient sound. Specifically, the sound collecting device 14 picks up the sound emitted by the sound emitting device 20 in the facility P, and generates an acoustic signal Y representing the waveform of the sound. Therefore, the acoustic signal Y generated by the sound collection at the facility P may include the acoustic component of the audio information D1 and the acoustic component of the position information D2. The position information D2 is also referred to as information indicating the position of the terminal device 10A when the guidance voice V is picked up.

以上の説明から理解される通り、収音装置１４は、端末装置１０Aの相互間の音声通話または動画撮影時の音声収録に利用されるほか、空気振動としての音波を伝送媒体とする音響通信により音声情報Ｄ1および位置情報Ｄ2を受信する受信機としても機能する。なお、収音装置１４が生成した音響信号Ｙをアナログからデジタルに変換するＡ/Ｄ変換器の図示は便宜的に省略した。また、端末装置１０Aと一体に構成された収音装置１４に代えて、別体の収音装置１４を有線または無線により端末装置１０Aに接続してもよい。 As understood from the above description, the sound collecting device 14 is used for voice communication between the terminal devices 10A and voice recording at the time of moving image shooting, and also by acoustic communication using sound waves as air vibration as a transmission medium. It also functions as a receiver for receiving voice information D1 and position information D2. The illustration of the A / D converter that converts the acoustic signal Y generated by the sound collecting device 14 from analog to digital is omitted for convenience. Further, instead of the sound collecting device 14 integrally configured with the terminal device 10A, a separate sound collecting device 14 may be connected to the terminal device 10A by wire or wirelessly.

制御装置１１（コンピュータの例示）は、例えばＣＰＵ（Central Processing Unit）等の処理回路で構成され、端末装置１０Aの各要素を統括的に制御する。記憶装置１２は、制御装置１１が実行するプログラムと、制御装置１１が使用する各種のデータとを記憶する。例えば半導体記録媒体および磁気記録媒体等の公知の記録媒体、または複数種の記録媒体の組合せが、記憶装置１２として任意に採用され得る。 The control device 11 (example of a computer) is composed of a processing circuit such as a CPU (Central Processing Unit), and controls each element of the terminal device 10A in an integrated manner. The storage device 12 stores a program executed by the control device 11 and various data used by the control device 11. For example, a known recording medium such as a semiconductor recording medium and a magnetic recording medium, or a combination of a plurality of types of recording media can be arbitrarily adopted as the storage device 12.

制御装置１１は、図２に例示される通り、記憶装置１２に記憶されたプログラムを実行することで複数の機能（情報抽出部４１および再生制御部４２）を実現する。なお、制御装置１１の一部の機能を専用の電子回路で実現してもよい。また、制御装置１１の機能を複数の装置に搭載してもよい。 As illustrated in FIG. 2, the control device 11 realizes a plurality of functions (information extraction unit 41 and reproduction control unit 42) by executing a program stored in the storage device 12. A part of the functions of the control device 11 may be realized by a dedicated electronic circuit. Further, the function of the control device 11 may be mounted on a plurality of devices.

情報抽出部４１は、収音装置１４が生成した音響信号Ｙから音声情報Ｄ1および位置情報Ｄ2を抽出する。具体的には、情報抽出部４１は、例えば、音響信号Ｙのうち各情報（音声情報Ｄ1および位置情報Ｄ2）の音響成分を含む周波数帯域を強調するフィルタ処理と、各情報に対する変調処理に対応した復調処理とにより、音声情報Ｄ1および位置情報Ｄ2を抽出する。情報抽出部４１が抽出した音声情報Ｄ1は、当該音声情報Ｄ1に対応する案内音声Ｖ（すなわち放音装置２０により放音された案内音声Ｖ）の関連情報Ｒの取得に利用される。 The information extraction unit 41 extracts the voice information D1 and the position information D2 from the acoustic signal Y generated by the sound collecting device 14. Specifically, the information extraction unit 41 corresponds to, for example, a filter process for emphasizing the frequency band including the acoustic component of each information (voice information D1 and position information D2) in the acoustic signal Y, and a modulation process for each information. The audio information D1 and the position information D2 are extracted by the demodulation process. The voice information D1 extracted by the information extraction unit 41 is used to acquire the related information R of the guidance voice V corresponding to the voice information D1 (that is, the guidance voice V emitted by the sound emitting device 20).

通信装置１３は、制御装置１１による制御のもとで通信網７０を介して配信装置３０と通信する。第１実施形態の通信装置１３は、情報抽出部４１が抽出した音声情報Ｄ1と位置情報Ｄ2とを配信装置３０に送信する。配信装置３０は、端末装置１０Aから送信された音声情報Ｄ1に対応した関連情報Ｒを取得して端末装置１０Aに送信する。通信装置１３は、配信装置３０から送信された関連情報Ｒを受信する。配信装置３０が関連情報Ｒを取得する処理については後述する。なお、実際には複数の端末装置１０Aのそれぞれから音声情報Ｄ1および位置情報Ｄ2が配信装置３０に送信される。 The communication device 13 communicates with the distribution device 30 via the communication network 70 under the control of the control device 11. The communication device 13 of the first embodiment transmits the voice information D1 and the position information D2 extracted by the information extraction unit 41 to the distribution device 30. The distribution device 30 acquires the related information R corresponding to the voice information D1 transmitted from the terminal device 10A and transmits the related information R to the terminal device 10A. The communication device 13 receives the related information R transmitted from the distribution device 30. The process of acquiring the related information R by the distribution device 30 will be described later. Actually, the voice information D1 and the position information D2 are transmitted from each of the plurality of terminal devices 10A to the distribution device 30.

再生制御部４２は、通信装置１３が受信した関連情報Ｒを再生装置１５に再生させる。再生装置１５は、関連情報Ｒを再生する出力機器である。具体的には、再生装置１５は、関連情報Ｒが表す画像を表示する表示装置と、当該関連情報Ｒが表す音響を放音する放音装置とを具備する。すなわち、再生装置１５による再生は、画像の表示と音響の放音とを包含する。なお、端末装置１０Aと一体に構成された再生装置１５に代えて、別体の再生装置１５を有線または無線により端末装置１０Aに接続してもよい。また、再生装置１５が表示装置および放音装置の一方のみを含む構成としてもよい。 The reproduction control unit 42 causes the reproduction device 15 to reproduce the related information R received by the communication device 13. The reproduction device 15 is an output device that reproduces the related information R. Specifically, the reproduction device 15 includes a display device that displays an image represented by the related information R, and a sound emitting device that emits a sound represented by the related information R. That is, the reproduction by the reproduction device 15 includes the display of an image and the sound emission of sound. Instead of the reproduction device 15 integrally configured with the terminal device 10A, a separate reproduction device 15 may be connected to the terminal device 10A by wire or wirelessly. Further, the reproduction device 15 may be configured to include only one of the display device and the sound emitting device.

図３は、端末装置１０Aの制御装置１１が実行する処理を例示するフローチャートである。例えば所定の周期で図３の処理が反復的に実行される。図３の処理を開始すると、情報抽出部４１は、収音装置１４が生成した音響信号Ｙから音声情報Ｄ1と位置情報Ｄ2とを抽出する（Ｓa1）。情報抽出部４１は、抽出した音声情報Ｄ1と位置情報Ｄ2とを、通信装置１３から配信装置３０に対して送信させる（Ｓa2）。なお、位置情報Ｄ2が所定の周期で反復的に放音装置２０から送信される場合、制御装置１１は、直近に受信した位置情報Ｄ2を記憶装置１２に記憶しておいて、当該位置情報Ｄ2を配信装置３０に対して送信する。再生制御部４２は、配信装置３０から送信された関連情報Ｒを再生装置１５に再生させる（Ｓa3）。以上の処理により、放音装置２０から放音された案内音声Ｖの関連情報Ｒが端末装置１０Aの利用者に提供される。 FIG. 3 is a flowchart illustrating a process executed by the control device 11 of the terminal device 10A. For example, the process of FIG. 3 is repeatedly executed at a predetermined cycle. When the process of FIG. 3 is started, the information extraction unit 41 extracts the voice information D1 and the position information D2 from the acoustic signal Y generated by the sound collecting device 14 (Sa1). The information extraction unit 41 causes the communication device 13 to transmit the extracted voice information D1 and the position information D2 to the distribution device 30 (Sa2). When the position information D2 is repeatedly transmitted from the sound emitting device 20 in a predetermined cycle, the control device 11 stores the most recently received position information D2 in the storage device 12, and the position information D2 is stored. Is transmitted to the distribution device 30. The reproduction control unit 42 causes the reproduction device 15 to reproduce the related information R transmitted from the distribution device 30 (Sa3). By the above processing, the related information R of the guidance voice V emitted from the sound emitting device 20 is provided to the user of the terminal device 10A.

＜配信装置３０＞
図４は、配信装置３０の構成を例示するブロック図である。配信装置３０は、端末装置１０Aおよび端末装置１０Bに通信網７０を介して各種の情報を送信するサーバ装置（例えばウェブサーバ）である。各端末装置１０Aに対しては、当該端末装置１０Aから送信された音声情報Ｄ1に対応する関連情報Ｒが送信される。他方、端末装置１０Bに対しては、当該端末装置１０Bの利用者に提供するための情報（以下「提供情報」という）Ｑが送信される。図４に例示される通り、第１実施形態の配信装置３０は、制御装置３１と記憶装置３２と通信装置３３とを具備する。なお、配信装置３０は、単体の装置で実現されるほか、相互に別体で構成された複数の装置の集合（すなわちサーバシステム）でも実現される。 <Distribution device 30>
FIG. 4 is a block diagram illustrating the configuration of the distribution device 30. The distribution device 30 is a server device (for example, a web server) that transmits various information to the terminal device 10A and the terminal device 10B via the communication network 70. For each terminal device 10A, the related information R corresponding to the voice information D1 transmitted from the terminal device 10A is transmitted. On the other hand, information (hereinafter referred to as "provided information") Q for providing to the user of the terminal device 10B is transmitted to the terminal device 10B. As illustrated in FIG. 4, the distribution device 30 of the first embodiment includes a control device 31, a storage device 32, and a communication device 33. The distribution device 30 is realized not only by a single device but also by a set of a plurality of devices (that is, a server system) configured as separate bodies from each other.

通信装置３３は、制御装置３１（通信制御部５１）による制御のもとで通信網７０を介して端末装置１０Aおよび端末装置１０Bと通信する。第１実施形態の通信装置３３は、複数の端末装置１０Aのそれぞれから音声情報Ｄ1および位置情報Ｄ2を受信し、当該端末装置１０Aに関連情報Ｒを送信する。また、通信装置３３は、端末装置１０Bに提供情報Ｑを送信する。 The communication device 33 communicates with the terminal device 10A and the terminal device 10B via the communication network 70 under the control of the control device 31 (communication control unit 51). The communication device 33 of the first embodiment receives the voice information D1 and the position information D2 from each of the plurality of terminal devices 10A, and transmits the related information R to the terminal device 10A. Further, the communication device 33 transmits the provided information Q to the terminal device 10B.

制御装置３１（コンピュータの例示）は、例えばＣＰＵ（Central Processing Unit）等の処理回路で構成され、配信装置３０の各要素を統括的に制御する。記憶装置３２は、制御装置１１が実行するプログラムと、制御装置３１が使用する各種のデータとを記憶する。例えば半導体記録媒体および磁気記録媒体等の公知の記録媒体、または複数種の記録媒体の組合せが、記憶装置３２として任意に採用され得る。第１実施形態の記憶装置３２は、関連テーブルＴaと音声テーブルＴbと端末テーブルＴcとを記憶する。 The control device 31 (example of a computer) is composed of a processing circuit such as a CPU (Central Processing Unit), and controls each element of the distribution device 30 in an integrated manner. The storage device 32 stores a program executed by the control device 11 and various data used by the control device 31. For example, a known recording medium such as a semiconductor recording medium and a magnetic recording medium, or a combination of a plurality of types of recording media can be arbitrarily adopted as the storage device 32. The storage device 32 of the first embodiment stores the related table Ta, the voice table Tb, and the terminal table Tc.

図５は、関連テーブルＴaの模式図である。図５に例示される通り、関連テーブルＴaは、複数の関連情報Ｒが登録されたテーブルである。具体的には、複数の音声情報Ｄ1の各々について、当該音声情報Ｄ1に対応する関連情報Ｒが登録される。特定の案内音声Ｖの音声情報Ｄ1には、例えば当該案内音声Ｖの発話内容を表す文字列、または、当該文字列を他言語に翻訳した翻訳文等を表す関連情報Ｒが対応付けられる。 FIG. 5 is a schematic diagram of the related table Ta. As illustrated in FIG. 5, the related table Ta is a table in which a plurality of related information R is registered. Specifically, for each of the plurality of voice information D1, the related information R corresponding to the voice information D1 is registered. The voice information D1 of the specific guidance voice V is associated with, for example, a character string representing the speech content of the guidance voice V, or related information R representing a translated sentence obtained by translating the character string into another language.

図６は、音声テーブルＴbの模式図である。図６に例示される通り、音声テーブルＴbは、複数の端末装置１０Aからそれぞれ送信された複数の音声情報Ｄ1が登録されたデータテーブルである。具体的には、各端末装置１０Aから受信した音声情報Ｄ1と位置情報Ｄ2とが対応付けて音声テーブルＴbに登録される。複数の放音位置（Ａ，Ｂ，Ｃ，…）の付近に所在する端末装置１０Aから音声情報Ｄ1が配信装置３０に送信されるから、多様な案内音声Ｖに対応する音声情報Ｄ1が音声テーブルＴbに登録される。すなわち、音声テーブルＴbは、複数の音声情報Ｄ1の集合（ビッグデータ）である。端末テーブルＴcは、提供情報Ｑを送信する対象となる複数の端末装置１０B（具体的には端末装置１０Bを識別するための情報）が登録されたテーブルである。例えば、特定の放音位置に対応する提供情報Ｑを端末装置１０Bの利用者が取得したい場合に、当該端末装置１０Bに対する利用者からの操作に応じて端末装置１０Bが端末テーブルＴcに登録される。また、例えば音声情報Ｄ1の受信を契機として、端末テーブルＴcへの登録要求を配信装置３０に対して端末装置１０Bが自動的に（すなわち利用者からの指示を必要とせずに）送信してもよい。 FIG. 6 is a schematic diagram of the audio table Tb. As illustrated in FIG. 6, the voice table Tb is a data table in which a plurality of voice information D1 transmitted from each of the plurality of terminal devices 10A is registered. Specifically, the voice information D1 received from each terminal device 10A and the position information D2 are associated and registered in the voice table Tb. Since the voice information D1 is transmitted from the terminal device 10A located near the plurality of sound emission positions (A, B, C, ...) To the distribution device 30, the voice information D1 corresponding to various guidance voices V is a voice table. Registered in Tb. That is, the voice table Tb is a set (big data) of a plurality of voice information D1s. The terminal table Tc is a table in which a plurality of terminal devices 10B (specifically, information for identifying the terminal device 10B) to be transmitted the provided information Q are registered. For example, when the user of the terminal device 10B wants to acquire the provided information Q corresponding to a specific sound emission position, the terminal device 10B is registered in the terminal table Tc according to the operation from the user to the terminal device 10B. .. Further, for example, even if the terminal device 10B automatically (that is, without requiring an instruction from the user) transmits a registration request to the terminal table Tc to the distribution device 30 triggered by the reception of the voice information D1. good.

制御装置３１は、図４に例示される通り、記憶装置１２に記憶されたプログラムを実行することで複数の機能（通信制御部５１，記憶制御部５２，関連情報取得部５３，提供情報生成部５４）を実現する。なお、制御装置１１の一部の機能を専用の電子回路で実現してもよい。また、制御装置３１の機能を複数の装置に搭載してもよい。 As illustrated in FIG. 4, the control device 31 executes a program stored in the storage device 12 to perform a plurality of functions (communication control unit 51, storage control unit 52, related information acquisition unit 53, provided information generation unit). 54) is realized. A part of the functions of the control device 11 may be realized by a dedicated electronic circuit. Further, the function of the control device 31 may be mounted on a plurality of devices.

通信制御部５１は、各種の情報の受信および送信を通信装置３３に実行させる。記憶制御部５２は、通信装置３３が受信した音声情報Ｄ1および位置情報Ｄ2を記憶装置３２（具体的には音声テーブルＴb）に記憶させる。関連情報取得部５３は、通信装置３３が受信した音声情報Ｄ1に対応する関連情報Ｒを取得する。提供情報生成部５４は、記憶装置３２に記憶された複数の音声情報Ｄ1を利用して提供情報Ｑを生成する。 The communication control unit 51 causes the communication device 33 to receive and transmit various types of information. The storage control unit 52 stores the voice information D1 and the position information D2 received by the communication device 33 in the storage device 32 (specifically, the voice table Tb). The related information acquisition unit 53 acquires the related information R corresponding to the voice information D1 received by the communication device 33. The provided information generation unit 54 generates the provided information Q by using the plurality of voice information D1 stored in the storage device 32.

図７は、制御装置３１が関連情報Ｒを取得する処理のフローチャートである。図５の処理を開始すると、通信制御部５１は、端末装置１０Aから送信された音声情報Ｄ1と位置情報Ｄ2とを通信装置３３に受信させる（Ｓb1）。なお、実際は複数の放音位置の付近に所在する複数の端末装置１０Aのそれぞれから音声情報Ｄ1と位置情報Ｄ2とが送信される。記憶制御部５２は、複数の端末装置１０Aからそれぞれ受信した複数の音声情報Ｄ1を記憶装置３２に記憶させる（Ｓb2）。具体的には、記憶制御部５２は、各端末装置１０Aから受信した音声情報Ｄ1と位置情報Ｄ2とを対応付けて音声テーブルＴbに登録する。関連情報取得部５３は、通信装置３３が受信した音声情報Ｄ1に対応する関連情報Ｒを取得する（Ｓb3）。関連情報Ｒの取得には、図５の関連テーブルＴaが利用される。具体的には、関連情報取得部５３は、関連テーブルＴaに登録された複数の関連情報Ｒのうち、通信装置３３が受信した音声情報Ｄ1に対応付けられた関連情報Ｒを特定する。通信制御部５１は、関連情報取得部５３が特定した関連情報Ｒを端末装置１０Aに対して通信装置３３から送信させる（Ｓb4）。以上の処理により、放音装置２０から放音された案内音声Ｖを収音した端末装置１０Aに当該案内音声Ｖの関連情報Ｒが送信される。 FIG. 7 is a flowchart of a process in which the control device 31 acquires the related information R. When the process of FIG. 5 is started, the communication control unit 51 causes the communication device 33 to receive the voice information D1 and the position information D2 transmitted from the terminal device 10A (Sb1). Actually, the voice information D1 and the position information D2 are transmitted from each of the plurality of terminal devices 10A located near the plurality of sound emission positions. The storage control unit 52 stores a plurality of voice information D1s received from the plurality of terminal devices 10A in the storage device 32 (Sb2). Specifically, the storage control unit 52 registers the voice information D1 received from each terminal device 10A and the position information D2 in association with each other in the voice table Tb. The related information acquisition unit 53 acquires the related information R corresponding to the voice information D1 received by the communication device 33 (Sb3). The related table Ta in FIG. 5 is used to acquire the related information R. Specifically, the related information acquisition unit 53 specifies the related information R associated with the voice information D1 received by the communication device 33 among the plurality of related information R registered in the related table Ta. The communication control unit 51 causes the terminal device 10A to transmit the related information R specified by the related information acquisition unit 53 from the communication device 33 (Sb4). By the above processing, the related information R of the guidance voice V is transmitted to the terminal device 10A that collects the guidance voice V emitted from the sound emitting device 20.

図８は、制御装置３１が提供情報Ｑを生成する処理のフローチャートである。例えば所定の時間毎に図８の処理が実行される。提供情報生成部５４は、図６の音声テーブルＴbに登録された複数の音声情報Ｄ1を利用して提供情報Ｑを生成する（Ｓc1）。通信制御部５１は、提供情報生成部５４が生成した提供情報Ｑを、端末テーブルＴcに登録された複数の端末装置１０Bに対して通信装置３３から送信させる（Ｓc2）。 FIG. 8 is a flowchart of a process in which the control device 31 generates the provided information Q. For example, the process of FIG. 8 is executed at predetermined time intervals. The provided information generation unit 54 generates the provided information Q by using the plurality of voice information D1 registered in the voice table Tb of FIG. 6 (Sc1). The communication control unit 51 causes the communication device 33 to transmit the provided information Q generated by the provided information generation unit 54 to the plurality of terminal devices 10B registered in the terminal table Tc (Sc2).

以下、配信装置３０が生成する提供情報Ｑについて説明する。以下の説明では、非常事態または緊急事態等の異常事態が発生した場合に生成される提供情報Ｑを例示する。特定の施設（例えば交通施設）で非常事態の発生による混雑が発生している場合を想定する。例えば、交通施設（例えば駅）における人身事故の発生により電車の遅延が発生し、それにより混雑が発生している場合が例示される。非常事態が発生した施設内の放音装置２０の付近に所在する多数の端末装置１０Aは、非常事態の発生を知らせる案内音声Ｖをそれぞれが収音して、当該案内音声Ｖに対応する音声情報Ｄ1と当該放音位置を示す位置情報Ｄ2とを配信装置３０に送信する。非常事態が発生している施設を表す放音位置を示す位置情報Ｄ2と、非常事態の発生を知らせる案内音声Ｖの音声情報Ｄ1とのレコード（組合せ）が、短時間に集中して音声テーブルＴbに登録される。したがって、そのレコードの数に応じて混雑の発生を推定し、音声情報Ｄ1に応じて混雑の原因を推定することができる。 Hereinafter, the provided information Q generated by the distribution device 30 will be described. In the following description, the provided information Q generated in the event of an emergency or an abnormal situation such as an emergency will be illustrated. It is assumed that a specific facility (for example, a transportation facility) is congested due to the occurrence of an emergency. For example, there is an example of a case where a train is delayed due to an accident resulting in injury or death in a transportation facility (for example, a station), which causes congestion. A large number of terminal devices 10A located in the vicinity of the sound emitting device 20 in the facility where the emergency has occurred pick up the guidance voice V notifying the occurrence of the emergency, and the voice information corresponding to the guidance voice V is obtained. The D1 and the position information D2 indicating the sound emission position are transmitted to the distribution device 30. The record (combination) of the position information D2 indicating the sound emission position indicating the facility where the emergency has occurred and the voice information D1 of the guidance voice V notifying the occurrence of the emergency is concentrated in a short time to the voice table Tb. To be registered in. Therefore, the occurrence of congestion can be estimated according to the number of records, and the cause of congestion can be estimated according to the voice information D1.

以上の事情を前提として、提供情報生成部５４は、音声テーブルＴbを利用して提供情報Ｑを生成する。具体的には、提供情報生成部５４は、非常事態の発生を知らせる案内音声Ｖの音声情報Ｄ1と、当該非常事態が発生している施設を表す放音位置を示す位置情報Ｄ2とのレコードの数（以下「登録数」という）Ｎを音声テーブルＴbから特定し、当該登録数Ｎが閾値を上回る場合（すなわち施設が非常事態により混雑している場合）に、非常事態の発生を示す提供情報Ｑを生成する。登録数Ｎが閾値を上回るレコードの音声情報Ｄ1が示す案内音声Ｖの内容に応じた提供情報Ｑが生成される。例えば人身事故により駅が混雑していることを示す提供情報Ｑが生成される。また、登録数Ｎが閾値を上回るレコードの音声情報Ｄ1が示す案内音声Ｖが火災等の非常事態を知らせる内容である場合には、例えば当該レコードの位置情報Ｄ2が示す放音位置が表す施設（つまり火災が発生している施設）を示す提供情報Ｑを生成してもよい。また、位置情報Ｄ2が示す放音位置からの避難経路を示す提供情報Ｑを生成してもよい。例えば、複数の放音位置の各々に対応付けられた避難経路が予め記憶装置３２に記憶され、提供情報Ｑの生成に利用される。以上の説明から理解される通り、端末装置１０Aから音声情報Ｄ1とともに送信された位置情報Ｄ2は、提供情報Ｑの生成に利用されるための情報である。 On the premise of the above circumstances, the provision information generation unit 54 generates the provision information Q by using the voice table Tb. Specifically, the provided information generation unit 54 records a record of the voice information D1 of the guidance voice V notifying the occurrence of an emergency and the position information D2 indicating the sound emitting position indicating the facility in which the emergency occurs. Information provided indicating the occurrence of an emergency when the number (hereinafter referred to as "registration number") N is specified from the voice table Tb and the registration number N exceeds the threshold value (that is, when the facility is congested due to an emergency). Generate Q. The provided information Q corresponding to the content of the guidance voice V indicated by the voice information D1 of the record whose registration number N exceeds the threshold value is generated. For example, the provided information Q indicating that the station is congested due to a personal injury is generated. Further, when the guidance voice V indicated by the voice information D1 of the record whose registration number N exceeds the threshold value is the content indicating an emergency situation such as a fire, for example, the facility represented by the sound emission position indicated by the position information D2 of the record ( That is, the provided information Q indicating the facility where the fire is occurring) may be generated. Further, the provided information Q indicating the evacuation route from the sound emission position indicated by the position information D2 may be generated. For example, the evacuation route associated with each of the plurality of sound emission positions is stored in the storage device 32 in advance and used to generate the provided information Q. As understood from the above description, the position information D2 transmitted from the terminal device 10A together with the voice information D1 is information to be used for generating the provided information Q.

なお、登録数Ｎは、例えば所定の期間（以下「参照期間」という）内に端末装置１０Aから受信したレコード（音声情報Ｄ1，位置情報Ｄ2）の総数である。例えば、現時点から過去にわたる所定長の参照期間内において配信装置３０が端末装置１０Aから受信したレコードが記憶装置３２に保持される構成では、記憶装置３２に記録されたレコードの総数が登録数Ｎとして計数される。また、配信装置３０が音声情報Ｄ1および位置情報Ｄ2を端末装置１０Aから受信した時刻が記憶装置３２に記憶される構成では、記憶装置３２に記憶された全部のレコードのうち受信時刻が参照期間内に属するレコードの総数が登録数Ｎとして計数される。 The number of registrations N is, for example, the total number of records (voice information D1 and position information D2) received from the terminal device 10A within a predetermined period (hereinafter referred to as “reference period”). For example, in a configuration in which records received by the distribution device 30 from the terminal device 10A are held in the storage device 32 within a reference period of a predetermined length from the present time to the past, the total number of records recorded in the storage device 32 is the registered number N. It is counted. Further, in the configuration in which the time when the distribution device 30 receives the voice information D1 and the position information D2 from the terminal device 10A is stored in the storage device 32, the reception time of all the records stored in the storage device 32 is within the reference period. The total number of records belonging to is counted as the number of registrations N.

以上の説明から理解される通り、第１実施形態の提供情報生成部５４は、音声テーブルＴbに登録された複数の音声情報Ｄ1のうち、特定の放音位置（例えば混雑している施設）を示す位置情報Ｄ2に対応付けられた２以上の音声情報Ｄ1を利用して、提供情報Ｑを生成する。具体的には、音声テーブルＴbに登録された複数の音声情報Ｄ1が示す案内音声Ｖの内容に応じた提供情報Ｑが生成される。以上の手順で生成された提供情報Ｑが前述の通り、端末テーブルＴcに登録された複数の端末装置１０Bに対して送信される。 As can be understood from the above description, the provision information generation unit 54 of the first embodiment sets a specific sound emission position (for example, a congested facility) among the plurality of voice information D1 registered in the voice table Tb. The provided information Q is generated by using two or more voice information D1s associated with the indicated position information D2. Specifically, the provided information Q corresponding to the content of the guidance voice V indicated by the plurality of voice information D1 registered in the voice table Tb is generated. As described above, the provided information Q generated by the above procedure is transmitted to the plurality of terminal devices 10B registered in the terminal table Tc.

端末装置１０Bは、配信装置３０から送信された提供情報Ｑを再生する。具体的には、端末装置１０Bは、例えば提供情報Ｑを表す文字列の表示、または、提供情報Ｑが表す音響の放音により、当該提供情報Ｑを再生する。 The terminal device 10B reproduces the provided information Q transmitted from the distribution device 30. Specifically, the terminal device 10B reproduces the provided information Q, for example, by displaying a character string representing the provided information Q or by emitting an acoustic sound represented by the provided information Q.

以上の説明から理解される通り、第１実施形態では、複数の端末装置１０Aからそれぞれ受信した複数の音声情報Ｄ1が記憶装置３２（音声テーブルＴb）に記憶されるから、複数の音声情報Ｄ1を様々な用途に活用するが可能である。第１実施形態では特に、記憶装置３２に記憶された複数の音声情報Ｄ1を利用することで生成された提供情報Ｑが端末装置１０Bに送信されるから、複数の端末装置１０Aから送信された音声情報Ｄ1を提供情報Ｑの生成に活用することができる。また、第１実施形態では、記憶装置３２に記憶された複数の音声情報Ｄ1のうち、特定の位置を示す位置情報Ｄ2に対応付けられた２以上の音声情報Ｄ1から提供情報Ｑが生成されるから、特定の位置で収音されたと推定される案内音声Ｖをそれぞれが示す２以上の音声情報Ｄ1から提供情報Ｑを生成することができるという利点がある。 As can be understood from the above description, in the first embodiment, since the plurality of voice information D1s received from the plurality of terminal devices 10A are stored in the storage device 32 (voice table Tb), the plurality of voice information D1s are stored. It can be used for various purposes. In the first embodiment, in particular, since the provided information Q generated by using the plurality of voice information D1 stored in the storage device 32 is transmitted to the terminal device 10B, the voice transmitted from the plurality of terminal devices 10A The information D1 can be used to generate the provided information Q. Further, in the first embodiment, the provided information Q is generated from two or more voice information D1 associated with the position information D2 indicating a specific position among the plurality of voice information D1 stored in the storage device 32. Therefore, there is an advantage that the provided information Q can be generated from two or more voice information D1s, each of which indicates the guidance voice V estimated to be picked up at a specific position.

＜第２実施形態＞
本発明の第２実施形態を説明する。なお、以下の各例示において機能が第１実施形態と同様である要素については、第１実施形態の説明で使用した符号を流用して各々の詳細な説明を適宜に省略する。 <Second Embodiment>
A second embodiment of the present invention will be described. In each of the following examples, for the elements having the same functions as those of the first embodiment, the reference numerals used in the description of the first embodiment will be diverted and detailed description of each will be omitted as appropriate.

第２実施形態の端末装置１０Aは、音声情報Ｄ1と位置情報Ｄ2とに加えて、時刻情報Ｄ3を配信装置３０に送信する。時刻情報Ｄ3は、端末装置１０Aが案内音声Ｖを収音した時刻（典型的には日時）を示す情報である。例えば案内音声Ｖを収音したときに端末装置１０Aに設定されている時刻が時刻情報Ｄ3として端末装置１０Aにより生成される。なお、情報抽出部４１が案内音声Ｖの音声情報Ｄ1を抽出した時刻を時刻情報Ｄ3として生成してもよい。音声情報Ｄ1と位置情報Ｄ2とは、第１実施形態と同様に、放音装置２０から送信される。 The terminal device 10A of the second embodiment transmits the time information D3 to the distribution device 30 in addition to the voice information D1 and the position information D2. The time information D3 is information indicating the time (typically, the date and time) when the terminal device 10A picks up the guidance voice V. For example, the time set in the terminal device 10A when the guidance voice V is picked up is generated by the terminal device 10A as the time information D3. The time when the information extraction unit 41 extracts the voice information D1 of the guidance voice V may be generated as the time information D3. The voice information D1 and the position information D2 are transmitted from the sound emitting device 20 as in the first embodiment.

配信装置３０の通信制御部５１は、音声情報Ｄ1と位置情報Ｄ2と時刻情報Ｄ3とを通信装置３３に受信させる。記憶制御部５２は、端末装置１０Aが送信した音声情報Ｄ1と位置情報Ｄ2と時刻情報Ｄ3とを音声テーブルＴbに登録する。図９は、第２実施形態に係る音声テーブルＴbの模式図である。図９に例示される通り、各端末装置１０Aから受信した音声情報Ｄ1と位置情報Ｄ2と時刻情報Ｄ3とを対応付けて、１つのレコードとして登録する。 The communication control unit 51 of the distribution device 30 causes the communication device 33 to receive the voice information D1, the position information D2, and the time information D3. The storage control unit 52 registers the voice information D1, the position information D2, and the time information D3 transmitted by the terminal device 10A in the voice table Tb. FIG. 9 is a schematic diagram of the audio table Tb according to the second embodiment. As illustrated in FIG. 9, the voice information D1 received from each terminal device 10A, the position information D2, and the time information D3 are associated with each other and registered as one record.

第２実施形態の関連情報取得部５３は、第１実施形態と同様に、端末装置１０Aから送信された音声情報Ｄ1から関連情報Ｒを取得し、当該端末装置１０Aに送信する。第２実施形態の提供情報生成部５４は、第１実施形態と同様に、記憶装置３２に記憶された複数の音声情報Ｄ1を利用して提供情報Ｑを生成する。 Similar to the first embodiment, the related information acquisition unit 53 of the second embodiment acquires the related information R from the voice information D1 transmitted from the terminal device 10A and transmits the related information R to the terminal device 10A. Similar to the first embodiment, the provision information generation unit 54 of the second embodiment generates the provision information Q by using the plurality of voice information D1 stored in the storage device 32.

以下、第２実施形態の配信装置３０が生成する提供情報Ｑについて説明する。第２実施形態では、特定の施設の混雑の状況（例えば混雑の原因および混雑の度合）を知らせる提供情報Ｑを生成する場合を例示する。前述の通り、特定の放音位置の付近に所在する複数の端末装置１０Aが各種の案内音声Ｖをそれぞれが収音して、音声情報Ｄ1と位置情報Ｄ2と時刻情報Ｄ3とが配信装置３０に送信される。音声テーブルＴbにおいて放音位置（例えば商業施設を表す放音位置）を示す位置情報Ｄ2が共通に対応付けられた時刻情報Ｄ3のうち、当該商業施設に混雑が発生している時間帯（例えば朝／昼／夜）を表す時刻情報Ｄ3は、短時間に集中して登録される。したがって、各時間帯について、特定の商業施設を表す放音位置を示す位置情報Ｄ2と時刻情報Ｄ3と含むレコードの登録数Ｎを算定することで、当該商業施設が混雑している時間帯を推定することができる。 Hereinafter, the provided information Q generated by the distribution device 30 of the second embodiment will be described. In the second embodiment, a case where the provided information Q that informs the congestion status of a specific facility (for example, the cause of congestion and the degree of congestion) is generated is exemplified. As described above, a plurality of terminal devices 10A located near a specific sound emitting position pick up various guidance voices V, respectively, and the voice information D1, the position information D2, and the time information D3 are sent to the distribution device 30. Will be sent. Of the time information D3 to which the position information D2 indicating the sound emission position (for example, the sound emission position representing a commercial facility) is commonly associated with the voice table Tb, the time zone in which the commercial facility is congested (for example, morning). The time information D3 representing (/ day / night) is centrally registered in a short time. Therefore, for each time zone, the time zone when the commercial facility is congested is estimated by calculating the number of registered records N including the position information D2 and the time information D3 indicating the sound emission position representing a specific commercial facility. can do.

以上の事情を前提として、第２実施形態の提供情報生成部５４は、音声テーブルＴbに登録された複数の音声情報Ｄ1を利用して、商業施設の混雑の状況を知らせる提供情報Ｑを生成する。例えば、提供情報生成部５４は、特定の商業施設を表す放音位置を示す位置情報Ｄ2を含むレコードを音声テーブルＴbから特定する。つぎに、提供情報生成部５４は、特定した複数のレコードを当該レコードの時刻情報Ｄ3が示す時刻を利用して時間帯（朝／昼／夜）毎に分類する。分類されたレコードの登録数Ｎに応じて各時間帯の混雑の度合（例えば高／普通／低）を推定し、推定した各時間帯の混雑の度合を提供情報Ｑとして生成する。時刻情報Ｄ3は、提供情報Ｑの生成に利用されるための情報である。 On the premise of the above circumstances, the provision information generation unit 54 of the second embodiment uses the plurality of voice information D1 registered in the voice table Tb to generate the provision information Q notifying the congestion status of the commercial facility. .. For example, the provision information generation unit 54 identifies a record including the position information D2 indicating the sound emission position representing a specific commercial facility from the voice table Tb. Next, the provision information generation unit 54 classifies the specified plurality of records into each time zone (morning / noon / night) using the time indicated by the time information D3 of the record. The degree of congestion in each time zone (for example, high / normal / low) is estimated according to the number of registered records N of the classified records, and the estimated degree of congestion in each time zone is generated as the provided information Q. The time information D3 is information to be used for generating the provided information Q.

また、時間帯毎に分類された音声情報Ｄ1が示す案内音声Ｖの内容から、時間帯毎の混雑の原因を示す提供情報Ｑが生成される。例えば各種のイベント（例えばタイムセール等）の発生を報知する案内音声Ｖを示す音声情報Ｄ1を含むレコードの登録数Ｎが多い場合には、当該イベントが混雑の原因であることを示す提供情報Ｑが生成される。すなわち、複数の案内音声Ｖの内容に応じた提供情報Ｑが生成される。なお、混雑の度合は、時間帯毎の混雑の度合に限定されず、例えば曜日毎や月毎の混雑の度合でもよい。また、音声情報Ｄ1を利用せずに、位置情報Ｄ2および時刻情報Ｄ3から混雑の状況を知らせる提供情報Ｑを生成してもよい。以上の手順で生成された提供情報Ｑが前述の通り、端末テーブルＴcに登録された複数の端末装置１０Bに対して送信される。 Further, from the content of the guidance voice V indicated by the voice information D1 classified for each time zone, the provided information Q indicating the cause of the congestion for each time zone is generated. For example, when the number of registered records N including the voice information D1 indicating the guidance voice V indicating the occurrence of various events (for example, time sale, etc.) is large, the provided information Q indicating that the event is the cause of congestion. Is generated. That is, the provided information Q corresponding to the contents of the plurality of guidance voices V is generated. The degree of congestion is not limited to the degree of congestion for each time zone, and may be, for example, the degree of congestion for each day of the week or each month. Further, the provided information Q for notifying the congestion status may be generated from the position information D2 and the time information D3 without using the voice information D1. As described above, the provided information Q generated by the above procedure is transmitted to the plurality of terminal devices 10B registered in the terminal table Tc.

以上の説明から理解される通り、第２実施形態の提供情報生成部５４は、音声テーブルＴbに登録された複数の音声情報Ｄ1のうち、特定の時刻（例えば時間帯）を示す時刻情報Ｄ3に対応付けられた２以上の音声情報Ｄ1を利用して、提供情報Ｑを生成する。第２実施形態でも第１実施形態と同様に、記憶装置３２に記憶された複数の音声情報Ｄ1を様々な用途（例えば提供情報Ｑの生成）に活用することが可能である。第２実施形態では特に、記憶装置３２に記憶された複数の音声情報Ｄ1のうち、特定の時刻を示す時刻情報Ｄ3に対応付けられた２以上の音声情報Ｄ1から提供情報Ｑが生成されるから、特定の時刻に収音されたと推定される案内音声Ｖをそれぞれが示す２以上の音声情報Ｄ1から提供情報Ｑを生成できるという利点がある。 As understood from the above description, the provided information generation unit 54 of the second embodiment uses the time information D3 indicating a specific time (for example, a time zone) among the plurality of voice information D1 registered in the voice table Tb. The provided information Q is generated by using the two or more associated voice information D1s. In the second embodiment as well as in the first embodiment, it is possible to utilize the plurality of voice information D1 stored in the storage device 32 for various purposes (for example, generation of the provided information Q). In the second embodiment, in particular, among the plurality of voice information D1 stored in the storage device 32, the provided information Q is generated from two or more voice information D1 associated with the time information D3 indicating a specific time. There is an advantage that the provided information Q can be generated from two or more voice information D1s, each of which indicates the guidance voice V estimated to be picked up at a specific time.

なお、第２実施形態では、端末装置１０Aが時刻情報Ｄ3を生成したが、配信装置３０が時刻情報Ｄ3を生成してもよい。例えば端末装置１０Aから音声情報Ｄ1と位置情報Ｄ2とを受信した時刻を示す時刻情報Ｄ3が生成され、当該時刻情報Ｄ3が端末装置１０Aから送信された音声情報Ｄ1と位置情報Ｄ2とに対応付けて、１つのレコードとして音声テーブルＴbに登録する。 In the second embodiment, the terminal device 10A generates the time information D3, but the distribution device 30 may generate the time information D3. For example, time information D3 indicating the time when the voice information D1 and the position information D2 are received is generated from the terminal device 10A, and the time information D3 is associated with the voice information D1 and the position information D2 transmitted from the terminal device 10A. It is registered in the voice table Tb as one record.

＜第３実施形態＞
第３実施形態の端末装置１０Aは、音声情報Ｄ1と位置情報Ｄ2とに加えて、言語情報Ｄ4を配信装置３０に送信する。言語情報Ｄ4は、端末装置１０Aに設定された言語を示す情報である。言語情報Ｄ4を取得する方法は任意であるが、例えば、端末装置１０AのＯＳ（Operating System）の言語設定を参照して言語情報Ｄ4を生成する構成、または端末装置１０Aの利用者が任意に指定した言語を示す言語情報Ｄ4を生成する構成が採用される。音声情報Ｄ1と位置情報Ｄ2とは、第１実施形態と同様に、放音装置２０から送信される。 <Third Embodiment>
The terminal device 10A of the third embodiment transmits the language information D4 to the distribution device 30 in addition to the voice information D1 and the position information D2. The language information D4 is information indicating the language set in the terminal device 10A. The method of acquiring the language information D4 is arbitrary, but for example, a configuration for generating the language information D4 by referring to the language setting of the OS (Operating System) of the terminal device 10A, or an arbitrary specification by the user of the terminal device 10A. A configuration is adopted in which the language information D4 indicating the language is generated. The voice information D1 and the position information D2 are transmitted from the sound emitting device 20 as in the first embodiment.

配信装置３０の通信制御部５１は、端末装置１０Aが送信した音声情報Ｄ1と位置情報Ｄ2と言語情報Ｄ4とを通信装置３３に受信させる。記憶制御部５２は、通信装置３３が受信した音声情報Ｄ1と位置情報Ｄ2と言語情報Ｄ4とを音声テーブルＴbに登録する。図１０は、第３実施形態に係る音声テーブルＴbの模式図である。図１０に例示される通り、各端末装置１０Aから受信した音声情報Ｄ1と位置情報Ｄ2と言語情報Ｄ4とを対応付けて１つのレコードとして登録する。 The communication control unit 51 of the distribution device 30 causes the communication device 33 to receive the voice information D1, the position information D2, and the language information D4 transmitted by the terminal device 10A. The storage control unit 52 registers the voice information D1, the position information D2, and the language information D4 received by the communication device 33 in the voice table Tb. FIG. 10 is a schematic diagram of the audio table Tb according to the third embodiment. As illustrated in FIG. 10, the voice information D1 received from each terminal device 10A, the position information D2, and the language information D4 are associated and registered as one record.

第３実施形態の関連情報取得部５３は、第１実施形態と同様に、端末装置１０Aから送信された音声情報Ｄ1から関連情報Ｒを取得し、当該端末装置１０Aに送信する。第３実施形態の提供情報生成部５４は、音声テーブルＴbを利用して提供情報Ｑを生成する。 Similar to the first embodiment, the related information acquisition unit 53 of the third embodiment acquires the related information R from the voice information D1 transmitted from the terminal device 10A and transmits the related information R to the terminal device 10A. The provision information generation unit 54 of the third embodiment generates the provision information Q by using the voice table Tb.

以下、第３実施形態の配信装置３０が生成する提供情報Ｑについて説明する。第３実施形態では、特定の放音位置の付近に所在する利用者が使用する言語を示す提供情報Ｑを生成する場合を例示する。放音位置の付近に特定の言語を使用する利用者（つまり特定の言語が設定された端末装置１０A）が多く所在する場合、当該放音位置を示す位置情報Ｄ2が共通に対応付けられた言語情報Ｄ4が、短時間に集中して音声テーブルＴbに登録される。したがって、各言語について、特定の放音位置を示す位置情報Ｄ2と言語情報Ｄ4とを含むレコードの登録数Ｎを算定することで、当該放音位置において使用する人数が多い言語を推定することができる。 Hereinafter, the provided information Q generated by the distribution device 30 of the third embodiment will be described. In the third embodiment, a case where the provided information Q indicating the language used by the user located near the specific sound emission position is generated is illustrated. When there are many users who use a specific language (that is, the terminal device 10A in which a specific language is set) near the sound emission position, the language to which the position information D2 indicating the sound emission position is commonly associated. Information D4 is centrally registered in the voice table Tb in a short time. Therefore, for each language, by calculating the number of registered records N including the position information D2 and the language information D4 indicating a specific sound emission position, it is possible to estimate the language in which a large number of people are used at the sound emission position. can.

以上の事情を前提として、第３実施形態の提供情報生成部５４は、音声テーブルＴbに登録された複数の音声情報Ｄ1を利用して、特定の放音位置において使用する人数が多い言語を示す提供情報Ｑを生成する。例えば、提供情報生成部５４は、特定の放音位置を示す位置情報Ｄ2を含むレコードを音声テーブルＴbから特定する。次に、提供情報生成部５４は、特定した複数のレコードを当該レコードの言語情報Ｄ4が示す言語を利用して言語毎に分類することで、提供情報Ｑを生成する。例えば、分類されたレコードの登録数Ｎに応じて、当該レコードの位置情報Ｄ2が示す放音位置において各言語を使用する人数を推定し、推定した人数が最も多い言語を示す提供情報Ｑが生成される。すなわち、特定の放音位置で最も使用されていると推定される言語を示す提供情報Ｑが生成される。言語情報Ｄ4は、提供情報Ｑの生成に利用される情報である。なお、分類されたレコードの数が降順で上位に位置する複数の言語を示す提供情報Ｑを生成してもよい。 On the premise of the above circumstances, the provision information generation unit 54 of the third embodiment uses a plurality of voice information D1s registered in the voice table Tb to indicate a language used by a large number of people at a specific sound emission position. Generate the provided information Q. For example, the provided information generation unit 54 identifies a record including the position information D2 indicating a specific sound emission position from the voice table Tb. Next, the provided information generation unit 54 generates the provided information Q by classifying the specified plurality of records into each language using the language indicated by the language information D4 of the record. For example, the number of people who use each language at the sound emission position indicated by the position information D2 of the record is estimated according to the registered number N of the classified records, and the provided information Q indicating the language with the largest estimated number of people is generated. Will be done. That is, the provided information Q indicating the language most presumed to be used at a specific sound emission position is generated. The language information D4 is information used to generate the provided information Q. It should be noted that the provided information Q indicating a plurality of languages in which the number of classified records is ranked higher in descending order may be generated.

第３実施形態では、交通機関に設置された電光掲示板、または商業施設に設置される電子看板（デジタルサイネージ）等の案内用の表示端末を端末装置１０Bとして例示する。配信装置３０は、端末テーブルＴcに登録された複数の端末装置１０Bのうち、提供情報Ｑの生成に利用された言語情報Ｄ4に対応する位置情報Ｄ2が示す放音位置（すなわち提供情報Ｑが示す言語を使用する利用者が多く所在する放音位置）の付近に位置する端末装置１０Bに、当該提供情報Ｑを送信する。端末装置１０Bは、例えば配信装置３０から送信された提供情報Ｑが示す言語により、各種の情報を表示する。すなわち、放音位置の付近において使用する人数が多い言語により情報が表示される。 In the third embodiment, a display terminal for guidance such as an electric bulletin board installed in a transportation facility or an electronic signage (digital signage) installed in a commercial facility is exemplified as a terminal device 10B. The distribution device 30 has a sound emission position indicated by the position information D2 corresponding to the language information D4 used for generating the provision information Q (that is, the provision information Q indicates) among the plurality of terminal devices 10B registered in the terminal table Tc. The provided information Q is transmitted to the terminal device 10B located near the sound emitting position where many users who use the language are located). The terminal device 10B displays various information in the language indicated by the provided information Q transmitted from the distribution device 30, for example. That is, the information is displayed in a language that is used by a large number of people in the vicinity of the sound emitting position.

なお、第３実施形態では、端末装置１０Aが言語情報Ｄ4を生成したが、放音装置２０が音声情報Ｄ1と位置情報Ｄ2とともに言語情報Ｄ4を端末装置１０Aに送信してもよい。例えば、案内音声Ｖの言語を示す情報が言語情報Ｄ4として端末装置１０Aに送信される。端末装置１０Aは、放音装置２０から送信された言語情報Ｄ4を配信装置３０に送信する。また、以上の構成では、放音装置２０からどの言語の案内音声Ｖが放音されているのかという情報収集が可能になる。 In the third embodiment, the terminal device 10A generates the language information D4, but the sound emitting device 20 may transmit the language information D4 to the terminal device 10A together with the voice information D1 and the position information D2. For example, information indicating the language of the guidance voice V is transmitted to the terminal device 10A as language information D4. The terminal device 10A transmits the language information D4 transmitted from the sound emitting device 20 to the distribution device 30. Further, with the above configuration, it is possible to collect information as to which language of the guidance voice V is being emitted from the sound emitting device 20.

以上の説明から理解される通り、第３実施形態の提供情報生成部５４は、音声テーブルＴbに登録された複数の音声情報Ｄ1のうち、特定の言語を示す言語情報Ｄ4に対応付けられた２以上の音声情報Ｄ1を利用して、提供情報Ｑを生成する。第３実施形態でも第１実施形態と同様に、記憶装置３２に記憶された複数の音声情報Ｄ1を様々な用途（例えば提供情報Ｑの生成）に活用するが可能である。第３実施形態では特に、各端末装置１０Aから受信した音声情報Ｄ1と言語情報Ｄ4と対応付けて記憶装置３２に記憶されるから、言語情報Ｄ4を様々な用途に活用することができる。 As can be understood from the above description, the provided information generation unit 54 of the third embodiment is associated with the language information D4 indicating a specific language among the plurality of voice information D1 registered in the voice table Tb2. The provided information Q is generated by using the above voice information D1. In the third embodiment as well as in the first embodiment, it is possible to utilize the plurality of voice information D1 stored in the storage device 32 for various purposes (for example, generation of the provided information Q). In the third embodiment, in particular, since the voice information D1 and the language information D4 received from each terminal device 10A are stored in the storage device 32 in association with each other, the language information D4 can be utilized for various purposes.

＜変形例＞
以上に例示した各態様に付加される具体的な変形の態様を以下に例示する。以下の例示から任意に選択された複数の態様を、相互に矛盾しない範囲で適宜に併合してもよい。 <Modification example>
Specific modifications added to each of the above-exemplified embodiments are illustrated below. A plurality of embodiments arbitrarily selected from the following examples may be appropriately merged to the extent that they do not contradict each other.

（１）前述の各形態では、記憶装置３２（音声テーブルＴb）に記憶された複数の音声情報Ｄ1を利用して提供情報Ｑを生成したが、記憶装置３２に記憶された複数の音声情報Ｄ1は、提供情報Ｑの生成以外の様々な用途に活用される。例えば、記憶装置３２に記憶された複数の音声情報Ｄ1またはその内容を統計的に分析することで、各種の事業（例えばマーケティング）に活用してもよい。なお、記憶装置３２に記憶された音声情報Ｄ1以外の情報（例えば位置情報Ｄ2，時刻情報Ｄ3，言語情報Ｄ4）を分析に利用してもよい。また、音声テーブルＴbを検索可能なデータベースとして提供してもよい。以上の説明から理解される通り、提供情報Ｑの生成は省略される。 (1) In each of the above-described embodiments, the provided information Q is generated by using the plurality of voice information D1 stored in the storage device 32 (voice table Tb), but the plurality of voice information D1 stored in the storage device 32. Is used for various purposes other than the generation of the provided information Q. For example, a plurality of voice information D1 stored in the storage device 32 or the contents thereof may be statistically analyzed and utilized for various businesses (for example, marketing). Information other than the voice information D1 stored in the storage device 32 (for example, position information D2, time information D3, language information D4) may be used for analysis. Further, the voice table Tb may be provided as a searchable database. As understood from the above description, the generation of the provided information Q is omitted.

（２）前述の各形態では、放音装置２０が案内音声Ｖを放音したが、放音装置２０が放音する音声は案内音声Ｖに限定されない。すなわち、端末装置１０Aが収音する音声は案内音声Ｖ以外の音声でもよい。例えば、楽音（楽曲の演奏音）または警報音等の音声を端末装置１０Aが収音し、当該音声を示す音声情報Ｄ1を生成してもよい。案内音声Ｖは、音声の一例である。 (2) In each of the above-described embodiments, the sound emitting device 20 emits the guidance voice V, but the sound emitted by the sound emitting device 20 is not limited to the guidance voice V. That is, the voice picked up by the terminal device 10A may be a voice other than the guidance voice V. For example, the terminal device 10A may pick up a voice such as a musical sound (playing sound of a musical piece) or an alarm sound, and generate voice information D1 indicating the voice. The guidance voice V is an example of voice.

（３）前述の各形態では、案内音声Ｖを識別する識別子を音声情報Ｄ1として例示したが、音声情報Ｄ1は以上の例示に限定されない。例えば案内音声Ｖの発話内容を表す文字列を音声情報Ｄ1としてもよい。端末装置１０Aの情報抽出部４１は、収音装置１４が生成した音響信号Ｙに対する音声認識により、案内音声Ｖの発話内容を表す文字列を音声情報Ｄ1として特定する。音響信号Ｙに対する音声認識には、例えばＨＭＭ（Hidden Markov Model）等の音響モデルと、言語的な制約を示す言語モデルとを利用した認識処理等の公知の技術が任意に採用され得る。端末装置１０Aが音声認識により音声情報Ｄ1を生成する構成では、放音装置２０による音声情報Ｄ1の送信は省略される。端末装置１０Aは、案内音声Ｖの発話内容を表す文字列（音声情報Ｄ1）に対応する関連情報Ｒを配信装置３０から受信する。また、音声情報Ｄ1は、情報の所在を表す情報（例えばＵＲＬ）でもよい。以上の例示から理解される通り、音声情報Ｄ1は、端末装置１０Aが収音した音声を示す情報として包括的に表現される。同様に、位置情報Ｄ2も放音位置を識別する識別子に限定されない。例えば、放音位置を示す文字列（例えば施設の名称）、または、放音位置を示すコンテンツの所在を表す情報（例えばＵＲＬ）を位置情報Ｄ2として放音装置２０が端末装置１０Aに送信してもよい。 (3) In each of the above-described embodiments, the identifier for identifying the guidance voice V is exemplified as the voice information D1, but the voice information D1 is not limited to the above examples. For example, the character string representing the utterance content of the guidance voice V may be the voice information D1. The information extraction unit 41 of the terminal device 10A identifies the character string representing the utterance content of the guidance voice V as the voice information D1 by voice recognition for the acoustic signal Y generated by the sound collecting device 14. For speech recognition for the acoustic signal Y, known techniques such as recognition processing using an acoustic model such as HMM (Hidden Markov Model) and a language model showing linguistic restrictions can be arbitrarily adopted. In the configuration in which the terminal device 10A generates the voice information D1 by voice recognition, the transmission of the voice information D1 by the sound emitting device 20 is omitted. The terminal device 10A receives the related information R corresponding to the character string (voice information D1) representing the utterance content of the guidance voice V from the distribution device 30. Further, the voice information D1 may be information (for example, a URL) indicating the location of the information. As understood from the above examples, the voice information D1 is comprehensively expressed as information indicating the voice picked up by the terminal device 10A. Similarly, the position information D2 is not limited to the identifier that identifies the sound emission position. For example, the sound emitting device 20 transmits the character string indicating the sound emitting position (for example, the name of the facility) or the information indicating the location of the content indicating the sound emitting position (for example, URL) as the position information D2 to the terminal device 10A. May be good.

（４）前述の各形態で例示した位置情報Ｄ2と時刻情報Ｄ3とは、端末装置１０Aによる収音の状況を示す状況情報として包括的に表現される。つまり、案内音声Ｖが収音された位置と、案内音声Ｖが収音された時刻とが収音の状況として例示される。なお、状況情報は、位置情報Ｄ2および時刻情報Ｄ3に限定されない。例えば、端末装置１０Aにより撮像された画像、測位用の衛星（例えばＧＰＳ衛星）を利用して取得した位置、移動速度、アプリケーションの使用状況、ウェブブラウザによる閲覧履歴、プッシュ通知された情報、等の案内音声Ｖの収音時に端末装置１０Aにより生成または取得される情報であれば、状況情報は任意である。 (4) The position information D2 and the time information D3 exemplified in each of the above-described embodiments are comprehensively expressed as situation information indicating the state of sound collection by the terminal device 10A. That is, the position where the guidance voice V is picked up and the time when the guidance voice V is picked up are exemplified as the sound picking up situation. The situation information is not limited to the position information D2 and the time information D3. For example, an image captured by the terminal device 10A, a position acquired by using a positioning satellite (for example, a GPS satellite), a moving speed, an application usage status, a browsing history by a web browser, information notified by push, etc. The status information is arbitrary as long as it is information generated or acquired by the terminal device 10A when the guidance voice V is picked up.

（５）第１実施形態と第２実施形態では、端末装置１０Aが音声情報Ｄ1と状況情報とを配信装置３０に送信し、第３実施形態では、端末装置１０Aが音声情報Ｄ1と状況情報と言語情報Ｄ4とを配信装置３０に送信したが、端末装置１０Aが音声情報Ｄ1以外の情報を送信することは省略してもよい。 (5) In the first embodiment and the second embodiment, the terminal device 10A transmits the voice information D1 and the status information to the distribution device 30, and in the third embodiment, the terminal device 10A receives the voice information D1 and the status information. Although the language information D4 is transmitted to the distribution device 30, it may be omitted that the terminal device 10A transmits information other than the voice information D1.

また、端末装置１０Aが音声情報Ｄ1に付加して送信する情報は、状況情報および言語情報Ｄ4に限定されない。例えば、非常事態（例えば火災）の発生を知らせる案内音声Ｖを収音する端末装置１０Aは、案内音声Ｖ以外の音（例えばサイレン等）も案内音声Ｖとともに収音し得る。以上の状況では、端末装置１０Aが案内音声Ｖとともに収音した音の分類を示す情報（以下「分類情報」という）を音声情報Ｄ1に付加して配信装置３０に送信してもよい。例えば、非常事態を知らせるサイレンを示す分類情報、または、爆発音および衝撃音等の異常音を示す分類情報が例示される。分類情報の生成には、公知の技術が任意に採用され得る。例えば音響信号Ｙの解析により分類情報が生成される。端末装置１０Aは、音声情報Ｄ1と分類情報とを配信装置３０に送信し、配信装置３０（記憶制御部５２）は、受信した音声情報Ｄ1と分類情報とを対応付けて記憶装置３２に記憶させる。なお、案内音声Ｖ以外の音が含まれる音響信号Ｙを端末装置１０Aが音声情報Ｄ1に付加して配信装置３０に送信し、配信装置３０が当該音響信号Ｙの解析により分類情報を生成してもよい。また、端末装置１０Aが収音した案内音声Ｖの放音元である放音装置２０に関する情報（例えば識別情報）を音声情報Ｄ1に付加して端末装置１０Aから送信してもよい。 Further, the information transmitted by the terminal device 10A in addition to the voice information D1 is not limited to the status information and the language information D4. For example, the terminal device 10A that picks up the guidance voice V notifying the occurrence of an emergency (for example, a fire) can pick up sounds other than the guidance voice V (for example, a siren) together with the guidance voice V. In the above situation, information indicating the classification of the sound picked up by the terminal device 10A together with the guidance voice V (hereinafter referred to as “classification information”) may be added to the voice information D1 and transmitted to the distribution device 30. For example, classification information indicating a siren indicating an emergency situation, or classification information indicating an abnormal sound such as an explosion sound and an impact sound is exemplified. Known techniques can be arbitrarily adopted for the generation of classification information. For example, classification information is generated by analysis of the acoustic signal Y. The terminal device 10A transmits the voice information D1 and the classification information to the distribution device 30, and the distribution device 30 (storage control unit 52) stores the received voice information D1 and the classification information in the storage device 32 in association with each other. .. The terminal device 10A adds an acoustic signal Y including sounds other than the guidance voice V to the voice information D1 and transmits the sound signal Y to the distribution device 30, and the distribution device 30 generates classification information by analyzing the acoustic signal Y. May be good. Further, information (for example, identification information) regarding the sound emitting device 20 which is the sound emitting source of the guidance voice V collected by the terminal device 10A may be added to the voice information D1 and transmitted from the terminal device 10A.

（６）第１実施形態および第２実施形態では、提供情報Ｑの生成に複数の音声情報Ｄ1を利用したが、複数の音声情報Ｄ1を利用せずに提供情報Ｑを生成してもよい。例えば状況情報のみを利用して提供情報Ｑを生成してもよい。例えば、音声テーブルＴbに登録されている各放音位置を示す位置情報Ｄ2の数に応じて、当該放音位置毎に混雑の度合を示す提供情報Ｑを生成してもよい。 (6) In the first embodiment and the second embodiment, a plurality of voice information D1s are used to generate the provided information Q, but the provided information Q may be generated without using the plurality of voice information D1s. For example, the provided information Q may be generated using only the status information. For example, the provision information Q indicating the degree of congestion may be generated for each sound emission position according to the number of position information D2 indicating each sound emission position registered in the voice table Tb.

（７）音声情報Ｄ1を受信できるのはその案内音声Ｖを収音可能な特定の放音位置に制限されるから、音声情報Ｄ1は放音位置を示す情報とも表現できる。したがって、前述の各形態で例示した提供情報Ｑの生成において、複数の端末装置１０Aから送信された位置情報Ｄ2を利用することは必須ではない。つまり、放音装置２０による位置情報Ｄ2の送信は省略さ得る。 (7) Since the voice information D1 can be received only at a specific sound discharge position where the guidance voice V can be picked up, the voice information D1 can also be expressed as information indicating the sound discharge position. Therefore, it is not essential to use the position information D2 transmitted from the plurality of terminal devices 10A in the generation of the provided information Q exemplified in each of the above-described embodiments. That is, the transmission of the position information D2 by the sound emitting device 20 may be omitted.

また、端末装置１０Aは、測位用の衛星（例えばＧＰＳ衛星）からの電波を受信することで端末装置１０Aの正確な位置情報Ｄ2を取得し、当該位置情報Ｄ2を音声情報Ｄ1に付加して送信してもよい。なお、衛星電波から特定される位置情報Ｄ2と、放音装置２０から音響通信で受信した位置情報Ｄ2との双方を、音声テーブルＴbに登録してもよい。衛星電波から特定される位置情報Ｄ2は、端末装置１０Aの絶対的な位置を示すのに対し、音響通信で受信した位置情報Ｄ2は放音位置を示すという意味的な相違がある。例えば、端末装置１０Aが電車等の移動体の内部に所在する場合を想定すると、衛星電波から特定される位置情報Ｄ2が示す絶対位置は移動体の移動に連動して変化するが、音響通信で受信した位置情報Ｄ2が示す放音位置は変化しない。 Further, the terminal device 10A acquires accurate position information D2 of the terminal device 10A by receiving radio waves from a positioning satellite (for example, a GPS satellite), and adds the position information D2 to the voice information D1 for transmission. You may. Both the position information D2 specified from the satellite radio wave and the position information D2 received by acoustic communication from the sound emitting device 20 may be registered in the voice table Tb. The position information D2 specified from the satellite radio wave indicates the absolute position of the terminal device 10A, whereas the position information D2 received by acoustic communication indicates the sound emission position, which is a semantic difference. For example, assuming that the terminal device 10A is located inside a moving body such as a train, the absolute position indicated by the position information D2 specified from the satellite radio wave changes in conjunction with the movement of the moving body, but in acoustic communication. The sound emission position indicated by the received position information D2 does not change.

また、端末装置１０Aが接続する通信網７０を管理する電気通信事業者が生成する高精度な位置情報Ｄ2（以下「高精度位置情報」という）を提供情報Ｑの生成に利用してもよい。例えば音声情報Ｄ1に付加された時刻情報Ｄ3が示す時刻に端末装置１０Aが所在する位置に対応した高精度位置情報が、当該音声情報Ｄ1に対応付けて音声テーブルＴbに登録される。以上の構成では、電気通信事業者が生成する高精度位置情報を複数の音声情報Ｄ1の活用に利用することができる。以上の説明から理解される通り、前述の各形態において、配信装置３０に対する位置情報Ｄ2の送信は省略される。なお、高精度位置情報と音声テーブルＴbとの対応関係に応じて提供情報Ｑを生成してもよい。例えば、高精度位置情報が示す多数の端末装置１０Aの移動傾向を加味して、音声テーブルＴbから提供情報Ｑを生成してもよい。 Further, the high-precision position information D2 (hereinafter referred to as “high-precision position information”) generated by the telecommunications carrier that manages the communication network 70 to which the terminal device 10A is connected may be used to generate the provided information Q. For example, high-precision position information corresponding to the position where the terminal device 10A is located at the time indicated by the time information D3 added to the voice information D1 is registered in the voice table Tb in association with the voice information D1. With the above configuration, the high-precision position information generated by the telecommunications carrier can be used to utilize the plurality of voice information D1s. As understood from the above description, in each of the above-described embodiments, the transmission of the position information D2 to the distribution device 30 is omitted. The provided information Q may be generated according to the correspondence between the high-precision position information and the voice table Tb. For example, the provided information Q may be generated from the voice table Tb in consideration of the movement tendency of a large number of terminal devices 10A indicated by the high-precision position information.

（８）前述の各形態では、複数の音声情報Ｄ1を利用して提供情報Ｑを生成したが、提供情報Ｑの生成において複数の音声情報Ｄ1を利用することは必須ではない。すなわち、１個の音声情報Ｄ1を利用して提供情報Ｑを生成してもよい。例えば非常事態を知らせる案内音声Ｖの音声情報Ｄ1が端末装置１０Aから配信装置３０に送信された場合、当該音声情報Ｄ1を利用して非常事態を知らせる提供情報Ｑを生成してもよい。以上の構成では、配信装置３０は、当該音声情報Ｄ1を利用して生成した提供情報Ｑを端末装置１０Bに送信する要素として機能する。端末装置１０Aと端末装置１０Bとの異同は不問である。 (8) In each of the above-described embodiments, the provided information Q is generated by using the plurality of voice information D1, but it is not essential to use the plurality of voice information D1 in the generation of the provided information Q. That is, the provided information Q may be generated by using one voice information D1. For example, when the voice information D1 of the guidance voice V notifying the emergency situation is transmitted from the terminal device 10A to the distribution device 30, the voice information D1 may be used to generate the provision information Q notifying the emergency situation. In the above configuration, the distribution device 30 functions as an element for transmitting the provided information Q generated by using the voice information D1 to the terminal device 10B. The difference between the terminal device 10A and the terminal device 10B does not matter.

（９）前述の各形態では、配信装置３０は端末装置１０Aに関連情報Ｒを送信したが、関連情報Ｒを端末装置１０Aに送信することは省略してもよい。すなわち、端末装置１０Aは音声情報Ｄ1を収集して配信装置３０に送信するための情報端末として利用される。 (9) In each of the above-described embodiments, the distribution device 30 transmits the related information R to the terminal device 10A, but the transmission of the related information R to the terminal device 10A may be omitted. That is, the terminal device 10A is used as an information terminal for collecting the voice information D1 and transmitting it to the distribution device 30.

（１０）前述の各形態では、端末テーブルＴcに登録される端末装置１０Bに対して提供情報Ｑを送信したが、端末装置１０Aに対して提供情報Ｑを送信してもよい。また、放音装置２０が設置された施設Ｐの事業者に情報提供Ｑを提供してもよい。例えば、提供情報生成部５４は、特定の案内音声Ｖ（例えば施設Ｐで放音されている案内音声Ｖ）を示す音声情報Ｄ1の増加傾向を知らせる提供情報Ｑを生成する。例えば、音声テーブルＴbに登録された複数の音声情報Ｄ1のうち、施設Ｐを示す位置情報Ｄ2に対応する音声情報Ｄ1が短時間で増加傾向にある場合（例えば登録数が所定の閾値を上回る場合）に、提供情報Ｑが生成される。例えば、特定の案内音声Ｖを示す音声情報Ｄ1の登録数が増加したことを知らせる提供情報Ｑ、または、当該音声情報Ｄ1の登録数を知らせる提供情報Ｑが生成される。生成された提供情報Ｑが施設Ｐの利用者の情報端末に送信される。利用者の情報端末は、例えば放音装置２０を制御するための制御装置である。情報端末は、配信装置３０から受信した提供情報Ｑが示す音声情報Ｄ1の増加傾向から、施設Ｐの混雑の状況（例えば混雑の度合）を推定する。次に、情報端末は、推定した混雑の状況から、施設Ｐの利用者を混雑の度合が低い場所に誘導するための案内音声Ｖを、所定のタイミングで放音装置２０に放音させる。 (10) In each of the above-described embodiments, the provided information Q is transmitted to the terminal device 10B registered in the terminal table Tc, but the provided information Q may be transmitted to the terminal device 10A. Further, the information provision Q may be provided to the business operator of the facility P in which the sound emitting device 20 is installed. For example, the provision information generation unit 54 generates the provision information Q that informs the increasing tendency of the voice information D1 indicating a specific guidance voice V (for example, the guidance voice V emitted at the facility P). For example, when the voice information D1 corresponding to the position information D2 indicating the facility P tends to increase in a short time among a plurality of voice information D1 registered in the voice table Tb (for example, when the number of registrations exceeds a predetermined threshold value). ), The provided information Q is generated. For example, the provision information Q notifying that the number of registrations of the voice information D1 indicating the specific guidance voice V has increased, or the provision information Q notifying the registration number of the voice information D1 is generated. The generated provided information Q is transmitted to the information terminal of the user of the facility P. The user's information terminal is, for example, a control device for controlling the sound emitting device 20. The information terminal estimates the congestion status (for example, the degree of congestion) of the facility P from the increasing tendency of the voice information D1 indicated by the provided information Q received from the distribution device 30. Next, the information terminal causes the sound emitting device 20 to emit a guidance voice V for guiding the user of the facility P to a place where the degree of congestion is low based on the estimated congestion situation.

（１１）前述の各形態では、各端末装置１０Aが関連情報Ｒの取得のために送信した音声情報Ｄ1を記憶装置３２（音声テーブルＴb）に記憶したが、端末装置１０Aは関連情報Ｒの取得のために送信した音声情報Ｄ1とは異なるタイミングで送信された当該音声情報Ｄ1を記憶装置３２に記憶してもよい。端末装置１０Aは、案内音声Ｖを収音すると関連情報Ｒの取得のために音声情報Ｄ1を配信装置３０に送信する。その後、端末装置１０Aは、記憶装置３２（音声テーブルＴb）に記憶するための音声情報Ｄ1（および他の情報）を配信装置３０に送信する。例えば、予め設定された時刻に複数の音声情報Ｄ1をまとめて送信してもよい。 (11) In each of the above-described embodiments, the voice information D1 transmitted by each terminal device 10A for acquiring the related information R is stored in the storage device 32 (voice table Tb), but the terminal device 10A acquires the related information R. The voice information D1 transmitted at a timing different from the voice information D1 transmitted for the purpose may be stored in the storage device 32. When the terminal device 10A picks up the guidance voice V, the terminal device 10A transmits the voice information D1 to the distribution device 30 in order to acquire the related information R. After that, the terminal device 10A transmits the voice information D1 (and other information) to be stored in the storage device 32 (voice table Tb) to the distribution device 30. For example, a plurality of voice information D1s may be collectively transmitted at a preset time.

（１２）前述の各形態では、各端末装置１０Aから送信された音声情報Ｄ1を配信装置３０（音声テーブルＴb）に記憶したが、配信装置３０とは別体の情報処理装置に音声テーブルＴb（音声情報Ｄ1）を記憶してもよい。各端末装置１０Aは、配信装置３０と情報処理装置とのそれぞれに音声情報Ｄ1を送信する。配信装置３０は、関連テーブルＴaを記憶し、音声情報Ｄ1に対応した関連情報Ｒを端末装置１０Aに送信する。他方、情報処理装置の記憶装置は、各端末装置１０Aから受信した複数の音声情報Ｄ1を含む音声テーブルＴbを記憶する。提供情報Ｑは、情報処理装置から端末装置１０Bに送信される。以上の構成では、音声情報Ｄ1を情報処理装置に送信するタイミングは任意である。例えば、配信装置３０に対する音声情報Ｄ1の送信と同時でもよいし、予め設定された時刻に複数の音声情報Ｄ1をまとめて送信してもよい。なお、音声情報Ｄ1と、音声情報Ｄ1以外の情報（例えば状況情報または言語情報Ｄ4）とを情報処理装置に送信してもよい。 (12) In each of the above-described embodiments, the voice information D1 transmitted from each terminal device 10A is stored in the distribution device 30 (voice table Tb), but the voice table Tb (voice table Tb) is stored in the information processing device separate from the distribution device 30. The voice information D1) may be stored. Each terminal device 10A transmits voice information D1 to each of the distribution device 30 and the information processing device. The distribution device 30 stores the related table Ta and transmits the related information R corresponding to the voice information D1 to the terminal device 10A. On the other hand, the storage device of the information processing device stores the voice table Tb including the plurality of voice information D1 received from each terminal device 10A. The provided information Q is transmitted from the information processing device to the terminal device 10B. In the above configuration, the timing of transmitting the voice information D1 to the information processing apparatus is arbitrary. For example, the voice information D1 may be transmitted to the distribution device 30 at the same time, or a plurality of voice information D1s may be collectively transmitted at a preset time. The voice information D1 and information other than the voice information D1 (for example, status information or language information D4) may be transmitted to the information processing apparatus.

（１３）前述の各形態では、関連情報取得部５３は、関連テーブルＴaから音声情報Ｄ1に対応する関連情報Ｒを取得したが、関連情報取得部５３が関連情報Ｒを生成してもよい。例えば、案内音声Ｖの発話内容を表す文字列を示す音声情報Ｄ1から、当該文字列に応じた関連情報Ｒが生成される。すなわち、関連テーブルＴaを記憶装置３２に記憶しておくことは必須ではない。 (13) In each of the above-described embodiments, the related information acquisition unit 53 acquires the related information R corresponding to the voice information D1 from the related table Ta, but the related information acquisition unit 53 may generate the related information R. For example, the related information R corresponding to the character string is generated from the voice information D1 indicating the character string representing the utterance content of the guidance voice V. That is, it is not essential to store the related table Ta in the storage device 32.

（１４）前述の各形態で例示した音声テーブルＴbは、データ構造としても特定される。すなわち、音声テーブルＴbは、放音装置２０から放音されて端末装置１０Aが収音した音声の内容をそれぞれが示す複数の音声情報Ｄ1を含むデータ構造であって、配信装置３０（情報処理装置の一例）がデータ構造に含まれる複数の音声情報Ｄ1を利用して提供情報Ｑを生成する処理に利用される。 (14) The audio table Tb exemplified in each of the above-described embodiments is also specified as a data structure. That is, the voice table Tb is a data structure including a plurality of voice information D1s, each of which indicates the content of the sound emitted from the sound emitting device 20 and collected by the terminal device 10A, and is a distribution device 30 (information processing device). (One example) is used in the process of generating the provided information Q by using the plurality of voice information D1 included in the data structure.

（１５）端末装置１０Aの制御装置１１を情報抽出部４１および再生制御部４２として機能させるプログラムは、単体のアプリケーションソフトウェアのほか、例えば端末装置１０Aにおいて使用される各種のアプリケーションソフトウェア（例えばブラウザ）に対するプラグインソフトウェアとして提供してもよい。 (15) The program that causes the control device 11 of the terminal device 10A to function as the information extraction unit 41 and the reproduction control unit 42 is for, for example, various application software (for example, a browser) used in the terminal device 10A, in addition to the single application software. It may be provided as plug-in software.

（１６）前述の各形態に係る配信装置３０または端末装置１０（１０A，１０B）の機能は、各形態での例示の通り、制御装置３１とプログラムとの協働により実現される。前述の各形態に係るプログラムは、コンピュータが読取可能な記録媒体に格納された形態で提供されてコンピュータにインストールされ得る。記録媒体は、例えば非一過性（non-transitory）の記録媒体であり、ＣＤ-ＲＯＭ等の光学式記録媒体（光ディスク）が好例であるが、半導体記録媒体または磁気記録媒体等の公知の任意の形式の記録媒体も包含される。なお、非一過性の記録媒体とは、一過性の伝搬信号（transitory, propagating signal）を除く任意の記録媒体を含み、揮発性の記録媒体も除外されない。また、通信網を介した配信の形態でプログラムをコンピュータに提供してもよい。 (16) The functions of the distribution device 30 or the terminal device 10 (10A, 10B) according to each of the above-described modes are realized by the cooperation between the control device 31 and the program as illustrated in each mode. The program according to each of the above-described forms may be provided in a form stored in a computer-readable recording medium and installed in the computer. The recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disc) such as a CD-ROM is a good example, but a semiconductor recording medium, a magnetic recording medium, or the like is known as arbitrary. Recording media in the form of are also included. The non-transient recording medium includes any recording medium other than the transient propagation signal (transitory, propagating signal), and the volatile recording medium is not excluded. Further, the program may be provided to the computer in the form of distribution via a communication network.

＜付記＞
以上に例示した形態から、例えば以下の構成が把握される。 <Additional Notes>
From the above-exemplified form, for example, the following configuration can be grasped.

本発明の好適な態様（第１態様）に係る情報処理方法は、放音装置から放音されて端末装置が収音した音声を示す音声情報を当該端末装置から受信し、複数の端末装置からそれぞれ受信した複数の音声情報を記憶装置に記憶する。以上の態様では、複数の端末装置からそれぞれ受信した複数の音声情報が記憶装置に記憶されるから、複数の音声情報を様々な用途に活用することができる。 In the information processing method according to the preferred embodiment (first aspect) of the present invention, voice information indicating the sound emitted from the sound emitting device and collected by the terminal device is received from the terminal device, and the sound information is received from the plurality of terminal devices. A plurality of received voice information is stored in the storage device. In the above aspect, since a plurality of voice information received from each of the plurality of terminal devices is stored in the storage device, the plurality of voice information can be utilized for various purposes.

第１態様の好適例（第２態様）では、前記記憶装置に記憶された前記複数の音声情報を利用して提供情報を生成し、前記提供情報を端末装置に送信する。以上の態様では、記憶装置に記憶された複数の音声情報を利用することで生成された提供情報が端末装置に送信されるから、複数の端末装置から送信された音声情報を提供情報の生成に活用することができる。 In a preferred example of the first aspect (second aspect), the provided information is generated by using the plurality of voice information stored in the storage device, and the provided information is transmitted to the terminal device. In the above aspect, since the provided information generated by using the plurality of voice information stored in the storage device is transmitted to the terminal device, the voice information transmitted from the plurality of terminal devices is used to generate the provided information. It can be utilized.

第２態様の好適例（第３態様）では、前記提供情報は、前記複数の音声情報が示す音声の内容に応じた情報である。以上の態様では、複数の端末装置が収音した音声の内容に応じた提供情報を生成することができる。 In the preferred example of the second aspect (third aspect), the provided information is information according to the content of the voice indicated by the plurality of voice information. In the above aspect, it is possible to generate the provided information according to the content of the sound picked up by the plurality of terminal devices.

第２態様または第３態様の好適例（第４態様）では、前記端末装置による前記収音の状況を示す状況情報を当該端末装置から受信し、各端末装置から受信した前記音声情報と前記状況情報とを対応付けて前記記憶装置に記憶し、前記記憶装置に記憶された前記複数の音声情報のうち、特定の状況を示す前記状況情報に対応付けられた２以上の音声情報を利用して、前記提供情報を生成する。以上の態様では、記憶装置に記憶された複数の音声情報のうち、特定の状況を示す状況情報に対応付けられた２以上の音声情報から提供情報が生成されるから、特定の状況下で収音されたと推定される音声をそれぞれが示す２以上の音声情報から提供情報を生成することが可能である。 In the preferred example (fourth aspect) of the second aspect or the third aspect, the situation information indicating the state of sound collection by the terminal device is received from the terminal device, and the voice information received from each terminal device and the situation. The information is stored in the storage device in association with the information, and among the plurality of voice information stored in the storage device, two or more voice information associated with the situation information indicating a specific situation is used. , Generate the provided information. In the above aspect, since the provided information is generated from two or more voice information associated with the situation information indicating a specific situation among the plurality of voice information stored in the storage device, the information is collected under the specific situation. It is possible to generate the provided information from two or more voice information, each of which indicates the voice presumed to have been sounded.

第４態様の好適例（第５態様）では、前記状況情報は、前記音声を収音したときの端末装置の位置を前記状況として示す位置情報を含み、前記記憶装置に記憶された複数の音声情報のうち、特定の位置を示す位置情報に対応付けられた２以上の音声情報を利用して、前記提供情報を生成する。以上の態様では、記憶装置に記憶された複数の音声情報のうち、特定の位置を示す位置情報に対応付けられた２以上の音声情報から提供情報が生成されるから、特定の位置で収音されたと推定される音声をそれぞれが示す２以上の音声情報から提供情報を生成することが可能である。 In a preferred example of the fourth aspect (fifth aspect), the situation information includes position information indicating the position of the terminal device when the voice is picked up as the situation, and a plurality of voices stored in the storage device. Among the information, the provided information is generated by using two or more voice information associated with the position information indicating a specific position. In the above aspect, since the provided information is generated from two or more voice information associated with the position information indicating the specific position among the plurality of voice information stored in the storage device, the sound is picked up at the specific position. It is possible to generate the provided information from two or more voice information, each of which indicates the voice presumed to have been played.

第４態様または第５態様の好適例（第６態様）では、前記状況情報は、前記音声を収音した時刻を前記状況として示す時刻情報を含み、前記記憶装置に記憶された複数の音声情報のうち、特定の時刻を示す時刻情報に対応付けられた２以上の音声情報を利用して、前記提供情報を生成する。以上の態様では、記憶装置に記憶された複数の音声情報のうち、特定の時刻を示す時刻情報に対応付けられた２以上の音声情報から提供情報が生成されるから、特定の時刻に収音されたと推定される音声をそれぞれが示す２以上の音声情報から提供情報を生成することが可能である。 In the preferred example (sixth aspect) of the fourth aspect or the fifth aspect, the situation information includes time information indicating the time when the voice is picked up as the situation, and a plurality of voice information stored in the storage device. Among them, the provided information is generated by using two or more voice information associated with the time information indicating a specific time. In the above aspect, since the provided information is generated from two or more voice information associated with the time information indicating a specific time among the plurality of voice information stored in the storage device, the sound is picked up at a specific time. It is possible to generate the provided information from two or more voice information, each of which indicates the voice presumed to have been made.

第１態様から第６態様の何れかの好適例（第７態様）では、前記端末装置に設定された言語を示す言語情報を当該端末装置から受信し、各端末装置から受信した前記音声情報と前記言語情報とを対応付けて前記記憶装置に記憶する。以上の態様では、各端末装置から受信した言語情報と音声情報と対応付けて記憶装置に記憶されるから、案内音声を収音した端末装置に設定されている言語を示す言語情報を様々な用途に活用することができる。 In any of the preferred examples (7th aspect) of the first to sixth aspects, the language information indicating the language set in the terminal device is received from the terminal device, and the voice information received from each terminal device is used. It is stored in the storage device in association with the language information. In the above aspects, since the language information received from each terminal device and the voice information are stored in the storage device in association with each other, the language information indicating the language set in the terminal device that collects the guidance voice can be used for various purposes. Can be utilized for.

本発明の好適な態様（第８態様）に係るデータ構造は、放音装置から放音されて端末装置が収音した音声の内容をそれぞれが示す複数の音声情報を含むデータ構造であって、情報処理装置が前記複数の音声情報を利用して提供情報を生成する処理に利用される。以上の態様では、データ構造に含まれる複数の音声情報が提供情報の生成に利用されるから、複数の端末装置から送信された音声情報を提供情報の生成に活用することができる。 The data structure according to the preferred aspect (eighth aspect) of the present invention is a data structure including a plurality of voice information indicating the contents of the sound emitted from the sound emitting device and collected by the terminal device. The information processing device is used in a process of generating provided information by using the plurality of voice information. In the above aspect, since a plurality of voice information included in the data structure is used to generate the provided information, the voice information transmitted from the plurality of terminal devices can be used to generate the provided information.

１００…情報提供システム、１０…端末装置、２０…放音装置、３０…配信装置、１１…制御装置、１２…記憶装置、１３…通信装置、１４…収音装置、１５…再生装置、３１…制御装置、３２…記憶装置、３３…通信装置、４１…情報抽出部、４２…再生制御部、５１…通信制御部、５２…記憶制御部、５３…関連情報取得部、５４…提供情報生成部、７０…通信網。 100 ... Information providing system, 10 ... Terminal device, 20 ... Sound emitting device, 30 ... Distribution device, 11 ... Control device, 12 ... Storage device, 13 ... Communication device, 14 ... Sound collecting device, 15 ... Playback device, 31 ... Control device, 32 ... Storage device, 33 ... Communication device, 41 ... Information extraction unit, 42 ... Playback control unit, 51 ... Communication control unit, 52 ... Storage control unit, 53 ... Related information acquisition unit, 54 ... Provided information generation unit , 70 ... Communication network.

Claims

From each of the plurality of terminal devices, voice information indicating the sound emitted by the sound emitting device and picked up by the terminal device is received.
It is realized by a computer that stores the voice information received from each of the plurality of terminal devices in the storage device in association with the sound indicated by the voice information and the classification information indicating the classification of the sound picked up by the terminal device. Information processing method.

Providing information is generated by using a plurality of voice information stored in the storage device.
The information processing method according to claim 1, wherein the provided information is transmitted to the terminal device.

The classification information is received from each of the plurality of terminal devices, and the classification information is received.
In the storage of the voice information, claim 1 or claim 2 wherein the voice information received from each of the plurality of terminal devices is stored in the storage device in association with the classification information received from the terminal device. Information processing method.

An acoustic signal representing the sound picked up by the terminal device is received from each of the plurality of terminal devices, and the sound is received.
The information processing method according to claim 1 or 2, wherein the classification information is generated by analyzing the acoustic signal received from the terminal device.

A communication control unit that causes the communication device to execute an operation of receiving voice information indicating the sound emitted by the sound emitting device and picked up by the terminal device from each of the plurality of terminal devices.
A storage control unit that stores the voice information received from each of the plurality of terminal devices in the storage device in association with the classification information indicating the classification of the sound picked up by the terminal device together with the voice indicated by the voice information. Information processing device to be equipped.