JP6045511B2

JP6045511B2 - Acoustic signal detection system, acoustic signal detection method, acoustic signal detection server, acoustic signal detection apparatus, and acoustic signal detection program

Info

Publication number: JP6045511B2
Application number: JP2014001511A
Authority: JP
Inventors: 裕子石若
Original assignee: PS Solutions Corp
Current assignee: PS Solutions Corp
Priority date: 2014-01-08
Filing date: 2014-01-08
Publication date: 2016-12-14
Anticipated expiration: 2034-01-08
Also published as: JP2015129868A

Description

本発明は、評価の対象となる評価音響信号中に含まれる特定音響信号を検出する音響信号検出システム、音響信号検出方法、音響信号検出サーバー、音響信号検出装置、及び音響信号検出プログラムに関する。 The present invention relates to an acoustic signal detection system, an acoustic signal detection method, an acoustic signal detection server, an acoustic signal detection device, and an acoustic signal detection program for detecting a specific acoustic signal included in an evaluation acoustic signal to be evaluated.

近年、複数種類の音響信号が混在した音響信号から、特定の人の音声や楽器音等の調波構造を持った音響信号を検出する技術が知られている（例えば、特許文献１）。この特許文献１に開示された音声認識装置では、背景雑音が存在し畳み込み混合された実環境での信号（音声、画像、電波などの信号）から源信号をブラインドソース信号分離処理によって音源分離して目的の音響信号を検出するものである（例えば、特許文献１）。 In recent years, a technique for detecting an acoustic signal having a harmonic structure such as a specific person's voice or musical instrument sound from an acoustic signal in which a plurality of types of acoustic signals are mixed is known (for example, Patent Document 1). In the speech recognition apparatus disclosed in Patent Document 1, a source signal is separated from a source signal (sound, image, radio wave, etc.) in a real environment in which background noise exists and is convoluted and mixed by blind source signal separation processing. The target acoustic signal is detected (for example, Patent Document 1).

特開２００３−７８４２３号公報JP 2003-78423 A

しかしながら、特許文献１に開示されたような音源分離を行う場合には、雑音混じりの録音を信号と雑音に分離するためにブラインドソース信号分離処理などの複雑な演算処理が必要であり、このブラインドソース信号分離処理では、音源とマイクの位置関係などの情報が未知であるうえ、音声の遅延などを考慮する必要があることから、前処理としてのフィルタリングや後処理としての逆フィルタリングによって周波数特性を復元するなどの処理を行う。そのため、従来の音源分離では、その演算処理が複雑となり、処理負担が増大し、目的の特定音響信号を検出するのに時間がかかってしまい、リアルタイムの処理による負荷が過大となることから、モバイル端末のような携帯用のデバイスには適していないという問題がある。 However, when performing sound source separation as disclosed in Patent Document 1, in order to separate a noise-mixed recording into a signal and noise, complicated arithmetic processing such as blind source signal separation processing is necessary. In source signal separation processing, information such as the positional relationship between the sound source and the microphone is unknown, and it is necessary to take into account audio delays, etc., so the frequency characteristics can be adjusted by filtering as preprocessing or inverse filtering as postprocessing. Perform processing such as restoration. Therefore, in the conventional sound source separation, the computation processing becomes complicated, the processing load increases, it takes time to detect the target specific acoustic signal, and the load due to real-time processing becomes excessive, so mobile There is a problem that it is not suitable for portable devices such as terminals.

そこで、本発明は、上記のような問題を解決するものであり、雑音下においても演算処理負担を軽減してリアルタイムに目的の特定音響信号を検出することができるとともに、その特定音響信号をトリガーとしたサービスを提供してサービスの多様化、充実化を図ることができる音響信号検出システム、音響信号検出方法、音響信号検出サーバー、音響信号検出装置、及び音響信号検出プログラムを提供することを目的とする。 Therefore, the present invention solves the above-described problems, and can reduce the processing load even under noise and detect a target specific sound signal in real time, and trigger the specific sound signal. It is an object to provide an acoustic signal detection system, an acoustic signal detection method, an acoustic signal detection server, an acoustic signal detection device, and an acoustic signal detection program that can provide a variety of services and provide a variety of services. And

上記課題を解決するために、本発明は、評価の対象となる評価音響信号中に含まれる特定音響信号を検出する音響信号検出システムであって、評価音響信号を取得し、取得された評価音響信号について、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化され時系列で記録された比較先テンプレートを生成する評価対象変換部と、特定音響信号に基づいて、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化され時系列で記録された音源テンプレートを生成する音源テンプレート生成部と、音源テンプレートを取得する、音源テンプレート取得部と、比較先テンプレートと音源テンプレートとを比較して、二値化された音響信号の分布の適合率を算出するテンプレート比較部と、テンプレート比較部により算出された適合率に応じた比較結果信号を出力する適合率出力部と、音源テンプレート生成部が生成した音源テンプレートを識別するテンプレート識別子と、発生させるべきイベントを識別するイベント識別子とを関連づけて記憶するイベントデータベースと、適合率出力部から出力された比較結果信号に基づき、当該比較結果信号に係る音源テンプレートのテンプレート識別子についてイベントデータベースを照合し、発生させるべきイベントを選出し、選出されたイベントを発生させるイベント発生管理部とを備え、音源テンプレート生成部は、特定音響信号について音源テンプレートを生成する際、各音階における信号強度を検出し、検出された音響信号が所定のしきい値を上回るか否かに基づいて音響信号の有無を抽出し、前記しきい値は、前記発生させるべきイベントの種類に応じて異なる値に設定されていることを特徴とする。 In order to solve the above-described problems, the present invention provides an acoustic signal detection system that detects a specific acoustic signal included in an evaluation acoustic signal to be evaluated, acquires the evaluation acoustic signal, and acquires the acquired evaluation acoustic signal. For a signal, the presence / absence of an acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the comparison destination template in which the distribution of the acoustic signal and its temporal change are binarized for each scale and recorded in time series is obtained. Based on the evaluation target conversion unit to be generated and the specific acoustic signal , the presence or absence of the acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are binary for each scale. Sound source template generation unit for generating sound source templates recorded in time series, sound source template acquisition unit for acquiring sound source templates, comparison destination template and sound source A template comparison unit that compares the template and calculates a matching rate of the binarized acoustic signal distribution; and a matching rate output unit that outputs a comparison result signal according to the matching rate calculated by the template comparison unit. Based on an event database that stores a template identifier that identifies a sound source template generated by the sound source template generation unit and an event identifier that identifies an event to be generated, and a comparison result signal output from the matching rate output unit, An event generation management unit that compares an event database with respect to a template identifier of a sound source template related to the comparison result signal, selects an event to be generated, and generates the selected event; When generating a sound source template for each scale Detecting a definitive signal strength, extracts the presence or absence of the acoustic signal detected sound signal based on whether exceeds a predetermined threshold value, said threshold value, depending on the type of event to be the generation It is characterized by being set to a different value .

また、他の発明は、評価の対象となる評価音響信号中に含まれる特定音響信号を検出する音響信号検出方法であって、
（１）評価音響信号を取得し、取得された評価音響信号について、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化され時系列で記録された比較先テンプレートを生成する評価対象変換する処理と、
（２）特定音響信号について、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化され時系列で記録された音源テンプレートを生成して取得するとともに、比較先テンプレートと音源テンプレートとを比較して、二値化された音響信号の分布の適合率を算出するテンプレート比較処理と、
（３）テンプレート比較処理により算出された適合率に応じた比較結果信号を出力する適合率出力処理と、
（４）テンプレート比較処理で生成された音源テンプレートを識別するテンプレート識別子と、発生させるべきイベントを識別するイベント識別子とを関連づけてイベントデータベースに記憶させるデータベース制御処理と、
（５）適合率出力処理から出力された比較結果信号に基づき、当該比較結果信号に係る音源テンプレートのテンプレート識別子についてイベントデータベースを照合し、発生させるべきイベントを選出し、選出されたイベントを発生させるイベント発生管理処理と
を含み、
テンプレート比較処理では、特定音響信号について、音源テンプレートを生成する際、各音階における信号強度を検出し、検出された音響信号が所定のしきい値を上回るか否かに基づいて音響信号の有無を抽出し、
前記しきい値は、前記発生させるべきイベントの種類に応じて異なる値に設定されていることを特徴とする。 Another invention is an acoustic signal detection method for detecting a specific acoustic signal included in an evaluation acoustic signal to be evaluated,
(1) Acquire an evaluation acoustic signal, extract the presence or absence of an acoustic signal in a predetermined frequency band based on the scale average rate for the acquired evaluation acoustic signal, and determine the distribution and temporal change of the acoustic signal for each scale. Processing to convert the evaluation target to generate a comparison target template binarized and recorded in time series,
(2) For a specific acoustic signal, the presence or absence of an acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are binarized for each scale and recorded in time series. A template comparison process that generates and obtains a sound source template, and compares the comparison target template with the sound source template to calculate a matching rate of the binarized acoustic signal distribution;
(3) relevance ratio output processing for outputting a comparison result signal corresponding to the relevance ratio calculated by the template comparison processing ;
(4) a database control process for storing a template identifier for identifying a sound source template generated in the template comparison process and an event identifier for identifying an event to be generated in association with each other in an event database;
(5) Based on the comparison result signal output from the matching rate output process, the event database is checked for the template identifier of the sound source template related to the comparison result signal, the event to be generated is selected, and the selected event is generated. Event occurrence management processing and 
In the template comparison process, when generating a sound source template for a specific sound signal, the signal intensity in each scale is detected, and the presence or absence of the sound signal is determined based on whether or not the detected sound signal exceeds a predetermined threshold value. Extract and
The threshold value is set to a different value depending on the type of event to be generated .

これらのような本発明によれば、特定音響信号を、音階平均律に基づいてその周波数帯域における信号の有無を抽出して生成したテンプレート同士の比較により行うため、複雑な演算処理を要することなく音源とのマッチング処理を行うことができ、データをダウンサイズすることによる高速演算によってリアルタイム処理による負荷を低減することができ、携帯用のデバイスでの実装が可能となる。また、音響信号の有無を、二値化されたテンプレートによって検出することから、雑音下での検出が可能となる。これらの結果、例えば、店舗内で放送されているテーマソングなどを検出した場合にクーポンやポイントを配布するなどのように、特定の音響検出をトリガーとしたサービスを提供することができ、サービスの多様化、充実化を図ることができる。
なお、前記音源テンプレートの生成時にあっては、高調波成分を除いたうえで各音階の音の有無を検出することが好ましい。
また、前記音源テンプレートとの比較時にあっては、高調波成分を除かずに、高調波成分を含めたままで各階調音の有無を検出するようにしてもよい。 According to the present invention as described above, the specific acoustic signal is obtained by comparing the templates generated by extracting the presence / absence of the signal in the frequency band based on the scale equal temperament, so that complicated calculation processing is not required. Matching processing with a sound source can be performed, the load due to real-time processing can be reduced by high-speed computation by downsizing data, and mounting on a portable device is possible. Further, since the presence / absence of an acoustic signal is detected by a binarized template, detection under noise is possible. As a result, it is possible to provide a service triggered by specific sound detection, such as distributing coupons or points when a theme song broadcast in a store is detected. Diversification and enhancement can be achieved.
When generating the sound source template, it is preferable to detect the presence or absence of a sound of each scale after removing the harmonic component.
Further, at the time of comparison with the sound source template, the presence / absence of each tone may be detected without removing the harmonic component and including the harmonic component.

なお、ここで、「評価音響信号の取得」としては、マイク等の音響デバイスを用いて音響を直接電気信号として取得する他、無線・有線の通信により音声信号や音声ファイルとして取得する場合が含まれる。一方、「音源テンプレート」は、システムのいずれかのデバイスを用いて生成されるものであり、例えば、ユーザー端末でユーザー自ら生成してもよく、生成済みの音源テンプレートをサーバーからダウンロードするようにしてもよい。また、「音階平均率」としては、半音ずつの音で構成された１２音階の他、種々の音階が含まれる。さらに、「音響信号の有無を二値化」するとは、各音階の音があるか否かのみを検出することを意味し、例えば、音響信号が所定のしきい値を上回るか否かにより判定する手法が挙げられる。 Here, “acquisition of evaluation sound signal” includes not only acquiring sound directly as an electrical signal using an acoustic device such as a microphone, but also acquiring it as a sound signal or sound file by wireless / wired communication. It is. On the other hand, the “sound source template” is generated using any device of the system. For example, the user may generate the sound source template on the user terminal, or download the generated sound source template from the server. Also good. In addition, the “scale average rate” includes various scales in addition to the 12 scales composed of semitones. Furthermore, “binarization of the presence / absence of an acoustic signal” means detecting only whether there is a sound of each scale, for example, determining whether the acoustic signal exceeds a predetermined threshold value. The technique to do is mentioned.

ここでイベントとは、プログラム上における例外処理や、分岐処理など、一定の要件が満たされることを条件として発生する処理であり、このイベントにより発生された処理によって提供されるシステムやサービスもここで言う「イベント」に含まれる。例えば、特定の音源を検出した場合に、音声やテキスト表示によるメッセージ出力や、画像・動画の再生、アプリケーションの起動など種々の情報処理サービスも、ここでのイベントに含まれる。 Here, an event is a process that occurs on condition that a certain requirement is satisfied, such as exception processing in a program or branch processing, and the system and service provided by the process generated by this event are also used here. It is included in the “event”. For example, when a specific sound source is detected, various information processing services such as message output by voice or text display, image / video playback, application activation, and the like are also included in the event.

この場合には、特定音響の検出に用いられる音源テンプレートと任意のイベントとを関連づけておくことで、情報処理端末やサーバーにおける特定音響の検出をトリガーとしたイベントを発生させることができる。また、いずれのユーザーが、どこでどのような音響を検出したかについての情報を収集することができ、リアルなユーザー動向を調査することができ、ビックデータの構築によるマーケティングの充実を図ることができる。 In this case, an event triggered by the detection of the specific sound in the information processing terminal or server can be generated by associating the sound source template used for detection of the specific sound with an arbitrary event. In addition, it is possible to collect information on which user has detected what kind of sound and where, to investigate realistic user trends, and to improve marketing by building big data. .

この場合には、しきい値の値を変えて音源テンプレートを作成することにより、同じ特定音響が検出される場合であっても、音源テンプレートの種類によって異なるイベントを発生させることができる。これにより、例えば、家族で同時に同じ店舗に入り、同じ特定音響を検出したとしても、大人と子供では異なるイベントを発生させるなど、年齢や性別などユーザーの特定に適したサービスを提供することができる。 In this case, by generating a sound source template by changing the threshold value, different events can be generated depending on the type of the sound source template even when the same specific sound is detected. As a result, for example, even if a family enters the same store at the same time and detects the same specific sound, it is possible to provide services suitable for user identification such as age and gender, such as generating different events for adults and children. .

さらに、他の発明は、上記音響信号検出システム及び音響信号検出方法で利用可能な、評価の対象となる評価音響信号中に含まれる特定音響信号を検出する音響信号検出サーバー及び音響信号検出装置である。 Furthermore, another invention is an acoustic signal detection server and an acoustic signal detection device for detecting a specific acoustic signal included in an evaluation acoustic signal to be evaluated, which can be used in the acoustic signal detection system and the acoustic signal detection method. is there.

具体的に、本発明の音響信号検出サーバーは、評価の対象となる評価音響信号中に含まれる特定音響信号を検出する音響信号検出サーバーであって、通信ネットワークを通じて、評価音響信号を取得し、取得された評価音響信号について、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化され時系列で記録された比較先テンプレートを生成する評価対象変換部と、特定音響信号について、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化され時系列で記録された音源テンプレートを生成して取得する、音源テンプレート取得部と、比較先テンプレートと音源テンプレートとを比較して、二値化された音響信号の分布の適合率を算出するテンプレート比較部と、テンプレート比較部により算出された適合率に応じた比較結果信号を、通信ネットワークに送出する適合率出力部と、音源テンプレート生成部が生成した音源テンプレートを識別するテンプレート識別子と、発生させるべきイベントを識別するイベント識別子とを関連づけて記憶するイベントデータベースと、適合率出力部から出力された比較結果信号に基づき、当該比較結果信号に係る音源テンプレートのテンプレート識別子についてイベントデータベースを照合し、発生させるべきイベントを選出し、選出されたイベントを発生させるイベント発生管理部とを備え、
前記音源テンプレート生成部は、前記特定音響信号について前記音源テンプレートを生成する際、各音階における信号強度を検出し、検出された音響信号が所定のしきい値を上回るか否かに基づいて音響信号の有無を抽出し、前記しきい値は、前記発生させるべきイベントの種類に応じて異なる値に設定されている。 Specifically, the acoustic signal detection server of the present invention is an acoustic signal detection server that detects a specific acoustic signal included in an evaluation acoustic signal to be evaluated, acquires the evaluation acoustic signal through a communication network, and About the acquired evaluation sound signal, the presence or absence of the sound signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the sound signal and its temporal change are binarized for each scale and recorded in time series. The comparison target template for generating the comparison target template and the presence or absence of the acoustic signal in a predetermined frequency band for the specific acoustic signal are extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are A sound source template acquisition unit that generates and acquires sound source templates that are binarized and recorded in time series, a comparison destination template, and a sound source template A template comparison unit that calculates the matching rate of the binarized distribution of the acoustic signal, and a matching rate output unit that sends a comparison result signal corresponding to the matching rate calculated by the template comparison unit to the communication network An event database that associates and stores a template identifier that identifies a sound source template generated by the sound source template generation unit and an event identifier that identifies an event to be generated, and a comparison result signal output from the matching rate output unit And an event occurrence management unit that collates the event database for the template identifier of the sound source template related to the comparison result signal, selects an event to be generated, and generates the selected event ,
When generating the sound source template for the specific sound signal, the sound source template generation unit detects a signal intensity at each scale and determines whether the detected sound signal exceeds a predetermined threshold value. The threshold value is set to a different value depending on the type of event to be generated .

このような本発明によれば、特定音響信号を、音階平均律に基づいてその周波数帯域における信号の有無を抽出して生成したテンプレート同士の比較により行うため、複雑な演算処理を要することなく音源とのマッチング処理を行うことができ、データをダウンサイズすることによる高速演算によってリアルタイム処理が可能となる。また、音響信号の有無を、二値化されたテンプレートによって検出することから、雑音下での検出が可能となる。これらの結果、例えば、店舗内で放送されているテーマソングなどを検出した場合にクーポンやポイントを配布するなどのように、特定の音響検出をトリガーとしたサービスを提供することができ、サービスの多様化、充実化を図ることができる。特に、本発明では、通信ネットワーク上に配置されたサーバーにおいて、評価音響信号から比較先テンプレートを生成するとともに、テンプレートの適合率を算出しているので、ユーザーが所持する端末における処理負担を軽減し、及びメモリ容量の有効利用を図ることができる。
なお、前記音源テンプレートの生成時にあっては、高調波成分を除いたうえで各音階の音の有無を検出することが好ましい。
また、前記音源テンプレートとの比較時にあっては、高調波成分を除かずに、高調波成分を含めたままで各階調音の有無を検出するようにしてもよい。 According to the present invention, the specific sound signal is obtained by comparing the templates generated by extracting the presence / absence of the signal in the frequency band based on the scale equal temperament. And real-time processing is possible by high-speed calculation by downsizing the data. Further, since the presence / absence of an acoustic signal is detected by a binarized template, detection under noise is possible. As a result, it is possible to provide a service triggered by specific sound detection, such as distributing coupons or points when a theme song broadcast in a store is detected. Diversification and enhancement can be achieved. In particular, in the present invention, the server arranged on the communication network generates a comparison destination template from the evaluation acoustic signal and calculates the matching rate of the template, thereby reducing the processing burden on the terminal owned by the user. In addition, the memory capacity can be effectively used.
When generating the sound source template, it is preferable to detect the presence or absence of a sound of each scale after removing the harmonic component.
Further, at the time of comparison with the sound source template, the presence / absence of each tone may be detected without removing the harmonic component and including the harmonic component.

この場合には、特定音響の検出に用いられる音源テンプレートと任意のイベントとを関連づけておくことで、サーバーにおける特定音響の検出をトリガーとしたイベントを発生させることができる。また、いずれのユーザーが、どこでどのような音響を検出したかについての情報を収集することができ、リアルなユーザー動向を調査することができ、ビックデータの構築によるマーケティングの充実を図ることができる。 In this case, an event triggered by the detection of the specific sound in the server can be generated by associating the sound source template used for the detection of the specific sound with an arbitrary event. In addition, it is possible to collect information on which user has detected what kind of sound and where, to investigate realistic user trends, and to improve marketing by building big data. .

一方、音響信号検出装置の発明は、評価の対象となる評価音響信号中に含まれる特定音響信号を検出する音響信号検出装置であって、音響を取得し、評価音響信号として出力する音響取得部と、音響取得部から出力された評価音響信号について、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化され時系列で記録された比較先テンプレートを生成する評価対象変換部と、特定音響信号に基づいて、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化され時系列で記録された音源テンプレートを生成する音源テンプレート生成部と、音源テンプレートを取得する音源テンプレート取得部と、比較先テンプレートと音源テンプレートとを比較して、二値化された音響信号の分布の適合率を算出するテンプレート比較部と、テンプレート比較部により算出された適合率に応じた比較結果信号を出力する適合率出力部と、音源テンプレート生成部が生成した音源テンプレートを識別するテンプレート識別子と、発生させるべきイベントを識別するイベント識別子とを関連づけて記憶するイベントデータベースと、適合率出力部から出力された比較結果信号に基づき、当該比較結果信号に係る音源テンプレートのテンプレート識別子についてイベントデータベースを照合し、発生させるべきイベントを選出し、選出されたイベントを発生させるイベント発生管理部とを備え、音源テンプレート生成部は、特定音響信号について音源テンプレートを生成する際、各音階における信号強度を検出し、検出された音響信号が所定のしきい値を上回るか否かに基づいて音響信号の有無を抽出し、前記しきい値は、発生させるべきイベントの種類に応じて異なる値に設定されている。 On the other hand, the invention of the acoustic signal detection device is an acoustic signal detection device that detects a specific acoustic signal included in an evaluation acoustic signal to be evaluated, and acquires a sound and outputs it as an evaluation acoustic signal Then, for the evaluation acoustic signal output from the acoustic acquisition unit, the presence or absence of the acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are binarized for each scale. And a target conversion unit that generates a comparison target template recorded in time series, and based on the specific acoustic signal, the presence or absence of the acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and A sound source template generation unit that generates a sound source template that is binarized for each scale and recorded in time series, and a sound source template that acquires the sound source template The acquisition unit, the template comparison unit that compares the comparison target template and the sound source template to calculate the matching rate of the binarized acoustic signal distribution, and the comparison result according to the matching rate calculated by the template comparison unit A matching rate output unit that outputs a signal, a template identifier that identifies a sound source template generated by the sound source template generation unit, an event database that stores an event identifier that identifies an event to be generated, and a matching rate output unit Based on the output comparison result signal, the event database is checked for the template identifier of the sound source template related to the comparison result signal, an event to be generated is selected, and an event occurrence management unit that generates the selected event is provided. The sound source template generator When generating the source template, detects the signal intensity at each scale, extracts the presence or absence of a sound signal based on the detected acoustic signals on whether exceeds a predetermined threshold value, said threshold value, generates It is set to a different value depending on the type of event to be performed .

このような本発明によれば、特定音響信号を、音階平均律に基づいてその周波数帯域における信号の有無を抽出して生成したテンプレート同士の比較により行うため、複雑な演算処理を要することなく音源とのマッチング処理を行うことができ、データをダウンサイズすることによる高速演算によってリアルタイム処理が可能となる。また、音響信号の有無を、二値化されたテンプレートによって検出することから、雑音下での検出が可能となる。これらの結果、例えば、店舗内で放送されているテーマソングなどを検出した場合にクーポンやポイントを配布するなどのように、特定の音響検出をトリガーとしたサービスを提供することができ、サービスの多様化、充実化を図ることができる。特に、本発明では、音響信号検出装置自体が主導となって、特定音響信号を検出して比較結果を算出しているので、通信ネットワーク上に配置されるサーバー装置等に依存することがなく、音響信号検出装置におけるスタンドアローン動作によって、さらなるリアルタイム処理が可能となる。
なお、前記音源テンプレートの生成時にあっては、高調波成分を除いたうえで各音階の音の有無を検出することが好ましい。
また、前記音源テンプレートとの比較時にあっては、高調波成分を除かずに、高調波成分を含めたままで各階調音の有無を検出するようにしてもよい。 According to the present invention, the specific sound signal is obtained by comparing the templates generated by extracting the presence / absence of the signal in the frequency band based on the scale equal temperament. And real-time processing is possible by high-speed calculation by downsizing the data. Further, since the presence / absence of an acoustic signal is detected by a binarized template, detection under noise is possible. As a result, it is possible to provide a service triggered by specific sound detection, such as distributing coupons or points when a theme song broadcast in a store is detected. Diversification and enhancement can be achieved. In particular, in the present invention, since the acoustic signal detection device itself takes the lead in detecting the specific acoustic signal and calculating the comparison result, it does not depend on the server device or the like arranged on the communication network, Further real-time processing is possible by the stand-alone operation in the acoustic signal detection device.
When generating the sound source template, it is preferable to detect the presence or absence of a sound of each scale after removing the harmonic component.
Further, at the time of comparison with the sound source template, the presence / absence of each tone may be detected without removing the harmonic component and including the harmonic component.

この場合には、特定音響の検出に用いられる音源テンプレートと任意のイベントとを関連づけておくことで、情報処理端末における特定音響の検出をトリガーとしたイベントを発生させることができる。また、いずれのユーザーが、どこでどのような音響を検出したかについての情報を収集することができ、リアルなユーザー動向を調査することができ、ビックデータの構築によるマーケティングの充実を図ることができる。 In this case, an event triggered by the detection of the specific sound in the information processing terminal can be generated by associating the sound source template used for detection of the specific sound with an arbitrary event. In addition, it is possible to collect information on which user has detected what kind of sound and where, to investigate realistic user trends, and to improve marketing by building big data. .

また、上述した本発明に係るシステム、方法、サーバー、及び装置は、所定の言語で記述されたプログラムをコンピューター上で実行することにより実現することができる。 The system, method, server, and apparatus according to the present invention described above can be realized by executing a program written in a predetermined language on a computer.

すなわち、本発明は、評価の対象となる評価音響信号中に含まれる特定音響信号を、コンピューターによって検出する音響信号検出プログラムであって、
（１）コンピューターに、評価音響信号を取得し、取得された評価音響信号について、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化され時系列で記録された比較先テンプレートを生成する評価対象変換ステップと、
（２）特定音響信号について、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化され時系列で記録された音源テンプレートを生成して取得するとともに、比較先テンプレートと音源テンプレートとを比較して、二値化された音響信号の分布の適合率を算出するテンプレート比較ステップと、
（３）テンプレート比較ステップにより算出された適合率に応じた比較結果信号を出力する適合率出力ステップと、
（４）前記テンプレート比較ステップで生成された音源テンプレートを識別するテンプレート識別子と、発生させるべきイベントを識別するイベント識別子とを関連づけてイベントデータベースに記憶させるデータベース制御ステップと、
（５）前記適合率出力ステップで出力された比較結果信号に基づき、当該比較結果信号に係る音源テンプレートのテンプレート識別子について前記イベントデータベースを照合し、発生させるべきイベントを選出し、選出されたイベントを発生させるイベント発生管理ステップと
を含む処理を実行させ、
前記テンプレート比較ステップでは、前記特定音響信号について、前記音源テンプレートを生成する際、各音階における信号強度を検出し、検出された音響信号が所定のしきい値を上回るか否かに基づいて音響信号の有無を抽出し、前記しきい値は、前記発生させるべきイベントの種類に応じて異なる値に設定されている。 That is, the present invention is an acoustic signal detection program for detecting a specific acoustic signal included in an evaluation acoustic signal to be evaluated by a computer,
(1) An evaluation acoustic signal is acquired by a computer, and for the acquired evaluation acoustic signal, the presence or absence of an acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are An evaluation target conversion step for generating a comparison target template binarized for each scale and recorded in time series,
(2) For a specific acoustic signal, the presence or absence of an acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are binarized for each scale and recorded in time series. A template comparison step of generating and obtaining a sound source template, comparing the comparison target template with the sound source template, and calculating a matching rate of the binarized acoustic signal distribution;
(3) a precision ratio output step of outputting a comparison result signal corresponding to the precision ratio calculated by the template comparison step ;
(4) a database control step for associating a template identifier for identifying the sound source template generated in the template comparison step with an event identifier for identifying the event to be generated in an event database;
(5) Based on the comparison result signal output in the precision ratio output step, the event database is checked for a template identifier of a sound source template related to the comparison result signal, an event to be generated is selected, and the selected event is selected. to execute an event generating management step of generating a processing including ,
In the template comparison step, when the sound source template is generated for the specific sound signal, the signal intensity in each scale is detected, and the sound signal is determined based on whether the detected sound signal exceeds a predetermined threshold value. The threshold value is set to a different value depending on the type of event to be generated .

このようなプログラムを、ユーザー端末やＷｅｂサーバー等のコンピューターやＩＣチップにインストールし、ＣＰＵ上で実行することにより、上述した各機能を有する音響信号検出システム、音響信号検出方法、音響信号検出サーバー、及び音響信号検出装置を容易に構築することができる。このプログラムは、例えば、通信回線を通じて配布することが可能であり、また、汎用コンピューターで読み取り可能な記録媒体に記録することにより、スタンドアローンの計算機上で動作するパッケージアプリケーションとして譲渡することができる。記録媒体として、具体的には、フレキシブルディスクやカセットテープ等の磁気記録媒体、若しくはＣＤ−ＲＯＭやＤＶＤ−ＲＯＭ等の光ディスクの他、ＲＡＭカードなど、種々の記録媒体に記録することができる。そして、このプログラムを記録したコンピューター読み取り可能な記録媒体によれば、汎用のコンピューターや専用コンピューターを用いて、上述した音響信号検出システム、音響信号検出方法、音響信号検出サーバー、及び音響信号検出装置を簡便に実施することが可能となるとともに、プログラムの保存、運搬及びインストールを容易に行うことができる。 By installing such a program on a computer or IC chip such as a user terminal or a Web server and executing it on the CPU, an acoustic signal detection system, an acoustic signal detection method, an acoustic signal detection server having the above-described functions, And an acoustic signal detection apparatus can be constructed | assembled easily. This program can be distributed through, for example, a communication line, and can be transferred as a package application that operates on a stand-alone computer by being recorded on a recording medium readable by a general-purpose computer. Specifically, the recording medium can be recorded on various recording media such as a RAM card in addition to a magnetic recording medium such as a flexible disk or a cassette tape, or an optical disk such as a CD-ROM or DVD-ROM. And according to the computer-readable recording medium which recorded this program, the above-mentioned acoustic signal detection system, acoustic signal detection method, acoustic signal detection server, and acoustic signal detection device are used using a general-purpose computer or a dedicated computer. The program can be easily implemented, and the program can be easily stored, transported, and installed.

以上述べたように、この発明によれば、雑音下においても処理負担を軽減してリアルタイムに目的の特定音響信号を検出することができる。そして、特定の音響検出をリアルタイムに検出することで、その音響をトリガーとしたサービスを提供して、サービスの多様化、充実化を図ることができる。 As described above, according to the present invention, a target specific acoustic signal can be detected in real time while reducing the processing load even under noise. And by detecting specific sound detection in real time, it is possible to provide a service triggered by the sound, and to diversify and enhance the service.

第１実施形態に係る音響信号検出システムの全体構成を示す概念図である。It is a conceptual diagram which shows the whole structure of the acoustic signal detection system which concerns on 1st Embodiment. 第１実施形態に係る管理サーバーの内部構成を示すブロック図である。It is a block diagram which shows the internal structure of the management server which concerns on 1st Embodiment. 第１実施形態に係るユーザー端末及び音再生手段の内部構成を示すブロック図である。It is a block diagram which shows the internal structure of the user terminal which concerns on 1st Embodiment, and a sound reproduction means. 第１実施形態に係る音源テンプレート及び比較先テンプレートを示す説明図である。It is explanatory drawing which shows the sound source template and comparison destination template which concern on 1st Embodiment. 第１実施形態に係る音源テンプレートが作成されるパート位置を示す説明図である。It is explanatory drawing which shows the part position where the sound source template which concerns on 1st Embodiment is produced. 第１実施形態に係るダブルバッファリング処理を示す説明図である。It is explanatory drawing which shows the double buffering process which concerns on 1st Embodiment. 第１実施形態に係るテンプレート比較手法を示す説明図である。It is explanatory drawing which shows the template comparison method which concerns on 1st Embodiment. （ａ）及び（ｂ）は、第１実施形態に係る適合率に応じて表示される表示情報を示す説明図である。(A) And (b) is explanatory drawing which shows the display information displayed according to the relevance rate which concerns on 1st Embodiment. （ａ）〜（ｃ）は、本実施形態に係る適合率に応じて表示される表示情報を示す説明図である。(A)-(c) is explanatory drawing which shows the display information displayed according to the relevance rate which concerns on this embodiment. （ａ）及び（ｂ）は、第１実施形態に係る適合率に応じて表示される表示情報を示す説明図である。(A) And (b) is explanatory drawing which shows the display information displayed according to the relevance rate which concerns on 1st Embodiment. 第１実施形態に係る音響信号検出方法を示すフローチャート図である。It is a flowchart figure which shows the acoustic signal detection method which concerns on 1st Embodiment. 第２実施形態に係る音響信号検出システムの全体構成を示す概念図である。It is a conceptual diagram which shows the whole structure of the acoustic signal detection system which concerns on 2nd Embodiment. 第２実施形態に係る管理サーバーの内部構成を示すブロック図である。It is a block diagram which shows the internal structure of the management server which concerns on 2nd Embodiment. 第２実施形態に係るユーザー端末及び音再生手段の内部構成を示すブロック図である。It is a block diagram which shows the internal structure of the user terminal and sound reproduction means which concern on 2nd Embodiment. 第２実施形態に係る音響信号検出方法を示すフローチャート図である。It is a flowchart figure which shows the acoustic signal detection method which concerns on 2nd Embodiment.

［第１実施形態］
（音響信号検出システムの概要）
以下に添付図面を参照して、本発明に係るシステムの実施形態を詳細に説明する。図１は、第１実施形態に係るシステムの全体構成を示す概念図である。 [First Embodiment]
(Outline of acoustic signal detection system)
Exemplary embodiments of a system according to the present invention will be described below in detail with reference to the accompanying drawings. FIG. 1 is a conceptual diagram showing an overall configuration of a system according to the first embodiment.

図１に示すように、本実施形態に係る音響信号検出システムは、評価の対象となる評価音響信号中に含まれる特定音響信号を検出する音響信号検出システムであって、通信回線を相互に接続して構築される通信ネットワーク５上に配置された管理サーバー２と、無線基地局３と、無線基地局３を通じて無線通信が可能なユーザー端末１とを備えている。 As shown in FIG. 1, the acoustic signal detection system according to the present embodiment is an acoustic signal detection system that detects a specific acoustic signal included in an evaluation acoustic signal to be evaluated, and connects communication lines to each other. The management server 2 arranged on the communication network 5 constructed as described above, the radio base station 3, and the user terminal 1 capable of radio communication through the radio base station 3 are provided.

通信ネットワーク５は、通信プロトコルＴＣＰ／ＩＰを用いたＩＰ網であって、種々の通信回線（電話回線やＩＳＤＮ回線、ＡＤＳＬ回線、光回線などの公衆回線、専用回線、無線通信網）を相互に接続して構築される分散型の通信ネットワークである。このＩＰ網には、１０ＢＡＳＥ−Ｔや１００ＢＡＳＥ−ＴＸ等によるイントラネット（企業内ネットワーク）や家庭内ネットワークなどのＬＡＮなども含まれる。 The communication network 5 is an IP network using the communication protocol TCP / IP, and various communication lines (public lines such as telephone lines, ISDN lines, ADSL lines, optical lines, private lines, wireless communication networks) are mutually connected. It is a distributed communication network constructed by connecting. This IP network includes a LAN such as an intranet (company network) or a home network based on 10BASE-T, 100BASE-TX, or the like.

無線基地局３は、図示しないゲートウェイ装置を通じて通信ネットワーク５に接続され、ユーザー端末１との間で無線通信接続を確立し、ユーザー端末１による通話やデータ通信を提供する装置である。 The wireless base station 3 is a device that is connected to the communication network 5 through a gateway device (not shown), establishes a wireless communication connection with the user terminal 1, and provides a telephone call and data communication by the user terminal 1.

ユーザー端末１は、ＣＰＵを備えた演算処理装置であり、無線通信機能を有した携帯可能なモバイルコンピューターやＰＤＡ（ＰｅｒｓｏｎａｌＤｉｇｉｔａｌＡｓｓｉｓｔａｎｃｅ）、携帯電話機、スマートフォン、タブレットＰＣ等である。このユーザー端末１は、基地局等の中継点と無線で通信し、通話やデータ通信等の通信サービスを移動しつつ受けることができる。ユーザー端末１の通信方式としては、例えば、ＦＤＭＡ方式、ＴＤＭＡ方式、ＣＤＭＡ方式、Ｗ−ＣＤＭＡの他、ＰＨＳ（ＰｅｒｓｏｎａｌＨａｎｄｙｐｈｏｎｅＳｙｓｔｅｍ）方式等が挙げられる。 The user terminal 1 is an arithmetic processing unit including a CPU, and is a portable mobile computer having a wireless communication function, a PDA (Personal Digital Assistance), a mobile phone, a smartphone, a tablet PC, or the like. This user terminal 1 communicates wirelessly with a relay point such as a base station, and can receive a communication service such as a call or data communication while moving. Examples of the communication method of the user terminal 1 include an FDMA method, a TDMA method, a CDMA method, a W-CDMA, a PHS (Personal Handyphone System) method, and the like.

また、このユーザー端末１は、デジタルカメラ機能、アプリケーションソフトの実行機能、或いはＧＰＳ機能等の機能が搭載されている他、外部から音響信号を取得するマイクを備えており、このマイクによって外部の音響を音声信号や音声ファイルとして取得する音響信号検出装置として機能する。 The user terminal 1 is equipped with functions such as a digital camera function, an application software execution function, or a GPS function, and also includes a microphone for acquiring an acoustic signal from the outside. Functions as an acoustic signal detection device that acquires a sound signal or a sound file.

管理サーバー２は、通信ネットワーク５上に配置され、ユーザーに対するサービスを管理する一般的なサーバー装置であって、ＷＷＷ（ＷｏｒｌｄＷｉｄｅＷｅｂ）等のドキュメントシステムにおいて、ＨＴＭＬ（ＨｙｐｅｒＴｅｘｔＭａｒｋｕｐＬａｎｇｕａｇｅ）ファイルや画像ファイル、音楽ファイルなどの情報送信を行うサーバーコンピューター或いはその機能を持ったソフトウェアであり、ＨＴＭＬ文書や画像などの情報を蓄積しておき、Ｗｅｂブラウザなどのクライアントソフトウェアの要求に応じて、インターネットなどの通信ネットワーク５を通じてこれらの情報を送信する。 The management server 2 is a general server device that is arranged on the communication network 5 and manages services for users. In a document system such as WWW (World Wide Web), an HTML (Hyper Text Markup Language) file or an image file is used. , A server computer that transmits information such as music files or software having such functions, stores information such as HTML documents and images, and communicates over the Internet in response to requests from client software such as a Web browser These pieces of information are transmitted through the network 5.

特に管理サーバー２は、ユーザー端末１に対して、特定音響信号を検出するための音源テンプレートＤ１と、及び音源テンプレートＤ１に関連づけられて発生されるべきイベントに関する情報であるイベント情報Ｄ２とを生成し、これらの情報をユーザー端末１に配信する。 In particular, the management server 2 generates, for the user terminal 1, a sound source template D1 for detecting a specific sound signal and event information D2 that is information relating to an event to be generated in association with the sound source template D1. These pieces of information are distributed to the user terminal 1.

ここで、「特定音響信号」とは、本システムが検出するべき音響のみを含んだ信号を意味しこの例えば、スーパーマーケットや家電量販店などで店内に流される独自のテーマソング、駅や電車内で流される駅名又は発車メロディー、アーティストの楽曲のワンフレーズなどが含まれる。 Here, the “specific sound signal” means a signal containing only sound to be detected by the system, for example, in an original theme song, a station, or a train that is flown into a store in a supermarket or a home electronics store. The name of the station or the departure melody, one phrase of the artist's music, etc. are included.

また、「音源テンプレートＤ１」とは、特定音響信号について、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化されて時系列で記録された情報（ＳＴＦＴ（ｓｈｏｒｔ−ｔｉｍｅＦｏｕｒｉｅｒｔｒａｎｓｆｏｒｍ）データ）である。「音源テンプレート」は、システムのいずれかのデバイスを用いて生成されるものであり、例えば、ユーザー端末でユーザー自ら生成してもよく、生成済みの音源テンプレートをサーバーからダウンロードするようにしてもよい。 In addition, the “sound source template D1” refers to the extraction of the presence or absence of an acoustic signal in a predetermined frequency band based on the scale average rate for a specific acoustic signal, and the distribution of the acoustic signal and its temporal change are binary for each scale. Information recorded in time series (STT (short-time Fourier transform) data). The “sound source template” is generated using any device of the system. For example, the user may generate the sound source template on the user terminal, or the generated sound source template may be downloaded from the server. .

また、「音階平均率」としては、半音ずつの音で構成された１２音階の他、種々の音階が含まれる。「音響信号の有無を二値化」するとは、各音階の音があるか否かのみを検出することを意味し、例えば、音響信号が所定のしきい値を上回るか否かにより判定する手法が挙げられる。 In addition, the “scale average rate” includes various scales in addition to the 12 scales composed of semitones. “Binary binarization of the presence / absence of an acoustic signal” means detecting only whether there is a sound of each scale, for example, a method of determining by whether an acoustic signal exceeds a predetermined threshold Is mentioned.

また、イベントとしては、ユーザーが所定のエリアＡに移動して特定の音源を検出した場合に発生するものであり、音声やテキスト表示によるメッセージ出力や、画像・動画の再生、アプリケーションの起動など種々の情報処理サービスが挙げられる。そして、これら種々の情報処理サービスに関するデータ又はプログラムがイベント情報Ｄ２として配信される。この配信される情報Ｄ２としては、例えば、クーポン情報、チケット情報、プレゼント情報、乗り換え案内情報、目的地である旨を通知する通知情報などが含まれる。 In addition, the event occurs when the user moves to a predetermined area A and detects a specific sound source. Various events such as message output by voice or text display, image / video playback, application startup, etc. Information processing services. Data or programs related to these various information processing services are distributed as event information D2. The information D2 to be distributed includes, for example, coupon information, ticket information, present information, transfer guidance information, notification information for notifying that it is a destination, and the like.

エリアＡとは、例えば、ショッピングモールやコンサート会場など特定の音響を出力する場所であり、エリアＡ内には、スピーカー等の音再生手段４から音響（テーマソング、又は効果音など）が出力されるようになっている。そして、その音響をユーザー端末１のマイク等で聞き取り、音声データ等として取得することで、その音響内に含まれる音源テンプレートＤ１の有無に応じてイベントが発生されるようになっている。 The area A is a place where a specific sound is output, such as a shopping mall or a concert venue, and sound (theme song or sound effect) is output from the sound reproduction means 4 such as a speaker in the area A. It has become so. Then, by listening to the sound with the microphone of the user terminal 1 and acquiring it as sound data, an event is generated according to the presence or absence of the sound source template D1 included in the sound.

（各装置の内部構造）
次いで、上述した音響信号検出システムを構成する各装置の内部構造について説明する。図２は、第１実施形態に係る管理サーバー２ーの内部構成を示すブロック図であり、図３は、第１実施形態に係るユーザー端末１の内部構成を示すブロック図である。また、図４（ａ）〜（ｃ）は、第１実施形態に係る音源テンプレートＤ１の及び比較先テンプレートを示す説明図である。なお、ここで、図４中の横軸は時間経過を示し、図４中の縦軸は周波数を示している。また、図５は、音源テンプレートＤ１が作成されるパート位置を示す説明図である。なお、説明中で用いられる「モジュール」とは、装置や機器等のハードウェア、或いはその機能を持ったソフトウェア、又はこれらの組み合わせなどによって構成され、所定の動作を達成するための機能単位を示す。 (Internal structure of each device)
Next, the internal structure of each device constituting the acoustic signal detection system described above will be described. FIG. 2 is a block diagram showing an internal configuration of the management server 2 according to the first embodiment, and FIG. 3 is a block diagram showing an internal configuration of the user terminal 1 according to the first embodiment. 4A to 4C are explanatory diagrams showing the sound source template D1 and the comparison destination template according to the first embodiment. Here, the horizontal axis in FIG. 4 indicates the passage of time, and the vertical axis in FIG. 4 indicates the frequency. FIG. 5 is an explanatory diagram showing a part position where the sound source template D1 is created. The “module” used in the description refers to a functional unit that is configured by hardware such as an apparatus or a device, software having the function, or a combination thereof, and achieves a predetermined operation. .

（１）管理サーバー２
管理サーバー２は、単一のサーバー装置の他、Ｗｅｂサーバーやデータベースサーバーなど複数種のサーバー群から構成することができ、本実施形態では、通信部２１と、制御部２２と、各種データベース２３〜２５を備えている。 (1) Management server 2
The management server 2 can be composed of a plurality of server groups such as a Web server and a database server in addition to a single server device. In the present embodiment, the communication server 21, the control unit 22, and various databases 23 to 25.

通信部２１は、通信ネットワーク５を通じて、ユーザー端末との間でデータの送受信を行う通信インターフェースである。データベース２３〜２５は、本システムに関する各種の情報を蓄積するデータベース群であり、本実施形態では、ユーザー端末１を所有するユーザーに関する情報を保持するユーザー情報データベース２３と、イベントデータベース２４と、音源テンプレートデータベース２５とを備えている。 The communication unit 21 is a communication interface that transmits and receives data to and from the user terminal through the communication network 5. The databases 23 to 25 are a database group that accumulates various types of information related to the system. In the present embodiment, the user information database 23 that holds information about the user who owns the user terminal 1, the event database 24, and the sound source template And a database 25.

ユーザー情報データベース２３には、ユーザーを識別する識別番号に、ユーザー端末の電話番号、及びメールアドレスの他、ユーザーの年齢、性別、住所、及び趣向情報などの属性情報や音響検出を実行した履歴情報が関連づけて記録されている。 In the user information database 23, in addition to the identification number for identifying the user, the telephone number of the user terminal and the e-mail address, the attribute information such as the user's age, sex, address, and preference information, and the history information on which acoustic detection has been performed Are recorded in association with each other.

音源テンプレートデータベース２５は、特定音響信号について生成された音源テンプレートＤ１のを蓄積する蓄積装置であり、制御部２２の音源テンプレート生成部２２２が生成した音源テンプレートＤ１のが、音源テンプレートＤ１のを識別するテンプレート識別子に関連づけて記憶されている。 The sound source template database 25 is a storage device that stores the sound source template D1 generated for the specific sound signal, and the sound source template D1 generated by the sound source template generation unit 222 of the control unit 22 identifies the sound source template D1. It is stored in association with the template identifier.

イベントデータベース２４は、ユーザーに対して発生させるイベント情報Ｄ２を蓄積する装置であり、イベントを識別するイベント識別子が付加されている。また、本実施形態においてイベントデータベース２４には、テンプレート識別子に、発生させるべきイベントのイベント識別子が関連づけて記憶されている。 The event database 24 is an apparatus for accumulating event information D2 generated for a user, and an event identifier for identifying the event is added. In the present embodiment, the event database 24 stores the event identifier of the event to be generated in association with the template identifier.

この発生させるべきイベントは、例えば、全ユーザーに対して共通のイベントであってもよく、ユーザーの属性に応じて変更してもよい。すなわち、同じ音源テンプレートＤ１のであっても、例えば、ユーザーが女性であれば化粧品、食品等に関するイベントの識別子を関連づけ、例えば、ユーザーの性別が男性であれば、ゴルフなどのスポーツ、又は家電に関するイベントの識別子が関連づけることも可能である。 The event to be generated may be, for example, an event common to all users, or may be changed according to user attributes. That is, even if the sound source template D1 is the same, for example, if the user is a woman, an event identifier relating to cosmetics, food, etc. is associated. For example, if the user's gender is male, an event relating to sports such as golf or home appliances Can be associated with each other.

また、例えば、図５に示すように、同一楽曲のワンフレーズのうち、異なる位置のパートを切り取って、音源テンプレートＤ１１〜Ｄ１３として生成し、ユーザーの属性に応じて異なる位置の音源テンプレートを配信させることで、例えば、家族で同時に同じ店舗に入った場合でも、大人と子供では異なるイベントを発生させるなど、ユーザーに発生させるイベント情報を変更させることができるようになっている。 Also, for example, as shown in FIG. 5, parts at different positions are cut out from one phrase of the same music, generated as sound source templates D11 to D13, and sound source templates at different positions are distributed according to user attributes. Thus, for example, even when a family enters the same store at the same time, it is possible to change event information to be generated by the user, such as generating different events for adults and children.

なお、この際、切り取るパートは音源テンプレートＤ１１及びＤ１２のように重なっていてもよいし、音源テンプレートＤ１３のように他のテンプレートと重ならなくてもよい。さらには、ユーザーの年齢や趣向情報に基づいてイベントを選択可能となっており、年齢や性別などユーザーの特定に適したサービスを提供することができる。 At this time, the part to be cut may overlap with the sound source templates D11 and D12 or may not overlap with other templates like the sound source template D13. Furthermore, an event can be selected based on the user's age and preference information, and a service suitable for identifying the user such as age and sex can be provided.

このように、各ユーザーに対して発生すべきイベントを変更するために、イベントデータベース２４内のイベント情報（例えば、食品に関する情報）には、属性情報（例えば、女性）が関連づけられるとともに、この属性情報に対応するテンプレート識別子（例えば、化粧会社のテーマソング）がイベント識別子と関連づけられるようになっている。 As described above, in order to change an event to be generated for each user, attribute information (for example, female) is associated with event information (for example, information on food) in the event database 24, and this attribute A template identifier (for example, a theme song of a makeup company) corresponding to the information is associated with the event identifier.

制御部２２は、ＣＰＵやＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）等のプロセッサ、メモリ、及びその他の電子回路等のハードウェア、或いはその機能を持ったプログラム等のソフトウェア、又はこれらの組み合わせなどによって構成された演算モジュールであり、プログラムを適宜読み込んで実行することにより種々の機能モジュールを仮想的に構築し、構築された各機能モジュールによって、各部の動作制御、ユーザー操作に対する種々の処理を行っている。 The control unit 22 includes a processor such as a CPU or DSP (Digital Signal Processor), a memory, hardware such as other electronic circuits, software such as a program having the function, or a combination thereof. It is a module, and various function modules are virtually constructed by appropriately reading and executing a program, and various processes for operation control of each unit and user operations are performed by the constructed function modules.

この制御部２２には、音響を検出する機能として、音源データ取得部２２１と、音源テンプレート生成部２２２と、イベント生成部２２３と、配信管理部２２４と、ユーザー登録部２２５とを備えている。 The control unit 22 includes a sound source data acquisition unit 221, a sound source template generation unit 222, an event generation unit 223, a distribution management unit 224, and a user registration unit 225 as functions for detecting sound.

ユーザー登録部２２５は、本システムを受けるためのユーザー登録を受け付けるモジュールである。このユーザー登録部２２５は、ユーザーの各種情報を取得すると、ユーザーを特定するための識別子を付与してユーザー情報データベース２３に記録する。 The user registration unit 225 is a module that accepts user registration for receiving this system. When the user registration unit 225 acquires various types of user information, the user registration unit 225 assigns an identifier for identifying the user and records the information in the user information database 23.

音源データ取得部２２１は、評価の対象となる特定音響信号を取得するモジュールであり、例えば、クーポンを発行するサービス提供側から音声信号や音声ファイルを取得する。 The sound source data acquisition unit 221 is a module that acquires a specific acoustic signal to be evaluated. For example, the sound source data acquisition unit 221 acquires an audio signal or an audio file from a service providing side that issues a coupon.

音源テンプレート生成部２２２は、音源データ取得部２２１が取得した特定音響信号に基づいて音源テンプレートＤ１を生成するモジュールである。具体的に、音源データ取得部２２１は、音の周波数帯域を１２音階平均率にまるめて音階テーブルを生成する。音階テーブルの作成法としては、中心周波数ｆｍから前後ｆｍ＊２＾（±１／１２）を境界値とするものとする。さらに、あるしきい値を超えた値の中で、高調派成分を除いた音量（Ａｍｐｌｉｔｕｄｅ）最大Ｎ個を抜き出す。なお、Ｎは１以上の整数とするものとする。この高調波成分については、音源テンプレートの作成時にあっては、高調波成分を除いたうえで各音階の音の有無を検出し、テンプレートとの比較時にあっては、高調波成分を除かずに、高調波成分を含めたままで各階調音の有無を検出する。これにより、検出対象となる音階の倍音となる高調波成分の影響を低減することができ、より適正な音源テンプレートを作成できるとともに、比較の際には、逆に倍音の影響を利用して検出精度を高めることができる。 The sound source template generation unit 222 is a module that generates the sound source template D1 based on the specific sound signal acquired by the sound source data acquisition unit 221. Specifically, the sound source data acquisition unit 221 generates a scale table by rounding the sound frequency band to the 12 scale average rate. As a method for creating a scale table, the front and rear fm * 2 ^ (± 1/12) from the center frequency fm are used as boundary values. Further, a maximum of N volume (Amplitude) excluding harmonic components is extracted from values exceeding a certain threshold. Note that N is an integer of 1 or more. As for this harmonic component, when creating a sound source template, the presence or absence of the sound of each scale is detected after removing the harmonic component, and when comparing with the template, the harmonic component is not removed. The presence / absence of each tone is detected while including the harmonic component. As a result, it is possible to reduce the effect of harmonic components that are harmonics of the scale to be detected, and to create a more appropriate sound source template. Accuracy can be increased.

そして、この処理を各窓幅（ブロック数）で行うことで、図４（ａ）に示すように、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化され時系列で記録された所定長さの音源テンプレートＤ１が作成される。各窓幅（ブロック数）で抜き出す特徴点は、同じ個数でもよいし、違う個数でもよいものとする。なお、違う個数を抜き出す場合は、しきい値を超えた値のうち高調派成分を除いた全てを取得するものとする。 Then, by performing this processing with each window width (number of blocks), as shown in FIG. 4A, the presence or absence of an acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal A sound source template D1 having a predetermined length in which the temporal change is binarized for each scale and recorded in time series is created. The number of feature points extracted in each window width (number of blocks) may be the same or different. When extracting different numbers, all the values exceeding the threshold value excluding the harmonic component are acquired.

また、本実施形態において、音源テンプレート生成部２２２は、特定音響信号について、音源テンプレートＤ１を生成する際、各音階における信号強度を検出し、検出された音響信号が所定のしきい値を上回るか否かに基づいて音響信号の有無を抽出してもよい。このしきい値は、発生させるべきイベントの種類に応じて異なる値に設定されている。 In the present embodiment, the sound source template generation unit 222 detects the signal intensity in each scale when generating the sound source template D1 for the specific sound signal, and whether the detected sound signal exceeds a predetermined threshold value. The presence or absence of an acoustic signal may be extracted based on whether or not it is present. This threshold value is set to a different value depending on the type of event to be generated.

本実施形態では、基音＋１／２Ｆｒｅｑの単位であって、発話域を除く周波数でテンプレートを用意するものとする。さらに、ここでは、機械学習機能を用いて、抽出する周波数領域、数、窓幅を学習させるものとする。この機械学習機能としては、人工ニューラルネットワーク、遺伝的アルゴリズム、強化学習等教師あり学習などを用いることができ、そのために、学習用のデータがサーバー内に蓄積されている。 In the present embodiment, it is assumed that a template is prepared with a unit of fundamental tone +1/2 Freq and a frequency excluding the speech area. Furthermore, it is assumed here that the machine learning function is used to learn the frequency domain to be extracted, the number, and the window width. As the machine learning function, an artificial neural network, a genetic algorithm, supervised learning such as reinforcement learning, and the like can be used. For this purpose, learning data is stored in the server.

このように生成された音源テンプレートＤ１は、テンプレート識別子が付加されて音源テンプレートデータベース２５に蓄積される。なお、音源テンプレート生成部２２２は、図５に示すように、同じ曲であっても異なるパートからそれぞれ音源テンプレートＤ１１〜Ｄ１３を作成することも可能である。 The sound source template D1 generated in this way is added with a template identifier and stored in the sound source template database 25. As shown in FIG. 5, the sound source template generation unit 222 can also create sound source templates D11 to D13 from different parts even for the same music piece.

イベント生成部２２３は、ユーザーに提供すべきイベント情報を生成するモジュールであり、生成したイベント情報Ｄ２を識別するイベント識別子に関連づけてイベントデータベース２４に蓄積する。 The event generation unit 223 is a module that generates event information to be provided to the user, and stores it in the event database 24 in association with an event identifier that identifies the generated event information D2.

配信管理部２２４は、各種のデータを通信ネットワーク上に配信する制御を管理するモジュールであり、音源テンプレート配信部２２４ａを備えている。音源テンプレート配信部２２４ａは、ユーザー端末からのアクセスに基づいて、音源テンプレート生成部２２２が生成した音源テンプレートＤ１をユーザー端末１に対して配信するモジュールである。 The distribution management unit 224 is a module that manages control for distributing various types of data on the communication network, and includes a sound source template distribution unit 224a. The sound source template distribution unit 224a is a module that distributes the sound source template D1 generated by the sound source template generation unit 222 to the user terminal 1 based on access from the user terminal.

この際、配信管理部２２４は、音源テンプレート配信部２２４ａが配信した音源テンプレートＤ１のテンプレート識別子と、発生させるべきイベントのイベント識別子とを関連づけてイベントデータベース２４に登録する。なお、配信管理部２２４は、予めテンプレート識別子と、発生させるべきイベントのイベント識別子とを関連づけ、音源テンプレートＤ１とイベント情報Ｄ２とを合わせてユーザー端末１に配信してもよい。 At this time, the distribution management unit 224 registers the template identifier of the sound source template D1 distributed by the sound source template distribution unit 224a and the event identifier of the event to be generated in the event database 24 in association with each other. The distribution management unit 224 may associate the template identifier with the event identifier of the event to be generated in advance, and distribute the sound source template D1 and the event information D2 to the user terminal 1 together.

また、配信管理部２２４は、アクセスしたユーザーの属性情報を参照し、その属性情報に関連したイベント情報を選択する機能を有している。この機能によれば、例えば、図５に示すように、１つのフレーズから異なるパートの音源テンプレートＤ１１〜Ｄ１３が生成されている場合には、各パートの音源テンプレートＤ１１〜Ｄ１３を選択して、それぞれに対して異なるイベントを選択することができるようになっている。 The distribution management unit 224 has a function of referring to attribute information of the accessed user and selecting event information related to the attribute information. According to this function, for example, as shown in FIG. 5, when sound source templates D11 to D13 of different parts are generated from one phrase, the sound source templates D11 to D13 of each part are selected, Different events can be selected for.

（２）ユーザー端末及び音再生手段
次いで、ユーザー端末１及び音再生手段４の内部構造について説明する。図６は、本実施形態に係るダブルバッファリング処理を示す説明図であり、図７は、本実施形態に係るテンプレート比較手法を示す説明図である。また、図８〜１０は、本実施形態に係る適合率に応じて表示される表示情報を示す説明図である。 (2) User terminal and sound reproduction means Next, the internal structure of the user terminal 1 and the sound reproduction means 4 will be described. FIG. 6 is an explanatory diagram illustrating a double buffering process according to the present embodiment, and FIG. 7 is an explanatory diagram illustrating a template comparison method according to the present embodiment. Moreover, FIGS. 8-10 is explanatory drawing which shows the display information displayed according to the relevance rate which concerns on this embodiment.

音再生手段４は、図３に示すように、音響を再生する手段であり、音声信号や音声ファイルを読み出して再生する制御部４１と、音声信号や音声ファイルを外部に音響として出力するスピーカー４２と、音声信号や音声ファイルを一時的に記憶するバッファ４３ａ及び４３ｂからなる。制御部４１は、オーディオアンプ等の音楽再生用の電子回路であり、各種の記録メディアに記録された音声信号や音声ファイルを読み出して、アンプ等によって音声信号や音声ファイルを増幅させてスピーカー４２に入力する。スピーカー４２は、音声信号や音声ファイルを物理振動に変換して外部に出力する機器である。 As shown in FIG. 3, the sound reproducing means 4 is a means for reproducing sound, a control unit 41 that reads out and reproduces an audio signal or an audio file, and a speaker 42 that outputs the audio signal or audio file as sound to the outside. And buffers 43a and 43b for temporarily storing audio signals and audio files. The control unit 41 is an electronic circuit for music reproduction such as an audio amplifier, reads out audio signals and audio files recorded on various recording media, amplifies the audio signals and audio files with an amplifier and the like, and supplies them to the speaker 42. input. The speaker 42 is a device that converts an audio signal and an audio file into physical vibrations and outputs them to the outside.

制御部４１は、特定音響を再生する際、図６に示すように、バッファ４３ａ及びバッファ４３ｂを用いて、ダブルバッファリング処理を行っている。本実施形態では、２つのバッファ４３ａ及び４３ｂに同一の生波形データを保持させて、この生波形データを制御部４１の再生デバイスに入力して交互に音を出力させることで、再生用のデータが常に存在して音が途切れなく出力されるようにしている。 When the specific sound is reproduced, the control unit 41 performs double buffering processing using the buffer 43a and the buffer 43b as shown in FIG. In the present embodiment, the same raw waveform data is held in the two buffers 43a and 43b, and the raw waveform data is input to the playback device of the control unit 41 to output sound alternately, thereby reproducing the data. Is always present and the sound is output without interruption.

この際、制御部４１では、図６に示すように、再生を開始して一方のバッファ４３ａが空になった時点でイベントを発生させるように関連づけておき、イベントが発生するとそのイベント内で、次のデータをバッファに用意して、その後に再生を開始する。この際、他方のバッファ４３ｂには音データが残るように制御することで、再生音は途切れること再生されることとなる。なお、本実施形態では、最初は両方のバッファ４３ａ，４３ｂを同時にセットし、次回からは交互にバッファ４３ａ，４３ｂを使うようにするものとする。 At this time, as shown in FIG. 6, the control unit 41 associates the event so as to generate an event when one of the buffers 43 a becomes empty after starting reproduction, and when an event occurs, The next data is prepared in the buffer, and then playback is started. At this time, by controlling so that the sound data remains in the other buffer 43b, the reproduced sound is reproduced without interruption. In the present embodiment, both buffers 43a and 43b are set at the beginning, and the buffers 43a and 43b are alternately used from the next time.

一方、ユーザー端末１は、図３に示すように、無線インターフェース１１と、アプリケーション実行部１７と、出力インターフェース１３と、入力インターフェース１２と、メモリ１５と、音響取得部１６と、音源テンプレート受信部１４とを備えている。 On the other hand, as shown in FIG. 3, the user terminal 1 includes a wireless interface 11, an application execution unit 17, an output interface 13, an input interface 12, a memory 15, a sound acquisition unit 16, and a sound source template reception unit 14. And.

無線インターフェース１１は、通話やデータ通信を行うための移動通信用のプロトコルによる無線通信機能と、例えば無線ＬＡＮ等のデータ通信用のプロトコルによる無線通信機能とを備えている。なお、この無線インターフェースは、モバイルコンピューターやＰＤＡにおいては、無線ＬＡＮアダプタ等により実現することができる。なお、この無線インターフェース１１には、Ｗｉｆｉ等の無線ＬＡＮの他、赤外線通信、Ｂｌｕｅｔｏｏｔｈ（登録商標）等の近距離通信のインターフェースが含まれる。 The wireless interface 11 includes a wireless communication function based on a mobile communication protocol for performing a call or data communication, and a wireless communication function based on a data communication protocol such as a wireless LAN. This wireless interface can be realized by a wireless LAN adapter or the like in a mobile computer or PDA. The wireless interface 11 includes a short-distance communication interface such as infrared communication and Bluetooth (registered trademark) in addition to a wireless LAN such as WiFi.

出力インターフェース１３は、ディスプレイやスピーカーなど、映像や音響を出力するデバイスである。特に、この出力インターフェース１３には、液晶ディスプレイなどの表示部１３ａが含まれており、アプリケーションにより構築されるＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）がこの表示部１３ａに表示される。 The output interface 13 is a device that outputs video and sound, such as a display and a speaker. In particular, the output interface 13 includes a display unit 13a such as a liquid crystal display, and a GUI (Graphical User Interface) constructed by an application is displayed on the display unit 13a.

入力インターフェース１２は、操作ボタンやタッチパネルなどユーザー操作を入力するデバイスである。また、入力インターフェース１２は、外部から出力され、評価対象となる評価音響信号を取得するマイク１２ａを備えている。このマイク１２ａで取得した評価音響信号は、データ変換されて音響取得部１６に入力される。この「評価音響信号」とは、外部から出力された各種の音響が混ざり合った音響信号であり、雑音内に特定音響信号が含まれている信号や、特定音響信号が含まれず雑音のみの信号である場合も含まれる。また、「評価音響信号の取得」としては、マイク等の音響デバイスを用いて音響を直接電気信号として取得する他、無線・有線の通信により音声信号や音声ファイルとして取得する場合が含まれる。評価音響信号内には、特定音響信号と雑音の音響信号とが交ざり合った信号である。 The input interface 12 is a device for inputting user operations such as operation buttons and a touch panel. The input interface 12 includes a microphone 12a that is output from the outside and acquires an evaluation acoustic signal to be evaluated. The evaluation sound signal acquired by the microphone 12 a is converted into data and input to the sound acquisition unit 16. This “evaluation acoustic signal” is an acoustic signal that is a mixture of various types of externally output sound, and a signal that contains a specific acoustic signal within the noise, or a signal that contains only the noise without the specific acoustic signal. Is also included. In addition, “acquisition of an evaluation acoustic signal” includes a case where sound is directly acquired as an electric signal using an acoustic device such as a microphone, and is acquired as an audio signal or an audio file by wireless / wired communication. The evaluation acoustic signal is a signal in which a specific acoustic signal and a noise acoustic signal cross each other.

メモリ１５は、ＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）や各種のアプリケーション用のプログラム、その他のデータ等などを記憶する記憶装置であり、このメモリ１５内には、管理サーバー２から配信された音源テンプレートＤ１と、音源テンプレートＤ１に関連づけられたイベント情報Ｄ２がイベントデータベース１５１に記憶されている。なお、これらのデータは、テンプレート識別子及びイベント識別子に基づいて関連づけられている。また、メモリ１５内には、本発明に係る音響信号検出プログラムが管理サーバー２からダウンロードされて蓄積されているものとする。 The memory 15 is a storage device that stores an OS (Operating System), programs for various applications, other data, and the like. The memory 15 includes a sound source template D1 distributed from the management server 2 and a sound source. Event information D2 associated with the template D1 is stored in the event database 151. These data are associated based on the template identifier and the event identifier. In the memory 15, the acoustic signal detection program according to the present invention is downloaded from the management server 2 and stored.

音源テンプレート受信部１４は、通信ネットワーク５を通じて音源テンプレートＤ１と、当該音源テンプレートＤ１に基づいて発生させるべきベントに関するイベント情報Ｄ２とを受信するモジュールであり、この音源テンプレート受信部１４は、音源テンプレートＤ１を取得する音源テンプレート取得部としての機能を有している。取得した音源テンプレートＤ１及びイベント情報Ｄ２は、イベントデータベース１５１に蓄積される。 The sound source template receiving unit 14 is a module that receives the sound source template D1 and event information D2 related to the event to be generated based on the sound source template D1 through the communication network 5, and the sound source template receiving unit 14 includes the sound source template D1. As a sound source template acquisition unit. The acquired sound source template D1 and event information D2 are accumulated in the event database 151.

音響取得部１６は、マイク１２ａから入力された音響を取得し、当該音響を評価音響信号、若しくは特定音響信号として出力するモジュールである。音響取得部１６においても、音声取り込み処理としてダブルバッファリング処理を実行しており、入力される同一の評価音響信号をふたつのバッファに対して時間をずらして記録して音抜けを防止している。 The sound acquisition unit 16 is a module that acquires sound input from the microphone 12a and outputs the sound as an evaluation sound signal or a specific sound signal. The sound acquisition unit 16 also performs a double buffering process as a sound capturing process, and records the same evaluation sound signal that is input to the two buffers at different times to prevent sound omission. .

そして、評価音響信号として取得された信号は、評価対象変換部１７１に入力され、特定音響信号として取得された信号は、音源テンプレート生成部１７５に入力される。この入力先の決定は、当該音響信号プログラムが起動され、音源テンプレートＤ１を作成メニュー、若しくはテンプレート比較メニューのいずれかが選択されることで決定される。 Then, the signal acquired as the evaluation sound signal is input to the evaluation target conversion unit 171, and the signal acquired as the specific sound signal is input to the sound source template generation unit 175. This input destination is determined by starting the sound signal program and selecting either the creation menu or the template comparison menu for the sound source template D1.

アプリケーション実行部１７は、一般のＯＳやブラウザソフト、電子メール、画像表示ソフトなどのアプリケーションを実行するモジュールであり、通常はＣＰＵ等により実現される。このアプリケーション実行部１７で管理サーバー２からダウンロードした音響信号検出プログラムを実行することにより、ＣＰＵ上に評価対象変換部１７１と、テンプレート比較部１７２と、適合率出力部１７３と、イベント発生管理部１７４と、受信管理部１７６と、音源テンプレート生成部１７５との機能モジュールが仮想的に構築される。 The application execution unit 17 is a module that executes applications such as a general OS, browser software, electronic mail, and image display software, and is usually realized by a CPU or the like. By executing the acoustic signal detection program downloaded from the management server 2 by the application execution unit 17, the evaluation object conversion unit 171, the template comparison unit 172, the matching rate output unit 173, and the event occurrence management unit 174 are executed on the CPU. Then, functional modules of the reception management unit 176 and the sound source template generation unit 175 are virtually constructed.

評価対象変換部１７１は、音響取得部１６から取得された評価音響信号について、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化されて時系列で記録された比較先テンプレートＤ３を生成するモジュールである。 The evaluation target conversion unit 171 extracts the presence / absence of an acoustic signal in a predetermined frequency band for the evaluation acoustic signal acquired from the acoustic acquisition unit 16 based on the scale average rate, and the distribution of the acoustic signal and its temporal change are This module generates a comparison destination template D3 binarized for each musical scale and recorded in time series.

より具体的に、評価対象変換部１７１は、音の周波数帯域を１２音階平均率にまるめて音階テーブルを作成し、その音階テーブルの中から、あるしきい値を超えた値の中で、音量（Ａｍｐｌｉｔｕｄｅ）最大Ｍ個を抜き出して比較先テンプレートＤ３として生成する。この評価先テンプレートでは、Ｎ＜Ｍとなるように設定するものとする。また、比較先テンプレートＤ３は、高調派成分を含むものとする。 More specifically, the evaluation target conversion unit 171 creates a scale table by rounding the frequency band of the sound to the 12 scale average rate, and the sound volume within a value exceeding a certain threshold is selected from the scale table. (Amplitude) A maximum of M is extracted and generated as a comparison destination template D3. In this evaluation destination template, it is assumed that N <M. Moreover, the comparison destination template D3 shall contain a harmonic component.

そして、各窓幅（ｂｌｏｃｋ数）で同じ処理を行い、音源テンプレートＤ１を作成する。この際、評価対象変換部１７１は、リアルタイム処理を行うため、音源テンプレート生成とは異なり、同じ個数を想定する。このような処理により、図４（ｃ）に示すような比較先テンプレートＤ３が生成される。 Then, the same processing is performed for each window width (number of blocks) to create a sound source template D1. At this time, since the evaluation target conversion unit 171 performs real-time processing, unlike the sound source template generation, the same number is assumed. By such processing, a comparison destination template D3 as shown in FIG. 4C is generated.

テンプレート比較部１７２は、比較先テンプレートＤ３と音源テンプレートＤ１とを比較して、二値化された音響信号の分布の適合率を算出するモジュールである。具体的に、テンプレート比較部１７２は、イベントデータベース１５１から図４（ｂ）に示す音源テンプレートＤ１を取り出して、図４（ｃ）に示すような比較先テンプレートＤ３に対して、１窓幅（ｂｌｏｃｋ数）ずつずらしながら比較して適合率を算出する。 The template comparison unit 172 is a module that compares the comparison destination template D3 and the sound source template D1 and calculates the matching rate of the binarized acoustic signal distribution. Specifically, the template comparison unit 172 extracts the sound source template D1 shown in FIG. 4B from the event database 151, and compares the comparison target template D3 as shown in FIG. 4C with one window width (block). The precision is calculated by making a comparison while shifting the number.

ここで、上記評価対象変換部１７１、テンプレート比較部１７２による処理は、入力される比較先テンプレートＤ３をＦＩＦＯ（ＦｉｒｓｔＩｎＦｉｒｓｔＯｕｔ）方式で持つことで、リアルタイムで実行されるようになっている。具体的には、図７に示すように、評価対象変換部１７１に入力された評価音響信号は、ダブルバッファリング処理が実行されて、ブロック幅単位でテンプレート化処理される。そして、予め分析済みの音源テンプレートＤ１と同じ長さのテンプレート幅分を有するデータをＦＩＦＯ方式で保持しつつ、保持分の比較先テンプレートＤ３を音源テンプレートＤ１と比較して、適合率出力部１７３によって適合率を算出する。なお、適合率出力部１７３において、音源テンプレートＤ１と比較して適合率を算出している間も、比較先テンプレートＤ３の生成はリアルタイムで行っている。 Here, the processing by the evaluation object conversion unit 171 and the template comparison unit 172 is executed in real time by having the input comparison destination template D3 in the FIFO (First In First Out) method. Specifically, as illustrated in FIG. 7, the evaluation acoustic signal input to the evaluation target conversion unit 171 is subjected to double buffering processing and is subjected to templating processing in block width units. Then, the data having the same template width as the previously analyzed sound source template D1 is held in the FIFO method, and the comparison destination template D3 to be held is compared with the sound source template D1, and the matching rate output unit 173 performs the comparison. Calculate the precision. It should be noted that the comparison destination template D3 is generated in real time while the precision ratio output unit 173 calculates the precision ratio in comparison with the sound source template D1.

適合率出力部１７３は、テンプレート比較部１７２により算出された適合率に応じた比較結果信号を出力するモジュールである。本実施形態では、適合率が完全一致であるか否かによって比較結果を分けてもよく、また、適合率を複数段階に分け、その適合率の割合に応じて比較結果を変化させてもよい。このように比較結果信号を適合率の割合に応じて変化させることで、イベント発生のトリガーとなる特定音響信号が近くにあることを知らせる音レーダー機能を実行可能となる。 The matching rate output unit 173 is a module that outputs a comparison result signal corresponding to the matching rate calculated by the template comparison unit 172. In the present embodiment, the comparison result may be divided depending on whether or not the matching rate is a perfect match, or the matching rate may be divided into a plurality of stages, and the comparison result may be changed according to the ratio of the matching rate. . In this way, by changing the comparison result signal in accordance with the ratio of the relevance ratio, it is possible to execute a sound radar function that notifies that a specific acoustic signal that triggers the occurrence of an event is nearby.

具体的に、音レーダー機能を実行する場合には、音レーダー用の適合率を算出する。この適合率の算出は、時間を無視した周波数帯域だけのテンプレート適合率を［Ｍ＿ａ］とし、時間を考慮したテンプレート適合率を［Ｍ＿ｔ］とし、重み付け係数を［α］とすると、
（式１）音レーダー用適合率＝Ｍ＿ａ＋ α＊Ｍ＿ｔ
によって算出される。 Specifically, when the sound radar function is executed, the precision for sound radar is calculated. The calculation of the matching rate is as follows: [M_a] is the template matching rate for only the frequency band ignoring time, [M_t] is the template matching rate considering time, and [α] is the weighting coefficient.
(Expression 1) Precision for sound radar = M_a + α * M_t
Is calculated by

イベント発生管理部１７４は、適合率出力部１７３から出力された比較結果信号に基づき、当該比較結果信号に係る音源テンプレートＤ１のテンプレート識別子についてイベントデータベース１５１を照合し、発生させるべきイベントを選出し、選出されたイベントを発生させるモジュールである。このイベント発生管理部１７４は、適合率出力部１７３から出力された比較結果信号に基づき、適合率に応じたグラフィックの表示を行うように制御する適合率表示部１７４ａを有している。 The event occurrence management unit 174 collates the event database 151 with respect to the template identifier of the sound source template D1 related to the comparison result signal based on the comparison result signal output from the matching rate output unit 173, and selects an event to be generated. This module generates selected events. The event occurrence management unit 174 has a matching rate display unit 174a that controls to display a graphic corresponding to the matching rate based on the comparison result signal output from the matching rate output unit 173.

例えば、適合率表示部１７４ａは、適合率出力部１７３において適合率が完全一致か否かのみ判断された場合には、検出成功の比較結果信号を取得した場合、図８（ａ）に示すようなキャラクターを表示部１３ａの画面上に表示させ、適合率が満たさず検出失敗の比較結果信号を取得した場合、図８（ｂ）に示すようなキャラクターを表示部１３ａの画面上に表示させる。 For example, when the relevance ratio display unit 174a determines only whether the relevance ratios are completely coincident in the relevance ratio output section 173, when the comparison result signal indicating successful detection is acquired, as shown in FIG. When a comparison result signal indicating that the matching rate is not satisfied and detection failure is obtained is displayed on the screen of the display unit 13a, a character as shown in FIG. 8B is displayed on the screen of the display unit 13a.

一方、音レーダー機能によって、音レーダー用適合率を算出した場合には、その適合率に応じた比較結果信号が入力され、イベント発生管理部１７４では、その比較結果に応じてイベントを変更して、画面上の表示も変化させる。 On the other hand, when the sound radar precision is calculated by the sound radar function, a comparison result signal corresponding to the precision is input, and the event occurrence management unit 174 changes the event according to the comparison result. Also change the display on the screen.

この処理を実行する場合には、所定のしきい値に応じて音響信号の有無を変化させた複数の音源テンプレートＤ１を用いるものとし、その信号強度の強弱に応じて段階的に比較結果信号を変化させる。具体的に、適合率が低い場合には、適合率表示部１７４ａは、図９（ａ）に示すように「特定音響信号が聞こえない」表情のキャラクターを表示させる。また、ある程度の適合率である場合には、図９（ｂ）に示すように「特定音響が近くにある」表情のキャラクターに変化させる。そして、適合率が完全一致である場合には、図９（ｃ）に示すように「イベント発生可能」である表情のキャラクターを表示させて、その後にイベント情報を表示させる。なお、イベント発生までの適合率に達していない場合には、入力される音響信号の強弱や入力方向を検出して、特定音響までの距離や方向を示すように目のサイズや口の動きを変化させるようにしてもよい。 When this process is executed, a plurality of sound source templates D1 in which the presence or absence of an acoustic signal is changed according to a predetermined threshold value are used, and the comparison result signal is stepwise according to the strength of the signal intensity. Change. Specifically, when the relevance ratio is low, the relevance ratio display unit 174a displays a character with a facial expression “the specific sound signal cannot be heard” as shown in FIG. In addition, when the matching rate is a certain degree, the character is changed to a character having an expression of “the specific sound is nearby” as shown in FIG. Then, if the relevance ratio is completely coincident, a character with a facial expression of “event can be generated” is displayed as shown in FIG. 9C, and then event information is displayed. In addition, when the accuracy rate up to the occurrence of the event has not been reached, the intensity and input direction of the input sound signal are detected, and the eye size and mouth movement are indicated so as to indicate the distance and direction to the specific sound. It may be changed.

入力方向を検出する手法としては、例えば、テンプレートとの比較結果において適合している音量の総計を、テンプレート中の適合ブロックの総計として算出し、そのブロック数の変遷を履歴として記録し、前回の算出結果と、今回の算出結果のブロックの総計を比較する方法が挙げられる。この方法では、前回と今回の適合ブロック数を比較し、その比較において、ブロック数が増加していれば近づいていると判断し、減少していれば遠くなっていると判断する。他の手法としては、音声を取得するマイクデバイスを指向性のものにすることにより、より明確に方向性を検出することができ、さらに複数マイクを用いて、ビームフォーミニングを用い、左右の音声の強弱により音源方向を特定するようにしてもよい。また、端末に、ＧＰＳやコンパス、ジャイロ等の位置情報取得手段が備えられている場合には、端末の移動方向と、音響信号の強度変化から、入力方向を特定するようにしてもよい。 As a method for detecting the input direction, for example, the total volume of the matching sound in the comparison result with the template is calculated as the total number of matching blocks in the template, and the transition of the number of blocks is recorded as a history. There is a method of comparing the calculation result and the total of the blocks of the current calculation result. In this method, the number of matching blocks of the previous time and this time are compared, and in the comparison, if the number of blocks increases, it is determined that they are approaching, and if they are decreasing, it is determined that they are far away. As another method, the directionality can be detected more clearly by making the microphone device for acquiring the sound directional, and by using multiple microphones and beam forming, The sound source direction may be specified by the strength of the sound. When the terminal is provided with position information acquisition means such as a GPS, a compass, and a gyro, the input direction may be specified from the moving direction of the terminal and the intensity change of the acoustic signal.

このような音レーダー機能を実行することで、サービスを提供する音源テンプレートＤ１がユーザーの傍にあることや、その場所に近づいているか否かをユーザーに通知して誘導することができる。 By executing such a sound radar function, it is possible to notify and guide the user whether the sound source template D1 that provides the service is near the user or whether the sound source template D1 is approaching the location.

また、適合率表示部１７４ａは、生成した比較先テンプレートＤ３を、図１０（ａ）及び（ｂ）に示すように、音響信号の有無を音階平均率で区分けし、音響信号の分布を同心円の形状で表示させてもよい。この表示情報は、１周が１オクターブを表し、外周の円ほど高周波数（高いオクターブ領域）を表示するようになっている。この場合でも、特定音響までの距離や方向を示すように表示を変化させてもよい。 In addition, the matching rate display unit 174a classifies the generated comparison target template D3 by dividing the presence / absence of the acoustic signal by the scale average rate as shown in FIGS. You may display with a shape. In this display information, one circle represents one octave, and a higher frequency (high octave region) is displayed on the outer circle. Even in this case, the display may be changed so as to indicate the distance or direction to the specific sound.

音源テンプレート生成部１７５は、ユーザー端末１側で、特定音響信号に基づいて、音源テンプレートＤ１を生成するモジュールである。この音源テンプレートＤ１の作成は、管理サーバー２における音源テンプレート生成部２２２と同様な処理によって生成され、しきい値に応じて音響信号の有無を抽出する処理を実行することもできる。 The sound source template generation unit 175 is a module that generates the sound source template D1 on the user terminal 1 side based on the specific sound signal. The generation of the sound source template D1 is generated by a process similar to that of the sound source template generation unit 222 in the management server 2, and a process of extracting the presence / absence of an acoustic signal according to a threshold value can also be executed.

受信管理部１７６は、音源テンプレート受信部１４が受信した音源テンプレートＤ１のテンプレート識別子と、このテンプレート識別子に基づいて発生させるべきイベントに関する情報Ｄ２とを関連づけて、イベントデータベース１５１に登録させるモジュールである。 The reception management unit 176 is a module that associates the template identifier of the sound source template D1 received by the sound source template reception unit 14 with the information D2 related to the event to be generated based on the template identifier, and registers the information in the event database 151.

（音響信号検出方法）
次いで、本発明の音響信号検出方法について説明する。図１１は、音響信号検出システムの動作の概要を示すフローチャート図である。なお、本実施形態では、管理サーバー２においてユーザー登録が完了されているものとする。 (Acoustic signal detection method)
Next, the acoustic signal detection method of the present invention will be described. FIG. 11 is a flowchart showing an outline of the operation of the acoustic signal detection system. In this embodiment, it is assumed that user registration is completed in the management server 2.

評価の対象となる評価音響信号中に含まれる特定音響信号を検出する音響信号検出方法では、先ず、イベント生成部２２３は、ユーザーに対して発生させるイベントに関するイベント情報Ｄ２を生成する（Ｓ１０１）。このイベント情報Ｄ２は、イベントを識別するイベント識別子に関連づけてイベントデータベース２４に蓄積する。 In the acoustic signal detection method for detecting a specific acoustic signal included in an evaluation acoustic signal to be evaluated, first, the event generation unit 223 generates event information D2 related to an event generated for the user (S101). This event information D2 is stored in the event database 24 in association with an event identifier for identifying the event.

次いで、音源テンプレート生成部２２２において、特定音響信号を取得し（Ｓ１０２）、特定音響信号に基づいて、音源テンプレートＤ１を生成する音源テンプレート生成処理を行う（Ｓ１０３）。この際、音源テンプレートＤ１の生成処理では、特定音響信号について、音源テンプレートＤ１を生成する際、各音階における信号強度を検出し、検出された音響信号が所定のしきい値を上回るか否かに基づいて音響信号の有無を抽出するようにしてもよい。なお、しきい値は、発生させるべきイベントの種類に応じて異なる値に設定されている。 Next, the sound source template generation unit 222 acquires a specific sound signal (S102), and performs sound source template generation processing for generating a sound source template D1 based on the specific sound signal (S103). At this time, in the generation process of the sound source template D1, when the sound source template D1 is generated for the specific sound signal, the signal intensity in each scale is detected, and whether or not the detected sound signal exceeds a predetermined threshold value. Based on this, the presence / absence of an acoustic signal may be extracted. Note that the threshold value is set to a different value depending on the type of event to be generated.

生成された音源テンプレートＤ１は、音源テンプレートＤ１を識別するテンプレート識別子が付加されて音源テンプレートデータベース２５に蓄積される。この際、配信管理部２２４は、音源テンプレートＤ１を識別するテンプレート識別子と、発生させるべきイベントを識別するイベント識別子とを関連づけて、イベントデータベース２４に記憶させるデータベース制御処理を行う（Ｓ１０３）。この際、配信管理部２２４は、ユーザー情報データベース２３を参照して、配信先となるユーザー端末１が所持するユーザーの属性情報（性別、年齢、又は趣向情報）に基づいて、イベント情報Ｄ２を選択するように制御してもよい。 The generated sound source template D1 is added with a template identifier for identifying the sound source template D1 and stored in the sound source template database 25. At this time, the distribution management unit 224 associates a template identifier for identifying the sound source template D1 with an event identifier for identifying an event to be generated, and performs database control processing for storing the event identifier in the event database 24 (S103). At this time, the distribution management unit 224 refers to the user information database 23 and selects event information D2 based on user attribute information (gender, age, or preference information) possessed by the user terminal 1 that is the distribution destination. You may control to do.

その後、管理サーバー２が主導で、若しくは、ユーザー端末１から音源テンプレートＤ１を要求する信号を取得した場合に、音源テンプレート配信部２２４ａは、音源テンプレートＤ１を、通信ネットワーク５を通じて配信する音源テンプレート配信処理を実行する（Ｓ１０５）。この際、配信管理部２２４は、この音源テンプレートＤ１に基づいて発生させるべきイベント情報Ｄ２についても合わせて配信してもよく、発生すべきイベントが変更された場合には、新しいイベント情報Ｄ２をユーザー端末１に配信する。新たなイベント情報Ｄ２を配信した場合、配信管理部２２４は、テンプレート識別子に、発生させるべきイベントを関連づけてイベントデータベース２４に更新登録する。 Thereafter, the sound source template distribution unit 224a distributes the sound source template D1 through the communication network 5 when the management server 2 takes the initiative or acquires a signal requesting the sound source template D1 from the user terminal 1. Is executed (S105). At this time, the distribution management unit 224 may also distribute the event information D2 to be generated based on the sound source template D1, and if the event to be generated is changed, the new event information D2 is sent to the user. Delivered to the terminal 1. When the new event information D2 is distributed, the distribution management unit 224 updates and registers the event to be generated in the event database 24 in association with the template identifier.

ユーザー端末１では、音源テンプレート受信部１４によって、音源テンプレートＤ１と当該音源テンプレートＤ１に基づいて発生させるべきイベント情報Ｄ２とを受信すると（Ｓ１０６）、受信管理部１７６の制御によって、音源テンプレートＤ１を識別するテンプレート識別子と、発生させるべきイベントを識別するイベント識別子とを関連づけてイベントデータベース１５１に記憶させる（Ｓ１０７）。その後、音響信号検出プログラムを起動させて、音再生手段４のスピーカー４２から所定の音が出力されている状態で（Ｓ１０８）、マイク１２ａから外部の音響を読み取らせる（Ｓ１０９）。 In the user terminal 1, when the sound source template receiving unit 14 receives the sound source template D1 and event information D2 to be generated based on the sound source template D1 (S106), the sound source template D1 is identified by the control of the reception management unit 176. The template identifier to be associated with the event identifier for identifying the event to be generated is stored in the event database 151 (S107). Thereafter, the acoustic signal detection program is activated, and external sound is read from the microphone 12a (S109) while a predetermined sound is being output from the speaker 42 of the sound reproducing means 4 (S108).

ユーザー端末１の音響取得部１６は、この音響を評価音響信号として取得し、取得された評価音響信号を評価対象変換部１７１に入力する。評価対象変換部１７１では、この評価音響信号に基づいて、比較先テンプレートＤ３を生成する評価対象変換する処理を実行する（Ｓ１１０）。 The sound acquisition unit 16 of the user terminal 1 acquires this sound as an evaluation sound signal, and inputs the acquired evaluation sound signal to the evaluation target conversion unit 171. The evaluation target conversion unit 171 executes a process of converting the evaluation target to generate the comparison target template D3 based on the evaluation acoustic signal (S110).

比較先テンプレートＤ３がＦＩＦＯ方式によって生成されると、その比較先テンプレートＤ３はテンプレート比較部１７２に入力される。テンプレート比較部１７２では、比較先テンプレートＤ３と音源テンプレートＤ１とを比較して、二値化された音響信号の分布の適合率を算出するテンプレート比較処理を実行する（Ｓ１１１）。 When the comparison destination template D3 is generated by the FIFO method, the comparison destination template D3 is input to the template comparison unit 172. The template comparison unit 172 compares the comparison destination template D3 with the sound source template D1, and executes a template comparison process for calculating the matching rate of the binarized acoustic signal distribution (S111).

テンプレート比較処理により算出された適合率は適合率出力部１７３に入力され、適合率出力部１７３では、その適合率に応じた比較結果信号を出力する適合率出力処理を実行する（Ｓ１１２）。適合率出力処理によって出力された比較結果信号は、イベント発生管理部１７４に入力される。 The precision calculated by the template comparison process is input to the precision ratio output unit 173, and the precision ratio output unit 173 executes a precision ratio output process of outputting a comparison result signal corresponding to the precision ratio (S112). The comparison result signal output by the precision ratio output process is input to the event occurrence management unit 174.

イベント発生管理部１７４では、出力された比較結果信号に基づいて、イベント発生の有無を判断する（Ｓ１１３）。なお、このイベント発生の有無は、例えば、比較結果信号が両テンプレートについて完全一致しているか、若しくは一致度が所定のしきい値以上であるか否かなどに基づいて判断される。なお、この一致の程度に応じて発生させるイベントを選択するようにしてもよく、これにより、目標とする音源が遠い場合と、近い場合とで表示するメッセージや起動するアプリケーションを切り換えることができる。 The event occurrence management unit 174 determines whether or not an event has occurred based on the output comparison result signal (S113). The presence / absence of this event is determined based on, for example, whether the comparison result signals are completely identical for both templates or whether the degree of coincidence is a predetermined threshold value or more. Note that an event to be generated may be selected according to the degree of coincidence, and thereby, a message to be displayed and an application to be activated can be switched depending on whether the target sound source is far or near.

なお、ここでのイベントとは、プログラム上における例外処理や、分岐処理など、一定の要件が満たされることを条件として発生する処理であり、このイベントにより発生された処理によって提供されるシステムやサービスもここで言う「イベント」に含まれる。例えば、特定の音源を検出した場合に、音声やテキスト表示によるメッセージ出力や、画像・動画の再生、アプリケーションの起動など種々の情報処理サービスも、ここでのイベントに含まれる。 An event here is a process that occurs on the condition that certain requirements are satisfied, such as exception processing and branch processing in the program, and the system or service provided by the process generated by this event Is also included in the “event” mentioned here. For example, when a specific sound source is detected, various information processing services such as message output by voice or text display, image / video playback, application activation, and the like are also included in the event.

ステップＳ１１３において、イベントを発生しないと判断した場合には（Ｓ１１３における“Ｎ”）、イベントを発生させることなく、ステップＳ１０８〜Ｓ１１２までの処理を繰り返す。一方、イベントを発生すると判断した場合には、（Ｓ１１３における“Ｙ”）、当該比較結果信号に係る音源テンプレートＤ１のテンプレート識別子についてイベントデータベース２４を照合し、発生させるべきイベントを選出し、選出されたイベントを発生させるイベント発生管理処理を実行する（Ｓ１１４）。この際、適合率表示部１７４ａでは、適合率出力処理により出力された比較結果信号に基づき、適合率に応じたグラフィックの表示を行う（Ｓ１１５）。 If it is determined in step S113 that no event will occur ("N" in S113), the processing from steps S108 to S112 is repeated without generating an event. On the other hand, if it is determined that an event will occur ("Y" in S113), the event database 24 is checked for the template identifier of the sound source template D1 related to the comparison result signal, and an event to be generated is selected and selected. The event occurrence management process for generating the event is executed (S114). At this time, the matching ratio display unit 174a displays a graphic corresponding to the matching ratio based on the comparison result signal output by the matching ratio output process (S115).

その後、イベント発生、若しくはユーザー操作によって特定音響検出の処理が選択されるまで（Ｓ１１６における“Ｎ”）、ステップＳ１０８〜ステップＳ１１５までの処理を繰り返し、特定音響検出の処理終了が選択された場合には（Ｓ１１６における“Ｙ”）、終了する。 Thereafter, the process from step S108 to step S115 is repeated until the specific sound detection process is selected until the specific sound detection process is selected by event occurrence or user operation ("N" in S116). ("Y" in S116), the process ends.

（音響信号検出プログラム）
上述した第１実施形態係る音響信号検出システム、音響信号検出装置、音響信号検出サーバー、及びオブジェクト制御方法は、所定の言語で記述されたプログラムをコンピューター上で実行することにより実現することができる。すなわち、このプログラムを、ユーザー端末やＷｅｂサーバー等のコンピューターやＩＣチップにインストールし、ＣＰＵ２上で実行することにより、上述した各機能を有するシステムを容易に構築することができる。このプログラムは、例えば、通信回線を通じて配布することが可能であり、またスタンドアローンの計算機上で動作するパッケージアプリケーションとして譲渡することができる。 (Acoustic signal detection program)
The above-described acoustic signal detection system, acoustic signal detection device, acoustic signal detection server, and object control method according to the first embodiment can be realized by executing a program described in a predetermined language on a computer. That is, by installing this program on a computer such as a user terminal or a Web server or an IC chip and executing it on the CPU 2, a system having the above-described functions can be easily constructed. This program can be distributed through a communication line, for example, and can be transferred as a package application that operates on a stand-alone computer.

そして、このようなプログラムは、パーソナルコンピューターで読み取り可能な記録媒体に記録することができ、汎用のコンピューターや専用コンピューターを用いて、上述した音響信号検出システム、音響信号検出装置、音響信号検出サーバー、及びオブジェクト制御方法を実施することが可能となるとともに、プログラムの保存、運搬及びインストールを容易に行うことができる。 Such a program can be recorded on a recording medium that can be read by a personal computer. Using a general-purpose computer or a dedicated computer, the above-described acoustic signal detection system, acoustic signal detection device, acoustic signal detection server, In addition, the object control method can be implemented, and the program can be easily stored, transported, and installed.

（作用・効果）
このような本実施形態によれば、特定音響信号を、音階平均律に基づいてその周波数帯域における信号の有無を抽出して生成したテンプレート同士の比較により行うため、複雑な演算処理を要することなく音源とのマッチング処理を行うことができ、データをダウンサイズすることによる高速演算によってリアルタイム処理が可能となる。また、音響信号の有無を、二値化されたテンプレートによって検出することから、雑音下での検出が可能となる。これらの結果、例えば、店舗内で放送されているテーマソングなどを検出した場合にクーポンやポイントを配布するなどのように、特定の音響検出をトリガーとしたサービスを提供することができ、サービスの多様化、充実化を図ることができる。 (Action / Effect)
According to the present embodiment, the specific acoustic signal is obtained by comparing the templates generated by extracting the presence / absence of the signal in the frequency band based on the scale equal temperament, so that complicated calculation processing is not required. Matching processing with a sound source can be performed, and real-time processing can be performed by high-speed computation by downsizing data. Further, since the presence / absence of an acoustic signal is detected by a binarized template, detection under noise is possible. As a result, it is possible to provide a service triggered by specific sound detection, such as distributing coupons or points when a theme song broadcast in a store is detected. Diversification and enhancement can be achieved.

また、本実施形態では、特定音響の検出に用いられる音源テンプレートＤ１と任意のイベントとを関連づけておくことで、ユーザー端末１における特定音響の検出をトリガーとしたイベントを発生させることができる。また、いずれのユーザーが、どこでどのような音響を検出したかについての情報を収集することができ、リアルなユーザー動向を調査することができ、ビックデータの構築によるマーケティングの充実を図ることができる。特に、本実施形態では、ユーザー端末自体が主導となって、特定音響信号を検出して比較結果を算出しているので、通信ネットワーク５上に配置されるサーバー装置等に依存することなく、よりリアルタイム処理が可能となる。 Moreover, in this embodiment, the event which triggered the detection of the specific sound in the user terminal 1 can be generated by associating the sound source template D1 used for detection of the specific sound with an arbitrary event. In addition, it is possible to collect information on which user has detected what kind of sound and where, to investigate realistic user trends, and to improve marketing by building big data. . In particular, in the present embodiment, the user terminal itself takes the lead in detecting the specific acoustic signal and calculating the comparison result, so that it is more reliable without depending on the server device or the like arranged on the communication network 5. Real-time processing is possible.

また、本実施形態によれば、音源テンプレート生成部２２２は、音源テンプレートＤ１を生成する際、各音階における信号強度を検出し、検出された音響信号が、イベントの種類に応じて異なる値に設定されたしきい値を上回るか否かに基づいて音響信号の有無を抽出しているので、しきい値の値を変えて音源テンプレートＤ１を作成することにより、同じ特定音響が検出される場合であっても、音源テンプレートＤ１の種類によって異なるイベントを発生させることができる。これにより、例えば、家族で同時に同じ店舗に入り、同じ特定音響を検出したとしても、大人と子供では異なるイベントを発生させるなど、年齢や性別などユーザーの特定に適したサービスを提供することができる。 Further, according to the present embodiment, the sound source template generation unit 222 detects the signal intensity in each scale when generating the sound source template D1, and sets the detected acoustic signal to a different value according to the type of event. Since the presence / absence of an acoustic signal is extracted based on whether or not the threshold value is exceeded, the same specific sound is detected by creating the sound source template D1 by changing the threshold value. Even if it exists, a different event can be generated according to the kind of sound source template D1. As a result, for example, even if a family enters the same store at the same time and detects the same specific sound, it is possible to provide services suitable for user identification such as age and gender, such as generating different events for adults and children. .

また、本実施形態によれば、音源テンプレート配信部２２４ａが配信した音源テンプレートＤ１のテンプレート識別子と、発生させるべきイベントとを関連づけてイベントデータベース２４に登録しているので、例えば、インターネット上のサーバーから、Ｗｅｂページや電子メール等を介して音源テンプレートＤ１を配信し、その音源テンプレートＤ１ごとにイベントを関連づけておくことができ、Ｗｅｂ上のサービスと実店舗でのサービスを密接にリンクさせることができ、サービスの精細化を図ることができる。 Further, according to the present embodiment, the template identifier of the sound source template D1 distributed by the sound source template distribution unit 224a is registered in the event database 24 in association with the event to be generated. The sound source template D1 can be distributed via a Web page, e-mail, etc., and an event can be associated with each sound source template D1, and the service on the Web and the service at the actual store can be closely linked. The service can be refined.

また、本実施形態では、適合率表示部１７４ａは、適合率出力部１７３から出力された比較結果信号に基づき、適合率に応じたグラフィックの表示を行っているので、イベントの１つとして適合率の表示を行うことによって、操作者は適合率を視認することができ、音源までの距離を視覚的に確認することができる。これにより、例えば、特定音響までの距離や方向を示す音源レーダーのような仕組みを構築することができ、雑音の多い環境であっても、ユーザーを音源の位置まで誘導することができ、この仕組みを利用したゲーム性の高い種々のサービスを提供することができる。 In the present embodiment, the precision ratio display unit 174a displays a graphic corresponding to the precision ratio based on the comparison result signal output from the precision ratio output section 173, so that the precision ratio is one of the events. By displaying the above, the operator can visually recognize the precision and can visually confirm the distance to the sound source. As a result, for example, a mechanism like a sound source radar that indicates the distance and direction to a specific sound can be constructed, and even in a noisy environment, the user can be guided to the position of the sound source. It is possible to provide various services with high game characteristics using.

［第２実施形態］
上述した実施形態では、音響信号検出プログラム、音源テンプレートＤ１及びイベント情報Ｄ２等をユーザー端末に保持させて、スタンドアローン形式で音響信号を検出する処理を実行するようにしたが、本発明はこれに限定されるものではなく、通信ネットワーク５上に配置されたサーバーにおいて、これらの処理を実行するようにしてもよい。図１２は、第２実施形態に係る音響信号検出システムの全体構成を示す概念図である。なお、第２実施形態において、上述した第１実施形態と同一の構成要素には同一の符号を付し、その機能等は特に言及しない限り同一であり、その説明は省略する。 [Second Embodiment]
In the above-described embodiment, the acoustic signal detection program, the sound source template D1, the event information D2, and the like are held in the user terminal, and the process of detecting the acoustic signal in the stand-alone format is executed. The processing is not limited, and these processes may be executed in a server arranged on the communication network 5. FIG. 12 is a conceptual diagram showing an overall configuration of an acoustic signal detection system according to the second embodiment. Note that in the second embodiment, the same components as those in the first embodiment described above are denoted by the same reference numerals, and the functions and the like are the same unless otherwise specified, and the description thereof is omitted.

（音響信号検出システムの概要）
図１１に示すように、本実施形態に係る音響信号検出システムは、通信ネットワーク５上に配置された、機能の異なるサーバー群６及び７と、無線基地局３と、無線基地局３を通じて無線通信が可能なユーザー端末１Ａとを備えている。 (Outline of acoustic signal detection system)
As shown in FIG. 11, the acoustic signal detection system according to the present embodiment performs wireless communication through server groups 6 and 7, a wireless base station 3, and a wireless base station 3 that are arranged on a communication network 5. The user terminal 1A capable of

ユーザー端末１Ａは、上述した第１実施形態と同様なＣＰＵを備えた演算処理機能、及び無線通信機能を有した携帯可能な情報処理端末であり、基地局等の中継点と無線で通信し、通話やデータ通信等の通信サービスを移動しつつ受けることができる。また、ユーザー端末１Ａは、アプリケーションソフトの実行機能等を備えるとともに、外部から音響信号を取得するマイクを備えており、このマイクによって外部の音響を音声信号や音声ファイルとして取得する。 The user terminal 1A is a portable information processing terminal having an arithmetic processing function and a wireless communication function including a CPU similar to that in the first embodiment described above, and wirelessly communicates with a relay point such as a base station. Communication services such as calls and data communication can be received while moving. The user terminal 1 A includes an execution function of application software and the like, and includes a microphone that acquires an acoustic signal from the outside. The microphone acquires external sound as an audio signal or an audio file.

サーバー群６及び７は、処理機能が異なる一般的なサーバー装置である。本実施形態では、両テンプレートの比較を行う複数のテンプレート比較サーバー７（７ａ〜７ｎ）と、ユーザー端末１Ａとテンプレート比較サーバー７との間に位置し、負荷分散を行う複数の制御管理サーバー６（６ａ〜６ｎ）とから構成される。また、制御管理サーバー６は、テンプレート比較サーバー７から取得した比較結果に応じて、ユーザー端末に対してイベントを発生させる機能も備えている。 The server groups 6 and 7 are general server devices having different processing functions. In the present embodiment, a plurality of template comparison servers 7 (7a to 7n) that compare both templates, and a plurality of control management servers 6 (between the user terminal 1A and the template comparison server 7) that perform load distribution. 6a-6n). The control management server 6 also has a function of generating an event for the user terminal according to the comparison result acquired from the template comparison server 7.

このような制御管理サーバー６（６ａ〜６ｎ）及びテンプレート比較サーバー７（７ａ〜７ｎ）は、提供するサービスの種類に応じて、テンプレート比較やイベント発生の処理を行うように複数のサーバー装置で構成されているものとする。 The control management server 6 (6a to 6n) and the template comparison server 7 (7a to 7n) are configured by a plurality of server devices so as to perform template comparison and event generation processing according to the type of service to be provided. It is assumed that

そして、本実施形態では、ユーザー端末１Ａが外部から評価音響信号を取得し、その評価音響信号のデータ形式そのまま、通信ネットワーク５上に送信することで、制御管理サーバー６及びテンプレート比較サーバー７によって、特定音響信号の検出処理が実行され、その比較結果がユーザー端末１Ａに返信されてイベントが発生されるようになっている。 In this embodiment, the user terminal 1A acquires the evaluation acoustic signal from the outside, and transmits the evaluation acoustic signal in its data format as it is to the communication network 5, so that the control management server 6 and the template comparison server 7 A specific acoustic signal detection process is executed, and the comparison result is returned to the user terminal 1A to generate an event.

（各装置の内部構造）
次いで、上述した音響信号検出システムを構成する各装置の内部構造について説明する。図１２は、第２実施形態に係るユーザー端末１Ａの内部構成を示すブロック図であり、図１３は、第２実施形態に係る制御管理サーバー及びテンプレート比較サーバーの内部構成を示すブロック図である。 (Internal structure of each device)
Next, the internal structure of each device constituting the acoustic signal detection system described above will be described. FIG. 12 is a block diagram showing an internal configuration of a user terminal 1A according to the second embodiment, and FIG. 13 is a block diagram showing an internal configuration of a control management server and a template comparison server according to the second embodiment.

（１）ユーザー端末
先ず、ユーザー端末１Ａの内部構造について説明する。図１２に示すように、ユーザー端末１Ａには、無線インターフェース１１と、アプリケーション実行部１８と、出力インターフェース１３と、入力インターフェース１２と、メモリ１５と、音響取得部１６とを備えている。 (1) User terminal First, the internal structure of the user terminal 1A will be described. As illustrated in FIG. 12, the user terminal 1 A includes a wireless interface 11, an application execution unit 18, an output interface 13, an input interface 12, a memory 15, and a sound acquisition unit 16.

音響取得部１６は、上記同様、マイク１２ａから入力された音響を取得し、当該音響を評価音響信号、若しくは特定音響信号として出力するモジュールである。本実施形態においては、これらの信号は、音響送信部１８１に送信される。アプリケーション実行部１８は、一般のＯＳやブラウザソフト、電子メール、画像表示ソフトなどのアプリケーションを実行するモジュールであり、本実施形態においてもＣＰＵ等により実現される。このアプリケーション実行部１８で管理サーバー２からダウンロードした音響信号検出プログラムを実行することにより、本実施形態では、音響送信部１８１と、イベント実行部１８２とが構成される。 Similarly to the above, the sound acquisition unit 16 is a module that acquires the sound input from the microphone 12a and outputs the sound as an evaluation sound signal or a specific sound signal. In the present embodiment, these signals are transmitted to the acoustic transmission unit 181. The application execution unit 18 is a module that executes an application such as a general OS, browser software, electronic mail, and image display software, and is also realized by a CPU or the like in this embodiment. By executing the acoustic signal detection program downloaded from the management server 2 by the application execution unit 18, the acoustic transmission unit 181 and the event execution unit 182 are configured in the present embodiment.

音響送信部１８１は、音響取得部１６から取得した評価音響信号、若しくは特定音響信号をＩＰパケット化するなどして、ユーザー端末１Ａを識別するユーザー識別子とともに、制御管理サーバー６に送信するモジュールである。なお、配信先となる制御管理サーバー６（６ａ〜６ｎ）は、ユーザー登録をしたサーバー装置であるものとする。 The acoustic transmission unit 181 is a module that transmits the evaluation acoustic signal acquired from the acoustic acquisition unit 16 or the specific acoustic signal to the control management server 6 together with the user identifier for identifying the user terminal 1A by converting the packet into an IP packet. . It is assumed that the control management server 6 (6a to 6n) serving as a distribution destination is a server device that has undergone user registration.

イベント実行部１８２は、制御管理サーバー６から送られた、比較結果に基づいて選出されたイベント情報Ｄ２を取得して、当該イベント情報Ｄ２に応じて所定のイベントを実行するモジュールである。本実施形態においても実行されるイベントとは、音声やテキスト表示によるメッセージ出力や、画像・動画の再生、アプリケーションの起動など種々の情報処理サービスが挙げられ、取得するイベント情報Ｄ２とは、クーポン情報、チケット情報、プレゼント情報、乗り換え案内情報、目的地である旨を通知する通知情報などである。そして、イベント実行部１８２は、このイベント情報Ｄ２を用いて、表示部１３ａにイベントに関する表示情報を表示させる。 The event execution unit 182 is a module that acquires event information D2 selected based on the comparison result sent from the control management server 6 and executes a predetermined event according to the event information D2. The event executed in this embodiment also includes various information processing services such as message output by voice or text display, image / video playback, application activation, etc. Event information D2 to be acquired is coupon information Ticket information, present information, transfer guidance information, and notification information for notifying that the destination is the destination. And the event execution part 182 displays the display information regarding an event on the display part 13a using this event information D2.

（２）制御管理サーバー及びテンプレート比較サーバーの内部構成
次いで、制御管理サーバー６及びテンプレート比較サーバー７の内部構成について説明する。本実施形態において、制御管理サーバー６は、ユーザー管理機能と、音源テンプレート作成機能、比較先テンプレート作成機能と、イベント発生管理機能とを有しており、ユーザー登録部６２と、評価対象変換部６３と、イベント発生管理部６４と、イベント生成部６５と、音源テンプレート生成部６８と、配信管理部６９と、通信部６１と、ユーザー情報データベース６７と、イベントデータベース６６とを備えている。 (2) Internal Configuration of Control Management Server and Template Comparison Server Next, the internal configuration of the control management server 6 and the template comparison server 7 will be described. In the present embodiment, the control management server 6 has a user management function, a sound source template creation function, a comparison destination template creation function, and an event occurrence management function, and a user registration unit 62 and an evaluation target conversion unit 63. An event generation management unit 64, an event generation unit 65, a sound source template generation unit 68, a distribution management unit 69, a communication unit 61, a user information database 67, and an event database 66.

ユーザー登録部６２は、本システムを受けるためのユーザー登録を受け付けるモジュールである。このユーザー登録部６２は、ユーザーの各種情報を取得すると、ユーザーを特定するための識別子を付与してユーザー情報データベース６７に記録する。ユーザー情報データベース６７には、ユーザーを識別する識別子に、ユーザー端末の電話番号、及びメールアドレスの他、ユーザーの年齢、性別、住所、及び趣向情報などの属性情報や、音響検出を実行した履歴情報が関連づけて記録されている。 The user registration unit 62 is a module that accepts user registration for receiving this system. When the user registration unit 62 acquires various types of user information, the user registration unit 62 assigns an identifier for identifying the user and records the information in the user information database 67. In the user information database 67, in addition to the identifier for identifying the user, the telephone number and mail address of the user terminal, attribute information such as the user's age, gender, address, and preference information, and history information on which acoustic detection has been performed Are recorded in association with each other.

評価対象変換部６３は、ユーザー端末１Ａから送信された評価音響信号について、所定の周波数帯域における音響信号の有無を音階平均率に基づいて抽出し、音響信号の分布及びその時間的変化が、音階ごとに二値化されて時系列で記録された比較先テンプレートＤ３を生成するモジュールである。この具体的処理については、第１実施形態と同様であるため省略する。 The evaluation target conversion unit 63 extracts the presence / absence of an acoustic signal in a predetermined frequency band from the evaluation acoustic signal transmitted from the user terminal 1A based on the scale average rate, and the distribution of the acoustic signal and its temporal change are the scales. This is a module that generates a comparison destination template D3 binarized and recorded in time series. Since this specific process is the same as that of the first embodiment, a description thereof will be omitted.

イベント生成部６５は、ユーザーに提供すべきイベント情報Ｄ２を生成するモジュールであり、生成したイベント情報Ｄ２を識別するイベント識別子に関連づけてイベントデータベース６６に蓄積する。 The event generation unit 65 is a module that generates event information D2 to be provided to the user, and stores it in the event database 66 in association with an event identifier that identifies the generated event information D2.

音源テンプレート生成部６８は、例えば、クーポンを発行するサービス提供側から取得した特定音響信号に基づいて音源テンプレートＤ１を生成するモジュールであり、生成した音源テンプレートＤ１にはテンプレート識別子が付加される。なお、本実施形態においても、音源テンプレート生成部６８は、音源テンプレートＤ１を生成する際、各音階における信号強度を検出し、検出された音響信号が所定のしきい値を上回るか否かに基づいて音響信号の有無を抽出するようにしてもよい。そして、そのしきい値は、発生させるべきイベントの種類に応じて異なる値に設定されている。 The sound source template generation unit 68 is a module that generates a sound source template D1 based on, for example, a specific sound signal acquired from a service provider that issues a coupon, and a template identifier is added to the generated sound source template D1. Also in the present embodiment, the sound source template generation unit 68 detects the signal intensity in each scale when generating the sound source template D1, and based on whether the detected acoustic signal exceeds a predetermined threshold value. The presence or absence of an acoustic signal may be extracted. The threshold value is set to a different value depending on the type of event to be generated.

配信管理部６９は、各種のデータを通信ネットワーク上に配信する制御を管理するモジュールであり、音源テンプレート配信部６９ａを備えている。音源テンプレート配信部６９ａは、音源テンプレート生成部６８が生成した音源テンプレートを、テンプレート比較部７４を有するテンプレート比較サーバー７に対し、通信ネットワーク５を通じて配信するモジュールであり、本実施形態では、テンプレート識別子を付加した状態で送信する。 The distribution management unit 69 is a module that manages control for distributing various types of data on the communication network, and includes a sound source template distribution unit 69a. The sound source template distribution unit 69a is a module that distributes the sound source template generated by the sound source template generation unit 68 to the template comparison server 7 having the template comparison unit 74 through the communication network 5. In this embodiment, the template identifier is assigned to the sound source template distribution unit 69a. Send with added status.

また、配信管理部６９は、音源テンプレート配信部６９ａが配信した音源テンプレートＤ１のテンプレート識別子と、発生させるべきイベントとのイベント識別子を関連づけてイベントデータベース６６に記憶する。なお、本実施形態では、制御管理サーバー６において音源テンプレートＤ１を生成して保存する構成としたが、本発明は、これに限定するものではなく、テンプレート比較サーバー７において、音源テンプレートＤ１を生成して蓄積してもよい。 In addition, the distribution management unit 69 stores the template identifier of the sound source template D1 distributed by the sound source template distribution unit 69a in association with the event identifier of the event to be generated in the event database 66. In this embodiment, the control management server 6 generates and stores the sound source template D1, but the present invention is not limited to this, and the template comparison server 7 generates the sound source template D1. May be accumulated.

イベントデータベース６６は、ユーザーに対して発生させるイベント情報Ｄ２を蓄積する装置であり、イベントを識別するイベント識別子が付加されて記憶されるとともに、音源テンプレートＤ１を識別するテンプレート識別子と、発生させるべきイベントを識別するイベント識別子とを関連づけて記憶されている。 The event database 66 is a device for accumulating event information D2 to be generated for the user, and is stored with an event identifier for identifying the event, and a template identifier for identifying the sound source template D1 and an event to be generated. Is stored in association with an event identifier for identifying the event ID.

イベント発生管理部６４は、テンプレート比較サーバー７の適合率出力部７５から出力された比較結果信号に基づき、当該比較結果信号に係る音源テンプレートＤ１のテンプレート識別子についてイベントデータベース６６を照合し、発生させるべきイベントを選出し、選出されたイベントに関する情報をユーザー端末１Ａに通知するモジュールである。 The event occurrence management unit 64 should collate the event database 66 with respect to the template identifier of the sound source template D1 related to the comparison result signal based on the comparison result signal output from the matching rate output unit 75 of the template comparison server 7, and generate the event database 66. This module selects an event and notifies the user terminal 1A of information related to the selected event.

このイベント発生管理部６４には、テンプレート比較サーバー７の適合率出力部から出力された比較結果信号に基づき、適合率に応じたグラフィックの表示を行うための制御信号、を通信ネットワーク５を通じてユーザー端末１Ａに送出する適合率表示制御部６４ａを有している。通信部６１は、通信ネットワーク５を通じて、ユーザー端末１Ａ及びテンプレート比較サーバー７との間でデータの送受信を行う通信インターフェースである。 The event occurrence management unit 64 receives a control signal for displaying a graphic corresponding to the matching rate based on the comparison result signal output from the matching rate output unit of the template comparison server 7 via the communication network 5. It has a matching rate display control unit 64a for sending to 1A. The communication unit 61 is a communication interface that transmits and receives data between the user terminal 1 A and the template comparison server 7 through the communication network 5.

次いで、テンプレート比較サーバー７について説明する。本実施形態において、テンプレート比較サーバー７は、テンプレート比較機能と、適合率出力機能とを有しており、音源テンプレート取得部７２と、テンプレート比較部７４と、適合率出力部７５と、音源テンプレートデータベース７３と、通信部７１とを備えている。 Next, the template comparison server 7 will be described. In this embodiment, the template comparison server 7 has a template comparison function and a precision ratio output function, and a sound source template acquisition part 72, a template comparison part 74, a precision ratio output part 75, and a sound source template database. 73 and a communication unit 71.

音源テンプレート取得部７２は、制御管理サーバー６において生成された音源テンプレートを取得するモジュールであり、関連づけられたテンプレート識別子に基づいて音源テンプレートデータベース７３に記憶する。通信部７１は、通信ネットワーク５を通じて、制御管理サーバー６との間でデータの送受信を行う通信インターフェースである。 The sound source template acquisition unit 72 is a module for acquiring the sound source template generated in the control management server 6 and stores it in the sound source template database 73 based on the associated template identifier. The communication unit 71 is a communication interface that transmits and receives data to and from the control management server 6 through the communication network 5.

音源テンプレートデータベース７３は、特定音響信号について生成された音源テンプレートＤ１を蓄積する蓄積装置であり、音源テンプレートＤ１に音源テンプレートＤ１を識別するテンプレート識別子が関連づけられて記憶されている。 The sound source template database 73 is a storage device that stores the sound source template D1 generated for the specific sound signal, and stores a template identifier that identifies the sound source template D1 in association with the sound source template D1.

テンプレート比較部７４は、比較先テンプレートＤ３と音源テンプレートＤ１とを比較して、二値化された音響信号の分布の適合率を算出するモジュールである。適合率出力部７５は、テンプレート比較部７４により算出された適合率に応じた比較結果信号を、通信ネットワーク５を介して制御管理サーバー６に送出するモジュールである。なお、テンプレート比較部７４及び適合率出力部７５の具体的な処理は、第１実施形態と同様であるため、その説明は省略する。また、第１実施形態で説明したリアルタイム処理やダブルバッファリング処理などを含む各処理についても、本実施形態では実行されているものとし、その説明は省略する。 The template comparison unit 74 is a module that compares the comparison target template D3 and the sound source template D1 and calculates the matching ratio of the binarized acoustic signal distribution. The matching rate output unit 75 is a module that sends a comparison result signal corresponding to the matching rate calculated by the template comparison unit 74 to the control management server 6 via the communication network 5. In addition, since the specific process of the template comparison part 74 and the precision ratio output part 75 is the same as that of 1st Embodiment, the description is abbreviate | omitted. Each process including the real-time process and the double buffering process described in the first embodiment is also executed in the present embodiment, and the description thereof is omitted.

（音響信号検出方法）
次いで、第２実施形態に係る音響信号検出方法について説明する。図１５は、音響信号検出システムの動作の概要を示すフローチャート図である。なお、本実施形態においても、制御管理サーバー６においてユーザー登録が完了されているものとする。 (Acoustic signal detection method)
Next, an acoustic signal detection method according to the second embodiment will be described. FIG. 15 is a flowchart showing an outline of the operation of the acoustic signal detection system. Also in this embodiment, it is assumed that user registration is completed in the control management server 6.

本実施形態に係る音響信号検出方法では、先ず、イベント生成部６５において、ユーザーに対して発生させるイベントに関するイベント情報Ｄ２を生成する（Ｓ２０１）。このイベント情報Ｄ２は、イベントを識別するイベント識別子に関連づけてイベントデータベース６６に蓄積する。 In the acoustic signal detection method according to the present embodiment, first, the event generation unit 65 generates event information D2 related to an event generated for the user (S201). This event information D2 is stored in the event database 66 in association with an event identifier for identifying the event.

次いで、音源テンプレート生成部６８において、特定音響信号を取得し（Ｓ２０２）、特定音響信号に基づいて、音源テンプレートＤ１を生成する音源テンプレート生成処理を行う（Ｓ２０３）。この際、音源テンプレートＤ１の生成処理では、特定音響信号について、音源テンプレートＤ１を生成する際、各音階における信号強度を検出し、検出された音響信号が所定のしきい値を上回るか否かに基づいて音響信号の有無を抽出するようにしてもよい。なお、しきい値は、発生させるべきイベントの種類に応じて異なる値に設定されている。その後、生成された音源テンプレートＤ１は、音源テンプレートＤ１を識別するテンプレート識別子が付加され、音源テンプレート配信部６９ａによって、テンプレート比較サーバー７に対して配信される（Ｓ２０４）。 Next, the sound source template generation unit 68 acquires a specific sound signal (S202), and performs sound source template generation processing for generating a sound source template D1 based on the specific sound signal (S203). At this time, in the generation process of the sound source template D1, when the sound source template D1 is generated for the specific sound signal, the signal intensity in each scale is detected, and whether or not the detected sound signal exceeds a predetermined threshold value. Based on this, the presence / absence of an acoustic signal may be extracted. Note that the threshold value is set to a different value depending on the type of event to be generated. Thereafter, the generated sound source template D1 is added with a template identifier for identifying the sound source template D1, and distributed to the template comparison server 7 by the sound source template distribution unit 69a (S204).

この際、配信管理部６９では、音源テンプレート配信部６９ａが配信した音源テンプレートＤ１のテンプレート識別子と、発生させるべきイベントとのイベント識別子を関連づけてイベントデータベース６６に記憶するデータベース制御処理を行う（Ｓ２０５）。テンプレート比較サーバー７では、音源テンプレート取得部７２が、送信された音源テンプレートＤ１を取得し（Ｓ２０６）、関連づけられたテンプレート識別子に基づいて音源テンプレートデータベース７３に記憶する（Ｓ２０７）。 At this time, the distribution management unit 69 performs a database control process in which the template identifier of the sound source template D1 distributed by the sound source template distribution unit 69a and the event identifier of the event to be generated are associated and stored in the event database 66 (S205). . In the template comparison server 7, the sound source template acquisition unit 72 acquires the transmitted sound source template D1 (S206) and stores it in the sound source template database 73 based on the associated template identifier (S207).

その後、音再生手段４のスピーカー４２から所定の音が出力されている状態で（Ｓ２０８）、音響信号検出プログラムが起動されると、マイク１２ａから外部の音響が読み取られる。ユーザー端末１Ａの音響取得部１６は、この音響を評価音響信号として取得し、取得された評価音響信号は、音響送信部１８１を介して制御管理サーバー６へ送信される（Ｓ２０９）。制御管理サーバー６の評価対象変換部６３では、この評価音響信号を取得し（Ｓ２１０）、比較先テンプレートＤ３を生成する評価対象変換する処理を実行する（Ｓ２１１）。なお、ユーザー端末１から配信された評価音響信号にはユーザー識別子が付加され、いずれのユーザー端末１Ａから送られてきたかが判別可能となっており、このユーザー識別子に基づいて相互通信されるようになっている。 After that, when a predetermined sound is being output from the speaker 42 of the sound reproducing means 4 (S208), when the acoustic signal detection program is started, external sound is read from the microphone 12a. The sound acquisition unit 16 of the user terminal 1A acquires this sound as an evaluation sound signal, and the acquired evaluation sound signal is transmitted to the control management server 6 via the sound transmission unit 181 (S209). The evaluation target conversion unit 63 of the control management server 6 acquires this evaluation acoustic signal (S210), and executes the process of converting the evaluation target to generate the comparison destination template D3 (S211). Note that a user identifier is added to the evaluation acoustic signal distributed from the user terminal 1 so that it can be determined from which user terminal 1A it is sent, and mutual communication is performed based on this user identifier. ing.

比較先テンプレートＤ３が生成されると、その比較先テンプレートＤ３は通信ネットワーク５を介してテンプレート比較サーバー７へ配信される（Ｓ２１２）。テンプレート比較サーバー７が比較先テンプレートＤ３を取得すると（Ｓ２１３）、そのデータは、テンプレート比較部７４に入力される。テンプレート比較部７４では、比較先テンプレートＤ３と音源テンプレートＤ１とを比較して、二値化された音響信号の分布の適合率を算出するテンプレート比較処理を実行する（Ｓ２１４）。 When the comparison destination template D3 is generated, the comparison destination template D3 is distributed to the template comparison server 7 via the communication network 5 (S212). When the template comparison server 7 acquires the comparison destination template D3 (S213), the data is input to the template comparison unit 74. The template comparison unit 74 compares the comparison destination template D3 with the sound source template D1, and executes a template comparison process for calculating the matching rate of the binarized acoustic signal distribution (S214).

テンプレート比較処理により算出された適合率は適合率出力部７５に入力され、適合率出力部７５では、その適合率に応じた比較結果信号を出力する適合率出力処理を実行する（Ｓ２１５）。適合率出力処理によって出力された比較結果信号は、通信ネットワーク５を介して、制御管理サーバー６へ送信される。 The precision calculated by the template comparison process is input to the precision ratio output unit 75, and the precision ratio output unit 75 executes a precision ratio output process for outputting a comparison result signal corresponding to the precision ratio (S215). The comparison result signal output by the precision ratio output process is transmitted to the control management server 6 via the communication network 5.

制御管理サーバー６では、その結果信号がイベント発生管理部６４に入力され、イベント発生管理部６４では、出力された比較結果信号に基づいて、イベント発生の有無を判断する（Ｓ２１６）。なお、このイベント発生の有無は、例えば、比較結果信号が両テンプレートが完全一致しているか、若しくは所定のしきい値以上であるか否かなど、発生させるべきイベントに応じて選択される。イベントを発生しないと判断した場合には、（Ｓ２１６における“Ｎ”）、イベントを発生させることなく、ステップＳ２０８〜Ｓ２１５までの処理を繰り返す。 In the control management server 6, the result signal is input to the event occurrence management unit 64, and the event occurrence management unit 64 determines whether or not an event has occurred based on the output comparison result signal (S216). The presence / absence of this event is selected according to the event to be generated, for example, whether the comparison result signal is a perfect match between the two templates or whether it is equal to or greater than a predetermined threshold value. If it is determined that no event occurs ("N" in S216), the processing from step S208 to S215 is repeated without generating an event.

一方、イベントを発生すると判断した場合には、（Ｓ２１６における“Ｙ”）、当該比較結果信号に係る音源テンプレートＤ１のテンプレート識別子についてイベントデータベース６６を照合し、発生させるべきイベントを選出し、選出されたイベントを発生させるイベント発生管理処理を実行する（Ｓ２１７）。具体的には、ユーザー識別子に基づいて、ユーザー端末１Ａに対し、イベント情報Ｄ２を配信する。この際、適合率表示制御部６４ａでは、適合率出力処理により出力された比較結果信号に基づき、適合率に応じたグラフィックの表示制御を行う表示制御信号も送信する。 On the other hand, if it is determined that an event will occur ("Y" in S216), the event database 66 is checked for the template identifier of the sound source template D1 related to the comparison result signal, and an event to be generated is selected and selected. The event occurrence management process for generating the event is executed (S217). Specifically, the event information D2 is distributed to the user terminal 1A based on the user identifier. At this time, the precision ratio display control unit 64a also transmits a display control signal for performing graphic display control according to the precision ratio based on the comparison result signal output by the precision ratio output process.

次いで、ユーザー端末１Ａのイベント実行部１８２は、イベント情報Ｄ２及び、表示制御信号に基づいて、表示部１３ａ上に比較結果に応じたグラフィックを表示させる（Ｓ２１８）。その後、イベント発生、若しくはユーザー操作によって特定音響検出の処理が選択されるまで（Ｓ２１９における“Ｎ”）、ステップＳ２０８〜ステップＳ２２０までの処理を繰り返し、特定音響検出の処理終了が選択された場合には（Ｓ２１９における“Ｙ”）、終了する。 Next, the event execution unit 182 of the user terminal 1A displays a graphic corresponding to the comparison result on the display unit 13a based on the event information D2 and the display control signal (S218). Thereafter, the process from step S208 to step S220 is repeated until the specific sound detection process is selected until an event occurs or a specific sound detection process is selected by a user operation ("N" in S219). (“Y” in S219) ends.

（音響信号検出プログラム）
上述した第２実施形態係る音響信号検出システム、音響信号検出装置、音響信号検出サーバー、及びオブジェクト制御方法についても、所定の言語で記述されたプログラムをコンピューター上で実行することにより実現することができる。すなわち、このプログラムを、ユーザー端末やＷｅｂサーバー等のコンピューターやＩＣチップにインストールし、ＣＰＵ上で実行することにより、上述した各機能を有するシステムを容易に構築することができる。このプログラムは、例えば、通信回線を通じて配布することが可能であり、またスタンドアローンの計算機上で動作するパッケージアプリケーションとして譲渡することができる。 (Acoustic signal detection program)
The acoustic signal detection system, the acoustic signal detection device, the acoustic signal detection server, and the object control method according to the second embodiment described above can also be realized by executing a program described in a predetermined language on a computer. . That is, by installing this program on a computer such as a user terminal or a Web server or an IC chip and executing it on the CPU, a system having the above-described functions can be easily constructed. This program can be distributed through a communication line, for example, and can be transferred as a package application that operates on a stand-alone computer.

（作用・効果）
このような本実施形態においても、特定音響信号を、音階平均律に基づいてその周波数帯域における信号の有無を抽出して生成したテンプレート同士の比較により行うため、複雑な演算処理を要することなく音源とのマッチング処理を行うことができ、データをダウンサイズすることによる高速演算によってリアルタイム処理が可能となる。また、音響信号の有無を、二値化されたテンプレートによって検出することから、雑音下での検出が可能となる。これらの結果、例えば、店舗内で放送されているテーマソングなどを検出した場合にクーポンやポイントを配布するなどのように、特定の音響検出をトリガーとしたサービスを提供することができ、サービスの多様化、充実化を図ることができる。 (Action / Effect)
In this embodiment as well, the specific sound signal is obtained by comparing the templates generated by extracting the presence / absence of the signal in the frequency band based on the scale equal temperament, so that the sound source is not required for complicated calculation processing. And real-time processing is possible by high-speed calculation by downsizing the data. Further, since the presence / absence of an acoustic signal is detected by a binarized template, detection under noise is possible. As a result, it is possible to provide a service triggered by specific sound detection, such as distributing coupons or points when a theme song broadcast in a store is detected. Diversification and enhancement can be achieved.

また、本実施形態では、特定音響の検出に用いられる音源テンプレートＤ１と任意のイベントとを関連づけておくことで、ユーザー端末１Ａにおける特定音響の検出をトリガーとしたイベントを発生させることができる。また、いずれのユーザーが、どこでどのような音響を検出したかについての情報を収集することができ、リアルなユーザー動向を調査することができ、ビックデータの構築によるマーケティングの充実を図ることができる。特に、本実施形態では、通信ネットワーク５上に配置されたサーバー６，７において、評価音響信号を比較先テンプレートにしたり、両テンプレートを比較して、その適合率を算出したりしているので、ユーザーが所持するユーザー端末１Ａに対する処理負担の軽減、及びメモリ容量の有効利用を図ることができる。 In the present embodiment, an event triggered by detection of specific sound in the user terminal 1A can be generated by associating the sound source template D1 used for detection of specific sound with an arbitrary event. In addition, it is possible to collect information on which user has detected what kind of sound and where, to investigate realistic user trends, and to improve marketing by building big data. . In particular, in the present embodiment, in the servers 6 and 7 arranged on the communication network 5, the evaluation acoustic signal is used as a comparison destination template, or both templates are compared to calculate the matching rate. It is possible to reduce the processing load on the user terminal 1A possessed by the user and to effectively use the memory capacity.

また、本実施形態によれば、音源テンプレート生成部６８は、音源テンプレートＤ１を生成する際、各音階における信号強度を検出し、検出された音響信号が、イベントの種類に応じて異なる値に設定されたしきい値を上回るか否かに基づいて音響信号の有無を抽出しているので、しきい値の値を変えて音源テンプレートＤ１を作成することにより、同じ特定音響が検出される場合であっても、音源テンプレートＤ１の種類によって異なるイベントを発生させることができる。これにより、例えば、家族で同時に同じ店舗に入り、同じ特定音響を検出したとしても、大人と子供では異なるイベントを発生させるなど、年齢や性別などユーザーの特定に適したサービスを提供することができる。 Further, according to the present embodiment, the sound source template generation unit 68 detects the signal intensity in each scale when generating the sound source template D1, and sets the detected acoustic signal to a different value depending on the type of event. Since the presence / absence of an acoustic signal is extracted based on whether or not the threshold value is exceeded, the same specific sound is detected by creating the sound source template D1 by changing the threshold value. Even if it exists, a different event can be generated according to the kind of sound source template D1. As a result, for example, even if a family enters the same store at the same time and detects the same specific sound, it is possible to provide services suitable for user identification such as age and gender, such as generating different events for adults and children. .

また、本実施形態によれば、音源テンプレート配信部６９ａが配信した音源テンプレートＤ１のテンプレート識別子と、発生させるべきイベントとを関連づけてイベントデータベース６６に登録しているので、例えば、インターネット上のサーバーから、Ｗｅｂページや電子メール等を介して音源テンプレートＤ１を配信し、その音源テンプレートＤ１ごとにイベントを関連づけておくことができ、Ｗｅｂ上のサービスと実店舗でのサービスを密接にリンクさせることができ、サービスの精細化を図ることができる。 Further, according to the present embodiment, the template identifier of the sound source template D1 distributed by the sound source template distribution unit 69a is registered in the event database 66 in association with the event to be generated. The sound source template D1 can be distributed via a Web page, e-mail, etc., and an event can be associated with each sound source template D1, and the service on the Web and the service at the actual store can be closely linked. The service can be refined.

また、本実施形態では、適合率表示制御部６４ａは、適合率出力部７５から出力された比較結果信号に基づき、適合率に応じたグラフィックの表示を行う制御を行っているので、イベントの１つとして適合率の表示を行うことによって、操作者は適合率を視認することができ、音源までの距離を視覚的に確認することができる。これにより、例えば、特定音響までの距離や方向を示す音源レーダーのような仕組みを構築することができ、雑音の多い環境であっても、ユーザーを音源の位置まで誘導することができ、この仕組みを利用したゲーム性の高い種々のサービスを提供することができる。 Further, in the present embodiment, the relevance ratio display control unit 64a performs control to display a graphic according to the relevance ratio based on the comparison result signal output from the relevance ratio output section 75. As a result, by displaying the relevance ratio, the operator can visually recognize the relevance ratio and visually confirm the distance to the sound source. As a result, for example, a mechanism like a sound source radar that indicates the distance and direction to a specific sound can be constructed, and even in a noisy environment, the user can be guided to the position of the sound source. It is possible to provide various services with high game characteristics using.

［各種サービス］
次いで、上述したような実施形態に係る音響信号検出システム及び方法を応用した各サービスについて以下に説明する。なお、以下のサービスは、上述した第１実施形態、及び第２実施形態で実施可能である。 [Various services]
Next, each service applying the acoustic signal detection system and method according to the above-described embodiment will be described below. Note that the following services can be implemented in the first embodiment and the second embodiment described above.

（クーポン配信サービス）
先ず、上述した各実施形態に係る音響信号検出システム及び方法をクーポン配信サービスに適用した場合を例に説明する。このクーポン配信サービスは、スーパーマーケットや家電量販店など実店舗にユーザーが来店した際、店舗のスピーカーから出力された特定の音をユーザー端末１が認識すると、そのユーザー端末１、１Ａの表示部１３ａの画面上に、クーポンやポイント情報を画面上に表示させて割引サービスを提供するものである。 (Coupon delivery service)
First, the case where the acoustic signal detection system and method according to each embodiment described above are applied to a coupon distribution service will be described as an example. When the user terminal 1 recognizes a specific sound output from a store speaker when a user visits an actual store such as a supermarket or a home appliance mass retailer, the coupon distribution service uses the display unit 13a of the user terminal 1 or 1A. A discount service is provided by displaying coupons and point information on the screen.

詳述すると、クーポン配信サービスにおいて「音源テンプレート」は、店舗同時のテーマソングや、タイムセール時に出力される効果音を用いるものとする。そして、この音源テンプレートＤ１に対して、発生させるべきイベントとしては、例えば、来店ポイント贈与であったり、割引サービスであったり、タイムセール情報であったりしてもよい。 More specifically, in the coupon distribution service, the “sound source template” uses a theme song at the same time as a store or a sound effect output during a time sale. The event to be generated for the sound source template D1 may be, for example, store point gift, discount service, or time sale information.

この際、例えば、ユーザーの属性に応じて、発生するイベントを変えることで、例えば、ショッピングモール内において放送されている同一のテーマソングを流した場合であっても、女性には化粧品や食料品などに関するクーポンをイベントとして発生させ、子供には玩具などに関するクーポンをイベントとして発生させ、男性には、ゴルフや家電などに関するクーポンをイベントとして発生させることができる。この場合、同一のパート部分を音源テンプレートＤ１とし、ユーザー属性に応じて、関連づけられるイベント情報を変更してもよく、また、異なるパート部分を音源テンプレートＤ１１〜Ｄ１３として生成し、それぞれの音源テンプレートに対して、異なるイベント情報を関連づけしてもよい。 At this time, for example, by changing the event that occurs according to the user's attributes, for example, even if the same theme song broadcast in a shopping mall is played, women can receive cosmetics and groceries. For example, a coupon related to a toy or the like can be generated as an event for a child, and a coupon related to golf or a home appliance can be generated as an event for a male. In this case, the same part portion may be used as the sound source template D1, and the associated event information may be changed according to the user attribute. Different part portions may be generated as the sound source templates D11 to D13, and each sound source template may be generated. Different event information may be associated with each other.

（回線品質判定サービス）
次いで、上述したような各実施形態の音響信号検出システム及び方法を回線品質判定サービスに適用した場合を例に説明する。この回線品質判定サービスは、電話の開通工事終了時において、その周囲に特定の音を出力し、ユーザー端末１，Ａ１にエコー（反響）が生じているか否かによって回線品質を判定するものである。この場合、ユーザー端末１、１Ａは、開通工事業者が所持するものである。また、この回線品質判定を行う場合、その評価式は、以下のような評価式を用いて、１０段階評価を行うものとする。 (Line quality judgment service)
Next, the case where the acoustic signal detection system and method of each embodiment as described above is applied to a line quality determination service will be described as an example. This line quality determination service outputs a specific sound around the telephone at the end of the telephone opening work, and determines the line quality based on whether or not echo (resonance) has occurred in the user terminals 1 and A1. . In this case, the user terminals 1 and 1A are owned by the opening contractor. Further, when this line quality determination is performed, the evaluation formula is 10-level evaluation using the following evaluation formula.

評価式＝Ｎｏｒｍａｌｉｚｅｄ（Ｓｕｍ（Ｍ＿ｎ［ｉ］×Ａｍｐ［ｉ］））ｒａｎｇｅ：［０，１０］
ここで、「Ｍ＿ｎ」とは、マッチング率であり、「Ａｍｐ」とは、その時の音量（Ａｍｐｌｉｔｕｄｅ）のＡｖｅｒａｇｅ（テンプレート幅）を示すものである。なお、上式において、正規化（Ｎｏｒｍａｌｉｚｅｄ）は、必要に応じて適宜行うものであってよい。 Evaluation formula = Normalized (Sum (M_n [i] × Amp [i])) range: [0, 10]
Here, “M_n” is a matching rate, and “Amp” indicates the average of the volume (Amplitude) at that time (template width). In the above equation, normalization may be performed as needed.

さらに、このサービスにおいて、表示部１３ａには、図１１（ａ）及び（ｂ）に示すように、エコーの具合に応じてキャラクターの表情を変化させて開通工事業者が判定結果を認識し易くする。具体的に、図１１（ａ）に示す画像はエコーが生じていない場合に表示され、図１１（ｂ）に示す画像は、エコーが生じている場合に表示される。なお、本実施形態では、この評価を１０段階で表すもととするが、その数は限定されるものではない。 Further, in this service, as shown in FIGS. 11 (a) and 11 (b), the display unit 13a changes the facial expression of the character in accordance with the state of echo so that the opening contractor can easily recognize the determination result. . Specifically, the image shown in FIG. 11A is displayed when no echo is generated, and the image shown in FIG. 11B is displayed when an echo is generated. In this embodiment, this evaluation is based on 10 levels, but the number is not limited.

なお、この回線品質判定を生の音声で行う場合には、リアルタイムに音源テンプレートを作成し、上述した評価式と同じ式を用いて評価を行う。このとき、音源としての生の音声の長さに対するバッファリングする音声の長さを適宜変更することで対応することができる。 When this line quality determination is performed with live voice, a sound source template is created in real time, and evaluation is performed using the same expression as the above-described evaluation expression. At this time, it is possible to cope with this by appropriately changing the length of the voice to be buffered with respect to the length of the raw voice as the sound source.

このような実施形態によれば、本発明を回線品質サービスに用いることで、電話の開通工事の際にエコーが生じるか否かを簡易に調査することができる。 According to such an embodiment, by using the present invention for the line quality service, it is possible to easily investigate whether or not an echo is generated at the time of telephone opening construction.

（道案内サービス）
次いで、上述した各実施形態に係る音響信号検出システム及び方法を道案内サービスに適用した場合を例に説明する。この道案内サービスは、例えば、駅のスピーカーから出力される特定の音に基づいて、目的のプラットホーム、改札、又はトイレなどの所定の目的地まで案内を行うものである。 (Route guidance service)
Next, a case where the acoustic signal detection system and method according to each embodiment described above is applied to a route guidance service will be described as an example. This route guidance service provides guidance to a predetermined destination such as a target platform, a ticket gate, or a toilet based on a specific sound output from a speaker at a station, for example.

このサービスで利用される「音源テンプレート」としては、例えば、駅名を連呼する音声や、各駅で流している効果音（発車オルゴール等）、その他の各駅で独自に利用している効果音が含まれる。そして、案内するべき目的地としては、例えば、所定のプラットホームや改札口、トイレの場所が含まれる。なお、このサービスでは、音響の種類及び信号強度と、目的地との関係を地図情報として蓄積し、イベント情報と関連づけておくものとする。 The “sound source template” used in this service includes, for example, voices that call the station name continuously, sound effects that are played at each station (departure music box, etc.), and sound effects that are used independently at other stations. . The destination to be guided includes, for example, a predetermined platform, a ticket gate, and a toilet location. In this service, the relationship between the type of sound and signal intensity and the destination is stored as map information and associated with event information.

この地図情報は、ユーザー端末１に蓄積してもよく、管理サーバー２、若しくは制御管理サーバー６に蓄積してもよい。管理サーバー２又は制御管理サーバー６に地図情報を保持させる場合には、アップデートが容易となる。なお、ユーザー端末１に保持させる場合には、最新のマップ情報は逐次ユーザー端末１に配信されるものとする。 This map information may be accumulated in the user terminal 1 or in the management server 2 or the control management server 6. When the management server 2 or the control management server 6 holds the map information, the update is easy. In addition, when making it hold | maintain in the user terminal 1, the newest map information shall be delivered to the user terminal 1 sequentially.

そして、このサービスに利用する場合には、上述した音レーダー機能を用い、適合率に応じて、次に行くべき場所が近いか遠いかを案内するようにする。これにより目的地へ誘導させ易くすることができる。このように本システム及び方法を道案内サービスに利用する場合には、例えば、新宿駅、渋谷駅、東京駅といった乗り換えが複雑な駅や、初めて訪れた大きな駅においても、その場所から出力される音に基づいて、適切に目的地に誘導させることができる。 And when using for this service, the sound radar function mentioned above is used and it is made to guide whether the place which should go next is near or far according to a relevance rate. Thereby, it can be made easy to guide to the destination. When the present system and method are used for the route guidance service in this way, for example, even at stations such as Shinjuku Station, Shibuya Station, and Tokyo Station that are complicated to transfer, and large stations that have been visited for the first time, they are output from that location. Based on the sound, it can be appropriately guided to the destination.

（音認証サービス）
次いで、上述したような各実施形態の音響信号検出システム及び方法を認証サービスに適用した場合を例に説明する。 (Sound authentication service)
Next, a case where the acoustic signal detection system and method of each embodiment as described above is applied to an authentication service will be described as an example.

この認証サービスは、音源テンプレートＤ１をユーザーにチケットとして配布して、その音源テンプレートＤ１を用いて認証に利用するものである。このサービスで利用される「音源テンプレート」としては、例えば、会社訪問時には会社のテーマソングとなり、スポーツ観戦時には、そのチームの応援ソングとなり、コンサート鑑賞の際には、そのアーティストの楽曲が音源テンプレートとなる。また、発生するイベントとは、チケットの発行となっている。 This authentication service distributes the sound source template D1 to the user as a ticket and uses the sound source template D1 for authentication. The “sound source template” used in this service is, for example, the company's theme song when visiting the company, the support song of the team when watching sports, and the music of the artist as the sound source template when watching a concert. Become. An event that occurs is a ticket issue.

そして、ここでは、入口等において特定音響信号を含む音楽を流すことで、音源テンプレートＤ１をダウンロードしたユーザー端末１にのみ、イベントが発生して、画面上にチケットを表示させることができるため、入場を許可することができる。 Here, by playing music including a specific sound signal at the entrance or the like, an event occurs only on the user terminal 1 that has downloaded the sound source template D1, and a ticket can be displayed on the screen. Can be allowed.

さらに、このサービスをコンサートチケットとして実施する場合、異なるパート部分を音源テンプレートＤ１１〜Ｄ１３として生成し、それぞれの音源テンプレートＤ１に対して、異なる内容のイベント情報（チケット内容）を関連づけしてもよい。 Further, when this service is implemented as a concert ticket, different part portions may be generated as sound source templates D11 to D13, and event information (ticket contents) having different contents may be associated with each sound source template D1.

これにより、例えば、ユーザーごとに異なるイベントを発生できるため、例えば、コンサート会場の座席情報をユーザーに配信させることができる。また、上記道案内サービスと同様に、音レーダー機能及び会場内の地図情報を用いることで、座席や、会場内の施設（トイレ、売店等）までの道のりを案内することもできる。 Thereby, for example, since a different event can be generated for each user, for example, seat information of a concert venue can be distributed to the user. Similarly to the above route guidance service, by using the sound radar function and map information in the venue, it is also possible to guide the way to the seats and facilities (toilet, shop, etc.) in the venue.

また、所定のパートから生成した音源テンプレートＤ１１〜Ｄ１３に対してプレゼント当選のイベントを関連づけることで、コンサート中にアーティストがそのパート部分を演奏した際、所定のユーザーにプレゼント当選を通知させるサービスを実施できる。 In addition, by associating a gift-winning event with the sound source templates D11 to D13 generated from a predetermined part, when an artist plays the part part during a concert, a service is provided to notify a predetermined user of the gift-winning it can.

なお、アーティストの楽曲を音源テンプレートＤ１として配布した際、当該楽曲を着信メロディーとして利用可能とすることもできる。これにより、ユーザーにとっては、通常利用時に好きなアーティストの楽曲を着信メロディーとして取得することができるため有益となる。なお、この場合には、コンサートチケットとして使用後には、イベント識別子の関連づけを解除するものとする。また、音源テンプレートＤ１自体には、コピーガードなどの処理を行い、セキュリティを高める。 When the artist's music is distributed as the sound source template D1, the music can be used as an incoming melody. This is beneficial for the user because the favorite artist's music can be acquired as an incoming melody during normal use. In this case, the association of the event identifier is canceled after use as a concert ticket. Further, the sound source template D1 itself is subjected to processing such as copy guard to enhance security.

（乗り過ごし防止サービス）
次いで、上述した各実施形態に係る音響信号検出システム及び方法を道乗り過ごし防止サービスに適用した場合を例に説明する。 (Override prevention service)
Next, a case where the acoustic signal detection system and method according to each of the above-described embodiments is applied to a road overpass prevention service will be described as an example.

乗り過ごし防止サービスは、例えば、電車に乗っているユーザーに対して目的の駅に到着したことを通知するものである。このサービスで利用される「音源テンプレート」としては、駅名や、発車時の音、扉が開いた音、車内のアナウンス音が含まれる。なお、この際、各駅で異なる音を用いている場合には、例えば、一駅前の音を認識して次の駅が目的地であることを通知するようにしてもよい。また、駅名を認識する場合には、音声認識ではなく、あくまでイントネーション音楽としてその音から音源テンプレートＤ１を作成するものとする。 The overpass prevention service, for example, notifies the user on the train that he has arrived at the target station. The “sound source template” used in this service includes a station name, a sound at the time of departure, a sound when the door is opened, and an announcement sound in the car. At this time, when different sounds are used at each station, for example, a sound in front of one station may be recognized to notify that the next station is the destination. Further, when recognizing the station name, the sound source template D1 is created from the sound as the intonation music, not the voice recognition.

この場合には、例えば、電車やバスに乗っている場合、目的地で出力される音を認識して、ユーザーに目的地に到着したことを通知できるので、乗り過ごしを防止することができる。 In this case, for example, when riding on a train or bus, it is possible to recognize the sound output at the destination and notify the user that the user has arrived at the destination.

［変更例］
なお、上述した各実施形態の説明は、本発明の一例である。このため、本発明は上述した実施形態に限定されることなく、本発明に係る技術的思想を逸脱しない範囲であれば、設計等に応じて種々の変更が可能である。以下に、本発明の変更例について説明する。 [Example of change]
The description of each embodiment described above is an example of the present invention. For this reason, the present invention is not limited to the above-described embodiment, and various modifications can be made according to the design and the like as long as they do not depart from the technical idea of the present invention. Below, the modification of this invention is demonstrated.

（変更例１）
上述したテンプレートの比較では、二値化された音響信号が分布したテンプレートを比較して適合率を算出する手法を用いたが、本発明は、これに限定するものではなく、種々の比較方法を用いることができる。例えば、音源テンプレートＤ１において、最大音量からの差分データを利用してもよい。具体的には、先ず、音源テンプレートＤ１の各ブロックにおいて、最大音量からの差分データを算出して保存する。この際、その差分データを正規化して保存しておく。次に、テンプレートマッチングを行う。なお、テンプレートマッチングの際には、相互相関係数を計算するものとする。 (Modification 1)
In the template comparison described above, a method of calculating the matching rate by comparing templates in which binarized acoustic signals are distributed was used. However, the present invention is not limited to this, and various comparison methods are used. Can be used. For example, difference data from the maximum volume may be used in the sound source template D1. Specifically, first, difference data from the maximum volume is calculated and stored in each block of the sound source template D1. At this time, the difference data is normalized and stored. Next, template matching is performed. Note that a cross-correlation coefficient is calculated in template matching.

そいて、最もマッチング率の高い領域において、テンプレートと同じ周波数領域の音量（Ａｍｐｌｉｔｕｄｅ）差分データを抜き出し、音量（Ａｍｐｌｉｔｕｄｅ）差分データの正規化を行う。次いで、音源テンプレートと比較先テンプレートとを比較する。この際、差分データが大きく異なる部分（しきい値）は、ノイズとして処理をする。テンプレートマッチング率と音量（Ａｍｐｌｉｔｕｄｅ）差分マッチング率とを加算して、しきい値と比較することで比較結果を判断する。 Then, in the region with the highest matching rate, the volume (Amplitude) difference data in the same frequency region as the template is extracted, and the volume (Amplitude) difference data is normalized. Next, the sound source template and the comparison destination template are compared. At this time, a portion (threshold value) where the difference data is greatly different is processed as noise. The template matching rate and the volume (Amplitude) difference matching rate are added and compared with a threshold value to determine the comparison result.

また、上述した実施形態に対してテンプレートのパラメータを自動で調整する機能を持たせてもよい。すなわち、この自動調節機能を有することで、信号量（ｓｉｇｎａｌ）と雑音量（ｎｏｉｓｅ）の比であるＳＮ比が異なる環境下で、最適なパラメータを自動的に動的に変更することができるので、例えば、外の環境と家の中の環境の異なる雑音レベルに対して、最適なパラメータに適応することが可能である。また、リアルタイムに動的に最適なパラメータ調整を行うことができる。 Moreover, you may provide the function which adjusts the parameter of a template automatically with respect to embodiment mentioned above. In other words, by having this automatic adjustment function, the optimum parameter can be automatically and dynamically changed in an environment where the S / N ratio, which is the ratio of the signal amount (signal) and the noise amount (noise), is different. For example, it is possible to adapt optimal parameters for different noise levels in the outside environment and in the home environment. In addition, optimal parameter adjustment can be performed dynamically in real time.

以下に、このパラメータ自動調整について説明する。ここでは、機能学習を用いた場合と、動的パラメータを用いた場合について説明する。 Hereinafter, the automatic parameter adjustment will be described. Here, a case where function learning is used and a case where dynamic parameters are used will be described.

・機械学習を用いたパラメータ調整
先ず、テンプレートを作成後、学習用データを準備する。学習用データは、テンプレート音源と雑音とが混合したものと、テンプレート音源が入っていない、任意の音源を用意する。ここで音が混合したデータは、雑音レベルが異なる複数のデータを用意するものとする。 -Parameter adjustment using machine learning First, after creating a template, prepare learning data. As the learning data, a mixture of a template sound source and noise and an arbitrary sound source that does not contain a template sound source are prepared. Here, a plurality of data with different noise levels are prepared as data mixed with sound.

そして、先ず、教師あり機械学習機能を用いて、窓幅（ｂｌｏｃｋサイズ）、テンプレートの音量（Ａｍｐｌｉｔｕｄｅ）個数、比較される音源の音量（Ａｍｐｌｉｔｕｄｅ）数を学習する。この際、雑音レベルが高いものと低いものに対して、学習が収束しない場合は、個別に学習結果を保持する。また、全ての学習データに対して、機械学習が収束しない場合は、学習用データをＳＮ比で分類し、分類ごとに学習を行う。 First, using the supervised machine learning function, the window width (block size), the number of template volumes (Amplitude), and the number of sound sources (Amplitude) of the sound sources to be compared are learned. At this time, if the learning does not converge for the high and low noise levels, the learning result is held individually. If machine learning does not converge for all learning data, the learning data is classified by the SN ratio, and learning is performed for each classification.

このように、分類ごとに学習したデータを用いて、全ての学習データに対して、マッチング率を保持する。このデータは、動的パラメータ調整の際に用いてもよい。全ての教師信号に対して、学習が収束した場合は、学習結果を学習した音源テンプレート全てに適用する。そして、生成された学習データは、ユーザー端末、又はサーバーのいずれかに保持される。 In this way, the matching rate is held for all the learning data using the data learned for each classification. This data may be used for dynamic parameter adjustment. When learning converges for all teacher signals, the learning result is applied to all learned sound source templates. The generated learning data is held in either the user terminal or the server.

また、保持されるデータは、音源テンプレート、ＳＮ比、平均音量（Ａｍｐｌｉｔｕｄｅ）、ブロックサイズ、比較に使うテンプレートの音量（Ａｍｐｌｉｔｕｄｅ）の個数、比較される音源の音量（Ａｍｐｌｉｔｕｄｅ）の個数である。 The stored data includes a sound source template, an SN ratio, an average sound volume (Amplitude), a block size, the number of sound volumes (Amplitude) of templates used for comparison, and the number of sound volumes (Amplitude) of sound sources to be compared.

このように学習機能を用いる場合には、１つのテンプレートに対して、複数の属性情報を持つことがある。また、適用する機械学習は、人工ニューラルネットワーク、遺伝的アルゴリズム、教科学習等教師あり学習など種々の機能を用いることができる。 When the learning function is used as described above, a single template may have a plurality of attribute information. The machine learning to be applied can use various functions such as an artificial neural network, a genetic algorithm, supervised learning such as subject learning.

次いで、動的パラメータの適用について説明する。動的パラメータを適用する場合には、先ず、アプリケーションを起動した後、マイクから入ってくる音の強さを、ある一定時間（ｔ＿ａｍｐ）の平均値で持つようにする（ステップ１）。そして、テンプレートの属性情報である、音量（Ａｍｐｌｉｔｕｄｅ）の平均値からパラメータの候補を抽出して（ステップ２）、候補になった全てのパラメータに対して、マッチング率を計算する。そして、適合率としきい値とを比較する（ステップ３）。 Next, application of dynamic parameters will be described. When applying the dynamic parameter, first, after starting the application, the intensity of sound coming from the microphone is set to have an average value for a certain fixed time (t_amp) (step 1). Then, parameter candidates are extracted from the average value of the volume (Amplitude), which is the attribute information of the template (step 2), and the matching rate is calculated for all the parameters that are candidates. Then, the matching rate and the threshold value are compared (step 3).

ここで、しきい値を超えて適合したと判定された場合（ステップ３におけるＹｅｓ）、候補になっていないパラメータを用いて適合率を計算する（ステップ４）。この候補になっていないパラメータは、全てを用いてもよいし、１つでもよいし、ランダムに抽出してもよい（ステップ５）。 Here, when it is determined that the matching is performed exceeding the threshold (Yes in Step 3), the matching rate is calculated using the parameters that are not candidates (Step 4). All parameters that are not candidates may be used, or may be one or may be extracted at random (step 5).

候補になっていないパラメータを用いた適合率が、保持されている適合率とかけ離れていない場合は適合したと判定する（ステップ６）。反対に、候補になっていないパラメータを用いた適合率が、かけ離れている場合は、適合しなかったと判定する（ステップ７）。 If the relevance ratio using a parameter that is not a candidate is not far from the retained relevance ratio, it is determined that the relevance is satisfied (step 6). On the other hand, if the relevance ratios using parameters that are not candidates are far from each other, it is determined that they are not suitable (step 7).

一方、しきい値を超えずに、適合しなかったと判定した場合は（ステップ３におけるＮｏ）、パラメータの候補を抽出しなおす（ステップ８）。なお、パラメータの候補は、ＳＮ比をもとにしてもいいし、全てのパラメータに対して行ってもよい。このパラメータの候補の抽出では、上記ステップ４で適合したと判定された音源を保持しておき、学習済みパラメータの中で、最も近い振る舞いを行うものを選定する（ステップ９）。 On the other hand, when it is determined that the threshold value has not been exceeded and it has not been matched (No in Step 3), parameter candidates are extracted again (Step 8). The parameter candidates may be based on the S / N ratio or may be performed for all parameters. In this parameter candidate extraction, the sound source determined to have been matched in Step 4 is held, and the learned parameter that performs the closest behavior is selected (Step 9).

そして、選定されたパラメータをもとに、マッチング率を計算して（ステップ１０）、上記ステップ７のように適合しないと判断されるまで処理を繰り返す。その後、ステップ７のように適合しないと判断された場合には、ステップ８の処理を行う。 Then, based on the selected parameter, a matching rate is calculated (step 10), and the process is repeated until it is determined that it does not match as in step 7 above. After that, if it is determined that it is not suitable as in step 7, the process of step 8 is performed.

（変更例２）
上述した各実施形態では、特定音響信号から生成された音源テンプレートＤ１をユーザー端末若しくはサーバーに蓄積しておき、音再生手段４から出力された評価音響信号（比較先テンプレートＤ３）を読み取って比較する構成とした。しかしながら、本発明は、これに限定するものではなく、例えば、音源テンプレートからなる音データをユーザー端末１、１Ａのスピーカー等から出力し、外部の装置によって読み取らせて比較する構成としてもよい。すなわち、本変更例では、ユーザー端末１、１Ａが音再生手段４として機能し、外部の装置がユーザー端末１、１Ａとして機能することとなる。 (Modification 2)
In each of the embodiments described above, the sound source template D1 generated from the specific sound signal is stored in the user terminal or server, and the evaluation sound signal (comparison target template D3) output from the sound reproducing means 4 is read and compared. The configuration. However, the present invention is not limited to this, and for example, sound data composed of a sound source template may be output from the speakers of the user terminals 1 and 1A, read by an external device, and compared. That is, in the present modification example, the user terminals 1 and 1A function as the sound reproducing unit 4, and the external device functions as the user terminals 1 and 1A.

この場合、外部の装置によって保持され、比較先となるテンプレートについても、特定音響信号から生成されたテンプレートとなる。また、イベント情報は、サーバー側に蓄積しておき、当該テンプレートの識別子に関連づけられることとなる。 In this case, a template that is held by an external device and is a comparison destination is also a template generated from the specific acoustic signal. Event information is accumulated on the server side and associated with the identifier of the template.

そして、ユーザー端末１のメモリには、出力すべき音源テンプレートをサーバーから取得して保持しておき、外部の装置に対して読み取り可能なように出力させることで、外部装置で両テンプレートが一致するか否かを判断し、一致する場合には、テンプレート識別子に関連づけられたイベントを発生させることができる。 Then, the sound source template to be output is acquired from the server and held in the memory of the user terminal 1, and is output so as to be readable to an external device, whereby both templates match in the external device. If they match, an event associated with the template identifier can be generated.

このような本変更例２に係る構成を用いることで、コンサートチケットの認証サービスを実行することができる。すなわち、この認証サービスでは、ユーザー端末１、１Ａにチケットとして特定音響信号から生成したテンプレートを配布する。一方、会場の入場口には、音響信号を読み取り可能なマイクを備えた認証装置を設置する。この認証装置には、内部メモリに比較先となるテンプレートを記憶させてもよく、また、通信処理可能であれば、通信ネットワーク５上に配置されたサーバーに比較先となるテンプレートを記憶させてもよい。 By using such a configuration according to the second modification, a concert ticket authentication service can be executed. That is, in this authentication service, a template generated from a specific acoustic signal is distributed as a ticket to the user terminals 1 and 1A. On the other hand, an authentication device equipped with a microphone capable of reading an acoustic signal is installed at the entrance of the venue. In this authentication apparatus, a template to be compared may be stored in the internal memory, and if a communication process is possible, a template to be compared may be stored in a server arranged on the communication network 5. Good.

そして、入場口に設置された認証装置にユーザー端末１を近づかせた状態で、スピーカーからテンプレートに基づいた音響を出力させることで、認証側に記憶されたテンプレートと比較して完全一致か否かを判断することで認証処理を行う。 Then, with the user terminal 1 approaching the authentication device installed at the entrance, the sound based on the template is output from the speaker, so that whether or not the template matches the template stored on the authentication side. Authentication processing is performed by judging the above.

このような認証サービスを用いることで、この際、音源テンプレートを生成する場合に、周波数がカットされているので、本来の音とは異なる音になっているため、単純に音を切り出して認証情報としてもセキュリティを高めることができる。また、バーコードやＱＲコード（登録商標）のような静的データではなく、動的データであるため、真意が難しく、セキュリティレベルが高くなる。 By using such an authentication service, the frequency is cut when generating the sound source template at this time, so the sound is different from the original sound. But security can be improved. In addition, since it is not static data such as a bar code or QR code (registered trademark) but dynamic data, it is difficult to be true and the security level is high.

なお、本変更例では、ユーザー端末１、１Ａのスピーカーからテンプレートに基づいた音響を出力したが、例えば、図１０（ａ）及び（ｂ）に示すように、表示部１３ａにテンプレートに基づいた表示画面を表示させるとともに、認証装置にカメラなど撮像手段を設けて、画像によって認証処理を行ってもよい。この場合には、無音で処理が行えるため、外部に音を出力させることができない状況下においても、認証処理を実行することができる。 In this modification, sound based on the template is output from the speaker of the user terminal 1 or 1A. For example, as shown in FIGS. 10A and 10B, the display based on the template is displayed on the display unit 13a. While displaying the screen, the authentication apparatus may be provided with an imaging unit such as a camera, and the authentication process may be performed using an image. In this case, since the process can be performed without sound, the authentication process can be executed even in a situation where no sound can be output to the outside.

（その他の変更例）
なお、第２実施形態では、複数のサーバー装置に音響信号検出に係る各機能を分散させて配置したが、本発明はこれに限定されるものではなく、サーバー装置を単体として、これに全ての機能を備えさせてもよく、各モジュールの機能に特化したより多くのサーバー装置を分散配置させて、各装置を通信によって協動させるようにしてもよい。 (Other changes)
In the second embodiment, each function related to acoustic signal detection is distributed and arranged in a plurality of server devices. However, the present invention is not limited to this, and the server device as a single unit, A function may be provided, or a larger number of server apparatuses specialized for the function of each module may be arranged in a distributed manner so that each apparatus cooperates by communication.

Ａ…エリア
Ｄ１，Ｄ１１〜Ｄ１３…音源テンプレート
Ｄ２…イベント情報
Ｄ３…比較先テンプレート
１，１Ａ…ユーザー端末
２…管理サーバー
３…無線基地局
４…音再生手段
５…通信ネットワーク
６…制御管理サーバー
７…テンプレート比較サーバー
１１…無線インターフェース
１２…入力インターフェース
１２ａ…マイク
１３…出力インターフェース
１３ａ…表示部
１４…音源テンプレート受信部
１５…メモリ
１６…音響取得部
１７，１８…アプリケーション実行部
２１…通信部
２２…制御部
２３…ユーザー情報データベース
２４…イベントデータベース
２５…音源テンプレートデータベース
４１…制御部
４２…スピーカー
４３ａ，４３ｂ…バッファ
６１…通信部
６２…ユーザー登録部
６３…評価対象変換部
６４…イベント発生管理部
６４ａ…適合率表示制御部
６５…イベント生成部
６６…イベントデータベース
６７…ユーザー情報データベース
６８…音源テンプレート生成部
６９…配信管理部
６９ａ…音源テンプレート配信部
７１…通信部
７２…音源テンプレート取得部
７３…音源テンプレートデータベース
７４…テンプレート比較部
７５…適合率出力部
１５１…イベントデータベース
１７１…評価対象変換部
１７２…テンプレート比較部
１７３…適合率出力部
１７４…イベント発生管理部
１７４ａ…適合率表示部
１７５…音源テンプレート生成部
１７６…受信管理部
１８１…音響送信部
１８２…イベント実行部
２２１…音源データ取得部
２２２…音源テンプレート生成部
２２３…イベント生成部
２２４…配信管理部
２２４ａ…音源テンプレート配信部
２２５…ユーザー登録部 A ... Area D1, D11-D13 ... Sound source template D2 ... Event information D3 ... Comparison template 1,1A ... User terminal 2 ... Management server 3 ... Radio base station 4 ... Sound reproduction means 5 ... Communication network 6 ... Control management server 7 ... Template comparison server 11 ... Wireless interface 12 ... Input interface 12a ... Microphone 13 ... Output interface 13a ... Display unit 14 ... Sound source template receiving unit 15 ... Memory 16 ... Sound acquisition unit 17, 18 ... Application execution unit 21 ... Communication unit 22 ... Control unit 23 ... User information database 24 ... Event database 25 ... Sound source template database 41 ... Control unit 42 ... Speakers 43a, 43b ... Buffer 61 ... Communication unit 62 ... User registration unit 63 ... Evaluation target conversion unit 64 ... A Vent occurrence management unit 64a ... relevance ratio display control unit 65 ... event generation unit 66 ... event database 67 ... user information database 68 ... sound source template generation unit 69 ... distribution management unit 69a ... sound source template distribution unit 71 ... communication unit 72 ... sound source template Acquisition unit 73 ... sound source template database 74 ... template comparison unit 75 ... precision ratio output part 151 ... event database 171 ... evaluation object conversion part 172 ... template comparison part 173 ... precision ratio output part 174 ... event occurrence management part 174a ... precision ratio display Unit 175 ... sound source template generation unit 176 ... reception management unit 181 ... sound transmission unit 182 ... event execution unit 221 ... sound source data acquisition unit 222 ... sound source template generation unit 223 ... event generation unit 224 ... distribution management unit 224a ... Plate delivery unit 225 ... user registration unit

Claims

An acoustic signal detection system for detecting a specific acoustic signal included in an evaluation acoustic signal to be evaluated,
The evaluation acoustic signal is acquired, and for the acquired evaluation acoustic signal, the presence or absence of the acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are obtained for each scale. An evaluation target conversion unit that generates a comparison target template that is quantified and recorded in time series, and
Based on the specific acoustic signal , the presence or absence of an acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are binarized for each scale and recorded in time series. A sound source template generator for generating a sound source template,
A sound source template acquisition unit for acquiring the sound source template;
A template comparison unit that compares the comparison target template with the sound source template and calculates a binarized distribution ratio of the acoustic signal;
A precision output unit that outputs a comparison result signal corresponding to the precision calculated by the template comparison unit ;
An event database for storing a template identifier for identifying a sound source template generated by the sound source template generation unit and an event identifier for identifying an event to be generated;
Based on the comparison result signal output from the precision ratio output unit, the event database is checked for the template identifier of the sound source template related to the comparison result signal, the event to be generated is selected, and the event that generates the selected event An outbreak management section and 
When generating the sound source template for the specific sound signal, the sound source template generation unit detects a signal intensity at each scale and determines whether the detected sound signal exceeds a predetermined threshold value. extract the presence or absence of,
The acoustic signal detection system , wherein the threshold value is set to a different value depending on a type of the event to be generated .

The acoustic signal detection system according to claim 1, wherein the sound source template generation unit detects the presence or absence of a sound of each scale after removing a harmonic component when generating the sound source template.

The said template comparison part detects the presence or absence of each gradation sound, without including a harmonic component, at the time of a comparison with the said sound source template, without including a harmonic component. The described acoustic signal detection system.

An acoustic signal detection server for detecting a specific acoustic signal included in an evaluation acoustic signal to be evaluated,
Through the communication network, the evaluation acoustic signal is acquired, and for the acquired evaluation acoustic signal, the presence or absence of the acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are An evaluation target conversion unit that generates a comparison template that is binarized for each scale and recorded in time series;
Based on the specific acoustic signal, the presence or absence of an acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are binarized for each scale and recorded in time series. A sound source template generator for generating a sound source template,
A sound source template acquisition unit for acquiring the sound source template;
A template comparison unit that compares the comparison target template with the sound source template and calculates a binarized distribution ratio of the acoustic signal;
A matching rate output unit that sends a comparison result signal according to the matching rate calculated by the template comparison unit to the communication network ;
An event database for storing a template identifier for identifying a sound source template generated by the sound source template generation unit and an event identifier for identifying an event to be generated;
Based on the comparison result signal output from the precision ratio output unit, the event database is checked for the template identifier of the sound source template related to the comparison result signal, the event to be generated is selected, and the event that generates the selected event An outbreak management section and 
When generating the sound source template for the specific sound signal, the sound source template generation unit detects a signal intensity at each scale and determines whether the detected sound signal exceeds a predetermined threshold value. extract the presence or absence of,
The acoustic signal detection server , wherein the threshold value is set to a different value depending on a type of the event to be generated .

The acoustic signal detection server according to claim 4 , wherein the sound source template generation unit detects the presence or absence of a sound of each scale after removing a harmonic component when generating the sound source template.

6. The template comparison unit according to claim 4 or 5 , wherein, when comparing with the sound source template, the presence / absence of each tone is detected without removing the harmonic component and including the harmonic component. The acoustic signal detection server described.

An acoustic signal detection device for detecting a specific acoustic signal included in an evaluation acoustic signal to be evaluated,
A sound acquisition unit that acquires sound and outputs the sound as the evaluation sound signal;
For the evaluation acoustic signal output from the acoustic acquisition unit, the presence or absence of an acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are binarized for each scale. An evaluation target conversion unit that generates a comparison destination template recorded in time series, and
Based on the specific acoustic signal, the presence or absence of an acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are binarized for each scale and recorded in time series. A sound source template generator for generating a sound source template,
A sound source template acquisition unit for acquiring the sound source template;
A template comparison unit that compares the comparison target template with the sound source template and calculates a binarized distribution ratio of the acoustic signal;
A precision output unit that outputs a comparison result signal corresponding to the precision calculated by the template comparison unit ;
An event database for storing a template identifier for identifying a sound source template generated by the sound source template generation unit and an event identifier for identifying an event to be generated;
Based on the comparison result signal output from the precision ratio output unit, the event database is checked for the template identifier of the sound source template related to the comparison result signal, the event to be generated is selected, and the event that generates the selected event An outbreak management section and 
When generating the sound source template for the specific sound signal, the sound source template generation unit detects a signal intensity at each scale and determines whether the detected sound signal exceeds a predetermined threshold value. extract the presence or absence of,
The acoustic signal detection apparatus , wherein the threshold value is set to a different value depending on a type of the event to be generated .

The acoustic signal detection device according to claim 7 , wherein the sound source template generation unit detects the presence or absence of a sound of each scale after removing a harmonic component when generating the sound source template.

9. The template comparison unit according to claim 7 or 8 , wherein, when comparing with the sound source template, the presence / absence of each tone is detected without removing the harmonic component and including the harmonic component. The acoustic signal detection device described.

An acoustic signal detection program for detecting a specific acoustic signal included in an evaluation acoustic signal to be evaluated by a computer, the computer,
The evaluation acoustic signal is acquired, and for the acquired evaluation acoustic signal, the presence or absence of the acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are obtained for each scale. An evaluation target conversion step for generating a comparison target template that has been quantified and recorded in time series, and
For the specific sound signal, the presence or absence of the sound signal in a predetermined frequency band is extracted based on the scale average rate, and the sound signal distribution and its temporal change are binarized for each scale and recorded in time series A template comparison step of generating and obtaining a template, comparing the comparison target template with the sound source template, and calculating a matching ratio of the binarized distribution of the acoustic signal;
A precision ratio output step of outputting a comparison result signal corresponding to the precision ratio calculated by the template comparison step ;
A database control step of associating a template identifier for identifying the sound source template generated in the template comparison step with an event identifier for identifying an event to be generated in an event database;
Based on the comparison result signal output in the precision ratio output step, the event database is checked with respect to the template identifier of the sound source template related to the comparison result signal, an event to be generated is selected, and an event that generates the selected event An outbreak management step and a process including 
In the template comparison step, when the sound source template is generated for the specific sound signal, the signal intensity in each scale is detected, and the sound signal is determined based on whether the detected sound signal exceeds a predetermined threshold value. extract the presence or absence of,
The acoustic signal detection program , wherein the threshold value is set to a different value depending on a type of the event to be generated .

11. The acoustic signal detection program according to claim 10 , wherein in the template comparison step, the presence or absence of a sound of each scale is detected after the harmonic component is removed at the time of generating the sound source template.

In the template comparison step, in the time of comparison between the sound source template, without removing the harmonic components, to claim 10 or 11, characterized in that detecting the presence or absence of each floor articulatory while including harmonic components The described acoustic signal detection program.

An acoustic signal detection method for detecting a specific acoustic signal included in an evaluation acoustic signal to be evaluated,
The evaluation acoustic signal is acquired, and for the acquired evaluation acoustic signal, the presence or absence of the acoustic signal in a predetermined frequency band is extracted based on the scale average rate, and the distribution of the acoustic signal and its temporal change are obtained for each scale. A process for converting the evaluation target to generate a comparison target template that is recorded in chronological order,
For the specific sound signal, the presence or absence of the sound signal in a predetermined frequency band is extracted based on the scale average rate, and the sound signal distribution and its temporal change are binarized for each scale and recorded in time series A template comparison process that generates and obtains a template, compares the comparison target template with the sound source template, and calculates a matching ratio of the binarized distribution of the acoustic signal;
Relevance ratio output processing for outputting a comparison result signal corresponding to the relevance ratio calculated by the template comparison processing ;
A database control process for associating a template identifier for identifying a sound source template generated in the template comparison process with an event identifier for identifying an event to be generated in an event database;
Based on the comparison result signal output from the precision ratio output process, the event database is checked with respect to the template identifier of the sound source template related to the comparison result signal, the event to be generated is selected, and the selected event is generated Including outbreak management processing and 
In the template comparison process, when the sound source template is generated for the specific sound signal, the signal intensity in each scale is detected, and the sound signal is determined based on whether the detected sound signal exceeds a predetermined threshold value. extract the presence or absence of,
The acoustic signal detection method , wherein the threshold value is set to a different value depending on a type of the event to be generated .

The acoustic signal detection method according to claim 13 , wherein in the template comparison process, at the time of generating a sound source template, the presence or absence of a sound of each scale is detected after removing harmonic components.

15. The template comparison process according to claim 13 or 14 , wherein, when comparing with the sound source template, the presence or absence of each tone is detected while the harmonic component is included without removing the harmonic component. The acoustic signal detection method as described.