JP2012194653A

JP2012194653A - Data processing device, data processing system, data processing method, and program

Info

Publication number: JP2012194653A
Application number: JP2011056655A
Authority: JP
Inventors: Aiko Akutagawa; 愛子芥川
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2011-03-15
Filing date: 2011-03-15
Publication date: 2012-10-11

Abstract

PROBLEM TO BE SOLVED: To quickly search for a musical piece by reducing a data amount of meta-data used for a music search.SOLUTION: A data processing device comprises: first and second measurement units 102A and 102B for measuring a fundamental frequency f(p) of audio data M1 and M2 of a musical piece at preset time intervals t; first and second calculation units 103A and 103B for, by using the fundamental frequency f(p), calculating a deviation qof the fundamental frequency f(p) from a reference frequency of standard sound as sound corresponding to the fundamental frequency f(p), respectively; and first and second meta-data generation units 104A and 104B for, by using the deviation q, generating meta-data MET1(t,q) and MET2(t,q) concerning features of the musical piece.

Description

本発明は、データ処理装置、データ処理システム、データ処理方法およびプログラムに関する。 The present invention relates to a data processing device, a data processing system, a data processing method, and a program.

近年、パーソナルコンピュータ（以下「ＰＣ」という）や多機能携帯電話機などの普及により、楽曲の多くがデジタルデータ化されている。その際には、楽曲の探索や管理を容易にするため、楽曲名や作曲家名といったオーディオデータの属性が記述されたメタデータが生成されることが多い。 In recent years, with the widespread use of personal computers (hereinafter referred to as “PCs”), multi-function mobile phones, and the like, most of music is converted into digital data. In that case, in order to facilitate the search and management of music, metadata describing the attributes of audio data such as music name and composer name is often generated.

また、ネットワークを利用した電子商取引の普及により、膨大な数の楽曲がコンテンツ提供者によって提供されている。例えば、個人ユーザがその中から所望する楽曲をＰＣを用いて探索する場合、楽曲名や作曲家名など、何らかのキーワードが必要である。こうしたキーワードが分からない場合、所望する楽曲を見つけ出すことが困難となる。 In addition, with the spread of electronic commerce using a network, a huge number of music pieces are provided by content providers. For example, when a personal user searches for a desired piece of music using a PC, some keyword such as a song name or a composer name is required. If these keywords are not known, it is difficult to find a desired music piece.

この課題を解決すべく、特許文献１には、オーディオデータに高速フーリエ変換などの信号処理を所定時間ごとに施し、音程の特徴量をメタデータとして抽出することで、音（メロディ）をキーワードとして楽曲の探索を行うことができる技術が開示されている。この信号処理方法については、引用文献２に開示されている。 In order to solve this problem, Patent Document 1 discloses that audio data is subjected to signal processing such as fast Fourier transform at predetermined time intervals, and feature values of pitches are extracted as metadata, so that sound (melody) is used as a keyword. A technique capable of searching for music is disclosed. This signal processing method is disclosed in cited document 2.

特開２００８−１９２１０２号公報JP 2008-192102 A 特開２００６−２０１６１４号公報JP 2006-201614 A

音をキーワードに、数ある同一異録音の楽曲の中から所望する一曲を探索する場合、探索に用いられるメタデータのデータ量が増大する。必然的に、探索対象および被探索対象のメタデータ同士を照合する際の演算量が増え、楽曲の探索に要する時間も増大する。 When searching for a desired piece of music from among the same and different recordings using sound as a keyword, the amount of metadata used for the search increases. Inevitably, the amount of calculation when matching the metadata of the search target and the search target increases, and the time required for searching for the music also increases.

本発明の目的は、探索に用いられるメタデータのデータ量を低減させ、楽曲を素早く探索することができる、データ処理装置、データ処理システム、データ処理方法およびプログラムを提供することである。 An object of the present invention is to provide a data processing device, a data processing system, a data processing method, and a program capable of reducing the amount of metadata used for searching and quickly searching for music.

本発明のデータ処理装置は、楽曲のオーディオデータの基本周波数を予め指定された時間間隔ごとに測定する測定部と、前記測定部の測定ごとに得られた基本周波数を用いて、基準となる音の基準周波数に対する前記基本周波数の偏差を前記基本周波数に対応した音として各々算出する算出部と、前記算出部によって各々算出された偏差を用いて、前記楽曲の特徴に関するメタデータを生成するメタデータ生成部と、を有する。 The data processing apparatus of the present invention uses a measurement unit that measures a fundamental frequency of audio data of a music for each predetermined time interval, and a reference frequency using the fundamental frequency obtained for each measurement by the measurement unit. A calculation unit that calculates a deviation of the fundamental frequency with respect to a reference frequency as a sound corresponding to the fundamental frequency, and metadata that generates metadata about the characteristics of the music using the deviation calculated by the calculation unit, respectively. And a generation unit.

本発明のデータ処理方法は、楽曲のオーディオデータの基本周波数を予め指定された時間間隔ごとに測定する測定ステップと、前記測定ステップでの測定ごとに得られた基本周波数を用いて、基準となる音の基準周波数に対する前記基本周波数の偏差を前記基本周波数に対応した音として各々算出する算出ステップと、前記算出ステップで各々算出された偏差を用いて、前記楽曲の特徴に関するメタデータを生成するメタデータ生成ステップと、を有する。 The data processing method of the present invention serves as a reference by using a measurement step for measuring the fundamental frequency of audio data of music at every predetermined time interval and the fundamental frequency obtained for each measurement in the measurement step. A calculation step for calculating a deviation of the fundamental frequency with respect to a reference frequency of the sound as a sound corresponding to the fundamental frequency, and a meta for generating metadata relating to the feature of the music using the deviation calculated in the calculation step. And a data generation step.

本発明のプログラムは、楽曲のオーディオデータの基本周波数を予め指定された時間間隔ごとに測定する測定手順と、前記測定手順での測定ごとに得られた基本周波数を用いて、基準となる音の基準周波数に対する前記基本周波数の偏差を前記基本周波数に対応した音として各々算出する算出手順と、前記算出手順で各々算出された偏差を用いて、前記楽曲の特徴に関するメタデータを生成するメタデータ生成手順と、をコンピュータに実行させる。 The program of the present invention uses a measurement procedure for measuring a fundamental frequency of audio data of a music for each predetermined time interval, and a fundamental frequency obtained for each measurement in the measurement procedure. A calculation procedure for calculating a deviation of the fundamental frequency with respect to a reference frequency as a sound corresponding to the fundamental frequency, and a metadata generation for generating metadata relating to the feature of the music using the deviation calculated in the calculation procedure. And causing the computer to execute the procedure.

本発明によれば、探索に用いられるメタデータのデータ量を低減させ、楽曲を素早く探索することができる。 ADVANTAGE OF THE INVENTION According to this invention, the data amount of the metadata used for a search can be reduced, and a music can be searched quickly.

図１は、本発明の実施の形態に係るデータ処理システムＡの構成例を示す概略ブロック図である。FIG. 1 is a schematic block diagram showing a configuration example of a data processing system A according to an embodiment of the present invention. 図２は、本発明の実施の形態に係るサーバ装置１のハードウェアの構成例を示すブロック図である。FIG. 2 is a block diagram showing a hardware configuration example of the server apparatus 1 according to the embodiment of the present invention. 図３は、図２に示すサーバ装置１の持つ機能を表したブロック図である。FIG. 3 is a block diagram showing functions of the server device 1 shown in FIG. 図４は、本発明の実施の形態に係る算出処理を説明するための図である。FIG. 4 is a diagram for explaining calculation processing according to the embodiment of the present invention. 図５は、本発明の実施の形態に係るメタデータ生成処理によって得られたメタデータの構成を説明するための図である。FIG. 5 is a diagram for explaining a configuration of metadata obtained by the metadata generation processing according to the embodiment of the present invention. 図６は、本発明の実施の形態に係るデータ処理システムＡの動作例を示すシーケンス図である。FIG. 6 is a sequence diagram showing an operation example of the data processing system A according to the embodiment of the present invention.

以下、本発明の実施の形態を図面に関連づけて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

［データ処理システムＡの構成例］
図１は、本発明の実施の形態に係るデータ処理システムＡの構成例を示す概略ブロック図である。図１に示すように、データ処理システムＡは、データ処理装置としてのサーバ装置１、楽曲サーバ装置２、クライアント端末装置３およびネットワーク４を有する。 [Configuration example of data processing system A]
FIG. 1 is a schematic block diagram showing a configuration example of a data processing system A according to an embodiment of the present invention. As shown in FIG. 1, the data processing system A includes a server device 1, a music server device 2, a client terminal device 3, and a network 4 as data processing devices.

データ処理システムＡは、サーバ装置１および楽曲サーバ装置２を所有するコンテンツ提供者が、ネットワーク４を介してクライアント端末装置３の所有者（以下「個人ユーザ」という）に対して個人ユーザが所望する楽曲を提供するシステムである。 In the data processing system A, the content provider who owns the server device 1 and the music server device 2 desires the individual user to the owner of the client terminal device 3 (hereinafter referred to as “individual user”) via the network 4. This system provides music.

楽曲の提供に際して、データ処理システムＡは、楽曲サーバ装置２に格納されている何千、何万という数の楽曲の中から個人ユーザが所望する楽曲（楽曲データＭ１）を探索する。その際に、楽曲名など、言葉のキーワードが分からなければ、探索が困難となる。そこで、データ処理システムＡは、クライアント端末装置３に格納されているオーディオデータＭ２をサーバ装置１にアップロードすることで、言葉だけはではなく、音をキーワードとして探索することができる。本実施の形態では、音をキーワードとして探索する場合を例に挙げる。 When providing the music, the data processing system A searches for the music (music data M1) desired by the individual user from the thousands and tens of thousands of music stored in the music server device 2. At that time, if the keyword of the word such as the music title is not known, the search becomes difficult. Therefore, the data processing system A can search not only words but also sounds as keywords by uploading the audio data M2 stored in the client terminal device 3 to the server device 1. In this embodiment, a case where sound is searched for as a keyword is taken as an example.

楽曲の探索の便宜を図るため、データ処理システムＡには、楽曲探索モードおよび類似探索モードがある。ここで、これら２つのモードについて説明する。 In order to facilitate the search for music, the data processing system A has a music search mode and a similar search mode. Here, these two modes will be described.

（楽曲探索モード）
楽曲探索モードは、個人ユーザのオーディオデータＭ２を用いて、楽曲サーバ装置２に格納されている数ある楽曲の中から個人ユーザが所望する楽曲を探索するためのモードである。 (Music search mode)
The music search mode is a mode for searching for music desired by the individual user from among a number of music stored in the music server device 2 using the audio data M2 of the individual user.

（類似探索モード）
類似探索モードは、オーディオデータＭ２のメロディに類似した楽曲を探索するためのモードである。 (Similar search mode)
The similar search mode is a mode for searching for music similar to the melody of the audio data M2.

例えば、クラシック音楽は、世界各国の演奏者により演奏され、録音されている。クラシック音楽のように、楽曲名は同一であるが、演奏者や録音の環境が異なるという、同一異録音の楽曲は無数に存在する。類似探索モードは、このような同一異録音の楽曲を探索する場合に好適である。 For example, classical music is played and recorded by performers around the world. Like classical music, there are innumerable songs with the same music title but different performances and different recording environments. The similarity search mode is suitable for searching for music of the same recording.

サーバ装置１は、プログラムに従って演算処理を行うコンピュータを備えている。サーバ装置１は、ネットワーク４に接続され、楽曲サーバ装置２およびクライアント端末装置３と通信する。サーバ装置１は、クライアント端末装置３からの要求に従って、サーバ装置１の内部にて、以下の処理を行う。 The server device 1 includes a computer that performs arithmetic processing according to a program. The server device 1 is connected to the network 4 and communicates with the music server device 2 and the client terminal device 3. The server device 1 performs the following processing inside the server device 1 in accordance with a request from the client terminal device 3.

（楽曲探索処理）
サーバ装置１は、楽曲探索モードの実行を促す要求がクライアント端末装置３からあった場合、楽曲データＭ１をネットワーク４上の楽曲サーバ装置２の中から探索し、探索結果をクライアント端末装置３に送信する。 (Music search process)
When the request for prompting the execution of the music search mode is received from the client terminal device 3, the server device 1 searches the music data M 1 from the music server device 2 on the network 4 and transmits the search result to the client terminal device 3. To do.

（類似探索処理）
サーバ装置１は、類似探索モードの実行を促す要求がクライアント端末装置３からあった場合、オーディオデータＭ２に類似した楽曲データ（楽曲データＭ１とする）を探索し、探索結果をクライアント端末装置３に送信する。 (Similarity search process)
When the request for prompting the execution of the similar search mode is received from the client terminal device 3, the server device 1 searches for music data similar to the audio data M 2 (referred to as music data M 1), and sends the search result to the client terminal device 3. Send.

楽曲サーバ装置２は、探索対象の楽曲データＭ１を始め、種々の属性の楽曲データを複数格納している。楽曲サーバ装置２は、ネットワーク４に接続され、サーバ装置１およびクライアント端末装置３と通信する。 The music server device 2 stores a plurality of music data having various attributes, including music data M1 to be searched. The music server device 2 is connected to the network 4 and communicates with the server device 1 and the client terminal device 3.

クライアント端末装置３は、ＰＣ、デジタルオーディオ機器、ＰＤＡ（ＰｅｒｓｏｎａｌＤａｔａＡｓｓｉｓｔａｎｃｅ）、多機能／高機能携帯電話機などの電子機器である。クライアント端末装置３は、基本的に、以下の機能を持つ。 The client terminal device 3 is an electronic device such as a PC, a digital audio device, a PDA (Personal Data Assistance), and a multi-function / high-function mobile phone. The client terminal device 3 basically has the following functions.

第１に、クライアント端末装置３は、オーディオデータの取り扱いが可能である。第２に、クライアント端末装置３は、有線、無線を問わず、ネットワーク４に接続し、サーバ装置１および楽曲サーバ装置２と通信する。第３に、クライアント端末装置３は、表示画面に文字や画像を表示する。 First, the client terminal device 3 can handle audio data. Secondly, the client terminal device 3 is connected to the network 4 regardless of wired or wireless, and communicates with the server device 1 and the music server device 2. Third, the client terminal device 3 displays characters and images on the display screen.

本実施の形態では、特に断りがない限り、ＰＣをクライアント端末装置３の一例に挙げる。オーディオデータの取り扱い、ネットワーク４への接続、文字や画像の表示が可能な電子機器であれば、クライアント端末装置３の構成は、特に限定されるものではない。 In the present embodiment, a PC is taken as an example of the client terminal device 3 unless otherwise specified. The configuration of the client terminal device 3 is not particularly limited as long as it is an electronic device capable of handling audio data, connecting to the network 4, and displaying characters and images.

以下、楽曲データＭ１およびオーディオデータＭ２は、デジタルオーディオデータであるとする。 Hereinafter, it is assumed that the music data M1 and the audio data M2 are digital audio data.

ネットワーク４は、例えば、通信プロトコルとしてのＴＣＰ／ＩＰ（ＴｒａｎｓｍｉｓｓｉｏｎＣｏｎｔｒｏｌＰｒｏｔｏｃｏｌ／ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）を利用可能なインターネットである。 The network 4 is the Internet that can use, for example, TCP / IP (Transmission Control Protocol / Internet Protocol) as a communication protocol.

［サーバ装置１のハードウェアの構成例］
図２は、本発明の実施の形態に係るサーバ装置１のハードウェアの構成例を示すブロック図である。図２に示すように、サーバ装置１は、ＣＰＵ１０、第１ＨＤＤ１１、第２ＨＤＤ１２、ＲＯＭ１３、ＲＡＭ１４、Ｉ／Ｏ（Ｉｎｐｕｔ／Ｏｕｔｐｕｔ）ポート１５および内部バス１６を有する。サーバ装置１の各構成部は、内部バス１６に各々接続されている。 [Hardware configuration example of server device 1]
FIG. 2 is a block diagram showing a hardware configuration example of the server apparatus 1 according to the embodiment of the present invention. As illustrated in FIG. 2, the server device 1 includes a CPU 10, a first HDD 11, a second HDD 12, a ROM 13, a RAM 14, an I / O (Input / Output) port 15, and an internal bus 16. Each component of the server device 1 is connected to the internal bus 16.

ＣＰＵ１０は、例えば、中央演算処理装置である。ＣＰＵ１０は、内部バス１６を介して、ＲＯＭ１３に格納されているプログラム１３１に従って、第１および第２ＨＤＤ１１、１２、ＲＡＭ１４との間でデータの授受を行いながら、楽曲探索処理および類似探索処理を行う。 The CPU 10 is, for example, a central processing unit. The CPU 10 performs music search processing and similarity search processing while exchanging data with the first and second HDDs 11 and 12 and the RAM 14 according to the program 131 stored in the ROM 13 via the internal bus 16.

第１ＨＤＤ１１は、記憶装置であって、例えば、ハードディスクである。第２ＨＤＤ１２も同様に、記憶装置であって、例えば、ハードディスクである。第１ＨＤＤ１１は、主に、楽曲探索処理および類似探索処理に用いられるメタデータを格納する。 The first HDD 11 is a storage device, for example, a hard disk. Similarly, the second HDD 12 is a storage device, for example, a hard disk. The first HDD 11 mainly stores metadata used for music search processing and similarity search processing.

第２ＨＤＤ１２は、主に、楽曲探索処理および類似探索処理に用いられるインデックスファイルを格納する。 The second HDD 12 mainly stores an index file used for music search processing and similarity search processing.

ＲＯＭ１３は、不揮発性の記憶装置である。ＲＯＭ１３は、ＣＰＵ１０が楽曲探索処理および類似探索処理を実行するのに必要なプログラム１３１を格納している。 The ROM 13 is a non-volatile storage device. The ROM 13 stores a program 131 necessary for the CPU 10 to execute a music search process and a similar search process.

ＲＡＭ１４は、例えば、ＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）ある。ＲＡＭ１４は、ＣＰＵ１０が演算処理のときに使用する一時的なデータを格納する。 The RAM 14 is, for example, a DRAM (Dynamic Random Access Memory). The RAM 14 stores temporary data used by the CPU 10 during arithmetic processing.

Ｉ／Ｏポート１５は、外部機器を接続するためのポートであって、内部バス１６を介して外部機器とＣＰＵ１０との間のデータの授受を行う。Ｉ／Ｏポート１５に接続可能な外部機器には、例えば、ネットワークカードやキーボードがある。本実施の形態では、サーバ装置１は、ネットワークカードが接続されたＩ／Ｏポート１５を介して、ネットワーク４上の楽曲サーバ装置２およびクライアント端末装置３と通信する。 The I / O port 15 is a port for connecting an external device, and exchanges data between the external device and the CPU 10 via the internal bus 16. Examples of external devices that can be connected to the I / O port 15 include a network card and a keyboard. In the present embodiment, the server device 1 communicates with the music server device 2 and the client terminal device 3 on the network 4 via the I / O port 15 to which the network card is connected.

ＣＰＵ１０は、ＲＯＭ１３からプログラム１３１を読み出すと、第１ＨＤＤ１１、第２ＨＤＤ１２およびＲＡＭ１４にアクセスしながら、プログラム１３１の処理手順に従って処理を行う。これにより、サーバ装置１が持つ各機能が実現される（図３参照）。 When reading the program 131 from the ROM 13, the CPU 10 performs processing according to the processing procedure of the program 131 while accessing the first HDD 11, the second HDD 12, and the RAM 14. Thereby, each function which the server apparatus 1 has is implement | achieved (refer FIG. 3).

なお、図２に示すハードウェアの構成は一例であって、種々の改変が可能である。例えば、プログラム１３１をＲＯＭ１３、第１ＨＤＤ１１または第２ＨＤＤ１２に格納することができる。この他、例えば、１個のハードディスクに、本実施の形態で用いられる各種データを格納することができる。 The hardware configuration shown in FIG. 2 is an example, and various modifications can be made. For example, the program 131 can be stored in the ROM 13, the first HDD 11, or the second HDD 12. In addition, for example, various data used in the present embodiment can be stored in one hard disk.

［サーバ装置１の機能ブロック図］
図３は、図２に示すサーバ装置１の機能を表したブロック図である。図３に示すように、サーバ装置１は、第１処理系１０Ａ、第２処理系１０Ｂおよび楽曲データベースＤＢを有する。第１および第２処理系１０Ａ、１０Ｂの各処理は、ＣＰＵ１０（図２参照）によってソフトウェア的に実行される。 [Functional Block Diagram of Server Device 1]
FIG. 3 is a block diagram showing functions of the server device 1 shown in FIG. As shown in FIG. 3, the server device 1 includes a first processing system 10A, a second processing system 10B, and a music database DB. Each process of the first and second processing systems 10A and 10B is executed by software by the CPU 10 (see FIG. 2).

先ず、第１および第２処理系１０Ａ、１０Ｂの概要について説明する。 First, an outline of the first and second processing systems 10A and 10B will be described.

（第１処理系１０Ａ）
第１処理系１０Ａは、第１ファイル形式変換部１０１Ａ、第１測定部１０２Ａ、第１算出部１０３Ａ、第１メタデータ生成部１０４Ａ、クエリ生成部１０５Ａ、クエリ実行部１０６および表示処理部１０７を有する。なお、クエリ生成部１０５Ａおよびクエリ実行部１０６は、照合部の一例である。 (First treatment system 10A)
The first processing system 10A includes a first file format conversion unit 101A, a first measurement unit 102A, a first calculation unit 103A, a first metadata generation unit 104A, a query generation unit 105A, a query execution unit 106, and a display processing unit 107. Have. Note that the query generation unit 105A and the query execution unit 106 are examples of a collation unit.

第１処理系１０Ａの目的は、楽曲探索処理および類似探索処理を行うため、個人ユーザのオーディオデータＭ２を、第２処理系１０Ｂによって作成された楽曲データベースＤＢと照合することである。第１処理系１０Ａの基本的な処理は、以下の通りである。 The purpose of the first processing system 10A is to collate the audio data M2 of the individual user with the music database DB created by the second processing system 10B in order to perform music search processing and similarity search processing. The basic processing of the first processing system 10A is as follows.

（測定処理）
第１に、第１処理系１０Ａは、オーディオデータＭ２をクライアント端末装置３から入力し、オーディオデータＭ２の基本周波数ｆ（ｐ）を予め指定された時間間隔ごとに測定する。ここでいう、基本周波数とは、ある音（例えば、音名（以下、米名表記）でいう「Ａ」）の音高（音の高さ）を決定づける物理量であって、その音が持つ周波数成分の内で最も低い周波数である。本実施の形態では、基本周波数ｆ（ｐ）を測定する時間間隔は、およそ１００ｍｓである。 (Measurement process)
First, the first processing system 10A inputs the audio data M2 from the client terminal device 3, and measures the fundamental frequency f (p) of the audio data M2 at predetermined time intervals. Here, the fundamental frequency is a physical quantity that determines the pitch (pitch) of a certain sound (for example, “A” in the name of a note (hereinafter referred to as American name)), and the frequency of the sound. It is the lowest frequency among the components. In the present embodiment, the time interval for measuring the fundamental frequency f (p) is approximately 100 ms.

（算出処理）
第２に、第１処理系１０Ａは、基準となる音（基準音）の基準周波数ｆ（Ａ）に対する基本周波数ｆ（ｐ）の偏差を半音階単位で算出し、この偏差に対応した音を算出する。換言すれば、第１処理系１０Ａは、基準周波数ｆ（Ａ）に対する基本周波数ｆ（ｐ）のズレに対応した音を算出する。ここでいう、基準音は、音名（以下、米名）で表すと、「Ａ」に対応する音である。その音の基準周波数ｆ（Ａ）は、望ましくは４４０Ｈｚである。 (Calculation process)
Second, the first processing system 10A calculates the deviation of the fundamental frequency f (p) from the reference frequency f (A) of the reference sound (reference sound) in units of semitones, and the sound corresponding to this deviation is calculated. calculate. In other words, the first processing system 10A calculates a sound corresponding to the deviation of the fundamental frequency f (p) from the reference frequency f (A). Here, the reference sound is a sound corresponding to “A” when expressed by a pitch name (hereinafter, US name). The reference frequency f (A) of the sound is desirably 440 Hz.

楽曲が基本周波数成分と種々の高調波成分で構成されていることは周知である。測定処理により基本周波数ｆ（ｐ）を測定し、算出処理により基本周波数ｆ（ｐ）が基準周波数ｆ（Ａ）からどの程度ズレているかを算出することで、基本周波数ｆ（ｐ）を測定した時点での音（例えば、音名でいう「Ｂ」）が分かる。その詳細については後述する。 It is well known that music is composed of fundamental frequency components and various harmonic components. The fundamental frequency f (p) was measured by measuring the fundamental frequency f (p) and calculating how much the fundamental frequency f (p) is deviated from the reference frequency f (A) by the calculation process. The sound at the time (for example, “B” in the pitch name) is known. Details thereof will be described later.

（メタデータ生成処理）
第３に、第１処理系１０Ａは、算出処理によって得た音を基に、オーディオデータＭ２の特徴、具体的には、メロディを表すメタデータを生成する。 (Metadata generation process)
Thirdly, the first processing system 10A generates metadata representing the characteristics of the audio data M2, specifically, the melody, based on the sound obtained by the calculation process.

（楽曲探索処理／類似探索処理）
第４に、第１処理系１０Ａは、メタデータを用いて、楽曲データベースＤＢから楽曲データＭ１の検索または類似探索をクライアント端末装置３からの処理要求に従って行い、その結果をクライアント端末装置３に送信する。 (Music search process / similarity search process)
Fourth, the first processing system 10A uses the metadata to perform a search or similarity search of the music data M1 from the music database DB according to a processing request from the client terminal device 3, and transmits the result to the client terminal device 3. To do.

（第２処理系１０Ｂ）
第２処理系１０Ｂは、第２ファイル形式変換部１０１Ｂ、第２測定部１０２Ｂ、第２算出部１０３Ｂ、第２メタデータ生成部１０４Ｂ、クローラ部１０８および楽曲データベース生成部１０９を有する。 (Second processing system 10B)
The second processing system 10B includes a second file format conversion unit 101B, a second measurement unit 102B, a second calculation unit 103B, a second metadata generation unit 104B, a crawler unit 108, and a music database generation unit 109.

第２処理系１０Ｂの目的は、楽曲データベースＤＢを作成することである。楽曲データベースＤＢは、楽曲探索処理および類似探索処理を行うため、楽曲データＭ１を始めとする、楽曲サーバ装置２内の楽曲データの情報を記録したものである。第２処理系１０Ｂの基本的な処理は、以下の通りである。 The purpose of the second processing system 10B is to create a music database DB. The music database DB records information of music data in the music server device 2 including the music data M1 in order to perform music search processing and similarity search processing. The basic processing of the second processing system 10B is as follows.

（収集処理）
第１に、第２処理系１０Ｂは、楽曲サーバ装置２のディレクトリを巡回し、楽曲サーバ装置２に格納されている種々の楽曲データを収集する。 (Collection processing)
First, the second processing system 10 B circulates the directory of the music server device 2 and collects various music data stored in the music server device 2.

（データベース作成処理）
第２に、第２処理系１０Ｂは、測定処理、算出処理およびメタデータ生成処理を行って、楽曲データベースＤＢを作成する。 (Database creation process)
Second, the second processing system 10B performs a measurement process, a calculation process, and a metadata generation process to create a music database DB.

以下、本実施の形態の特徴である、算出処理およびメタデータ生成処理について順を追って説明する。 Hereinafter, the calculation process and the metadata generation process, which are features of the present embodiment, will be described in order.

［算出処理における音の算出方法］
図４は、本発明の実施の形態に係る算出処理を説明するための図である。本実施の形態では、音階の表現方法に半音階を用いる。十二平均律における半音は、１オクターブを１２等分したものである。１オクターブは、米名の表記を用いると、音高の低い方から順に、「Ｃ」、「Ｄ♭（フラット）」、「Ｄ」、「Ｅ♭（フラット）」、「Ｅ」、「Ｆ」、「Ｆ♯（シャープ）」、「Ｇ」、「Ａ♭（フラット）」、「Ａ」、「Ｂ♭（フラット）」、「Ｂ」の１２音からなる。 [Sound calculation method in calculation processing]
FIG. 4 is a diagram for explaining calculation processing according to the embodiment of the present invention. In this embodiment, a chromatic scale is used as a musical scale expression method. The semitone in the twelve equal temperament is equal to one octave divided into 12 equal parts. One octave uses the notation of the American name, in order from the lowest pitch: “C”, “D ♭ (flat)”, “D”, “E ♭ (flat)”, “E”, “F ”,“ F # (sharp) ”,“ G ”,“ A ♭ (flat) ”,“ A ”,“ B ♭ (flat) ”,“ B ”.

互いに隣接する２つの半音の周波数比は、（１）式で表される。 The frequency ratio of two adjacent semitones is expressed by equation (1).

本実施の形態では、上述したように、「Ａ」を基準音とし、その基準周波数は、ｆ（Ａ）＝４４０Ｈｚである。例えば、音高が基準音「Ａ」より半音高い「Ｂ♭」音の基本周波数ｆ（Ｂ♭）は、（１）式の関係により、（２）式で表される。 In the present embodiment, as described above, “A” is the reference sound, and the reference frequency is f (A) = 440 Hz. For example, the fundamental frequency f (B ♭) of the “B ♭” sound whose pitch is higher than the reference sound “A” by a semitone is expressed by the equation (2) according to the relationship of the equation (1).

（２）式より、周波数ｆ（Ｂ♭）は、（３）式で表される。

From the equation (2), the frequency f (B ♭) is expressed by the equation (3).

例えば、音高が「Ｂ♭」音より半音高い「Ｂ」音の基本周波数ｆ（Ｂ）は、基本周波数ｆ（Ｂ♭）の算出と同様の議論により、（４）式で表される。 For example, the fundamental frequency f (B) of the “B” sound whose pitch is higher by a semitone than the “B 半” sound is expressed by the equation (4) by the same discussion as the calculation of the fundamental frequency f (B ♭).

すなわち、周波数ｆ（Ｂ♭）は、（３）式を用いると、（５）式で表される。 That is, the frequency f (B ♭) is expressed by the expression (5) when the expression (3) is used.

図４には、「Ｃ」音からそれよりも１オクターブ高い「Ｃ」音までの各音に対応した基本周波数ｆ（ｐ）が示されている。なお、変数ｐは、「Ｃ」から「Ｂ」のいずれかの音を表す。音が高くなるほど、基本周波数ｆ（ｐ）も高くなる。 FIG. 4 shows the fundamental frequency f (p) corresponding to each sound from the “C” sound to the “C” sound one octave higher than that. The variable p represents any sound from “C” to “B”. The higher the sound, the higher the fundamental frequency f (p).

基準音「Ａ」の基準周波数ｆ（Ａ）と、基準音「Ａ」より高く、かつ、これから２個の半音だけズレた「Ｂ」音の基本周波数ｆ（Ｂ）との比（周波数比）は、（６）式で表される。 Ratio (frequency ratio) between the reference frequency f (A) of the reference sound “A” and the basic frequency f (B) of the “B” sound that is higher than the reference sound “A” and is shifted by two semitones from now on Is expressed by equation (6).

図４には、基準音「Ａ」を１としたとき、基準周波数ｆ（Ａ）と各音の基本周波数ｆ（ｐ）との比が示されている。 FIG. 4 shows the ratio between the reference frequency f (A) and the fundamental frequency f (p) of each sound when the reference sound “A” is 1.

（６）式を参照すると、基準音「Ａ」の基準周波数ｆ（Ａ）と、基準音「Ａ」からｑ個の半音だけズレたｐ音の基本周波数ｆ（ｐ）との比を、一般化された（７）式で表すことができる。 Referring to the equation (6), the ratio between the reference frequency f (A) of the reference sound “A” and the fundamental frequency f (p) of the p-tone shifted by q semitones from the reference sound “A” is It can be expressed by the formula (7).

（７）式において、変数ｑは、整数をとる。以下、変数ｑを「ズレ個数」と呼ぶ。（７）式を変形すると、（８）式の関係式が得られる。 In the equation (7), the variable q takes an integer. Hereinafter, the variable q is referred to as “the number of deviations”. When the equation (7) is modified, the relational equation (8) is obtained.

（８）式において、底を２^１／１２とする対数をとると、ズレ個数ｑは、（９）式で表される。 In equation (8), when taking the logarithm with the base being 2 ^1/12 , the deviation number q is expressed by equation (9).

ただし、ズレ個数ｑは、整数となるように、四捨五入された値である。（９）式において、基準音「Ａ」の基準周波数ｆ（Ａ）が既知であるので、基本周波数ｆ（ｐ）が分かれば、ズレ個数ｑが求まる。したがって、半音で基準音「Ａ」から何個ズレているかが分かる。 However, the deviation number q is a value rounded off so as to be an integer. In equation (9), since the reference frequency f (A) of the reference sound “A” is known, if the fundamental frequency f (p) is known, the number of deviations q can be obtained. Therefore, it can be understood how many halftones are deviated from the reference sound “A”.

図４には、ズレ個数ｑと各音との対応関係が示されている。測定処理にて測定された基本周波数がｆ（ｐ）＝４６６．１６Ｈｚであった場合、ズレ個数はｑ＝１となる。この演算結果は、基準音「Ａ」から半音でズレ個数ｑ＝１だけズレた「Ｂ♭」を表している。なお、音名は、オクターブに左右されずに、一定の表記（「Ｃ」、「Ｄ♭」、…「Ｂ」）がなされるが、本実施の形態では、音高そのものを表すことができる。 FIG. 4 shows the correspondence between the number of deviations q and each sound. When the fundamental frequency measured in the measurement process is f (p) = 466.16 Hz, the number of deviations is q = 1. This calculation result represents “B ♭” that is shifted from the reference sound “A” by the number of shifts q = 1 in semitones. Note that the pitch name is not affected by an octave and is given a certain notation (“C”, “D ♭”,... “B”), but in this embodiment, the pitch itself can be expressed. .

このように、基本周波数ｆ（ｐ）が基準周波数ｆ（Ａ）からどの程度ズレているかを算出することで、測定された基本周波数ｆ（ｐ）がどの音に対応するかということが分かる。 Thus, by calculating how much the fundamental frequency f (p) is deviated from the reference frequency f (A), it can be understood which sound the measured fundamental frequency f (p) corresponds to.

［メタデータの構成例］
図５は、本発明の実施の形態に係るメタデータ生成処理によって得られたメタデータの構成を説明するための図である。第１処理系１０Ａによる測定処理では、楽曲の開始から、およそ１００ｍｓごとにオーディオデータＭ２の基本周波数ｆ（ｐ）が測定される。第２処理系１０Ｂによる測定処理は、第１処理系１０Ａのものと同様であるので、ここでは、第１処理系１０Ａの処理を例に挙げて説明する。 [Metadata configuration example]
FIG. 5 is a diagram for explaining a configuration of metadata obtained by the metadata generation processing according to the embodiment of the present invention. In the measurement process by the first processing system 10A, the fundamental frequency f (p) of the audio data M2 is measured approximately every 100 ms from the start of the music. Since the measurement processing by the second processing system 10B is the same as that of the first processing system 10A, here, the processing of the first processing system 10A will be described as an example.

楽曲は和音で構成されることが多いため、一度に複数個の基本周波数ｆ（ｐ）が測定されることがある。図５に示す例では、楽曲の開始から１００ｍｓ経過した時間ｔ_１において、２個の基本周波数ｆ（ｐ_１）＝２９３．７Ｈｚ、ｆ（ｐ_２）＝３４９．２Ｈｚが各々測定されている。 Since music is often composed of chords, a plurality of fundamental frequencies f (p) may be measured at one time. In the example shown in FIG. 5, two basic frequencies f (p ₁ ) = 293.7 Hz and f (p ₂ ) = 349.2 Hz are measured at time t _{1 when} 100 ms has elapsed from the start of the music.

算出処理の結果、時間ｔ_１において、基本周波数ｆ（ｐ_１）におけるズレ個数がｑ_１＝−７であったとすると、ズレ個数ｑ_１に対応する音は「Ｄ」であることが分かる。同様に、基本周波数ｆ（ｐ_２）におけるズレ個数ｑ_２に対応する音が「Ｆ」であることが分かる。 As a result of the calculation processing, if the number of deviations at the fundamental frequency f (p ₁ ) is q ₁ = −7 at time t ₁ , it can be seen that the sound corresponding to the number of deviations q ₁ is “D”. Similarly, it can be seen that the sound corresponding to the number of deviations q ₂ at the fundamental frequency f (p ₂ ) is “F”.

メタデータ生成処理において、第１処理系１０Ａは、基本周波数ｆ（ｐ）の経過時間ｔ_ｎと共に、ズレ個数ｑ_ｎが時系列順に記述されたメタデータを生成する。メタデータは、経過時間ｔ_ｎおよびズレ個数ｑ_ｎをパラメータとし、下記に示す構文１で記述される。 In the metadata generation process, the first processing system 10A generates metadata in which the number of deviations q _n is described in chronological order together with the elapsed time t _n of the fundamental frequency f (p). The metadata is described in the following syntax 1 using the elapsed time t _n and the number of deviations q _n as parameters.

（構文１）
（ｔ_ｎ），ｑ_１，ｑ_２，・・・，ｑ_ｎ，， (Syntax 1)
(T _n ), q ₁ , q ₂ ,..., Q _n ,.

変数ｎは、正の整数（１，２，３，・・・）をとる。経過時間ｔ_ｎは、楽曲開始からの経過時間であって、括弧「（）」内に記述される。経過時間ｔ_ｎの記述の後には、「，（コンマ）」を於いてズレ個数ｑ_１が記述される。ズレ個数ｑ_ｎ−１の記述の後にズレ個数ｑ_ｎを記述する場合、前者の記述の後に「，」を於いて後者のズレ個数ｑ_ｎが記述される。最後のズレ個数ｑ_ｎの記述の後には、メタデータの終了を表す「，（コンマ）」が記述される。 The variable n takes a positive integer (1, 2, 3,...). The elapsed time t _n is an elapsed time from the start of the music and is described in parentheses “()”. After the description of the elapsed time t _n, the number of deviations q ₁ is described using “, (comma)”. When the number of deviations q _n is described after the description of the number of deviations q _n−1 , the latter number of deviations q _n is described by “,” after the former description. After the last shift number q _n description, indicating the end of the metadata ", (comma)" is described.

例えば、経過時間ｔ_１＝１００ｍｓにおける測定では、ズレ個数ｑ_１＝−７，ｑ_２＝−４であるので、「（１００），−７，−４，，」というメタデータが生成される。予めズレ個数ｑ_ｎを音名（具体的には音高）に対応づけておけば、メタデータを参照するだけで、経過時間ｔ_ｎ＝１００ｍｓにおける楽曲データが「Ｄ」音および「Ｆ」音によって構成されていることが分かる。 For example, in the measurement at the elapsed time t ₁ = 100 ms, since the number of deviations q ₁ = −7, q ₂ = −4, metadata “(100), −7, −4,...” Is generated. If the number of deviations q _{n is associated} with the pitch name (specifically, the pitch) in advance, the music data at the elapsed time t _n = 100 ms is “D” sound and “F” sound only by referring to the metadata. It can be seen that it is constituted by.

１５分程度の楽曲では、経過時間ｔ_ｎごとに得られるメタデータのバイト数は、数バイトから２０バイト程度であり、平均すると１５バイト程度である。１００ｍｓごとに１５バイトのメタデータが得られると仮定すると、この楽曲から得られるメタデータの総バイト数は、およそ１５バイト×（９０００００ｍｓ／１００ｍｓ）＝１３５ＫＢとなる。 In a music piece of about 15 minutes, the number of bytes of metadata obtained for each elapsed time t _n is about several bytes to 20 bytes, and on average about 15 bytes. Assuming that 15 bytes of metadata are obtained every 100 ms, the total number of bytes of metadata obtained from this music is approximately 15 bytes × (900000 ms / 100 ms) = 135 KB.

上述したように、経過時間ｔ_ｎおよびズレ個数ｑ_ｎを用いてオーディオデータＭ２のメロディを表すので、１５分程度の楽曲であっても、メタデータのデータ量は、わずか１３５ＫＢ程度でしかない。サーバ装置１は、楽曲の特徴を的確に表しつつ、軽量なメタデータを生成することを可能にしている。 As described above, since the melody of the audio data M2 is expressed using the elapsed time t _n and the number of deviations q _n , the metadata data amount is only about 135 KB even for a music piece of about 15 minutes. The server device 1 can generate lightweight metadata while accurately representing the characteristics of the music.

（第１処理系１０Ａの各構成部）
次に、図３を参照しながら、第１処理系１０Ａの各構成部について説明する。 (Each component of the first processing system 10A)
Next, each component of the first processing system 10A will be described with reference to FIG.

第１ファイル形式変換部１０１Ａは、オーディオデータＭ２をネットワーク４（不図示）を介してクライアント端末装置３から時系列で入力し、これを所定のファイル形式に変換し、変換したオーディオデータＭ２ｃを第１測定部１０２Ａに出力する。これは、処理対象のファイル形式を統一するために行われる。本実施の形態では、例えば、オーディオデータＭ２がＭＰ３（ＭＰＥＧＡｕｄｉｏＬａｙｅｒ３）のファイル形式で圧縮されている場合、第１ファイル形式変換部１０１Ａは、このオーディオデータＭ２をＭＰ４のファイル形式に変換する。 The first file format conversion unit 101A inputs audio data M2 from the client terminal device 3 in time series via the network 4 (not shown), converts this into a predetermined file format, and converts the converted audio data M2c to the first 1 output to the measurement unit 102A. This is performed to unify the file format to be processed. In the present embodiment, for example, when the audio data M2 is compressed in the MP3 (MPEG Audio Layer3) file format, the first file format conversion unit 101A converts the audio data M2 into the MP4 file format.

第１測定部１０２Ａは、オーディオデータＭ２ｃを第１ファイル形式変換部１０１Ａから時系列で取り込み、およそ１００ｍｓごとに測定処理を行い、得られた基本周波数ｆ（ｐ）を第１算出部１０３Ａに出力する。測定処理の際に、第１測定部１０２Ａは、例えば、高速フーリエ変換（ＦＦＴ）を用いて、オーディオデータＭ２ｃを周波数スペクトルデータに変換し、およそ１００ｍｓごとに基本周波数ｆ（ｐ）を算出する。更に、第１測定部１０２Ａは、基本周波数ｆ（ｐ）を測定したときの測定時間を楽曲開始からの経過時間ｔ_ｎとして第１メタデータ生成部１０４Ａに出力する。なお、基本周波数ｆ（ｐ）の算出には、上述の他、ハーモニッククラスタリング法、くし形フィルタ法などを用いることができる。 The first measurement unit 102A takes the audio data M2c from the first file format conversion unit 101A in time series, performs measurement processing approximately every 100 ms, and outputs the obtained fundamental frequency f (p) to the first calculation unit 103A. To do. During the measurement process, the first measurement unit 102A converts the audio data M2c into frequency spectrum data using, for example, fast Fourier transform (FFT), and calculates the fundamental frequency f (p) about every 100 ms. Further, the first measurement unit 102A outputs the measurement time when measuring the fundamental frequency f (p) as the elapsed time t _n from the music start to the first meta-data generating unit 104A. In addition to the above, the fundamental frequency f (p) can be calculated using a harmonic clustering method, a comb filter method, or the like.

第１算出部１０３Ａは、基本周波数ｆ（ｐ）を第１測定部１０２Ａから入力し、算出処理を行い、算出したズレ個数ｑ_ｎを第１メタデータ生成部１０４Ａに出力する。算出処理の際に、第１算出部１０３Ａは、基準音「Ａ」の基準周波数ｆ（Ａ）と、第１測定部１０２Ａによって測定された基本周波数ｆ（ｐ）とを（９）式に代入し、整数でズレ個数ｑ_ｎを算出する。 The first calculation unit 103A receives the fundamental frequency f (p) from the first measurement unit 102A, performs calculation processing, and outputs the calculated number of deviations q _n to the first metadata generation unit 104A. During the calculation process, the first calculation unit 103A substitutes the reference frequency f (A) of the reference sound “A” and the fundamental frequency f (p) measured by the first measurement unit 102A into the equation (9). Then, the shift number q _n is calculated as an integer.

第１メタデータ生成部１０４Ａは、経過時間ｔ_ｎを第１測定部１０２Ａから入力すると共に、ズレ個数ｑ_ｎを第１算出部１０３Ａから入力し、およそ１００ｍｓごとにメタデータ生成処理を行い、生成したメタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）をクエリ生成部１０５Ａに出力する。メタデータ生成処理の際に、第１メタデータ生成部１０４Ａは、構文１に示すように、「（ｔ_ｎ），ｑ_１，ｑ_２，・・・，ｑ_ｎ，，」という記述のメタデータを生成する。 The first metadata generation unit 104A inputs the elapsed time t _n from the first measurement unit 102A, and inputs the number of deviations q _n from the first calculation unit 103A, and performs metadata generation processing approximately every 100 ms to generate The meta data MET2 (t _n , q _n ) is output to the query generation unit 105A. During the metadata generation process, the first metadata generation unit 104A, as shown in the syntax 1, the metadata having the description “(t _n ), q ₁ , q ₂ ,..., Q _n ,”. Is generated.

本実施の形態では、データの流れを明確にするため、第１メタデータ生成部１０４Ａに第１測定部１０２Ａから経過時間ｔ_ｎが入力されるものとしているが、第１メタデータ生成部１０４Ａが経過時間ｔ_ｎを直接把握しても差し支えはない。このことは、第２処理系１０Ｂについても同様である。 In the present embodiment, in order to clarify the flow of data, the elapsed time t _n is input from the first measurement unit 102A to the first metadata generation unit 104A, but the first metadata generation unit 104A There is no problem even if the elapsed time t _n is directly grasped. The same applies to the second processing system 10B.

クエリ生成部１０５Ａは、メタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）を第１メタデータ生成部１０４Ａから入力し、メタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）を用いてクエリを生成し、生成したクエリＱＵＥをクエリ実行部１０６に出力する。その際に、クエリ生成部１０５Ａは、メタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）もクエリ実行部１０６に出力する。 The query generation unit 105A inputs the metadata MET2 (t _n , q _n ) from the first metadata generation unit 104A, generates a query using the metadata MET2 (t _n , q _n ), and generates the generated query QUE Is output to the query execution unit 106. At that time, the query generation unit 105 A} also outputs metadata MET 2 (t _n , q _n ) to the query execution unit 106.

クエリＱＵＥは、楽曲探索処理または類似探索処理を実行するために必要なデータであって、楽曲データベースＤＢに対する処理要求が記述されている。処理要求には、サーバ装置１がクライアント端末装置３から受けた、楽曲探索モードまたは類似探索モードの実行がある。 The query QUE is data necessary for executing the music search process or the similar search process, and describes a processing request for the music database DB. The processing request includes execution of the music search mode or the similar search mode received by the server device 1 from the client terminal device 3.

クエリ実行部１０６は、クエリＱＵＥをクエリ生成部１０５Ａから入力し、クエリＱＵＥに従って楽曲データベースＤＢにアクセスしながら、以下の処理を行う。 The query execution unit 106 inputs the query QUE from the query generation unit 105A, and performs the following processing while accessing the music database DB according to the query QUE.

（楽曲探索モードの実行時）
楽曲探索モードの実行時には、クエリ実行部１０６は、メタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）を第２ＨＤＤ１２に格納されているインデックスファイル１２１と照合し、照合結果を楽曲探索結果ＡＮＳとして表示処理部１０７に出力する。 (When performing music search mode)
When executing the music search mode, the query execution unit 106 checks the metadata MET2 (t _n , q _n ) with the index file 121 stored in the second HDD 12, and displays the check result as the music search result ANS as the display processing unit 107. Output to.

インデックスファイル１２１は、例えば、図５に図示するように、第２処理系１０Ｂによって生成されたメタデータＭＥＴ１（ｔ_ｎ，ｑ_ｎ）を時系列順に記録したものである。なお、インデックスファイル１２１には、楽曲データＭ１以外のもののメタデータも記録されている。 For example, as illustrated in FIG. 5, the index file 121 records metadata MET1 (t _n , q _n ) generated by the second processing system 10B in chronological order. In the index file 121, metadata other than the music data M1 is also recorded.

照合の際に、クエリ実行部１０６は、オーディオデータＭ２のメタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）をインデックスファイル１２１に記述されているメタデータＭＥＴ１（ｔ_ｎ，ｑ_ｎ）と比較する。 In matching, the query execution unit 106 compares the audio metadata MET2 _(t _{n, q} n) of data M2 metadata MET1 _(t _{n, q} n) which is described with the index file 121 and.

比較の結果、両者が一致する場合、クエリ実行部１０６は、楽曲データＭ１が楽曲サーバ装置２にあると判断し、楽曲データＭ１が見つかった旨を楽曲探索結果ＡＮＳとして、第１ＨＤＤ１１に格納されている属性データＤと共に表示処理部１０７に出力する。一方、両者が不一致する場合、クエリ実行部１０６は、楽曲データＭ１が楽曲サーバ装置２にないと判断し、楽曲データＭ１が見つからなかった旨を楽曲探索結果ＡＮＳとして表示処理部１０７に出力する。 If the two match as a result of the comparison, the query execution unit 106 determines that the music data M1 is in the music server device 2, and stores that the music data M1 is found in the first HDD 11 as a music search result ANS. The attribute data D is output to the display processing unit 107. On the other hand, if the two do not match, the query execution unit 106 determines that the music data M1 is not in the music server device 2, and outputs that the music data M1 was not found to the display processing unit 107 as a music search result ANS.

（類似探索モードの実行時）
類似探索モードの実行時には、クエリ実行部１０６は、楽曲探索モードの場合と同様に、メタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）をインデックスファイル１２１と照合し、照合結果を探索結果ＡＮＳとして表示処理部１０７に出力する。ただし、以下の点が楽曲探索モードの場合と異なる。 (When similar search mode is executed)
When executing the similar search mode, the query execution unit 106 checks the metadata MET2 (t _n , q _n ) with the index file 121 and displays the check result as the search result ANS, as in the music search mode. It outputs to 107. However, the following points are different from those in the music search mode.

照合の際に、クエリ実行部１０６は、オーディオデータＭ２のメタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）がインデックスファイル１２１に記述されているメタデータＭＥＴ１（ｔ_ｎ，ｑ_ｎ）と類似するか否かを判断する。本実施の形態では、前者のメタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）が後者のメタデータＭＥＴ１（ｔ_ｎ，ｑ_ｎ）と一定の割合（例えば、８０パーセント）一致した場合、クエリ実行部１０６は、両者が類似すると判断する。この類似の度合を表す割合は、サーバ装置１の所有者または個人ユーザが好適に設定すればよい。 In matching, the query execution unit 106, the metadata MET2 _(t _{n, q} n) of the audio data M2 is whether similar metadata MET1 described in the index file 121 _(t _{n, q} n) Judging. In the present embodiment, when the former metadata MET2 (t _n , q _n ) matches the latter metadata MET 1 (t _n , q _n ) at a certain rate (for example, 80%), the query execution unit 106 Judge that both are similar. The ratio representing the degree of similarity may be suitably set by the owner of the server device 1 or an individual user.

比較の結果、両者が類似する場合、クエリ実行部１０６は、オーディオデータＭ２が楽曲データＭ１に類似した楽曲であると推定し、その旨を類似探索結果ＡＮＳとして、属性データＤと共に表示処理部１０７に出力する。一方、両者が類似しない場合、クエリ実行部１０６は、オーディオデータＭ２に類似する楽曲が楽曲サーバ装置２にはないと判断し、その旨を類似探索結果ＡＮＳとして表示処理部１０７に出力する。 As a result of the comparison, if the two are similar, the query execution unit 106 estimates that the audio data M2 is a song similar to the song data M1, and displays the fact as a similarity search result ANS together with the attribute data D and the display processing unit 107. Output to. On the other hand, if the two are not similar, the query execution unit 106 determines that there is no music similar to the audio data M2 in the music server device 2, and outputs that fact to the display processing unit 107 as a similar search result ANS.

表示処理部１０７は、楽曲探索結果ＡＮＳまたは類似探索結果ＡＮＳと共に、属性データＤをクエリ実行部１０６から入力し、両者をクライアント端末装置３に表示するための表示処理を行い、その処理結果Ｒをクライアント端末装置３にネットワーク４を介して送信する。表示処理は、例えば、表示画面のレイアウトなどを決める処理である。 The display processing unit 107 inputs the attribute data D from the query execution unit 106 together with the music search result ANS or the similar search result ANS, performs display processing for displaying both on the client terminal device 3, and displays the processing result R as the processing result R. The data is transmitted to the client terminal device 3 via the network 4. The display process is a process for determining the layout of the display screen, for example.

いずれのモードにおいても、探索対象が見つかった場合、クライアント端末装置３は、その旨と共に楽曲データＭ１の属性データＤを表示画面に表示する。 In any mode, when a search target is found, the client terminal device 3 displays the attribute data D of the music data M1 on the display screen together with that fact.

（第２処理系１０Ｂの各構成部）
図３を参照しながら、第２処理系１０Ｂの各構成部について説明する。 (Each component of the second processing system 10B)
Each component of the second processing system 10B will be described with reference to FIG.

クローラ部１０８は、収集処理として、ネットワーク４上にある楽曲サーバ装置２のディレクトリを巡回し、楽曲サーバ装置２に格納されているコンテンツデータのキャッシュデータを収集する。キャッシュデータには、楽曲データＭ１を始めとする種々の楽曲データやその属性データがある。属性データは、上述したように、楽曲名などのデータである。クローラ部１０８は、キャッシュデータの内、楽曲データＭ１を第２ファイル形式変換部１０１Ｂに出力し、属性データＤを第１ＨＤＤ１１に出力する。 As a collection process, the crawler unit 108 circulates a directory of the music server device 2 on the network 4 and collects cache data of content data stored in the music server device 2. The cache data includes various music data including the music data M1 and its attribute data. As described above, the attribute data is data such as a song name. The crawler unit 108 outputs the music data M1 of the cache data to the second file format conversion unit 101B, and outputs the attribute data D to the first HDD 11.

第２ファイル形式変換部１０１Ｂは、楽曲データＭ１をクローラ部１０８から入力し、第１ファイル形式変換部１０１Ａと同様の処理を行い、所定のファイル形式に変換した楽曲データＭ１ｃを第２測定部１０２Ｂに出力する。 The second file format conversion unit 101B receives the music data M1 from the crawler unit 108, performs the same processing as the first file format conversion unit 101A, and converts the music data M1c converted into a predetermined file format to the second measurement unit 102B. Output to.

第２測定部１０２Ｂは、楽曲データＭ１ｃを第２ファイル形式変換部１０１Ｂから時系列で取り込み、第１測定部１０２Ａと同様の測定処理を行い、この処理によって得られた基本周波数ｆ（ｐ）を第２算出部１０３Ｂに出力し、経過時間ｔ_ｎを第２メタデータ生成部１０４Ｂに出力する。 The second measurement unit 102B takes the music data M1c from the second file format conversion unit 101B in time series, performs the same measurement process as the first measurement unit 102A, and uses the fundamental frequency f (p) obtained by this process. output to the second calculation unit 103B, and outputs the elapsed time _{t n} to the second metadata generating unit 104B.

第２算出部１０３Ｂは、基本周波数ｆ（ｐ）を第２測定部１０２Ｂから入力し、第１算出部１０３Ａと同様の算出処理を行い、算出したズレ個数ｑ_ｎを第２メタデータ生成部１０４Ｂに出力する。 The second calculation unit 103B receives the fundamental frequency f (p) from the second measurement unit 102B, performs a calculation process similar to that of the first calculation unit 103A, and calculates the calculated shift number q _n to the second metadata generation unit 104B. Output to.

第２メタデータ生成部１０４Ｂは、経過時間ｔ_ｎを第２測定部１０２Ｂから入力すると共に、ズレ個数ｑ_ｎを第２算出部１０３Ｂから入力し、第１メタデータ生成部１０４Ａと同様のメタデータ生成処理を行い、生成したメタデータＭＥＴ１（ｔ_ｎ，ｑ_ｎ）を第１ＨＤＤ１１に出力する。 The second metadata generation unit 104B inputs the elapsed time t _n from the second measurement unit 102B, and inputs the number of deviations q _n from the second calculation unit 103B. The same metadata as the first metadata generation unit 104A A generation process is performed, and the generated metadata MET ₁ (t _n , q _n ) is output to the first HDD 11.

楽曲データベース生成部１０９は、楽曲データベースＤＢにアクセスし、第１ＨＤＤ１１に格納されたメタデータＭＥＴ１（ｔ_ｎ，ｑ_ｎ）を用いて、インデックスファイル１２１を作成する。 The music database generation unit 109 accesses the music database DB and creates an index file 121 using the metadata MET1 (t _n , q _n ) stored in the first HDD 11.

（楽曲データベースＤＢ）
楽曲データベースＤＢについて説明する。楽曲データベースＤＢは、楽曲サーバ装置２にある種々の楽曲データと、これらの楽曲データのメタデータを基に作成されたインデックスファイル１２１とによって構成されている。 (Music database DB)
The music database DB will be described. The music database DB is composed of various music data in the music server device 2 and an index file 121 created based on metadata of these music data.

第１ＨＤＤ１１は、第２メタデータ生成部１０４Ｂから入力されたメタデータＭＥＴ１（ｔ_ｎ，ｑ_ｎ）と、クローラ部１０８から入力された属性データＤとを格納している。第２ＨＤＤ１２は、インデックスファイル１２１を格納している。 The first HDD 11 stores metadata MET1 (t _n , q _n ) input from the second metadata generation unit 104B and attribute data D input from the crawler unit 108. The second HDD 12 stores an index file 121.

クローラ部１０８が収集処理に失敗した場合など、何らかの理由により、第１ＨＤＤ１１にクローラ部１０８から属性データＤが入力されなかった場合、例えば、サーバ装置１の所有者による手動操作で、属性データＤを第１ＨＤＤ１１に直接入力することができる。更に、手動操作で、属性データＤを変更することができる。 When the attribute data D is not input from the crawler unit 108 to the first HDD 11 for some reason, such as when the crawler unit 108 fails in the collection process, for example, the attribute data D is manually input by the owner of the server device 1. Direct input can be made to the first HDD 11. Furthermore, the attribute data D can be changed manually.

［データ処理システムＡの動作例］
図６は、本発明の実施の形態に係るデータ処理システムＡの動作例を示すシーケンス図である。 [Operation example of data processing system A]
FIG. 6 is a sequence diagram showing an operation example of the data processing system A according to the embodiment of the present invention.

ここでは、クライアント端末装置３の所有者である個人ユーザがネットワーク４上の楽曲サーバ装置２の中からオーディオデータＭ２と同じ楽曲を探索する場合を例に挙げる。なお、類似探索処理であっても、データ処理システムＡの動作は同じである。 Here, the case where the individual user who is the owner of the client terminal device 3 searches the music server device 2 on the network 4 for the same music as the audio data M2 is taken as an example. Even in the similar search process, the operation of the data processing system A is the same.

サーバ装置１は、楽曲探索モードの実行を促す要求がクライアント端末装置３からあった場合、その要求を許可すると、以下の処理を行う。先ず、ステップＳ１からＳ５に示す、第１処理系１０Ａによる処理の流れについて説明する。 When a request for urging execution of the music search mode is received from the client terminal device 3, the server device 1 performs the following processing when the request is permitted. First, the flow of processing by the first processing system 10A shown in steps S1 to S5 will be described.

第１ファイル形式変換部１０１Ａは、オーディオデータＭ２をクライアント端末装置３から時系列で入力すると、これを所定のファイル形式に変換する（ステップＳ１）。 When the audio data M2 is input from the client terminal device 3 in time series, the first file format conversion unit 101A converts the audio data M2 into a predetermined file format (step S1).

ファイル形式の変換後、第１測定部１０２Ａは、オーディオデータＭ２ｃにおける楽曲の開始部分を検知する。そして、第１測定部１０２Ａは、およそ１００ｍｓごとに測定処理を行い、基本周波数ｆ（ｐ）を得る（ステップＳ２）。 After the conversion of the file format, the first measurement unit 102A detects the start portion of the music in the audio data M2c. Then, the first measurement unit 102A performs a measurement process approximately every 100 ms to obtain a fundamental frequency f (p) (step S2).

測定処理後、第１算出部１０３Ａは、基本周波数ｆ（ｐ）を用いて算出処理を行い、ズレ個数ｑ_ｎを得る（ステップＳ３）。 After the measurement process, the first calculation unit 103A performs a calculation process using the fundamental frequency f (p) to obtain the number of deviations q _n (step S3).

算出処理後、第１メタデータ生成部１０４Ａは、楽曲開始からの経過時間ｔ_ｎと、ズレ個数ｑ_ｎとを用いて、およそ１００ｍｓごとにメタデータ生成処理を行い、メタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）を得る（ステップＳ４）。 After the calculation process, the first metadata generation unit 104A performs a metadata generation process about every 100 ms using the elapsed time t _n from the start of the music and the number of deviations q _n, and generates metadata MET2 (t _n , q _n ) is obtained (step S4).

メタデータ生成処理、クエリ生成部１０５Ａは、メタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）を用いてクエリを生成し、クエリＱＵＥを得る（ステップＳ５）。 The metadata generation process / query generation unit 105A generates a query using the metadata MET2 (t _n , q _n ) to obtain a query QUE (step S5).

次に、ステップＳ６からＳ１１に示す、第２処理系１０Ｂの処理の流れについて説明する。 Next, the process flow of the second processing system 10B shown in steps S6 to S11 will be described.

クローラ部１０８は、楽曲サーバ装置２に格納されているコンテンツデータのキャッシュデータを収集し、楽曲データＭ１および属性データＤを得る（ステップＳ６）。 The crawler unit 108 collects cache data of content data stored in the music server device 2 and obtains music data M1 and attribute data D (step S6).

収集処理後、第２ファイル形式変換部１０１Ｂは、クローラ部１０８から時系列で入力した楽曲データＭ１を所定のファイル形式に変換する（ステップＳ７）。 After the collection process, the second file format conversion unit 101B converts the music data M1 input in time series from the crawler unit 108 into a predetermined file format (step S7).

ファイル形式の変換後、第２測定部１０２Ｂは、測定処理を行い、基本周波数ｆ（ｐ）を得る（ステップＳ８）。 After the file format conversion, the second measurement unit 102B performs a measurement process to obtain the fundamental frequency f (p) (step S8).

測定処理後、第２算出部１０３Ｂは、基本周波数ｆ（ｐ）を用いて算出処理を行い、ズレ個数ｑ_ｎを得る（ステップＳ９）。 After the measurement process, the second calculation unit 103B performs a calculation process using the fundamental frequency f (p) to obtain the number of deviations q _n (step S9).

算出処理後、第２メタデータ生成部１０４Ｂは、楽曲開始からの経過時間ｔ_ｎと、ズレ個数ｑ_ｎとを用いて、およそ１００ｍｓごとにメタデータ生成処理を行い、メタデータＭＥＴ１（ｔ_ｎ，ｑ_ｎ）を得る（ステップＳ１０）。 After the calculation process, the second metadata generation unit 104B performs the metadata generation process about every 100 ms using the elapsed time t _n from the start of the music and the number of deviations q _n, and generates metadata MET1 (t _n , q _n ) is obtained (step S10).

メタデータＭＥＴ１（ｔ_ｎ，ｑ_ｎ）が生成された後、楽曲データベース生成部１０９は、メタデータＭＥＴ１（ｔ_ｎ，ｑ_ｎ）を用いて、インデックスファイル１２１を作成する（ステップＳ１１）。 Metadata MET1 _(t _{n, q} n) after is generated, the music database generation unit 109, the metadata MET1 _(t _{n, q} n) is used to create an index file 121 (step S11).

上述したステップＳ６からＳ１１の処理によって、楽曲データベースＤＢが作成される。この一連の処理は、ステップＳ１２の処理が開始される前に完了していることが望ましい。また、この一連の処理は、定期的（例えば、２４時間ごと）に実行されればよい。 The music database DB is created by the processing of steps S6 to S11 described above. This series of processes is preferably completed before the process of step S12 is started. Further, this series of processes may be executed periodically (for example, every 24 hours).

次に、ステップＳ１２およびＳ１３に示す、第１処理系１０Ａによる処理の流れについて説明する。 Next, the flow of processing by the first processing system 10A shown in steps S12 and S13 will be described.

クエリ実行部１０６は、オーディオデータＭ２のメタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）をインデックスファイル１２１と照合する（ステップＳ１２）。 The query execution unit 106 collates the metadata MET2 (t _n , q _n ) of the audio data M2 with the index file 121 (step S12).

表示処理部１０７は、探索結果を得ると、これを基に表示処理を行い、その処理結果Ｒをクライアント端末装置３に送信する（ステップＳ１３）。 When the display processing unit 107 obtains the search result, the display processing unit 107 performs display processing based on the search result, and transmits the processing result R to the client terminal device 3 (step S13).

クライアント端末装置３は、処理結果Ｒをサーバ装置１から受けると、楽曲データＭ１が見つかったか否かを表す探索結果を表示画面に表示する。 When the client terminal device 3 receives the processing result R from the server device 1, the client terminal device 3 displays a search result indicating whether or not the music data M1 has been found on the display screen.

なお、ステップＳ１からＳ１３の処理は、プログラム１３１（図２参照）に処理手順として記述されている。 Note that the processing of steps S1 to S13 is described as a processing procedure in the program 131 (see FIG. 2).

以上述べたように、本実施の形態によれば、サーバ装置１は、経過時間ｔ_ｎおよびズレ個数ｑ_ｎを用いてメタデータを生成するので、以下の顕著な効果を得ることができる。 As described above, according to the present embodiment, the server device 1 generates metadata using the elapsed time t _n and the number of deviations q _n , so that the following remarkable effects can be obtained.

メタデータは、上述したように、１５分程度の楽曲であっても、１３５ＫＢ程度であり、非常に軽量である。したがって、クエリ実行部１０６がメタデータ同士を照合する際の演算量を著しく低減させることができる。このことは、データベースの使用容量の削減や、楽曲探索処理および類似探索処理に要する時間の短縮につながる。 As described above, the metadata is about 135 KB even for a music piece of about 15 minutes and is very lightweight. Therefore, the amount of calculation when the query execution unit 106 collates the metadata can be significantly reduced. This leads to a reduction in the use capacity of the database and a reduction in the time required for the music search process and the similar search process.

更に、本実施の形態によれば、以下の顕著な効果を得ることができる。 Furthermore, according to the present embodiment, the following remarkable effects can be obtained.

言語（日本語、英語など）に左右されず、的確に探索することができる。クラシック音楽のように、同一異録音の楽曲は、楽曲名などの属性が楽曲間で同じになることが多い上、多種多様な言語で記述される。そもそも、楽曲名などの属性の記述方式が統一されているわけではなく、第三者の手によって属性を変更することもできる。したがって、楽曲名などの言葉のキーワードを用いて、同一異録音の楽曲の中から一曲を探索することが難しい。しかしながら、本実施の形態では、メタデータに言葉のキーワードを記述しないので、メロディだけで探索を行うことができる。 You can search accurately regardless of language (Japanese, English, etc.). Like classical music, music with the same recordings often have the same attributes, such as music name, and are described in a variety of languages. In the first place, the description method of attributes such as music titles is not unified, and the attributes can be changed by a third party. Therefore, it is difficult to search for a song from songs of the same recording using words such as song names. However, in the present embodiment, since no word keyword is described in the metadata, the search can be performed using only the melody.

探索対象が見つかった場合、探索結果と共に探索対象の属性がクライアント端末装置３に送信される。したがって、個人ユーザは、探索対象の楽曲名などが分からなくても、探索対象が見つかれば、その楽曲名などを直ちに知ることができる。 When the search target is found, the search target attribute is transmitted to the client terminal device 3 together with the search result. Therefore, even if the individual user does not know the name of the music to be searched, if the search target is found, the individual user can immediately know the name of the music.

算出処理において、基本周波数ｆ（ｐ）に着目し、これが基準周波数ｆ（Ａ）からのズレを算出するため、楽曲の演奏時における楽器の構成や、楽曲の音質に左右されにくく、楽曲の特徴を的確に表したメタデータを生成することができる。 In the calculation process, attention is paid to the fundamental frequency f (p), which calculates the deviation from the reference frequency f (A). It is possible to generate metadata that accurately represents.

この他、データ処理システムＡを用いれば、属性データが第三者の手によって故意に書き換えられ、他のサーバ等にアップロードされたような違法な楽曲の探索も可能となる。 In addition, if the data processing system A is used, it becomes possible to search for illegal music such that attribute data is intentionally rewritten by a third party and uploaded to another server or the like.

（実施の形態の変形例）
実施の形態の変形例について説明する。本変形例は、音程を「Ｃ」音から「Ｅ」音に一定にずらすなど、キー変更がある楽曲の探索に関する。以下、実施の形態と異なる点について説明する。 (Modification of the embodiment)
A modification of the embodiment will be described. This modification relates to a search for music with a key change, such as shifting the pitch from “C” sound to “E” sound. Hereinafter, differences from the embodiment will be described.

類似探索モードにおいて、クエリ実行部１０６は、メタデータＭＥＴ２（ｔ_ｎ，ｑ_ｎ）がメタデータＭＥＴ１（ｔ_ｎ，ｑ_ｎ）に類似するか否かを判断する際に、以下のキー変更楽曲探索処理を行う。 In a similar search mode, the query execution unit 106, the metadata MET2 _(t _{n, q} n) when it is determined whether or not similar to the metadata _{_{MET1 (t n, q n)}} , the following key change music search Process.

具体的には、クエリ実行部１０６は、すべての経過時間ｔ_ｎにおいて、両者のズレ個数ｑ_ｎ同士が一定個数ｑ_ｗずれている場合に、オーディオデータＭ２と楽曲データＭ１間でキー変更があると判断する。一定個数ｑ_ｗは、例えば、ｑ_ｗ＝３など、好適に設定することができる。クエリ実行部１０６は、キー変更がある旨を類似探索結果ＡＮＳとして、属性データＤと共に表示処理部１０７に出力する。なお、両者のズレ個数ｑ_ｎ同士が一定個数ｑ_ｗずれていない場合の処理は、実施の形態と同様である。 Specifically, the query execution unit 106, in all of the elapsed time t _n, if the shift number q _n between the two is offset a predetermined number q _w, a key change between the audio data M2 and the music data M1 Judge. The certain number q _w can be suitably set, for example, q _w = 3. The query execution unit 106 outputs to the display processing unit 107 together with the attribute data D as a similarity search result ANS that there is a key change. The processing when the shift number q _n between the two do not deviate a predetermined number q _w is the same as the embodiment.

以上述べたキー変更楽曲探索処理をキー変更楽曲探索モードとして、楽曲探索モードおよび類似探索モードに加えることができる。 The key change music search process described above can be added to the music search mode and the similar search mode as the key change music search mode.

本変形例により、楽器を変えて演奏した楽曲や原曲の探索を容易に行うことができる。無論、実施の形態における効果を得ることができる。 According to this modification, it is possible to easily search for a musical piece or an original musical piece performed by changing the musical instrument. Of course, the effects of the embodiment can be obtained.

本発明は、その要旨を逸脱しない範囲内で種々の改変が可能である。例えば、サーバ装置１および楽曲サーバ装置２を家庭内ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）に接続し、クライアント端末装置３を用いて、家庭内で楽曲探索モード、類似探索モードおよびキー変更楽曲探索モードを実行することができる。 The present invention can be variously modified without departing from the gist thereof. For example, the server device 1 and the music server device 2 are connected to a home LAN (Local Area Network), and the music search mode, the similar search mode, and the key change music search mode are executed in the home using the client terminal device 3. be able to.

例えば、楽曲サーバ装置２が持つ機能をサーバ装置１に組み込み、両者を１つのサーバ装置とすることができる。 For example, the functions of the music server device 2 can be incorporated into the server device 1 and both can be made into one server device.

１…サーバ装置
２…楽曲サーバ装置
３…クライアント端末装置
４…ネットワーク
１０…ＣＰＵ
１０Ａ…第１処理系
１０Ｂ…第２処理系
１１…第１ＨＤＤ
１２…第２ＨＤＤ
１３…ＲＯＭ
１４…ＲＡＭ
１５…Ｉ／Ｏポート
１６…内部バス
１０１Ａ…第１ファイル形式変換部
１０１Ｂ…第２ファイル形式変換部
１０２Ａ…第１測定部
１０２Ｂ…第２測定部
１０３Ａ…第１算出部
１０３Ｂ…第２算出部
１０４Ａ…第１メタデータ生成部
１０４Ｂ…第２メタデータ生成部
１０５Ａ…クエリ生成部
１０６…クエリ実行部
１０７…表示処理部
１０８…クローラ部
１０９…楽曲データベース生成部
１２１…インデックスファイル
１３１…プログラム DESCRIPTION OF SYMBOLS 1 ... Server apparatus 2 ... Music server apparatus 3 ... Client terminal apparatus 4 ... Network 10 ... CPU
10A ... 1st processing system 10B ... 2nd processing system 11 ... 1st HDD
12 ... Second HDD
13 ... ROM
14 ... RAM
DESCRIPTION OF SYMBOLS 15 ... I / O port 16 ... Internal bus 101A ... 1st file format conversion part 101B ... 2nd file format conversion part 102A ... 1st measurement part 102B ... 2nd measurement part 103A ... 1st calculation part 103B ... 2nd calculation part 104A ... First metadata generation unit 104B ... Second metadata generation unit 105A ... Query generation unit 106 ... Query execution unit 107 ... Display processing unit 108 ... Crawler unit 109 ... Music database generation unit 121 ... Index file 131 ... Program

Claims

A measurement unit that measures the fundamental frequency of the audio data of the music at predetermined time intervals;
A calculation unit that calculates a deviation of the fundamental frequency from a reference frequency of a reference sound as a sound corresponding to the fundamental frequency, using the fundamental frequency obtained for each measurement of the measurement unit;
A metadata generation unit that generates metadata related to the characteristics of the music using the deviations calculated by the calculation unit;
A data processing apparatus.

The metadata generation unit
Associating each deviation with the elapsed time from the start of the music, and generating the metadata,
The data processing apparatus according to claim 1.

The calculation unit includes:
When the reference frequency is denoted by f (A), the fundamental frequency is denoted by f (p), and the deviation of the fundamental frequency (p) with respect to the reference frequency f (A) is denoted by q, the deviation q Is calculated using the following formula:
The data processing apparatus according to claim 1 or 2.

A collation unit that collates the metadata generated by the metadata generation unit with a database in which attributes of a plurality of music pieces including the music piece are registered;
The collation unit
Outputting the attribute of the music together with the matching result;
The data processing apparatus according to any one of claims 1 to 3.

The collation unit
If the metadata matches the metadata in the database at a certain rate, it is determined that the metadata is similar to the metadata in the database, and the attribute of the music is output together with a matching result.
The data processing apparatus according to claim 4.

A data processing device according to any one of claims 1 to 5;
A terminal device for transmitting audio data to the data processing device;
A data processing system.

A measurement step for measuring the fundamental frequency of the audio data of the music at predetermined time intervals;
Using the fundamental frequency obtained for each measurement in the measurement step, a calculation step for calculating a deviation of the fundamental frequency relative to a reference frequency of a reference sound as a sound corresponding to the fundamental frequency;
A metadata generation step for generating metadata relating to the characteristics of the music using the deviations calculated in the calculation step;
A data processing method.

A measurement procedure for measuring the fundamental frequency of the audio data of a song at predetermined time intervals;
Using the fundamental frequency obtained for each measurement in the measurement procedure, a calculation procedure for calculating a deviation of the fundamental frequency with respect to a reference frequency of a reference sound as a sound corresponding to the fundamental frequency,
A metadata generation procedure for generating metadata related to the characteristics of the music using the deviations calculated in the calculation procedure,
A program that causes a computer to execute.