JP2002055695A

JP2002055695A - Music search system

Info

Publication number: JP2002055695A
Application number: JP2000239763A
Authority: JP
Inventors: Tomohisa Himeno; 朋久姫野; Sakae Omachi; 栄大町
Original assignee: Alpine Electronics Inc
Current assignee: Alpine Electronics Inc
Priority date: 2000-08-08
Filing date: 2000-08-08
Publication date: 2002-02-20

Abstract

PROBLEM TO BE SOLVED: To provide a music search system which enables a user to efficiently search a music which the user desires to select. SOLUTION: A search key database 26 stores search keys for stored pieces of music, which simplified the change of musical interval in each phrase about a plurality of pieces of music each consisting of a plurality of phrases. When the sound collection of a melody, inputted by a singing voice or the like, is made by a microphone 20 and is converted into melodic data by a melodic recognition part 22, a search key preparation part 24 prepares a search key for an inputted piece of music which simplifies the change of musical interval about these melodic data. A search processing part 28 for the piece of music compares the search keys for the stored pieces of music, corresponding to each phrase of each piece of music with a search key for input piece of music, searches for the existence of a phrase, corresponding to the search keys for stored pieces of music having the same tendency as that of the search key for input piece of music about each of a plurality of pieces of music, and performs high-similarity determination to the piece of music, which has many phrases of the same tendency.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、曲データベース中
から利用者が所望する曲を検索する音楽検索システムに
関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a music retrieval system for retrieving a song desired by a user from a song database.

【０００２】[0002]

【従来の技術】従来、カラオケ装置では、演奏対象とす
る複数の曲のそれぞれに対して固有の識別番号を付加
し、この識別番号と曲名やアーティスト名等との対応を
一覧表にした検索用冊子が作成されており、利用者がこ
の検索用冊子を用いて所望の曲に対応する識別番号を調
べ、識別番号をカラオケ装置に対して入力することによ
り所望の曲を演奏させることができるようになってい
る。2. Description of the Related Art Conventionally, in a karaoke apparatus, a unique identification number is added to each of a plurality of songs to be played, and a correspondence between the identification numbers and song names, artist names and the like is listed. A booklet is created, and the user can use this search booklet to check the identification number corresponding to the desired song, and input the identification number to the karaoke apparatus to play the desired song. It has become.

【０００３】しかし、利用者が自分の所望する曲の曲名
等を把握していない場合や、検索用冊子が手元にない場
合等には、所望の曲を探すことが難しかった。そこで、
歌声や楽器音等によって曲に含まれる旋律の一部分を入
力し、入力された旋律に対して所定の処理を行って旋律
データ（音程の経時変化を表す数値データ）に変換し、
この旋律データを用いて、曲データベースに収録された
多数の曲データの中から所望の曲を検索する音楽検索シ
ステムが提案されている。[0003] However, it has been difficult for a user to search for a desired song if the user does not know the song title or the like of the desired song, or if a search booklet is not at hand. Therefore,
A part of the melody included in the song is input by singing voice or instrumental sound, etc., and the input melody is converted into melody data (numerical data representing the temporal change of the pitch) by performing predetermined processing.
There has been proposed a music search system that searches for a desired song from a large number of song data recorded in a song database using the melody data.

【０００４】例えば、特開平８−１２９３９３号公報に
は、マイクロホンから入力された旋律を旋律データに変
換し、曲データベースに収録された曲の旋律データと比
較することにより所望の曲を検索する手法が記載されて
いる。また、特開平９−２９３０８３号公報には、マイ
クロホンから入力された旋律を所定の音程データ（旋律
データ）およびリズムデータに変換し、曲データベース
に収録された曲の音程データおよびリズムデータと比較
して、１小節毎に両者の類似率を算出することにより所
望の曲を検索する手法が記載されている。また、特開平
８−１６０９７５号公報には、マイクロホンから入力さ
れた旋律の音程の相対的変化を求め、曲データベースに
収録された曲の音程の相対的変化と比較して所望の曲を
検索する手法が記載されている。For example, Japanese Unexamined Patent Publication No. 8-129393 discloses a technique for converting a melody input from a microphone into melody data and searching for a desired tune by comparing the melody data with melody data of a tune recorded in a tune database. Is described. Japanese Patent Application Laid-Open No. 9-293083 discloses that a melody input from a microphone is converted into predetermined pitch data (melody data) and rhythm data, and the melody is compared with the pitch data and rhythm data of the music recorded in the music database. A technique is described in which a desired music is searched for by calculating a similarity ratio between the two for each bar. In Japanese Patent Application Laid-Open No. 8-160975, a relative change in the pitch of the melody input from the microphone is obtained, and the desired change is searched for by comparing the relative change in the pitch of the song recorded in the song database. The method is described.

【０００５】[0005]

【発明が解決しようとする課題】ところで、上述した公
報に開示された各種の検索方法は、検索したい曲の旋律
等と類似した旋律等を有する曲を曲データベースから抽
出するものであり、旋律等の類似度合いに基づいた曲が
選ばれる。しかし、類似度合いが同じであっても、曲内
で繰り返されるフレーズの旋律が類似する場合と、偶然
にある部分の旋律が類似する場合があるが、従来の手法
ではこれら２つの場合について類似度が同じであると判
断されることになる。実際には、何度も繰り返されるフ
レーズが聴取者の記憶に残りやすいことを考えると、こ
れら２つの曲を同様に評価したのでは、利用者が選択を
希望する曲を効率よく検索することはできない。By the way, the various retrieval methods disclosed in the above-mentioned publications extract music having a melody similar to the melody of the music to be searched from a music database. Are selected based on the degree of similarity. However, even if the degree of similarity is the same, there are cases where the melody of the phrase repeated in the song is similar and where the melody of a certain part happens to be similar by accident. Are determined to be the same. In practice, considering that phrases that are repeated many times are likely to remain in the listener's memory, evaluating these two songs in the same way would make it difficult for the user to efficiently search for the songs that the user wishes to select. Can not.

【０００６】本発明は、このような点に鑑みて創作され
たものであり、その目的は、利用者が選択を希望する曲
を効率よく検索することができる音楽検索システムを提
供することにある。[0006] The present invention has been made in view of the above points, and an object of the present invention is to provide a music search system capable of efficiently searching for a song desired by a user to select. .

【０００７】[0007]

【課題を解決するための手段】上述した課題を解決する
ために、本発明の音楽検索システムでは、それぞれが複
数のフレーズからなる複数の曲について、各フレーズに
おける音程の変化を簡略化した第１の特徴データを特徴
データ格納手段に格納しており、入力された音声データ
について、音程の変化を簡略化した第２の特徴データを
特徴データ抽出手段によって抽出し、抽出された第２の
特徴データと、特徴データ格納手段に格納された各曲の
各フレーズに対応した第１の特徴データとを特徴比較手
段によって比較している。そして、特徴比較手段による
比較結果に基づいて、類似判定手段により、第２の特徴
データと同じ傾向を有する第１の特徴データに対応する
フレーズの有無を複数の曲のそれぞれについて調べ、同
じ傾向を有するフレーズの数が多い曲に対して高い類似
判定を行っている。In order to solve the above-mentioned problems, in the music search system of the present invention, for a plurality of tunes each composed of a plurality of phrases, a first variation in which a change in a pitch in each phrase is simplified. Characteristic data stored in the characteristic data storage means, and for the input voice data, second characteristic data in which a change in pitch is simplified is extracted by the characteristic data extraction means, and the extracted second characteristic data is extracted. And the first feature data corresponding to each phrase of each song stored in the feature data storage means are compared by the feature comparison means. Then, based on the comparison result by the feature comparison unit, the similarity determination unit checks the presence or absence of a phrase corresponding to the first feature data having the same tendency as the second feature data for each of the plurality of songs, and determines the same tendency. A high similarity determination is made for a song having a large number of phrases.

【０００８】一般に、曲中で登場する回数の多いフレー
ズ（部分的な旋律）ほど利用者の記憶に残りやすく、こ
の登場回数の多いフレーズが検索用のフレーズとして利
用者により入力される場合が多い。したがって、利用者
により入力されるフレーズに対応する音声データと同じ
傾向を有するフレーズの数が多い曲ほど高い類似判定を
行うことにより、記憶に残っていて利用者自身が聴きた
い、あるいは歌いたいと考えている曲が優先的に検索さ
れるようになり、利用者が選択を希望している曲を効率
よく検索することができる。In general, the phrase (partial melody) that appears more frequently in a song tends to remain in the memory of the user, and the phrase that appears more frequently is often input by the user as a search phrase. . Therefore, by performing a higher similarity determination for a song having a larger number of phrases having the same tendency as the voice data corresponding to the phrase input by the user, the user is likely to remain in memory and listen to or sing. The tune to be considered is preferentially searched, and the tune desired by the user can be efficiently searched.

【０００９】また、上述した特徴データ抽出手段は、音
声データに対応する旋律の時間変化を観察したときの頂
点を抽出し、一の頂点から所定範囲を超えて音程が変化
した他の頂点を代表点として抽出して、これらの代表点
に着目した音程の変化を表すデータを上述した第２の特
徴データとして抽出することが望ましい。所定範囲を超
えて音程が変化しない場合には代表点として抽出しない
ことにより、細かな音程変化を省略することができるの
で、入力される音声データの音程が所定範囲内で上下に
揺れている場合にこの影響を取り除くことができる。Further, the above-mentioned feature data extracting means extracts a vertex when observing a time change of the melody corresponding to the voice data, and represents another vertex whose pitch has changed from one vertex beyond a predetermined range. It is desirable to extract the data as a point, and to extract data representing a change in the pitch focused on these representative points as the above-mentioned second feature data. When the pitch does not change beyond the predetermined range, the pitch is not extracted as a representative point, so that a fine pitch change can be omitted. This effect can be eliminated.

【００１０】また、上述した特徴データ抽出手段は、一
の代表点の音程を基準として次の代表点の音程の変化量
を調べ、この変化量を複数の区分範囲のいずれかに分類
する操作を代表点のそれぞれについて繰り返し、各代表
点に対応して得られた分類結果を第２の特徴データとし
て抽出することが望ましい。一の代表点から次の代表点
までの音程の変化量を複数の区分範囲のいずれかに分類
しているので、第２の特徴データのデータ量を少なくす
ることができ、これにより、第１および第２の特徴デー
タを用いた比較処理を高速に行うことができる。また、
一の代表点の音程を基準とした相対的な音程変化によっ
て第２の特徴データを抽出しているので、利用者により
入力される音声のキー（調）が元の曲とずれている場合
でも、相対的な音程変化が一致していれば曲の類似判定
を行うことができるという利点もある。The above-described feature data extracting means examines the amount of change in the pitch of the next representative point with reference to the pitch of one representative point, and performs an operation of classifying this change into one of a plurality of division ranges. It is desirable to repeat the process for each of the representative points and extract the classification result obtained for each of the representative points as the second feature data. Since the amount of change in pitch from one representative point to the next representative point is classified into any of a plurality of division ranges, the data amount of the second feature data can be reduced. The comparison process using the second feature data can be performed at high speed. Also,
Since the second feature data is extracted by a relative pitch change based on the pitch of one representative point, even when the key (tone) of the voice input by the user is shifted from the original music, Also, there is an advantage that similarity determination of music can be performed if the relative pitch changes match.

【００１１】また、上述した類似判定手段による判定結
果に基づいて、類似度の高い曲を、類似度の順に表示す
る表示手段をさらに備えることが望ましい。これにより
利用者は、判定結果を容易に把握することができる。Preferably, the apparatus further comprises display means for displaying songs having a high degree of similarity in the order of the degree of similarity based on the result of the determination by the above-described similarity determination means. Thereby, the user can easily grasp the determination result.

【００１２】[0012]

【発明の実施の形態】以下、本発明を適用した一実施形
態の音楽検索システムについて、図面を参照しながら説
明する。図１は、本実施形態の音楽検索システムの構成
を示す図である。同図に示す音楽検索システムは、所定
の通信回線を介して接続されたサーバ装置１および端末
装置２により構成されている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A music search system according to an embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a diagram illustrating a configuration of a music search system according to the present embodiment. The music search system shown in FIG. 1 includes a server device 1 and a terminal device 2 connected via a predetermined communication line.

【００１３】サーバ装置１は、端末装置２からの要求に
基づいて曲データを送信するためのものであり、曲デー
タベース（ＤＢ）１０を備えている。曲データベース１
０は、多くの曲に対応した曲データを曲名やアーティス
ト名等と対応付けて収録している。本実施形態では、曲
データベース１０に収録される曲として、基本的にはボ
ーカル曲、すなわちその曲において主となる旋律（以
下、これを「主旋律」と称する。）が歌声により表現さ
れる曲を考慮している。また、本実施形態における曲デ
ータは、少なくとも、その曲における主旋律に対応する
主旋律データと、それ以外の伴奏等に対応する伴奏デー
タとを分離して読み出すことができるようになっている
ものとする。このような曲データとしては、例えば、広
く普及しており汎用性の高いＭＩＤＩ（Musical Instru
ment Digital Interface）形式の曲データなどがある。The server device 1 is for transmitting music data based on a request from the terminal device 2 and includes a music database (DB) 10. Song database 1
No. 0 records song data corresponding to many songs in association with song names, artist names, and the like. In the present embodiment, the songs recorded in the song database 10 are basically vocal songs, that is, songs in which the main melody of the song (hereinafter, referred to as “main melody”) is expressed by a singing voice. Take into account. In addition, it is assumed that the song data in the present embodiment can be read at least separately from the main melody data corresponding to the main melody of the song and the accompaniment data corresponding to other accompaniments. . As such music data, for example, MIDI (Musical Instrument
ment Digital Interface) format.

【００１４】また、上述した端末装置２は、所定の方法
によって旋律を入力し、これに基づいて所望の曲を検索
し、対応する曲データを取得するためのものであり、マ
イクロホン２０、旋律認識部２２、検索キー作成部２
４、検索キーデータベース（ＤＢ）２６、曲検索処理部
２８、表示部３０を含んで構成されている。The above-mentioned terminal device 2 is for inputting a melody by a predetermined method, searching for a desired tune based on the melody, and acquiring corresponding tune data. Unit 22, search key creation unit 2
4. It includes a search key database (DB) 26, a music search processing unit 28, and a display unit 30.

【００１５】マイクロホン２０は、利用者によって歌声
や楽器音等として入力される旋律をアナログの電気信号
に変換する。旋律認識部２２は、マイクロホン２０から
出力される電気信号に基づいて、利用者により入力され
た旋律をデジタルの旋律データに変換する。具体的に
は、上述した旋律データは、例えば、マイクロホン２０
から出力される電気信号をデジタルデータに変換した後
に周知の高速フーリエ変換処理を行って、時間経過に対
応した周波数成分の変化、すなわち音程を抽出すること
により得ることができる。The microphone 20 converts a melody input by the user as a singing voice or a musical instrument sound into an analog electric signal. The melody recognition unit 22 converts the melody input by the user into digital melody data based on the electric signal output from the microphone 20. Specifically, the melody data described above is, for example, the microphone 20
After converting the electrical signal output from the digital signal into digital data, a well-known fast Fourier transform process is performed to extract a change in a frequency component corresponding to a lapse of time, that is, a pitch.

【００１６】検索キー作成部２４は、旋律認識部２２か
ら出力される旋律データに基づいて、利用者により入力
された旋律を簡略化して表現した所定の検索キーを作成
する。なお、以降の説明では、旋律認識部２２から出力
される旋律データに基づいて作成される検索キーを「入
力曲検索キー」と称することとする。この入力曲検索キ
ーが「第２の特徴データ」に対応している。The search key creation unit 24 creates a predetermined search key that represents the melody input by the user in a simplified manner, based on the melody data output from the melody recognition unit 22. In the following description, a search key created based on the melody data output from the melody recognition unit 22 will be referred to as an “input music search key”. This input music search key corresponds to “second feature data”.

【００１７】また、検索キー作成部２４は、サーバ装置
１内の曲データベース１０に収録された曲データを取得
し、曲データに含まれる主旋律データに基づいて、この
曲に含まれる主旋律を簡略化して表現した所定の検索キ
ーを作成する。なお、以降の説明では、曲データベース
１０に収録された曲データに含まれる主旋律データに基
づいて作成される検索キーを「収録曲検索キー」と称す
ることとする。この収録曲検索キーが「第１の特徴デー
タ」に対応している。なお、上述した各検索キー（入力
曲検索キーおよび収録曲検索キー）を作成する方法の詳
細については後述する。The search key creation unit 24 obtains music data recorded in the music database 10 in the server device 1, and simplifies the main melody included in the music based on the main melody data included in the music data. A predetermined search key is created. In the following description, a search key created based on the main melody data included in the music data recorded in the music database 10 will be referred to as a “recorded music search key”. The recorded song search key corresponds to “first feature data”. The details of the method for creating the above-described search keys (input music search key and recorded music search key) will be described later.

【００１８】検索キーデータベース２６は、検索キー作
成部２４によって作成された収録曲検索キーをその曲固
有の情報（例えば、曲名やアーティスト名等）と対応付
けて格納する。なお、検索キーデータベース２６には、
サーバ装置１内の曲データベース１０に収録されている
全ての曲データに対応して作成された収録曲検索キーを
格納しておくことが好ましいが、その他にも、例えば、
サーバ装置２から端末装置１に新たに曲データが呼び出
される毎にその曲データに対応して収録曲検索キーを作
成するようにしてもよい。また、予め多数の曲データに
対応して収録曲検索キーを作成して検索キーデータベー
ス２６に格納しておくようにしてもよい。The search key database 26 stores the recorded music search keys created by the search key creation unit 24 in association with the music-specific information (for example, song names and artist names). The search key database 26 has
It is preferable to store recorded music search keys created corresponding to all music data recorded in the music database 10 in the server device 1.
Each time new song data is called from the server device 2 to the terminal device 1, a recorded song search key may be created corresponding to the song data. Alternatively, a recorded music search key may be created in advance corresponding to a large number of music data and stored in the search key database 26.

【００１９】曲検索処理部２８は、検索キー作成部２４
によって作成される入力曲検索キーを旋律認識部２２を
介して取得し、取得した入力曲検索キーと検索キーデー
タベース２６に格納されている収録曲検索キーとを比較
照合して類似度を判定し、利用者によって入力された旋
律に対応する曲を検索する。表示部３０は、曲検索処理
部２８による検索結果などを表示する。The music search processing section 28 includes a search key creation section 24
Of the input music search key created by the melody recognition unit 22, and compares the obtained input music search key with the recorded music search keys stored in the search key database 26 to determine the similarity. Search for a tune corresponding to the melody input by the user. The display unit 30 displays a result of the search performed by the music search processing unit 28 and the like.

【００２０】上述した検索キーデータベース２６が特徴
データ格納手段に、旋律認識部２２、検索キー作成部２
４が特徴データ抽出手段に、曲検索処理部２８が特徴比
較手段および類似判定手段に、表示部３０が表示手段に
それぞれ対応している。本実施形態の音楽検索システム
はこのような構成を有しており、次にその動作について
詳細に説明を行う。The above-mentioned search key database 26 is used as the feature data storage means for the melody recognition unit 22 and the search key creation unit 2.
Reference numeral 4 corresponds to a feature data extraction unit, the song search processing unit 28 corresponds to a feature comparison unit and a similarity determination unit, and the display unit 30 corresponds to a display unit. The music search system according to the present embodiment has such a configuration, and its operation will be described in detail below.

【００２１】（ａ）収録曲検索キーの作成手順図２は、曲データベース１０に収録された曲データに基
づいて収録曲検索キーを作成する際の音楽検索システム
の動作手順を示す流れ図であり、主に検索キー作成部２
４によって行われる動作が説明されている。 (A) Procedure for Creating Recorded Song Search Key FIG. 2 is a flowchart showing an operation procedure of the music search system when creating a recorded song search key based on song data recorded in the song database 10. Mainly search key creation unit 2
4 describes the operation performed.

【００２２】まず、検索キー作成部２４は、サーバ装置
１内の曲データベース１０から曲データを取得し、主旋
律データを抽出する（ステップ１００）。図３は、曲デ
ータベース１０から取得した曲データに含まれる主旋律
データの一例を示す図であり、横軸が経過時間、縦軸が
音程にそれぞれ対応している。First, the search key creation unit 24 acquires music data from the music database 10 in the server device 1 and extracts main melody data (step 100). FIG. 3 is a diagram showing an example of the main melody data included in the music data acquired from the music database 10, in which the horizontal axis corresponds to the elapsed time and the vertical axis corresponds to the pitch.

【００２３】また、本実施形態では、１オクターブを半
音毎の１２区間に分割しており、半音を音程差の最小単
位としている。なお、以下では説明を簡略化するため
に、半音の音程差を「音程差が１」と表現することとす
る。すなわち、１オクターブの音程差は「音程差が１
２」と表現されることとなる。In this embodiment, one octave is divided into 12 sections for each semitone, and the semitone is the minimum unit of the pitch difference. In the following, for the sake of simplicity, the pitch difference between semitones is expressed as "the pitch difference is 1". That is, the pitch difference of one octave is “the pitch difference is 1
2 ".

【００２４】次に検索キー作成部２４は、主旋律を複数
のフレーズに分割する（ステップ１０１）。本実施形態
では、主旋律をいくつかの区間に分割した場合の各区間
に含まれる部分的な旋律のことを「フレーズ」と呼んで
いる。図４は、フレーズについて説明する図である。同
図に示す例では、上述した図３に示した主旋律が４つの
フレーズに分割された様子が示されている。なお、フレ
ーズを分割する方法については、例えば、主旋律データ
を含む曲データを製作する者が予めフレーズの区切りを
示す所定の区切り情報を曲データに含ませておき、この
区切り情報に基づいて分割する方法や、所定数の小節毎
に自動的に分割する方法、あるいは所定の長さの休符
（旋律が途切れている部分）を検出して分割する方法な
ど種々の方法が考えられる。Next, the search key creation unit 24 divides the main melody into a plurality of phrases (step 101). In the present embodiment, a partial melody included in each section when the main melody is divided into several sections is called a “phrase”. FIG. 4 is a diagram illustrating a phrase. In the example shown in the figure, a state where the main melody shown in FIG. 3 described above is divided into four phrases is shown. As for the method of dividing a phrase, for example, a person who produces song data including main melody data includes predetermined segment information indicating a segment of a phrase in song data in advance, and divides based on this segment information. Various methods are conceivable, such as a method, a method of automatically dividing every predetermined number of measures, and a method of dividing by detecting a rest of a predetermined length (a part where the melody is interrupted).

【００２５】主旋律が複数のフレーズに分割されると、
次に検索キー作成部２４は、各フレーズ毎に主旋律デー
タの頂点を抽出し、各頂点の位置（時間および音程）を
記憶する（ステップ１０２）。図５は、主旋律データの
頂点について具体的に説明する図であり、一例として、
上述した図４等に示した主旋律データに基づいて頂点を
抽出した様子が示されている。同図において、（１）〜
（２１）と示されている位置がそれぞれ主旋律データの
頂点を示している。具体的には、本実施形態では、主旋
律データが上に凸となる位置、または下に凸となる位置
を頂点と呼んでいる。例えば、図５において、（１）の
位置から（２）の位置に移る際には音程が下がり、
（２）の位置から（３）の位置に変化する際には音程が
上がっているので、この場合には（２）の位置が「下に
凸となっている位置」であり、この位置が頂点として抽
出される。同様に、（２）の位置から（３）の位置に変
化する際には音程が上がり、（３）の位置から（４）の
位置に変化する際には音程が下がり続けているので、こ
の場合には（３）の位置が「旋律データが上に凸になっ
ている位置」であり、この位置が頂点として抽出され
る。また、その他の頂点についても同様に抽出される。When the main melody is divided into a plurality of phrases,
Next, the search key creation unit 24 extracts the vertices of the main melody data for each phrase, and stores the position (time and pitch) of each vertex (step 102). FIG. 5 is a diagram specifically explaining the vertices of the main melody data. As an example, FIG.
A state in which vertices are extracted based on the main melody data shown in FIG. 4 and the like described above is shown. In FIG.
Each position indicated by (21) indicates a vertex of the main melody data. Specifically, in the present embodiment, a position where the main melody data is convex upward or a position where the main melody data is convex downward is called a vertex. For example, in FIG. 5, when moving from the position (1) to the position (2), the pitch is lowered,
When the pitch changes from the position (2) to the position (3), the pitch is raised. In this case, the position (2) is a “position that is convex downward”. Extracted as vertices. Similarly, when the position changes from the position (2) to the position (3), the pitch increases, and when the position changes from the position (3) to the position (4), the pitch continues to decrease. In this case, the position (3) is “a position where the melody data is convex upward”, and this position is extracted as a vertex. The other vertices are similarly extracted.

【００２６】頂点が抽出されると、次に検索キー作成部
２４は、音程差に関する誤差範囲Ｅを設定し、上述した
ステップ１０２において抽出した各頂点を先頭から順に
調べ、各頂点間の音程差がＥ／２の範囲を超えて変化す
る頂点を抽出してこれを代表点として抽出する（ステッ
プ１０３）。When the vertices are extracted, the search key creation unit 24 sets an error range E relating to the pitch difference, examines the vertices extracted in the above-described step 102 in order from the top, and determines the pitch difference between the vertices. Is extracted as a representative point, and the vertices that change beyond the range of E / 2 are extracted (step 103).

【００２７】図６は、代表点について具体的に説明する
図である。図６において、例えば、誤差範囲をＥ＝２と
設定した場合であれば、先頭に近い方の頂点（着目頂
点）を基準として音程差が１を超えて変化する頂点が抽
出される。したがって、第１のフレーズでは、まず着目
頂点を頂点（１）とすると、次の頂点である頂点（２）
と頂点（１）の音程差は３であるので、頂点（２）は代
表点として抽出される。FIG. 6 is a diagram specifically explaining the representative points. In FIG. 6, for example, if the error range is set to E = 2, vertices whose pitch difference exceeds 1 with respect to the vertex closer to the beginning (vertex of interest) are extracted. Therefore, in the first phrase, when the vertex of interest is the vertex (1), the next vertex (2)
Since the pitch difference between and the vertex (1) is 3, the vertex (2) is extracted as a representative point.

【００２８】また、着目頂点を頂点（２）とすると、次
の頂点（３）と頂点（２）との音程差は１であり、頂点
（３）は代表点とはならない。また、着目頂点を頂点
（３）とすると、次の頂点（４）と頂点（３）との音程
差は５であるので、頂点（４）は代表点として抽出され
る。同様に、着目頂点を頂点（４）とすると、次の頂点
（５）の頂点（４）との音程差は４であるので、頂点
（５）は代表点として抽出される。以上の結果をまとめ
ると、第１のフレーズにおいては、頂点（１）、
（２）、（４）、（５）の４つが代表点として抽出され
ることとなる。If the target vertex is vertex (2), the pitch difference between the next vertex (3) and vertex (2) is 1, and vertex (3) is not a representative point. If the target vertex is vertex (3), the pitch difference between the next vertex (4) and vertex (3) is 5, so vertex (4) is extracted as a representative point. Similarly, assuming that the target vertex is vertex (4), since the pitch difference between the next vertex (5) and vertex (4) is 4, vertex (5) is extracted as a representative point. To summarize the above results, in the first phrase, the vertex (1),
Four (2), (4), and (5) are extracted as representative points.

【００２９】また、上述した手順と同様にして、第２の
フレーズでは頂点（６）、（７）、（９）、（１０）、
（１１）の５つが代表点として抽出され、第３のフレー
ズでは頂点（１２）、（１３）、（１５）の３つが代表
点として抽出され、第４のフレーズでは頂点（１６）、
（１７）、（１９）、（２０）、（２１）の５つが代表
点として抽出される。In the same manner as described above, the vertices (6), (7), (9), (10),
Five of (11) are extracted as representative points, three vertices (12), (13), and (15) are extracted as representative points in the third phrase, and vertices (16),
Five points (17), (19), (20), and (21) are extracted as representative points.

【００３０】各フレーズにおいて代表点が抽出される
と、検索キー作成部２４は、抽出された各代表点の間の
音程差の程度に基づいて所定の分木データを作成する
（ステップ１０４）。ここでは、まず、音程差に関する
しきい値Ｗが所定の値に設定され、設定されたしきい値
Ｗに基づいて、各フレーズ毎に、ある代表点と次の代表
点（時間軸上で前方に存在する代表点）との関係が分類
される。When a representative point is extracted from each phrase, the search key creating unit 24 creates predetermined branch data based on the degree of the pitch difference between the extracted representative points (step 104). Here, first, the threshold value W relating to the pitch difference is set to a predetermined value, and based on the set threshold value W, a certain representative point and a next representative point (forward on the time axis) for each phrase. Are classified.

【００３１】具体的には、本実施形態では各代表点の間
の関係は、（１）音程差が＋Ｗを超えて変化する場合、
（２）０〜＋Ｗの範囲内である場合、（３）０〜−Ｗの
範囲内である場合、（４）−Ｗを超えて変化する場合、
４つに分類される。なお、この場合に、上述したしきい
値Ｗの前に付加される「＋」は、ある代表点を基準とし
て次の代表点の音程が上がっていることを示し、「−」
は、ある代表点を基準として次の代表点の音程が下がっ
ていることを示すものとする。Specifically, in the present embodiment, the relation between the representative points is as follows: (1) When the pitch difference changes beyond + W,
(2) When it is within the range of 0 to + W, (3) When it is within the range of 0 to -W, (4) When it is changed beyond -W,
It is classified into four. In this case, “+” added before the above-mentioned threshold value W indicates that the pitch of the next representative point is raised with respect to a certain representative point, and “−”
Indicates that the pitch of the next representative point is lowered with respect to a certain representative point.

【００３２】図７は、分木データについて説明する図で
ある。図７（Ａ）は、しきい値Ｗ＝５とした場合の第１
のフレーズに対応して求められる分木データの内容を示
している。同図に示すように、本実施形態の分木データ
では４つの枝が用意されており、しきい値Ｗの値に基づ
いて分類される上述した４通りの代表点間の関係に対応
していずれか１つの枝が選択される。例えば、第１のフ
レーズにおいて代表点（１）と代表点（２）との間の音
程差は−３であり、上記（３）の「０〜−Ｗの範囲内で
ある場合」に該当するので、上から３番目の枝が選択さ
れる。同様に、代表点（２）と代表点（４）との間の音
程差は−４であるので上から３番目の枝が選択され、代
表点（４）と代表点（５）との間の音程差は＋４である
ので上から２番目の枝が選択される、そして、これらの
選択された枝をつなげて描画すると、図７（Ａ）におい
て実線で示されているように、「上から３番目の枝、上
から３番目の枝、上から２番目の枝」というパターンを
有する分木データにより第１のフレーズが簡略化して表
現される。FIG. 7 is a diagram for explaining the tree data. FIG. 7A shows the first case where threshold value W = 5.
Shows the contents of the binary tree data obtained in correspondence with the phrase of As shown in the figure, four branches are prepared in the branch tree data of the present embodiment, and correspond to the above-mentioned four types of relations between the representative points classified based on the value of the threshold value W. Any one branch is selected. For example, in the first phrase, the pitch difference between the representative point (1) and the representative point (2) is -3, which corresponds to the above (3) "when the value is within the range of 0 to -W". Therefore, the third branch from the top is selected. Similarly, since the pitch difference between the representative point (2) and the representative point (4) is -4, the third branch from the top is selected, and the difference between the representative point (4) and the representative point (5) is obtained. Is +4, the second branch from the top is selected. When these selected branches are connected and drawn, as shown by the solid line in FIG. , The first phrase is represented in a simplified manner by the binary tree data having a pattern of “the third branch from the top, the third branch from the top, and the second branch from the top”.

【００３３】また、第２〜第４のフレーズについても、
上述した第１のフレーズの場合と同様にして分木データ
が求められる。図７（Ｂ）は第２のフレーズに対応して
求められた分木データ、図７（Ｃ）は第３のフレーズに
対応して求められた分木データ、図７（Ｄ）は第４のフ
レーズに対応して求められた分木データをそれぞれ示し
ている。Also, for the second to fourth phrases,
Branch tree data is obtained in the same manner as in the case of the first phrase described above. FIG. 7 (B) is the tree data obtained corresponding to the second phrase, FIG. 7 (C) is the tree data obtained corresponding to the third phrase, and FIG. 7 (D) is the fourth tree data. , Respectively, respectively.

【００３４】各フレーズに対応した分木データが求まる
と、検索キー作成部２４は、これらの分木データのパタ
ーンを調べ、同一パターンの出現回数を調査する（ステ
ップ１０５）。ここで、分木データにおいて上から１番
目の枝を「枝１」、上から２番目の枝を「枝２」、上か
ら３番目の枝を「枝３」、上から４番目の枝を「枝４」
とそれぞれ省略して表すこととすると、図７に示した第
１〜第４のフレーズに対応する分木データのパターンは
それぞれ以下のように表現される。When the branch tree data corresponding to each phrase is obtained, the search key creation unit 24 checks the patterns of these branch tree data and checks the number of appearances of the same pattern (step 105). Here, in the branch tree data, the first branch from the top is “branch 1”, the second branch from the top is “branch 2”, the third branch from the top is “branch 3”, and the fourth branch from the top is “branch 3”. "Branch 4"
In this case, the patterns of the branch data corresponding to the first to fourth phrases shown in FIG. 7 are expressed as follows.

【００３５】第１のフレーズ：（枝３−枝３−枝２）第２のフレーズ：（枝３−枝３−枝１−枝３）第３のフレーズ：（枝２−枝２）第４のフレーズ：（枝３−枝３−枝１−枝３）これら第１〜第４のフレーズのパターンを調べると、第
１のフレーズに対応する分木データのパターンと第３の
フレーズに対応する分木データのパターンは、それぞれ
曲中で１回しか出現していないことが分かる。また、第
２のフレーズに対応する分木データのパターンと第４の
フレーズに対応する分木データのパターンは、その内容
が一致しているので、この分木データのパターンの出現
回数は２回ということになる。First phrase: (branch 3-branch 3-branch 2) Second phrase: (branch 3-branch 3-branch 1-branch 3) Third phrase: (branch 2-branch 2) 4th Phrase: (branch 3-branch 3-branch 1-branch 3) When examining the patterns of the first to fourth phrases, the pattern of the branch tree data corresponding to the first phrase and the pattern of the third phrase are determined. It can be seen that each pattern of the branch tree data appears only once in the music. Further, since the content of the pattern of the branch tree data corresponding to the second phrase and the pattern of the branch tree data corresponding to the fourth phrase match, the number of appearances of the pattern of the branch tree data is twice. It turns out that.

【００３６】次に、検索キー作成部２４は、各パターン
の出現回数に基づいて分木データを分類し、分類された
分木データを用いて収録曲検索キーを作成する（ステッ
プ１０６）。図８は、収録曲検索キーについて説明する
図である。図８（Ａ）は、パターンの出現回数が１回で
ある第１および第３のフレーズに対応する分木データに
基づいて作成される収録曲検索キーを示している。図８
（Ａ）に示すように、第１および第３のフレーズに対応
する分木データを合成することにより、第１のフレーズ
に対応した（枝３−枝３−枝２）というパターンと、第
３のフレーズに対応した（枝２−枝２）というパターン
とを含んで構成される分木データが得られ、これが収録
曲検索キーの１つとなる。Next, the search key creation unit 24 classifies the branch tree data based on the number of appearances of each pattern, and creates a recorded music search key using the classified branch tree data (step 106). FIG. 8 is a diagram for explaining a recorded music search key. FIG. 8A shows a recorded music search key created based on the binary tree data corresponding to the first and third phrases in which the number of appearances of the pattern is one. FIG.
As shown in (A), by combining the binary tree data corresponding to the first and third phrases, a pattern (branch 3-branch 3-branch 2) corresponding to the first phrase and a third Tree data including a pattern of (branch 2-branch 2) corresponding to the phrase is obtained, and this is one of the recorded music search keys.

【００３７】また、図８（Ｂ）は、パターンの出現回数
が２回である第２および第４のフレーズに対応する分木
データに基づいて作成される収録曲検索キーを示してい
る。図８（Ｂ）に示すように、第２および第４のフレー
ズに対応する分木データに基づいて、（枝３−枝３−枝
１−枝３）というパターンを有する分木データが得ら
れ、これも収録曲検索キーの１つとなる。FIG. 8B shows a recorded music search key created based on the binary tree data corresponding to the second and fourth phrases in which the pattern appears twice. As shown in FIG. 8B, based on the tree data corresponding to the second and fourth phrases, tree data having a pattern of (branch 3-branch 3-branch 1-branch 3) is obtained. This is also one of the recorded song search keys.

【００３８】収録曲検索キーが得られると、検索キー作
成部２４は、得られた収録曲検索キーのそれぞれに対し
て、これらを作成する元となった分木データのパターン
の出現回数に応じて所定の重み付け係数を付加する（ス
テップ１０７）。上述した図８に示した収録曲検索キー
においては、例えば、図８（Ａ）に示す収録曲検索キー
には重み付け係数として「１」が付加され、図８（Ｂ）
に示す収録曲検索キーには重み付け係数として「２」が
付加される。When the recorded song search keys are obtained, the search key creating unit 24 determines, for each of the obtained recorded song search keys, according to the number of appearances of the pattern of the branch tree data from which these were created. Then, a predetermined weighting coefficient is added (step 107). In the recorded music search key shown in FIG. 8 described above, for example, “1” is added as a weighting factor to the recorded music search key shown in FIG.
"2" is added as a weighting coefficient to the recorded music search key shown in FIG.

【００３９】なお、収録曲検索キーが３つ以上得られた
場合においても、各収録曲検索キーを作成する元となっ
た分木データのパターンの出現回数に応じて、適宜、重
み付け係数が設定されて付加されるものとする。次に、
検索キー作成部２４は、重み付け係数が付加された後の
収録曲検索キーを曲名等の情報と関連付けて検索キーデ
ータベース２６に格納する（ステップ１０８）。Even when three or more recorded song search keys are obtained, weighting factors are set appropriately according to the number of appearances of the pattern of the branch tree data from which each recorded song search key was created. And be added. next,
The search key creating unit 24 stores the search key in the search key database 26 in association with the information such as the title of the recorded music after the weighting coefficient is added (step 108).

【００４０】上述した一連の手順により、収録曲検索キ
ーの作成が完了する。また、上述した誤差範囲Ｅおよび
しきい値Ｗを種々の値に変化させて、図２に示したステ
ップ１０３〜ステップ１０８の手順を繰り返すことによ
り、１つの曲に対して複数パターンの収録曲検索キーが
作成される。例えば、上述した例では、誤差範囲をＥ＝
２、しきい値をＷ＝５としたが、誤差範囲およびしきい
値の値のいずれか一方または両方を変化させた条件で、
１曲に対して１０種類程度の収録曲検索キーが作成され
る。Through the above-described series of procedures, the creation of the recorded music search key is completed. Also, by changing the error range E and the threshold value W to various values and repeating the procedure of steps 103 to 108 shown in FIG. A key is created. For example, in the example described above, the error range is E =
2. Although the threshold value is set to W = 5, under the condition that one or both of the error range and the threshold value are changed,
About ten types of recorded music search keys are created for one music.

【００４１】（ｂ）所望の曲を検索する際の手順次に、利用者によって入力される旋律に基づいて、利用
者の所望する曲を検索する際の手順について説明する。
図９は、所望の曲を検索する際の音楽検索システムの動
作手順を示す流れ図である。 (B) Procedure for Retrieving a Desired Song Next, a procedure for retrieving a song desired by the user based on the melody input by the user will be described.
FIG. 9 is a flowchart showing an operation procedure of the music search system when searching for a desired song.

【００４２】まず、旋律認識部２２は、歌声や楽器音等
によって旋律が入力されたか否かを判定し（ステップ２
００）、旋律が入力された場合には肯定判断を行い、入
力された旋律を旋律データに変換する（ステップ２０
１）。入力された旋律が旋律データに変換されると、検
索キー作成部２４は、この旋律データの頂点を抽出し、
各頂点の位置を記憶する（ステップ２０２）。次に、検
索キー作成部２４は、音程差に関する誤差範囲Ｅを設定
し、上述したステップ２０２において抽出した各頂点を
先頭から順に調べ、各頂点間の音程差がＥ／２の範囲を
超えて変化する頂点を抽出してこれを代表点として抽出
する（ステップ２０３）。First, the melody recognition unit 22 determines whether or not a melody is input by a singing voice, a musical instrument sound, or the like (step 2).
00), when a melody is input, an affirmative judgment is made, and the input melody is converted into melody data (step 20).
1). When the input melody is converted into melody data, the search key creation unit 24 extracts the vertices of the melody data,
The position of each vertex is stored (step 202). Next, the search key creator 24 sets an error range E relating to the pitch difference, examines the vertices extracted in step 202 described above in order from the top, and determines that the pitch difference between the vertices exceeds the range of E / 2. A changing vertex is extracted and is extracted as a representative point (step 203).

【００４３】代表点が抽出されると、検索キー作成部２
４は、抽出された各代表点の間の音程差の程度に基づい
て分木データを求め、これに基づいて入力曲検索キーを
作成する（ステップ２０４）。具体的には、上述したス
テップ２００において利用者により入力される旋律は、
一般には曲に含まれる主旋律の一部分、すなわちフレー
ズであると考えられるので、ステップ２０４では、入力
された旋律の全体を１つのフレーズと見なしている。こ
のため、ステップ２０４においては、所定の処理を行っ
て得られた分木データがそのまま入力曲検索キーとな
る。When the representative points are extracted, the search key creation unit 2
4 obtains branch tree data based on the degree of the pitch difference between the extracted representative points, and creates an input music search key based on this (step 204). Specifically, the melody input by the user in step 200 described above is
Generally, it is considered that the melody is a part of the main melody included in the song, that is, a phrase. Therefore, in step 204, the whole of the input melody is regarded as one phrase. Therefore, in step 204, the branch tree data obtained by performing the predetermined processing directly serves as an input music search key.

【００４４】なお、旋律データの頂点および代表点を抽
出する手順や、代表点に基づいて分木データを作成する
手順については、上述した収録曲検索キーを作成する場
合の手順と同様であるのでここでは詳細な説明を省略す
る。このように、本実施形態では、収録曲検索キーと入
力曲検索キーを基本的に同じアルゴリズムによって作成
しているので、目的とする曲をより確実に検索すること
ができる。The procedure for extracting the vertices and the representative points of the melody data and the procedure for creating the branch tree data based on the representative points are the same as the procedures for creating the recorded music search key described above. Here, detailed description is omitted. As described above, in the present embodiment, the recorded music search key and the input music search key are basically created by the same algorithm, so that the target music can be more reliably searched.

【００４５】図１０は、ステップ２０４で作成される入
力曲検索キーの一例を示す図である。同図では、一例と
して、上述した図４等に示した主旋律における第２のフ
レーズ、または第４のフレーズに対応する旋律が利用者
により入力された場合の入力曲検索キーが示されてい
る。利用者によって、第２のフレーズまたは第４のフレ
ーズに対応する旋律がある程度正確に入力された場合に
は、図１０に示すように、（枝３−枝３−枝１−枝３）
というパターンを有する分木データが得られ、これが入
力曲検索キーとなる。FIG. 10 is a diagram showing an example of the input music search key created in step 204. In this figure, as an example, an input music search key when the user inputs the second phrase or the melody corresponding to the fourth phrase in the main melody shown in FIG. 4 and the like described above is shown. If the user inputs the melody corresponding to the second phrase or the fourth phrase with some accuracy, as shown in FIG. 10, (branch 3-branch 3-branch 1-branch 3)
Is obtained, and this is used as an input music search key.

【００４６】次に、曲検索処理部２８は、作成された入
力曲検索キーを旋律認識部２２を介して検索キー作成部
２４から受け取り、この入力曲検索キーと検索キーデー
タベース２６に格納されている収録曲検索キーとを比較
し、入力曲検索キーと一致または類似している収録曲検
索キーを含んでいる曲を抽出し（ステップ２０５）、各
検索キー（入力曲検索キーおよび収録曲検索キー）の類
似度合いと収録曲検索キーに付加されている重み付け係
数とを考慮して適合率を算出する（ステップ２０６）。
具体的には、本実施形態では、入力曲検索キーに対応す
る分木データのパターンと収録曲検索キーに対応する分
木データのパターンを比較して、分木データの枝分かれ
箇所が１箇所だけ異なっているような場合に、各検索キ
ーが「類似である」と判断している。Next, the song search processing unit 28 receives the created input song search key from the search key creation unit 24 via the melody recognition unit 22, and stores the input song search key and the search key database 26. A search is made for the songs that include the recorded song search key that matches or is similar to the input song search key (step 205), and the search keys (input song search key and recorded song search) are compared. The matching rate is calculated in consideration of the degree of similarity of the key and the weighting coefficient added to the recorded music search key (step 206).
Specifically, in the present embodiment, the pattern of the branch tree data corresponding to the input music search key and the pattern of the branch tree data corresponding to the recorded music search key are compared, and only one branch point of the branch tree data is found. If they are different, it is determined that each search key is “similar”.

【００４７】図１１は、適合率の算出結果の一例を示す
図である。同図に示す曲Ａ〜Ｄは、入力曲検索キーと類
似または一致している収録曲検索キーを含む曲の一例で
ある。なお、入力曲検索キーとして、図１０に示した入
力曲検索キー（枝３−枝３−枝１−枝３）を考えるもの
とする。FIG. 11 is a diagram showing an example of the calculation result of the precision. The songs A to D shown in the figure are examples of songs that include a recorded song search key that is similar or coincident with the input song search key. It is assumed that an input music search key (branch 3-branch 3-branch 1-branch 3) shown in FIG. 10 is considered as an input music search key.

【００４８】図１１に示す曲Ａは、上述した図８等にお
いて説明した曲であり、重み付け係数＝２の収録曲検索
キー（枝３−枝３−枝１−枝３）が入力曲検索キーと一
致している。また、曲Ｂは、重み付け係数＝１の収録曲
検索キー（枝３−枝３−枝１−枝３）が入力曲検索キー
と一致している。同様に、曲Ｃは、重み付け係数＝２の
収録曲検索キー（枝３−枝３−枝１−枝２）が入力曲検
索キーと類似している。曲Ｄは、重み付け係数＝１の収
録曲検索キー（枝３−枝３−枝２−枝３）が入力曲検索
キーと類似している。The song A shown in FIG. 11 is the song described with reference to FIG. 8 and the like described above, and the included song search key (weight 3 = branch 3—branch 1—branch 3) with the weighting coefficient = 2 is the input music search key. Matches. Also, for the music B, the recorded music search keys (the branch 3—the branch 3—the branch 1—the branch 3) with the weighting coefficient = 1 coincide with the input music search key. Similarly, the song C has a recorded song search key (weight 3-branch 3-branch 1-branch 2) with a weighting coefficient = 2 similar to the input song search key. The song D has a recorded song search key with a weighting factor of 1 (branch 3-branch 3-branch 2-branch 3) similar to the input music search key.

【００４９】また、本実施形態では、適合率を百分率で
表しており、まず入力曲検索キーと収録曲検索キーとの
類似度合いに基づいて、（ａ）完全一致の場合には減点
なし、（ｂ）類似の場合には減点１０％、として適合率
に対する減点が行われる。また、重み付け係数に基づい
て、最も重み付け係数が大きい場合を減点なしとして重
み付け係数が１小さくなる毎に５％の減点が行われる。In this embodiment, the matching rate is expressed as a percentage. First, based on the similarity between the input music search key and the recorded music search key, (a) no deduction for perfect match, ( b) In the case of similarity, a penalty point of 10% is deducted from the precision rate. Further, based on the weighting coefficient, the point with the largest weighting coefficient is regarded as no deduction, and a deduction of 5% is performed every time the weighting coefficient decreases by one.

【００５０】具体的には、曲Ａについては、入力曲検索
キーと重み付け係数＝２の収録曲検索キーとを比較する
と、完全一致であるので類似度合いに関する減点はなく
（すなわち、減点が０％）、重み付け係数＝２（曲Ａに
おいて最も大きな重み付け係数）であるので重み付け係
数に関する減点もなく、したがって適合率が１００％と
算出される。また、曲Ｂについては、入力曲検索キーと
重み付け係数＝１の収録曲検索キーを比較すると、完全
一致であるので類似度合いに関する減点はないが、重み
付け係数＝１であるので重み付け係数に関して５％の減
点が行われ、したがって適合率は９５％と算出される。More specifically, when comparing the input music search key and the recorded music search key with the weighting coefficient = 2, there is no penalty related to the degree of similarity for the music A because it is a perfect match (that is, the penalty is 0%). ), The weighting coefficient = 2 (the largest weighting coefficient in the music piece A), so that there is no penalty for the weighting coefficient, and thus the precision is calculated as 100%. Also, when comparing the input music search key and the recorded music search key with the weighting coefficient = 1 for the music B, there is no penalty related to the degree of similarity because it is a perfect match, but since the weighting coefficient = 1, the weighting coefficient is 5%. Is performed, and the precision is therefore calculated to be 95%.

【００５１】同様に、曲Ｃについては、入力曲検索キー
と重み付け係数＝２の収録曲検索キーを比較すると、類
似であるので類似度合いに関して１０％の減点が行わ
れ、重み付け係数＝２であるので重み付け係数に関する
減点はなく、したがって適合率は９０％と算出される。
同様に、曲Ｄについては、入力曲検索キーと重み付け係
数＝１の収録曲検索キーを比較すると、類似であるので
類似度合いに関して１０％の減点が行われ、重み付け係
数＝１であるので重み付け係数に関して５％の減点が行
われ、したがって適合率は８５％と算出される。Similarly, when the input music search key and the included music search key with the weighting coefficient = 2 are compared for the music C, they are similar, so that the degree of similarity is reduced by 10%, and the weighting coefficient = 2. Therefore, there is no penalty for the weighting factor, and thus the precision is calculated as 90%.
Similarly, for song D, when the input song search key is compared with the included song search key with the weighting coefficient = 1, the similarity is deducted by 10% with respect to the degree of similarity. Is deducted by 5%, and thus the precision is calculated to be 85%.

【００５２】ここで、図１１に示した適合率の算出結果
を見ると、曲Ａおよび曲Ｂの両者とも入力曲検索キーと
完全一致している収録曲検索キーを含んでいるが、曲Ａ
の方が収録曲検索キーに付加されている重み付け係数が
大きくなっているために適合率の算出結果が高くなって
いることがわかる。また、曲Ｃおよび曲Ｄの両者とも入
力曲検索キーと類似した収録曲検索キーを含んでいる
が、曲Ｃの方が収録曲検索キーに付加されている重み付
け係数が大きくなっているために適合率の算出結果が高
くなっていることがわかる。重み付け係数が大きい収録
曲検索キーというのは曲中で登場する回数のより多いフ
レーズに対応しており、このような登場回数の多いフレ
ーズは何度も耳にすることから利用者により覚えられ入
力曲検索キーとして与えられる可能性が高いので、重み
付け係数を考慮して適合率を算出することにより、利用
者が選択を希望している可能性の高い曲の適合率をより
高い値に算出することができる。Here, looking at the calculation result of the matching rate shown in FIG. 11, both the music A and the music B include the recorded music search key that completely matches the input music search key.
It can be seen that the calculation result of the relevance ratio is higher in the case of, because the weighting coefficient added to the recorded song search key is larger. Further, both the songs C and D include a recorded song search key similar to the input song search key, but the song C has a larger weighting coefficient added to the recorded song search key because the song C has a larger weighting coefficient. It can be seen that the calculation result of the precision is higher. A song search key with a large weighting factor corresponds to a phrase that appears more frequently in the song, and since such a phrase that appears more often is heard more than once, it is remembered and input by the user. Since it is highly likely to be given as a song search key, the relevance is calculated in consideration of the weighting coefficient, so that the relevance of a song that the user is likely to want to select is calculated to a higher value. be able to.

【００５３】また、上述したように、本実施形態では１
曲に対して、誤差範囲Ｅおよびしきい値Ｗの値を変化さ
せて作成した数種類（例えば１０種類）の収録曲検索キ
ーが作成されているので、曲検索処理部２８は、これら
の収録曲検索キーの全てを用いた適合率の算出結果を総
合して１曲に対する最終的な適合率を算出する（ステッ
プ２０７）。具体的には、例えば、上述した曲Ａに対し
て１０種類の収録曲検索キーが用意されている場合であ
れば、それぞれの収録曲検索キーと入力曲検索キーとの
比較に基づく適合率の算出結果が１０通り得られるの
で、これらの算出結果を加算して加算結果を１０で除算
することにより、最終的な適合率の算出結果が得られ
る。Further, as described above, in this embodiment, 1
Since several (for example, 10) recorded music search keys are created for the music by changing the values of the error range E and the threshold value W, the music search processing unit 28 A final matching rate for one song is calculated by integrating the calculation results of the matching rates using all the search keys (step 207). Specifically, for example, if ten types of recorded song search keys are prepared for the above-described song A, the matching rate based on a comparison between each of the recorded song search keys and the input song search key is determined. Since ten calculation results are obtained, these calculation results are added, and the addition result is divided by 10, thereby obtaining the final calculation result of the precision.

【００５４】その後、曲検索処理部２８は、最終的な適
合率の算出結果が所定値以上（例えば、７０％以上）の
収録曲検索キーを含んでいる曲を抽出し、所定の順番
（例えば、適合率の高い順等）で曲名やアーティスト名
等と適合率の算出結果を表示部３０に表示する（ステッ
プ２０８）。そして、表示部３０に表示された１または
複数の候補曲リストの中から、図示しない操作部等を用
いて利用者により所望の曲が指定されると、曲検索処理
部２８からサーバ装置１に対して曲データの送信要求が
出力され、この要求に応じてサーバ装置１から曲データ
が送られる。After that, the song search processing section 28 extracts the songs including the included song search keys whose final relevance ratio calculation result is equal to or more than a predetermined value (for example, 70% or more) and extracts them in a predetermined order (for example, The song name, the artist name, and the like and the calculation result of the matching rate are displayed on the display unit 30 in the order of the highest matching rate (step 208). Then, when the user specifies a desired song from the one or more candidate song lists displayed on the display unit 30 using an operation unit or the like (not shown), the song search processing unit 28 sends the song to the server device 1. In response to this request, a song data transmission request is output, and the song data is sent from server device 1 in response to the request.

【００５５】このように、本実施形態の音楽検索システ
ムでは、それぞれが複数のフレーズからなる複数の曲に
ついて、各フレーズにおける音程の変化を簡略化した収
録曲検索キーを作成する際に、曲中において登場回数の
多いフレーズに対応して作成される収録曲検索キーほど
大きな値の重み付け係数を付加している。そして、利用
者により曲を検索するために旋律が入力されると、これ
に対応して音程の変化を簡略化した入力曲検索キーを作
成して、入力曲検索キーと複数の曲のそれぞれに対応し
て用意されている収録曲検索キーとを比較し、収録曲検
索キーに付加された重み付け係数を考慮して類似度を判
定している。したがって、利用者により入力される旋律
と同じ傾向を有するフレーズを多く含む曲、すなわち、
利用者が所望している可能性の高い曲ほど類似判定の判
定結果が高くなり、利用者が選択を希望している曲を効
率よく検索することができる。As described above, in the music search system according to the present embodiment, for a plurality of songs each composed of a plurality of phrases, when creating a recorded song search key in which the change in the pitch in each phrase is simplified, , A larger weighting coefficient is added to a recorded music search key created corresponding to a phrase that appears more frequently. Then, when the user inputs a melody to search for a song, an input song search key that simplifies the change in pitch is created correspondingly, and the input song search key and each of the plurality of songs are created. The similarity is determined by comparing a correspondingly prepared song search key with a weighting factor added to the song search key. Therefore, a song containing many phrases having the same tendency as the melody input by the user, that is,
A song having a higher possibility of being desired by the user has a higher determination result of the similarity determination, and the user can efficiently search for a song desired to be selected.

【００５６】なお、本発明は上記実施形態に限定される
ものではなく、本発明の主旨の範囲内において種々の変
形実施が可能である。例えば、上述した実施形態では、
４つの枝が設定された分木データにより各検索キーを作
成していたが、この分木データに設定する枝は、少なく
は２つとしてもよく、逆により多くの枝を設定してもよ
い。設定する枝を少なく設定するほど、各検索キーのデ
ータ量を少なくすることができ、枝を多く設定するほ
ど、フレーズの特徴をより詳細に表現することができ
る。The present invention is not limited to the above-described embodiment, and various modifications can be made within the scope of the present invention. For example, in the embodiment described above,
Each search key is created by using the branch data in which four branches are set. However, the number of branches to be set in this branch data may be at least two, and more branches may be set in reverse. . The smaller the number of branches to be set, the smaller the data amount of each search key, and the more branches, the more detailed the characteristics of the phrase can be expressed.

【００５７】また、上述した実施形態では、入力曲検索
キーと収録曲検索キーとの類似度合いを判断する際の判
断基準として、分木データのパターンが１箇所だけ異な
る場合を「類似している」と判断していたが、類似度合
いの判定基準はこれに限定されるものではなく他の基準
により判断してもよい。具体的には、例えば、分木デー
タのパターンの１箇所が異なる場合には減点１０％、２
箇所が異なる場合には減点を２０％、……というように
パターンが異なる箇所の増減に応じて減点を設定するよ
うにしてもよい。In the above-described embodiment, as a criterion for judging the degree of similarity between the input music search key and the recorded music search key, a case where the pattern of the branch tree data differs by one place is “similar. However, the criterion for determining the degree of similarity is not limited to this, and may be determined based on another criterion. Specifically, for example, when one part of the pattern of the branch tree data is different, a deduction of 10%, 2
If the points are different, the deduction may be set according to the increase or decrease of the points with different patterns, such as 20%,....

【００５８】また、上述した実施形態では、主旋律デー
タ等における音程変化の最小単位を半音としていたが、
これに限定されるものではなく、音程変化の最小単位は
半音以外の値に設定してもよい。また、上述した実施形
態では、１曲毎に数種類の収録曲検索キーが用意されて
いたが、必ずしも数種類用意する必要はなく、最低限、
１曲に対して１種類の収録曲検索キーが用意されていれ
ばよい。In the above-described embodiment, the minimum unit of the pitch change in the main melody data and the like is a semitone.
The present invention is not limited to this, and the minimum unit of the pitch change may be set to a value other than a semitone. Further, in the above-described embodiment, several types of recorded music search keys are prepared for each music, but it is not always necessary to prepare several types, and at least
It suffices if one kind of recorded music search key is prepared for one music.

【００５９】また、上述した実施形態では、曲データベ
ース１０に収録される曲は、基本的にボーカル曲を対象
としている旨の説明を行ったが、収録される曲はこれに
限定されるものではなく、歌声以外の楽器音等により主
旋律が表現される曲を収録するようにしてもよい。具体
的には、例えば、クラシックやジャズ等のジャンルに属
する曲が考えられるが、この場合には、その曲を最も印
象付けるような旋律を主旋律として選択して、対応する
主旋律データを用意しておけばよい。In the above-described embodiment, the description has been made that the songs recorded in the song database 10 are basically vocal songs, but the songs recorded are not limited to this. Instead, a song in which the main melody is represented by an instrumental sound other than the singing voice may be recorded. Specifically, for example, a song belonging to a genre such as classical music or jazz can be considered. In this case, a melody that most impresses the song is selected as a main melody, and corresponding main melody data is prepared. It is good.

【００６０】[0060]

【発明の効果】上述したように、本発明によれば、入力
される音声データに対応するフレーズと同じ傾向を有す
るフレーズの数が多い曲ほど高い類似判定を行っている
ので、記憶に残っていて利用者自身が聴きたい、あるい
は歌いたいと考えている曲が優先的に検索されるように
なり、利用者が選択を希望している曲を効率よく検索す
ることができる。As described above, according to the present invention, the higher the number of phrases having the same tendency as the phrase corresponding to the input voice data, the higher the similarity determination is made, so that the similarity determination remains in the memory. As a result, songs that the user himself wants to listen to or want to sing are searched with priority, and songs that the user wants to select can be efficiently searched.

【図面の簡単な説明】[Brief description of the drawings]

【図１】一実施形態の音楽検索システムの構成を示す図
である。FIG. 1 is a diagram illustrating a configuration of a music search system according to an embodiment.

【図２】曲データベースに収録された曲データに基づい
て収録曲検索キーを作成する際の音楽検索システムの動
作手順を示す流れ図である。FIG. 2 is a flowchart showing an operation procedure of the music search system when creating a recorded music search key based on music data recorded in a music database.

【図３】曲データベースから取得した曲データに含まれ
る主旋律データの一例を示す図である。FIG. 3 is a diagram showing an example of main melody data included in song data acquired from a song database.

【図４】フレーズについて説明する図である。FIG. 4 is a diagram illustrating a phrase.

【図５】主旋律データの頂点について具体的に説明する
図である。FIG. 5 is a diagram specifically illustrating a vertex of main melody data.

【図６】代表点について具体的に説明する図である。FIG. 6 is a diagram specifically illustrating a representative point.

【図７】分木データについて説明する図である。FIG. 7 is a diagram illustrating branch tree data.

【図８】収録曲検索キーについて説明する図である。FIG. 8 is a diagram illustrating a recorded music search key.

【図９】所望の曲を検索する際の音楽検索システムの動
作手順を示す流れ図である。FIG. 9 is a flowchart showing an operation procedure of the music search system when searching for a desired song.

【図１０】入力曲検索キーの一例を示す図である。FIG. 10 is a diagram showing an example of an input music search key.

【図１１】適合率の算出結果の一例を示す図である。FIG. 11 is a diagram illustrating an example of a calculation result of a precision.

[Explanation of symbols]

１サーバ装置２端末装置１０曲データベース（ＤＢ）２０マイクロホン２２旋律認識部２４検索キー作成部２６検索キーデータベース（ＤＢ）２８曲検索処理部３０表示部 Reference Signs List 1 server device 2 terminal device 10 music database (DB) 20 microphone 22 melody recognition unit 24 search key creation unit 26 search key database (DB) 28 music search processing unit 30 display unit

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 15/00 Ｇ１０Ｌ 3/00 ５５１ＧＦターム(参考） 5B075 ND14 PQ04 PR06 QM08 5D015 AA06 HH04 5D108 BC01 BC17 BG06 5D378 KK01 MM97 QQ01 ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI theme coat ゛ (reference) G10L 15/00 G10L 3/00 551G F-term (reference) 5B075 ND14 PQ04 PR06 QM08 5D015 AA06 HH04 5D108 BC01 BC17 BG06 5D378 KK01 MM97 QQ01

Claims

[Claims]

1. A feature data storage means for storing first feature data in which a change in pitch in each phrase is simplified for a plurality of songs each including a plurality of phrases; Feature data extraction means for extracting second feature data whose change has been simplified; the second feature data extracted by the feature data extraction means; and each phrase of each song stored in the feature data storage means A feature comparison unit that compares the first feature data with the first feature data corresponding to the second feature data;
The presence / absence of the phrase corresponding to the first feature data having the same tendency as that of the first feature data is checked for each of the plurality of songs, and a high similarity determination is performed for a song having the same tendency and a large number of phrases. A music search system comprising: a similarity determination unit.

2. The method according to claim 1, wherein the characteristic data extracting unit extracts a vertex when observing a time change of a melody corresponding to the voice data, and a pitch changes from one vertex beyond a predetermined range. A music search system, wherein other vertices are extracted as representative points, and data representing a change in pitch focused on these representative points is extracted as the second feature data.

3. The characteristic data extracting unit according to claim 2, wherein the characteristic data extracting means checks a change amount of a pitch of the next representative point with reference to a pitch of one representative point, and determines the change amount in any of a plurality of division ranges. A music search system, wherein an operation of classifying a crab is repeated for each of the representative points, and a classification result obtained for each of the representative points is extracted as the second feature data.

4. The music according to claim 1, further comprising: display means for displaying songs having a high degree of similarity in order of the degree of similarity based on a result of the determination by the similarity determination means. Search system.