JP4165645B2

JP4165645B2 - Music search system and music search method

Info

Publication number: JP4165645B2
Application number: JP2003376217A
Authority: JP
Inventors: 成文後田
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2003-11-05
Filing date: 2003-11-05
Publication date: 2008-10-15
Anticipated expiration: 2023-11-05
Also published as: JP2005141431A

Description

本発明は、ＨＤＤ等の大容量の記憶手段に大量に記憶されている楽曲データの中から所望の楽曲を検索する楽曲検索システムおよび楽曲検索方法に関し、特に人間の感性によって判断される印象度データに基づいて楽曲の検索が可能な楽曲検索システムおよび楽曲検索方法に関する。 The present invention relates to a music search system and a music search method for searching for desired music from music data stored in large quantities in a large-capacity storage means such as an HDD, and in particular, impression degree data determined by human sensitivity. The present invention relates to a music search system and a music search method capable of searching for music based on the music.

近年、ＨＤＤ等の大容量の記憶手段が開発され、大容量の記憶手段に大量の楽曲データを記憶させることができるようになっている。大容量の記憶手段に記憶されている大量の楽曲データの検索は、アーティスト名や曲名、その他のキーワード等の書誌データを用いて行うのが一般的であるが、書誌データで検索した場合には、楽曲が持っている情感を考慮することができず、印象の異なる楽曲が検索される可能性があり、聴取した際の印象が同じような楽曲を検索したい場合には、不向きである。 In recent years, a large-capacity storage means such as an HDD has been developed, and a large amount of music data can be stored in the large-capacity storage means. Searching for a large amount of music data stored in a large-capacity storage means is generally performed using bibliographic data such as artist names, music titles, and other keywords. This is not suitable when it is not possible to take into account the emotions of a music piece and there is a possibility that a music piece with a different impression will be searched, and it is desired to search for a music piece with a similar impression when listening.

そこで、楽曲に対する主観的な印象に基づいて利用者の希望する楽曲を検索可能にするために、検索を希望する楽曲に対するユーザの主観的な要件を入力して数値化して出力し、その出力から、検索対象の楽曲の印象を数量化した予測印象値を算出し、算出した予測印象値をキーとして、複数の楽曲の音響信号及びその楽曲の印象を数量化した印象値を格納した楽曲データベースを検索することにより、利用者の楽曲に対する主観的なイメージに基づいて、希望する楽曲を検索する装置が提案されている（例えば、特許文献１参照）。 Therefore, in order to make it possible to search for the music desired by the user based on the subjective impression of the music, the user's subjective requirements for the music desired to be searched are input, quantified and output, and the output , A predicted impression value obtained by quantifying the impression of the music to be searched is calculated, and a music database storing an acoustic value of a plurality of music and an impression value obtained by quantifying the impression of the music is calculated using the calculated predicted impression value as a key. There has been proposed an apparatus for searching for desired music based on a subjective image of a user's music by searching (for example, see Patent Document 1).

しかしながら、従来技術では、ユーザは、検索に当たって、楽曲に対する主観的な印象を必ず入力する複雑な操作が必要となると共に、ユーザの主観的な要件の入力が数値化された予測印象値が必ずしも目的としている楽曲の印象値に近似しているとは限らないため、大容量の記憶手段に記憶されている大量の楽曲データの中から目的としている楽曲と印象が同じような楽曲を素早く検索することができないという問題点があった。
特開２００２−２７８５４７号公報 However, according to the conventional technology, a user needs a complicated operation to input a subjective impression of a musical piece in a search, and a predicted impression value in which a user's subjective requirement input is quantified is not necessarily a target. Because it is not always close to the impression value of the song you are trying to find, you can quickly search for songs that have the same impression as the target song from a large amount of song data stored in a large-capacity storage means There was a problem that could not.
JP 2002-278547 A

本発明は斯かる問題点を鑑みてなされたものであり、その目的とするところは、代表曲を選択するという簡単な操作だけで、大容量の記憶手段に記憶されている大量の楽曲データの中から代表曲と印象が同じような楽曲を素早く検索することができる楽曲検索システムおよび楽曲検索方法を提供する点にある。 The present invention has been made in view of such problems, and the object of the present invention is to store a large amount of music data stored in a large-capacity storage means only by a simple operation of selecting a representative song. The object is to provide a music search system and a music search method capable of quickly searching for music having the same impression as the representative music.

本発明は上記課題を解決すべく、以下に掲げる構成とした。
本発明の楽曲検索システムは、楽曲データベースに記憶された複数の楽曲データの中から所望の楽曲データを検索する楽曲検索システムであって、前記楽曲データを入力する楽曲データ入力手段と、該楽曲データ入力手段によって入力された前記楽曲データの一定フレーム長に対して高速フーリエ変換を行いパワースペクトルを算出することで物理的な特徴データを抽出する特徴データ抽出手段と、予め学習が施された階層型ニューラルネットワークを用いて前記特徴データ抽出手段によって抽出された特徴データを人間の感性によって判断される印象度データに変換する印象度データ変換手段と、該印象度データ変換手段によって変換された印象度データに基づいて、前記楽曲データ入力手段によって入力された楽曲データを予め学習が施された自己組織化マップである楽曲マップにマッピングする楽曲マッピング手段と、該楽曲マッピング手段によってマッピングされた楽曲データが記憶される楽曲マップ記憶手段と、楽曲マップにマッピングされている楽曲データの中から代表曲を選択する代表曲選択手段と、キーワードに対応する楽曲を設定するキーワード設定手段と、楽曲がマッピングで表示される楽曲マッピング表示手段と、前記楽曲マッピング表示手段で表示された楽曲であるニューロンをポイントすると前記キーワードが表示されるキーワード表示手段と、該代表曲選択手段によって選択された代表曲と前記キーワードに基づいて代表曲がマッピングされているニューロンと、その近傍ニューロンとに含まれる楽曲マップの楽曲データを検索する楽曲検索手段と、該楽曲検索手段によって検索された楽曲データを出力する楽曲データ出力手段と、を具備することを特徴とする。 In order to solve the above problems, the present invention has the following configuration.
The music search system of the present invention is a music search system for searching for desired music data from a plurality of music data stored in a music database, the music data input means for inputting the music data, and the music data Feature data extraction means for extracting physical feature data by performing a fast Fourier transform on a fixed frame length of the music data input by the input means and calculating a power spectrum, and a hierarchical type that has been subjected to learning in advance Impression degree data conversion means for converting feature data extracted by the feature data extraction means using a neural network into impression degree data determined by human sensitivity, and impression degree data converted by the impression degree data conversion means The music data input by the music data input means is previously learned. A music mapping means for mapping to a music map that is a self-organizing map, a music map storage means for storing music data mapped by the music mapping means, and music data mapped to the music map Representative song selection means for selecting a representative song, keyword setting means for setting a song corresponding to the keyword, song mapping display means for displaying a song by mapping, and a neuron that is a song displayed by the song mapping display means The keyword display means for displaying the keyword when pointing to, the representative song selected by the representative song selection means, the neuron to which the representative song is mapped based on the keyword, and the music map included in the neighboring neurons and music retrieval means for retrieving the music data, Characterized by comprising a music data output means for outputting the music data searched by the music retrieval means.

また、本発明の楽曲検索システムは、楽曲データベースに記憶された複数の楽曲データの中から所望の楽曲データを検索する楽曲検索装置と、当該楽曲検索装置に接続可能に構成されている端末装置とからなる楽曲検索システムであって、前記楽曲検索装置は、前記楽曲データを入力する楽曲データ入力手段と、該楽曲データ入力手段によって入力された前記楽曲データの一定フレーム長に対して高速フーリエ変換を行いパワースペクトルを算出することで物理的な特徴データを抽出する特徴データ抽出手段と、予め学習が施された階層型ニューラルネットワークを用いて前記特徴データ抽出手段によって抽出された特徴データを人間の感性によって判断される印象度データに変換する印象度データ変換手段と、該印象度データ変換手段によって変換された印象度データに基づいて、前記楽曲データ入力手段によって入力された楽曲データを予め学習が施された自己組織化マップである楽曲マップにマッピングする楽曲マッピング手段と、該楽曲マッピング手段によってマッピングされた楽曲データが記憶される楽曲マップ記憶手段と、楽曲マップにマッピングされている楽曲データの中から代表曲を選択する代表曲選択手段と、キーワードに対応する楽曲を設定するキーワード設定手段と、楽曲がマッピングで表示される楽曲マッピング表示手段と、前記楽曲マッピング表示手段で表示された楽曲であるニューロンをポイントすると前記キーワードが表示されるキーワード表示手段と、該代表曲選択手段によって選択された代表曲と前記キーワードに基づいて代表曲がマッピングされているニューロンと、その近傍ニューロンとに含まれる楽曲マップの楽曲データを検索する楽曲検索手段と、該楽曲検索手段によって検索された楽曲データを出力する楽曲データ出力手段と、を具備し、前記端末装置は、前記楽曲検索装置からの楽曲データを入力する検索結果入力手段と、該検索結果入力手段によって入力された楽曲データを記憶する検索結果記憶手段と、該検索結果記憶手段に記憶された楽曲データを再生する音声出力手段と、を具備することを特徴とする。 The music search system of the present invention includes a music search device that searches for desired music data from a plurality of music data stored in a music database, and a terminal device that is configured to be connectable to the music search device. The music search system comprises: music data input means for inputting the music data; and fast Fourier transform for a fixed frame length of the music data input by the music data input means. The feature data extraction means for extracting physical feature data by calculating the power spectrum and the feature data extracted by the feature data extraction means using a hierarchical neural network that has been learned in advance. Impression degree data conversion means for converting into impression degree data determined by the A music mapping means for mapping the music data input by the music data input means to a music map, which is a self-organized map that has been learned in advance, based on the impression degree data converted in this way, and the music mapping means The music map storage means for storing the music data mapped by the above, the representative music selection means for selecting the representative music from the music data mapped to the music map, and the keyword setting means for setting the music corresponding to the keyword A music mapping display means for displaying music by mapping, a keyword display means for displaying the keyword when pointing to a neuron that is a music displayed by the music mapping display means, and a representative music selection means selected by the representative music selection means. Representative songs and the representative songs are mapped based on the keywords Comprising the neurons are a music searching means for searching the music data of the music maps contained in its vicinity neurons, and the music data outputting means for outputting the music data searched by the musical composition retrieving means, and The terminal device is stored in search result input means for inputting music data from the music search apparatus, search result storage means for storing music data input by the search result input means, and search result storage means. Voice output means for playing back the music data.

また、本発明の楽曲検索システムは、入力された楽曲データを楽曲データベースに記憶させる楽曲登録装置と、当該楽曲登録装置に接続可能に構成されている端末装置とからなる楽曲検索システムであって、前記楽曲登録装置は、前記楽曲データを入力する楽曲データ入力手段と、該楽曲データ入力手段によって入力された前記楽曲データの一定フレーム長に対して高速フーリエ変換を行いパワースペクトルを算出することで物理的な特徴データを抽出する特徴データ抽出手段と、予め学習が施された階層型ニューラルネットワークを用いて前記特徴データ抽出手段によって抽出された特徴データを人間の感性によって判断される印象度データに変換する印象度データ変換手段と、該印象度データ変換手段によって変換された印象度データに基づいて、前記楽曲データ入力手段によって入力された楽曲データを予め学習が施された自己組織化マップである楽曲マップにマッピングする楽曲マッピング手段と、該楽曲マッピング手段によってマッピングされた楽曲データが記憶される楽曲マップ記憶手段と、前記楽曲データベースに記憶されている楽曲データおよび前記楽曲マップ記憶手段に記憶されている楽曲マップを前記端末装置に出力するデータベース出力手段と、を具備し、前記端末装置は、前記楽曲登録装置からの楽曲データおよび楽曲マップを入力するデータベース入力手段と、該データベース入力手段によって入力された楽曲データを記憶する端末側楽曲データベースと、前記データベース入力手段によって入力された楽曲マップを記憶する端末側楽曲マップ記憶手段と、楽曲マップにマッピングされている楽曲データの中から代表曲を選択する代表曲選択手段と、キーワードに対応する楽曲を設定するキーワード設定手段と、楽曲がマッピングで表示される楽曲マッピング表示手段と、前記楽曲マッピング表示手段で表示された楽曲であるニューロンをポイントすると前記キーワードが表示されるキーワード表示手段と、該代表曲選択手段によって選択された代表曲と前記キーワードに基づいて代表曲がマッピングされているニューロンと、その近傍ニューロンとに含まれる楽曲マップの楽曲データを検索する楽曲検索手段と、該楽曲検索手段によって検索された楽曲データを再生する音声出力手段と、を具備することを特徴とする。 The music search system of the present invention is a music search system comprising a music registration device that stores input music data in a music database, and a terminal device configured to be connectable to the music registration device, The music registration device is configured to calculate a power spectrum by performing fast Fourier transform on music data input means for inputting the music data and a fixed frame length of the music data input by the music data input means. Using feature data extraction means for extracting characteristic feature data and a hierarchical neural network that has been learned in advance, the feature data extracted by the feature data extraction means is converted into impression degree data determined by human sensitivity Impression degree data converting means, and impression degree data converted by the impression degree data converting means Accordingly, the music mapping means for mapping the music data input by the music data input means to a music map which is a self-organized map that has been previously learned, and the music data mapped by the music mapping means are stored. Music map storage means, and database output means for outputting the music data stored in the music database and the music map stored in the music map storage means to the terminal device, the terminal device comprising: Database input means for inputting music data and music map from the music registration device, terminal-side music database for storing music data input by the database input means, and music map input by the database input means Terminal-side music map storage means for storing Representative song selection means for selecting a representative song from the song data mapped to the song map, keyword setting means for setting a song corresponding to the keyword, song mapping display means for displaying the song by mapping, The keyword is displayed when the neuron that is the music displayed by the music mapping display means is pointed, the representative music is mapped based on the keyword and the representative music selected by the representative music selection means and the keyword. and neurons are a characterized by including a music searching means for searching the music data of the music maps contained in its vicinity neurons, and audio output means for reproducing the music data searched by the musical composition retrieving means, the To do.

さらに、本発明の楽曲検索システムは、前記階層型ニューラルネットワークは、楽曲データを聴取した評価者によって入力された印象度データを教師信号として学習が施されていることを特徴とする。 Furthermore, the music search system of the present invention is characterized in that the hierarchical neural network is learned by using, as a teacher signal, impression degree data input by an evaluator who has listened to music data.

さらに、本発明の楽曲検索システムは、前記特徴データ抽出手段は、特徴データとしてゆらぎ情報からなる複数の項目を抽出することを特徴とする。 Furthermore, the music search system of the present invention is characterized in that the feature data extraction means extracts a plurality of items consisting of fluctuation information as feature data.

さらに、本発明の楽曲検索システムは、前記楽曲マッピング手段は、前記印象度データ変換手段によって変換された印象度データを入力ベクトルとして、当該入力ベクトルに最も近いニューロンに前記楽曲データ入力手段によって入力された楽曲データをマッピングすることを特徴とする。 Furthermore, in the music search system of the present invention, the music mapping means uses the impression degree data converted by the impression degree data conversion means as an input vector and is input by the music data input means to a neuron closest to the input vector. The music data is mapped.

さらに、本発明の楽曲検索システムは、前記楽曲検索手段において近傍ニューロンを決定するための近傍半径は、任意に設定可能であることを特徴とする。 Furthermore, the music search system of the present invention is characterized in that a neighborhood radius for determining a neighborhood neuron in the music search means can be arbitrarily set.

さらに、本発明の楽曲検索システムは、楽曲データを聴取した評価者によって入力された印象度データによって学習が施されていることを特徴とする。 Furthermore, the music search system of the present invention is characterized in that learning is performed by impression degree data input by an evaluator who has listened to music data.

また、本発明の楽曲検索システムは、楽曲データベースに記憶された複数の楽曲データの中から所望の楽曲データを検索する楽曲検索システムであって、予め学習が施された自己組織化マップであり、楽曲データがマッピングされている楽曲マップと、楽曲マップにマッピングされている楽曲データの中から代表曲を選択する代表曲選択手段と、キーワードに対応する楽曲を設定するキーワード設定手段と、楽曲がマッピングで表示される楽曲マッピング表示手段と、前記楽曲マッピング表示手段で表示された楽曲であるニューロンをポイントすると前記キーワードが表示されるキーワード表示手段と、該代表曲選択手段によって選択された代表曲と前記キーワードに基づいて代表曲がマッピングされているニューロンと、その近傍ニューロンとに含まれる楽曲マップの楽曲データを検索する楽曲検索手段と、該楽曲検索手段によって検索された楽曲データを出力する楽曲データ出力手段と、を具備することを特徴とする。 The music search system of the present invention is a music search system that searches for desired music data from a plurality of music data stored in a music database, and is a self-organizing map that has been previously learned. Music map in which music data is mapped, representative music selection means for selecting a representative music from music data mapped in the music map, keyword setting means for setting music corresponding to the keyword, and music mapping The music mapping display means displayed on the screen, the keyword display means for displaying the keyword when the neuron that is the music displayed on the music mapping display means is pointed, the representative music selected by the representative music selection means, and the Neurons to which representative songs are mapped based on keywords and their neighboring neurons Characterized by comprising Doo and music searching means for searching the music data of the music maps contained in the music data output means for outputting the music data searched by the musical composition retrieval means.

さらに、本発明の楽曲検索システムは、楽曲データは、当該楽曲データが有する印象度データを入力ベクトルとして楽曲マップにマッピングされていることを特徴とする。 Furthermore, the music search system of the present invention is characterized in that the music data is mapped to a music map using impression degree data of the music data as an input vector.

さらに、本発明の楽曲検索システムは、楽曲マップは、楽曲データを聴取した評価者によって入力された印象度データによって学習が施されていることを特徴とする。 Furthermore, the music search system of the present invention is characterized in that the music map is learned by impression degree data input by an evaluator who has listened to the music data.

また、本発明の楽曲検索システムは、コンピュータが実行する楽曲データベースに記憶された複数の楽曲データの中から所望の楽曲データを検索する楽曲検索方法であって、前記コンピータは、前記楽曲データの入力を受け付ける楽曲データ入力部の楽曲データ入力工程と、該入力した前記楽曲データの一定フレーム長に対して高速フーリエ変換を行いパワースペクトルを算出することで物理的な特徴データを抽出する特徴データ抽出部の特徴データ抽出工程と、予め学習が施された階層型ニューラルネットワークを用いて前記抽出した特徴データを人間の感性によって判断される印象度データに変換する印象度データ変換部の印象度データ変換工程と、該変換した印象度データに基づいて、前記受け付けた楽曲データを予め学習が施された自己組織化マップである楽曲マップにマッピングする楽曲マッピング部の楽曲マッピング工程と、キーワードに対応する楽曲を設定するＰＣ操作部のキーワード設定工程と、楽曲がマッピングで表示されるＰＣ表示部の楽曲マッピング表示工程と、前記楽曲マッピング表示工程で表示された楽曲であるニューロンをポイントすると前記キーワードが表示されるＰＣ表示部のキーワード表示工程と、楽曲マップにマッピングされている楽曲データの中から代表曲を選択するＰＣ操作部の代表曲選択工程と、該選択した代表曲に基づいて代表曲がマッピングされているニューロンと、その近傍ニューロンとに含まれる楽曲マップにマッピングされている楽曲データを検索する楽曲検索部の楽曲検索工程と、前記楽曲検索工程によって検索された楽曲データを出力する検索結果出力部の楽曲データ出力工程を実行することを特徴とする。 The music search system of the present invention is a music search method for searching for desired music data from a plurality of music data stored in a music database executed by a computer, wherein the computer inputs the music data. A music data input step for receiving music data and a feature data extraction unit for extracting physical feature data by calculating a power spectrum by performing a fast Fourier transform on a fixed frame length of the music data input Feature data extraction step, and impression degree data conversion step of an impression degree data conversion unit that converts the extracted feature data into impression degree data determined by human sensitivity using a previously learned hierarchical neural network And the received music data was previously learned based on the converted impression degree data. The music mapping process of the music mapping unit that maps to the music map that is a self-organizing map, the keyword setting process of the PC operation unit that sets music corresponding to the keyword, and the music mapping of the PC display unit that displays the music by mapping A keyword display step of the PC display unit in which the keyword is displayed when the neuron that is the song displayed in the song mapping display step is pointed to, and a representative song from the song data mapped to the song map Search and representative song selection process of the PC operation unit for selecting a neuron representative music based on the representative music that the selection is mapped, the music data that is mapped to the song map included in its vicinity neurons The music search process of the music search unit and the music searched by the music search process And executes the retrieval result output section of the music data output step of outputting the data.

さらに、本発明の楽曲検索方法は、楽曲データを聴取した評価者によって入力された印象度データを教師信号として学習が施された前記階層型ニューラルネットワークを用いて前記抽出した特徴データを人間の感性によって判断される印象度データに変換することを特徴とする。 Furthermore, in the music search method of the present invention, the feature data extracted using the hierarchical neural network trained using the impression data input by the evaluator who has listened to the music data as a teacher signal is used as the human sensitivity. It converts into impression degree data judged by this.

本発明の楽曲検索システムおよび楽曲検索方法は、予め学習が施された自己組織化マップであり、楽曲データが、当該楽曲データが有する印象度データに基づいてマッピングされている楽曲マップを用いて検索するように構成することにより、代表曲を選択するだけで、大容量の記憶手段に記憶されている大量の楽曲データの中から代表曲と印象が同じような楽曲を素早く検索することができるという効果を奏する。 The music search system and the music search method of the present invention are self-organized maps that have been learned in advance, and the music data is searched using a music map in which the music data is mapped based on impression degree data of the music data. By configuring so, it is possible to quickly search for a song having the same impression as the representative song from a large amount of song data stored in a large-capacity storage means simply by selecting the representative song. There is an effect.

さらに、本発明の楽曲検索システムおよび楽曲検索方法は、楽曲データから印象度データへの変換に用いる階層型ニューラルネットワークを、楽曲データを聴取した評価者によって入力された印象度データを教師信号として学習を施すように構成することにより、例えば、ユーザが認知する著名人を評価者として起用することで、ユーザの信頼性を向上させることができると共に、複数人の評価者によってそれぞれ学習が施された階層型ニューラルネットワークを用意し、ユーザによって選択できるようにすれば、ユーザの利便性が向上するという効果を奏する。 Further, the music search system and the music search method of the present invention learn a hierarchical neural network used for conversion from music data to impression data using impression data input by an evaluator who has listened to music data as a teacher signal. By using a celebrity recognized by the user as an evaluator, for example, the reliability of the user can be improved, and learning is performed by a plurality of evaluators, respectively. If a hierarchical neural network is prepared and can be selected by the user, the convenience of the user is improved.

さらに、本発明の楽曲検索システムおよび楽曲検索方法は、特徴データとしてゆらぎ情報からなる複数の項目を抽出するように構成することにより、楽曲データの物理的な特徴を正確に抽出することができ、特徴データから変換される印象度データの精度を向上させることができるという効果を奏する。 Furthermore, the music search system and the music search method of the present invention are configured to extract a plurality of items including fluctuation information as feature data, thereby accurately extracting the physical features of the music data, There is an effect that the accuracy of the impression degree data converted from the feature data can be improved.

さらに、本発明の楽曲検索システムおよび楽曲検索方法は、楽曲マップとして予め学習が施された自己組織化マップを用いることにより、類似する印象を有する楽曲が近隣に配置されるため、検索効率が向上するという効果を奏する。 Furthermore, the music search system and the music search method of the present invention use a self-organized map that has been learned in advance as a music map, so that music having similar impressions is arranged in the vicinity, thereby improving search efficiency. The effect of doing.

以下、本発明の実施の形態を図面に基づいて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明に係る楽曲検索システムの実施の形態の構成を示すブロック図であり、図２は、図１に示す楽曲検索装置に用いられるニューラルネットワークを事前に学習させるニューラルネットワーク学習装置の構成を示すブロック図である。 FIG. 1 is a block diagram showing a configuration of an embodiment of a music search system according to the present invention, and FIG. 2 is a diagram of a neural network learning device that learns in advance a neural network used in the music search device shown in FIG. It is a block diagram which shows a structure.

本実施の形態は、図１を参照すると、楽曲検索装置１０と、端末装置３０とがＵＳＢ等のデータ伝送路で接続されており、端末装置３０は、楽曲検索装置１０から切り離して携帯することができる構成となっている。 In this embodiment, referring to FIG. 1, the music search device 10 and the terminal device 30 are connected by a data transmission path such as a USB, and the terminal device 30 is separated from the music search device 10 and is carried. It has a configuration that can.

楽曲検索装置１０は、図１を参照すると、楽曲データ入力部１１と、圧縮処理部１２と、特徴データ抽出部１３と、印象度データ変換部１４と、楽曲データベース１５と、楽曲マッピング部１６と、楽曲マップ記憶部１７と、楽曲検索部１８と、ＰＣ操作部１９と、ＰＣ表示部２０と、検索結果出力部２１とからなる。 Referring to FIG. 1, the music search device 10 includes a music data input unit 11, a compression processing unit 12, a feature data extraction unit 13, an impression degree data conversion unit 14, a music database 15, and a music mapping unit 16. The music map storage unit 17, the music search unit 18, the PC operation unit 19, the PC display unit 20, and the search result output unit 21.

楽曲データ入力部１１は、ＣＤ、ＤＶＤ等の楽曲データが記憶されている記憶媒体を読み取る機能を有し、ＣＤ、ＤＶＤ等の記憶媒体から楽曲データを入力し、圧縮処理部１２および特徴データ抽出部１３に出力する。ＣＤ、ＤＶＤ等の記憶媒体以外にインターネット等のネットワークを経由した楽曲データ（配信データ）を入力するように構成しても良い。なお、圧縮された楽曲データが入力される場合には、圧縮された楽曲データを伸長して特徴データ抽出部１３に出力する。 The music data input unit 11 has a function of reading a storage medium in which music data such as a CD and a DVD is stored. The music data input unit 11 inputs music data from a storage medium such as a CD and a DVD, and extracts a compression processing unit 12 and feature data. To the unit 13. You may comprise so that the music data (delivery data) via networks, such as the internet, other than storage media, such as CD and DVD, may be input. When compressed music data is input, the compressed music data is decompressed and output to the feature data extraction unit 13.

圧縮処理部１２は、楽曲データ入力部１１から入力された楽曲データをＭＰ３やＡＴＲＡＣ（Adaptive Transform Acoustic Coding ）等の圧縮形式で圧縮し、圧縮した楽曲データを、アーティスト名、曲名等の書誌データと共に楽曲データベース１５に記憶させる。 The compression processing unit 12 compresses the music data input from the music data input unit 11 in a compression format such as MP3 or ATRAC (Adaptive Transform Acoustic Coding), and the compressed music data together with bibliographic data such as artist names and music titles. It is stored in the music database 15.

特徴データ抽出部１３は、楽曲データ入力部１１から入力された楽曲データから、ゆらぎ情報からなる特徴データを抽出し、抽出した特徴データを印象度データ変換部１４に出力する。 The feature data extraction unit 13 extracts feature data composed of fluctuation information from the music data input from the music data input unit 11 and outputs the extracted feature data to the impression degree data conversion unit 14.

印象度データ変換部１４は、予め学習が施された階層型ニューラルネットワークを用いて、特徴データ抽出部１３から入力された特徴データを、人間の感性によって判断される印象度データに変換し、変換した印象度データを楽曲マッピング部１６に出力する。 The impression degree data conversion unit 14 converts the feature data input from the feature data extraction unit 13 into impression degree data determined by human sensitivity using a hierarchical neural network that has been learned in advance. The received impression degree data is output to the music mapping unit 16.

楽曲データベース１５は、ＨＤＤ等の大容量の記憶手段であり、圧縮処理部１２によって圧縮された楽曲データ、書誌データと、特徴データ抽出部１３によって抽出された特徴データとが関連づけられて記憶される。 The music database 15 is a large-capacity storage unit such as an HDD, and stores music data and bibliographic data compressed by the compression processing unit 12 in association with feature data extracted by the feature data extraction unit 13. .

楽曲マッピング部１６は、印象度データ変換部１４から入力された印象度データに基づいて、予め学習が施された自己組織化マップである楽曲マップに楽曲データをマッピングし、楽曲データをマッピングした楽曲マップを楽曲マップ記憶部１７に記憶させる。 The music mapping unit 16 maps music data to a music map, which is a self-organized map that has been learned in advance, based on the impression degree data input from the impression degree data conversion unit 14, and the music data is mapped to the music data The map is stored in the music map storage unit 17.

楽曲マップ記憶部１７は、ＨＤＤ等の大容量の記憶手段であり、楽曲マッピング部１６によって楽曲データがマッピングされた楽曲マップが記憶される。 The music map storage unit 17 is a large-capacity storage unit such as an HDD, and stores a music map to which music data is mapped by the music mapping unit 16.

楽曲検索部１８は、ＰＣ操作部１９から入力された印象度データおよび書誌データに基づいて楽曲データベース１５を検索し、当該検索結果をＰＣ表示部２０に表示すると共に、ＰＣ操作部１９によって選択された代表曲に基づいて楽曲マップ記憶部１７を検索し、当該代表曲検索結果をＰＣ表示部２０に表示する。また、楽曲検索部１８は、検索結果出力部２１を介してＰＣ操作部１９によって選択された楽曲データを端末装置３０に出力する。 The music search unit 18 searches the music database 15 based on the impression data and bibliographic data input from the PC operation unit 19, displays the search result on the PC display unit 20, and is selected by the PC operation unit 19. The music map storage unit 17 is searched based on the representative music, and the representative music search result is displayed on the PC display unit 20. Further, the music search unit 18 outputs the music data selected by the PC operation unit 19 to the terminal device 30 via the search result output unit 21.

ＰＣ操作部１９は、キーボードやマウス等の入力手段であり、楽曲データベース１５および楽曲マップ記憶部１７に記憶されている楽曲データを検索する検索条件の入力、端末装置３０に出力する楽曲データを選択する入力が行われる。 The PC operation unit 19 is input means such as a keyboard and a mouse, inputs search conditions for searching for music data stored in the music database 15 and the music map storage unit 17, and selects music data to be output to the terminal device 30. Input is made.

ＰＣ表示部２０は、例えば液晶ディスプレイ等の表示手段であり、楽曲マップ記憶部１７に記憶されている楽曲データのマッピング状況の表示、楽曲データベース１５および楽曲マップ記憶部１７に記憶されている楽曲データを検索する検索条件の表示、検索された楽曲データ（検索結果）の表示が行われる。 The PC display unit 20 is a display unit such as a liquid crystal display, for example, displays the mapping status of the song data stored in the song map storage unit 17, and the song data stored in the song database 15 and the song map storage unit 17. The search condition for searching for the song is displayed, and the searched music data (search result) is displayed.

検索結果出力部２１は、端末装置３０の検索結果入力部３１との間をＵＳＢ等のデータ伝送路で接続可能に構成されており、楽曲検索部１８によって検索され、ＰＣ操作部１９によって選択された楽曲データを端末装置３０の検索結果入力部３１に出力する。 The search result output unit 21 is configured to be connectable to the search result input unit 31 of the terminal device 30 through a data transmission path such as USB, and is searched by the music search unit 18 and selected by the PC operation unit 19. The received music data is output to the search result input unit 31 of the terminal device 30.

端末装置３０は、ＨＤＤ等の大容量の記憶手段を有するポータブルオーディオ等の音声再生装置であり、図１を参照すると、検索結果入力部３１と、検索結果記憶部３２と、端末操作部３３、端末表示部３４と、音声出力部３５とからなる。 The terminal device 30 is an audio playback device such as a portable audio having a large capacity storage means such as an HDD. Referring to FIG. 1, a search result input unit 31, a search result storage unit 32, a terminal operation unit 33, It consists of a terminal display unit 34 and an audio output unit 35.

検索結果入力部３１は、楽曲検索装置１０の検索結果出力部２１をＵＳＢ等のデータ伝送路で接続可能に構成されており、楽曲検索装置１０の検索結果出力部２１から入力された楽曲データを検索結果記憶部３２に記憶させる。 The search result input unit 31 is configured such that the search result output unit 21 of the music search device 10 can be connected by a data transmission path such as a USB, and the music data input from the search result output unit 21 of the music search device 10 is received. The search result storage unit 32 stores the result.

端末操作部３３は、検索結果記憶部３２に記憶されている楽曲データの選択・再生を指示する入力、ボリュームコントロールの入力等の楽曲データの再生に係る入力が行われる。 The terminal operation unit 33 performs input related to reproduction of music data such as input for instructing selection / reproduction of music data stored in the search result storage unit 32 and input of volume control.

端末表示部３４は、例えば液晶ディスプレイ等の表示手段であり、再生中の曲名や、各種操作ガイダンスが表示される。 The terminal display unit 34 is a display means such as a liquid crystal display, for example, and displays the name of a song being played and various operation guidance.

音声出力部３５は、検索結果記憶部３２に圧縮されて記憶されている楽曲データを伸長して再生するオーティオプレーヤである。 The audio output unit 35 is an audio player that decompresses and reproduces music data that is compressed and stored in the search result storage unit 32.

ニューラルネットワーク学習装置４０は、印象度データ変換部１４で用いられる階層型ニューラルネットワークと、楽曲マッピング部１６で用いられる楽曲マップとの学習を行う装置であり、図２を参照すると、楽曲データ入力部４１と、音声出力部４２と、特徴データ抽出部４３と、印象度データ入力部４４と、結合重み値学習部４５と、楽曲マップ学習部４６と、結合重み値出力部４７と、特徴ベクトル出力部４８とからなる。 The neural network learning device 40 is a device that learns the hierarchical neural network used in the impression degree data conversion unit 14 and the music map used in the music mapping unit 16, and referring to FIG. 41, voice output unit 42, feature data extraction unit 43, impression degree data input unit 44, combination weight value learning unit 45, music map learning unit 46, combination weight value output unit 47, and feature vector output Part 48.

楽曲データ入力部４１は、ＣＤ、ＤＶＤ等の楽曲データが記憶されている記憶媒体を読み取る機能を有し、ＣＤ、ＤＶＤ等の記憶媒体から楽曲データを入力し、音声出力部４２および特徴データ抽出部４３に出力する。ＣＤ、ＤＶＤ等の記憶媒体以外にインターネット等のネットワークを経由した楽曲データ（配信データ）を入力するように構成しても良い。なお、圧縮された楽曲データが入力される場合には、圧縮された楽曲データを伸長して音声出力部４２および特徴データ抽出部４３に出力する。 The music data input unit 41 has a function of reading a storage medium in which music data such as a CD and a DVD is stored. The music data input unit 41 inputs music data from a storage medium such as a CD and a DVD, and extracts an audio output unit 42 and feature data. To the unit 43. You may comprise so that the music data (delivery data) via networks, such as the internet, other than storage media, such as CD and DVD, may be input. When compressed music data is input, the compressed music data is decompressed and output to the audio output unit 42 and the feature data extraction unit 43.

音声出力部４２は、楽曲データ入力部４１から入力された楽曲データを伸長して再生するオーディオプレーヤである。 The audio output unit 42 is an audio player that decompresses and reproduces music data input from the music data input unit 41.

特徴データ抽出部４３は、楽曲データ入力部４１から入力された楽曲データから、ゆらぎ情報からなる特徴データを抽出し、抽出した特徴データを結合重み値学習部４５に出力する。 The feature data extraction unit 43 extracts feature data composed of fluctuation information from the song data input from the song data input unit 41 and outputs the extracted feature data to the connection weight value learning unit 45.

印象度データ入力部４４は、音声出力部４２からの音声出力に基づく、評価者による印象度データの入力を受け付け、受け付けた印象度データを、階層型ニューラルネットワークの学習に用いる教師信号として結合重み値学習部４５に出力すると共に自己組織化マップへの入力ベクトルとして楽曲マップ学習部４６に出力する。 The impression degree data input unit 44 accepts input of impression degree data by the evaluator based on the audio output from the audio output unit 42, and uses the received impression degree data as a joint signal as a teacher signal used for learning of the hierarchical neural network. The value is output to the value learning unit 45 and is also output to the music map learning unit 46 as an input vector to the self-organizing map.

結合重み値学習部４５は、特徴データ抽出部４３から入力された特徴データと、印象度データ入力部４４から入力された印象度データとに基づいて階層型ニューラルネットワークに学習を施し、各ニューラルの結合重み値を更新し、結合重み値出力部４７を介して更新した結合重み値を出力する。学習が施された階層型ニューラルネットワーク（更新された結合重み値）は、楽曲検索装置１０の印象度データ変換部１４に移植される。 The joint weight value learning unit 45 performs learning on the hierarchical neural network based on the feature data input from the feature data extraction unit 43 and the impression degree data input from the impression degree data input unit 44, and The connection weight value is updated, and the updated connection weight value is output via the connection weight value output unit 47. The learned hierarchical neural network (updated connection weight value) is transplanted to the impression degree data conversion unit 14 of the music search apparatus 10.

楽曲マップ学習部４６は、印象度データ入力部４４から入力された印象度データを自己組織化マップへの入力ベクトルとして自己組織化マップに学習を施し、各ニューラルの特徴ベクトルを更新し、特徴ベクトル出力部４８を介して更新した特徴ベクトルを出力する。
学習が施された自己組織化マップ（更新された特徴ベクトル）は、楽曲マップとして楽曲検索装置１０の楽曲マップ記憶部１７に記憶される。 The music map learning unit 46 learns the self-organizing map using the impression degree data input from the impression degree data input unit 44 as an input vector to the self-organizing map, updates the feature vector of each neural network, and The updated feature vector is output via the output unit 48.
The learned self-organizing map (updated feature vector) is stored in the music map storage unit 17 of the music search device 10 as a music map.

次に、本実施の形態の動作について図３乃至図１５を参照して詳細に説明する。
図３は、図１に示す楽曲検索装置における楽曲登録動作を説明するためのフローチャートであり、図４は、図１に示す特徴データ抽出部における特徴データ抽出動作を説明するためのフローチャートであり、図５は、図２に示すニューラルネットワーク学習装置における階層型ニューラルネットワークの学習動作を説明するためのフローチャートであり、図６は、図２に示すニューラルネットワーク学習装置における楽曲マップの学習動作を説明するためのフローチャートあり、図７は、図１に示す楽曲検索装置における楽曲検索動作を説明するためのフローチャートであり、図８は、図２に示すニューラルネットワーク学習装置における階層型ニューラルネットワークの学習アルゴリズムを説明するための説明図であり、図９は、図２に示すニューラルネットワーク学習装置における楽曲マップの学習アルゴリズムを説明するための説明図であり、図１０は、図１に示すＰＣ表示部の表示画面例を示す図であり、図１１は、図１０に示す検索条件入力領域の表示例を示す図であり、図１２および図１３は、図１０に示す検索結果表示領域の表示例を示す図であり、
図１４は、図１０に示す表示画面例に表示される全楽曲リスト表示領域例を示す図であり、図１５は、図１０に示す表示画面例に表示されるキーワード検索領域例を示す図である。 Next, the operation of the present embodiment will be described in detail with reference to FIGS.
FIG. 3 is a flowchart for explaining the music registration operation in the music search apparatus shown in FIG. 1, and FIG. 4 is a flowchart for explaining the feature data extraction operation in the feature data extraction unit shown in FIG. FIG. 5 is a flowchart for explaining the learning operation of the hierarchical neural network in the neural network learning apparatus shown in FIG. 2, and FIG. 6 explains the music map learning operation in the neural network learning apparatus shown in FIG. FIG. 7 is a flowchart for explaining the music search operation in the music search apparatus shown in FIG. 1, and FIG. 8 shows the learning algorithm of the hierarchical neural network in the neural network learning apparatus shown in FIG. FIG. 9 is an explanatory diagram for explanation, and FIG. FIG. 10 is an explanatory diagram for explaining a music map learning algorithm in the neural network learning apparatus, FIG. 10 is a diagram showing an example of a display screen of the PC display unit shown in FIG. 1, and FIG. 11 is a search shown in FIG. FIG. 12 and FIG. 13 are diagrams showing display examples of the search result display region shown in FIG.
FIG. 14 is a diagram showing an example of a total music list display area displayed on the display screen example shown in FIG. 10, and FIG. 15 is a diagram showing an example of a keyword search area displayed on the display screen example shown in FIG. is there.

まず、楽曲検索装置１０における楽曲登録動作について図３を参照して詳細に説明する。
楽曲データ入力部１１にＣＤ、ＤＶＤ等の楽曲データが記憶されている記憶媒体をセットし、楽曲データ入力部１１から楽曲データを入力する（ステップＡ１）。 First, the music registration operation in the music search apparatus 10 will be described in detail with reference to FIG.
A storage medium storing music data such as CD and DVD is set in the music data input section 11 and music data is input from the music data input section 11 (step A1).

圧縮処理部１２は、楽曲データ入力部１１から入力された楽曲データを圧縮し（ステップＡ２）、圧縮した楽曲データを、アーティスト名、曲名等の書誌データと共に楽曲データベース１５に記憶させる（ステップＡ３）。 The compression processing unit 12 compresses the music data input from the music data input unit 11 (step A2), and stores the compressed music data in the music database 15 together with the bibliographic data such as artist name and music name (step A3). .

特徴データ抽出部１３は、楽曲データ入力部１１から入力された楽曲データから、ゆらぎ情報からなる特徴データを抽出する（ステップＡ４）。
特徴データ抽出部１３における特徴データの抽出動作は、図４を参照すると、楽曲データの入力を受け付け（ステップＢ１）、楽曲データの予め定められたデータ解析開始点から一定のフレーム長に対しＦＦＴ（高速フーリエ変換）を行い（ステップＢ２）、パワースペクトルを算出する。なお、ステップＢ２の前に高速化を目的としてダウンサンプリングを行うようにしても良い。 The feature data extraction unit 13 extracts feature data including fluctuation information from the music data input from the music data input unit 11 (step A4).
Referring to FIG. 4, the feature data extraction operation in the feature data extraction unit 13 accepts input of music data (step B1), and performs FFT (FFT) for a certain frame length from a predetermined data analysis start point of music data. (Fast Fourier transform) is performed (step B2), and a power spectrum is calculated. Note that downsampling may be performed for the purpose of speeding up before step B2.

次に、特徴データ抽出部１３は、ｌｏｗ、ｍｉｄｄｌｅ、ｈｉｇｈの周波数帯域を予め設定しておき、ｌｏｗ、ｍｉｄｄｌｅ、ｈｉｇｈの３帯域のパワースペクトルを積分し、平均パワーを算出すると共に（ステップＢ３）、ｌｏｗ、ｍｉｄｄｌｅ、ｈｉｇｈの周波数帯域の内、最大のパワーを持つ帯域をｐｉｔｃｈのデータ解析開始点値とし、ｐｉｔｃｈを測定する（ステップＢ４）。 Next, the feature data extraction unit 13 presets low, middle, and high frequency bands, integrates the power spectrums of the three bands, low, middle, and high, and calculates an average power (step B3). Among the frequency bands of, low, middle, and high, the band having the maximum power is set as the data analysis start point value of the pitch, and the pitch is measured (step B4).

ステップＢ２〜ステップＢ４の処理動作は、予め定められたフレーム個数分行われ、特徴データ抽出部１３は、ステップＢ２〜ステップＢ４の処理動作を行ったフレーム個数が予め定められた設定値に達したか否かを判断し（ステップＢ５）、ステップＢ２〜ステップＢ４の処理動作を行ったフレーム個数が予め定められた設定値に達していない場合には、データ解析開始点をシフトしながら（ステップＢ６）、ステップＢ２〜ステップＢ４の処理動作を繰り返す。 The processing operations in step B2 to step B4 are performed for a predetermined number of frames, and the feature data extraction unit 13 determines whether the number of frames for which the processing operations in steps B2 to B4 have been performed has reached a predetermined setting value. (Step B5), and if the number of frames for which the processing operations in steps B2 to B4 have been performed does not reach a predetermined set value, the data analysis start point is shifted (step B6). , The processing operations of Step B2 to Step B4 are repeated.

ステップＢ２〜ステップＢ４の処理動作を行ったフレーム個数が予め定められた設定値に達した場合には、特徴データ抽出部１３は、ステップＢ２〜ステップＢ４の処理動作によって算出したｌｏｗ、ｍｉｄｄｌｅ、ｈｉｇｈの平均パワーの時系列データに対しＦＦＴを行うと共に、ステップＢ２〜ステップＢ４の処理動作によって測定したＰｉｔｃｈの時系列データに対しＦＦＴを行う（ステップＢ７）。 When the number of frames subjected to the processing operations of Step B2 to Step B4 reaches a predetermined set value, the feature data extraction unit 13 calculates low, middle, high calculated by the processing operations of Step B2 to Step B4. The FFT is performed on the time series data of the average power and the time series data of the Pitch measured by the processing operations of Step B2 to Step B4 (Step B7).

次に、特徴データ抽出部１３は、ｌｏｗ、ｍｉｄｄｌｅ、ｈｉｇｈ、ＰｉｔｃｈにおけるＦＦＴ分析結果から、横軸を対数周波数、縦軸を対数パワースペクトルとしたグラフにおける回帰直線の傾きと、回帰直線のＹ切片とをゆらぎ情報として算出し（ステップＢ８）、ｌｏｗ、ｍｉｄｄｌｅ、ｈｉｇｈ、Ｐｉｔｃｈのそれぞれにおける回帰直線の傾きおよびＹ切片を８項目からなる特徴データとして印象度データ変換部１４に出力する。 Next, the feature data extraction unit 13 calculates the slope of the regression line in the graph with the horizontal axis representing the logarithmic frequency and the vertical axis representing the logarithmic power spectrum from the FFT analysis results for low, middle, high, and pitch, and the Y intercept of the regression line. Are calculated as fluctuation information (step B8), and the slope of the regression line and the Y-intercept in each of low, middle, high, and pitch are output to the impression degree data conversion unit 14 as feature data of eight items.

印象度データ変換部１４は、図８に示すような入力層（第１層）、中間層（第ｎ層）、出力層（第Ｎ層）からなる階層型ニューラルネットワークを用い、入力層（第１層）に特徴データ抽出部１３で抽出された特徴データを入力することによって、出力層（第Ｎ層）から印象度データを出力、すなわち特徴データを印象度データに変換し（ステップＡ５）、出力層（第Ｎ層）から出力された印象度データを、楽曲マッピング部１６に出力すると共に、楽曲データと共に楽曲データベース１５に記憶させる。なお、中間層（第ｎ層）の各ニューラルの結合重み値ｗは、ニューラルネットワーク学習装置４０によって予め学習が施されている。また、本実施の形態の場合には、入力層（第１層）に入力される特徴データ、すなわち特徴データ抽出部１３によって抽出される特徴データの項目は、前述のように８項目であり、印象度データの項目としては、人間の感性によって判断される（明るい、暗い）、（重い、軽い）、（かたい、やわらかい）、（安定、不安定）、（澄んだ、にごった）、（滑らか、歯切れの良い）、（激しい、穏やか）、（厚い、薄い）の８項目を設定し、各項目を７段階評価で表すように設定した。従って、入力層（第１層）のニューロン数Ｌ_１と出力層（第Ｎ層）のニューロン数Ｌ_Ｎとは、それぞれ８個となっており、中間層（第ｎ層：ｎ＝２，…，Ｎ−１）のニューロン数Ｌ_ｎは、適宜設定されている。 The impression data conversion unit 14 uses a hierarchical neural network including an input layer (first layer), an intermediate layer (n-th layer), and an output layer (N-th layer) as shown in FIG. By inputting the feature data extracted by the feature data extraction unit 13 into the first layer), the impression data is output from the output layer (Nth layer), that is, the feature data is converted into impression data (step A5). Impression degree data output from the output layer (Nth layer) is output to the music mapping unit 16 and stored in the music database 15 together with the music data. Note that the neural network learning device 40 has previously learned the connection weight value w of each neural layer (n-th layer). In the case of the present embodiment, the feature data input to the input layer (first layer), that is, the feature data extracted by the feature data extraction unit 13 is eight items as described above, Impression data items are determined by human sensitivity (bright, dark), (heavy, light), (hard, soft), (stable, unstable), (clear, fuzzy), ( Eight items of (smooth, crisp), (violent, gentle), (thick, thin) were set, and each item was set to be expressed by a seven-level evaluation. Therefore, the number of neurons L _N of input layer neurons number L ₁ and the output layer (first layer) (the N th layer) is a eight respectively, the intermediate layer (the n-th layer: n = 2, ... , N−1), the number of neurons L _n is set as appropriate.

楽曲マッピング部１６は、楽曲データ入力部１１から入力された楽曲を楽曲マップ記憶部１７に記憶されている楽曲マップの該当箇所にマッピングする。楽曲マッピング部１６におけるマッピング動作に用いられる楽曲マップは、例えばニューロンが２次元に規則的に配置（図９に示す例では、９＊９の正方形）されている自己組織化マップ（ＳＯＭ）であり、教師信号を必要としない学習ニューラルネットワークで、入力パターン群をその類似度に応じて分類する能力を自律的に獲得していくニューラルネットワークである。なお、本実施の形態では、ニューロンが１００＊１００の正方形に配列された２次元ＳＯＭを使用したが、ニューロンの配列は、正方形であっても、蜂の巣であっても良い。 The music mapping unit 16 maps the music input from the music data input unit 11 to a corresponding portion of the music map stored in the music map storage unit 17. The music map used for the mapping operation in the music mapping unit 16 is, for example, a self-organizing map (SOM) in which neurons are regularly arranged in two dimensions (9 * 9 square in the example shown in FIG. 9). A learning neural network that does not require a teacher signal, and is a neural network that autonomously acquires the ability to classify input pattern groups according to their similarity. In the present embodiment, a two-dimensional SOM in which neurons are arranged in a 100 * 100 square is used. However, the arrangement of neurons may be a square or a honeycomb.

また、楽曲マッピング部１６におけるマッピング動作に用いられる楽曲マップは、ニューラルネットワーク学習装置４０によって、学習が施されており、各ニューロンには、予め学習されたｎ次元の特徴ベクトルｍ_ｉ（ｔ）∈Ｒ^ｎが内包されており、楽曲マッピング部１６は、印象度データ変換部１４によって変換された印象度データを入力ベクトルｘ_ｊとし、入力ベクトルｘ_ｊに最も近いニューロン、すなわちユークリッド距離‖ｘ_ｊ−ｍ_ｉ‖を最小にするニューロンに、入力された楽曲をマッピングし（ステップＡ６）、マッピングした楽曲マップを楽曲マップ記憶部１７に記憶させる。なお、Ｒは、印象度データの各項目の評価段階数を示し、ｎは、印象度データの項目数を示す。 The music map used for the mapping operation in the music mapping unit 16 is learned by the neural network learning device 40, and each neuron has a previously learned n-dimensional feature vector m _i (t) ∈. R ⁿ are contained is, music mapping unit 16, the impression data converted by the impression-data-conversion unit 14 to the input vector x _j, nearest neuron to the input vector x _j, namely the Euclidean distance ‖x _j - The input music is mapped to a neuron that minimizes m _i （(step A6), and the mapped music map is stored in the music map storage unit 17. R represents the number of evaluation stages for each item of impression degree data, and n represents the number of items of impression degree data.

次に、印象度データ変換部１４における変換動作（ステップＡ５）に用いられる階層型ニューラルネットワークの学習動作について図５および図８を参照して詳細に説明する。
楽曲データ入力部４１にＣＤ、ＤＶＤ等の楽曲データが記憶されている記憶媒体をセットし、楽曲データ入力部４１から楽曲データを入力し（ステップＣ１）、特徴データ抽出部４３は、楽曲データ入力部４１から入力された楽曲データから、ゆらぎ情報からなる特徴データを抽出する（ステップＣ２）。 Next, the learning operation of the hierarchical neural network used for the conversion operation (step A5) in the impression degree data conversion unit 14 will be described in detail with reference to FIG. 5 and FIG.
A music medium such as a CD or DVD is set in the music data input unit 41, music data is input from the music data input unit 41 (step C1), and the feature data extraction unit 43 inputs the music data. Feature data composed of fluctuation information is extracted from the music data input from the unit 41 (step C2).

また、音声出力部４２は、楽曲データ入力部４１から入力された楽曲データを音声出力し（ステップＣ３）、評価者は、音声出力部４２からの音声出力を聞くことによって、楽曲の印象度を感性によって評価し、評価結果を印象度データとして印象度データ入力部４４から入力し（ステップＣ４）、結合重み値学習部４５は、印象度データ入力部４４から入力された印象度データを教師信号として受け付ける。なお、本実施の形態では、印象度の評価項目としては、人間の感性によって判断される（明るい、暗い）、（重い、軽い）、（かたい、やわらかい）、（安定、不安定）、（澄んだ、にごった）、（滑らか、歯切れの良い）、（激しい、穏やか）、（厚い、薄い）の８項目を設定し、各項目についての７段階評価を印象度データとして楽曲データ入力部４１で受け付けるように構成した。 The audio output unit 42 outputs the music data input from the music data input unit 41 as audio (step C3), and the evaluator listens to the audio output from the audio output unit 42 to determine the impression level of the music. Evaluation is performed based on sensitivity, and the evaluation result is input as impression degree data from the impression degree data input unit 44 (step C4). The combined weight value learning unit 45 uses the impression degree data input from the impression degree data input unit 44 as a teacher signal. Accept as. In this embodiment, the evaluation items for impression degree are determined by human sensitivity (bright, dark), (heavy, light), (hard, soft), (stable, unstable), ( 8 items of clear, sloppy), (smooth, crisp), (violent, gentle), (thick, thin) are set, and a 7-step evaluation for each item is used as impression data, and the music data input unit 41 Configured to accept.

結合重み値学習部４５における階層型ニューラルネットワークの学習、すなわち各ニューロンの結合重み値ｗの更新は、誤差逆伝播学習法を用いて行う。
まず、初期値として、中間層（第ｎ層）の全てニのューロンの結合重み値ｗを乱数によって−０．１〜０．１程度の範囲の小さな値に設定しておき、結合重み値学習部４５は、特徴データ抽出部４３によって抽出された特徴データを入力信号 x_ｊ(ｊ＝１，２，…，８) として入力層（第１層）に入力し、入力層（第１層）から出力層（第Ｎ層）に向けて、各ニューロンの出力を計算する。 The learning of the hierarchical neural network in the connection weight value learning unit 45, that is, the update of the connection weight value w of each neuron is performed using an error back propagation learning method.
First, as an initial value, the joint weight value w of all the two urons in the intermediate layer (nth layer) is set to a small value in the range of about −0.1 to 0.1 by random numbers, and the joint weight value learning is performed. The unit 45 inputs the feature data extracted by the feature data extraction unit 43 to the input layer (first layer) as the input signal x _j (j = 1, 2,..., 8), and inputs the input layer (first layer). To the output layer (Nth layer), the output of each neuron is calculated.

次に、結合重み値学習部４５は、印象度データ入力部４４から入力された印象度データを教師信号ｙ_ｊ(ｊ＝１，２，…，８) とし、出力層（第Ｎ層）の出力out_j ^Ｎと、教師信号ｙ_ｊとの誤差から、学習則δ_j ^Ｎを次式によって計算する。 Next, the combined weight value learning unit 45 uses the impression degree data input from the impression degree data input unit 44 as a teacher signal y _j (j = 1, 2,..., 8), and outputs the output layer (Nth layer). The learning rule δ _j ^N is calculated from the error between the output out _j ^N and the teacher signal y _j by the following equation.

次に、結合重み値学習部４５は、学習則δ_j ^Ｎを使って、中間層（第ｎ層）の誤差信号 δ_j ⁿ を次式によって計算する。 Next, the joint weight value learning unit 45 calculates the error signal δ _j ⁿ of the intermediate layer (nth layer) using the learning rule δ _j ^N by the following equation.

なお、数２において、ｗは、第 n 層 j 番目と第 n -1 層ｋ番目のニューロンの間の結合重み値を表している。 In Equation 2, w represents a connection weight value between the n-th layer j-th neuron and the (n −1) -th layer k-th neuron.

次に、結合重み値学習部４５は、中間層（第ｎ層）の誤差信号 δ_j ⁿ を用いて各ニューロンの結合重み値ｗの変化量Δｗを次式によって計算し、各ニューロンの結合重み値ｗを更新する（ステップＣ５）。 Next, the connection weight value learning unit 45 calculates the amount of change Δw of the connection weight value w of each neuron using the following equation using the error signal δ _j ⁿ of the intermediate layer (nth layer), and the connection weight of each neuron. The value w is updated (step C5).

なお、数３において、ηは、学習率を表し、 (0＜η≦1)に設定されている。 In Equation 3, η represents a learning rate and is set to (0 <η ≦ 1).

学習回数を定める設定値Ｔを予め設定しておき、学習回数ｔ＝０，１，…，Ｔについて学習を行い、結合重み値学習部４５は、学習回数ｔが設定値Ｔに達したか否かを判断し（ステップＣ６）、学習回数ｔが設定値Ｔに達するまでステップＣ１〜ステップＣ５の処理動作を繰り返し、学習回数ｔが設定値Ｔに達すると、結合重み値出力部４７を介して学習させた各ニューロンの結合重み値ｗを出力する（ステップＣ７）。出力された各ニューロンの結合重み値ｗは、楽曲検索装置１０の印象度データ変換部１４に記憶される。 A preset value T for determining the number of learning is set in advance, and learning is performed for the number of learning t = 0, 1,..., T. The connection weight value learning unit 45 determines whether the number of learning t has reached the set value T. (Step C6), the processing operation of Step C1 to Step C5 is repeated until the learning number t reaches the set value T. When the learning number t reaches the set value T, the connection weight value output unit 47 is used. The connection weight value w of each learned neuron is output (step C7). The output connection weight value w of each neuron is stored in the impression degree data conversion unit 14 of the music search device 10.

なお、学習回数を定める設定値Ｔは、次式に示す2乗誤差Ｅが十分に小さくなる値に設定すると良い。 The set value T for determining the number of learnings is preferably set to a value at which the square error E shown in the following equation becomes sufficiently small.

次に、楽曲マッピング部１６におけるマッピング動作（ステップＡ６）に用いられる楽曲マップの学習動作について図６および図９を参照して詳細に説明する。 Next, the music map learning operation used for the mapping operation (step A6) in the music mapping unit 16 will be described in detail with reference to FIGS.

楽曲データ入力部４１にＣＤ、ＤＶＤ等の楽曲データが記憶されている記憶媒体をセットし、楽曲データ入力部４１から楽曲データを入力し（ステップＤ１）、音声出力部４２は、楽曲データ入力部４１から入力された楽曲データを音声出力し（ステップＤ２）、評価者は、音声出力部４２からの音声出力を聞くことによって、楽曲の印象度を感性によって評価し、評価結果を印象度データとして印象度データ入力部４４から入力し（ステップＤ３）、楽曲マップ学習部４６は、印象度データ入力部４４から入力された印象度データを自己組織化マップへの入力ベクトルとして受け付ける。なお、本実施の形態では、印象度の評価項目としては、人間の感性によって判断される（明るい、暗い）、（重い、軽い）、（かたい、やわらかい）、（安定、不安定）、（澄んだ、にごった）、（滑らか、歯切れの良い）、（激しい、穏やか）、（厚い、薄い）の８項目を設定し、各項目についての７段階評価を印象度データとして楽曲データ入力部４１で受け付けるように構成した。 The music data input unit 41 is set with a storage medium storing music data such as CD and DVD, the music data is input from the music data input unit 41 (step D1), and the audio output unit 42 is a music data input unit. The music data input from 41 is output as audio (step D2), and the evaluator listens to the audio output from the audio output unit 42 to evaluate the impression level of the music based on the sensitivity, and the evaluation result is used as impression data. Input from the impression level data input unit 44 (step D3), the music map learning unit 46 accepts the impression level data input from the impression level data input unit 44 as an input vector to the self-organizing map. In this embodiment, the evaluation items for impression degree are determined by human sensitivity (bright, dark), (heavy, light), (hard, soft), (stable, unstable), ( 8 items of clear, sloppy), (smooth, crisp), (violent, gentle), (thick, thin) are set, and a 7-step evaluation for each item is used as impression data, and the music data input unit 41 Configured to accept.

楽曲マップ学習部４６は、は、印象度データ入力部４４から入力された印象度データを入力ベクトルｘ_ｊ（ｔ）∈Ｒ^ｎとし、各ニューロンの特徴ベクトルｍ_ｉ（ｔ）∈Ｒ^ｎを学習させる。なお、ｔは、学習回数を表し、学習回数を定める設定値Ｔを予め設定しておき、学習回数ｔ＝０，１，…，Ｔについて学習を行わせる。なお、Ｒは、各印象度項目の評価段階を示し、ｎは、印象度データの項目数を示す。 The music map learning unit 46 uses the impression degree data input from the impression degree data input unit 44 as an input vector x _j (t) ∈R ⁿ and learns a feature vector m _i (t) ∈R ⁿ of each neuron. Let Note that t represents the number of learning times, a preset value T that determines the number of learning times is set in advance, and learning is performed for the learning number t = 0, 1,. Note that R indicates the evaluation stage of each impression degree item, and n indicates the number of items of impression degree data.

まず、初期値として、全てのニューロンの特徴ベクトルｍ_ｃ（０）をそれぞれ０〜１の範囲でランダムに設定しておき、楽曲マップ学習部４６は、ｘ_ｊ（ｔ）に最も近いニューロンｃ、すなわち‖ｘ_ｊ（ｔ）−ｍ_ｃ（ｔ）‖を最小にする勝者ニューロンｃを求め、勝者ニューロンｃの特徴ベクトルｍ_ｃ（ｔ）と、勝者ニューロンｃの近傍にある近傍ニューロンｉの集合Ｎ_ｃのそれぞれの特徴ベクトルｍ_ｉ（ｔ）（ｉ∈Ｎ_ｃ）とを、次式に従ってそれぞれ更新する（ステップＤ４）。なお、近傍ニューロンｉを決定するための近傍半径は、予め設定されているものとする。 First, as an initial value, feature vectors m _c (0) of all neurons are set at random in the range of 0 to 1, and the music map learning unit 46 determines that the neuron c, which is closest to x _j (t), That is, a winner neuron c that minimizes ‖x _j (t) -m _c (t) ‖ is obtained, and a feature vector m _c (t) of the winner neuron c and a set N of neighboring neurons i in the vicinity of the winner neuron c and each feature vector _m i of _{_{c (t) (i∈N c)}} , and updates each according to the following equation (step D4). It is assumed that the neighborhood radius for determining the neighborhood neuron i is set in advance.

なお、数５において、ｈ_ｃｉ（ｔ）は、学習率を表し、次式によって求められる。 In Equation 5, h _ci (t) represents a learning rate and is obtained by the following equation.

なお、α_initは学習率の初期値であり、Ｒ^２（ｔ）は、単調減少する一次関数もしくは指数関数が用いられる。 Α _init is an initial value of the learning rate, and R ² (t) is a monotonically decreasing linear function or exponential function.

次に、楽曲マップ学習部４６は、学習回数ｔが設定値Ｔに達したか否かを判断し（ステップＤ５）、学習回数ｔが設定値Ｔに達するまでステップＤ１〜ステップＤ４の処理動作を繰り返し、学習回数ｔが設定値Ｔに達すると、特徴ベクトル出力部４８を介して学習させた特徴ベクトルｍ_ｉ（Ｔ）∈Ｒ^ｎを出力する（ステップＤ６）。出力された各ニューロンｉの特徴ベクトルｍ_ｉ（Ｔ）は、楽曲検索装置１０の楽曲マップ記憶部１７に楽曲マップとして記憶される。 Next, the music map learning unit 46 determines whether or not the learning count t has reached the set value T (step D5), and performs the processing operations of steps D1 to D4 until the learning count t reaches the set value T. Repeatedly, when the learning count t reaches the set value T, the feature vector m _i (T) εR ⁿ learned through the feature vector output unit 48 is output (step D6). The output feature vector m _i (T) of each neuron i is stored as a music map in the music map storage unit 17 of the music search device 10.

次に、楽曲検索装置１０における楽曲検索動作について図７を参照して詳細に説明する。
楽曲検索部１８は、ＰＣ表示部２０に、図１０に示すような検索画面５０を表示し、ＰＣ操作部１９からのユーザ入力を受け付ける。検索画面５０は、楽曲マップ記憶部１７に記憶されている楽曲データのマッピング状況が表示される楽曲マップ表示領域５１と、検索条件を入力する検索条件入力領域５２と、検索結果が表示される検索結果表示領域５３とからなる。図１０の楽曲マップ表示領域５１に示されている点は、楽曲データがマッピングされている楽曲マップのニューロンを示している。 Next, the music search operation in the music search apparatus 10 will be described in detail with reference to FIG.
The music search unit 18 displays a search screen 50 as shown in FIG. 10 on the PC display unit 20 and accepts user input from the PC operation unit 19. The search screen 50 includes a music map display area 51 in which the mapping status of music data stored in the music map storage unit 17 is displayed, a search condition input area 52 for inputting search conditions, and a search in which search results are displayed. It consists of a result display area 53. The points shown in the music map display area 51 in FIG. 10 indicate the neurons of the music map to which the music data is mapped.

検索条件入力領域５２は、図１１に示すように、検索条件として印象度データを入力する印象度データ入力領域５２１と、検索条件として書誌データを入力する書誌データ入力領域５２２と、検索の実行を指示する検索実行ボタン５２３とからなり、ユーザは、検索条件として印象度データおよび書誌データをＰＣ操作部１９から入力し（ステップＥ１）、検索実行ボタン５２３をクリックすることで、印象度データおよび書誌データに基づく検索を楽曲検索部１８に指示する。なお、ＰＣ操作部１９からの印象度データの入力は、図１１に示すように、印象度データの各項目を７段階評価で入力することによって行われる。 As shown in FIG. 11, the search condition input area 52 has an impression degree data input area 521 for inputting impression degree data as a search condition, a bibliographic data input area 522 for input of bibliographic data as a search condition, and executes the search. The search execution button 523 for instructing the user, the user inputs impression degree data and bibliographic data as search conditions from the PC operation unit 19 (step E1), and clicks the search execution button 523, whereby the impression degree data and bibliography are clicked. A search based on data is instructed to the music search unit 18. Note that the impression data is input from the PC operation unit 19 by inputting each item of the impression data in a seven-step evaluation as shown in FIG.

楽曲検索部１８は、ＰＣ操作部１９から入力された印象度データおよび書誌データに基づいて楽曲データベース１５を検索し（ステップＥ２）、図１２に示すような検索結果を検索結果表示領域５３に表示する。 The music search unit 18 searches the music database 15 based on the impression data and bibliographic data input from the PC operation unit 19 (step E2), and displays the search results as shown in FIG. To do.

ＰＣ操作部１９から入力された印象度データに基づく検索は、ＰＣ操作部１９から入力された印象度データを入力ベクトルｘ_ｊとし、楽曲データベース１５に楽曲データと共に記憶されている印象度データを検索対象ベクトルＸ_ｊとすると、入力ベクトルｘ_ｊに近い検索対象ベクトルＸ_ｊ、すなわちユークリッド距離‖ｘ_ｊ−ｍ_ｉ‖が小さい順に検索していく。検索する件数は、予め定めておいても、ユーザによって任意に設定するようにしても良い。また、印象度データと書誌データとが共に検索条件とされている場合には、書誌データに基づく検索を行った後、印象度データに基づく検索が行われる。なお、Ｒは、印象度データ各項目の評価段階数を示し、ｎは、印象度データの項目数を示す。 The search based on the impression degree data input from the PC operation unit 19 uses the impression degree data input from the PC operation unit 19 as the input vector _xj, and searches the impression degree data stored together with the song data in the song database 15. If the target vector _{X j,} the input vectors _{x j} closer search target vector _{X j,} i.e. continue to search the Euclidean distance ‖x _j -m _i ‖ is ascending order. The number of searches may be determined in advance or arbitrarily set by the user. If both impression level data and bibliographic data are set as search conditions, after searching based on bibliographic data, searching based on impression level data is performed. R represents the number of evaluation stages for each item of impression degree data, and n represents the number of items of impression degree data.

検索条件入力領域５２を用いた検索以外に、楽曲マップ表示領域５１を用いた検索を行える様にしても良い。この場合には、楽曲マップ表示領域５１において検索対象領域を指定することで、検索対象領域内にマッピングされている楽曲データを検索結果として検索結果表示領域５３に表示する。 In addition to the search using the search condition input area 52, a search using the music map display area 51 may be performed. In this case, by designating the search target area in the music map display area 51, the music data mapped in the search target area is displayed in the search result display area 53 as a search result.

次に、ユーザは、検索結果表示領域５３に表示されている検索結果の中から代表曲を選択し（ステップＥ３）、代表曲検索実行ボタン５３１をクリックすることで、代表曲に基づく検索を楽曲検索部１８に指示する。 Next, the user selects a representative song from the search results displayed in the search result display area 53 (step E3), and clicks the representative song search execution button 531 to perform a search based on the representative song. The search unit 18 is instructed.

楽曲検索部１８は、選択された代表曲に基づいて楽曲マップ記憶部１７に記憶されている楽曲マップを検索し（ステップＥ４）、代表曲がマッピングされているニューロンと、その近傍ニューロンとにマッピングされている楽曲データを代表曲検索結果として検索結果表示領域５３に表示する。近傍ニューロンを決定するための近傍半径は、予め定めておいても、ユーザによって任意に設定するようにしても良い。 The music search unit 18 searches the music map stored in the music map storage unit 17 based on the selected representative music (step E4), and maps it to the neuron to which the representative music is mapped and its neighboring neurons. The stored music data is displayed in the search result display area 53 as a representative music search result. The neighborhood radius for determining the neighborhood neuron may be set in advance or arbitrarily set by the user.

次に、ユーザは、検索結果表示領域５３に表示されている代表曲検索結果の中から端末装置３０に出力する楽曲データを、図１３に示すように選択し（ステップＥ５）、出力ボタン５３２をクリックすることで、選択した楽曲データの出力を楽曲検索部１８に指示し、楽曲検索部１８は、検索結果出力部２１を介してユーザによって選択された楽曲データを端末装置３０に出力する（ステップＥ６）。 Next, the user selects music data to be output to the terminal device 30 from the representative music search results displayed in the search result display area 53 as shown in FIG. 13 (step E5), and the output button 532 is selected. By clicking, the music search unit 18 is instructed to output the selected music data, and the music search unit 18 outputs the music data selected by the user via the search result output unit 21 to the terminal device 30 (step). E6).

なお、検索条件入力領域５２、楽曲マップ表示領域５１を用いた代表曲の検索以外に、図１４に示すような、記憶されている全楽曲のリストが表示される全楽曲リスト表示領域５４を検索画面５０に表示させ、全楽曲リストから代表曲を直接選択して、代表曲選択実行ボタン５４１をクリックすることで、選択された代表曲に基づく検索を楽曲検索部１８に指示するように構成しても良い。 In addition to the search for representative songs using the search condition input area 52 and the music map display area 51, a search is made for an all music list display area 54 in which a list of all stored music is displayed as shown in FIG. It is configured to display on the screen 50 and directly select a representative song from the entire song list and click the representative song selection execution button 541 to instruct the music search unit 18 to search based on the selected representative song. May be.

さらに、上述した検索以外に、「明るい曲」、「楽しい曲」、「癒される曲」というように言葉で表現されるキーワードに対応するニューロン（あるいは楽曲）を設定しておき、キーワードを選択することによって楽曲の検索を行えるように構成しても良い。すなわち、図１５（ａ）に示すような、キーワード検索領域５５を検索画面５０に表示させ、キーワード選択領域５５１に表示されたキーワードのリストからいずれかを選択し、おまかせ検索ボタン５５３をクリックすることで、選択されたキーワードに対応するニューロンに基づく検索を楽曲検索部１８に指示するように構成する。図１５（ａ）に示す設定楽曲表示領域５５２には、選択されたキーワードに対応する楽曲が設定されている場合に、当該楽曲が設定楽曲として表示され、この場合には、おまかせ検索ボタン５５３をクリックすることで、選択されたキーワードに対応する設定楽曲を代表曲とする検索を楽曲検索部１８に指示する。また、図１５（ａ）に示す設定楽曲変更ボタン５５４は、キーワードに対応する楽曲を変更する際に使用されるもので、設定楽曲変更ボタン５５４をクリックすることで、全楽曲リストが表示されて、全楽曲リストの中から楽曲を選択することで、キーワードに対応する楽曲を変更できるように構成する。なお、キーワードに対応するニューロン（あるいは楽曲）の設定は、キーワードに印象度データを割り付けておき、当該印象度データを入力ベクトルｘ_ｊとし、入力ベクトルｘ_ｊに最も近いニューロン（あるいは楽曲）とを対応づけるようにしても良く、ユーザによって任意に設定できるように構成しても良い。 In addition to the search described above, neurons (or songs) corresponding to keywords expressed in words such as “bright songs”, “fun songs”, and “healed songs” are set and keywords are selected. It may be configured so that music can be searched. That is, as shown in FIG. 15A, a keyword search area 55 is displayed on the search screen 50, one is selected from the keyword list displayed in the keyword selection area 551, and the automatic search button 553 is clicked. Thus, the music search unit 18 is instructed to search based on the neuron corresponding to the selected keyword. In the set music display area 552 shown in FIG. 15A, when a music corresponding to the selected keyword is set, the music is displayed as the set music. In this case, an automatic search button 553 is displayed. By clicking, the music search unit 18 is instructed to search for the set music corresponding to the selected keyword as a representative music. The set music change button 554 shown in FIG. 15A is used when changing the music corresponding to the keyword. When the set music change button 554 is clicked, the entire music list is displayed. The music corresponding to the keyword can be changed by selecting the music from the entire music list. The setting of neurons (or songs) that correspond to the keywords in advance by assigning impression data to the keyword, the impression data as input vectors x _j, the nearest neuron to the input vector x _j (or music) You may make it match | combine and you may comprise so that it can set arbitrarily by a user.

このように、キーワードに対応するニューロンが設定されている場合には、図１５（ｂ）に示すように、楽曲マップ表示領域５１において楽曲がマッピンクされているニューロンをクリックすると、クリックされたニューロンに対応するキーワードがキーワード表示５１１としてポップアップ表示されるように構成すると、楽曲マップ表示領域５１を利用した楽曲の検索を容易に行うことができる。 Thus, when the neuron corresponding to the keyword is set, as shown in FIG. 15B, when the neuron to which the music is mapped is clicked in the music map display area 51, the clicked neuron is displayed. If the corresponding keyword is configured to be pop-up displayed as the keyword display 511, it is possible to easily search for a song using the song map display area 51.

以上説明したように、本実施の形態によれば、予め学習が施された自己組織化マップであり、楽曲データが、当該楽曲データが有する印象度データに基づいてマッピングされている楽曲マップを楽曲マップ記憶部１７に記憶しておき、楽曲検索部１８によって楽曲マップ記憶部１７に記憶されている楽曲マップ用いて検索するように構成することにより、代表曲を選択するだけで、大容量の記憶手段に記憶されている大量の楽曲データの中から代表曲と印象が同じような楽曲を素早く検索することができるという効果を奏する。 As described above, according to the present embodiment, the music map is a self-organized map that has been learned in advance, and the music data is mapped based on the impression degree data of the music data. By storing in the map storage unit 17 and searching by using the music map stored in the music map storage unit 17 by the music search unit 18, a large-capacity memory can be obtained simply by selecting a representative song. It is possible to quickly search for music having the same impression as the representative music from a large amount of music data stored in the means.

さらに、本実施の形態によれば、印象度データ変換部１４で用いる階層型ニューラルネットワークを、楽曲データを聴取した評価者によって入力された印象度データを教師信号として学習を施すように構成することにより、例えば、ユーザが認知する著名人を評価者として起用することで、ユーザの信頼性を向上させることができると共に、複数人の評価者によってそれぞれ学習が施された階層型ニューラルネットワークを用意し、ユーザによって選択できるようにすれば、ユーザの利便性が向上するという効果を奏する。 Furthermore, according to the present embodiment, the hierarchical neural network used in the impression data conversion unit 14 is configured to perform learning using the impression data input by the evaluator who listened to the music data as a teacher signal. Thus, for example, by using a celebrity recognized by the user as an evaluator, the reliability of the user can be improved, and a hierarchical neural network that has been subjected to learning by a plurality of evaluators is prepared. If it can be selected by the user, the convenience of the user is improved.

さらに、本実施の形態によれば、特徴データ抽出部１３においてゆらぎ情報からなる複数の項目を特徴データとして抽出するように構成することにより、楽曲データの物理的な特徴を正確に抽出することができ、特徴データから変換される印象度データの精度を向上させることができるという効果を奏する。 Furthermore, according to the present embodiment, the feature data extraction unit 13 is configured to extract a plurality of items of fluctuation information as feature data, thereby accurately extracting the physical features of the music data. It is possible to improve the accuracy of the impression degree data converted from the feature data.

さらに、本実施の形態によれば、楽曲マップとして予め学習が施された自己組織化マップを用いることにより、類似する印象を有する楽曲が近隣に配置されるため、検索効率が向上するという効果を奏する。 Furthermore, according to the present embodiment, by using a self-organizing map that has been learned in advance as a music map, music having a similar impression is arranged in the vicinity, so that the search efficiency is improved. Play.

次に、本発明の他の実施の形態について図１６を参照して詳細に説明する。
図１６は、本発明に係る楽曲検索システムの他の実施の形態の構成を示すブロック図である。 Next, another embodiment of the present invention will be described in detail with reference to FIG.
FIG. 16 is a block diagram showing a configuration of another embodiment of a music search system according to the present invention.

図１６に示す他の実施の形態では、図１に示す楽曲データベース１５、楽曲マップ記憶部１７および楽曲検索部１８とそれぞれ同等の機能を有する楽曲データベース３６、楽曲マップ記憶部３７および楽曲検索部３８を端末装置３０に備え、端末装置３０で楽曲データベース３６の検索と、楽曲マップ記憶部３７に記憶されている楽曲マップの検索とを行えるように構成されている。他の実施の形態において、楽曲検索装置１０は、楽曲データ入力部１１から入力された楽曲データを楽曲データベース１５に、印象度データ変換部１４によって変換された印象度データを楽曲データベース１５に、楽曲マッピング部１６によってマッピングされた楽曲マップを楽曲マップ記憶部１７にそれぞれ記憶させる楽曲登録装置としてする。 In another embodiment shown in FIG. 16, a music database 36, a music map storage unit 37, and a music search unit 38 having functions equivalent to those of the music database 15, music map storage unit 17 and music search unit 18 shown in FIG. The terminal device 30 is configured so that the terminal device 30 can search the music database 36 and search the music map stored in the music map storage unit 37. In another embodiment, the music search device 10 stores the music data input from the music data input unit 11 in the music database 15, the impression data converted by the impression data conversion unit 14 in the music database 15, The music map mapped by the mapping unit 16 is used as a music registration device that stores the music map in the music map storage unit 17.

楽曲検索装置１０の楽曲データベース１５および楽曲マップ記憶部１７の記憶内容は、データベース出力部２２によって端末装置３０に出力され、端末装置３０のデータベース入力部３９は、楽曲データベース１５および楽曲マップ記憶部１７の記憶内容を楽曲データベース３６および楽曲マップ記憶部３７に記憶させる。検索条件の入力は、端末表示部３４の表示内容に基づいて、端末操作部３３から行われる。 The contents stored in the music database 15 and the music map storage unit 17 of the music search device 10 are output to the terminal device 30 by the database output unit 22. Is stored in the music database 36 and the music map storage unit 37. The search condition is input from the terminal operation unit 33 based on the display content of the terminal display unit 34.

なお、本発明が上記各実施の形態に限定されず、本発明の技術思想の範囲内において、各実施の形態は適宜変更され得ることは明らかである。また、上記構成部材の数、位置、形状等は上記実施の形態に限定されず、本発明を実施する上で好適な数、位置、形状等にすることができる。なお、各図において、同一構成要素には同一符号を付している。 Note that the present invention is not limited to the above-described embodiments, and it is obvious that the embodiments can be appropriately changed within the scope of the technical idea of the present invention. In addition, the number, position, shape, and the like of the constituent members are not limited to the above-described embodiment, and can be set to a suitable number, position, shape, and the like in practicing the present invention. In each figure, the same numerals are given to the same component.

本発明に係る楽曲検索システムの実施の形態の構成を示すブロック図である。It is a block diagram which shows the structure of embodiment of the music search system which concerns on this invention. 図１に示す楽曲検索装置に用いられるニューラルネットワークを事前に学習させるニューラルネットワーク学習装置の構成を示すブロック図である。It is a block diagram which shows the structure of the neural network learning apparatus which learns in advance the neural network used for the music search apparatus shown in FIG. 図１に示す楽曲検索装置における楽曲登録動作を説明するためのフローチャートである。It is a flowchart for demonstrating the music registration operation | movement in the music search apparatus shown in FIG. 図１に示す特徴データ抽出部における特徴データ抽出動作を説明するためのフローチャートである。It is a flowchart for demonstrating the feature data extraction operation | movement in the feature data extraction part shown in FIG. 図２に示すニューラルネットワーク学習装置における階層型ニューラルネットワークの学習動作を説明するためのフローチャートである。3 is a flowchart for explaining a learning operation of a hierarchical neural network in the neural network learning apparatus shown in FIG. 図２に示すニューラルネットワーク学習装置における楽曲マップの学習動作を説明するためのフローチャートである。3 is a flowchart for explaining a music map learning operation in the neural network learning apparatus shown in FIG. 2. 図１に示す楽曲検索装置における楽曲検索動作を説明するためのフローチャートである。It is a flowchart for demonstrating the music search operation | movement in the music search apparatus shown in FIG. 図２に示すニューラルネットワーク学習装置における階層型ニューラルネットワークの学習アルゴリズムを説明するための説明図である。It is explanatory drawing for demonstrating the learning algorithm of the hierarchical neural network in the neural network learning apparatus shown in FIG. 図２に示すニューラルネットワーク学習装置における楽曲マップの学習アルゴリズムを説明するための説明図である。It is explanatory drawing for demonstrating the learning algorithm of the music map in the neural network learning apparatus shown in FIG. 図１に示すＰＣ表示部の表示画面例を示す図である。It is a figure which shows the example of a display screen of the PC display part shown in FIG. 図１０に示す検索条件入力領域の表示例を示す図である。It is a figure which shows the example of a display of the search condition input area shown in FIG. 図１０に示す検索結果表示領域の表示例を示す図である。It is a figure which shows the example of a display of the search result display area shown in FIG. 図１０に示す検索結果表示領域の表示例を示す図である。It is a figure which shows the example of a display of the search result display area shown in FIG. 図１０に示す表示画面例に表示される全楽曲リスト表示領域例を示す図である。It is a figure which shows the example of all the music list display areas displayed on the example of a display screen shown in FIG. 図１０に示す表示画面例に表示されるキーワード検索領域例を示す図である。It is a figure which shows the example of a keyword search area | region displayed on the example of a display screen shown in FIG. 本発明に係る楽曲検索システムの他の実施の形態の構成を示すブロック図である。It is a block diagram which shows the structure of other embodiment of the music search system which concerns on this invention.

Explanation of symbols

１０楽曲検索装置
１１楽曲データ入力部
１２圧縮処理部
１３特徴データ抽出部
１４印象度データ変換部
１５楽曲データベース
１６楽曲マッピング部
１７楽曲マップ記憶部
１８楽曲検索部
１９ＰＣ操作部
２０ＰＣ表示部
２１検索結果出力部
２２データベース出力部
３０端末装置
３１検索結果入力部
３２検索結果記憶部
３３端末操作部
３４端末表示部
３５音声出力部
３６楽曲データベース
３７楽曲マップ記憶部
３８楽曲検索部
３９データベース入力部
４０ニューラルネットワーク学習装置
４１楽曲データ入力部
４２音声出力部
４３特徴データ抽出部
４４印象度データ入力部
４５結合重み値学習部
４６楽曲マップ学習部
４７結合重み値出力部
４８特徴ベクトル出力部
５０検索画面
５１楽曲マップ表示領域
５２検索条件入力領域
５３検索結果表示領域
５４全楽曲リスト表示領域
５５キーワード検索領域
５１１キーワード表示
５２１印象度データ入力領域
５２２書誌データ入力領域
５２３検索実行ボタン
５３１代表曲検索実行ボタン
５３２出力ボタン
５４１代表曲選択実行ボタン
５５１キーワード選択領域
５５２設定楽曲表示領域
５５３おまかせ検索ボタン
５５４設定楽曲変更ボタン
DESCRIPTION OF SYMBOLS 10 Music search device 11 Music data input part 12 Compression processing part 13 Feature data extraction part 14 Impression degree data conversion part 15 Music database 16 Music mapping part 17 Music map memory | storage part 18 Music search part 19 PC operation part 20 PC display part 21 Search Result output unit 22 Database output unit 30 Terminal device 31 Search result input unit 32 Search result storage unit 33 Terminal operation unit 34 Terminal display unit 35 Audio output unit 36 Music database 37 Music map storage unit 38 Music search unit 39 Database input unit 40 Neural Network learning device 41 Music data input unit 42 Audio output unit 43 Feature data extraction unit 44 Impression degree data input unit 45 Bond weight value learning unit 46 Music map learning unit 47 Connection weight value output unit 48 Feature vector output unit 50 Search screen 51 Music Map display Area 52 Search condition input area 53 Search result display area 54 All music list display area 55 Keyword search area 511 Keyword display 521 Impression data input area 522 Bibliographic data input area 523 Search execution button 531 Representative song search execution button 532 Output button 541 Representative Song selection execution button 551 Keyword selection area 552 Setting music display area 553 Automatic search button 554 Setting music change button

Claims

A music search system for searching desired music data from a plurality of music data stored in a music database,
Music data input means for inputting the music data;
Feature data extraction means for extracting physical feature data by performing a fast Fourier transform on a fixed frame length of the music data input by the music data input means and calculating a power spectrum;
Impression degree data conversion means for converting the feature data extracted by the feature data extraction means using a hierarchical neural network that has been subjected to learning into impression degree data determined by human sensitivity,
Music mapping means for mapping the music data input by the music data input means to a music map, which is a self-organized map that has been learned in advance, based on the impression data converted by the impression data conversion means; ,
Music map storage means for storing music data mapped by the music mapping means;
Representative song selection means for selecting a representative song from song data mapped to the song map;
Keyword setting means for setting music corresponding to the keyword;
Music mapping display means for displaying music by mapping;
Keyword display means for displaying the keyword when pointing to a neuron that is a music displayed by the music mapping display means;
A music searching means for searching a neuron representative music on the basis of the representative music selected by the representative song selection means keyword is mapped, the music data of the music maps contained in its vicinity neurons,
A music search system comprising: music data output means for outputting music data searched by the music search means.

A music search system comprising a music search device for searching desired music data from a plurality of music data stored in a music database, and a terminal device configured to be connectable to the music search device,
The music search device includes music data input means for inputting the music data;
Feature data extraction means for extracting physical feature data by performing a fast Fourier transform on a fixed frame length of the music data input by the music data input means and calculating a power spectrum;
Impression degree data conversion means for converting the feature data extracted by the feature data extraction means using a hierarchical neural network that has been subjected to learning into impression degree data determined by human sensitivity,
Music mapping means for mapping the music data input by the music data input means to a music map, which is a self-organized map that has been learned in advance, based on the impression data converted by the impression data conversion means; ,
Music map storage means for storing music data mapped by the music mapping means;
Representative song selection means for selecting a representative song from song data mapped to the song map;
Keyword setting means for setting music corresponding to the keyword;
Music mapping display means for displaying music by mapping;
Keyword display means for displaying the keyword when pointing to a neuron that is a music displayed by the music mapping display means;
A music searching means for searching a neuron representative music on the basis of the representative music selected by the representative song selection means keyword is mapped, the music data of the music maps contained in its vicinity neurons,
Music data output means for outputting the music data searched by the music search means,
The terminal device includes search result input means for inputting music data from the music search device;
Search result storage means for storing music data input by the search result input means;
A music search system comprising: audio output means for reproducing music data stored in the search result storage means.

A music search system comprising a music registration device for storing input music data in a music database and a terminal device configured to be connectable to the music registration device,
The music registration device includes music data input means for inputting the music data;
Feature data extraction means for extracting physical feature data by performing a fast Fourier transform on a fixed frame length of the music data input by the music data input means and calculating a power spectrum;
Impression degree data conversion means for converting the feature data extracted by the feature data extraction means using a hierarchical neural network that has been subjected to learning into impression degree data determined by human sensitivity,
Music mapping means for mapping the music data input by the music data input means to a music map, which is a self-organized map that has been learned in advance, based on the impression data converted by the impression data conversion means; ,
Music map storage means for storing music data mapped by the music mapping means;
Database output means for outputting the music data stored in the music database and the music map stored in the music map storage means to the terminal device;
The terminal device includes database input means for inputting music data and a music map from the music registration device;
A terminal-side music database for storing music data input by the database input means;
Terminal-side music map storage means for storing the music map input by the database input means;
Representative song selection means for selecting a representative song from song data mapped to the song map;
Keyword setting means for setting music corresponding to the keyword;
Music mapping display means for displaying music by mapping;
Keyword display means for displaying the keyword when pointing to a neuron that is a music displayed by the music mapping display means;
A music searching means for searching a neuron representative music on the basis of the representative music selected by the representative song selection means keyword is mapped, the music data of the music maps contained in its vicinity neurons,
A music search system comprising: audio output means for reproducing music data searched by the music search means.

It said hierarchical neural network, according to claim 1 to 3, wherein the tune search system is characterized in that the learning is input impression data as teacher signals are applied by evaluators who listen to music data.

The feature data extraction unit, the tune search system as claimed in any one of claims 1 to 4, wherein extracting the plurality of items consisting of fluctuation information as feature data.

The music mapping means uses the impression degree data converted by the impression degree data conversion means as an input vector, and maps the music data input by the music data input means to a neuron closest to the input vector. Claims 1 to 5
The music search system according to any one of the above.

The music search system according to any one of claims 1 to 6 , wherein a neighborhood radius for determining a neighborhood neuron in the music search means can be arbitrarily set.

Tune search system according to any one of claims 1 to 7, characterized in that the learning by the input impression data are applied by evaluators who listen to music data.

A music search system for searching desired music data from a plurality of music data stored in a music database,
It is a self-organizing map that has been learned in advance, a music map to which music data is mapped,
Representative song selection means for selecting a representative song from song data mapped to the song map;
Keyword setting means for setting music corresponding to the keyword;
Music mapping display means for displaying music by mapping;
Keyword display means for displaying the keyword when pointing to a neuron that is a music displayed by the music mapping display means;
A music searching means for searching a neuron representative music on the basis of the representative music selected by the representative song selection means keyword is mapped, the music data of the music maps contained in its vicinity neurons,
A music search system comprising: music data output means for outputting music data searched by the music search means.

The music search system according to claim 9 , wherein the music data is mapped to a music map using impression degree data included in the music data as an input vector.

The music search system according to any one of claims 9 to 10 , wherein a neighborhood radius for determining a neighborhood neuron in the music search means can be arbitrarily set.

The music search system according to any one of claims 9 to 11 , wherein the music map is learned by impression degree data input by an evaluator who has listened to the music data.

A music search method for searching for desired music data from a plurality of music data stored in a music database executed by a computer ,
The Konpita includes a music data input process of the music data input unit accepting an input of the music data,
A feature data extraction step of feature data extraction unit that extracts a physical feature data by calculating a power spectrum performs a fast Fourier transform on a predetermined frame length of the song data said input,
Impression degree data conversion step of an impression degree data conversion unit that converts the extracted feature data into impression degree data determined by human sensitivity using a previously learned hierarchical neural network ;
A music mapping step of a music mapping unit that maps the received music data to a music map, which is a self-organized map that has been previously learned, based on the converted impression degree data ;
A keyword setting step of the PC operation unit for setting music corresponding to the keyword;
A music mapping display step of the PC display unit in which the music is displayed by mapping;
A keyword display step of a PC display unit in which the keyword is displayed when a neuron that is a song displayed in the song mapping display step is pointed;
A representative song selection process of the PC operation unit for selecting a representative song from the song data mapped to the song map ;
A music search step of a music search unit for searching music data mapped to a music map included in a neuron to which a representative music is mapped based on the selected representative music and its neighboring neurons ;
A music search method comprising: executing a music data output step of a search result output unit that outputs music data searched in the music search step .

The extracted feature data is converted into impression degree data determined by human sensitivity using the hierarchical neural network that has been trained using impression degree data input by an evaluator who has listened to music data as a teacher signal. The music search method according to claim 13 .

The music search program which can perform the music search method in any one of Claim 13 thru | or 14 with a computer.