JP2006031637A

JP2006031637A - Musical piece retrieval system and musical piece retrieval method

Info

Publication number: JP2006031637A
Application number: JP2004213608A
Authority: JP
Inventors: Narifumi Nochida; 成文後田
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2004-07-21
Filing date: 2004-07-21
Publication date: 2006-02-02
Anticipated expiration: 2024-07-21
Also published as: JP4246120B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a musical piece retrieval system and a musical piece retrieval method, allowing a strong reflection of taste of a user even if using a hierarchical neural network previously applied with advance learning by an evaluator so as to reduce a work amount of the user for applying learning to the hierarchical neural network. <P>SOLUTION: A plurality of advance learning completion hierarchical neural networks are stored in a learning data storage part 23, one of the plurality of advance learning completion hierarchical neural networks is made to be selected by an initial setting part 22, the advance learning completion hierarchical neural network selected by a hierarchical neural network learning part 25 is made to reflect the taste of the user and is made to be learned, and physical characteristic data of a musical piece and impression degree data decided by human sensitivity are directly associated by the hierarchical neural network made to perform the learning by the hierarchical neural network learning part 25. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、ＨＤＤ等の大容量の記憶手段に大量に記憶されている楽曲データの中から所望の楽曲を検索する楽曲検索システムおよび楽曲検索方法に関し、特に人間の感性によって判断される印象度データに基づいて楽曲の検索が可能な楽曲検索システムおよび楽曲検索方法に関する。 The present invention relates to a music search system and a music search method for searching for desired music from music data stored in large quantities in a large-capacity storage means such as an HDD, and in particular, impression degree data determined by human sensitivity. The present invention relates to a music search system and a music search method capable of searching music based on the above.

近年、ＨＤＤ等の大容量の記憶手段が開発され、大容量の記憶手段に大量の楽曲データを記憶させることができるようになっている。大容量の記憶手段に記憶されている大量の楽曲データの検索は、アーティスト名や曲名、その他のキーワード等の書誌データを用いて行うのが一般的であるが、書誌データで検索した場合には、楽曲が持っている情感を考慮することができず、印象の異なる楽曲が検索される可能性があり、聴取した際の印象が同じような楽曲を検索したい場合には、不向きである。 In recent years, a large-capacity storage means such as an HDD has been developed, and a large amount of music data can be stored in the large-capacity storage means. Searching for a large amount of music data stored in a large-capacity storage means is generally performed using bibliographic data such as artist names, music titles, and other keywords. This is not suitable when it is not possible to take into account the emotions of a music piece and there is a possibility that a music piece with a different impression will be searched, and it is desired to search for a music piece with a similar impression when listening.

そこで、楽曲に対する主観的な印象に基づいて利用者の希望する楽曲を検索可能にするために、検索を希望する楽曲に対するユーザの主観的な要件を入力して数値化して出力し、その出力から、検索対象の楽曲の印象を数量化した予測印象値を算出し、算出した予測印象値をキーとして、複数の楽曲の音響信号及びその楽曲の印象を数量化した印象値を格納した楽曲データベースを検索することにより、利用者の楽曲に対する主観的なイメージに基づいて、希望する楽曲を検索する装置が提案されている（例えば、特許文献１参照）。 Therefore, in order to make it possible to search for the music desired by the user based on the subjective impression of the music, the user's subjective requirements for the music desired to be searched are input, quantified and output, and the output , A predicted impression value obtained by quantifying the impression of the music to be searched is calculated, and a music database storing an acoustic value of a plurality of music and an impression value obtained by quantifying the impression of the music is calculated using the calculated predicted impression value as a key. There has been proposed an apparatus for searching for desired music based on a subjective image of a user's music by searching (for example, see Patent Document 1).

しかしながら、従来技術では、楽曲の物理的な特徴を変換した印象値を、ユーザの主観的な要件の入力が数値化された予測印象値に基づいて検索を行っているため、ユーザによって検索条件として入力される主観的な要件の入力項目が集約されてしまい、主観的な要件に基づく精度の高い楽曲データの検索を実現することができないと共に、楽曲の物理的な特徴を印象値に変換するルールが固定されているため、変換後の印象値が必ずしも個々のユーザの嗜好に即しないという問題点があった。
特開２００２−２７８５４７号公報 However, in the prior art, since the impression value obtained by converting the physical characteristics of the music is searched based on the predicted impression value obtained by quantifying the input of the user's subjective requirements, the search condition is set by the user as a search condition. Input items of subjective requirements that are input are aggregated, and it is not possible to achieve high-accuracy music data search based on subjective requirements, and rules that convert physical characteristics of music into impression values Is fixed, there is a problem that the impression value after conversion does not necessarily match the preference of each user.
JP 2002-278547 A

本発明は斯かる問題点に鑑みてなされたものであり、その目的とするところは、ユーザの嗜好を反映させて学習を施した階層型ニューラルネットワークによって、楽曲の有する物理的な複数の項目からなる特徴データと、人間の感性によって判断される項目からなる印象度データとを直接関連づけることにより、ユーザによって検索条件として入力される人間の感性によって判断される印象度データに基づいて精度の高い楽曲データの検索を行うことができると共に、階層型ニューラルネットワークに学習を施すためのユーザの作業量を軽減させるために予め評価者によって事前学習が施された階層型ニューラルネットワークを用いてもユーザの嗜好を強く反映させることができる楽曲検索システムおよび楽曲検索方法を提供する点にある。 The present invention has been made in view of such problems, and an object of the present invention is to use a hierarchical neural network that reflects user's preference to learn from a plurality of physical items that a song has. High-precision music based on impression data determined by human sensitivity input as a search condition by a user by directly associating feature data and impression data consisting of items determined by human sensitivity Users can also search for data and use the hierarchical neural network that has been pre-learned by the evaluator in order to reduce the user's workload for learning the hierarchical neural network. Is to provide a music search system and a music search method capable of strongly reflecting the music.

本発明は上記課題を解決すべく、以下に掲げる構成とした。
本発明の楽曲検索システムは、複数の楽曲データを楽曲データベースに記憶させ、当該楽曲データベースに記憶された複数の前記楽曲データの中から所望の前記楽曲データを検索する楽曲検索システムであって、評価者によって事前学習が施された、前記楽曲データが有する物理的な特徴データを人間の感性によって判断される印象度データに変換する複数の階層型ニューラルネットワークが記憶された学習データ記憶手段と、該学習データ記憶手段に記憶されている複数の階層型ニューラルネットワークのいずれかを選択する初期設定手段と、前記楽曲データを音声出力する音声出力手段と、該音声出力手段から音声出力された前記楽曲データに対応して前記印象度データを入力する印象度データ入力手段と、該初期設定手段によって選択された階層型ニューラルネットワークを前記音声出力手段から音声出力された前記楽曲データが有する前記特徴データおよび前記印象度データ入力手段から入力された前記印象度データを用いて学習させる階層型ニューラルネットワーク学習手段と、前記楽曲データを入力する楽曲データ入力手段と、該楽曲データ入力手段によって入力された前記楽曲データから前記特徴データを抽出する特徴データ抽出手段と、前記階層型ニューラルネットワーク学習手段によって学習が施された階層型ニューラルネットワークを用いて、前記特徴データ抽出手段によって抽出された前記特徴データを前記印象度データに変換する印象度データ変換手段と、該印象度データ変換手段によって変換された前記印象度データを前記楽曲データ入力手段によって入力された前記楽曲データと共に前記楽曲データベースに記憶させる記憶制御手段と、検索条件として入力された前記印象度データに基づいて前記楽曲データベースを検索する楽曲検索手段と、該楽曲検索手段によって検索された前記楽曲データを出力する楽曲データ出力手段とを具備することを特徴とする。 In order to solve the above problems, the present invention has the following configurations.
The music search system of the present invention is a music search system for storing a plurality of music data in a music database and searching for the desired music data from the plurality of music data stored in the music database. Learning data storage means in which a plurality of hierarchical neural networks for converting physical feature data of the music data, which has been pre-learned by a person, into impression degree data determined by human sensitivity, are stored; Initial setting means for selecting one of a plurality of hierarchical neural networks stored in the learning data storage means, audio output means for outputting the music data as audio, and the music data output as audio from the audio output means The impression degree data input means for inputting the impression degree data corresponding to the The hierarchical neural network learning means for learning the hierarchical neural network using the feature data included in the music data output from the voice output means and the impression degree data input from the impression degree data input means Learning is performed by music data input means for inputting the music data, feature data extraction means for extracting the feature data from the music data input by the music data input means, and the hierarchical neural network learning means. Impression degree data conversion means for converting the feature data extracted by the feature data extraction means into the impression degree data using the hierarchical neural network thus formed, and the impression degree converted by the impression degree data conversion means Data by the music data input means Storage control means for storing in the music database together with the received music data, music search means for searching the music database based on the impression degree data input as a search condition, and search performed by the music search means And music data output means for outputting the music data.

さらに、本発明の楽曲検索システムは、ユーザのパーソナル情報を入力するパーソナル情報入力手段を具備し、前記学習データ記憶手段に異なる前記評価者によってそれぞれ事前学習が施された複数の階層型ニューラルネットワークを前記評価者のパーソナル情報と共に記憶させておき、前記初期設定手段は、前記ユーザのパーソナル情報と前記評価者のパーソナル情報とを比較することによって、前記学習データ記憶手段に記憶されている複数の階層型ニューラルネットワークのいずれかを選択することを特徴とする。 Furthermore, the music search system of the present invention includes personal information input means for inputting personal information of a user, and includes a plurality of hierarchical neural networks that are pre-learned by different evaluators in the learning data storage means. The initial setting means stores a plurality of hierarchies stored in the learning data storage means by comparing the personal information of the user with the personal information of the evaluator. One type of neural network is selected.

さらに、本発明の楽曲検索システムは、前記楽曲データのジャンルを入力するジャンル入力手段を具備し、前記学習データ記憶手段に異なるジャンルの楽曲データによってそれぞれ事前学習が施された複数の階層型ニューラルネットワークを記憶させておき、前記初期設定手段は、前記ジャンル入力手段によって入力されたジャンルに基づいて前記学習データ記憶手段に記憶されている複数の階層型ニューラルネットワークのいずれかを選択することを特徴とする。 Furthermore, the music search system of the present invention comprises a genre input means for inputting the genre of the music data, and a plurality of hierarchical neural networks each pre-trained by the music data of different genres in the learning data storage means And the initial setting means selects any one of a plurality of hierarchical neural networks stored in the learning data storage means based on the genre input by the genre input means. To do.

本発明の楽曲検索方法は、複数の楽曲データを楽曲データベースに記憶し、当該楽曲データベースに記憶された複数の前記楽曲データの中から所望の前記楽曲データを検索する楽曲検索方法であって、評価者によって事前学習が施された、前記楽曲データが有する物理的な特徴データを人間の感性によって判断される印象度データに変換する複数の階層型ニューラルネットワークを記憶しておき、該記憶している複数の階層型ニューラルネットワークのいずれかを選択し、前記楽曲データを音声出力し、該音声出力した前記楽曲データに対応する前記印象度データの入力を受け付け、前記選択した階層型ニューラルネットワークを前記音声出力された前記楽曲データが有する前記特徴データおよび前記受け付けた前記印象度データを用いて学習させ、前記楽曲データの入力を受け付け、該受け付けた前記楽曲データから前記特徴データを抽出し、前記学習させた階層型ニューラルネットワークを用いて、前記抽出した前記特徴データを前記印象度データに変換し、該変換した前記印象度データを前記受け付けた前記楽曲データと共に前記楽曲データベースに記憶し、
検索条件として前記印象度データの入力を受け付け、該受け付けた前記印象度データに基づいて前記楽曲データベースを検索し、該検索した前記楽曲データを出力することを特徴とする。 The music search method of the present invention is a music search method for storing a plurality of music data in a music database, and searching for the desired music data from the plurality of music data stored in the music database. A plurality of hierarchical neural networks that have been pre-learned by a person and convert physical feature data of the music data into impression degree data determined by human sensitivity, Select one of a plurality of hierarchical neural networks, output the music data as audio, accept input of the impression data corresponding to the audio data output, and select the selected hierarchical neural network as the audio Learning using the feature data of the outputted music data and the received impression data And receiving the input of the music data, extracting the feature data from the received music data, and converting the extracted feature data into the impression degree data using the learned hierarchical neural network. , Storing the converted impression degree data in the music database together with the received music data,
The input of the impression level data is received as a search condition, the music database is searched based on the received impression level data, and the searched music data is output.

さらに、本発明の楽曲検索方法は、ユーザのパーソナル情報の入力を受け付け、前記評価者によってそれぞれ事前学習が施された複数の階層型ニューラルネットワークを前記評価者のパーソナル情報と共に記憶しておき、前記受け付けた前記ユーザのパーソナル情報と前記評価者のパーソナル情報とを比較することによって、前記記憶している複数の階層型ニューラルネットワークのいずれかを選択することを特徴とする。 Furthermore, the music search method of the present invention receives a user's personal information input, stores a plurality of hierarchical neural networks that have been pre-learned by the evaluator, together with the evaluator's personal information, One of the plurality of stored hierarchical neural networks is selected by comparing the received personal information of the user and the personal information of the evaluator.

さらに、本発明の楽曲検索方法は、前記楽曲データのジャンルの入力を受け付け、異なるジャンルの楽曲データによってそれぞれ事前学習が施された複数の階層型ニューラルネットワークを記憶しておき、前記受け付けた前記ジャンルに基づいて前記記憶している複数の階層型ニューラルネットワークのいずれかを選択することを特徴とする。 Further, the music search method of the present invention receives an input of a genre of the music data, stores a plurality of hierarchical neural networks that are pre-learned by music data of different genres, and stores the received genre One of the plurality of stored hierarchical neural networks is selected based on the above.

本発明の楽曲検索システムおよび楽曲検索方法は、評価者によって事前学習が施された、楽曲データが有する物理的な特徴データを人間の感性によって判断される印象度データに変換する複数の階層型ニューラルネットワークを記憶しておき、記憶している複数の階層型ニューラルネットワークのいずれかをユーザの入力に基づいて選択し、選択した階層型ニューラルネットワークをユーザの嗜好を反映させて学習させるように構成することにより、ユーザの嗜好を反映させて学習を施した階層型ニューラルネットワークによって、楽曲の有する物理的な複数の項目からなる特徴データと、人間の感性によって判断される項目からなる印象度データとを直接関連づけることができ、ユーザによって検索条件として入力される人間の感性によって判断される印象度データに基づいて精度の高い楽曲データの検索を行うことができると共に、階層型ニューラルネットワークに学習を施すためのユーザの作業量を軽減させるために予め評価者によって事前学習が施された階層型ニューラルネットワークを用いてもユーザの嗜好を強く反映させることができるという効果を奏する。 The music search system and the music search method according to the present invention include a plurality of hierarchical neural networks that convert physical feature data of music data that has been pre-learned by an evaluator into impression degree data that is determined by human sensitivity. A network is stored, one of a plurality of stored hierarchical neural networks is selected based on a user input, and the selected hierarchical neural network is configured to learn by reflecting user preferences. By means of a hierarchical neural network that has learned by reflecting the user's preference, feature data consisting of a plurality of physical items of music and impression degree data consisting of items judged by human sensitivity Can be directly related, and depending on the human sensitivity input as a search condition by the user. It is possible to search music data with high accuracy based on the judged impression degree data, and in advance, evaluator performs pre-learning in order to reduce the user's workload for learning the hierarchical neural network. Even if the hierarchical neural network is used, the user's preference can be strongly reflected.

以下、本発明の実施の形態を図面に基づいて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

（本実施の形態）
図１は、本発明に係る楽曲検索システムの本実施の形態の構成を示すブロック図であり、図２は、図１に示す端末装置の構成を示すブロック図であり、図３は、図１に示す楽曲検索装置に用いられる階層型ニューラルネットワークおよび楽曲マップを事前に学習させるニューラルネットワーク学習装置の構成を示すブロック図である。 (This embodiment)
FIG. 1 is a block diagram showing the configuration of the present embodiment of the music search system according to the present invention, FIG. 2 is a block diagram showing the configuration of the terminal device shown in FIG. 1, and FIG. It is a block diagram which shows the structure of the neural network learning apparatus which learns in advance the hierarchical neural network and music map which are used for the music search apparatus shown in FIG.

本実施の形態は、図１を参照すると、楽曲検索装置１０と、端末装置３０とがＵＳＢ等のデータ伝送路で接続されており、端末装置３０は、楽曲検索装置１０から切り離して携帯することができる構成となっている。 In this embodiment, referring to FIG. 1, the music search device 10 and the terminal device 30 are connected by a data transmission path such as a USB, and the terminal device 30 is separated from the music search device 10 and is carried. It has a configuration that can.

楽曲検索装置１０は、図１を参照すると、楽曲データ入力部１１と、圧縮処理部１２と、特徴データ抽出部１３と、印象度データ変換部１４と、楽曲データベース１５と、楽曲マッピング部１６と、楽曲マップ記憶部１７と、楽曲検索部１８と、ＰＣ操作部１９と、ＰＣ表示部２０と、検索結果出力部２１と、初期設定部２２と、学習データ記憶部２３と、音声出力部２４と、階層型ニューラルネットワーク学習部２５とからなる。 Referring to FIG. 1, the music search device 10 includes a music data input unit 11, a compression processing unit 12, a feature data extraction unit 13, an impression degree data conversion unit 14, a music database 15, and a music mapping unit 16. The music map storage unit 17, the music search unit 18, the PC operation unit 19, the PC display unit 20, the search result output unit 21, the initial setting unit 22, the learning data storage unit 23, and the voice output unit 24. And a hierarchical neural network learning unit 25.

楽曲データ入力部１１は、ＣＤ、ＤＶＤ等の楽曲データが記憶されている記憶媒体を読み取る機能を有し、ＣＤ、ＤＶＤ等の記憶媒体から楽曲データを入力し、圧縮処理部１２および特徴データ抽出部１３に出力する。ＣＤ、ＤＶＤ等の記憶媒体以外にインターネット等のネットワークを経由した楽曲データ（配信データ）を入力するように構成しても良い。なお、圧縮された楽曲データが入力される場合には、圧縮された楽曲データを伸長して特徴データ抽出部１３に出力する。 The music data input unit 11 has a function of reading a storage medium in which music data such as a CD and a DVD is stored. The music data input unit 11 inputs music data from a storage medium such as a CD and a DVD, and extracts a compression processing unit 12 and feature data. To the unit 13. You may comprise so that the music data (delivery data) via networks, such as the internet, other than storage media, such as CD and DVD, may be input. When compressed music data is input, the compressed music data is decompressed and output to the feature data extraction unit 13.

圧縮処理部１２は、楽曲データ入力部１１から入力された楽曲データをＭＰ３やＡＴＲＡＣ（Adaptive Transform Acoustic Coding）等の圧縮形式で圧縮し、圧縮した楽曲データを、アーティスト名、曲名等の書誌データと共に楽曲データベース１５に記憶させる。 The compression processing unit 12 compresses the music data input from the music data input unit 11 in a compression format such as MP3 or ATRAC (Adaptive Transform Acoustic Coding), and the compressed music data together with bibliographic data such as artist names and music titles. It is stored in the music database 15.

特徴データ抽出部１３は、楽曲データ入力部１１から入力された楽曲データから、ゆらぎ情報からなる特徴データを抽出し、抽出した特徴データを印象度データ変換部１４に出力する。 The feature data extraction unit 13 extracts feature data composed of fluctuation information from the music data input from the music data input unit 11 and outputs the extracted feature data to the impression degree data conversion unit 14.

印象度データ変換部１４は、予め学習が施された階層型ニューラルネットワークを用いて、特徴データ抽出部１３から入力された特徴データを、人間の感性によって判断される印象度データに変換し、変換した印象度データを楽曲マッピング部１６に出力する。 The impression degree data conversion unit 14 converts the feature data input from the feature data extraction unit 13 into impression degree data determined by human sensitivity using a hierarchical neural network that has been learned in advance. The received impression degree data is output to the music mapping unit 16.

楽曲データベース１５は、ＨＤＤ等の大容量の記憶手段であり、圧縮処理部１２によって圧縮された楽曲データ、書誌データと、特徴データ抽出部１３によって抽出された特徴データとが関連づけられて記憶される。 The music database 15 is a large-capacity storage unit such as an HDD, and stores music data and bibliographic data compressed by the compression processing unit 12 in association with feature data extracted by the feature data extraction unit 13. .

楽曲マッピング部１６は、印象度データ変換部１４から入力された印象度データに基づいて、予め学習が施された自己組織化マップである楽曲マップに楽曲データをマッピングし、楽曲データをマッピングした楽曲マップを楽曲マップ記憶部１７に記憶させる。 The music mapping unit 16 maps music data to a music map, which is a self-organized map that has been learned in advance, based on the impression degree data input from the impression degree data conversion unit 14, and the music data is mapped to the music data The map is stored in the music map storage unit 17.

楽曲マップ記憶部１７は、ＨＤＤ等の大容量の記憶手段であり、楽曲マッピング部１６によって楽曲データがマッピングされた楽曲マップが記憶される。 The music map storage unit 17 is a large-capacity storage unit such as an HDD, and stores a music map to which music data is mapped by the music mapping unit 16.

楽曲検索部１８は、ＰＣ操作部１９から入力された印象度データおよび書誌データに基づいて楽曲データベース１５を検索し、当該検索結果をＰＣ表示部２０に表示すると共に、ＰＣ操作部１９によって選択された代表曲に基づいて楽曲マップ記憶部１７を検索し、当該代表曲検索結果をＰＣ表示部２０に表示する。また、楽曲検索部１８は、検索結果出力部２１を介してＰＣ操作部１９によって選択された楽曲データを端末装置３０に出力する。 The music search unit 18 searches the music database 15 based on the impression data and bibliographic data input from the PC operation unit 19, displays the search result on the PC display unit 20, and is selected by the PC operation unit 19. The music map storage unit 17 is searched based on the representative music, and the representative music search result is displayed on the PC display unit 20. Further, the music search unit 18 outputs the music data selected by the PC operation unit 19 to the terminal device 30 via the search result output unit 21.

ＰＣ操作部１９は、キーボードやマウス等の入力手段であり、楽曲データベース１５および楽曲マップ記憶部１７に記憶されている楽曲データを検索する検索条件の入力、端末装置３０に出力する楽曲データを選択する入力が行われる。また、ＰＣ操作部１９は、音声出力部２４からの音声出力に基づく、ユーザによる印象度データの入力を受け付け、受け付けた印象度データを初期設定部２２に出力する。 The PC operation unit 19 is input means such as a keyboard and a mouse, inputs search conditions for searching for music data stored in the music database 15 and the music map storage unit 17, and selects music data to be output to the terminal device 30. Input is made. Further, the PC operation unit 19 accepts input of impression degree data by the user based on the audio output from the audio output unit 24, and outputs the received impression degree data to the initial setting unit 22.

ＰＣ表示部２０は、例えば液晶ディスプレイ等の表示手段であり、初期設定選択画面および初期設定入力画面の表示、楽曲マップ記憶部１７に記憶されている楽曲データのマッピング状況の表示、楽曲データベース１５および楽曲マップ記憶部１７に記憶されている楽曲データを検索する検索条件の表示、検索された楽曲データ（検索結果）の表示が行われる。 The PC display unit 20 is a display unit such as a liquid crystal display, for example, displays an initial setting selection screen and an initial setting input screen, displays a mapping status of music data stored in the music map storage unit 17, the music database 15, and Display of search conditions for searching for music data stored in the music map storage unit 17 and display of searched music data (search results) are performed.

検索結果出力部２１は、端末装置３０の検索結果入力部３１との間をＵＳＢ等のデータ伝送路で接続可能に構成されており、楽曲検索部１８によって検索され、ＰＣ操作部１９によって選択された楽曲データを端末装置３０の検索結果入力部３１に出力する。 The search result output unit 21 is configured to be connectable to the search result input unit 31 of the terminal device 30 through a data transmission path such as USB, and is searched by the music search unit 18 and selected by the PC operation unit 19. The received music data is output to the search result input unit 31 of the terminal device 30.

学習データ記憶部２３は、メモリ等の記憶手段であり、第３者である評価者によって事前学習が施された階層型ニューラルネットワーク（以下、事前学習済み階層型ニューラルネットワークと称す）および事前学習に用いられた学習データ（以下、事前学習データと称す）が記憶されており、事前学習済み階層型ニューラルネットワークは、異なる評価者および異なる事前学習データを用いて複数のものが記憶されている。また、学習データ記憶部２３には、それぞれ異なる評価者および異なる事前学習データを用いて事前学習が施されている複数の学習マップが記憶されている。さらに学習データ記憶部２３には、ジャンルに関係なくランダムに揃えられた初期設定用楽曲データと、ジャンルを統一してそれぞれ揃えられた複数の初期設定用楽曲データとが、その初期設定用楽曲データの特徴データと共に初期設定用サンプルとして記憶されている。なお、初期設定用サンプルの初期設定用楽曲データとしては、１曲全部を必要とするものではなく、特徴データの抽出に用いられた部分のみで良い。 The learning data storage unit 23 is a storage unit such as a memory, and is used for a hierarchical neural network that has been pre-learned by a third party evaluator (hereinafter referred to as a pre-learned hierarchical neural network) and for pre-learning. Used learning data (hereinafter referred to as pre-learning data) is stored, and a plurality of pre-learned hierarchical neural networks are stored using different evaluators and different pre-learning data. The learning data storage unit 23 stores a plurality of learning maps that have been pre-learned using different evaluators and different pre-learning data. Further, the learning data storage unit 23 includes initial setting music data randomly arranged regardless of the genre, and a plurality of initial setting music data that are arranged in a uniform genre. Are stored as initial setting samples. Note that the initial setting music data of the initial setting sample does not require all of the music, but only the portion used for extracting the feature data.

初期設定部２２は、最初に電源が投入されると、初期設定選択画面４９をＰＣ表示部２０に表示させ、パーソナル情報、楽曲のジャンルもしくは評価者の入力を受け付けることにより、階層型ニューラルネットワーク学習部２５で階層型ニューラルネットワークに用いる事前学習済み階層型ニューラルネットワークを特定する共に、楽曲マップ記憶部１７に記憶させる学習マップを特定し、特定した事前学習済み階層型ニューラルネットワークおよび学習マップを階層型ニューラルネットワーク学習部２５に出力する。また、初期設定部２２は、初期設定入力画面５０をＰＣ表示部２０に表示させると共に、ＰＣ操作部１９からの指示に基づいて、学習データ記憶部２３に記憶されている初期設定用サンプルの内の初期設定用楽曲データを音声出力部２４に出力し、当該楽曲データに対応してユーザによってＰＣ操作部１９から入力された印象度データを受け付け、事前学習済み階層型ニューラルネットワークの学習に用いた事前学習データと、初期設定用サンプルの内の特徴データと、ユーザによって入力された印象度データとを初期設定用データとして階層型ニューラルネットワーク学習部２５に出力する。 When the power is first turned on, the initial setting unit 22 displays an initial setting selection screen 49 on the PC display unit 20, and accepts personal information, a genre of music, or an evaluator's input, thereby learning a hierarchical neural network. In addition to specifying a pre-learned hierarchical neural network to be used for the hierarchical neural network in the unit 25, a learning map to be stored in the music map storage unit 17 is specified, and the specified pre-learned hierarchical neural network and learning map are hierarchized. The data is output to the neural network learning unit 25. In addition, the initial setting unit 22 displays the initial setting input screen 50 on the PC display unit 20 and, based on an instruction from the PC operation unit 19, of the initial setting samples stored in the learning data storage unit 23. The initial setting music data is output to the audio output unit 24, and the impression degree data input from the PC operation unit 19 by the user corresponding to the music data is received and used for learning the pre-learned hierarchical neural network. The prior learning data, the feature data in the initial setting sample, and the impression degree data input by the user are output to the hierarchical neural network learning unit 25 as initial setting data.

音声出力部２４は、初期設定部２２から入力された初期設定用楽曲データを伸長して再生するオーディオプレーヤである。 The audio output unit 24 is an audio player that decompresses and reproduces the initial setting music data input from the initial setting unit 22.

階層型ニューラルネットワーク学習部２５は、誤差逆伝播学習法を用い、事前学習済み階層型にニューラルネットワークの各ニューロンの結合重み値を、事前学習データと、初期設定用サンプルの特徴データおよびユーザによって入力された印象度データとからなる初期設定用データに基づいて学習、すなわち各ニューロンの結合重み値の更新を行い、更新した各ニューロンの結合重み値、すなわち学習させて階層型ニューラルネットワークを印象度データ変換部１４に出力すると共に、初期設定部２２で特定された学習マップを楽曲マップ記憶部１７に出力する。 The hierarchical neural network learning unit 25 uses the error back propagation learning method, and inputs the connection weight value of each neuron of the neural network to the pre-learned hierarchical type by the pre-learning data, the feature data of the initial setting sample, and the user. Learning based on the initial setting data composed of the impression degree data, that is, updating the connection weight value of each neuron, and updating the connection weight value of each neuron, that is, letting it learn the hierarchical neural network In addition to outputting to the conversion unit 14, the learning map specified by the initial setting unit 22 is output to the music map storage unit 17.

端末装置３０は、ＨＤＤ等の大容量の記憶手段を有するポータブルオーディオ等の音声再生装置であり、図２を参照すると、検索結果入力部３１と、検索結果記憶部３２と、端末操作部３３、端末表示部３４と、音声出力部３５とからなる。 The terminal device 30 is an audio reproduction device such as a portable audio having a large capacity storage means such as an HDD. Referring to FIG. 2, a search result input unit 31, a search result storage unit 32, a terminal operation unit 33, It consists of a terminal display unit 34 and an audio output unit 35.

検索結果入力部３１は、楽曲検索装置１０の検索結果出力部２１とＵＳＢ等のデータ伝送路で接続可能に構成されており、楽曲検索装置１０の検索結果出力部２１から入力された楽曲データを検索結果記憶部３２に記憶させる。 The search result input unit 31 is configured to be connectable to the search result output unit 21 of the music search device 10 via a data transmission path such as a USB, and the music data input from the search result output unit 21 of the music search device 10 is received. The search result storage unit 32 stores the result.

端末操作部３３は、検索結果記憶部３２に記憶されている楽曲データの選択・再生を指示する入力、ボリュームコントロールの入力等の楽曲データの再生に係る入力が行われる。 The terminal operation unit 33 performs input related to reproduction of music data such as input for instructing selection / reproduction of music data stored in the search result storage unit 32 and input of volume control.

端末表示部３４は、例えば液晶ディスプレイ等の表示手段であり、再生中の曲名や、各種操作ガイダンスが表示される。 The terminal display unit 34 is a display means such as a liquid crystal display, for example, and displays the name of a song being played and various operation guidance.

音声出力部３５は、検索結果記憶部３２に圧縮されて記憶されている楽曲データを伸長して再生するオーティオプレーヤである。 The audio output unit 35 is an audio player that decompresses and reproduces music data that is compressed and stored in the search result storage unit 32.

図３に示すニューラルネットワーク学習装置４０は、印象度データ変換部１４で用いられる階層型ニューラルネットワークと、楽曲マッピング部１６で用いられる楽曲マップとの学習を行う装置であり、図３を参照すると、楽曲データ入力部４１と、音声出力部４２と、特徴データ抽出部４３と、印象度データ入力部４４と、結合重み値学習部４５と、楽曲マップ学習部４６と、結合重み値出力部４７と、特徴ベクトル出力部４８とからなる。 A neural network learning device 40 shown in FIG. 3 is a device that learns a hierarchical neural network used in the impression degree data conversion unit 14 and a music map used in the music mapping unit 16, and referring to FIG. Music data input unit 41, audio output unit 42, feature data extraction unit 43, impression degree data input unit 44, combination weight value learning unit 45, music map learning unit 46, and combination weight value output unit 47 And a feature vector output unit 48.

楽曲データ入力部４１は、ＣＤ、ＤＶＤ等の楽曲データが記憶されている記憶媒体を読み取る機能を有し、ＣＤ、ＤＶＤ等の記憶媒体から楽曲データを入力し、音声出力部４２および特徴データ抽出部４３に出力する。ＣＤ、ＤＶＤ等の記憶媒体以外にインターネット等のネットワークを経由した楽曲データ（配信データ）を入力するように構成しても良い。なお、圧縮された楽曲データが入力される場合には、圧縮された楽曲データを伸長して音声出力部４２および特徴データ抽出部４３に出力する。 The music data input unit 41 has a function of reading a storage medium in which music data such as a CD and a DVD is stored. The music data input unit 41 inputs music data from a storage medium such as a CD and a DVD, and extracts an audio output unit 42 and feature data. To the unit 43. You may comprise so that the music data (delivery data) via networks, such as the internet, other than storage media, such as CD and DVD, may be input. When compressed music data is input, the compressed music data is decompressed and output to the audio output unit 42 and the feature data extraction unit 43.

音声出力部４２は、楽曲データ入力部４１から入力された楽曲データを伸長して再生するオーディオプレーヤである。 The audio output unit 42 is an audio player that decompresses and reproduces music data input from the music data input unit 41.

特徴データ抽出部４３は、楽曲データ入力部４１から入力された楽曲データから、ゆらぎ情報からなる特徴データを抽出し、抽出した特徴データを結合重み値学習部４５に出力する。 The feature data extraction unit 43 extracts feature data composed of fluctuation information from the song data input from the song data input unit 41 and outputs the extracted feature data to the connection weight value learning unit 45.

印象度データ入力部４４は、音声出力部４２からの音声出力に基づく、評価者による印象度データの入力を受け付け、受け付けた印象度データを、階層型ニューラルネットワークの学習に用いる教師信号として結合重み値学習部４５に出力すると共に自己組織化マップへの入力ベクトルとして楽曲マップ学習部４６に出力する。 The impression degree data input unit 44 accepts input of impression degree data by the evaluator based on the audio output from the audio output unit 42, and uses the received impression degree data as a joint signal as a teacher signal used for learning of the hierarchical neural network. The value is output to the value learning unit 45 and is also output to the music map learning unit 46 as an input vector to the self-organizing map.

結合重み値学習部４５は、特徴データ抽出部４３から入力された特徴データと、印象度データ入力部４４から入力された印象度データとに基づいて階層型ニューラルネットワークに学習を施し、各ニューロンの結合重み値を更新し、更新した各ニューロンの結合重み値ｗと、更新に用いた事前学習データ（楽曲データの特徴データ＋印象度データ）とを結合重み値出力部４７によって学習データ記憶部２３に出力する。 The connection weight value learning unit 45 performs learning on the hierarchical neural network based on the feature data input from the feature data extraction unit 43 and the impression degree data input from the impression degree data input unit 44, and The connection weight value is updated, and the updated connection weight value w of each neuron and the pre-learning data (music data feature data + impression degree data) used for updating are updated by the connection weight value output unit 47 by the learning data storage unit 23. Output to.

楽曲マップ学習部４６は、印象度データ入力部４４から入力された印象度データを自己組織化マップへの入力ベクトルとして自己組織化マップに学習を施し、各ニューロンの特徴ベクトルを更新し、特徴ベクトル出力部４８を介して更新した特徴ベクトルを出力する。学習が施された自己組織化マップ（更新された特徴ベクトル）は、楽曲マップとして楽曲検索装置１０の楽曲マップ記憶部１７に記憶される。 The music map learning unit 46 learns the self-organizing map using the impression degree data input from the impression degree data input unit 44 as an input vector to the self-organizing map, updates the feature vector of each neuron, and The updated feature vector is output via the output unit 48. The learned self-organizing map (updated feature vector) is stored in the music map storage unit 17 of the music search device 10 as a music map.

次に、本実施の形態の動作について図４乃至図１９を参照して詳細に説明する。
図４は、図１に示す階層型ニューラルネットワーク学習部における階層型ニューラルネットワークの学習アルゴリズムを説明するための説明図であり、図５は、図３に示すニューラルネットワーク学習装置における楽曲マップの学習アルゴリズムを説明するための説明図であり、
図６は、図３に示すニューラルネットワーク学習装置における階層型ニューラルネットワークの事前学習動作および楽曲マップの学習動作を説明するためのフローチャートであり、図７は、図１に示す学習データ記憶部の記憶内容を示す図であり、図８は、図１に示す階層型ニューラルネットワーク学習部において初期設定時に行われる階層型ニューラルネットワークの学習動作および学習マップの選択動作を説明するためのフローチャートであり、図９は、図１に示すＰＣ表示部に表示される初期設定選択画面例を示す図であり、図１０は、図１に示すＰＣ表示部に表示される初期設定入力画面例を示す図であり、図１１は、図１に示す楽曲検索装置における楽曲登録動作を説明するためのフローチャートであり、図１２は、図１に示す特徴データ抽出部における特徴データ抽出動作を説明するためのフローチャートであり、図１３は、図１に示す楽曲検索装置における楽曲検索動作を説明するためのフローチャートであり、図１４は、図１に示すＰＣ表示部に表示される検索画面例を示す図であり、図１５は、図１４に示す検索条件入力領域の表示例を示す図であり、図１６および図１７は、図１４に示す検索結果表示領域の表示例を示す図であり、図１８は、図１４に示す検索画面例に表示される全楽曲リスト表示領域例を示す図であり、図１９は、図１４に示す検索画面例に表示されるキーワード検索領域例を示す図である。 Next, the operation of the present embodiment will be described in detail with reference to FIGS.
4 is an explanatory diagram for explaining a learning algorithm of the hierarchical neural network in the hierarchical neural network learning unit shown in FIG. 1, and FIG. 5 is a music map learning algorithm in the neural network learning apparatus shown in FIG. It is explanatory drawing for demonstrating,
FIG. 6 is a flowchart for explaining the pre-learning operation of the hierarchical neural network and the music map learning operation in the neural network learning apparatus shown in FIG. 3, and FIG. 7 is the storage of the learning data storage unit shown in FIG. FIG. 8 is a flowchart for explaining the learning operation of the hierarchical neural network and the selection operation of the learning map performed at the initial setting in the hierarchical neural network learning unit shown in FIG. 9 is a diagram showing an example of an initial setting selection screen displayed on the PC display unit shown in FIG. 1, and FIG. 10 is a diagram showing an example of an initial setting input screen displayed on the PC display unit shown in FIG. 11 is a flowchart for explaining a music registration operation in the music search apparatus shown in FIG. 1, and FIG. FIG. 13 is a flowchart for explaining the feature data extraction operation in the data extraction unit, FIG. 13 is a flowchart for explaining the music search operation in the music search device shown in FIG. 1, and FIG. 14 is the PC shown in FIG. 15 is a diagram showing an example of a search screen displayed on the display unit, FIG. 15 is a diagram showing a display example of a search condition input area shown in FIG. 14, and FIGS. 16 and 17 are search result displays shown in FIG. FIG. 18 is a diagram showing an example of the entire music list display area displayed on the search screen example shown in FIG. 14, and FIG. 19 is a diagram showing the search screen example shown in FIG. It is a figure which shows the example of a keyword search area | region performed.

本実施の形態の楽曲検索装置１０では、使用に先立って初期設定として、印象度データ変換部１４で用いられる階層型ニューラルネットワークの学習と、楽曲マップ記憶部１７に記憶される学習マップの選択とが行われる。 In the music search device 10 of the present embodiment, as an initial setting prior to use, learning of the hierarchical neural network used by the impression degree data conversion unit 14 and selection of a learning map stored in the music map storage unit 17 are performed. Is done.

初期設定で行われる階層型ニューラルネットワークの学習は、学習データ記憶部２３に記憶されている事前学習済み階層型ニューラルネットワークおよび事前学習データに基づいて行われるものであり、学習データ記憶部２３には、異なる評価者および異なる事前学習データを用いて事前学習が施された複数の事前学習済み階層型ニューラルネットワークおよび事前学習データが記憶されており、初期設定では、学習データ記憶部２３に記憶されている複数の事前学習済み階層型ニューラルネットワークおよび事前学習データの中からいずれかの事前学習済み階層型ニューラルネットワークおよび事前学習データを選択し、選択した事前学習済み階層型ニューラルネットワークおよび事前学習データに基づいて階層型ニューラルネットワークの学習が行われる。 The learning of the hierarchical neural network performed in the initial setting is performed based on the pre-learned hierarchical neural network and the pre-learning data stored in the learning data storage unit 23. A plurality of pre-learned hierarchical neural networks and pre-learning data that have been pre-learned using different evaluators and different pre-learning data are stored, and are stored in the learning data storage unit 23 by default. Select one of the pre-trained hierarchical neural networks and pre-learning data from the pre-trained hierarchical neural network and pre-learning data, and based on the selected pre-trained hierarchical neural network and pre-learning data Hierarchical neural network Learning is performed.

また、初期設定で行われる学習マップの選択は、学習データ記憶部２３に記憶されている複数の学習マップの中からいずれかを選択するものであり、学習データ記憶部２３に記憶されている複数の学習マップは、それぞれ異なる評価者および異なる事前学習データを用いて事前学習が施されている。 In addition, the selection of the learning map performed in the initial setting is to select one of the plurality of learning maps stored in the learning data storage unit 23, and the plurality of learning maps stored in the learning data storage unit 23 are selected. These learning maps are pre-learned using different evaluators and different pre-learning data.

以下、第３者である評価者による階層型ニューラルネットワークおよび楽曲マップの事前学習について図４乃至図６を参照して詳細に説明する。 Hereinafter, prior learning of a hierarchical neural network and a music map by a third party evaluator will be described in detail with reference to FIGS.

印象度データ変換部１４で用いられる階層型ニューラルネットワークは、図４に示すように、入力層（第１層）、中間層（第ｎ層）および出力層（第Ｎ層）からなり、入力層（第１層）に特徴データを入力することによって、出力層（第Ｎ層）から印象度データを出力、すなわち特徴データを印象度データに変換し、出力層（第Ｎ層）から出力するものであり、中間層（第ｎ層）の各ニューロンの結合重み値ｗを事前学習する。なお、特徴データは、特徴データ抽出部１３によって楽曲データから抽出される８項目からなるデータであり、入力層（第１層）のニューロン数Ｌ_１は、８個となっている（特徴データ抽出部１３による特徴データの抽出方法については、後述する）。また、印象度データは、人間の感性によって判断される（明るい、暗い）、（重い、軽い）、（かたい、やわらかい）、（安定、不安定）、（澄んだ、にごった）、（滑らか、歯切れの良い）、（激しい、穏やか）、（厚い、薄い）の８項目がそれぞれ７段階評価で表されたデータであり、出力層（第Ｎ層）のニューロン数Ｌ_Ｎは、８個となっている。中間層（第ｎ層：ｎ＝２，…，Ｎ−１）のニューロン数Ｌ_ｎは、適宜設定されている。 As shown in FIG. 4, the hierarchical neural network used in the impression degree data conversion unit 14 includes an input layer (first layer), an intermediate layer (nth layer), and an output layer (Nth layer). By inputting feature data into the (first layer), impression level data is output from the output layer (Nth layer), that is, feature data is converted into impression level data and output from the output layer (Nth layer) The connection weight value w of each neuron in the intermediate layer (nth layer) is pre-learned. The feature data is data of 8 items to be extracted from the music data by the characteristic data extraction unit 13, the neuron number L ₁ of the input layer (first layer) has a eight (characteristic data extraction The method of extracting feature data by the unit 13 will be described later). In addition, impression data is judged by human sensitivity (bright, dark), (heavy, light), (hard, soft), (stable, unstable), (clear, dirty), (smooth 8 items of (crisp, crisp), (violent, gentle), (thick, thin) are data expressed by 7-level evaluation, and the number of neurons L _N in the output layer (Nth layer) is 8 It has become. The number of neurons L _n in the intermediate layer (nth layer: n = 2,..., N−1) is set as appropriate.

また、楽曲マップ記憶部１７に記憶される楽曲マップは、図５に示すように、ニューロンが２次元に規則的に配置（図５に示す例では、９・９の正方形）されている自己組織化マップ（ＳＯＭ）であり、教師信号を必要としない学習ニューラルネットワークで、入力パターン群をその類似度に応じて分類する能力を自律的に獲得して行くニューラルネットワークである。なお、本実施の形態では、ニューロンが１００・１００の正方形に配列された２次元ＳＯＭを使用したが、ニューロンの配列は、正方形であっても、蜂の巣であっても良い。楽曲マップの各ニューロンには、ｎ次元の特徴ベクトルｍ_ｉ（ｔ）∈Ｒ^ｎが内包されており、事前学習データによる事前学習によって各ニューロンに内包された特徴ベクトルｍ_ｉ（ｔ）∈Ｒ^ｎが学習される。 Further, as shown in FIG. 5, the music map stored in the music map storage unit 17 is a self-organization in which neurons are regularly arranged two-dimensionally (9.9 squares in the example shown in FIG. 5). This is a learning network that does not require a teacher signal and is a neural network that autonomously acquires the ability to classify input pattern groups according to their similarity. In the present embodiment, a two-dimensional SOM in which neurons are arranged in 100/100 squares is used. However, the neurons may be square or beehives. Each neuron of the music map, the n-dimensional feature vectors m _{i (t)} ∈R ⁿ are included, the feature vectors are contained in each neuron by prior learning by pre-learning data m _{i (t)} ∈R ⁿ Is learned.

評価者による階層型ニューラルネットワーク（結合重み値ｗ）および楽曲マップ（特徴ベクトルｍ_ｉ（ｔ）∈Ｒ^ｎ）の事前学習は、図３に示すニューラルネットワーク学習装置４０を用いて行われ、まず、階層型ニューラルネットワーク（結合重み値ｗ）および楽曲マップ（特徴ベクトルｍ_ｉ（ｔ）∈Ｒ^ｎ）を事前学習させるための事前学習データ（楽曲データの特徴データ＋印象度データ）の入力が行われる。 Prior learning of the hierarchical neural network (connection weight value w) and the music map (feature vector m _i (t) εR ⁿ ) by the evaluator is performed using the neural network learning device 40 shown in FIG. Prior learning data (feature data feature data + impression degree data) for pre-learning a hierarchical neural network (connection weight value w) and a music map (feature vector m _i (t) εR ⁿ ) is input. .

楽曲データ入力部４１にＣＤ、ＤＶＤ等の楽曲データが記憶されている記憶媒体をセットし、楽曲データ入力部４１から楽曲データを入力し（ステップＡ１）、特徴データ抽出部４３は、楽曲データ入力部４１から入力された楽曲データから、ゆらぎ情報からなる特徴データを抽出する（ステップＡ２）。 A music medium such as a CD or a DVD is stored in the music data input unit 41, the music data is input from the music data input unit 41 (step A1), and the feature data extraction unit 43 inputs the music data. Feature data consisting of fluctuation information is extracted from the music data input from the unit 41 (step A2).

また、音声出力部４２は、楽曲データ入力部４１から入力された楽曲データを音声出力し（ステップＡ３）、評価者は、音声出力部４２からの音声出力を聞くことによって、楽曲の印象度を感性によって評価し、評価結果を印象度データとして印象度データ入力部４４から入力し（ステップＡ４）、結合重み値学習部４５は、印象度データ入力部４４から入力された印象度データを教師信号として受け付ける。なお、本実施の形態では、印象度の評価項目としては、人間の感性によって判断される（明るい、暗い）、（重い、軽い）、（かたい、やわらかい）、（安定、不安定）、（澄んだ、にごった）、（滑らか、歯切れの良い）、（激しい、穏やか）、（厚い、薄い）の８項目を設定し、各項目についての７段階評価を印象度データとして印象度データ入力部４４で受け付けるように構成した。 The voice output unit 42 outputs the music data input from the music data input unit 41 as a voice (step A3), and the evaluator listens to the voice output from the voice output unit 42, thereby increasing the impression level of the music. Evaluation is performed based on sensitivity, and the evaluation result is input as impression degree data from the impression degree data input unit 44 (step A4). The combined weight value learning unit 45 uses the impression degree data input from the impression degree data input unit 44 as a teacher signal. Accept as. In this embodiment, the evaluation items for impression degree are determined by human sensitivity (bright, dark), (heavy, light), (hard, soft), (stable, unstable), ( 8 items of clear, sloppy), (smooth, crisp), (violent, gentle), (thick, thin) are set, and impression level data input section with 7-level evaluation for each item as impression level data 44 to accept.

次に、入力された事前学習データが予め定められたサンプル数Ｔ_１に達したか否かを判断し（ステップＡ５）、入力された事前学習データがサンプル数Ｔ_１に達するまでステップＡ１〜Ａ４の動作が繰り返される。 Next, steps until it is determined whether the pre-learned data that is input has reached the number of samples T ₁ for a predetermined (step A5), pre-learning data input reaches the number of samples T ₁ Al to A4 Is repeated.

事前学習データが予め定められたサンプル数Ｔ_１に達すると、事前学習データに基づいて、結合重み値学習部４５によって階層型ニューラルネットワークの学習が（結合重み値ｗの更新）行われると共に（ステップＡ６）、事前学習データの内の印象度データ入力部４４から入力された印象度データに基づいて、楽曲マップ学習部４６によって、学習マップの学習（特徴ベクトルｍ_ｉ（ｔ）∈Ｒ^ｎの更新）が行われる（ステップＡ９）。 If prior learning data reaches the number of samples T ₁ predetermined, on the basis of the pre-learning data, the connection weights learning unit 45 with learning of the hierarchical neural network is performed (updated connection weights values w) (step A6) Based on the impression degree data input from the impression degree data input unit 44 in the pre-learning data, the music map learning unit 46 learns the learning map (updates the feature vector m _i (t) ∈R ⁿ . ) Is performed (step A9).

ステップＡ６で行われる階層型ニューラルネットワークの学習、すなわち各ニューロンの結合重み値ｗの更新は、誤差逆伝播学習法を用いて行う。
まず、初期値として、中間層（第ｎ層）の全てのニューロンの結合重み値ｗを乱数によって−０．１〜０．１程度の範囲の小さな値に設定しておき、結合重み値学習部４５は、特徴データ抽出部４３によって抽出された特徴データを入力信号 x_ｊ(ｊ＝１，２，…，８) として入力層（第１層）に入力し、入力層（第１層）から出力層（第Ｎ層）に向けて、各ニューロンの出力を計算する。 The learning of the hierarchical neural network performed in step A6, that is, the update of the connection weight value w of each neuron is performed using the error back propagation learning method.
First, as an initial value, the connection weight value w of all the neurons of the intermediate layer (nth layer) is set to a small value in the range of about −0.1 to 0.1 by a random number, and the connection weight value learning unit 45 inputs the feature data extracted by the feature data extraction unit 43 into the input layer (first layer) as an input signal x _j (j = 1, 2,..., 8), and from the input layer (first layer). The output of each neuron is calculated toward the output layer (Nth layer).

次に、結合重み値学習部４５は、印象度データ入力部４４から入力された印象度データを教師信号ｙ_ｊ(ｊ＝１，２，…，８) とし、出力層（第Ｎ層）の出力out_j ^Ｎと、教師信号ｙ_ｊとの誤差から、学習則δ_j ^Ｎを次式によって計算する。 Next, the combined weight value learning unit 45 uses the impression degree data input from the impression degree data input unit 44 as a teacher signal y _j (j = 1, 2,..., 8), and outputs the output layer (Nth layer). The learning rule δ _j ^N is calculated from the error between the output out _j ^N and the teacher signal y _j by the following equation.

次に、結合重み値学習部４５は、学習則δ_j ^Ｎを使って、中間層（第ｎ層）の誤差信号 δ_j ⁿ を次式によって計算する。 Next, the joint weight value learning unit 45 calculates the error signal δ _j ⁿ of the intermediate layer (nth layer) using the learning rule δ _j ^N by the following equation.

なお、数式２において、ｗは、第 n 層 j 番目と第 n -1 層ｋ番目のニューロンの間の結合重み値を表している。 In Equation 2, w represents a connection weight value between the n-th layer j-th neuron and the (n −1) -th layer k-th neuron.

次に、結合重み値学習部４５は、中間層（第ｎ層）の誤差信号 δ_j ⁿ を用いて各ニューロンの結合重み値ｗの変化量Δｗを次式によって計算し、各ニューロンの結合重み値ｗを更新する（ステップＡ６）。なお、次式において、ηは、学習率を表し、評価者による事前学習では、η_１(0＜η_１≦1)に設定されている。 Next, the connection weight value learning unit 45 calculates the amount of change Δw of the connection weight value w of each neuron using the following equation using the error signal δ _j ⁿ of the intermediate layer (nth layer), and the connection weight of each neuron. The value w is updated (step A6). In the following equation, η represents a learning rate, and is set to η ₁ (0 <η ₁ ≦ 1) in the prior learning by the evaluator.

ステップＡ６では、サンプル数Ｔ_１の事前学習データのそれぞれについて学習が行われ、次に、次式に示す２乗誤差Ｅが予め定められた事前学習用の基準値Ｅ_１よりも小さいか否かが判断され（ステップＡ７）、２乗誤差Ｅが基準値Ｅ_１よりも小さくなるまでステップＡ６の動作が繰り返される。なお、２乗誤差Ｅが基準値Ｅ_１よりも小さくなると想定される学習反復回数Ｓを予め設定しておき、ステップＡ６の動作をＳ回繰り返すようにしても良い。 In step A6, for each of the pre-training data sample number T ₁ learning is performed, then, or smaller or not than the reference value E ₁ for pre-learning is square error E shown in the following equation predetermined There is judged (step A7), the operation of step A6 to the square error E is smaller than the reference value E ₁ is repeated. Incidentally, the learning iterations S squared error E is assumed to be smaller than the reference value E ₁ is set in advance, the operation of step A6 may be repeated S times.

ステップＡ７で２乗誤差Ｅが基準値Ｅ_１よりも小さいと判断された場合には、結合重み値学習部４５は、事前学習させた各ニューロンの結合重み値ｗ、すなわち事前学習済み階層型ニューラルネットワークと、事前学習に用いた事前学習データ（楽曲データの特徴データ＋印象度データ）とを結合重み値出力部４７によって出力させ、出力された事前学習済み階層型ニューラルネットワークおよび事前学習データを、学習データ記憶部２３に記憶させる（ステップＡ８）。 In the case of the square error E is determined to be smaller than the reference value E ₁ Step A7, connection weights learning unit 45, the connection weight values w of each neuron is pre-learning, i.e. pre-learned hierarchical neural The combined weight value output unit 47 outputs the network and the pre-learning data (music data feature data + impression degree data) used for the pre-learning, and the pre-learned hierarchical neural network and the pre-learning data that are output, It memorize | stores in the learning data memory | storage part 23 (step A8).

ステップＡ９で行われる楽曲マップの学習は、楽曲マップ学習部４６において、印象度データ入力部４４から入力された印象度データを入力ベクトルｘ_ｊ（ｔ）∈Ｒ^ｎとし、各ニューロンの特徴ベクトルｍ_ｉ（ｔ）∈Ｒ^ｎを学習させる。なお、ｔは、学習回数を表し、学習回数を定める設定値Ｔを予め設定しておき、学習回数ｔ＝０，１，…，Ｔについて学習を行わせる。なお、Ｒは、各印象度項目の評価段階を示し、ｎは、印象度データの項目数を示す。 In the music map learning performed in step A9, the music map learning unit 46 uses the impression degree data input from the impression degree data input unit 44 as an input vector x _j (t) εR ^n, and the feature vector m of each neuron. _i (t) ∈ R ⁿ is learned. Note that t represents the number of learning times, a preset value T that determines the number of learning times is set in advance, and learning is performed for the learning number t = 0, 1,. Note that R indicates the evaluation stage of each impression degree item, and n indicates the number of items of impression degree data.

まず、初期値として、楽曲マップを構成する全てのニューロンの特徴ベクトルｍ_ｃ（０）をそれぞれ０〜１の範囲でランダムに設定しておき、楽曲マップ学習部４６は、ｘ_ｊ（ｔ）に最も近いニューロンｃ、すなわち‖ｘ_ｊ（ｔ）−ｍ_ｃ（ｔ）‖を最小にする勝者ニューロンｃを求め、勝者ニューロンｃの特徴ベクトルｍ_ｃ（ｔ）と、勝者ニューロンｃの近傍にある近傍ニューロンｉの集合Ｎ_ｃのそれぞれの特徴ベクトルｍ_ｉ（ｔ）（ｉ∈Ｎ_ｃ）とを、次式に従ってそれぞれ更新する（ステップＡ９）。なお、近傍ニューロンｉを決定するための近傍半径は、予め設定されているものとする。 First, as an initial value, feature vectors m _c (0) of all neurons constituting the music map are randomly set in the range of 0 to 1, and the music map learning unit 46 sets x _j (t) to x _j (t). Find the nearest neuron c, that is, the winner neuron c that minimizes ‖x _j (t) -m _c (t) ‖, the feature vector m _c (t) of the winner neuron c, and the neighborhood in the vicinity of the winner neuron c Each feature vector m _i (t) (iεN _c ) of the set N _c of neurons i is updated according to the following equation (step A9). It is assumed that the neighborhood radius for determining the neighborhood neuron i is set in advance.

なお、数式５において、ｈ_ｃｉ（ｔ）は、学習率を表し、次式によって求められる。 In Equation 5, h _ci (t) represents a learning rate and is obtained by the following equation.

なお、α_initは、学習率の初期値であり、Ｒ^２（ｔ）は、単調減少する一次関数もしくは指数関数が用いられる。 Α _init is an initial value of the learning rate, and R ² (t) is a monotonically decreasing linear function or exponential function.

次に、楽曲マップ学習部４６は、学習回数ｔが設定値Ｔに達したか否かを判断し（ステップＡ１０）、学習回数ｔが設定値Ｔに達するまでステップＡ９〜ステップＡ１０の処理動作を繰り返し、学習回数ｔが設定値Ｔに達すると、特徴ベクトル出力部４８を介して学習させた特徴ベクトルｍ_ｉ（Ｔ）∈Ｒ^ｎを出力させ（ステップＡ１１）、出力された各ニューロンｉの特徴ベクトルｍ_ｉ（Ｔ）を、楽曲検索装置１０の学習データ記憶部２３に事前学習が施された楽曲マップとして記憶する。 Next, the music map learning unit 46 determines whether or not the learning count t has reached the set value T (step A10), and performs the processing operations of step A9 to step A10 until the learning count t reaches the set value T. Repeatedly, when the learning count t reaches the set value T, the feature vector m _i (T) ∈R ⁿ learned through the feature vector output unit 48 is outputted (step A11), and the feature of each neuron i outputted is outputted. The vector m _i (T) is stored in the learning data storage unit 23 of the music search device 10 as a music map that has been pre-learned.

上述のステップＡ１〜Ａ１１の動作は、異なる評価者および異なる事前学習データについてそれぞれ行われ、学習データ記憶部２３には、図７に示すように複数の事前学習済み階層型ニューラルネットワークおよび事前学習データと、複数の楽曲マップが記憶されることになる。図７に示す例では、学習データ記憶部２３に、評価者Ａ〜Ｅがランダムに揃えられたサンプル数Ｔ_１の楽曲データに基づいてそれぞれ学習させた♯１〜♯５の事前学習済み階層型ニューラルネットワーク、事前学習データおよび楽曲マップが記憶されていると共に、評価者Ｆがそれぞれジャンル毎に揃えられたサンプル数Ｔ_１の楽曲データに基づいてそれぞれ学習させた♯６〜♯１０の事前学習済み階層型ニューラルネットワーク、事前学習データおよび楽曲マップが記憶されている。 The operations in steps A1 to A11 described above are performed for different evaluators and different pre-learning data, respectively, and a plurality of pre-learned hierarchical neural networks and pre-learning data are stored in the learning data storage unit 23 as shown in FIG. A plurality of music maps are stored. In the example shown in FIG. 7, # ₁ to # 5 pre-learned hierarchical type trained in the learning data storage unit 23 by the evaluators A to E based on the music data of the sample number T ₁ randomly arranged, respectively. neural network, prior learning with data and song map is stored, the evaluator F is ♯6~♯10 pre learned that was learned, respectively on the basis of the music data of the sample number T ₁ that is aligned for each genre, respectively A hierarchical neural network, pre-learning data, and a music map are stored.

また、学習データ記憶部２３には、図７に示すように、評価者Ａ〜Ｅのそれぞれのパーソナル情報が、評価者Ａ〜Ｅがランダムに揃えられたサンプル数Ｔ_１の楽曲データに基づいてそれぞれ学習させた♯１〜♯５の事前学習済み階層型ニューラルネットワーク、事前学習データおよび楽曲マップに対応して記憶されている。パーソナル情報は、血液型、性別、年齢、星座、性格（短気、情に弱い、気長・・）等の項目からなる評価者の固有情報を示すものである。 Further, in the learning data storage unit 23, as shown in FIG. 7, the evaluator respective personal information A~E is evaluator A~E is based on the music data of the sample number T ₁ which is aligned at random It is stored in correspondence with the previously learned # 1 to # 5 pre-learned hierarchical neural network, pre-learning data, and music map. The personal information indicates unique information of the evaluator including items such as blood type, gender, age, constellation, personality (shortness, weakness, temperament).

さらに、学習データ記憶部２３には、図７に示すように、ジャンルに関係なくランダムに揃えられたサンプル数Ｔ_２の初期設定用楽曲データ♯１と、ジャンル（ポップス、ジャズ、ロック、カントリーミュージック、演歌）を統一してそれぞれ揃えられたサンプル数Ｔ_２の初期設定用楽曲データ♯２〜♯６とが、その初期設定用楽曲データの特徴データと共に初期設定用サンプルとして記憶されている。 Further, in the learning data storage unit 23, as shown in FIG. 7, the initial setting music data ♯1 sample number T ₂ which is aligned at random regardless of genre, the genre (pop, jazz, rock, country music , Enka) and the initial setting music data # _{2 to} # ₆ of the number of samples T2 that are aligned and stored together with the feature data of the initial setting music data are stored as initial setting samples.

次に、階層型ニューラルネットワーク学習部２５において初期設定時に行われる階層型ニューラルネットワークの学習動作および楽曲マップの選択動作について図８乃至図１０を参照して詳細に説明する。 Next, the learning operation of the hierarchical neural network and the music map selection operation performed at the initial setting in the hierarchical neural network learning unit 25 will be described in detail with reference to FIGS.

最初に電源が投入されると、初期設定部２２は、初期設定選択画面４９をＰＣ表示部２０に表示させる（ステップＢ１）。初期設定選択画面４９は、パーソナル情報、楽曲のジャンルもしくは評価者によって、学習データ記憶部２３に記憶されている複数の事前学習済み階層型ニューラルネットワークおよび事前学習データの中からいずれかの組を選択すると共に、学習データ記憶部２３に記憶されている複数の楽曲マップの中からいずれかを選択するための画面であり、図９に示すように、パーソナル情報によって選択するために、ユーザのパーソナル情報（血液型、性別、年齢、星座、性格）を入力するパーソナル情報入力欄４９１と、楽曲のジャンルによって選択するために、楽曲のジャンルを入力するジャンル入力欄４９２と、評価者によって選択するために、希望の評価者を入力する評価者入力欄４９３と、選択の実行を指示する選択実行ボタン４９４とからなる。 When the power is first turned on, the initial setting unit 22 displays an initial setting selection screen 49 on the PC display unit 20 (step B1). The initial setting selection screen 49 selects one of a plurality of pre-learned hierarchical neural networks and pre-learning data stored in the learning data storage unit 23 by personal information, music genre or evaluator. FIG. 9 is a screen for selecting one of a plurality of music maps stored in the learning data storage unit 23. As shown in FIG. 9, the user's personal information To select a personal information input field 491 for inputting (blood type, gender, age, constellation, personality), a genre input field 492 for inputting a genre of music, and a selection by an evaluator. An evaluator input field 493 for inputting a desired evaluator and a selection execution button 494 for instructing execution of selection Consisting of.

ユーザによってパーソナル情報、楽曲のジャンルもしくは評価者が入力され、選択実行ボタン４９４がクリックされると、初期設定部２２は、パーソナル情報による選択であるか否か、すなわちパーソナル情報入力欄４９１にユーザのパーソナル情報が入力された上で選択実行ボタン４９４がクリックされたかを判断し（ステップＢ２）、パーソナル情報による選択である場合には、ユーザのパーソナル情報と評価者のパーソナル情報とを比較し（ステップＢ３）、ユーザのパーソナル情報に最も近い評価者が学習させた事前学習済み階層型ニューラルネットワークおよび学習マップを特定する（ステップＢ４）。すなわち、図７に示すように、評価者Ａ〜Ｅのパーソナル情報ＰＡ〜ＰＥが学習データ記憶部２３に記憶されており、ユーザのパーソナル情報とパーソナル情報ＰＡ〜ＰＥとを比較した結果、ユーザのパーソナル情報にパーソナル情報ＰＡが最も近い場合には、評価者Ａが学習させた事前学習済み階層型ニューラルネットワーク♯１および学習マップ♯１が特定される。なお、ユーザのパーソナル情報と評価者のパーソナル情報とを比較は、例えば、ユーザのパーソナル情報と評価者のパーソナル情報とをベクトル化し、両者のユークリッド距離が最も小さい評価者のパーソナル情報を最も近いパーソナルデータとすると良い。また、本実施の形態では、パーソナル情報の性格を直接入力するように構成したが、性格診断アンケートを用意し、性格診断アンケートに対するユーザの回答に基づいて性格を特定するようにしても良い。 When personal information, a genre of music or an evaluator is input by the user and the selection execution button 494 is clicked, the initial setting unit 22 determines whether the selection is based on personal information, that is, in the personal information input field 491, It is determined whether the selection execution button 494 has been clicked after the personal information has been input (step B2). If the selection is based on personal information, the user's personal information is compared with the evaluator's personal information (step B2). B3) A pre-learned hierarchical neural network and a learning map learned by an evaluator closest to the user's personal information are specified (step B4). That is, as shown in FIG. 7, personal information PA to PE of the evaluators A to E is stored in the learning data storage unit 23. As a result of comparing the user's personal information with the personal information PA to PE, When the personal information PA is closest to the personal information, the pre-learned hierarchical neural network # 1 and the learning map # 1 learned by the evaluator A are specified. The comparison between the user's personal information and the evaluator's personal information is performed by, for example, vectorizing the user's personal information and the evaluator's personal information so that the evaluator's personal information having the smallest Euclidean distance is the closest personal It should be data. Further, in the present embodiment, the personal information personality is directly input. However, a personality diagnosis questionnaire may be prepared, and the personality may be specified based on a user's answer to the personality diagnosis questionnaire.

次に、初期設定部２２は、初期設定入力画面５０をＰＣ表示部２０に表示させる（ステップＢ５）。初期設定入力画面５０は、図１０に示すように、印象度データ、すなわち（明るい、暗い）、（重い、軽い）、（かたい、やわらかい）、（安定、不安定）、（澄んだ、にごった）、（滑らか、歯切れの良い）、（激しい、穏やか）、（厚い、薄い）の８項目についての７段階評価をそれぞれ入力する印象度データ入力領域５１と、初期設定の開始を指示する初期設定開始ボタン５２と、印象度データ入力領域５１に入力した印象度データの入力を確定させる入力確定ボタン５３とからなる。 Next, the initial setting unit 22 displays the initial setting input screen 50 on the PC display unit 20 (step B5). As shown in FIG. 10, the initial setting input screen 50 has impression data, that is, (bright, dark), (heavy, light), (hard, soft), (stable, unstable), (clear, dirty) (I), (smooth, crisp), (severe, gentle), (thick, thin), the impression degree data input area 51 for inputting the seven-level evaluation for each of the eight items, and the initial instruction for starting the initial setting It includes a setting start button 52 and an input confirmation button 53 for confirming the input of impression degree data input to the impression degree data input area 51.

ユーザによって初期設定開始ボタン５２がクリックされると、初期設定部２２は、学習データ記憶部２３に記憶されている複数の初期設定用サンプル♯１〜♯６の内、ジャンルに関係なくランダムに揃えられた初期設定用楽曲サンプル♯１の最初の１つを読み出し、読み出した初期設定用サンプルの内の初期設定用楽曲データを音声出力部２４に出力し、音声出力部２４は、初期設定部２２から入力された初期設定用楽曲データを音声出力する（ステップＢ６）。 When the initial setting start button 52 is clicked by the user, the initial setting unit 22 randomly aligns the plurality of initial setting samples # 1 to # 6 stored in the learning data storage unit 23 regardless of the genre. The first one of the initial setting music samples # 1 thus read is read, and the initial setting music data of the read initial setting samples is output to the audio output unit 24. The audio output unit 24 The music data for initial setting input from is output as a sound (step B6).

ユーザは、音声出力部２４からの音声出力を聞くことによって、楽曲の印象度を感性によって評価し、ＰＣ操作部１９から印象度データ入力領域５１に評価結果、すなわち印象度データを入力し（ステップＢ７）、入力確定ボタン５３をクリックする。ユーザによって入力確定ボタン５３がクリックされると、初期設定部２２は、初期設定用サンプル（サンプル数Ｔ_２）の全てに対してユーザによる評価結果、すなわち印象度データが入力されたか否かを判断し（ステップＢ８）、初期設定用サンプル（サンプル数Ｔ_２）の全てに対してユーザによる評価結果、すなわち印象度データが入力されるまでステップＢ６〜Ｂ７の動作を繰り返す。 The user evaluates the impression level of the music based on the sensitivity by listening to the audio output from the audio output unit 24, and inputs the evaluation result, that is, the impression level data, from the PC operation unit 19 to the impression level data input area 51 (step B7), the input confirmation button 53 is clicked. When the input confirmation button 53 is clicked by the user, the initial setting unit 22 determines whether or not the evaluation result by the user, that is, impression degree data has been input for all of the initial setting samples (sample number T ₂ ). (Step B8), the operations of Steps B6 to B7 are repeated until the evaluation result by the user, that is, the impression degree data, is input to all the initial setting samples (sample number T ₂ ).

ステップＢ２でパーソナル情報による選択でないと判断された場合には、初期設定部２２は、評価者による選択であるか否か、すなわち評価者入力欄４９３にユーザのパーソナル情報が入力された上で選択実行ボタン４９４がクリックされたかを判断し（ステップＢ９）、評価者による選択である場合には、選択された評価者が学習させた事前学習済み階層型ニューラルネットワークおよび学習マップを特定し（ステップＢ１０）、ステップＢ５に至る。 If it is determined in step B2 that the selection is not based on personal information, the initial setting unit 22 selects whether the selection is made by the evaluator, that is, after the personal information of the user is input in the evaluator input field 493. It is determined whether or not the execution button 494 has been clicked (step B9). If the selection is made by the evaluator, the pre-learned hierarchical neural network and learning map learned by the selected evaluator are specified (step B10). ) To Step B5.

ステップＢ９で評価者による選択でないと判断された場合には、ジャンルによる選択、ジャンル入力欄４９３に楽曲のジャンルが入力された上で選択実行ボタン４９４がクリックされたことによるため、初期設定部２２は、選択されたジャンルで揃えられた事前学習データによって学習させた事前学習済み階層型ニューラルネットワークおよび学習マップを特定する（ステップＢ１１）。すなわち、図７に示すように、評価者Ｆがそれぞれジャンル（ポップス、ジャズ、ロック、カントリーミュージック、演歌）毎に揃えられたサンプル数Ｔ_１の楽曲データに基づいてそれぞれ学習させた事前学習済み階層型ニューラルネットワーク♯６〜♯１０および学習マップ♯６〜♯１０が記憶されており、ジャンルとしてポップスが選択された場合には、選択されたジャンル（ポップス）の楽曲データに基づいてそれぞれ学習させた事前学習済み階層型ニューラルネットワーク♯６および学習マップ♯６が特定される。 If it is determined in step B9 that the selection is not made by the evaluator, the initial setting unit 22 is selected because the selection by the genre and the selection execution button 494 is clicked after the genre of the music is input in the genre input field 493. Specifies a pre-learned hierarchical neural network and a learning map that have been learned using pre-learning data arranged in the selected genre (step B11). That is, as shown in FIG. 7, a pre-learned hierarchy that the evaluator F has learned based on the music data of the number of samples T ₁ arranged for each genre (pops, jazz, rock, country music, and enka), respectively. Type neural networks # 6 to # 10 and learning maps # 6 to # 10 are stored, and when pops is selected as a genre, learning is performed based on music data of the selected genre (pops). A pre-learned hierarchical neural network # 6 and a learning map # 6 are specified.

次に、初期設定部２２は、初期設定入力画面５０をＰＣ表示部２０に表示させ（ステップＢ１２）、ユーザによって初期設定開始ボタン５２がクリックされると、初期設定部２２は、学習データ記憶部２３に記憶されている複数の初期設定用サンプル♯１〜♯６の内、選択されたジャンルに揃えられた初期設定用楽曲サンプルの最初の１つを読み出し、読み出した初期設定用サンプルの内の初期設定用楽曲データを音声出力部２４に出力し、音声出力部２４は、初期設定部２２から入力された初期設定用楽曲データを音声出力する（ステップＢ１３）。 Next, the initial setting unit 22 displays the initial setting input screen 50 on the PC display unit 20 (step B12), and when the user clicks the initial setting start button 52, the initial setting unit 22 displays the learning data storage unit. 23, the first one of the initial setting music samples aligned with the selected genre is read out from the plurality of initial setting samples # 1 to # 6 stored in the initial setting sample. The initial setting music data is output to the audio output unit 24, and the audio output unit 24 outputs the initial setting music data input from the initial setting unit 22 as audio (step B13).

ユーザは、音声出力部２４からの音声出力を聞くことによって、楽曲の印象度を感性によって評価し、ＰＣ操作部１９から印象度データ入力領域５１に評価結果、すなわち印象度データを入力し（ステップＢ１４）、入力確定ボタン５３をクリックする。ユーザによって入力確定ボタン５３がクリックされると、初期設定部２２は、初期設定用サンプル（サンプル数Ｔ_２）の全てに対してユーザによる評価結果、すなわち印象度データが入力されたか否かを判断し（ステップＢ１５）、初期設定用サンプル（サンプル数Ｔ_２）の全てに対してユーザによる評価結果、すなわち印象度データが入力されるまでステップＢ１３〜Ｂ１４の動作を繰り返す。 The user evaluates the impression level of the music based on the sensitivity by listening to the audio output from the audio output unit 24, and inputs the evaluation result, that is, the impression level data, from the PC operation unit 19 to the impression level data input area 51 (step B14), the input confirmation button 53 is clicked. When the input confirmation button 53 is clicked by the user, the initial setting unit 22 determines whether or not the evaluation result by the user, that is, impression degree data has been input for all of the initial setting samples (sample number T ₂ ). (Step B15), the operations of Steps B13 to B14 are repeated until the evaluation results by the user, that is, the impression degree data, are input to all the initial setting samples (sample number T ₂ ).

ステップＢ８もしくはステップＢ１５で初期設定用サンプル（サンプル数Ｔ_２）の全てに対してユーザによる評価結果、すなわち印象度データが入力されたと判断されると、ステップＢ４、ステップＢ１０もしくはステップＢ１１で特定された事前学習済み階層型ニューラルネットワークの事前学習させた各ニューロンの結合重み値ｗと、ステップＢ４、ステップＢ１０もしくはステップＢ１１で特定された学習マップとを学習データ記憶部２３から読み出して階層型ニューラルネットワーク学習部２５に出力すると共に、特定された事前学習済み階層型ニューラルネットワークの事前学習に用いた事前学習データと、印象度データの入力の入力に用いられた初期設定用サンプルの内の特徴データとを学習データ記憶部２３から読み出し、事前学習データに初期設定用サンプルの内の特徴データとユーザによって入力された印象度データとを加えたものを初期設定用データとして階層型ニューラルネットワーク学習部２５に出力する。すなわち、事前学習データのサンプル数Ｔ_１が１００曲分あり、初期設定用サンプルのサンプル数Ｔ_２が１０曲である場合には、評価者が評価した１００曲分のデータと、ユーザが評価した１０曲分のデータとからなる１１０曲分（サンプル数Ｔ_１＋サンプル数Ｔ_２）の特徴データと印象度データとが初期設定用データとして階層型ニューラルネットワーク学習部２５に出力される。 If it is determined in step B8 or step B15 that the evaluation result by the user, that is, impression degree data, has been input for all of the initial setting samples (sample number T ₂ ), it is specified in step B4, step B10 or step B11. The learning weight storage unit 23 reads out the connection weight value w of each neuron that has been pre-learned from the pre-learned hierarchical neural network and the learning map specified in step B4, step B10, or step B11, to obtain the hierarchical neural network. Output to the learning unit 25, and the pre-learning data used for the pre-learning of the specified pre-learned hierarchical neural network, and the feature data in the initial setting sample used for the input of impression degree data Is read from the learning data storage unit 23, And it outputs the hierarchical neural network learning portion 25 plus the impression data input by the characteristic data and the user of the initial setting for the sample before the training data as the initial setting data. That is, there sample number T ₁ of the prior learning data 100 music pieces, sample number T ₂ of the sample for initial setting in the case of 10 songs, the data of 100 music pieces evaluated by the evaluator, the user has rated Feature data and impression degree data for 110 songs (sample number T ₁ + sample number T ₂ ) composed of data for 10 songs are output to the hierarchical neural network learning unit 25 as initial setting data.

階層型ニューラルネットワーク学習部２５は、初期設定用データに基づいて、誤差逆伝播学習法を用い、階層型ニューラルネットワークの学習、すなわち各ニューロンの結合重み値ｗの更新を行う。 The hierarchical neural network learning unit 25 learns the hierarchical neural network, that is, updates the connection weight value w of each neuron using the error back propagation learning method based on the initial setting data.

まず、初期値として、中間層（第ｎ層）のニューロンの結合重み値ｗを事前学習させた結合重み値ｗに設定しておき、階層型ニューラルネットワーク学習部２５は、初期設定部２２から入力された特徴データを入力信号ｘ_ｊ(ｊ＝１，２，…，８)として入力層（第１層）に入力し、入力層（第１層）から出力層（第Ｎ層）に向けて、各ニューロンの出力を計算する。 First, the connection weight value w of neurons in the intermediate layer (n-th layer) is set as an initial value to the connection weight value w learned in advance, and the hierarchical neural network learning unit 25 is input from the initial setting unit 22. The inputted feature data is input to the input layer (first layer) as an input signal x _j (j = 1, 2,..., 8), and directed from the input layer (first layer) to the output layer (Nth layer). Calculate the output of each neuron.

次に、階層型ニューラルネットワーク学習部２５は、初期設定部２２から入力された印象度データを教師信号ｙ_ｊ（ｊ＝１，２，…，８）とし、出力層（第Ｎ層）の出力out_j ^Ｎと、教師信号ｙ_ｊとの誤差から、学習則δ_j ^Ｎを数式１によって計算する。 Next, the hierarchical neural network learning unit 25 uses the impression degree data input from the initial setting unit 22 as a teacher signal y _j (j = 1, 2,..., 8), and outputs the output layer (Nth layer). From the error between out _j ^N and the teacher signal y _j , the learning rule δ _j ^N is calculated by Equation 1.

次に、階層型ニューラルネットワーク学習部２５は、学習則δ_j ^Ｎを使って、中間層（第ｎ層）の誤差信号δ_j ⁿを数式２によって計算する。 Next, the hierarchical neural network learning unit 25 uses the learning rule δ _j ^N to calculate the error signal δ _j ⁿ of the intermediate layer (the n-th layer) using Equation 2.

次に、階層型ニューラルネットワーク学習部２５は、中間層（第ｎ層）の誤差信号 δ_j ⁿを用いて各ニューロンの結合重み値ｗの変化量Δｗを数式３によって計算し、各ニューロンの結合重み値ｗを更新する（ステップＢ１６）。なお、数式３における学習率は、η_２(0＜η_２≦1)に設定されており、η_２は、評価者による事前学習で使用した学習率η_１よりも大きな値を用いると良い。さらに、ユーザが評価した１０曲分のデータ、すなわちユーザが入力した印象度データを教師信号として用いて学習させる際には、学習率をη_３(0＜η_１、η_２＜η_３≦1)に設定すると、ユーザの嗜好が階層型ニューラルネットワークの学習に強く反映されて好適である。 Next, the hierarchical neural network learning unit 25 calculates the amount of change Δw of the connection weight value w of each neuron using Equation 3 using the error signal δ _j ⁿ of the intermediate layer (nth layer), and connects each neuron. The weight value w is updated (step B16). Note that the learning rate in Equation 3 is set to η ₂ (0 <η ₂ ≦ 1), and η ₂ may be a value larger than the learning rate η ₁ used in the prior learning by the evaluator. Furthermore, when learning is performed using data of 10 songs evaluated by the user, that is, impression degree data input by the user as a teacher signal, the learning rate is set to η ₃ (0 <η ₁ , η ₂ <η ₃ ≦ 1). ) Is preferable because the user's preference is strongly reflected in the learning of the hierarchical neural network.

ステップＢ１６では、サンプル数Ｔ_１＋サンプル数Ｔ_２の初期設定用データのそれぞれについて学習が行われ、次に、数式４に示す２乗誤差Ｅが予め定められた初期設定用の基準値Ｅ_２（初期設定用の基準値Ｅ_２＜事前学習用の基準値Ｅ_１）よりも小さいか否かが判断され（ステップＢ１７）、２乗誤差Ｅが基準値Ｅ_２よりも小さくなるまでステップＢ１６の動作が繰り返される。なお、２乗誤差Ｅが基準値Ｅ_２よりも小さくなると想定される学習反復回数Ｓを予め設定しておき、ステップＢ１６の動作をＳ回繰り返すようにしても良い。また、初期設定用データの内のユーザが評価したデータ（サンプル数Ｔ_２）を用いた学習回数を、他のデータ（サンプル数Ｔ_１）よりも多くすると、ユーザの嗜好が階層型ニューラルネットワークの学習に強く反映されて好適である。すなわち、ステップＢ１６での学習において、サンプル数Ｔ_１の初期設定用データを１回ずつ学習させるのに対し、ユーザが評価してサンプル数Ｔ_２の初期設定用データを複数回ずつ学習させることにより、初期設定用データの内のユーザが評価したデータ（サンプル数Ｔ_２）を用いた学習回数を、他のデータ（サンプル数Ｔ_１）よりも多くする。ステップＢ１６の動作を学習反復回数Ｓに予め定めておく場合には、初期設定用データの内のユーザが評価したデータ（サンプル数Ｔ_２）を用いた学習の反復回数を、他のデータ（サンプル数Ｔ_１）を用いた学習の反復回数よりも多くすると良い。 In step B16, learning is performed for each of the initial setting data of the number of samples T ₁ + the number of samples T ₂ , and then, the initial setting reference value E _{2 in which the} square error E shown in Expression 4 is predetermined. It is determined whether or not it is smaller than (initial reference reference value E ₂ <pre-learning reference value E ₁ ) (step B 17). Step B 16 is continued until the square error E becomes smaller than the reference value E ₂ . The operation is repeated. Incidentally, the learning iterations S squared error E is assumed to be smaller than the reference value E ₂ is set in advance, the operation of step B16 may be repeated S times. Further, if the number of learnings using the data (sample number T ₂ ) evaluated by the user in the initial setting data is made larger than the other data (sample number T ₁ ), the user's preference becomes higher than that of the hierarchical neural network. It is suitable because it is strongly reflected in learning. That is, in the learning in step B16, whereas train the initial setting data of the sample number T ₁ once, the user is assessed to learn the data for initial setting of the sample number T ₂ by a plurality of times The number of learnings using the data (sample number T ₂ ) evaluated by the user in the initial setting data is set to be larger than that of the other data (sample number T ₁ ). When the operation of step B16 is determined in advance as the number of learning iterations S, the number of iterations of learning using the data (sample number T ₂ ) evaluated by the user in the initial setting data is set to other data (samples). It is preferable to increase the number of learning iterations using the number T ₁ ).

ステップＢ１７で２乗誤差Ｅが基準値Ｅ_２よりも小さいと判断された場合には、階層型ニューラルネットワーク学習部２５は、学習させた各ニューロンの結合重み値ｗを印象度データ変換部１４に出力すると共に、ステップＢ４、ステップＢ１０もしくはステップＢ１１で特定された学習マップを楽曲マップ記憶部１７に出力する（ステップＢ１８）。 Square when the error E is determined to be smaller than the reference value E ₂ in step B17, the hierarchical neural network learning portion 25, the coupling weight value w for each neuron that has learned in the impression-data-conversion unit 14 While outputting, the learning map specified by step B4, step B10, or step B11 is output to the music map memory | storage part 17 (step B18).

次に、楽曲検索装置１０における楽曲登録動作について図１１を参照して詳細に説明する。
楽曲データ入力部１１にＣＤ、ＤＶＤ等の楽曲データが記憶されている記憶媒体をセットし、楽曲データ入力部１１から楽曲データを入力する（ステップＣ１）。 Next, the music registration operation in the music search apparatus 10 will be described in detail with reference to FIG.
A storage medium in which music data such as CD and DVD is stored is set in the music data input unit 11, and music data is input from the music data input unit 11 (step C1).

圧縮処理部１２は、楽曲データ入力部１１から入力された楽曲データを圧縮し（ステップＣ２）、圧縮した楽曲データを、アーティスト名、曲名等の書誌データと共に楽曲データベース１５に記憶させる（ステップＣ３）。 The compression processing unit 12 compresses the music data input from the music data input unit 11 (step C2), and stores the compressed music data in the music database 15 together with the bibliographic data such as artist name and music name (step C3). .

特徴データ抽出部１３は、楽曲データ入力部１１から入力された楽曲データから、ゆらぎ情報からなる特徴データを抽出する（ステップＣ４）。
特徴データ抽出部１３における特徴データの抽出動作は、図１２を参照すると、楽曲データの入力を受け付け（ステップＤ１）、楽曲データの予め定められたデータ解析開始点から一定のフレーム長に対しＦＦＴ（高速フーリエ変換）を行い（ステップＤ２）、パワースペクトルを算出する。なお、ステップＤ２の前に高速化を目的としてダウンサンプリングを行うようにしても良い。 The feature data extraction unit 13 extracts feature data composed of fluctuation information from the song data input from the song data input unit 11 (step C4).
Referring to FIG. 12, the feature data extraction operation in the feature data extraction unit 13 accepts input of music data (step D1), and performs FFT (FFT) for a certain frame length from a predetermined data analysis start point of music data. (Fast Fourier transform) is performed (step D2), and a power spectrum is calculated. Note that downsampling may be performed for the purpose of speeding up before step D2.

次に、特徴データ抽出部１３は、Ｌｏｗ、Ｍｉｄｄｌｅ、Ｈｉｇｈの周波数帯域を予め設定しておき、Ｌｏｗ、Ｍｉｄｄｌｅ、Ｈｉｇｈの３帯域のパワースペクトルを積分し、平均パワーを算出すると共に（ステップＤ３）、Ｌｏｗ、Ｍｉｄｄｌｅ、Ｈｉｇｈの周波数帯域の内、最大のパワーを持つ帯域をＰｉｔｃｈのデータ解析開始点値とし、Ｐｉｔｃｈを測定する（ステップＤ４）。 Next, the feature data extraction unit 13 sets the frequency bands of Low, Middle, and High in advance, integrates the power spectrum of the three bands of Low, Middle, and High to calculate the average power (step D3). Among the frequency bands of Low, Middle, and High, the band having the maximum power is set as the Pitch data analysis start point value, and the Pitch is measured (step D4).

ステップＤ２〜ステップＤ４の処理動作は、予め定められたフレーム個数分行われ、特徴データ抽出部１３は、ステップＤ２〜ステップＤ４の処理動作を行ったフレーム個数が予め定められた設定値に達したか否かを判断し（ステップＤ５）、ステップＤ２〜ステップＤ４の処理動作を行ったフレーム個数が予め定められた設定値に達していない場合には、データ解析開始点をシフトしながら（ステップＤ６）、ステップＤ２〜ステップＤ４の処理動作を繰り返す。 The processing operations in steps D2 to D4 are performed for a predetermined number of frames, and the feature data extraction unit 13 determines whether the number of frames for which the processing operations in steps D2 to D4 have been performed has reached a predetermined set value. (Step D5), and if the number of frames for which the processing operations of steps D2 to D4 have not reached the predetermined set value, the data analysis start point is shifted (step D6). , The processing operation of Step D2 to Step D4 is repeated.

ステップＤ２〜ステップＤ４の処理動作を行ったフレーム個数が予め定められた設定値に達した場合には、特徴データ抽出部１３は、ステップＤ２〜ステップＤ４の処理動作によって算出したＬｏｗ、Ｍｉｄｄｌｅ、Ｈｉｇｈの平均パワーの時系列データに対しＦＦＴを行うと共に、ステップＤ２〜ステップＤ４の処理動作によって測定したＰｉｔｃｈの時系列データに対しＦＦＴを行う（ステップＤ７）。 When the number of frames for which the processing operations of Step D2 to Step D4 have been performed reaches a predetermined set value, the feature data extraction unit 13 calculates Low, Middle, High calculated by the processing operations of Step D2 to Step D4. The FFT is performed on the time series data of the average power and the pitch time series data measured by the processing operations in steps D2 to D4 (step D7).

次に、特徴データ抽出部１３は、Ｌｏｗ、Ｍｉｄｄｌｅ、Ｈｉｇｈ、ＰｉｔｃｈにおけるＦＦＴ分析結果から、横軸を対数周波数、縦軸を対数パワースペクトルとしたグラフにおける回帰直線の傾きと、回帰直線のＹ切片とをゆらぎ情報として算出し（ステップＤ８）、Ｌｏｗ、Ｍｉｄｄｌｅ、Ｈｉｇｈ、Ｐｉｔｃｈのそれぞれにおける回帰直線の傾きおよびＹ切片を８項目からなる特徴データとして印象度データ変換部１４に出力する。 Next, the feature data extraction unit 13 calculates the slope of the regression line in the graph with the horizontal axis representing the logarithmic frequency and the vertical axis representing the logarithmic power spectrum from the FFT analysis results in Low, Middle, High, and Pitch, and the Y intercept of the regression line. Are calculated as fluctuation information (step D8), and the slope of the regression line and the Y-intercept in each of Low, Middle, High, and Pitch are output to the impression degree data conversion unit 14 as feature data of eight items.

印象度データ変換部１４は、図４に示すような入力層（第１層）、中間層（第ｎ層）、出力層（第Ｎ層）からなる階層型ニューラルネットワークを用い、入力層（第１層）に特徴データ抽出部１３で抽出された特徴データを入力することによって、出力層（第Ｎ層）から印象度データを出力、すなわち特徴データを印象度データに変換し（ステップＣ５）、出力層（第Ｎ層）から出力された印象度データを、楽曲マッピング部１６に出力すると共に、楽曲データと共に楽曲データベース１５に記憶させる。なお、中間層（第ｎ層）の各ニューロンの結合重み値ｗは、ニューラルネットワーク学習装置４０によって予め学習が施されている。また、本実施の形態の場合には、入力層（第１層）に入力される特徴データ、すなわち特徴データ抽出部１３によって抽出される特徴データの項目は、前述のように８項目であり、印象度データの項目としては、人間の感性によって判断される（明るい、暗い）、（重い、軽い）、（かたい、やわらかい）、（安定、不安定）、（澄んだ、にごった）、（滑らか、歯切れの良い）、（激しい、穏やか）、（厚い、薄い）の８項目を設定し、各項目を７段階評価で表すように設定した。従って、入力層（第１層）のニューロン数Ｌ_１と出力層（第Ｎ層）のニューロン数Ｌ_Ｎとは、それぞれ８個となっており、中間層（第ｎ層：ｎ＝２，…，Ｎ−１）のニューロン数Ｌ_ｎは、適宜設定されている。 The impression data conversion unit 14 uses a hierarchical neural network including an input layer (first layer), an intermediate layer (n-th layer), and an output layer (N-th layer) as shown in FIG. By inputting the feature data extracted by the feature data extraction unit 13 into the first layer), impression level data is output from the output layer (Nth layer), that is, the feature data is converted into impression level data (step C5), Impression degree data output from the output layer (Nth layer) is output to the music mapping unit 16 and stored in the music database 15 together with the music data. Note that the connection weight value w of each neuron in the intermediate layer (nth layer) is learned in advance by the neural network learning device 40. In the case of the present embodiment, the feature data input to the input layer (first layer), that is, the feature data extracted by the feature data extraction unit 13 is eight items as described above, Impression data items are determined by human sensitivity (bright, dark), (heavy, light), (hard, soft), (stable, unstable), (clear, fuzzy), ( Eight items of (smooth, crisp), (violent, gentle), (thick, thin) were set, and each item was set to be expressed by a seven-level evaluation. Therefore, the number of neurons L _N of input layer neurons number L ₁ and the output layer (first layer) (the N th layer) is a eight respectively, the intermediate layer (the n-th layer: n = 2, ... , N−1), the number of neurons L _n is set as appropriate.

楽曲マッピング部１６は、楽曲データ入力部１１から入力された楽曲を楽曲マップ記憶部１７に記憶されている楽曲マップの該当箇所にマッピングする。楽曲マッピング部１６におけるマッピング動作に用いられる楽曲マップの各ニューロンには、予め学習されたｎ次元の特徴ベクトルｍ_ｉ（ｔ）∈Ｒ^ｎが内包されており、楽曲マッピング部１６は、印象度データ変換部１４によって変換された印象度データを入力ベクトルｘ_ｊとし、入力ベクトルｘ_ｊに最も近いニューロン、すなわちユークリッド距離‖ｘ_ｊ−ｍ_ｉ‖を最小にするニューロンに、入力された楽曲をマッピングし（ステップＣ６）、マッピングした楽曲マップを楽曲マップ記憶部１７に記憶させる。なお、Ｒは、印象度データの各項目の評価段階数を示し、ｎは、印象度データの項目数を示す。 The music mapping unit 16 maps the music input from the music data input unit 11 to a corresponding portion of the music map stored in the music map storage unit 17. Each neuron of the music map used for the mapping operation in the music mapping unit 16 includes an n-dimensional feature vector m _i (t) εR ⁿ learned in advance, and the music mapping unit 16 stores the impression degree data. The impression degree data converted by the conversion unit 14 is set as an input vector x _j , and the input music is mapped to a neuron closest to the input vector x _j , that is, a neuron that minimizes the Euclidean distance ‖ x _j −m _i ‖. (Step C6), the mapped music map is stored in the music map storage unit 17. R represents the number of evaluation stages for each item of impression degree data, and n represents the number of items of impression degree data.

次に、楽曲検索装置１０における楽曲検索動作について図１３乃至図１９を参照して詳細に説明する。
楽曲検索部１８は、ＰＣ表示部２０に、図１４に示すような検索画面６０を表示し、ＰＣ操作部１９からのユーザ入力を受け付ける。検索画面６０は、楽曲マップ記憶部１７に記憶されている楽曲データのマッピング状況が表示される楽曲マップ表示領域６１と、検索条件を入力する検索条件入力領域６２と、検索結果が表示される検索結果表示領域６３とからなる。図１４の楽曲マップ表示領域６１に示されている点は、楽曲データがマッピングされている楽曲マップのニューロンを示している。 Next, a music search operation in the music search apparatus 10 will be described in detail with reference to FIGS.
The music search unit 18 displays a search screen 60 as shown in FIG. 14 on the PC display unit 20 and accepts user input from the PC operation unit 19. The search screen 60 includes a music map display area 61 in which the mapping status of music data stored in the music map storage unit 17 is displayed, a search condition input area 62 for inputting search conditions, and a search in which search results are displayed. It consists of a result display area 63. The points shown in the music map display area 61 in FIG. 14 indicate the neurons of the music map to which the music data is mapped.

検索条件入力領域６２は、図１５に示すように、検索条件として印象度データを入力する印象度データ入力領域６２１と、検索条件として書誌データを入力する書誌データ入力領域６２２と、検索の実行を指示する検索実行ボタン６２３とからなり、ユーザは、検索条件として印象度データおよび書誌データをＰＣ操作部１９から入力し（ステップＥ１）、検索実行ボタン６２３をクリックすることで、印象度データおよび書誌データに基づく検索を楽曲検索部１８に指示する。なお、ＰＣ操作部１９からの印象度データの入力は、図１５に示すように、印象度データの各項目を７段階評価で入力することによって行われる。 As shown in FIG. 15, the search condition input area 62 includes an impression degree data input area 621 for inputting impression degree data as a search condition, a bibliographic data input area 622 for inputting bibliographic data as a search condition, and executes a search. The search execution button 623 for instructing the user inputs impression degree data and bibliographic data as search conditions from the PC operation unit 19 (step E1), and clicks the search execution button 623, whereby the impression degree data and bibliography are displayed. A search based on data is instructed to the music search unit 18. Note that the impression level data is input from the PC operation unit 19 by inputting each item of the impression level data in a seven-step evaluation, as shown in FIG.

楽曲検索部１８は、ＰＣ操作部１９から入力された印象度データおよび書誌データに基づいて楽曲データベース１５を検索し（ステップＥ２）、図１６に示すような検索結果を検索結果表示領域６３に表示する。 The music search unit 18 searches the music database 15 based on the impression data and bibliographic data input from the PC operation unit 19 (step E2), and displays the search results as shown in FIG. To do.

ＰＣ操作部１９から入力された印象度データに基づく検索は、ＰＣ操作部１９から入力された印象度データを入力ベクトルｘ_ｊとし、楽曲データベース１５に楽曲データと共に記憶されている印象度データを検索対象ベクトルＸ_ｊとすると、入力ベクトルｘ_ｊに近い検索対象ベクトルＸ_ｊ、すなわちユークリッド距離‖ｘ_ｊ−Ｘ_ｊ‖が小さい順に検索して行く。検索する件数は、予め定めておいても、ユーザによって任意に設定するようにしても良い。また、印象度データと書誌データとが共に検索条件とされている場合には、書誌データに基づく検索を行った後、印象度データに基づく検索が行われる。 The search based on the impression degree data input from the PC operation unit 19 uses the impression degree data input from the PC operation unit 19 as the input vector _xj, and searches the impression degree data stored together with the song data in the song database 15. If the target vector _{X j,} the input vectors _{x j} closer search target vector _{X j,} i.e. go search the Euclidean distance ‖x _{_j} -X _j ‖ is ascending order. The number of searches may be determined in advance or arbitrarily set by the user. If both impression level data and bibliographic data are set as search conditions, after searching based on bibliographic data, searching based on impression level data is performed.

検索条件入力領域６２を用いた検索以外に、楽曲マップ表示領域６１を用いた検索を行える様にしても良い。この場合には、楽曲マップ表示領域６１において検索対象領域を指定することで、検索対象領域内にマッピングされている楽曲データを検索結果として検索結果表示領域６３に表示する。 In addition to the search using the search condition input area 62, a search using the music map display area 61 may be performed. In this case, by designating a search target area in the music map display area 61, the music data mapped in the search target area is displayed in the search result display area 63 as a search result.

次に、ユーザは、検索結果表示領域６３に表示されている検索結果の中から代表曲を選択し（ステップＥ３）、代表曲検索実行ボタン６３１をクリックすることで、代表曲に基づく検索を楽曲検索部１８に指示する。 Next, the user selects a representative song from the search results displayed in the search result display area 63 (step E3), and clicks the representative song search execution button 631 to perform a search based on the representative song. The search unit 18 is instructed.

楽曲検索部１８は、選択された代表曲に基づいて楽曲マップ記憶部１７に記憶されている楽曲マップを検索し（ステップＥ４）、代表曲がマッピングされているニューロンと、その近傍ニューロンとにマッピングされている楽曲データを代表曲検索結果として検索結果表示領域６３に表示する。近傍ニューロンを決定するための近傍半径は、予め定めておいても、ユーザによって任意に設定するようにしても良い。 The music search unit 18 searches the music map stored in the music map storage unit 17 based on the selected representative music (step E4), and maps it to the neuron to which the representative music is mapped and its neighboring neurons. The stored music data is displayed in the search result display area 63 as a representative music search result. The neighborhood radius for determining the neighborhood neuron may be set in advance or arbitrarily set by the user.

次に、ユーザは、検索結果表示領域６３に表示されている代表曲検索結果の中から端末装置３０に出力する楽曲データを、図１７に示すように選択し（ステップＥ５）、出力ボタン６３２をクリックすることで、選択した楽曲データの出力を楽曲検索部１８に指示し、楽曲検索部１８は、検索結果出力部２１を介してユーザによって選択された楽曲データを端末装置３０に出力する（ステップＥ６）。 Next, the user selects music data to be output to the terminal device 30 from the representative music search results displayed in the search result display area 63 (step E5), and the output button 632 is selected. By clicking, the music search unit 18 is instructed to output the selected music data, and the music search unit 18 outputs the music data selected by the user via the search result output unit 21 to the terminal device 30 (step). E6).

なお、検索条件入力領域６２、楽曲マップ表示領域６１を用いた代表曲の検索以外に、図１８に示すような、記憶されている全楽曲のリストが表示される全楽曲リスト表示領域６４を検索画面６０に表示させ、全楽曲リストから代表曲を直接選択して、代表曲選択実行ボタン６４１をクリックすることで、選択された代表曲に基づく検索を楽曲検索部１８に指示するように構成しても良い。 In addition to searching for representative songs using the search condition input area 62 and the music map display area 61, a search is made for an all music list display area 64 in which a list of all stored music is displayed as shown in FIG. The screen is displayed on the screen 60, and a representative song is directly selected from the entire song list, and a search based on the selected representative song is instructed to the song search unit 18 by clicking a representative song selection execution button 641. May be.

さらに、上述した検索以外に、「明るい曲」、「楽しい曲」、「癒される曲」というように言葉で表現されるキーワードに対応するニューロン（あるいは楽曲）を設定しておき、キーワードを選択することによって楽曲の検索を行えるように構成しても良い。すなわち、図１９（ａ）に示すような、キーワード検索領域６５を検索画面６０に表示させ、キーワード選択領域６５１に表示されたキーワードのリストからいずれかを選択し、おまかせ検索ボタン６５３をクリックすることで、選択されたキーワードに対応するニューロンに基づく検索を楽曲検索部１８に指示するように構成する。図１９（ａ）に示す設定楽曲表示領域６５２には、選択されたキーワードに対応する楽曲が設定されている場合に、当該楽曲が設定楽曲として表示され、この場合には、おまかせ検索ボタン６５３をクリックすることで、選択されたキーワードに対応する設定楽曲を代表曲とする検索を楽曲検索部１８に指示する。また、図１９（ａ）に示す設定楽曲変更ボタン６５４は、キーワードに対応する楽曲を変更する際に使用されるもので、設定楽曲変更ボタン６５４をクリックすることで、全楽曲リストが表示されて、全楽曲リストの中から楽曲を選択することで、キーワードに対応する楽曲を変更できるように構成する。なお、キーワードに対応するニューロン（あるいは楽曲）の設定は、キーワードに印象度データを割り付けておき、当該印象度データを入力ベクトルｘ_ｊとし、入力ベクトルｘ_ｊに最も近いニューロン（あるいは楽曲）とを対応づけるようにしても良く、ユーザによって任意に設定できるように構成しても良い。 In addition to the search described above, neurons (or songs) corresponding to keywords expressed in words such as “bright songs”, “fun songs”, and “healed songs” are set and keywords are selected. It may be configured so that music can be searched. That is, as shown in FIG. 19A, the keyword search area 65 is displayed on the search screen 60, one is selected from the keyword list displayed in the keyword selection area 651, and the automatic search button 653 is clicked. Thus, the music search unit 18 is instructed to search based on the neuron corresponding to the selected keyword. In the set music display area 652 shown in FIG. 19A, when a music corresponding to the selected keyword is set, the music is displayed as the set music. In this case, an automatic search button 653 is displayed. By clicking, the music search unit 18 is instructed to search for the set music corresponding to the selected keyword as a representative music. The set music change button 654 shown in FIG. 19A is used when changing the music corresponding to the keyword. When the set music change button 654 is clicked, the entire music list is displayed. The music corresponding to the keyword can be changed by selecting the music from the entire music list. The setting of neurons (or songs) that correspond to the keywords in advance by assigning impression data to the keyword, the impression data as input vectors x _j, the nearest neuron to the input vector x _j (or music) You may make it match | combine and you may comprise so that it can set arbitrarily by a user.

このように、キーワードに対応するニューロンが設定されている場合には、図１９（ｂ）に示すように、楽曲マップ表示領域６１において楽曲がマッピンクされているニューロンをクリックすると、クリックされたニューロンに対応するキーワードがキーワード表示６１１としてポップアップ表示されるように構成すると、楽曲マップ表示領域６１を利用した楽曲の検索を容易に行うことができる。 In this way, when the neuron corresponding to the keyword is set, as shown in FIG. 19B, when the neuron to which the music is mapped is clicked in the music map display area 61, the clicked neuron is displayed. If the corresponding keyword is configured to be displayed as a pop-up as the keyword display 611, it is possible to easily search for music using the music map display area 61.

以上説明したように、本実施の形態によれば、学習データ記憶部２３に評価者によって事前学習が施された複数の階層型ニューラルネットワークを記憶させておき、初期設定部２２によって学習データ記憶部２３に記憶されている複数の階層型ニューラルネットワークのいずれかをユーザの入力に基づいて選択させ、初期設定部２２によって選択された階層型ニューラルネットワークを階層型ニューラルネットワーク学習部２５によってユーザの嗜好を反映させて学習させるように構成することにより、ユーザの嗜好を反映させて学習を施した階層型ニューラルネットワークによって、楽曲の有する物理的な複数の項目からなる特徴データと、人間の感性によって判断される項目からなる印象度データとを直接関連づけることができ、ユーザによって検索条件として入力される人間の感性によって判断される印象度データに基づいて精度の高い楽曲データの検索を行うことができると共に、階層型ニューラルネットワークに学習を施すためのユーザの作業量を軽減させるために予め評価者によって事前学習が施された階層型ニューラルネットワークを用いてもユーザの嗜好を強く反映させることができるという効果を奏する。 As described above, according to the present embodiment, the learning data storage unit 23 stores a plurality of hierarchical neural networks that have been pre-trained by the evaluator, and the initial setting unit 22 stores the learning data storage unit. 23 selects one of the plurality of hierarchical neural networks stored on the basis of a user input, and the hierarchical neural network selected by the initial setting unit 22 determines the user's preference by the hierarchical neural network learning unit 25. By configuring to reflect and learn, it is judged by feature data consisting of multiple physical items of music and human sensibilities by a hierarchical neural network that learns by reflecting user preferences Impression data consisting of Therefore, it is possible to search music data with high accuracy based on impression degree data determined by human sensibility input as a search condition, and reduce the user's workload for learning the hierarchical neural network. Therefore, the user's preference can be strongly reflected even using a hierarchical neural network that has been pre-learned by an evaluator in advance.

次に、本発明の他の実施の形態について図２０を参照して詳細に説明する。
図２０は、本発明に係る楽曲検索システムの他の実施の形態の構成を示すブロック図である。 Next, another embodiment of the present invention will be described in detail with reference to FIG.
FIG. 20 is a block diagram showing the configuration of another embodiment of the music search system according to the present invention.

図２０に示す他の実施の形態では、図１に示す楽曲データベース１５、楽曲マップ記憶部１７および楽曲検索部１８とそれぞれ同等の機能を有する楽曲データベース３６、楽曲マップ記憶部３７および楽曲検索部３８を端末装置３０に備え、端末装置３０で楽曲データベース３６の検索と、楽曲マップ記憶部３７に記憶されている楽曲マップの検索とを行えるように構成されている。他の実施の形態において、楽曲検索装置１０は、楽曲データ入力部１１から入力された楽曲データを楽曲データベース１５に、印象度データ変換部１４によって変換された印象度データを楽曲データベース１５に、楽曲マッピング部１６によってマッピングされた楽曲マップを楽曲マップ記憶部１７にそれぞれ記憶させる楽曲登録装置として用いられる。 In another embodiment shown in FIG. 20, a music database 36, a music map storage unit 37, and a music search unit 38 having functions equivalent to those of the music database 15, music map storage unit 17 and music search unit 18 shown in FIG. The terminal device 30 is configured so that the terminal device 30 can search the music database 36 and search the music map stored in the music map storage unit 37. In another embodiment, the music search device 10 stores the music data input from the music data input unit 11 in the music database 15, the impression data converted by the impression data conversion unit 14 in the music database 15, The music map mapped by the mapping unit 16 is used as a music registration device that stores the music map in the music map storage unit 17.

楽曲検索装置１０の楽曲データベース１５および楽曲マップ記憶部１７の記憶内容は、データベース出力部２６によって端末装置３０に出力され、端末装置３０のデータベース入力部３９は、楽曲データベース１５および楽曲マップ記憶部１７の記憶内容を楽曲データベース３６および楽曲マップ記憶部３７に記憶させる。検索条件の入力は、端末表示部３４の表示内容に基づいて、端末操作部３３から行われる。 The contents stored in the music database 15 and the music map storage unit 17 of the music search device 10 are output to the terminal device 30 by the database output unit 26, and the database input unit 39 of the terminal device 30 receives the music database 15 and the music map storage unit 17. Is stored in the music database 36 and the music map storage unit 37. The search condition is input from the terminal operation unit 33 based on the display content of the terminal display unit 34.

なお、本発明が上記各実施の形態に限定されず、本発明の技術思想の範囲内において、各実施の形態は適宜変更され得ることは明らかである。また、上記構成部材の数、位置、形状等は上記実施の形態に限定されず、本発明を実施する上で好適な数、位置、形状等にすることができる。なお、各図において、同一構成要素には同一符号を付している。 Note that the present invention is not limited to the above-described embodiments, and it is obvious that the embodiments can be appropriately changed within the scope of the technical idea of the present invention. In addition, the number, position, shape, and the like of the constituent members are not limited to the above-described embodiment, and can be set to a suitable number, position, shape, and the like in practicing the present invention. In each figure, the same numerals are given to the same component.

本発明に係る楽曲検索システムの本実施の形態の構成を示すブロック図である。It is a block diagram which shows the structure of this Embodiment of the music search system which concerns on this invention. 図１に示す端末装置の構成を示すブロック図である。It is a block diagram which shows the structure of the terminal device shown in FIG. 図１に示す楽曲検索装置に用いられる階層型ニューラルネットワークおよび楽曲マップを事前に学習させるニューラルネットワーク学習装置の構成を示すブロック図である。It is a block diagram which shows the structure of the neural network learning apparatus which learns in advance the hierarchical neural network and music map which are used for the music search apparatus shown in FIG. 図１に示す階層型ニューラルネットワーク学習部における階層型ニューラルネットワークの学習アルゴリズムを説明するための説明図である。It is explanatory drawing for demonstrating the learning algorithm of the hierarchical neural network in the hierarchical neural network learning part shown in FIG. 図３に示すニューラルネットワーク学習装置における楽曲マップの学習アルゴリズムを説明するための説明図である。It is explanatory drawing for demonstrating the learning algorithm of the music map in the neural network learning apparatus shown in FIG. 図３に示すニューラルネットワーク学習装置における階層型ニューラルネットワークの事前学習動作および楽曲マップの学習動作を説明するためのフローチャートである。4 is a flowchart for explaining a pre-learning operation of a hierarchical neural network and a music map learning operation in the neural network learning apparatus shown in FIG. 3. 図１に示す学習データ記憶部の記憶内容を示す図である。It is a figure which shows the memory content of the learning data memory | storage part shown in FIG. 図１に示す階層型ニューラルネットワーク学習部において初期設定時に行われる階層型ニューラルネットワークの学習動作を説明するためのフローチャートである。3 is a flowchart for explaining a learning operation of the hierarchical neural network performed at the time of initial setting in the hierarchical neural network learning unit shown in FIG. 1. 図１に示すＰＣ表示部に表示される初期設定選択画面例を示す図である。It is a figure which shows the example of an initial setting selection screen displayed on the PC display part shown in FIG. 図１に示すＰＣ表示部に表示される初期設定入力画面例を示す図である。It is a figure which shows the example of an initial setting input screen displayed on the PC display part shown in FIG. 図１に示す楽曲検索装置における楽曲登録動作を説明するためのフローチャートである。It is a flowchart for demonstrating the music registration operation | movement in the music search apparatus shown in FIG. 図１に示す特徴データ抽出部における特徴データ抽出動作を説明するためのフローチャートである。It is a flowchart for demonstrating the feature data extraction operation | movement in the feature data extraction part shown in FIG. 図１に示す楽曲検索装置における楽曲検索動作を説明するためのフローチャートである。It is a flowchart for demonstrating the music search operation | movement in the music search apparatus shown in FIG. 図１に示すＰＣ表示部に表示される検索画面例を示す図である。It is a figure which shows the example of a search screen displayed on the PC display part shown in FIG. 図１４に示す検索条件入力領域の表示例を示す図である。It is a figure which shows the example of a display of the search condition input area shown in FIG. 図１４に示す検索結果表示領域の表示例を示す図である。It is a figure which shows the example of a display of the search result display area shown in FIG. 図１４に示す検索結果表示領域の表示例を示す図である。It is a figure which shows the example of a display of the search result display area shown in FIG. 図１４に示す検索画面例に表示される全楽曲リスト表示領域例を示す図である。It is a figure which shows the example of all the music list display areas displayed on the example of a search screen shown in FIG. 図１４に示す検索画面例に表示されるキーワード検索領域例を示す図である。It is a figure which shows the example of a keyword search area | region displayed on the example of a search screen shown in FIG. 本発明に係る楽曲検索システムの他の実施の形態の構成を示すブロック図である。It is a block diagram which shows the structure of other embodiment of the music search system which concerns on this invention.

Explanation of symbols

１０楽曲検索装置
１１楽曲データ入力部
１２圧縮処理部
１３特徴データ抽出部
１４印象度データ変換部
１５楽曲データベース
１６楽曲マッピング部
１７楽曲マップ記憶部
１８楽曲検索部
１９ＰＣ操作部（印象度データ入力手段）
２０ＰＣ表示部
２１検索結果出力部
２２初期設定部
２３学習データ記憶部
２４音声出力部
２５階層型ニューラルネットワーク学習部
２６データベース出力部
３０端末装置
３１検索結果入力部
３２検索結果記憶部
３３端末操作部
３４端末表示部
３５音声出力部
３６楽曲データベース
３７楽曲マップ記憶部
３８楽曲検索部
３９データベース入力部
４０ニューラルネットワーク学習装置
４１楽曲データ入力部
４２音声出力部
４３特徴データ抽出部
４４印象度データ入力部
４５結合重み値学習部
４６楽曲マップ学習部
４７結合重み値出力部
４８特徴ベクトル出力部
４９初期設定選択画面
５０初期設定入力画面
５１印象度データ入力領域
５２初期設定開始ボタン
５３入力確定ボタン
６０検索画面
６１楽曲マップ表示領域
６２検索条件入力領域
６３検索結果表示領域
６４全楽曲リスト表示領域
６５キーワード検索領域
４９１パーソナル情報入力欄
４９２ジャンル入力欄
４９３評価者入力欄
４９４選択実行ボタン
６１１キーワード表示
６２１印象度データ入力領域
６２２書誌データ入力領域
６２３検索実行ボタン
６３１代表曲検索実行ボタン
６３２出力ボタン
６４１代表曲選択実行ボタン
６５１キーワード選択領域
６５２設定楽曲表示領域
６５３おまかせ検索ボタン
６５４設定楽曲変更ボタン

DESCRIPTION OF SYMBOLS 10 Music search device 11 Music data input part 12 Compression processing part 13 Feature data extraction part 14 Impression degree data conversion part 15 Music database 16 Music mapping part 17 Music map memory | storage part 18 Music search part 19 PC operation part (Impression degree data input means) )
DESCRIPTION OF SYMBOLS 20 PC display part 21 Search result output part 22 Initial setting part 23 Learning data storage part 24 Speech output part 25 Hierarchical neural network learning part 26 Database output part 30 Terminal device 31 Search result input part 32 Search result storage part 33 Terminal operation part 34 terminal display unit 35 audio output unit 36 music database 37 music map storage unit 38 music search unit 39 database input unit 40 neural network learning device 41 music data input unit 42 audio output unit 43 feature data extraction unit 44 impression degree data input unit 45 Joint weight value learning section 46 Music map learning section 47 Joint weight value output section 48 Feature vector output section 49 Initial setting selection screen 50 Initial setting input screen 51 Impression degree data input area 52 Initial setting start button 53 Input confirmation button 60 Search screen 61 Music Display area 62 search condition input area 63 search result display area 64 all music list display area 65 keyword search area 491 personal information input field 492 genre input field 493 evaluator input field 494 selection execution button 611 keyword display 621 impression degree data input Area 622 Bibliographic data input area 623 Search execution button 631 Representative song search execution button 632 Output button 641 Representative song selection execution button 651 Keyword selection area 652 Set music display area 653 Automatic search button 654 Set music change button

Claims

A music search system for storing a plurality of music data in a music database and searching for the desired music data from the plurality of music data stored in the music database,
Learning data storage means storing a plurality of hierarchical neural networks that have been pre-learned by an evaluator and that convert physical feature data of the music data into impression degree data determined by human sensitivity,
Initial setting means for selecting one of a plurality of hierarchical neural networks stored in the learning data storage means;
Audio output means for outputting the music data as audio;
Impression degree data input means for inputting the impression degree data corresponding to the music data output from the voice output means;
The hierarchical neural network selected by the initial setting means is trained by using the feature data included in the music data output from the audio output means and the impression degree data input from the impression degree data input means. Hierarchical neural network learning means;
Music data input means for inputting the music data;
Feature data extraction means for extracting the feature data from the music data input by the music data input means;
Impression degree data conversion means for converting the feature data extracted by the feature data extraction means into the impression degree data using a hierarchical neural network learned by the hierarchical neural network learning means;
Storage control means for storing the impression degree data converted by the impression degree data conversion means in the music database together with the music data input by the music data input means;
Music search means for searching the music database based on the impression degree data input as a search condition;
A music search system comprising: music data output means for outputting the music data searched by the music search means.

Comprising personal information input means for inputting user personal information;
A plurality of hierarchical neural networks that have been pre-learned by different evaluators in the learning data storage means are stored together with the evaluator's personal information,
The initial setting unit selects one of a plurality of hierarchical neural networks stored in the learning data storage unit by comparing the personal information of the user and the personal information of the evaluator. The music search system according to claim 1.

Comprising genre input means for inputting the genre of the music data;
The learning data storage means stores a plurality of hierarchical neural networks that have been pre-learned by music data of different genres,
3. The initial setting unit selects one of a plurality of hierarchical neural networks stored in the learning data storage unit based on the genre input by the genre input unit. The described music search system.

A music search method for storing a plurality of music data in a music database and searching for the desired music data from a plurality of the music data stored in the music database,
A plurality of hierarchical neural networks for converting physical feature data of the music data, which has been pre-learned by the evaluator, into impression degree data determined by human sensitivity, are stored,
Select one of the stored hierarchical neural networks,
Audio output of the music data;
Receiving an input of the impression degree data corresponding to the music data output by voice;
The selected hierarchical neural network is learned using the feature data of the music data that has been output as audio and the received impression data,
Receiving input of the music data;
Extracting the feature data from the received music data;
Using the learned hierarchical neural network, the extracted feature data is converted into the impression data,
Storing the converted impression data together with the received music data in the music database;
Accept the input of impression degree data as a search condition,
Search the music database based on the received impression degree data,
A music search method comprising outputting the searched music data.

Accepts user personal information input,
A plurality of hierarchical neural networks that have been pre-learned by the evaluator are stored together with the evaluator's personal information,
5. The music according to claim 4, wherein one of the plurality of stored hierarchical neural networks is selected by comparing the received personal information of the user and personal information of the evaluator. retrieval method.

Accepts input of the genre of the music data,
Store multiple hierarchical neural networks that have been pre-learned with music data of different genres,
6. The music search method according to claim 4, wherein one of the plurality of stored hierarchical neural networks is selected based on the received genre.

A music search program for causing a computer to execute the music search method according to claim 4.