JP2008181294A

JP2008181294A - Information processing apparatus, method and program

Info

Publication number: JP2008181294A
Application number: JP2007013830A
Authority: JP
Inventors: Yoshiyuki Kobayashi; 由幸小林
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2007-01-24
Filing date: 2007-01-24
Publication date: 2008-08-07

Abstract

PROBLEM TO BE SOLVED: To more surely extract desired features from data. SOLUTION: A next generation creation part 123 creates the several candidates of a filter for pre-processing to be used for pre-processing to be performed to data as the object of feature extraction in the case of extracting predetermined features from data. A gene evaluation part 122 evaluates the candidates of the filter for pre-processing, and a next generation creation part 123 creates new candidates by using the several candidates as the object of evaluation until predetermined conditions are satisfied by the evaluation value of the most highly evaluated candidate. When the predetermined conditions are satisfied by the most highly evaluated candidate, the gene evaluation part 122 outputs the candidate as a filter for pre-processing. This invention is applicable to an information processor. COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は情報処理装置および方法、並びにプログラムに関し、特に、データから所望する特徴をより確実に抽出することができるようにした情報処理装置および方法、並びにプログラムに関する。 The present invention relates to an information processing device and method, and a program, and more particularly, to an information processing device and method, and a program that can extract a desired feature from data more reliably.

従来、入力されたデータから、所定の特徴を抽出する信号処理装置が知られている（例えば、特許文献１参照）。このような信号処理装置がデータから特徴を抽出するためのアルゴリズムは、専用の装置または人間によって構築されている。 Conventionally, a signal processing device that extracts a predetermined feature from input data is known (see, for example, Patent Document 1). An algorithm for such a signal processing device to extract features from data is constructed by a dedicated device or a human.

特に、特徴を抽出するアルゴリズムが人間によって構築される場合、人間は、先見知識、つまりこれまでに得た経験を生かすことができるため、教師データを用いて抽出の精度を向上させる最適化が可能なアルゴリズムを構築したり、精度は低くても効率的に特徴を抽出することのできるアルゴリズムを構築したりすることができる。ここで、教師データとは、信号処理装置に入力される特徴抽出の対象となるデータや、そのデータから抽出される特徴など、アルゴリズムを構築するために必要なデータをいう。 In particular, when algorithms for extracting features are constructed by humans, humans can make use of foresight knowledge, that is, experience gained so far, so optimization can be performed using teacher data to improve extraction accuracy. It is possible to construct a simple algorithm, or to construct an algorithm that can extract features efficiently even if accuracy is low. Here, the teacher data refers to data necessary for constructing an algorithm, such as data to be extracted from features input to the signal processing apparatus and features extracted from the data.

例えば、教師データを用いて精度を向上させる方法として、ＧＡ（Genetic Algorithm）を利用することによって、特徴を抽出するためのアルゴリズムにおいて用いられるパラメータの最適化を行う方法が知られている。 For example, as a method of improving accuracy using teacher data, a method of optimizing parameters used in an algorithm for extracting features by using GA (Genetic Algorithm) is known.

米国特許出願公開第２００４／０１８１４０１Ａ１号明細書US Patent Application Publication No. 2004/0181401 A1

しかしながら、上述した技術においては、データから所望する特徴を確実に抽出することは容易ではなかった。例えば、人間により特徴抽出のアルゴリズムが構築される場合、そのアルゴリズムによる特徴抽出の精度は、アルゴリズムを構築する人のセンスに大きく依存し、精度を向上させるための作業は試行錯誤の繰り返しであるため、アルゴリズムの構築には多くの労力が必要であった。 However, in the above-described technique, it is not easy to reliably extract a desired feature from data. For example, when a feature extraction algorithm is constructed by a human, the accuracy of feature extraction by the algorithm depends greatly on the sense of the person who constructs the algorithm, and the work to improve the accuracy is a trial and error process. The construction of the algorithm required a lot of effort.

また、ＧＡにより特徴抽出のアルゴリズムにおいて用いられるパラメータの最適化を行うこともできるが、パラメータの最適化にも限界があるため、充分に精度を向上させることはできなかった。 Further, although the parameters used in the feature extraction algorithm can be optimized by GA, there is a limit to the parameter optimization, and thus the accuracy cannot be sufficiently improved.

さらに、専用の装置によりアルゴリズムが構築される場合、装置が人間の先見知識を利用することができないなどの理由により、特徴抽出に複雑な処理が必要なときには、アルゴリズムの構築に膨大な量の教師データが必要であったり、多くの教師データを用いても装置がアルゴリズムを構築できなかったりすることがあった。 Furthermore, when an algorithm is constructed by a dedicated device, a huge amount of teachers are required to construct the algorithm when complicated processing is required for feature extraction because the device cannot use human foresight knowledge. In some cases, data is required, or even if a large amount of teacher data is used, the device cannot construct an algorithm.

本発明は、このような状況に鑑みてなされたものであり、データから所望する特徴をより確実に抽出することができるようにするものである。 The present invention has been made in view of such a situation, and makes it possible to extract a desired feature from data more reliably.

本発明の一側面の情報処理装置は、データから所定の特徴が抽出される場合に、前記特徴の抽出の対象となる前記データに施される前処理に用いられる前処理用フィルタの候補である候補フィルタを生成するフィルタ生成手段と、前記候補フィルタを用いて、前記候補フィルタの評価に用いられる教師データに、前記前処理を施す前処理手段と、前記前処理が施された前記教師データから、前記特徴を抽出する抽出手段と、前記候補フィルタが用いられて前処理が施された前記教師データから前記特徴を抽出した場合における、前記特徴の抽出の評価を示す前記候補フィルタの評価値を、前記抽出手段により抽出された前記特徴と、予め求められている前記教師データから抽出されるべき特徴とを基に計算する評価手段とを備える。 An information processing apparatus according to an aspect of the present invention is a candidate for a preprocessing filter used for preprocessing applied to the data, which is a target of feature extraction, when a predetermined feature is extracted from data. From the filter generation means for generating a candidate filter, the preprocessing means for performing the preprocessing on the teacher data used for the evaluation of the candidate filter using the candidate filter, and the teacher data subjected to the preprocessing An evaluation means for extracting the feature, and an evaluation value of the candidate filter indicating an evaluation of the extraction of the feature when the feature is extracted from the teacher data that has been preprocessed using the candidate filter. And an evaluation means for calculating on the basis of the feature extracted by the extraction means and a feature to be extracted from the teacher data obtained in advance.

前記フィルタ生成手段には、複数の前記候補フィルタの前記評価値のそれぞれに基づいて、前記候補フィルタのうちのいくつかを用いて、複数の新たな候補フィルタをさらに生成させ、前記評価手段には、前記新たな候補フィルタの評価値のそれぞれを計算させることができる。 The filter generation means further generates a plurality of new candidate filters using some of the candidate filters based on each of the evaluation values of the plurality of candidate filters, and the evaluation means Each of the evaluation values of the new candidate filter can be calculated.

前記評価手段には、最後に計算された前記評価値のうちの評価の最も高い評価値が所定の条件を満たす場合、前記最も高い評価値の候補フィルタを、前記前処理に用いる前処理用フィルタとして出力させることができる。 In the evaluation means, when the highest evaluation value among the evaluation values calculated at the end satisfies a predetermined condition, the candidate filter having the highest evaluation value is used as a preprocessing filter for the preprocessing. Can be output as

前記フィルタ生成手段には、最後に計算された前記評価値のうちの評価の最も高い評価値が前記条件を満たさない場合、最後に計算された前記評価値に基づいて、最後に生成された前記候補フィルタのうちのいくつかを用いて、複数の新たな候補フィルタをさらに生成させることができる。 When the highest evaluation value among the evaluation values calculated last does not satisfy the condition, the filter generation means, based on the evaluation value calculated last, the last generated Several of the candidate filters can be used to further generate a plurality of new candidate filters.

前記前処理用フィルタは、前記データの形式を保ったまま処理を施すフィルタとすることができる。 The preprocessing filter may be a filter that performs processing while maintaining the data format.

情報処理装置には、前記候補フィルタが用いられて前記前処理が施された教師データと、予め求められている前記教師データから抽出されるべき特徴とを基に、前記候補フィルタが用いられて前記前処理が施された前記データから前記特徴を抽出するための判別機を機械学習により作成する判別機作成手段をさらに設けることができる。 The information processing apparatus uses the candidate filter based on the teacher data that has been subjected to the preprocessing using the candidate filter and the characteristics that should be extracted from the teacher data that has been obtained in advance. The discriminator creating means for creating a discriminator for extracting the feature from the preprocessed data by machine learning can be further provided.

本発明の一側面の情報処理方法またはプログラムは、データから所定の特徴が抽出される場合に、前記特徴の抽出の対象となる前記データに施される前処理に用いられる前処理用フィルタの候補である候補フィルタを生成し、前記候補フィルタを用いて、前記候補フィルタの評価に用いられる教師データに、前記前処理を施し、前記前処理が施された前記教師データから、抽出手段により前記特徴を抽出し、前記候補フィルタが用いられて前処理が施された前記教師データから前記特徴を抽出した場合における、前記特徴の抽出の評価を示す前記候補フィルタの評価値を、前記抽出手段により抽出された前記特徴と、予め求められている前記教師データから抽出されるべき特徴とを基に計算するステップを含む。 An information processing method or program according to one aspect of the present invention provides a candidate for a preprocessing filter that is used for preprocessing performed on the data that is a target of extraction of a feature when a predetermined feature is extracted from the data. The candidate filter is generated, the preprocessing is performed on the teacher data used for the evaluation of the candidate filter using the candidate filter, and the feature is extracted by the extraction unit from the teacher data subjected to the preprocessing. When the feature is extracted from the teacher data that has been preprocessed using the candidate filter, the extraction means extracts the evaluation value of the candidate filter that indicates the evaluation of the feature extraction And calculating based on the obtained feature and the feature to be extracted from the teacher data obtained in advance.

本発明の一側面においては、データから所定の特徴が抽出される場合に、前記特徴の抽出の対象となる前記データに施される前処理に用いられる前処理用フィルタの候補である候補フィルタが生成され、前記候補フィルタが用いられて、前記候補フィルタの評価に用いられる教師データに、前記前処理が施され、前記前処理が施された前記教師データから、抽出手段により前記特徴が抽出され、前記候補フィルタが用いられて前処理が施された前記教師データから前記特徴を抽出した場合における、前記特徴の抽出の評価を示す前記候補フィルタの評価値が、前記抽出手段により抽出された前記特徴と、予め求められている前記教師データから抽出されるべき特徴とを基に計算される。 In one aspect of the present invention, when a predetermined feature is extracted from data, a candidate filter that is a candidate for a preprocessing filter used for preprocessing applied to the data to be extracted of the feature is provided. The candidate filter is generated, the preprocessing is performed on the teacher data used for the evaluation of the candidate filter, and the feature is extracted from the teacher data subjected to the preprocessing by the extraction unit In the case where the feature is extracted from the teacher data that has been preprocessed using the candidate filter, the evaluation value of the candidate filter indicating the evaluation of the extraction of the feature is extracted by the extraction unit The calculation is performed based on the characteristics and the characteristics to be extracted from the teacher data obtained in advance.

本発明の一側面によれば、特徴を抽出するための前処理に用いる前処理用フィルタを生成することができる。特に、本発明の一側面によれば、データから所望する特徴をより確実に抽出することができる。 According to one aspect of the present invention, a preprocessing filter used for preprocessing for extracting features can be generated. In particular, according to one aspect of the present invention, a desired feature can be more reliably extracted from data.

以下に本発明の実施の形態を説明するが、本発明の構成要件と、明細書又は図面に記載の実施の形態との対応関係を例示すると、次のようになる。この記載は、本発明をサポートする実施の形態が、明細書又は図面に記載されていることを確認するためのものである。従って、明細書又は図面中には記載されているが、本発明の構成要件に対応する実施の形態として、ここには記載されていない実施の形態があったとしても、そのことは、その実施の形態が、その構成要件に対応するものではないことを意味するものではない。逆に、実施の形態が構成要件に対応するものとしてここに記載されていたとしても、そのことは、その実施の形態が、その構成要件以外の構成要件には対応しないものであることを意味するものでもない。 Embodiments of the present invention will be described below. Correspondences between the constituent elements of the present invention and the embodiments described in the specification or the drawings are exemplified as follows. This description is intended to confirm that the embodiments supporting the present invention are described in the specification or the drawings. Therefore, even if there is an embodiment which is described in the specification or the drawings but is not described here as an embodiment corresponding to the constituent elements of the present invention, that is not the case. It does not mean that the form does not correspond to the constituent requirements. Conversely, even if an embodiment is described here as corresponding to a configuration requirement, that means that the embodiment does not correspond to a configuration requirement other than the configuration requirement. It's not something to do.

本発明の一側面の情報処理装置は、データから所定の特徴が抽出される場合に、前記特徴の抽出の対象となる前記データに施される前処理に用いられる前処理用フィルタの候補である候補フィルタを生成するフィルタ生成手段（例えば、図１０の初期世代生成部１２１および次世代生成部１２３）と、前記候補フィルタを用いて、前記候補フィルタの評価に用いられる教師データに、前記前処理を施す前処理手段（例えば、図１０の前処理部１４３）と、前記前処理が施された前記教師データから、前記特徴を抽出する抽出手段（例えば、図１０のコード判別部１４６）と、前記候補フィルタが用いられて前処理が施された前記教師データから前記特徴を抽出した場合における、前記特徴の抽出の評価を示す前記候補フィルタの評価値を、前記抽出手段により抽出された前記特徴と、予め求められている前記教師データから抽出されるべき特徴とを基に計算する評価手段（例えば、図１０の評価部１４７）とを備える。 An information processing apparatus according to an aspect of the present invention is a candidate for a preprocessing filter used for preprocessing applied to the data, which is a target of feature extraction, when a predetermined feature is extracted from data. Using the filter generation means (for example, the initial generation generation unit 121 and the next generation generation unit 123 in FIG. 10) for generating the candidate filter and the candidate filter, the preprocessing is performed on the teacher data used for the evaluation of the candidate filter. Preprocessing means (for example, the preprocessing unit 143 in FIG. 10), extraction means (for example, the code determination unit 146 in FIG. 10) for extracting the features from the teacher data subjected to the preprocessing, When the feature is extracted from the teacher data that has been preprocessed using the candidate filter, an evaluation value of the candidate filter that indicates an evaluation of the feature extraction Comprising said features extracted by the extraction means, an evaluation means for calculating based on the features to be extracted from the teacher data obtained in advance (for example, the evaluation unit 147 of FIG. 10).

前記フィルタ生成手段には、複数の前記候補フィルタの前記評価値のそれぞれに基づいて、前記候補フィルタのうちのいくつかを用いて、複数の新たな候補フィルタをさらに生成させ（例えば、図１３のステップＳ４４の処理乃至ステップＳ４７の処理）、前記評価手段には、前記新たな候補フィルタの評価値のそれぞれを計算させる（例えば、図１４のステップＳ９７の処理）ことができる。 The filter generation means further generates a plurality of new candidate filters using some of the candidate filters based on each of the evaluation values of the plurality of candidate filters (for example, FIG. 13 The process of step S44 to the process of step S47) can cause the evaluation means to calculate each evaluation value of the new candidate filter (for example, the process of step S97 in FIG. 14).

前記評価手段には、最後に計算された前記評価値のうちの評価の最も高い評価値が所定の条件を満たす場合、前記最も高い評価値の候補フィルタを、前記前処理に用いる前処理用フィルタとして出力させる（例えば、図１３のステップＳ４８の処理）ことができる。 In the evaluation means, when the highest evaluation value among the evaluation values calculated at the end satisfies a predetermined condition, the candidate filter having the highest evaluation value is used as a preprocessing filter for the preprocessing. (For example, the process in step S48 in FIG. 13).

前記フィルタ生成手段には、最後に計算された前記評価値のうちの評価の最も高い評価値が前記条件を満たさない場合、最後に計算された前記評価値に基づいて、最後に生成された前記候補フィルタのうちのいくつかを用いて、複数の新たな候補フィルタをさらに生成させる（例えば、図１３のステップＳ４４の処理乃至ステップＳ４７の処理）ことができる。 When the highest evaluation value among the evaluation values calculated last does not satisfy the condition, the filter generation means, based on the evaluation value calculated last, the last generated A plurality of new candidate filters can be further generated by using some of the candidate filters (for example, the processing from step S44 to step S47 in FIG. 13).

情報処理装置には、前記候補フィルタが用いられて前記前処理が施された教師データと、予め求められている前記教師データから抽出されるべき特徴とを基に、前記候補フィルタが用いられて前記前処理が施された前記データから前記特徴を抽出するための判別機を機械学習により作成する判別機作成手段（例えば、図１０のコード判別機学習部１４５）をさらに設けることができる。 The information processing apparatus uses the candidate filter based on the teacher data that has been subjected to the preprocessing using the candidate filter and the characteristics that should be extracted from the teacher data that has been obtained in advance. A discriminator creating means (for example, a code discriminator learning unit 145 in FIG. 10) for creating a discriminator for extracting the feature from the preprocessed data by machine learning can be further provided.

本発明の一側面の情報処理方法またはプログラムは、データから所定の特徴が抽出される場合に、前記特徴の抽出の対象となる前記データに施される前処理に用いられる前処理用フィルタの候補である候補フィルタを生成し（例えば、図１３のステップＳ４１）、前記候補フィルタを用いて、前記候補フィルタの評価に用いられる教師データに、前記前処理を施し（例えば、図１４のステップＳ８２）、前記前処理が施された前記教師データから、抽出手段により前記特徴を抽出し（例えば、図１４のステップＳ９０およびステップＳ９１）、前記候補フィルタが用いられて前処理が施された前記教師データから前記特徴を抽出した場合における、前記特徴の抽出の評価を示す前記候補フィルタの評価値を、前記抽出手段により抽出された前記特徴と、予め求められている前記教師データから抽出されるべき特徴とを基に計算する（例えば、図１４のステップＳ９７）ステップを含む。 An information processing method or program according to one aspect of the present invention provides a candidate for a preprocessing filter that is used for preprocessing performed on the data that is a target of extraction of a feature when a predetermined feature is extracted from the data. Is generated (for example, step S41 in FIG. 13), and the preprocessing is performed on the teacher data used for evaluation of the candidate filter by using the candidate filter (for example, step S82 in FIG. 14). The feature is extracted from the teacher data that has been subjected to the pre-processing by an extraction means (for example, step S90 and step S91 in FIG. 14), and the teacher data that has been pre-processed using the candidate filter In the case where the feature is extracted from, the evaluation value of the candidate filter indicating the evaluation of the feature extraction is extracted before the extraction unit extracts the evaluation value. Including features and is calculated based on the feature to be extracted from the teacher data obtained in advance (e.g., step S97 of FIG. 14) the step.

以下、図面を参照して、本発明を適用した実施の形態について説明する。 Embodiments to which the present invention is applied will be described below with reference to the drawings.

本発明を適用した一実施の形態の特徴抽出システムにおいては、図１に示すように、構築されたアルゴリズムにしたがって、入力されたデータから所定の特徴が抽出されて出力される。この特徴抽出システムは、例えば図２に示すように構成される。 In the feature extraction system according to an embodiment to which the present invention is applied, as shown in FIG. 1, predetermined features are extracted from input data and output according to a constructed algorithm. This feature extraction system is configured, for example, as shown in FIG.

図２に示す特徴抽出システムは、フィルタ生成装置１１および信号処理装置１２から構成され、信号処理装置１２には、前処理部２１および特徴抽出処理部２２が設けられている。 The feature extraction system shown in FIG. 2 includes a filter generation device 11 and a signal processing device 12, and the signal processing device 12 includes a preprocessing unit 21 and a feature extraction processing unit 22.

フィルタ生成装置１１は、例えば、積分処理を施すフィルタや、ハイパスフィルタ等の所定の処理を施すフィルタ（以下、適宜、単位フィルタとも称する）の組み合わせからなる前処理用フィルタを生成し、信号処理装置１２の前処理部２１に供給する。ここで、前処理用フィルタは、入力されたデータの形式を保ったまま処理を施すフィルタとされる。したがって、所定のデータに前処理用フィルタを用いたフィルタ処理である前処理が施されると、前処理を施す前と同じ形式のデータが得られる。 The filter generation device 11 generates a preprocessing filter composed of a combination of a filter that performs integration processing, a filter that performs predetermined processing such as a high-pass filter (hereinafter also referred to as a unit filter, as appropriate), and a signal processing device. 12 pre-processing units 21. Here, the preprocessing filter is a filter that performs processing while maintaining the format of the input data. Therefore, when preprocessing, which is filter processing using a preprocessing filter, is performed on predetermined data, data in the same format as before preprocessing is obtained.

また、フィルタ生成装置１１は、データから所望する特徴を抽出するための処理に用いられるデータである判別機を生成し、信号処理装置１２の特徴抽出処理部２２に供給する。 Further, the filter generation device 11 generates a discriminator that is data used for processing for extracting a desired feature from the data and supplies the discriminator to the feature extraction processing unit 22 of the signal processing device 12.

前処理部２１は、フィルタ生成装置１１から供給された前処理用フィルタを用いて、入力されたデータに前処理を施して特徴抽出処理部２２に供給する。特徴抽出処理部２２は、フィルタ生成装置１１から供給された判別機を用いて、前処理部２１から供給されたデータから所定の特徴を抽出して出力する。 The preprocessing unit 21 performs preprocessing on the input data using the preprocessing filter supplied from the filter generation device 11 and supplies the preprocessed data to the feature extraction processing unit 22. The feature extraction processing unit 22 extracts and outputs a predetermined feature from the data supplied from the preprocessing unit 21 using the discriminator supplied from the filter generation device 11.

このように、入力されたデータに前処理を施すことにより、入力されたデータのうちの、特徴抽出に有用な成分をより強調したり、不要な成分、例えば特徴抽出に悪影響を及ぼす成分だけを除去したりすることができるため、入力されたデータから、より確実に所望する特徴を抽出することができる。 In this way, by applying pre-processing to the input data, components that are useful for feature extraction in the input data are further emphasized, or only unnecessary components, for example, components that adversely affect feature extraction are included. Therefore, it is possible to more reliably extract a desired feature from input data.

より具体的には、例えば、特徴抽出システムにおいて、音楽の楽曲などの波形を時間と音程毎のエネルギの２軸に解析することによって得られるデータ、すなわち楽曲などを再生させるための音声データに、波形を時間と音程毎のエネルギに解析する処理が施されて得られる時間‐音程データを、入力されるデータとし、抽出される特徴を音楽のコード、つまり和音とすることができる。なお、以下、波形を時間と音程毎のエネルギに解析する処理を１２音解析処理と称する。 More specifically, for example, in the feature extraction system, the data obtained by analyzing the waveform of music, etc., into two axes of energy for each time and pitch, that is, voice data for reproducing the music, Time-pitch data obtained by performing processing for analyzing the waveform into energy for each time and pitch can be input data, and extracted features can be music chords, that is, chords. Hereinafter, processing for analyzing a waveform into energy for each time and pitch will be referred to as twelve sound analysis processing.

特徴抽出システムにおいて、入力されるデータを時間‐音程データとし、抽出される特徴を音楽のコードとする場合、特徴抽出システムは、図３に示すように構成される。 In the feature extraction system, when the input data is time-pitch data and the extracted feature is a music code, the feature extraction system is configured as shown in FIG.

図３に示す特徴抽出システムは、フィルタ生成装置５１および信号処理装置５２から構成され、信号処理装置５２は、前処理部６１およびコード抽出部６２を備えている。ここで、図３のフィルタ生成装置５１は、図２のフィルタ生成装置１１に対応し、図３の信号処理装置５２は、図２の信号処理装置１２に対応している。また、図３の前処理部６１およびコード抽出部６２は、図２の前処理部２１および特徴抽出処理部２２に対応している。 The feature extraction system shown in FIG. 3 includes a filter generation device 51 and a signal processing device 52, and the signal processing device 52 includes a preprocessing unit 61 and a code extraction unit 62. Here, the filter generation device 51 of FIG. 3 corresponds to the filter generation device 11 of FIG. 2, and the signal processing device 52 of FIG. 3 corresponds to the signal processing device 12 of FIG. Further, the preprocessing unit 61 and the code extraction unit 62 in FIG. 3 correspond to the preprocessing unit 21 and the feature extraction processing unit 22 in FIG.

フィルタ処理装置５１は、単位フィルタを組み合わせて前処理用フィルタを生成し、信号処理装置５２の前処理部６１に供給する。また、フィルタ生成装置５１は、前処理が施された時間‐音程データから、コードを抽出するために用いられるコード判別機を生成し、生成したコード判別機を信号処理装置５２のコード抽出部６２に供給する。 The filter processing device 51 generates a preprocessing filter by combining unit filters, and supplies the preprocessing filter to the preprocessing unit 61 of the signal processing device 52. Further, the filter generation device 51 generates a code discriminator used for extracting a chord from the pre-processed time-pitch data, and the generated code discriminator is used as the code extraction unit 62 of the signal processing device 52. To supply.

信号処理装置５２の前処理部６１は、フィルタ処理装置５１から供給された前処理用フィルタを用いて、入力された時間‐音程データに対して前処理を施し、コード抽出部６２に供給する。コード抽出部６２は、フィルタ処理装置５１から供給されたコード判別機を用いて、前処理部６１から供給された時間‐音程データから楽曲のコードを抽出して出力する。 The preprocessing unit 61 of the signal processing device 52 performs preprocessing on the input time-pitch data using the preprocessing filter supplied from the filter processing device 51, and supplies it to the chord extraction unit 62. The chord extraction unit 62 uses the chord discriminator supplied from the filter processing device 51 to extract and output the chord of the music from the time-pitch data supplied from the preprocessing unit 61.

このように、特徴抽出システムにおいては、時間‐音程データの入力に対し、出力される特徴として楽曲のコード進行を得ることができる。 Thus, in the feature extraction system, the chord progression of the music can be obtained as the output feature in response to the time-pitch data input.

図４は、図３のコード抽出部６２のより詳細な構成例を示すブロック図である。 FIG. 4 is a block diagram showing a more detailed configuration example of the code extraction unit 62 of FIG.

コード抽出部６２は、ビート検出部９１、ビート毎特徴量抽出部９２、およびコード判別部９３から構成される。 The chord extraction unit 62 includes a beat detection unit 91, a beat-by-beat feature amount extraction unit 92, and a chord determination unit 93.

ビート検出部９１は、前処理部６１から供給された時間‐音程データからビートを検出する。ここで、ビートとは、打点または拍のことであり、楽曲において基本の単位として聞こえる基準をいう。ビートは、一般に複数の意味で使われるが、以下、楽曲における基本的な時間の単位の始まりである時刻の意味で用いる。楽曲における基本的な時間の単位の始まりである時刻を、拍の位置と称し、楽曲における基本的な時間の単位の範囲を、拍の範囲と称する。なお、拍の長さは、いわゆるテンポである。 The beat detection unit 91 detects a beat from the time-pitch data supplied from the preprocessing unit 61. Here, the beat is a hit point or a beat, and is a reference that can be heard as a basic unit in music. The beat is generally used in a plurality of meanings, but hereinafter, it is used in the meaning of the time that is the beginning of the basic unit of time in the music. The time that is the beginning of the basic time unit in the music is referred to as the beat position, and the basic time unit range in the music is referred to as the beat range. The beat length is a so-called tempo.

すなわち、ビート検出部９１は、楽曲の音声データから得られた時間‐音程データから、時間‐音程データにおける拍の位置を検出する。そして、ビート検出部９１は、時間‐音程データにおける拍のそれぞれの位置を示すビート情報と、入力された時間‐音程データとをビート毎特徴量抽出部９２に供給する。 That is, the beat detection unit 91 detects the position of the beat in the time-pitch data from the time-pitch data obtained from the sound data of the music. Then, the beat detection unit 91 supplies beat information indicating each position of the beat in the time-pitch data and the input time-pitch data to the beat feature quantity extraction unit 92.

なお、時間‐音程データにおける拍の位置から次の拍の位置までが、拍の範囲なので、時間‐音程データにおける拍の位置がわかれば、拍の範囲がわかる。 Since the range of beats from the position of the beat in the time-pitch data to the next beat position is the range of beats, the range of beats can be known if the position of the beat in the time-pitch data is known.

ビート毎特徴量抽出部９２は、ビート検出部９１から供給された時間‐音程データから、所定の範囲の音声の特徴量、例えばビート毎の音声の特徴量（以下、ビート毎のコード判別用特徴量と称する）を抽出する。すなわち、ビート毎特徴量抽出部９２は、ビート情報を基に、時間‐音程データの拍のそれぞれの範囲における、１２平均率の音程のそれぞれの高さの音のそれぞれの特徴を示す特徴量を抽出する。ビート毎特徴量抽出部９２は、抽出したビート毎のコード判別用特徴量を、コード判別部９３に供給する。 The beat feature quantity extraction unit 92 uses a time-pitch data supplied from the beat detection unit 91 to determine a voice feature quantity within a predetermined range, for example, a voice feature quantity for each beat (hereinafter, a chord discrimination feature for each beat). Extract). In other words, the beat-by-beat feature quantity extraction unit 92 calculates the feature quantities indicating the characteristics of the sounds at the respective pitches of the 12 average rate pitches in the respective beat ranges of the time-pitch data based on the beat information. Extract. The beat feature quantity extraction unit 92 supplies the extracted chord discrimination feature quantity for each beat to the chord discrimination unit 93.

コード判別部９３は、フィルタ生成装置５１から供給されたコード判別機を用いて、ビート毎特徴量抽出部９２から供給されたビート毎のコード判別用特徴量から、ビート毎のコードを判別して出力する。すなわち、コード判別部９３は、ビート毎のコード判別用特徴量から拍の範囲の和音を判別する。なお、後述するように、コード判別機は、特徴量による学習によって予め作成され、前処理用フィルタが変更されるたびにフィルタ生成装置５１からコード判別部９３に新たなコード判別機が供給される。 The chord discriminating unit 93 discriminates the chord for each beat from the chord discriminating feature amount supplied from the beat feature amount extracting unit 92 using the chord discriminator supplied from the filter generating device 51. Output. That is, the chord discriminating unit 93 discriminates a chord in the beat range from the chord discrimination feature value for each beat. As will be described later, the code discriminator is created in advance by learning based on feature amounts, and a new code discriminator is supplied from the filter generation device 51 to the code discriminating unit 93 every time the preprocessing filter is changed. .

このように、信号処理装置５２は、楽曲の時間‐音程データから、その楽曲のビート毎のコードを判別する。例えば、図５に示されるように、信号処理装置５２は、楽曲の時間‐音程データから、Ｃであるコード、Ｅマイナーであるコード、Ａマイナーであるコード、Ｃであるコード、Ｆであるコード、Ｅマイナーであるコード、Ｄマイナーであるコード、およびＧであるコードなどをビート毎に判別する。そして、例えば信号処理装置５２は、ビート毎のコードのコード名を出力する。 Thus, the signal processing device 52 determines the chord for each beat of the music from the time-pitch data of the music. For example, as shown in FIG. 5, the signal processing device 52 determines a code that is C, a code that is E minor, a code that is A minor, a code that is C, and a code that is F from the time-pitch data of the music. , A chord that is E minor, a chord that is D minor, a chord that is G, and the like are determined for each beat. For example, the signal processing device 52 outputs the chord name of the chord for each beat.

次に、図６のフローチャートを参照して、信号処理装置５２が、入力された時間‐音程データからビート毎のコードを抽出して出力する処理であるコード判別処理について説明する。このコード判別処理は、信号処理装置５２に時間‐音程データが入力されると開始される。 Next, with reference to the flowchart of FIG. 6, a chord determination process, which is a process in which the signal processing device 52 extracts and outputs chords for each beat from the input time-pitch data, will be described. This code discrimination process is started when time-pitch data is input to the signal processing device 52.

ステップＳ１１において、前処理部６１は、フィルタ生成装置５１から供給された前処理用フィルタを用いて、入力された時間‐音程データに前処理を施し、前処理が施された時間‐音程データをビート検出部９１に供給する。 In step S11, the pre-processing unit 61 performs pre-processing on the input time-pitch data using the pre-processing filter supplied from the filter generation device 51, and the pre-processed time-pitch data is processed. This is supplied to the beat detection unit 91.

ここで、前処理部６１に入力される時間‐音程データは、例えば図７に示すように、各時刻における各音のエネルギを示すデータとされる。なお、図７において、縦方向は音程を示し、横方向は時間を示している。また、濃度はエネルギの強さを示している。 Here, the time-pitch data input to the preprocessing unit 61 is data indicating the energy of each sound at each time, for example, as shown in FIG. In FIG. 7, the vertical direction indicates the pitch, and the horizontal direction indicates time. Moreover, the density | concentration has shown the strength of energy.

図７では、複数のオクターブの成分、つまりオクターブ１乃至オクターブ７のそれぞれにおける１２平均率のそれぞれの高さの１２の音のエネルギが時間‐音程データとして示されている。 In FIG. 7, twelve sound energies having a height of 12 average rates in each of a plurality of octave components, that is, octaves 1 to 7 are shown as time-pitch data.

前処理部６１は、図７に示した時間‐音程データに対して、例えば図８Ａまたは図８Ｂに示す前処理用フィルタを用いた前処理を施す。 The preprocessing unit 61 performs preprocessing using, for example, a preprocessing filter illustrated in FIG. 8A or 8B on the time-pitch data illustrated in FIG.

図８Ａに示す前処理用フィルタにおいて、Ａ１１の部分、すなわち文字“Sqr”は、時間‐音程データの各時刻における各音のエネルギを２乗することを表しており、Ａ１２の部分、すなわち文字“Time＃LPF，0.5”は、Ａ１１の部分により２乗された時間‐音程データの時間方向に、つまり時間‐音程データの所定のオクターブにおける所定の音の各時刻のエネルギに対して、係数が0.5であるローパスフィルタを用いたフィルタ処理を施すことを表している。また、Ａ１３部分、すなわち文字“Freq＃HPF，0.3”は、Ａ１２の部分によりフィルタ処理が施された時間‐音程データの周波数方向に、つまり時間‐音程データの所定の時刻の各音のエネルギに対して、係数が0.3であるハイパスフィルタを用いたフィルタ処理を施すことを表している。 In the preprocessing filter shown in FIG. 8A, the portion of A11, that is, the character “Sqr” represents that the energy of each sound at each time of the time-pitch data is squared, and the portion of A12, that is, the character “ Time # LPF, 0.5 "has a coefficient of 0.5 in the time direction of the time-pitch data squared by the portion of A11, that is, for the energy at each time of a predetermined sound in a predetermined octave of the time-pitch data. It represents that the filter processing using the low-pass filter is. Further, the A13 portion, that is, the characters “Freq # HPF, 0.3” is in the frequency direction of the time-pitch data filtered by the A12 portion, that is, in the energy of each sound at a predetermined time of the time-pitch data. On the other hand, filter processing using a high-pass filter having a coefficient of 0.3 is performed.

これらのＡ１１乃至Ａ１３の部分は、前処理用フィルタを構成する各単位フィルタを示しており、時間‐音程データに対して、Ａ１１の部分により示される単位フィルタによるフィルタ処理乃至Ａ１３の部分により示される単位フィルタによるフィルタ処理が順番に施される。 These A11 to A13 portions indicate the respective unit filters constituting the preprocessing filter. The time-pitch data is indicated by the filter processing by the unit filter indicated by the A11 portion to the A13 portion. Filter processing by the unit filter is performed in order.

したがって、図８Ａに示す前処理用フィルタを用いた前処理として、時間‐音程データに対し、各エネルギを示す値を２乗する演算処理、ローパスフィルタを用いたフィルタ処理、およびハイパスフィルタを用いたフィルタ処理が順番に施される。 Therefore, as preprocessing using the preprocessing filter shown in FIG. 8A, arithmetic processing for squaring a value indicating each energy, filter processing using a low-pass filter, and high-pass filter are used for time-pitch data. Filter processing is performed in order.

また、図８Ｂに示す前処理用フィルタにおいて、Ａ２１の部分、すなわち文字“Time＃Differential”は、時間‐音程データの時間方向に、つまり時間‐音程データの所定のオクターブにおける所定の音の各時刻のエネルギに対して、微分処理を施すことを表しており、Ａ２２の部分、すなわち文字“Abs”は、時間‐音程データの各時刻における各音のエネルギの絶対値を求めることを表している。さらに、Ａ２３の部分、すなわち文字“Freq＃Order”は、時間‐音程データの周波数方向に、つまり時間‐音程データの所定の時刻の各音のエネルギを、その値が小さいものから順序付けを行うことを表しており、Ａ２４の部分、すなわち文字“Time＃LPF，0.4”は、時間‐音程データの時間方向に、つまり時間‐音程データの所定のオクターブにおける所定の音の各時刻のエネルギに対して、係数が0.4であるローパスフィルタを用いたフィルタ処理を施すことを表している。 In the preprocessing filter shown in FIG. 8B, the portion A21, that is, the character “Time # Differential” is in the time direction of the time-pitch data, that is, each time of a predetermined sound in a predetermined octave of the time-pitch data. The A22 part, that is, the character “Abs” represents that the absolute value of the energy of each sound at each time of the time-pitch data is obtained. Further, the part A23, that is, the character “Freq # Order” is used to order the energy of each sound at a predetermined time in the time-pitch data, that is, at a predetermined time in the time-pitch data from the smallest value. A24, that is, the characters “Time # LPF, 0.4” are in the time direction of the time-pitch data, that is, with respect to the energy at each time of a predetermined sound in a predetermined octave of the time-pitch data. , The filter processing using a low-pass filter having a coefficient of 0.4 is performed.

これらのＡ２１乃至Ａ２４の部分は、前処理用フィルタを構成する各単位フィルタを示しており、時間‐音程データに対して、Ａ２１の部分により示される単位フィルタによるフィルタ処理乃至Ａ２４の部分により示される単位フィルタによるフィルタ処理が順番に施される。 These A21 to A24 portions indicate the respective unit filters constituting the preprocessing filter. The time-pitch data is indicated by the filter processing by the unit filter indicated by the A21 portion to the A24 portion. Filter processing by the unit filter is performed in order.

したがって、図８Ｂに示す前処理用フィルタを用いた前処理として、時間‐音程データに対し、時間方向に対する微分処理、絶対値を求める演算処理、順序付けを行う処理、およびローパスフィルタを用いたフィルタ処理が順番に施される。 Therefore, as preprocessing using the preprocessing filter shown in FIG. 8B, time-pitch data differential processing in the time direction, calculation processing for obtaining absolute values, processing for ordering, and filter processing using a low-pass filter Are applied in order.

また、その他、前処理用フィルタを構成する単位フィルタとして、絶対値での正規化を行うためのフィルタ、時間方向または周波数方向に対して正規化を行うためのフィルタ、時間方向または周波数方向に対して、平均値が所定の値となるように正規化を行うためのフィルタ、時間方向または周波数方向に対して、平均値および分散が所定の値となるように正規化を行うためのフィルタなどを用いることができる。 In addition, as a unit filter constituting the preprocessing filter, a filter for normalizing with an absolute value, a filter for normalizing in the time direction or the frequency direction, with respect to the time direction or the frequency direction A filter for performing normalization so that the average value becomes a predetermined value, a filter for performing normalization so that the average value and variance become predetermined values in the time direction or the frequency direction, etc. Can be used.

さらに、前処理用フィルタを構成する単位フィルタとして、対数関数、指数関数などを用いた演算処理を施すフィルタ、べき乗を求める演算処理を施すフィルタ、加算、減算、乗算、除算、微分、積分などの演算処理を施すフィルタ、絶対値、２乗、平方根などを求める演算処理を施すフィルタ、バンドパスフィルタ、直流成分を除去するフィルタ、順序付けやピーク値の検出を行うための処理を施すフィルタ等を用いることができる。 Furthermore, as a unit filter constituting the preprocessing filter, a filter that performs arithmetic processing using a logarithmic function, an exponential function, etc., a filter that performs arithmetic processing for calculating a power, addition, subtraction, multiplication, division, differentiation, integration, etc. A filter that performs arithmetic processing, a filter that performs arithmetic processing to obtain an absolute value, a square, a square root, or the like, a bandpass filter, a filter that removes a DC component, a filter that performs processing for ordering or peak value detection, or the like is used. be able to.

なお、前処理用フィルタを用いた前処理においては、その処理前と処理後とで、処理されるデータ、すなわち時間‐音程データの形式が保たれたままとされる。つまり、前処理が施された時間‐音程データも、処理前の時間‐音程データと同様に、各時刻における各音のエネルギを示すデータとされる。 In the preprocessing using the preprocessing filter, the format of the data to be processed, that is, the time-pitch data is maintained before and after the processing. That is, the pre-processed time-pitch data is also data indicating the energy of each sound at each time, similarly to the pre-process time-pitch data.

図６のフローチャートの説明に戻り、時間‐音程データに前処理が施されると、ステップＳ１２において、ビート検出部９１は、前処理部６１から供給された時間‐音程データからビートを検出する。すなわち、ビート検出部９１は、時間‐音程データから、時間‐音程データにおける拍の位置を検出して、時間‐音程データにおける拍のそれぞれの位置を示すビート情報と時間‐音程データとをビート毎特徴量抽出部９２に供給する。 Returning to the description of the flowchart of FIG. 6, when pre-processing is performed on the time-pitch data, the beat detection unit 91 detects a beat from the time-pitch data supplied from the pre-processing unit 61 in step S 12. That is, the beat detection unit 91 detects the position of the beat in the time-pitch data from the time-pitch data, and obtains beat information indicating each position of the beat in the time-pitch data and the time-pitch data for each beat. This is supplied to the feature quantity extraction unit 92.

例えば、ビート検出部９１は、時間‐音程データにより示される各オクターブにおける１２平均率のそれぞれの高さの１２の音のエネルギの変化として、エネルギの時間方向の差分を求め、各時刻における音のエネルギの変化を、複数のオクターブのそれぞれの１２の音について積算し、その結果をアタック情報とする。ここで、アタック情報とは、人間にビートを感じさせる音量の変化を時間に沿ってデータ化したものをいう。 For example, the beat detection unit 91 obtains a difference in energy in the time direction as a change in energy of 12 sounds at respective heights of 12 average rates in each octave indicated by the time-pitch data, and calculates the sound in each time. The energy change is integrated for each of the twelve sounds of the plurality of octaves, and the result is used as attack information. Here, the attack information is data obtained by converting the change in volume that makes a human to feel a beat over time.

そして、ビート検出部９１は、コードの検出の対象となっている楽曲において、最も基本となっている音の長さを検出する。すなわち、例えばビート検出部９１は、時系列の情報であるアタック情報を通常の波形と見立てて、アタック情報にショートタイムフーリエ変換などの処理を施し、アタック情報から基本ピッチ（音程）抽出を行うことで最も基本となる音の長さを求める。例えば、楽曲において最も基本となっている音は、４分音符、８分音符、または１６分音符で表される音である。なお、以下、楽曲において、最も基本となる音の長さを基本ビート周期と称する。 The beat detection unit 91 detects the most basic sound length in the music that is the target of chord detection. That is, for example, the beat detection unit 91 considers attack information, which is time-series information, as a normal waveform, performs a process such as a short time Fourier transform on the attack information, and extracts a basic pitch (pitch) from the attack information. Find the most basic sound length. For example, the most basic sound in music is a sound represented by a quarter note, an eighth note, or a sixteenth note. Hereinafter, the most basic sound length in the music is referred to as a basic beat cycle.

さらに、ビート検出部９１は、時間‐音程データにより示される各オクターブにおけるそれぞれの１２の音のデータに所定の信号処理を適用することにより、音程方向のエネルギの分散や単位時間当たりのピークの数などの楽曲特徴量を抽出し、楽曲特徴量とテンポとによる学習によって予め求められたデータに基づいて、抽出した楽曲特徴量からテンポを推定する。そして、ビート検出部９１は、推定されたテンポと基本ビート周期とを用いて最終的なテンポを決定し、決定されたテンポ、すなわち拍の位置を示すビート情報と時間‐音程データとをビート毎特徴量抽出部９２に供給する。 Furthermore, the beat detection unit 91 applies predetermined signal processing to the data of each of the twelve sounds in each octave indicated by the time-pitch data, thereby distributing the energy in the pitch direction and the number of peaks per unit time. And the tempo is estimated from the extracted music feature amount based on data obtained in advance by learning with the music feature amount and the tempo. Then, the beat detection unit 91 determines the final tempo using the estimated tempo and the basic beat cycle, and uses the determined tempo, that is, beat information indicating the position of the beat and time-pitch data for each beat. This is supplied to the feature quantity extraction unit 92.

ステップＳ１３において、ビート毎特徴量抽出部９２は、ビート検出部９１から供給された拍の位置を示すビート情報を基に、時間‐音程データの拍の範囲のそれぞれからコード判別用特徴量を抽出し、コード判別部９３に供給する。 In step S 13, the beat feature quantity extraction unit 92 extracts the chord determination feature quantity from each beat range of the time-pitch data based on the beat information indicating the beat position supplied from the beat detection unit 91. And supplied to the code discriminating unit 93.

例えば、ビート毎特徴量抽出部９２は、時間‐音程データから、ビート情報で示される拍の位置を基に、所定の拍の位置から次の拍の位置までの拍の範囲のデータのみを切り出して、切り出された拍の範囲のデータで示されるエネルギを、時間で平均する。これにより、それぞれのオクターブにおける１２平均率のそれぞれの高さの１２の音毎のエネルギが求められる。 For example, the feature extraction unit 92 for each beat cuts out only the data of the beat range from a predetermined beat position to the next beat position based on the beat position indicated by the beat information from the time-pitch data. Then, the energy indicated by the data of the extracted beat range is averaged over time. Thereby, the energy for every 12 sounds of each height of 12 average rates in each octave is calculated | required.

さらに、ビート毎特徴量抽出部９２は、例えば、７オクターブの、それぞれのオクターブにおける１２平均率のそれぞれの高さの１２の音毎のエネルギに重み付けし、７オクターブのそれぞれのオクターブの同じ音名の音のエネルギを加算して、音名で特定される１２の音のそれぞれのエネルギを求める。ビート毎特徴量抽出部９２は、１２の音のそれぞれのエネルギを音名の音階の順に配置して、音階の順の音のエネルギを示すコード判別用特徴量を生成する。すなわち、例えば、ビート毎特徴量抽出部９２は、重み付けされたエネルギのうち、各オクターブのＣである音名の音のエネルギを加算して、Ｃである音名の音のエネルギを求める。 Further, the beat-by-beat feature quantity extraction unit 92 weights, for example, the energy of every 12 notes at the height of 12 average rates in each octave of 7 octaves, and the same pitch name in each octave of 7 octaves. The energy of each of the twelve sounds identified by the pitch names is obtained. The beat-by-beat feature quantity extraction unit 92 arranges the energy of each of the twelve sounds in the order of the pitch of the pitch name, and generates a chord discrimination feature quantity indicating the energy of the sounds in the order of the scale. That is, for example, the beat feature quantity extraction unit 92 calculates the energy of the sound of the pitch name C by adding the energy of the sound of the pitch name C in each octave out of the weighted energy.

なお、ビート毎特徴量抽出部９２は、時間‐音程データの拍の範囲からのビート毎のコード判別用特徴量として、ルートを判別するために用いられる特徴量（以下、ルート判別用特徴量と称する）とメジャーの和音であるかマイナーの和音であるかを判別するために用いられる特徴量（以下、メジャーマイナー判別用特徴量と称する）とを生成する。また、ルート判別用特徴量を生成する場合に用いられる、音のエネルギに重み付けするための重みと、メジャーマイナー判別用特徴量を生成する場合に用いられる、音のエネルギに重み付けするための重みとは、異なっている。 The beat feature quantity extraction unit 92 uses a feature quantity used for discriminating a route (hereinafter referred to as a route discrimination feature quantity) as a chord discrimination feature quantity for each beat from the beat range of the time-pitch data. And a feature amount used to determine whether the chord is a major chord or a minor chord (hereinafter referred to as a major / minor distinguishing feature amount). In addition, a weight for weighting sound energy, which is used when generating a feature value for route discrimination, and a weight for weighting sound energy, which is used when a feature amount for major / minor discrimination, is used. Is different.

ビート毎にコード判別用特徴量が抽出されると、ステップＳ１４において、コード判別部９３は、フィルタ生成装置５１から供給されたコード判別機により、ビート毎特徴量抽出部９２から供給されたコード判別用特徴量からビート毎のコードを判別する。 When the chord discrimination feature value is extracted for each beat, the chord discrimination unit 93 uses the chord discriminator supplied from the filter generation device 51 to perform chord discrimination supplied from the beat feature amount extraction unit 92 in step S14. The chord for each beat is determined from the feature amount.

例えば、コード判別部９３は、図９に示すように、コード判別機を用いて、コード判別用特徴量としてのルート判別用特徴量およびメジャーマイナー判別用特徴量から、Ｃのメジャーコード、Ｃのマイナーコード、Ｃ＃のメジャーコード、Ｃ＃のマイナーコード、Ｄのメジャーコード、Ｄのマイナーコード、Ｄ＃のメジャーコード、Ｄ＃のマイナーコード、Ｅのメジャーコード、Ｅのマイナーコード、Ｆのメジャーコード、Ｆのマイナーコード、Ｆ＃のメジャーコード、Ｆ＃のマイナーコード、Ｇのメジャーコード、Ｇのマイナーコード、Ｇ＃のメジャーコード、Ｇ＃のマイナーコード、Ａのメジャーコード、Ａのマイナーコード、Ａ＃のメジャーコード、Ａ＃のマイナーコード、Ｂのメジャーコード、およびＢのマイナーコードのうちの、拍の範囲の正しい和音を示すコードを、ビート（拍の範囲）毎に判別する。 For example, as shown in FIG. 9, the code discriminating unit 93 uses a code discriminator to calculate a C major code, a C minor chord from a route discriminating feature amount and a major / minor discriminating feature amount as a chord discriminating feature amount. Minor code, C # major code, C # minor code, D major code, D minor code, D # major code, D # minor code, E major code, E minor code, F major Code, minor code of F, major code of F #, minor code of F #, major code of G, minor code of G, major code of G #, minor code of G #, major code of A, minor code of A , A # major code, A # minor code, B major code, and B minor code, A code indicating a correct chord range, to determine for each beat (range beats).

ステップＳ１５において、コード判別部９３は、判別により得られたコードをビート毎のコードとして出力し、コード判別処理は終了する。なお、より詳細には、コード判別部９３は、ビート毎のコードとして、そのコードのコード名を出力する。 In step S15, the chord discrimination unit 93 outputs the chord obtained by the discrimination as a chord for each beat, and the chord discrimination process ends. In more detail, the chord discriminating unit 93 outputs the chord name of the chord as a chord for each beat.

このようにして、信号処理装置５２は、入力された時間‐音程データに前処理を施し、前処理が施された時間‐音程データから、ビート毎のコードを抽出して出力する。 In this way, the signal processing device 52 performs preprocessing on the input time-pitch data, and extracts and outputs a chord for each beat from the preprocessed time-pitch data.

このように、入力された時間‐音程データに前処理を施すことによって、時間‐音程データのうちの特徴の抽出に有用な成分をより強調し、特徴の抽出の妨げとなる成分を除去することができる。これにより、前処理の施された時間‐音程データから、より確実に所望する特徴を抽出することができる。 In this way, by applying pre-processing to the input time-pitch data, components useful for feature extraction in the time-pitch data are more emphasized, and components that hinder feature extraction are removed. Can do. This makes it possible to extract a desired feature more reliably from the pre-processed time-pitch data.

次に、前処理用フィルタおよびコード判別機の作成について説明する。図３に示した特徴抽出システムにおいては、時間‐音程データから特徴としての楽曲のコードを抽出するアルゴリズムは人間により構築されるが、そのアルゴリズムの一部、例えば前処理などはフィルタ生成装置５１によって、ＧＰ（Genetic Programming）などにより構築される。 Next, creation of a preprocessing filter and a code discriminator will be described. In the feature extraction system shown in FIG. 3, an algorithm for extracting a song code as a feature from time-pitch data is constructed by a human, but a part of the algorithm, for example, pre-processing is performed by a filter generation device 51. And GP (Genetic Programming).

例えば、前処理に用いられる前処理用フィルタは、ＧＰによって、単位フィルタの組み合わせを遺伝的に進化させることにより生成され、その前処理用フィルタに対応するコード判別機は、機械学習により生成される。 For example, a preprocessing filter used for preprocessing is generated by genetically evolving a combination of unit filters by GP, and a code discriminator corresponding to the preprocessing filter is generated by machine learning. .

図１０は、前処理用フィルタおよびコード判別機を作成するフィルタ生成装置５１の構成例を示すブロック図である。 FIG. 10 is a block diagram illustrating a configuration example of a filter generation device 51 that creates a preprocessing filter and a code discriminator.

フィルタ生成装置５１は、初期世代生成部１２１、遺伝子評価部１２２、および次世代生成部１２３から構成される。 The filter generation device 51 includes an initial generation generation unit 121, a gene evaluation unit 122, and a next generation generation unit 123.

初期世代生成部１２１は、信号処理装置５２の前処理部６１に供給される前処理用フィルタの候補であるいくつかの前処理用フィルタを生成する。なお、以下の説明において、前処理部６１に供給される前処理用フィルタの候補である１つの前処理用フィルタを遺伝子と称し、特に、初期世代生成部１２１により生成された遺伝子を、初期世代の遺伝子と称する。初期世代生成部１２１は、単位フィルタを組み合わせて、いくつかの初期世代の遺伝子を生成し、遺伝子評価部１２２に供給する。 The initial generation generation unit 121 generates several preprocessing filters that are candidates for the preprocessing filter supplied to the preprocessing unit 61 of the signal processing device 52. In the following description, one preprocessing filter that is a candidate for the preprocessing filter supplied to the preprocessing unit 61 is referred to as a gene, and in particular, the gene generated by the initial generation generation unit 121 is referred to as the initial generation. The gene is called. The initial generation unit 121 generates several initial generation genes by combining unit filters, and supplies the genes to the gene evaluation unit 122.

遺伝子評価部１２２は、初期世代生成部１２１から供給された遺伝子を評価し、その評価の結果を次世代生成部１２３に供給する。また、遺伝子評価部１２２は、次世代生成部１２３から供給された次世代の遺伝子、つまり最後に評価が行われた世代の遺伝子の次の世代の遺伝子を評価し、その評価の結果を次世代生成部１２３に供給する。そして、遺伝子評価部１２２は、繰り返し何世代かの遺伝子の評価を行って最終的な前処理用フィルタを選択し、選択した前処理用フィルタと、その前処理用フィルタに対応するコード判別機とを信号処理装置５２に供給する。 The gene evaluation unit 122 evaluates the gene supplied from the initial generation generation unit 121 and supplies the evaluation result to the next generation generation unit 123. In addition, the gene evaluation unit 122 evaluates the next generation gene supplied from the next generation generation unit 123, that is, the gene of the next generation of the gene of the generation that was last evaluated, and the result of the evaluation is used as the next generation gene. It supplies to the production | generation part 123. FIG. Then, the gene evaluation unit 122 repeatedly evaluates several generations of genes to select a final preprocessing filter, a selected preprocessing filter, and a code discriminator corresponding to the preprocessing filter Is supplied to the signal processing device 52.

ここで、次世代の遺伝子とは、遺伝子評価部１２２において評価の対象とされている遺伝子を基準として、それらの基準となる遺伝子が用いられて生成された遺伝子をいう。したがって、例えば、初期世代に対する次世代の遺伝子、つまり初期世代の次の世代の遺伝子は、初期世代の遺伝子が用いられて生成された、初期世代の遺伝子の次に評価の対象となる遺伝子とされる。 Here, the next-generation gene refers to a gene that is generated using the genes that are the targets of evaluation in the gene evaluation unit 122 as a reference. Therefore, for example, the next generation gene for the early generation, that is, the gene of the next generation of the early generation, is the gene to be evaluated next to the gene of the early generation generated using the gene of the early generation. The

遺伝子評価部１２２は、教師データ保持部１４１、解析部１４２、前処理部１４３、教師データ分割部１４４、コード判別機学習部１４５、コード判別部１４６、および評価部１４７から構成される。 The gene evaluation unit 122 includes a teacher data holding unit 141, an analysis unit 142, a preprocessing unit 143, a teacher data division unit 144, a code discriminator learning unit 145, a code discrimination unit 146, and an evaluation unit 147.

教師データ保持部１４１は、コード判別機の作成および遺伝子の評価に用いられる楽曲の音声データを教師データとして保持している。教師データ保持部１４１は、保持している教師データを、解析部１４２および教師データ分割部１４４に供給する。 The teacher data holding unit 141 holds, as teacher data, voice data of music used to create a code discriminator and evaluate genes. The teacher data holding unit 141 supplies the held teacher data to the analyzing unit 142 and the teacher data dividing unit 144.

解析部１４２は、教師データ保持部１４１から供給された教師データに１２音解析処理を施し、その結果得られた時間‐音程データを前処理部１４３に供給する。前処理部１４３は、初期世代生成部１２１または次世代生成部１２３から供給された各遺伝子について、遺伝子である前処理用フィルタを用いて、解析部１４２から供給された時間‐音程データのそれぞれに前処理を施して、コード判別機学習部１４５およびコード判別部１４６に供給する。したがって、例えば前処理部１４３に、遺伝子Ｇ１１および遺伝子Ｇ１２が供給された場合、前処理部１４３は、解析部１４２から供給された全ての時間‐音程データのそれぞれに、遺伝子Ｇ１１を用いて前処理を施して、コード判別機学習部１４５およびコード判別部１４６に供給するとともに、解析部１４２から供給された全ての時間‐音程データのそれぞれに、遺伝子Ｇ１２を用いて前処理を施して、コード判別機学習部１４５およびコード判別部１４６に供給する。 The analysis unit 142 performs twelve-tone analysis processing on the teacher data supplied from the teacher data holding unit 141, and supplies the time-pitch data obtained as a result to the preprocessing unit 143. For each gene supplied from the initial generation generation unit 121 or the next generation generation unit 123, the preprocessing unit 143 applies each of the time-pitch data supplied from the analysis unit 142 using a preprocessing filter that is a gene. Pre-processing is performed, and the result is supplied to the code discriminator learning unit 145 and the code discriminating unit 146. Therefore, for example, when the gene G11 and the gene G12 are supplied to the preprocessing unit 143, the preprocessing unit 143 performs preprocessing using the gene G11 for each of all the time-pitch data supplied from the analysis unit 142. To the chord discriminator learning unit 145 and the chord discriminating unit 146, and the pre-processing using the gene G12 is performed on all of the time-pitch data supplied from the analyzing unit 142 to perform the chord discrimination. To the machine learning unit 145 and the code determination unit 146.

教師データ分割部１４４は、教師データ保持部１４１から供給された教師データを、コード判別機の学習に用いられる教師データである学習用データと、遺伝子の評価に用いられる教師データである評価用データとに無作為に分割する。そして、教師データ分割部１４４は、学習用データとされた教師データをコード判別機学習部１４５に供給し、評価用データとされた教師データを、評価部１４７に供給する。 The teacher data dividing unit 144 uses the teacher data supplied from the teacher data holding unit 141 for learning data that is teacher data used for learning by the code discriminator and evaluation data that is teacher data used for gene evaluation. Divide it randomly. Then, the teacher data dividing unit 144 supplies the teacher data set as learning data to the code discriminator learning unit 145 and supplies the teacher data set as evaluation data to the evaluation unit 147.

コード判別機学習部１４５は、前処理部１４３から供給された時間‐音程データと、教師データ分割部１４４から供給された学習用データとから機械学習により、遺伝子ごとに、その遺伝子に対応するコード判別機を作成し、コード判別部１４６および評価部１４７に供給する。 The code discriminator learning unit 145 uses a machine learning based on the time-pitch data supplied from the preprocessing unit 143 and the learning data supplied from the teacher data dividing unit 144 for each gene, and the code corresponding to the gene. A discriminator is created and supplied to the code discriminating unit 146 and the evaluation unit 147.

コード判別部１４６は、コード判別機学習部１４５から供給されたコード判別機を用いて、前処理部１４３から供給された時間‐音程データからビート毎のコードを判別し、その判別結果を評価部１４７に供給する。すなわち、コード判別部１４６は、前処理部１４３から供給された時間‐音程データのうち、所定の遺伝子が用いられて前処理が施された時間‐音程データについては、その遺伝子により前処理が施された時間‐音程データが用いられて作成されたコード判別機を用いてコードを判別する。 The chord discriminating unit 146 discriminates the chord for each beat from the time-pitch data supplied from the pre-processing unit 143 using the chord discriminator supplied from the chord discriminator learning unit 145, and the discrimination result is evaluated. 147. That is, the chord discriminating unit 146 performs pre-processing on the time-pitch data that has been pre-processed by using a predetermined gene among the time-pitch data supplied from the pre-processing unit 143. The chord is discriminated using a chord discriminator created by using the time-pitch data.

評価部１４７は、教師データ分割部１４４から供給された評価用データと、コード判別部１４６から供給されたコードの判別結果とを用いて、コードの判別の精度を求めて各遺伝子を評価し、その評価結果を次世代生成部１２３に供給する。すなわち、教師データとされる楽曲の音声データについては、その楽曲のビート毎の正しいコードのコード名である正解コード名が予め知られているので、評価部１４７は、コード判別機により判別されたコードのうちの評価用データに対応する音声データのコードと、予め知らされている正解コード名とを比較することにより遺伝子の評価を行う。 The evaluation unit 147 uses the evaluation data supplied from the teacher data division unit 144 and the code discrimination result supplied from the code discrimination unit 146 to evaluate each gene for accuracy of code discrimination, The evaluation result is supplied to the next generation generation unit 123. In other words, since the correct code name that is the code name of the correct code for each beat of the music is known in advance for the audio data of the music that is the teacher data, the evaluation unit 147 is determined by the code discriminator. The gene is evaluated by comparing the code of the voice data corresponding to the evaluation data in the code with the correct code name known in advance.

また、評価部１４７は、繰り返して何世代かの遺伝子を評価し、最後に評価した世代の遺伝子の評価の結果を示す評価値のうちの最も高い評価値、つまり最も評価の高い遺伝子の評価値が、例えば予め定められた閾値以上であるなどの所定の条件を満たす場合、その最も高い評価値の遺伝子、およびその遺伝子に対応するコード判別機を、信号処理装置５２の前処理部６１およびコード判別部９３に供給する。 The evaluation unit 147 repeatedly evaluates several generations of genes, and the highest evaluation value among the evaluation values indicating the evaluation result of the gene of the last evaluated generation, that is, the evaluation value of the highest evaluation gene. When a predetermined condition such as being equal to or greater than a predetermined threshold value is satisfied, the gene having the highest evaluation value and the code discriminator corresponding to the gene are designated as the preprocessing unit 61 and the code of the signal processing device 52. It supplies to the discrimination | determination part 93.

次世代生成部１２３は、評価の対象となっている遺伝子、すなわち初期世代生成部１２１から供給された初期世代の遺伝子、または次世代生成部１２３が生成した遺伝子を用いて、評価部１４７から供給されたそれらの遺伝子の評価の結果に基づいて、次世代の遺伝子を生成して前処理部１４３および評価部１４７に供給する。 The next generation generation unit 123 supplies from the evaluation unit 147 using the gene to be evaluated, that is, the gene of the initial generation supplied from the initial generation generation unit 121 or the gene generated by the next generation generation unit 123. Based on the result of the evaluation of those genes, a next generation gene is generated and supplied to the preprocessing unit 143 and the evaluation unit 147.

次世代生成部１２３は、選択部１５１、突然変異処理部１５２、交差処理部１５３、およびランダム生成部１５４を備えている。 The next generation generation unit 123 includes a selection unit 151, a mutation processing unit 152, an intersection processing unit 153, and a random generation unit 154.

次世代生成部１２３の選択部１５１は、評価部１４７により最後に評価された世代の遺伝子の評価の結果に基づいて、最後に評価された世代の遺伝子のうち、評価の高かったいくつかの遺伝子を選択し、選択した遺伝子を次世代の遺伝子とする。 The selection unit 151 of the next generation generation unit 123 uses the gene of the last generation evaluated by the evaluation unit 147 to select some genes with the highest evaluation among the genes of the generation evaluated last. And select the selected gene as the next generation gene.

突然変異処理部１５２は、評価部１４７により最後に評価された世代の遺伝子の評価の結果に基づいて、評価の高かったいくつかの遺伝子に突然変異処理を施して次世代の遺伝子を生成する。すなわち、突然変異処理部１５２は、評価の高かったいくつかの遺伝子のそれぞれについて、遺伝子の一部、つまり遺伝子を構成する単位フィルタのうちのいくつかを、無作為に選択した他の単位フィルタに変更することによって新たな遺伝子を生成し、生成された遺伝子を次世代の遺伝子とする。 The mutation processing unit 152 generates a next-generation gene by performing mutation processing on some of the highly evaluated genes based on the evaluation result of the gene evaluated last by the evaluation unit 147. That is, for each of several highly evaluated genes, the mutation processing unit 152 converts some of the genes, that is, some of the unit filters constituting the genes, to other randomly selected unit filters. A new gene is generated by changing, and the generated gene is set as the next generation gene.

交差処理部１５３は、評価部１４７により最後に評価された世代の遺伝子の評価の結果に基づいて、評価の高かったいくつかの遺伝子に交差処理を施して次世代の遺伝子を生成する。すなわち、交差処理部１５３は、評価の高かったいくつかの遺伝子のそれぞれについて、遺伝子の一部、つまり遺伝子を構成する単位フィルタのうちのいくつかを、同じ世代の他の遺伝子を構成する単位フィルタに変更することによって新たな遺伝子を生成し、生成された遺伝子を次世代の遺伝子とする。 Based on the result of the evaluation of the gene of the last generation evaluated by the evaluation unit 147, the cross processing unit 153 performs cross processing on some of the highly evaluated genes to generate the next generation gene. That is, for each of several highly evaluated genes, the intersection processing unit 153 replaces part of the genes, that is, some of the unit filters constituting the genes with unit filters constituting the other genes of the same generation. A new gene is generated by changing to, and the generated gene is used as the next generation gene.

ランダム生成部１５４は、無作為に選択した単位フィルタを組み合わせることによって、新たな遺伝子（前処理用フィルタ）を生成し、生成した遺伝子を次世代の遺伝子とする。次世代生成部１２３は、このようにして選択部１５１乃至ランダム生成部１５４により生成された次世代の遺伝子を、評価部１４７および前処理部１４３に供給する。 The random generation unit 154 generates a new gene (preprocessing filter) by combining randomly selected unit filters, and uses the generated gene as the next generation gene. The next generation generation unit 123 supplies the next generation gene generated by the selection unit 151 to the random generation unit 154 in this way to the evaluation unit 147 and the preprocessing unit 143.

図１１は、図１０のコード判別機学習部１４５のより詳細な構成例を示すブロック図である。コード判別機学習部１４５は、ビート毎特徴量抽出部１８１、追加部１８２、シフト処理部１８３、およびコード判別機生成部１８４から構成される。 FIG. 11 is a block diagram illustrating a more detailed configuration example of the code discriminator learning unit 145 of FIG. The chord discriminator learning unit 145 includes a beat-by-beat feature quantity extracting unit 181, an adding unit 182, a shift processing unit 183, and a chord discriminator generating unit 184.

ビート毎特徴量抽出部１８１は、前処理部１４３から供給された時間‐音程データから、ビートを検出してビート毎の音声のコード判別用特徴量を抽出し、追加部１８２に供給する。追加部１８２は、教師データ分割部１４４から供給された学習用データである教師データに、予め知られている正解コード名と、ビート毎特徴量抽出部１８１から供給されたコード判別用特徴量とを追加して、シフト処理部１８３に供給する。 The beat feature quantity extraction unit 181 detects a beat from the time-pitch data supplied from the preprocessing unit 143, extracts a chord discrimination feature quantity for each beat, and supplies it to the addition unit 182. The adding unit 182 adds the previously known correct code name to the teacher data, which is the learning data supplied from the teacher data dividing unit 144, and the chord determination feature amount supplied from the beat-by-beat feature amount extraction unit 181. Is added to the shift processing unit 183.

シフト処理部１８３は、追加部１８２から供給された教師データに追加（付加）されたコード判別用特徴量と正解コード名とを１音分ずつシフトさせて教師データに追加して、コード判別機生成部１８４に供給する。コード判別機生成部１８４は、シフト処理部１８３から供給された学習用データとしての教師データから、機械学習により各遺伝子に対応するコード判別機を作成（生成）し、コード判別部１４６および評価部１４７に供給する。 The shift processing unit 183 shifts the chord discrimination feature amount added to (added to) the teacher data supplied from the adding unit 182 and the correct code name by one sound at a time, and adds them to the teacher data. It supplies to the production | generation part 184. The code discriminator generating unit 184 creates (generates) a code discriminator corresponding to each gene by machine learning from the teacher data as learning data supplied from the shift processing unit 183, and the code discriminating unit 146 and the evaluation unit 147.

図１２は、図１０のコード判別部１４６のより詳細な構成例を示すブロック図である。コード判別部１４６は、ビート毎特徴量抽出部２１１および判別部２１２から構成される。 FIG. 12 is a block diagram illustrating a more detailed configuration example of the code determination unit 146 of FIG. The chord discriminating unit 146 includes a beat-by-beat feature amount extracting unit 211 and a discriminating unit 212.

ビート毎特徴量抽出部２１１は、前処理部１４３から供給された時間‐音程データから、ビートを検出してビート毎の音声のコード判別用特徴量を抽出し、判別部２１２に供給する。判別部２１２は、コード判別機学習部１４５のコード判別機生成部１８４から供給されたコード判別機を用いて、ビート毎特徴量抽出部２１１から供給されたコード判別用特徴量から、音声データのビート毎のコードを判別し、評価部１４７に供給する。ここで、判別部２１２は、供給されたコード判別用特徴量のうち、所定の遺伝子が用いられて前処理が施された時間‐音程データから抽出されたコード判別用特徴量と、その遺伝子に対応するコード判別機とからコードを判別する。 The beat feature quantity extraction unit 211 detects a beat from the time-pitch data supplied from the preprocessing unit 143, extracts a chord discrimination feature quantity for each beat, and supplies it to the discrimination unit 212. The discriminator 212 uses the chord discriminator supplied from the chord discriminator generation unit 184 of the chord discriminator learning unit 145, and uses the chord discriminating feature amount supplied from the beat feature amount extracting unit 211 to calculate the voice data. The chord for each beat is discriminated and supplied to the evaluation unit 147. Here, the discriminating unit 212 includes the chord discriminating feature amount extracted from the time-pitch data that has been preprocessed by using a predetermined gene out of the supplied chord discriminating feature amount, and the gene The code is discriminated from the corresponding code discriminator.

次に、図１３のフローチャートを参照して、フィルタ生成装置５１が前処理用フィルタを生成する処理である、前処理用フィルタ生成処理について説明する。 Next, a preprocessing filter generation process, which is a process in which the filter generation device 51 generates a preprocessing filter, will be described with reference to the flowchart of FIG.

ステップＳ４１において、初期世代生成部１２１は、初期世代の遺伝子を生成して、前処理部１４３および次世代生成部１２３に供給する。すなわち、初期世代生成部１２１は、１または複数の単位フィルタを組み合わせて遺伝子としての前処理用フィルタの候補をいくつか生成し、生成された遺伝子のそれぞれを初期世代の遺伝子とする。 In step S 41, the initial generation generation unit 121 generates an initial generation gene and supplies the gene to the preprocessing unit 143 and the next generation generation unit 123. That is, the initial generation generation unit 121 generates several candidates for preprocessing filters as genes by combining one or a plurality of unit filters, and sets each of the generated genes as an initial generation gene.

ステップＳ４２において、フィルタ生成装置５１は、評価の対象となる遺伝子のそれぞれに対して評価処理を行う。なお、評価処理の詳細については後述するが、この評価処理において、評価の対象となっている世代の遺伝子のそれぞれについて、それらの遺伝子を前処理用フィルタとして用いた場合のコードの判別精度を示す評価値が求められる。 In step S42, the filter generation device 51 performs an evaluation process on each of the genes to be evaluated. The details of the evaluation process will be described later. In this evaluation process, for each generation gene to be evaluated, the code discrimination accuracy when the gene is used as a preprocessing filter is shown. An evaluation value is obtained.

評価処理が行われて、各遺伝子の評価値が求められると、ステップＳ４３において、評価部１４７は、評価値が変化しなくなったか否かを判定する。例えば、評価部１４７は、遺伝子の評価値のうちの最も高い評価値と、前回求めた評価値、すなわち前の世代の遺伝子の評価値のうちの最も高い評価値との差を求め、今回求めた評価値の差、前回求められた評価値の差、および前々回求められた評価値の差のそれぞれが、予め定められた閾値以下である場合、つまり評価値が殆ど変化しなくなった場合、評価値が変化しなくなったと判定する。 When the evaluation process is performed and the evaluation value of each gene is obtained, in step S43, the evaluation unit 147 determines whether or not the evaluation value has changed. For example, the evaluation unit 147 obtains a difference between the highest evaluation value of the gene evaluation values and the evaluation value obtained last time, that is, the highest evaluation value of the evaluation values of the genes of the previous generation, and obtains this time. Evaluation when the difference between the evaluation values obtained, the difference between the evaluation values obtained last time, and the difference between the evaluation values obtained last time is less than or equal to a predetermined threshold value, that is, the evaluation values hardly change. It is determined that the value no longer changes.

また、例えば、遺伝子の評価値のうち最も高い評価値が、予め定められた閾値以上である場合に、充分な精度でコードを抽出するための遺伝子が得られたとして、評価値が変化しなくなったと判定されるようにしてもよい。 In addition, for example, when the highest evaluation value among gene evaluation values is equal to or higher than a predetermined threshold, the evaluation value does not change as a gene for extracting a code with sufficient accuracy is obtained. It may be determined that

ステップＳ４３において、評価値が変化したと判定された場合、すなわち、まだ充分な精度でコードを抽出するための遺伝子が得られていないと判定された場合、次世代の遺伝子を生成するので、評価部１４７は、各遺伝子の評価結果である評価値を次世代生成部１２３に供給し、処理はステップＳ４４に進む。 If it is determined in step S43 that the evaluation value has changed, that is, if it has been determined that a gene for extracting a code with sufficient accuracy has not yet been obtained, a next-generation gene is generated. The unit 147 supplies an evaluation value, which is an evaluation result of each gene, to the next generation generation unit 123, and the process proceeds to step S44.

ステップＳ４４において、次世代生成部１２３の選択部１５１は、評価部１４７から供給された遺伝子の評価値に基づいて、評価された遺伝子のうち、評価値の高いいくつかの遺伝子を選択し、選択した遺伝子を次世代の遺伝子とする。 In step S44, the selection unit 151 of the next generation generation unit 123 selects and selects some genes having a high evaluation value among the evaluated genes based on the evaluation value of the gene supplied from the evaluation unit 147. This gene will be the next generation gene.

例えば、最後に評価された遺伝子が、初期世代の遺伝子である場合、選択部１５１は、初期世代生成部１２１から供給された初期世代の遺伝子のうち、評価値の高いいくつかの遺伝子を選択して次世代の遺伝子とする。また、例えば、最後に評価された遺伝子が、初期世代の遺伝子でない場合、つまり次世代生成部１２３により生成された遺伝子である場合、選択部１５１は、前回、選択部１５１乃至ランダム生成部１５４により生成された遺伝子のうち、評価値の高いいくつかの遺伝子を選択して次世代の遺伝子とする。 For example, when the last evaluated gene is an initial generation gene, the selection unit 151 selects some genes having a high evaluation value from the initial generation genes supplied from the initial generation generation unit 121. And the next generation gene. Further, for example, when the last evaluated gene is not the gene of the initial generation, that is, when it is a gene generated by the next generation generation unit 123, the selection unit 151 performs the previous selection by the selection unit 151 to the random generation unit 154. Among the generated genes, some genes with high evaluation values are selected as the next generation genes.

ステップＳ４５において、突然変異処理部１５２は、評価部１４７から供給された遺伝子の評価値に基づいて、評価値の高かったいくつかの遺伝子に突然変異処理を施して次世代の遺伝子を生成する。すなわち、突然変異処理部１５２は、評価値の高かったいくつかの遺伝子のそれぞれについて、遺伝子を構成する単位フィルタのうちのいくつかを、無作為に選択した他の単位フィルタに変更することによって新たな遺伝子を生成し、次世代の遺伝子とする。 In step S45, based on the gene evaluation values supplied from the evaluation unit 147, the mutation processing unit 152 performs mutation processing on some genes having high evaluation values to generate next-generation genes. That is, for each of several genes having a high evaluation value, the mutation processing unit 152 newly replaces some of the unit filters constituting the gene with another unit filter selected at random. To generate next-generation genes.

ステップＳ４６において、交差処理部１５３は、評価部１４７から供給された遺伝子の評価値に基づいて、評価値の高かったいくつかの遺伝子に交差処理を施して次世代の遺伝子を生成する。すなわち、交差処理部１５３は、評価値の高かったいくつかの遺伝子のそれぞれについて、遺伝子を構成する単位フィルタのうちのいくつかを、同じ世代の他の遺伝子を構成する単位フィルタに変更することによって新たな遺伝子を生成し、次世代の遺伝子とする。 In step S46, based on the gene evaluation values supplied from the evaluation unit 147, the cross processing unit 153 performs cross processing on some genes having high evaluation values to generate next-generation genes. That is, the intersection processing unit 153 changes, for each of several genes having a high evaluation value, some of the unit filters constituting the genes to unit filters constituting other genes of the same generation. Generate a new gene and make it the next generation gene.

ステップＳ４７において、ランダム生成部１５４は、無作為に選択した単位フィルタを組み合わせることによって新たな遺伝子（前処理用フィルタの候補）を生成し、次世代の遺伝子とする。 In step S47, the random generation unit 154 generates a new gene (pre-processing filter candidate) by combining randomly selected unit filters, and sets it as the next generation gene.

なお、ステップＳ４５の処理およびステップＳ４６の処理においても、ステップＳ４４の処理における場合と同様に、最後に評価された遺伝子が、初期世代の遺伝子である場合、初期世代生成部１２１から供給された初期世代の遺伝子が用いられて次世代の遺伝子が生成され、最後に評価された遺伝子が、次世代生成部１２３、すなわち選択部１５１乃至ランダム生成部１５４により生成された遺伝子である場合、それらの次世代生成部１２３により生成された遺伝子が用いられて次世代の遺伝子が生成される。 In the process of step S45 and the process of step S46, as in the case of the process of step S44, when the last evaluated gene is an early generation gene, the initial stage supplied from the initial generation generation unit 121 is used. When the next generation gene is generated using the generation gene and the last evaluated gene is the gene generated by the next generation generation unit 123, that is, the selection unit 151 to the random generation unit 154, the next The gene generated by the generation generation unit 123 is used to generate the next generation gene.

次世代生成部１２３は、このようにして選択部１５１乃至ランダム生成部１５４により次世代の遺伝子が生成されると、生成された次世代の遺伝子を評価部１４７および前処理部１４３に供給する。そして、次世代の遺伝子が評価部１４７および前処理部１４３に供給されると、処理はステップＳ４２に戻り、新たに生成された次世代の遺伝子のそれぞれの評価が行われる。 When the next generation gene is generated by the selection unit 151 to the random generation unit 154 in this way, the next generation generation unit 123 supplies the generated next generation gene to the evaluation unit 147 and the preprocessing unit 143. Then, when the next generation gene is supplied to the evaluation unit 147 and the preprocessing unit 143, the process returns to step S42, and each of the newly generated next generation gene is evaluated.

また、ステップＳ４３において、遺伝子の評価値が変化しなくなったと判定された場合、すなわち充分な精度でコードを抽出するための遺伝子が得られた場合、ステップＳ４８において、評価部１４７は、最後に評価された遺伝子のうち、最も評価値の高い遺伝子を選択し、選択された遺伝子、つまり前処理用フィルタと、その前処理用フィルタに対応するコード判別機とを出力する。 If it is determined in step S43 that the evaluation value of the gene has ceased to change, that is, if a gene for extracting a code with sufficient accuracy is obtained, the evaluation unit 147 finally evaluates in step S48. The gene with the highest evaluation value is selected from the selected genes, and the selected gene, that is, the preprocessing filter and the code discriminator corresponding to the preprocessing filter are output.

すなわち、評価部１４７は、次世代生成部１２３から供給された、最後に評価の対象となった世代の遺伝子のうちの、選択された最も評価値の高い遺伝子（前処理用フィルタ）を前処理部６１に供給するとともに、コード判別機学習部１４５から供給された、最後に評価の対象となった世代の遺伝子に対応するコード判別機のうちの、選択された遺伝子を用いて前処理が行われた時間‐音程データからコードを判別するためのコード判別機、つまり選択された遺伝子が用いられて前処理が施された時間‐音程データから作成されたコード判別機を、コード抽出部６２に供給する。選択された前処理用フィルタとコード判別機とが出力されると、前処理用フィルタ生成処理は終了する。 That is, the evaluation unit 147 preprocesses the selected gene (pre-processing filter) having the highest evaluation value among the genes of the generation to be evaluated last supplied from the next generation generation unit 123. And the pre-processing is performed using the selected gene of the code discriminator corresponding to the gene of the generation to be evaluated last supplied from the code discriminator learning unit 145. A chord discriminator for discriminating a chord from the received time-pitch data, that is, a chord discriminator created from the time-pitch data that has been preprocessed using the selected gene Supply. When the selected preprocessing filter and code discriminator are output, the preprocessing filter generation process ends.

このようにして、フィルタ生成装置５１は、前処理用フィルタおよびコード判別機を作成する。このように、時間‐音程データから楽曲のコードを抽出するアルゴリズムを人間により構築し、そのアルゴリズムにおいて用いられる前処理フィルタと、コード判別機とをフィルタ生成装置５１により生成することで、精度よく楽曲のコードを抽出することのできるアルゴリズムを生成することができる。 In this way, the filter generation device 51 creates a preprocessing filter and a code discriminator. In this way, an algorithm for extracting the chord of music from time-pitch data is constructed by humans, and the pre-processing filter used in the algorithm and the chord discriminator are generated by the filter generator 51, so that the music can be accurately reproduced. It is possible to generate an algorithm that can extract the codes.

すなわち、前処理用フィルタの候補を遺伝子として、繰り返し次の世代の遺伝子を生成して評価を行って評価値の高い遺伝子を前処理用フィルタとすることで、より精度よく楽曲のコードを抽出することができる前処理用フィルタおよびコード判別機を生成することができる。そして、生成した前処理用フィルタおよびコード判別機を、人間の先見知識を生かして、少ない教師データから構築されたアルゴリズムにおいて用いることで、アルゴリズムによるコードの抽出の精度を向上させることができ、その結果、より確実に特徴を抽出することができるアルゴリズムを簡単に得ることができる。 In other words, the pre-processing filter candidate is used as a gene, the next generation gene is repeatedly generated and evaluated, and a gene with a high evaluation value is used as the pre-processing filter, so that the music code can be extracted more accurately. Preprocessing filters and code discriminators that can be generated can be generated. And, by using the generated preprocessing filter and code discriminator in an algorithm constructed from a small amount of teacher data by utilizing human foresight knowledge, the accuracy of code extraction by the algorithm can be improved. As a result, an algorithm that can extract features more reliably can be easily obtained.

特徴抽出システムにおいては、人間により構築されたアルゴリズムにおいて用いられる前処理用フィルタ、およびコード判別機がフィルタ生成装置５１により生成されるので、フィルタ生成装置５１により、精度の高いアルゴリズムが半自動的に構築されるということができる。 In the feature extraction system, a pre-processing filter and a code discriminator used in an algorithm constructed by a human are generated by the filter generation device 51. Therefore, a highly accurate algorithm is semi-automatically constructed by the filter generation device 51. It can be said that.

次に、図１４のフローチャートを参照して、図１３のステップＳ４２の処理に対応する評価処理について説明する。この評価処理は、評価の対象となる遺伝子ごとに行われる。すなわち、各遺伝子のそれぞれに対して評価処理が行われる。 Next, an evaluation process corresponding to the process of step S42 of FIG. 13 will be described with reference to the flowchart of FIG. This evaluation process is performed for each gene to be evaluated. That is, an evaluation process is performed for each gene.

ステップＳ８１において、解析部１４２は、教師データ保持部１４１から供給された教師データである音声データに１２音解析処理を施し、その結果得られた時間‐音程データを前処理部１４３に供給する。すなわち、解析部１４２は、音声データによる音声を複数のオクターブの成分に分けて、さらにそれぞれのオクターブにおける１２平均率のそれぞれの高さの１２の音のエネルギを求めることで、オクターブ毎の１２の音のそれぞれのエネルギを示す時間‐音程データを求める。 In step S 81, the analysis unit 142 performs 12-tone analysis processing on the voice data that is the teacher data supplied from the teacher data holding unit 141, and supplies the time-pitch data obtained as a result to the preprocessing unit 143. That is, the analysis unit 142 divides the sound of the sound data into a plurality of octave components, and further obtains the energy of twelve sounds at each height of the twelve average rate in each octave, so that twelve of each octave is obtained. Time-pitch data indicating the energy of each sound is obtained.

ステップＳ８２において、前処理部１４３は、初期世代生成部１２１または次世代生成部１２３から供給された遺伝子を用いて、解析部１４２から供給された時間‐音程データに前処理を施す。そして、前処理部１４３は、前処理が施された時間‐音程データを、コード判別機学習部１４５のビート毎特徴量抽出部１８１、およびコード判別部１４６のビート毎特徴量抽出部２１１に供給する。 In step S 82, the preprocessing unit 143 performs preprocessing on the time-pitch data supplied from the analysis unit 142 using the gene supplied from the initial generation generation unit 121 or the next generation generation unit 123. Then, the preprocessing unit 143 supplies the pre-processed time-pitch data to the beat feature quantity extraction unit 181 of the chord discriminator learning unit 145 and the beat feature quantity extraction unit 211 of the chord discrimination unit 146. To do.

例えば、初期世代の遺伝子の評価が行われる場合、初期世代生成部１２１から前処理部１４３には初期世代の遺伝子が供給されるので、前処理部１４３は、初期世代生成部１２１から供給された初期世代の遺伝子を用いて前処理を行う。また、初期世代より後の世代の遺伝子、つまり次世代生成部１２３により生成された遺伝子の評価が行われる場合、次世代生成部１２３から前処理部１４３には、評価の対象となる新たな世代の遺伝子が供給されるので、前処理部１４３は、次世代生成部１２３から供給された遺伝子を用いて前処理を行う。 For example, when an initial generation gene is evaluated, since the initial generation gene is supplied from the initial generation generation unit 121 to the preprocessing unit 143, the preprocessing unit 143 is supplied from the initial generation generation unit 121. Pretreatment is performed using early generation genes. In addition, when the gene of the generation after the initial generation, that is, the gene generated by the next generation generation unit 123 is evaluated, the next generation generation unit 123 sends a new generation to be evaluated to the preprocessing unit 143. Therefore, the preprocessing unit 143 performs preprocessing using the gene supplied from the next generation generation unit 123.

より詳細には、前処理部１４３は、解析部１４２から供給された全ての時間‐音程データについて、時間‐音程データに各遺伝子のそれぞれを用いて前処理を施す。従って、例えば、前処理部１４３に遺伝子Ｇ４１および遺伝子Ｇ４２が供給された場合、前処理部１４３は、全ての時間‐音程データについて、遺伝子Ｇ４１を用いた前処理を施すとともに、遺伝子Ｇ４２を用いた前処理を施す。これにより、１つの時間‐音程データに対して、遺伝子Ｇ４１が用いられて前処理が施された時間‐音程データと、遺伝子Ｇ４２が用いられて前処理が施された時間‐音程データとが得られる。 More specifically, the preprocessing unit 143 performs preprocessing on all the time-pitch data supplied from the analysis unit 142 by using each of the genes in the time-pitch data. Therefore, for example, when the gene G41 and the gene G42 are supplied to the preprocessing unit 143, the preprocessing unit 143 performs preprocessing using the gene G41 on all time-pitch data, and uses the gene G42. Pre-processing is performed. As a result, for one time-pitch data, time-pitch data pre-processed using the gene G41 and time-pitch data pre-processed using the gene G42 are obtained. It is done.

ステップＳ８３において、教師データ分割部１４４は、教師データ保持部１４１から供給された教師データを、学習用データと評価用データとに分割する。すなわち、教師データ分割部１４４は、教師データ保持部１４１から供給された教師データのそれぞれについて、学習用データとするか、または評価用データとするかを無作為に選択していくことで、教師データを学習用データと評価用データとに分割する。 In step S83, the teacher data dividing unit 144 divides the teacher data supplied from the teacher data holding unit 141 into learning data and evaluation data. In other words, the teacher data dividing unit 144 randomly selects whether the teacher data supplied from the teacher data holding unit 141 is to be learning data or evaluation data. The data is divided into learning data and evaluation data.

そして、教師データ分割部１４４は、分割の結果、学習用データとされた教師データを、コード判別機学習部１４５の追加部１８２に供給するとともに、評価用データとされた教師データを評価部１４７に供給する。 Then, the teacher data dividing unit 144 supplies the teacher data determined as the learning data as a result of the division to the adding unit 182 of the code discriminator learning unit 145, and the teacher data set as the evaluation data is evaluated by the evaluation unit 147. To supply.

ステップＳ８４において、コード判別機学習部１４５のビート毎特徴量抽出部１８１は、前処理部１４３から供給された時間‐音程データから、コード判別用特徴量を抽出し、追加部１８２に供給する。 In step S 84, the beat-by-beat feature quantity extraction unit 181 of the chord discriminator learning unit 145 extracts the chord discrimination feature quantity from the time-pitch data supplied from the preprocessing unit 143, and supplies it to the addition unit 182.

例えば、ビート毎特徴量抽出部１８１は、ビート検出部９１と同様の処理を行って、基本ビート周期を求めることにより時間‐音程データから拍の位置を検出し、さらにビート毎特徴量抽出部９２と同様の処理を行い、時間‐音程データの拍の範囲のそれぞれについて、それぞれのオクターブの１２平均率のそれぞれの高さの１２の音毎のエネルギに重み付けし、それぞれのオクターブにおける同じ音名の音のエネルギを加算して、音名で特定される１２の音のそれぞれのエネルギを求めることにより、ルート判別用特徴量とメジャーマイナー判別用特徴量とを生成し、これらの特徴量をコード判別用特徴量とする。 For example, the beat-by-beat feature quantity extraction unit 181 performs the same processing as the beat detection unit 91 to obtain the basic beat cycle, thereby detecting the beat position from the time-pitch data, and further, the beat-by-beat feature quantity extraction unit 92. And for each range of beats in the time-pitch data, weight the energy of every 12 notes at each height of the 12 average rate of each octave, and the same note name in each octave By adding the energy of the sound and obtaining the energy of each of the 12 sounds specified by the pitch name, a feature quantity for route discrimination and a feature quantity for major / minor discrimination are generated, and these feature quantities are code-determined. It is used as a feature amount.

ステップＳ８５において、追加部１８２は、ビート毎特徴量抽出部１８１から供給されたコード判別用特徴量と、学習用データについて予め知られているビート毎の正解コード名とを、学習用データとしての教師データに追加する。 In step S85, the adding unit 182 uses the chord determination feature amount supplied from the beat feature amount extraction unit 181 and the correct code name for each beat known in advance for the learning data as the learning data. Add to teacher data.

例えば、追加部１８２には、図１５に示すように、教師データとしての音声データにより再生される楽曲について、予めその楽曲のコード進行が知らされている。図１５には、教師データとしての音声データにより再生される楽曲としての楽曲１乃至楽曲３と、それらの楽曲のコード進行とが示されている。なお、図中、より左側に位置するコード名が、その楽曲の開始位置により近い拍の範囲の正解コード名とされている。 For example, as shown in FIG. 15, the adding unit 182 is informed in advance of the chord progression of the music for the music reproduced by the audio data as the teacher data. FIG. 15 shows music 1 to music 3 as music reproduced by audio data as teacher data, and the chord progression of those music. In the figure, the chord name located on the left side is the correct chord name in the beat range closer to the start position of the music.

楽曲１のコード進行は、楽曲１の最初の拍の範囲のコード名がＣであり、その後順番にＥマイナーであるコード、Ａマイナーであるコード、Ｃであるコード、Ｆであるコード、Ｅマイナーであるコード、Ｄマイナーであるコード、およびＧであるコードとされている。 The chord progression of the song 1 is that the chord name of the first beat range of the song 1 is C, then the chord that is E minor, the chord that is A minor, the chord that is C, the chord that is F, and the E minor , A code that is a D minor, and a code that is a G.

また、楽曲２のコード進行は、楽曲２の最初の拍の範囲のコード名がＤマイナーであり、その後順番にＢ♭（フラット）であるコード、Ｃであるコード、Ａマイナーであるコード、Ｂ♭であるコード、Ｆであるコード、Ｇであるコード、およびＣであるコードとされている。さらに、楽曲３のコード進行は、楽曲３の最初の拍の範囲のコード名がＧであり、その後順番にＤであるコード、Ｆであるコード、Ｃであるコード、Ｄであるコード、Ｇであるコード、Ｄであるコード、Ｆであるコード、Ｃ♯であるコード、およびＥ♭であるコードとされている。 The chord progression of the music piece 2 is that the chord name in the first beat range of the music piece 2 is D minor, then the chord is B ♭ (flat), the chord is C, the chord is A minor, B The code is ♭, the code is F, the code is G, and the code is C. Further, the chord progression of the music 3 is that the chord name of the first beat range of the music 3 is G, and then the chord D, the chord F, the chord C, the chord D, G A code is a code that is D, a code that is F, a code that is C #, and a code that is E ♭.

さらに、また、追加部１８２には、各楽曲について、楽曲におけるコードの開始時刻および終了時刻、すなわち各拍の範囲の開始時刻および終了時刻も予め知らされている。したがって、例えば、追加部１８２には、楽曲１の最初のコード（コード名がＣであるコード）の開始時刻が楽曲１の開始時刻であり、終了時刻が開始時刻から１３秒後の時刻であることなどが予め知らされている。 Furthermore, the adding unit 182 also knows in advance the start time and end time of the chord in the music, that is, the start time and end time of each beat range for each music. Therefore, for example, in the adding unit 182, the start time of the first chord of the music piece 1 (code whose code name is C) is the start time of the music piece 1, and the end time is the time 13 seconds after the start time. This is known beforehand.

追加部１８２は、供給された学習用データとしての教師データに対して、ビート毎特徴量抽出部１８１から供給されたその教師データのコード判別用特徴量と、予め知らされている教師データの正解コード名とを追加する。ここで、追加部１８２は、学習用データとしての教師データから抽出されたコード判別用特徴量ごとに、教師データに対してコード判別用特徴量と正解コード名とを追加する処理を行う。なお、追加部１８２には、評価用データとされた教師データは供給されないので、追加部１８２に供給されたコード判別用特徴量のうち、評価用データとしての教師データから抽出されたコード判別用特徴量については、コード判別用特徴量の教師データへの追加は行われない。 For the teacher data as the supplied learning data, the adding unit 182 adds the chord discriminating feature amount of the teacher data supplied from the beat feature amount extracting unit 181 and the correct answer of the teacher data known in advance. Add a code name. Here, the adding unit 182 performs a process of adding the code discrimination feature quantity and the correct code name to the teacher data for each code discrimination feature quantity extracted from the teacher data as the learning data. In addition, since the teacher data set as the evaluation data is not supplied to the adding unit 182, the code discrimination feature extracted from the teacher data as the evaluation data among the code discrimination feature values supplied to the adding unit 182. As for the feature quantity, the code discrimination feature quantity is not added to the teacher data.

例えば、追加部１８２に学習用データとしての教師データＤ１および教師データＤ２が供給されるとともに、遺伝子Ｇ４１が用いられて前処理された教師データＤ１（より詳細には、教師データＤ１から得られた時間‐音程データ）から抽出されたコード判別用特徴量Ｆ１１、遺伝子Ｇ４１が用いられて前処理された教師データＤ２から抽出されたコード判別用特徴量Ｆ１２、遺伝子Ｇ４２が用いられて前処理された教師データＤ１から抽出されたコード判別用特徴量Ｆ２１、および遺伝子Ｇ４２が用いられて前処理された教師データＤ２から抽出されたコード判別用特徴量Ｆ２２が供給されたとする。 For example, the teacher data D1 and the teacher data D2 as learning data are supplied to the adding unit 182 and the teacher data D1 preprocessed using the gene G41 (more specifically, obtained from the teacher data D1) Code discrimination feature F11 extracted from time-pitch data) and preprocessing using chord discrimination feature F12 and gene G42 extracted from teacher data D2 preprocessed using gene G41 Assume that the code discrimination feature quantity F21 extracted from the teacher data D1 and the code discrimination feature quantity F22 extracted from the teacher data D2 preprocessed using the gene G42 are supplied.

この場合、追加部１８２は、コード判別用特徴量Ｆ１１と正解コード名とが追加された教師データＤ１、コード判別用特徴量Ｆ１２と正解コード名とが追加された教師データＤ２、コード判別用特徴量Ｆ２１と正解コード名とが追加された教師データＤ１、およびコード判別用特徴量Ｆ２２と正解コード名とが追加された教師データＤ２を、シフト処理部１８３に供給する。 In this case, the adding unit 182 has the teacher data D1 to which the code identification feature quantity F11 and the correct code name are added, the teacher data D2 to which the code discrimination feature quantity F12 and the correct code name are added, and the code discrimination feature. The teacher data D1 to which the amount F21 and the correct code name are added and the teacher data D2 to which the code identification feature amount F22 and the correct code name are added are supplied to the shift processing unit 183.

図１４のフローチャートの説明に戻り、学習用データにコード判別用特徴量と正解コード名とが追加されると、ステップＳ８６において、シフト処理部１８３は、追加部１８２から供給された各教師データについて、教師データのビート毎のコード判別用特徴量と正解コード名とを１音分シフトする。 Returning to the description of the flowchart of FIG. 14, when the code determination feature amount and the correct code name are added to the learning data, the shift processing unit 183 determines each teacher data supplied from the adding unit 182 in step S86. Then, the chord discrimination feature and the correct chord name for each beat of the teacher data are shifted by one sound.

ステップＳ８７において、シフト処理部１８３は、１音分シフトされたコード判別用特徴量と正解コード名とを教師データに追加する。 In step S87, the shift processing unit 183 adds the chord determination feature amount shifted by one sound and the correct code name to the teacher data.

ステップＳ８８において、シフト処理部１８３は、ステップＳ８６およびステップＳ８７において行われる処理、つまりビート毎のコード判別用特徴量と正解コード名とを１音分シフトして教師データに追加する処理を１１回繰り返したか否かを判定する。 In step S88, the shift processing unit 183 performs the process performed in steps S86 and S87, that is, the process of shifting the chord discrimination feature value and the correct code name for each beat by one sound and adding it to the teacher data 11 times. It is determined whether or not the process has been repeated.

例えば、図１６で示されるように、ビート毎のコード判別用特徴量に対応するビート毎のコードによって示される正しいコードの名前である正解コード名がＤである場合、Ｃ，Ｃ＃，Ｄ，Ｄ＃，Ｅ，Ｆ，Ｆ＃，Ｇ，Ｇ＃，Ａ，Ａ＃、およびＢのそれぞれの音名の音のエネルギを示すデータが順に配置されているルート判別用特徴量およびメジャーマイナー判別用特徴量が、追加部１８２によって、Ｄである正解コード名とともに教師データに追加される。 For example, as shown in FIG. 16, when the correct code name that is the name of the correct code indicated by the code for each beat corresponding to the chord discrimination feature value for each beat is D, C, C #, D, D #, E, F, F #, G, G #, A, A #, and B data for indicating the energy of each of the pitch names are arranged in order. The feature amount is added to the teacher data by the adding unit 182 along with the correct code name D.

そして、シフト処理部１８３は、Ｃ＃，Ｄ，Ｄ＃，Ｅ，Ｆ，Ｆ＃，Ｇ，Ｇ＃，Ａ，Ａ＃，Ｂ、およびＣのそれぞれの音名の音のエネルギを示すデータが順に配置されるように、ルート判別用特徴量およびメジャーマイナー判別用特徴量における音のエネルギを示すデータの配置をシフトし、正解コード名をＣ＃にシフトする。シフト処理部１８３は、Ｃ＃，Ｄ，Ｄ＃，Ｅ，Ｆ，Ｆ＃，Ｇ，Ｇ＃，Ａ，Ａ＃，Ｂ、およびＣのそれぞれの音名の音のエネルギを示すデータが順に配置されているルート判別用特徴量およびメジャーマイナー判別用特徴量を、Ｃ＃である正解コード名とともに、教師データに追加する。 The shift processing unit 183 receives data indicating the energy of the sounds of the pitch names of C #, D, D #, E, F, F #, G, G #, A, A #, B, and C. In order to arrange them sequentially, the arrangement of data indicating the energy of the sound in the feature quantity for route discrimination and the feature quantity for major / minor discrimination is shifted, and the correct code name is shifted to C #. The shift processing unit 183 sequentially arranges data indicating the sound energy of the pitch names of C #, D, D #, E, F, F #, G, G #, A, A #, B, and C. The route discrimination feature quantity and major / minor discrimination feature quantity are added to the teacher data together with the correct code name C #.

さらに、シフト処理部１８３は、Ｄ，Ｄ＃，Ｅ，Ｆ，Ｆ＃，Ｇ，Ｇ＃，Ａ，Ａ＃，Ｂ，Ｃ、およびＣ＃のそれぞれの音名の音のエネルギを示すデータが順に配置されるように、ルート判別用特徴量およびメジャーマイナー判別用特徴量における音のエネルギを示すデータの配置をさらにシフトし、正解コード名をＤにシフトする。シフト処理部１８３は、Ｄ，Ｄ＃，Ｅ，Ｆ，Ｆ＃，Ｇ，Ｇ＃，Ａ，Ａ＃，Ｂ，Ｃ、およびＣ＃のそれぞれの音名の音のエネルギを示すデータが順に配置されているルート判別用特徴量およびメジャーマイナー判別用特徴量を、Ｄである正解コード名とともに、教師データに追加する。 Further, the shift processing unit 183 receives data indicating the energy of the sounds of the respective pitch names D, D #, E, F, F #, G, G #, A, A #, B, C, and C #. In order to arrange them sequentially, the arrangement of data indicating the energy of the sound in the feature quantity for route discrimination and the feature quantity for major / minor discrimination is further shifted, and the correct code name is shifted to D. The shift processing unit 183 sequentially arranges data indicating the sound energy of the pitch names of D, D #, E, F, F #, G, G #, A, A #, B, C, and C #. The route discrimination feature quantity and major / minor discrimination feature quantity are added to the teacher data together with the correct code name D.

このように、ルート判別用特徴量およびメジャーマイナー判別用特徴量における、音のエネルギを示すデータの配置のシフトが１１回繰り返されて、１つのルート判別用特徴量から、１２のデータが教師データに追加され、１つのメジャーマイナー判別用特徴量から、１２のデータが教師データに追加されることになる。 As described above, the shift of the arrangement of the data indicating the sound energy in the route distinguishing feature amount and the major / minor distinguishing feature amount is repeated 11 times, so that 12 pieces of data from one route distinguishing feature amount are teacher data. 12 data are added to the teacher data from one major / minor discrimination feature quantity.

追加部１８２およびシフト処理部１８３は、供給された全ての学習データの全ての拍の範囲について、コード判別用特徴量と正解コード名とを学習データとしての教師データに追加する処理を行う。 The adding unit 182 and the shift processing unit 183 perform processing for adding the chord determination feature quantity and the correct code name to the teacher data as learning data for all beat ranges of all supplied learning data.

図１４のフローチャートの説明に戻り、ステップＳ８８において、１１回繰り返していないと判定された場合、処理はステップＳ８６に戻り、１１回繰り返したと判定されるまで、上述した処理が繰り返される。 Returning to the description of the flowchart of FIG. 14, if it is determined in step S88 that the process has not been repeated 11 times, the process returns to step S86, and the above-described process is repeated until it is determined that the process has been repeated 11 times.

これに対して、ステップＳ８８において、１１回繰り返したと判定された場合、シフト処理部１８３は、コード判別用特徴量と正解コード名とが追加された教師データをコード判別機生成部１８４に供給し、処理はステップＳ８９に進む。 On the other hand, if it is determined in step S88 that the repetition has been performed 11 times, the shift processing unit 183 supplies the teacher data to which the code discrimination feature amount and the correct code name are added to the code discriminator generation unit 184. The process proceeds to step S89.

ステップＳ８９において、コード判別機生成部１８４は、各遺伝子について、シフト処理部１８３から供給された教師データを用いて、機械学習によりコード判別機を生成（作成）し、生成したコード判別機をコード判別部１４６の判別部２１２、および評価部１４７に供給する。 In step S89, the code discriminator generation unit 184 generates (creates) a code discriminator by machine learning using the teacher data supplied from the shift processing unit 183 for each gene, and codes the generated code discriminator as a code. The data is supplied to the determination unit 212 of the determination unit 146 and the evaluation unit 147.

すなわち、コード判別機生成部１８４は、供給された教師データのうち、同じ遺伝子が用いられて前処理された時間‐音程データから抽出されたコード判別用特徴量が付加されている教師データを用いて、その遺伝子に対応するコード判別機、つまりその遺伝子が用いられて前処理が施された時間‐音程データから、正しいコードを抽出するためのコード判別機を生成する。 That is, the chord discriminator generating unit 184 uses the teacher data to which the chord discriminating feature amount extracted from the time-pitch data pre-processed by using the same gene among the supplied teacher data is used. Then, a code discriminator corresponding to the gene, that is, a code discriminator for extracting a correct code from the time-pitch data using the gene and pre-processed is generated.

例えば、コード判別機生成部１８４は、kNN（k-Nearest Neighbor）、SVM（Support Vector Machine）、Naive Bayes、最も距離が近いコードを正解とするマハラノビス距離、または最も確率が高いコードを正解とするGMM（Gaussian Mixture Model）などにより、コード判別機を機械学習で生成（作成）する。 For example, the code discriminator generator 184 sets kNN (k-Nearest Neighbor), SVM (Support Vector Machine), Naive Bayes, Mahalanobis distance with the closest code as the correct answer, or the code with the highest probability as the correct answer. A code discriminator is generated (created) by machine learning using GMM (Gaussian Mixture Model) or the like.

ステップＳ９０において、コード判別部１４６のビート毎特徴量抽出部２１１は、前処理部１４３から供給された時間‐音程データからコード判別用特徴量を抽出し、判別部２１２に供給する。 In step S 90, the beat feature quantity extraction unit 211 of the chord discrimination unit 146 extracts the chord discrimination feature quantity from the time-pitch data supplied from the preprocessing unit 143, and supplies it to the discrimination unit 212.

例えば、ビート毎特徴量抽出部２１１は、ビート毎特徴量抽出部１８１と同様の処理を行って、基本ビート周期を求めることにより時間‐音程データから拍の位置を検出し、さらに時間‐音程データの拍の範囲のそれぞれについて、それぞれのオクターブにおける１２平均率のそれぞれの高さの１２の音毎のエネルギに重み付けし、それぞれのオクターブの同じ音名の音のエネルギを加算して、音名で特定される１２の音のそれぞれのエネルギを求めることにより、ルート判別用特徴量とメジャーマイナー判別用特徴量とを生成し、これらの特徴量をコード判別用特徴量とする。 For example, the beat-by-beat feature quantity extraction unit 211 performs the same processing as the beat-by-beat feature quantity extraction unit 181, detects the beat position from the time-pitch data by obtaining the basic beat period, and further, the time-pitch data. For each range of beats, weight the energy of each twelve note at the height of each twelve average rate in each octave, add the energy of the same note name in each octave, and By obtaining the energy of each of the 12 specified sounds, a route discrimination feature quantity and a major / minor discrimination feature quantity are generated, and these feature quantities are used as the chord discrimination feature quantity.

ステップＳ９１において、判別部２１２は、コード判別機生成部１８４から供給されたコード判別機を用いて、ビート毎特徴量抽出部２１１から供給されたコード判別用特徴量からビート毎のコードを判別する。 In step S91, the determination unit 212 uses the chord discriminator supplied from the chord discriminator generation unit 184 to discriminate the chord for each beat from the chord discrimination feature quantity supplied from the beat feature quantity extraction unit 211. .

例えば、判別部２１２は、所定の遺伝子が用いられて前処理が施された時間‐音程データから抽出されたコード判別用特徴量に対しては、その所定の遺伝子に対応するコード判別機を用いてビート毎のコードを判別する。 For example, the discriminating unit 212 uses a chord discriminator corresponding to the predetermined gene for the chord discriminating feature amount extracted from the time-pitch data that has been preprocessed using the predetermined gene. To determine the chord for each beat.

ステップＳ９２において、ビート毎特徴量抽出部２１１は、前処理部１４３から供給された全ての時間‐音程データについて、コードの判別が行われたか否かを判定する。 In step S92, the beat-by-beat feature quantity extraction unit 211 determines whether or not chord discrimination has been performed for all time-pitch data supplied from the preprocessing unit 143.

ステップＳ９２において、まだ全ての時間‐音程データについてコードの判別が行われていないと判定された場合、処理はステップＳ９０に戻り、上述した処理が繰り返される。 If it is determined in step S92 that chord determination has not been performed for all time-pitch data, the process returns to step S90 and the above-described process is repeated.

これに対して、ステップＳ９２において、全ての時間‐音程データについてコードの判別が行われたと判定された場合、判別部２１２は、判別により得られたビート毎のコード名を評価部１４７に供給し、処理はステップＳ９３に進む。 On the other hand, if it is determined in step S92 that chord discrimination has been performed for all time-pitch data, the discrimination unit 212 supplies the chord name for each beat obtained by the discrimination to the evaluation unit 147. The process proceeds to step S93.

ステップＳ９３において、評価部１４７は、判別部２１２から供給されたコード名を１つ選択し、そのコード名が正しく判別されたか否かを判定する。 In step S93, the evaluation unit 147 selects one code name supplied from the determination unit 212, and determines whether or not the code name is correctly determined.

すなわち、判別部２１２から評価部１４７には、学習用データとされた教師データから抽出されたコード名と、評価用データとされた教師データから抽出されたコード名とが供給される。評価部１４７は、判別部２１２から供給されたコード名のうち、評価用データとされた教師データから抽出されたコード名を用いて、各遺伝子の評価を行う。 That is, the code name extracted from the teacher data set as learning data and the code name extracted from the teacher data set as evaluation data are supplied from the determination unit 212 to the evaluation unit 147. The evaluation unit 147 evaluates each gene using the code name extracted from the teacher data that is the evaluation data among the code names supplied from the determination unit 212.

例えば、評価部１４７は、供給されたコード名のうち、評価の対象となっている所定の遺伝子が用いられて前処理が施された時間‐音程データから抽出されたコード名を用いて、その評価の対象である所定の遺伝子の評価を行う。評価部１４７には、予め評価用データである教師データの楽曲の正解コードが知らされているので、評価部１４７は、判別部２１２から供給されたコード名であって、所定の遺伝子を評価するために用いられるコード名から１つのコード名を選択し、その選択されたコード名と、対応する正解コード名とを比較して、コードが正しく判別されたか否かを判定する。 For example, the evaluation unit 147 uses the chord name extracted from the time-pitch data that has been pre-processed by using a predetermined gene to be evaluated among the chord names that have been supplied. A predetermined gene to be evaluated is evaluated. Since the evaluation unit 147 knows in advance the correct code of the song data of the teacher data that is the evaluation data, the evaluation unit 147 evaluates a predetermined gene with the code name supplied from the determination unit 212. One code name is selected from the code names used for the purpose, and the selected code name is compared with the corresponding correct code name to determine whether or not the code is correctly determined.

ステップＳ９３において、正しく判別されたと判定された場合、ステップＳ９４において、評価部１４７は、保持しているコードの判別の正解数を１インクリメントする。 If it is determined in step S93 that the determination is correct, in step S94, the evaluation unit 147 increments the number of correct answers for determination of the held code by one.

例えば、評価部１４７は、各遺伝子について、判別部２１２から供給されたコード名であって、その遺伝子を評価するために用いられるコード名のうち、コード名が正しく判別された数を示す正解数と、コード名が正しく判別されなかった数を示す不正解数とを保持している。ステップＳ９３において、選択された１つのコード名が正しく判別されたと判定された場合、評価部１４７は、保持している正解数をインクリメントする。 For example, for each gene, the evaluation unit 147 is the code name supplied from the determination unit 212, and among the code names used for evaluating the gene, the number of correct answers indicating the number of correctly identified code names And the number of incorrect answers indicating the number of code names that were not correctly determined. If it is determined in step S93 that the selected one code name has been correctly determined, the evaluation unit 147 increments the number of correct answers held.

これに対して、ステップＳ９３において、正しく判別されなかったと判定された場合、ステップＳ９５において、評価部１４７は、保持している不正解数を１インクリメントする。 On the other hand, if it is determined in step S93 that it has not been correctly determined, in step S95, the evaluation unit 147 increments the number of incorrect answers held by one.

ステップＳ９４において正解数がインクリメントされるか、ステップＳ９５において不正解数がインクリメントされると、ステップＳ９６において、評価部１４７は、現在評価の対象となっている遺伝子を評価するために用いられる全てのコード名について、評価を行ったか否か、つまり遺伝子の評価に用いられる全ての評価用データについて、コード名が正しく判別されたか否かの評価を行ったかを判定する。 When the number of correct answers is incremented in step S94 or the number of incorrect answers is incremented in step S95, in step S96, the evaluation unit 147 evaluates all the genes that are currently used for evaluation. It is determined whether or not the code name has been evaluated, that is, whether or not the code name has been correctly identified for all the evaluation data used for gene evaluation.

ステップＳ９６において、全てのコード名について評価を行っていないと判定された場合、処理はステップＳ９３に戻り、判別部２１２から供給された次の１つのコード名が選択されて評価が行われる。なお、ステップＳ９３乃至ステップＳ９５の処理は、評価の対象となる全ての遺伝子について行われる。 If it is determined in step S96 that all code names have not been evaluated, the process returns to step S93, and the next one code name supplied from the determination unit 212 is selected for evaluation. Note that the processing from step S93 to step S95 is performed for all genes to be evaluated.

これに対して、ステップＳ９６において、全てのコード名について評価を行ったと判定された場合、ステップＳ９７において、評価部１４７は、各遺伝子についての正解数と、不正解数とに基づいて、各遺伝子の評価値を計算し、処理は図１３のステップＳ４３に進む。例えば、評価部１４７は、１つの遺伝子についての正解数と不正解数とに基づいて、コードの判別の正解率を、その遺伝子の評価値として算出する。 On the other hand, if it is determined in step S96 that all the code names have been evaluated, in step S97, the evaluation unit 147 determines that each gene is based on the number of correct answers and the number of incorrect answers for each gene. The evaluation value is calculated, and the process proceeds to step S43 in FIG. For example, the evaluation unit 147 calculates the correct answer rate of code discrimination as an evaluation value of the gene based on the number of correct answers and the number of incorrect answers for one gene.

このようにして、フィルタ生成装置５１は、生成された遺伝子のそれぞれの評価を行う。このように、生成された遺伝子の評価を行うことによって、評価の高い遺伝子を用いて次世代の遺伝子を生成することができるので、遺伝子の世代が進むにしたがって、より確実に楽曲のコードを抽出するための前処理用フィルタおよびコード判別機を生成することができる。 In this way, the filter generation device 51 evaluates each of the generated genes. In this way, by evaluating the generated gene, it is possible to generate the next generation gene using the highly evaluated gene, so as the generation of the gene progresses, the code of the music can be extracted more reliably It is possible to generate a preprocessing filter and a code discriminator.

換言すれば、遺伝子の世代が進むにしたがって、音声データのうちの、特徴抽出に有用な成分をより強調し、不要な成分を除去することができる前処理用フィルタを得ることができる。 In other words, as the gene generation progresses, it is possible to obtain a preprocessing filter that can emphasize components useful for feature extraction in audio data and remove unnecessary components.

なお、以上においては、前処理用フィルタを用いた前処理は、その処理前と処理後とで、処理されるデータ、すなわち時間‐音程データの形式が保たれたままとされる処理であると説明したが、処理前と処理後とで、データの形式が変化する処理を前処理とするようにしてもよい。そのような場合、フィルタ生成装置５１において、データの形式が変化した、前処理が施された時間‐音程データから楽曲のコードを抽出するためのコード判別機が生成される。 In the above, the preprocessing using the preprocessing filter is processing in which the format of the data to be processed, that is, the time-pitch data is maintained before and after the processing. As described above, the process in which the data format changes before and after the process may be set as the preprocess. In such a case, the filter generator 51 generates a chord discriminator for extracting the chord of music from pre-processed time-pitch data whose data format has changed.

また、以上においては、楽曲の時間‐音程データからコードを抽出する例について説明したが、その他、静止画像、動画像、テキスト（文字列）などが入力データとして信号処理装置１２に入力され、信号処理装置１２において、入力データから所望する特徴が抽出されるようにすることもできる。そのような場合、フィルタ生成装置１１は、入力データから特徴を抽出するための前処理用フィルタと、判別機とを生成する。 In the above description, the example of extracting the chord from the time-pitch data of the music has been described. In addition, still images, moving images, texts (character strings), and the like are input to the signal processing device 12 as input data. In the processing device 12, a desired feature can be extracted from the input data. In such a case, the filter generation device 11 generates a preprocessing filter for extracting features from input data and a discriminator.

上述した一連の処理は、ハードウエアにより実行させることもできるし、ソフトウエアにより実行させることもできる。一連の処理をソフトウエアにより実行させる場合には、そのソフトウエアを構成するプログラムが、専用のハードウエアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータなどに、プログラム記録媒体からインストールされる。 The series of processes described above can be executed by hardware or can be executed by software. When a series of processing is executed by software, a program constituting the software executes various functions by installing a computer incorporated in dedicated hardware or various programs. For example, it is installed from a program recording medium in a general-purpose personal computer or the like.

図１７は、上述した一連の処理をプログラムにより実行するパーソナルコンピュータの構成の例を示すブロック図である。CPU（Central Processing Unit）３０１は、ROM（Read Only Memory）３０２、または記録部３０８に記録されているプログラムに従って各種の処理を実行する。RAM（Random Access Memory）３０３には、CPU３０１が実行するプログラムやデータなどが適宜記憶される。これらのCPU３０１、ROM３０２、およびRAM３０３は、バス３０４により相互に接続されている。 FIG. 17 is a block diagram illustrating an example of the configuration of a personal computer that executes the above-described series of processing using a program. A CPU (Central Processing Unit) 301 executes various processes according to a program recorded in a ROM (Read Only Memory) 302 or a recording unit 308. A RAM (Random Access Memory) 303 appropriately stores programs executed by the CPU 301 and data. The CPU 301, ROM 302, and RAM 303 are connected to each other by a bus 304.

CPU３０１にはまた、バス３０４を介して入出力インターフェース３０５が接続されている。入出力インターフェース３０５には、キーボード、マウス、マイクロホンなどよりなる入力部３０６、ディスプレイ、スピーカなどよりなる出力部３０７が接続されている。CPU３０１は、入力部３０６から入力される指令に対応して各種の処理を実行する。そして、CPU３０１は、処理の結果を出力部３０７に出力する。 An input / output interface 305 is also connected to the CPU 301 via the bus 304. The input / output interface 305 is connected to an input unit 306 including a keyboard, a mouse, and a microphone, and an output unit 307 including a display and a speaker. The CPU 301 executes various processes in response to commands input from the input unit 306. Then, the CPU 301 outputs the processing result to the output unit 307.

入出力インターフェース３０５に接続されている記録部３０８は、例えばハードディスクからなり、CPU３０１が実行するプログラムや各種のデータを記録する。通信部３０９は、インターネットやローカルエリアネットワークなどのネットワークを介して外部の装置と通信する。 The recording unit 308 connected to the input / output interface 305 includes, for example, a hard disk, and records programs executed by the CPU 301 and various data. The communication unit 309 communicates with an external device via a network such as the Internet or a local area network.

また、通信部３０９を介してプログラムを取得し、記録部３０８に記録してもよい。 Alternatively, the program may be acquired via the communication unit 309 and recorded in the recording unit 308.

入出力インターフェース３０５に接続されているドライブ３１０は、磁気ディスク、光ディスク、光磁気ディスク、或いは半導体メモリなどのリムーバブルメディア３３１が装着されたとき、それらを駆動し、そこに記録されているプログラムやデータなどを取得する。取得されたプログラムやデータは、必要に応じて記録部３０８に転送され、記録される。 The drive 310 connected to the input / output interface 305 drives a removable medium 331 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and drives the program or data recorded therein. Get etc. The acquired program and data are transferred to the recording unit 308 and recorded as necessary.

コンピュータにインストールされ、コンピュータによって実行可能な状態とされるプログラムを格納するプログラム記録媒体は、図１７に示すように、磁気ディスク（フレキシブルディスクを含む）、光ディスク（CD-ROM(Compact Disc-Read Only Memory),DVD(Digital Versatile Disc)を含む）、光磁気ディスク、もしくは半導体メモリなどよりなるパッケージメディアであるリムーバブルメディア３３１、または、プログラムが一時的もしくは永続的に格納されるROM３０２や、記録部３０８を構成するハードディスクなどにより構成される。プログラム記録媒体へのプログラムの格納は、必要に応じてルータ、モデムなどのインターフェースである通信部３０９を介して、ローカルエリアネットワーク、インターネット、デジタル衛星放送といった、有線または無線の通信媒体を利用して行われる。 As shown in FIG. 17, a program recording medium that stores a program that is installed in a computer and is ready to be executed by the computer includes a magnetic disk (including a flexible disk), an optical disk (CD-ROM (Compact Disc-Read Only). Memory, DVD (Digital Versatile Disc), a magneto-optical disk, a removable medium 331 which is a package medium made of a semiconductor memory, or the like, a ROM 302 in which a program is temporarily or permanently stored, or a recording unit 308 It is comprised by the hard disk etc. which comprise. The program is stored in the program recording medium using a wired or wireless communication medium such as a local area network, the Internet, or digital satellite broadcasting via a communication unit 309 that is an interface such as a router or a modem as necessary. Done.

なお、本明細書において、プログラム記録媒体に格納されるプログラムを記述するステップは、記載された順序に沿って時系列的に行われる処理はもちろん、必ずしも時系列的に処理されなくとも、並列的あるいは個別に実行される処理をも含むものである。 In the present specification, the step of describing the program stored in the program recording medium is not limited to the processing performed in time series in the described order, but is not necessarily performed in time series. Or the process performed separately is also included.

なお、本発明の実施の形態は、上述した実施の形態に限定されるものではなく、本発明の要旨を逸脱しない範囲において種々の変更が可能である。 The embodiment of the present invention is not limited to the above-described embodiment, and various modifications can be made without departing from the gist of the present invention.

本発明の一実施の形態の特徴抽出システムについて説明する図である。It is a figure explaining the feature extraction system of one embodiment of this invention. 本発明の一実施の形態の特徴抽出システムの構成を示すブロック図である。It is a block diagram which shows the structure of the feature extraction system of one embodiment of this invention. 本発明の一実施の形態の特徴抽出システムの構成を示すブロック図である。It is a block diagram which shows the structure of the feature extraction system of one embodiment of this invention. コード抽出部の構成例を示すブロック図である。It is a block diagram which shows the structural example of a code extraction part. 時間‐音程データから抽出されるビート毎のコード名を説明する図である。It is a figure explaining the chord name for every beat extracted from time-pitch data. コード判別処理を説明するフローチャートである。It is a flowchart explaining a code | cord | chord discrimination | determination process. 時間‐音程データを説明する図である。It is a figure explaining time-pitch data. 前処理用フィルタを説明する図である。It is a figure explaining the filter for pre-processing. コード判別機を用いたコードの判別を説明する図である。It is a figure explaining discrimination | determination of the code | cord | chord using a code discriminator. フィルタ生成装置の構成を示すブロック図である。It is a block diagram which shows the structure of a filter production | generation apparatus. コード判別機学習部の構成を示すブロック図である。It is a block diagram which shows the structure of a code discriminator learning part. コード判別部の構成を示すブロック図である。It is a block diagram which shows the structure of a code discrimination | determination part. 前処理用フィルタ生成処理を説明するフローチャートである。It is a flowchart explaining the filter production | generation process for pre-processing. 評価処理を説明するフローチャートである。It is a flowchart explaining an evaluation process. 楽曲のコード進行を説明する図である。It is a figure explaining the chord progression of a music. コード判別用特徴量および正解コードのシフトを説明する図である。It is a figure explaining the feature-value for code discrimination | determination and the shift of a correct code. パーソナルコンピュータの構成を示すブロック図である。It is a block diagram which shows the structure of a personal computer.

Explanation of symbols

１１フィルタ生成装置，１２信号処理装置，２１前処理部，２２特徴抽出処理部，５１フィルタ生成装置，５２信号処理装置，６１前処理部，６２コード抽出部，９２ビート毎特徴量抽出部，９３コード判別部，１２１初期世代生成部，１２２遺伝子評価部，１２３次世代生成部，１４１教師データ保持部，１４３前処理部，１４５コード判別機学習部，１４６コード判別部，１４７評価部，１５１選択部，１５２突然変異処理部，１５３交差処理部，１５４ランダム生成部，１８１ビート毎特徴量抽出部，１８２追加部，１８３シフト処理部，１８４コード判別機生成部，２１１ビート毎特徴量抽出部，２１２判別部 DESCRIPTION OF SYMBOLS 11 Filter production | generation apparatus, 12 Signal processing apparatus, 21 Pre-processing part, 22 Feature extraction processing part, 51 Filter production | generation apparatus, 52 Signal processing apparatus, 61 Pre-processing part, 62 Code extraction part, 92 Feature-value extraction part for every beat, 93 Code discriminating unit, 121 Initial generation unit, 122 Gene evaluation unit, 123 Next generation generation unit, 141 Teacher data holding unit, 143 Preprocessing unit, 145 Code discriminator learning unit, 146 Code discriminating unit, 147 Evaluation unit, 151 selection Part, 152 mutation processing part, 153 crossing processing part, 154 random generation part, 181 beat feature quantity extraction part, 182 addition part, 183 shift processing part, 184 code discriminator generation part, 211 beat feature quantity extraction part, 212 Discriminator

Claims

A filter generating means for generating a candidate filter that is a candidate for a preprocessing filter used for preprocessing applied to the data to be extracted from the data when a predetermined feature is extracted from the data;
Using the candidate filter, pre-processing means for performing the pre-processing on teacher data used for evaluation of the candidate filter;
Extraction means for extracting the features from the pre-processed teacher data;
When the feature is extracted from the teacher data that has been pre-processed using the candidate filter, the feature extracted by the extraction unit is an evaluation value of the candidate filter that indicates an evaluation of the feature extraction And an evaluation means for calculating based on the characteristics to be extracted from the teacher data obtained in advance.

The filter generation means further generates a plurality of new candidate filters using some of the candidate filters based on each of the evaluation values of the plurality of candidate filters,
The information processing apparatus according to claim 1, wherein the evaluation unit calculates each evaluation value of the new candidate filter.

When the highest evaluation value among the evaluation values calculated at the end satisfies a predetermined condition, the evaluation unit uses the candidate filter having the highest evaluation value as a preprocessing filter used for the preprocessing. The information processing apparatus according to claim 2.

If the highest evaluation value among the evaluation values calculated last does not satisfy the condition, the filter generation means may generate the candidate generated last based on the evaluation value calculated last. The information processing apparatus according to claim 3, further generating a plurality of new candidate filters by using some of the filters.

The information processing apparatus according to claim 1, wherein the preprocessing filter is a filter that performs processing while maintaining a format of the data.

Based on the teacher data that has been subjected to the preprocessing using the candidate filter and the characteristics to be extracted from the teacher data that have been obtained in advance, the candidate filter is used to perform the preprocessing. The information processing apparatus according to claim 1, further comprising: a discriminator creating unit that creates a discriminator for extracting the feature from the data by machine learning.

When a predetermined feature is extracted from data, a candidate filter that is a candidate for a preprocessing filter used for preprocessing applied to the data to be extracted is generated.
Using the candidate filter, the preprocessing is performed on the teacher data used for the evaluation of the candidate filter,
The feature is extracted from the teacher data subjected to the preprocessing by an extraction unit,
When the feature is extracted from the teacher data that has been pre-processed using the candidate filter, the feature extracted by the extraction unit is an evaluation value of the candidate filter that indicates an evaluation of the feature extraction And an information processing method including a step of calculating based on the characteristics to be extracted from the teacher data obtained in advance.

When a predetermined feature is extracted from data, a candidate filter that is a candidate for a preprocessing filter used for preprocessing applied to the data to be extracted is generated.
Using the candidate filter, the preprocessing is performed on the teacher data used for the evaluation of the candidate filter,
The feature is extracted from the teacher data subjected to the preprocessing by an extraction unit,
When the feature is extracted from the teacher data that has been pre-processed using the candidate filter, the feature extracted by the extraction unit is an evaluation value of the candidate filter that indicates an evaluation of the feature extraction And a program for causing a computer to execute a process including a step of calculating based on the characteristics to be extracted from the teacher data obtained in advance.