JP4613923B2

JP4613923B2 - Musical sound processing apparatus and program

Info

Publication number: JP4613923B2
Application number: JP2007091431A
Authority: JP
Inventors: 琢哉藤島; 温東儀
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2007-03-30
Filing date: 2007-03-30
Publication date: 2011-01-19
Anticipated expiration: 2027-03-30
Also published as: JP2008250008A

Description

本発明は、楽曲の楽調を反映した音楽データを生成する技術に関する。 The present invention relates to a technique for generating music data that reflects the tone of music.

楽曲の伴奏音の態様を指定する複数の自動伴奏データを選択的に利用して所望の伴奏音を自動的に生成する技術が従来から提案されている。例えば特許文献１には、利用者が指定した拍点の態様（リズム）に応じて自動伴奏データを選択し、鍵盤に対する押鍵の内容に基づいて検出されたコードとともに出力する技術が開示されている。
特開平１０−２２２１６６号公報 Conventionally, a technique for automatically generating a desired accompaniment sound by selectively using a plurality of automatic accompaniment data designating the form of the accompaniment sound of the music has been proposed. For example, Patent Document 1 discloses a technique of selecting automatic accompaniment data according to a beat point mode (rhythm) designated by a user and outputting the data together with a code detected based on the content of a key pressed to the keyboard. Yes.
JP-A-10-222166

しかし、特許文献１の技術においては、楽曲の楽調に合致した拍点の態様を指定するという煩雑な作業が利用者に要求されるという問題がある。また、音楽に不慣れな利用者が拍点の態様を指定した場合には、楽曲の楽調を適正に反映した自動伴奏データが選択されない場合もある。以上の事情を背景として、本発明は、利用者による拍点の指定を不要とするという課題の解決をひとつの目的としている。 However, the technique disclosed in Patent Document 1 has a problem that the user is required to perform a complicated task of designating a beat point mode that matches the musical tone. In addition, when a user unfamiliar with music specifies the mode of beat points, automatic accompaniment data that appropriately reflects the tone of the music may not be selected. Against the background of the above circumstances, an object of the present invention is to solve the problem of making it unnecessary to specify beat points by the user.

以上の課題を解決するために、本発明に係る楽音処理装置は、伴奏音の態様を指定する複数の伴奏態様データを記憶する記憶手段（例えば図１の記憶回路２０）と、楽曲の楽音を示す楽音信号から特徴量を順次に抽出する特徴量抽出手段と、楽音信号から拍点を検出する拍点検出手段と、拍点検出手段が検出した拍点に基づいて複数の単位区間を画定する区間画定手段と、複数の伴奏態様データの各々が示す伴奏音の特徴量と特徴量抽出手段が単位区間内の楽音信号について特定した特徴量とを比較する比較手段を含み、記憶手段が記憶する複数の伴奏態様データのうち、単位区間内の楽音信号に類似する伴奏音の伴奏態様データを、比較手段による比較の結果に基づいて、単位区間ごとに順次に選択する選択手段と、選択手段が選択した伴奏態様データまたは当該伴奏態様データの識別子を含む単位データを各単位区間について出力する出力手段とを具備する。以上の構成によれば、拍点検出手段が楽音信号から拍点を検出するから、拍点を指定する利用者の労力は軽減される。また、伴奏態様データの各々が示す伴奏音の特徴量と楽音信号の特徴量との比較の結果に基づいて伴奏態様データが選択されるから、例えば伴奏態様データが指定する伴奏音のタイミングと楽音信号の楽音のタイミングとの比較の結果のみに基づいて伴奏態様データを選択する構成と比較して、楽音信号に楽調が類似する伴奏音の単位データを生成することが可能である。
本発明の好適な態様において、特徴量抽出手段は、楽音信号の特徴量をフレーム毎に順次に抽出し、比較手段は、複数の伴奏態様データの各々が示す伴奏音のフレーム毎の特徴量と単位区間内の楽音信号の各フレームの特徴量とを比較する。また、本発明の好適な態様において、選択手段は、複数の伴奏態様データの各々について伴奏音のスペクトログラムを特定する伴奏処理手段を含み、伴奏処理手段が各伴奏態様データについて特定したスペクトログラムの特徴量と単位区間内の楽音信号のスペクトログラムの特徴量とを比較手段が比較した結果に基づいて、複数の伴奏態様データのうち、伴奏音のスペクトログラムが単位区間内の楽音信号のスペクトログラムに類似する伴奏態様データを選択する。
In order to solve the above-described problems, a musical sound processing apparatus according to the present invention stores a plurality of accompaniment mode data (for example, the storage circuit 20 in FIG. 1) for specifying a mode of an accompaniment sound, and a musical tone of music. A feature amount extracting unit that sequentially extracts feature amounts from the musical tone signal shown, a beat point detecting unit that detects beat points from the musical tone signal, and a plurality of unit sections are defined based on the beat points detected by the beat point detecting unit. The storage means stores the section defining means, the comparison means for comparing the feature quantity of the accompaniment sound indicated by each of the plurality of accompaniment mode data and the feature quantity specified by the feature quantity extraction means for the musical tone signal in the unit section. A selection unit that sequentially selects accompaniment mode data of accompaniment sounds similar to a musical tone signal in a unit section among a plurality of accompaniment mode data for each unit section based on a comparison result by the comparison unit, and a selection unit, Selected companion The unit data including the mode data or the identifier of the accompaniment mode data and an output means for outputting for each unit section. According to the above configuration, since the beat point detecting means detects the beat point from the musical tone signal, the labor of the user who designates the beat point is reduced. Also, since the accompaniment mode data is selected based on the result of comparison between the feature amount of the accompaniment sound and the feature amount of the musical tone signal indicated by each of the accompaniment mode data, for example, the timing of the accompaniment sound and the musical tone specified by the accompaniment mode data Compared to a configuration in which accompaniment mode data is selected based only on the result of comparison with the timing of the tone of the signal, it is possible to generate accompaniment tone unit data whose tone is similar to the tone signal.
In a preferred aspect of the present invention, the feature quantity extracting means sequentially extracts the feature quantity of the musical tone signal for each frame, and the comparing means includes the feature quantity for each frame of the accompaniment sound indicated by each of the plurality of accompaniment style data. The feature amount of each frame of the musical tone signal in the unit section is compared. In a preferred aspect of the present invention, the selection means includes accompaniment processing means for specifying a spectrogram of the accompaniment sound for each of the plurality of accompaniment aspect data, and the feature amount of the spectrogram specified by the accompaniment processing means for each accompaniment aspect data The accompaniment mode in which the spectrogram of the accompaniment sound is similar to the spectrogram of the tone signal in the unit interval among a plurality of accompaniment mode data based on the result of the comparison means comparing the characteristic amount of the spectrogram of the musical sound signal in the unit interval Select data.

本発明の好適な態様に係る楽音処理装置は、楽音信号に対応した音高（例えばコード名やベース音）を単位区間ごとに特定する音高特定手段を具備し、出力手段は、選択手段が選択した伴奏態様データまたは当該伴奏態様データの識別子と音高特定手段が特定した音高とを含む単位データを出力する。以上の態様によれば、楽音信号に対応した音高を含む単位データが出力されるから、単位データに基づいて発生される楽音（伴奏音）を楽音信号の楽調に近似されることが可能である。
さらに好適な態様において、音高特定手段は、各単位区間のコードを特定するコード特定手段と、各単位区間のベース音を特定するベース特定手段とを含み、出力手段は、コード特定手段が特定したコードとベース特定手段が特定したベース音とを単位データに含めて出力する。本態様によれば、楽音信号の楽調を充分に反映した伴奏音を示す単位データを生成できる。 A musical tone processing apparatus according to a preferred aspect of the present invention includes a pitch specifying unit that specifies a pitch (for example, a chord name or a bass tone) corresponding to a musical tone signal for each unit section, and the output unit includes: Unit data including the selected accompaniment mode data or the identifier of the accompaniment mode data and the pitch specified by the pitch specifying means is output. According to the above aspect, since the unit data including the pitch corresponding to the musical tone signal is output, the musical tone (accompaniment tone) generated based on the unit data can be approximated to the musical tone tone. It is.
In a further preferred aspect, the pitch specifying means includes a chord specifying means for specifying a chord of each unit section and a base specifying means for specifying a base sound of each unit section, and the output means is specified by the chord specifying means. The chord and the bass sound specified by the bass specifying means are included in the unit data and output. According to this aspect, unit data indicating an accompaniment sound that sufficiently reflects the tone of the tone signal can be generated.

本発明の好適な態様において、拍点検出手段は、楽音信号のうち、相異なる種類の楽器の演奏音に対応した複数の周波数帯域の各々に属する成分が発生する時点を拍点として検出する。以上の態様によれば、例えばひとつの周波数帯域に属する成分が発生する時点を拍点とする構成と比較して、拍点を高精度に検出することが可能となる。
In a preferred aspect of the present invention, the beat point detecting means detects, as a beat point, a time point at which a component belonging to each of a plurality of frequency bands corresponding to performance sounds of different types of musical instruments is generated. According to the above aspect, for example, beat points can be detected with high accuracy compared to a configuration in which a time point when a component belonging to one frequency band is generated is used as a beat point.

本発明の好適な態様において、選択手段は、複数の伴奏態様データの各々が示す伴奏音の特徴量を特定する伴奏処理手段と、伴奏処理手段が特定した特徴量と単位区間内の楽音信号の特徴量とを比較する比較手段とを含み、比較手段による比較の結果に基づいて、単位区間内の楽音信号に類似する伴奏音の伴奏態様データを選択する。さらに好適な態様において、伴奏処理手段は、伴奏態様データが指定する各伴奏音と拍点検出手段が検出した拍点とが時間軸上で対応するように、伴奏態様データが指定する伴奏音の時間軸上の位置を調整する（例えば図６の伸縮処理）。本態様によれば、伴奏態様データが指定する各伴奏音と拍点検出手段が検出した拍点とが時間軸上で対応するように伴奏態様データが調整されるから、伴奏態様データの伴奏音の特徴量と楽音信号の特徴量とを精緻に対比することが可能である。
In a preferred aspect of the present invention, the selection means includes an accompaniment processing means for specifying the characteristic amount of the accompaniment sound indicated by each of the plurality of accompaniment aspect data, a characteristic amount specified by the accompaniment processing means, and a musical sound signal within the unit section. Comparing means for comparing with the feature amount is selected, and accompaniment mode data of accompaniment sounds similar to the musical tone signal in the unit interval is selected based on the comparison result by the comparing means. In a further preferred aspect, the accompaniment processing means is configured to perform an accompaniment sound specified by the accompaniment aspect data so that each accompaniment sound specified by the accompaniment aspect data corresponds to a beat point detected by the beat point detecting means on the time axis. The position on the time axis is adjusted (for example, the expansion / contraction process in FIG. 6). According to this aspect, the accompaniment mode data is adjusted so that each accompaniment sound specified by the accompaniment mode data corresponds to the beat point detected by the beat point detecting means on the time axis. It is possible to precisely compare the feature amount of the musical sound and the feature amount of the musical sound signal.

本発明の好適な態様に係る楽音処理装置は、楽曲のうち演奏が反復される反復区間を検出する構成判定手段を具備し、選択手段は、構成判定手段が検出した各反復区間について同じ伴奏態様データを選択する。以上の態様においては、反復区間について選択手段が伴奏態様データを選択する処理が軽減されるとともに、各反復区間の音楽的な統一性が維持されるという利点がある。 The musical sound processing apparatus according to a preferred aspect of the present invention includes a configuration determining unit that detects a repeated section in which a performance is repeated in the music, and the selecting unit is the same accompaniment mode for each repeated section detected by the configuration determining unit. Select data. In the above aspect, there is an advantage that the process of selecting the accompaniment aspect data by the selection means for the repetitive section is reduced and the musical uniformity of each repetitive section is maintained.

本発明に係る楽音処理装置は、各処理に専用されるＤＳＰ（Digital Signal Processor）などのハードウェア（電子回路）によって実現されるほか、ＣＰＵ（Central Processing Unit）などの汎用の演算処理装置とプログラムとの協働によっても実現される。本発明に係るプログラムは、楽曲の楽音を示す楽音信号から特徴量を順次に抽出する特徴量抽出処理と、楽音信号から拍点を検出する拍点検出処理と、拍点検出処理で検出した拍点に基づいて複数の単位区間を画定する区間画定処理と、伴奏音の態様を指定する複数の伴奏態様データの各々が示す伴奏音の特徴量と特徴量抽出処理で単位区間内の楽音信号について特定した特徴量とを比較する比較処理を含み、複数の伴奏態様データのうち、単位区間内の楽音信号に類似する伴奏音の伴奏態様データを、比較処理の結果に基づいて、単位区間ごとに順次に選択する選択処理と、選択処理で選択した伴奏態様データまたは当該伴奏態様データの識別子を含む単位データを各単位区間について出力する出力処理とをコンピュータに実行させる。以上のプログラムによっても、本発明に係る楽音処理装置と同様の作用および効果が奏される。なお、本発明のプログラムは、コンピュータが読取可能な記録媒体に格納された形態で利用者に提供されてコンピュータにインストールされるほか、通信網を介した配信の形態でサーバ装置から提供されてコンピュータにインストールされる。
The musical sound processing apparatus according to the present invention is realized by hardware (electronic circuit) such as a DSP (Digital Signal Processor) dedicated to each process, and a general-purpose arithmetic processing apparatus such as a CPU (Central Processing Unit) and a program It is also realized through collaboration with. The program according to the present invention includes a feature amount extraction process for sequentially extracting feature amounts from a musical sound signal indicating a musical tone of music, a beat point detection process for detecting a beat point from the musical sound signal, and a beat detected by the beat point detection process. About the musical tone signal in the unit section by the section defining process for defining a plurality of unit sections based on the points, and the accompaniment sound feature amount and the feature amount extraction process indicated by each of the plurality of accompaniment aspect data for specifying the accompaniment sound aspect Comparing processing for comparing the specified feature value, and accompaniment sound data of accompaniment sounds similar to the musical tone signal in the unit interval among a plurality of accompaniment mode data, for each unit interval based on the result of the comparison processing The computer is caused to execute selection processing for selecting sequentially and output processing for outputting unit data including the accompaniment mode data selected in the selection processing or the identifier of the accompaniment mode data for each unit section. With the above program, the same operations and effects as the musical sound processing apparatus according to the present invention are exhibited. The program of the present invention is provided to the user in a form stored in a computer-readable recording medium and installed in the computer, or is provided from the server device in the form of distribution via a communication network. To be installed.

本発明は、楽音を処理する方法としても特定される。本発明の楽音処理方法は、伴奏音の態様を指定する複数の伴奏態様データを記憶する記憶手段を利用した方法であって、楽曲の楽音を示す楽音信号から特徴量を順次に抽出する特徴量抽出過程と、楽音信号から拍点を検出する拍点検出過程と、拍点検出過程で検出した拍点に基づいて複数の単位区間を画定する区間画定過程と、複数の伴奏態様データの各々が示す伴奏音の特徴量と特徴量抽出手段が単位区間内の楽音信号について特定した特徴量とを比較する比較過程を含み、記憶手段が記憶する複数の伴奏態様データのうち、単位区間内の楽音信号に類似する伴奏音の伴奏態様データを、比較過程の結果に基づいて、単位区間ごとに順次に選択する選択過程と、選択過程にて選択した伴奏態様データまたは当該伴奏態様データの識別子を含む単位データを各単位区間について出力する出力過程とを含む。以上の方法によれば、本発明に係る楽音処理装置と同様の作用および効果が奏される。 The present invention is also specified as a method for processing musical sounds. The musical sound processing method of the present invention is a method using storage means for storing a plurality of pieces of accompaniment mode data for specifying the mode of the accompaniment sound, wherein the feature amount is sequentially extracted from the musical sound signal indicating the musical tone of the music. Each of the extraction process, the beat point detection process for detecting the beat point from the musical sound signal, the section defining process for defining a plurality of unit sections based on the beat points detected in the beat point detection process, and the plurality of accompaniment mode data Including a comparison process in which the feature amount of the accompaniment sound shown and the feature amount extraction means compare the feature amount specified for the musical tone signal in the unit section, and among the plurality of accompaniment mode data stored in the storage means , the musical tone in the unit section accompaniment mode data of the accompaniment tone similar to the signal, based on the results of the comparison process, a selection process that sequentially selects each unit interval, the identifier of the accompaniment mode data or the accompaniment mode data selected in the selection process including The unit data and an output step of outputting for each unit section. According to the above method, the same operation and effect as the musical sound processing apparatus according to the present invention are exhibited.

＜Ａ：楽音処理装置の構成＞
図１は、本発明のひとつの形態に係る楽音処理装置の構成を示すブロック図である。同図に示すように、楽音処理装置１００には、信号生成装置１２と出力処理装置１４とが接続される。 <A: Configuration of musical tone processing device>
FIG. 1 is a block diagram showing a configuration of a musical tone processing apparatus according to one embodiment of the present invention. As shown in the figure, a signal generation device 12 and an output processing device 14 are connected to the musical tone processing device 100.

信号生成装置１２は、楽曲の楽音（演奏音）の時間的な波形を示す楽音信号Ｖを生成する手段である。楽曲の楽音を収録した記録媒体（例えば音楽ＣＤ）から順次にデータを取得して楽音信号Ｖを生成および出力する再生装置が信号生成装置１２として好適に採用される。信号生成装置１２が生成した楽音信号Ｖは楽音処理装置１００に供給される。 The signal generation device 12 is a means for generating a musical sound signal V indicating a temporal waveform of a musical tone (performance sound) of music. A reproduction apparatus that sequentially acquires data from a recording medium (for example, a music CD) that records musical sounds of music and generates and outputs a musical sound signal V is suitably employed as the signal generating apparatus 12. The musical tone signal V generated by the signal generating device 12 is supplied to the musical tone processing device 100.

楽音処理装置１００は、楽音信号Ｖから楽音データＤOUTを生成および出力する。楽音データＤOUTは、楽音信号Ｖが示す楽曲に類似する伴奏音を時系列に指定するデータ列である。すなわち、本形態の楽音処理装置１００は、楽音信号Ｖを楽音データＤOUTに符号化する装置として機能する。 The musical tone processing apparatus 100 generates and outputs musical tone data DOUT from the musical tone signal V. The musical sound data DOUT is a data string that specifies accompaniment sounds similar to the music indicated by the musical sound signal V in time series. That is, the musical tone processing apparatus 100 according to the present embodiment functions as a device that encodes the musical tone signal V into the musical tone data DOUT.

楽音処理装置１００が生成した楽音データＤOUTは出力処理装置１４に供給される。出力処理装置１４は、楽音データＤOUTに応じた音波を出力する装置である。出力処理装置１４は、楽音処理装置１００に対して直接的に接続されてもよいし、インターネットなどの通信網を介して間接的に楽音処理装置１００に接続されてもよい。また、出力処理装置１４と楽音処理装置１００とを一体の装置としてもよい。 The musical tone data DOUT generated by the musical tone processing device 100 is supplied to the output processing device 14. The output processing device 14 is a device that outputs sound waves according to the musical sound data DOUT. The output processing device 14 may be directly connected to the musical tone processing device 100, or may be indirectly connected to the musical tone processing device 100 via a communication network such as the Internet. Further, the output processing device 14 and the musical tone processing device 100 may be integrated.

図１に示すように、楽音処理装置１００は、記憶回路２０と制御回路３０とを具備する。記憶回路２０は、制御回路３０が実行するプログラムや制御回路３０が使用する各種のデータを記憶する。半導体記憶装置や磁気記憶装置など任意の記憶装置が記憶回路２０として採用される。制御回路３０は、プログラムを実行することで図１の各要素（機能体）として機能するＣＰＵなどの演算処理装置である。なお、制御回路３０は、楽音信号Ｖの処理に専用されるＤＳＰなどの電子回路によっても実現される。また、図１に例示した制御回路３０の各要素が複数の集積回路に分散して配置された構成としてもよい。 As shown in FIG. 1, the musical sound processing apparatus 100 includes a storage circuit 20 and a control circuit 30. The storage circuit 20 stores a program executed by the control circuit 30 and various data used by the control circuit 30. Any storage device such as a semiconductor storage device or a magnetic storage device is employed as the storage circuit 20. The control circuit 30 is an arithmetic processing unit such as a CPU that functions as each element (functional body) in FIG. 1 by executing a program. The control circuit 30 is also realized by an electronic circuit such as a DSP dedicated to processing the musical tone signal V. Further, each element of the control circuit 30 illustrated in FIG. 1 may be distributed and arranged in a plurality of integrated circuits.

図１の特徴量抽出部３２は、楽音信号Ｖを時間軸上で区分した複数のフレームの各々についてＮ種類の特徴量Ｆ（ＦS，ＦW）を抽出する手段である（Ｎは自然数）。相前後するフレームは時間軸上で重複する。本形態の特徴量抽出部３２は、周波数分析部３２１とスペクトル特徴量抽出部３２３と波形特徴量抽出部３２５とを含む。 The feature quantity extraction unit 32 in FIG. 1 is means for extracting N types of feature quantities F (FS, FW) for each of a plurality of frames obtained by dividing the musical sound signal V on the time axis (N is a natural number). Successive frames overlap on the time axis. The feature amount extraction unit 32 of this embodiment includes a frequency analysis unit 321, a spectrum feature amount extraction unit 323, and a waveform feature amount extraction unit 325.

周波数分析部３２１は、楽音信号Ｖの各フレームについてＦＦＴ（Fast Fourier Transform）処理を含む周波数分析を実行することで各フレームのパワースペクトルＱを特定する。スペクトル特徴量抽出部３２３は、周波数分析部３２１が特定したパワースペクトルＱの特徴量ＦSを抽出する。例えば図２に示すように、スペクトル特徴量抽出部３２３は、パワースペクトルＱにてピークが現れる各周波数と当該ピークの強度との組合せ（以下「周波数ビン」という）Ｓ1を特徴量ＦSとして抽出する。また、スペクトル特徴量抽出部３２３は、ＨＰＣＰ（Harmonics Pitch Class Profile）をパワースペクトルＱから抽出する。ＨＰＣＰは、図２に示すように、１オクターブを構成する１２個の半音階（Ｃ，Ｃ#，Ｄ，……，Ａ#，Ｂ）の各々の強度を示す特徴量ＦSである。 The frequency analysis unit 321 specifies the power spectrum Q of each frame by executing frequency analysis including FFT (Fast Fourier Transform) processing for each frame of the musical sound signal V. The spectrum feature amount extraction unit 323 extracts the feature amount FS of the power spectrum Q specified by the frequency analysis unit 321. For example, as shown in FIG. 2, the spectrum feature amount extraction unit 323 extracts a combination S1 of each frequency at which a peak appears in the power spectrum Q and the intensity of the peak (hereinafter referred to as “frequency bin”) as a feature amount FS. . Further, the spectrum feature amount extraction unit 323 extracts an HPCP (Harmonics Pitch Class Profile) from the power spectrum Q. As shown in FIG. 2, HPCP is a feature value FS indicating the intensity of each of the twelve semitones (C, C #, D,..., A #, B) constituting one octave.

図１の波形特徴量抽出部３２５は、楽音信号Ｖの時間軸上における波形の特徴量ＦWをフレームごとに抽出する。波形特徴量抽出部３２５は、例えば図３に示すように、楽音信号Ｖの包絡線Ｗ1の強度（瞬時値）Ｗ2をフレームごとに算定し、強度Ｗ2の時系列を微分することで各フレームにおける勾配Ｗ3を特徴量ＦWとして算定する。なお、勾配Ｗ3に代えて各フレームの強度Ｗ2の移動平均を算定してもよい。 The waveform feature quantity extraction unit 325 in FIG. 1 extracts the waveform feature quantity FW on the time axis of the musical sound signal V for each frame. For example, as shown in FIG. 3, the waveform feature amount extraction unit 325 calculates the intensity (instantaneous value) W2 of the envelope W1 of the musical sound signal V for each frame, and differentiates the time series of the intensity W2 in each frame. The gradient W3 is calculated as the feature value FW. Instead of the gradient W3, a moving average of the intensity W2 of each frame may be calculated.

特徴量抽出部３２が抽出したＮ種類の特徴量Ｆ（ＦS，ＦW）は図１の特徴量記憶部３４に格納される。図４は、特徴量記憶部３４に特徴量Ｆが記憶された様子を示す概念図である。同図に示すように、楽曲の全体にわたる楽音信号Ｖの各フレームについて抽出されたＮ種類の特徴量Ｆ1〜ＦNが特徴量記憶部３４に保持される。ただし、楽曲の一部のフレームの特徴量Ｆのみが特徴量記憶部３４に保持される構成であってもよい。 The N types of feature values F (FS, FW) extracted by the feature value extraction unit 32 are stored in the feature value storage unit 34 of FIG. FIG. 4 is a conceptual diagram showing how the feature quantity F is stored in the feature quantity storage unit 34. As shown in the figure, N types of feature quantities F1 to FN extracted for each frame of the musical tone signal V over the entire musical piece are held in the feature quantity storage unit 34. However, the configuration may be such that only the feature value F of some frames of the music is stored in the feature value storage unit 34.

図１の拍点検出部５２は、楽音信号Ｖの特徴量Ｆから楽曲の拍点を検出する手段である。さらに詳述すると、拍点検出部５２は、打楽器の演奏音が発生した時点を拍点として検出する。本形態においては、ハイハットシンバル（HH）とスネアドラム（SD）とベースドラム（BD）の演奏音に基づいて拍点Ｐを検出する。 The beat point detection unit 52 in FIG. 1 is means for detecting the beat point of the music from the feature value F of the musical sound signal V. More specifically, the beat point detection unit 52 detects a time point when a percussion instrument performance sound is generated as a beat point. In this embodiment, the beat point P is detected based on the performance sounds of the hi-hat cymbal (HH), snare drum (SD), and bass drum (BD).

図５は、拍点検出部５２が拍点を検出する動作の具体例を示す概念図である。同図に示すように、拍点検出部５２は、楽曲内の各楽音の顕著なアタック部Ａ（Ａ1，Ａ2，……）を特定する。例えば、図３に例示した勾配Ｗ3（特徴量ＦW）が急峻に増大する複数のフレームのうち、通常の楽曲における拍点の間隔（例えば４拍子で30BPM(Beat Per Minute)〜240BPMに相当する１秒〜８秒の範囲）を周期とするフレームがアタック部Ａとして選択される。 FIG. 5 is a conceptual diagram illustrating a specific example of an operation in which the beat point detection unit 52 detects a beat point. As shown in the figure, the beat point detection unit 52 identifies a remarkable attack part A (A1, A2,...) Of each musical tone in the music. For example, among a plurality of frames in which the gradient W3 (feature value FW) illustrated in FIG. 3 increases steeply, the interval between beat points in a normal music piece (for example, 1 corresponding to 30 BPM (Beat Per Minute) to 240 BPM in 4 beats) A frame having a period in the range of seconds to 8 seconds) is selected as the attack part A.

さらに、拍点検出部５２は、アタック部Ａに相当する各フレームのパワースペクトルＱ（特徴量ＦSに含まれる周波数ビンＳ1）を解析することで拍点を確定する。さらに詳述すると、拍点検出部５２は、複数のアタック部Ａのうち予め選定された複数の周波数帯域にて発生するアタック部Ａを拍点として検出する。複数の周波数帯域の各々は別種の打楽器の音域に相当する。 Further, the beat point detection unit 52 determines the beat point by analyzing the power spectrum Q (frequency bin S1 included in the feature value FS) of each frame corresponding to the attack part A. More specifically, the beat point detection unit 52 detects an attack part A generated in a plurality of preselected frequency bands among the plurality of attack parts A as a beat point. Each of the plurality of frequency bands corresponds to a sound range of a different type of percussion instrument.

図５には、楽音信号ＶのスペクトログラムＧ0が模式的に図示されている。図５のアタック部Ａ3およびＡ7に対応する各フレームでは、スネアドラムの音域に相当する周波数帯域ＢSD内の強度が高い。したがって、拍点検出部５２は、アタック部Ａ3およびＡ7がスネアドラムの演奏に起因していると判定する。同様に、拍点検出部５２は、周波数帯域ＢBD（ベースドラムの音域を含む帯域）の強度が高いアタック部Ａ1およびＡ5はベースドラムの演奏に起因していると判定し、周波数帯域ＢHH（ハイハットシンバルの音域を含む帯域）の強度が高いアタック部Ａ1〜Ａ7はハイハットシンバルの演奏に起因していると判定する。 FIG. 5 schematically shows a spectrogram G0 of the tone signal V. In each frame corresponding to the attack parts A3 and A7 in FIG. 5, the intensity in the frequency band BSD corresponding to the sound range of the snare drum is high. Therefore, the beat point detector 52 determines that the attack parts A3 and A7 are caused by the performance of the snare drum. Similarly, the beat point detector 52 determines that the attack parts A1 and A5 having a high frequency band BBD (band including the bass drum sound band) are caused by the performance of the bass drum, and the frequency band BHH (high hat) It is determined that the attack parts A1 to A7 having a high intensity of the band including the cymbal sound range are caused by the performance of the hi-hat cymbal.

一方、楽曲の小節に含まれる各拍点の順番と各拍点で演奏される打楽器の種類との間には音楽的な傾向がある。例えば４拍子の楽曲においては、第２拍と第４拍とにスネアドラムが演奏される傾向があり、１拍目と３拍目とにベースドラムが演奏される傾向があるといった具合である。拍点検出部５２は、以上のような音楽的な傾向を条件として複数のアタック部Ａのなかから拍点Ｐを確定する。例えば、図５においてスネアドラムの演奏に対応するアタック部Ａ3およびＡ7に着目すると、拍点検出部５２は、アタック部Ａ3を小節の第２番目の拍点Ｐと確定するとともにアタック部Ａ7を小節の第４番目の拍点Ｐと確定する。同様に、拍点検出部５２は、アタック部Ａ1を第１番目の拍点Ｐと確定するとともにアタック部Ａ5を第３番目の拍点Ｐと確定する。さらに、拍点検出部５２は、相前後する拍点Ｐの間隔の逆数を瞬時的なテンポTMP0として順次に特定する。 On the other hand, there is a musical tendency between the order of each beat point included in the measure of the music and the type of percussion instrument played at each beat point. For example, in a 4-beat music, the snare drum tends to be played on the second and fourth beats, and the bass drum tends to be played on the first and third beats. The beat point detection unit 52 determines the beat point P from the plurality of attack parts A on the condition of the musical tendency as described above. For example, when focusing on the attack parts A3 and A7 corresponding to the performance of the snare drum in FIG. 5, the beat point detecting part 52 determines the attack part A3 as the second beat point P of the measure and the attack part A7 as the measure. The fourth beat point P is determined. Similarly, the beat point detector 52 determines the attack part A1 as the first beat point P and the attack part A5 as the third beat point P. Furthermore, the beat point detection unit 52 sequentially specifies the reciprocal of the interval between successive beat points P as the instantaneous tempo TMP0.

図１の区間画定部５４は、拍点検出部５２が検出した拍点Ｐに基づいて楽曲を複数の単位区間Ｔに区分する手段である。例えば、拍点検出部５２が検出した所定個の拍点Ｐを含む区間が単位区間Ｔとして画定される。本形態の区間画定部５４は、図５に示すように、第１拍目から第４拍目までの４個の拍点Ｐを含む１小節を単位区間Ｔとして画定する。 The section demarcating section 54 in FIG. 1 is means for dividing the music into a plurality of unit sections T based on the beat point P detected by the beat point detecting section 52. For example, a section including a predetermined number of beat points P detected by the beat point detection unit 52 is defined as the unit section T. As shown in FIG. 5, the section defining unit 54 of this embodiment defines one measure including four beat points P from the first beat to the fourth beat as a unit section T.

図１の音高特定部４２は、楽音信号Ｖの各単位区間Ｔについて特徴量Ｆに応じた音高（コード名およびベース音）を特定する手段である。本形態の音高特定部４２はコード特定部４２１とベース特定部４２３とを含む。コード特定部４２１は、各単位区間Ｔの楽音信号Ｖが示す楽音に対応したコード名（和音）Ｎ1を特定する。例えば、コード特定部４２１は、特徴量記憶部３４に格納されたＨＰＣＰ（特徴量ＦS）を単位区間Ｔ内の複数のフレームにわたって平均化し、ＨＰＣＰとコード名Ｎ1とを対応付けるテーブルを参酌することで、ＨＰＣＰに対応したコード名Ｎ1を特定する。なお、ＨＰＣＰを利用したコード名Ｎ1の特定については本出願人の先願である特開２０００−２９８４７５号公報に開示されている。 The pitch specifying unit 42 in FIG. 1 is a means for specifying a pitch (a chord name and a bass tone) according to the feature amount F for each unit section T of the musical tone signal V. The pitch specifying unit 42 of this embodiment includes a chord specifying unit 421 and a bass specifying unit 423. The chord specifying unit 421 specifies a chord name (chord) N1 corresponding to the tone indicated by the tone signal V of each unit section T. For example, the code specifying unit 421 averages the HPCP (feature amount FS) stored in the feature amount storage unit 34 over a plurality of frames in the unit section T, and refers to a table that associates the HPCP with the code name N1. The code name N1 corresponding to HPCP is specified. The specification of the code name N1 using HPCP is disclosed in Japanese Patent Application Laid-Open No. 2000-298475, which is a prior application of the present applicant.

ベース特定部４２３は、各単位区間Ｔの楽音信号Ｖが示す楽音に対応したベース音Ｎ2を特定する。例えば、ベース特定部４２３は、特徴量ＦSの周波数ビンＳ1が示すパワースペクトルＱにおいて強度が閾値を上回る複数のピークのうち最も低周波側のピークの周波数をベース音Ｎ2として特定する。なお、ベース特定部４２３がベース音Ｎ2を特定する方法は任意である。例えば、楽音信号Ｖのうち１kHz以下の成分のみを選択的に抽出するフィルタ処理を実行したうえで零交差点の周期（ピッチ）を算定することでベース音Ｎ2を特定してもよい。 The bass identifying unit 423 identifies the bass sound N2 corresponding to the musical tone indicated by the musical tone signal V of each unit section T. For example, the base specifying unit 423 specifies the frequency of the lowest frequency peak among the plurality of peaks whose intensity exceeds the threshold in the power spectrum Q indicated by the frequency bin S1 of the feature quantity FS as the base sound N2. Note that the method for specifying the bass sound N2 by the bass specifying unit 423 is arbitrary. For example, the base sound N2 may be specified by calculating the period (pitch) of the zero-crossing point after performing filter processing for selectively extracting only the components of 1 kHz or less from the musical sound signal V.

音高特定部４２が単位区間Ｔごとに算定したコード名Ｎ1およびベース音Ｎ2は音高記憶部４４に格納される。半導体記憶装置や磁気記憶装置が特徴量記憶部３４や音高記憶部４４として採用される。特徴量記憶部３４と音高記憶部４４とは、ひとつの記憶装置に設定された別個の記憶領域であってもよいし、各々が別個の記憶装置であってもよい。また、特徴量記憶部３４および音高記憶部４４の少なくとも一方は、記憶回路２０に設定された記憶領域であってもよい。 The chord name N1 and the bass note N2 calculated by the pitch specifying unit 42 for each unit section T are stored in the pitch storage unit 44. A semiconductor storage device or a magnetic storage device is employed as the feature amount storage unit 34 or the pitch storage unit 44. The feature amount storage unit 34 and the pitch storage unit 44 may be separate storage areas set in one storage device, or may be separate storage devices. In addition, at least one of the feature amount storage unit 34 and the pitch storage unit 44 may be a storage area set in the storage circuit 20.

図１の記憶回路２０には、各々が別個の楽曲に対応する複数の伴奏態様データＤAが記憶される。ひとつの伴奏態様データＤAは、楽曲の１小節内の伴奏音の態様を指定するデータである。伴奏音の態様（伴奏音を演奏する楽器の種類や各伴奏音の発生の時点）は伴奏態様データＤAごとに相違する。ひとつの伴奏態様データＤAは、図１に示すように、当該伴奏態様データＤAに固有に付与された識別子ＡIDと、各々が別個の打楽器の演奏の時点を指定する複数のトラックＴRとで構成される。本形態の伴奏態様データＤAは、ハイハットシンバルに対応するトラックＴR_HHと、スネアドラムに対応するトラックＴR_SDと、ベースドラムに対応するトラックＴR_BDとを含む。ひとつのトラックＴRは、打楽器の演奏を指示するイベントデータＥと、各イベント（演奏）の間隔を指定する時間データΔとが時系列に配列されたＭＩＤＩ（Musical Instrument Digital Interface）データである。 The storage circuit 20 of FIG. 1 stores a plurality of accompaniment mode data DA each corresponding to a separate musical piece. One piece of accompaniment mode data DA is data for designating the mode of the accompaniment sound within one measure of the music. The mode of the accompaniment sound (the type of musical instrument that plays the accompaniment sound and the time of occurrence of each accompaniment sound) differs for each accompaniment mode data DA. As shown in FIG. 1, one accompaniment mode data DA is composed of an identifier AID uniquely assigned to the accompaniment mode data DA and a plurality of tracks TR each designating the time point of performance of a separate percussion instrument. The The accompaniment mode data DA of this embodiment includes a track TR_HH corresponding to a hi-hat cymbal, a track TR_SD corresponding to a snare drum, and a track TR_BD corresponding to a base drum. One track TR is MIDI (Musical Instrument Digital Interface) data in which event data E for instructing performance of a percussion instrument and time data Δ for designating an interval of each event (performance) are arranged in time series.

図１の選択部６２は、特徴量抽出部３２が単位区間Ｔ内の楽音信号Ｖについて特定した特徴量Ｆに基づいて、記憶回路２０に格納された複数の伴奏態様データＤAの何れかを単位区間Ｔごとに順次に選択する手段である。本形態の選択部６２は伴奏処理部６４と比較部６６とを含む。 1 selects one of a plurality of accompaniment mode data DA stored in the storage circuit 20 based on the feature value F specified by the feature value extraction unit 32 for the musical tone signal V in the unit section T. It is means for selecting each section T sequentially. The selection unit 62 of this embodiment includes an accompaniment processing unit 64 and a comparison unit 66.

図６は、伴奏処理部６４および比較部６６の動作を説明するための概念図である。同図においては、選択部６２による処理の対象となる単位区間Ｔ内の楽音信号ＶのスペクトログラムＧ0が図示されている。図６に示すように、伴奏処理部６４は、記憶回路２０に格納された複数の伴奏態様データＤAの各々について伸縮処理と伴奏音特定処理とを実行する。 FIG. 6 is a conceptual diagram for explaining the operation of the accompaniment processing unit 64 and the comparison unit 66. In the figure, a spectrogram G0 of the musical tone signal V in the unit interval T to be processed by the selection unit 62 is shown. As shown in FIG. 6, the accompaniment processing unit 64 executes an expansion / contraction process and an accompaniment sound specifying process for each of the plurality of accompaniment mode data DA stored in the storage circuit 20.

伸縮処理は、伴奏態様データＤAが指定する各伴奏音の時点と、拍点検出部５２が単位区間Ｔについて検出した各拍点Ｐとが時間軸上で合致するように、伴奏態様データＤAが指定する各伴奏音の時間軸上における位置を調整する処理である。伴奏処理部６４は、例えば、単位区間Ｔ内の各拍点Ｐについて拍点検出部５２が検出したテンポTMP0の平均と伴奏態様データＤAが指定する伴奏音のテンポTMPとの相対比（TMP0／TMP）を伴奏態様データＤAの各時間データΔに乗算する。 In the expansion / contraction process, the accompaniment mode data DA is set so that the time points of the accompaniment sounds specified by the accompaniment mode data DA and the beat points P detected by the beat point detection unit 52 for the unit section T coincide on the time axis. This is a process of adjusting the position on the time axis of each accompaniment sound to be designated. The accompaniment processing unit 64, for example, the relative ratio of the average tempo TMP0 detected by the beat point detection unit 52 for each beat point P in the unit interval T and the tempo TMP of the accompaniment sound specified by the accompaniment mode data DA (TMP0 / TMP) is multiplied by the time data Δ of the accompaniment mode data DA.

伴奏音特定処理は、伴奏態様データＤAが指定する伴奏音の特徴量ＦAを特定する処理である。伴奏処理部６４は、第１に、伴奏態様データＤAが指定する伴奏音についてスペクトログラムＧA（パワースペクトルの時系列）を特定する。例えば、伴奏処理部６４は、伴奏態様データＤAが指定する伴奏音の音源として想定される複数の打楽器の各々について演奏音のスペクトログラムｇを保持する。そして、ひとつの打楽器の演奏音の発生が伴奏態様データＤAによって指示される時点には当該打楽器のスペクトログラムｇを配置し、複数の打楽器の伴奏音の発生が伴奏態様データＤAによって指示される時点には各打楽器のスペクトログラムｇの加算を配置することでスペクトログラムＧAを特定する。第２に、伴奏処理部６４は、特徴量抽出部３２と同様の処理によって、スペクトログラムＧAからＮ種類の特徴量ＦAをフレームごとに抽出する。なお、特徴量ＦAの抽出のために特徴量抽出部３２を共用してもよい。また、以上においては伴奏態様データＤAからスペクトログラムＧAおよび特徴量ＦAを特定する構成を例示したが、スペクトログラムＧAや特徴量ＦAを伴奏態様データＤAとして記憶回路２０に格納してもよい。 The accompaniment sound specifying process is a process of specifying the feature amount FA of the accompaniment sound specified by the accompaniment mode data DA. First, the accompaniment processing unit 64 specifies a spectrogram GA (power spectrum time series) for the accompaniment sound specified by the accompaniment mode data DA. For example, the accompaniment processing unit 64 holds a spectrogram g of performance sound for each of a plurality of percussion instruments assumed as sound sources of accompaniment sounds specified by the accompaniment mode data DA. Then, the spectrogram g of the percussion instrument is arranged at the time when the generation of the performance sound of one percussion instrument is instructed by the accompaniment mode data DA, and the generation of the accompaniment sound of a plurality of percussion instruments is instructed by the accompaniment mode data DA. Specifies the spectrogram GA by placing the addition of the spectrogram g of each percussion instrument. Secondly, the accompaniment processing unit 64 extracts N types of feature values FA from the spectrogram GA for each frame by the same processing as the feature value extraction unit 32. Note that the feature quantity extraction unit 32 may be shared for extracting the feature quantity FA. In the above description, the spectrogram GA and the feature amount FA are specified from the accompaniment mode data DA. However, the spectrogram GA and the feature amount FA may be stored in the storage circuit 20 as the accompaniment mode data DA.

比較部６６は、図６に示すように、記憶回路２０に格納された複数の伴奏態様データＤAの各々について、楽音信号Ｖの単位区間Ｔ内の特徴量Ｆと当該伴奏態様データＤAから伴奏処理部６４が特定した特徴量ＦAとを比較することで類否指標値Ｃを算定する。類否指標値Ｃは、楽音信号Ｖの特徴量Ｆと伴奏処理部６４が特定した特徴量ＦAとの類否（図６のスペクトログラムＧ0とスペクトログラムＧAとの類否）の指標となる数値である。例えば、楽音信号Ｖから抽出されたＮ種類の特徴量Ｆを座標値としてＮ次元空間に特定される点と、伴奏態様データＤAから抽出されたＮ種類の特徴量ＦAを座標値としてＮ次元空間に特定される点とのユークリッド距離が類否指標値Ｃとして算定される。したがって、類否指標値Ｃが小さいほど、楽音信号Ｖの特徴量Ｆと特徴量ＦAとの類似度が高いといえる。もっとも、特徴量Ｆと特徴量ＦAとの類似度が高いほど増大する性質の類否指標値Ｃを算定する構成も採用される。 As shown in FIG. 6, the comparison unit 66 performs accompaniment processing for each of a plurality of accompaniment mode data DA stored in the storage circuit 20 from the feature amount F in the unit interval T of the musical tone signal V and the accompaniment mode data DA. The similarity index value C is calculated by comparing the feature amount FA specified by the unit 64. The similarity index value C is a numerical value that serves as an index of similarity between the feature value F of the musical sound signal V and the feature value FA specified by the accompaniment processing unit 64 (the similarity between the spectrogram G0 and the spectrogram GA in FIG. 6). . For example, N types of feature values F extracted from the musical tone signal V are specified as coordinate values in the N-dimensional space, and N types of feature values FA extracted from the accompaniment mode data DA are used as coordinate values in the N-dimensional space. The Euclidean distance from the point specified by is calculated as the similarity index value C. Therefore, it can be said that the smaller the similarity index value C, the higher the similarity between the feature value F and the feature value FA of the tone signal V. However, a configuration for calculating the similarity index value C having a property that increases as the similarity between the feature amount F and the feature amount FA increases is also adopted.

なお、楽音信号Ｖには伴奏音以外の楽音が含まれるのに対して伴奏態様データＤAには伴奏音の指定のみが含まれる。したがって、楽音信号Ｖのうち伴奏音以外の成分については特徴量ＦAとの類否に対する影響が低減されるように類否指標値Ｃを算定することが望ましい。例えば、楽音信号Ｖから抽出された特徴量Ｆのうち伴奏音以外の楽音が支配的となる成分（例えばパワースペクトルＱのうち伴奏音以外の楽音が主に存在する周波数帯域内の特徴量Ｆ）については除去したうえで類否指標値Ｃを算定する構成が採用される。 The musical tone signal V includes musical sounds other than the accompaniment sounds, whereas the accompaniment mode data DA includes only the designation of the accompaniment sounds. Therefore, it is desirable to calculate the similarity index value C for components other than the accompaniment sound in the musical sound signal V so that the influence on the similarity with the feature amount FA is reduced. For example, a component in which the musical sound other than the accompaniment sound is dominant among the characteristic amount F extracted from the musical sound signal V (for example, the characteristic amount F in the frequency band in which the musical sound other than the accompaniment sound exists in the power spectrum Q). A configuration in which the similarity index value C is calculated after being removed is adopted.

選択部６２は、比較部６６の算定した類否指標値Ｃが最小となる伴奏態様データＤA（楽音信号Ｖに最も類似した伴奏音を指定する伴奏態様データＤA）を選択し、当該伴奏態様データＤAに含まれる識別子ＡIDを記憶回路２０から取得して単位区間Ｔごとに順次に出力部７０に出力する。すなわち、楽音信号Ｖの楽調に近似する伴奏音の伴奏態様データＤAについて識別子ＡIDが出力される。 The selection unit 62 selects the accompaniment mode data DA (accompaniment mode data DA that designates the accompaniment sound most similar to the musical tone signal V) that minimizes the similarity index value C calculated by the comparison unit 66, and the accompaniment mode data. The identifier AID included in DA is acquired from the storage circuit 20 and is sequentially output to the output unit 70 for each unit interval T. That is, the identifier AID is output for the accompaniment mode data DA of the accompaniment sound that approximates the tone of the music signal V.

図１の出力部７０は、区間画定部５４が画定した単位区間Ｔごとに単位データＵを生成して出力する。ひとつの単位区間Ｔに対応した単位データＵは、図１に示すように、当該単位区間Ｔについて選択部６２が選択した伴奏態様データＤAの識別子ＡIDと、当該単位区間Ｔについて音高特定部４２が特定したコード名Ｎ1およびベース音Ｎ2とを含む。各単位区間Ｔごとの単位データＵの時系列が楽音データＤOUTとして出力処理装置１４に供給される。 The output unit 70 in FIG. 1 generates and outputs unit data U for each unit section T defined by the section defining unit 54. As shown in FIG. 1, the unit data U corresponding to one unit section T includes the identifier AID of the accompaniment mode data DA selected by the selection unit 62 for the unit section T and the pitch specifying unit 42 for the unit section T. Includes the chord name N1 and the bass note N2. A time series of unit data U for each unit section T is supplied to the output processing device 14 as musical sound data DOUT.

出力処理装置１４は、制御回路１４１と記憶回路１４３と音源回路１４５と出力装置１４７とを具備する。記憶回路１４３は、楽音処理装置１００から供給される楽音データＤOUTを順次に記憶するとともに、各々が別個の伴奏態様データＤAに対応する複数の演奏データＤBを記憶する。演奏データＤBは、伴奏態様データＤAと同様に１小節内の演奏の態様を指定するデータである。 The output processing device 14 includes a control circuit 141, a storage circuit 143, a sound source circuit 145, and an output device 147. The storage circuit 143 sequentially stores the musical tone data DOUT supplied from the musical tone processing apparatus 100, and stores a plurality of performance data DB each corresponding to separate accompaniment mode data DA. The performance data DB is data for designating the performance mode within one measure, as with the accompaniment mode data DA.

図７は、ひとつの演奏データＤBの構造を示す概念図である。図７に示すように、ひとつの伴奏態様データＤAに対応する演奏データＤBは、当該伴奏態様データＤAと共通の識別子ＡIDと、各々が別個のパートに対応した複数のトラックＴRとを含む。本形態の演奏データＤBは、伴奏態様データＤAと共通のトラックＴR（ＴR_HH，ＴR_SD，ＴR_BD）に加えて、コードの演奏の態様を指定するトラックＴR_CDと、ベース音の演奏の態様を指定するトラックＴR_BSとを含む。トラックＴR_CDは、コードの演奏を指示するイベントデータＥと各イベントの間隔を指定する時間データΔとが時系列に配列されたデータ列である。トラックＴR_CDのイベントデータＥはコード名の指定を含まない。同様に、トラックＴR_BSは、ベース音の演奏を指示するイベントデータＥ（ベース音の指定は含まない）と時間データΔとのデータ列である。 FIG. 7 is a conceptual diagram showing the structure of one piece of performance data DB. As shown in FIG. 7, performance data DB corresponding to one accompaniment mode data DA includes the accompaniment mode data DA, a common identifier AID, and a plurality of tracks TR each corresponding to a separate part. The performance data DB of this embodiment includes a track TR_CD for specifying the chord performance mode and a track for specifying the bass performance mode in addition to the common track TR (TR_HH, TR_SD, TR_BD) with the accompaniment mode data DA. Including TR_BS. The track TR_CD is a data string in which event data E for instructing performance of a chord and time data Δ for designating an interval between events are arranged in time series. The event data E of the track TR_CD does not include a code name designation. Similarly, the track TR_BS is a data string of event data E (not including bass sound designation) that instructs performance of the bass sound and time data Δ.

図１の制御回路１４１は、楽音データＤOUTの各単位データＵに含まれるコード名Ｎ1とベース音Ｎ2とを演奏データＤBに付与することで演奏データＤCを生成する。すなわち、楽音データＤOUTの各単位データＵを配列の順番で記憶回路１４３から順次に取得し、記憶回路１４３に格納された複数の演奏データＤBのうち当該単位データＵ内の識別子ＡIDを含む演奏データＤBを選択する。そして、制御回路１４１は、単位データＵに含まれるコード名Ｎ1を演奏データＤB内のトラックＴR_CDの各イベントデータＥに付加するとともに、単位データＵに含まれるベース音Ｎ2を演奏データＤB内のトラックＴR_BSの各イベントデータＥに付加することで演奏データＤCを生成する。さらに、制御回路１４１は、演奏データＤCの各トラックＴRのイベントデータＥを、当該トラックＴR内の時間データΔで指定される時点で音源回路１４５に順次に出力する。 The control circuit 141 in FIG. 1 generates performance data DC by assigning the chord name N1 and the bass sound N2 included in each unit data U of the musical sound data DOUT to the performance data DB. That is, the unit data U of the musical tone data DOUT is sequentially obtained from the storage circuit 143 in the order of arrangement, and the performance data including the identifier AID in the unit data U among the plurality of performance data DB stored in the storage circuit 143. Select DB. Then, the control circuit 141 adds the chord name N1 included in the unit data U to each event data E of the track TR_CD in the performance data DB, and also adds the bass sound N2 included in the unit data U to the track in the performance data DB. Performance data DC is generated by adding to each event data E of TR_BS. Further, the control circuit 141 sequentially outputs the event data E of each track TR of the performance data DC to the sound source circuit 145 at the time designated by the time data Δ in the track TR.

音源回路１４５は、制御回路１４１から供給される各イベントデータＥに対応した楽音の波形を示す信号を生成して出力装置１４７に出力するＭＩＤＩ音源である。出力装置１４７は、音源回路１４５が出力する信号に応じた音波を放射する。出力装置１４７から出力される楽音は、楽音信号Ｖの楽調に近似する伴奏音（打楽器音、コードおよびベース音）となる。 The tone generator circuit 145 is a MIDI tone generator that generates a signal indicating a tone waveform corresponding to each event data E supplied from the control circuit 141 and outputs the signal to the output device 147. The output device 147 emits a sound wave corresponding to the signal output from the sound source circuit 145. The musical sound output from the output device 147 is an accompaniment sound (percussion instrument sound, chord and bass sound) that approximates the musical tone of the musical sound signal V.

以上に説明したように本形態によれば、信号生成装置１２が出力する楽音信号Ｖが、データ量の少ない楽音データＤOUTに符号化される。そして、楽音信号Ｖから楽音データＤOUTへの符号化に際して拍点Ｐの位置が特徴量Ｆから自動的に検出されるため、楽音信号Ｖについて拍点Ｐを指定する利用者の労力が軽減されるという利点がある。また、単位区間Ｔ（例えば１小節）を単位として単位データＵの生成（伴奏態様データＤAの選択およびコード名Ｎ1やベース音Ｎ2の付加）が実行されるから、例えば楽音信号Ｖの拍点Ｐごとに符号化を実行する構成と比較して、音楽的な統一感のある演奏音の単位データＵを生成できるという利点がある。 As described above, according to the present embodiment, the musical sound signal V output from the signal generating device 12 is encoded into the musical sound data DOUT having a small data amount. Since the position of the beat point P is automatically detected from the feature amount F when the music signal V is encoded into the music data DOUT, the user's effort to specify the beat point P for the music signal V is reduced. There is an advantage. Further, since the unit data U is generated (selecting the accompaniment mode data DA and adding the chord name N1 and the bass sound N2) with the unit section T (for example, one measure) as a unit, for example, the beat point P of the musical sound signal V There is an advantage that the unit data U of performance sound with a musical sense of unity can be generated as compared with the configuration in which encoding is performed every time.

＜Ｂ：変形例＞
以上の形態には様々な変形を加えることができる。具体的な変形の態様を例示すれば以下の通りである。なお、以下に例示する任意の２以上の態様を組合せた構成も採用される。 <B: Modification>
Various modifications can be made to the above embodiment. An example of a specific modification is as follows. In addition, the structure which combined arbitrary two or more aspects illustrated below is also employ | adopted.

（１）変形例１
以上の形態においては、所定の間隔のフレームごとに楽音信号Ｖの特徴量Ｆ（ＦS，ＦW）が抽出される構成を例示したが、特徴量Ｆを抽出する時期は適宜に変更される。伴奏態様データＤAの選別には楽音信号Ｖのうち伴奏音の特徴量Ｆが特に重要である（伴奏音以外の楽音の特徴量Ｆを抽出する必要性は低い）から、楽音信号Ｖのうち伴奏音の発生が推定される区間についてのみ特徴量Ｆを抽出する構成が好適である。 (1) Modification 1
In the above embodiment, the configuration in which the feature value F (FS, FW) of the musical tone signal V is extracted for each frame at a predetermined interval is exemplified, but the timing for extracting the feature value F is appropriately changed. The accompaniment feature data F is particularly important for the selection of the accompaniment mode data DA (the necessity of extracting the musical feature F other than the accompaniment sound is low). A configuration in which the feature amount F is extracted only for a section in which sound generation is estimated is preferable.

ひとつの態様における波形特徴量抽出部３２５は、図３に示すように、楽音信号Ｖの音量が急激に増大した時点（伴奏音が発生した時点）ｔを楽音信号Ｖの特徴量ＦWから順次に特定する。周波数分析部３２１は、波形特徴量抽出部３２５が特定した各時点ｔから所定の時間長にわたる区間のみについてパワースペクトルＱを算定する。以上の構成によれば、周波数分析部３２１やスペクトル特徴量抽出部３２３の処理量が軽減されるという利点がある。また、楽音信号Ｖのうち拍点検出部５２が検出した拍点Ｐの間隔に相当する区間を単位として特徴量抽出部３２が特徴量Ｆを算定する構成とすれば、拍点Ｐごとの特徴量Ｆを算定することが可能となる。 As shown in FIG. 3, the waveform feature amount extraction unit 325 in one aspect sequentially selects the time t (the time when the accompaniment sound is generated) t of the musical tone signal V from the feature amount FW of the musical tone signal V. Identify. The frequency analysis unit 321 calculates the power spectrum Q only for the section extending from the time point t specified by the waveform feature amount extraction unit 325 to a predetermined time length. According to the above configuration, there is an advantage that the processing amount of the frequency analysis unit 321 and the spectrum feature amount extraction unit 323 is reduced. Further, if the feature amount extraction unit 32 calculates the feature amount F by using a section corresponding to the interval of the beat points P detected by the beat point detection unit 52 in the musical sound signal V as a unit, the feature for each beat point P is obtained. The quantity F can be calculated.

（２）変形例２
楽音信号Ｖが示す楽曲の構成を考慮して伴奏態様データＤAを選択する構成も採用される。図８は、変形例に係る楽音処理装置１００の構成を示すブロック図である。同図の構成判定部８２は、楽曲内において同等の演奏が反復される反復区間を検出する手段である。図９は、構成判定部８２による処理の内容を示す概念図である。同図に示すように、構成判定部８２は、楽曲を区分した複数の区間（例えば単位区間Ｔ）から２個の区間を選択する全通りの組合せについて音楽的な類似度を特徴量Ｆに基づいて算定する。図９においては、ひとつの楽曲の始点から終点までの単位区間Ｔの配列が縦軸と横軸とに図示されている。類似度が所定の閾値を上回る２個の区間の組合せに相当する地点には黒点が表記されている。図９の直線Ｌ上の黒点は、ひとつの共通する区間について算定される類似度が最大（一致）となることを意味している。図９のような結果が算定された場合、構成判定部８２は、区間ａと区間ｂとが反復区間（相互に類似する区間）であると判定する。 (2) Modification 2
A configuration in which the accompaniment mode data DA is selected in consideration of the configuration of the music indicated by the musical tone signal V is also employed. FIG. 8 is a block diagram showing the configuration of the musical tone processing apparatus 100 according to the modification. The configuration determination unit 82 shown in the figure is means for detecting a repeated section in which an equivalent performance is repeated in the music. FIG. 9 is a conceptual diagram showing the contents of processing by the configuration determining unit 82. As shown in the figure, the configuration determining unit 82 is based on the feature amount F for the musical similarity for all combinations of selecting two sections from a plurality of sections (for example, the unit section T) into which the music is divided. To calculate. In FIG. 9, the arrangement of the unit sections T from the start point to the end point of one piece of music is shown on the vertical axis and the horizontal axis. Black spots are written at points corresponding to combinations of two sections whose similarity exceeds a predetermined threshold. A black dot on the straight line L in FIG. 9 means that the similarity calculated for one common section is maximized (matched). When the result as illustrated in FIG. 9 is calculated, the configuration determining unit 82 determines that the section a and the section b are repetitive sections (sections similar to each other).

選択部６２は、構成判定部８２が検出した反復区間内については同じ伴奏態様データＤAを選択する。すなわち、図９の場合においては、区間ａ内の第ｉ番目（ｉは自然数）の単位区間Ｔと区間ｂ内の第ｉ番目の単位区間Ｔとで共通の伴奏態様データＤAが選択される。以上の構成によれば、区間ａの反復である区間ｂについては図６の処理を実行せずに伴奏態様データＤAを選択できるから、選択部６２の処理量が軽減されるという利点がある。また、単位区間Ｔごとに図６の処理を実行する構成においては、反復区間であるにも拘わらず別個の伴奏態様データＤAが選択される可能性がある。これに対して図８の構成によれば、反復区間については自動的に同じ伴奏音が生成されるから、音楽的な統一性を確保することが可能である。 The selection unit 62 selects the same accompaniment mode data DA for the repeated section detected by the configuration determination unit 82. That is, in the case of FIG. 9, the accompaniment mode data DA common to the i-th (i is a natural number) unit section T in the section a and the i-th unit section T in the section b is selected. According to the above configuration, the accompaniment mode data DA can be selected without executing the process of FIG. 6 for the section b which is a repetition of the section a, so that there is an advantage that the processing amount of the selection unit 62 is reduced. Further, in the configuration in which the process of FIG. 6 is performed for each unit section T, there is a possibility that separate accompaniment mode data DA is selected even though it is a repeated section. On the other hand, according to the configuration of FIG. 8, the same accompaniment sound is automatically generated in the repeated section, so that musical uniformity can be ensured.

また、図８の特性検証部８４は、音高特定部４２が特定したコード名Ｎ1およびベース音Ｎ2や拍点検出部５２が検出した拍点Ｐ（または区間画定部５４が画定した単位区間Ｔ）や選択部６２が選択した伴奏態様データＤAの音楽的な適否を検証して各々の動作に反映させる手段である。例えば、特性検証部８４は、複数の単位区間Ｔにわたるコード名Ｎ1が音楽的に適切であるか否かを判定し、何れかの単位区間Ｔのコード名Ｎ1が不適切であると判定した場合には当該単位区間Ｔのコード名Ｎ1をコード特定部４２１に修正させる。すなわち、多数の単位区間Ｔのうちひとつの単位区間Ｔのみにおいてコード名Ｎ1が「Ｃm」と判定されて他の全部の単位区間Ｔにおいてコード名Ｎ1が「Ｃ」と判定された場合、特性検証部８４は、コード名Ｎ1が「Ｃm」と判定された単位区間Ｔについてコード名Ｎ1の修正（Ｃm→Ｃ）をコード特定部４２１に指示する。また、コード特定部４２１が特定したコード名Ｎ1とベース特定部４２３が特定したベース音Ｎ2とが音楽的に調和しない場合、特性検証部８４は、コード名Ｎ1またはベース音Ｎ2の修正を音高特定部４２に指示する。以上の構成によれば、音楽的な不調和が抑制された伴奏音を生成することが可能となる。 Further, the characteristic verification unit 84 in FIG. 8 includes the chord name N1 and the bass tone N2 specified by the pitch specifying unit 42 and the beat point P detected by the beat point detecting unit 52 (or the unit section T defined by the section defining unit 54). ) And the accompaniment mode data DA selected by the selection unit 62 is verified and reflected in each operation. For example, the characteristic verification unit 84 determines whether or not the chord name N1 over a plurality of unit sections T is musically appropriate, and determines that the chord name N1 of any unit section T is inappropriate The code identification unit 421 is caused to correct the code name N1 of the unit section T. That is, when the code name N1 is determined as “Cm” in only one unit section T among the many unit sections T and the code name N1 is determined as “C” in all other unit sections T, the characteristic verification is performed. The unit 84 instructs the code specifying unit 421 to correct the code name N1 (Cm → C) for the unit section T in which the code name N1 is determined to be “Cm”. When the chord name N1 specified by the chord specifying unit 421 and the bass sound N2 specified by the base specifying unit 423 are not in harmony musically, the characteristic verification unit 84 corrects the chord name N1 or the bass sound N2 with a pitch. The specifying unit 42 is instructed. According to the above configuration, it is possible to generate an accompaniment sound in which musical incongruity is suppressed.

（３）変形例３
打楽器の演奏音に対応したピークがパワースペクトルＱの低周波側に発生する場合がある。したがって、ピークの発生する最低の周波数をベース音Ｎ2として特定する以上の形態においては、打楽器の演奏音がベース音Ｎ2と誤認される可能性がある。一方、ベース特定部４２３が本来的に特定すべきベース音Ｎ2は、打楽器と比較して演奏音が長時間にわたって継続し、かつ、基音の整数倍の周波数に倍音が現れる倍音構造（調波構造）の有声音である。そこで、好適な態様におけるベース特定部４２３は、低周波側のピークに対応した楽音のうち、所定値を上回る時間長にわたって継続するとともに倍音構造が認識される楽音をベース音Ｎ2として特定する。以上の構成によれば、ベース音Ｎ2と打楽器の演奏音とが混在する楽曲においてもベース音Ｎ2を高精度に特定することが可能である。もっとも、ベース特定部４２３がベース音Ｎ2を特定する方法は本発明において任意である。 (3) Modification 3
In some cases, a peak corresponding to the percussion instrument performance sound occurs on the low frequency side of the power spectrum Q. Therefore, in the form in which the lowest frequency at which the peak occurs is specified as the bass sound N2, the performance sound of the percussion instrument may be mistaken for the bass sound N2. On the other hand, the bass sound N2 that should basically be identified by the bass identifying unit 423 has a harmonic structure (harmonic structure) in which the performance sound continues for a long time compared to the percussion instrument and overtones appear at a frequency that is an integral multiple of the fundamental tone. ) Voiced sound. Therefore, the base specifying unit 423 in the preferred embodiment specifies, as a base tone N2, a tone that continues for a time length exceeding a predetermined value and has a harmonic structure recognized among the tones corresponding to the low-frequency peak. According to the above configuration, it is possible to specify the bass sound N2 with high accuracy even in the music in which the bass sound N2 and the percussion instrument performance sound are mixed. However, the method by which the bass identifying unit 423 identifies the bass sound N2 is arbitrary in the present invention.

（４）変形例４
コード特定部４２１が特徴量Ｆに基づいてコード名Ｎ1を特定する方法は本発明において任意である。また、コードを構成する各楽音の配列の順番（ボイシング）をコード特定部４２１が特定する構成も採用される。例えば、コード特定部４２１は、コードを構成する各楽音の高低をパワースペクトルＱの形状（特徴量ＦS）から認定し、各楽音を高低の順番に配列したコードのコード名Ｎ1を特定する。また、ベース特定部４２３が特定したベース音Ｎ2に基づいてコード特定部４２１がコード名Ｎ1を特定する構成も採用される。例えば、コード特定部４２１は、ベース音Ｎ2を最も低い音階として含む分数コード（オンコード）のコード名Ｎ1を特定する。 (4) Modification 4
The method in which the code specifying unit 421 specifies the code name N1 based on the feature amount F is arbitrary in the present invention. In addition, a configuration in which the chord identifying unit 421 identifies the order (voicing) of the arrangement of each musical sound constituting the chord is also employed. For example, the chord specifying unit 421 recognizes the level of each tone constituting the chord from the shape of the power spectrum Q (feature amount FS), and specifies the chord name N1 of the chord in which each tone is arranged in the order of the level. In addition, a configuration in which the chord identifying unit 421 identifies the chord name N1 based on the bass sound N2 identified by the bass identifying unit 423 is also employed. For example, the chord specifying unit 421 specifies the chord name N1 of the fraction code (on chord) that includes the bass sound N2 as the lowest scale.

（５）変形例５
以上の形態においては、記憶回路２０に格納された全部の伴奏態様データＤAについて特徴量ＦAの抽出（伸縮処理および伴奏音特定処理）が実行される構成を例示したが、特定の伴奏態様データＤAについて選択的に特徴量ＦAを抽出する構成も採用される。例えば、記憶回路２０内の複数の伴奏態様データＤAを楽曲のジャンルに応じて分類し、楽音信号Ｖから特定されるジャンルまたは利用者が指定するジャンルに対応した伴奏態様データＤAのみを選択の候補として楽音信号Ｖと対比する構成も採用される。また、楽音信号Ｖの伴奏音と相違することが明白であると判断できる伴奏態様データＤA（例えば、伴奏音のテンポが楽音信号Ｖと大幅に相違する伴奏態様データＤA）については選択の候補から除外する構成も採用される。以上の形態によれば、絞込みで除外された伴奏態様データＤAについて特徴量ＦAの抽出を省略できるから、選択部６２の処理量が軽減されるという利点がある。 (5) Modification 5
In the above embodiment, the configuration in which the extraction of the feature amount FA (expansion / contraction processing and accompaniment sound specifying processing) is performed on all the accompaniment mode data DA stored in the storage circuit 20 has been illustrated, but specific accompaniment mode data DA A configuration is also adopted in which the feature amount FA is selectively extracted for. For example, a plurality of accompaniment mode data DA in the memory circuit 20 is classified according to the genre of music, and only the accompaniment mode data DA corresponding to the genre specified from the musical tone signal V or the genre specified by the user is a candidate for selection. A configuration that contrasts with the musical sound signal V is also adopted. Further, accompaniment mode data DA (for example, accompaniment mode data DA in which the tempo of the accompaniment sound is significantly different from the musical sound signal V) that can be determined to be clearly different from the accompaniment sound of the musical sound signal V is selected from the selection candidates. An excluded configuration is also adopted. According to the above form, since the extraction of the feature value FA can be omitted for the accompaniment mode data DA excluded by narrowing down, there is an advantage that the processing amount of the selection unit 62 is reduced.

（６）変形例６
以上の形態においては、伴奏態様データＤAの識別子ＡIDを単位データＵに含めて楽音処理装置１００から出力する構成を例示したが、単位データＵの内容は適宜に変更される。例えば、伴奏態様データＤAをコード名Ｎ1やベース音Ｎ2とともに単位データＵとして出力する構成によれば、演奏データＤBのうち打楽器のトラックＴR（ＴR_HH，ＴR_SD，ＴR_BD）を出力処理装置１４が保持する必要がないという利点がある。 (6) Modification 6
In the above embodiment, the configuration in which the identifier AID of the accompaniment mode data DA is included in the unit data U and output from the musical sound processing apparatus 100 is exemplified, but the contents of the unit data U are changed as appropriate. For example, according to the configuration in which the accompaniment mode data DA is output as the unit data U together with the chord name N1 and the bass sound N2, the output processing device 14 holds the percussion instrument track TR (TR_HH, TR_SD, TR_BD) in the performance data DB. There is an advantage that it is not necessary.

また、打楽器のトラックＴRにコードのトラックＴR_CDとベース音のトラックＴR_BSとを追加したデータ（図７の演奏データＤB）を伴奏態様データＤAとして記憶回路２０に格納した構成も採用される。出力部７０は、伴奏態様データＤAのうちコードのトラックＴR_CDのイベントデータＥにコード名Ｎ1を挿入するとともにベース音のトラックＴR_BSのイベントデータＥにベース音Ｎ2を挿入することで演奏データＤCを生成して出力処理装置１４に出力する。以上の構成によれば、出力処理装置１４において演奏データＤBを保持する必要がないという利点がある。 Further, a configuration in which the data (performance data DB in FIG. 7) obtained by adding the chord track TR_CD and the bass sound track TR_BS to the percussion instrument track TR is stored in the storage circuit 20 as the accompaniment mode data DA is also employed. The output unit 70 generates performance data DC by inserting the chord name N1 into the event data E of the chord track TR_CD and the bass sound N2 into the event data E of the bass track TR_BS in the accompaniment mode data DA. And output to the output processing device 14. According to the above configuration, there is an advantage that it is not necessary to hold the performance data DB in the output processing device 14.

さらに、各単位区間Ｔの単位データＵに当該単位区間Ｔの楽音信号Ｖを付加したファイルを楽音処理装置１００から順次に出力処理装置１４に送信してもよい。この構成によれば、出力処理装置１４側において、単位データＵに基づいて伴奏音を再生できるだけでなく、当該伴奏音に対応した楽音信号Ｖを各種の楽曲を作成するための素材として利用することが可能である。 Further, a file in which the musical sound signal V of the unit section T is added to the unit data U of each unit section T may be sequentially transmitted from the musical sound processing apparatus 100 to the output processing apparatus 14. According to this configuration, on the output processing device 14 side, not only the accompaniment sound can be reproduced based on the unit data U, but also the musical sound signal V corresponding to the accompaniment sound is used as a material for creating various musical pieces. Is possible.

なお、図１のように識別子ＡIDを単位データＵに含めて出力する構成によれば、各トラックＴRのデータを単位データＵに含めて出力する構成と比較して、楽音処理装置１００から出力処理装置１４に伝送されるデータ量が削減されるという利点がある。したがって、例えば楽音処理装置１００と出力処理装置１４とが通信網を介して接続された構成においては、通信トラヒックを軽減するという観点から、識別子ＡIDを含む単位データＵ（トラックＴRを含まない単位データＵ）が出力処理装置１４に伝送される構成が特に好適である。 As shown in FIG. 1, according to the configuration in which the identifier AID is included in the unit data U and output, the output processing from the musical sound processing device 100 is performed compared to the configuration in which the data of each track TR is included in the unit data U and output. There is an advantage that the amount of data transmitted to the device 14 is reduced. Therefore, for example, in a configuration in which the musical sound processing device 100 and the output processing device 14 are connected via a communication network, unit data U including an identifier AID (unit data not including a track TR is used from the viewpoint of reducing communication traffic. The configuration in which U) is transmitted to the output processing device 14 is particularly suitable.

（７）変形例７
楽音信号Ｖから抽出される特徴量Ｆ（または伴奏態様データＤAから抽出される特徴量ＦA）の種類は以上の例示に限定されない。例えば、パワースペクトルＱを近似する直線の勾配やパワースペクトルＱの重心（セントロイド）、メルケプストラム係数（MFCC： Mel Frequency Cepstrum Coefficients）、パワースペクトルＱのピークにおける強度の平均値とピーク以外における強度の平均値との相対比など各種の特徴量Ｆをスペクトル特徴量抽出部３２３は抽出し得る。すなわち、音高特定部４２によるコード名Ｎ1およびベース音Ｎ2の特定と、拍点検出部５２による拍点Ｐの検出（区間画定部５４による単位区間Ｔの画定）と、選択部６２による伴奏態様データＤAの選択（楽音信号Ｖが示す伴奏音と伴奏態様データＤAが示す伴奏音との比較）とに使用され得る総ての特徴量Ｆが適用される。 (7) Modification 7
The type of feature value F extracted from the musical sound signal V (or feature value FA extracted from the accompaniment mode data DA) is not limited to the above examples. For example, the gradient of the straight line approximating the power spectrum Q, the center of gravity (centroid) of the power spectrum Q, the Mel Cepstrum Coefficients (MFCC), the average value of the intensity at the peak of the power spectrum Q and the intensity of the intensity other than the peak The spectrum feature quantity extraction unit 323 can extract various feature quantities F such as a relative ratio with the average value. That is, the chord name N 1 and the bass sound N 2 are specified by the pitch specifying unit 42, the beat point P is detected by the beat point detecting unit 52 (the unit section T is defined by the section defining unit 54), and the accompaniment mode is performed by the selecting unit 62. All feature quantities F that can be used for selection of data DA (comparison between accompaniment sound indicated by musical tone signal V and accompaniment sound indicated by accompaniment mode data DA) are applied.

（８）変形例８
拍点Ｐを検出する方法は適宜に変更される。例えば、以上の形態においては各拍点Ｐで演奏される打楽器の種類の傾向に基づいて拍点Ｐが検出される構成を例示したが、複数のアタック部Ａのうち打楽器に対応した特定の周波数帯域（例えば図５の帯域ＢSD）内の成分の強度が高いアタック部Ａを拍点Ｐとして特定してもよい。また、打楽器の演奏音は瞬間的である（継続しない）から、複数のアタック部Ａのうち演奏音が継続しないアタック部Ａを拍点Ｐとして特定する構成も採用される。 (8) Modification 8
The method for detecting the beat point P is appropriately changed. For example, in the above embodiment, the configuration in which the beat point P is detected based on the tendency of the type of percussion instrument played at each beat point P is illustrated, but the specific frequency corresponding to the percussion instrument among the plurality of attack parts A An attack part A having a high component intensity in a band (for example, band BSD in FIG. 5) may be specified as beat point P. Further, since the performance sound of the percussion instrument is instantaneous (does not continue), a configuration is also adopted in which the attack portion A in which the performance sound does not continue among the plurality of attack portions A is specified as the beat point P.

（９）変形例９
信号生成装置１２と楽音処理装置１００とが一体の装置を構成してもよい。また、楽音信号Ｖの供給元は信号生成装置１２に限定されない。例えば、マイクロホンが採取したアナログ信号をＡ/Ｄ変換した楽音信号Ｖが楽音処理装置１００に供給される構成や、インターネットなどの通信網を経由して楽音信号Ｖが楽音処理装置１００に供給される構成も採用される。 (9) Modification 9
The signal generation device 12 and the musical sound processing device 100 may constitute an integrated device. Further, the source of the musical sound signal V is not limited to the signal generator 12. For example, a musical tone signal V obtained by A / D converting an analog signal collected by a microphone is supplied to the musical tone processing apparatus 100, or a musical tone signal V is supplied to the musical tone processing apparatus 100 via a communication network such as the Internet. A configuration is also adopted.

本発明のひとつの形態に係る楽音処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the musical tone processing apparatus which concerns on one form of this invention. スペクトル特徴量抽出部の動作を説明するための概念図である。It is a conceptual diagram for demonstrating operation | movement of a spectrum feature-value extraction part. 波形特徴量抽出部の動作を説明するための概念図である。It is a conceptual diagram for demonstrating operation | movement of a waveform feature-value extraction part. 特徴量記憶部が特徴量を記憶する様子を示す概念図である。It is a conceptual diagram which shows a mode that a feature-value memory | storage part memorize | stores a feature-value. 拍点検出部が拍点を検出する動作の具体例を示す概念図である。It is a conceptual diagram which shows the specific example of the operation | movement which a beat point detection part detects a beat point. 伴奏処理部および比較部の動作を説明するための概念図である。It is a conceptual diagram for demonstrating operation | movement of an accompaniment process part and a comparison part. 演奏データＤBの構造を示す概念図である。It is a conceptual diagram which shows the structure of performance data DB. 変形例に係る楽音処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the musical tone processing apparatus which concerns on a modification. 構成判定部の動作を説明するための概念図である。It is a conceptual diagram for demonstrating operation | movement of a structure determination part.

Explanation of symbols

１００……楽音処理装置、１２……信号生成装置、１４……出力処理装置、１４１……制御回路、１４３……記憶回路、１４５……音源回路、１４７……出力装置、２０……記憶回路、３０……制御回路、３２……特徴量抽出部、３２１……周波数分析部、３２３……スペクトル特徴量抽出部、３２５……波形特徴量抽出部、３４……特徴量記憶部、４２……音高特定部、４２１……コード特定部、４２３……ベース特定部、４４……音高記憶部、５２……拍点検出部、５４……区間画定部、６２……選択部、６４……伴奏処理部、６６……比較部、７０……出力部。 DESCRIPTION OF SYMBOLS 100 ... Musical sound processing device, 12 ... Signal generation device, 14 ... Output processing device, 141 ... Control circuit, 143 ... Memory circuit, 145 ... Sound source circuit, 147 ... Output device, 20 ... Memory circuit , 30... Control circuit, 32... Feature amount extraction unit, 321... Frequency analysis unit, 323... Spectrum feature amount extraction unit, 325. ... Pitch specifying part, 421 ... Chord specifying part, 423 ... Bass specifying part, 44 ... Pitch storage part, 52 ... Beat point detection part, 54 ... Section demarcation part, 62 ... Selection part, 64 ... accompaniment processing section, 66 ... comparison section, 70 ... output section.

Claims

Storage means for storing a plurality of accompaniment mode data for specifying the mode of the accompaniment sound;
Feature quantity extraction means for sequentially extracting feature quantities from a musical tone signal indicating the musical tone of the music;
Beat point detecting means for detecting a beat point from the musical sound signal;
Section defining means for defining a plurality of unit sections based on the beat points detected by the beat point detecting means;
Wherein said plurality of accompaniment aspects of data each indicating accompaniment sound feature amount and the feature amount extracting means includes a comparison means for comparing the feature amounts specified for the tone signal in the unit section, said storage means stores Selection means for sequentially selecting accompaniment sound data of accompaniment sounds similar to the musical tone signal in the unit interval among a plurality of accompaniment aspect data for each unit interval based on a result of comparison by the comparison means;
A musical tone processing apparatus comprising: output means for outputting the accompaniment mode data selected by the selection unit or unit data including an identifier of the accompaniment mode data for each unit section.

The feature amount extraction means sequentially extracts the feature amount of the musical sound signal for each frame,
2. The musical sound processing apparatus according to claim 1, wherein the comparing means compares the characteristic amount of each accompaniment sound frame indicated by each of the plurality of accompaniment mode data with the characteristic amount of each frame of the musical sound signal in the unit section .

The selection means includes accompaniment processing means for specifying a spectrogram of an accompaniment sound for each of the plurality of accompaniment mode data, and the characteristic amount of the spectrogram specified by the accompaniment processing means for each accompaniment mode data and a musical tone in the unit section Based on the result of comparison by the comparing means with the spectrogram feature value of the signal, the accompaniment mode data whose accompaniment sound spectrogram is similar to the spectrogram of the musical tone signal in the unit section is selected from the plurality of accompaniment mode data The musical tone processing apparatus according to claim 1 or 2.

The beat point detecting means detects a time point at which a component belonging to each of a plurality of frequency bands corresponding to performance sounds of different types of musical instruments is generated as a beat point. Any of the musical tone processing apparatuses.

On the computer,
A feature amount extraction process for sequentially extracting feature amounts from a musical tone signal indicating the musical tone of the music;
Beat point detection processing for detecting a beat point from the musical sound signal;
Section defining process for defining a plurality of unit sections based on the beat points detected by the beat point detecting process,
A comparison process for comparing the feature amount of the accompaniment sound indicated by each of a plurality of accompaniment mode data specifying the form of the accompaniment sound and the feature amount specified for the musical tone signal in the unit interval in the feature amount extraction process, A selection process for sequentially selecting accompaniment mode data of accompaniment sounds similar to the musical tone signal in the unit section among a plurality of accompaniment mode data for each unit section based on the result of the comparison process ;
A program for executing an output process for outputting the accompaniment mode data selected in the selection process or unit data including an identifier of the accompaniment mode data for each unit section.