JP5541008B2

JP5541008B2 - Data correction apparatus and program

Info

Publication number: JP5541008B2
Application number: JP2010194703A
Authority: JP
Inventors: 典昭阿瀬見; 満春佳山; 誠司黒川
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2010-08-31
Filing date: 2010-08-31
Publication date: 2014-07-09
Anticipated expiration: 2030-08-31
Also published as: JP2012053204A

Description

本発明は、楽譜データに規定された出力音が、対象楽曲を構成する楽音に一致するように、当該楽譜データを修正するデータ修正装置、及びプログラムに関する。 The present invention relates to a data correction apparatus and a program for correcting musical score data so that an output sound defined in the musical score data matches a musical sound constituting a target musical piece.

近年、対象楽曲を模擬した楽曲の楽譜を表し、少なくとも、個々の出力音の音高及び出力タイミングが予め規定された楽譜データ（いわゆる、ＭＩＤＩデータ）に従って、楽曲を演奏する自動演奏装置が知られている。 2. Description of the Related Art In recent years, an automatic performance apparatus that represents a musical score of a musical piece that simulates a target musical piece and plays the musical piece according to musical score data (so-called MIDI data) in which the pitch and output timing of each output sound are defined in advance is known. ing.

このような自動演奏装置は、当該自動演奏装置の機種毎に出力される出力音の音色が異なる。このため、自動演奏装置にて演奏されるＭＩＤＩデータが、同一の楽曲に対応する一つのＭＩＤＩデータであったとしても、そのＭＩＤＩデータを演奏する自動演奏装置の機種によって、演奏された楽曲の印象が異なるという問題が生じる。 Such an automatic performance device differs in tone color of output sound output for each model of the automatic performance device. For this reason, even if the MIDI data played by the automatic performance device is a single piece of MIDI data corresponding to the same musical piece, the impression of the musical piece played by the type of automatic performance device that plays the MIDI data. The problem that is different.

この問題を解決するために、同一の対象楽曲に基づく一つのＭＩＤＩデータであれば、当該ＭＩＤＩデータによって表される楽曲が同一の音色で演奏されるように、自動演奏装置の機種に関する情報に従って、当該ＭＩＤＩデータを構成する出力音の周波数特性を補正するデータ修正装置が提案されている（例えば、特許文献１参照）。 In order to solve this problem, according to information about the model of the automatic performance device, if one piece of MIDI data is based on the same target music, the music represented by the MIDI data is played with the same tone. There has been proposed a data correction apparatus that corrects the frequency characteristics of output sound that constitutes the MIDI data (see, for example, Patent Document 1).

特許第３６５０５２６号Patent No. 3650526

ところで、自動演奏装置にて用いられる一般的なＭＩＤＩデータは、音響機器や楽器などを用いて演奏された対象楽曲を聴いた人物（以下、制作者）が、その対象楽曲を構成する楽音を再現するように、ＭＩＤＩデータを構成する個々の出力音の音高や出力タイミングといった各パラメータを規定することで生成される。このように生成されるＭＩＤＩデータには、制作者の感性が強く反映され、対象楽曲自身と、その対象楽曲をＭＩＤＩデータによって表した楽曲との間には、個々の出力音の音高や個々の出力音の出力タイミングにズレが存在することが多い。 By the way, general MIDI data used in an automatic performance device reproduces the musical sound that constitutes the target music by a person who listens to the target music played using an audio device or a musical instrument (hereinafter referred to as the producer). As described above, it is generated by defining each parameter such as the pitch and output timing of each output sound constituting the MIDI data. The MIDI data generated in this way strongly reflects the sensibility of the producer. Between the target music itself and the music that represents the target music in the MIDI data, the pitch of each output sound and the individual music There is often a difference in the output timing of the output sound.

このような出力音の音高や出力タイミングのズレは、自動演奏装置の機種に依存しない。このため、特許文献１に記載のデータ修正装置において、ＭＩＤＩデータを構成する出力音の周波数特性を補正しても、当該ズレを解消できず、ＭＩＤＩデータに従って演奏された楽曲の印象が、対象楽曲自身の印象と異なることを解消できないという問題があった。 Such deviations in the pitch of the output sound and output timing do not depend on the model of the automatic performance device. For this reason, in the data correction apparatus described in Patent Document 1, even if the frequency characteristics of the output sound constituting the MIDI data are corrected, the deviation cannot be eliminated, and the impression of the music played in accordance with the MIDI data is the target music. There was a problem that it was not possible to resolve what was different from my own impression.

そこで、本発明は、データ修正装置において、ＭＩＤＩデータによって表される楽曲が対象楽曲に近似するように、当該ＭＩＤＩデータを構成するパラメータを修正することを目的とする。 Therefore, an object of the present invention is to correct a parameter constituting the MIDI data so that the music represented by the MIDI data approximates the target music in the data correction apparatus.

上記目的を達成するためになされた本発明のデータ修正装置では、楽音推移取得手段が、対象楽曲を構成する楽音の音圧が時間軸に沿って推移した楽音推移を取得し、対象楽曲を模擬した楽曲の楽譜を表し、音源モジュールから出力される個々の出力音の音高及び出力タイミングが少なくとも規定された楽譜データに基づいて、出力音推移取得手段が、出力音の音圧が時間軸に沿って推移した出力音推移を取得する。 In the data correction apparatus of the present invention made to achieve the above object, the musical sound transition acquisition means acquires the musical sound transition in which the sound pressure of the musical sound constituting the target music has changed along the time axis, and simulates the target music. Based on the musical score data in which the pitch and output timing of each output sound output from the sound module are at least specified, the output sound transition acquisition means is configured to output the sound pressure of the output sound on the time axis. Acquire the output sound transition that has shifted along.

そして、補正量導出手段が、楽音推移取得手段にて取得した楽音推移から抽出した該楽音推移の特性を表す楽音情報と、出力音推移取得手段にて取得した出力音推移から抽出した該出力音推移の特性を表す出力音情報とを比較した結果に基づき、音高補正量導出手段、及び時間補正量導出手段のうち少なくとも一方に、補正量の導出を実行させると共に、楽譜データ修正手段が、その導出した補正量に従って、楽譜データに規定された個々の出力音をシフトすることで、楽譜データを修正する。 Then, the correction amount deriving unit extracts the tone information indicating the characteristics of the tone transition extracted from the tone transition acquired by the tone transition acquiring unit, and the output sound extracted from the output tone transition acquired by the output tone transition acquiring unit. Based on the result of comparison with the output sound information representing the characteristics of the transition, at least one of the pitch correction amount derivation unit and the time correction amount derivation unit executes the derivation of the correction amount, and the score data correction unit includes: According to the derived correction amount, the score data is corrected by shifting the individual output sounds defined in the score data.

ただし、音高補正量導出手段は、楽音情報と出力音情報とを比較した結果に基づいて、出力音の音高が、該出力音に対応する楽音の音高に一致するように楽譜データの音高補正量を補正量の一つとして導出し、時間補正量導出手段は、楽音情報と出力音情報とを比較した結果に基づいて、出力音の出力タイミングが、該出力音に対応する楽音の演奏開始タイミングに一致するように楽譜データの時間補正量を補正量の一つとして導出する。
さらに、本発明における音高補正量導出手段では、楽音分布導出手段が、楽音推移取得手段にて取得した楽音推移に含まれる周波数と各周波数における強度（即ち、振幅やパワー）とを表し、周波数における強度について正規化した楽音音高分布を、楽音情報の一つとして導出し、出力音分布導出手段が、出力音推移取得手段にて取得した出力音推移に含まれる周波数と各周波数における強度とを表し、周波数における強度について正規化した出力音高分布を、出力音情報の一つとして導出する。
すると、音高相関導出手段が、出力音高分布と、楽音音高分布との相関値を表す音高相関値を、楽音音高分布の予め規定された規定位置から出力音高分布を周波数軸に沿ってシフトさせる毎に導出して、その導出された音高相関値の中で、値が最大となる音高相関値に対応する規定位置からの周波数軸に沿ったシフト量を、音高補正量導出手段が、音高補正量としても良い。 However, the pitch correction amount deriving means, based on the result of comparing the musical sound information and the output sound information, outputs the musical score data so that the pitch of the output sound matches the pitch of the musical sound corresponding to the output sound. The pitch correction amount is derived as one of the correction amounts, and the time correction amount deriving means determines the musical sound whose output timing corresponds to the output sound based on the result of comparing the musical sound information and the output sound information. The time correction amount of the score data is derived as one of the correction amounts so as to coincide with the performance start timing.
Further, in the pitch correction amount deriving unit in the present invention, the tone distribution deriving unit represents the frequency included in the tone transition acquired by the tone transition acquiring unit and the intensity (that is, amplitude and power) at each frequency, and the frequency The musical tone pitch distribution normalized with respect to the intensity of the sound is derived as one piece of musical sound information, and the output sound distribution deriving means includes the frequency included in the output sound transition acquired by the output sound transition acquiring means and the intensity at each frequency. The output pitch distribution normalized with respect to the intensity at the frequency is derived as one of the output sound information.
Then, the pitch correlation deriving means uses the pitch correlation value representing the correlation value between the output pitch distribution and the musical tone pitch distribution as the frequency axis of the output pitch distribution from the predetermined specified position of the musical tone pitch distribution. Of the calculated pitch correlation value, the shift amount along the frequency axis from the specified position corresponding to the maximum pitch correlation value among the derived pitch correlation values is calculated as the pitch. The correction amount deriving unit may be the pitch correction amount.

このようなデータ修正装置にて修正された楽譜データ（以下、修正楽譜データとする）は、楽譜データに規定された出力音の音高及び出力タイミングのうちの少なくとも一方について、対象楽曲を構成する楽音の音高及び演奏開始タイミングに近似したものとなる。 The musical score data corrected by such a data correction device (hereinafter referred to as corrected musical score data) constitutes a target musical piece for at least one of the pitch and output timing of the output sound specified in the musical score data. It approximates the pitch of the musical tone and the performance start timing.

つまり、本発明のデータ修正装置によれば、修正楽譜データを構成する出力音と、対象楽曲を構成する楽音との間に、音高や出力タイミングのズレが生じることを低減できる。
よって、自動演奏装置などが、修正楽譜データに従って該修正楽譜データによって表された楽曲を演奏すれば、その演奏された楽曲を聴いたユーザが、対象楽曲との間に存在するズレに違和感や、対象楽曲自身と印象が異なると感じることを低減できる。 That is, according to the data correction apparatus of the present invention, it is possible to reduce the occurrence of pitch and output timing deviations between the output sound constituting the modified musical score data and the musical sound constituting the target musical piece.
Therefore, if an automatic performance device or the like plays a musical piece represented by the modified musical score data according to the modified musical score data, the user who has listened to the played musical piece has a sense of incongruity with the gap existing between the target musical pieces, It can reduce feeling that the impression is different from the target music itself.

このように導出される音高補正量に従って楽譜データを修正すれば、修正後の出力音推移に含まれる周波数及び各周波数における強度の比率を、楽音推移に含まれる周波数及び各周波数における強度の比率に、より近似させることができる。 If the score data is corrected according to the pitch correction amount derived in this way, the frequency included in the corrected output sound transition and the intensity ratio at each frequency are changed to the frequency included in the musical sound transition and the intensity ratio at each frequency. Can be more approximated.

特に、本発明のデータ修正装置で導出される楽音音高分布及び出力音高分布は、楽音推移及び出力音推移に含まれる周波数と各周波数における強度のうち、周波数における強度について正規化されている。このため、本発明のデータ修正装置によれば、楽音推移の振幅と、出力音推移の振幅とが大きく異なっていたとしても、修正楽譜データに基づく出力音推移を楽音推移に近づけることができる。 In particular, the musical tone pitch distribution and the output pitch distribution derived by the data correction apparatus of the present invention are normalized with respect to the intensity at the frequency among the frequencies included in the musical tone transition and the output sound transition and the intensity at each frequency. . For this reason, according to the data correction device of the present invention, even if the amplitude of the musical sound transition and the amplitude of the output sound transition are greatly different, the output sound transition based on the modified musical score data can be brought close to the musical sound transition.

なお、本発明において、出力音推移及び楽音推移の「周波数と各周波数における強度とを表す」ものは、出力音推移及び楽音推移の振幅スペクトルであっても良いし、出力音推移及び楽音推移のパワースペクトルであっても良い。 In the present invention, the output sound transition and the musical sound transition “representing the frequency and the intensity at each frequency” may be the amplitude spectrum of the output sound transition and the musical sound transition, or the output sound transition and the musical sound transition. It may be a power spectrum.

後者の場合、本発明における楽音分布導出手段、及び出力音分布導出手段は、それぞれ、楽音推移全体及び出力音推移全体のパワースペクトルを導出し、その導出したパワースペクトルの周波数について、境界が隣接するように規定された周波数範囲である規定音高範囲毎に代表値化して、正規化することで、楽音音高分布及び出力音高分布を導出しても良い。 In the latter case, the musical sound distribution deriving means and the output sound distribution deriving means in the present invention derive the power spectrum of the entire musical sound transition and the entire output sound transition, respectively, and the boundaries of the frequencies of the derived power spectrum are adjacent to each other. on behalf binarizing each defined a frequency range defined pitch range as, to normalize, but it may also derive the tone pitch distribution and output pitch distribution.

なお、ここで言う「規定音高範囲毎に代表値化」とは、規定音高範囲に含まれる周波数について平均化した値を意味するものでも良いし、規定音高範囲の中心音高に相当する値を意味するものでも良い。 As used herein, “representative value for each specified pitch range” may mean an averaged value for the frequencies included in the specified pitch range, and corresponds to the central pitch in the specified pitch range. It may mean the value to be.

このような本発明のデータ修正装置によれば、楽音音高分布及び出力音高分布が、それぞれ、パワースペクトルの周波数について規定音高範囲毎に代表値化されているため、例えば、規定音高範囲を半音単位として規定すれば、修正楽譜データに基づく出力音推移の音高と、楽音推移の音高との一致度をより向上させることができる。 According to such a data correction apparatus of the present invention, the musical tone pitch distribution and the output pitch distribution are representative values for each specified pitch range with respect to the frequency of the power spectrum. If the range is defined as a semitone unit, the degree of coincidence between the pitch of the output sound transition based on the modified musical score data and the pitch of the musical sound transition can be further improved.

また、本発明においては、時間補正量導出手段は、楽音変化導出手段が、楽音推移取得手段にて取得した楽音推移から、該楽音推移の非調波成分である楽音非調波を抽出し、時間軸に沿った楽音非調波の変化を表す楽音変化を、楽音情報の一つとして導出し、出力音変化導出手段が、出力音推移取得手段にて取得した出力音推移から、該出力音推移の非調波成分である出力音非調波を抽出し、時間軸に沿った出力音非調波の変化を表す出力音変化を、出力音情報の一つとして導出する。 Also, you Itewa the present invention, the time correction amount deriving hand stage, the tone change obtaining means, from the tone transition acquired in tone transition acquiring unit, a tone non harmonic is a non-harmonic component of the musical tone transition Extracting and deriving a musical sound change representing a musical non-harmonic change along the time axis as one of the musical sound information, the output sound change deriving means from the output sound transition acquired by the output sound transition acquiring means, An output sound non-harmonic that is a non-harmonic component of the output sound transition is extracted, and an output sound change representing a change in the output sound non-harmonic along the time axis is derived as one of the output sound information.

すると、時間相関導出手段が、楽音変化と、出力音変化との相関値を表す時間相関値を、楽音変化と出力音変化とに設定された設定位置を一致させて出力音変化を時間軸に沿って伸縮する毎に導出すると共に、設定位置を規定範囲内で時間軸に沿って順次変更して、その導出された時間相関値の中で、値が最大となる時間相関値に対応する出力音変化の時間軸に沿った伸縮率及び設定位置を、時間補正量導出手段が、時間補正量として導出しても良い。 Then, the time correlation deriving means matches the time correlation value representing the correlation value between the musical sound change and the output sound change to the set position set in the musical sound change and the output sound change, and sets the output sound change to the time axis. Derived every time it expands and contracts along, and sequentially changes the set position along the time axis within the specified range, and the output corresponding to the time correlation value that has the maximum value among the derived time correlation values The time correction amount deriving means may derive the expansion / contraction rate and the set position along the time axis of the sound change as the time correction amount .

一般的に、楽音推移や出力音推移に含まれる非調波成分は、打楽器（例えば、ドラムやベース）や楽器のアタック音に多く含まれており、時間のずれに対する相関度合いの変化が大きい。そのため、非調波成分を用いることで、相関値を基準にした時間補正量導出が正確に行える。 In general, inharmonic components included in musical tone transition and output sound transition are mostly included in percussion instruments (for example, drums and basses) and attack sounds of musical instruments, and the degree of correlation varies greatly with time lag. Therefore, by using the non-harmonic component, the time correction amount can be accurately derived based on the correlation value.

よって、本発明のデータ修正装置において、時間補正量に従って楽譜データを修正すれば、修正楽譜データに基づく出力音推移と、楽音推移とのリズム、ひいては、修正楽譜データにおける個々の出力音の出力タイミングと、楽音の演奏開始タイミングとを一致させることができる。 Therefore, in the data correction apparatus of the present invention, if the score data is corrected according to the time correction amount, the rhythm between the output sound transition based on the corrected score data and the musical sound transition, and thus the output timing of each output sound in the corrected score data And the musical performance start timing can be matched.

ところで、本発明において、出力音推移取得手段は、出力音の音高の補正量に従って、楽譜データ修正手段が楽譜データに規定された個々の出力音の周波数をシフトさせた修正楽譜データに基づく出力音推移（以下、修正音推移とする）を取得しても良い。 By the way, in the present invention, the output sound transition acquisition means is an output based on the modified score data in which the score data correction means shifts the frequency of each output sound defined in the score data according to the correction amount of the pitch of the output sound. A sound transition (hereinafter referred to as a modified sound transition) may be acquired.

この場合、時間補正量導出手段は、修正音推移に基づく出力音変化を出力音情報の一つとして導出して、その修正音推移に基づく出力音変化と、楽音変化との時間相関値に従って、時間補正量を導出しても良い。 In this case, the time correction amount deriving means derives the output sound change based on Osamu correct pronunciation of a character changes as one of the output sound information, and change the output sound based on the corrected tone transition, according to the time correlation value with the tone change, A time correction amount may be derived.

ただし、このような発明においては、楽音変化、出力音変化、時間相関値、時間補正量についての導出方法は、出力音変化を導出する対象が出力音推移から修正音推移に変更されたことを除けば、請求項３に係る発明と同様である。 However, in such an invention, the derivation method for the musical sound change, the output sound change, the time correlation value, and the time correction amount indicates that the object from which the output sound change is derived has been changed from the output sound transition to the corrected sound transition. Except for this, it is the same as the invention according to claim 3 .

このような本発明のデータ修正装置では、時間補正量を導出する前に、楽譜データに規定された個々の出力音の音高が、対象楽曲を構成する楽音の音高に一致するように修正している。したがって、本発明のデータ修正装置によれば、楽譜データに規定された出力音の音高と、対象楽曲を構成する楽音の音高との間にズレが生じていることに起因して、時間補正量の導出精度が低下することを防止できる。 In such a data correction apparatus of the present invention, before deriving the time correction amount, correction is made so that the pitches of the individual output sounds specified in the score data match the pitches of the musical sounds constituting the target music. doing. Therefore, according to the data correction device of the present invention, there is a time difference between the pitch of the output sound specified in the musical score data and the pitch of the musical sound constituting the target music. It is possible to prevent the accuracy of deriving the correction amount from decreasing.

なお、本発明のデータ修正装置によれば、出力音の音高と、出力音の出力タイミングとの両方について修正された修正楽譜データによって表された楽曲を、対象楽曲により一致させることができる。 In addition, according to the data correction apparatus of this invention, the music represented by the correction musical score data corrected about both the pitch of an output sound and the output timing of an output sound can be made to correspond with an object music.

本発明のデータ修正装置は、楽音変化導出手段が、対象楽曲のテンポ一定の区間である対象区間毎に、楽音変化を導出し、出力音変化導出手段が、対象区間に対応する区間毎に、出力音変化を導出し、時間相関導出手段が、対象区間それぞれについて、時間相関値を導出し、時間補正量導出手段は、対象区間それぞれについて、時間補正量を導出しても良い。 In the data correction device of the present invention, the musical sound change deriving means derives a musical sound change for each target section that is a constant tempo section of the target music, and the output sound change deriving means is for each section corresponding to the target section. derives the change output sound, the time correlation deriving means, for each target interval, it derives a time correlation value, the time correction amount deriving means, for each target section, but it may also derive the time correction amount.

このようなデータ修正装置によれば、対象楽曲においてテンポが一定の区間毎に、時間補正量を導出して、出力音の出力タイミングを修正することができる。この結果、修正楽譜データにおける個々の出力音の出力タイミングを、対象楽曲における個々の楽音の演奏開始タイミングにより正確に一致させることができる。 According to such a data correction apparatus, it is possible to correct the output timing of the output sound by deriving the time correction amount for each section having a constant tempo in the target music piece. As a result, the output timings of the individual output sounds in the modified musical score data can be accurately matched with the performance start timings of the individual musical sounds in the target music.

さらに、本発明のデータ修正装置では、楽音振幅導出手段が、楽音推移取得手段にて取得した楽音推移から、該楽音推移の平均振幅を表す楽音平均振幅を導出し、出力音振幅導出手段が、出力音推移取得手段にて取得した出力音推移から、該出力音推移の平均振幅を表す出力音平均振幅を導出する。 Further, in the data correction apparatus of the present invention, the musical sound amplitude deriving means derives the musical sound average amplitude representing the average amplitude of the musical sound transition from the musical sound transition acquired by the musical sound transition acquiring means, and the output sound amplitude deriving means includes: An output sound average amplitude representing the average amplitude of the output sound transition is derived from the output sound transition acquired by the output sound transition acquisition means.

そして、比率導出手段が、楽音振幅導出手段にて導出した楽音平均振幅と、出力音振幅導出手段にて導出した出力音平均振幅との比率である音量比率を導出すると共に、音量修正手段が、比率導出手段にて導出された音量比率を、楽譜データに規定された個々の出力音の音圧に乗じることで出力音の音量を修正しても良い。 The ratio deriving means derives a volume ratio that is a ratio between the musical sound average amplitude derived by the musical sound amplitude deriving means and the output sound average amplitude derived by the output sound amplitude deriving means, and the volume correcting means, the volume ratio derived by ratio deriving means, yet good to modify the volume of the output sound by multiplying the sound pressure of defined individual output sound score data.

このような本発明のデータ修正装置によれば、楽譜データに規定された個々の出力音の音量を、対象楽曲の楽音の音量に近似させることができる。
なお、本発明は、コンピュータをデータ修正装置として機能させるためのプログラムであっても良い。 According to such a data correction apparatus of the present invention, the volume of each output sound specified in the score data can be approximated to the volume of the musical sound of the target music.
The present invention may be a program for causing a computer to function as a data correction device.

この場合、本発明のプログラムは、対象楽曲を構成する楽音の音圧が時間軸に沿って推移した楽音推移を取得する楽音推移取得手順と、対象楽曲を模擬した楽曲の楽譜を表し、音源モジュールから出力される個々の出力音の音高及び出力タイミングが少なくとも規定された楽譜データに基づいて、出力音の音圧が時間軸に沿って推移した出力音推移を取得する出力音推移取得手順と、楽音推移取得手順にて取得した楽音推移から抽出した該楽音推移の特性を表す楽音情報と、出力音推移取得手順にて取得した出力音推移から抽出した該出力音推移の特性を表す出力音情報とを比較した結果に基づき、音高補正量導出手順、及び時間補正量導出手順のうち少なくとも一方に、補正量の導出を実行させる補正量導出手順と、楽譜データに規定された個々の出力音を、補正量導出手順で導出した補正量に従ってシフトすることで、楽譜データを修正する楽譜データ修正手順とをコンピュータに実行させる必要がある。 In this case, the program of the present invention represents a musical sound transition acquisition procedure for acquiring a musical sound transition in which the sound pressure of the musical sound constituting the target musical piece has changed along the time axis, and a musical score of the musical piece simulating the target musical piece. An output sound transition acquisition procedure for acquiring the output sound transition in which the sound pressure of the output sound has shifted along the time axis based on the musical score data in which the pitch and the output timing of each output sound output from at least are specified; , Musical sound information representing the characteristics of the musical sound transition extracted from the musical sound transition acquired in the musical sound transition acquisition procedure, and output sound representing the characteristic of the output sound transition extracted from the output sound transition acquired in the output sound transition acquisition procedure Based on the result of the comparison with the information, at least one of the pitch correction amount derivation procedure and the time correction amount derivation procedure is defined in the correction amount derivation procedure and the score data. Individual output sound, by shifting in accordance with the correction amount derived by the correction amount derivation steps, necessary Ru to execute a musical score data correction procedure for correcting the musical score data to the computer.

ただし、音高補正量導出手順では、楽音情報と出力音情報とを比較した結果に基づき、出力音の音高が、該出力音に対応する楽音の音高に一致するように楽譜データの音高補正量を補正量の一つとして導出し、時間補正量導出手順では、楽音情報と出力音情報とを比較した結果に基づき、出力音の出力タイミングが、該出力音に対応する楽音の演奏開始タイミングに一致するように楽譜データの時間補正量を補正量の一つとして導出する。
さらに、本発明の音高補正量導出手順においては、楽音音高分布を楽音情報の一つとして導出する楽音分布導出手順と、出力音高分布を出力音情報の一つとして導出する出力音分布導出手順と、出力音高分布と楽音音高分布との相関値を表す音高相関値を、楽音音高分布の予め規定された規定位置から出力音高分布を周波数軸に沿ってシフトさせる毎に導出する音高相関導出手順とを、コンピュータに実行させると共に、音高相関導出手順にて導出された音高相関値の中で、値が最大となる音高相関値に対応する規定位置からの周波数軸に沿ったシフト量を、音高補正量として導出する。
また、本発明の時間補正量導出手順においては、楽音変化を楽音情報の一つとして導出する楽音変化導出手順と、出力音変化を出力音情報の一つとして導出する出力音変化導出手順と、楽音変化と出力音変化との相関値を表す時間相関値を、楽音変化と出力音変化とに設定された設定位置を一致させて出力音変化を時間軸に沿って伸縮する毎に導出すると共に、設定位置を規定範囲内で時間軸に沿って順次変更する時間相関導出手順とを、コンピュータに実行させ、時間相関導出手順にて導出された時間相関値の中で、値が最大となる時間相関値に対応する出力音変化の時間軸に沿った伸縮率及び設定位置を、時間補正量として導出する。 However, in the pitch correction amount derivation procedure, the pitch of the musical score data is set so that the pitch of the output sound matches the pitch of the musical sound corresponding to the output sound based on the result of comparing the musical sound information and the output sound information. The high correction amount is derived as one of the correction amounts. In the time correction amount deriving procedure, the output timing of the output sound is based on the result of comparing the musical sound information and the output sound information. The time correction amount of the score data is derived as one of the correction amounts so as to coincide with the start timing.
Furthermore, in the pitch correction amount deriving procedure of the present invention, the musical sound distribution deriving procedure for deriving the musical tone pitch distribution as one of the musical tone information, and the output sound distribution for deriving the output pitch distribution as one of the output sound information. Every time the output pitch distribution is shifted along the frequency axis, the pitch correlation value representing the correlation value between the derivation procedure and the output pitch distribution and the musical pitch distribution is shifted from the predetermined specified position of the musical pitch distribution. The pitch correlation derivation procedure derived in step (1) is executed by the computer, and the pitch correlation value derived in the pitch correlation derivation procedure is determined from the specified position corresponding to the pitch correlation value having the maximum value. The amount of shift along the frequency axis is derived as a pitch correction amount.
Further, in the time correction amount derivation procedure of the present invention, a musical sound change derivation procedure for deriving a musical sound change as one of musical sound information, an output sound change derivation procedure for deriving an output sound change as one of output sound information, A time correlation value representing a correlation value between a musical sound change and an output sound change is derived every time the output sound change is expanded or contracted along the time axis by matching the set positions set for the musical sound change and the output sound change. The time correlation derivation procedure for sequentially changing the set position along the time axis within a specified range is executed by a computer, and the time in which the value is the maximum among the time correlation values derived by the time correlation derivation procedure The expansion / contraction rate and the set position along the time axis of the output sound change corresponding to the correlation value are derived as the time correction amount.

本発明のプログラムが、このようになされていれば、例えば、ＤＶＤ−ＲＯＭ、ＣＤ−ＲＯＭ、ハードディスク等のコンピュータ読み取り可能な記録媒体に記録し、必要に応じてコンピュータにロードさせて起動することや、必要に応じて通信回線を介してコンピュータに取得させて起動することにより用いることができる。そして、コンピュータに各手順を実行させることで、そのコンピュータを、請求項１または請求項３に記載されたデータ修正装置として機能させることができる。 If the program of the present invention is made in this way, for example, it can be recorded on a computer-readable recording medium such as a DVD-ROM, CD-ROM, hard disk, etc. If necessary, it can be used by being acquired and activated by a computer via a communication line. And by making a computer perform each procedure, the computer can be functioned as a data correction apparatus described in Claim 1 or Claim 3 .

実施形態におけるデータ修正装置の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the data correction apparatus in embodiment. データ修正処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a data correction process. 音高補正処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a pitch correction process. 音高補正処理の処理内容を説明する説明図である。It is explanatory drawing explaining the processing content of a pitch correction process. 時間補正処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a time correction process. 時間補正処理の処理内容を説明する説明図である。It is explanatory drawing explaining the processing content of a time correction process. 音量補正処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a volume correction process.

以下に本発明の実施形態を図面と共に説明する。
本発明が適用されたデータ修正装置は、予め生成された楽曲の一つである対象楽曲を構成する楽音に、対象楽曲を模擬した楽曲の楽譜を表す楽譜データに規定された出力音が一致するように、当該楽譜データを修正する装置である。このデータ修正装置は、本実施形態では、図１に示す情報処理装置１０によって構成されている。なお、本実施形態における対象楽曲は、複数の音源（例えば、楽器や人）にて発生した楽音が重畳するように生成されている。
〈データ修正装置の構成について〉
図１に示すように、情報処理装置１０は、通信部１１と、音響データ読取部１２と、入力受付部１３と、表示部１４と、音声入力部１５と、音声出力部１６と、音源モジュール１７と、記憶部１８と、制御部２０とを備えている。 Embodiments of the present invention will be described below with reference to the drawings.
In the data correction apparatus to which the present invention is applied, the output sound defined in the musical score data representing the musical score of the musical piece simulating the target musical piece matches the musical sound constituting the target musical piece that is one of the previously generated musical pieces. In this way, the musical score data is corrected. In the present embodiment, this data correction apparatus is constituted by the information processing apparatus 10 shown in FIG. Note that the target music in this embodiment is generated so that musical sounds generated by a plurality of sound sources (for example, musical instruments and people) are superimposed.
<Configuration of data correction device>
As shown in FIG. 1, the information processing apparatus 10 includes a communication unit 11, an acoustic data reading unit 12, an input receiving unit 13, a display unit 14, a voice input unit 15, a voice output unit 16, and a sound source module. 17, a storage unit 18, and a control unit 20.

このうち、通信部１１は、情報処理装置１０をネットワーク（例えば、専用回線やＷＡＮ）に接続し、その接続されたネットワークを介して外部と通信を行うものである。
音響データ読取部１２は、記憶媒体に記憶されている音響データを時間軸に沿って順次読み取る装置（例えば、ＣＤやＤＶＤの読取装置）である。その音響データは、対象楽曲を構成する全ての楽音の音圧が時間軸に沿って推移したアナログ波形を標本化（サンプリング）したデータである。 Among these, the communication unit 11 connects the information processing apparatus 10 to a network (for example, a dedicated line or a WAN), and communicates with the outside through the connected network.
The acoustic data reading unit 12 is a device (for example, a CD or DVD reader) that sequentially reads the acoustic data stored in the storage medium along the time axis. The acoustic data is data obtained by sampling (sampling) an analog waveform in which the sound pressures of all the musical sounds constituting the target music have shifted along the time axis.

そして、入力受付部１３は、外部からの操作に従って情報や指令の入力を受け付ける入力機器（例えば、キーボードやポインティングデバイス）である。表示部１４は、画像を表示する表示装置（例えば、液晶ディスプレイやＣＲＴ等）である。また、音声入力部１５は、音声を電気信号に変換して制御部２０に入力する装置（いわゆるマイクロホン）である。音声出力部１６は、制御部２０からの電気信号を音声に変換して出力する装置（いわゆるスピーカ）である。 The input receiving unit 13 is an input device (for example, a keyboard or a pointing device) that receives input of information and commands in accordance with an external operation. The display unit 14 is a display device (for example, a liquid crystal display or a CRT) that displays an image. The voice input unit 15 is a device (so-called microphone) that converts voice into an electrical signal and inputs the electrical signal to the control unit 20. The audio output unit 16 is a device (so-called speaker) that converts an electrical signal from the control unit 20 into sound and outputs the sound.

さらに、音源モジュール１７は、楽譜データに基づいて、音源からの音を模擬した出力音を出力する装置である。本実施形態においては、音源モジュール１７は、周知のＭＩＤＩ（ＭｕｓｉｃａｌＩｎｓｔｒｕｍｅｎｔＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅ）音源によって構成されている。そして、音源モジュール１７において、出力音として音が模擬される音源（以下、模擬音源とする）は、鍵盤楽器（例えば、ピアノやパイプオルガンなど）、弦楽器（例えば、バイオリンやビオラ、ギター、琴など）、打楽器（例えば、ドラムやシンバル、ティンパニー、木琴など）、及び管楽器（例えば、クラリネットやトランペット、フルート、尺八など）などであり、予め登録されている。 Furthermore, the sound module 17 is a device that outputs an output sound that simulates the sound from the sound source based on the score data. In the present embodiment, the sound source module 17 is constituted by a well-known MIDI (Musical Instrument Digital Interface) sound source. In the sound module 17, sound sources that are simulated as output sounds (hereinafter referred to as simulated sound sources) are keyboard instruments (for example, pianos and pipe organs), stringed instruments (for example, violin, viola, guitar, koto, etc.) ), Percussion instruments (for example, drums, cymbals, timpani, xylophone, etc.), wind instruments (for example, clarinet, trumpet, flute, shakuhachi, etc.), etc., which are registered in advance.

次に、楽譜データは、対象楽曲を模擬した楽曲（以下、対応楽曲とする）を区別するデータである識別データと、当該対応楽曲にて用いられる模擬音源毎の楽譜を表す楽譜トラックとを少なくとも有している。本実施形態における楽譜データは、周知のＭＩＤＩ規格によって表されている。 Next, the musical score data includes at least identification data that is data for distinguishing a musical piece that simulates the target musical piece (hereinafter referred to as a corresponding musical piece) and a musical score track that represents a musical score for each simulated sound source used in the corresponding musical piece. Have. The musical score data in this embodiment is represented by a well-known MIDI standard.

このうち、各楽譜トラックは、音源モジュール１７が出力音を出力する期間（以下、音符長）、及び個々の出力音の音高（いわゆるノートナンバー）が規定されている。さらに、各楽譜トラックには、個々の出力音の強さ（いわゆるアタック、ベロシティ、ディケイなど）や、対応楽曲を分割する区間（例えば、Ａメロやサビなど）におけるテンポが規定されている。 Among these, for each score track, a period (hereinafter, note length) in which the sound source module 17 outputs an output sound and a pitch (so-called note number) of each output sound are defined. Further, each musical score track defines the strength of each output sound (so-called attack, velocity, decay, etc.) and the tempo in the section (for example, A melody or chorus) into which the corresponding music is divided.

ただし、楽譜トラックの音符長は、当該出力音の出力を開始するまでの当該楽曲の演奏開始からの時刻を表す出力タイミング（いわゆるノートオンタイミング）と、当該出力音の出力を終了するまでの当該楽曲の演奏開始からの時刻を表す終了タイミング（いわゆるノートオフタイミング）とによって規定されている。 However, the note length of the music score track is the output timing (so-called note-on timing) indicating the time from the start of the performance of the music until the output of the output sound is started, and the output time of the output sound. It is defined by the end timing (so-called note-off timing) representing the time from the start of the music performance.

また、記憶部１８は、記憶内容を読み書き可能に構成された不揮発性の記憶装置（例えば、ハードディスク装置）である。この記憶部１８には、処理プログラムや楽譜データが少なくとも格納される。 The storage unit 18 is a non-volatile storage device (for example, a hard disk device) configured to be able to read and write stored contents. The storage unit 18 stores at least processing programs and score data.

さらに、制御部２０は、電源が切断されても記憶内容を保持する必要がある処理プログラムやデータを格納するＲＯＭ２１と、処理プログラムやデータを一時的に格納するＲＡＭ２２と、ＲＯＭ２１やＲＡＭ２２に記憶された処理プログラムに従って各処理（各種演算）を実行するＣＰＵ２３とを少なくとも有した周知のコンピュータを中心に構成されている。 Further, the control unit 20 is stored in the ROM 21 that stores processing programs and data that need to retain stored contents even when the power is turned off, the RAM 22 that temporarily stores processing programs and data, and the ROM 21 and RAM 22. It is mainly configured by a known computer having at least a CPU 23 that executes each process (various operations) according to the processing program.

なお、本実施形態における処理プログラムとして、対象楽曲を構成する楽音に、対象楽曲に対応する楽曲の楽譜データに規定された出力音が一致するように、当該楽譜データを修正するデータ修正処理を、制御部２０が実行するものが予め用意されている。
〈データ修正処理の処理内容について〉
次に、制御部２０が実行するデータ修正処理について説明する。 As a processing program in the present embodiment, a data correction process for correcting the musical score data so that the output sound defined in the musical score data of the music corresponding to the target music matches the musical sound constituting the target music, What the control unit 20 executes is prepared in advance.
<Processing details of data correction processing>
Next, data correction processing executed by the control unit 20 will be described.

このデータ修正処理は、入力受付部１３を介して、当該データ修正処理を起動するための起動指令が入力されると、実行が開始されるものである。
そして、図２に示すように、データ修正処理は、起動されると、入力受付部１３を介して入力された情報によって指定される楽曲に対応する楽譜データを取得する（Ｓ１１０（Ｓは、ステップを意味する））。 The data correction process is started when an activation command for starting the data correction process is input via the input receiving unit 13.
Then, as shown in FIG. 2, when the data correction process is started, score data corresponding to the music designated by the information input via the input receiving unit 13 is acquired (S110 (S is a step). Means)).

続いて、音響データ読取部１２にて読み取った音響データを、対象楽曲を構成する楽音が時間軸に沿って推移した波形である楽音推移として取得する（Ｓ１２０）。ただし、本実施形態の音響データ読取部１２には、本データ修正処理が起動される前に、Ｓ１１０にて取得する楽譜データに対応する対象楽曲の音響データを記憶した記憶媒体が配置されているものとする。 Subsequently, the acoustic data read by the acoustic data reading unit 12 is acquired as a musical sound transition which is a waveform in which the musical sounds constituting the target music have shifted along the time axis (S120). However, the acoustic data reading unit 12 of the present embodiment is provided with a storage medium that stores the acoustic data of the target music corresponding to the score data acquired in S110 before the data correction processing is started. Shall.

そして、Ｓ１１０にて取得した楽譜データと、Ｓ１２０で取得した楽音推移とに基づいて、対象楽曲を構成する楽音の音高に、出力音の音高が一致するように、当該楽譜データを修正する音高補正処理を実行する（Ｓ１３０）。以下、出力音について修正が実行された楽譜データを修正楽譜データと称す。 Then, based on the musical score data acquired in S110 and the musical sound transition acquired in S120, the musical score data is corrected so that the pitch of the output sound matches the pitch of the musical sound constituting the target music. A pitch correction process is executed (S130). Hereinafter, the score data in which the output sound is corrected is referred to as corrected score data.

さらに、音高補正処理によって、出力音の音高が楽音の音高に一致するように修正された出力音（以下、修正出力音とする）の出力タイミングが、楽音の演奏開始タイミングに一致するように、修正楽譜データを修正する時間補正処理を実行する（Ｓ１５０）。続いて、時間補正処理によって、出力タイミングが楽音の演奏開始タイミングに一致するように修正された修正出力音の強さが、楽音の強さ（即ち、音量）に一致するように、修正楽譜データを修正する音量補正処理を実行する（Ｓ１７０）。 Further, the output timing of the output sound (hereinafter referred to as the corrected output sound) that has been corrected so that the pitch of the output sound matches the pitch of the musical tone by the pitch correction processing matches the musical performance start timing. As described above, a time correction process for correcting the corrected musical score data is executed (S150). Subsequently, the corrected musical score data is adjusted so that the intensity of the corrected output sound, which has been corrected by the time correction process so that the output timing matches the musical performance start timing, matches the intensity (ie, volume) of the musical sound. Volume correction processing for correcting the sound is executed (S170).

その後、本データ修正処理を終了する。
つまり、データ修正処理では、楽譜データに規定されている出力音の音高、出力タイミング、強さのそれぞれを、対象楽曲を構成する楽音の音高、演奏開始タイミング、音量に一致するように修正した修正楽譜データを生成する。
〈音高補正処理の処理内容について〉
次に、データ修正処理のＳ１３０にて起動される音高補正処理について説明する。 Thereafter, the data correction process is terminated.
In other words, in the data correction process, the pitch, output timing, and strength of the output sound specified in the score data are corrected to match the pitch, the performance start timing, and the volume of the musical sound that composes the target song. The corrected score data is generated.
<Pitch correction processing details>
Next, the pitch correction process started in S130 of the data correction process will be described.

この音高補正処理は、起動されると、図３に示すように、先のＳ１１０にて取得した楽譜データに含まれる全ての楽譜トラックに基づいて、全ての出力音が時間軸に沿って推移した波形である出力音推移を取得する（Ｓ３１０）。具体的に、本実施形態における出力音推移の取得は、全ての楽譜トラックに規定されている個々の出力音を、楽譜データの時間軸に沿って音源モジュール１７に出力させ、音声入力部１５を介して受け付けることで実行する。 When this pitch correction process is started, as shown in FIG. 3, all output sounds change along the time axis based on all score tracks included in the score data acquired in the previous S110. The transition of the output sound that is the waveform obtained is acquired (S310). Specifically, the acquisition of the transition of the output sound in the present embodiment is performed by causing the sound source module 17 to output individual output sounds defined in all the score tracks along the time axis of the score data, It is executed by accepting via

続いて、その取得した出力音推移を、時間軸に沿って設定された単位時間毎に周波数解析（本実施形態では、離散フーリエ変換）して、その単位時間の出力音推移に含まれる周波数、及び各周波数における強度を表すパワースペクトルを導出する（Ｓ３２０）。その導出されたパワースペクトルに基づいて、各周波数における強度を、時間軸に沿って周波数毎に相加平均した平均出力音スペクトルを導出する（Ｓ３３０）。その導出した平均出力音スペクトルの周波数における強度を、境界が互いに隣接するように予め規定された周波数範囲（例えば、半音単位、以下、規定音高範囲）毎に平均化して代表値とする（Ｓ３４０）。さらに、そのＳ３４０で平均化した平均出力音スペクトルにおける周波数における強度を、分散「１」、平均「０」となるように正規化した正規化出力音スペクトル（図４（Ａ）参照）を導出する（Ｓ３５０）。 Subsequently, the obtained output sound transition is subjected to frequency analysis (in this embodiment, discrete Fourier transform) for each unit time set along the time axis, and the frequency included in the output sound transition of the unit time, A power spectrum representing the intensity at each frequency is derived (S320). Based on the derived power spectrum, an average output sound spectrum obtained by arithmetically averaging the intensity at each frequency for each frequency along the time axis is derived (S330). The intensity at the frequency of the derived average output sound spectrum is averaged for each frequency range (for example, a semitone unit, hereinafter, a specified pitch range) so that the boundaries are adjacent to each other to obtain a representative value (S340). ). Furthermore, a normalized output sound spectrum (see FIG. 4A) is derived by normalizing the intensity at the frequency in the average output sound spectrum averaged in S340 so that the variance is “1” and the average is “0”. (S350).

なお、本実施形態のＳ３４０にて求める代表値は、規定音高範囲に含まれる周波数における強度を平均化することで求めることに限らず、規定音高範囲における中心値に対応する周波数における強度を代表値としても良い。この場合、具体的には、２０Ｃｅｎｔ毎（半音の５分の１毎）に、２０Ｃｅｎｔグリッドに一番近い周波数の値（パワー）を抽出する処理を行う。 In addition, the representative value calculated | required by S340 of this embodiment is not only calculated | required by averaging the intensity | strength in the frequency contained in a regular pitch range, but the intensity | strength in the frequency corresponding to the center value in a regular pitch range. It may be a representative value. In this case, specifically, for each 20 Cent (every fifth of a semitone), a process of extracting a frequency value (power) closest to the 20 Cent grid is performed.

続いて、先のＳ１２０にて取得した楽音推移を、時間軸に沿って設定された単位時間毎に周波数解析して、その単位時間の楽音推移に含まれる周波数、及び各周波数における強度を表すパワースペクトルを導出する（Ｓ３６０）。その導出されたパワースペクトルに基づいて、各周波数における強度を、時間軸に沿って周波数毎に相加平均した平均楽音スペクトルを導出する（Ｓ３７０）。その導出した平均楽音スペクトルの周波数における強度を、規定音高範囲毎に平均化して代表値とし（Ｓ３８０）、そのＳ３８０で平均化した平均楽音スペクトルにおける周波数における強度を、分散「１」、平均「０」となるように正規化した正規化楽音スペクトル（図４（Ｂ）参照）を導出する（Ｓ３９０）。 Subsequently, the musical sound transition acquired in the previous S120 is subjected to frequency analysis for each unit time set along the time axis, and the frequency included in the musical sound transition of the unit time and the power representing the intensity at each frequency are analyzed. A spectrum is derived (S360). Based on the derived power spectrum, an average musical sound spectrum obtained by arithmetically averaging the intensity at each frequency for each frequency along the time axis is derived (S370). The intensity at the frequency of the derived average tone spectrum is averaged for each specified pitch range to obtain a representative value (S380), and the intensity at the frequency in the average tone spectrum averaged at S380 is expressed as variance “1”, average “ A normalized musical tone spectrum (see FIG. 4B) normalized to be “0” is derived (S390).

なお、本実施形態のＳ３８０にて求める代表値は、規定音高範囲に含まれる周波数における強度を平均化することで求めることに限らず、規定音高範囲における中心値に対応する周波数における強度を代表値としても良い。この場合、具体的には、２０Ｃｅｎｔ毎（半音の５分の１毎）に、２０Ｃｅｎｔグリッドに一番近い周波数の値（パワー）を抽出する処理を行う。 In addition, the representative value calculated | required by S380 of this embodiment is not restricted to calculating | requiring by averaging the intensity | strength in the frequency included in a regulation pitch range, but the intensity | strength in the frequency corresponding to the center value in a regulation pitch range. It may be a representative value. In this case, specifically, for each 20 Cent (every fifth of a semitone), a process of extracting a frequency value (power) closest to the 20 Cent grid is performed.

そして、詳しくは、後述するように、正規化出力音スペクトルと正規化楽音スペクトルとの相関値（以下、音高相関値とする）を導出する（Ｓ４００）。そして、正規化楽音スペクトルに対する正規化出力音スペクトルのシフト量が予め規定された上限値以上であるか否かを判定する（Ｓ４１０）。その判定の結果、シフト量が上限値未満であれば（Ｓ４１０：ＮＯ）、正規化出力音スペクトルを、周波数軸に沿って予め規定された規定量シフトして（Ｓ４２０）、Ｓ４００へと戻り、音高相関値を再度導出する。 In detail, as described later, a correlation value between the normalized output sound spectrum and the normalized musical sound spectrum (hereinafter referred to as a pitch correlation value) is derived (S400). Then, it is determined whether or not the shift amount of the normalized output sound spectrum with respect to the normalized musical sound spectrum is equal to or greater than a predetermined upper limit value (S410). As a result of the determination, if the shift amount is less than the upper limit value (S410: NO), the normalized output sound spectrum is shifted by a predetermined amount along the frequency axis (S420), and the process returns to S400. The pitch correlation value is derived again.

すなわち、本実施形態のＳ４００〜Ｓ４２０では、図４（Ｃ）に示すように、正規化楽音スペクトルに対して、正規化出力音スペクトルを周波数軸に沿って下限値から上限値に達するまでシフトさせつつ、その正規化出力音スペクトルをシフトさせる毎に、音高相関値を導出する。 That is, in S400 to S420 of this embodiment, as shown in FIG. 4C, the normalized output sound spectrum is shifted from the lower limit value to the upper limit value along the frequency axis with respect to the normalized musical sound spectrum. However, every time the normalized output sound spectrum is shifted, a pitch correlation value is derived.

そして、正規化出力音のシフト量が上限値以上となると（Ｓ４１０：ＹＥＳ）、対象楽曲を構成する楽音の音高に、出力音の音高を一致させるための補正量（以下、音高補正量とする）を導出する（Ｓ４３０）。本実施形態のＳ４３０では、具体的に、先のＳ４００にて導出された全ての音高相関値の中で、値が最大である音高相関値に対応する正規化出力音スペクトルのシフト量を音高補正量として導出する。 When the shift amount of the normalized output sound becomes equal to or greater than the upper limit value (S410: YES), a correction amount (hereinafter referred to as pitch correction) for matching the pitch of the output sound to the pitch of the musical sound constituting the target music. Is determined (S430). In S430 of the present embodiment, specifically, among the pitch correlation values derived in the previous S400, the shift amount of the normalized output sound spectrum corresponding to the pitch correlation value having the maximum value is calculated. Derived as a pitch correction amount.

続いて、その導出された音高補正量に従って、楽譜データにおける全ての楽譜トラックに規定された個々の出力音の音高を修正することで、修正楽譜データを生成する（Ｓ４４０）。すなわち、本実施形態のＳ４４０にて生成される修正楽譜データは、出力音の音高が、予め用意された出力音の音高から音高補正量シフトされたものとなる。 Subsequently, the corrected musical score data is generated by correcting the pitches of the individual output sounds defined for all musical score tracks in the musical score data in accordance with the derived pitch correction amount (S440). That is, the modified musical score data generated in S440 of the present embodiment is obtained by shifting the pitch of the output sound from the pitch of the output sound prepared in advance by a pitch correction amount.

そして、その後、本音高補正処理を終了し、データ修正処理へと戻る。
つまり、音高補正処理では、楽音推移の特性を表す楽音情報としての正規化楽音スペクトルと、出力音推移の特性を表す出力音情報としての正規化出力音スペクトルとを比較した結果に基づいて導出した一つの音高補正量に従って、楽譜データにおける全ての楽譜トラックに規定された個々の出力音の音高を修正している。 After that, the pitch correction process is terminated and the process returns to the data correction process.
In other words, the pitch correction process is derived based on the result of comparing the normalized musical sound spectrum as the musical sound information representing the characteristics of the musical sound transition and the normalized output sound spectrum as the output sound information representing the characteristic of the output sound transition. According to the one pitch correction amount, the pitches of the individual output sounds defined for all the score tracks in the score data are corrected.

〈時間補正処理の処理内容について〉
次に、データ修正処理のＳ１５０にて起動される時間補正処理について説明する。
この時間補正処理は、起動されると、図５に示すように、先のＳ４４０にて生成された修正楽譜データに含まれる全ての楽譜トラックに基づいて、全ての修正出力音が時間軸に沿って推移した波形である修正音推移を取得する（Ｓ５１０）。本実施形態における修正音推移の取得は、Ｓ３１０と同様の方法により実行すれば良い。 <About time correction processing>
Next, the time correction process started in S150 of the data correction process will be described.
When this time correction processing is started, as shown in FIG. 5, all the corrected output sounds are moved along the time axis based on all the score tracks included in the corrected score data generated in the previous S440. The correction sound transition, which is the waveform that has changed, is acquired (S510). The acquisition of the correction sound transition in the present embodiment may be executed by the same method as in S310.

続いて、その取得した修正音推移の非調波成分である出力音非調波を、該修正音推移から導出し（Ｓ５２０）、さらに、先のＳ１２０で取得した楽音推移の非調波成分である楽音非調波を、該楽音推移から導出する（Ｓ５３０）。これらの非調波成分の導出は、予め用意されたフィルタに、修正音推移または楽音推移を通過させることで実行しても良い。 Subsequently, the output sound non-harmonic, which is the non-harmonic component of the acquired modified sound transition, is derived from the modified sound transition (S520), and further, the non-harmonic component of the musical sound transition acquired in the previous S120. A certain tone non-harmonic is derived from the tone transition (S530). The derivation of these non-harmonic components may be executed by passing the corrected sound transition or the musical sound transition through a filter prepared in advance.

さらに、出力音非調波及び楽音非調波を、それぞれ、時間軸に沿って規定された時間長である特定ブロック毎に分割する（Ｓ５４０）。その分割する特定ブロックは、出力音非調波については、対応楽曲においてテンポが一定であることを表すテンポ一定区間（本発明の対象区間に相当）毎である。このテンポ一定区間は、楽譜データに規定されたテンポに従って、テンポが変更される時刻を、各テンポ一定区間の開始時刻、終了時刻として特定することで決定する。なお、楽音非調波の特定ブロックについては、出力音非調波の特定ブロックを決定した後、出力音非調波の特定ブロックそれぞれの開始時刻、終了時刻に相当する対象楽曲の演奏開始からの時刻を、楽音非調波の特定ブロックそれぞれの開始時刻及び終了時刻として特定することで決定する。 Furthermore, the output sound non-harmonic and the musical sound non-harmonic are each divided into specific blocks each having a time length defined along the time axis (S540). The specific block to be divided is for each tempo constant section (corresponding to the target section of the present invention) indicating that the tempo is constant in the corresponding music for the output sound inharmonic. This fixed tempo section is determined by specifying the time at which the tempo is changed according to the tempo specified in the musical score data as the start time and end time of each fixed tempo section. Regarding the specific block of musical tone non-harmonic, after the specific block of output non-harmonic is determined, the start time and the end time of each specific block of output sound non-harmonic are determined from the start of the performance of the target music. The time is determined by specifying the start time and the end time of each specific block of the musical tone non-harmonic.

そして、Ｓ５４０にて分割された特定ブロックの中から、一組の特定ブロックを選択し（Ｓ５５０）、その一組の特定ブロックについて、楽音非調波、出力音非調波共に、時間軸に沿った変化を表すユニットデータを生成する（Ｓ５６０）。本実施形態におけるユニットデータは、図６（Ａ）,（Ｂ）に示すように、特定ブロックよりも短い時間長である規定区間毎に、その規定区間内での非調波成分の振幅値を加算した上で、その規定区間毎に加算された値を正規化することによって生成する。なお、以下では、出力音非調波についてのユニットデータを出力音ユニットデータ（本発明における出力音変化に相当）とし、楽音非調波についてのユニットデータを楽音ユニットデータ（本発明における楽音変化に相当）とする。 Then, a set of specific blocks is selected from the specific blocks divided in S540 (S550), and both the musical tone non-harmonic and output sound non-harmonic are set along the time axis for the set of specific blocks. Unit data representing the change is generated (S560). As shown in FIGS. 6 (A) and 6 (B), the unit data in the present embodiment includes the amplitude value of the non-harmonic component in the specified section for each specified section having a shorter time length than the specific block. After the addition, it is generated by normalizing the value added for each specified section. In the following, the unit data for output sound non-harmonic is referred to as output sound unit data (corresponding to the output sound change in the present invention), and the unit data for musical sound non-harmonic is referred to as the musical unit data (musical sound change in the present invention) Equivalent).

その出力音ユニットデータの時間軸上に規定された出力音設定位置を、楽音ユニットデータの時間軸上に規定された楽音設定位置に一致させて、出力音ユニットデータと楽音ユニットデータとの相関値（以下、時間相関値とする）を導出する（Ｓ５７０）。そして、楽音ユニットデータに対する出力音ユニットデータの伸縮率が、予め規定された上限値（伸縮率の上限値）以上であるか否かを判定する（Ｓ５８０）。その判定の結果、楽音ユニットデータの伸縮率が、伸縮率の上限値未満であれば（Ｓ５８０：ＮＯ）、出力音ユニットデータを、時間軸に沿って予め規定された規定量拡大して（Ｓ５９０）、Ｓ５７０へと戻る。 Matching the output sound setting position specified on the time axis of the output sound unit data with the music sound setting position specified on the time axis of the music sound unit data, the correlation value between the output sound unit data and the sound unit data (Hereinafter referred to as a time correlation value) is derived (S570). Then, it is determined whether or not the expansion / contraction rate of the output sound unit data with respect to the musical sound unit data is equal to or higher than a predetermined upper limit value (expansion rate upper limit value) (S580). As a result of the determination, if the expansion / contraction rate of the musical sound unit data is less than the upper limit value of the expansion / contraction rate (S580: NO), the output sound unit data is expanded by a predetermined amount along the time axis (S590). ), The process returns to S570.

さらに、楽音ユニットデータの伸縮率が、伸縮率の上限値に達していれば（Ｓ５８０：ＹＥＳ）、楽音ユニットデータに対する出力音ユニットデータの時間軸に沿ったシフト量が、予め規定された上限値（シフト量の上限値）以上であるか否かを判定する（Ｓ６００）。その判定の結果、楽音ユニットデータのシフト量が、シフト量の上限値未満であれば（Ｓ６００：ＮＯ）、出力音ユニットデータの設定位置を、予め規定された時間シフトして（Ｓ６１０）、出力音ユニットデータの伸縮率を下限値とした上で、Ｓ５７０へと戻る。 Further, if the expansion / contraction rate of the musical sound unit data has reached the upper limit value of the expansion / contraction rate (S580: YES), the shift amount along the time axis of the output sound unit data with respect to the musical sound unit data is set to a predetermined upper limit value. It is determined whether or not it is equal to or greater than (the upper limit value of the shift amount) (S600). As a result of the determination, if the shift amount of the musical sound unit data is less than the upper limit value of the shift amount (S600: NO), the set position of the output sound unit data is shifted by a predetermined time (S610) and output. After setting the expansion / contraction rate of the sound unit data as the lower limit value, the process returns to S570.

すなわち、本実施形態のＳ５７０〜Ｓ６１０では、図６（Ｃ）に示すように、楽音ユニットデータに対して、出力音ユニットデータの伸縮率が上限値に達するまで拡大する毎に、時間相関値を導出する。そして、このような時間相関値の導出を、楽音ユニットデータに対して、出力音ユニットデータを時間軸に沿ってシフト量の上限値に達するまでシフトさせつつ実行する。 That is, in S570 to S610 of this embodiment, as shown in FIG. 6 (C), the time correlation value is set each time the musical sound unit data is expanded until the expansion / contraction rate of the output sound unit data reaches the upper limit value. To derive. Then, the derivation of the time correlation value is executed while shifting the output sound unit data along the time axis until the upper limit value of the shift amount is reached with respect to the musical sound unit data.

一方、Ｓ６００での判定の結果、出力音ユニットデータのシフト量が、シフト量の上限値以上であれば（Ｓ６００：ＹＥＳ）、対象楽曲を構成する楽音の演奏開始タイミングに、修正出力音の出力タイミングを一致させるための補正量（以下、時間補正量とする）を導出する（Ｓ６２０）。本実施形態のＳ６２０では、具体的に、一組の特定ブロックに対してＳ５７０で導出された全ての時間相関値の中で、値が最大となる時間相関値に対応する出力音ユニットデータの伸縮率及びシフト量を、Ｓ５５０で選択した特定ブロックに対する時間補正量として導出する。 On the other hand, as a result of the determination in S600, if the shift amount of the output sound unit data is equal to or greater than the upper limit value of the shift amount (S600: YES), the output of the modified output sound is output at the performance start timing of the musical sound constituting the target song. A correction amount for matching the timing (hereinafter referred to as a time correction amount) is derived (S620). In S620 of this embodiment, specifically, the expansion / contraction of the output sound unit data corresponding to the time correlation value having the maximum value among all the time correlation values derived in S570 for a set of specific blocks. The rate and the shift amount are derived as the time correction amount for the specific block selected in S550.

その導出された時間補正量に従って、個々の出力音の出力タイミングを修正した修正楽譜データを生成する（Ｓ６３０）。本実施形態のＳ６３０では、Ｓ５５０で選択した特定ブロックに対する時間補正量として導出された、出力音ユニットデータのシフト量と、出力音ユニットデータの伸縮率とに基づいて、出力音の音高が修正された修正楽譜データにおける当該特定ブロックの開始時刻及び終了時刻を修正する。そして、修正前の出力音の出力タイミングの間隔比率が維持されるように、修正後の開始時刻、及び終了時刻にて規定される期間に応じて、出力音の出力タイミングを伸縮させることで、当該特定ブロックに対する個々の出力音の出力タイミングを修正した修正楽譜データを生成する。なお、本実施形態のＳ６３０では、出力音の終了タイミングについても修正する。この出力音の終了タイミングの修正方法は、出力音の出力タイミングと同様の方法を用いればよい。 According to the derived time correction amount, corrected score data in which the output timing of each output sound is corrected is generated (S630). In S630 of this embodiment, the pitch of the output sound is corrected based on the shift amount of the output sound unit data and the expansion / contraction rate of the output sound unit data derived as the time correction amount for the specific block selected in S550. The start time and end time of the specific block in the corrected musical score data are corrected. And, by extending the output timing of the output sound according to the period specified by the start time after correction and the end time so that the interval ratio of the output timing of the output sound before correction is maintained, Modified score data in which the output timing of each output sound for the specific block is corrected is generated. In S630 of this embodiment, the end timing of the output sound is also corrected. As a method for correcting the end timing of the output sound, a method similar to the output timing of the output sound may be used.

続いて、Ｓ５４０にて分割した全ての特定ブロックに対して、時間補正量を導出したか否かを判定し（Ｓ６４０）、その判定の結果、全ての特定ブロックに対して時間補正量を導出していなければ（Ｓ６４０：ＮＯ）、Ｓ５５０に戻る。そのＳ５５０では、新たな特定ブロックを選択し、Ｓ６３０までのステップを実行する。このＳ５５０では、時間長が長いものから順に特定ブロックを取得して、時間補正量を導出する。ただし、時間補正量が既に導出されている特定ブロックに隣接する特定ブロックでは、既に導出されている特定ブロックの修正後の開始時刻または終了時刻を、自特定ブロックでの値として導出する。 Subsequently, it is determined whether or not the time correction amount is derived for all the specific blocks divided in S540 (S640). As a result of the determination, the time correction amount is derived for all the specific blocks. If not (S640: NO), the process returns to S550. In S550, a new specific block is selected, and the steps up to S630 are executed. In S550, the specific block is acquired in order from the longest time length, and the time correction amount is derived. However, in a specific block adjacent to a specific block whose time correction amount has already been derived, the start time or end time after modification of the specific block that has already been derived is derived as a value in the self-specific block.

一方、Ｓ６４０での判定の結果、全ての特定ブロックに対して時間補正量を導出していれば（Ｓ６４０：ＹＥＳ）、その後、本時間補正処理を終了し、データ修正処理へと戻る。 On the other hand, if the result of determination in S640 is that the time correction amount has been derived for all the specific blocks (S640: YES), then this time correction process is terminated and the process returns to the data correction process.

つまり、時間補正処理では、楽音推移の特性を表す楽音情報としての楽音ユニットデータと、修正音推移の特性を表す出力音情報としての出力音ユニットデータとを比較した結果に基づいて導出した時間補正量に従って、楽譜データにおける全ての楽譜トラックに規定された個々の出力音の出力タイミングを修正している。 In other words, in the time correction process, the time correction derived based on the result of comparing the musical sound unit data as the musical sound information representing the characteristics of the musical sound transition and the output sound unit data as the output sound information representing the characteristics of the corrected sound transition. According to the quantity, the output timings of the individual output sounds defined for all the score tracks in the score data are corrected.

〈音量補正処理の処理内容について〉
次に、データ修正処理のＳ１７０にて起動される音量補正処理について説明する。
この音量補正処理は、起動されると、図７に示すように、先のＳ６４０にて生成された修正楽譜データに含まれる全ての楽譜トラックに基づいて、全ての修正出力音が時間軸に沿って推移した波形である修正音推移を取得する（Ｓ７１０）。本実施形態における修正音推移の取得は、Ｓ３１０と同様の方法により実行すれば良い。 <Volume correction processing details>
Next, the volume correction process started in S170 of the data correction process will be described.
When this volume correction process is started, as shown in FIG. 7, all the corrected output sounds are moved along the time axis based on all the score tracks included in the corrected score data generated in the previous S640. The correction sound transition, which is the waveform that has changed, is acquired (S710). The acquisition of the correction sound transition in the present embodiment may be executed by the same method as in S310.

そのＳ７１０にて取得した修正音推移の振幅を時間軸に沿った全体（全期間）で平均することで、出力音平均振幅を導出し（Ｓ７２０）、さらに、先のＳ１２０にて取得した楽音推移の振幅を時間軸に沿った全体（全期間）で平均することで、楽音平均振幅を取得する（Ｓ７３０）。続いて、Ｓ７２０にて導出した出力音平均振幅と、Ｓ７３０にて導出した楽音平均振幅との比率である音量比率を導出する（Ｓ７４０）。 The average amplitude of the output sound is derived by averaging the amplitude of the modified sound transition acquired in S710 over the entire time axis (all periods) (S720), and further, the musical sound transition acquired in S120 above Is averaged over the entire time axis (all periods) to obtain a musical tone average amplitude (S730). Subsequently, a volume ratio that is a ratio between the output sound average amplitude derived in S720 and the musical sound average amplitude derived in S730 is derived (S740).

その音量比率を、先のＳ６４０にて生成された修正楽譜データの修正出力音の強さ（即ち、音圧を表すベロシティなどの値）に乗じることにより、修正出力音の音量が楽音の音量に一致するように修正した修正楽譜データを生成する（Ｓ７５０）。 By multiplying the volume ratio by the strength of the modified output sound of the modified musical score data generated in the previous S640 (ie, a value such as velocity representing the sound pressure), the volume of the modified output sound becomes the volume of the musical sound. The corrected musical score data corrected to match is generated (S750).

そして、その後、音量補正処理を終了し、データ修正処理へと戻る。
［実施形態の効果］
以上説明したように、本実施形態のデータ修正装置１によれば、当該データ修正装置１にて修正された修正楽譜データは、楽譜データに規定された出力音の音高、出力タイミング、強さ（音量）が、対象楽曲を構成する楽音の音高、演奏開始タイミング、音量に一致したものとなる。 Thereafter, the sound volume correction process is terminated, and the process returns to the data correction process.
[Effect of the embodiment]
As described above, according to the data correction device 1 of the present embodiment, the corrected score data corrected by the data correction device 1 is the pitch, output timing, and strength of the output sound defined in the score data. (Volume) corresponds to the pitch of the musical sound constituting the target musical piece, the performance start timing, and the volume.

よって、修正楽譜データを構成する修正出力音と、対象楽曲を構成する楽音との間に、音高や出力タイミング、音量のズレが生じることを低減できる。
この結果、自動演奏装置などが、修正楽譜データに従って対応楽曲を演奏すれば、その演奏された楽曲を聴いたユーザが、対象楽曲との間に存在するズレに違和感や、対象楽曲自身と印象が異なると感じることを防止できる。 Therefore, it is possible to reduce the occurrence of deviations in pitch, output timing, and volume between the modified output sound constituting the modified musical score data and the musical sound constituting the target musical piece.
As a result, if the automatic performance device or the like plays the corresponding music piece according to the modified musical score data, the user who listens to the played music piece feels uncomfortable in the gap between the target music piece and the impression of the target music piece itself. You can prevent them from feeling different.

特に、本実施形態の音高補正処理では、楽音推移及び出力音推移のパワースペクトルのうち、周波数における強度について正規化することで導出した正規化出力音スペクトル及び正規化楽音スペクトルの比較結果に基づいて、音高補正量を導出している。 In particular, in the pitch correction process of the present embodiment, based on the comparison result of the normalized output sound spectrum and the normalized musical sound spectrum derived by normalizing the intensity in frequency among the power spectra of the musical sound transition and the output sound transition. Thus, the pitch correction amount is derived.

したがって、このように導出される音高補正量を用いて、楽譜データに規定された個々の出力音の音高を修正すれば、楽音推移の振幅と、出力音推移の振幅とが大きく異なっていたとしても、修正楽譜データに基づく修正音推移を楽音推移に近づけることができる。 Therefore, if the pitch of each output sound specified in the score data is corrected using the pitch correction amount derived in this way, the amplitude of the musical tone transition and the amplitude of the output pitch transition differ greatly. Even so, the correction sound transition based on the corrected musical score data can be brought close to the musical sound transition.

さらに、本実施形態において、正規化出力音スペクトル及び正規化楽音スペクトルは、周波数について規定音高範囲ごとに代表値化されているため、楽音の周波数や出力音の周波数に、目的とする波形以外の外来音（周波数ノイズ）が吸収されることとなる。このため、正規化出力音スペクトルと正規化楽音スペクトルとの相関値の誤差を低減させ、音高補正量の導出精度を向上させることができる。これにより、修正楽譜データに基づく出力音推移の音高と、楽音推移の音高との一致度をより向上させることができる。 Further, in the present embodiment, the normalized output sound spectrum and the normalized musical sound spectrum are representative values for each specified pitch range with respect to the frequency, so that the frequency of the musical sound and the frequency of the output sound are other than the target waveform. The extraneous sound (frequency noise) is absorbed. For this reason, the error of the correlation value between the normalized output sound spectrum and the normalized musical sound spectrum can be reduced, and the accuracy of deriving the pitch correction amount can be improved. Thereby, the degree of coincidence between the pitch of the output sound transition based on the modified musical score data and the pitch of the musical sound transition can be further improved.

特に、本実施形態において、正規化出力音スペクトル及び正規化楽音スペクトルを導出する際には、規定音高範囲ごとに代表値化しており、全ての数値を計算しないので、相関を計算する時間も回数も減らすことができ、この結果、高速化できる。 In particular, in the present embodiment, when deriving the normalized output sound spectrum and the normalized musical sound spectrum, they are representative values for each specified pitch range, and since all numerical values are not calculated, the time for calculating the correlation is also required. The number of times can be reduced, and as a result, the speed can be increased.

また、本実施形態のデータ修正処理では、音高補正処理を実行して、楽譜データに規定された個々の出力音の音高が、対象楽曲を構成する楽音の音高に一致するように修正した上で、時間補正処理を実行している。したがって、データ修正装置１によれば、楽譜データに規定された出力音の音高と、対象楽曲を構成する楽音の音高との間にズレが生じていることに起因して、時間補正量の導出精度が低下することを防止できる。 Further, in the data correction process of the present embodiment, the pitch correction process is executed so that the pitches of the individual output sounds specified in the score data match the pitches of the musical sounds constituting the target music. In addition, time correction processing is executed. Therefore, according to the data correction apparatus 1, the time correction amount is caused by the difference between the pitch of the output sound specified in the score data and the pitch of the musical sound constituting the target music. It is possible to prevent the derivation accuracy of.

特に、本実施形態の時間補正処理では、対象楽曲においてテンポが一定の区間毎に、時間補正量の導出している。このように導出された時間補正量を用いて、出力音の出力タイミングを修正することで、修正楽譜データにおける個々の出力音の出力タイミングを、対象楽曲における個々の楽音の演奏開始タイミングにより正確に一致させることができる。
［その他の実施形態］
以上、本発明の実施形態について説明したが、本発明は上記実施形態に限定されるものではなく、本発明の要旨を逸脱しない範囲において、様々な態様にて実施することが可能である。 In particular, in the time correction process of the present embodiment, the time correction amount is derived for each section where the tempo is constant in the target music. By correcting the output timing of the output sound using the time correction amount derived in this way, the output timing of each output sound in the modified musical score data is more accurately determined by the performance start timing of each musical sound in the target music. Can be matched.
[Other Embodiments]
As mentioned above, although embodiment of this invention was described, this invention is not limited to the said embodiment, In the range which does not deviate from the summary of this invention, it is possible to implement in various aspects.

例えば、上記実施形態の音高補正処理におけるＳ３１０では、全ての楽譜トラックに規定されている個々の出力音を、楽譜データの時間軸に沿って音源モジュール１７に出力させ、音声入力部１５を介して受け付けることで、出力音推移の取得を実行していたが、出力音推移の取得方法は、これに限るものではない。すなわち、出力音推移の取得は、出力音の時間軸に沿った波形を表す音響信号（電気信号）を音源モジュール１７が生成し、その生成された音響信号に従って音声出力部１６が鳴動するように、情報処理装置１０が構成されている場合、音源モジュール１７が生成する音響信号を出力音推移として取得しても良い。 For example, in S310 in the pitch correction process of the above embodiment, individual output sounds defined for all score tracks are output to the sound source module 17 along the time axis of the score data, and the sound input unit 15 is used. However, the output sound transition acquisition method is not limited to this. That is, the acquisition of the output sound transition is such that the sound source module 17 generates an acoustic signal (electric signal) representing a waveform along the time axis of the output sound, and the sound output unit 16 rings according to the generated acoustic signal. When the information processing apparatus 10 is configured, an acoustic signal generated by the sound source module 17 may be acquired as an output sound transition.

そして、上記実施形態の時間補正処理では、時間補正量の導出を、特定ブロック毎に実行するが、時間補正量は、楽曲に対して一つ導出されても良い。
また、上記実施形態における時間補正処理では、時間補正量の導出するときに楽音ユニットデータと比較する出力音ユニットデータを、出力音の音高が修正された修正楽譜データに基づいて取得した修正音推移から生成していたが、この出力音ユニットデータの生成に用いる信号は、例えば、出力音の音高が修正される前の楽譜データに基づいて取得した出力音推移であっても良い。 In the time correction process of the above embodiment, the time correction amount is derived for each specific block. However, one time correction amount may be derived for the music piece.
Further, in the time correction processing in the above embodiment, the corrected sound acquired based on the modified musical score data in which the pitch of the output sound is corrected is output sound unit data to be compared with the musical sound unit data when the time correction amount is derived. Although generated from the transition, the signal used to generate the output sound unit data may be, for example, the output sound transition acquired based on the musical score data before the pitch of the output sound is corrected.

さらに、上記実施形態の音量補正処理では、修正楽譜データに基づいて取得した修正音推移を用いて音量比率を導出した修正楽譜データを生成したが、本発明においては、音量比率の導出は、楽譜データに基づく出力音推移を用いて実行しても良い。 Furthermore, in the sound volume correction processing of the above embodiment, the modified music score data in which the sound volume ratio is derived using the corrected sound transition acquired based on the modified music score data is generated. You may perform using the output sound transition based on data.

なお、上記実施形態のデータ修正処理では、音高補正処理、及び時間補正処理に加えて、音量補正処理を実行していたが、データ修正処理で実行される処理の中に、音量補正処理が含まれていなくとも良い。つまり、データ修正処理によって修正される楽譜データのパラメータは、出力音の音高と出力タイミングとであっても良い。 In the data correction process of the above embodiment, the volume correction process is executed in addition to the pitch correction process and the time correction process, but the volume correction process is included in the processes executed in the data correction process. It does not have to be included. That is, the parameters of the score data corrected by the data correction process may be the pitch of the output sound and the output timing.

さらには、上記実施形態のデータ修正処理では、音高補正処理と時間補正処理との両方の処理を実行していたが、データ修正処理で実行する処理としては、音高補正処理と時間補正処理とのうちの少なくとも一方であっても良い。
［実施形態と特許請求の範囲との対応関係］
最後に、上記実施形態の記載と、特許請求の範囲の記載との関係を説明する。 Furthermore, in the data correction process of the above embodiment, both the pitch correction process and the time correction process are executed. However, as the process executed in the data correction process, the pitch correction process and the time correction process are performed. Or at least one of them.
[Correspondence between Embodiment and Claims]
Finally, the relationship between the description of the above embodiment and the description of the scope of claims will be described.

上記実施形態のデータ修正処理におけるＳ１２０が、本発明の楽音推移取得手段に相当し、音高補正処理におけるＳ３１０、時間補正処理におけるＳ５１０、及び音量補正処理におけるＳ７１０が、本発明の出力音推移取得手段に相当する。さらに、上記実施形態の音高補正処理におけるＳ３２０からＳ４３０、及び時間補正処理のＳ５２０からＳ６２０が、本発明の補正量導出手段に相当し、このうち、前者が音高補正量導出手段に、後者が時間補正量導出手段に相当する。そして、音高補正処理のＳ４４０、及び時間補正処理のＳ６３０が、本発明の楽譜データ修正手段に相当する。 S120 in the data correction process of the above embodiment corresponds to the musical sound transition acquisition means of the present invention. S310 in the pitch correction process, S510 in the time correction process, and S710 in the volume correction process acquire the output sound transition of the present invention. Corresponds to means. Further, S320 to S430 in the pitch correction process of the above embodiment and S520 to S620 of the time correction process correspond to the correction amount deriving means of the present invention, of which the former is the pitch correction amount deriving means and the latter. Corresponds to time correction amount deriving means. Then, S440 of the pitch correction process and S630 of the time correction process correspond to the score data correcting means of the present invention.

また、音高補正処理におけるＳ３２０からＳ３５０が、本発明の出力音分布導出手段に相当し、Ｓ３６０からＳ３９０が、本発明の楽音分布導出手段に相当し、Ｓ４００からＳ４２０が、本発明の音高相関導出手段に相当する。そして、時間補正処理におけるＳ５２０,Ｓ５４０からＳ５６０が、本発明の出力音変化導出手段に相当し、Ｓ５３０からＳ５６０が、本発明の楽音変化導出手段に相当し、Ｓ５７０からＳ６１０が、本発明の時間相関導出手段に相当する。 Also, S320 to S350 in the pitch correction processing correspond to the output sound distribution deriving means of the present invention, S360 to S390 correspond to the musical sound distribution deriving means of the present invention, and S400 to S420 correspond to the pitch of the present invention. Corresponds to correlation deriving means. S520, S540 to S560 in the time correction processing correspond to the output sound change deriving means of the present invention, S530 to S560 correspond to the musical sound change deriving means of the present invention, and S570 to S610 correspond to the time of the present invention. Corresponds to correlation deriving means.

さらには、音量補正処理におけるＳ７２０が、本発明の出力音振幅導出手段に相当し、Ｓ７３０が、楽音振幅導出手段に相当し、Ｓ７４０が、比率導出手段に相当し、Ｓ７５０が、音量修正手段に相当する。 Further, S720 in the sound volume correction processing corresponds to the output sound amplitude deriving means of the present invention, S730 corresponds to the musical sound amplitude deriving means, S740 corresponds to the ratio deriving means, and S750 as the sound volume correcting means. Equivalent to.

１…データ修正装置１０…情報処理装置１１…通信部１２…音響データ読取部１３…入力受付部１４…表示部１５…音声入力部１６…音声出力部１７…音源モジュール１８…記憶部２０…制御部 DESCRIPTION OF SYMBOLS 1 ... Data correction apparatus 10 ... Information processing apparatus 11 ... Communication part 12 ... Acoustic data reading part 13 ... Input reception part 14 ... Display part 15 ... Audio | voice input part 16 ... Audio | voice output part 17 ... Sound source module 18 ... Memory | storage part 20 ... Control Part

Claims

A musical sound transition acquisition means for acquiring a musical sound transition in which the sound pressure of the musical sound constituting the target music has changed along the time axis;
Representing the score of the music simulating the target music, the sound pressure of the output sound is along the time axis based on the score data in which the pitch and output timing of each output sound output from the sound module are defined at least Output sound transition acquisition means for acquiring the output sound transition that has changed,
Musical sound information representing the characteristics of the musical sound transition extracted from the musical sound transition acquired by the musical sound transition acquisition means, and output representing the characteristics of the output sound transition extracted from the output sound transition acquired by the output sound transition acquisition means Based on the result of comparison with the sound information, the pitch correction amount of the musical score data is derived as one of the correction amounts so that the pitch of the output sound matches the pitch of the musical sound corresponding to the output sound. Based on the result of comparison between the pitch correction amount deriving means and the musical sound information and the output sound information, the output score of the output sound is matched with the performance start timing of the musical sound corresponding to the output sound. A time correction amount deriving unit for deriving a time correction amount of data as one of the correction amounts, and at least one of the pitch correction amount deriving unit and the time correction amount deriving unit derives the correction amount. Execute And the correction amount deriving means,
Musical score data correcting means for correcting the musical score data by shifting individual output sounds defined in the musical score data according to the correction amount derived by the correction amount deriving means ;
The pitch correction amount derivation means includes:
A musical sound distribution derivation that represents a frequency included in the musical sound transition acquired by the musical sound transition acquisition means and an intensity at each frequency, and that derives a musical tone pitch distribution normalized with respect to the intensity at the frequency as one of the musical sound information. Means,
The frequency included in the output sound transition acquired by the output sound transition acquisition means and the intensity at each frequency are expressed, and the output pitch distribution normalized with respect to the intensity at the frequency is derived as one of the output sound information. Output sound distribution deriving means;
A pitch correlation value representing a correlation value between the output pitch distribution derived by the output sound distribution deriving means and the musical tone pitch distribution derived by the musical sound distribution deriving means is determined in advance of the musical tone pitch distribution. A pitch correlation deriving means for deriving each time the output pitch distribution is shifted along the frequency axis from a defined position,
Of the pitch correlation values derived by the pitch correlation deriving means, the shift amount along the frequency axis from the specified position corresponding to the pitch correlation value having the maximum value is the pitch correction amount. A data correction apparatus characterized by being derived as follows .

The musical sound distribution derivation means includes:
By deriving the power spectrum of the entire transition of the musical sound and representing the frequency of the derived power spectrum for each specified pitch range, which is a frequency range defined so that the boundary is adjacent, and normalizing, Deriving the tone pitch distribution,
The output sound distribution derivation means includes
A power spectrum of the entire output sound transition is derived, and the frequency of the derived power spectrum is represented for each specified pitch range and normalized to derive the output pitch distribution. The data correction device according to claim 1 .

  A musical sound transition acquisition means for acquiring a musical sound transition in which the sound pressure of the musical sound constituting the target music has changed along the time axis;
  Representing the score of the music simulating the target music, the sound pressure of the output sound is along the time axis based on the score data in which the pitch and output timing of each output sound output from the sound module are defined at least Output sound transition acquisition means for acquiring the output sound transition that has changed,
  Musical sound information representing the characteristics of the musical sound transition extracted from the musical sound transition acquired by the musical sound transition acquisition means, and output representing the characteristics of the output sound transition extracted from the output sound transition acquired by the output sound transition acquisition means Based on the result of comparison with the sound information, the pitch correction amount of the musical score data is derived as one of the correction amounts so that the pitch of the output sound matches the pitch of the musical sound corresponding to the output sound. Based on the result of comparison between the pitch correction amount deriving means and the musical sound information and the output sound information, the output score of the output sound is matched with the performance start timing of the musical sound corresponding to the output sound. A time correction amount deriving unit for deriving a time correction amount of data as one of the correction amounts, and at least one of the pitch correction amount deriving unit and the time correction amount deriving unit derives the correction amount. Execute And the correction amount deriving means,
  Musical score data correcting means for correcting the musical score data by shifting individual output sounds defined in the musical score data according to the correction amount derived by the correction amount deriving means;
  With
  The time correction amount derivation means includes:
  A musical tone non-harmonic that is a non-harmonic component of the musical tone transition is extracted from the musical tone transition acquired by the musical tone transition acquisition means, and a musical tone change that represents a change in musical non-harmonic along the time axis is extracted. Musical sound change deriving means derived as one piece of information;
  The output sound non-harmonic, which is a non-harmonic component of the output sound transition, is extracted from the output sound transition acquired by the output sound transition acquisition means, and an output representing the change of the output sound non-harmonic along the time axis Output sound change deriving means for deriving a sound change as one of the output sound information;
  A time correlation value representing a correlation value between the musical sound change derived by the musical sound change deriving means and the output sound change derived by the output sound change deriving means is set in the musical sound change and the output sound change. A time correlation deriving unit for deriving each time the output sound change is made to expand and contract along the time axis by matching the set position, and for sequentially changing the set position along the time axis within a specified range;
  Among the time correlation values derived by the time correlation deriving means, the expansion / contraction rate and the set position along the time axis of the output sound change corresponding to the time correlation value having the maximum value are used as the time correction amount. A data correction apparatus characterized by deriving.

The output sound transition acquisition means is
In accordance with the pitch correction amount derived by the pitch correction amount deriving means, the score data correction means shifts the output sound transition based on the modified score data obtained by shifting the frequency of each output sound specified in the score data. Get a certain correction sound transition,
The time correction amount derivation means includes:
A musical tone non-harmonic that is a non-harmonic component of the musical tone transition is extracted from the musical tone transition acquired by the musical tone transition acquisition means, and a musical tone change that represents a change in musical non-harmonic along the time axis is extracted. Musical sound change deriving means derived as one piece of information;
The output sound non-harmonic, which is a non-harmonic component of the correction sound transition, is extracted from the correction sound transition acquired by the output sound transition acquisition means, and the output represents the change of the output sound non-harmonic along the time axis Output sound change deriving means for deriving a sound change as one of the output sound information;
A time correlation value representing a correlation value between the musical sound change derived by the musical sound change deriving means and the output sound change derived by the output sound change deriving means is set in the musical sound change and the output sound change. A time correlation deriving means for deriving each time the output sound change is expanded and contracted along the time axis by matching the set positions, and sequentially changing the set position along the time axis within a specified range;
Among the time correlation values derived by the time correlation deriving means, the expansion / contraction rate and the set position along the time axis of the output sound change corresponding to the time correlation value having the maximum value are used as the time correction amount. data correcting apparatus according to claim 1 or claim 2, wherein the deriving.

The musical sound change deriving means includes
Deriving the musical sound change for each target section that is a constant tempo section of the target music,
The output sound change deriving means includes:
Deriving the output sound change for each section corresponding to the target section,
The time correlation deriving means includes:
Deriving the time correlation value for each of the target sections,
The time correction amount derivation means includes:
For each of the target section, the data correcting apparatus according to claim 3 or claim 4, wherein the deriving the time correction amount.

A musical tone amplitude deriving unit for deriving a musical tone average amplitude representing an average amplitude of the musical tone transition from the musical tone transition acquired by the musical tone transition acquiring unit;
Output sound amplitude deriving means for deriving an output sound average amplitude representing an average amplitude of the output sound transition from the output sound transition acquired by the output sound transition acquiring means;
A ratio derivation means for deriving a volume ratio that is a ratio of the musical sound average amplitude derived by the musical sound amplitude derivation means and the output sound average amplitude derived by the output sound amplitude derivation means;
Volume correction means for correcting the volume of the output sound by multiplying the volume ratio derived by the ratio derivation means by the sound pressure of each output sound specified in the score data. The data correction apparatus as described in any one of Claims 1-5 .

A musical sound transition acquisition procedure for acquiring a musical sound transition in which the sound pressure of the musical sound constituting the target music has changed along the time axis,
Representing the score of the music simulating the target music, the sound pressure of the output sound is along the time axis based on the score data in which the pitch and output timing of each output sound output from the sound module are defined at least Output sound transition acquisition procedure for acquiring the output sound transition
Music information representing the characteristics of the musical sound transition extracted from the musical sound transition acquired in the musical sound transition acquisition procedure, and output representing the characteristics of the output sound transition extracted from the output sound transition acquired in the output sound transition acquisition procedure Based on the result of comparison with the sound information, the pitch correction amount of the musical score data is derived as one of the correction amounts so that the pitch of the output sound matches the pitch of the musical sound corresponding to the output sound. Based on a pitch correction amount derivation procedure and a comparison result between the musical sound information and the output sound information, the score data is set so that the output timing of the output sound matches the performance start timing of the musical sound corresponding to the output sound. A correction amount derivation procedure for causing at least one of the time correction amount derivation procedures to derive the time correction amount as one of the correction amounts,
A musical score data correction procedure for correcting the musical score data by shifting individual output sounds defined in the musical score data according to the correction amount derived in the correction amount derivation procedure,
Let the computer run
The pitch correction amount derivation procedure is as follows:
A musical sound distribution derivation that represents a frequency included in the musical sound transition acquired in the musical sound transition acquisition procedure and an intensity at each frequency, and a musical tone pitch distribution normalized with respect to the intensity at the frequency is derived as one of the musical sound information. Procedure and
Represents the frequency included in the output sound transition acquired in the output sound transition acquisition procedure and the intensity at each frequency, and derives the output pitch distribution normalized for the intensity at the frequency as one of the output sound information. Output sound distribution derivation procedure,
A pitch correlation value representing a correlation value between the output pitch distribution derived in the output sound distribution derivation procedure and the musical tone pitch distribution derived in the musical sound distribution derivation procedure is calculated in advance from the musical pitch distribution. A pitch correlation deriving procedure for deriving each time the output pitch distribution is shifted along a frequency axis from a prescribed position,
The pitch correction amount derivation procedure is as follows:
Of the pitch correlation values derived by the pitch correlation deriving procedure, the shift amount along the frequency axis from the specified position corresponding to the pitch correlation value having the maximum value is the pitch correction amount. A program characterized by being derived as

A musical sound transition acquisition procedure for acquiring a musical sound transition in which the sound pressure of the musical sound constituting the target music has changed along the time axis,
Representing the score of the music simulating the target music, the sound pressure of the output sound is along the time axis based on the score data in which the pitch and output timing of each output sound output from the sound module are defined at least Output sound transition acquisition procedure for acquiring the output sound transition
Music information representing the characteristics of the musical sound transition extracted from the musical sound transition acquired in the musical sound transition acquisition procedure, and output representing the characteristics of the output sound transition extracted from the output sound transition acquired in the output sound transition acquisition procedure Based on the result of comparison with the sound information, the pitch correction amount of the musical score data is derived as one of the correction amounts so that the pitch of the output sound matches the pitch of the musical sound corresponding to the output sound. Based on a pitch correction amount derivation procedure and a comparison result between the musical sound information and the output sound information, the score data is set so that the output timing of the output sound matches the performance start timing of the musical sound corresponding to the output sound. A correction amount derivation procedure for causing at least one of the time correction amount derivation procedures to derive the time correction amount as one of the correction amounts,
A musical score data correction procedure for correcting the musical score data by shifting individual output sounds defined in the musical score data according to the correction amount derived in the correction amount derivation procedure,
Let the computer run ,
The time correction amount derivation procedure includes:
The musical tone non-harmonic, which is a non-harmonic component of the musical tone transition, is extracted from the musical tone transition acquired in the musical tone transition acquisition procedure, and the musical tone change representing the musical non-harmonic change along the time axis is extracted from the musical tone transition. Musical sound change derivation procedure derived as one of information,
The output sound non-harmonic, which is a non-harmonic component of the output sound transition, is extracted from the output sound transition acquired in the output sound transition acquisition procedure, and the output represents the change of the output sound non-harmonic along the time axis. An output sound change derivation procedure for deriving a sound change as one of the output sound information;
A time correlation value representing a correlation value between the musical sound change derived in the musical sound change derivation procedure and the output sound change derived in the output sound change derivation procedure is set in the musical sound change and the output sound change. Deriving each time the output sound change is made to expand and contract along the time axis by matching the set position, and a time correlation deriving procedure for sequentially changing the set position along the time axis within a specified range, Let the computer run,
The time correction amount derivation procedure includes:
Among the time correlation values derived by the time correlation derivation procedure, the expansion / contraction rate and the set position along the time axis of the output sound change corresponding to the time correlation value having the maximum value are used as the time correction amount. A program characterized by deriving.