JP2015148663A

JP2015148663A - Musical composition processing device

Info

Publication number: JP2015148663A
Application number: JP2014020216A
Authority: JP
Inventors: 誠橘; Makoto Tachibana; 橘　　誠; 雅史吉田; Masashi Yoshida; 修三馬場; Shuzo Baba
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2014-02-05
Filing date: 2014-02-05
Publication date: 2015-08-20
Anticipated expiration: 2034-02-05
Also published as: JP6295691B2

Abstract

PROBLEM TO BE SOLVED: To set natural variation amount fully consistent with the actual trends for each note of a musical piece.SOLUTION: The variation amount at the time the pronunciation of the note is set in the variation amount setting unit, depending on the random number λ generated in the probability distribution D of dispersion σ corresponding to the position of each note in the target music. For example, in a variation amount setting unit, the variation amount is set in accordance with the random number λ generated in the probability distribution D of dispersion σ set to a numeric value d1 when the time of the pronunciation of the note is located on the inner side of the beat point range R including the beat point B of the music and the variation amount is set in accordance with the random number λ generated in the probability distribution D of dispersion σ set to a numeric value d2 exceeding the numerical value d1 when the time of the pronunciation of the note is located on the outer side of the beat point range R.

Description

本発明は、楽曲の各音符を制御する技術に関する。 The present invention relates to a technique for controlling each note of a music piece.

現実の歌唱や演奏の場面では、楽曲の各音符の発音時点が、楽譜上の正確な位置から多少は変動する。以上の傾向を加味して聴感的に自然な音響を生成する観点から、例えば特許文献１には、楽曲の各音符の発音の時点を楽譜上の既定の位置から前後に変動させる技術が開示されている。 In actual singing and performance scenes, the time point at which each musical note is pronounced varies slightly from the exact position on the score. From the viewpoint of generating perceptually natural sound in consideration of the above tendency, for example, Patent Document 1 discloses a technique for changing the time of pronunciation of each note of a musical piece from a predetermined position on a musical score back and forth. ing.

特開２００９−２５８２９１号公報JP 2009-258291 A

ところで、現実の歌唱や演奏では、各音符の発音時点が変動する度合が、楽曲内での当該音符の位置に応じて相違するという傾向がある。例えば、楽曲の拍点の近傍で発音される音符は、歌唱者や演奏者が発音時点を把握し易いため、発音時点が拍点から離間した音符と比較して発音時点の変動の度合が抑制されるという傾向がある。したがって、楽曲内の各音符の位置とは無関係に各音符の発音時点の変動量を設定する構成では、各音符の発音時点の変動が実際の傾向から乖離する可能性がある。なお、以上の説明では便宜的に、各音符の発音時点に着目したが、各音符の消音時点や各音符の各種の音楽情報（例えばビブラートの特性）についても同様の事情が妥当する。以上の事情を考慮して、本発明は、楽曲の各音符について実際の傾向に整合した自然な変動量を設定することを目的とする。 By the way, in actual singing and performance, the degree to which the sounding time of each note fluctuates tends to be different depending on the position of the note in the music. For example, notes that are pronounced near the beat point of a song are easier for singers and performers to know when to pronounce, so the degree of variation at the point of pronunciation is suppressed compared to notes that are separated from the beat point. There is a tendency to be. Therefore, in a configuration in which the amount of variation at the time of pronunciation of each note is set regardless of the position of each note in the music, the variation at the time of pronunciation of each note may deviate from the actual tendency. In the above description, for the sake of convenience, attention has been paid to the time point at which each note is pronounced. However, the same situation applies to the time point at which each note is muted and various music information (for example, characteristics of vibrato). In view of the above circumstances, an object of the present invention is to set a natural fluctuation amount that matches an actual tendency for each note of a music piece.

以上の課題を解決するために、本発明の楽曲処理装置は、楽曲の各音符について、楽曲における当該音符の位置に応じた散布度の確率分布で発生する乱数に応じて、当該音符の発音時点の変動量を設定する変動量設定手段を具備する。以上の構成では、楽曲における音符の位置に応じた散布度の確率分布で発生する乱数に応じて当該音符の発音時点の変動量が設定されるから、楽曲における各音符の位置とは無関係に発音時点の変動量を設定する構成（例えば特許文献１）と比較して、実際の歌唱や演奏における各音符の発音時点の変動量（誤差）と各音符の楽曲内の位置との関係を反映した自然な変動量を設定できるという利点がある。 In order to solve the above-described problems, the music processing device of the present invention, for each note of the music, according to the random number generated in the probability distribution of the degree of spread according to the position of the note in the music, A fluctuation amount setting means for setting the fluctuation amount of. In the above configuration, since the amount of variation at the time of sound generation of the note is set according to the random number generated in the probability distribution of the spread degree according to the position of the note in the music, the sound is generated regardless of the position of each note in the music. Compared to a configuration that sets the amount of variation at the time (for example, Patent Document 1), it reflects the relationship between the amount of variation (error) at the time of pronunciation of each note in actual singing and performance and the position of each note in the song. There is an advantage that a natural variation amount can be set.

本発明の好適な態様において、変動量設定手段は、楽曲の拍点を包含する拍点範囲の内側に一の音符の発音時点が位置する場合に、第１散布度の確率分布で発生する乱数に応じて一の音符の発音時点の変動量を設定し、一の音符の発音時点が拍点範囲の外側に位置する場合に、第１散布度を上回る第２散布度の確率分布で発生する乱数に応じて一の音符の発音時点の変動量を設定する。以上の態様では、音符の発音時点が拍点範囲に包含されるか否かに応じて確率分布の散布度が変更されるから、楽曲の拍点と各音符の発音時点との距離に応じて発音時点の変動量（誤差）が変動するという傾向を反映した自然な変動量を設定することが可能である。また、散布度が２値的に設定されるから、例えば散布度を連続的に変化させる構成と比較して変動量の設定の処理が簡素化されるという利点もある。 In a preferred aspect of the present invention, the fluctuation amount setting means is a random number generated in the probability distribution of the first distribution degree when the time of pronunciation of one note is located inside the beat range including the beat of the music. The variation amount at the time of sound production of one note is set according to the above, and when the time of sound production of one note is located outside the beat range, it occurs with a probability distribution of the second spread degree exceeding the first spread degree. Sets the amount of fluctuation at the time of the sound generation of one note according to the random number. In the above aspect, since the degree of distribution of the probability distribution is changed depending on whether or not the note generation time is included in the beat range, depending on the distance between the beat point of the music and the sound generation time of each note. It is possible to set a natural variation amount that reflects the tendency that the variation amount (error) at the time of pronunciation varies. In addition, since the spread degree is set in a binary manner, there is an advantage that the process of setting the variation amount is simplified as compared with, for example, a configuration in which the spread degree is continuously changed.

本発明の好適な態様において、変動量設定手段は、可変の補正値に対応した乱数の発生確率が最大となる確率分布で発生する乱数に応じて、音符の発音時点の変動量を設定する。以上の態様では、確率分布にて発生確率が最大となる乱数（例えば確率分布の中心）が補正値に応じて可変に設定されるから、各音符の発音時点の先行／遅延（前ノリ／後ノリ）の傾向を反映した自然な変動量を設定することが可能である。なお、以上の態様の具体例は例えば第２実施形態として後述される。 In a preferred aspect of the present invention, the fluctuation amount setting means sets the fluctuation amount at the time of sounding a note in accordance with a random number generated in a probability distribution that maximizes the probability of random number generation corresponding to the variable correction value. In the above aspect, since the random number (for example, the center of the probability distribution) having the maximum occurrence probability in the probability distribution is variably set according to the correction value, the leading / delaying (previous / rear) of the time point at which each note is generated It is possible to set a natural amount of fluctuation that reflects the tendency of (slipping). In addition, the specific example of the above aspect is later mentioned as 2nd Embodiment, for example.

本発明の好適な態様において、変動量設定手段は、変動量に応じた調整後の一の音符の発音時点が、一の音符の消音時点と比較して前方に位置し、かつ、一の音符の直前の他の音符の発音時点と比較して後方に位置するように、一の音符の発音時点の変動量を設定する。以上の態様では、変動量に応じた調整後の音符の発音時点が、当該音符の消音時点と直前の音符の発音時点との間に位置するように、各音符の変動量の範囲が制限される。したがって、各音符の発音時点の変動に起因した当該音符や直前の音符の消失（楽曲内容の破綻）を防止できるという利点がある。なお、以上の態様の具体例は例えば第３実施形態として後述される。 In a preferred aspect of the present invention, the fluctuation amount setting means is such that the sounding time point of one note after adjustment according to the fluctuation amount is positioned forward compared to the sounding time point of one note, and one note The amount of fluctuation at the time of sounding of one note is set so that it is located behind the time of sounding of the other notes immediately before. In the above aspect, the range of the variation amount of each note is limited so that the sounding point of the note after adjustment according to the variation amount is located between the sounding point of the note and the sounding point of the immediately preceding note. The Therefore, there is an advantage that it is possible to prevent the disappearance of the corresponding note or the immediately preceding note (destruction of the music content) due to the fluctuation of the time of pronunciation of each note. In addition, the specific example of the above aspect is later mentioned as 3rd Embodiment, for example.

本発明の好適な態様において、変動量設定手段は、楽曲の第１音符の発音期間と直後の第２音符の発音期間とが時間軸上で相互に連続する場合に、第２音符について設定した発音時点の変動量に応じて、第１音符の消音時点の変動量を設定する。以上の態様では、第２音符の変動量に応じて第１音符の消音時点の変動量が設定されるから、各変動量を適用した調整後にも、第１音符と第２音符とが連続した状態が維持されるという利点がある。なお、以上の態様の具体例は例えば第４実施形態として後述される。 In a preferred aspect of the present invention, the fluctuation amount setting means sets the second note when the sound generation period of the first note of the music and the sound generation period of the second note immediately after are continuous on the time axis. The amount of fluctuation at the time of muting of the first note is set according to the amount of fluctuation at the time of pronunciation. In the above aspect, since the fluctuation amount at the time of muting of the first note is set according to the fluctuation amount of the second note, the first note and the second note are continuous even after adjustment by applying each fluctuation amount. There is an advantage that the state is maintained. In addition, the specific example of the above aspect is later mentioned as 4th Embodiment, for example.

本発明の好適な態様において、変動量設定手段は、楽曲の相異なる複数の声部の各々における各音符の発音時点の変動量を声部毎に個別に設定する。以上の態様では、楽曲の複数の声部の各々について各音符の発音時点の変動量が設定される。したがって、声部毎の各音符の発音時点を時間軸上で分散させることが可能である。なお、楽曲の各声部に対応する音符列（旋律）の異同は不問である。 In a preferred aspect of the present invention, the fluctuation amount setting means individually sets the fluctuation amount at the time of pronunciation of each note in each of a plurality of different voice parts of the music for each voice part. In the above aspect, the fluctuation amount at the time of pronunciation of each note is set for each of the plurality of voice parts of the music. Therefore, it is possible to disperse the time of pronunciation of each note for each voice part on the time axis. In addition, the difference of the note sequence (melody) corresponding to each voice part of a music is unquestioned.

以上の各態様は、音符について設定される各種の音楽情報の変動量の設定にも適用され得る。本発明の好適な態様に係る楽曲処理装置は、楽曲の各音符について、楽曲における当該音符の位置に応じた散布度の確率分布で発生する乱数に応じて、当該音符の音楽情報の変動量を設定する変動量設定手段を具備する。 Each of the above aspects can also be applied to setting the amount of variation of various music information set for a note. The music processing device according to a preferred aspect of the present invention, for each musical note of the music, the amount of change in the music information of the musical note according to the random number generated in the probability distribution of the spread degree according to the position of the musical note in the musical piece. Fluctuation amount setting means for setting is provided.

以上の各態様に係る楽曲処理装置は、ＤＳＰ（Digital Signal Processor）等のハードウェア（電子回路）によって実現されるほか、ＣＰＵ（Central Processing Unit）等の汎用の演算処理装置とプログラムとの協働によっても実現される。本発明のプログラムは、コンピュータが読取可能な記録媒体に格納された形態で提供されてコンピュータにインストールされ得る。記録媒体は、例えば非一過性（non-transitory）の記録媒体であり、CD-ROM等の光学式記録媒体（光ディスク）が好例であるが、半導体記録媒体や磁気記録媒体等の公知の任意の形式の記録媒体を包含し得る。本発明のプログラムは、例えば通信網を介した配信の形態で提供されてコンピュータにインストールされ得る。また、本発明は、以上に説明した各態様に係る楽曲処理装置の動作方法（楽曲処理方法）としても特定される。 The music processing apparatus according to each of the above aspects is realized by hardware (electronic circuit) such as DSP (Digital Signal Processor), and cooperation between a general-purpose arithmetic processing apparatus such as CPU (Central Processing Unit) and the program. It is also realized by. The program of the present invention can be provided in a form stored in a computer-readable recording medium and installed in the computer. The recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disk) such as a CD-ROM is a good example, but a known arbitrary one such as a semiconductor recording medium or a magnetic recording medium This type of recording medium can be included. The program of the present invention can be provided, for example, in the form of distribution via a communication network and installed in a computer. The present invention is also specified as an operation method (music processing method) of the music processing device according to each aspect described above.

第１実施形態における音声合成装置の構成図である。It is a block diagram of the speech synthesizer in 1st Embodiment. 楽曲データの模式図である。It is a schematic diagram of music data. 編集画面の模式図である。It is a schematic diagram of an edit screen. 変動量設定処理のフローチャートである。It is a flowchart of a variation | change_quantity setting process. 変動量設定処理の説明図である。It is explanatory drawing of a variation | change_quantity setting process. 第２実施形態における変動量設定処理のフローチャートである。It is a flowchart of the variation | change_quantity setting process in 2nd Embodiment. 第２実施形態における変動量設定処理の説明図である。It is explanatory drawing of the variation | change_quantity setting process in 2nd Embodiment. 第３実施形態における変動量設定処理の説明図である。It is explanatory drawing of the variation | change_quantity setting process in 3rd Embodiment. 第４実施形態における変動量設定処理の説明図である。It is explanatory drawing of the variation | change_quantity setting process in 4th Embodiment. 第４実施形態における変動量設定処理の説明図である。It is explanatory drawing of the variation | change_quantity setting process in 4th Embodiment. 第４実施形態において各音符の消音時点の変動量を設定する処理のフローチャートである。It is a flowchart of the process which sets the variation | change_quantity at the time of mute of each note in 4th Embodiment.

＜第１実施形態＞
図１は、本発明の第１実施形態に係る音声合成装置１００の構成図である。音声合成装置１００は、任意の楽曲（以下「対象楽曲」という）の歌唱音声の音響信号Ｖを生成する信号処理装置であり、演算処理装置１０と記憶装置１２と表示装置１４と入力装置１６と放音装置１８とを具備するコンピュータシステム（例えば携帯情報端末やパーソナルコンピュータ等の情報処理装置）で実現される。 <First Embodiment>
FIG. 1 is a configuration diagram of a speech synthesizer 100 according to the first embodiment of the present invention. The speech synthesizer 100 is a signal processing device that generates an acoustic signal V of a singing voice of an arbitrary song (hereinafter referred to as “target song”), and includes an arithmetic processing device 10, a storage device 12, a display device 14, and an input device 16. This is realized by a computer system (for example, an information processing device such as a portable information terminal or a personal computer) including the sound emitting device 18.

表示装置１４（例えば液晶表示パネル）は、演算処理装置１０から指示された画像を表示する。入力装置１６は、音声合成装置１００に対する各種の指示のために利用者が操作する操作機器であり、例えば利用者が操作する複数の操作子を含んで構成される。表示装置１４と一体に構成されたタッチパネルを入力装置１６として採用することも可能である。放音装置１８（例えばスピーカやヘッドホン）は、音響信号Ｖに応じた音響を再生する。なお、音響信号Ｖをデジタルからアナログに変換するＤ/Ａ変換器の図示は便宜的に省略した。 The display device 14 (for example, a liquid crystal display panel) displays an image instructed from the arithmetic processing device 10. The input device 16 is an operation device operated by a user for various instructions to the speech synthesizer 100, and includes a plurality of operators operated by the user, for example. A touch panel configured integrally with the display device 14 may be employed as the input device 16. The sound emitting device 18 (for example, a speaker or headphones) reproduces sound corresponding to the sound signal V. The D / A converter that converts the acoustic signal V from digital to analog is not shown for convenience.

記憶装置１２は、演算処理装置１０が実行するプログラムや演算処理装置１０が使用する各種のデータを記憶する。半導体記録媒体や磁気記録媒体等の公知の記録媒体または複数種の記録媒体の組合せが記憶装置１２として任意に採用される。第１実施形態の記憶装置１２は、複数の音声素片群Ｌと対象楽曲の楽曲データＭとを記憶する。 The storage device 12 stores a program executed by the arithmetic processing device 10 and various data used by the arithmetic processing device 10. A known recording medium such as a semiconductor recording medium or a magnetic recording medium or a combination of a plurality of types of recording media is arbitrarily employed as the storage device 12. The storage device 12 of the first embodiment stores a plurality of speech element groups L and music data M of the target music.

各音声素片群Ｌは、特定の発声者の発声音から事前に採取された複数の音声素片の集合（音声ライブラリ）である。各音声素片の音響特性は音声素片群Ｌ毎に相違する。具体的には、音声素片群Ｌ毎に各音声素片の発声者が相違する。各音声素片は、言語的な意味の区別の最小単位である音素（例えば母音や子音）、または、複数の音素を連結した音素連鎖（例えばダイフォンやトライフォン）である。 Each speech unit group L is a set (speech library) of a plurality of speech units collected in advance from the uttered sound of a specific speaker. The acoustic characteristics of each speech unit are different for each speech unit group L. Specifically, the speaker of each speech unit is different for each speech unit group L. Each speech element is a phoneme (for example, a vowel or a consonant) that is a minimum unit of linguistic meaning distinction, or a phoneme chain (for example, a diphone or a triphone) that connects a plurality of phonemes.

楽曲データＭは、対象楽曲を構成する複数の音符の時系列を指定するデータであり、図２に例示される通り、対象楽曲の相異なる声部（パート）に対応する複数の音楽データＭPを含んで構成される。任意の１個の声部の音楽データＭPは、対象楽曲のうち当該声部の音符の時系列を指定する。なお、第１実施形態では、対象楽曲の共通の旋律を複数の発声者が歌唱した合唱音声の音響信号Ｖを生成する場合（ユニゾン）を想定する。すなわち、楽曲データＭの複数の音楽データＭPで指定される音符列（旋律）は共通し、声部毎に別個の音声素片群Ｌが音声合成の素材として利用される。ただし、音楽データＭPで指定される音符列を声部毎に相違させることも可能である。また、各音楽データＭPに対応する複数の声部について共通の音声素片群Ｌを利用することで、合成音声に自然な厚みを付与すること（ダブリング効果）も可能である。 The music data M is data for designating a time series of a plurality of notes constituting the target music. As illustrated in FIG. 2, the music data M includes a plurality of music data MP corresponding to different voice parts (parts) of the target music. Consists of including. The music data MP of any one voice part specifies a time series of notes of the voice part of the target music. In the first embodiment, a case (unison) is assumed in which an acoustic signal V of a choral voice sung by a plurality of speakers is sung on a common melody of the target music piece. That is, the note strings (melody) specified by the plurality of music data MP of the music data M are common, and a separate speech segment group L for each voice part is used as a material for speech synthesis. However, the note sequence designated by the music data MP can be made different for each voice part. Further, by using a common speech segment group L for a plurality of voice parts corresponding to each music data MP, it is possible to give a natural thickness to the synthesized speech (doubling effect).

図２に例示される通り、任意の１個の声部に対応する音楽データＭPは、音符情報ＱNと制御情報ＱCとを当該声部の音符毎に指定する。音符情報ＱNは、音符を指定する情報であり、第１実施形態では音高Ｘ1と発音期間Ｘ2と音声符号Ｘ3とを包含する。音高Ｘ1は、例えばMIDI（Musical Instrument Digital Interface）規格に準拠したノートナンバーである。発音期間Ｘ2は、音符の発音が維持される期間であり、例えば音符の発音時点（音符の発音が開始される時点）と消音時点（音符の発音が終了する時点）とで指定される。なお、音符の発音時点と継続長（音価）とで発音期間Ｘ2を指定することも可能である。音声符号Ｘ3は、合成音声の発音内容（すなわち対象楽曲の歌詞）を指定する。 As illustrated in FIG. 2, the music data MP corresponding to any one voice part specifies the note information QN and the control information QC for each note of the voice part. The note information QN is information for specifying a note, and includes a pitch X1, a pronunciation period X2, and a voice code X3 in the first embodiment. The pitch X1 is a note number based on, for example, the MIDI (Musical Instrument Digital Interface) standard. The sound generation period X2 is a period during which the sound of a note is maintained, and is specified, for example, at the time of sound generation of a note (at the time of sound generation of the note) and the time of mute (at the time of sound generation of the note). Note that the sound generation period X2 can also be specified by the time of sound generation and the duration (tone value). The voice code X3 specifies the pronunciation content of the synthesized voice (that is, the lyrics of the target song).

他方、制御情報ＱCは、合成音声の音楽的な特性（例えば発声者の歌唱の傾向や表現等）を制御するための情報である。第１実施形態の制御情報ＱCは、音符の発音時点の変動量Δを包含する。変動量Δは、各音符の音符情報ＱNで指定される発音期間Ｘ2の発音時点（すなわち楽譜上で規定される所期の発音時点）に対する時間的な誤差（ばらつき）である。変動量Δは正数または負数に設定され得る。例えば変動量Δが正数である場合には発音時点が時間軸上で後方に変動し、変動量Δが負数である場合には発音時点が時間軸上で前方に変動する。 On the other hand, the control information QC is information for controlling the musical characteristics of the synthesized speech (for example, the tendency or expression of the speaker's singing). The control information QC of the first embodiment includes a fluctuation amount Δ at the time of note generation. The fluctuation amount Δ is a temporal error (variation) with respect to the sounding point of the sounding period X2 designated by the note information QN of each note (that is, the expected sounding point specified on the score). The fluctuation amount Δ can be set to a positive number or a negative number. For example, when the fluctuation amount Δ is a positive number, the sounding time point changes backward on the time axis, and when the fluctuation amount Δ is a negative number, the sounding time point changes forward on the time axis.

図１の演算処理装置１０は、記憶装置１２に記憶されたプログラムを実行することで、楽曲データＭの編集や音響信号Ｖの生成のための複数の機能（表示制御部２２，情報管理部２４，音声合成部２６，変動量設定部２８）を実現する。なお、演算処理装置１０の各機能を複数の装置に分散した構成や、専用の電子回路（例えばDSP）が演算処理装置１０の一部の機能を実現する構成も採用され得る。表示制御部２２および情報管理部２４は、楽曲編集用のソフトウェア（エディタ）で実現され、音声合成部２６は、音声合成用のソフトウェア（音声合成エンジン）で実現される。変動量設定部２８は、例えば、楽曲編集用または音声合成用のソフトウェアに対するプラグインソフトウェアで実現される。もっとも、各機能に対応するソフトウェアの切分けは任意であり、例えば、楽曲編集用のソフトウェアのひとつの機能として変動量設定部２８の機能を内包することも可能である。 The arithmetic processing device 10 in FIG. 1 executes a program stored in the storage device 12, thereby performing a plurality of functions (display control unit 22, information management unit 24) for editing the music data M and generating the acoustic signal V. , The voice synthesis unit 26, and the fluctuation amount setting unit 28). A configuration in which each function of the arithmetic processing device 10 is distributed to a plurality of devices, or a configuration in which a dedicated electronic circuit (for example, DSP) realizes a part of the functions of the arithmetic processing device 10 may be employed. The display control unit 22 and the information management unit 24 are realized by music editing software (editor), and the voice synthesis unit 26 is realized by voice synthesis software (voice synthesis engine). The fluctuation amount setting unit 28 is realized by, for example, plug-in software for music editing software or voice synthesis software. However, the separation of software corresponding to each function is arbitrary, and for example, the function of the fluctuation amount setting unit 28 can be included as one function of the music editing software.

表示制御部２２は、各種の画像を表示装置１４に表示させる。第１実施形態の表示制御部２２は、楽曲データＭが指定する対象楽曲の内容を利用者が確認および編集するための図３の編集画面４０を表示装置１４に表示させる。編集画面４０は、相互に交差する時間軸および音高軸が設定された楽譜領域４２に、楽曲データＭの各音楽データＭPで指定される各音符を表象する音符図像４４を配置したピアノロール型の画像である。音高軸の方向における音符図像４４の位置は当該音符の音高Ｘ1に応じて設定され、時間軸の方向における音符図像４４の位置および表示長は当該音符の発音期間Ｘ2に応じて設定される。具体的には、図３に例示される通り、音符図像４４の前縁（左端）の位置は、音符情報ＱNで指定される発音期間Ｘ2の発音時点に制御情報ＱCの変動量Δを付加した位置に設定され、音符図像４４の後縁（右端）の位置は、音符情報ＱNで指定される消音時点に応じて設定される。また、各音符の音声符号Ｘ3の文字が各音符図像４４に付加される。なお、第１実施形態では複数の声部の音符図像４４が編集画面４０に表示される。各音符図像４４を利用者が声部毎に視覚的に区別できるように、音符図像４４の表示態様（色彩や形状等の態様）を声部毎に相違させる構成が好適である。 The display control unit 22 displays various images on the display device 14. The display control unit 22 of the first embodiment causes the display device 14 to display the editing screen 40 of FIG. 3 for the user to confirm and edit the content of the target music specified by the music data M. The edit screen 40 is a piano roll type in which a note image 44 representing each note designated by each music data MP of the music data M is arranged in a score area 42 in which a time axis and a pitch axis intersect with each other are set. It is an image. The position of the note image 44 in the direction of the pitch axis is set according to the pitch X1 of the note, and the position and display length of the note image 44 in the direction of the time axis are set according to the sound generation period X2 of the note. . Specifically, as illustrated in FIG. 3, the position of the leading edge (left end) of the note image 44 is obtained by adding a fluctuation amount Δ of the control information QC to the sounding point of the sounding period X2 specified by the note information QN. The position of the trailing edge (right end) of the note image 44 is set in accordance with the mute time point specified by the note information QN. Further, the character of the voice code X3 of each note is added to each note image 44. In the first embodiment, note images 44 of a plurality of voice parts are displayed on the editing screen 40. A configuration in which the display form (color, shape, etc.) of the note image 44 is different for each voice part is preferable so that the user can visually distinguish each note image 44 for each voice part.

図１の情報管理部２４は、記憶装置１２に記憶された楽曲データＭを管理する。具体的には、情報管理部２４は、入力装置１６に対する利用者からの指示に応じて楽曲データＭの各音楽データＭP（音高Ｘ1，発音期間Ｘ2，音声符号Ｘ3）を更新する。第１実施形態の情報管理部２４は、利用者が１個の声部について指示した音符の時系列を表現する音楽データＭPを複製することで、相異なる音声素片群Ｌに対応する複数の声部の音楽データＭP（楽曲データＭ）を生成する。 The information management unit 24 in FIG. 1 manages the music data M stored in the storage device 12. Specifically, the information management unit 24 updates each music data MP (pitch X1, sound generation period X2, voice code X3) of the music data M in accordance with an instruction from the user to the input device 16. The information management unit 24 of the first embodiment duplicates the music data MP that represents the time series of the notes that the user has instructed for one voice part, so that a plurality of speech unit groups L corresponding to different speech unit groups L can be obtained. The music data MP (music data M) of the voice part is generated.

音声合成部２６は、記憶装置１２に記憶された各音声素片群Ｌと楽曲データＭとを利用した音声合成処理で音響信号Ｖを生成する。第１実施形態の音声合成部２６は、対象楽曲の楽曲データＭの各音楽データＭPで指定される歌唱音声を表す基礎信号を対象楽曲の声部毎に生成し、複数の声部の基礎信号を混合することで音響信号Ｖを生成する。具体的には、音声合成部２６は、各声部の音楽データＭPが音符毎に指定する音声符号Ｘ3に対応した音声素片を当該声部の音声素片群Ｌから順次に選択し、当該音声素片を音高Ｘ1に調整するとともに発音期間Ｘ2と変動量Δとに応じて時間長を調整し、調整後の各音声素片を時間軸上で相互に連結することで声部毎に基礎信号を生成する。各声部の音符毎の発音時点は、当該音符の音符情報ＱNが指定する発音期間Ｘ2の発音時点から、当該音符の制御情報ＱCが指定する変動量Δに応じて調整される。具体的には、音符情報ＱNで指定される発音時点が変動量Δだけ変更される。音声合成部２６が生成した音響信号Ｖが放音装置１８に供給されることで対象楽曲の合唱音声が再生される。 The speech synthesizer 26 generates an acoustic signal V by speech synthesis processing using each speech element group L and music data M stored in the storage device 12. The voice synthesizer 26 of the first embodiment generates a basic signal representing the singing voice specified by each music data MP of the music data M of the target music for each voice of the target music, and the basic signals of a plurality of voices. Is mixed to generate an acoustic signal V. Specifically, the speech synthesizer 26 sequentially selects speech units corresponding to the speech code X3 designated for each note by the music data MP of each voice unit from the speech unit group L of the voice unit, The voice segment is adjusted to the pitch X1, the time length is adjusted according to the pronunciation period X2 and the fluctuation amount Δ, and the individual voice segments are connected to each other on the time axis for each voice part. Generate a base signal. The sound generation time for each note of each voice is adjusted according to the variation Δ specified by the control information QC of the note from the sound generation time of the sound generation period X2 specified by the note information QN of the note. Specifically, the sound generation point designated by the note information QN is changed by the variation Δ. The sound signal V generated by the speech synthesizer 26 is supplied to the sound emitting device 18 so that the chorus sound of the target music is reproduced.

図１の変動量設定部２８は、対象楽曲の各声部の音符毎に発音時点の変動量Δ（制御情報ＱC）を設定する。具体的には、変動量Δは乱数λに応じて音符毎に設定される。第１実施形態の変動量設定部２８は、対象楽曲での当該音符の位置に応じて可変に設定された散布度（分散）の確率分布Ｄで発生する乱数λに応じて変動量Δを設定する。実際の歌唱に着目すると、発音時点が対象楽曲の拍点の近傍に位置する音符は、発音時点が拍点から離間した音符と比較して発音時点の変動が抑制される（発音時点の誤差が低減される）という傾向が観察される。以上の傾向を考慮して、第１実施形態では、対象楽曲の拍点の近傍に位置する音符の発音時点の分布幅が、拍点から離間した音符の発音時点の分布幅を下回る（発音時点の分散が抑制される）ように、各音符の発音時点の変動量Δが設定される。 The fluctuation amount setting unit 28 in FIG. 1 sets a fluctuation amount Δ (control information QC) at the time of sound generation for each note of each voice part of the target music piece. Specifically, the variation Δ is set for each note according to the random number λ. The variation setting unit 28 of the first embodiment sets the variation Δ according to the random number λ generated in the probability distribution D of the spread degree (variance) set variably according to the position of the note in the target music. To do. Focusing on the actual singing, the note whose pronunciation point is located near the beat point of the target music is less affected by the fluctuation of the pronunciation point compared to the note whose pronunciation point is far from the beat point (the error at the point of pronunciation is A tendency to be reduced) is observed. In consideration of the above-mentioned tendency, in the first embodiment, the distribution width at the time of sound generation of the notes located near the beat point of the target music is less than the distribution width at the time of sound generation of the notes apart from the beat point (sound generation time point). The fluctuation amount Δ at the time of pronunciation of each note is set so that the dispersion of the sound is suppressed.

図４は、変動量設定部２８が変動量Δを設定する処理（変動量設定処理）のフローチャートであり、図５は、変動量設定処理の具体例の説明図である。図４の変動量設定処理は、編集画面４０に対する利用者からの指示に応じて対象楽曲の楽曲データＭが生成された状態で、入力装置１６に対する利用者からの指示（変動量Δの設定用のプラグインソフトウェアの実行指示）を契機として実行される。変動量設定部２８が開始される段階では、各音符の変動量Δは初期値（例えばゼロ）に設定される。図５では、前述の編集画面４０と同様の形式で対象楽曲の音符Ｎ1および音符Ｎ2が図示され、対象楽曲の拍点Ｂが便宜的に併記されている。 FIG. 4 is a flowchart of processing (variation amount setting processing) in which the variation amount setting unit 28 sets the variation amount Δ, and FIG. 5 is an explanatory diagram of a specific example of variation amount setting processing. In the variation amount setting process of FIG. 4, an instruction from the user (for setting variation amount Δ) to the input device 16 in a state where the music data M of the target music is generated according to the instruction from the user to the editing screen 40. (Execution instruction for plug-in software). At the stage where the fluctuation amount setting unit 28 is started, the fluctuation amount Δ of each note is set to an initial value (for example, zero). In FIG. 5, the note N1 and the note N2 of the target music are shown in the same format as the editing screen 40 described above, and the beat point B of the target music is shown together for convenience.

変動量設定処理を開始すると、変動量設定部２８は、記憶装置１２に記憶された楽曲データＭから１個の音楽データＭP（声部）を選択し（ＳA1）、当該音楽データＭPの各音符情報ＱNで指定される１個の音符（以下「選択音符」という）を選択する（ＳA2）。 When the variation setting process is started, the variation setting unit 28 selects one piece of music data MP (voice part) from the music data M stored in the storage device 12 (SA1), and each note of the music data MP. One note (hereinafter referred to as “selected note”) designated by the information QN is selected (SA2).

変動量設定部２８は、音符情報ＱNで指定される選択音符の発音時点が対象楽曲の拍点範囲Ｒの内側に位置するか否かを判定する（ＳA3）。図５に例示される通り、拍点範囲Ｒは、対象楽曲の拍点Ｂを包含する時間軸上の範囲である。具体的には、拍点Ｂを中心とする所定幅の範囲が拍点範囲Ｒとして設定される。図５に例示される通り、対象楽曲の音符Ｎ1の発音時点は拍点範囲Ｒの内側に位置し（ＳA3：YES）、音符Ｎ2の発音時点は拍点範囲Ｒの外側に位置する（ＳA3：NO）。以上の説明から理解される通り、ステップＳA3の判定は、選択音符の発音時点が対象楽曲の拍点Ｂの近傍に位置するか否かを判定する処理（対象楽曲における選択音符の位置を判別する処理）に相当する。 The fluctuation amount setting unit 28 determines whether or not the sounding point of the selected note specified by the note information QN is located inside the beat range R of the target music (SA3). As illustrated in FIG. 5, the beat point range R is a range on the time axis including the beat point B of the target music. Specifically, a range having a predetermined width centered on the beat point B is set as the beat point range R. As illustrated in FIG. 5, the sounding point of the note N1 of the target music is located inside the beat range R (SA3: YES), and the sounding point of the note N2 is located outside the beat range R (SA3: NO). As understood from the above description, the determination in step SA3 is a process for determining whether or not the time point of the selected note is located near the beat point B of the target music (the position of the selected note in the target music is determined). Process).

図５に例示される通り、変動量設定部２８は、確率分布Ｄで発生する乱数λに応じて選択音符の変動量Δを設定する（ＳA4〜ＳA7）。第１実施形態では、ゼロを平均とする正規分布を確率分布Ｄとして利用した場合を例示する。確率分布Ｄに応じた変動量Δの設定について以下に詳述する。なお、確率分布Ｄは正規分布に限定されない。 As illustrated in FIG. 5, the fluctuation amount setting unit 28 sets the fluctuation amount Δ of the selected note according to the random number λ generated in the probability distribution D (SA4 to SA7). In the first embodiment, a case where a normal distribution with an average of zero is used as the probability distribution D is illustrated. The setting of the variation Δ according to the probability distribution D will be described in detail below. The probability distribution D is not limited to a normal distribution.

変動量設定部２８は、音符情報ＱNで指定される選択音符の発音時点が拍点範囲Ｒの内側に位置するか否かに応じて確率分布Ｄの分散（散布度）σを可変に設定する。具体的には、選択音符の発音時点が拍点範囲Ｒの内側に位置する場合（ＳA3：YES）、すなわち選択音符が拍点Ｂに近接する場合、変動量設定部２８は、確率分布Ｄの分散σを数値ｄ1（第１散布度）に設定する（ＳA4）。他方、選択音符が拍点範囲Ｒの外側に位置する場合（ＳA3：NO）、すなわち選択音符が拍点Ｂから離間する場合、変動量設定部２８は、確率分布Ｄの分散σを数値ｄ2（第２散布度）に設定する（ＳA5）。図５から理解される通り、数値ｄ2は数値ｄ1を上回る（ｄ2＞ｄ1）。例えば、所定の数値ｄ1に係数ｋ（ｋ＞１）を乗算することで数値ｄ2を算定する構成や、所定の数値ｄ2を係数ｋで除算することで数値ｄ1を算定する構成が採用される。 The fluctuation amount setting unit 28 variably sets the variance (dispersion degree) σ of the probability distribution D according to whether or not the sounding point of the selected note designated by the note information QN is located inside the beat range R. . Specifically, when the sounding time of the selected note is located inside the beat point range R (SA3: YES), that is, when the selected note is close to the beat point B, the fluctuation amount setting unit 28 indicates the probability distribution D. The variance σ is set to a numerical value d1 (first spreading degree) (SA4). On the other hand, when the selected note is located outside the beat range R (SA3: NO), that is, when the selected note is separated from the beat point B, the fluctuation amount setting unit 28 sets the variance σ of the probability distribution D to a numerical value d2 ( (Second spreading degree) is set (SA5). As understood from FIG. 5, the numerical value d2 exceeds the numerical value d1 (d2> d1). For example, a configuration in which the numerical value d2 is calculated by multiplying the predetermined numerical value d1 by a coefficient k (k> 1), or a configuration in which the numerical value d1 is calculated by dividing the predetermined numerical value d2 by the coefficient k is employed.

図５に例示される通り、変動量設定部２８は、以上の手順で設定した分散σの確率分布Ｄのもとで乱数（すなわち正規乱数）λを発生する（ＳA6）。乱数λの発生には公知の技術が任意に採用され得るが、全数値が同確率で発生する一様乱数にボックス=ミュラー（Box-Muller）法を適用することで正規乱数λを生成することが可能である。第１実施形態の確率分布Ｄは、ゼロを平均とする正規分布であるから、乱数λは正数または負数に設定され得る。 As illustrated in FIG. 5, the fluctuation amount setting unit 28 generates a random number (that is, a normal random number) λ based on the probability distribution D of the variance σ set by the above procedure (SA6). A known technique can be arbitrarily adopted to generate the random number λ, but a normal random number λ is generated by applying the Box-Muller method to a uniform random number in which all numerical values are generated with the same probability. Is possible. Since the probability distribution D of the first embodiment is a normal distribution with zero as an average, the random number λ can be set to a positive number or a negative number.

変動量設定部２８は、乱数λに応じて発音時点の変動量Δを設定する（ＳA7）。具体的には、乱数λを変動量Δとして採択する構成や、乱数λを変数とする所定の演算で変動量Δを算定する構成が採用され得る。すなわち、第１実施形態の変動量設定部２８は、対象楽曲における選択音符の位置に応じた分散σの確率分布Ｄで発生する乱数λに応じて選択音符の発音時点の変動量Δを設定する。以上の説明から理解される通り、音符情報ＱNで指定される発音時点が拍点範囲Ｒの内側に位置する音符は、発音時点が拍点範囲Ｒの外側に位置する音符と比較して、変動量Δを適用した調整後の発音時点が広範囲に分散される。すなわち、拍点Ｂの近傍に位置する音符は拍点Ｂから離間した音符と比較して発音時点が分散し難い（ばらつき難い）という現実の歌唱の傾向が再現される。 The fluctuation amount setting unit 28 sets the fluctuation amount Δ at the time of sound generation according to the random number λ (SA7). Specifically, a configuration in which the random number λ is adopted as the variation amount Δ or a configuration in which the variation amount Δ is calculated by a predetermined calculation using the random number λ as a variable may be employed. That is, the variation setting unit 28 of the first embodiment sets the variation Δ at the time of pronunciation of the selected note according to the random number λ generated in the probability distribution D of the variance σ according to the position of the selected note in the target music. . As can be understood from the above description, the note whose pronunciation point specified by the note information QN is located inside the beat range R is more variable than the note whose pronunciation point is located outside the beat range R. The adjusted sounding time points when the amount Δ is applied are dispersed over a wide range. That is, the actual singing tendency that the notes located near the beat point B are less likely to be dispersed (difficult to vary) than the notes separated from the beat point B is reproduced.

選択音符について変動量Δを設定すると、変動量設定部２８は、現段階の処理対象の音楽データＭPで指定される全部の音符（すなわち１個の声部の全部の音符）について変動量Δを設定したか否かを判定する（ＳA8）。変動量Δの未設定の音符が存在する場合（ＳA8：NO）、変動量設定部２８は、直後の音符を選択音符として選択（ＳA2）したうえで変動量Δの設定（ＳA3〜ＳA7）を実行する。他方、全部の音符について変動量Δの設定が完了した場合（ＳA8：YES）、変動量設定部２８は、楽曲データＭの全部の音楽データＭPについて処理が完了したか否かを判定し（ＳA9）、未処理の音楽データＭPが存在する場合（ＳA9：NO）には未処理の音楽データＭPを選択（ＳA1）したうえで以上の処理（ＳA2〜ＳA8）を実行する。楽曲データＭに包含される全部の音楽データＭPについて処理が完了する（ＳA9：YES）ことで変動量設定処理が終了する。以上の説明から理解される通り、各音符の変動量Δの設定は音楽データＭP毎（対象楽曲の声部毎）に別個に実行されるから、音符情報ＱNで指定される音符の発音時点が複数の声部にわたり共通する場合でも、音響信号Ｖにおける当該音符の発音時点は声部毎に相違し得る。すなわち、各声部の発音時点が適度に分散された自然な合成音声が生成される。 When the fluctuation amount Δ is set for the selected note, the fluctuation amount setting unit 28 sets the fluctuation amount Δ for all the notes specified by the music data MP to be processed at the current stage (that is, all the notes of one voice part). It is determined whether it has been set (SA8). When there is a note for which the variation Δ is not set (SA8: NO), the variation setting unit 28 selects the immediately following note as a selected note (SA2) and then sets the variation Δ (SA3 to SA7). Run. On the other hand, when the setting of the fluctuation amount Δ is completed for all the notes (SA8: YES), the fluctuation amount setting unit 28 determines whether or not the processing has been completed for all the music data MP of the music data M (SA9). When the unprocessed music data MP exists (SA9: NO), the unprocessed music data MP is selected (SA1) and the above processing (SA2 to SA8) is executed. When the process is completed for all the music data MP included in the music data M (SA9: YES), the variation amount setting process ends. As understood from the above description, the setting of the variation amount Δ of each note is executed separately for each music data MP (for each voice part of the target music piece), so that the time point of the note specified by the note information QN is determined. Even when it is common to a plurality of voice parts, the sound generation time point of the note in the acoustic signal V may be different for each voice part. That is, natural synthesized speech in which the sound generation points of each voice part are appropriately dispersed is generated.

以上に説明した通り、第１実施形態では、対象楽曲の音符の位置に応じて分散σが可変に設定された確率分布Ｄで発生する乱数λに応じて当該音符の発音時点の変動量Δが設定されるから、対象楽曲における各音符の位置とは無関係に発音時点の変動量が設定される特許文献１の構成と比較すると、実際の歌唱や演奏における各音符の発音時点の変動量（誤差）と各音符の楽曲内の位置との関係を反映した自然な変動量Δを設定できるという利点がある。 As described above, in the first embodiment, the fluctuation amount Δ at the time of pronunciation of the note is determined according to the random number λ generated in the probability distribution D in which the variance σ is variably set according to the position of the note of the target music. Therefore, when compared with the configuration of Patent Document 1 in which the amount of variation at the time of pronunciation is set regardless of the position of each note in the target music, the amount of variation (error) at the time of pronunciation of each note in actual singing or performance ) And the position of each note in the music piece, there is an advantage that a natural variation Δ can be set.

例えば第１実施形態では、音符情報ＱNで指定される発音時点が拍点範囲Ｒに包含されるか否かに応じて確率分布Ｄの分散σが変動するから、楽曲の拍点Ｂから離間した音符は拍点Ｂの近傍の音符と比較して発音時点が分散し易いという現実の歌唱の傾向を再現した自然な変動量Δを設定できる。また、分散σが２値的（ｄ1，ｄ2）に設定されるから、例えば拍点Ｂと音符の発音時点との距離に応じて分散σを連続的に変化させる構成と比較して変動量設定処理が簡素化されるという利点もある。 For example, in the first embodiment, since the variance σ of the probability distribution D varies depending on whether or not the pronunciation point specified by the note information QN is included in the beat range R, it is separated from the beat point B of the music. It is possible to set a natural variation Δ that reproduces the actual singing tendency that the notes are more easily distributed than the notes near the beat point B. Further, since the variance σ is set to be binary (d1, d2), for example, the amount of variation is set as compared with the configuration in which the variance σ is continuously changed according to the distance between the beat point B and the time point of note production. There is also an advantage that the processing is simplified.

＜第２実施形態＞
本発明の第２実施形態を以下に説明する。なお、以下に例示する各形態において作用や機能が第１実施形態と同様である要素については、第１実施形態の説明で参照した符号を流用して各々の詳細な説明を適宜に省略する。 Second Embodiment
A second embodiment of the present invention will be described below. In addition, about the element which an effect | action and function are the same as that of 1st Embodiment in each form illustrated below, the reference | standard referred by description of 1st Embodiment is diverted, and each detailed description is abbreviate | omitted suitably.

図６は、第２実施形態の変動量設定部２８が実行する変動量設定処理の部分的なフローチャートであり、図７は、第２実施形態における変動量設定処理の説明図である。図６に例示される通り、第２実施形態の変動量設定部２８は、音楽データＭP毎（声部毎）に補正値Ａを可変に設定する（ＳA11）。具体的には、補正値Ａは、例えば入力装置１６に対する利用者からの指示に応じて声部毎に正数または負数に設定され得る。 FIG. 6 is a partial flowchart of the fluctuation amount setting process executed by the fluctuation amount setting unit 28 of the second embodiment, and FIG. 7 is an explanatory diagram of the fluctuation amount setting process in the second embodiment. As illustrated in FIG. 6, the fluctuation amount setting unit 28 of the second embodiment variably sets the correction value A for each music data MP (for each voice part) (SA11). Specifically, the correction value A can be set to a positive number or a negative number for each voice part in accordance with an instruction from the user to the input device 16, for example.

図７から理解される通り、変動量設定部２８は、選択音符の位置に応じた分散σの確率分布Ｄの中心（平均値）を補正値Ａに応じて制御する。具体的には、変動量設定部２８は、補正値Ａを平均値とする分散σの確率分布Ｄ（すなわち補正値Ａの発生確率が最大となる確率分布Ｄ）のもとで乱数λを発生させる（ＳA6）。乱数λに応じた変動量Δの設定（ＳA7）は第１実施形態と同様である。以上の説明から理解される通り、補正値Ａは、確率分布Ｄのオフセット値に相当する。なお、以上に例示した正規分布では確率分布Ｄの中心（発生確率が最大となる数値）が平均値に相当するが、他の種類の確率分布（例えば対数正規分布）では、発生確率が最大となる数値（最頻値）と平均値とは相違し得る。 As understood from FIG. 7, the fluctuation amount setting unit 28 controls the center (average value) of the probability distribution D of the variance σ according to the position of the selected note according to the correction value A. Specifically, the fluctuation amount setting unit 28 generates a random number λ based on a probability distribution D of variance σ having the correction value A as an average value (that is, a probability distribution D in which the generation probability of the correction value A is maximum). (SA6). The setting of the variation Δ according to the random number λ (SA7) is the same as in the first embodiment. As understood from the above description, the correction value A corresponds to the offset value of the probability distribution D. In the normal distribution exemplified above, the center of the probability distribution D (the numerical value with the highest occurrence probability) corresponds to the average value, but in other types of probability distributions (for example, lognormal distribution), the occurrence probability is the maximum. The numerical value (mode) and the average value can be different.

補正値Ａが正数に設定された場合、変動量Δを適用した調整後の各音符の発音時点は、音符情報ＱNで指定される初期的な発音時点と比較して後方の時点を中心として分散される。したがって、各音符の発音時点が所期の時点（楽譜上の発音時点）と比較して遅延する歌唱の傾向（いわゆる後ノリ）が再現される。他方、補正値Ａが負数に設定された場合、変動量Δを適用した調整後の各音符の発音時点は、音符情報ＱNで指定される初期的な発音時点と比較して前方の時点を中心として分散される。したがって、各音符の発音時点が所期の時点と比較して先行する歌唱の傾向（いわゆる前ノリ）が再現される。 When the correction value A is set to a positive number, the sound generation time of each adjusted note to which the variation Δ is applied is centered on a time point behind the initial sound generation time specified by the note information QN. Distributed. Therefore, the tendency of singing (so-called post-slipping) in which the time of pronunciation of each note is delayed compared to the intended time (sounding time on the score) is reproduced. On the other hand, when the correction value A is set to a negative number, the sound generation time of each adjusted note to which the variation Δ is applied is centered on the time point ahead of the initial sound generation time specified by the note information QN. As distributed. Therefore, the tendency of the singing preceding the time point at which each note is pronounced compared to the intended time point (so-called pre-groove) is reproduced.

第２実施形態においても第１実施形態と同様の効果が実現される。また、第２実施形態では、確率分布Ｄの中心の位置（平均値）が補正値Ａに応じて可変に設定されるから、各音符の発音時点の先行／遅延の傾向（前ノリ／後ノリ）を反映した変動量Δを設定できるという利点がある。また、第２実施形態では、対象楽曲の声部毎に補正値Ａが個別に設定されるから、各音符の発音時点の先行／遅延の傾向を声部毎（歌唱者毎）に相違させることが可能である。 In the second embodiment, the same effect as in the first embodiment is realized. In the second embodiment, since the center position (average value) of the probability distribution D is variably set according to the correction value A, the leading / delaying tendency (previous / rear) at the time of sound generation of each note. ) Can be set. In the second embodiment, since the correction value A is individually set for each voice part of the target music piece, the tendency of leading / delaying at the time of pronunciation of each note is made different for each voice part (for each singer). Is possible.

＜第３実施形態＞
図８は、第３実施形態における変動量設定部２８の動作（変動量設定処理）の説明図である。図８には、対象楽曲の１個の声部に包含される音符Ｎ1と音符Ｎ1の直後の音符Ｎ2とが例示されている。 <Third Embodiment>
FIG. 8 is an explanatory diagram of the operation (variation amount setting process) of the variation amount setting unit 28 in the third embodiment. FIG. 8 illustrates a note N1 included in one voice part of the target musical piece and a note N2 immediately after the note N1.

図８から理解される通り、第３実施形態の変動量設定部２８は、変動量Δを適用した調整後の音符Ｎ2の発音時点が所定の範囲（以下「変動許容範囲」という）Ｐの内側に位置するように音符Ｎ2の変動量Δを制限する。変動許容範囲Ｐの端部（前縁）ｐ1は、例えば、音符Ｎ2の直前の音符Ｎ1の発音時点（音符Ｎ1の変動量Δを適用した調整後の発音時点）に設定される。すなわち、変動量Δを適用した調整後の音符Ｎ2の発音時点が直前の音符Ｎ1の発音時点と比較して時間軸上で後方に位置するように、変動量設定部２８は音符Ｎ2の変動量Δを制限する。 As understood from FIG. 8, the fluctuation amount setting unit 28 of the third embodiment is such that the sound generation time point of the adjusted note N2 to which the fluctuation amount Δ is applied is inside a predetermined range (hereinafter referred to as “fluctuation allowable range”) P. The variation amount Δ of the note N2 is limited so as to be located at. The end portion (leading edge) p1 of the fluctuation allowable range P is set, for example, at the sounding time point of the note N1 immediately before the note N2 (the sounding time point after adjustment by applying the fluctuation amount Δ of the note N1). That is, the fluctuation amount setting unit 28 changes the fluctuation amount of the note N2 so that the sounding time point of the adjusted note N2 to which the fluctuation amount Δ is applied is positioned rearward on the time axis compared with the sounding time point of the immediately preceding note N1. Limit Δ.

他方、変動許容範囲Ｐの端部（後縁）ｐ2は、例えば音符情報ＱNで指定される音符Ｎ2の消音時点に設定される。すなわち、変動量Δを適用した調整後の音符Ｎ2の発音時点が当該音符Ｎ2の消音時点と比較して時間軸上で前方に位置するように、変動量設定部２８は音符Ｎ2の変動量Δを制限する。 On the other hand, the end portion (rear edge) p2 of the fluctuation allowable range P is set, for example, at the time when the note N2 specified by the note information QN is muted. In other words, the fluctuation amount setting unit 28 sets the fluctuation amount Δ of the note N2 so that the sounding time point of the adjusted note N2 to which the fluctuation amount Δ is applied is positioned forward on the time axis as compared to the mute time point of the note N2. Limit.

以上の構成によれば、変動量Δを適用した調整後の各音符の発音時点が変動許容範囲Ｐ内に位置するように各音符の発音時点の変動量Δが制限される。したがって、対象楽曲の音符の消失（楽曲内容の破綻）を発生させずに各音符の発音時点を調整することが可能である。 According to the above configuration, the variation amount Δ at the time of sound generation of each note is limited so that the sound generation time of each note after adjustment to which the variation amount Δ is applied is located within the variation allowable range P. Therefore, it is possible to adjust the sound generation time point of each note without causing the disappearance of the note of the target music (destruction of the music content).

なお、図８の例示では、変動許容範囲Ｐの端部ｐ1を音符Ｎ1の発音時点に設定したが、音符Ｎ1の発音時点から所定の時間長（例えば１６分音符に相当する時間長）だけ後方の時点を端部ｐ1として設定することも可能である。以上の構成によれば、音符Ｎ2の発音時点が過度に前方に変動された場合でも、直前の音符Ｎ1に所定の時間長を確保できるという利点がある。同様に、図８の例示では、変動許容範囲Ｐの端部ｐ2を音符Ｎ2の消音時点に設定したが、音符Ｎ2の消音時点から所定の時間長（例えば１６分音符に相当する時間長）だけ前方の時点を端部ｐ2として設定することも可能である。以上の構成によれば、音符Ｎ2の発音時点が過度に後方に変動された場合でも、当該音符Ｎ2に所定の時間長を確保できるという利点がある。 In the example of FIG. 8, the end portion p1 of the fluctuation allowable range P is set to the time point when the note N1 is sounded. It is also possible to set the point of time as the end portion p1. According to the above configuration, there is an advantage that a predetermined time length can be secured for the immediately preceding note N1 even when the sounding time point of the note N2 fluctuates excessively forward. Similarly, in the example of FIG. 8, the end portion p2 of the fluctuation allowable range P is set to the time point when the note N2 is muted, but only a predetermined time length (for example, a time length corresponding to a sixteenth note) from the time point when the note N2 is muted. It is also possible to set the front time point as the end portion p2. According to the above configuration, there is an advantage that a predetermined time length can be secured for the note N2 even when the sounding point of the note N2 is fluctuated excessively backward.

＜第４実施形態＞
図９および図１０は、第４実施形態における変動量設定部２８の動作（変動量設定処理）の説明図である。図９および図１０には、対象楽曲の１個の声部にて相前後する音符Ｎ1および音符Ｎ2が例示されている。図９では、音符情報ＱNで指定される初期的な発音期間Ｘ2が音符Ｎ1と音符Ｎ2とで相互に連続する場合（音符Ｎ1の消音時点と直後の音符Ｎ2の発音時点とが合致する場合）が想定され、図１０では、音符情報ＱNで指定される初期的な発音期間Ｘ2が音符Ｎ1と音符Ｎ2とで相互に間隔をあけて離間する場合が想定されている。 <Fourth embodiment>
9 and 10 are explanatory diagrams of the operation (variation amount setting process) of the variation amount setting unit 28 in the fourth embodiment. FIG. 9 and FIG. 10 illustrate the note N1 and the note N2 that are reciprocal in one voice part of the target music piece. In FIG. 9, when the initial sound generation period X2 designated by the note information QN is continuous between the note N1 and the note N2 (when the mute time of the note N1 coincides with the sound generation time of the immediately following note N2). In FIG. 10, it is assumed that the initial sound generation period X2 specified by the note information QN is spaced apart from each other by the note N1 and the note N2.

第４実施形態の変動量設定部２８は、第１実施形態と同様の変動量設定処理で対象楽曲の各音符の発音時点の変動量Δを設定するほか、各音符の消音時点について変動量ΔEを設定する。具体的には、第４実施形態の変動量設定部２８は、選択音符の発音時点の変動量Δの設定（ＳA7）毎に図１１の処理を実行することで、選択音符の直前の音符（以下「先行音符」という）の消音時点の変動量ΔEを設定する。 The variation amount setting unit 28 of the fourth embodiment sets the variation amount Δ at the time of pronunciation of each note of the target music by the variation amount setting process similar to that of the first embodiment, and also the variation amount ΔE at the time point when each note is muted. Set. Specifically, the fluctuation amount setting unit 28 of the fourth embodiment executes the process of FIG. 11 for each setting of the fluctuation amount Δ when the selected note is sounded (SA7), so that the note immediately before the selected note ( The amount of fluctuation ΔE at the time of mute of “preceding note” is set.

図１１の処理を開始すると、変動量設定部２８は、選択音符（音符Ｎ2）と先行音符（音符Ｎ1）とが時間軸上で相互に連続するか否かを判定する（ＳB1）。音符Ｎ1と音符Ｎ2とが相互に連続する場合（ＳB1：YES）、変動量設定部２８は、図９に例示される通り、音符Ｎ2の発音時点について設定した変動量Δに応じて直前の音符Ｎ1の消音時点の変動量ΔEを設定する（ＳB2）。具体的には、変動量Δを適用した音符Ｎ2の発音時点の調整と変動量ΔEを適用した音符Ｎ1の消音時点の調整との実行後にも音符Ｎ1と音符Ｎ2とが相互に連続するように、変動量ΔEは変動量Δと同等の数値に設定される。 When the process of FIG. 11 is started, the fluctuation amount setting unit 28 determines whether or not the selected note (note N2) and the preceding note (note N1) are continuous with each other on the time axis (SB1). When the note N1 and the note N2 are continuous with each other (SB1: YES), as shown in FIG. 9, the fluctuation amount setting unit 28 determines the immediately preceding note according to the fluctuation amount Δ set for the time point at which the note N2 is sounded. A fluctuation amount ΔE at the time of mute of N1 is set (SB2). Specifically, the note N1 and the note N2 are continuous with each other even after the adjustment of the sounding point of the note N2 to which the fluctuation amount Δ is applied and the adjustment of the muting point of the note N1 to which the fluctuation amount ΔE is applied. The variation amount ΔE is set to a value equivalent to the variation amount Δ.

他方、音符Ｎ1と音符Ｎ2とが時間軸上で相互に間隔をあけて離間する場合（ＳB1：NO）、変動量設定部２８は、図１０に例示される通り、音符Ｎ1の消音時点の変動量ΔEを所定の確率分布ＤEに応じて設定する（ＳB3）。すなわち、発音時点の変動量Δの設定と同様に、変動量設定部２８は、確率分布ＤEで発生する乱数λに応じて変動量ΔEを設定する。ただし、確率分布ＤEの分散は所定値に固定される。変動量設定部２８が各音符について設定した変動量ΔEは、当該音符の変動量Δとともに音楽データＭPの制御情報ＱCに包含される。なお、音符Ｎ1と音符Ｎ2とが相互に離間する場合に確率分布ＤEの分散を可変に設定することも可能である。 On the other hand, when the note N1 and the note N2 are spaced apart from each other on the time axis (SB1: NO), the fluctuation amount setting unit 28 changes the time when the note N1 is muted as illustrated in FIG. The amount ΔE is set according to a predetermined probability distribution DE (SB3). That is, similarly to the setting of the fluctuation amount Δ at the time of sound generation, the fluctuation amount setting unit 28 sets the fluctuation amount ΔE according to the random number λ generated in the probability distribution DE. However, the variance of the probability distribution DE is fixed to a predetermined value. The variation amount ΔE set for each note by the variation amount setting unit 28 is included in the control information QC of the music data MP together with the variation amount Δ of the note. In addition, when the note N1 and the note N2 are separated from each other, the variance of the probability distribution DE can be set variably.

音声合成部２６は、音響信号Ｖを生成する音声合成処理において、各音符の音符情報ＱNが指定する発音期間Ｘ2の消音時点を、当該音符の制御情報ＱCが指定する変動量ΔEに応じて調整する。例えば、音声合成部２６は、音符情報ＱNで指定される音符の消音時点を時間軸上で変動量ΔEだけ変動させる。 In the speech synthesis process for generating the acoustic signal V, the speech synthesizer 26 adjusts the mute time point of the sound generation period X2 designated by the note information QN of each note according to the variation ΔE designated by the control information QC of the note. To do. For example, the speech synthesizer 26 varies the time point of the note specified by the note information QN by the variation ΔE on the time axis.

第４実施形態においても第１実施形態と同様の効果が実現される。また、第４実施形態では、音符情報ＱNで指定される発音期間Ｘ2が音符Ｎ1と音符Ｎ2とで相互に連続する場合に、音符Ｎ2の発音時点の変動量Δに応じて音符Ｎ1の消音時点の変動量ΔEが設定される。したがって、音符Ｎ1と音符Ｎ2との関係（連続／離間）を調整の前後で維持することが可能である。他方、発音期間Ｘ2が音符Ｎ1と音符Ｎ2とで相互に離間する場合には、音符Ｎ1の消音時点の変動量ΔEが確率分布ＤEに応じて設定される。したがって、各音符の消音時点を適度に分散させた自然な合成音声を生成することが可能である。 In the fourth embodiment, the same effect as in the first embodiment is realized. Further, in the fourth embodiment, when the sound generation period X2 specified by the note information QN is continuous between the note N1 and the note N2, the mute time of the note N1 according to the variation Δ at the time of sound generation of the note N2. The fluctuation amount ΔE is set. Therefore, the relationship (continuous / separated) between the note N1 and the note N2 can be maintained before and after the adjustment. On the other hand, when the sound generation period X2 is separated from each other by the note N1 and the note N2, the fluctuation amount ΔE when the note N1 is muted is set according to the probability distribution DE. Therefore, it is possible to generate natural synthesized speech in which the mute times of each note are appropriately dispersed.

＜変形例＞
以上の各形態は多様に変形され得る。具体的な変形の態様を以下に例示する。以下の例示から任意に選択された２以上の態様は適宜に併合され得る。 <Modification>
Each of the above forms can be variously modified. Specific modifications are exemplified below. Two or more aspects arbitrarily selected from the following examples can be appropriately combined.

（１）前述の各形態では、確率分布Ｄで発生する乱数λに応じて各音符の発音時点の変動量Δを設定したが、対象楽曲での音符の位置に応じて分散σが可変に設定された確率分布Ｄを利用して変動量が設定される変数（音楽情報）の種類は任意である。例えば、各音符の消音時点の変動量ΔEを、対象楽曲での当該音符の位置（例えば消音時点が拍点範囲Ｒに包含されるか否か）に応じた分散σの確率分布Ｄで発生する乱数λに応じて設定することも可能である。また、以下に例示される通り、発音期間Ｘ2の端点（発音時点，消音時点）以外で音符毎に設定される変数の変動量を、当該音符の位置に応じた分散σの確率分布Ｄで発生する乱数λに応じて設定することも可能である。 (1) In each of the above-described forms, the variation Δ at the time of sound generation of each note is set according to the random number λ generated in the probability distribution D, but the variance σ is variably set according to the position of the note in the target music. The type of variable (music information) for which the amount of variation is set using the probability distribution D thus set is arbitrary. For example, the fluctuation amount ΔE at the time of mute of each note is generated in a probability distribution D of variance σ according to the position of the note in the target music (for example, whether the mute time is included in the beat range R). It is also possible to set according to the random number λ. Further, as illustrated below, the fluctuation amount of the variable set for each note other than the end point (sound generation time, mute time) of the sound generation period X2 is generated in the probability distribution D of the variance σ according to the position of the note. It is also possible to set according to the random number λ to be performed.

例えば、対象楽曲の各音符の音響のうち強度が経時的に減衰する減衰区間（ディケイ）の特性を制御するための変数（以下「減衰変数」という）の変動量を、第１実施形態と同様に、対象楽曲での当該音符の位置に応じた分散σの確率分布Ｄの乱数λに応じて設定することも可能である。減衰変数は、各音符の音響のうち減衰区間における強度の時間変化（単位時間毎の減衰量）を制御するための変数である。なお、直後の音符に連続する音符や音価が相応に長い音符については減衰特性の変動が少ないという傾向がある。以上の傾向を考慮すると、例えば、対象楽曲内で直後の音符に連続しない音符と、所定値を下回る音価の音符（例えば１６分音符未満の音符）とについて確率分布Ｄの乱数λに応じて減衰変数の変動量を設定する構成が好適である。他方、条件に合致しない音符（直後に連続する音符や音価が相応に長い音符）については、減衰変数の変動量をゼロに設定した構成や、分散σが小さい確率分布Ｄの乱数λに応じて変動量を設定する構成が採用される。 For example, the amount of change in a variable (hereinafter referred to as “attenuation variable”) for controlling the characteristics of an attenuation section (decay) in which the intensity attenuates with time in the sound of each note of the target music is the same as in the first embodiment. In addition, it is possible to set according to the random number λ of the probability distribution D of the variance σ according to the position of the note in the target music. The attenuation variable is a variable for controlling the temporal change (attenuation amount per unit time) in the attenuation section of the sound of each note. Note that there is a tendency for the attenuation characteristics to be less fluctuated for notes that follow the note immediately after them or for notes that have a correspondingly long note value. Considering the above tendency, for example, according to the random number λ of the probability distribution D for notes that do not continue to the note immediately after in the target music and notes with a note value lower than a predetermined value (for example, notes less than 16th notes) A configuration in which the amount of fluctuation of the attenuation variable is set is preferable. On the other hand, for notes that do not match the conditions (notes that continue immediately after them or notes that have a correspondingly long note value), depending on the configuration in which the variation amount of the attenuation variable is set to zero, or the random number λ of the probability distribution D with a small variance σ Thus, a configuration for setting the fluctuation amount is adopted.

各音符の音楽的な表情を制御するための変数（表情変数）の変動量を、第１実施形態と同様に、対象楽曲での当該音符の位置に応じた分散σの確率分布Ｄの乱数λに応じて設定することも可能である。表情変数としては、例えば、音量（ベロシティ，ダイナミクス）のほか、ピッチベンドを制御する変数や、ビブラートを制御する変数（例えばビブラートの速度や深度）が例示される。 As in the first embodiment, the variable amount for controlling the musical expression of each note (expression variable) is a random number λ of the probability distribution D of variance σ corresponding to the position of the note in the target music piece. It is also possible to set according to. Examples of facial expression variables include not only volume (velocity, dynamics) but also variables that control pitch bend and variables that control vibrato (for example, vibrato speed and depth).

以上の例示から理解される通り、前述の各形態における変動量設定部２８は、対象楽曲の各音符について、対象楽曲での当該音符の位置に応じた分散σの確率分布Ｄで発生する乱数λに応じて、当該音符の音楽情報の変動量を設定する要素（変動量設定手段）として包括的に表現され、音楽情報は、音符の発音時点や消音時点のほかに前述の減衰変数や表情変数が包含され得る。 As can be understood from the above examples, the variation setting unit 28 in each of the above-described forms has a random number λ generated with a probability distribution D of variance σ corresponding to the position of the note in the target song for each note of the target song. Accordingly, the music information is comprehensively expressed as an element (variation amount setting means) for setting the variation amount of the music information of the note. Can be included.

（２）前述の各形態では、音符情報ＱNで指定される各音符の発音時点が拍点範囲Ｒに包含されるか否かに応じて確率分布Ｄの分散σを制御したが、確率分布Ｄの分散σの設定方法は以上の例示に限定されない。具体的には、確率分布Ｄの分散σを経時的に変化させることも可能である。例えば、対象楽曲の所定の拍数を周期として分散σを連続的または段階的に変化させる構成が採用される。また、対象楽曲の拍点とは無関係に分散σを変化させることも可能である。例えば、対象楽曲の始点から終点にかけて分散σを増加または減少させる構成や、対象楽曲の特定の区間（例えばサビ区間）と他の区間とで分散σを相違させる構成が採用される。 (2) In each of the above-described embodiments, the variance σ of the probability distribution D is controlled according to whether or not the sound generation time of each note specified by the note information QN is included in the beat point range R, but the probability distribution D The method of setting the variance σ is not limited to the above example. Specifically, the variance σ of the probability distribution D can be changed over time. For example, a configuration is adopted in which the variance σ is changed continuously or step by step with a predetermined number of beats of the target music. It is also possible to change the variance σ regardless of the beat point of the target music piece. For example, a configuration in which the variance σ is increased or decreased from the start point to the end point of the target music, or a configuration in which the variance σ is different between a specific section (for example, a chorus section) of the target music and another section is adopted.

（３）前述の各形態では、歌唱音声の音響信号Ｖを生成する場合（音声合成）を想定したが、対象楽曲を楽器で演奏した演奏音の音響信号Ｖを生成する場合（自動演奏）にも、前述の各形態と同様に本発明が適用される。また、前述の各形態では、複数の声部の各々について別個に各音符の変動量Δを設定する場合を例示したが、１個の声部について各音符の変動量Δを設定する場合にも本発明は適用される。 (3) In each of the above embodiments, the case where the acoustic signal V of the singing voice is generated (speech synthesis) is assumed, but the case where the acoustic signal V of the performance sound obtained by playing the target music piece with the musical instrument is generated (automatic performance). Also, the present invention is applied in the same manner as the above-described embodiments. Further, in each of the above-described embodiments, the case where the variation amount Δ of each note is separately set for each of a plurality of voice portions has been illustrated, but also when the variation amount Δ of each note is set for one voice portion. The present invention applies.

（４）前述の各形態では、楽曲データＭの編集と音響信号Ｖの生成とを実行する音声合成装置１００を例示したが、楽曲データＭの編集（表示制御部２２，情報管理部２４）と音響信号Ｖの生成（音声合成部２６）とを省略した楽曲処理装置としても本発明は特定される。楽曲処理装置は、対象楽曲の各音符について音楽情報の変動量（例えば発音時点の変動量Δ）を設定する装置（変動量設定装置）であり、前述の各形態で例示した変動量設定部２８を含んで構成される。前述の各形態の音声合成装置１００は、楽曲データＭの編集および音響信号Ｖの生成の機能を楽曲処理装置に追加した装置として把握される。 (4) In each of the above-described embodiments, the speech synthesizer 100 that executes the editing of the music data M and the generation of the acoustic signal V is exemplified, but the editing of the music data M (the display control unit 22 and the information management unit 24) The present invention is also specified as a music processing apparatus in which the generation of the acoustic signal V (speech synthesizer 26) is omitted. The music processing device is a device (variation amount setting device) that sets a variation amount of music information (for example, a variation amount Δ at the time of pronunciation) for each note of the target musical piece, and the variation amount setting unit 28 exemplified in the above embodiments. It is comprised including. The speech synthesizer 100 of each embodiment described above is understood as an apparatus in which the functions of editing the music data M and generating the acoustic signal V are added to the music processing apparatus.

（５）前述の各形態では、拍点範囲Ｒの範囲幅を所定値に固定したが、拍点範囲Ｒの範囲幅（時間長）を可変に設定することも可能である。例えば、対象楽曲のジャンルや演奏テンポに応じて拍点範囲Ｒの範囲幅を可変に設定する構成（例えば演奏テンポが速いほど拍点範囲Ｒの範囲幅を縮小する構成）や、分散の数値（ｄ1，ｄ2）に応じて拍点範囲Ｒの範囲幅を可変に設定する構成も採用され得る。また、入力装置１６に対する利用者からの指示に応じて拍点範囲Ｒの範囲幅を可変に設定することも可能である。 (5) In each of the above-described embodiments, the range width of the beat point range R is fixed to a predetermined value. However, the range width (time length) of the beat point range R can be variably set. For example, a configuration in which the range width of the beat point range R is variably set according to the genre of the target music and the performance tempo (for example, a configuration in which the range width of the beat point range R is reduced as the performance tempo is faster), or a variance value ( A configuration in which the range width of the beat point range R is variably set according to d1, d2) can also be employed. It is also possible to variably set the range width of the beat point range R in accordance with an instruction from the user to the input device 16.

（６）前述の各形態では、全部の音楽データＭP（全声部）について各音符の発音時点の変動量Δを設定したが、複数の声部のうち特定の声部については、発音時点を、音楽データＭPで規定された時点とする（すなわち変動量Δを設定しない（Δ＝０））ことも可能である。歌唱音声の音高や発音時点が正確な歌唱者が多人数での合唱に包含されると合唱音声が全体的に纏まった印象になる、という傾向を再現することが可能である。 (6) In each of the above embodiments, the variation Δ at the time of pronunciation of each note is set for all the music data MP (whole voice part). The time point specified by the music data MP (that is, the fluctuation amount Δ is not set (Δ = 0)) can also be used. It is possible to reproduce the tendency that when a singer with accurate pitch and pronunciation time is included in a chorus with a large number of people, the chorus voice has an overall impression.

（７）複数の声部（音楽データＭP）の各々について変数を個別に設定することも可能である。例えば、確率分布Ｄの分散σの数値（ｄ1，ｄ2）や拍点範囲Ｒの範囲幅等の変数を声部毎に個別に設定した構成が好適である。 (7) Variables can be individually set for each of a plurality of voice parts (music data MP). For example, a configuration in which variables such as the numerical value (d1, d2) of the variance σ of the probability distribution D and the range width of the beat point range R are individually set for each voice part is preferable.

（８）移動通信網やインターネット等の通信網を介して端末装置と通信するサーバ装置で楽曲処理装置（音声合成装置１００）を実現することも可能である。具体的には、楽曲処理装置は、端末装置から通信網を介して受信した楽曲データＭについて前述の各形態にて例示した処理を実行することで音符毎の発音時点の変動量Δを設定し、各音符の変動量Δを時系列に指定するデータや、各音符情報ＱNで指定される発音期間Ｘ2の発音時点を変動量Δに応じて調整した楽曲データＭを、通信網から端末装置に送信する。 (8) The music processing device (speech synthesizer 100) can be realized by a server device that communicates with a terminal device via a communication network such as a mobile communication network or the Internet. Specifically, the music processing device sets the amount of variation Δ at the time of pronunciation for each note by executing the processing exemplified in each of the above-described forms for the music data M received from the terminal device via the communication network. In addition, data specifying the variation amount Δ of each note in time series and music data M obtained by adjusting the sound generation point in the sound generation period X2 specified by each note information QN according to the variation amount Δ are transmitted from the communication network to the terminal device. Send.

１００……楽曲処理装置、１０……演算処理装置、１２……記憶装置、１４……表示装置、１６……入力装置、１８……放音装置、２２……表示制御部、２４……情報管理部、２６……音声合成部、２８……変動量設定部。
DESCRIPTION OF SYMBOLS 100 ... Music processing device, 10 ... Arithmetic processing device, 12 ... Memory | storage device, 14 ... Display device, 16 ... Input device, 18 ... Sound emission device, 22 ... Display control part, 24 ... Information Management unit, 26... Speech synthesis unit, 28... Fluctuation amount setting unit.

Claims

A music processing apparatus comprising: a fluctuation amount setting unit that sets a fluctuation amount at the time of sound generation of each note according to a random number generated with a probability distribution of the degree of dispersion according to the position of the note in the music for each note of the music .

The variation amount setting means is configured such that when the sound generation time of one note is located inside the beat point range including the beat point of the music, the one variation is set according to the random number generated in the probability distribution of the first distribution degree. A variation amount at the time of sounding a note is set, and when the sounding time of the one note is located outside the beat point range, a random number generated with a probability distribution of the second spread degree exceeding the first spread degree The music processing device according to claim 1, wherein a variation amount at the time of sounding the one note is set accordingly.

The variation amount setting means sets the variation amount at the time of sounding of the note in accordance with a random number generated in the probability distribution that maximizes the probability of random number generation corresponding to a variable correction value. 2. Music processing apparatus of 2.

The fluctuation amount setting means is configured such that the time point of the one note after adjustment according to the amount of fluctuation is positioned forward compared to the time point of mute of the one note, and the other note immediately before the one note. The music processing device according to any one of claims 1 to 3, wherein a fluctuation amount at the time of sound generation of the one note is set so as to be located behind the time of sound generation of the note.

The variation amount setting means is a variation amount at the time of sounding set for the second note when the sounding period of the first note and the sounding period of the second note immediately after the music are consecutive on the time axis. The music processing device according to any one of claims 1 to 4, wherein a variation amount at the time of mute of the first note is set according to.