JP2002221978A

JP2002221978A - Vocal data forming device, vocal data forming method and singing tone synthesizer

Info

Publication number: JP2002221978A
Application number: JP2001019141A
Authority: JP
Inventors: Hidenori Kenmochi; 秀紀劔持
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2001-01-26
Filing date: 2001-01-26
Publication date: 2002-08-09

Abstract

PROBLEM TO BE SOLVED: To realize natural singing by a virtual singer made to meet accompani ment by uttering phonemes for consonants among the phonemes constituting syllables so as to meet the timing to generate notes. SOLUTION: The vocal data including the timing data to generate the notes by each of the syllables corresponding to lyrics is previously stored. When reproduction processing is started, the timing to generate the vowel 'a', the operation to utter the consonant 's' is started before the timing to generate the note in having the syllable 'sa' corresponding to the note 'doh' uttered and the timing to generate the vowel 'a' is made to meet the timing to generate the note 'doh'. As a result, the natural singing by the virtual singer is made possible without delaying in the accompaniment.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、例えばパソコンに
よるコンピュータ・ミュージックに用いて好適なボーカ
ルデータ生成装置、ボーカルデータ生成方法および歌唱
音合成装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a vocal data generating apparatus, a vocal data generating method, and a singing sound synthesizing apparatus suitable for use in, for example, computer music by a personal computer.

【０００２】[0002]

【従来の技術】昨今のコンピュータ・ミュージックは、
その発達により業務用のカラオケ、テレビのＣＭ、さら
にポップスの伴奏等、多くの分野に利用されており、そ
の演奏レベルはオーケストラや生バンドに近いレベルに
達している。コンピュータ・ミュージックは、ＤＴＭ
（デスクトップミュージック）とも呼ばれている。この
ＤＴＭは、ＭＩＤＩ音源を備え、このＭＩＤＩ音源を利
用して、ヒット曲を模倣したり、ユーザが制作者やアレ
ンジャーとなって、音色やメロディを変えて自分だけの
楽曲を作って演奏を楽しむことも可能になっている。さ
らに、ＤＴＭには、メロディパートの音符に対して歌詞
の音節を割り当てたボーカルデータを用いてバーチャル
シンガ（架空の歌手）に歌詞を唄わせることのできる音
楽システムもある。2. Description of the Related Art Recent computer music is
Due to its development, it has been used in many fields such as commercial karaoke, television commercials, and accompaniment of pops, and its performance level has reached a level close to that of an orchestra or a live band. Computer Music is DTM
(Desktop music). This DTM has a MIDI sound source, and uses this MIDI sound source to imitate a hit song, or to create a personalized song by changing the tone and melody as a creator or arranger, and enjoy playing. It is also possible. Further, the DTM includes a music system that allows a virtual singer (fictional singer) to sing lyrics using vocal data in which syllables of lyrics are assigned to notes of a melody part.

【０００３】このＤＴＭシステムでは、ＭＩＤＩデータ
が用いられ、このＭＩＤＩデータはノートオン信号およ
びノートオフ信号というデータ列からなる。そして、こ
のＭＩＤＩデータは、ノートオン信号によってある高さ
の音（ノートナンバ）を発生し、ノートオフ信号によっ
てその音を停止する。ノートナンバとはＭＩＤＩ規格で
決められたものであり、例えばノートナンバ＝「Ｃ３」
が「ド」、ノートナンバ＝「Ｄ３」が「レ」・・・とな
っている。一方、ＤＴＭは、歌唱音源等により擬似的に
音節を生成し、この音節をノートオン信号に対応付けて
音声を発声させる。これにより、ＤＴＭシステムでは、
ボーカルデータに基づいてバーチャルシンガに歌詞を唄
わせることが可能となる。In this DTM system, MIDI data is used, and the MIDI data is composed of a data string of a note-on signal and a note-off signal. The MIDI data generates a note of a certain pitch (note number) by a note-on signal, and stops the note by a note-off signal. The note number is determined by the MIDI standard, for example, note number = “C3”
Is “do” and note number = “D3” is “re”. On the other hand, the DTM generates a syllable in a simulated manner from a singing sound source or the like, and associates the syllable with a note-on signal to produce a voice. Thus, in the DTM system,
It becomes possible to make the virtual singers sing the lyrics based on the vocal data.

【０００４】[0004]

【発明が解決しようとする課題】ところで、人間が歌を
唄うとき、歌唱者は、一定の高さと長さを持つ音符に付
加された歌詞を声にする。この際、音符のタイミング通
りに歌詞に相当する音声を発生させるのではなく、音符
のタイミングよりも前に、その歌詞を唄おうとする準備
を無意識のうちに始めている。また、子音で始まる音節
では、その子音を出す位置を無意識のうちに前にずら
し、実際に音として聞こえる母音を音符のタイミングに
合わせて発生している。When a human sings a song, the singer speaks a lyric attached to a note having a certain pitch and length. At this time, instead of generating a voice corresponding to the lyrics at the timing of the note, preparations for singing the lyrics are started unconsciously before the timing of the note. In a syllable that starts with a consonant, the position at which the consonant is emitted is unintentionally shifted forward, and a vowel that is actually heard as a sound is generated at the timing of the note.

【０００５】しかし、上記従来によるＤＴＭでは、バー
チャルシンガが歌詞の音節を発声する場合には、音符に
音節を割り当てたボーカルデータを用いているので、音
符を基準にして音節を発声するようになる。このため、
音節が母音のみの場合には、音符から若干遅れてこの音
節が聞こえるが、子音で始まる音節の場合には、母音を
発声するときに音として聞こえるため、音符に対応した
発音タイミングからかなり遅れて音節が聞こえることに
なる。However, in the conventional DTM, when a virtual singer utters a syllable of a lyric, vocal data in which a syllable is assigned to a note is used, so that the syllable is uttered based on the note. . For this reason,
If the syllable consists only of vowels, this syllable will be heard slightly after the note, but if it is a syllable that starts with a consonant, it will be heard as a sound when the vowel is uttered. You will hear the syllables.

【０００６】ここで、伴奏に対する歌詞の遅れについ
て、図１０および図１１に示した「チューリップ」の歌
い始めの「咲いた咲いた」の部分を例示して説明する。
図１０は音符に対応したＭＩＤＩデータおよび音節の発
声を示したものであり、ノートオン信号をＫＯＮ１、Ｋ
ＯＮ２、ＫＯＮ３・・・とし、ノートオフ信号をＫＯＦ
Ｆ１、ＫＯＦＦ２、ＫＯＦＦ３・・・として表してい
る。図１１は音符の長さに対する音節の発声を示した図
である。なお、図１０および図１１では、音節の発声時
についてのみ図示し、音節の発声停止については省略し
ている。Here, the delay of the lyrics with respect to the accompaniment will be described by exemplifying the part of "tumorous blooming" at the beginning of the singing of "tulip" shown in FIGS.
FIG. 10 shows MIDI data and syllable utterance corresponding to a note.
ON2, KON3 ... and note-off signal is KOF
F1, KOFF2, KOFF3,... FIG. 11 is a diagram showing the utterance of syllables with respect to the length of a note. FIGS. 10 and 11 show only the case where the syllable is uttered, and omit the stop of the syllable.

【０００７】ここで、人は、前述した如く、「咲いた」
を歌唱する場合、第１拍目の「ド」が発生したときに
「さ」を発声するためには、音節としての「さ」を構成
する音素「ｓ」を予め「ド」の発生位置よりも前にずら
し、音節「さ」を構成する音素である母音「ａ」が発声
する位置を音符「ド」の位置に合わせるようにしてい
る。ところが、ＤＴＭシステムは、第１拍目の「ド」を
発生させるためのノートオンＫＯＮ１に基づいて「さ」
の音節を発声させるための処理を開始する。このため、
子音「ｓ」の発音タイミングが「ド」の発音タイミング
よりも時間Δｔ１だけ遅れ、さらに母音「ａ」の発音タ
イミングが「ド」の発生タイミングよりも時間Δｔ２だ
け遅れることになる（図１１参照）。ここで、子音で始
まる音節の子音から母音に変化するまでの時間は便宜上
一定時間ｔ０とする。次に、ＤＴＭシステムは、第２拍
目の「レ」を発生させるためのノートオンＫＯＮ２に基
づいて「い」の音節を発声させるための処理を開始す
る。このため、母音「ｉ」の発音タイミングが「レ」の
発音タイミングよりも時間Δｔ１だけ遅れる（図１１参
照）。さらに、ＤＴＭシステムは、第３拍目の「ミ」を
発生させるためのノートオンＫＯＮ３に基づいて「た」
の音節を発声させるための処理を開始する。このため、
音素としての子音「ｔ」の発音タイミングが「ミ」の発
音タイミングよりも時間Δｔ１だけ遅れ、さらに音素と
しての母音「ａ」の発音タイミングが「ミ」の発音タイ
ミングよりも時間Δｔ２だけ遅れることになる（図１１
参照）。[0007] Here, as described above, the person "blossomed".
In order to utter “sa” when the first beat “do” occurs, the phoneme “s” that constitutes “sa” as a syllable must be set in advance from the position where “do” occurs. The vowel “a”, which is a phoneme constituting the syllable “sa”, is also shifted forward to match the position of the note “do”. However, the DTM system uses “note” based on note-on KON1 for generating “do” on the first beat.
The process for producing the syllable is started. For this reason,
The sounding timing of the consonant "s" is delayed by the time .DELTA.t1 from the sounding timing of "do", and the sounding timing of the vowel "a" is further delayed by the time .DELTA.t2 from the occurrence timing of "do" (see FIG. 11). . Here, the time from a consonant of a syllable starting with a consonant to a change to a vowel is set to a fixed time t0 for convenience. Next, the DTM system starts processing for producing a syllable of “I” based on the note-on KON2 for generating “RE” of the second beat. Therefore, the sounding timing of the vowel “i” is delayed by the time Δt1 from the sounding timing of “の” (see FIG. 11). Further, the DTM system uses the note-on KON3 for generating the third beat “mi” based on “ta”.
The process for producing the syllable is started. For this reason,
The sounding timing of the consonant "t" as a phoneme is delayed by a time? T1 from the sounding timing of "mi", and the sounding timing of a vowel "a" as a phoneme is delayed by a time? T2 from the sounding timing of "mi". (Fig. 11
reference).

【０００８】このように、従来のＤＴＭシステムでは、
伴奏に合わせてバーチャルシンガに歌詞を唄わせと、メ
ロディパートの音符から遅れて歌詞の音節が発声するこ
とになり、特に子音で始まる音節でその遅れが顕著に現
れ、歌詞が伴奏に合わなくなってしまう、という問題が
あった。As described above, in the conventional DTM system,
If you sing the lyrics to the virtual singer along with the accompaniment, the syllables of the lyrics will be uttered later than the melody part note, especially the syllables that begin with consonants will appear noticeably, and the lyrics will not match the accompaniment There was a problem that.

【０００９】また、伴奏に歌詞を合わせる場合には、歌
詞の音節を発音タイミングを調整すればよいことが分か
る。しかし、発音タイミングを、子音で始まる音節に合
わせた場合、母音のみである音節を再生すると、この音
節が音符の発生タイミングよりも早くなってしまう。こ
のように、発音タイミングを一様に調整するだけでは、
バーチャルシンガによる自然な歌唱は得ることができな
かった。In addition, when the lyrics are matched with the accompaniment, it can be understood that the syllables of the lyrics need only be adjusted in sounding timing. However, if the pronunciation timing is set to a syllable that starts with a consonant, when a syllable consisting of only vowels is reproduced, this syllable will be earlier than the timing of generation of a note. Thus, simply adjusting the tone generation timing uniformly requires
Natural singing by virtual singer could not be obtained.

【００１０】本発明は、以上の問題に鑑みてなされたも
のであり、伴奏に合わせたバーチャルシンガによる自然
な歌唱を実現することのできるボーカルデータを生成す
るボーカルデータ生成装置、ボーカルデータ生成方法お
よび歌唱音合成装置を提供することを目的としている。[0010] The present invention has been made in view of the above problems, and has a vocal data generating apparatus, a vocal data generating method, and a vocal data generating method for generating vocal data capable of realizing a natural singing by a virtual singer according to accompaniment. It is an object of the present invention to provide a singing sound synthesizer.

【００１１】[0011]

【課題を解決するための手段】上記目的を達成するため
に、請求項１記載の発明は、メロディおよび歌詞に対応
するボーカルデータを生成するボーカルデータ生成装置
であって、歌詞の音節を前記メロディに対応する音符に
割り当てる歌詞割当手段と、前記音節を音素に分け、各
音節を構成する音素のうち、母音の音素の発音タイミン
グを前記音符に対応する発音タイミングに合わせるべく
各音節に対応する発音タイミングデータを生成する発音
タイミングデータ生成手段と、前記音節の音素列デー
タ、前記発音タイミングデータおよび前記音節に対応し
た音高データをボーカルデータとして生成し、このボー
カルデータを出力するデータ出力手段と、を備えたこと
を特徴としている。According to one aspect of the present invention, there is provided a vocal data generating apparatus for generating vocal data corresponding to a melody and lyrics. Lyric allocating means for allocating to a note corresponding to the syllable, and a pronunciation corresponding to each syllable in order to match the pronunciation timing of a vowel phoneme to the pronunciation timing corresponding to the note among the phonemes constituting each syllable. Sounding timing data generating means for generating timing data; data output means for generating phoneme string data of the syllables, the sounding timing data and pitch data corresponding to the syllables as vocal data, and outputting the vocal data; It is characterized by having.

【００１２】請求項２記載の発明は、メロディおよび歌
詞に対応するボーカルデータを生成するボーカルデータ
生成装置であって、歌詞の音節を前記メロディに対応す
る音符に割り当てる歌詞割当手段と、前記音節を音素に
分け、各音節を構成する音素のうち、母音の音素の発音
タイミングを前記音符に対応する発音タイミングに合わ
せるべく各音節に対応する発音タイミングデータを生成
する発音タイミングデータ生成手段と、前記音節の音素
列データ、前記発音タイミングデータおよび前記音節に
対応した音高データをボーカルデータとして生成し、こ
のボーカルデータをシステムエクスクルーシブメッセー
ジに含ませて出力するデータ出力手段と、を備えたこと
を特徴としている。According to a second aspect of the present invention, there is provided a vocal data generating apparatus for generating vocal data corresponding to a melody and lyrics, wherein the lyrics allocating means assigns a syllable of the lyrics to a note corresponding to the melody; Sounding timing data generating means for generating sounding timing data corresponding to each syllable in order to match the sounding timing of the vowel phoneme to the sounding timing corresponding to the note, out of the phonemes constituting each syllable, divided into phonemes; Data output means for generating pitch data corresponding to the phoneme string data, the tone generation timing data and the syllables as vocal data, and outputting the vocal data included in the system exclusive message. I have.

【００１３】請求項３記載の発明は、請求項１または２
記載のボーカルデータ生成装置において、前記データ出
力手段は、前記ボーカルデータをコーラス毎に分けて出
力することを特徴としている。[0013] The invention according to claim 3 is the invention according to claim 1 or 2.
In the vocal data generation device described above, the data output means outputs the vocal data separately for each chorus.

【００１４】請求項４記載の発明は、請求項１または２
記載のボーカルデータ生成装置において、前記データ出
力手段は、前記ボーカルデータをフレーズ毎に分けて出
力することを特徴としている。The invention described in claim 4 is the first or second invention.
In the vocal data generation device described above, the data output means outputs the vocal data divided for each phrase.

【００１５】請求項５記載の発明は、請求項１または２
記載のボーカルデータ生成装置において、前記データ出
力手段は、前記ボーカルデータを各息継ぎ区間毎に分け
て出力することを特徴としている。The invention according to claim 5 is the first or second invention.
In the vocal data generation device described above, the data output means outputs the vocal data separately for each breathing section.

【００１６】請求項６記載の発明は、請求項１または２
記載のボーカルデータ生成装置において、前記データ出
力手段は、前記ボーカルデータを小節毎に出力すること
を特徴としている。The invention according to claim 6 is the first or second invention.
In the vocal data generation device described above, the data output means outputs the vocal data for each bar.

【００１７】請求項７記載の発明は、メロディおよび歌
詞に対応するボーカルデータを生成するボーカルデータ
生成方法であって、歌詞の音節を前記メロディに対応す
る音符に割り当てる歌詞割当工程と、前記音節を音素に
分け、各音声を構成する音素のうち、母音の音素の発音
タイミングを前記音符に対応する発音タイミングに合わ
せるべく各音節に対応する発音タイミングデータを生成
する発音タイミングデータ生成工程と、前記音節の音素
列データ、前記発音タイミングデータおよび前記音節に
対応した音高データをボーカルデータとして生成し、こ
のボーカルデータを出力するデータ出力工程と、を備え
たことを特徴としている。The invention according to claim 7 is a vocal data generating method for generating vocal data corresponding to a melody and lyrics, wherein a lyric allocating step of assigning a syllable of the lyrics to a note corresponding to the melody, and A sounding timing data generating step of generating sounding timing data corresponding to each syllable in order to match the sounding timing of the vowel phoneme to the sounding timing corresponding to the note, out of the phonemes constituting each voice by dividing into phonemes; And generating pitch data corresponding to the syllable, the phoneme sequence data, and the syllable timing data, and outputting the vocal data.

【００１８】請求項８記載の発明は、メロディおよび歌
詞に対応するボーカルデータを生成するボーカルデータ
生成方法であって、歌詞の音節を前記メロディに対応す
る音符に割り当てる歌詞割当工程と、前記音節を音素に
分け、各音声を構成する音素のうち、母音の音素の発音
タイミングを前記音符に対応する発音タイミングに合わ
せるべく各音節に対応する発音タイミングデータを生成
する発音タイミングデータ生成工程と、前記音節の音素
列データ、前記発音タイミングデータおよび前記音節に
対応した音高データをボーカルデータとして生成し、こ
のボーカルデータをシステムエクスクルーシブメッセー
ジに含ませて出力するデータ出力工程と、を備えたこと
を特徴としている。The invention according to claim 8 is a vocal data generating method for generating vocal data corresponding to a melody and lyrics, wherein a lyric allocating step of allocating a syllable of the lyrics to a note corresponding to the melody, and A sounding timing data generating step of generating sounding timing data corresponding to each syllable in order to match the sounding timing of the vowel phoneme to the sounding timing corresponding to the note, out of the phonemes constituting each voice by dividing into phonemes; A data output step of generating pitch data corresponding to the phoneme string data, the sounding timing data and the syllables as vocal data, and outputting the vocal data included in a system exclusive message. I have.

【００１９】請求項９記載の発明は、伴奏に合わせてバ
ーチャルシンガに歌詞を唄わせる歌唱音合成装置であっ
て、伴奏データと、メロディデータの音符に歌詞を割り
当て、該音符の発音タイミングに前記音節毎の母音の発
音タイミングを対応させた発音タイミングデータ、音節
の音素列データおよび前記音節に対応した音高データを
含むボーカルデータを出力するデータ出力手段と、再生
時にこれらのデータを送信するデータ制御手段と、前記
データ制御手段によって送信された前記伴奏データを受
けて伴奏音を発生させる伴奏音源と、前記データ制御手
段によって送信された前記ボーカルデータを受け、該ボ
ーカルデータに基づいて前記音符に対応する音程で音節
を発音させる歌唱音源と、を具備したことを特徴として
いる。The invention according to claim 9 is a singing sound synthesizing apparatus for singing lyrics to a virtual singer in accordance with an accompaniment, wherein lyrics are assigned to notes of accompaniment data and melody data, and the sound timing of the notes is set at the sounding timing of the notes. Data output means for outputting vocal data including sounding timing data corresponding to the vowel sounding timing for each syllable, syllable phoneme string data and pitch data corresponding to the syllable, and data for transmitting these data during reproduction Control means, an accompaniment sound source that generates the accompaniment sound by receiving the accompaniment data transmitted by the data control means, and receives the vocal data transmitted by the data control means, and generates a note based on the vocal data. And a singing sound source for producing syllables at corresponding intervals.

【００２０】請求項１０記載の発明は、伴奏に合わせて
バーチャルシンガに歌詞を唄わせる歌唱音合成装置であ
って、伴奏データと、メロディデータの音符に歌詞を割
り当て、該音符の発音タイミングに前記音節毎の母音の
発音タイミングを対応させた発音タイミングデータ、音
節の音素列データおよび前記音節に対応した音高データ
を含むボーカルデータをシステムエクスクルーシブメッ
セージに含ませて出力するデータ出力手段と、再生時に
これらのデータを送信するデータ制御手段と、前記デー
タ制御手段によって送信された前記伴奏データを受けて
伴奏音を発生させる伴奏音源と、前記データ制御手段に
よって送信された前記システムエクスクルーシブメッセ
ージ中のボーカルデータを受け、該ボーカルデータに基
づいて前記音符に対応する音程で音節を発音させる歌唱
音源と、を具備したことを特徴としている。According to a tenth aspect of the present invention, there is provided a singing sound synthesizing apparatus for singing lyrics to a virtual singer along with an accompaniment, wherein lyrics are assigned to musical notes of accompaniment data and melody data, and the sounding timing of the musical notes is set. Data output means for outputting vocal data including vocal data including vocal data including vowel sound timing corresponding to vowels for each syllable, phoneme string data of syllables and pitch data corresponding to the syllables in a system exclusive message, A data control means for transmitting these data, an accompaniment sound source for generating an accompaniment tone upon receiving the accompaniment data transmitted by the data control means, and vocal data in the system exclusive message transmitted by the data control means To the note based on the vocal data. It is characterized by comprising a singing sound source to sound the syllables in pitch to respond, the.

【００２１】[0021]

【発明の実施の形態】以下、図面を参照し、本発明の実
施形態について説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００２２】Ａ．歌唱音合成装置の構成図１は、本発明による歌唱音合成装置（以下、ＤＴＭと
いう）のシステム構成の一例を示す図である。このＤＴ
Ｍシステムは、パーソナルコンピュータ１（以下、パソ
コン１という）と、トーンジェネレータ３１と、スピー
カ５０と、ＭＩＤＩキーボード６０とによって大略構成
されている。図２は、パソコン１およびトーンジェネレ
ータ３１の構成、さらに接続状態を示す図である。パソ
コン１はシーケンサでもよく、トーンジェネレータ３１
は音声発音装置を含むものである。A. FIG. 1 is a diagram showing an example of a system configuration of a singing sound synthesizer (hereinafter, referred to as DTM) according to the present invention. This DT
The M system generally includes a personal computer 1 (hereinafter, referred to as a personal computer 1), a tone generator 31, a speaker 50, and a MIDI keyboard 60. FIG. 2 is a diagram showing the configuration of the personal computer 1 and the tone generator 31, and further shows the connection state. The personal computer 1 may be a sequencer, and the tone generator 31
Includes a voice pronunciation device.

【００２３】Ａ−１．パソコン１の構成パソコン１は、ＣＰＵ１３、ＲＡＭ１１、ＲＯＭ１２お
よびハードディスク１４（図２参照）を備えたパソコン
本体１０と、このパソコン本体１０からの画像信号を受
けて各種表示を行うモニタ２７と、パソコン本体１０に
歌詞、指令情報等の各種情報を入力するキーボード２８
およびマウス２９とを具備している。A-1. 1. Configuration of the Personal Computer 1 The personal computer 1 includes a personal computer body 10 having a CPU 13, a RAM 11, a ROM 12, and a hard disk 14 (see FIG. 2), a monitor 27 that receives image signals from the personal computer body 10 and performs various displays, and a personal computer body. Keyboard 28 for inputting various information such as lyrics and command information to 10
And a mouse 29.

【００２４】パソコン本体１０のバス１５には、図２に
示す如く、検出回路１６、検出回路１７及び表示回路１
８の他、インターフェース１９、ハードディスク１４、
ＲＡＭ１１、ＲＯＭ１２、ＣＰＵ１３等が接続される。As shown in FIG. 2, a detection circuit 16, a detection circuit 17, and a display circuit 1 are connected to a bus 15 of the personal computer body 10.
8, the interface 19, the hard disk 14,
The RAM 11, the ROM 12, the CPU 13, and the like are connected.

【００２５】検出回路１６は、キーボード２８上のキー
（数字キーや文字キー等）入力を検出し、キー信号を生
成する。検出回路１７は、マウス２９の移動操作やスイ
ッチ操作を検出し、マウス信号を生成する。作業者は、
キーボード２８またはマウス２９を用いて、伴奏データ
の生成およびボーカルデータの編集を行う。The detection circuit 16 detects an input of a key (numerical key, character key, or the like) on the keyboard 28 and generates a key signal. The detection circuit 17 detects a moving operation and a switch operation of the mouse 29 and generates a mouse signal. The worker
Using the keyboard 28 or the mouse 29, accompaniment data is generated and vocal data is edited.

【００２６】表示回路１８はモニタ２７に接続される。
モニタ２７には、編集画面等を表示することができ、作
業者は、モニタ２７上の編集画面を参照しながら伴奏デ
ータの生成およびボーカルデータの編集を行う。The display circuit 18 is connected to a monitor 27.
An edit screen or the like can be displayed on the monitor 27, and the operator generates the accompaniment data and edits the vocal data while referring to the edit screen on the monitor 27.

【００２７】ＲＯＭ１２は、伴奏データ、ボーカルデー
タ等の各種パラメータ及び制御プログラムを記憶する。
ＲＡＭ１１は、フラグ、バッファ等を記憶するもので、
ハードディスク１４から供給された制御プログラム等を
記憶することも可能となる。ＣＰＵ１３は、ＲＡＭ１１
又はＲＯＭ１２に記憶されている制御プログラムに従っ
て、ボーカルデータの編集等のための演算または制御を
行うものである。ＲＯＭ１２等に記憶される制御プログ
ラムは、後述する伴奏データ生成処理、楽器音生成
処理、ボーカルデータ生成処理、曲再生処理、表
示処理等を行わせるものである。ＣＰＵ１３にはタイマ
２０が接続され、ＣＰＵ１３はこのタイマ２０の時間情
報に応じて、所定時間間隔に割り込み処理を行う。The ROM 12 stores various parameters such as accompaniment data and vocal data and a control program.
The RAM 11 stores flags, buffers, and the like.
It is also possible to store the control program and the like supplied from the hard disk 14. The CPU 13 is a RAM 11
Alternatively, according to a control program stored in the ROM 12, calculation or control for editing vocal data or the like is performed. The control program stored in the ROM 12 or the like performs an accompaniment data generation process, a musical instrument sound generation process, a vocal data generation process, a music reproduction process, a display process, and the like, which will be described later. A timer 20 is connected to the CPU 13, and the CPU 13 performs an interrupt process at predetermined time intervals according to the time information of the timer 20.

【００２８】インターフェース１９は、ＭＩＤＩインタ
ーフェースからなる。インターフェース１９は、トーン
ジェネレータ３１のインターフェース３８とＭＩＤＩケ
ーブルで接続される。これにより、パソコン１は、イン
ターフェース８を介して、伴奏データおよびボーカルデ
ータ等をトーンジェネレータ３１に送信する。The interface 19 comprises a MIDI interface. The interface 19 is connected to the interface 38 of the tone generator 31 via a MIDI cable. Thereby, the personal computer 1 transmits the accompaniment data and the vocal data to the tone generator 31 via the interface 8.

【００２９】Ａ−２．トーンジェネレータ３１の構成トーンジェネレータ３１は、ＭＩＤＩ音源３２および歌
唱音源３３とを有する。バス３４には、ＭＩＤＩ音源３
２、効果回路３７、歌唱音源３３、検出回路３５、表示
回路３６、インターフェース３８、ＲＡＭ３９、ＲＯＭ
４０、ＣＰＵ４１等が接続される。A-2. Configuration of Tone Generator 31 The tone generator 31 has a MIDI sound source 32 and a singing sound source 33. The bus 34 has a MIDI sound source 3
2. Effect circuit 37, singing sound source 33, detection circuit 35, display circuit 36, interface 38, RAM 39, ROM
40, a CPU 41 and the like are connected.

【００３０】検出回路３５は、操作子４２或いはＭＩＤ
Ｉキーボード６０の操作を検出し、操作子信号を生成す
る。操作子４２は、例えばスイッチ等の演奏操作子であ
る。表示回路３６は、トーンジェネレータ３１の動作状
態等を表示パネル４３に表示させるものである。The detection circuit 35 is provided with an operation element 42 or MID
An operation of the I keyboard 60 is detected, and an operation signal is generated. The operation element 42 is a performance operation element such as a switch. The display circuit 36 displays the operation state of the tone generator 31 on the display panel 43.

【００３１】ＲＯＭ４０は、音声を合成するためのフォ
ルマントデータ、その他の各種データ及び制御プログラ
ムを記憶している。ＲＡＭ３９は、パソコン１から送信
される伴奏データおよびボーカルデータ等を記憶する。
制御プログラムは、ＲＯＭ４０に記憶させる代わりに、
ＲＡＭ３９に記憶させてもよい。ＣＰＵ４１は、ＲＯＭ
４０に記憶されている制御プログラムに従い、演算また
は制御を行うことにより曲を再生させる。The ROM 40 stores formant data for synthesizing voice, various other data, and a control program. The RAM 39 stores accompaniment data and vocal data transmitted from the personal computer 1.
Instead of storing the control program in the ROM 40,
The information may be stored in the RAM 39. CPU 41 is a ROM
According to the control program stored in 40, the music is reproduced by performing calculation or control.

【００３２】ＣＰＵ４１は、接続されたタイマ４４から
時間情報を得て、この時間情報に従って、伴奏データ又
はボーカルデータの再生を行う。ＣＰＵ４１は、ノート
オン信号等に基づいて楽音パラメータおよび効果パラメ
ータを生成し、それぞれＭＩＤＩ音源３２および効果回
路３７に供給する。ＭＩＤＩ音源３２は、供給される楽
音パラメータに応じて楽音信号を生成するものである。
効果回路３７は、供給される効果パラメータに応じて、
ＭＩＤＩ音源３２で生成される楽音信号に例えばディレ
イやリバーブ等の効果を付与し、ＤＡ変換回路４５に供
給する。ＤＡ変換回路４５は供給されるデジタル形式の
楽音信号をアナログ形式に変換し、スピーカ５０から伴
奏を発生させる。なお、ＭＩＤＩ音源３２は、波形メモ
リ方式、ＦＭ方式、物理モデル方式、高調波合成方式、
ＶＣＯ＋ＶＣＦ＋ＶＣＡのアナログシンセサイザ方式
等、どのような方式であってもよい。The CPU 41 obtains time information from the connected timer 44, and reproduces accompaniment data or vocal data according to the time information. The CPU 41 generates a tone parameter and an effect parameter based on the note-on signal and the like, and supplies them to the MIDI tone generator 32 and the effect circuit 37, respectively. The MIDI tone generator 32 generates a tone signal in accordance with a supplied tone parameter.
The effect circuit 37, depending on the effect parameter supplied,
The tone signal generated by the MIDI tone generator 32 is given an effect such as delay or reverb, and is supplied to the DA conversion circuit 45. The DA conversion circuit 45 converts the supplied digital musical tone signal into an analog musical signal, and generates an accompaniment from the speaker 50. The MIDI sound source 32 has a waveform memory system, an FM system, a physical model system, a harmonic synthesis system,
Any system such as an analog synthesizer system of VCO + VCF + VCA may be used.

【００３３】インターフェース３８は、ＭＩＤＩインタ
ーフェースからなる。例えば、トーンジェネレータ３１
のインターフェース３８とパソコン１のインターフェー
ス１９とがＭＩＤＩケーブルで接続される場合、トーン
ジェネレータ３１およびパソコン１との間はＭＩＤＩ通
信を行う。The interface 38 comprises a MIDI interface. For example, the tone generator 31
When the interface 38 of the personal computer 1 and the interface 19 of the personal computer 1 are connected by a MIDI cable, MIDI communication is performed between the tone generator 31 and the personal computer 1.

【００３４】また、ＣＰＵ４１は、インターフェース３
８を介して、パソコン１からボーカルデータを受け取
り、ＲＡＭ３９に格納する。ボーカルデータは、ノート
データ、音素列データおよび発音タイミングデータを含
むもので、詳細ついては後に説明する。The CPU 41 is provided with the interface 3
The vocal data is received from the personal computer 1 through the PC 8 and stored in the RAM 39. The vocal data includes note data, phoneme string data, and sounding timing data, and will be described later in detail.

【００３５】さらに、ＣＰＵ４１は、ＲＡＭ３９に記憶
されているボーカルデータを読み出し、ＲＯＭ４０に記
憶されているフォルマントデータに基づき、このフォル
マントデータを歌唱音源３３に供給する。フォルマント
データは、例えば各音素に対応したフォルマントを形成
するために必要とされるフォルマント中心周波数（フォ
ルマントを形成する山の中心周波数）データ、フォルマ
ントバンド幅（フォルマントを形成する山のバンド幅）
データ、フォルマントレベル（フォルマントを形成する
山のピークレベル）データ等からなる。歌唱音源３３
は、供給されるフォルマントデータに応じて音声信号を
生成する。音声信号は、所定の音高を有した歌声に相当
する。歌唱音源３３は、フォルマント合成方式（歌唱音
源）でもよいし、その他の方式でもよい。フォルマント
合成方式については、例えば特開平３−２００２９９号
公報に記載されている。Further, the CPU 41 reads the vocal data stored in the RAM 39 and supplies the vocal data to the singing sound source 33 based on the formant data stored in the ROM 40. The formant data includes, for example, formant center frequency (center frequency of a mountain forming a formant) data required to form a formant corresponding to each phoneme, and formant band width (bandwidth of a mountain forming a formant)
Data, formant level (peak level of a mountain forming a formant) data, and the like. Singing sound source 33
Generates an audio signal according to the supplied formant data. The audio signal corresponds to a singing voice having a predetermined pitch. The singing sound source 33 may be a formant synthesis method (singing sound source) or another method. The formant synthesis method is described in, for example, JP-A-3-200299.

【００３６】効果回路３７は、供給される効果パラメー
タに応じて、歌唱音源３３で生成される音声信号に、例
えばディレイ等の効果を付与し、ＤＡ変換回路４５に供
給する。ＤＡ変換回路４５は、供給されるデジタル形式
の音声信号をアナログ形式に変換し、スピーカ５０から
歌唱を発音させる。The effect circuit 37 applies an effect such as a delay to the audio signal generated by the singing sound source 33 in accordance with the supplied effect parameter, and supplies the result to the DA conversion circuit 45. The DA conversion circuit 45 converts the supplied digital audio signal into an analog audio signal and causes the speaker 50 to produce a singing voice.

【００３７】なお、歌唱音源３３およびＭＩＤＩ音源３
２は、専用のハードウエアを用いて構成するものに限ら
ず、ＤＳＰ＋マイクロプログラムを用いて構成してもよ
いし、ＣＰＵ＋ソフトウェアのプログラムで構成するよ
うにしてもよい。さらに、１つの歌唱音源又は音源回路
を時分割で使用することにより複数の発音チャンネルを
形成するようにしてもよいし、複数の歌唱音源又は音源
回路を用い、１つの発音チャンネルにつき１つの歌唱音
源又は音源回路で複数の発音チャンネルを構成するよう
にしてもよい。The singing sound source 33 and the MIDI sound source 3
2 is not limited to a configuration using dedicated hardware, but may be configured using a DSP + microprogram, or may be configured using a CPU + software program. Further, a plurality of sounding channels may be formed by using one singing sound source or sound source circuit in a time-division manner, or one singing sound source for one sounding channel using a plurality of singing sound sources or sound source circuits. Alternatively, a plurality of tone generation channels may be constituted by a tone generator circuit.

【００３８】Ｂ．制御処理次に、ＤＴＭの制御処理について説明する。この制御プ
ログラムは、伴奏データ生成処理、楽器音生成処
理、ボーカルデータ生成処理、曲再生処理、表示
処理等をＣＰＵ１３またはＣＰＵ４１に行わせるもので
ある。B. Control Processing Next, DTM control processing will be described. This control program causes the CPU 13 or the CPU 41 to perform accompaniment data generation processing, musical instrument sound generation processing, vocal data generation processing, music reproduction processing, display processing, and the like.

【００３９】ここで、前述した各処理の概略を説明す
る。伴奏データ生成処理モニタ２７上に表示された音符入力ウィンドウの五線譜
に、ユーザがキーボード２８（マウス２９）或いはＭＩ
ＤＩキーボード６０を用いて音符を書き込み、この音符
をＭＩＤＩ形式の伴奏データとしてハードディスク１４
に記憶させる。この際、伴奏データは、パート（例え
ば、楽器）毎に個々に記憶させる。楽器音生成処理ユーザが曲作成処理によって書き込んだ各パートデータ
に対して楽器（例えば、ドラム、ギター、エレクトーン
等）の選択、さらにアレンジおよびエフェクト処理等を
設定し、この設定状態が設定データとしてハードディス
ク１４に記憶される。ボーカルデータ生成処理メロディおよび歌詞に対応するボーカルデータを生成
し、このボーカルデータがハードディスク１４に記憶さ
れる。曲再生処理トーンジェネレータ３１のＭＩＤＩ音源３２および歌唱
音源３３とを用いて、ボーカルデータおよび伴奏データ
による楽音をスピーカ５０から発生させる。表示処理モニタ２７上に各種画面を表示させる。Here, the outline of each process described above will be described. Accompaniment data generation processing The user enters a musical score in the musical notation input window displayed on the monitor 27 with the keyboard 28 (mouse 29) or MI
The musical note is written using the DI keyboard 60, and the musical note is converted to MIDI format accompaniment data on the hard disk 14.
To memorize. At this time, the accompaniment data is stored individually for each part (for example, musical instrument). Musical instrument sound generation processing The user selects a musical instrument (for example, a drum, a guitar, an electric tone, etc.) for each part data written by the music creation processing, and further arranges and effects processing. 14 is stored. Vocal Data Generation Processing Vocal data corresponding to a melody and lyrics is generated, and the vocal data is stored in the hard disk 14. Music Playback Processing Using the MIDI sound source 32 and the singing sound source 33 of the tone generator 31, a musical tone based on vocal data and accompaniment data is generated from the speaker 50. Display Processing Various screens are displayed on the monitor 27.

【００４０】これらの処理のうち、伴奏データ生成処
理、楽器音生成処理、表示処理は、従来からＤＴＭ
で行われていた技術であり、その詳細説明を省略する
が、ボーカルデータ生成処理および曲再生処理につ
いては、後に詳述するものとする。Among these processes, the accompaniment data generation process, the musical instrument sound generation process, and the display process are conventionally performed by DTM.
Although the detailed description is omitted, the vocal data generation processing and the music reproduction processing will be described later in detail.

【００４１】Ｂ−１．ボーカルデータ生成処理まず、前述したボーカルデータ生成処理について、図３
のフローチャートに基づいて説明する。まず、ＣＰＵ１
３は、ボーカルデータを生成する制御プログラムに基づ
いて、歌詞割当処理を行う（ステップＳ１）。B-1. Vocal Data Generation Process First, the vocal data generation process described above is described with reference to FIG.
A description will be given based on the flowchart of FIG. First, CPU1
No. 3 performs a lyrics assignment process based on a control program for generating vocal data (step S1).

【００４２】ここで、歌詞割当処理とは、歌詞の音節を
メロディデータの音符に割り当てる処理のことであり、
例えば、音節のデータ数と音符のデータ数とを合わせ
て、初めから割り振る等の処理を行う。Here, the lyrics assignment process is a process of assigning the syllables of the lyrics to the notes of the melody data.
For example, processing such as assigning the syllable data number and the note data number together from the beginning is performed.

【００４３】この歌詞割当処理によって、生成されたボ
ーカルデータのテーブル（図４参照）がハードディスク
１４に記憶される。図４は、例えば「チューリップ」の
歌詞「咲いた咲いた」の部分のテーブルを示している。
このデータテーブルは、ノート（音符）データ、発生タ
イミングデータ、歌詞および音素列データとを含む。ノ
ートデータとはノートナンバ（音高を示す）のことであ
る。音素列データは、各音素に関するデータと呼気（即
ち、人が音声を発音する際の息継ぎ）を表現するデータ
を含む。前記音素列データは、「歌詞」の各音節（本実
施形態における仮名文字）に対応している。The vocal data table (see FIG. 4) generated by the lyrics assigning process is stored in the hard disk 14. FIG. 4 shows, for example, a table of a portion of the lyrics “flower bloomed” of “tulip”.
This data table includes note (note) data, generation timing data, lyrics, and phoneme string data. The note data is a note number (indicating a pitch). The phoneme string data includes data relating to each phoneme and data expressing exhalation (that is, breathing when a person pronounces a voice). The phoneme string data corresponds to each syllable of “lyrics” (a kana character in the present embodiment).

【００４４】例えば、第１行目の歌詞「さ」を構成する
音節は、音素列データ「ｓ＋ａ」に置き換えられ、Ｃ３
の音高（ノート）で発音されることを意味し、第２行目
の歌詞「い」は、音素列データ「ｉ」に置き換えられ、
Ｄ３の音高で発音されることを意味している。For example, the syllables constituting the lyrics “sa” on the first line are replaced with phoneme string data “s + a”,
Means that the word "i" is pronounced at the pitch (note) of the second line, and the lyrics "i" on the second line are replaced with phoneme string data "i".
It means that it is pronounced at the pitch of D3.

【００４５】発音タイミングデータは、初めの音素の発
音タイミングを示したもので、図９に示すように、音符
に割り当てられた音節について、音素列データのうち母
音を発音するときの発音タイミングと、音符を発音する
ときの発音タイミングとを合わせるタイミングを示した
ものである。母音のみで表される音節の場合には、音節
の発音タイミングは音符の発音タイミングにほぼ一致
し、子音で始まる音節の場合には、音節の発音タイミン
グは音符の発音タイミングよりも前になる。The sounding timing data indicates the sounding timing of the first phoneme. As shown in FIG. 9, for the syllable assigned to the note, the sounding timing when the vowel of the phoneme string data is sounded, This shows the timing to match the sounding timing when a note is sounded. In the case of a syllable represented only by a vowel, the sounding timing of the syllable almost coincides with the sounding timing of the note, and in the case of a syllable starting with a consonant, the sounding timing of the syllable is earlier than the sounding timing of the note.

【００４６】再び、図３に戻って、ＣＰＵ１３は、図４
のボーカルデータを含むシステムエクスクルーシブメッ
セージを作成する（ステップＳ３）。一般に、このシス
テムエクスクルーシブメッセージは、ＭＩＤＩ音源固有
の機能を操作するために使う情報であり、メーカが自社
製のＭＩＤＩ楽器等に独自に設定している送受信メッセ
ージのことである。このメッセージは、曲データの１番
最初に音源に送ることにより、ＭＩＤＩ音源が持つ独自
の機能を用いて音色やエフェクトを設定する。このシス
テムエクスクルーシブメッセージを用いることにより、
ＭＩＤＩ音源により高度な楽音表現を実現する。Returning to FIG. 3, the CPU 13 returns to FIG.
Then, a system exclusive message including the vocal data is created (step S3). Generally, this system exclusive message is information used for operating a function unique to a MIDI sound source, and is a transmission / reception message that is set uniquely by a manufacturer for a MIDI instrument or the like manufactured by the manufacturer. This message is sent to the sound source at the very beginning of the music data, so that the tone color and effects are set using the unique functions of the MIDI sound source. By using this system exclusive message,
Advanced musical tone expression is realized by MIDI sound source.

【００４７】本実施形態によるシステムエクスクルーシ
ブメッセージは、図５に示すように構成されている。即
ち、先頭にシステムエクスクルーシブメッセージである
ことを示すエクスクルーシブ・ステータス「Ｆ０」、次
の位置がメーカＩＤ、３つ目の位置がエクスクルーシブ
メッセージを送受信するための識別番号（デバイスＩ
Ｄ）、次が音素列データメッセージ、次が発音タイミン
グメッセージさらに音節のピッチメッセージとなり、こ
の音素列データメッセージ、発音タイミングメッセー
ジ、ピッチメッセージがそれぞれ１つの音節を示してい
る。そして、この３つのデータを各音節毎に繰り返して
１曲分のボーカルデータを構成し、最後の位置にエンド
オブエクスクルーシブ「Ｆ７」が割り付けられる。The system exclusive message according to the present embodiment is configured as shown in FIG. That is, at the beginning, an exclusive status “F0” indicating a system exclusive message, the next position is a manufacturer ID, and the third position is an identification number for transmitting / receiving an exclusive message (device I).
D), next is a phoneme string data message, next is a sounding timing message, and a syllable pitch message. The phoneme string data message, sounding timing message, and pitch message each indicate one syllable. Then, these three data are repeated for each syllable to form vocal data for one song, and the end of exclusive "F7" is assigned to the last position.

【００４８】さらに、ＣＰＵ１３は、ステップＳ３で生
成されたシステムエクスクルーシブメッセージをハード
ディスク１４に記憶し、この処理を終了する。Further, the CPU 13 stores the system exclusive message generated in step S3 on the hard disk 14, and ends this processing.

【００４９】Ｂ−２．曲再生処理次に、曲再生処理について、図６のフローチャートに基
づいて説明する。まず、パソコン１のＣＰＵ１３は、キ
ーボード２８またはマウス２９からの信号を受けて、ハ
ードディスク１４に記憶された伴奏データおよびシステ
ムエクスクルーシブメッセージとして表現されたボーカ
ルデータをインターフェース１９，３７を介してトーン
ジェネレータ３１に送信する。トーンジェネレータ３１
のＣＰＵ４１は、この伴奏データおよびシステムエクス
クルーシブメッセージを受信すると、曲再生処理を開始
する（ステップＳ１１）。B-2. Music Playback Processing Next, music playback processing will be described based on the flowchart of FIG. First, the CPU 13 of the personal computer 1 receives a signal from the keyboard 28 or the mouse 29 and sends the accompaniment data stored on the hard disk 14 and the vocal data expressed as a system exclusive message to the tone generator 31 via the interfaces 19 and 37. Send. Tone generator 31
CPU 41 starts music playback processing upon receiving the accompaniment data and the system exclusive message (step S11).

【００５０】ＣＰＵ４１は、制御プログラムに基づい
て、受信した伴奏データおよびシステムエクスクルーシ
ブメッセージをＲＡＭ３９に格納すると共に、格納した
システムエクスクルーシブメッセージからボーカルデー
タを生成する（ステップＳ１２）。この際、システムエ
クスクルーシブメッセージの各部に書き込まれたメッセ
ージから、ノートデータ、発音タイミングデータ、歌詞
および音素列データを、各音節毎に読出し、図４に示す
ようなボーカルデータのテーブルを作成する。The CPU 41 stores the received accompaniment data and the system exclusive message in the RAM 39 based on the control program, and generates vocal data from the stored system exclusive message (step S12). At this time, note data, tone generation timing data, lyrics and phoneme string data are read from each message written in each part of the system exclusive message for each syllable, and a vocal data table as shown in FIG. 4 is created.

【００５１】次に、ＣＰＵ４１は、伴奏データに基づい
て伴奏再生処理を開始する（ステップＳ１３）。この伴
奏再生処理は、各パート毎の楽器による再生を行うもの
である。この処理は、従来技術と変わるところがないの
で、その詳細は省略する。さらに、ＣＰＵ４１は、ボー
カルデータのテーブルからバーチャルシンガによる歌詞
の再生処理を開始する（ステップＳ１４）。そして、演
奏に合わせてバーチャルシンガに歌詞を唄わせる。Next, the CPU 41 starts an accompaniment reproduction process based on the accompaniment data (step S13). In this accompaniment reproduction process, reproduction is performed with a musical instrument for each part. Since this processing is the same as that of the conventional technique, its details are omitted. Further, the CPU 41 starts the playback processing of the lyrics by the virtual singer from the vocal data table (step S14). Then, let the virtual singers sing the lyrics along with the performance.

【００５２】ここで、歌詞再生処理の流れを、図７のフ
ローチャートを基づいて説明する。即ち、ＣＰＵ４１
は、タイマ４４をスタートさせ（ステップＳ２１）、デ
ータテーブルの発音タイミングｔにタイマのカウント値
が達したか否かを監視し、発音タイミングに達する毎
に、音節を順番に発音させる（ステップＳ２２）。そし
て、ＣＰＵ４１は、全てのボーカルデータが発音したか
否かを監視し（ステップＳ２３）、全てのボーカルデー
タが発音するまで、この処理を繰り返す。Here, the flow of the lyrics reproduction process will be described with reference to the flowchart of FIG. That is, the CPU 41
Starts the timer 44 (step S21), monitors whether or not the count value of the timer has reached the sound generation timing t in the data table, and sounds the syllables in order each time the sound generation timing is reached (step S22). . Then, the CPU 41 monitors whether or not all the vocal data is generated (step S23), and repeats this process until all the vocal data is generated.

【００５３】音符と音節との発音タイミングについて、
図８および図９に具体例を挙げて説明する。この具体例
は、「チューリップ」の歌い始めの「咲いた咲いた」の
部分を示したものである。前述した如く、発音タイミン
グデータは、音符に割り当てられた音節について、音素
列データのうち母音を発声するときの発音タイミング
と、音符を発生するときの発音タイミングとを合わせた
データである。このため、ＣＰＵ４１は、第１拍目の
「ド」を発音させるためのノートオンＫＯＮ１に対応し
た音節「さ」を発音させるためには、音符の発音タイミ
ングよりも時間ｔ０だけ前、即ちタイマ４４のスタート
からｔ１後に音節「さ」を構成する音素である子音
「ｓ」の発音動作を始めている。これにより、音節
「さ」を構成する音素である母音「ａ」の発音タイミン
グが「ド」の発音タイミングにほぼ一致する。Regarding the sounding timing of notes and syllables,
A specific example will be described with reference to FIGS. This specific example shows the part of "tulips bloomed" at the beginning of the singing of "tulips". As described above, the sounding timing data is data obtained by combining the sounding timing when a vowel is uttered and the sounding timing when a note is generated in the syllable assigned to the note. Therefore, the CPU 41 generates the syllable “sa” corresponding to the note-on KON1 for generating the first beat “do” by the time t0 before the note generation timing, that is, the timer 44. After t1 from the start, the sounding operation of the consonant "s" which is a phoneme constituting the syllable "sa" has begun. Thus, the sounding timing of the vowel "a", which is a phoneme constituting the syllable "sa", substantially matches the sounding timing of "do".

【００５４】また、ＣＰＵ４１は、第２拍目の「レ」を
発音させるためのノートオンＫＯＮ２に対応した音節
「い」を発音させるためには、タイマ４４のスタートか
らｔ２後に母音「ｉ」の発音動作を始めている。これに
より、母音「ｉ」の発音タイミングが「レ」の発音タイ
ミングにほぼ一致する。Further, the CPU 41 generates the syllable “i” corresponding to the note-on KON2 for generating the second beat “re” by outputting the vowel “i” after t2 from the start of the timer 44. He has begun sounding. Thus, the sounding timing of the vowel "i" substantially coincides with the sounding timing of "RE".

【００５５】さらに、ＣＰＵ４１は、第３拍目の「ミ」
を発音させるためのノートオンＫＯＮ３に対応した音節
「た」を発音させるためには、音符の発音タイミングよ
りも時間ｔ０だけ前、即ちタイマ４４のスタートからｔ
３後に音節「た」を構成する音素である子音「ｔ」の発
音動作を始めている。これにより、音節「た」を構成す
る音素である母音「ａ」の発音タイミングが「ミ」の発
音タイミングにほぼ一致する。Further, the CPU 41 sets the third beat of "mi"
In order to sound a syllable "ta" corresponding to the note-on KON3 for sounding a note, a time t0 before the sounding timing of the note, that is, t from the start of the timer 44,
Three seconds later, a sounding operation of a consonant "t" which is a phoneme constituting the syllable "ta" has begun. Thus, the sounding timing of the vowel "a", which is a phoneme constituting the syllable "ta", substantially coincides with the sounding timing of "mi".

【００５６】このように、本実施形態によるＤＴＭシス
テムでは、メロディの音符に対して歌詞の音節を遅らせ
ることなく再生でき、バーチャルシンガに歌詞を唄わせ
ることができる。これにより、伴奏に合わせて歌詞を唄
わせることが可能となる。As described above, in the DTM system according to the present embodiment, the syllables of the lyrics can be reproduced without delaying the notes of the melody, and the virtual singers can sing the lyrics. This makes it possible to make the lyrics sing along with the accompaniment.

【００５７】Ｃ．実施形態の効果本実施形態では、メロディの音符の発音タイミングと歌
詞に対応した音節の母音の発音タイミングとが一致する
ように設定した発音タイミングデータを、曲の再生に際
して予めトーンジェネレータ３１に送信するようにした
から、バーチャルシンガによってリアルタイムに歌唱さ
れることができる。この結果、ＤＴＭシステムでは、自
然の歌唱に近い歌い出しのタイミングを実現することが
可能となる。C. Effects of the Embodiment In the present embodiment, the sounding timing data set so that the sounding timing of the melody note and the sounding timing of the vowel of the syllable corresponding to the lyrics are transmitted to the tone generator 31 in advance when the music is reproduced. By doing so, it is possible to sing in real time by the virtual singer. As a result, in the DTM system, it is possible to realize a timing of starting singing close to natural singing.

【００５８】Ｄ．変形例（１）前記実施形態では、１曲分のボーカルデータをシ
ステムエクスクルーシブメッセージに含んだ場合を例示
したが、本発明はこれに限らず、１コーラス、１フレー
ズ、息継ぎ間、１小節としてもよく、音符の発生タイミ
ングよりも前に予めトーンジェネレータ３１にある単位
で送信するようにすればよい。D. Modifications (1) In the above-described embodiment, the case where vocal data for one song is included in the system exclusive message has been exemplified. However, the present invention is not limited to this. It is good enough to transmit the tone generator 31 in a certain unit in advance of the note generation timing.

【００５９】（２）演奏データは、インターネット等か
ら配信を受けたデータであってもよく、この場合、演奏
データのうちメロディパートに対して、既成の歌詞或い
はオリジナルの歌詞を割り当て、ボーカル情報によって
設定したバーチャルシンガに歌唱させることも可能とな
る。(2) The performance data may be data distributed from the Internet or the like. In this case, existing lyrics or original lyrics are assigned to the melody part of the performance data, and the vocal information is used. It is also possible to make the set virtual singers sing.

【００６０】（３）システムエクスクルーシブメッセー
ジのメッセージにノートナンバ等を記憶するようにして
もよく、歌詞を再生するときの音程は、システムエクス
クルーシブメッセージに書き込んでも、ノートオン信号
に含まれる音高を用いるようにしてもよい。(3) The note number or the like may be stored in the message of the system exclusive message. The pitch at the time of reproducing the lyrics uses the pitch included in the note-on signal even if the pitch is written in the system exclusive message. You may do so.

【００６１】（４）前記実施形態では、歌唱装置をＤＴ
Ｍに適用した場合について述べたが、本発明はこれに限
らず、歌唱音が出力可能な電子楽器や音声応答装置、或
いはゲームマシンやカラオケなどのアミューズメント機
器などに用いてもよい。(4) In the above embodiment, the singing device is DT
Although the case of application to M has been described, the present invention is not limited to this, and may be used for an electronic musical instrument or voice response device capable of outputting a singing sound, or an amusement device such as a game machine or karaoke.

【００６２】[0062]

【発明の効果】以上説明したように、本発明によれば、
音節を構成する音素のうち、子音に対向する音素を音符
の発生タイミングにあわせて発声するので、伴奏に合わ
せたバーチャルシンガによる自然な歌唱を実現すること
ができる。As described above, according to the present invention,
Of the phonemes that make up the syllable, the phoneme that faces the consonant is uttered in time with the note generation timing, so that a natural singing can be realized by the virtual singer that matches the accompaniment.

[Brief description of the drawings]

【図１】本発明の実施形態によるＤＴＭシステムの構
成を示す構成図である。FIG. 1 is a configuration diagram showing a configuration of a DTM system according to an embodiment of the present invention.

【図２】同実施形態のＤＴＭシステムの電気的な構成
をブロック図である。FIG. 2 is a block diagram showing an electrical configuration of the DTM system according to the embodiment.

【図３】同実施形態によるボーカルデータ生成処理を
示す流れ図である。FIG. 3 is a flowchart showing vocal data generation processing according to the embodiment.

【図４】同実施形態に用いられるボーカルデータのテ
ーブルを示す図である。FIG. 4 is a diagram showing a table of vocal data used in the embodiment.

【図５】同実施形態に用いられるシステムエクスクル
ーシブメッセージの構成を示す図である。FIG. 5 is a diagram showing a configuration of a system exclusive message used in the embodiment.

【図６】同実施形態による曲再生処理を示す流れ図で
ある。FIG. 6 is a flowchart showing a music reproduction process according to the embodiment;

【図７】同実施形態による歌詞再生処理を示す流れ図
である。FIG. 7 is a flowchart showing lyrics reproduction processing according to the embodiment;

【図８】具体例による音符に対応したＭＩＤＩデータ
および音節の発声を示した図である。FIG. 8 is a diagram showing MIDI data and syllable utterances corresponding to notes according to a specific example.

【図９】具体例による音符の長さに対する音節の発声
を示した図である。FIG. 9 is a diagram illustrating utterance of syllables with respect to a note length according to a specific example.

【図１０】従来技術による音符に対応したＭＩＤＩデ
ータおよび音節の発声を示した図である。FIG. 10 is a diagram showing MIDI data and syllable utterances corresponding to notes according to the related art.

【図１１】従来技術による音符の長さに対する音節の
発声を示した図である。FIG. 11 is a diagram illustrating utterance of syllables with respect to a note length according to the related art.

[Explanation of symbols]

１・・・パソコン，２８・・・キーボード，２９・・・
マウス，３１・・・トーンジェネレータ，３２・・・Ｍ
ＩＤＩ音源，３３・・・歌唱音源，５０・・・スピー
カ，６０・・・ＭＩＤＩキーボード1 ... PC, 28 ... Keyboard, 29 ...
Mouse, 31 ... tone generator, 32 ... M
IDI sound source, 33 ... singing sound source, 50 ... speaker, 60 ... MIDI keyboard

Claims

[Claims]

1. A vocal data generating apparatus for generating vocal data corresponding to a melody and lyrics, a lyrics assigning means for assigning a syllable of the lyrics to a note corresponding to the melody, dividing the syllable into phonemes, Of the phonemes that make up
Sounding timing data generating means for generating sounding timing data corresponding to each syllable so as to match the sounding timing of the vowel phoneme to the sounding timing corresponding to the note; and the phoneme string data of the syllable, the sounding timing data and the syllable. A vocal data generating device, comprising: data output means for generating corresponding pitch data as vocal data and outputting the vocal data.

2. A vocal data generating apparatus for generating vocal data corresponding to a melody and lyrics, a lyrics assigning means for assigning a syllable of the lyrics to a note corresponding to the melody, dividing the syllable into phonemes, and Of the phonemes that make up
Sounding timing data generating means for generating sounding timing data corresponding to each syllable so as to match the sounding timing of the vowel phoneme to the sounding timing corresponding to the note; and the phoneme string data of the syllable, the sounding timing data and the syllable. A vocal data generation device comprising: data output means for generating corresponding pitch data as vocal data, and outputting the vocal data included in a system exclusive message.

3. The vocal data generation device according to claim 1, wherein said data output means outputs said vocal data separately for each chorus.

4. The vocal data generation device according to claim 1, wherein said data output means outputs said vocal data divided for each phrase.

5. The vocal data generation device according to claim 1, wherein said data output means outputs said vocal data separately for each breathing section.

6. The vocal data generating device according to claim 1, wherein said data output means outputs said vocal data for each bar.

7. A vocal data generating method for generating vocal data corresponding to a melody and lyrics, a lyrics assigning step of assigning a syllable of the lyrics to a note corresponding to the melody, dividing the syllable into phonemes, and Of the phonemes that make up
A sounding timing data generating step of generating sounding timing data corresponding to each syllable in order to match the sounding timing of the vowel phoneme to the sounding timing corresponding to the note; and the phoneme string data of the syllable, the sounding timing data and the syllable. A vocal data generating method, comprising: generating corresponding vocal data as vocal data; and outputting the vocal data.

8. A vocal data generating method for generating vocal data corresponding to a melody and lyrics, a lyrics assigning step of assigning a syllable of the lyrics to a note corresponding to the melody, dividing the syllable into phonemes, and Of the phonemes that make up
A sounding timing data generating step of generating sounding timing data corresponding to each syllable in order to match the sounding timing of the vowel phoneme to the sounding timing corresponding to the note; and the phoneme string data of the syllable, the sounding timing data and the syllable. A vocal data generating method, comprising: generating corresponding pitch data as vocal data; and outputting the vocal data included in a system exclusive message.

9. A singing sound synthesizer for singing lyrics to a virtual singer in accordance with an accompaniment, wherein lyrics are assigned to musical notes of accompaniment data and melody data, and the vowel of each syllable is pronounced at the timing of the musical note. Data output means for outputting vocal data including pronunciation timing data, syllable phoneme string data, and pitch data corresponding to the syllable, data control means for transmitting these data during reproduction; and the data An accompaniment sound source for generating an accompaniment sound by receiving the accompaniment data transmitted by the control means; and receiving the vocal data transmitted by the data control means, and forming a syllable at a pitch corresponding to the note based on the vocal data. A singing sound synthesizer comprising: a singing sound source to be pronounced.

10. A singing sound synthesizer for singing lyrics to a virtual singer in accordance with an accompaniment, wherein lyrics are assigned to musical notes of accompaniment data and melody data, and the vowel of each syllable is pronounced at the musical note timing. Data output means for outputting the vocal data including the sounding timing data, syllable phoneme string data and pitch data corresponding to the syllable in a system exclusive message, and transmitting these data during reproduction Data control means, an accompaniment sound source for generating an accompaniment sound upon receiving the accompaniment data transmitted by the data control means, and receiving vocal data in the system exclusive message transmitted by the data control means;
A singing sound source for producing a syllable at a pitch corresponding to the note based on the vocal data.