JP3265995B2

JP3265995B2 - Singing voice synthesis apparatus and method

Info

Publication number: JP3265995B2
Application number: JP21220896A
Authority: JP
Inventors: 康善中嶋
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 1996-07-24
Filing date: 1996-07-24
Publication date: 2002-03-18
Anticipated expiration: 2016-07-24
Also published as: JPH1039896A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、人の声で歌を歌わ
せるための歌唱音声合成装置及び方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a singing voice synthesizing apparatus and method for causing a person to sing a song.

【０００２】[0002]

【従来の技術】音声を合成する手法は従来より種々の手
法が提案されており、例えば特開平３−２００３００号
公報に示されるようなフォルマント合成方式による音声
合成装置が知られている。2. Description of the Related Art Various techniques for synthesizing speech have been conventionally proposed. For example, a speech synthesizer using a formant synthesis method as disclosed in Japanese Patent Application Laid-Open No. 3-200300 is known.

【０００３】また、フォルマント特性が実際の楽器音や
人声音と同様に変化するようにパラメータデータを複数
ステップに亘って予め記憶し、該記憶したパラメータデ
ータを順次読み出してフォルマント合成を行うことによ
り、自然な楽音又は人声音の合成を行うようにした楽音
合成装置も従来より知られている（特開平４−２５１２
９７号公報）。In addition, parameter data is stored in advance over a plurality of steps so that the formant characteristics change in the same manner as actual instrument sounds or human voice sounds, and the stored parameter data is sequentially read out to formant. A musical sound synthesizer which synthesizes a natural musical sound or a human voice sound has also been known (Japanese Patent Laid-Open No. Hei 4-2512).
No. 97).

【０００４】上述した従来公知の手法を用いて歌唱音声
合成を行う場合、例えば英語の歌詞”ｈｉｔ”を１つの
４分音符に対応させて発音する場合には、”ｈ”，”
ｉ”，”ｔ”のそれぞれに発音時間Ｔ（ｈ），Ｔ
（ｉ），Ｔ（ｔ）を絶対時間で割り当て、Ｔ（ｈ）＋Ｔ
（ｉ）＋Ｔ（ｔ）が、４分音符の発音時間となるように
パラメータデータを予め記憶しておく手法（以下「第１
の従来手法」という）、あるいはＴ（ｈ）＋Ｔ（ｉ）＋
Ｔ（ｔ）を４分音符の発音時間より短い時間に設定して
おき、最後の”ｔ”の発音時間が終了した時点で発音を
終了するか、その最後の”ｔ”の音を４分音符の発音終
了時点までホールドする手法（以下「第２の従来手法」
という）が採用されている。[0004] When singing voice synthesis is performed by using the above-mentioned conventionally known technique, for example, when the English lyrics "hit" is pronounced in correspondence with one quarter note, "h", "
i ”and“ t ”each have a sound generation time T (h), T
(I), T (t) are assigned in absolute time, and T (h) + T
A method of storing parameter data in advance so that (i) + T (t) becomes the sounding time of a quarter note (hereinafter referred to as “first
Or T (h) + T (i) +
T (t) is set to a time shorter than the sounding time of the quarter note, and when the sounding time of the last "t" ends, the sounding ends, or the sound of the last "t" is played for four minutes. A method of holding notes until the end of note generation (hereinafter referred to as "second conventional method")
Has been adopted.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記第
１の従来手法では、予め決まったテンポでの歌唱しか行
うことができないという問題がある。そこで、各音素の
発音時間を相対時間で規定しておくという手法も考えら
れるが、特に”ｈ”や”ｔ”などの無声音（子音）の発
音時間をテンポに応じて変更すると歌唱が不自然なもの
となってしまう。However, the first conventional method has a problem that it can only sing at a predetermined tempo. Therefore, a method of defining the pronunciation time of each phoneme as a relative time is also conceivable, but in particular, if the pronunciation time of unvoiced sounds (consonants) such as “h” and “t” is changed according to the tempo, the singing becomes unnatural. It will be something.

【０００６】一方、上記第２の従来手法では、”ｔ”の
発音終了時点で発音を終了する場合、あるいは”ｔ”の
音をホールドする場合のいずれ場合も歌唱が不自然で違
和感があるという問題がある。On the other hand, in the second conventional method, the singing is unnatural and uncomfortable when the sound is ended at the end of the sounding of "t" or when the sound of "t" is held. There's a problem.

【０００７】本発明は上述した点に鑑みなされたもので
あり、曲のテンポを変更しても自然な歌唱を行うことが
できる歌唱音声合成装置及び方法を提供することを目的
とする。The present invention has been made in view of the above points, and has as its object to provide a singing voice synthesizing apparatus and method capable of performing natural singing even if the tempo of a song is changed.

【０００８】[0008]

【課題を解決するための手段】上記目的を達成するため
請求項１に記載の歌唱音声合成装置は、発音すべき歌詞
を示す複数の歌詞データと、該歌詞データに対応し、該
歌詞データが示す歌詞の発音時間を相対時間で示す歌詞
発音時間データとを含む歌唱データに基づいて、発音制
御手段を制御することにより、音声を順次合成する歌唱
音声合成装置であって、前記各歌詞データは、それぞ
れ、当該歌詞の音素を示す音素記号データと、該音素の
発音時間を指定する音素発音時間データとからなり、前
記音素発音時間データは、当該音素が有声音である場合
には、当該音素の発音時間を絶対時間で指定する第１の
データ、または、該有声音に対応する前記歌詞発音時間
データが示す発音時間の終了まで発音することを指定す
る第２のデータのいずれかからなる一方、当該音素が有
声音でない場合には、前記第１のデータのみからなり、
前記発音制御手段は、前記音素発音時間データが前記第
１のデータからなるときには、該第１のデータが絶対時
間で指定する発音時間だけ当該音素を発音させるように
制御する一方、前記音素発音時間データが前記第２のデ
ータからなるときには、当該有声音の音素を、該有声音
に対応する前記歌詞発音時間データが示す発音時間の終
了まで発音させるように制御することを特徴とする。According to a first aspect of the present invention, there is provided a singing voice synthesizing apparatus, comprising: a plurality of lyrics data indicating lyrics to be pronounced; and a plurality of lyrics data corresponding to the lyrics data. A singing voice synthesizer for sequentially synthesizing voices by controlling the pronunciation control means based on singing data including lyric pronunciation time data indicating the pronunciation time of the indicated lyrics in relative time. , Each consisting of phoneme symbol data indicating the phoneme of the lyrics and phoneme sounding time data designating the sounding time of the phoneme, wherein the phoneme sounding time data indicates that the phoneme is a voiced sound.
The first data that specifies the sounding time of the phoneme in absolute time, or the second data which specifies that the sound until the end of the lyrics sounding time data is sounding time indicated corresponding to the voiced while that such scolded either the phoneme is Yes
If it is not a voice sound, it consists of only the first data,
The sound control means, when the phoneme sounding time data is composed of the first data when the first data is absolute
While the phoneme is controlled to sound only for the sounding time specified between the phonemes, when the phoneme sounding time data is composed of the second data, the phoneme of the voiced sound is replaced with the lyrics sounding time corresponding to the voiced sound. It is characterized in that the sound is controlled until the sounding time indicated by the data ends.

【０００９】また請求項２に記載の歌唱音声合成装置
は、請求項１の歌唱音声合成装置において、前記発音制
御手段は、前記歌詞データ中の前記音素発音時間データ
が前記第２のデータからなる有声音の音素に続く音素
は、当該有声音に対応する前記歌詞発音時間データが示
す発音時間の終了後に発音するように制御することを特
徴とする。The singing voice synthesizing device according to claim 2 is the singing voice synthesizing device according to claim 1, wherein the pronunciation control means is configured to generate the phoneme pronunciation time data in the lyrics data.
Is controlled so that a phoneme following a phoneme of a voiced sound composed of the second data is pronounced after the end of the sounding time indicated by the lyrics sounding time data corresponding to the voiced sound.

【００１０】請求項３に記載の歌唱音声合成方法は、発
音すべき歌詞を示す複数の歌詞データと、該歌詞データ
に対応し、該歌詞データが示す歌詞の発音時間を相対時
間で示す歌詞発音時間データとを含む歌唱データに基づ
いて、発音制御手段を制御することにより、音声を順次
合成する歌唱音声合成方法であって、前記各歌詞データ
は、それぞれ、当該歌詞の音素を示す音素記号データ
と、該音素の発音時間を指定する音素発音時間データと
からなり、前記音素発音時間データは、当該音素が有声
音である場合には、当該音素の発音時間を絶対時間で指
定する第１のデータ、または、該有声音に対応する前記
歌詞発音時間データが示す発音時間の終了まで発音する
ことを指定する第２のデータのいずれかからなる一方、
当該音素が有声音でない場合には、前記第１のデータの
みからなり、前記発音制御手段に対しては、前記音素発
音時間データが前記第１のデータからなるときには、該
第１のデータが絶対時間で指定する発音時間だけ当該音
素を発音させるように制御する一方、前記音素発音時間
データが前記第２のデータからなるときには、当該有声
音の音素を、該有声音に対応する前記歌詞発音時間デー
タが示す発音時間の終了まで発音させるように制御する
ことを特徴とする。また請求項４に記載の歌唱音声合成
方法は、請求項３の歌唱音声合成方法において、前記発
音制御手段に対しては、前記歌詞データ中の前記音素発
音時間データが前記第２のデータからなる有声音の音素
に続く音素は、当該有声音に対応する前記歌詞発音時間
データが示す発音時間の終了後に発音するように制御す
ることを特徴とする。According to a third aspect of the present invention, there is provided a singing voice synthesizing method, wherein a plurality of lyric data indicating lyric to be pronounced and a lyric time indicated by the lyric data corresponding to the lyric data are set in relative time.
A singing voice synthesizing method for sequentially synthesizing voices by controlling the pronunciation control means based on singing data including lyric pronunciation time data indicated between the lyric data, wherein each of the lyric data is a phoneme of the lyrics. And phoneme sounding time data designating the sounding time of the phoneme, wherein the phoneme sounding time data indicates that the phoneme is voiced.
If it is sound, first the finger <br/> constant pronunciation time of the phoneme in absolute time data or the <br/> lyric sounding time data pronunciation time indicated corresponding to the voiced while that such scolded or a second data specifying that sound to the end,
If the phoneme is not voiced, the first data
Consists only, for the sound control unit, when the phoneme sounding time data is from said first data, said
While the first data controls the phoneme to sound only for the sounding time specified by the absolute time, when the phoneme sounding time data consists of the second data, the phoneme of the voiced sound is changed to the voiced sound. It is characterized in that control is performed so that the sound is generated until the end of the sounding time indicated by the corresponding lyrics sounding time data. In a singing voice synthesis method according to a fourth aspect of the present invention, in the singing voice synthesis method according to the third aspect, the phoneme generation in the lyrics data is transmitted to the pronunciation control means.
A phoneme whose sound time data follows a voiced phoneme composed of the second data is controlled so as to sound after the end of the sounding time indicated by the lyrics sounding time data corresponding to the voiced sound.

【００１１】請求項１に記載の歌唱音声合成装置又は請
求項３に記載の歌唱音声合成方法によれば、音素発音時
間データが第２のデータ、すなわち当該音素が有声音で
ある場合に、該有声音に対応する歌詞発音時間データが
示す発音時間の終了まで発音することを指定するデータ
からなるときには、当該有声音の音素を、該有声音に対
応する音素発音時間データが示す発音時間の終了まで発
音させるように、発音制御手段が制御される。According to the singing voice synthesizing apparatus according to the first aspect or the singing voice synthesizing method according to the third aspect, when the phoneme is pronounced,
The interim data is the second data, that is, the phoneme is a voiced sound.
In some cases, the lyrics pronunciation time data corresponding to the voiced sound is
Data that specifies sounding until the end of the indicated sounding time
, The phoneme of the voiced sound is associated with the voiced sound.
Sounds until the end of the sounding time indicated by the corresponding phoneme sounding time data
The sound control means is controlled so as to make a sound .

【００１２】[0012]

【発明の実施の形態】以下本発明の実施の形態を図面を
参照して説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１３】図１は本発明の実施の一形態にかかる歌唱
電子装置の構成を示すブロック図であり、この装置は、
装置全体の制御を行うＣＰＵ１と、ＣＰＵ１で実行され
るプログラムやプログラムの実行に必要なテーブル等及
び音色合成のためのフォルマントデータが記憶されるＲ
ＯＭ２と、ＣＰＵ１のワーキングエリアとして使用さ
れ、演算途中のデータ等を記憶するＲＡＭ３と、音声合
成のための歌唱データ及び伴奏データを記憶するデータ
メモリ４と、各種パラメータや装置の動作モード等を表
示する表示部５と、演奏者が演奏操作を行う、例えば鍵
盤のような演奏操作子６と、演奏モードの指定等を行う
ための設定操作子７と、フォルマントデータに基づいて
音声又は楽音の合成を行うフォルマント音源８と、フォ
ルマント音源８から出力されるデジタル信号をアナログ
信号に変換するＤＡ変換器９と、ＤＡ変換器の出力信号
を増幅してスピーカから出力するサウンドシステム１０
と、上記構成要素１〜８を相互に接続するバス１１とを
備えている。FIG. 1 is a block diagram showing a configuration of a singing electronic device according to an embodiment of the present invention.
And CPU1 which controls the entire apparatus, R the formant data for programs, tables, etc. necessary for the execution of the program and the tone synthesis is performed by CPU1 is stored
OM2, a RAM 3 used as a working area of the CPU 1 and stores data in the middle of calculation, a data memory 4 storing singing data and accompaniment data for voice synthesis, and displays various parameters and operation modes of the apparatus. A display unit 5 for performing a performance operation by a player, for example, a performance operator 6 such as a keyboard, a setting operator 7 for designating a performance mode, and the like, and synthesizing voice or musical sound based on formant data. Sound source 8 for performing digital audio, a DA converter 9 for converting a digital signal output from the formant sound source 8 into an analog signal, and a sound system 10 for amplifying an output signal of the DA converter and outputting the amplified signal from a speaker.
And a bus 11 for interconnecting the above components 1 to 8.

【００１４】フォルマント音源８は、複数の音源チャン
ネル８０を有し、音源チャンネル８０は、４つの母音フ
ォルマント発生部ＶＴＧ１〜ＶＴＧ４と、４つの子音フ
ォルマント発生部ＵＴＧ１〜ＵＴＧ４とから構成されて
いる。このように、母音及び子音についてぞれぞれ４個
ずつのフォルマント発生部を設け、これらのフォルマン
ト発生部の出力を加算して音声を合成する手法は、例え
ば前記特開平３−２００３００号公報に示されるように
公知である。The formant sound source 8 has a plurality of sound source channels 80. The sound source channel 80 is composed of four vowel formant generators VTG1 to VTG4 and four consonant formant generators UTG1 to UTG4. As described above, a method of providing four formant generators for vowels and consonants, respectively, and adding the outputs of these formant generators to synthesize a voice is described in, for example, Japanese Patent Application Laid-Open No. 3-200300. Known as shown.

【００１５】図２は、ＲＯＭ２、ＲＡＭ３及びデータメ
モリ４に記憶されるデータの構成を示す図である。FIG. 2 is a diagram showing the structure of data stored in the ROM 2, RAM 3, and data memory 4.

【００１６】ＲＯＭ２は、ＣＰＵ１が実行するプログラ
ム及びフォルマントデータＰＨＤＡＴＡを記憶している
（同図（ａ））。フォルマントデータＰＨＤＡＴＡは、
日本語及び英語の各音素（母音（有声音）及び子音）に
対応したデータＰＨＤＡＴＡ［ａ］，ＰＨＤＡＴＡ
［ｅ］，……ＰＨＤＡＴＡ［ｚ］から成り、各フォルマ
ントデータＰＨＤＡＴＡは、フォルマント中心周波数、
フォルマントレベル、フォルマント帯域幅等のパラメー
タから成る。これらのパラメータは、時系列データとし
て構成されており、所定のタイミング毎に順次読み出す
ことにより、時変動フォルマントが再生される。The ROM 2 stores a program executed by the CPU 1 and formant data PHDATA (FIG. 1A). Formant data PHDATA is
Data PHDATA [a], PHDATA corresponding to Japanese and English phonemes (vowels (voiced sounds) and consonants)
[E],... PHDATA [z], each formant data PHDATA is a formant center frequency,
It consists of parameters such as formant level and formant bandwidth. These parameters are configured as time-series data, and the time-varying formants are reproduced by sequentially reading them at predetermined timing.

【００１７】ＲＡＭ３は、ＣＰＵ１が演算に使用するワ
ーキングエリアと、演奏シーケンスデータがロードされ
るソングバッファとしての機能を有するエリアとを構成
する（同図（ｂ））。The RAM 3 constitutes a working area used by the CPU 1 for calculation and an area having a function as a song buffer into which the performance sequence data is loaded (FIG. 2B).

【００１８】データメモリ４には、ｎ個のソングデータ
ＳＯＮＧ１、ＳＯＮＧ２、…、ＳＯＮＧｎが記憶されて
おり（同図（ｃ））、各ソングデータＳＯＮＧは、同図
（ｄ）に示すように、曲名を示す曲名データＳＯＮＧＮ
ＡＭＥ、曲のテンポを示すテンポデータＴＥＭＰＯ、拍
子、音色などを指定するデータＭＩＳＣＤＡＴＡ、歌詞
データ、音高データ、ベロシティデータ、デュレーショ
ンデータ等からなる歌唱データＬＹＲＩＣＳＥＱＤＡＴ
Ａ、及び伴奏の演奏データを示す伴奏データＡＣＣＯＭ
ＰＤＡＴＡから構成される。The data memory 4 stores n song data SONG1, SONG2,..., SONGn (FIG. 4C). Each song data SONG is stored in the data memory 4 as shown in FIG. Song name data SOGN indicating song title
AME, tempo data TEMPO indicating the tempo of the song, data MISCDATA specifying the beat, timbre, etc., singing data LYRICSEQDAT including lyrics data, pitch data, velocity data, duration data, etc.
A and accompaniment data ACCOM indicating performance data of accompaniment
It is composed of PDATA.

【００１９】各歌唱データＬＹＲＩＣＳＥＱＤＡＴＡ
は、ｍ個の歌詞ノートデータＬＹＲＩＣＮＯＴＥ及び歌
唱データの終了を示すエンドデータＬＹＲＩＣＥＮＤか
ら成り、各歌詞ノートデータＬＹＲＩＣＮＯＴＥは、歌
詞音素データＬＹＰＨＤＡＴＡ、キーオンデータＫＥＹ
ＯＮ、デュレーションデータＤＵＲＡＴＩＯＮ、及びキ
ーオフデータＫＥＹＯＦＦから成る。歌詞音素データＬ
ＹＰＨＤＡＴＡは、歌詞の各音素（例えば歌詞”ｈｉ
ｔ”の例では、”ｈ”，”ｉ”，”ｔ”）を示す音素記
号データＬＹＰＨＯＮＥとその音素の発音時間を指定す
る音素発音時間データＰＨＯＮＥＴＩＭＥが発音順に並
べられて構成されている。キーオンデータＫＥＹＯＮ
は、音高データ（例えばＣ３）及びベロシティデータＶ
（例えば６４）からなり、これにより音高及び立ち上り
のエンベロープが決定される。デュレーションデータＤ
ＵＲＡＴＩＯＮ（例えばＤＵＲ９６）は、発音時間
（相対時間）を示すデータであり、テンポデータＴＥＭ
ＰＯ及び割り込みクロック時間に応じて絶対時間に対応
したデータに変換される。キーオフデータＫＥＹＯＦＦ
は、発音の終了を示すデータである。Each singing data LYRICSEQDATA
Consists of m pieces of lyric note data LYRICNOTE and end data LYRICEND indicating the end of singing data. Each piece of lyric note data LYRICNOTE is composed of lyric phoneme data LYPHDATA and key-on data KEY.
ON, duration data DURATION, and key-off data KEYOFF. Lyric phoneme data L
YPHDATA indicates each phoneme of the lyrics (for example, lyrics "hi
In the example of "t", phoneme symbol data LYPHONE indicating "h", "i", "t") and phoneme sounding time data PHONETIME specifying the sounding time of the phoneme are arranged in order of sounding. Data KEYON
Represents pitch data (for example, C3) and velocity data V
(For example, 64), which determines the pitch and the rising envelope. Duration data D
URATION (for example, DUR 96) is data indicating a sound generation time (relative time), and includes tempo data TEM.
The data is converted into data corresponding to the absolute time according to the PO and the interrupt clock time. Key-off data KEYOFF
Is data indicating the end of sound generation.

【００２０】図２（ｆ）には、歌詞”ｈｉｔ”及び”ｙ
ｕｋｉ”に対応する歌詞ノートデータＬＹＲＩＣＮＯＴ
Ｅの例を示している。ここで、音素発音時間データＰＨ
ＯＮＥＴＩＭＥは原則として、絶対時間で発音時間を指
定するものであるが（同図においてＰＨＯＮＥＴＩＭＥ
１は「５」に設定されており、これは例えば基本時間単
位を８ｍｓｅｃとすると、８ｍｓｅｃ×５＝４０ｍｓｅ
ｃに相当する）、「０」に指定された場合（”ｈｉｔ”
の”ｉ”，”ｙｕｋｉ”の”ｕ”、以下「ゼロ指定」と
いう）は、後で詳述するようにこの母音をデュレーショ
ンの終了時点まで発音することを意味する。そして、そ
の後に続く音素（”ｈｉｔ”の”ｔ”，”ｙｕｋｉ”
の”ｋｉ”）は、デュレーションの終了後に発音するよ
うに制御される。FIG. 2F shows the lyrics "hit" and "y".
lyrics note data LYRICNOT corresponding to "uki"
An example of E is shown. Here, the phoneme sounding time data PH
In principle, ONETIME specifies the sound production time in absolute time (in FIG.
1 is set to “5”. For example, if the basic time unit is 8 msec, 8 msec × 5 = 40 msec
c), if specified as “0” (“hit”
"I" and "u" of "yuki" (hereinafter referred to as "zero designation") mean that the vowel is pronounced until the end of the duration, as will be described in detail later. Then, the following phonemes (“t” and “yuki” of “hit”)
“Ki”) is controlled so that it is pronounced after the end of the duration.

【００２１】図３は、ＣＰＵ１で実行されるメインプロ
グラムのフローチャートであり、このプログラムは本装
置の電源がオンされると実行が開始される。FIG. 3 is a flowchart of a main program executed by the CPU 1, and the execution of this program is started when the power of the apparatus is turned on.

【００２２】先ずステップＳ１では、各種パラメータの
処理設定を行い、次いで演奏操作子６及び設定操作子７
における操作イベントの検出を行う（ステップＳ２）。
続くステップＳ３では、ソングデータＳＯＮＧに基づく
演奏処理の実行中でないか否かを判別し、演奏を開始し
ていないときは、ソングデータＳＯＮＧの選択イベント
が有るか否かを判別する（ステップＳ４）。選択イベン
トが無ければ直ちに、また選択イベントがあったときは
選択されたソングデータＳＯＮＧをデータメモリ４から
ＲＡＭ３のソングバッファに転送して（ステップＳ
５）、ステップＳ６に進む。First, in step S1, processing settings of various parameters are performed, and then the performance operator 6 and the setting operator 7
(Step S2).
In the following step S3, it is determined whether or not the performance process based on the song data SONG is being executed, and if the performance has not been started, it is determined whether or not there is a selection event of the song data SONG (step S4). . Immediately if there is no selection event, or if there is a selection event, the selected song data SONG is transferred from the data memory 4 to the song buffer of the RAM 3 (step S
5) Go to step S6.

【００２３】ステップＳ６では、ＲＡＭ３のソングバッ
ファにソングデータＳＯＮＧがあるか否かを判別し、無
ければステップＳ２に戻り、あるときは歌唱演奏開始操
作イベントが有ったか否かを判別する（ステップＳ
７）。そして該操作イベントがなければ直ちにステップ
Ｓ２に戻り、あったときは歌唱演奏処理を開始し、各種
フラグ（歌詞ノートデータＬＹＲＩＣＮＯＴＥに基づい
た発音処理実行中であることを「１」で示すキーオンフ
ラグＫＥＹＯＮＦＬＧ、デュレーションデータＤＵＲＡ
ＴＩＯＮで指定された発音時間（以下「デュレーション
時間」という）中であることを「１」で示すノートオン
フラグＮＯＴＥＯＮＦＬＧ、音素発音時間データＰＨＯ
ＮＥＴＩＭＥで指定された発音時間中であることを
「１」で示すフォルマントタイマフラグＦＴＩＭＥＲＦ
ＬＧ、ゼロ指定がされたことを「１」で示すゼロ指定フ
ラグＰＨＴＩＭＥＺＥＲＯＦＬＧ及びゼロ指定された場
合のデュレーション時間終了後の処理中であることを
「１」で示す残り処理フラグＲＥＳＴＦＬＧ）及びポイ
ンタｉを初期化して（ステップＳ８）、ステップＳ２に
戻る。In step S6, it is determined whether or not there is song data SONG in the song buffer of the RAM 3. If not, the process returns to step S2, and if so, it is determined whether or not a singing performance start operation event is present (step S6). S
7). If there is no operation event, the process immediately returns to step S2. If there is, the singing performance process is started, and various flags (key-on flag KEYONFLG indicating "1" indicating that sound generation processing is being executed based on lyrics note data LYRICNOTE). , Duration data DURA
A note-on flag NOTEONFLG indicating "1" indicates that the sounding time (hereinafter, referred to as "duration time") specified by TION is present, and phoneme sounding time data PHO.
A formant timer flag FTIMER which indicates by "1" that the sounding time is being specified by NETTIME.
LG, a zero designation flag PHTIMEZEROFLG indicating “0” has been designated, a remaining process flag RESTFLG designated “1” indicating that the process is being performed after the end of the duration time when zero is designated, and a pointer i. Initialize (step S8) and return to step S2.

【００２４】歌唱演奏処理が開始されると、ステップＳ
３からステップＳ９に進み、ＲＡＭ３のソングバッファ
にロードされたソングデータＳＯＮＧに基づく演奏処理
（ＳＯＮＧ演奏処理、図４）を実行する。そして、歌唱
演奏処理のストップ操作イベントが有ったか否かを判別
し（ステップＳ１０）、該操作イベントが無ければ直ち
に、また該操作イベントがあったときは歌唱演奏の中止
処理を実行して、ステップＳ２に戻る。When the singing performance process starts, step S
Then, the process proceeds from step 3 to step S9 to execute a performance process (SONG performance process, FIG. 4) based on the song data SONG loaded into the song buffer of the RAM 3. Then, it is determined whether or not there is a stop operation event of the singing performance processing (step S10). Immediately if there is no such operation event, and if there is such an operation event, a singing performance stop processing is executed. It returns to step S2.

【００２５】図４は、図３のステップＳ９におけるＳＯ
ＮＧ演奏処理のフローチャートであり、この処理は歌唱
データＬＹＲＩＣＳＥＱＤＡＴＡに基づく演奏処理（Ｌ
ＹＲＩＣＳＥＱＤＡＴＡ演奏処理、ステップＳ２１）
と、伴奏データＡＣＣＯＭＰＤＡＴＡの基づく演奏処理
（ＡＣＣＯＭＰＤＡＴＡ演奏処理、ステップＳ２２）と
から成る。FIG. 4 is a graph showing the relationship between SO and SO in step S9 in FIG.
It is a flowchart of an NG performance process, which is a performance process (L) based on the singing data LYRICSEQDATA.
YRICSEQDATA performance processing, step S21)
And performance processing based on the accompaniment data ACCOMPDATA (ACCOMPDATA performance processing, step S22).

【００２６】図５、６及び７は、図４のステップＳ２１
におけるＬＹＲＩＣＳＥＱＤＡＴＡ演奏処理のフローチ
ャートである。FIGS. 5, 6, and 7 show steps S21 in FIG.
6 is a flowchart of LYRICSEQDATA performance processing in FIG.

【００２７】ステップＳ３１では、キーオンフラグＫＥ
ＹＯＮＦＬＧが「０」か否かを判別し、当初はＫＥＹＯ
ＮＦＬＧ＝０であるので、ｉ番目の歌詞ノートデータＬ
ＹＲＩＣＮＯＴＥｉを読み込み（ステップＳ３２）、そ
の読み込んだデータがエンドデータＬＹＲＩＣＥＮＤで
ないか否かを判別する（ステップＳ３３）。エンドデー
タＬＹＲＩＣＥＮＤであれば歌唱演奏終了処理を実行し
て（ステップＳ３６）、本処理を終了する。エンドデー
タＬＹＲＩＣＥＮＤでなければ、デュレーションデータ
ＤＵＲＡＴＩＯＮを、テンポデータＴＥＭＰＯ及び割り
込みクロック時間（具体的には、図８に示すＴＩＭＥＲ
割り込み処理の実行間隔）に応じた時間データに変換し
てノートタイマＮＯＴＥＴＩＭＥＲに設定する（ステッ
プＳ３４）。このタイマの値は、図８の処理が実行され
る毎に「１」ずつデクリメントされる。In step S31, the key-on flag KE
It is determined whether or not YONFLG is "0".
Since NFLG = 0, the i-th lyrics note data L
YRICNOTEi is read (step S32), and it is determined whether or not the read data is end data LYRICEND (step S33). If the end data is LYRICEND, singing performance end processing is executed (step S36), and this processing ends. If the end data is not LYRICEND, the duration data DURATION is converted to the tempo data TEMPO and the interrupt clock time (specifically, TIMER shown in FIG. 8).
The time data is converted into time data corresponding to the interrupt processing execution interval) and set in the note timer NOTETIMER (step
Step S34) . The value of this timer is decremented by “1” each time the processing in FIG. 8 is executed.

【００２８】続くステップＳ３５では、ポインタｋを
「１」に設定するとともに、キーオンフラグＫＥＹＯＮ
ＦＬＧ及びノートオンフラグＮＯＴＥＯＮＦＬＧをとも
に「１」に設定し、図６のステップＳ４１に進む。ステ
ップＳ４１では、残り処理フラグＲＥＳＴＦＬＧが
「０」か否かを判別する。当初はＲＥＳＴＦＬＧ＝０で
あるので、ステップＳ４２に進み、ノートオンフラグＮ
ＯＴＥＯＮＦＬＧが「１」か否かを判別する。ノートオ
ンフラグＮＯＴＥＯＮＦＬＧは、デュレーション時間が
経過しノートタイマＮＯＴＥＴＩＭＥＲの値が「０」に
なると、「１」から「０」に戻される（図８、ステップ
Ｓ７３、Ｓ７４）が、当初はＮＯＴＥＯＮＦＬＧ＝１で
あるので、ステップＳ４３に進む。In the following step S35, the pointer k is set to "1" and the key-on flag KEYON is set.
The FLG and the note-on flag NOTEONFLG are both set to "1", and the process proceeds to step S41 in FIG. In the step S41, it is determined whether or not the remaining processing flag RESTFLG is “0”. Initially, RESTFLG = 0, so the process proceeds to step S42, where the note-on flag N
It is determined whether or not OTEONFLG is “1”. When the duration time elapses and the value of the note timer NOTETIMER becomes "0", the note-on flag NOTEONFLG is returned from "1" to "0" (FIG. 8, steps S73 and S74). Since there is, the process proceeds to step S43.

【００２９】ステップＳ４３では、ゼロ指定フラグＰＨ
ＴＩＭＥＺＥＲＯＦＬＧが「０」か否かを判別し、当初
はＰＨＴＩＭＥＺＥＲＯＦＬＧ＝０であるので、ステッ
プＳ４４に進んで、フォルマントタイマフラグＦＴＩＭ
ＥＲＦＬＧが「０」であるか否かを判別する。当初はＦ
ＴＩＭＥＲＦＬＧ＝０であるので、図７のステップＳ５
１に進み、ポインタｋが示す音素記号データＬＹＰＨＯ
ＮＥを読み込む。次いで、読み込んだ音素記号データＬ
ＹＰＨＯＮＥが母音か否かを判別し（ステップＳ５
２）、母音でないときは子音であるか否かを判別する
（ステップＳ５３）。In step S43, a zero designation flag PH
It is determined whether or not TIMEZEROFLG is "0". Since PHTIMEZEROFLG is initially 0, the process proceeds to step S44, where the formant timer flag FTIM is set.
It is determined whether or not ERFLG is “0”. Initially F
Since TIMERFLG = 0, step S5 in FIG.
1 and the phoneme symbol data LYPHO indicated by the pointer k.
Read NE. Next, the read phoneme symbol data L
It is determined whether or not YPHONE is a vowel (step S5).
2) If it is not a vowel, it is determined whether or not it is a consonant (step S53).

【００３０】例えば音素記号データＬＹＰＨＯＮＥが”
ｈ”であるときは、ステップＳ５２、Ｓ５３を経由して
ステップＳ５４に進む。なお、ステップＳ５２及びＳ５
３の答がともに否定（ＮＯ）のときは、１つの歌詞ノー
トデータＬＹＲＩＣＮＯＴＥの発音が完了したと判断し
て、図６のステップＳ４８に進む。For example, if the phoneme symbol data LYPHONE is "
If h ", the process proceeds to step S54 via steps S52 and S53. Incidentally, steps S52 and S5
If the answer to 3 is both negative (NO), it is determined that the sounding of one piece of lyric note data LYRICNOTE has been completed, and the process proceeds to step S48 in FIG.

【００３１】ステップＳ５４では、フォルマントタイマ
ＦＴＩＭＥＲを、ポインタｋが示す音素発音時間データ
ＰＨＯＮＥＴＩＭＥに設定するとともにフォルマントタ
イマフラグＦＴＩＭＥＲＦＬＧを「１」に設定して、フ
ォルマントタイマＦＴＩＭＥＲをスタートさせる（ステ
ップＳ５４）。フォルマントタイマＦＴＩＭＥＲは、ノ
ートタイマＮＯＴＥＴＩＭＥＲと同様に、図８の処理で
デクリメントされ、その値が「０」になるとフォルマン
トタイマフラグＦＴＩＭＥＲＦＬＧが「０」に設定され
る（ステップＳ７６からＳ７８）。In step S54, the formant timer FTIMER is set to the phoneme sounding time data PHONETIME indicated by the pointer k, the formant timer flag FTIMERLG is set to "1", and the formant timer FTIMER is started (step S54). Like the note timer NOTETIMER, the formant timer FTIMER is decremented in the process of FIG. 8, and when its value becomes "0", the formant timer flag FTIMERLG is set to "0" (steps S76 to S78).

【００３２】続くステップＳ５５では、音素記号データ
ＬＹＰＨＯＮＥｋを子音フォルマント発生部ＵＴＧへ転
送し、次いでキーオンデータＫＥＹＯＮで指定されたベ
ロシティで発音を開始し（ステップＳ５６）、ポインタ
ｋを「１」だけインクリメントして（ステップＳ５
７）、本処理を終了する。In the following step S55, the phoneme symbol data LYPHONEk is transferred to the consonant formant generating unit UTG, and then sound generation is started at the velocity specified by the key-on data KEYON (step S56), and the pointer k is incremented by "1". (Step S5
7), end this processing.

【００３３】以後は図８の処理でフォルマントタイマＦ
ＴＩＭＥＲの値が「０」となり、フォルマントタイマフ
ラグＦＴＩＭＥＲＦＬＧが「０」となるまで、ステップ
Ｓ４４から直ちに本処理を終了する動作を繰り返す。Thereafter, in the process of FIG.
Until the value of TIMER becomes “0” and the formant timer flag FTIMERLG becomes “0”, the operation of immediately ending this processing from step S44 is repeated.

【００３４】図８の処理は所定時間（例えば、８ｍｓｅ
ｃ）毎に実行される。この処理では、先ずステップＳ７
１でキーオンフラグＫＥＹＯＮＦＬＧが「１」か否かを
判別し、ＫＥＹＯＮＦＬＧ＝０であるときは直ちにステ
ップＳ７５に進み、ＫＥＹＯＮＦＬＧ＝１であるときは
ノートタイマＮＯＴＥＴＩＭＥＲの値を「１」だけデク
リメントし（ステップＳ７２）、該タイマの値が「０」
か否かを判別する（ステップＳ７３）。そして、ＮＯＴ
ＥＴＩＭＥＲ＞０である間は直ちにステップＳ７５に進
み、ＮＯＴＥＴＩＭＥＲ＝０となると、ノートオンフラ
グＮＯＴＥＯＮＦＬＧを「０」に設定して（ステップＳ
７４）、ステップＳ７５に進む。The processing in FIG. 8 is performed for a predetermined time (for example, 8 msec).
c) is executed every time. In this process, first, in step S7
It is determined at 1 whether or not the key-on flag KEYONFLG is "1". If KEYONFLG = 0, the process immediately proceeds to step S75. If KEYONFLG = 1, the value of the note timer NOTETIMER is decremented by "1" (step S72), the value of the timer is “0”
It is determined whether or not this is the case (step S73). And NOT
While ETIMER> 0, the process immediately proceeds to step S75, and when NOTETIMER = 0, the note-on flag NOTEONFLG is set to “0” (step S75).
74), and proceeds to step S75.

【００３５】ステップＳ７５では、フォルマントタイマ
フラグＦＴＩＭＥＲＦＬＧが「１」か否かを判別し、Ｆ
ＴＩＭＥＲＦＬＧ＝０であるときは直ちにステップＳ７
９に進み、ＦＴＩＭＥＲＦＬＧ＝１であるときはフォル
マントタイマＦＴＩＭＥＲの値を「１」だけデクリメン
トして（ステップＳ７６）、該タイマの値が「０」か否
かを判別する（ステップＳ７７）。そして、ＦＴＩＭＥ
Ｒ＞０である間は直ちにステップＳ７９に進み、ＦＴＩ
ＭＥＲ＝０となると、フォルマントタイマフラグＦＴＩ
ＭＥＲＦＬＧを「０」に設定して（ステップＳ７８）、
ステップＳ７９に進む。In step S75, it is determined whether or not the formant timer flag FTIMERLG is "1".
If TIMERFLG = 0, immediately step S7
Then, if FTIMERLG = 1, the value of the formant timer FTIMER is decremented by "1" (step S76), and it is determined whether or not the value of the timer is "0" (step S77). And FTIME
While R> 0, the process immediately proceeds to step S79, and the FTI
When MER = 0, the formant timer flag FTI
MERFLG is set to "0" (step S78),
Proceed to step S79.

【００３６】ステップＳ７９では他の割り込み処理を実
行して本処理を終了する。In step S79, another interrupt process is executed, and the process ends.

【００３７】以上のようにして図８の処理により、デュ
レーション時間の管理及び各音素の発音時間の管理が行
われる。As described above, the duration time and the sounding time of each phoneme are managed by the processing of FIG.

【００３８】図６に戻り、フォルマントタイマフラグＦ
ＴＩＭＥＲＦＬＧが「０」になると、ステップＳ４４か
らステップＳ５１に進んで、次の音素記号データＬＹＰ
ＨＯＮＥｋを読み込む。Referring back to FIG. 6, the formant timer flag F
When TIMERFLG becomes "0", the process proceeds from step S44 to step S51, where the next phoneme symbol data LYP
Read HONEk.

【００３９】続くステップＳ５２で音素記号データＬＹ
ＰＨＯＮＥｋが母音である場合（例えば”ｈｉｔ”の”
ｉ”である場合）には、音素発音時間データＰＨＯＮＥ
ＴＩＭＥｋが「０」でないか否か、すなわちゼロ指定が
なされていないか否かを判別する（ステップＳ６１）。
ゼロ指定されている場合（例えば図２（ｆ）に示す音
素”ｉ”の場合）には、ステップＳ６３に進み、ゼロ指
定フラグＰＨＴＩＭＥＺＥＲＯＦＬＧが「０」か否かを
判別し、当初はＰＨＴＩＭＥＺＥＲＯＦＬＧ＝０である
ので、該フラグＰＨＴＩＭＥＺＥＲＯＦＬＧを「１」に
設定して、ステップＳ６７に進む。ゼロ指定がなされた
母音は、デュレーション時間の終了時点まで発音を継続
するので、フォルマントタイマＦＴＩＭＥＲの設定は行
わない。In the following step S52, phoneme symbol data LY
When PHONEk is a vowel (for example, "hit"
i "), the phoneme sounding time data PHONE
It is determined whether or not TIMEk is not "0", that is, whether or not zero is specified (step S61).
If zero is specified (for example, in the case of the phoneme "i" shown in FIG. 2 (f)), the process proceeds to step S63, where it is determined whether or not a zero specification flag PHTIMEZEROFLG is "0". Initially, PHTIMEZEROFLG = 0. Therefore, the flag PHTIMEZEROFLG is set to “1”, and the process proceeds to step S67. Since the vowel designated as zero continues sounding until the end of the duration time, the formant timer FTIMER is not set.

【００４０】一方、ゼロ指定がなされていないときは、
フォルマントタイマＦＴＩＭＥＲを、ポインタｋが示す
音素発音時間データＰＨＯＮＥＴＩＭＥｋに設定すると
ともにフォルマントタイマフラグＦＴＩＭＥＲＦＬＧを
「１」に設定して、フォルマントタイマＦＴＩＭＥＲを
スタートさせ（ステップＳ６２）、ステップＳ６７に進
む。On the other hand, when zero is not specified,
The formant timer FTIMER is set to the phoneme sounding time data PHONETIMEk indicated by the pointer k, the formant timer flag FTIMERLG is set to "1", the formant timer FTIMER is started (step S62), and the process proceeds to step S67.

【００４１】ステップＳ６７では、音素記号データＬＹ
ＰＨＯＮＥｋを母音フォルマント発生部ＶＴＧへ転送
し、次いでキーオンデータＫＥＹＯＮで指定された音高
及びベロシティで発音を開始し（ステップＳ６８）、ポ
インタｋを「１」だけインクリメントして（ステップＳ
６９）、本処理を終了する。In step S67, the phoneme symbol data LY
PHONEk is transferred to the vowel formant generation unit VTG, and then sound generation is started at the pitch and velocity designated by the key-on data KEYON (step S68), and the pointer k is incremented by "1" (step S68).
69), end the present process.

【００４２】図２（ｆ）に示す歌詞”ｈｉｔ”の例で
は、”ｉ”がゼロ指定されているので、以後はステップ
Ｓ４３から直ちに処理を終了する動作を繰り返す。そし
て、デュレーション時間が終了してノートタイマＮＯＴ
ＥＴＩＭＥＲの値が「０」となり、ノートオンフラグＮ
ＯＴＥＯＮＦＬＧが「０」となるとステップＳ４２から
ステップＳ４５に進み、ゼロ指定フラグＰＨＴＩＭＥＺ
ＥＲＯＦＬＧが「１」か否かを判別する。この例ではＰ
ＨＴＩＭＥＺＥＲＯ＝１であるので、発音中の母音（”
ｉ”）を消音し、残り処理フラグＲＥＳＴＦＬＧを
「１」に設定して（ステップＳ４６）、ステップＳ５１
に進む。In the example of the lyrics "hit" shown in FIG. 2 (f), since "i" is specified as zero, the operation of immediately terminating the process from step S43 is repeated. Then, the duration time ends and the note timer NOT
The value of ETIMER becomes “0” and the note-on flag N
When OTEONFLG becomes "0", the process proceeds from step S42 to step S45, and the zero designation flag PHTIMEZ
It is determined whether or not EROFLG is “1”. In this example, P
Since HTIMEZERO = 1, the vowel (“”
i ") is silenced, and the remaining process flag RESTFLG is set to" 1 "(step S46), and step S51 is performed.
Proceed to.

【００４３】ステップＳ５１では、次に音素記号データ
ＬＹＰＨＯＮＥ（”ｔ”）を読み込み、ステップＳ５２
からＳ５７を実行する。以後は、ステップＳ４１からス
テップＳ４４に直接進む処理を繰り返し、フォルマント
タイマＦＴＩＭＥＲの値が「０」となり、フォルマント
タイマフラグＦＴＩＭＥＲＦＬＧ＝０となると、ステッ
プＳ５１、Ｓ５２、Ｓ５３を経由してステップＳ４８に
進み、キーオンフラグＫＥＹＯＮＦＬＧ、フォルマント
タイマフラグＦＴＩＭＥＲＦＬＧ、ノートオンフラグＮ
ＯＴＥＯＮＦＬＧ、ゼロ設定フラグＰＨＴＩＭＥＺＥＲ
ＯＦＬＧ及び残り処理フラグＲＥＳＴＦＬＧを「０」に
設定するとともに、ポインタｉを「１」だけインクリメ
ントして、本処理を終了する。In step S51, the phoneme symbol data LYPHONE ("t") is read next, and step S52 is executed.
To S57. Thereafter, the process of directly proceeding from step S41 to step S44 is repeated. When the value of the formant timer FTIMER becomes “0” and the formant timer flag FTIMERLG becomes 0, the process proceeds to step S48 via steps S51, S52, and S53, Key-on flag KEYONFLG, formant timer flag FTIMERLG, note-on flag N
OTEONFLG, zero setting flag PHTIMEZEER
The OFLG and the remaining process flag RESTFLG are set to “0”, the pointer i is incremented by “1”, and the process ends.

【００４４】なお、歌詞ノートデータＬＹＲＩＣＮＯＴ
Ｅの中にゼロ指定された音素がない場合は、デュレーシ
ョン時間が終了すると、ステップＳ４５からステップＳ
４７に進み、発音中の母音又は子音の消音を行って、ス
テップＳ４８に進む。The lyrics note data LYRICNOT
If there is no phoneme designated as zero in E, when the duration time ends, the process proceeds from step S45 to step S45.
Proceeding to 47, mute the vowel or consonant being pronounced, and proceed to step S48.

【００４５】また、１つの歌詞ノートデータＬＹＲＩＣ
ＮＯＴＥの中に２つ以上のゼロ指定がなされた場合は、
図７のステップＳ６３の答が否定（ＮＯ）となり、ステ
ップＳ６５に進んで、ポインタｋの値を「１」だけイン
クリメントし、次いでフォルマントタイマフラグＦＴＩ
ＭＥＲＦＬＧを「０」に設定して（ステップＳ６６）、
ステップＳ５１に戻る。これにより、１つの歌詞ノート
データの中でゼロ指定された２つ目以降の母音は、発音
しないように制御される。One piece of lyrics note data LYRIC
If two or more zeros are specified in NOTE,
The answer to step S63 in FIG. 7 is negative (NO), the process proceeds to step S65, the value of the pointer k is incremented by "1", and then the formant timer flag FTI
MERFLG is set to "0" (step S66),
It returns to step S51. As a result, the second and subsequent vowels designated as zero in one piece of lyric note data are controlled so as not to sound.

【００４６】図９は、図２（ｆ）に示すように音素発音
時間データＰＨＯＮＥＴＩＭＥが設定された歌詞”ｈｉ
ｔ”を音高Ｃ３の４分音符に対応して発音する処理を説
明するための図である。キーオンのタイミング（時刻ｔ
１）から音素”ｈ”の発音を開始し、音素発音時間デー
タＰＨＯＮＥＴＩＭＥ１により指定された発音時間が経
過すると（時刻ｔ２）、音素”ｉ”の発音が開始され
る。このとき、音素”ｈ”の発音レベルは所定の減衰特
性にしたがって減衰する。音素”ｉ”はゼロ指定がされ
ているので、デュレーション時間の終了時点（時刻ｔ
３）まで発音され、その後音素”ｔ”が指定された発音
時間だけ発音される。FIG. 9 shows the lyrics "hi" in which the phoneme sounding time data PHONETIME is set as shown in FIG. 2 (f).
12 is a diagram for explaining a process of generating “t” corresponding to a quarter note at pitch C3. Key-on timing (time t
The sounding of the phoneme "h" starts from 1), and when the sounding time specified by the phoneme sounding time data PHONETIME1 has elapsed (time t2), the sounding of the phoneme "i" is started. At this time, the sound level of the phoneme "h" is attenuated according to a predetermined attenuation characteristic. Since the phoneme “i” is specified as zero, the end time of the duration time (time t
3) is pronounced, and then the phoneme "t" is pronounced for the designated pronunciation time.

【００４７】また歌詞が”ｙｕｋｉ”の例（図２
（ｆ）、下側）では、音素”ｕ”がゼロ指定されている
ので、この母音がデュレーション時間の終了時点まで発
音され、音素”ｋ”及び”ｉ”はその後に発音される。An example in which the lyrics are "yuki" (FIG. 2)
In (f), lower side, since the phoneme "u" is designated as zero, this vowel is pronounced until the end of the duration time, and the phonemes "k" and "i" are emitted thereafter.

【００４８】このように本実施形態では、歌詞ノートデ
ータＬＹＲＩＣＮＯＴＥ中においてゼロ指定された母音
の音素は、デュレーション時間の終了時点まで発音する
ようにしたので、曲のテンポを変更しても自然な歌唱を
行うことができる。As described above, in the present embodiment, the vowel phonemes designated as zero in the lyric note data LYRICNOTE are pronounced until the end of the duration time, so that natural singing can be performed even if the tempo of the music is changed. It can be performed.

【００４９】また一音に長い歌詞を割り当てる場合に
は、ゼロ指定する母音を変える（例えば「こーんにち
は」を「こんにちーわ」というように変える）ことによ
り、歌唱の感じを変えることができ、歌唱表現力を向上
させることができる。When a long lyrics is assigned to one note, the vocal sound to be specified by zero is changed (for example, "konnichi" is changed to "konnichiwa") so that the feeling of singing is changed. Can be changed, and the singing expression can be improved.

【００５０】なお本発明は、上述した実施の形態に限ら
れるものではなく、種々の形態で実施することができ
る。例えば、上述した実施形態ではソングデータＳＯＮ
Ｇは、データメモリ４に格納されているが、ＭＩＤＩイ
ンターフェースを設けて外部の機器から供給するように
してもよい。The present invention is not limited to the above-described embodiments, but can be implemented in various forms. For example, in the above embodiment, the song data SON
G is stored in the data memory 4, but may be supplied from an external device by providing a MIDI interface.

【００５１】また、音声合成の方法はフォルマント合成
方式に限らず、他の方式を採用してもよい。また、ＣＰ
Ｕに音声合成処理まで実行させてもよい。The method of speech synthesis is not limited to the formant synthesis method, and other methods may be employed. Also, CP
U may execute up to the speech synthesis process.

【００５２】[0052]

【発明の効果】以上詳述したように請求項１に記載の歌
唱音声合成装置又は請求項３に記載の歌唱音声合成方法
によれば、音素発音時間データが第２のデータ、すなわ
ち当該音素が有声音である場合に、該有声音に対応する
歌詞発音時間データが示す発音時間の終了まで発音する
ことを指定するデータからなるときには、当該有声音の
音素を、該有声音に対応する音素発音時間データが示す
発音時間の終了まで発音させるように、発音制御手段が
制御されるので、曲のテンポを変更しても自然な歌唱を
行うことができるとともに、歌唱表現力を向上させるこ
とができる。As described above in detail, according to the singing voice synthesizing apparatus according to the first aspect or the singing voice synthesizing method according to the third aspect, the phoneme sounding time data is the second data, that is, the second data.
If the phoneme is a voiced sound, it corresponds to the voiced sound
Produces until the end of the pronunciation time indicated by the lyrics pronunciation time data
Data that specifies that voiced sound
A phoneme is indicated by phoneme sounding time data corresponding to the voiced sound.
The sound control means is used to sound until the end of the sounding time.
Since the control is performed, natural singing can be performed even when the tempo of the music is changed, and singing expression can be improved.

[Brief description of the drawings]

【図１】本発明の実施の一形態にかかる歌唱電子装置の
構成を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of a singing electronic device according to an embodiment of the present invention.

【図２】図１の各メモリに記憶されるデータの構成を示
す図である。FIG. 2 is a diagram showing a configuration of data stored in each memory of FIG. 1;

【図３】図１のＣＰＵで実行されるメインプログラムの
フローチャートである。FIG. 3 is a flowchart of a main program executed by a CPU of FIG. 1;

【図４】歌唱演奏処理のフローチャートである。FIG. 4 is a flowchart of a singing performance process.

【図５】図４の歌唱データ（ＬＹＲＩＣＳＥＱＤＡＴ
Ａ）演奏処理を詳細に示すフローチャートである。FIG. 5 shows the singing data (LYRICSEQDAT) of FIG.
A) It is a flowchart which shows a performance process in detail.

【図６】図４の歌唱データ（ＬＹＲＩＣＳＥＱＤＡＴ
Ａ）演奏処理を詳細に示すフローチャートである。FIG. 6 shows the singing data (LYRICSEQDAT) of FIG.
A) It is a flowchart which shows a performance process in detail.

【図７】図４の歌唱データ（ＬＹＲＩＣＳＥＱＤＡＴ
Ａ）演奏処理を詳細に示すフローチャートである。FIG. 7 shows the singing data (LYRICSEQDAT) of FIG.
A) It is a flowchart which shows a performance process in detail.

【図８】タイマー割り込み処理のフローチャートであ
る。FIG. 8 is a flowchart of a timer interrupt process.

【図９】歌唱データ演奏処理を説明するための図であ
る。FIG. 9 is a diagram for explaining singing data performance processing;

[Explanation of symbols]

１ＣＰＵ２ＲＯＭ３ＲＡＭ４データメモリ８フォルマント音源９ＤＡ変換器１０サウンドシステム DESCRIPTION OF SYMBOLS 1 CPU 2 ROM 3 RAM 4 Data memory 8 Formant sound source 9 DA converter 10 Sound system

フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) G10L 13/06 G10H 7/02 G10K 15/04 302 G10L 13/00 ＪＩＣＳＴファイル（ＪＯＩＳ)Continued on the front page (58) Fields surveyed (Int.Cl. ⁷ , DB name) G10L 13/06 G10H 7/02 G10K 15/04 302 G10L 13/00 JICST file (JOIS)

Claims

(57) [Claims]

1. Based on singing data including a plurality of lyrics data indicating lyrics to be pronounced, and lyrics pronunciation time data corresponding to the lyrics data and indicating the pronunciation time of the lyrics indicated by the lyrics data in relative time. A singing voice synthesizer that sequentially synthesizes voices by controlling pronunciation control means, wherein each of the lyrics data specifies phoneme symbol data indicating a phoneme of the lyrics and a sounding time of the phoneme, respectively. Phoneme sounding time data, wherein the phoneme sounding time data indicates that the phoneme is a voiced sound.
In this case, the first time to specify the sounding time of the phoneme in absolute time
Data, or the lyrics sounding time second one that such scolded any data that specifies that the sound until the end of the data is sounding time indicated corresponding to the voiced speech is the phoneme
If it is not a voiced sound, only the first data is used.
Ri, the sound control means, when the phoneme sounding time data is composed of the first data when the first data is absolute
While the phoneme is controlled so as to sound the phoneme only for a specified sounding time, when the phoneme sounding time data is composed of the second data, the phoneme of the voiced sound is replaced with the lyrics sounding time corresponding to the voiced sound. A singing voice synthesizing device that controls sound generation until the end of a sounding time indicated by data.

2. The phonemic control means according to claim 1, wherein said phoneme sounding time data in said lyrics data is composed of said second data. A phoneme following a voiced phoneme is said phoneme corresponding to said voiced sound. 2. The singing voice synthesizing apparatus according to claim 1, wherein the singing voice synthesizing device is controlled so as to generate a sound after the utterance time indicated by the utterance time data.

3. Based on singing data including a plurality of lyrics data indicating lyrics to be pronounced, and lyrics pronunciation time data corresponding to the lyrics data and indicating the pronunciation time of the lyrics indicated by the lyrics data in relative time. A singing voice synthesizing method for sequentially synthesizing voices by controlling pronunciation control means, wherein each of the lyrics data specifies phoneme symbol data indicating a phoneme of the lyrics and a sounding time of the phoneme, respectively. Phoneme sounding time data, wherein the phoneme sounding time data indicates that the phoneme is a voiced sound.
In this case, the first time to specify the sounding time of the phoneme in absolute time
Data, or the lyrics sounding time second one that such scolded any data that specifies that the sound until the end of the data is sounding time indicated corresponding to the voiced speech is the phoneme
If it is not a voiced sound, only the first data is used.
Ri, for the sound control unit, when the phoneme sounding time data is composed of the first data, while the first data is controlled so as to sound the sounding time only the phonemes specified in absolute time , When the phoneme sounding time data is composed of the second data, the phoneme of the voiced sound is
A singing voice synthesizing method, wherein the singing voice synthesizing method is controlled so that sound is generated until the end of the sounding time indicated by the lyrics sounding time data corresponding to the voiced sound.

4. A method according to claim 1, wherein said phoneme sounding time data in said lyrics data is said second data.
Followed phoneme voiced consisting phonemes corresponding to the voiced
4. The singing voice synthesizing method according to claim 3, wherein the singing voice synthesis is controlled so that the singing voice is generated after the utterance time indicated by the lyric sound generation time data ends.