JP6760457B2

JP6760457B2 - Electronic musical instruments, control methods for electronic musical instruments, and programs

Info

Publication number: JP6760457B2
Application number: JP2019164117A
Authority: JP
Inventors: 厚士中村; 克瀬戸口; 真段城; 文章太田
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2019-09-10
Filing date: 2019-09-10
Publication date: 2020-09-23
Anticipated expiration: 2038-04-16
Also published as: JP2020013145A

Description

本発明は、鍵盤等の操作子の操作に応じて歌声を再生する電子楽器、電子楽器の制御方法、及びプログラムに関する。 The present invention relates to an electronic musical instrument that reproduces a singing voice in response to an operation of an operator such as a keyboard, a control method for the electronic musical instrument, and a program.

従来、鍵盤等の操作子の操作に応じて歌声（ボーカル）を再生するようにした電子楽器の技術が知られている（例えば特許文献１）。この従来技術は、音程を指示する鍵盤操作子と、歌詞データが記憶された記憶手段と、該記憶手段から歌詞データを読み出すべき旨を指示する指示手段と、該指示手段による指示があった場合に前記記憶手段から歌詞データを順次読み出す読出手段と、該読出手段により読み出された歌詞データに応じた音色で鍵盤操作子で指示された音程の歌声を発生する音源、とを備える。 Conventionally, there is known a technique of an electronic musical instrument that reproduces a singing voice (vocal) in response to an operation of an operator such as a keyboard (for example, Patent Document 1). In this conventional technique, when there is a keyboard operator for instructing a pitch, a storage means for storing lyrics data, an instruction means for instructing that the lyrics data should be read from the storage means, and an instruction by the instruction means. It is provided with a reading means for sequentially reading lyrics data from the storage means, and a sound source for generating a singing voice of a pitch instructed by a keyboard operator with a tone color corresponding to the lyrics data read by the reading means.

特開平６−３３２４４９号公報Japanese Unexamined Patent Publication No. 6-332449

しかし、上述のような従来技術では、例えば電子楽器により出力される伴奏データの進行に合わせて歌詞に応じた歌声を出力しようとした場合に、演奏者がどの鍵を指定しても鍵が指定されるごとに歌詞に応じた歌声が順次出力されると、演奏者による鍵の指定の仕方によっては、出力される歌声と伴奏データの進行とが合わない。例えば、１小節に４つの音符が含まれている場合、演奏者が１小節の区間で４つ以上音高を指定した場合は伴奏データの進行より歌詞が先に進んでしまうし、演奏者が１小節の区間で指定した音高が３つ以下の場合は伴奏データの進行より歌詞が遅れてしまう。 However, in the above-mentioned conventional technique, for example, when trying to output a singing voice according to the lyrics according to the progress of accompaniment data output by an electronic musical instrument, the key is specified regardless of which key is specified by the performer. If the singing voice corresponding to the lyrics is sequentially output each time, the output singing voice and the progress of the accompaniment data do not match depending on how the performer specifies the key. For example, if one bar contains four notes, and the performer specifies four or more pitches in one bar, the lyrics will advance ahead of the accompaniment data, and the performer will. If the pitch specified in the section of one bar is three or less, the lyrics will be delayed from the progress of the accompaniment data.

このように、演奏者が音高を鍵盤等で指定するごとに歌詞が順次進んでいってしまえば、例えば伴奏に対して歌詞が進みすぎてしまうことや、逆に遅れすぎてしまうことになる。 In this way, if the lyrics advance in sequence each time the performer specifies the pitch on the keyboard, for example, the lyrics will advance too much with respect to the accompaniment, or conversely, they will be too late. ..

態様の一例の電子楽器は、
第１タイミングに合わせて指定すべき第１音高が指定された場合に、前記第１タイミングに対応する第１文字に応じた前記第１音高の歌声データを出力し、前記第１タイミングの次の第２タイミングに合わせて指定すべき第２音高が前記第２タイミングよりも前に指定された場合に、前記第２タイミングの到来を待たずに前記第２タイミングに対応する第２文字に応じた前記第２音高の歌声データを出力する歌声出力処理と、
前記第１タイミングに対応する前記第１文字に応じた前記第１音高の歌声データの出力中、前記第１音高及び前記第２音高以外の第３音高が前記第２タイミングよりも前に指定された場合に、出力中の前記第１文字に応じた前記第１音高の歌声データの音高を前記第３音高に変更する音高変更処理と、
を実行する。 An example of an electronic musical instrument of the embodiment is
When the first pitch to be specified according to the first timing is specified, the singing voice data of the first pitch corresponding to the first character corresponding to the first timing is output, and the singing voice data of the first pitch is output. When the second pitch to be specified according to the next second timing is specified before the second timing, the second character corresponding to the second timing without waiting for the arrival of the second timing. vocals output process of outputting the voice data of the second pitch corresponding to,
In the output singing data of the first pitch corresponding to the first character corresponding to the previous SL first timing, the first pitch and the third pitch other than the second pitch is higher than the second timing Also, when specified before, a pitch change process for changing the pitch of the singing voice data of the first pitch corresponding to the first character being output to the third pitch, and
To execute.

本発明によれば、歌詞の進行を良好に制御する電子楽器を提供できる。 According to the present invention, it is possible to provide an electronic musical instrument that satisfactorily controls the progress of lyrics.

電子鍵盤楽器の一実施形態の外観例を示す図である。It is a figure which shows the appearance example of one Embodiment of an electronic keyboard instrument. 電子鍵盤楽器の制御システムの一実施形態のハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware configuration example of one Embodiment of the control system of an electronic keyboard instrument. 音声合成ＬＳＩの構成例を示すブロック図である。It is a block diagram which shows the configuration example of the voice synthesis LSI. 音声合成ＬＳＩの動作説明図である。It is an operation explanatory diagram of the voice synthesis LSI. 歌詞制御技術の説明図である。It is explanatory drawing of the lyrics control technique. 本実施形態のデータ構成例を示す図である。It is a figure which shows the data structure example of this embodiment. 本実施形態における電子楽器の制御処理例を示すメインフローチャートである。It is a main flowchart which shows the control processing example of the electronic musical instrument in this embodiment. 初期化処理、テンポ変更処理、及びソング開始処理の詳細例を示すフローチャートである。It is a flowchart which shows the detailed example of the initialization processing, the tempo change processing, and the song start processing. スイッチ処理の詳細例を示すフローチャートである。It is a flowchart which shows the detailed example of a switch process. 自動演奏割込み処理の詳細例を示すフローチャートである。It is a flowchart which shows the detailed example of the automatic performance interrupt processing. ソング再生処理の第１の実施形態の詳細例を示すフローチャートである。It is a flowchart which shows the detailed example of the 1st Embodiment of a song reproduction processing. ソング再生処理の第２の実施形態の詳細例を示すフローチャートである。It is a flowchart which shows the detailed example of the 2nd Embodiment of a song reproduction process. 次ソングイベント検索処理の詳細例を示すフローチャートである。It is a flowchart which shows the detailed example of the next song event search process. ＭｕｓｉｃＸＭＬ形式による歌詞制御データの構成例を示す図である。It is a figure which shows the composition example of the lyrics control data in MusicXML format. ＭｕｓｉｃＸＭＬ形式による歌詞制御データによる楽譜表示例を示す図である。It is a figure which shows the musical score display example by the lyrics control data in the MusicXML format.

以下、本発明を実施するための形態について図面を参照しながら詳細に説明する。 Hereinafter, embodiments for carrying out the present invention will be described in detail with reference to the drawings.

図１は、電子鍵盤楽器の一実施形態１００の外観例を示す図である。電子鍵盤楽器１００は、演奏操作子としての複数の鍵からなる鍵盤１０１と、音量の指定、ソング再生のテンポ設定、ソング再生開始、伴奏再生等の各種設定を指示する第１のスイッチパネル１０２と、ソングや伴奏の選曲や音色の選択等を行う第２のスイッチパネル１０３と、ソング再生時の歌詞、楽譜や各種設定情報を表示するＬＣＤ１０４（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ：液晶ディスプレイ）等を備える。また、電子鍵盤楽器１００は、特には図示しないが、演奏により生成された楽音を放音するスピーカを裏面部、側面部、又は背面部等に備える。 FIG. 1 is a diagram showing an external example of an embodiment 100 of an electronic keyboard instrument. The electronic keyboard instrument 100 includes a keyboard 101 composed of a plurality of keys as a performance operator, and a first switch panel 102 for instructing various settings such as volume designation, song playback tempo setting, song playback start, and accompaniment playback. , A second switch panel 103 for selecting songs and accompaniments, selecting tones, and the like, and an LCD 104 (Liquid Crystal Display) for displaying lyrics, scores, and various setting information during song playback. Further, although not particularly shown, the electronic keyboard instrument 100 is provided with a speaker for emitting a musical sound generated by the performance on a back surface portion, a side surface portion, a back surface portion, or the like.

図２は、図１の電子鍵盤楽器１００の制御システム２００の一実施形態のハードウェア構成例を示す図である。図２において、制御システム２００は、ＣＰＵ（中央演算処理装置）２０１、ＲＯＭ（リードオンリーメモリ）２０２、ＲＡＭ（ランダムアクセスメモリ）２０３、音源ＬＳＩ（大規模集積回路）２０４、音声合成ＬＳＩ２０５、図１の鍵盤１０１、第１のスイッチパネル１０２、及び第２のスイッチパネル１０３が接続されるキースキャナ２０６、及び図１のＬＣＤ１０４が接続されるＬＣＤコントローラ２０８が、それぞれシステムバス２０９に接続されている。また、ＣＰＵ２０１には、自動演奏のシーケンスを制御するためのタイマ２１０が接続される。更に、音源ＬＳＩ２０４及び音声合成ＬＳＩ２０５からそれぞれ出力される楽音出力データ２１８及び歌声音声出力データ２１７は、Ｄ／Ａコンバータ２１１、２１２によりそれぞれアナログ楽音出力信号及びアナログ歌声音声出力信号に変換される。アナログ楽音出力信号及びアナログ歌声音声出力信号は、ミキサ２１３で混合され、その混合信号がアンプ２１４で増幅された後に、特には図示しないスピーカ又は出力端子から出力される。 FIG. 2 is a diagram showing a hardware configuration example of an embodiment of the control system 200 of the electronic keyboard instrument 100 of FIG. In FIG. 2, the control system 200 includes a CPU (central processing unit) 201, a ROM (read-only memory) 202, a RAM (random access memory) 203, a sound source LSI (large-scale integrated circuit) 204, a voice synthesis LSI 205, and FIG. The key scanner 101 to which the keyboard 101, the first switch panel 102, and the second switch panel 103 are connected, and the LCD controller 208 to which the LCD 104 of FIG. 1 is connected are connected to the system bus 209, respectively. Further, a timer 210 for controlling the sequence of automatic performance is connected to the CPU 201. Further, the music sound output data 218 and the singing voice voice output data 217 output from the sound source LSI 204 and the voice synthesis LSI 205, respectively, are converted into an analog music sound output signal and an analog singing voice voice output signal by the D / A converters 211 and 212, respectively. The analog musical tone output signal and the analog singing voice output signal are mixed by the mixer 213, and after the mixed signal is amplified by the amplifier 214, they are output from a speaker or an output terminal (not shown).

ＣＰＵ２０１は、ＲＡＭ２０３をワークメモリとして使用しながらＲＯＭ２０２に記憶された制御プログラムを実行することにより、図１の電子鍵盤楽器１００の制御動作を実行する。また、ＲＯＭ２０２は、上記制御プログラム及び各種固定データのほか、歌詞データ及び伴奏データを含む曲データを記憶する。 The CPU 201 executes the control operation of the electronic keyboard instrument 100 of FIG. 1 by executing the control program stored in the ROM 202 while using the RAM 203 as the work memory. In addition to the control program and various fixed data, the ROM 202 stores song data including lyrics data and accompaniment data.

ＣＰＵ２０１には、本実施形態で使用するタイマ２１０が実装されており、例えば電子鍵盤楽器１００における自動演奏の進行をカウントする。 The timer 210 used in the present embodiment is mounted on the CPU 201, and counts the progress of the automatic performance of the electronic keyboard instrument 100, for example.

音源ＬＳＩ２０４は、ＣＰＵ２０１からの発音制御指示に従って、例えば特には図示しない波形ＲＯＭから楽音波形データを読み出し、Ｄ／Ａコンバータ２１１に出力する。音源ＬＳＩ２０４は、同時に最大２５６ボイスを発振させる能力を有する。 The sound source LSI 204 reads, for example, musical sound type data from a waveform ROM (not shown) in accordance with a sound control instruction from the CPU 201, and outputs the music to the D / A converter 211. The sound source LSI 204 has the ability to oscillate up to 256 voices at the same time.

音声合成ＬＳＩ２０５は、ＣＰＵ２０１から、歌詞のテキストデータと音高と音長と開始フレームに関する情報を歌声データ２１５として与えられると、それに対応する歌声の音声データを合成し、Ｄ／Ａコンバータ２１２に出力する。 When the voice synthesis LSI 205 is given the text data of the lyrics and the information on the pitch, the pitch and the start frame as the singing voice data 215 from the CPU 201, the voice synthesis LSI 205 synthesizes the corresponding singing voice voice data and outputs it to the D / A converter 212. To do.

キースキャナ２０６は、図１の鍵盤１０１の押鍵／離鍵状態、第１のスイッチパネル１０２、及び第２のスイッチパネル１０３のスイッチ操作状態を定常的に走査し、ＣＰＵ２０１に割り込みを掛けて状態変化を伝える。 The key scanner 206 constantly scans the key press / release state of the key 101 of FIG. 1, the switch operation state of the first switch panel 102, and the second switch panel 103, and interrupts the CPU 201. Communicate change.

ＬＣＤコントローラ２０８は、ＬＣＤ１０４の表示状態を制御するＩＣ（集積回路）である。 The LCD controller 208 is an IC (integrated circuit) that controls the display state of the LCD 104.

図３は、図２の音声合成ＬＳＩ２０５の構成例を示すブロック図である。この音声合成ＬＳＩ２０５は、後述するソング再生処理により図２のＣＰＵ２０１から指示される歌声データ２１５を入力することにより、例えば下記文献に記載の「深層学習に基づく統計的音声合成」の技術に基づいて、歌声音声出力データ２１７を合成し出力する。 FIG. 3 is a block diagram showing a configuration example of the speech synthesis LSI 205 of FIG. This speech synthesis LSI 205 is based on, for example, the technique of "statistical speech synthesis based on deep learning" described in the following document by inputting the singing voice data 215 instructed from the CPU 201 of FIG. 2 by the song reproduction process described later. , Singing voice voice output data 217 is synthesized and output.

（文献）
橋本佳，高木信二「深層学習に基づく統計的音声合成」日本音響学会誌７３巻１号（２０１７），ｐｐ．５５−６２ (Reference)
Yoshi Hashimoto, Shinji Takagi, "Statistical Speech Synthesis Based on Deep Learning," Journal of the Acoustical Society of Japan, Vol. 73, No. 1 (2017), pp. 55-62

音声合成ＬＳＩ２０５は、音声学習部３０１と音声合成部３０２を含む。音声学習部３０１は、学習用テキスト解析部３０３と学習用音響特徴量抽出部３０４とモデル学習部３０５とを含む。 The speech synthesis LSI 205 includes a speech learning unit 301 and a speech synthesis unit 302. The voice learning unit 301 includes a learning text analysis unit 303, a learning acoustic feature extraction unit 304, and a model learning unit 305.

学習用テキスト解析部３０３は、歌詞テキストと音高と音長を含む学習用歌声データ３１１を入力してそのデータを解析する。この結果、学習用テキスト解析部３０３は、学習用歌声データ３１１に対応する音素、品詞、単語、音高などを表現する離散数値系列である学習用言語特徴量系列３１３を推定して出力する。 The learning text analysis unit 303 inputs the learning singing voice data 311 including the lyrics text, the pitch and the pitch, and analyzes the data. As a result, the learning text analysis unit 303 estimates and outputs the learning language feature sequence 313, which is a discrete numerical sequence expressing phonemes, parts of speech, words, pitches, etc. corresponding to the learning singing voice data 311.

学習用音響特徴量抽出部３０４は、上記歌詞テキストを或る歌手が歌うことによりマイク等を介して集録された学習用歌声音声データ３１２を入力して分析する。この結果、学習用音響特徴量抽出部３０４は、学習用歌声音声データ３１２に対応する音声の特徴を表す学習用音響特徴量系列３１４を抽出して出力する。 The learning acoustic feature amount extraction unit 304 inputs and analyzes the learning singing voice voice data 312 recorded through a microphone or the like by a certain singer singing the lyrics text. As a result, the learning acoustic feature amount extraction unit 304 extracts and outputs the learning acoustic feature amount series 314 representing the voice features corresponding to the learning singing voice voice data 312.

モデル学習部３０５は、下記（１）式に従って、学習用言語特徴量系列３１３（これを
と置く）と、音響モデル（これを
と置く）とから、学習用音響特徴量系列３１４（これを
と置く）が生成される確率（これを
と置く）を最大にするような音響モデル
を、機械学習により推定する。即ち、テキストである言語特徴量系列と音声である音響特徴量系列との関係が、音響モデルという統計モデルによって表現される。 The model learning unit 305 uses the following equation (1) to form a learning language feature sequence 313 (this is used).
And the acoustic model (put this)
From), the learning acoustic feature series 314 (this is
Probability of generating (put this)
And put) an acoustic model that maximizes
Is estimated by machine learning. That is, the relationship between the language feature series that is text and the acoustic feature series that is voice is expressed by a statistical model called an acoustic model.

モデル学習部３０５は、（１）式によって機械学習を行った結果算出される音響モデル
を表現するモデルパラメータを学習結果３１５として出力し、音声合成部３０２内の音響モデル部３０６に設定する。 The model learning unit 305 is an acoustic model calculated as a result of performing machine learning according to equation (1).
The model parameter expressing the above is output as the learning result 315 and set in the acoustic model unit 306 in the voice synthesis unit 302.

音声合成部３０２は、テキスト解析部３０７と音響モデル部３０６と発声モデル部３０８とを含む。音声合成部３０２は、歌詞テキストを含む歌声データ２１５に対応する歌声音声出力データ２１７を、音響モデル部３０６に設定された音響モデルという統計モデルを用いて予測することにより合成する、統計的音声合成処理を実行する。 The speech synthesis unit 302 includes a text analysis unit 307, an acoustic model unit 306, and a vocalization model unit 308. The voice synthesis unit 302 synthesizes the singing voice output data 217 corresponding to the singing voice data 215 including the lyrics text by predicting it using a statistical model called an acoustic model set in the acoustic model unit 306. Execute the process.

テキスト解析部３０７は、自動演奏に合わせた演奏者の演奏の結果として、図２のＣＰＵ２０１より指定される歌詞のテキストデータと音高と音長と開始フレームに関する情報を含む歌声データ２１５を入力し、そのデータを解析する。この結果、テキスト解析部３０７は、歌声データ２１５に対応する音素、品詞、単語などを表現する言語特徴量系列３１６を解析して出力する。 The text analysis unit 307 inputs the text data of the lyrics designated by the CPU 201 of FIG. 2 and the singing voice data 215 including information on the pitch, the pitch, and the start frame as the result of the performer's performance in accordance with the automatic performance. , Analyze the data. As a result, the text analysis unit 307 analyzes and outputs the language feature sequence 316 expressing the phonemes, parts of speech, words, etc. corresponding to the singing voice data 215.

音響モデル部３０６は、言語特徴量系列３１６を入力することにより、それに対応する音響特徴量系列３１７を推定して出力する。即ち音響モデル部３０６は、下記（２）式に従って、テキスト解析部３０７から入力する言語特徴量系列３１６（これを再度
と置く）と、モデル学習部３０５での機械学習により学習結果３１５として設定された音響モデル
とに基づいて、音響特徴量系列３１７（これを再度
と置く）が生成される確率（これを
と置く）を最大にするような音響特徴量系列３１７の推定値
を推定する。 By inputting the language feature quantity sequence 316, the acoustic model unit 306 estimates and outputs the corresponding acoustic feature quantity sequence 317. That is, the acoustic model unit 306 uses the following equation (2) to input the language feature quantity series 316 (again) from the text analysis unit 307.
), And the acoustic model set as the learning result 315 by machine learning in the model learning unit 305.
Based on and, the acoustic feature series 317 (again)
Probability of generating (put this)
Estimated value of acoustic feature series 317 that maximizes
To estimate.

発声モデル部３０８は、音響特徴量系列３１７を入力することにより、ＣＰＵ２０１より指定される歌詞テキストを含む歌声データ２１５に対応する歌声音声出力データ２１７を生成する。歌声音声出力データ２１７は、図２のＤ／Ａコンバータ２１２からミキサ２１３及びアンプ２１４を介して出力され、特には図示しないスピーカから放音される。 The vocalization model unit 308 generates the singing voice output data 217 corresponding to the singing voice data 215 including the lyrics text designated by the CPU 201 by inputting the acoustic feature quantity series 317. The singing voice output data 217 is output from the D / A converter 212 of FIG. 2 via the mixer 213 and the amplifier 214, and is particularly emitted from a speaker (not shown).

学習用音響特徴量系列３１４や音響特徴量系列３１７で表される音響特徴量は、人間の声道をモデル化したスペクトル情報と、人間の声帯をモデル化した音源情報とを含む。スペクトルパラメータとしては例えば、メルケプストラムや線スペクトル対（ＬｉｎｅＳｐｅｃｔｒａｌＰａｉｒｓ：ＬＳＰ）等を採用できる。音源情報としては、人間の音声のピッチ周波数を示す基本周波数（Ｆ０）を採用できる。発声モデル部３０８は、音源生成部３０９と合成フィルタ部３１０とを含む。音源生成部３０９は、音響モデル部３０６から入力する音源情報３１９の系列を順次入力することにより、例えば、音源情報３１９に含まれる基本周波数（Ｆ０）で周期的に繰り返され、音源情報３１９に含まれるパワー値を有するパルス列（有声音音素の場合）、又は音源情報３１９に含まれるパワー値を有するホワイトノイズ（無声音音素の場合）からなる音源信号を生成する。合成フィルタ部３１０は、音響モデル部３０６から順次入力するスペクトル情報３１８の系列に基づいて声道をモデル化するデジタルフィルタを形成し、音源生成部３０９から入力する音源信号を励振源信号として、デジタル信号の歌声音声出力データ２１７を生成し出力する。 The acoustic features represented by the learning acoustic feature series 314 and the acoustic feature series 317 include spectral information that models the human vocal tract and sound source information that models the human vocal cords. As the spectrum parameter, for example, mer cepstrum, line spectrum pair (Line Spectral Pairs: LSP) and the like can be adopted. As the sound source information, a fundamental frequency (F0) indicating the pitch frequency of human voice can be adopted. The vocalization model unit 308 includes a sound source generation unit 309 and a synthetic filter unit 310. The sound source generation unit 309 sequentially inputs a sequence of sound source information 319 input from the acoustic model unit 306, so that the sound source information 319 is periodically repeated at the fundamental frequency (F0) included in the sound source information 319, and is included in the sound source information 319. A sound source signal consisting of a pulse train having a power value (in the case of a voiced phoneme) or white noise having a power value included in the sound source information 319 (in the case of an unvoiced phoneme) is generated. The synthetic filter unit 310 forms a digital filter that models the voice path based on the sequence of spectrum information 318 sequentially input from the acoustic model unit 306, and digitally uses the sound source signal input from the sound source generation unit 309 as an excitation source signal. The singing voice audio output data 217 of the signal is generated and output.

本実施形態では、言語特徴量系列３１６から音響特徴量系列３１７を予測するために、音響モデル部３０６がディープニューラルネットワーク（ＤｅｅｐＮｅｕｒａｌＮｅｔｗｏｒｋ：ＤＮＮ）により実装される。これに対応して、音声学習部３０１内のモデル学習部３０５は、言語特徴量から音響特徴量へのＤＮＮ内の各ニューロンの非線形変換関数を表すモデルパラメータを学習し、そのモデルパラメータを学習結果３１５として音声合成部３０２内の音響モデル部３０６のＤＮＮに出力する。 In the present embodiment, in order to predict the acoustic feature sequence 317 from the language feature sequence 316, the acoustic model unit 306 is implemented by a deep neural network (DNN). Correspondingly, the model learning unit 305 in the speech learning unit 301 learns the model parameters representing the nonlinear conversion function of each neuron in the DNN from the language feature quantity to the acoustic feature quantity, and learns the model parameters as the learning result. As 315, it is output to the DNN of the acoustic model unit 306 in the voice synthesis unit 302.

通常、音響特徴量は例えば５．１ｍｓｅｃ（ミリ秒）幅のフレームを単位として算出され、言語特徴量は音素を単位として算出される。従って、音響特徴量と言語特徴量は時間単位が異なる。ＤＮＮである音響モデル部３０６は、入力である言語特徴量系列３１６と出力である音響特徴量系列３１７の一対一の対応関係を表すモデルであるため、時間単位の異なる入出力データ対を用いてＤＮＮを学習させることはできない。このため、本実施形態では、予めフレーム単位の音響特徴量系列と音素単位の言語特徴量系列の対応関係が設定され、フレーム単位の音響特徴量と言語特徴量の対が生成される。 Usually, the acoustic feature amount is calculated in units of a frame having a width of, for example, 5.1 msec (milliseconds), and the language feature amount is calculated in units of phonemes. Therefore, the acoustic features and the language features have different time units. Since the acoustic model unit 306, which is a DNN, is a model representing a one-to-one correspondence between the input language feature sequence 316 and the output acoustic feature sequence 317, input / output data pairs having different time units are used. DNN cannot be trained. Therefore, in the present embodiment, the correspondence between the frame-based acoustic feature sequence and the phoneme-based language feature sequence is set in advance, and a pair of frame-based acoustic feature and language feature is generated.

図４は、上述の対応関係を示す音声合成ＬＳＩ２０５の動作説明図である。例えば、童謡「きらきら星」の歌い出しの歌詞文字列「き」「ら」「き」（図４（ａ））に対応する言語特徴量系列である歌声音素列「／ｋ／」「／ｉ／」「／ｒ／」「／ａ／」「／ｋ／」「／ｉ／」（図４（ｂ））が得られているときに、これらの言語特徴量系列が、フレーム単位の音響特徴量系列（図４（ｃ））に対して、１対多の関係（図４の（ｂ）と（ｃ）の関係）で対応付けられる。なお、言語特徴量は音響モデル部３０６におけるＤＮＮへの入力として使用されるため、数値データとして表現する必要がある。このため、言語特徴量系列としては、「直前の音素は「／ａ／」であるか？」や「現在の単語に含まれる音素の数は？」などのコンテキストに関する質問に対する二値のデータ（０又は１）、或いは、連続値での回答を連結して得られる数値データが用意される。 FIG. 4 is an operation explanatory diagram of the speech synthesis LSI 205 showing the above-mentioned correspondence. For example, the singing phoneme string "/ k /" "/ i" which is a linguistic feature sequence corresponding to the lyrics character strings "ki", "ra", and "ki" (Fig. 4 (a)) of the singing of the children's song "Kirakira Hoshi" When / ”,“ / r / ”,“ / a / ”,“ / k / ”, and“ / i / ”(FIG. 4 (b)) are obtained, these language feature series are frame-based phoneme features. The quantity series (FIG. 4 (c)) is associated with a one-to-many relationship (relationship between (b) and (c) in FIG. 4). Since the language feature amount is used as an input to the DNN in the acoustic model unit 306, it needs to be expressed as numerical data. Therefore, as a language feature series, "Is the phoneme immediately before" / a / "? Binary data (0 or 1) to questions related to context such as "How many phonemes are contained in the current word?", Or numerical data obtained by concatenating consecutive answers are prepared. ..

図３の音声学習部３０１内のモデル学習部３０５は、図４の破線矢印群４０１として示されるように、フレーム単位で、図４（ｂ）に対応する学習用言語特徴量系列３１３の音素列と図４（ｃ）に対応する学習用音響特徴量系列３１４の対を音響モデル部３０６のＤＮＮに順次与えて学習を行う。なお、音響モデル部３０６内のＤＮＮは、図４のグレー色の丸印群として示されるように、入力層、１つ以上の中間層、及び出力層からなるニューロン群を含む。 The model learning unit 305 in the voice learning unit 301 of FIG. 3 is a phoneme sequence of the learning language feature series 313 corresponding to FIG. 4 (b) in frame units as shown by the broken line arrow group 401 of FIG. And the pair of the learning acoustic feature quantity series 314 corresponding to FIG. 4C are sequentially given to the DNN of the acoustic model unit 306 for learning. The DNN in the acoustic model unit 306 includes a neuron group consisting of an input layer, one or more intermediate layers, and an output layer, as shown as a group of gray circles in FIG.

一方、音声合成時には、上記フレーム単位で、図４（ｂ）に対応する言語特徴量系列３１６の音素列が音響モデル部３０６のＤＮＮに入力される。この結果、音響モデル部３０６のＤＮＮは、図４の太実線矢印群４０２として示されるように、上記フレーム単位で、音響特徴量系列３１７を出力する。従って、発声モデル部３０８においても、上述のフレーム単位で、音響特徴量系列３１７に含まれる音源情報３１９及びスペクトル情報３１８がそれぞれ音源生成部３０９及び合成フィルタ部３１０に与えられて、音声合成が実行される。 On the other hand, at the time of speech synthesis, the phoneme sequence of the language feature quantity series 316 corresponding to FIG. 4B is input to the DNN of the acoustic model unit 306 in the frame unit. As a result, the DNN of the acoustic model unit 306 outputs the acoustic feature amount series 317 in the frame unit as shown by the thick solid line arrow group 402 in FIG. Therefore, also in the vocalization model unit 308, the sound source information 319 and the spectrum information 318 included in the acoustic feature quantity series 317 are given to the sound source generation unit 309 and the synthesis filter unit 310, respectively, in the above-mentioned frame unit, and the voice synthesis is executed. Will be done.

この結果、発声モデル部３０８は、図４の太実線矢印群４０３として示されるように、フレーム毎に、例えば２２５サンプル（ｓａｍｐｌｅｓ）ずつの歌声音声出力データ２１７を出力する。フレームは５．１ｍｓｅｃの時間幅を有するため、１サンプルは「５．１ｍｓｅｃ÷２２５≒０．０２２７ｍｓｅｃ」であり、従って、歌声音声出力データ２１７のサンプリング周波数は１／０．０２２７≒４４ｋＨｚ（キロヘルツ）である。 As a result, the vocalization model unit 308 outputs, for example, 225 samples of singing voice output data 217 for each frame, as shown by the thick solid line arrow group 403 in FIG. Since the frame has a time width of 5.1 msec, one sample is "5.1 msec ÷ 225 ≈ 0.0227 msec", and therefore the sampling frequency of the singing voice output data 217 is 1 / 0.0227 ≈ 44 kHz (kilohertz). Is.

ＤＮＮの学習は、フレーム単位の音響特徴量と言語特徴量の対を用いて、下記の（３）式で演算される二乗誤差最小化基準によって行われる。 The DNN learning is performed by the square error minimization standard calculated by the following equation (3) using the pair of the acoustic feature amount and the language feature amount in each frame.

ここで、
と
はそれぞれｔ番目のフレームｔにおける音響特徴量と言語特徴量、
は音響モデル部３０６のＤＮＮのモデルパラメータ、
はＤＮＮによって表される非線形変換関数である。ＤＮＮのモデルパラメータは誤差逆伝播法によって効率良く推定することができる。前述した（１）式によって表される統計的音声合成におけるモデル学習部３０５の処理との対応関係を考慮すると、ＤＮＮの学習は下記の（４）式のように表すことができる。 here,
When
Are the acoustic features and language features in the t-th frame t, respectively.
Is the DNN model parameter of the acoustic model unit 306,
Is a non-linear transformation function represented by DNN. The DNN model parameters can be efficiently estimated by the backpropagation method. Considering the correspondence with the processing of the model learning unit 305 in the statistical speech synthesis represented by the above-mentioned equation (1), the learning of DNN can be expressed as the following equation (4).

ここで、下記（５）式が成立する。
Here, the following equation (5) holds.

上記（４）式及び（５）式のように、音響特徴量と言語特徴量の関係は、ＤＮＮの出力を平均ベクトルとする正規分布
によって表すことができる。ＤＮＮを用いた統計的音声合成処理では、通常、言語特徴量
に非依存な共分散行列、即ち全てのフレームにおいて同じ共分散行列
が用いられる。また、共分散行列
を単位行列とすると、（４）式は（３）式と等価な学習処理を示している。 As in the above equations (4) and (5), the relationship between the acoustic features and the language features is a normal distribution with the output of DNN as the average vector.
Can be represented by. In statistical speech synthesis processing using DNN, language features are usually used.
Covariance matrix independent of, i.e. the same covariance matrix in all frames
Is used. Also, the covariance matrix
When is an identity matrix, equation (4) shows a learning process equivalent to equation (3).

図４で説明したように、音響モデル部３０６のＤＮＮは、フレーム毎に独立に音響特徴量系列３１７を推定する。このため、得られる音響特徴量系列３１７には、合成音声の品質を低下させるような不連続が含まれる。そこで、本実施形態では例えば、動的特徴量を用いたパラメータ生成アルゴリズムを利用することにより、合成音声の品質を改善することができる。 As described with reference to FIG. 4, the DNN of the acoustic model unit 306 estimates the acoustic feature sequence 317 independently for each frame. Therefore, the obtained acoustic feature sequence 317 includes discontinuities that deteriorate the quality of the synthesized speech. Therefore, in the present embodiment, for example, the quality of the synthesized speech can be improved by using a parameter generation algorithm using dynamic features.

図１、図２、及び図３の構成例を有する本実施形態の動作について、以下に詳細に説明する。図５は、歌詞制御技術の説明図である。図５（ａ）は、自動演奏に従って進行する歌詞テキストとメロディの関係を示す図である。例えば、前述した童謡「きらきら星」の歌い出しの場合、曲データには、「き／Ｔｗｉｎ（第１文字）」「ら／ｋｌｅ（第２文字）」「き／ｔｗｉｎ（第３文字）」「ら／ｋｌｅ（第４文字）」の歌詞の各文字（歌詞情報）と、歌詞の各文字を出力するｔ１、ｔ２、ｔ３、ｔ４の各タイミング情報と、歌詞の各文字のメロディ音高「Ｅ４（第１音高）」「Ｅ４（第２音高）」「Ｂ４（第３音高）」「Ｂ４（第４音高）」等の各音高情報が、含まれている。ｔ４の後のｔ５、ｔ６、ｔ７の各タイミングには、「ひ／ｌｉｔ（第５文字）」「か／ｔｌｅ（第６文字）」「る／ｓｔａｒ（第７文字）」の歌詞の各文字が対応付けられている。 The operation of the present embodiment having the configuration examples of FIGS. 1, 2 and 3 will be described in detail below. FIG. 5 is an explanatory diagram of the lyrics control technique. FIG. 5A is a diagram showing the relationship between the lyric text and the melody that progresses according to the automatic performance. For example, in the case of singing the above-mentioned children's song "Kirakira Hoshi", the song data includes "ki / Twin (first character)", "ra / kle (second character)", and "ki / twin (third character)". Each character (lyric information) of the lyrics of "ra / kle (4th character)", each timing information of t1, t2, t3, t4 that outputs each character of the lyrics, and the melody pitch of each character of the lyrics " Each pitch information such as E4 (first pitch), "E4 (second pitch)", "B4 (third pitch)", and "B4 (fourth pitch)" is included. At each timing of t5, t6, and t7 after t4, each character of the lyrics of "hi / lit (5th character)", "ka / tre (6th character)", and "ru / star (7th character)" Are associated with each other.

例えば、図５（ｂ）における、ｔ１、ｔ２、ｔ３、ｔ４のタイミングは、図５（ａ）の本来の発声タイミングｔ１、ｔ２、ｔ３、ｔ４に対応している。ここで、演奏者が、本来の発声タイミングに対応したタイミングｔ１とｔ２で、図１の鍵盤１０１において、曲データに含まれる第１音高Ｅ４と同じ音高Ｅ４の鍵を２度正しく押鍵したとする。この場合、図２のＣＰＵ２０１は、タイミングｔ１及びｔ２で歌詞「き／Ｔｗｉｎ（第１文字）」及び「ら／ｋｌｅ（第２文字）」と、それぞれのタイミングｔ１、ｔ２でいずれも演奏者に指定された音高Ｅ４を示す情報と、例えばそれぞれ四分音符長の時間長を示す情報（少なくとも曲データ及び演奏者による演奏のいずれかに基づいて得られる）と、が含まれる歌声データ２１５を、図２の音声合成ＬＳＩ２０５に出力する。この結果、音声合成ＬＳＩ２０５は、タイミングｔ１及びｔ２でそれぞれ歌詞「き／Ｔｗｉｎ（第１文字）」及び「ら／ｋｌｅ（第２文字）」に対応するそれぞれ四分音符長の歌声音声出力データ２１７を第１音高（＝指定された音高）Ｅ４と、第２音高（＝指定された音高）Ｅ４でそれぞれ出力する。タイミングｔ１、ｔ２に対応する判定「○」印は、曲データに含まれる音高及び歌詞情報に応じて発声が正しく行われたことを示している。 For example, the timings of t1, t2, t3, and t4 in FIG. 5 (b) correspond to the original vocalization timings t1, t2, t3, and t4 of FIG. 5 (a). Here, at the timings t1 and t2 corresponding to the original vocalization timing, the performer correctly presses the key of the same pitch E4 as the first pitch E4 included in the song data on the keyboard 101 of FIG. Suppose you did. In this case, the CPU 201 of FIG. 2 gives the performer the lyrics "ki / Twin (first character)" and "ra / kle (second character)" at the timings t1 and t2, and the performers at the respective timings t1 and t2. Singing voice data 215 including information indicating the specified pitch E4 and, for example, information indicating the time length of each quarter note length (obtained at least based on either the music data or the performance by the performer). , Output to the voice synthesis LSI 205 of FIG. As a result, the voice synthesis LSI 205 has quarter note length singing voice output data 217 corresponding to the lyrics "ki / Twin (first character)" and "ra / kle (second character)" at timings t1 and t2, respectively. Is output at the first pitch (= specified pitch) E4 and the second pitch (= specified pitch) E4, respectively. The determination "○" mark corresponding to the timings t1 and t2 indicates that the utterance was correctly performed according to the pitch and the lyrics information included in the song data.

次に、本来の発声タイミングのいずれかのタイミングで、演奏者がそのいずれかのタイミングに合わせて図１の鍵盤１０１を押鍵して音高を指定した場合に、その指定された音高が本来指定すべき音高と不一致の場合には、以下の制御が実行される。図２のＣＰＵ２０１は、そのいずれかのタイミングに合わせて、指定すべき音高に対応する文字の次の文字に応じた歌声が出力されないように、歌詞の進行及び自動伴奏の進行を制御する。例えば、或るタイミングに応じて指定すべき音高が第１音高であり、或るタイミングの次のタイミングに応じて指定すべき音高が第２音高の場合に、もし或るタイミングに応じて第１音高以外の音高が指定されたならば、第１音高に対応する第１文字に応じた歌声を出力させても、させなくてもどちらでもよいのであるが、第２音高に対応する第２文字に応じた歌声までは出力させない。その後、ＣＰＵ２０１は、演奏者が押鍵することにより指定された音高が本来指定すべき音高と一致した場合に、歌詞の進行及び自動伴奏の進行を再開する。 Next, when the performer presses the keyboard 101 of FIG. 1 to specify the pitch at any of the original vocalization timings, the specified pitch is changed. If there is a discrepancy with the pitch that should be specified, the following control is executed. The CPU 201 of FIG. 2 controls the progress of lyrics and the progress of automatic accompaniment so that the singing voice corresponding to the character next to the character corresponding to the pitch to be specified is not output in accordance with any of the timings. For example, if the pitch to be specified according to a certain timing is the first pitch and the pitch to be specified according to the next timing of a certain timing is the second pitch, at a certain timing. If a pitch other than the first pitch is specified accordingly, it does not matter whether or not the singing voice corresponding to the first character corresponding to the first pitch is output, but the second The singing voice corresponding to the second character corresponding to the pitch is not output. After that, the CPU 201 restarts the progress of the lyrics and the progress of the automatic accompaniment when the pitch specified by the performer presses the key matches the pitch that should be originally specified.

例えば、図５（ｂ）において、演奏者が、本来の発声タイミングに対応したタイミングｔ３で、図１の鍵盤１０１の鍵を押鍵して指定した音高Ｇ４が、タイミングｔ３で本来発声されるべき第３音高Ｂ４と不一致であったとする。この場合、図２のＣＰＵ２０１は、タイミングｔ３に合わせて「き／ｔｗｉｎ（第３文字）」が出力されてもよいが、第３音高Ｂ４の次に指定すべき第４音高（タイミングｔ４に応じて指定すべき第４音高）に対応する「ら／ｋｌｅ（第４文字）」以降の文字に応じた歌声が出力されないように、歌詞の進行及び自動伴奏の進行を制御する。なお、ＣＰＵ１０１は、図５（ｂ）のタイミングｔ３の場合に「き／ｔｗｉｎ（第３文字）」を発音しないように制御してもよい。この場合、指定すべきタイミングに応じて鍵を指定したにも関わらず、指定すべき鍵が指定されていない（間違って押鍵した）ので、歌声が出力されない。 For example, in FIG. 5B, the pitch G4 specified by the performer by pressing the key of the keyboard 101 of FIG. 1 at the timing t3 corresponding to the original vocalization timing is originally vocalized at the timing t3. It is assumed that the pitch does not match the third pitch B4. In this case, the CPU 201 of FIG. 2 may output "ki / twin (third character)" in accordance with the timing t3, but the fourth pitch (timing t4) to be specified next to the third pitch B4. The progress of the lyrics and the progress of the automatic accompaniment are controlled so that the singing voice corresponding to the characters after "ra / kle (fourth character)" corresponding to the fourth pitch to be specified according to is not output. The CPU 101 may be controlled so as not to pronounce "ki / twin (third character)" at the timing t3 of FIG. 5 (b). In this case, although the key is specified according to the timing to be specified, the key to be specified is not specified (the key is pressed by mistake), so that the singing voice is not output.

上述のようにして歌声が出力されないように歌詞の進行及び自動伴奏の進行が制御されている状態で、演奏者が押鍵して指定した音高が本来指定すべき音高に一致すると、ＣＰＵ２０１は、歌詞の進行及び自動伴奏の進行を再開する。例えば図５（ｂ）のタイミングｔ３で歌詞の進行及び自動伴奏の進行が停止した後、演奏者が、タイミングｔ３′で、本来指定すべき第３音高Ｂ４と一致する音高Ｂ４を指定すると、ＣＰＵ２０１は、タイミングｔ３で発音されるはずであった第３音高Ｂ４に対応する歌詞情報「き／ｔｗｉｎ（第３文字）」に応じた歌声「ぃ／ｉｎ（第３文字´）」を出力し、歌詞の進行及び自動伴奏の進行を再開する。 If the pitch specified by the performer by pressing the key matches the pitch that should be originally specified while the progress of the lyrics and the progress of the automatic accompaniment are controlled so that the singing voice is not output as described above, the CPU201 Resumes the progression of lyrics and the progression of automatic accompaniment. For example, after the progress of the lyrics and the progress of the automatic accompaniment are stopped at the timing t3 of FIG. 5B, the performer specifies the pitch B4 that matches the originally specified third pitch B4 at the timing t3'. , CPU201 produces a singing voice "i / in (third character')" corresponding to the lyrics information "ki / twin (third character)" corresponding to the third pitch B4 that should have been pronounced at the timing t3. Output and resume the progress of lyrics and the progress of automatic accompaniment.

上述のように歌詞の進行及び自動伴奏の進行が再開された場合、例えば図５（ｂ）のタイミングｔ３′で再開された歌詞情報「き／ｔｗｉｎ（第３文字）」に応じた歌声「ぃ／ｉｎ（第３文字´）」の発声の次に発声されるべき歌詞情報「ら／ｋｌｅ（第４文字）」の発声タイミングｔ４′は、再開した発声タイミングがｔ３からｔ３′にずれた分だけ、本来の発声タイミングｔ４からｔ４′にスライドする。 When the progress of the lyrics and the progress of the automatic accompaniment are restarted as described above, for example, the singing voice "i / twin (third character)" corresponding to the lyrics information "ki / twin (third character)" restarted at the timing t3'in FIG. 5 (b). The utterance timing t4'of the lyrics information "ra / kle (fourth character)" that should be uttered after the utterance of "/ in (third character')" is the amount that the restarted utterance timing deviates from t3 to t3'. Only, it slides from the original utterance timing t4 to t4'.

ここで、上述した指定された音高が本来指定されるべき音高と不一致の場合には、指定すべきタイミングに応じた押鍵がなかった場合も含んでいるといえる。即ち、図５では図示していないが、本来の発声タイミングのいずれかのタイミングで図１の鍵盤１０１の鍵が押されずに音高が指定されなかった場合においても、発声すべき音高に対応する文字（第１文字）の次の文字（第２文字）に応じた歌声が出力されないように、歌詞の進行及び自動伴奏の進行が制御されているといえる。 Here, when the above-mentioned specified pitch does not match the pitch that should be originally specified, it can be said that it includes the case where the key is not pressed according to the timing to be specified. That is, although not shown in FIG. 5, even when the key of the keyboard 101 of FIG. 1 is not pressed and the pitch is not specified at any timing of the original vocalization timing, the pitch to be vocalized is supported. It can be said that the progress of the lyrics and the progress of the automatic accompaniment are controlled so that the singing voice corresponding to the next character (second character) of the character (first character) to be played is not output.

図５（ｃ）のタイミングｔ３は、本実施形態による上述の制御動作が行われなかったと仮定した場合において、演奏者が、本来の発声タイミングに対応したタイミングｔ３で、押鍵により指定された音高Ｇ４がタイミングｔ３で本来指定すべき第３音高Ｂ４と不一致であった場合の制御動作を説明したものである。本実施形態による上述の制御動作が行われなかった場合には、図５（ｃ）のタイミングｔ３では、本来発声されるべき「き／ｔｗｉｎ（第３文字）」の次の「ら／ｋｌｅ（第４文字）」が発声されてしまい、本実施形態とは異なり、歌詞の進行が不適切になってしまっている。すなわち、タイミングｔ４で発音されるべき「ら／ｋｌｅ（第４文字）」が、タイミングｔ４よりも早いタイミングｔ３で発音されてしまっている。 The timing t3 of FIG. 5C is a sound designated by the key press at the timing t3 corresponding to the original vocalization timing, assuming that the above-mentioned control operation according to the present embodiment is not performed. This is a description of the control operation when the high G4 does not match the third pitch B4 that should be originally specified at the timing t3. When the above-mentioned control operation according to the present embodiment is not performed, at the timing t3 of FIG. 5 (c), the "ra / kle (3)" following the "ki / twin (third character)" that should be originally uttered (3rd character) The fourth character) ”has been uttered, and unlike the present embodiment, the progress of the lyrics has become inappropriate. That is, the "ra / kle (fourth character)" that should be pronounced at the timing t4 is pronounced at the timing t3 earlier than the timing t4.

以上のように、演奏者が本来の発声タイミングで本来指定すべき音高と一致する正しい音高を指定しなかった場合には、本実施形態による制御動作が実行されない場合には、歌詞の進行と自動伴奏の進行とが合わなくなって、演奏者はそのたびに歌詞の進行を修正しなければならなかった。これに対して、本実施形態では、演奏者により本来指定すべき音高と一致する正しい音高が指定されるまで、歌詞の進行と自動伴奏の進行を停止するため、演奏者の演奏に合わせた自然な歌詞進行を行うことが可能となる。 As described above, if the performer does not specify the correct pitch that matches the pitch that should be originally specified at the original vocalization timing, and if the control operation according to the present embodiment is not executed, the progress of the lyrics. And the progression of the automatic accompaniment did not match, and the performer had to correct the progression of the lyrics each time. On the other hand, in the present embodiment, the progress of the lyrics and the progress of the automatic accompaniment are stopped until the correct pitch that matches the pitch that should be originally specified by the performer is specified. It is possible to perform natural lyrics progression.

次に、本来の発声タイミングのいずれのタイミングも到来していないタイミングで、演奏者が図１の鍵盤１０１の任意の鍵（操作子）を押鍵操作し、それにより指定された音高が次に指定すべき音高と不一致であった場合、図２のＣＰＵ２０１は、音声合成ＬＳＩ２０５において出力されている歌声音声出力データ２１７の歌声の音高を、演奏操作により指定された音高に変更することを指示する歌声データ２１５を、図２の音声合成ＬＳＩ２０５に出力する。この結果、図２又は図３の音声合成ＬＳＩ２０５は、上記本来の発声タイミングのいずれのタイミングも到来していないタイミングで、発声中の歌声音声出力データ２１７の音高を、ＣＰＵ２０１から指定された音高に変更する。 Next, the performer presses an arbitrary key (operator) on the keyboard 101 of FIG. 1 at a timing when none of the original vocalization timings have arrived, and the pitch specified by the key is next. If the pitch does not match the pitch to be specified in, the CPU 201 of FIG. 2 changes the pitch of the singing voice of the singing voice voice output data 217 output by the voice synthesis LSI 205 to the pitch specified by the performance operation. The singing voice data 215 instructing this is output to the voice synthesis LSI 205 of FIG. As a result, in the voice synthesis LSI 205 of FIG. 2 or 3, the pitch of the singing voice output data 217 being uttered is set to the sound specified by the CPU 201 at a timing when none of the original utterance timings has arrived. Change to high.

例えば、図５（ｂ）において、本来の発声タイミングｔ１、ｔ２、ｔ３、ｔ４のいずれのタイミングも到来していないタイミングｔ１′で、演奏者が、図１の鍵盤１０１において音高Ｇ４の鍵を押鍵したとする。この場合、ＣＰＵ２０１は、指定された音高Ｇ４が、後（タイミングｔ２）で指定すべき歌詞情報「ら／ｋｌｅ（第２文字）」に対応する第２音高Ｅ４と不一致と判断し、その結果、音声合成ＬＳＩ２０５において出力されている歌詞情報「き／Ｔｗｉｎ（第１文字）」の歌声音声出力データ２１７の音高を、いままでの第１音高Ｅ４から演奏操作により指定された音高Ｇ４に変更して発声を継続させることを指示する歌声データ２１５を、図２の音声合成ＬＳＩ２０５に出力する。この結果、図２又は図３の音声合成ＬＳＩ２０５は、タイミングｔ１′で、発声中の歌詞情報「き／Ｔｗｉｎ（第１文字）」に応じた「ぃ／ｉｎ（第１文字´）」の歌声音声出力データ２１７の音高を、いままでの第１音高Ｅ４から、ＣＰＵ２０１から指定された音高Ｇ４に変更して発声を継続する。 For example, in FIG. 5 (b), at the timing t1'in which none of the original vocalization timings t1, t2, t3, and t4 has arrived, the performer presses the key of pitch G4 on the keyboard 101 of FIG. Suppose you press a key. In this case, the CPU 201 determines that the designated pitch G4 does not match the second pitch E4 corresponding to the lyrics information "ra / kle (second character)" to be specified later (timing t2). As a result, the pitch of the singing voice voice output data 217 of the lyrics information "ki / Twin (first character)" output by the voice synthesis LSI 205 is the pitch specified by the performance operation from the first pitch E4 so far. The singing voice data 215 instructing to change to G4 and continue utterance is output to the voice synthesis LSI 205 of FIG. As a result, the voice synthesis LSI 205 of FIG. 2 or FIG. 3 has a singing voice of "i / in (first character')" corresponding to the phonation information "ki / Twin (first character)" at the timing t1'. The pitch of the voice output data 217 is changed from the first pitch E4 up to now to the pitch G4 specified by the CPU 201, and the utterance is continued.

図５（ｃ）のタイミングｔ１′は、本実施形態による上述の制御動作が行われなかったと仮定した場合において、演奏者が、本来の発声タイミング以外のタイミングｔ１′で、図１の鍵盤１０１の鍵を押鍵した場合の制御動作を説明したものである。本実施形態による上述の制御動作が行われなかった場合には、図５（ｃ）のタイミングｔ１′では、その次の発声タイミングｔ２で発声されるべき歌詞情報「ら／ｋｌｅ（第２文字）」の発声が行われてしまう。 The timing t1'in FIG. 5C is the timing t1'other than the original vocalization timing when the performer assumes that the above-mentioned control operation according to the present embodiment has not been performed. This is an explanation of the control operation when a key is pressed. When the above-mentioned control operation according to the present embodiment is not performed, at the timing t1'in FIG. 5C, the lyrics information "ra / kle (second character)" to be uttered at the next utterance timing t2. Is uttered.

以上のように、本実施形態による制御動作が実行されない場合には、演奏者が本来の発声タイミング以外のタイミングで指定した音高が指定すべき音高と不一致であった場合でも、どんどん歌詞が先に進んでしまって、不自然な感じになってしまっていた。これに対して、本実施形態では、そのタイミングの直前の本来のタイミングから発声されている歌声音声出力データ２１７に応じて発声されている音高が演奏者により指定された音高に変更されて継続させることが可能となる。この場合には、例えば図５（ｂ）の本来のソング再生タイミングｔ１で発声された歌詞情報「き／Ｔｗｉｎ（第１文字）」に対応する歌声音声出力データ２１７が途切れることなく、その音高が押鍵タイミングｔ１′において新たな押鍵による音高に連続的に変化してゆくように聞こえる。これにより、本実施形態では、自然な歌詞進行を行うことができる。 As described above, when the control operation according to the present embodiment is not executed, even if the pitch specified by the performer at a timing other than the original vocalization timing does not match the pitch to be specified, the lyrics are steadily displayed. I went ahead and felt unnatural. On the other hand, in the present embodiment, the pitch uttered according to the singing voice output data 217 uttered from the original timing immediately before the timing is changed to the pitch specified by the performer. It will be possible to continue. In this case, for example, the pitch of the singing voice output data 217 corresponding to the lyrics information “ki / Twin (first character)” uttered at the original song playback timing t1 in FIG. 5 (b) is not interrupted. Sounds to continuously change to the pitch due to the new key press at the key press timing t1'. As a result, in the present embodiment, the lyrics can be progressed naturally.

図５には図示していないが、本来の発声タイミングのいずれのタイミングも到来していないタイミングで、その次に発声されるべき歌詞情報に応じた音高と同じ音高を演奏者が指定したとする。この場合、ＣＰＵ２０１は、次に発声されるべき歌声のタイミングまで、歌詞の進行及び自動伴奏の進行を一気に進める（ジャンプさせる）ように制御してもよい。 Although not shown in FIG. 5, the performer specified the same pitch as the pitch according to the lyrics information to be uttered next at the timing when none of the original utterance timings arrived. And. In this case, the CPU 201 may control the progress of the lyrics and the progress of the automatic accompaniment to advance (jump) at once until the timing of the singing voice to be uttered next.

なお、演奏者が本来の発声タイミング以外のタイミングで演奏操作を行い、そのときに指定された音高が、次のタイミングで指定すべき音高と不一致であった場合に、既に出力された歌声音声出力データ２１７に応じた発声を繰り返し（音高を変更して）で出力してもよい。この場合には、例えば図５（ｂ）の本来のソング再生タイミングｔ１で発声された歌詞情報「き／Ｔｗｉｎ（第１文字）」に対応する歌声音声出力データ２１７に続けて、押鍵タイミングｔ１′における新たな押鍵により「き／Ｔｗｉｎ（第１文字）」に対応する歌声音声出力データ２１７が、別に発声されるように聞こえる。或いは、発声タイミング以外のタイミングでは、歌声音声出力データ２１７の発声を行わないように制御してもよい。 If the performer performs a performance operation at a timing other than the original vocalization timing and the pitch specified at that time does not match the pitch to be specified at the next timing, the already output singing voice The utterance according to the voice output data 217 may be repeatedly output (change the pitch). In this case, for example, following the singing voice voice output data 217 corresponding to the lyrics information "ki / Twin (first character)" uttered at the original song playback timing t1 of FIG. 5 (b), the key press timing t1 It sounds like the singing voice output data 217 corresponding to "ki / Twin (first character)" is uttered separately by the new key press in ′. Alternatively, it may be controlled so that the singing voice output data 217 is not uttered at a timing other than the utterance timing.

図６は、本実施形態において、図２のＲＯＭ２０２からＲＡＭ２０３に読み込まれる曲データのデータ構成例を示す図である。このデータ構成例は、ＭＩＤＩ（ＭｕｓｉｃａｌＩｎｓｔｒｕｍｅｎｔＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅ）用ファイルフォーマットの一つであるスタンダードＭＩＤＩファイルのフォーマットに準拠している。この曲データは、チャンクと呼ばれるデータブロックから構成される。具体的には、曲データは、ファイルの先頭にあるヘッダチャンクと、それに続く歌詞パート用の歌詞データが格納されるトラックチャンク１と、伴奏パート用の演奏データが格納されるトラックチャンク２とから構成される。 FIG. 6 is a diagram showing a data configuration example of song data read from ROM 202 to RAM 203 in FIG. 2 in the present embodiment. This data structure example conforms to the standard MIDI file format, which is one of the MIDI (Musical Instrument Digital Interface) file formats. This song data is composed of data blocks called chunks. Specifically, the song data consists of a header chunk at the beginning of the file, a track chunk 1 in which the lyrics data for the following lyrics part is stored, and a track chunk 2 in which the performance data for the accompaniment part is stored. It is composed.

ヘッダチャンクは、ＣｈｕｎｋＩＤ、ＣｈｕｎｋＳｉｚｅ、ＦｏｒｍａｔＴｙｐｅ、ＮｕｍｂｅｒＯｆＴｒａｃｋ、及びＴｉｍｅＤｉｖｉｓｉｏｎの４つの値からなる。ＣｈｕｎｋＩＤは、ヘッダチャンクであることを示す"MThd"という半角４文字に対応する４バイトのアスキーコード「4D 54 68 64」（数字は１６進数）である。ＣｈｕｎｋＳｉｚｅは、ヘッダチャンクにおいて、ＣｈｕｎｋＩＤとＣｈｕｎｋＳｉｚｅを除く、ＦｏｒｍａｔＴｙｐｅ、ＮｕｍｂｅｒＯｆＴｒａｃｋ、及びＴｉｍｅＤｉｖｉｓｉｏｎの部分のデータ長を示す４バイトデータであり、データ長は６バイト：「00 00 00 06」（数字は１６進数）に固定されている。ＦｏｒｍａｔＴｙｐｅは、本実施形態の場合、複数トラックを使用するフォーマット１を意味する２バイトのデータ「00 01」（数字は１６進数）である。ＮｕｍｂｅｒＯｆＴｒａｃｋは、本実施形態の場合、歌詞パートと伴奏パートに対応する２トラックを使用することを示す２バイトのデータ「00 02」（数字は１６進数）である。ＴｉｍｅＤｉｖｉｓｉｏｎは、４分音符あたりの分解能を示すタイムベース値を示すデータであり、本実施形態の場合、１０進法で４８０を示す２バイトのデータ「01 E0」（数字は１６進数）である。 The header chunk consists of four values: ChunkID, ChunkSize, FormatType, NumberOfTrack, and TimeDivision. The Chunk ID is a 4-byte ASCII code "4D 54 68 64" (numbers are hexadecimal numbers) corresponding to four single-byte characters "MThd" indicating that it is a header chunk. The ChunkSize is 4-byte data indicating the data length of the FormatType, NumberOfTrack, and TimeDivision parts excluding the ChunkID and the ChunkSize in the header chunk, and the data length is 6 bytes: "00 00 00 06" (numbers are hexadecimal numbers). It is fixed to. In the case of this embodiment, the Format Type is 2-byte data "00 01" (numbers are hexadecimal numbers), which means format 1 using a plurality of tracks. In the case of the present embodiment, the NumberOfTrack is 2-byte data "00 02" (numbers are hexadecimal numbers) indicating that two tracks corresponding to the lyrics part and the accompaniment part are used. The Time Division is data indicating a time base value indicating a resolution per quarter note, and in the case of the present embodiment, it is 2-byte data "01 E0" (number is a hexadecimal number) indicating 480 in decimal notation.

トラックチャンク１、２はそれぞれ、ＣｈｕｎｋＩＤ、ＣｈｕｎｋＳｉｚｅと、ＤｅｌｔａＴｉｍｅ＿１［ｉ］及びＥｖｅｎｔ＿１［ｉ］（トラックチャンク１／歌詞パートの場合）又はＤｅｌｔａＴｉｍｅ＿２［ｉ］及びＥｖｅｎｔ＿２［ｉ］（トラックチャンク２／伴奏パートの場合）からなる演奏データ組（０≦ｉ≦Ｌ：トラックチャンク１／歌詞パートの場合、０≦ｉ≦Ｍ：トラックチャンク２／伴奏パートの場合）とからなる。ＣｈｕｎｋＩＤは、トラックチャンクであることを示す"MTrk"という半角４文字に対応する４バイトのアスキーコード「4D 54 72 6B」（数字は１６進数）である。ＣｈｕｎｋＳｉｚｅは、各トラックチャンクにおいて、ＣｈｕｎｋＩＤとＣｈｕｎｋＳｉｚｅを除く部分のデータ長を示す４バイトデータである。 Track chunks 1 and 2 are ChunkID, ChunkSize, and DataTime_1 [i] and Event_1 [i] (in the case of track chunk 1 / lyrics part) or DeltaTime_2 [i] and Event_2 [i] (track chunk 2 / accompaniment part, respectively). Case) is composed of a performance data set (0 ≦ i ≦ L: track chunk 1 / lyrics part, 0 ≦ i ≦ M: track chunk 2 / accompaniment part). The Chunk ID is a 4-byte ASCII code "4D 54 72 6B" (numbers are hexadecimal numbers) corresponding to four single-byte characters "MTrk" indicating that it is a track chunk. The ChunkSize is 4-byte data indicating the data length of the portion of each track chunk excluding the ChunkID and the ChunkSize.

ＤｅｌｔａＴｉｍｅ＿１［ｉ］は、その直前のＥｖｅｎｔ＿１［ｉ−１］の実行時刻からの待ち時間（相対時間）を示す１〜４バイトの可変長データである。同様に、ＤｅｌｔａＴｉｍｅ＿２［ｉ］は、その直前のＥｖｅｎｔ＿２［ｉ−１］の実行時刻からの待ち時間（相対時間）を示す１〜４バイトの可変長データである。Ｅｖｅｎｔ＿１［ｉ］は、トラックチャンク１／歌詞パートにおいて、歌詞の発声タイミングと音高を指示するメタイベントである。Ｅｖｅｎｔ＿２［ｉ］は、トラックチャンク２／伴奏パートにおいて、ノートオン又はノートオフを指示するＭＩＤＩイベント、又は拍子を指示するメタイベントである。トラックチャンク１／歌詞パートに対して、各演奏データ組ＤｅｌｔａＴｉｍｅ＿１［ｉ］及びＥｖｅｎｔ＿１［ｉ］において、その直前のＥｖｅｎｔ＿１［ｉ−１］の実行時刻からＤｅｌｔａＴｉｍｅ＿１［ｉ］だけ待った上でＥｖｅｎｔ＿１［ｉ］が実行されることにより、歌詞の発声進行が実現される。一方、トラックチャンク２／伴奏パートに対して、各演奏データ組ＤｅｌｔａＴｉｍｅ＿２［ｉ］及びＥｖｅｎｔ＿２［ｉ］において、その直前のＥｖｅｎｔ＿２［ｉ−１］の実行時刻からＤｅｌｔａＴｉｍｅ＿２［ｉ］だけ待った上でＥｖｅｎｔ＿２［ｉ］が実行されることにより、自動伴奏の進行が実現される。 DeltaTime_1 [i] is variable length data of 1 to 4 bytes indicating the waiting time (relative time) from the execution time of Event_1 [i-1] immediately before that. Similarly, DeltaTime_2 [i] is 1 to 4 byte variable length data indicating the waiting time (relative time) from the execution time of Event_2 [i-1] immediately before that. Event_1 [i] is a meta-event that instructs the utterance timing and pitch of the lyrics in the track chunk 1 / lyrics part. Event_2 [i] is a MIDI event instructing note-on or note-off, or a meta-event instructing time signature in the track chunk 2 / accompaniment part. For each performance data set DeltaTime_1 [i] and Event_1 [i] for the track chunk 1 / lyrics part, after waiting for DeltaTime_1 [i] from the execution time of Event_1 [i-1] immediately before that, Event_1 [i] Is executed, the vocalization progress of the lyrics is realized. On the other hand, for the track chunk 2 / accompaniment part, in each performance data set DeltaTime_2 [i] and Event_2 [i], after waiting for the execution time of Event_2 [i-1] immediately before that, Event_2 [i] By executing i], the progress of automatic accompaniment is realized.

図７は、本実施形態における電子楽器の制御処理例を示すメインフローチャートである。この制御処理は例えば、図２のＣＰＵ２０１が、ＲＯＭ２０２からＲＡＭ２０３にロードされた制御処理プログラムを実行する動作である。 FIG. 7 is a main flowchart showing an example of control processing of an electronic musical instrument according to the present embodiment. This control process is, for example, an operation in which the CPU 201 of FIG. 2 executes a control process program loaded from the ROM 202 into the RAM 203.

ＣＰＵ２０１は、まず初期化処理を実行した後（ステップＳ７０１）、ステップＳ７０２からＳ７０８の一連の処理を繰り返し実行する。 The CPU 201 first executes the initialization process (step S701), and then repeatedly executes a series of processes from steps S702 to S708.

この繰返し処理において、ＣＰＵ２０１はまず、スイッチ処理を実行する（ステップＳ７０２）。ここでは、ＣＰＵ２０１は、図２のキースキャナ２０６からの割込みに基づいて、図１の第１のスイッチパネル１０２又は第２のスイッチパネル１０３のスイッチ操作に対応する処理を実行する。 In this iterative process, the CPU 201 first executes the switch process (step S702). Here, the CPU 201 executes a process corresponding to the switch operation of the first switch panel 102 or the second switch panel 103 of FIG. 1 based on the interrupt from the key scanner 206 of FIG.

次に、ＣＰＵ２０１は、図２のキースキャナ２０６からの割込みに基づいて図１の鍵盤１０１の何れかの鍵が操作されたか否かを判定して処理する鍵盤処理を実行する（ステップＳ７０３）。ここでは、ＣＰＵ２０１は、演奏者による何れかの鍵の押鍵又は離鍵の操作に応じて、図２の音源ＬＳＩ２０４に対して、発音開始又は発音停止を指示する発音制御データ２１６を出力する。 Next, the CPU 201 executes a keyboard process for determining and processing whether or not any key of the key 101 of FIG. 1 has been operated based on the interrupt from the key scanner 206 of FIG. 2 (step S703). Here, the CPU 201 outputs sound control data 216 instructing the sound source LSI 204 of FIG. 2 to start or stop sounding in response to an operation of pressing or releasing any key by the performer.

次に、ＣＰＵ２０１は、図１のＬＣＤ１０４に表示すべきデータを処理し、そのデータを、図２のＬＣＤコントローラ２０８を介してＬＣＤ１０４に表示する表示処理を実行する（ステップＳ７０４）。ＬＣＤ１０４に表示されるデータとしては、例えば演奏される歌声音声出力データ２１７に対応する歌詞とその歌詞に対応するメロディの楽譜や、各種設定情報がある（後述する図１３及び図１４を参照）。 Next, the CPU 201 processes data to be displayed on the LCD 104 of FIG. 1, and executes a display process of displaying the data on the LCD 104 via the LCD controller 208 of FIG. 2 (step S704). The data displayed on the LCD 104 includes, for example, the lyrics corresponding to the singing voice output data 217 to be played, the score of the melody corresponding to the lyrics, and various setting information (see FIGS. 13 and 14 described later).

次に、ＣＰＵ２０１は、ソング再生処理を実行する（ステップＳ７０５）。この処理においては、ＣＰＵ２０１が、演奏者の演奏に基づいて図５で説明した制御処理を実行し、歌声データ２１５を生成して音声合成ＬＳＩ２０５に出力する。 Next, the CPU 201 executes the song playback process (step S705). In this process, the CPU 201 executes the control process described with reference to FIG. 5 based on the performance of the performer, generates singing voice data 215, and outputs the singing voice data 215 to the speech synthesis LSI 205.

続いて、ＣＰＵ２０１は、音源処理を実行する（ステップＳ７０６）。音源処理において、ＣＰＵ２０１は、音源ＬＳＩ２０４における発音中の楽音のエンベロープ制御等の制御処理を実行する。 Subsequently, the CPU 201 executes sound source processing (step S706). In the sound source processing, the CPU 201 executes control processing such as envelope control of the musical sound being sounded in the sound source LSI 204.

続いて、ＣＰＵ２０１は、音声合成処理を実行する（ステップＳ７０７）。音声合成処理において、ＣＰＵ２０１は、音声合成ＬＳＩ２０５による音声合成の実行を制御する。 Subsequently, the CPU 201 executes the speech synthesis process (step S707). In the speech synthesis process, the CPU 201 controls the execution of speech synthesis by the speech synthesis LSI 205.

最後にＣＰＵ２０１は、演奏者が特には図示しないパワーオフスイッチを押してパワーオフしたか否かを判定する（ステップＳ７０８）。ステップＳ７０８の判定がＮＯならば、ＣＰＵ２０１は、ステップＳ７０２の処理に戻る。ステップＳ７０８の判定がＹＥＳならば、ＣＰＵ２０１は、図７のフローチャートで示される制御処理を終了し、電子鍵盤楽器１００の電源を切る。 Finally, the CPU 201 determines whether or not the performer has pressed a power-off switch (not shown) to power off (step S708). If the determination in step S708 is NO, the CPU 201 returns to the process in step S702. If the determination in step S708 is YES, the CPU 201 ends the control process shown in the flowchart of FIG. 7, and turns off the power of the electronic keyboard instrument 100.

図８（ａ）、（ｂ）、及び（ｃ）はそれぞれ、図７のステップＳ７０１の初期化処理、図７のステップＳ７０２のスイッチ処理における後述する図９のステップＳ９０２のテンポ変更処理、及び同じく図９のステップＳ９０６のソング開始処理の詳細例を示すフローチャートである。 8 (a), (b), and (c) are the initialization process of step S701 of FIG. 7, the tempo change process of step S902 of FIG. 9, which will be described later in the switch process of step S702 of FIG. 7, and the same. It is a flowchart which shows the detailed example of the song start processing of step S906 of FIG.

まず、図７のステップＳ７０１の初期化処理の詳細例を示す図８（ａ）において、ＣＰＵ２０１は、ＴｉｃｋＴｉｍｅの初期化処理を実行する。本実施形態において、歌詞の進行及び自動伴奏は、ＴｉｃｋＴｉｍｅという時間を単位として進行する。図６の曲データのヘッダチャンク内のＴｉｍｅＤｉｖｉｓｉｏｎ値として指定されるタイムベース値は４分音符の分解能を示しており、この値が例えば４８０ならば、４分音符は４８０ＴｉｃｋＴｉｍｅの時間長を有する。また、図６の曲データのトラックチャンク内の待ち時間ＤｅｌｔａＴｉｍｅ＿１［ｉ］値及びＤｅｌｔａＴｉｍｅ＿２［ｉ］値も、ＴｉｃｋＴｉｍｅの時間単位によりカウントされる。ここで、１ＴｉｃｋＴｉｍｅが実際に何秒になるかは、曲データに対して指定されるテンポによって異なる。今、テンポ値をＴｅｍｐｏ［ビート／分］、上記タイムベース値をＴｉｍｅＤｉｖｉｓｉｏｎとすれば、ＴｉｃｋＴｉｍｅの秒数は、次式により算出される。 First, in FIG. 8A showing a detailed example of the initialization process of step S701 of FIG. 7, the CPU 201 executes the ticktime initialization process. In the present embodiment, the progress of the lyrics and the automatic accompaniment proceed in units of time called TickTime. The timebase value specified as the TimeDivision value in the header chunk of the song data of FIG. 6 indicates the resolution of the quarter note, and if this value is, for example, 480, the quarter note has a time length of 480TickTime. Further, the waiting time DeltaTime_1 [i] value and the DeltaTime_2 [i] value in the track chunk of the song data of FIG. 6 are also counted by the time unit of TickTime. Here, how many seconds 1 Tick Time actually becomes depends on the tempo specified for the song data. If the tempo value is Tempo [beat / minute] and the time base value is Time Division, the number of seconds of Tick Time is calculated by the following equation.

ＴｉｃｋＴｉｍｅ［秒］＝６０／Ｔｅｍｐｏ／ＴｉｍｅＤｉｖｉｓｉｏｎ（６） TickTime [seconds] = 60 / Tempo / TimeDivision (6)

そこで、図８（ａ）のフローチャートで例示される初期化処理において、ＣＰＵ２０１はまず、上記（６）式に対応する演算処理により、ＴｉｃｋＴｉｍｅ［秒］を算出する（ステップＳ８０１）。なお、テンポ値Ｔｅｍｐｏは、初期状態では図２のＲＯＭ２０２に所定の値、例えば６０［ビート／秒］が記憶されているとする。或いは、不揮発性メモリに、前回終了時のテンポ値が記憶されていてもよい。 Therefore, in the initialization process exemplified by the flowchart of FIG. 8A, the CPU 201 first calculates the TickTime [seconds] by the arithmetic processing corresponding to the above equation (6) (step S801). It is assumed that a predetermined value, for example, 60 [beats / second] is stored in the ROM 202 of FIG. 2 in the initial state of the tempo value Tempo. Alternatively, the tempo value at the time of the previous end may be stored in the non-volatile memory.

次に、ＣＰＵ２０１は、図２のタイマ２１０に対して、ステップＳ８０１で算出したＴｉｃｋＴｉｍｅ［秒］によるタイマ割込みを設定する（ステップＳ８０２）。この結果、タイマ２１０において上記ＴｉｃｋＴｉｍｅ［秒］が経過する毎に、ＣＰＵ２０１に対して歌詞進行及び自動伴奏のための割込み（以下「自動演奏割込み」と記載）が発生する。従って、この自動演奏割込みに基づいてＣＰＵ２０１で実行される自動演奏割込み処理（後述する図１０）では、１ＴｉｃｋＴｉｍｅ毎に歌詞進行及び自動伴奏を進行させる制御処理が実行されることになる。 Next, the CPU 201 sets a timer interrupt according to the TickTime [seconds] calculated in step S801 with respect to the timer 210 of FIG. 2 (step S802). As a result, every time the TickTime [seconds] elapses in the timer 210, an interrupt for lyrics progression and automatic accompaniment (hereinafter referred to as “automatic performance interrupt”) is generated for the CPU 201. Therefore, in the automatic performance interrupt process (FIG. 10 described later) executed by the CPU 201 based on this automatic performance interrupt, the control process for advancing the lyrics progress and the automatic accompaniment is executed for each TickTime.

続いて、ＣＰＵ２０１は、図２のＲＡＭ２０３の初期化等のその他初期化処理を実行する（ステップＳ８０３）。その後、ＣＰＵ２０１は、図８（ａ）のフローチャートで例示される図７のステップＳ７０１の初期化処理を終了する。 Subsequently, the CPU 201 executes other initialization processing such as initialization of the RAM 203 of FIG. 2 (step S803). After that, the CPU 201 ends the initialization process of step S701 of FIG. 7 illustrated by the flowchart of FIG. 8A.

図８（ｂ）及び（ｃ）のフローチャートについては、後述する。図９は、図７のステップＳ７０２のスイッチ処理の詳細例を示すフローチャートである。 The flowcharts of FIGS. 8B and 8C will be described later. FIG. 9 is a flowchart showing a detailed example of the switch processing in step S702 of FIG.

ＣＰＵ２０１はまず、図１の第１のスイッチパネル１０２内のテンポ変更スイッチにより歌詞進行及び自動伴奏のテンポが変更されたか否かを判定する（ステップＳ９０１）。その判定がＹＥＳならば、ＣＰＵ２０１は、テンポ変更処理を実行する（ステップＳ９０２）。この処理の詳細は、図８（ｂ）を用いて後述する。ステップＳ９０１の判定がＮＯならば、ＣＰＵ２０１は、ステップＳ９０２の処理はスキップする。 First, the CPU 201 determines whether or not the tempo of the lyrics progression and the automatic accompaniment has been changed by the tempo change switch in the first switch panel 102 of FIG. 1 (step S901). If the determination is YES, the CPU 201 executes the tempo change process (step S902). Details of this process will be described later with reference to FIG. 8 (b). If the determination in step S901 is NO, the CPU 201 skips the process in step S902.

次に、ＣＰＵ２０１は、図１の第２のスイッチパネル１０３において何れかのソング曲が選曲されたか否かを判定する（ステップＳ９０３）。その判定がＹＥＳならば、ＣＰＵ２０１は、ソング曲読込み処理を実行する（ステップＳ９０４）。この処理は、図６で説明したデータ構造を有する曲データを、図２のＲＯＭ２０２からＲＡＭ２０３に読み込む処理である。これ以降、図６に例示されるデータ構造内のトラックチャンク１又は２に対するデータアクセスは、ＲＡＭ２０３に読み込まれた曲データに対して実行される。ステップＳ９０３の判定がＮＯならば、ＣＰＵ２０１は、ステップＳ９０４の処理はスキップする。 Next, the CPU 201 determines whether or not any song song has been selected on the second switch panel 103 of FIG. 1 (step S903). If the determination is YES, the CPU 201 executes the song song reading process (step S904). This process is a process of reading the song data having the data structure described in FIG. 6 from the ROM 202 of FIG. 2 into the RAM 203. From then on, data access to track chunks 1 or 2 in the data structure illustrated in FIG. 6 is performed on the song data read into RAM 203. If the determination in step S903 is NO, the CPU 201 skips the process in step S904.

続いて、ＣＰＵ２０１は、図１の第１のスイッチパネル１０２においてソング開始スイッチが操作されたか否かを判定する（ステップＳ９０５）。その判定がＹＥＳならば、ＣＰＵ２０１は、ソング開始処理を実行する（ステップＳ９０６）。この処理の詳細は、図８（ｃ）を用いて後述する。ステップＳ９０５の判定がＮＯならば、ＣＰＵ２０１は、ステップＳ９０６の処理はスキップする。 Subsequently, the CPU 201 determines whether or not the song start switch has been operated on the first switch panel 102 of FIG. 1 (step S905). If the determination is YES, the CPU 201 executes the song start process (step S906). Details of this process will be described later with reference to FIG. 8 (c). If the determination in step S905 is NO, the CPU 201 skips the process in step S906.

最後に、ＣＰＵ２０１は、図１の第１のスイッチパネル１０２又は第２のスイッチパネル１０３においてその他のスイッチが操作されたか否かを判定し、各スイッチ操作に対応する処理を実行する（ステップＳ９０７）。その後、ＣＰＵ２０１は、図９のフローチャートで例示される図７のステップＳ７０２のスイッチ処理を終了する。 Finally, the CPU 201 determines whether or not other switches have been operated on the first switch panel 102 or the second switch panel 103 of FIG. 1, and executes a process corresponding to each switch operation (step S907). .. After that, the CPU 201 ends the switch process in step S702 of FIG. 7, which is illustrated in the flowchart of FIG.

図８（ｂ）は、図９のステップＳ９０２のテンポ変更処理の詳細例を示すフローチャートである。前述したように、テンポ値が変更されるとＴｉｃｋＴｉｍｅ［秒］も変更になる。図８（ｂ）のフローチャートでは、ＣＰＵ２０１は、このＴｉｃｋＴｉｍｅ［秒］の変更に関する制御処理を実行する。 FIG. 8B is a flowchart showing a detailed example of the tempo change process in step S902 of FIG. As mentioned above, when the tempo value is changed, the TickTime [seconds] is also changed. In the flowchart of FIG. 8B, the CPU 201 executes the control process related to the change of the TickTime [seconds].

まず、ＣＰＵ２０１は、図７のステップＳ７０１の初期化処理で実行された図８（ａ）のステップＳ８０１の場合と同様にして、前述した（６）式に対応する演算処理により、ＴｉｃｋＴｉｍｅ［秒］を算出する（ステップＳ８１１）。なお、テンポ値Ｔｅｍｐｏは、図１の第１のスイッチパネル１０２内のテンポ変更スイッチにより変更された後の値がＲＡＭ２０３等に記憶されているものとする。 First, the CPU 201 is subjected to the operation process corresponding to the above-described equation (6) in the same manner as in the case of step S801 of FIG. 8A, which is executed in the initialization process of step S701 of FIG. Is calculated (step S811). It is assumed that the tempo value Tempo is stored in the RAM 203 or the like after being changed by the tempo change switch in the first switch panel 102 of FIG.

次に、ＣＰＵ２０１は、図７のステップＳ７０１の初期化処理で実行された図８（ａ）のステップＳ８０２の場合と同様にして、図２のタイマ２１０に対して、ステップＳ８１１で算出したＴｉｃｋＴｉｍｅ［秒］によるタイマ割込みを設定する（ステップＳ８１２）。その後、ＣＰＵ２０１は、図８（ｂ）のフローチャートで例示される図９のステップＳ９０２のテンポ変更処理を終了する。 Next, the CPU 201 performs the TickTime [calculated in step S811 with respect to the timer 210 of FIG. 2 in the same manner as in the case of step S802 of FIG. 8A executed in the initialization process of step S701 of FIG. Seconds] sets the timer interrupt (step S812). After that, the CPU 201 ends the tempo change process of step S902 of FIG. 9 exemplified by the flowchart of FIG. 8 (b).

図８（ｃ）は、図９のステップＳ９０６のソング開始処理の詳細例を示すフローチャートである。 FIG. 8C is a flowchart showing a detailed example of the song start process of step S906 of FIG.

まず、ＣＰＵ２０１は、自動演奏の進行において、ＴｉｃｋＴｉｍｅを単位として、直前のイベントの発生時刻からの相対時間をカウントするためのＲＡＭ２０３上の変数ＤｅｌｔａＴ＿１（トラックチャンク１）及びＤｅｌｔａＴ＿２（トラックチャンク２）の値を共に０に初期設定する。次に、ＣＰＵ２０１は、図６に例示される曲データのトラックチャンク１内の演奏データ組ＤｅｌｔａＴｉｍｅ＿１［ｉ］及びＥｖｅｎｔ＿１［ｉ］（１≦ｉ≦Ｌ−１）の夫々ｉを指定するためのＲＡＭ２０３上の変数ＡｕｔｏＩｎｄｅｘ＿１と、同じくトラックチャンク２内の演奏データ組ＤｅｌｔａＴｉｍｅ＿２［ｉ］及びＥｖｅｎｔ＿２［ｉ］（１≦ｉ≦Ｍ−１）の夫々ｉを指定するためのＲＡＭ２０３上の変数ＡｕｔｏＩｎｄｅｘ＿２の各値を共に０に初期設定する（以上、ステップＳ８２１）。これにより、図６の例では、初期状態としてまず、トラックチャンク１内の先頭の演奏データ組ＤｅｌｔａＴｉｍｅ＿１［０］とＥｖｅｎｔ＿１［０］、及びトラックチャンク２内の先頭の演奏データ組ＤｅｌｔａＴｉｍｅ＿２［０］とＥｖｅｎｔ＿２［０］がそれぞれ参照される。 First, in the progress of the automatic performance, the CPU 201 sets the values of the variables DeltaT_1 (track chunk 1) and DeltaT_2 (track chunk 2) on the RAM 203 for counting the relative time from the occurrence time of the immediately preceding event in units of TickTime. Is initially set to 0 for both. Next, the CPU 201 is a RAM 203 for designating each i of the performance data sets DeltaTime_1 [i] and Event_1 [i] (1 ≦ i ≦ L-1) in the track chunk 1 of the music data exemplified in FIG. Each value of the above variable AutoIndex_1 and the variable AutoIndex_1 on the RAM 203 for designating each i of the performance data sets DeltaTime_2 [i] and Event_2 [i] (1 ≦ i ≦ M-1) also in the track chunk 2. Both are initially set to 0 (above, step S821). As a result, in the example of FIG. 6, as an initial state, first, the first performance data set DeltaTime_1 [0] and Event_1 [0] in the track chunk 1, and the first performance data set DeltaTime_2 [0] in the track chunk 2 Event_1 [0] is referenced respectively.

次に、ＣＰＵ２０１は、現在のソング位置を指示するＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘの値を０に初期設定する（ステップＳ８２２）。 Next, the CPU 201 initially sets the value of the variable SongIndex on the RAM 203 that indicates the current song position to 0 (step S822).

更に、ＣＰＵ２０１は、歌詞及び伴奏の進行をするか（＝１）しないか（＝０）を示すＲＡＭ２０３上の変数ＳｏｎｇＳｔａｒｔの値を１（進行する）に初期設定する（ステップＳ８２３）。 Further, the CPU 201 initializes the value of the variable SongStart on the RAM 203 indicating whether the lyrics and accompaniment progress (= 1) or not (= 0) to 1 (progress) (step S823).

その後、ＣＰＵ２０１は、演奏者が、図１の第１のスイッチパネル１０２により歌詞の再生に合わせて伴奏の再生を行う設定を行っているか否かを判定する（ステップＳ８２４）。 After that, the CPU 201 determines whether or not the performer has set the first switch panel 102 of FIG. 1 to reproduce the accompaniment in accordance with the reproduction of the lyrics (step S824).

ステップＳ８２４の判定がＹＥＳならば、ＣＰＵ２０１は、ＲＡＭ２０３上の変数Ｂａｎｓｏｕの値を１（伴奏有り）に設定する（ステップＳ８２５）。逆に、ステップＳ８２４の判定がＮＯならば、ＣＰＵ２０１は、変数Ｂａｎｓｏｕの値を０（伴奏無し）に設定する（ステップＳ８２６）。ステップＳ８２５又はＳ８２６の処理の後、ＣＰＵ２０１は、図８（ｃ）のフローチャートで示される図９のステップＳ９０６のソング開始処理を終了する。 If the determination in step S824 is YES, the CPU 201 sets the value of the variable Bansou on the RAM 203 to 1 (with accompaniment) (step S825). On the contrary, if the determination in step S824 is NO, the CPU 201 sets the value of the variable Bansou to 0 (no accompaniment) (step S826). After the process of step S825 or S826, the CPU 201 ends the song start process of step S906 of FIG. 9 shown in the flowchart of FIG. 8 (c).

図１０は、図２のタイマ２１０においてＴｉｃｋＴｉｍｅ［秒］毎に発生する割込み（図８（ａ）のステップＳ８０２又は図８（ｂ）のステップＳ８１２を参照）に基づいて実行される自動演奏割込み処理の詳細例を示すフローチャートである。以下の処理は、図６に例示される曲データのトラックチャンク１及び２の演奏データ組に対して実行される。 FIG. 10 shows an automatic performance interrupt process executed based on an interrupt generated every TickTime [seconds] in the timer 210 of FIG. 2 (see step S802 of FIG. 8A or step S812 of FIG. 8B). It is a flowchart which shows the detailed example of. The following processing is executed on the performance data sets of the track chunks 1 and 2 of the music data exemplified in FIG.

まず、ＣＰＵ２０１は、トラックチャンク１に対応する一連の処理（ステップＳ１００１からＳ１００６）を実行する。始めにＣＰＵ２０１は、ＳｏｎｇＳｔａｒｔ値が１であるか否か、即ち歌詞及び伴奏の進行が指示されているか否かを判定する（ステップＳ１００１）。 First, the CPU 201 executes a series of processes (steps S1001 to S1006) corresponding to the track chunk 1. First, the CPU 201 determines whether or not the SongStart value is 1, that is, whether or not the progress of the lyrics and accompaniment is instructed (step S1001).

ＣＰＵ２０１は、歌詞及び伴奏の進行が指示されていないと判定した（ステップＳ１００１の判定がＮＯである）場合には、ＣＰＵ２０１は、歌詞及び伴奏の進行は行わずに図１０のフローチャートで例示される自動演奏割込み処理をそのまま終了する。 When the CPU 201 determines that the progress of the lyrics and the accompaniment is not instructed (the determination in step S1001 is NO), the CPU 201 does not proceed with the lyrics and the accompaniment, and is illustrated in the flowchart of FIG. The automatic performance interrupt processing ends as it is.

ＣＰＵ２０１は、歌詞及び伴奏の進行が指示されていると判定した（ステップＳ１００１の判定がＹＥＳである）場合には、トラックチャンク１に関する前回のイベントの発生時刻からの相対時刻を示すＤｅｌｔａＴ＿１値が、ＡｕｔｏＩｎｄｅｘ＿１値が示すこれから実行しようとする演奏データ組の待ち時間ＤｅｌｔａＴｉｍｅ＿１［ＡｕｔｏＩｎｄｅｘ＿１］に一致したか否かを判定する（ステップＳ１００２）。 When the CPU 201 determines that the progress of the lyrics and the accompaniment is instructed (the determination in step S1001 is YES), the DeltaT_1 value indicating the relative time from the occurrence time of the previous event regarding the track chunk 1 is set. It is determined whether or not the wait time DeltaTime_1 [AutoIndex_1] of the performance data set to be executed to be executed, which is indicated by the AutoIndex_1 value, is matched (step S1002).

ステップＳ１００２の判定がＮＯならば、ＣＰＵ２０１は、トラックチャンク１に関して、前回のイベントの発生時刻からの相対時刻を示すＤｅｌｔａＴ＿１値を＋１インクリメントさせて、今回の割込みに対応する１ＴｉｃｋＴｉｍｅ単位分だけ時刻を進行させる（ステップＳ１００３）。その後、ＣＰＵ２０１は、後述するステップＳ１００７に移行する。 If the determination in step S1002 is NO, the CPU 201 increments the DeltaT_1 value indicating the relative time from the occurrence time of the previous event by +1 with respect to the track chunk 1, and advances the time by 1 TickTime unit corresponding to the current interrupt. (Step S1003). After that, the CPU 201 shifts to step S1007, which will be described later.

ステップＳ１００２の判定がＹＥＳになると、ＣＰＵ２０１は、トラックチャンク１に関して、ＡｕｔｏＩｎｄｅｘ＿１値が示す演奏データ組のイベントＥｖｅｎｔ［ＡｕｔｏＩｎｄｅｘ＿１］を実行する（ステップＳ１００４）。このイベントは、歌詞データを含むソングイベントである。 When the determination in step S1002 becomes YES, the CPU 201 executes the event event [AutoIndex_1] of the performance data set indicated by the AutoIndex_1 value with respect to the track chunk 1 (step S1004). This event is a song event that includes lyrics data.

続いて、ＣＰＵ２０１は、トラックチャンク１内の次に実行すべきソングイベントの位置を示すＡｕｔｏＩｎｄｅｘ＿１値を、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘに格納する（ステップＳ１００４）。 Subsequently, the CPU 201 stores the AutoIndex_1 value indicating the position of the next song event to be executed in the track chunk 1 in the variable SongIndex on the RAM 203 (step S1004).

更に、ＣＰＵ２０１は、トラックチャンク１内の演奏データ組を参照するためのＡｕｔｏＩｎｄｅｘ＿１値を＋１インクリメントする（ステップＳ１００５）。 Further, the CPU 201 increments the AutoIndex_1 value for referencing the performance data set in the track chunk 1 by +1 (step S1005).

また、ＣＰＵ２０１は、トラックチャンク１に関して今回参照したソングイベントの発生時刻からの相対時刻を示すＤｅｌｔａＴ＿１値を０にリセットする（ステップＳ１００６）。その後、ＣＰＵ２０１は、ステップＳ１００７の処理に移行する。 Further, the CPU 201 resets the DeltaT_1 value indicating the relative time from the occurrence time of the song event referred to this time with respect to the track chunk 1 to 0 (step S1006). After that, the CPU 201 shifts to the process of step S1007.

次に、ＣＰＵ２０１は、トラックチャンク２に対応する一連の処理（ステップＳ１００７からＳ１０１３）を実行する。始めにＣＰＵ２０１は、トラックチャンク２に関する前回のイベントの発生時刻からの相対時刻を示すＤｅｌｔａＴ＿２値が、ＡｕｔｏＩｎｄｅｘ＿２値が示すこれから実行しようとする演奏データ組の待ち時間ＤｅｌｔａＴｉｍｅ＿２［ＡｕｔｏＩｎｄｅｘ＿２］に一致したか否かを判定する（ステップＳ１００７）。 Next, the CPU 201 executes a series of processes (steps S1007 to S1013) corresponding to the track chunk 2. First, the CPU 201 determines whether or not the DeltaT_2 value indicating the relative time from the occurrence time of the previous event regarding the track chunk 2 matches the waiting time DeltaTime_2 [AutoIndex_2] of the performance data set to be executed, which is indicated by the AutoIndex_2 value. Is determined (step S1007).

ステップＳ１００７の判定がＮＯならば、ＣＰＵ２０１は、トラックチャンク２に関して、前回のイベントの発生時刻からの相対時刻を示すＤｅｌｔａＴ＿２値を＋１インクリメントさせて、今回の割込みに対応する１ＴｉｃｋＴｉｍｅ単位分だけ時刻を進行させる（ステップＳ１００８）。その後、ＣＰＵ２０１は、図１０のフローチャートで示される自動演奏割込み処理を終了する。 If the determination in step S1007 is NO, the CPU 201 increments the DeltaT_2 value indicating the relative time from the occurrence time of the previous event by +1 with respect to the track chunk 2, and advances the time by 1 TickTime unit corresponding to the current interrupt. (Step S1008). After that, the CPU 201 ends the automatic performance interrupt process shown in the flowchart of FIG.

ステップＳ１００７の判定がＹＥＳならば、ＣＰＵ２０１は、伴奏再生を指示するＲＡＭ２０３上の変数Ｂａｎｓｏｕの値が１（伴奏有り）であるか否かを判定する（ステップＳ１００９）（図８（ｃ）のステップＳ８２４からＳ８２６を参照）。 If the determination in step S1007 is YES, the CPU 201 determines whether or not the value of the variable Bansou on the RAM 203 instructing accompaniment reproduction is 1 (with accompaniment) (step S1009) (step 8 (c)). See S824 to S826).

ステップＳ１００９の判定がＹＥＳならば、ＣＰＵ２０１は、ＡｕｔｏＩｎｄｅｘ＿２値が示すトラックチャンク２に関する伴奏に関するイベントＥｖｅｎｔ＿２［ＡｕｔｏＩｎｄｅｘ＿２］を実行する（ステップＳ１０１０）。ここで実行されるイベントＥｖｅｎｔ＿２［ＡｕｔｏＩｎｄｅｘ＿２］が、例えばノートオンイベントであれば、そのノートオンイベントにより指定されるキーナンバー及びベロシティにより、図２の音源ＬＳＩ２０４に対して伴奏用の楽音の発音命令が発行される。一方、イベントＥｖｅｎｔ＿２［ＡｕｔｏＩｎｄｅｘ＿２］が、例えばノートオフイベントであれば、そのノートオフイベントにより指定されるキーナンバー及びベロシティにより、図２の音源ＬＳＩ２０４に対して発音中の伴奏用の楽音の消音命令が発行される。 If the determination in step S1009 is YES, the CPU 201 executes the event Event_2 [AutoIndex_2] relating to the accompaniment related to the track chunk 2 indicated by the AutoIndex_2 value (step S1010). If the event Event_2 [AutoIndex_2] executed here is, for example, a note-on event, an accompaniment musical tone sounding command is issued to the sound source LSI 204 of FIG. 2 according to the key number and velocity specified by the note-on event. publish. On the other hand, if the event Event_2 [AutoIndex_2] is, for example, a note-off event, a mute command for the accompaniment musical tone being sounded to the sound source LSI 204 of FIG. 2 is issued according to the key number and velocity specified by the note-off event. publish.

一方、ステップＳ１００９の判定がＮＯならば、ＣＰＵ２０１は、ステップＳ１０１０をスキップすることにより、今回の伴奏に関するイベントＥｖｅｎｔ＿２［ＡｕｔｏＩｎｄｅｘ＿２］は実行せずに、歌詞に同期した進行のために、次のステップＳ１０１１の処理に進んで、イベントを進める制御処理のみを実行する。 On the other hand, if the determination in step S1009 is NO, the CPU 201 skips step S1010, so that the event Event_2 [AutoIndex_2] related to this accompaniment is not executed, and the progress is synchronized with the lyrics, so that the next step S1011 Proceed to the process of, and execute only the control process that advances the event.

ステップＳ１０１０の後又はステップＳ１００９の判定がＮＯの場合に、ＣＰＵ２０１は、トラックチャンク２上の伴奏データのための演奏データ組を参照するためのＡｕｔｏＩｎｄｅｘ＿２値を＋１インクリメントする（ステップＳ１０１１）。 After step S1010 or when the determination in step S1009 is NO, the CPU 201 increments the AutoIndex_2 value for referencing the performance data set for accompaniment data on track chunk 2 by +1 (step S1011).

また、ＣＰＵ２０１は、トラックチャンク２に関して今回実行したイベントの発生時刻からの相対時刻を示すＤｅｌｔａＴ＿２値を０にリセットする（ステップＳ１０１２）。 Further, the CPU 201 resets the DeltaT_2 value indicating the relative time from the occurrence time of the event executed this time with respect to the track chunk 2 to 0 (step S1012).

そして、ＣＰＵ２０１は、ＡｕｔｏＩｎｄｅｘ＿２値が示す次に実行されるトラックチャンク２上の演奏データ組の待ち時間ＤｅｌｔａＴｉｍｅ＿２［ＡｕｔｏＩｎｄｅｘ＿２］が０であるか否か、即ち、今回のイベントと同時に実行されるイベントであるか否かを判定する（ステップＳ１０１３）。 Then, the CPU 201 is whether or not the waiting time DeltaTime_2 [AutoIndex_2] of the performance data set on the track chunk 2 to be executed next indicated by the AutoIndex_2 value is 0, that is, an event executed at the same time as this event. Whether or not it is determined (step S1013).

ステップＳ１０１３の判定がＮＯならば、ＣＰＵ２０１は、図１０のフローチャートで示される今回の自動演奏割込み処理を終了する。 If the determination in step S1013 is NO, the CPU 201 ends the current automatic performance interrupt process shown in the flowchart of FIG.

ステップＳ１０１３の判定がＹＥＳならば、ＣＰＵ２０１は、ステップＳ１００９に戻って、ＡｕｔｏＩｎｄｅｘ＿２値が示すトラックチャンク２上で次に実行される演奏データ組のイベントＥｖｅｎｔ＿２［ＡｕｔｏＩｎｄｅｘ＿２］に関する制御しょりを繰り返す。ＣＰＵ２０１は、今回同時に実行される回数分だけ、ステップＳ１００９からＳ１０１３の処理を繰り返し実行する。以上の処理シーケンスは、例えば和音などのように複数のノートオンイベントが同時タイミングで発音されるような場合に実行される。 If the determination in step S1013 is YES, the CPU 201 returns to step S1009 and repeats the control regarding the event Event_1 [AutoIndex_2] of the performance data set to be executed next on the track chunk 2 indicated by the AutoIndex_2 value. The CPU 201 repeatedly executes the processes of steps S1009 to S1013 as many times as the number of times it is executed simultaneously this time. The above processing sequence is executed when a plurality of note-on events are sounded at the same timing, such as a chord.

図１１は、図７のステップＳ７０５のソング再生処理の第１の実施形態の詳細例を示すフローチャートである。この処理は、図５で説明した本実施形態による制御処理を実行するものである。 FIG. 11 is a flowchart showing a detailed example of the first embodiment of the song reproduction process of step S705 of FIG. This process executes the control process according to the present embodiment described with reference to FIG.

まずＣＰＵ２０１は、図１０の自動演奏割込み処理におけるステップＳ１００４で、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘに、値がセットされてＮｕｌｌ値でなくなっているか否かを判定する（ステップＳ１１０１）。このＳｏｎｇＩｎｄｅｘ値は、現在のタイミングが歌声の再生タイミングになっているか否かを示すものである。 First, the CPU 201 determines in step S1004 in the automatic performance interrupt process of FIG. 10 whether or not a value is set in the variable SongIndex on the RAM 203 and is no longer a Null value (step S1101). This SongIndex value indicates whether or not the current timing is the reproduction timing of the singing voice.

ステップＳ１１０１の判定がＹＥＳになった、即ち現時点がソング再生のタイミング（図５の例のｔ１、ｔ２、ｔ３、ｔ４、ｔ５、ｔ６、ｔ７等）になったら、ＣＰＵ２０１は、図７のステップＳ７０３の鍵盤処理により演奏者による図１の鍵盤１０１上で新たな押鍵が検出されているか否かを判定する（ステップＳ１１０２）。 When the determination in step S1101 becomes YES, that is, when the current time is the timing of song playback (t1, t2, t3, t4, t5, t6, t7, etc. in the example of FIG. 5), the CPU 201 performs the CPU 201 in step S703 of FIG. It is determined whether or not a new key press is detected on the keyboard 101 of FIG. 1 by the performer by the keyboard processing of (step S1102).

ステップＳ１１０２の判定がＹＥＳならば、ＣＰＵ２０１は、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘが示すＲＡＭ２０３上の曲データのトラックチャンク１上のソングイベントＥｖｅｎｔ＿１［ＳｏｎｇＩｎｄｅｘ］から音高を読出し、演奏者による押鍵により指定された指定音高が読み出した音高と一致するか否かを判定する（ステップＳ１１０３）。 If the determination in step S1102 is YES, the CPU 201 reads the pitch from the song event Event_1 [SongIndex] on the track chunk 1 of the song data on the RAM 203 indicated by the variable SongIndex on the RAM 203, and is designated by the key press by the performer. It is determined whether or not the designated pitch matches the read pitch (step S1103).

ステップＳ１１０３の判定がＹＥＳならば、ＣＰＵ２０１は、演奏者による押鍵により指定された指定音高を、発声音高として特には図示しないレジスタ又はＲＡＭ２０３上の変数にセットする（ステップＳ１１０４）。 If the determination in step S1103 is YES, the CPU 201 sets the designated pitch designated by the key press by the performer in a register (not particularly shown) or a variable on the RAM 203 as the vocal pitch (step S1104).

続いて、ＣＰＵ２０１は、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘが示すＲＡＭ２０３上の曲データのトラックチャンク１上のソングイベントＥｖｅｎｔ＿１［ＳｏｎｇＩｎｄｅｘ］から、歌詞文字列を読み出す。ＣＰＵ２０１は、読み出した歌詞文字列に対応する歌声音声出力データ２１７を、ステップＳ１１０４で設定された押鍵に基づく指定音高がセットされた発声音高で発声させるための歌声データ２１５を生成し、音声合成ＬＳＩ２０５に対して発声処理を指示する（ステップＳ１１０５）。 Subsequently, the CPU 201 reads the lyrics character string from the song event Event_1 [SongIndex] on the track chunk 1 of the song data on the RAM 203 indicated by the variable SongIndex on the RAM 203. The CPU 201 generates singing voice data 215 for uttering the singing voice output data 217 corresponding to the read lyrics character string at the voicing pitch set with the designated pitch based on the key press set in step S1104. Instruct the voice synthesis LSI 205 to perform vocalization processing (step S1105).

以上のステップＳ１１０４とＳ１１０５の処理は、図５（ｂ）のソング再生タイミングｔ１、ｔ２、ｔ３′、ｔ４に関して前述した制御処理に対応する。 The processing of steps S1104 and S1105 described above corresponds to the control processing described above with respect to the song playback timings t1, t2, t3', and t4 of FIG. 5 (b).

ステップＳ１１０５の処理の後、ＣＰＵ２０１は、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘが示す再生を行ったソング位置を、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘ＿ｐｒｅに記憶させる（ステップＳ１１０６）。 After the process of step S1105, the CPU 201 stores the reproduced song position indicated by the variable SongIndex on the RAM 203 in the variable SongIndex_pre on the RAM 203 (step S1106).

次に、ＣＰＵ２０１は、変数ＳｏｎｇＩｎｄｅｘの値をＮｕｌｌ値にクリアして、これ以降のタイミングをソング再生のタイミングでない状態にする（ステップＳ１１０７）。 Next, the CPU 201 clears the value of the variable SongIndex to the Null value, and sets the timing after that to a state other than the timing of song playback (step S1107).

更に、ＣＰＵ２０１は、歌詞及び自動伴奏の進行を制御するＲＡＭ２０３上の変数ＳｏｎｇＳｔａｒｔに、進行を指示する値１をセットする（ステップＳ１１０８）。その後、ＣＰＵ２０１は、図１１のフローチャートで示される図７のステップＳ７０５のソング再生処理を終了する。 Further, the CPU 201 sets a value 1 indicating the progress in the variable SongStart on the RAM 203 that controls the progress of the lyrics and the automatic accompaniment (step S1108). After that, the CPU 201 ends the song reproduction process of step S705 of FIG. 7 shown in the flowchart of FIG.

図５（ｂ）のタイミングｔ３に関して説明したように、歌詞及び自動伴奏の進行が停止している状態で、タイミングｔ３′において、演奏（押鍵）による指定音高が曲データから読出した音高に一致すると、ステップＳ１１０１の判定がＹＥＳ→ステップＳ１１０２の判定がＹＥＳと進んで、ステップＳ１１０５でＥｖｅｎｔ＿１［ＳｏｎｇＩｎｄｅｘ］が示す歌声の発声が行われた後に、上述のようにステップＳ１１０８で変数ＳｏｎｇＳｔａｒｔの値が１にセットされる。この結果、図１０の自動演奏割込み処理において、ステップＳ１００１の判定がＹＥＳとなって、歌声と自動伴奏の進行が再開される。 As described with respect to the timing t3 of FIG. 5B, the pitch specified by the performance (key press) is the pitch read from the music data at the timing t3'with the progress of the lyrics and the automatic accompaniment stopped. If, the determination in step S1101 is YES → the determination in step S1102 is YES, and after the singing voice indicated by Event_1 [SongIndex] is uttered in step S1105, the value of the variable SongStart in step S1108 as described above. Is set to 1. As a result, in the automatic performance interrupt process of FIG. 10, the determination in step S1001 becomes YES, and the progress of the singing voice and the automatic accompaniment is restarted.

前述したステップＳ１１０３の判定がＮＯ、即ち演奏者による押鍵により指定された指定音高が曲データから読出した音高と不一致ならば、ＣＰＵ２０１は、歌詞及び自動伴奏の進行を制御するＲＡＭ２０３上の変数ＳｏｎｇＳｔａｒｔに、進行の停止を指示する値０をセットする（ステップＳ１１０９）。その後、ＣＰＵ２０１は、図１１のフローチャートで示される図７のステップＳ７０５のソング再生処理を終了する。 If the determination in step S1103 described above is NO, that is, if the specified pitch specified by the key press by the performer does not match the pitch read from the song data, the CPU 201 is on the RAM 203 that controls the progress of the lyrics and automatic accompaniment. The variable SongStart is set to the value 0 indicating the stop of progress (step S1109). After that, the CPU 201 ends the song reproduction process of step S705 of FIG. 7 shown in the flowchart of FIG.

図５（ｂ）のタイミングｔ３に関して説明したように、歌声の発声タイミングｔ３で、演奏（押鍵）による指定音高が曲データから読み出した音高と不一致になると、ステップＳ１１０１の判定がＹＥＳ→ステップＳ１１０２の判定がＹＥＳ→ステップＳ１１０３の判定がＮＯと進んで、上述のようにステップＳ１１０９で変数ＳｏｎｇＳｔａｒｔの値が０にセットされる。この結果、図１０の自動演奏割込み処理において、ステップＳ１００１の判定がＮＯとなって、歌声と自動伴奏の進行が停止される。 As described with respect to the timing t3 of FIG. 5B, when the designated pitch by the performance (key press) does not match the pitch read from the song data at the vocalization timing t3 of the singing voice, the determination in step S1101 is YES → The determination in step S1102 proceeds from YES to the determination in step S1103 to NO, and the value of the variable SongStart is set to 0 in step S1109 as described above. As a result, in the automatic performance interrupt process of FIG. 10, the determination in step S1001 becomes NO, and the progress of the singing voice and the automatic accompaniment is stopped.

前述したステップＳ１１０１の判定がＮＯである、即ち現時点がソング再生のタイミングではないときには、ＣＰＵ２０１は、図７のステップＳ７０３の鍵盤処理により演奏者による図１の鍵盤１０１上で新たな押鍵が検出されているか否かを判定する（ステップＳ１１１０）。 When the determination in step S1101 described above is NO, that is, when the current time is not the timing of song playback, the CPU 201 detects a new key pressed by the performer on the keyboard 101 in FIG. 1 by the keyboard processing in step S703 in FIG. It is determined whether or not this is done (step S1110).

ステップＳ１１１０の判定がＮＯならば、ＣＰＵ２０１はそのまま、図１１のフローチャートで示される図７のステップＳ７０５のソング再生処理を終了する。 If the determination in step S1110 is NO, the CPU 201 as it is ends the song playback process in step S705 of FIG. 7 shown in the flowchart of FIG.

ステップＳ１１１０の判定がＹＥＳならば、ＣＰＵ２０１は、現在音声合成ＬＳＩ２０５が発声処理中の、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘ＿ｐｒｅが示すＲＡＭ２０３上の曲データのトラックチャンク１上のソングイベントＥｖｅｎｔ＿１［ＳｏｎｇＩｎｄｅｘ＿ｐｒｅ］の歌詞文字列に対応する歌声音声出力データ２１７の音高を、ステップＳ１１１０で検出された演奏者の押鍵に基づく指定音高に変更することを指示する歌声データ２１５を生成し、音声合成ＬＳＩ２０５に出力する（ステップＳ１１１１）。このとき、歌声データ２１５において、既に発声処理中の歌詞の音素のうち後半部分の音素、例えば歌詞文字列「き」であればそれを構成する音素列「／ｋ／」「／ｉ／」のうちの後半の「／ｉ／」が始まるフレーム（図４（ｂ）及び（ｃ）を参照）が、指定音高への変更の開始位置にセットされる。 If the determination in step S1110 is YES, the CPU 201 determines the lyrics character string of the song event Event_1 [SongIndex_pre] on the track chunk 1 of the song data on the RAM 203 indicated by the variable SongIndex_pre on the RAM 203, which is currently being uttered by the speech synthesis LSI 205. The singing voice data 215 instructing to change the pitch of the singing voice output data 217 corresponding to the above to the designated pitch based on the key press of the performer detected in step S1110 is generated and output to the voice synthesis LSI 205 ( Step S1111). At this time, in the singing voice data 215, the phonemes of the latter half of the phonemes of the lyrics that are already being uttered, for example, if the lyrics character string "ki", the phoneme strings "/ k /" and "/ i /" that compose it The frame in which the latter half of "/ i /" starts (see FIGS. 4 (b) and 4 (c)) is set at the start position of the change to the specified pitch.

以上のステップＳ１１１１の処理により、現在の押鍵タイミングの直前の本来のタイミング、例えば図５（ｂ）のｔ１から発声されている歌声音声出力データ２１７の発声がその音高が演奏者により指定された指定音高に変更されて、例えば図５（ｂ）の現在の押鍵タイミングｔ１′でその発声を継続させることが可能となる。 By the process of step S1111 above, the pitch of the singing voice output data 217 uttered from t1 in FIG. 5B, for example, the original timing immediately before the current key pressing timing is specified by the performer. The pitch is changed to the specified pitch, and the utterance can be continued at the current key press timing t1'in FIG. 5B, for example.

ステップＳ１１１１の処理の後、ＣＰＵ２０１は、図１１のフローチャートで示される図７のステップＳ７０５のソング再生処理を終了する。 After the process of step S1111, the CPU 201 ends the song reproduction process of step S705 of FIG. 7 shown in the flowchart of FIG.

図１２は、図７のステップＳ７０５のソング再生処理の第２の実施形態の詳細例を示すフローチャートである。この処理は、図５で説明した本実施形態による他の制御処理を実行するものである。 FIG. 12 is a flowchart showing a detailed example of the second embodiment of the song reproduction process of step S705 of FIG. This process executes another control process according to the present embodiment described with reference to FIG.

まず、ＣＰＵ２０１は、図７のステップＳ７０３の鍵盤処理により演奏者による図１の鍵盤１０１上で新たな押鍵が検出されているか否かを判定する（ステップＳ１２０１）。 First, the CPU 201 determines whether or not a new key press is detected on the keyboard 101 of FIG. 1 by the performer by the keyboard processing of step S703 of FIG. 7 (step S1201).

ステップＳ１２０１の判定がＹＥＳならば、ＣＰＵ２０１は、図１０の自動演奏割込み処理におけるステップＳ１００４で、現在のタイミングが歌声の再生タイミングになっているか否かを示すＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘに、値がセットされてＮｕｌｌ値でなくなっているか否かを判定する（ステップＳ１１０２）。 If the determination in step S1201 is YES, the CPU 201 sets a value in the variable SongIndex on the RAM 203 indicating whether or not the current timing is the reproduction timing of the singing voice in step S1004 in the automatic performance interrupt process of FIG. It is determined whether or not the value is not the Null value (step S1102).

ステップＳ１１０２の判定がＹＥＳになった、即ち現時点がソング再生のタイミング（図５の例のｔ１、ｔ２、ｔ３、ｔ４等）になったら、ＣＰＵ２０１は、演奏者による押鍵により指定された指定音高を、発声音高として特には図示しないレジスタ又はＲＡＭ２０３上の変数にセットする（ステップＳ１２０３）。 When the determination in step S1102 becomes YES, that is, when the current time is the timing of song playback (t1, t2, t3, t4, etc. in the example of FIG. 5), the CPU 201 is set to the designated sound designated by the key press by the performer. The pitch is set as the vocal pitch in a register (not shown) or a variable on the RAM 203 (step S1203).

続いて、ＣＰＵ２０１は、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘが示すＲＡＭ２０３上の曲データのトラックチャンク１上のソングイベントＥｖｅｎｔ＿１［ＳｏｎｇＩｎｄｅｘ］から、歌詞文字列を読み出す。ＣＰＵ２０１は、読み出した歌詞文字列に対応する歌声音声出力データ２１７を、ステップＳ１１０４で設定された押鍵に基づく指定音高がセットされた発声音高で発声させるための歌声データ２１５を生成し、音声合成ＬＳＩ２０５に対して発声処理を指示する（ステップＳ１２０４）。 Subsequently, the CPU 201 reads the lyrics character string from the song event Event_1 [SongIndex] on the track chunk 1 of the song data on the RAM 203 indicated by the variable SongIndex on the RAM 203. The CPU 201 generates singing voice data 215 for uttering the singing voice output data 217 corresponding to the read lyrics character string at the voicing pitch set with the designated pitch based on the key press set in step S1104. Instruct the voice synthesis LSI 205 to perform vocalization processing (step S1204).

その後、ＣＰＵ２０１は、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘが示すＲＡＭ２０３上の曲データのトラックチャンク１上のソングイベントＥｖｅｎｔ＿１［ＳｏｎｇＩｎｄｅｘ］から音高を読出し、演奏者による押鍵により指定された指定音高が曲データから読出した音高と一致するか否かを判定する（ステップＳ１２０５）。 After that, the CPU 201 reads the pitch from the song event Event_1 [SongIndex] on the track chunk 1 of the song data on the RAM 203 indicated by the variable SongIndex on the RAM 203, and the designated pitch specified by the key press by the performer is the song data. It is determined whether or not the pitch matches the pitch read from (step S1205).

ステップＳ１２０５の判定がＹＥＳならば、ＣＰＵ２０１は、ステップＳ１２０６に進む。この処理は、図５（ｂ）のソング再生タイミングｔ１、ｔ２、ｔ３′、ｔ４に関して前述した制御処理に対応する。 If the determination in step S1205 is YES, the CPU 201 proceeds to step S1206. This process corresponds to the control process described above for the song playback timings t1, t2, t3', and t4 in FIG. 5 (b).

ステップＳ１２０６において、ＣＰＵ２０１は、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘが示す再生を行ったソング位置を、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘ＿ｐｒｅに記憶させる。 In step S1206, the CPU 201 stores the reproduced song position indicated by the variable SongIndex on the RAM 203 in the variable SongIndex_pre on the RAM 203.

次に、ＣＰＵ２０１は、変数ＳｏｎｇＩｎｄｅｘの値をＮｕｌｌ値にクリアして、これ以降のタイミングをソング再生のタイミングでない状態にする（ステップＳ１２０７）。 Next, the CPU 201 clears the value of the variable SongIndex to the Null value, and sets the timing after that to a state other than the timing of song playback (step S1207).

更に、ＣＰＵ２０１は、歌詞及び自動伴奏の進行を制御するＲＡＭ２０３上の変数ＳｏｎｇＳｔａｒｔに、進行を指示する値１をセットする（ステップＳ１２０８）。その後、ＣＰＵ２０１は、図１２のフローチャートで示される図７のステップＳ７０５のソング再生処理を終了する。 Further, the CPU 201 sets a value 1 indicating the progress in the variable SongStart on the RAM 203 that controls the progress of the lyrics and the automatic accompaniment (step S1208). After that, the CPU 201 ends the song reproduction process of step S705 of FIG. 7 shown in the flowchart of FIG.

前述したステップＳ１２０５の判定がＮＯならば、ＣＰＵ２０１は、即ち演奏者による押鍵により指定された指定音高が曲データから読出した音高と不一致ならば、ＣＰＵ２０１は、歌詞及び自動伴奏の進行を制御するＲＡＭ２０３上の変数ＳｏｎｇＳｔａｒｔに、進行の停止を指示する値０をセットする（ステップＳ１２０９）。その後、ＣＰＵ２０１は、図１２のフローチャートで示される図７のステップＳ７０５のソング再生処理を終了する。これにより、図１０の自動演奏割込み処理においてステップＳ１００１の判定がＹＥＳとなって歌声と自動伴奏の進行が再開されることは、図１１のステップＳ１１０８の場合と同様である。 If the determination in step S1205 described above is NO, the CPU 201, that is, if the specified pitch specified by the key press by the performer does not match the pitch read from the song data, the CPU 201 advances the lyrics and automatic accompaniment. A value 0 instructing the stop of progress is set in the variable SongStart on the RAM 203 to be controlled (step S1209). After that, the CPU 201 ends the song reproduction process of step S705 of FIG. 7 shown in the flowchart of FIG. As a result, in the automatic performance interrupt process of FIG. 10, the determination in step S1001 becomes YES, and the progress of the singing voice and the automatic accompaniment is restarted, as in the case of step S1108 of FIG.

前述したステップＳ１２０５の判定がＮＯ、即ち演奏者による押鍵により指定された指定音高が曲データから読出した音高と不一致ならば、ＣＰＵ２０１は、歌詞及び自動伴奏の進行を制御するＲＡＭ２０３上の変数ＳｏｎｇＳｔａｒｔに、進行の停止を指示する値０をセットする（ステップＳ１２１０）。その後、ＣＰＵ２０１は、図１２のフローチャートで示される図７のステップＳ７０５のソング再生処理を終了する。これにより、図１０の自動演奏割込み処理においてステップＳ１００１の判定がＮＯとなって歌声と自動伴奏の進行が停止されることは、図１１のステップＳ１１０９の場合と同様である。 If the determination in step S1205 described above is NO, that is, if the specified pitch specified by the key press by the performer does not match the pitch read from the song data, the CPU 201 is on the RAM 203 that controls the progress of the lyrics and automatic accompaniment. The variable SongStart is set to a value 0 instructing the stop of progress (step S1210). After that, the CPU 201 ends the song reproduction process of step S705 of FIG. 7 shown in the flowchart of FIG. As a result, in the automatic performance interrupt process of FIG. 10, the determination in step S1001 becomes NO, and the progress of the singing voice and the automatic accompaniment is stopped, which is the same as in the case of step S1109 of FIG.

この処理は、図５（ｂ）のソング再生タイミングｔ３に関して前述した制御処理に対応する。 This process corresponds to the control process described above with respect to the song reproduction timing t3 of FIG. 5 (b).

前述したステップＳ１２０１の判定がＹＥＳになった後、ステップＳ１２０２の判定がＮＯになった場合、即ち歌声を発声すべきタイミング以外で演奏者による演奏（押鍵）が発声した場合には、以下の制御処理が実行される。 If the determination in step S1202 becomes NO after the determination in step S1201 described above becomes YES, that is, if the performance (key press) by the performer is uttered at a timing other than the timing at which the singing voice should be uttered, the following Control processing is executed.

まず、ＣＰＵ２０１は、ＲＡＭ２０３上の変数ＳｏｎｇＳｔａｒｔに値０をセットして歌声及び自動伴奏の進行をいったん停止させる（ステップＳ１２１１）（図１０のステップＳ１００１を参照）。 First, the CPU 201 sets a value 0 in the variable SongStart on the RAM 203 to temporarily stop the progress of the singing voice and the automatic accompaniment (step S1211) (see step S1001 in FIG. 10).

次に、ＣＰＵ２０１は、現在の歌声及び自動伴奏の進行位置に関するＲＡＭ２０３上の各変数ＤｅｌｔａＴ＿１、ＤｅｌｔａＴ＿２、ＡｕｔｏＩｎｄｅｘ＿１、及びＡｕｔｏＩｎｄｅｘ＿２の値をそれぞれ、各変数ＤｅｌｔａＴ＿１＿ｎｏｗ、ＤｅｌｔａＴ＿２＿ｎｏｗ、ＡｕｔｏＩｎｄｅｘ＿１＿ｎｏｗ、及びＡｕｔｏＩｎｄｅｘ＿２＿ｎｏｗに退避させる（ステップＳ１２１２）。 Next, the CPU 201 causes the values of the variables DeltaT_1, DeltaT_2, AutoIndex_1, and AutoIndex_1 on the RAM 203 regarding the current singing voice and the progress position of the automatic accompaniment to be set to the variables DeltaT_1_now, DeltaT_2_now, and AutoIndex_1_12, respectively. ).

その後、ＣＰＵ２０１は、次ソングイベント検索処理を実行する（ステップＳ１２１３）。ここでは、次に到来する歌声に関するイベント情報を指示するＳｏｎｇＩｎｄｅｘ値が検索される。この処理の詳細については、後述する。 After that, the CPU 201 executes the next song event search process (step S1213). Here, the SongIndex value that indicates the event information regarding the next singing voice is searched. The details of this process will be described later.

ステップＳ１２１３の検索処理の後、ＣＰＵ２０１は、検索されたＳｏｎｇＩｎｄｅｘ値が示すＲＡＭ２０３上の曲データのトラックチャンク１上のソングイベントＥｖｅｎｔ＿１［ＳｏｎｇＩｎｄｅｘ］から音高を読出し、演奏者による押鍵により指定された指定音高がその読出した音高と一致するか否かを判定する（ステップＳ１２１４）。 After the search process of step S1213, the CPU 201 reads the pitch from the song event Event_1 [SongIndex] on the track chunk 1 of the song data on the RAM 203 indicated by the searched SongIndex value, and is designated by the key press by the performer. It is determined whether or not the designated pitch matches the read pitch (step S1214).

ステップＳ１２１４の判定がＹＥＳならば、ＣＰＵ２０１は、制御処理を、ステップＳ１２０３→Ｓ１２０４→Ｓ１２０５の判定がＹＥＳ→Ｓ１２０６→Ｓ１２０７→Ｓ１２０８と進める。 If the determination in step S1214 is YES, the CPU 201 advances the control process in the order of YES → S1206 → S1207 → S1208 in the determination of steps S1203 → S1204 → S1205.

上述の一連の制御処理により、本来の発声タイミングが到来していないタイミングで、その次に発声されるべき歌声の音高と同じ音高の鍵を演奏者が押鍵した場合には、ＣＰＵ２０１は、次に発声されるべき歌声のタイミングまで、歌詞の進行及び自動伴奏の進行を一気に進める（ジャンプさせる）ように制御することができる。 By the above-mentioned series of control processing, when the performer presses a key having the same pitch as the pitch of the singing voice to be uttered next at the timing when the original utterance timing has not arrived, the CPU 201 , It is possible to control the progress of the lyrics and the progress of the automatic accompaniment at once (jump) until the timing of the singing voice to be uttered next.

前述したステップＳ１２１４の判定がＮＯならば、ＣＰＵ２０１は、ステップＳ１２１２で退避させたＲＡＭ２０３上の各変数ＤｅｌｔａＴ＿１＿ｎｏｗ、ＤｅｌｔａＴ＿２＿ｎｏｗ、ＡｕｔｏＩｎｄｅｘ＿１＿ｎｏｗ、及びＡｕｔｏＩｎｄｅｘ＿２＿ｎｏｗの値をそれぞれ、各変数ＤｅｌｔａＴ＿１、ＤｅｌｔａＴ＿２、ＡｕｔｏＩｎｄｅｘ＿１、及びＡｕｔｏＩｎｄｅｘ＿２に戻して、ステップＳ１２１３の検索処理により進んだ分を、検索前の元の進行位置に戻す（ステップＳ１２１５）。 If the determination in step S1214 described above is NO, the CPU 201 returns the values of the variables DeltaT_1_now, DeltaT_2_now, AutoIndex_1_now, and AutoIndex_2_now on the RAM 203 saved in step S1212 to the variables DeltaT_1 and DetAuto2_12, respectively. Then, the amount advanced by the search process in step S1213 is returned to the original progress position before the search (step S1215).

その後、ＣＰＵ２０１は、現在音声合成ＬＳＩ２０５が発声処理中の、ＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘ＿ｐｒｅが示すＲＡＭ２０３上の曲データのトラックチャンク１上のソングイベントＥｖｅｎｔ＿１［ＳｏｎｇＩｎｄｅｘ＿ｐｒｅ］の歌詞文字列に対応する歌声音声出力データ２１７の音高を、ステップＳ１２０１で検出された演奏者の押鍵に基づく指定音高に変更することを指示する歌声データ２１５を生成し、音声合成ＬＳＩ２０５に出力する（ステップＳ１２１６）。 After that, the CPU 201 is singing voice output data corresponding to the lyrics character string of the song event Event_1 [SongIndex_pre] on the track chunk 1 of the song data on the RAM 203 indicated by the variable SongIndex_pre on the RAM 203, which is currently being uttered by the voice synthesis LSI 205. Singing voice data 215 instructing to change the pitch of 217 to a designated pitch based on the key press of the performer detected in step S1201 is generated and output to the speech synthesis LSI 205 (step S1216).

このステップＳ１２１６の処理は、図１１のステップＳ１１１１の処理と同様であり、現在の押鍵タイミングの直前の本来のタイミング、例えば図５（ｂ）のｔ１から発声されている歌声音声出力データ２１７の発声がその音高が演奏者により演奏された指定音高に変更されて、例えば図５（ｂ）の現在の押鍵タイミングｔ１′でその発声を継続させることが可能となる。 The process of step S1216 is the same as the process of step S1111 of FIG. 11, and the singing voice output data 217 uttered from the original timing immediately before the current key press timing, for example, t1 of FIG. 5 (b). The utterance is changed from the pitch to the designated pitch played by the performer, and the utterance can be continued at the current key press timing t1'in FIG. 5B, for example.

ステップＳ１２１６の処理の後、ＣＰＵ２０１は、ＲＡＭ２０３上の変数ＳｏｎｇＳｔａｒｔに値１をセットすることにより、ステップＳ１２１１で一時的に停止させた歌詞及び自動伴奏の進行を再開させる（ステップＳ１２０８）。その後、ＣＰＵ２０１は、図１２のフローチャートで示される図７のステップＳ７０５のソング再生処理を終了する。 After the process of step S1216, the CPU 201 resumes the progress of the lyrics and the automatic accompaniment temporarily stopped in step S1211 by setting the value 1 in the variable SongStart on the RAM 203 (step S1208). After that, the CPU 201 ends the song reproduction process of step S705 of FIG. 7 shown in the flowchart of FIG.

前述したステップＳ１２０１の判定がＮＯである、即ち演奏者による演奏（押鍵）がなされていないときには、ＣＰＵ２０１は、図１０の自動演奏割込み処理におけるステップＳ１００４で、現在のタイミングが歌声の再生タイミングになっているか否かを示すＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘに、値がセットされてＮｕｌｌ値でなくなっているか否かを判定する（ステップＳ１２０９）。 When the determination in step S1201 described above is NO, that is, when the performer has not played (key pressed), the CPU 201 sets the current timing to the playback timing of the singing voice in step S1004 in the automatic performance interrupt process of FIG. It is determined whether or not a value is set in the variable SongIndex on the RAM 203 indicating whether or not the value is set and the value is not the Null value (step S1209).

ステップＳ１２０９の判定がＮＯならば、ＣＰＵ２０１はそのまま、図１２のフローチャートで示される図７のステップＳ７０５のソング再生処理を終了する。 If the determination in step S1209 is NO, the CPU 201 as it is ends the song reproduction process of step S705 of FIG. 7 shown in the flowchart of FIG.

ステップＳ１２０９の判定がＹＥＳならば、ＣＰＵ２０１は、歌詞及び自動伴奏の進行を制御するＲＡＭ２０３上の変数ＳｏｎｇＳｔａｒｔに、進行の停止を指示する値０をセットする（ステップＳ１２１０）。その後、ＣＰＵ２０１は、図１２のフローチャートで示される図７のステップＳ７０５のソング再生処理を終了する。これにより、図１０の自動演奏割込み処理においてステップＳ１００１の判定がＮＯとなって歌声と自動伴奏の進行が停止されることは、図１１のステップＳ１１０９の場合と同様である。 If the determination in step S1209 is YES, the CPU 201 sets the variable SongStart on the RAM 203 that controls the progress of the lyrics and the automatic accompaniment to a value 0 instructing the stop of the progress (step S1210). After that, the CPU 201 ends the song reproduction process of step S705 of FIG. 7 shown in the flowchart of FIG. As a result, in the automatic performance interrupt process of FIG. 10, the determination in step S1001 becomes NO, and the progress of the singing voice and the automatic accompaniment is stopped, which is the same as in the case of step S1109 of FIG.

図１３は、図１２のステップＳ１２１３の次ソングイベント検索処理の詳細例を示すフローチャートである。図１３のフローチャートにおいて、図１０の自動演奏割込み処理の場合と同じステップ番号を付した部分は、図１０の場合と同じ処理を示すものとする。図１３の処理では、基本的には図１０の自動演奏割込み処理におけるステップＳ１００２からＳ１０１３までの一連の制御フローと同じ制御フローであって、イベントの実行を行う処理のみを除いた制御処理群が実行される。 FIG. 13 is a flowchart showing a detailed example of the next song event search process in step S1213 of FIG. In the flowchart of FIG. 13, the portion with the same step number as in the case of the automatic performance interrupt process of FIG. 10 indicates the same process as in the case of FIG. In the process of FIG. 13, the control process group is basically the same as the series of control flows from steps S1002 to S1013 in the automatic performance interrupt process of FIG. 10, excluding only the process of executing the event. Will be executed.

即ち、図１３において、ＣＰＵ２０１は、ステップＳ１００２で、トラックチャンク１に関する前回のイベントの発生時刻からの相対時刻を示すＤｅｌｔａＴ＿１値が、ＡｕｔｏＩｎｄｅｘ＿１値が示すこれから実行しようとする歌声に関する演奏データ組の待ち時間ＤｅｌｔａＴｉｍｅ＿１［ＡｕｔｏＩｎｄｅｘ＿１］に一致したと判定するまで、ステップＳ１００３で、ＤｅｌｔａＴ＿１値がインクリメントさせて、歌声の進行を進める。 That is, in FIG. 13, in step S1002, the CPU 201 waits for the performance data set related to the singing voice to be executed from now on, in which the DeltaT_1 value indicating the relative time from the occurrence time of the previous event relating to the track chunk 1 is indicated by the AutoIndex_1 value. In step S1003, the DeltaT_1 value is incremented to advance the progress of the singing voice until it is determined that the data matches the DeltaTime_1 [AutoIndex_1].

また、ＣＰＵ２０１は、トラックチャンク２に関する前回のイベントの発生時刻からの相対時刻を示すＤｅｌｔａＴ＿２値が、ＡｕｔｏＩｎｄｅｘ＿２値が示すこれから実行しようとする自動伴奏に関する演奏データ組の待ち時間ＤｅｌｔａＴｉｍｅ＿２［ＡｕｔｏＩｎｄｅｘ＿２］に一致してステップＳ１００７の判定がＹＥＳになる毎に、ＡｕｔｏＩｎｄｅｘ＿２値を進めると共に、ステップＳ１００７の判定がＮＯならば、ＤｅｌｔａＴ＿２値をインクリメントさせて、自動伴奏の進行を進め、その後、ステップＳ１００２の制御処理に戻る。 Further, in the CPU 201, the DeltaT_2 value indicating the relative time from the occurrence time of the previous event related to the track chunk 2 matches the waiting time DeltaTime_2 [AutoIndex_2] of the performance data set regarding the automatic accompaniment to be executed, which is indicated by the AutoIndex_2 value. Each time the determination in step S1007 becomes YES, the AutoIndex_2 value is advanced, and if the determination in step S1007 is NO, the DeltaT_2 value is incremented to advance the progress of automatic accompaniment, and then the process returns to the control process in step S1002. ..

以上の一連の制御処理の繰返しにおいて、ステップＳ１００２の判定がＹＥＳになると、ＣＰＵ２０１は、ＡｕｔｏＩｎｄｅｘ＿１値をＲＡＭ２０３上の変数ＳｏｎｇＩｎｄｅｘに格納して、図１３のフローチャートで示される図１２のステップＳ１２１３の次ソングイベント検索処理を終了する。 When the determination in step S1002 becomes YES in the repetition of the above series of control processes, the CPU 201 stores the AutoIndex_1 value in the variable SongIndex on the RAM 203, and the next song of step S1213 in FIG. 12 shown in the flowchart of FIG. End the event search process.

図１４は、図６のデータ構造として例示した曲データをＭｕｓｉｃＸＭＬ形式で実施した場合の曲データの構成例を示す図である。このようなデータ構成により、歌詞文字列とメロディの楽譜データを持たせることが可能となる。そして、このような曲データをＣＰＵ２０１が例えば図７のステップＳ７０４の表示処理でパースすることにより、例えば図１の鍵盤１０１上で、現在ソング再生中の歌詞文字列に対応するメロディに対応する鍵を光らせて演奏者による歌詞文字列に対応する鍵の押鍵をガイドさせるような機能を持たせることが可能となる。同時に、例えば図１５に示されるような表示例の現在ソング再生中の歌詞文字列とそれに対応する楽譜を、図１のＬＣＤ１０４に表示させることが可能となる。 FIG. 14 is a diagram showing a configuration example of song data when the song data illustrated as the data structure of FIG. 6 is implemented in the MusicXML format. With such a data structure, it is possible to have the lyrics character string and the score data of the melody. Then, the CPU 201 parses such song data in the display process of step S704 of FIG. 7, for example, on the key 101 of FIG. 1, a key corresponding to the melody corresponding to the lyrics character string currently playing the song. It is possible to have a function to guide the key press of the key corresponding to the lyrics character string by the performer by shining. At the same time, for example, the lyrics character string currently playing the song in the display example as shown in FIG. 15 and the corresponding score can be displayed on the LCD 104 of FIG.

以上説明した実施形態では、言語特徴量系列３１６から音響特徴量系列３１７を予測するために、音響モデル部３０６がＤＮＮ（ディープニューラルネットワーク）により実装される。その他、上記予測のために、音響モデル部３０６がＨＭＭ（ＨｉｄｄｅｎＭａｒｋｏｖＭｏｄｅｌ：隠れマルコフモデル）によって実装されてもよい。この場合、音声学習部３０１内のモデル学習部３０５は、音声の音響的な特徴を精度良くモデル化するために，コンテキストを考慮したモデルを学習する。音響特徴量を詳細にモデル化するために、直前、直後の音素だけでなく、アクセント、品詞、文長などの要因も考慮する。しかし、コンテキストの組み合わせが膨大なものとなるため、すべてのコンテキストの組み合わせについて精度良くコンテキスト依存モデルを学習することができる音声データを用意することは困難である。この問題を解決するために、モデル学習部３０５は、決定木に基づくコンテキストクラスタリングの技術を採用することができる。決定木に基づくコンテキストクラスタリングでは、「直前の音素は／ａ／であるか？」などのコンテキストに関する質問を用いてコンテキストに依存したモデルを分類し，類似したコンテキストのモデルパラメータを、学習結果３１５として音響モデル部３０６に設定する。決定木の構造によって考慮されるコンテキストが変化するため，適切な決定木構造を選択することで高精度かつ汎化性能の高いコンテキストに依存したモデルを推定できる。図３の音声合成部３０２内の音響モデル部３０６は、テキスト解析部３０７により歌声データ２１５から抽出された言語特徴量系列３１６に従って、コンテキストに依存したＨＭＭを連結し，出力確率が最大となる音響特徴量系列３１７を予測する。 In the embodiment described above, the acoustic model unit 306 is implemented by the DNN (deep neural network) in order to predict the acoustic feature sequence 317 from the language feature sequence 316. In addition, for the above prediction, the acoustic model unit 306 may be implemented by an HMM (Hidden Markov Model: Hidden Markov Model). In this case, the model learning unit 305 in the speech learning unit 301 learns a model in consideration of the context in order to accurately model the acoustic features of the speech. In order to model the acoustic features in detail, not only the phonemes immediately before and after, but also factors such as accent, part of speech, and sentence length are considered. However, since the number of context combinations is enormous, it is difficult to prepare speech data that can accurately learn the context-dependent model for all context combinations. In order to solve this problem, the model learning unit 305 can adopt a technique of context clustering based on a decision tree. In context clustering based on a decision tree, context-dependent models are classified using context-related questions such as "Is the previous phoneme / a /?", And model parameters of similar contexts are used as the learning result 315. It is set in the acoustic model unit 306. Since the context considered depends on the structure of the decision tree, a context-dependent model with high accuracy and high generalization performance can be estimated by selecting an appropriate decision tree structure. The acoustic model unit 306 in the speech synthesis unit 302 of FIG. 3 connects the context-dependent HMMs according to the language feature sequence 316 extracted from the singing voice data 215 by the text analysis unit 307, and the sound with the maximum output probability. The feature series 317 is predicted.

以上説明した実施形態は、電子鍵盤楽器について本発明を実施したものであるが、本発明は電子弦楽器など他の電子楽器にも適用することができる。 Although the embodiment described above is an embodiment of the present invention for an electronic keyboard instrument, the present invention can also be applied to other electronic musical instruments such as electronic stringed instruments.

その他、本発明は上述した実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。また、上述した実施形態で実行される機能は可能な限り適宜組み合わせて実施しても良い。上述した実施形態には種々の段階が含まれており、開示される複数の構成要件による適宜の組み合せにより種々の発明が抽出され得る。例えば、実施形態に示される全構成要件からいくつかの構成要件が削除されても、効果が得られるのであれば、この構成要件が削除された構成が発明として抽出され得る。 In addition, the present invention is not limited to the above-described embodiment, and can be variously modified at the implementation stage without departing from the gist thereof. In addition, the functions executed in the above-described embodiment may be combined as appropriate as possible. The above-described embodiments include various steps, and various inventions can be extracted by an appropriate combination according to a plurality of disclosed constitutional requirements. For example, even if some constituent requirements are deleted from all the constituent requirements shown in the embodiment, if the effect is obtained, the configuration in which the constituent requirements are deleted can be extracted as an invention.

以上の実施形態に関して、更に以下の付記を開示する。
（付記１）
指定すべき第１音高と、指定された音高とが、一致するか否かを判断する判断処理と、
前記判断処理により前記第１音高との一致が判断された場合に、前記第１音高に対応する第１文字に応じた歌声を出力し、前記判断処理により前記第１音高との不一致が判断された場合に、前記判断処理により前記第１音高との一致が判断されるまで前記第１音高の次に指定すべき第２音高に対応する第２文字に応じた歌声を出力しない歌声出力処理と、
を実行する電子楽器。
（付記２）
前記歌声出力処理に合わせて伴奏データを出力する伴奏データ出力処理、を実行する付記１に記載の電子楽器。
（付記３）
前記歌声出力処理により前記第１音高に対応する前記第１文字に応じた歌声を出力後、前記第２音高を指定すべきタイミングの到来前に指定された音高が前記第２音高と不一致の場合に、出力された前記第１文字に応じた歌声の音高を前記指定された音高に変更して出力する音高変更処理、を実行する付記１または２に記載の電子楽器。
（付記４）
前記歌声出力処理により前記第１音高に対応する前記第１文字に応じた歌声を出力後、前記第２音高を指定すべきタイミングの到来前に指定された音高が前記第２音高と一致の場合に、前記第２音高を指定すべきタイミングの到来前に、前記指定に応じて、前記第２文字に応じた歌声の音高を前記第２音高で出力するとともに、前記伴奏データ出力処理が出力する伴奏データの進行を前記第２文字に応じた歌声の出力に合わせて進める制御処理、を実行する付記２または３に記載の電子楽器。
（付記５）
或る歌手が歌った歌声データ及び歌詞データによる機械学習により生成された学習済みモデルに基づいて、前記或る歌手に応じた歌声を音声合成する音声合成処理、を実行し、
前記歌声出力処理は、操作子が操作されたタイミングに応じて、前記音声合成処理により音声合成された歌声を前記操作された操作子により指定された音高で出力する付記１乃至４のいずれかに記載の電子楽器。
（付記６）
操作子を操作することにより音高を指定すべきタイミングに合わせて、指定すべき音高を示す識別子を表示する表示処理、を実行する付記１乃至５のいずれかに記載の電子楽器。
（付記７）
電子楽器のコンピュータに、
指定すべき第１音高と、指定された音高とが、一致するか否かを判断する判断処理と、
前記判断処理により前記第１音高との一致が判断された場合に、前記第１音高に対応する第１文字に応じた歌声を出力し、前記判断処理により前記第１音高との不一致が判断された場合に、前記判断処理により前記第１音高との一致が判断されるまで前記第１音高の次に指定すべき第２音高に対応する第２文字に応じた歌声を出力しない歌声出力処理と、
を実行させる方法。
（付記８）
電子楽器のコンピュータに、
指定すべき第１音高と、指定された音高とが、一致するか否かを判断する判断処理と、
前記判断処理により前記第１音高との一致が判断された場合に、前記第１音高に対応する第１文字に応じた歌声を出力し、前記判断処理により前記第１音高との不一致が判断された場合に、前記判断処理により前記第１音高との一致が判断されるまで前記第１音高の次に指定すべき第２音高に対応する第２文字に応じた歌声を出力しない歌声出力処理と、
を実行させるプログラム。 Regarding the above embodiments, the following additional notes will be further disclosed.
(Appendix 1)
Judgment processing to determine whether the first pitch to be specified and the specified pitch match.
When a match with the first pitch is determined by the determination process, a singing voice corresponding to the first character corresponding to the first pitch is output, and the mismatch with the first pitch is determined by the determination process. Is determined, the singing voice corresponding to the second character corresponding to the second pitch to be specified next to the first pitch is sung until the determination process determines that the pitch matches the first pitch. Singing voice output processing that does not output,
An electronic musical instrument that runs.
(Appendix 2)
The electronic musical instrument according to Appendix 1, which executes an accompaniment data output process for outputting accompaniment data in accordance with the singing voice output process.
(Appendix 3)
After the singing voice output process outputs the singing voice corresponding to the first character corresponding to the first pitch, the pitch specified before the timing for designating the second pitch is the second pitch. The electronic musical instrument according to Appendix 1 or 2, which executes the pitch change process of changing the pitch of the singing voice corresponding to the output first character to the specified pitch and outputting the pitch in the case of inconsistency with the above. ..
(Appendix 4)
After the singing voice output process outputs the singing voice corresponding to the first character corresponding to the first pitch, the pitch specified before the timing for designating the second pitch is the second pitch. In the case of the same, before the timing for designating the second pitch arrives, the pitch of the singing voice corresponding to the second character is output at the second pitch according to the designation, and the pitch is output as described above. The electronic musical instrument according to Appendix 2 or 3, which executes a control process for advancing the progress of the accompaniment data output by the accompaniment data output process in accordance with the output of the singing voice corresponding to the second character.
(Appendix 5)
Based on the learned model generated by machine learning based on the singing voice data sung by a certain singer and the lyrics data, a voice synthesis process for synthesizing the singing voice corresponding to the certain singer is executed.
The singing voice output process is any one of Supplementary notes 1 to 4 that outputs the singing voice synthesized by the voice synthesis process at the pitch specified by the operated operator according to the timing when the operator is operated. The electronic musical instrument described in.
(Appendix 6)
The electronic musical instrument according to any one of Supplementary note 1 to 5, which executes a display process of displaying an identifier indicating a pitch to be specified in accordance with a timing at which the pitch should be specified by operating an operator.
(Appendix 7)
On the computer of electronic musical instruments
Judgment processing to determine whether the first pitch to be specified and the specified pitch match.
When a match with the first pitch is determined by the determination process, a singing voice corresponding to the first character corresponding to the first pitch is output, and the mismatch with the first pitch is determined by the determination process. Is determined, the singing voice corresponding to the second character corresponding to the second pitch to be specified next to the first pitch is sung until the determination process determines that the pitch matches the first pitch. Singing voice output processing that does not output,
How to run.
(Appendix 8)
On the computer of electronic musical instruments
Judgment processing to determine whether the first pitch to be specified and the specified pitch match.
When a match with the first pitch is determined by the determination process, a singing voice corresponding to the first character corresponding to the first pitch is output, and the mismatch with the first pitch is determined by the determination process. Is determined, the singing voice corresponding to the second character corresponding to the second pitch to be specified next to the first pitch is sung until the determination process determines that the pitch matches the first pitch. Singing voice output processing that does not output,
A program that executes.

１００電子鍵盤楽器
１０１鍵盤
１０２第１のスイッチパネル
１０３第２のスイッチパネル
１０４ＬＣＤ
２００制御システム
２０１ＣＰＵ
２０２ＲＯＭ
２０３ＲＡＭ
２０４音源ＬＳＩ
２０５音声合成ＬＳＩ
２０６キースキャナ
２０８ＬＣＤコントローラ
２０９システムバス
２１０タイマ
２１１、２１２Ｄ／Ａコンバータ
２１３ミキサ
２１４アンプ
３０１音声学習部
３０２音声合成部
３０３学習用テキスト解析部
３０４学習用音響特徴量抽出
３０５モデル学習部
３０６音響モデル部
３０７テキスト解析部
３０８発声モデル部
３０９音源生成部
３１０合成フィルタ部
３１１学習用歌声データ
３１２学習用歌声音声データ
３１３学習用言語特徴量系列
３１４学習用音響特徴量系列
３１５学習結果
３１６言語特徴量系列
３１７音響特徴量系列
３１８スペクトル情報
３１９音源情報 100 Electronic keyboard instrument 101 Keyboard 102 First switch panel 103 Second switch panel 104 LCD
200 Control system 201 CPU
202 ROM
203 RAM
204 Sound source LSI
205 Speech synthesis LSI
206 Key scanner 208 LCD controller 209 System bus 210 Timer 211, 212 D / A converter 213 Mixer 214 Amplifier 301 Speech learning unit 302 Speech synthesis unit 303 Learning text analysis unit 304 Learning acoustic feature extraction 305 Model learning unit 306 Acoustic model Part 307 Text analysis part 308 Voice model part 309 Sound source generation part 310 Synthetic filter part 311 Learning singing voice data 312 Learning singing voice data 313 Learning language feature series 314 Learning acoustic feature series 315 Learning result 316 Language feature series 317 Acoustic feature series 318 Spectral information 319 Sound source information

Claims

When the first pitch to be specified according to the first timing is specified, the singing voice data of the first pitch corresponding to the first character corresponding to the first timing is output, and the singing voice data of the first pitch is output. When the second pitch to be specified according to the next second timing is specified before the second timing, the second character corresponding to the second timing without waiting for the arrival of the second timing. vocals output process of outputting the voice data of the second pitch corresponding to,
In the output singing data of the first pitch corresponding to the first character corresponding to the previous SL first timing, the first pitch and the third pitch other than the second pitch is higher than the second timing Also, when specified before, a pitch change process for changing the pitch of the singing voice data of the first pitch corresponding to the first character being output to the third pitch, and
An electronic musical instrument that runs.

Based on the trained model generated by machine learning using the singing voice data and lyrics data of a certain singer, a voice synthesis process for synthesizing the singing voice according to the certain singer is executed.
The electronic according to claim 1, wherein the singing voice output process outputs a singing voice synthesized by the voice synthesis processing according to the timing when the operator is operated at a pitch specified by the operated operator. Musical instrument.

The electronic musical instrument according to claim 1 or 2, wherein a display process for displaying an identifier indicating a pitch to be specified is executed before the timing at which the pitch should be specified by operating the operator.

The electronic musical instrument of the computer,
When the first pitch to be specified according to the first timing is specified, the singing voice data of the first pitch corresponding to the first character corresponding to the first timing is output, and the singing voice data of the first pitch is output. When the second pitch to be specified according to the next second timing is specified before the second timing, the second character corresponding to the second timing without waiting for the arrival of the second timing. vocals output process of outputting the voice data of the second pitch corresponding to,
In the output singing data of the first pitch corresponding to the first character corresponding to the previous SL first timing, the first pitch and the third pitch other than the second pitch is higher than the second timing Also, when specified before, a pitch change process for changing the pitch of the singing voice data of the first pitch corresponding to the first character being output to the third pitch, and
How to run.

The electronic musical instrument of the computer,
When the first pitch to be specified according to the first timing is specified, the singing voice data of the first pitch corresponding to the first character corresponding to the first timing is output, and the singing voice data of the first pitch is output. When the second pitch to be specified according to the next second timing is specified before the second timing, the second character corresponding to the second timing without waiting for the arrival of the second timing. vocals output process of outputting the voice data of the second pitch corresponding to,
In the output singing data of the first pitch corresponding to the first character corresponding to the previous SL first timing, the first pitch and the third pitch other than the second pitch is higher than the second timing Also, when specified before, a pitch change process for changing the pitch of the singing voice data of the first pitch corresponding to the first character being output to the third pitch, and
A program that executes.