JPWO2019003349A1

JPWO2019003349A1 - Sound generator and method

Info

Publication number: JPWO2019003349A1
Application number: JP2019526038A
Authority: JP
Inventors: 一輝柏瀬; 桂三濱野
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2017-06-28
Filing date: 2017-06-28
Publication date: 2020-01-16
Anticipated expiration: 2037-06-28
Also published as: CN110720122B; JP6787491B2; WO2019003349A1; CN110720122A

Abstract

発音区間の切り替わり時における違和感を緩和することができる音発生装置を提供する。ＣＰＵ１０は、指定開始操作（進み操作子３４または戻し操作子３５の押下操作）が検出されると、出力中の音を停止して、発音対象フレーズを未確定状態とすると共に、ダミー音を自動生成してその出力を開始する。その後、ＣＰＵ１０は、指定終了操作（進み操作子３４または戻し操作子３５の離し操作）が検出されると、次の発音対象フレーズを確定させる。例えばＣＰＵ１０は、進み操作子３４の押下操作を検出した後に離し操作を検出した場合、現在のフレーズの１つ後のフレーズを発音対象フレーズとして確定させ、ダミー音を停止する。Provided is a sound generating device that can reduce a sense of discomfort when a sounding section is switched. When the designation start operation (the pressing operation of the advance operation element 34 or the return operation element 35) is detected, the CPU 10 stops the sound being output, sets the sounding target phrase to an undetermined state, and automatically sets the dummy sound. Generate and start its output. Thereafter, when the designation ending operation (the release operation of the advance operator 34 or the return operator 35) is detected, the CPU 10 determines the next phrase to be sounded. For example, when detecting a release operation after detecting a pressing operation of the advance operation element 34, the CPU 10 determines a phrase immediately after the current phrase as a phrase to be sounded, and stops the dummy sound.

Description

本発明は、歌唱用データに基づき歌唱音を発音する音発生装置及び方法に関する。 The present invention relates to a sound generating device and a method for generating a singing sound based on singing data.

音声合成技術を用い、歌唱用データに基づき歌唱音を発音する音発生装置が知られている。例えば、下記特許文献１の装置は、複数種類の合成情報（音韻情報と韻律情報）を音符毎にユーザに入力させ、リアルタイムに歌唱合成を行う。なお、音韻情報と韻律情報との入力タイミングにズレがあるとユーザに違和感を与えることから、特許文献１の装置は、最先の合成情報の入力から、最先の合成情報に対応する音声信号の出力が開始されるまでの間、ダミー音を発音することで違和感を緩和している。これによれば、決まった順番で１音節ずつ歌唱する際の違和感を緩和できる。 2. Description of the Related Art A sound generation device that generates a singing sound based on singing data using a voice synthesis technique is known. For example, the device of Patent Document 1 below allows a user to input a plurality of types of synthesis information (phonological information and prosody information) for each note, and performs singing synthesis in real time. Note that if the input timing of the phonemic information and the prosody information is incorrect, the user will feel uncomfortable if the input timing is incorrect. Until the output of is started, the dummy sound is emitted to alleviate the sense of incongruity. According to this, the uncomfortable feeling when singing one syllable at a time in a fixed order can be reduced.

特許第６０４４２８４号公報Japanese Patent No. 6044284

ところで一般に、曲の歌詞はフレーズ等のまとまりのある単位（区間）を複数有して構成される。そのため、あるフレーズの歌唱途中において、演奏者が、次のフレーズの歌唱へ移行したい場合が考えられる。仮にフレーズの切り替えができるように構成した場合、切り替え先のフレーズを確定し、切り替え後のフレーズ内の音節に発音位置を移動させる等の処理が必要となる。切り替え先のフレーズの確定や実際の切り替え処理のために時間を要すると、フレーズ切り替えの度に本来の歌唱指示に基づく音節の発音が途切れ、違和感を与えるおそれがある。伴奏音も併せて再生しているときには特に目立ってしまう。 By the way, in general, the lyric of a song has a plurality of unitary units (sections) such as phrases. Therefore, there may be a case where a player wants to shift to singing the next phrase while singing a certain phrase. If the configuration is such that the phrase can be switched, it is necessary to determine the phrase to which the phrase is to be switched, and to move the sounding position to a syllable in the switched phrase. If it takes time to determine the phrase to be switched to or to perform the actual switching process, the sound of the syllable based on the original singing instruction is interrupted each time the phrase is switched, which may give a sense of incongruity. This is particularly noticeable when the accompaniment sound is also reproduced.

本発明の目的は、発音区間の切り替わり時における違和感を緩和することができる音発生装置及び方法を提供することである。 SUMMARY OF THE INVENTION It is an object of the present invention to provide a sound generating apparatus and a sound generating method which can reduce a sense of discomfort when a sounding section is switched.

上記目的を達成するために本発明によれば、発音の基となる音節情報を含み連続する複数の区間からなる歌唱用データを取得するデータ取得部と、前記データ取得部により取得された歌唱用データのうち次の発音対象区間を指定する区間指定操作を検出する検出部と、前記検出部により区間指定操作が検出されたことに応じて、歌唱の指示に基づく歌唱音とは別の所定の歌唱音を発音する発音制御部と、を有する音発生装置が提供される。 According to the present invention, in order to achieve the above object, a data acquisition unit that acquires singing data composed of a plurality of continuous sections including syllable information that is a basis of pronunciation, and a singing data acquired by the data acquisition unit. A detection unit that detects a section designation operation that designates a next sounding target section in the data, and a predetermined singing sound different from a singing sound based on a singing instruction in response to the detection of the section designation operation by the detection unit. There is provided a sound generation device having a sound generation control unit that generates a singing sound.

なお、上記括弧内の符号は例示である。 In addition, the code | symbol in said parenthesis is an illustration.

本発明によれば、発音区間の切り替わり時における違和感を緩和することができる。 ADVANTAGE OF THE INVENTION According to this invention, the discomfort at the time of switching of a sounding area can be eased.

音発生装置の模式図である。It is a schematic diagram of a sound generating device. 電子楽器のブロック図である。It is a block diagram of an electronic musical instrument. 表示ユニットの主要部を示す図である。It is a figure showing the principal part of a display unit. 演奏が行われる場合の処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of a process when a performance is performed. 歌詞テキストデータの一例を示す図である。It is a figure showing an example of lyrics text data. 音声素片データの種類の一例を示す図である。It is a figure showing an example of the kind of speech unit data. 演奏が行われる場合の処理の流れの一例を示すフローチャートの一部である。It is a part of the flowchart which shows an example of the flow of a process when a performance is performed.

以下、図面を参照して本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（第１の実施の形態）
図１は、本発明の第１の実施の形態に係る音発生装置の模式図である。この音発生装置は、一例として鍵盤楽器である電子楽器１００として構成され、本体部３０及びネック部３１を有する。本体部３０は、第１面３０ａ、第２面３０ｂ、第３面３０ｃ、第４面３０ｄを有する。第１面３０ａは、複数の鍵から成る鍵盤部ＫＢが配設される鍵盤配設面である。第２面３０ｂは裏面である。第２面３０ｂにはフック３６、３７が設けられる。フック３６、３７間には不図示のストラップを架けることができ、演奏者は通常、ストラップを肩に掛けて鍵盤部ＫＢの操作等の演奏を行う。従って、肩掛けした使用時で、特に鍵盤部ＫＢの音階方向（鍵の配列方向）が左右方向となるとき、第１面３０ａ及び鍵盤部ＫＢが聴取者側を向き、第３面３０ｃ、第４面３０ｄはそれぞれ概ね下方、上方を向く。ネック部３１は本体部３０の側部から延設される。ネック部３１には、進み操作子３４、戻し操作子３５をはじめとする各種の操作子が配設される。本体部３０の第４面３０ｄには、液晶等で構成される表示ユニット３３が配設される。(First Embodiment)
FIG. 1 is a schematic diagram of a sound generating device according to a first embodiment of the present invention. The sound generating device is configured as an electronic musical instrument 100 which is a keyboard instrument as an example, and has a main body 30 and a neck 31. The main body 30 has a first surface 30a, a second surface 30b, a third surface 30c, and a fourth surface 30d. The first surface 30a is a keyboard disposition surface on which a keyboard portion KB including a plurality of keys is disposed. The second surface 30b is the back surface. The hooks 36 and 37 are provided on the second surface 30b. A strap (not shown) can be hung between the hooks 36 and 37, and the player usually plays the operation such as operating the keyboard KB by hanging the strap on the shoulder. Therefore, when the keyboard portion KB is used on the shoulder, particularly when the scale direction (key arrangement direction) of the keyboard portion KB is the left-right direction, the first surface 30a and the keyboard portion KB face the listener, and the third surface 30c, the fourth surface 30c. The surface 30d faces generally downward and upward, respectively. The neck 31 extends from a side of the main body 30. Various operators including a forward operator 34 and a return operator 35 are arranged on the neck part 31. On the fourth surface 30d of the main body 30, a display unit 33 composed of a liquid crystal or the like is provided.

電子楽器１００は、演奏操作子への操作に応じて歌唱模擬を行う楽器である。ここで、歌唱模擬とは、歌唱合成により人間の声を模擬した音声を出力することである。鍵盤部ＫＢの各鍵は白鍵、黒鍵が音高順に並べられ、各鍵は、それぞれ異なる音高に対応付けられている。電子楽器１００を演奏する場合、ユーザは、鍵盤部ＫＢの所望の鍵を押下する。電子楽器１００はユーザにより操作された鍵を検出し、操作された鍵に応じた音高の歌唱音を発音する。なお、発音される歌唱音の音節の順番は予め定められている。 The electronic musical instrument 100 is a musical instrument that simulates singing in response to an operation on a performance operator. Here, singing simulation refers to outputting a voice simulating a human voice by singing synthesis. For each key of the keyboard KB, a white key and a black key are arranged in pitch order, and each key is associated with a different pitch. When playing the electronic musical instrument 100, the user presses a desired key on the keyboard KB. The electronic musical instrument 100 detects a key operated by the user and emits a singing sound having a pitch corresponding to the operated key. The order of the syllables of the singing sound to be pronounced is predetermined.

図２は、電子楽器１００のブロック図である。電子楽器１００は、ＣＰＵ（Central Processing Unit）１０と、タイマ１１と、ＲＯＭ（Read Only Memory）１２と、ＲＡＭ（Random Access Memory）１３と、データ記憶部１４と、演奏操作子１５と、他操作子１６と、パラメータ値設定操作子１７と、表示ユニット３３と、音源１９と、効果回路２０と、サウンドシステム２１と、通信Ｉ／Ｆ（Interface）と、バス２３と、を備える。 FIG. 2 is a block diagram of the electronic musical instrument 100. The electronic musical instrument 100 includes a CPU (Central Processing Unit) 10, a timer 11, a ROM (Read Only Memory) 12, a RAM (Random Access Memory) 13, a data storage unit 14, a performance operator 15, and other operations. The control unit includes a control unit 16, a parameter value setting operation unit 17, a display unit 33, a sound source 19, an effect circuit 20, a sound system 21, a communication I / F (Interface), and a bus 23.

ＣＰＵ１０は、電子楽器１００全体の制御を行う中央処理装置である。タイマ１１は、時間を計測するモジュールである。ＲＯＭ１２は制御プログラムや各種のデータなどを格納する不揮発性のメモリである。ＲＡＭ１３はＣＰＵ１０のワーク領域及び各種のバッファなどとして使用される揮発性のメモリである。表示ユニット３３は、液晶ディスプレイパネル、有機ＥＬ（Electro-Luminescence）パネルなどの表示モジュールである。表示ユニット３３は、電子楽器１００の動作状態、各種設定画面、ユーザに対するメッセージなどを表示する。 The CPU 10 is a central processing unit that controls the entire electronic musical instrument 100. The timer 11 is a module that measures time. The ROM 12 is a nonvolatile memory that stores a control program, various data, and the like. The RAM 13 is a volatile memory used as a work area of the CPU 10 and various buffers. The display unit 33 is a display module such as a liquid crystal display panel and an organic EL (Electro-Luminescence) panel. The display unit 33 displays the operating state of the electronic musical instrument 100, various setting screens, messages for the user, and the like.

演奏操作子１５は、主として音高を指定する演奏操作を受け付けるモジュールである。本実施の形態では、鍵盤部ＫＢ、進み操作子３４、戻し操作子３５は演奏操作子１５に含まれる。一例として、演奏操作子１５が鍵盤である場合、演奏操作子１５は、各鍵に対応するセンサのオン／オフに基づくノートオン／ノートオフ、押鍵の強さ（速さ、ベロシティ）などの演奏情報を出力する。この演奏情報は、ＭＩＤＩ（musical instrument digital interface）メッセージ形式であってもよい。 The performance operator 15 is a module that mainly receives a performance operation for designating a pitch. In the present embodiment, the keyboard KB, the advance operator 34, and the return operator 35 are included in the performance operator 15. As an example, when the performance operation element 15 is a keyboard, the performance operation element 15 includes a note-on / note-off based on the on / off of a sensor corresponding to each key, and the strength (speed, velocity) of key depression. Output performance information. This performance information may be in the form of a MIDI (musical instrument digital interface) message.

他操作子１６は、例えば、電子楽器１００に関する設定など、演奏以外の設定を行うための操作ボタンや操作つまみなどの操作モジュールである。パラメータ値設定操作子１７は、主として歌唱音の属性についてのパラメータを設定するために使用される、操作ボタンや操作つまみなどの操作モジュールである。このパラメータとしては、例えば、和声（Harmonics）、明るさ（Brightness）、共鳴（Resonance）、性別要素（Gender Factor）等がある。和声とは、声に含まれる倍音成分のバランスを設定するパラメータである。明るさとは、声の明暗を設定するパラメータであり、トーン変化を与える。共鳴とは、歌唱音声や楽器音の、音色や強弱を設定するパラメータである。性別要素とは、フォルマントを設定するパラメータであり、声の太さ、質感を女性的、或いは、男性的に変化させる。外部記憶装置３は、例えば、電子楽器１００に接続される外部機器であり、例えば、音声データを記憶する装置である。通信Ｉ／Ｆ２２は、外部機器と通信する通信モジュールである。バス２３は電子楽器１００における各部の間のデータ転送を行う。 The other operator 16 is, for example, an operation module such as an operation button or an operation knob for performing settings other than the performance, such as settings related to the electronic musical instrument 100. The parameter value setting operator 17 is an operation module, such as an operation button or an operation knob, mainly used for setting a parameter for the attribute of the singing sound. The parameters include, for example, harmony (Harmonics), brightness (Brightness), resonance (Resonance), and gender factor (Gender Factor). Harmony is a parameter for setting the balance of harmonic components included in the voice. Brightness is a parameter for setting the contrast of a voice, and gives a tone change. The resonance is a parameter for setting the tone color and strength of a singing voice or a musical instrument sound. The gender element is a parameter for setting the formant, and changes the thickness and texture of the voice to be feminine or masculine. The external storage device 3 is, for example, an external device connected to the electronic musical instrument 100, and is, for example, a device that stores audio data. The communication I / F 22 is a communication module that communicates with an external device. The bus 23 performs data transfer between each unit in the electronic musical instrument 100.

データ記憶部１４は、歌唱用データ１４ａを格納する。歌唱用データ１４ａには歌詞テキストデータ、音韻情報データベースなどが含まれる。歌詞テキストデータは、歌詞を記述するデータである。歌詞テキストデータには、曲ごとの歌詞が音節単位で区切られて記述されている。すなわち、歌詞テキストデータは歌詞を音節に区切った文字情報を有し、この文字情報は音節に対応する表示用の情報でもある。ここで音節とは、１回の演奏操作に応じて出力する音のまとまりである。音韻情報データベースは、音声素片データを格納するデータベースである。音声素片データは音声の波形を示すデータであり、例えば、音声素片のサンプル列のスペクトルデータを波形データとして含む。また、音声素片データには、音声素片の波形のピッチを示す素片ピッチデータが含まれる。歌詞テキストデータ、音声素片データは、それぞれ、データベースにより管理されてもよい。 The data storage unit 14 stores singing data 14a. The singing data 14a includes lyrics text data, phoneme information database, and the like. The lyrics text data is data describing lyrics. In the lyrics text data, lyrics for each song are described in units of syllables. That is, the lyrics text data has character information obtained by dividing the lyrics into syllables, and this character information is also display information corresponding to the syllable. Here, a syllable is a group of sounds output according to one performance operation. The phoneme information database is a database that stores speech unit data. The speech unit data is data indicating a speech waveform, and includes, for example, spectrum data of a sample sequence of the speech unit as waveform data. The speech unit data includes segment pitch data indicating the pitch of the waveform of the speech unit. The lyrics text data and the speech segment data may be managed by a database, respectively.

音源１９は、複数の発音チャンネルを有するモジュールである。音源１９には、ＣＰＵ１０の制御の基で、ユーザの演奏に応じて１つの発音チャンネルが割り当てられる。歌唱音を発音する場合、音源１９は、割り当てられた発音チャンネルにおいて、データ記憶部１４から演奏に対応する音声素片データを読み出して歌唱音データを生成する。効果回路２０は、音源１９が生成した歌唱音データに対して、パラメータ値設定操作子１７により指定された音響効果を適用する。サウンドシステム２１は、効果回路２０による処理後の歌唱音データを、デジタル／アナログ変換器によりアナログ信号に変換する。そして、サウンドシステム２１は、アナログ信号に変換された歌唱音を増幅してスピーカなどから出力する。 The sound source 19 is a module having a plurality of sound channels. One tone generation channel is assigned to the sound source 19 according to the performance of the user under the control of the CPU 10. When singing a singing sound, the sound source 19 reads vocal segment data corresponding to the performance from the data storage unit 14 and generates singing sound data in the assigned sounding channel. The effect circuit 20 applies the sound effect specified by the parameter value setting operator 17 to the singing sound data generated by the sound source 19. The sound system 21 converts the singing sound data processed by the effect circuit 20 into an analog signal by a digital / analog converter. Then, the sound system 21 amplifies the singing sound converted into the analog signal and outputs the amplified singing sound from a speaker or the like.

図３は、表示ユニット３３の主要部を示す図である。表示ユニット３３は、表示領域として、第１メインエリア４１、第２メインエリア４２、第１サブエリア４３、第２サブエリア４４を有する。全体の表示領域は２行（２段）構成となっており、第１メインエリア４１及び第１サブエリア４３が１行目（上段）、第２メインエリア４２及び第２サブエリア４４が２行目（下段）に配置される。メインエリア４１、４２のそれぞれにおいて、表示ユニット３３の長手方向に複数の表示枠４５（４５−１、４５−２、４５−３・・・）が直列に配置されている。図３の左端の表示枠４５−１を先頭として、音節に対応する文字が発音予定順に表示される。メインエリア４１、４２は主として歌詞表示に用いられる。 FIG. 3 is a diagram illustrating a main part of the display unit 33. The display unit 33 has a first main area 41, a second main area 42, a first sub area 43, and a second sub area 44 as display areas. The entire display area has a two-row (two-stage) configuration, in which the first main area 41 and the first sub-area 43 are in the first row (upper row), and the second main area 42 and the second sub-area 44 are two rows. It is placed on the eyes (lower). In each of the main areas 41 and 42, a plurality of display frames 45 (45-1, 45-2, 45-3,...) Are arranged in series in the longitudinal direction of the display unit 33. With the display frame 45-1 at the left end of FIG. The main areas 41 and 42 are mainly used for displaying lyrics.

次に、歌唱順序及び歌詞表示に着目した動作について説明する。まず、歌唱用データ１４ａに含まれる歌詞テキストデータは、選択曲に応じた複数の各音節に対応付けられた文字情報を少なくとも含む。歌詞テキストデータは、歌唱部（音源１９、効果回路２０及びサウンドシステム２１）により歌唱されるためのデータである。歌詞テキストデータは予め、連続した複数の区間に分けられており、分割された各区間を「フレーズ」と称する。フレーズは、あるまとまりのある単位であり、ユーザが認識しやすい意味により区切られたものであるが、区間の定義はこれに限定されない。ＣＰＵ１０は、曲が選択されると、複数のフレーズに分けられた状態で取得する。フレーズには１以上の音節とその音節に対応する文字情報が含まれる。 Next, an operation focusing on singing order and lyrics display will be described. First, the lyrics text data included in the singing data 14a includes at least character information associated with a plurality of syllables corresponding to the selected song. The lyrics text data is data to be sung by the singing unit (sound source 19, effect circuit 20, and sound system 21). The lyrics text data is divided into a plurality of continuous sections in advance, and each of the divided sections is referred to as a “phrase”. A phrase is a unit of a unit and is divided according to a meaning that is easy for a user to recognize, but the definition of a section is not limited to this. When a song is selected, the CPU 10 acquires the song in a state where the song is divided into a plurality of phrases. The phrase includes one or more syllables and character information corresponding to the syllable.

電子楽器１００が起動されると、ＣＰＵ１０は、選択曲に対応する複数のフレーズのうち先頭のフレーズに対応する文字情報を、表示ユニット３３の第１メインエリア４１（図３）に表示させる。その際、１フレーズ目の先頭の文字が左端の表示枠４５−１に表示され、第１メインエリア４１に表示可能な数だけ文字が表示される。２フレーズ目については、第２メインエリア４２に表示可能な数だけ文字が表示される。鍵盤部ＫＢは、歌唱の指示を取得する指示取得部としての役割を果たす。ＣＰＵ１０は、鍵盤部ＫＢの操作等によって歌唱の指示が取得されたことに応じて、次に歌唱する音節を歌唱部に歌唱させると共に、第１メインエリア４１に表示された文字の表示を、音節の進行に従って進める。文字表示の歩進方向は図３の左方向であり、最初に表示しきれなかった文字は、歌唱の進行に応じて右端の表示枠４５から表れる。カーソル位置は次に歌唱する音節を示すものであり、第１メインエリア４１の表示枠４５−１に表示された文字に対応する音節を指示する。鍵盤部ＫＢの操作に応じて、表示ユニット３３に表示される歌詞が更新される。 When the electronic musical instrument 100 is activated, the CPU 10 causes the first main area 41 (FIG. 3) of the display unit 33 to display character information corresponding to the first phrase among a plurality of phrases corresponding to the selected music. At this time, the first character of the first phrase is displayed in the leftmost display frame 45-1, and as many characters as can be displayed in the first main area 41 are displayed. For the second phrase, characters are displayed in the second main area 42 as many as can be displayed. The keyboard KB plays a role as an instruction obtaining unit for obtaining a singing instruction. The CPU 10 causes the singing unit to sing the syllable to be sung next in response to the singing instruction being obtained by operating the keyboard unit KB or the like, and changes the display of the characters displayed in the first main area 41 to the syllable. Proceed according to the progress of. The running direction of the character display is the left direction in FIG. 3, and the characters that cannot be displayed first appear from the rightmost display frame 45 as the singing progresses. The cursor position indicates the syllable to be sung next, and indicates the syllable corresponding to the character displayed in the display frame 45-1 of the first main area 41. The lyrics displayed on the display unit 33 are updated according to the operation of the keyboard KB.

なお、１文字と１音節とは必ずしも対応しない。例えば、濁点を有する「だ」（ｄａ）は、「た」（ｔａ）と「"」の２文字が１音節に対応する。また、歌詞は英語でもよく、例えば歌詞が「september」の場合、「sep」「tem」「ber」の３音節となる。「sep」は１音節であるが、「s」「e」「p」の３文字が１音節に対応する。文字表示の歩進はあくまで音節単位であるので、「だ」の場合は歌唱により２文字進むことになる。このように、歌詞は、日本語に限らず他言語であってもよい。 Note that one character does not always correspond to one syllable. For example, "da" (da) having a cloud point has two characters "ta" (ta) and "" corresponding to one syllable. The lyrics may be in English. For example, when the lyrics are "september", the syllables are "sep", "tem", and "ber". “Sep” is one syllable, but three characters “s”, “e”, and “p” correspond to one syllable. Since the progress of the character display is syllable unit, the character “da” advances two characters by singing. Thus, the lyrics are not limited to Japanese and may be in other languages.

第１メインエリア４１への表示対象となっているフレーズの全ての音節が発音済みとなった場合は、ＣＰＵ１０は、第１メインエリア４１への表示対象となっているフレーズの次のフレーズに属する文字情報を第１メインエリア４１に表示させ、第２メインエリア４２への表示対象となっているフレーズの次のフレーズに属する文字情報を第２メインエリア４２に表示させる。なお、第２メインエリア４２への表示対象となっているフレーズの次のフレーズが存在しない場合は、第２メインエリア４２へ表示される文字はなくなる（全ての表示枠４５は空白）。 If all the syllables of the phrase to be displayed in the first main area 41 have been pronounced, the CPU 10 belongs to the phrase next to the phrase to be displayed in the first main area 41. Character information is displayed in the first main area 41, and character information belonging to a phrase next to the phrase to be displayed in the second main area 42 is displayed in the second main area 42. If there is no phrase next to the phrase to be displayed in the second main area 42, no characters are displayed in the second main area 42 (all display frames 45 are blank).

図１に示す進み操作子３４は、フレーズ単位で表示を繰り上げるための操作子である。また、進み操作子３４を押下して離す操作をフレーズ進み操作の一例とする。戻し操作子３５はフレーズ単位で表示を繰り下げるための操作子である。戻し操作子３５を押下して離す操作をフレーズ戻し操作の一例とする。進み操作子３４によるフレーズ進み操作、戻し操作子３５によるフレーズ戻し操作が、次の発音対象フレーズ（発音対象区間）を指定するフレーズ指定操作（区間指定操作）に該当する。 The advance operator 34 shown in FIG. 1 is an operator for moving up the display in phrase units. An operation of pressing and releasing the advance operation element 34 is an example of a phrase advance operation. The return operator 35 is an operator for moving down the display in phrase units. An operation of pressing and releasing the return operation element 35 is an example of a phrase return operation. The phrase advance operation by the advance operator 34 and the phrase return operation by the return operator 35 correspond to a phrase designating operation (section designating operation) for designating the next phrase to be sounded (sounding target segment).

ＣＰＵ１０は、フレーズ指定操作を検出すると、次の発音対象フレーズを確定させる。例えばＣＰＵ１０は、進み操作子３４の押下操作を検出した後、進み操作子３４の離し操作を検出すると、現在のフレーズの１つ後のフレーズを発音対象フレーズとして確定させる。また、戻し操作子３５の押下操作を検出した後、戻し操作子３５の離し操作を検出すると、現在のフレーズの１つ前のフレーズを発音対象フレーズとして確定させる。進み操作子３４の押下操作、戻し操作子３５の押下操作は、フレーズ指定操作のうち指定開始操作となる。進み操作子３４の離し操作、戻し操作子３５の離し操作は、フレーズ指定操作のうち指定終了操作となる。 When detecting the phrase specifying operation, the CPU 10 determines the next phrase to be pronounced. For example, after detecting the pressing operation of the advance operation element 34 and then detecting the release operation of the advance operation element 34, the CPU 10 determines the next phrase after the current phrase as the phrase to be pronounced. When a release operation of the return operation element 35 is detected after detecting a press operation of the return operation element 35, a phrase immediately before the current phrase is determined as a phrase to be sounded. The pressing operation of the advance operator 34 and the pressing operation of the return operator 35 are the designation start operations of the phrase designation operations. The release operation of the advance operator 34 and the release operation of the return operator 35 are designated end operations of the phrase designation operation.

発音対象フレーズの確定処理に連動し、ＣＰＵ１０は次のように歌詞表示処理を実行する。この歌詞表示処理は不図示の別途のフローチャートにより実行される。まず、ＣＰＵ１０は、フレーズ進み操作を検出すると、フレーズ表示の繰り上げ処理を実行することで、確定した発音対象フレーズを第１メインエリア４１に表示する。例えばＣＰＵ１０は、それまで第２メインエリア４２に表示されていた文字列を第１メインエリア４１に表示させると共に、さらに次のフレーズの文字列を第２メインエリア４２に表示させる。なお、第２メインエリア４２への表示対象となっているフレーズの次のフレーズが存在しない場合は、第２メインエリア４２へ表示される文字はなくなる（全ての表示枠４５は空白）。一方、ＣＰＵ１０は、フレーズ戻し操作を検出すると、フレーズ表示の繰り下げ処理を実行することで、確定した発音対象フレーズを第１メインエリア４１に表示する。例えばＣＰＵ１０は、第１メインエリア４１への表示対象となっていたフレーズの直前のフレーズに属する文字情報を第１メインエリア４１に表示させ、第２メインエリア４２への表示対象となっていたフレーズの直前のフレーズに属する文字情報を第２メインエリア４２に表示させる。 In conjunction with the process of determining the phrase to be pronounced, the CPU 10 executes the lyrics display process as follows. This lyrics display processing is executed according to a separate flowchart (not shown). First, upon detecting a phrase advance operation, the CPU 10 executes a phrase display advance process to display the determined sounding target phrase in the first main area 41. For example, the CPU 10 causes the character string previously displayed in the second main area 42 to be displayed in the first main area 41 and further causes the character string of the next phrase to be displayed in the second main area 42. If there is no phrase next to the phrase to be displayed in the second main area 42, no characters are displayed in the second main area 42 (all display frames 45 are blank). On the other hand, when the CPU 10 detects the phrase return operation, the CPU 10 executes the process of moving down the phrase display, thereby displaying the confirmed sound target phrase in the first main area 41. For example, the CPU 10 causes the first main area 41 to display character information belonging to the phrase immediately before the phrase to be displayed in the first main area 41, and displays the character information to be displayed in the second main area 42. Are displayed in the second main area 42.

ところで、発音対象フレーズの確定までに、ユーザに認識され得る程度の時間を要する場合がある。発音対象フレーズが確定するまでは次の音節を発音できないため、違和感が生じるおそれがある。そこで本実施の形態では、ＣＰＵ１０は、フレーズ進み操作または戻し操作（指定開始を示す区間指定操作）が検出されたことに応じて、ダミー音（所定の歌唱音）の発音を開始し、少なくとも次の発音対象フレーズが確定するまでそのダミー音を継続する。ダミー音は歌唱合成による「ル（ｒｕ）」等の歌唱音であり、その種類は問わず、その発音の基となる音節情報は予めＲＯＭ１２に格納されている。なお、ダミー音の発音の基となる音節情報は、歌唱用データ１４ａに付随させてもよい。また、歌唱用データ１４ａにおいて、ダミー音用の音節情報をフレーズごとに付随させ、現在の発音対象フレーズまたは次の発音対象フレーズに対応するダミー音を生成するようにしてもよい。また、ダミー音の発音の基となる音節情報を複数格納しておき、直前に発音していた歌唱音に基づいてダミー音を生成するようにしてもよい。 By the way, it may take time for the user to recognize the phrase to be pronounced. Until the phrase to be pronounced is determined, the next syllable cannot be pronounced, which may cause discomfort. Therefore, in the present embodiment, in response to detection of a phrase advance operation or a return operation (a section designation operation indicating designation start), the CPU 10 starts to emit a dummy sound (a predetermined singing sound), and The dummy sound is continued until the phrase to be pronounced is determined. The dummy sound is a singing sound such as "ru" by singing synthesis, and the syllable information on which the sound is generated is stored in the ROM 12 in advance regardless of its type. The syllable information that is the basis of the sound of the dummy sound may be attached to the singing data 14a. In the singing data 14a, syllable information for a dummy sound may be attached to each phrase, and a dummy sound corresponding to the current phrase to be pronounced or the next phrase to be pronounced may be generated. Alternatively, a plurality of syllable information that is the basis of the sounding of the dummy sound may be stored, and the dummy sound may be generated based on the singing sound that was sounded immediately before.

図４は、電子楽器１００による演奏が行われる場合の処理の流れの一例を示すフローチャートである。ここでは、ユーザにより、演奏曲の選択と選択した曲の演奏とが行われる場合の処理について説明する。また、説明を簡単にするため、複数の鍵が同時に操作された場合であっても、単音のみを出力する場合について説明する。この場合、同時に操作された鍵の音高のうち、最も高い音高のみについて処理してもよいし、最も低い音高のみについて処理してもよい。なお、以下に説明する処理は、例えば、ＣＰＵ１０がＲＯＭ１２やＲＡＭ１３に記憶されたプログラムを実行することにより実現される。図４に示す処理において、ＣＰＵ１０は、データ取得部、検出部、発音制御部、確定部としての役割を果たす。 FIG. 4 is a flowchart illustrating an example of the flow of processing when a performance is performed by the electronic musical instrument 100. Here, a description will be given of a process in a case where a user performs selection of a music piece and performance of the selected music piece. Also, for simplicity of description, a case will be described in which only a single tone is output even when a plurality of keys are operated simultaneously. In this case, only the highest pitch among the pitches of keys operated simultaneously may be processed, or only the lowest pitch may be processed. The processing described below is realized, for example, by the CPU 10 executing a program stored in the ROM 12 or the RAM 13. In the processing illustrated in FIG. 4, the CPU 10 plays a role as a data acquisition unit, a detection unit, a sound generation control unit, and a determination unit.

電源がオンにされると、ＣＰＵ１０は、演奏する曲を選択する操作がユーザから受け付けられるまで待つ（ステップＳ１０１）。なお、一定時間経過しても曲選択の操作がない場合は、ＣＰＵ１０は、デフォルトで設定されている曲が選択されたと判断してもよい。ＣＰＵ１０は、曲の選択を受け付けると、選択された曲の歌唱用データ１４ａの歌詞テキストデータを読み出す。そして、ＣＰＵ１０は、歌詞テキストデータに記述された先頭の音節にカーソル位置を設定する（ステップＳ１０２）。ここで、カーソルとは、次に発音する音節の位置を示す仮想的な指標である。次に、ＣＰＵ１０は、鍵盤部ＫＢの操作に基づくノートオンを検出したか否かを判定する（ステップＳ１０３）。ＣＰＵ１０は、ノートオンが検出されない場合、ノートオフを検出したか否かを判別する（ステップＳ１０９）。一方、ノートオンを検出した場合、すなわち新たな押鍵を検出した場合は、ＣＰＵ１０は、音を出力中であればその音の出力を停止する（ステップＳ１０４）。この場合の音にはダミー音も含まれ得る。次にＣＰＵ１０は、次の発音対象フレーズが確定状態となっているか否かを判別する（ステップＳ１０５）。通常の、歌唱指示（ノートオン）の取得に応じて歌唱音節を順に歩進させている段階では、発音対象フレーズが確定状態となっている。従ってこの場合は、ＣＰＵ１０は、ノートオンに応じた歌唱音を発音する出力音生成処理を実行する（ステップＳ１０７）。 When the power is turned on, the CPU 10 waits until an operation for selecting a music to be played is received from the user (step S101). If there is no operation for selecting a song after a certain period of time, the CPU 10 may determine that a song set by default has been selected. When receiving the selection of the song, the CPU 10 reads out the lyrics text data of the singing data 14a of the selected song. Then, the CPU 10 sets the cursor position to the first syllable described in the lyrics text data (step S102). Here, the cursor is a virtual index indicating the position of the syllable to be pronounced next. Next, the CPU 10 determines whether or not note-on based on the operation of the keyboard KB has been detected (step S103). When note-on is not detected, the CPU 10 determines whether note-off is detected (step S109). On the other hand, when note-on is detected, that is, when a new key press is detected, the CPU 10 stops outputting the sound if the sound is being output (step S104). The sound in this case may include a dummy sound. Next, the CPU 10 determines whether or not the next phrase to be sounded is in a fixed state (step S105). At the stage where the singing syllables are sequentially advanced in accordance with the acquisition of the singing instruction (note-on), the pronunciation target phrase is in a fixed state. Therefore, in this case, the CPU 10 executes an output sound generation process for generating a singing sound corresponding to the note-on (step S107).

この出力音生成処理を説明する。ＣＰＵ１０はまず、カーソル位置に対応する音節の音声素片データ（波形データ）を読み出し、ノートオンに対応する音高で、読み出した音声素片データが示す波形の音を出力する。具体的には、ＣＰＵ１０は、音声素片データに含まれる素片ピッチデータが示す音高と、操作された鍵に対応する音高との差分を求め、この差分に相当する周波数だけ波形データが示すスペクトル分布を周波数軸方向に移動させる。これにより、電子楽器１００は、操作された鍵に対応する音高で歌唱音を出力することができる。次に、ＣＰＵ１０は、カーソル位置（読出位置）を更新し（ステップＳ１０８）、処理をステップＳ１０９に進める。 This output sound generation processing will be described. First, the CPU 10 reads voice unit data (waveform data) of a syllable corresponding to the cursor position, and outputs a sound having a waveform corresponding to the note-on and having a waveform indicated by the read voice unit data. Specifically, the CPU 10 obtains a difference between the pitch indicated by the segment pitch data included in the speech segment data and the pitch corresponding to the operated key, and the waveform data is converted by a frequency corresponding to the difference. The spectrum distribution shown is moved in the frequency axis direction. Thereby, the electronic musical instrument 100 can output a singing sound at a pitch corresponding to the operated key. Next, CPU 10 updates the cursor position (readout position) (step S108), and advances the processing to step S109.

ここで、ステップＳ１０７、Ｓ１０８の処理に係るカーソル位置の決定と歌唱音の発音について、具体例を用いて説明する。まず、カーソル位置の更新について説明する。図５は、歌詞テキストデータの一例を示す図である。図５の例では、歌詞テキストデータには、５つの音節ｃ１〜ｃ５の歌詞が記述されている。各字「は」、「る」、「よ」、「こ」、「い」は、日本語のひらがなの１字を示し、各字が１音節に対応する。ＣＰＵ１０は、音節単位でカーソル位置を更新する。例えば、カーソルが音節ｃ３に位置している場合、「よ」に対応する音声素片データをデータ記憶部１４から読み出し、「よ」の歌唱音を発音する。ＣＰＵ１０は、「よ」の発音が終了すると、次の音節ｃ４にカーソル位置を移動させる。このように、ＣＰＵ１０は、ノートオンに応じて次の音節にカーソル位置を順次移動させる。 Here, the determination of the cursor position and the sounding of the singing sound according to the processing of steps S107 and S108 will be described using a specific example. First, updating of the cursor position will be described. FIG. 5 is a diagram showing an example of the lyrics text data. In the example of FIG. 5, the lyrics text data describes lyrics of five syllables c1 to c5. Each character "ha", "ru", "yo", "ko", "i" indicates one Japanese hiragana character, and each character corresponds to one syllable. The CPU 10 updates the cursor position for each syllable. For example, when the cursor is located at the syllable c3, the voice unit data corresponding to “yo” is read from the data storage unit 14 and the singing sound of “yo” is pronounced. When the pronunciation of “yo” is completed, the CPU 10 moves the cursor position to the next syllable c4. As described above, the CPU 10 sequentially moves the cursor position to the next syllable according to the note-on.

次に、歌唱音の発音について説明する。図６は、音声素片データの種類の一例を示す図である。ＣＰＵ１０は、カーソル位置に対応する音節を発音させるために、音韻情報データベースから、音節に対応する音声素片データを抽出する。音声素片データには、音素連鎖データと、定常部分データの２種類が存在する。音素連鎖データとは、「無音（＃）から子音」、「子音から母音」、「母音から（次の音節の）子音又は母音」など、発音が変化する際の音声素片を示すデータである。定常部分データは、母音の発音が継続する際の音声素片を示すデータである。例えば、カーソル位置が音節ｃ１の「は（ｈａ）」に設定されている場合、音源１９は、「無音→子音ｈ」に対応する音声連鎖データ「＃−ｈ」と、「子音ｈ→母音ａ」に対応する音声連鎖データ「ｈ−ａ」と、「母音ａ」に対応する定常部分データ「ａ」と、を選択する。そして、ＣＰＵ１０は、演奏が開始されて押鍵を検出すると、音声連鎖データ「＃−ｈ」、音声連鎖データ「ｈ−ａ」、定常部分データ「ａ」に基づく歌唱音を、操作された鍵に応じた音高、操作に応じたベロシティで出力する。このようにして、カーソル位置の決定と歌唱音の発音が実行される。 Next, the pronunciation of the singing sound will be described. FIG. 6 is a diagram illustrating an example of types of speech unit data. The CPU 10 extracts speech unit data corresponding to the syllable from the phoneme information database in order to generate a syllable corresponding to the cursor position. There are two types of speech segment data: phoneme chain data and stationary partial data. The phoneme chain data is data indicating a speech unit when the pronunciation changes, such as “silence (#) to consonant”, “consonant to vowel”, “vowel to (the next syllable) consonant or vowel”. . The stationary partial data is data indicating a speech unit when vowel pronunciation continues. For example, when the cursor position is set to “ha (ha)” of the syllable c1, the sound source 19 outputs the voice chain data “# -h” corresponding to “silence → consonant h” and “consonant h → vowel a”. ”, And the stationary partial data“ a ”corresponding to“ vowel a ”. Then, when the performance is started and the key press is detected, the CPU 10 outputs the singing sound based on the voice chain data “# -h”, the voice chain data “ha”, and the steady portion data “a” to the operated key. The pitch is output according to the pitch and the velocity according to the operation. In this way, the cursor position is determined and the singing sound is generated.

一方、ステップＳ１０５の判別の結果、次の発音対象フレーズが未確定状態である場合は、ＣＰＵ１０は、ステップＳ１０３で検出されたノートオンの音高でダミー音の出力音を生成し、ダミー音を出力する。ここで、後述するステップＳ１１５で、指定開始操作に基づきダミー音が既に出力されている。従って、出力中のダミー音の音高とステップＳ１０３で検出されたノートオンの音高とが相違する場合は、ＣＰＵ１０は、出力中のダミー音をステップＳ１０３で検出されたノートオンの音高に修正するよう、ダミー音の出力音を生成する。従って、ダミー音の出力後、次のフレーズ確定まで、演奏者が押鍵によってダミー音の音高を修正できる。その後、ステップＳ１０９に進む。 On the other hand, if the result of the determination in step S105 indicates that the next phrase to be pronounced is in an undetermined state, the CPU 10 generates an output sound of a dummy sound at the note-on pitch detected in step S103, and generates the dummy sound. Output. Here, in step S115 described later, the dummy sound has already been output based on the designation start operation. Accordingly, when the pitch of the dummy sound being output is different from the note-on pitch detected in step S103, the CPU 10 sets the dummy sound being output to the note-on pitch detected in step S103. Generate an output sound of the dummy sound so as to correct it. Therefore, after outputting the dummy sound, the performer can correct the pitch of the dummy sound by pressing a key until the next phrase is determined. Thereafter, the process proceeds to step S109.

図４のステップＳ１０９でノートオフが検出されない場合は、ＣＰＵ１０は処理をステップＳ１１２に進める。一方、ノートオフを検出した場合は、ＣＰＵ１０は、次の発音対象フレーズが確定状態となっているか否かを判別する（ステップＳ１１０）。通常の、歌唱指示（ノートオン）の取得に応じて歌唱音節を順に歩進させている段階では、発音対象フレーズが確定状態となっている。従ってこの場合は、ＣＰＵ１０は、音を出力中であればその音の出力を停止して（ステップＳ１１１）、処理をステップＳ１１２に進める。ステップＳ１１０の判別の結果、次の発音対象フレーズが未確定状態である場合は、ＣＰＵ１０は処理をステップＳ１１２に進める。ステップＳ１１２では、ＣＰＵ１０は、指定開始操作（進み操作子３４または戻し操作子３５の押下操作）が検出されたか否かを判別する。そして指定開始操作が検出されない場合は、ＣＰＵ１０は、指定終了操作（進み操作子３４または戻し操作子３５の離し操作）が検出されたか否かを判別する（ステップＳ１１６）。そして指定終了操作が検出されない場合は、ＣＰＵ１０は、処理をステップＳ１２１に進める。 If note-off is not detected in step S109 of FIG. 4, the CPU 10 advances the process to step S112. On the other hand, when the note-off is detected, the CPU 10 determines whether or not the next phrase to be sounded is in a fixed state (step S110). At the stage where the singing syllables are sequentially advanced in accordance with the acquisition of the singing instruction (note-on), the pronunciation target phrase is in a fixed state. Therefore, in this case, if a sound is being output, the CPU 10 stops outputting the sound (step S111), and advances the process to step S112. If the result of determination in step S110 is that the next phrase to be pronounced is in an undetermined state, the CPU 10 advances the process to step S112. In step S112, the CPU 10 determines whether or not the designation start operation (the pressing operation of the advance operation element 34 or the return operation element 35) is detected. If the designation start operation is not detected, the CPU 10 determines whether or not the designation end operation (the release operation of the advance operation element 34 or the return operation element 35) has been detected (step S116). Then, when the designation end operation is not detected, the CPU 10 advances the processing to step S121.

ステップＳ１１２の判別の結果、指定開始操作が検出された場合は、ＣＰＵ１０は、音を出力中であればその音の出力を停止して（ステップＳ１１３）、発音対象フレーズを未確定状態とする（ステップＳ１１４）。なお、ＣＰＵ１０は、例えば所定のフラグに０、１を設定する等によって、発音対象フレーズの未確定状態、確定状態を管理する。次に、ＣＰＵ１０は、ダミー音を自動生成し、ダミー音の出力を開始する（ステップＳ１１５））。これにより、指定開始操作に応じてダミー音の発音が開始される。その後、処理はステップＳ１１６に進む。 As a result of the determination in step S112, if the designation start operation is detected, the CPU 10 stops outputting the sound if the sound is being output (step S113), and sets the sounding target phrase to an undetermined state (step S113). Step S114). Note that the CPU 10 manages the undetermined state and the determined state of the pronunciation target phrase by, for example, setting a predetermined flag to 0 or 1. Next, the CPU 10 automatically generates a dummy sound and starts outputting the dummy sound (step S115). Thereby, the sounding of the dummy sound is started according to the designation start operation. Thereafter, the process proceeds to step S116.

ステップＳ１１６の判別の結果、指定終了操作が検出された場合は、ＣＰＵ１０は、ステップＳ１１２で検出した指定開始操作と当該指定終了操作とに基づいて、次の発音対象フレーズを確定させる（ステップＳ１１７）。例えばＣＰＵ１０は、上述したように、ステップＳ１１２で進み操作子３４の押下操作を検出した後、ステップＳ１１６で進み操作子３４の離し操作を検出した場合、現在のフレーズの１つ後のフレーズを発音対象フレーズとして確定させる。次に、ＣＰＵ１０は、読み出し位置の更新、すなわち、確定した発音対象フレーズにおける先頭の音節にカーソル位置を更新する（ステップＳ１１８）。これにより、次の発音対象フレーズが確定した後のステップＳ１０３で歌唱指示が取得されると、当該発音対象フレーズにおける先頭に対応する音節が歌唱されるので、確定したフレーズの歌唱へ直ちに移行できる。なお、確定した発音対象フレーズにおけるカーソル位置の更新先は所定位置でよく、必ずしも先頭位置でなくてもよい。その後、ＣＰＵ１０は、発音対象フレーズを確定状態とし（ステップＳ１１９）、出力中のダミー音を停止する（ステップＳ１２０）。これにより、発音対象フレーズが確定したことに応じてダミー音の発音が終了する。その後、処理はステップＳ１２１に進む。 If the result of determination in step S116 is that a designation end operation has been detected, the CPU 10 determines the next phrase to be sounded based on the designation start operation detected in step S112 and the designation end operation (step S117). . For example, as described above, after detecting the pressing operation of the operating element 34 in step S112 and detecting the releasing operation of the operating element 34 in step S116, the CPU 10 pronounces the next phrase after the current phrase as described above. Confirm as the target phrase. Next, the CPU 10 updates the read position, that is, updates the cursor position to the first syllable in the determined pronunciation target phrase (step S118). Thus, when a singing instruction is acquired in step S103 after the next phrase to be pronounced is determined, the syllable corresponding to the head of the phrase to be pronounced is sung, and the process can immediately shift to the singing of the determined phrase. The update destination of the cursor position in the determined phrase to be pronounced may be a predetermined position, and not necessarily the head position. Thereafter, the CPU 10 sets the phrase to be sounded to a fixed state (step S119), and stops outputting the dummy sound (step S120). Thereby, the sounding of the dummy sound ends in response to the determination of the sounding target phrase. Thereafter, the process proceeds to step S121.

ステップＳ１２１では、ＣＰＵ１０は、その他の処理を実行する。例えばＣＰＵ１０は、ダミー音の発音が一定時間以上継続している場合は、同じダミー音の生成及び出力をやり直す。それにより、例えば、「ルー」というダミー音が長く続いている場合に、「ルールールー」というように同じ音節の発音を繰り返すことができる。その後、ＣＰＵ１０は、演奏が終了したか否かを判別し（ステップＳ１２２）、演奏を終了していない場合は処理をステップＳ１０３に戻す。一方、演奏を終了した場合は、ＣＰＵ１０は、音を出力中であればその音の出力を停止して（ステップＳ１２３）、図４に示す処理を終了する。なお、ＣＰＵ１０は、演奏を終了したか否かを、例えば、選択曲の最後尾の音節が発音されたか否か、あるいは他操作子１６により演奏を終了する操作が行われた否か、などに基づき判別できる。 In step S121, the CPU 10 executes other processing. For example, if the sound of the dummy sound continues for a certain period of time or longer, the CPU 10 repeats the generation and output of the same dummy sound. Thereby, for example, when the dummy sound "Lou" continues for a long time, the pronunciation of the same syllable as "Rule Lou" can be repeated. Thereafter, the CPU 10 determines whether or not the performance has ended (step S122). If the performance has not ended, the process returns to step S103. On the other hand, when the performance is ended, if a sound is being output, the CPU 10 stops outputting the sound (step S123), and ends the processing illustrated in FIG. Note that the CPU 10 determines whether or not the performance has ended, for example, whether or not the last syllable of the selected song has been pronounced, or whether or not an operation to end the performance has been performed by another operator 16. It can be determined based on

本実施の形態によれば、フレーズ指定操作が検出されたことに応じて、歌唱の指示に基づく歌唱音とは別のダミー音（所定の歌唱音）が発音される。これにより、フレーズ切り替えの度に本来の歌唱指示に基づく音節の発音が停止しても、ダミー音が発音されることで、発音区間の切り替わり時における違和感を緩和することができる。特にダミー音の発音は、指定開始操作が検出されたことに応じて開始され、少なくとも、次の発音対象区間が確定するまで継続するので、発音区間の切り替わり時に無音となることが回避される。また、指定終了操作により発音対象フレーズが確定するので、ユーザがフレーズ指定操作をしている間、ダミー音の発音を継続させることができる。 According to the present embodiment, a dummy sound (predetermined singing sound) different from the singing sound based on the singing instruction is generated in response to the detection of the phrase designation operation. Thus, even if the syllable based on the original singing instruction is stopped every time the phrase is switched, the dummy sound is generated, so that the sense of incongruity at the time of switching the sounding section can be reduced. In particular, the sounding of the dummy sound is started in response to the detection of the designation start operation, and is continued at least until the next sounding target section is determined. Further, since the phrase to be pronounced is determined by the designation ending operation, the dummy sound can be continued to be produced while the user performs the phrase designation operation.

また、ダミー音の発音中に音高を指定する指示を取得した場合は、ＣＰＵ１０は、ダミー音の発音音高を指定された音高へ変更する（ステップＳ１０６）ので、ダミー音の音高修正により違和感を一層緩和することができる。 Further, when an instruction to specify a pitch is acquired during the generation of the dummy sound, the CPU 10 changes the pronunciation pitch of the dummy sound to the specified pitch (step S106). Thereby, the sense of discomfort can be further alleviated.

（第２の実施の形態）
第１の実施の形態では、ダミー音は、発音対象フレーズが確定状態となると直ちに停止された。これに対し、本発明の第２の実施の形態では、発音を開始したダミー音を、発音対象フレーズの確定後における最初にノートオンがあるまで継続する。そのために、図４のステップＳ１２０を廃止すればよい。そうすれば、発音対象フレーズが確定状態となった後の最初のノートオンにより、それまで出力中であったダミー音がステップＳ１０４で停止される。従って、発音が開始されたダミー音を、発音対象フレーズの確定後のノートオンまで途切れないようにすることができる。(Second embodiment)
In the first embodiment, the dummy sound is stopped immediately after the phrase to be pronounced is settled. On the other hand, in the second embodiment of the present invention, the dummy sound that has started sounding is continued until the first note-on after the sounding target phrase is determined. Therefore, step S120 in FIG. 4 may be omitted. Then, by the first note-on after the sounding target phrase has been determined, the dummy sound that has been being output is stopped in step S104. Therefore, it is possible to keep the dummy sound that has started sounding until the note-on after the sounding target phrase is determined.

なお、本実施の形態は、例えば、１つの操作で指定開始操作と指定終了操作が完結してしまうような仕様において効果的である。例えば、進み操作子３４または戻し操作子３５を押しただけで指定開始操作と指定終了操作が指示され、離し操作は何も意味を持たないような仕様に本発明を適用してもよい。 Note that the present embodiment is effective in a specification in which, for example, the designation start operation and the designation end operation are completed by one operation. For example, the present invention may be applied to a specification in which the designation start operation and the designation end operation are instructed only by pressing the advance operation element 34 or the return operation element 35, and the release operation has no meaning.

（第３の実施の形態）
第１の実施の形態では、ダミー音の発音後にノートオンがあった場合は、音高を変えてダミー音を再発音することで、ダミー音の音高をノートオンの音高に修正するとした（ステップＳ１０６）。これに対し本発明の第３の実施の形態では、ダミー音の発音後にノートオンがあってもダミー音の再生成・再発音をしない。(Third embodiment)
In the first embodiment, when a note-on occurs after the dummy sound is generated, the pitch of the dummy sound is corrected to the note-on pitch by changing the pitch and re-producing the dummy sound. (Step S106). On the other hand, in the third embodiment of the present invention, even if a note-on occurs after the dummy sound is generated, the dummy sound is not regenerated / reproduced.

図７は、本発明の第３の実施の形態に係る電子楽器１００による演奏が行われる場合の処理の流れの一例を示すフローチャートの一部である。このフローチャートでは、図４のフローチャートに対し、ステップＳ１０３より前の処理、ステップＳ１０９より後の処理は同じであるので、それらの図示を省略している。ステップＳ１０５、Ｓ１０６は廃止されている。 FIG. 7 is a part of a flowchart illustrating an example of a processing flow when a performance is performed by the electronic musical instrument 100 according to the third embodiment of the present invention. In this flowchart, the processing before step S103 and the processing after step S109 are the same as those in the flowchart of FIG. Steps S105 and S106 are abolished.

ステップＳ１０３で、ノートオンを検出すると、ＣＰＵ１０は、ダミー音を発音中であるか否かを判別する（ステップＳ２０１）。そして、ダミー音を発音中でない場合は、ステップＳ１０４、Ｓ１０７、Ｓ１０８を実行して処理をステップＳ１０９に進める。従って、前回のノートオンに基づく発音中の音は停止され、今回のノートオンに基づく歌唱音が発音される。なお、ダミー音が停止されていることは、発音対象フレーズが確定していることを意味する。一方ＣＰＵ１０は、ダミー音を発音中である場合は、処理をステップＳ１０９に進める。従って、ダミー音を発音中である場合、ノートオンがあってもノートオンに基づく発音はなされず、ダミー音の発音が音高修正されることなく継続する。 When note-on is detected in step S103, the CPU 10 determines whether or not a dummy sound is being generated (step S201). If the dummy sound is not being generated, steps S104, S107, and S108 are executed, and the process proceeds to step S109. Accordingly, the sound being generated based on the previous note-on is stopped, and the singing sound based on the current note-on is generated. The suspension of the dummy sound means that the phrase to be pronounced has been determined. On the other hand, when the dummy sound is being generated, the CPU 10 advances the processing to step S109. Therefore, when the dummy sound is being generated, even if there is a note-on, the sound is not generated based on the note-on, and the sound of the dummy sound continues without the pitch being corrected.

なお、フレーズ指定操作の態様については、例示したものに限らず、各種のバリエーションが考えられる。例えば、第２の実施の形態でも言及したように、進み操作子３４や戻し操作子３５のような指定操作子を１回押下することで指定開始操作及び指定終了操作が指示され、発音対象フレーズが確定する構成としてもよい。また、１組の操作で移動する先のフレーズは、隣接するフレーズに限らず、ＣＰＵ１０は、複数フレーズを飛び越して発音対象フレーズを確定させてもよい。また、指定操作子を一定時間長押しすることで、指定開始操作及び指定終了操作が完了する構成としてもよい。その際、ＣＰＵ１０は、長押しの時間長に応じて移動先のフレーズを確定させてもよい。またＣＰＵ１０は、一定時間内における指定操作子の押下操作と離し操作の繰り返し回数によって移動先の発音対象フレーズを確定させてもよい。あるいは、指定操作子と他の操作子との操作の組み合わせによって移動先の発音対象フレーズを指定できる構成としてもよい。また、指定操作子を所定の態様で操作することにより、現在のフレーズに拘わらず、選択曲の先頭のフレーズが発音対象フレーズとして確定されるようにしてもよい。 The mode of the phrase designating operation is not limited to the illustrated example, and various variations can be considered. For example, as mentioned in the second embodiment, the designation start operation and the designation end operation are instructed by pressing a designated operation element such as the advance operation element 34 and the return operation element 35 once, and the phrase to be pronounced is designated. May be determined. Further, the phrase to be moved by one set of operations is not limited to the adjacent phrase, and the CPU 10 may jump over a plurality of phrases and determine the phrase to be pronounced. Alternatively, the designation start operation and the designation end operation may be completed by pressing and holding the designated operation element for a certain period of time. At that time, the CPU 10 may determine the phrase to be moved according to the length of time of the long press. The CPU 10 may determine the phrase to be sounded at the destination based on the number of repetitions of the pressing operation and the releasing operation of the designated operation element within a certain time. Alternatively, a configuration may be adopted in which the destination pronunciation target phrase can be specified by a combination of operations of the designated operation element and another operation element. Further, by operating the designated operation element in a predetermined manner, the head phrase of the selected music piece may be determined as the phrase to be pronounced regardless of the current phrase.

なお、発音対象フレーズの確定と確定した発音対象フレーズにおけるカーソル位置の設定については次のようにしてもよい。例えば、選択曲の最終フレーズで進み操作子３４によるフレーズ指定操作があった場合、ＣＰＵ１０は、選択曲の先頭フレーズを発音対象フレーズとして確定させ、発音対象フレーズの先頭音節にカーソルを設定してもよい。また、先頭フレーズで戻し操作子３５によるフレーズ指定操作があった場合、ＣＰＵ１０は、選択曲の先頭フレーズを発音対象フレーズとして確定させ、発音対象フレーズの先頭音節にカーソルを設定してもよい。 The setting of the cursor position in the determined sounding target phrase and the determined sounding target phrase may be performed as follows. For example, when the phrase specifying operation by the advance operation element 34 is performed in the last phrase of the selected song, the CPU 10 determines the first phrase of the selected song as the phrase to be pronounced, and sets the cursor to the first syllable of the phrase to be pronounced. Good. Further, when a phrase designating operation is performed by the return operation element 35 at the head phrase, the CPU 10 may determine the head phrase of the selected song as the phrase to be sounded, and set the cursor to the head syllable of the phrase to be sounded.

なお、選択曲の歌唱用データ１４ａは、複数のフレーズに分けられた状態で取得できればよく、曲単位で取得することに限定されず、フレーズ単位で取得してもよい。歌唱用データ１４ａがデータ記憶部１４に記憶される態様も曲単位に限定されない。また、歌唱用データ１４ａの取得先は記憶部に限定されず、通信Ｉ／Ｆ２２を通じた外部機器を取得先としてもよい。また、電子楽器１００でユーザが編集または作成することでＣＰＵ１０により取得されるようにしてもよい。 It should be noted that the singing data 14a of the selected song only needs to be acquired in a state of being divided into a plurality of phrases, and is not limited to being acquired in units of songs, but may be acquired in units of phrases. The manner in which the singing data 14a is stored in the data storage unit 14 is not limited to music units. Further, the acquisition destination of the singing data 14a is not limited to the storage unit, and an external device through the communication I / F 22 may be the acquisition destination. Further, the electronic musical instrument 100 may be edited or created by the user to be acquired by the CPU 10.

以上、本発明をその好適な実施形態に基づいて詳述してきたが、本発明はこれら特定の実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の様々な形態も本発明に含まれる。 As described above, the present invention has been described in detail based on the preferred embodiments. However, the present invention is not limited to these specific embodiments, and various forms that do not depart from the gist of the present invention are also included in the present invention. included.

なお、本発明を達成するためのソフトウェアによって表される制御プログラムを記憶した記憶媒体を、本楽器に読み出すことによって同様の効果を奏するようにしてもよく、その場合、記憶媒体から読み出されたプログラムコード自体が本発明の新規な機能を実現することになり、そのプログラムコードを記憶した、非一過性のコンピュータ読み取り可能な記録媒体は本発明を構成することになる。また、プログラムコードを伝送媒体等を介して供給してもよく、その場合は、プログラムコード自体が本発明を構成することになる。なお、これらの場合の記憶媒体としては、ＲＯＭのほか、フロッピディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、磁気テープ、不揮発性のメモリカード等を用いることができる。「非一過性のコンピュータ読み取り可能な記録媒体」は、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（例えばＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ））のように、一定時間プログラムを保持しているものも含む。 Note that a similar effect may be obtained by reading a storage medium storing a control program represented by software for achieving the present invention into the musical instrument. In this case, the storage medium may be read from the storage medium. The program code itself realizes the novel function of the present invention, and the non-transitory computer-readable recording medium storing the program code constitutes the present invention. Further, the program code may be supplied via a transmission medium or the like, and in that case, the program code itself constitutes the present invention. In addition, as a storage medium in these cases, a floppy disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a magnetic tape, a nonvolatile memory card, and the like can be used in addition to the ROM. The “non-transitory computer-readable recording medium” refers to a volatile memory (for example, a volatile memory in a computer system serving as a server or a client when a program is transmitted through a network such as the Internet or a communication line such as a telephone line). Also includes a DRAM (Dynamic Random Access Memory), which holds a program for a certain period of time, such as a DRAM.

１０ＣＰＵ（データ取得部、検出部、発音制御部、確定部）
１４ａ歌唱用データ

10 CPU (data acquisition unit, detection unit, sound generation control unit, determination unit)
14a Singing data

Claims

A data acquisition unit that acquires singing data composed of a plurality of continuous sections including syllable information as a basis for pronunciation,
A detecting unit that detects a section designating operation that designates a next sounding target section in the singing data acquired by the data acquiring unit;
A sound generation unit that generates a predetermined singing sound different from a singing sound based on a singing instruction in response to detection of a section designation operation by the detection unit.

A determination unit that determines the next sounding target section based on the section designation operation detected by the detection unit,
The sounding control unit starts sounding the predetermined singing sound in response to detection of a section designation operation indicating designation start by the detection unit, and the next sounding target section is determined by at least the determination unit. The sound generator according to claim 1, wherein the predetermined singing sound is continuously generated until the sound is generated.

The sound generating device according to claim 2, wherein the determination unit determines the next sounding target section in response to detection of a section designation operation indicating designation completion by the detection unit.

Having an instruction acquisition unit for acquiring the singing instruction,
The pronunciation control unit sings syllable information defined in a predetermined order among a plurality of syllable information in the singing data in response to the singing instruction being obtained by the instruction obtaining unit. Item 4. The sound generator according to any one of Items 1 to 3.

The sounding control unit sings syllable information corresponding to a predetermined position in the next sounding target section in response to the singing instruction being obtained by the instruction obtaining unit after the next sounding target section is determined. The sound generator according to claim 4.

The sounding control unit, after the next sounding target section is determined, until the singing of the syllable information corresponding to the predetermined position is started in response to the singing instruction being obtained by the instruction obtaining unit, The sound generating device according to claim 5, wherein a predetermined singing sound is continuously generated.

The said sound control part changes the pronunciation pitch of the said predetermined singing sound to the said specified pitch, when the instruction | indication which designates a pitch during the production | generation of the said predetermined singing sound is acquired. 7. The sound generating device according to any one of 6.

A data acquisition step of acquiring singing data composed of a plurality of continuous sections including syllable information as a basis of pronunciation,
A detecting step of detecting a section specifying operation of specifying a next sounding target section in the singing data obtained by the data obtaining step,
A sound generation method for generating a predetermined singing sound different from a singing sound based on a singing instruction in response to detection of a section designation operation in the detection step.