JPH1031496A

JPH1031496A - Musical sound generating device

Info

Publication number: JPH1031496A
Application number: JP8184679A
Authority: JP
Inventors: Takeshi Terao; 健寺尾
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 1996-07-15
Filing date: 1996-07-15
Publication date: 1998-02-03

Abstract

PROBLEM TO BE SOLVED: To provide a musical sound generating device which can generates a musical sound so that synthesized human voice is like a natural singing voice. SOLUTION: A CPU 4 sets musical sound information consisting of note-ON and note-OFF information a pitch frequency, and a velocity in a DSP 9 according to externally supplied MIDI data, and reads voice synthesis parameters of a syllable corresponding to a text (one word) made to correspond to a musical sound to be noted ON out of a data base in a data ROM 8 and sets them in a RAM 10. Oonsequently, the DSP 9 vocalizes a human voice, which is subjected to PARCOR(partial auto-correlation) synthesis according to given speech synthesis parameters, as natural singing according to the musical sound information.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声合成する装置
に関し、特に、自然な歌声を発生することができる楽音
発生装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice synthesizing device, and more particularly to a tone generating device capable of generating a natural singing voice.

【０００２】[0002]

【従来の技術】従来より、音声分析して抽出した特徴パ
ラメータに基づき人声音を合成する手法として、チャネ
ルボコーダや、線形予測、ＰＡＲＣＯＲ（パーコール）
と呼ばれる技術が知られている。これら音声合成技術
は、分析した音声を如何に少ない情報量に変換するか、
つまり音声を分析して特徴パラメータの形に変換して言
葉の意味内容に関係の無い冗長成分を除いて情報量を圧
縮することに着目したものであって、高音質で音声合成
したり、合成した人声音を楽音形成に応用することを考
えたものではなかった。2. Description of the Related Art Conventionally, as a method of synthesizing a human voice based on feature parameters extracted by voice analysis, channel vocoder, linear prediction, PARCOR (Percall)
A technique called is known. These speech synthesis techniques can convert the analyzed speech into a smaller amount of information,
In other words, it focuses on analyzing speech, converting it into the form of feature parameters, and compressing the amount of information excluding redundant components not related to the meaning of words. He did not consider applying the human voice to the formation of musical sounds.

【０００３】そうした中にあって、チャネルボコーダは
構成が単純でリアルタイムの分析合成に向いているた
め、フィルタバンクにより抽出される音声のパワースペ
クトル包絡に基づき楽音合成する楽音発生装置に適用さ
れていた。しかしながら、チャネルボコーダでは、フィ
ルタバンクを構成するバンドパスフィルタ段数の限界
や、子音を合成できない等の問題により高音質の音声合
成が叶わず、やがて淘汰されて行った。In such a situation, the channel vocoder has a simple structure and is suitable for real-time analysis and synthesis. Therefore, the channel vocoder has been applied to a tone generator which synthesizes a tone based on the power spectrum envelope of speech extracted by a filter bank. . However, in the channel vocoder, high-quality sound synthesis was not achieved due to the limitation of the number of band-pass filter stages constituting the filter bank and problems such as inability to synthesize consonants.

【０００４】[0004]

【発明が解決しようとする課題】一方、従来の波形メモ
リ読み出し方式による楽音発生装置では、サンプリング
した人声音を波形メモリに記憶しておき、これをサンプ
リング時のピッチで読み出し再生すれば、最も単純な形
で高品位な人声音を発生させることが可能になるもの
の、サンプリング時のピッチとは異なるピッチで読み出
し再生しようとすると、人声音のフォルマント周波数が
変換ピッチ量に応じて変化してしまう為、自然な歌声を
発生することができないという問題がある。On the other hand, in the conventional tone generator using the waveform memory reading method, the simplest method is to store the sampled human voice in the waveform memory and read and reproduce it at the sampling pitch. Although it is possible to generate a high-quality human voice in a simple form, if you try to read and play it at a pitch different from the pitch at the time of sampling, the formant frequency of the human voice will change according to the conversion pitch amount However, there is a problem that a natural singing voice cannot be generated.

【０００５】そこで、本発明は、音声合成された人声音
を自然な歌声として楽音形成することができる楽音発生
装置を提供することを目的としている。SUMMARY OF THE INVENTION It is an object of the present invention to provide a musical sound generator capable of forming a human voice sound synthesized as a natural singing voice.

【０００６】[0006]

【課題を解決するための手段】上記目的を達成するた
め、請求項１に記載の発明では、複数の音節毎の音声合
成パラメータを記憶するパラメータ記憶手段と、楽曲の
各楽音を表わす楽音情報および各楽音に対応付けられた
歌詞を表わす発声情報を発生する楽曲情報発生手段と、
この楽曲情報発生手段が発生する発声情報に従って前記
パラメータ記憶手段から歌詞に合致する音節の音声合成
パラメータを読み出して出力するパラメータ発生手段
と、このパラメータ発生手段が出力する音声合成パラメ
ータに基づき合成した人声音を、前記楽音情報に従って
歌声として発音する音声合成手段とを具備することを特
徴としている。In order to achieve the above object, according to the first aspect of the present invention, parameter storage means for storing speech synthesis parameters for each of a plurality of syllables, musical tone information representing each musical tone of a music piece, Music information generating means for generating utterance information representing lyrics associated with each musical tone;
Parameter generating means for reading out and outputting speech synthesis parameters of syllables matching the lyrics from the parameter storage means in accordance with the utterance information generated by the music information generating means; and a person synthesizing based on the speech synthesis parameters output by the parameter generating means. Voice synthesizing means for generating a vocal sound as a singing voice according to the musical tone information.

【０００７】上記請求項１に従属する請求項２に記載の
発明によれば、前記パラメータ記憶手段は、各音節毎の
音声合成パラメータ中にデータ終端を表わす識別子を備
え、前記音声合成手段は、この識別子を検出して音声合
成パラメータがデータ終端に達したと判断した場合、合
成中の歌声を持続発音することを特徴とする。According to the second aspect of the present invention, the parameter storage means includes an identifier indicating the end of data in a speech synthesis parameter for each syllable, and the speech synthesis means comprises: When this identifier is detected and it is determined that the voice synthesis parameter has reached the end of the data, the singing voice being synthesized is continuously pronounced.

【０００８】同様に、上記請求項１に従属する請求項３
に記載の発明によれば、前記音声合成手段は、合成した
人声音の音質を前記楽音情報に含まれるベロシティに応
じて可変制御することを特徴としている。[0008] Similarly, claim 3 is dependent on claim 1.
According to the invention described in (1), the voice synthesizer variably controls the sound quality of the synthesized human voice in accordance with the velocity included in the musical sound information.

【０００９】また、上記請求項１に従属する請求項４に
記載の発明によれば、前記パラメータ発生手段は、前記
発声情報をＭＩＤＩデータのエクスクルーシブ・メッセ
ージで表現することを特徴とする。According to the fourth aspect of the present invention, the parameter generating means expresses the utterance information by an exclusive message of MIDI data.

【００１０】さらに、上記請求項１に従属する請求項５
に記載の発明によれば、前記音声合成手段は、前記楽音
情報に含まれるノートオン・ノートオフにより歌声の発
音・消音を制御することを特徴としている。[0010] Further, claim 5 is dependent on claim 1 above.
According to the invention described in (1), the voice synthesizing means controls the pronunciation and mute of the singing voice by note-on note-off included in the musical tone information.

【００１１】本発明では、パラメータ記憶手段に複数の
音節毎の音声合成パラメータを記憶しておき、楽曲情報
発生手段が楽曲の各楽音を表わす楽音情報および各楽音
に対応付けられた歌詞を表わす発声情報を発生すると、
パラメータ発生手段が発声情報に従って前記パラメータ
記憶手段から歌詞に合致する音節の音声合成パラメータ
を読み出す。そして、音声合成手段は、パラメータ発生
手段から出力される音声合成パラメータに基づき人声音
を合成し、これを前記楽音情報に従って歌声として発音
する。これにより、音声合成された人声音を自然な歌声
として楽音形成することが可能になる。According to the present invention, the parameter storing means stores voice synthesis parameters for each of a plurality of syllables, and the music information generating means generates musical sound information representing each musical sound of the musical composition and utterance representing the lyrics associated with each musical sound. When information is generated,
The parameter generating means reads out voice synthesis parameters of syllables matching the lyrics from the parameter storage means according to the utterance information. Then, the voice synthesizing unit synthesizes a human voice based on the voice synthesizing parameter output from the parameter generating unit, and sounds this as a singing voice according to the musical tone information. As a result, it is possible to form a musical tone as a natural singing voice using the synthesized voice.

【００１２】[0012]

【発明の実施の形態】本発明による楽音発生装置は、電
子楽器の他、人声音で音声案内する装置などに適用され
得る。以下では、本発明の実施の形態による楽音発生装
置を実施例として図面を参照して説明する。BEST MODE FOR CARRYING OUT THE INVENTION The musical sound generating apparatus according to the present invention can be applied to an electronic musical instrument as well as an apparatus for providing voice guidance using human voices. Hereinafter, a tone generator according to an embodiment of the present invention will be described as an example with reference to the drawings.

【００１３】Ａ．実施例の構成（１）全体構成図１は、本発明の一実施例による楽音発生装置の全体構
成を示すブロック図である。この図において、１は鍵盤
部であり、押離鍵操作に応じたキーオン／キーオフ信
号、キーコードおよびベロシティ信号からなる演奏情報
を発生する。２は操作パネル面に配設されるパネルスイ
ッチ群である。パネルスイッチ群２の内には、電源をオ
ンオフ操作する電源スイッチや、動作モードを選択する
モードスイッチ等が設けられており、これらスイッチ操
作に応じた操作信号を発生する。A. 1. Configuration of Embodiment (1) Overall Configuration FIG. 1 is a block diagram showing the overall configuration of a musical sound generating device according to an embodiment of the present invention. In FIG. 1, reference numeral 1 denotes a keyboard section, which generates performance information including a key-on / key-off signal, a key code, and a velocity signal according to a key press / release operation. Reference numeral 2 denotes a panel switch group provided on the operation panel surface. The panel switch group 2 includes a power switch for turning on and off the power, a mode switch for selecting an operation mode, and the like, and generates an operation signal in accordance with the operation of these switches.

【００１４】なお、ここで言う動作モードとは、音声合
成された人声音を、押離鍵操作に対応して鍵盤部１から
出力される演奏情報に応じて歌声として発音させる鍵盤
モード、あるいはＭＩＤＩインタフェース回路７（後述
する）を介して外部ＭＩＤＩ機器から供給されるＭＩＤ
Ｉデータに応じて歌声を発音させるＭＩＤＩモードを指
す。本実施例では、このＭＩＤＩモードについて言及す
る。The operation mode referred to here is a keyboard mode in which a synthesized voice is produced as a singing voice according to performance information output from the keyboard unit 1 in response to a key press or release operation, or MIDI. MID supplied from an external MIDI device via an interface circuit 7 (described later)
This refers to a MIDI mode in which a singing voice is generated according to the I data. In this embodiment, the MIDI mode will be described.

【００１５】次に、３は液晶パネル等から構成される表
示部であり、上記パネルスイッチ群２の操作に対応した
設定状況や動作状態が表示される。４は装置各部を制御
するＣＰＵであり、その特徴的な動作については追って
詳述する。このＣＰＵ４は、ＲＯＭ５に格納される音声
合成プログラムをＤＳＰ９（後述する）に転送したり、
ＭＩＤＩインタフェース回路７を介して外部から供給さ
れるＭＩＤＩデータに対応する音声合成パラメータを後
述のデータＲＯＭ８から読み出してＲＡＭ１０に転送す
る。Next, reference numeral 3 denotes a display unit composed of a liquid crystal panel or the like, which displays a setting state and an operation state corresponding to the operation of the panel switch group 2. Reference numeral 4 denotes a CPU for controlling each unit of the apparatus, and its characteristic operation will be described later in detail. The CPU 4 transfers the speech synthesis program stored in the ROM 5 to the DSP 9 (described later),
A voice synthesis parameter corresponding to MIDI data supplied from the outside via the MIDI interface circuit 7 is read from a data ROM 8 described later and transferred to the RAM 10.

【００１６】ＲＯＭ５は、上記ＣＰＵ４にロードされる
各種の制御プログラムや制御データの他、後述するＤＳ
Ｐ９において実行される音声合成プログラムを記憶して
いる。また、このＲＯＭ５には、ＭＩＤＩデータ中のノ
ートナンバを音高周波数に変換する変換テーブルＴＢＬ
が記憶されている。この変換テーブルＴＢＬは、図２に
示す通り、ノートナンバを読み出しアドレスとして対応
する音高周波数ＰＦを発生するものである。The ROM 5 stores various control programs and control data loaded into the CPU 4 and a DS (described later).
A speech synthesis program executed in P9 is stored. The ROM 5 also has a conversion table TBL for converting note numbers in MIDI data to pitch frequencies.
Is stored. As shown in FIG. 2, this conversion table TBL generates a corresponding pitch frequency PF using a note number as a read address.

【００１７】６はＣＰＵ４のワークエリアとして各種レ
ジスタやフラグデータが一時記憶されるＲＡＭである。
また、このＲＡＭ６には、バッファエリアが設けられて
おり、ＭＩＤＩインタフェース回路７を介して外部から
供給されるシリアル形式のＭＩＤＩデータを一時記憶す
るようになっている。ＭＩＤＩインタフェース回路７で
は、ＣＰＵ４の指示の下に、外部ＭＩＤＩ機器から供給
されるＭＩＤＩデータを取込み、バスＢを介して上記Ｒ
ＡＭ６のバッファエリアにストアする。また、このＭＩ
ＤＩインタフェース回路７は、ＲＡＭ６のバッファエリ
アにストアされたＭＩＤＩデータをシリアル形式でデー
タ送出する。Reference numeral 6 denotes a RAM for temporarily storing various registers and flag data as a work area of the CPU 4.
The RAM 6 has a buffer area for temporarily storing serial MIDI data supplied from the outside via the MIDI interface circuit 7. The MIDI interface circuit 7 fetches MIDI data supplied from an external MIDI device under the instruction of the CPU 4, and receives the MIDI data via the bus B.
Store in the buffer area of AM6. Also, this MI
The DI interface circuit 7 sends out the MIDI data stored in the buffer area of the RAM 6 in a serial format.

【００１８】データＲＯＭ８には、音節毎の音声合成パ
ラメータがデータベース登録されている。音声合成パラ
メータとは、パーコール（ＰＡＲＣＯＲ）分析によって
得られる残差Ｚおよびパーコール係数Ｋ１〜Ｋｎからな
るものであり、このデータＲＯＭ８では各音節、つま
り、かな一文字毎の音声合成パラメータがテーブル記憶
されている。In the data ROM 8, speech synthesis parameters for each syllable are registered in a database. The speech synthesis parameter is composed of a residual Z obtained by a Parcol analysis (PARCOR) analysis and Percoll coefficients K1 to Kn. In the data ROM 8, speech synthesis parameters for each syllable, that is, for each kana character, are stored in a table. I have.

【００１９】ここで、図３を参照してデータＲＯＭ８に
記憶される音声合成パラメータのデータ構造について述
べる。この図において、ＤＢ［ｍ，ｎ］は音番号ｍと係
数セト数ｎとを表わすヘッダである。このヘッダＤＢ
［ｍ，ｎ］により所望の音節の残差Ｚおよびパーコール
係数Ｋ１〜Ｋｎを検索し得るようになっている。ヘッダ
ＤＢ［ｍ，ｎ］の音番号ｍは、音節（一語）を指定す
る。一方、係数セット数ｎは、音番号ｍで指定された音
節を合成する際に使用されるパーコール係数Ｋ１〜Ｋｎ
および残差Ｚを指定する。Here, the data structure of the speech synthesis parameters stored in the data ROM 8 will be described with reference to FIG. In this figure, DB [m, n] is a header representing a tone number m and a coefficient set number n. This header DB
[M, n] can be used to search for the residual Z of the desired syllable and the Percoll coefficients K1 to Kn. The sound number m of the header DB [m, n] specifies a syllable (one word). On the other hand, the number of coefficient sets n is the number of Percoll coefficients K1 to Kn used when synthesizing the syllable specified by the sound number m.
And the residual Z.

【００２０】例えば、図３に示すＤＢ［０，０］〜ＤＢ
［０，ｎ−１］は、音番号０で規定された”音節
「あ」”をパーコール分析して得た残差Ｚおよびパーコ
ール係数Ｋ１〜Ｋｎの値を時系列に並べたものである。
通常、係数セット数ｎは５〜２０ｍｓｅｃ程度の時間間
隔でパーコール分析した値であり、この時間間隔で係数
セット数ｎが指定するパーコール係数Ｋ１〜Ｋｎおよび
残差Ｚを合成フィルタ（後述する）にセットすれば、パ
ーコール分析した時と同じフォルマント音が合成され
る。このようにデータベース化された音声合成パラメー
タの特長は、各音番号ｍの終端となるＤＢ［０〜ｍ，
ｎ］に、残差Ｚを「−１」としたデータ終了を表わす識
別子を設けたことにあり、これが意図するところについ
ては追って述べる。For example, DB [0,0] to DB shown in FIG.
[0, n-1] is a time series of values of the residual Z and the values of the Percoll coefficients K1 to Kn obtained by Percoll analysis of "syllable" A "" specified by the note number 0.
Normally, the coefficient set number n is a value obtained by Percoll analysis at a time interval of about 5 to 20 msec. At this time interval, the Percoll coefficients K1 to Kn and the residual Z specified by the coefficient set number n are applied to a synthesis filter (described later). If set, the same formant sound as when Percoll analysis was performed is synthesized. The feature of the speech synthesis parameters compiled in the database is that DB [0 to m,
n] is provided with an identifier indicating the end of data with the residual Z set to “−1”, and the intended purpose will be described later.

【００２１】次に、再び図１に戻り、実施例の構成につ
いて説明を進める。図１において、ＤＳＰ９は、ＣＰＵ
４から転送セットされる音声合成プログラムおよび音声
合成パラメータに基づきパーコール合成演算を行って人
声音を発生するものであり、このパーコール合成アルゴ
リズムについては後述する。１０は、ＤＳＰ９のワーク
エリアとして用いられるＲＡＭ１０であり、ＣＰＵ４に
よって上述したデータＲＯＭ８から読み出された音声合
成パラメータ（残差Ｚおよびパーコール係数Ｋ１〜Ｋ
ｎ）がセットされる。Next, returning to FIG. 1, the configuration of the embodiment will be described. In FIG. 1, a DSP 9 is a CPU
4 is to generate a human voice by performing a Percoll synthesis operation based on a voice synthesis program and a voice synthesis parameter transferred and set from Step 4. The Percoll synthesis algorithm will be described later. Reference numeral 10 denotes a RAM 10 used as a work area of the DSP 9, and a speech synthesis parameter (residual Z and Percoll coefficients K1 to K1) read from the data ROM 8 by the CPU 4.
n) is set.

【００２２】ＤＳＰ９にて合成された音声データは、次
段のＤ／Ａ変換器１１を介してアナログの音声信号に変
換される。Ｄ／Ａ変換器１１から出力される音声信号
は、アンプ１２にて不要ノイズ除去等のフィルタリング
が施された後に増幅されて、スピーカＳＰより自然な歌
声として放音される。The audio data synthesized by the DSP 9 is converted into an analog audio signal via the D / A converter 11 at the next stage. The audio signal output from the D / A converter 11 is subjected to filtering such as unnecessary noise removal by the amplifier 12 and then amplified, and is emitted as a natural singing voice from the speaker SP.

【００２３】（２）ＤＳＰ９のパーコール合成アルゴリ
ズム（機能構成）次に、図４を参照し、ＣＰＵ４から供給される楽音情報
に従って人声音による歌声を形成するＤＳＰ９のパーコ
ール合成アルゴリズム（機能構成）について説明する。
なお、ここで言う楽音情報とは、ＣＰＵ４がＭＩＤＩデ
ータから抽出するノートオンＮＯＮ、ノートオフＮＯＦ
およびベロシティデータＶＥＬの他、ノートナンバを前
述の変換テーブルＴＢＬによって変換した音高周波数Ｐ
Ｆである。(2) Parcall synthesis algorithm (functional configuration) of DSP 9 Next, with reference to FIG. 4, a percall synthesis algorithm (functional configuration) of the DSP 9 that forms a singing voice based on a human voice according to the musical tone information supplied from the CPU 4 will be described. I do.
Note that the tone information referred to here is a note-on NON and a note-off NOF extracted from the MIDI data by the CPU 4.
And the pitch frequency P obtained by converting the note number in addition to the velocity data VEL using the conversion table TBL.
F.

【００２４】図４において、２０はパルス発生器であ
り、音高周波数ＰＦに対応した周期でパルス波形ＰＷを
出力する。２１はホワイトノイズＷＮを発生するノイズ
発生器である。ＳＥＬは、合成する音声が「有声音」か
「無声音」かに応じてパルス波形ＰＷもしくはホワイト
ノイズＷＮを選択するセレクタである。すなわち、この
セレクタＳＥＬは、与えられたパーコール係数Ｋ１が所
定の定数Ｊ（約０．３）以上の場合に「有声音」を合成
すべくパルス波形ＰＷ側を選択し、一方、パーコール係
数Ｋ１が定数Ｊより小さい時には「無声音」を合成すべ
くホワイトノイズＷＮ側を選択する。In FIG. 4, reference numeral 20 denotes a pulse generator which outputs a pulse waveform PW at a period corresponding to the pitch frequency PF. Reference numeral 21 denotes a noise generator that generates white noise WN. SEL is a selector for selecting the pulse waveform PW or the white noise WN depending on whether the voice to be synthesized is “voiced sound” or “unvoiced sound”. That is, the selector SEL selects the pulse waveform PW side to synthesize the "voiced sound" when the given Percoll coefficient K1 is equal to or larger than the predetermined constant J (about 0.3), while the Percoll coefficient K1 is When it is smaller than the constant J, the white noise WN side is selected to synthesize "unvoiced sound".

【００２５】２２はエンベロープジェネレータであり、
ノートオンＮＯＮ、ノートオフＮＯＦおよびベロシティ
データＶＥＬに応じてアタック部、リリース部および振
幅が制御されたエンベロープ波形ＥＮＶを発生する。２
３は係数乗算器であり、上記セレクタＳＥＬを介して供
給されるパルス波形ＰＷあるいはホワイトノイズＷＮの
いずれかにエンベロープ波形ＥＮＶを乗算して出力す
る。Reference numeral 22 denotes an envelope generator.
An envelope waveform ENV whose attack part, release part, and amplitude are controlled in accordance with note-on NON, note-off NOF, and velocity data VEL is generated. 2
A coefficient multiplier 3 multiplies either the pulse waveform PW or the white noise WN supplied via the selector SEL by the envelope waveform ENV and outputs the result.

【００２６】ところで、上述したセレクタＳＥＬでは、
有声音・無声音に応じてパルス波形ＰＷあるいはホワイ
トノイズＷＮに切換えるが、これに限らず、有声音・無
声音に応じてパルス波形ＰＷとホワイトノイズＷＮとを
クロスフェードさせる態様としても良く、そのようにす
ると、有声音から無声音への変化、あるいは無声音から
有声音への変化がより自然なものとなる。By the way, in the selector SEL described above,
Switching to the pulse waveform PW or the white noise WN according to the voiced / unvoiced sound is not limited to this, and the pulse waveform PW and the white noise WN may be cross-fade according to the voiced / unvoiced sound. Then, the change from voiced sound to unvoiced sound or the change from unvoiced sound to voiced sound becomes more natural.

【００２７】２４は係数乗算器であり、上記係数乗算器
２３の出力に残差Ｚを乗算して出力する。２５−１〜２
５−ｎは、それぞれパーコール係数Ｋ１〜Ｋｎに基づき
パーコール分析過程の逆過程で音声合成する格子型フィ
ルタである。これら縦続接続される格子型フィルタは、
声道特性をシミュレートするもので、遅延回路２５ａ、
係数乗算器２５ｂ，２５ｃ、加算器２５ｄおよび減算器
２５ｅから構成される。遅延回路２５ａは、パーコール
分析過程と同じサンプリング遅延とすれば、分析した音
声信号と同じフォルマントとなる。したがって、音声合
成時の特殊効果として故意にフォルマントを異ならせる
ような場合には、パーコール分析時とは異なるサンプリ
ング遅延量とすれば良い。A coefficient multiplier 24 multiplies the output of the coefficient multiplier 23 by a residual Z and outputs the result. 25-1 to 2
Reference numeral 5-n denotes a lattice filter that synthesizes speech in the reverse process of the Percoll analysis process based on the Percoll coefficients K1 to Kn, respectively. These cascaded grating filters are:
It simulates the vocal tract characteristics, and includes a delay circuit 25a,
It comprises coefficient multipliers 25b and 25c, adder 25d and subtractor 25e. The delay circuit 25a has the same formant as the analyzed voice signal if the sampling delay is the same as in the Percoll analysis process. Therefore, when the formant is intentionally made different as a special effect at the time of speech synthesis, the sampling delay amount may be different from that at the time of Percoll analysis.

【００２８】このようなパーコール合成アルゴリズムに
よれば、ＲＡＭ１０に格納された音声合成パラメータ
（残差Ｚおよびパーコール係数Ｋ１〜Ｋｎ）、つまり、
ＭＩＤＩデータに対応してＣＰＵ４が前述のデータＲＯ
Ｍ８から読み出した音声合成パラメータに従って音節を
合成してなる人声音を歌声として生成する。According to such a Percoll synthesis algorithm, the speech synthesis parameters (residual Z and Percoll coefficients K1 to Kn) stored in the RAM 10, that is,
The CPU 4 responds to the MIDI data by
A human voice generated by synthesizing syllables according to the voice synthesis parameters read from M8 is generated as a singing voice.

【００２９】Ｂ．実施例の動作次に、図５〜図７を参照し、上記構成による実施例の動
作について説明する。以下では、最初に全体動作として
メインルーチンの処理を説明した後、このメインルーチ
ンにおいてコールされるＭＩＤＩ処理ルーチンやタイマ
インタラプトルーチンの処理について述べる。B. Next, the operation of the embodiment having the above configuration will be described with reference to FIGS. Hereinafter, the processing of the main routine will be described first as an overall operation, and then the processing of the MIDI processing routine and the timer interrupt routine called in the main routine will be described.

【００３０】（１）メインルーチンの動作まず、本実施例による楽音発生装置に電源が投入される
と、ＣＰＵ４はＲＯＭ５より所定の制御プログラムを読
み出して自身にロードした後、図５に示すメインルーチ
ンを実行してステップＳＡ１に処理を進め、ＲＡＭ６や
ＲＡＭ１０に設けられている各種レジスタ・フラグを初
期化する一方、ＤＳＰ９に対して内部レジスタ類をゼロ
リセットするよう指示を出す。(1) Operation of Main Routine When the power of the tone generator according to the present embodiment is turned on, the CPU 4 reads a predetermined control program from the ROM 5 and loads it into itself, and then executes the main routine shown in FIG. Is executed, and the process proceeds to step SA1 to initialize various register flags provided in the RAM 6 and the RAM 10, while instructing the DSP 9 to reset the internal registers to zero.

【００３１】こうして初期化がなされると、ＣＰＵ４は
次のステップＳＡ２に処理を進め、ＲＯＭ５に格納され
る音声合成プログラムをＤＳＰ９に転送する。これによ
り、ＤＳＰ９では音声合成の準備が整う。次いで、ステ
ップＳＡ３に進むと、カウンタレジスタｉの値をゼロリ
セットする。このカウンタレジスタｉの値は、音声合成
パラメータ（残差Ｚおよびパーコール係数Ｋ１〜Ｋｎ）
をデータＲＯＭ８から読み出してＲＡＭ１０にセットす
る場合や、ＲＡＭ１０に格納された音声合成パラメータ
をＤＳＰ９にセットする場合に参照される。After the initialization, the CPU 4 proceeds to the next step SA2, and transfers the speech synthesis program stored in the ROM 5 to the DSP 9. Thus, the DSP 9 is ready for speech synthesis. Next, in step SA3, the value of the counter register i is reset to zero. The value of the counter register i is determined by the speech synthesis parameters (residual Z and Percoll coefficients K1 to Kn).
Is read from the data ROM 8 and set in the RAM 10 or when the speech synthesis parameters stored in the RAM 10 are set in the DSP 9.

【００３２】そして、ステップＳＡ４では、ＭＩＤＩイ
ンタフェース回路７を介して入力されるＭＩＤＩデータ
の内容に応じた音声合成パラメータをデータＲＯＭ８か
ら読み出してＲＡＭ１０にセットしたり、入力されるＭ
ＩＤＩデータに対応した楽音情報（ノートオンＮＯＮ、
ノートオフＮＯＦ、ベロシティデータＶＥＬおよび音高
周波数ＰＦ）を生成してＤＳＰ９に供給するＭＩＤＩ処
理（後述する）を行う。次に、ステップＳＡ５に進む
と、パネルスイッチ群２を走査してスイッチ操作を検出
し、検出したスイッチ操作に応じた処理を行う。以後、
電源スイッチがオフ操作される迄、ステップＳＡ４〜Ｓ
Ａ５を繰り返す。In step SA4, the speech synthesis parameters corresponding to the contents of the MIDI data input via the MIDI interface circuit 7 are read from the data ROM 8 and set in the RAM 10 or the input M
Tone information corresponding to IDI data (note-on NON,
A MIDI process (described later) that generates a note-off NOF, velocity data VEL, and pitch frequency PF) and supplies it to the DSP 9 is performed. Next, in step SA5, the panel switch group 2 is scanned to detect a switch operation, and processing corresponding to the detected switch operation is performed. Since then
Steps SA4 to SA4 until the power switch is turned off.
Repeat A5.

【００３３】（２）ＭＩＤＩ処理ルーチンの動作さて、上述したメインルーチンのステップＳＡ４を介し
てＭＩＤＩ処理ルーチンが実行されると、ＣＰＵ４は図
６に示すステップＳＢ１に処理を進め、ＭＩＤＩ入力の
有無を判断する。ここで、ＭＩＤＩ入力が無ければ、判
断結果が「ＮＯ」となり、本ルーチンを終了させて処理
をメインルーチンに復帰させる。一方、ＭＩＤＩ入力が
有ると、判断結果が「ＹＥＳ」となり、次のステップＳ
Ｂ２に処理を進める。ステップＳＢ２では、入力された
ＭＩＤＩデータのステータスが「ノートオン」であるか
否かを判断する。(2) Operation of the MIDI Processing Routine When the MIDI processing routine is executed through step SA4 of the main routine, the CPU 4 advances the processing to step SB1 shown in FIG. to decide. Here, if there is no MIDI input, the determination result is “NO”, and this routine is terminated, and the process returns to the main routine. On the other hand, if there is a MIDI input, the determination result is “YES”, and the next step S
The process proceeds to B2. In step SB2, it is determined whether or not the status of the input MIDI data is "note on".

【００３４】ところで、本実施例の場合、入力されるＭ
ＩＤＩデータのステータスは、「エクスクルーシブ」、
「ノートオン」および「ノートオフ」のいずれかにな
る。「エクスクルーシブ」は、楽曲中の各音に対応付け
られた歌詞を、前述した音声合成パラメータ中の音番号
ｍで指定する。つまり、「エクスクルーシブ」によって
先ず歌声の音節を音番号ｍで指定し、続いてその歌声の
音高および音量を「ノートオン」によって指定し、「ノ
ートオフ」で消音することになる。By the way, in the case of this embodiment, the input M
The status of IDI data is "Exclusive",
Either "Note On" or "Note Off". “Exclusive” designates the lyrics associated with each sound in the music by the sound number m in the above-described voice synthesis parameter. In other words, the syllable of the singing voice is first designated by the note number m by "exclusive", then the pitch and volume of the singing voice are designated by "note on", and the sound is muted by "note off".

【００３５】したがって、いま最初にＭＩＤＩ入力があ
ると、そのＭＩＤＩデータは「エクスクルーシブ」とな
るから、上記ステップＳＢ２の判断結果は「ＮＯ」とな
り、ステップＳＢ３に進む。ステップＳＢ３では、ＭＩ
ＤＩデータが「エクスクルーシブ」である否かを判断
し、この場合、判断結果が「ＹＥＳ」となり、ステップ
ＳＢ４に進む。ステップＳＢ４に進むと、ＣＰＵ４は、
ＭＩＤＩデータ中のエクスクルーシブ・メッセージから
歌詞（一語）に対応付けられた音番号ｍを抽出してレジ
スタｅｘｃにセットして一旦、このルーチンの処理を完
了する。Therefore, if there is a MIDI input for the first time, the MIDI data becomes "exclusive", and the result of the determination at step SB2 is "NO", and the routine proceeds to step SB3. In step SB3, MI
It is determined whether or not the DI data is “exclusive”. In this case, the determination result is “YES”, and the process proceeds to Step SB4. When proceeding to step SB4, the CPU 4
The tone number m associated with the lyrics (one word) is extracted from the exclusive message in the MIDI data, set in the register exc, and the processing of this routine is completed once.

【００３６】なお、こうしてレジスタｅｘｃに音番号ｍ
がセットされると、ＣＰＵ４では、後述のタイマインタ
ラプトルーチンの処理によってデータＲＯＭ８から音番
号ｍに対応する音声合成パラメータ（残差Ｚおよびパー
コール係数Ｋ１〜Ｋｎ）を読み出し、これをＲＡＭ１０
にセットする。In this way, note number m is stored in register exc.
Is set, the CPU 4 reads the speech synthesis parameters (residual Z and Percoll coefficients K1 to Kn) corresponding to the tone number m from the data ROM 8 by the processing of a timer interrupt routine described later,
Set to.

【００３７】次に、「ノートオン」のＭＩＤＩ入力があ
ると、上述したステップＳＢ２の判断結果が「ＹＥＳ」
となり、ステップＳＢ５に処理を進める。ステップＳＢ
５では、「ノートオン」のＭＩＤＩデータ中に含まれる
ノートナンバを、前述した変換テーブルＴＢＬ（図２参
照）により音高周波数ＰＦに変換してＤＳＰ９にセット
する。次いで、ステップＳＢ６に進むと、ＣＰＵ４はＭ
ＩＤＩデータから上記ノートナンバに続いて付与されて
いるベロシティ値を抽出し、これをベロシティデータＶ
ＥＬとしてＤＳＰ９にセットする。これにより、ＤＳＰ
９では、パルス発生器２０に音高周波数ＰＦがセットさ
れ、エンベロープジェネレータ２２にベロシティデータ
ＶＥＬがセットされる。Next, when there is a MIDI input of "note on", the result of the determination in step SB2 is "YES".
, And the process proceeds to Step SB5. Step SB
In step 5, the note number included in the MIDI data of "note on" is converted into a pitch frequency PF by the above-described conversion table TBL (see FIG. 2) and set in the DSP 9. Next, when the process proceeds to step SB6, the CPU 4
The velocity value assigned after the note number is extracted from the IDI data, and is extracted from the velocity data V.
EL is set to DSP9. This allows the DSP
At 9, the pitch frequency PF is set in the pulse generator 20, and the velocity data VEL is set in the envelope generator 22.

【００３８】そして、ステップＳＢ７では、カウンタレ
ジスタｉをゼロリセットし、続くステップＳＢ８にてエ
ンベロープジェネレータ２２にアタック形成するよう指
示する。この結果、ＤＳＰ９では、ＲＡＭ１０にセット
されている楽音合成パラメータ、つまり、ヘッダＤＢ
［ｅｘｃ，ｉ］にて指定される音節の残差Ｚおよびパー
コール係数Ｋ１〜Ｋｎに基づきパーコール合成を行い、
ノートナンバに対応した音高周波数ＰＦによるピッチの
人声音を、ベロシティデータＶＥＬの音量で生成する。Then, in step SB7, the counter register i is reset to zero, and in the following step SB8, the envelope generator 22 is instructed to form an attack. As a result, in the DSP 9, the tone synthesis parameters set in the RAM 10, that is, the header DB
Percall synthesis is performed based on the syllable residual Z specified by [exc, i] and the Percoll coefficients K1 to Kn,
A human voice with a pitch based on the pitch frequency PF corresponding to the note number is generated at the volume of the velocity data VEL.

【００３９】以上のようにして、歌詞一語分の人声音が
歌声として発音された後には、「ノートオフ」のＭＩＤ
Ｉ入力がなされる。「ノートオフ」のＭＩＤＩデータが
供給された場合には、上記ステップＳＢ３を介してステ
ップＳＢ９に進み、「ノートオフ」の時にはステップＳ
Ｂ９の判断結果が「ＹＥＳ」となってステップＳＢ１０
に処理を進める。ステップＳＢ１０では、エンベロープ
ジェネレータ２２にリリース形成するよう指示し、発声
中の歌詞を消音させる。As described above, after the human voice for one word of the lyrics is pronounced as a singing voice, the MID of "note off"
An I input is made. When the MIDI data of "note off" is supplied, the process proceeds to step SB9 via the above-described step SB3, and when the MIDI data of "note off" is supplied, the process proceeds to step SB9.
The result of determination in B9 becomes "YES", and step SB10
Processing proceeds to In step SB10, the CPU instructs the envelope generator 22 to release the sound, and silences the lyrics being uttered.

【００４０】このように、ＭＩＤＩ処理ルーチンでは、
「エクスクルーシブ」のＭＩＤＩデータが入力される
と、ＭＩＤＩデータ中のエクスクルーシブ・メッセージ
から歌詞（一語）に対応付けられた音番号ｍを抽出して
レジスタｅｘｃにセットして、後述のタイマインタラプ
トルーチンに対して音番号ｍを引渡す。そして、「ノー
トオン」のＭＩＤＩデータが入力された時には、ノート
ナンバに対応した音高周波数ＰＦをＤＳＰ９のパルス発
生器２０にセットすると共に、ベロシティデータＶＥＬ
をＤＳＰ９のエンベロープジェネレータ２２にセットし
て歌詞を歌声として発音させる。続いて、「ノートオ
フ」のＭＩＤＩデータが入力されると、その歌声をリリ
ースさせる。As described above, in the MIDI processing routine,
When the MIDI data of "exclusive" is input, a note number m associated with the lyrics (one word) is extracted from the exclusive message in the MIDI data and set in the register exc. Hand over note number m. When MIDI data of "note on" is input, the pitch frequency PF corresponding to the note number is set in the pulse generator 20 of the DSP 9 and the velocity data VEL
Is set in the envelope generator 22 of the DSP 9 and the lyrics are pronounced as a singing voice. Subsequently, when MIDI data of "Note Off" is input, the singing voice is released.

【００４１】（３）タイマインタラプトルーチンの動作ＣＰＵ４では、例えば、２０ｍｓｅｃ毎に割込みマスク
を解除して図７に示すタイマインタラプトルーチンを実
行してステップＳＣ１に処理を進め、ヘッダＤＢ［ｅｘ
ｃ，ｉ］が示す残差Ｚが「０」より小さいか、つまり、
音番号ｍで指定される音節の音声合成パラメータが終了
しているか否かを判断する。(3) Operation of Timer Interrupt Routine The CPU 4 releases the interrupt mask, for example, every 20 msec, executes the timer interrupt routine shown in FIG. 7, advances the process to step SC1, and proceeds with the header DB [ex
c, i] is smaller than “0”, that is,
It is determined whether or not the speech synthesis parameters of the syllable specified by the phone number m have been completed.

【００４２】ここで、残差Ｚが「−１」でない場合に
は、判断結果が「ＮＯ」となり、ステプＳＣ２に進む。
ステップＳＣ２では、ヘッダＤＢ［ｅｘｃ，ｉ］に対応
する音声合成パラメータ（残差Ｚおよびパーコール係数
Ｋ１〜Ｋｎ）をデータＲＯＭ８から読み出しＲＡＭ１０
にセットする。これにより、ＤＳＰ９側では、ＲＡＭ１
０にセットされる残差Ｚおよびパーコール係数Ｋ１〜Ｋ
ｎに基づきパーコール合成する。Here, if the residual Z is not "-1", the determination result is "NO" and the process proceeds to Step SC2.
In step SC2, the speech synthesis parameters (residual Z and Percoll coefficients K1 to Kn) corresponding to the header DB [exc, i] are read from the data ROM 8 and read from the RAM 10
Set to. As a result, on the DSP 9 side, the RAM 1
Residual Z and Percoll coefficients K1-K set to 0
Percoll synthesis based on n.

【００４３】次いで、ステップＳＣ３では、カウンタレ
ジスタｉをインクリメントして歩進させ、一旦、この割
り込み処理を終了させる。そして、次の割り込みタイミ
ング下においてステップＳＣ１が実行され、この時、ヘ
ッダＤＢ［ｅｘｃ，ｉ］に対応して読み出した残差Ｚが
「−１」、つまり、データ終端になると、上記ステップ
ＳＣ１の判断結果が「ＹＥＳ」となり、この場合、何も
処理せずに終了させる。したがって、このような場合に
は、ＤＳＰ９側では音声合成パラメータが更新されない
為、パーコール合成する格子型フィルタ２５−１〜２５
−ｎ（図４参照）の時間変化が止り、合成される歌声は
サスティーン状態となる。Next, at step SC3, the counter register i is incremented to advance, and this interrupt processing is temporarily terminated. Then, at the next interrupt timing, step SC1 is executed. At this time, when the residual Z read out corresponding to the header DB [exc, i] is “−1”, that is, when the end of data is reached, the above-described step SC1 is executed. The result of the determination is "YES", and in this case, the processing is terminated without performing any processing. Therefore, in such a case, since the voice synthesis parameters are not updated on the DSP 9 side, the lattice type filters 25-1 to 25-25 for performing percall synthesis are used.
The time change of −n (see FIG. 4) stops, and the synthesized singing voice enters the sustain state.

【００４４】以上説明したように、本実施例では、外部
から供給されるＭＩＤＩデータに基づきノートオン・ノ
ートオフ、音高周波数およびベロシティからなる楽音情
報をＤＳＰ９にセットすると共に、ノートオンする楽音
に対応付けられた歌詞（一語）に相当する音節の音声合
成パラメータを、データＲＯＭ８中のデータベースから
読み出してＲＡＭ１０にセットする。この結果、ＤＳＰ
９では与えられた音声合成パラメータに従ってパーコー
ル合成した人声音を、楽音情報に従って歌声として発声
させる。As described above, in the present embodiment, the tone information including note-on / note-off, pitch frequency and velocity is set in the DSP 9 based on the MIDI data supplied from the outside, and the tone to be note-on is also set. The voice synthesis parameters of the syllable corresponding to the associated lyrics (one word) are read from the database in the data ROM 8 and set in the RAM 10. As a result, the DSP
In step 9, the human voice which is percall-synthesized according to the given voice synthesis parameter is uttered as a singing voice according to the musical tone information.

【００４５】特に、この実施例では、各音節の音声合成
パラメータ中に残差Ｚを「−１」としてデータ終端を表
わす識別子を設けたので、このデータ終端に至った時に
はパーコール係数Ｋ１〜Ｋｎが経時変化せずに保持され
るから、合成される歌声はサスティーン状態となり、音
が伸ばされた歌声（例えば、「あー」）となる。この
為、より一層自然な歌声として楽音形成することができ
る。In particular, in this embodiment, since the residual Z is set to "-1" in the speech synthesizing parameters of each syllable and an identifier indicating the end of data is provided, the percall coefficients K1 to Kn are set when the end of data is reached. Since the synthesized singing voice is maintained without change with time, the synthesized singing voice becomes a sustained state, and becomes a singing voice with an extended sound (for example, “Ah”). Therefore, a musical tone can be formed as a more natural singing voice.

【００４６】また、本実施例では、ベロシティデータＶ
ＥＬにより励振波形の振幅を制御し得るように乗算器２
３（図４参照）を設けたので、格子型フィルタ２５−１
〜２５−ｎで得られるフォルマント以外に、音質（口調
の強弱）を可変制御することも可能になる。In this embodiment, the velocity data V
Multiplier 2 so that the amplitude of the excitation waveform can be controlled by EL
3 (see FIG. 4), the lattice type filter 25-1 is provided.
In addition to the formants obtained in 音 25-n, it is also possible to variably control the sound quality (tone intensity).

【００４７】さらに、上述の実施例においては、ノート
オンする楽音に対応付けられた歌詞（一語）をＭＩＤＩ
データのエクスクルーシブ・メッセージで定義するよう
にしたので、リアルタイムに歌声を発生させることが可
能になっている。Further, in the above-described embodiment, the lyrics (one word) associated with the musical sound to be turned on are defined by MIDI.
Since it is defined by an exclusive message of data, it is possible to generate a singing voice in real time.

【００４８】加えて、この実施例では、乗算器２３によ
ってエンベロープ重畳させる構成としたので、残差Ｚが
係数乗算される以外に、ノートオン／オフで音量制御す
ることが可能になる。したがって、歌詞に対応させた歌
声の発音／消音を容易に制御できる。In addition, in this embodiment, since the envelope is superimposed by the multiplier 23, the volume can be controlled by note-on / off in addition to the coefficient multiplication of the residual Z. Therefore, it is possible to easily control the pronunciation / silence of the singing voice corresponding to the lyrics.

【００４９】なお、上述した実施例では、有声音をパー
コール合成するための励振源として音高周波数ＰＦに対
応した周期のパルス列を出力するパルス発生器２０を用
いたが、これに替えて、例えば図８に示す波形発生態様
としても良い。この図に示す波形発生態様は、音高周波
数ＰＦに対応したピッチで、互いに波形が異なる波形
ａ，ｂを発生する波形発生器３０，３１と、乗算器３
２，３３および加算器３４からなり、波形ａ，ｂをベロ
シティデータＶＥＬに応じて内挿補間する補間器３５と
から構成される。In the above-described embodiment, the pulse generator 20 that outputs a pulse train having a period corresponding to the pitch frequency PF is used as an excitation source for percall-synthesizing a voiced sound. The waveform generation mode shown in FIG. 8 may be used. The waveform generation mode shown in the figure includes waveform generators 30 and 31 for generating waveforms a and b having different waveforms at a pitch corresponding to the pitch frequency PF, and a multiplier 3.
2 and 33 and an adder 34, and an interpolator 35 for interpolating the waveforms a and b according to the velocity data VEL.

【００５０】こうした構成によれば、ベロシティデータ
ＶＥＬに応じて波形ａと波形ｂとを内挿補間するので、
ベロシティ値に応じて音色変化を与えることが可能にな
る。波形ａ，ｂの種類としては、三角波や異なるパルス
幅を持った波形等、倍音成分を多く含んだものが効果的
であり、上記以外では、例えば、ノートオンに対してデ
ィチューンを施した複数の波形信号を同時発生させて混
合することによって、コーラスにような音声合成も可能
になる。According to such a configuration, the waveform a and the waveform b are interpolated according to the velocity data VEL.
It becomes possible to change the timbre according to the velocity value. As the types of the waveforms a and b, those containing many overtone components, such as a triangular wave and waveforms having different pulse widths, are effective. In addition to the above, for example, a plurality of detuned note-ons are used. By simultaneously generating and mixing these waveform signals, voice synthesis such as a chorus can be performed.

【００５１】[0051]

【発明の効果】請求項１に記載の発明によれば、パラメ
ータ記憶手段に複数の音節毎の音声合成パラメータを記
憶しておき、楽曲情報発生手段が楽曲の各楽音を表わす
楽音情報および各楽音に対応付けられた歌詞を表わす発
声情報を発生すると、パラメータ発生手段が発声情報に
従って前記パラメータ記憶手段から歌詞に合致する音節
の音声合成パラメータを読み出す。そして、音声合成手
段は、パラメータ発生手段から出力される音声合成パラ
メータに基づき人声音を合成し、これを前記楽音情報に
従って歌声として発音するので、音声合成された人声音
を自然な歌声として楽音形成することができる。請求項
２に記載の発明によれば、前記パラメータ記憶手段は、
各音節毎の音声合成パラメータ中にデータ終端を表わす
識別子を備え、前記音声合成手段は、この識別子を検出
して音声合成パラメータがデータ終端に達したと判断し
た場合、合成中の歌声を持続発音するので、より一層自
然な歌声として楽音形成することができる。請求項３に
記載の発明によれば、前記音声合成手段は、合成した人
声音の音質を前記楽音情報に含まれるベロシティに応じ
て可変制御するので、口調の強弱などを表現することが
できる。請求項４に記載の発明によれば、前記パラメー
タ発生手段は、前記発声情報をＭＩＤＩデータのエクス
クルーシブ・メッセージで表現するため、極めて容易に
リアルタイムで歌声を発生させることができる。請求項
５に記載の発明によれば、前記音声合成手段は、前記楽
音情報に含まれるノートオン・ノートオフにより歌声の
発音・消音を制御するので、歌詞に対応させた歌声の発
音／消音を容易に制御することができる。According to the first aspect of the present invention, the parameter storage means stores the speech synthesis parameters for each of a plurality of syllables, and the music information generating means stores the musical tone information representing each musical tone of the musical composition and each musical tone. When the utterance information indicating the lyrics associated with the lyric is generated, the parameter generating means reads out the speech synthesis parameters of the syllables matching the lyrics from the parameter storage means in accordance with the utterance information. The voice synthesis means synthesizes a human voice based on the voice synthesis parameter output from the parameter generation means and pronounces the voice as a singing voice in accordance with the musical tone information, so that the voice synthesized human voice is formed as a natural singing voice. can do. According to the second aspect of the present invention, the parameter storage means includes:
The speech synthesis parameter for each syllable is provided with an identifier indicating the end of the data. When the speech synthesis means detects this identifier and determines that the speech synthesis parameter has reached the end of the data, the singing voice being synthesized is continuously pronounced. Therefore, a musical tone can be formed as a more natural singing voice. According to the third aspect of the present invention, the voice synthesizer variably controls the sound quality of the synthesized human voice according to the velocity included in the musical tone information, so that it is possible to express the tone of the tone. According to the fourth aspect of the present invention, since the parameter generating means expresses the utterance information by an exclusive message of MIDI data, it is possible to extremely easily generate a singing voice in real time. According to the fifth aspect of the present invention, the voice synthesizing unit controls the singing / sounding of the singing voice according to the note-on / note-off included in the musical sound information. Can be easily controlled.

[Brief description of the drawings]

【図１】本発明による楽音発生装置の構成を示すブロッ
ク図である。FIG. 1 is a block diagram showing a configuration of a tone generator according to the present invention.

【図２】ノートナンバを音高周波数ＰＦに変換する変換
テーブルＴＢＬの内容を説明するための図である。FIG. 2 is a diagram for explaining the contents of a conversion table TBL for converting a note number into a pitch frequency PF.

【図３】データＲＯＭ８にデータベース登録される音声
合成パラメータのデータ構造を説明するための図であ
る。FIG. 3 is a diagram for explaining a data structure of speech synthesis parameters registered in a database in a data ROM 8;

【図４】ＤＳＰ９のパーコール合成アルゴリズムを示す
機能ブロック図である。FIG. 4 is a functional block diagram showing a Parcall synthesis algorithm of the DSP 9;

【図５】メインルーチンの動作を示すフローチャートで
ある。FIG. 5 is a flowchart showing an operation of a main routine.

【図６】ＭＩＤＩ処理ルーチンの動作を示すフローチャ
ートである。FIG. 6 is a flowchart showing the operation of a MIDI processing routine.

【図７】タイマインタラプトルーチンの動作を示すフロ
ーチャートである。FIG. 7 is a flowchart showing the operation of a timer interrupt routine.

【図８】パルス発生器２０の変形例を示すブロック図で
ある。FIG. 8 is a block diagram showing a modification of the pulse generator 20.

【符号の説明】１鍵盤部２パネルスイッチ群３表示部４ＣＰＵ（パラメータ発生手段、音声合成手段）５ＲＯＭ６ＲＡＭ７ＭＩＤＩインタフェース回路（楽曲情報発生手段）８データＲＯＭ（パラメータ記憶手段）９ＤＳＰ（音声合成手段）１０ＲＡＭ（音声合成手段）１１Ｄ／Ａ変換器１２アンプ[Description of Signs] 1 keyboard unit 2 panel switch group 3 display unit 4 CPU (parameter generation unit, voice synthesis unit) 5 ROM 6 RAM 7 MIDI interface circuit (song information generation unit) 8 data ROM (parameter storage unit) 9 DSP (Speech synthesis means) 10 RAM (speech synthesis means) 11 D / A converter 12 Amplifier

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｇ１０Ｌ 5/04 Ｇ１０Ｈ 7/00 ５１３Ｚ ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁶ Identification code Agency reference number FI Technical display location G10L 5/04 G10H 7/00 513Z

Claims

[Claims]

1. Parameter storage means for storing speech synthesis parameters for each of a plurality of syllables, music information generation means for generating musical tone information representing each musical tone of a musical composition and utterance information representing lyrics associated with each musical tone. A parameter generating means for reading out and outputting a voice synthesizing parameter of a syllable matching the lyrics from the parameter storage means in accordance with the utterance information generated by the music information generating means; A musical sound generating device comprising: a voice synthesizing unit that generates a human voice as a singing voice according to the musical sound information.

2. The parameter storage means includes an identifier indicating the end of data in a speech synthesis parameter for each syllable. The speech synthesis means detects the identifier and determines that the speech synthesis parameter has reached the end of data. 2. The musical sound generator according to claim 1, wherein when it is determined, the singing voice being synthesized is continuously generated.

3. The musical sound generating apparatus according to claim 1, wherein said voice synthesizing means variably controls the sound quality of the synthesized human voice according to the velocity included in the musical sound information.

4. The musical sound uttering device according to claim 1, wherein said parameter generating means expresses said utterance information by an exclusive message of MIDI data.

5. The musical sound producing apparatus according to claim 1, wherein said voice synthesizing means controls the pronunciation and mute of a singing voice by note-on note-off included in said musical sound information.