JPH0812559B2

JPH0812559B2 - Singing voice generator

Info

Publication number: JPH0812559B2
Application number: JP59203463A
Authority: JP
Inventors: 美昭田中
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 1984-09-28
Filing date: 1984-09-28
Publication date: 1996-02-07
Anticipated expiration: 2011-02-07
Also published as: JPS6180299A

Description

【発明の詳細な説明】産業上の利用分野本発明は歌声音発生装置に係り、特にホルマント周波
数を合成して歌声音として発生出力する歌声音発生装置
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a singing voice sound generator, and more particularly to a singing voice sound generator that synthesizes formant frequencies to generate and output as a singing voice sound.

従来の技術音楽を演奏する音響合成システムの中には楽器音の合
成用と歌声合成用とが存在する。このうち歌声合成シス
テムには、例えば「ホルマント型音声合成LSIを用いた
パソコン用歌声合成システム」（石原，伏木田，三留，
井上：昭和59年度電子通信学会総合全国大会,6−198）
などが知られている。このものは入力された音声文字列
と楽譜文字列とのうち、音声文字列を子音と母音を基本
単位とするパラメータに分解し、また楽譜文字列の中の
音階データに従つてピツチパラメータをテーブルから読
み取り、しかる後に上記の子音，母音パラメータの分解
により作成したテーブルに従つてデータROMより取り出
したホルマントパラメータ及び振幅に、上記音階のピツ
チパラメータを組み合わせて合成パラメータを生成する
と共に、楽譜文字列の中の音符長データに従つて時間長
テーブルを作成する。そして、上記合成パラメータは時
間長テーブルに従つて、順次音声合成LSI（大規模集積
回路）に所定のフレーム周期毎に転送された後、ここで
10kHzサンプリングで合成波形が生成される。2. Description of the Related Art There are sound synthesis systems for playing music, one for synthesizing musical instrument sounds and the other for singing voice synthesis. Among them, the singing voice synthesizing system includes, for example, "singing voice synthesizing system for personal computer using formant type speech synthesizing LSI" (Ishihara, Fushikida, Sanru,
Inoue: 1984 IEICE General Conference, 6-198)
Etc. are known. This one decomposes the input voice character string and the musical score character string into parameters with the basic units of consonants and vowels, and creates a table of pitch parameters according to the scale data in the musical score character string. Then, the formant parameters and amplitudes retrieved from the data ROM are combined with the pitch parameters of the scale described above according to the table created by decomposing the consonant and vowel parameters, and the synthetic parameters are generated. Create a time length table according to the note length data. Then, the above-mentioned synthesis parameters are sequentially transferred to the voice synthesis LSI (large-scale integrated circuit) every predetermined frame period according to the time length table, and then,
A synthetic waveform is generated at 10 kHz sampling.

発明が解決しようとする問題点しかるに、上記の従来の歌声システムは歌声音を単に
発生することができるだけで、ビブラートなどの音楽的
変化をつけることが困難で音楽的表現に乏しい等の問題
点があつた。DISCLOSURE OF THE INVENTION Problems to be Solved by the Invention However, the conventional singing voice system described above can only generate a singing voice sound, and it is difficult to make a musical change such as a vibrato and the musical expression is poor. Atsuta

また、指定した音高で人声音を発生させる装置とし
て、特開昭55−77799号に開示された装置もあるが、こ
のものはホワイトノイズで子音を近似させているため、
ホルマント周波数の急激な変化を正確に再現できず、従
つて発生される人声音は不明瞭で、不自然であるという
問題点があつた。Further, as a device for generating a human voice sound at a designated pitch, there is also a device disclosed in JP-A-55-77799, but this one approximates consonants with white noise,
There was a problem that abrupt changes in formant frequency could not be accurately reproduced, and the human voice generated subsequently was unclear and unnatural.

そこで、本発明は特にMIDI（Musical Instrument Di
gital Interface）規格のインターフエースを用いてシ
ンセサイザと接続することにより、上記の問題点を解決
した歌声音発生装置を提供することを目的とする。Therefore, the present invention is particularly applicable to MIDI (Musical Instrument Di
It is an object of the present invention to provide a singing voice sound generator that solves the above problems by connecting to a synthesizer using an interface of the standard.

問題点を解決するための手段本発明になる歌声音発生装置は、テーブル作成手段，
ピツチパラメータ変換手段，合成パラメータ作成手段，
データ変換手段，データ転送手段，シンセサイザ及びイ
ンターフエース手段とよりなる。上記テーブル作成手段
は入力歌詞データを子音と母音を単位とするパラメータ
に分解して第１のテーブルを作成し、上記ピツチパラメ
ータ変換手段は入力楽譜データの中の音階データに従つ
てピツチパラメータを、例えば予め用意した音階に対応
する第２のテーブルから読み取る。合成パラメータ作成
手段は上記第１のテーブルに従つてメモリから読み出し
たホルマントパラメータに、上記ピツチパラメータを組
み合わせて編集及び補間された合成パラメータを生成す
ると共に、上記楽譜データ中の音価データに従つて時間
長テーブルを作成し、それらを一時記憶する。Means for Solving Problems A singing voice sound generating apparatus according to the present invention includes a table creating means,
Pitch parameter conversion means, synthetic parameter creation means,
It comprises a data conversion means, a data transfer means, a synthesizer and an interface means. The table creating means decomposes the input lyric data into parameters in units of consonants and vowels to create a first table, and the pitch parameter converting means creates pitch parameters in accordance with the scale data in the input score data. For example, it is read from the second table corresponding to the scale prepared in advance. The synthesis parameter creating means combines the formant parameters read from the memory according to the first table with the pitch parameters to generate edited and interpolated synthesis parameters, and according to the tone value data in the score data. Create time length tables and store them temporarily.

この合成パラメータは上記データ変換手段のよつて特
定の規格のデータに変換され、上記データ転送手段によ
つてその変換データのうち複数のホルマント周波数に関
するデータを転送され、ピツチ周波数に関するピツチ周
波数データを母音部発声期間中のみ前記時間長テーブル
に基づく期間発生出力される。上記インターフエース手
段は上記データ転送手段による転送データのうち複数の
ホルマント周波数に関するデータとピツチ周波数データ
とを前記シンセサイザ内の複数個の可変周波数発振器の
制御信号として別々に供給する。This synthesis parameter is converted into data of a specific standard by the data conversion means, data of a plurality of formant frequencies of the converted data is transferred by the data transfer means, and pitch frequency data of pitch frequencies is converted into vowels. Only during the vocalization period, a period is generated and output based on the time length table. The interface means separately supplies data concerning a plurality of formant frequencies and pitch frequency data among the transfer data by the data transfer means as control signals for a plurality of variable frequency oscillators in the synthesizer.

作用上記シンセサイザ内の複数個の可変周波数発振器は子
音と母音とを構成する複数のホルマント周波数の信号を
夫々発生出力すると共に、母音の発声期間中は楽譜デー
タによる音階を定めるピツチ周波数の信号も発生出力す
るので、シンセサイザのスピーカから歌声音が発音され
る。シンセサイザは発音すべき音に音楽的効果をもたら
す手段を備えているから、これを利用することにより歌
声音に音楽的効果をもたらせることができる。また、本
発明によれば、合成パラメータ作成手段によりパラメー
タを編集及び補間するので細かいニュアンスの歌声音を
発生することができる。以下、本発明について実施例と
共に詳細に説明する。Function A plurality of variable frequency oscillators in the synthesizer generate and output signals of a plurality of formant frequencies that form consonants and vowels, respectively, and also generate a pitch frequency signal that determines the scale based on the score data during the vowel vocalization period. Since it is output, a singing voice sound is produced from the speaker of the synthesizer. Since the synthesizer is provided with a means for giving a musical effect to the sound to be pronounced, it can be used to give a musical effect to the singing voice. Further, according to the present invention, since the parameters are edited and interpolated by the synthetic parameter creating means, it is possible to generate a singing voice sound with a fine nuance. Hereinafter, the present invention will be described in detail together with examples.

実施例第１図は本発明装置の一実施例のブロツク系統図を示
す。同図中、歌詞キーボード１は例えば第２図に示す如
く、キーボード本体16上にアルフアベツトのキー17を複
数個三段に配置し、最下段に母音を示すアルフアベツト
のキーが配置された構造とされており、複数個のキー17
はローマ字で書かれた文字の集合体である歌詞（音声文
字列）に応じて選択的に押される。例えば文字「ビ」はの２つのキーを押すことによつて入力される。歌詞キー
ボード１には第２図に示す如く、更に最下段の位置に女
声用キー18と子供の声用キー19とが夫々配置されてお
り、発音すべき歌詞音を女声とするときはキー18を押
し、また子供の声とするときはキー19を押し、更に男声
とするときにはキー18及び19の両方を押さないことによ
つて指定することができる。この歌詞キーボード１によ
つて、入力された歌詞データ（文字音声データと声別デ
ータ）はI/O（入出力）インターフエース２を通して中
央処理装置（CPU）３へ供給される。Embodiment FIG. 1 shows a block system diagram of an embodiment of the device of the present invention. In the figure, the lyric keyboard 1 has a structure in which a plurality of alph-abet keys 17 are arranged on a keyboard body 16 in three stages, and alf-abet keys indicating vowels are arranged at the bottom, as shown in FIG. And multiple keys 17
Is selectively pressed according to the lyrics (voice character string), which is a collection of characters written in Roman letters. For example, the letter "bi" is It is input by pressing the two keys of. As shown in FIG. 2, the lyrics keyboard 1 is further provided with a female voice key 18 and a child voice key 19 at the lowest position, and when the female voice is the lyric sound to be pronounced, the key 18 is used. Can be specified by pressing the key, or the key 19 for a child's voice, and not pressing both the keys 18 and 19 for a male voice. The lyrics data (character voice data and voice-specific data) input by the lyrics keyboard 1 is supplied to a central processing unit (CPU) 3 through an I / O (input / output) interface 2.

また他のキーボードとして音価キーボード４とコント
ロールキーボード5,更には後述するシンセサイザ10内の
キーボードがある。音価キーボード４は例えば第３図に
示す如く、音価（時価）を示す複数個のキー群21がマト
リクス状に配置された構造とされている。一方、コント
ロールキーボード５は数オクターブの音域内の音の高さ
を、例えば「１」から「88」までの範囲内の数字で推定
したり、音の強弱を指定するためのキーボードである。
音価キーボード４及びコントロールキーボード５の両出
力データは楽譜データ（楽譜文字列）としてI/Oインタ
ーフエース２を介してCPU3に供給される。Other keyboards include a tone value keyboard 4, a control keyboard 5, and a keyboard in a synthesizer 10 described later. The tone value keyboard 4 has a structure in which a plurality of key groups 21 indicating tone values (time values) are arranged in a matrix as shown in FIG. 3, for example. On the other hand, the control keyboard 5 is a keyboard for estimating the pitch of a sound within a sound range of several octaves, for example, by a number within the range of "1" to "88", and designating the strength and weakness of the sound.
Both output data of the note value keyboard 4 and the control keyboard 5 are supplied to the CPU 3 via the I / O interface 2 as musical score data (musical score character string).

CPU3はランダム・アクセス・メモリ（RAM）6,リード
・オンリ・メモリ（ROM）７及びメモリ（例えば磁気デ
イスク）８が夫々双方向性バスを介して接続されてい
る。RAM6はCPU3のデータストア用及び作業用のメモリ回
路で、また後述する音階に対応するピツチパラメータな
どが予め格納されている。またROM7にはCPU3の制御プロ
グラムやホルマント周波数に関するホルマントパラメー
タテーブルが予め格納されている。更にメモリ８は入力
あるいは加工したデータの保存用のメモリ回路である。
また、I/Oインターフエース２は後述するMIDIインター
フエース９に双方向性バスを介して接続されており、MI
DIインターフエース９はシンセサイザ10に双方向性バス
を介して接続されている。A random access memory (RAM) 6, a read only memory (ROM) 7 and a memory (for example, a magnetic disk) 8 are connected to the CPU 3 via a bidirectional bus. The RAM 6 is a memory circuit for data storage and working of the CPU 3, and has previously stored pitch parameters and the like corresponding to scales described later. Further, the ROM 7 pre-stores a control program of the CPU 3 and a formant parameter table regarding the formant frequency. Further, the memory 8 is a memory circuit for storing input or processed data.
Also, the I / O interface 2 is connected to a MIDI interface 9, which will be described later, via a bidirectional bus.
The DI interface 9 is connected to the synthesizer 10 via a bidirectional bus.

CPU3に入力された前記歌詞データ及び楽譜データはビ
デオコントローラ11を介してビデオ・ランダム・アクセ
ス・メモリ（V.RAM）12に書き込まれ、ビデオコントロ
ーラ11によりビデオ信号に変換された後、三原色信号分
離回路13で三原色信号とされてCRT（陰極線管）14に供
給される。また、CRT14にはCPU3によつて生成されたデ
ータに基づく三原色信号も供給される。これにより、操
作者はCRT14に表示された、例えば音価，音階，歌詞そ
の他の必要事項を示す表を見ながら、前記キーボード1,
4及び５を操作して所望のデータの入力の確認，訂正な
どができる。The lyrics data and the musical score data input to the CPU3 are written in the video random access memory (V.RAM) 12 via the video controller 11, converted into the video signal by the video controller 11, and then separated into the three primary color signals. A circuit 13 converts the signals into three primary color signals and supplies them to a CRT (cathode ray tube) 14. The CRT 14 is also supplied with the three primary color signals based on the data generated by the CPU 3. As a result, the operator can see the table displayed on the CRT 14 showing, for example, note values, scales, lyrics, and other necessary items while the keyboard 1,
You can check and correct the input of the desired data by operating 4 and 5.

また、CPU3は第４図に示す如き構成とされており、第
５図に示すフローチヤートに従つた動作を行なう。CPU3
はまずイニシヤライズされた後（第５図にステツプ40で
示す）、第４図の歌詞データ・楽譜データ受信部25によ
り、前記キーボード1,4及び５よりI/Oインターフエース
２を介して入力された歌詞データ及び楽譜データを夫々
受信し、取込む（第５図のステツプ41）。CPU3内の第４
図に示すテーブル作成手段26は取込んだ上記の歌詞デー
タを子音と母音を単位とする第１のパラメータに夫々分
解し、これにより第１のテーブルを作成し、これをRAM6
に格納する（第５図のステツプ42）。すなわち、日本語
の音節の大部分は広義の子音と母音との組合せからなる
ことが知られており、子音と母音を単位として、それら
の列によつて語の音形が表示されると考えられるから、
上記第１のパラメータによつて歌詞を示す個々の音節を
示すことができる。Further, the CPU 3 is configured as shown in FIG. 4, and operates according to the flow chart shown in FIG. CPU3
Is first initialized (indicated by step 40 in FIG. 5) and then input from the keyboard 1, 4 and 5 via the I / O interface 2 by the lyrics data / musical score data receiving unit 25 in FIG. The lyrics data and the score data are respectively received and fetched (step 41 in FIG. 5). 4th in CPU3
The table creating means 26 shown in the figure decomposes the acquired lyrics data into first parameters in units of consonants and vowels, thereby creating a first table, which is stored in the RAM6.
(Step 42 in FIG. 5). In other words, it is known that most Japanese syllables consist of a combination of consonants and vowels in a broad sense, and it is thought that consonants and vowels are used as a unit to display the phonetic form of a word. Because
The first parameter can indicate individual syllables indicating lyrics.

またピツチパラメータ変換手段27は取込んだ上記の楽
譜データの中の音階データに従つて、RAM6に予め記憶さ
れていたテーブルから音階を定める母音の基本周波数
（ピツチ周波数）F₀を示すピツチパラメータを読み出す
（すなわちピツチパラメータに変換するもので、これは
第５図にステツプ43で示す）。また、第４図に示すCPU3
内の合成パラメータ作成手段28は、テーブル作成手段26
により作成されてRAM6に格納されていた前記第１のテー
ブルを参照しながら前記ホルマントパラメータをROM7か
ら読み出し、これにピツチパラメータ変換手段27により
変換して得た前記ピツチパラメータを組み合わせ、パラ
メータの編集，補間（音が滑らかに変化するように音に
区切りをつけることなど）を行なつて合成パラメータを
作成すると共に、楽譜データの中の前記音価データに従
つて時間長テーブルを作成し、それらをRAM6に一時記憶
する（第５図のステツプ44）。Further, the pitch parameter converting means 27, in accordance with the scale data in the score data taken in, outputs a pitch parameter indicating a fundamental frequency (pitch frequency) F ₀ of a vowel defining a scale from a table stored in advance in the RAM 6. It is read (that is, converted into pitch parameters, which is indicated by step 43 in FIG. 5). Also, the CPU3 shown in FIG.
The synthesis parameter creating means 28 in the table creating means 26 is a table creating means 26.
By reading the formant parameter from the ROM 7 while referring to the first table stored in the RAM 6 and combining it with the pitch parameter obtained by the pitch parameter conversion means 27 to edit the parameter. Interpolation (such as dividing the sound so that the sound changes smoothly) is performed to create synthesis parameters, and a time length table is created according to the note value data in the score data, and these are created. It is temporarily stored in RAM6 (step 44 in FIG. 5).

ここで、歌声は歌詞を示す個々の音節とそれと共に発
生させる音とからなり、前者は前記第１のパラメータに
より子音と母音とで示され、また第６図に示す如きホル
マント周波数対時間特性を示すことが知られている。す
なわち、音声の識別は音声波形を構成している基本周波
数（ピツチ周波数）と複数個のホルマントに基づいて行
なわれていると見做されている。第1,第２及び第３ホル
マント（以下、ホルマント周波数ともいう）をF₁,F₂及
びF₃とすると、一の音節は最初に漸次ホルマント周波数
変化を伴つて子音部が発音され、次に母音部が略一定の
ホルマント周波数により発音される。なお、母音部発声
期間はホルマント周波数F₁〜F₃と共にピツチ周波数も発
生される。Here, the singing voice is composed of individual syllables indicating lyrics and sounds generated together with the syllables. The former is represented by consonants and vowels by the first parameter, and has a formant frequency vs. time characteristic as shown in FIG. It is known to show. That is, it is considered that the voice is identified based on the fundamental frequency (pitch frequency) forming the voice waveform and a plurality of formants. Letting F ₁ , F ₂ and F ₃ be the first, second and third formants (hereinafter also referred to as formant frequencies), one syllable is first pronounced in a consonant part with a gradual change in formant frequency, then The vowel part is pronounced with a substantially constant formant frequency. In the vowel vocalization period, the pitch frequency is generated together with the formant frequencies F _{1 to} F ₃ .

上記子音部におけるホルマント周波数F₁〜F₃の変化は
子音によつて異なり、例えばg,d及びｂの各子音のホル
マント周波数は第７図（Ａ），（Ｂ）及び（Ｃ）に示す
如くに変化することが一般に知られている。ここで、前
記した合成パラメータ中のホルマントパラメータは上記
３種のホルマント周波数F₁〜F₃を決定するパラメータで
ある。また、ピツチパラメータは前記ピツチ周波数を定
める。The changes in the formant frequencies F _{1 to} F ₃ in the consonant part differ depending on the consonant. For example, the formant frequencies of the consonants g, d and b are as shown in FIGS. 7 (A), (B) and (C). It is generally known to change to. Here, the formant parameters in the above-mentioned synthetic parameters are parameters that determine the above-mentioned three types of formant frequencies F _{1 to} F ₃ . The pitch parameter defines the pitch frequency.

一方、音節（歌詞）と共に発生される上記の音は音の
高さ，大きさ，長さによつて定まり、これが楽譜データ
として前記した如くCPU3に入力されるが、音階に応じて
第６図に示したホルマント周波数対時間特性が縦軸方向
に平行移動した如き特性となり、前記ピツチパラメータ
（ピツチ周波数）がその移動量（音の高さ）を定める。
また音の長さは母音部の時間長によつて得ることができ
る。更に音の大きさはコントロールキーボード５によつ
て入力されるが、音量パラメータとして前記合成パラメ
ータ中に含まれている。On the other hand, the above-mentioned sound generated together with the syllable (lyric) is determined by the pitch, size, and length of the sound, and this is input to the CPU 3 as music score data as described above. The formant frequency vs. time characteristic shown in (1) has a characteristic that it is translated in the vertical axis direction, and the pitch parameter (pitch frequency) determines the amount of movement (pitch of sound).
The length of the sound can be obtained by the time length of the vowel part. Further, the loudness of the sound is input by the control keyboard 5, but is included in the above-mentioned synthesis parameter as a volume parameter.

この合成パラメータはMIDIデータ変換手段29により公
知のMIDI規格に適合したデータ（これをMIDIデータとい
うものとする）に変換する（第５図のステツプ45の処理
を行なう。）。ここで、MIDI規格は楽器を演奏する際に
行なわれる種々の操作（キーを押す，ボリユームを回す
など）を夫々数バイトのデータに変換し、複数の楽器，
又はそれらをコントロールする機械との間で送受信する
ための規格である。MIDIデータは２〜３バイトのシリア
ルデータで伝送されるが、最初の１バイトがステータス
で、後続のデータバイトがどのようなフアンクシヨンを
持つているかを指示するメツセージになつている。ステ
ータスを含むデータ長は１〜３バイトと不定であり、ス
テータスとデータとは各バイトの第１ビツトが“1"のと
きステータス，“0"のときデータという形で認識され
る。上記メツセージには大別して個別の操作データを送
るチヤンネルメツセージと、ネツトワークをコントロー
ルするためのシステムメツセージとに分けられる。This synthesis parameter is converted by the MIDI data conversion means 29 into data (referred to as MIDI data) conforming to the known MIDI standard (the process of step 45 in FIG. 5 is performed). Here, the MIDI standard converts various operations performed when playing an instrument (pressing a key, turning a volume, etc.) into several bytes of data, and
Or, it is a standard for transmitting and receiving with a machine that controls them. MIDI data is transmitted as 2-3 bytes of serial data, and the first 1 byte is the status, and it is a message that indicates what kind of function the subsequent data bytes have. The data length including the status is indefinite as 1 to 3 bytes, and the status and the data are recognized as the status when the first bit of each byte is "1" and the data when the first bit is "0". The above messages are roughly classified into a channel message for sending individual operation data and a system message for controlling the network.

変換された上記のMIDIデータはデータ転送手段30に供
給され、ここで第５図に示すステツプ46〜56の処理動作
に従つて、CPU3よりI/Oインターフエース2,MIDIインタ
ーフエース９を夫々介してシンセサイザ10へ転送され
る。すなわち、第５図において、データ転送手段30は変
数Ｊを「１」にセツトし（ステツプ46）、変数Ｉを
「０」にセツトし（ステツプ47）、次にシンセサイザ10
よりMIDIインターフエース９及びI/Oインターフエース
２を介して入来するデータに基づいてシンセサイザ10は
受信可能か否かが判定され、受信可能となるまで待機す
る（ステツプ48）。シンセサイザ10が受信可能（イネー
ブル）となつた場合は、MIDIデータのうち子音のホルマ
ント周波数F₁〜F₃及び音量等に関する第１のデータをI/
Oインターフエース2,MIDIインターフエース９を夫々通
してシンセサイザ10へ転送した後、変数Ｉの値を「１」
だけ加算する（ステツプ49,50）。加算後の変数Ｉの値
がＮ以上となるまで上記ステツプ49及び50の処理動作が
繰り返される（ステツプ51）。The above-mentioned converted MIDI data is supplied to the data transfer means 30. Here, according to the processing operation of steps 46 to 56 shown in FIG. 5, the CPU 3 transmits the I / O interface 2 and the MIDI interface 9 respectively. And transferred to the synthesizer 10. That is, in FIG. 5, the data transfer means 30 sets the variable J to "1" (step 46), the variable I to "0" (step 47), and then the synthesizer 10
Based on the data coming in via the MIDI interface 9 and the I / O interface 2, the synthesizer 10 determines whether or not it is receivable, and waits until it becomes receivable (step 48). When the synthesizer 10 is ready to receive (enable), the first data of the consonant formant frequencies F _{1 to} F ₃ and the volume of the MIDI data are I / O.
After transferring to the synthesizer 10 through the O interface 2 and MIDI interface 9, respectively, the value of the variable I is set to "1".
Only add (steps 49, 50). The processing operations of steps 49 and 50 are repeated until the value of the variable I after addition becomes N or more (step 51).

ここで、上記Ｎの値は子音部の発声時間帯において、
一定間隔（例えば0.33ms）で、第１〜第３ホルマント周
波数F₁〜F₃の夫々について行なうべき周波数変更の回数
を示している。これにより、上記のステツプ49〜51で第
６図，第７図に示した如き子音部発声時間帯のホルマン
ト周波数F₁〜F₃の周波数変化を直線近似的に行なわせる
各ホルマント周波数については各々３バイトの第１のデ
ータが転送され、約3ms（≒３×３×320μｓ）毎にホル
マント周波数データが書き換えられる。Here, the value of N is the vocalization time zone of the consonant part,
The number of frequency changes to be performed for each of the _first to _third formant frequencies F1 to F3 is shown at regular intervals (for example, 0.33 ms). As a result, each of the formant frequencies which cause the frequency changes of the formant frequencies F _{1 to} F _{3 in} the consonant vocalization time zone as shown in FIGS. The 3-byte first data is transferred, and the formant frequency data is rewritten about every 3 ms (≈3 × 3 × 320 μs).

変数Ｉの値がＮ以上となると、次に前記MIDIデータの
うち母音のホルマント周波数F₁〜F₃,ピツチ周波数F₀及
び音量等に関する第２のデータがシンセサイザ10へ転送
されてピツチ周波数F₀の設定等が行なわれる（ステツプ
52）。そして、前記時間長テーブルに基づいて音価に従
つた時間長となるように、母音部の時間長さのチエツク
のためのタイムカウントが行なわれ、所定の時間長とな
ると第２のデータの送出がオフとされる（ステツプ5
3）。データ転送手段30は次に第１のデータをクリアし
た後（ステツプ54）、変数Ｊの値を「１」だけ増加する
（ステツプ55）。増加後の変数Ｊの値が発生すべき総音
符数ｍを越えるまで上記のステツプ47〜55の動作が繰り
返され、ｍを越えた時点で動作終了となる（ステツプ5
6）。When the value of the variable I is equal to or greater than N, then the MIDI vowel of data formant frequencies F ₁ to F _3, the pitch frequency F ₀ is transferred second data relating to pitch frequency F ₀ and volume and the like to the synthesizer 10 Are set (step
52). Then, the time count for checking the time length of the vowel part is performed so that the time length is in accordance with the tone value based on the time length table, and when the predetermined time length is reached, the second data is transmitted. Is turned off (step 5
3). The data transfer means 30 then clears the first data (step 54) and then increases the value of the variable J by "1" (step 55). The above steps 47 to 55 are repeated until the value of the variable J after the increase exceeds the total number m of notes to be generated, and when it exceeds m, the operation ends (step 5).
6).

このようにして取り出されたMIDIデータI/Oインター
フエース２を通してMIDIインターフエース９に供給さ
れ、更にこれよりシンセサイザ10へ供給される。第８図
はMIDIインターフエース９の一例の回路図を示す。I/O
インターフエース２よりのMIDIデータ（第1,第２のデー
タ）は、信号処理装置60のデータ入力端子D₀〜D₇に供給
され、ここで並直列変換等されてインバータ61,62,抵抗
63,コネクタ64等を介してシンセサイザ10へ転送され
る。またシンセサイザ10よりのデータはコネクタ65,フ
オトカプラ66を夫々介して信号処理装置60に供給され、
そのデータ入出力端子D₀〜D₇より並列に出力される。な
お、信号処理装置60はクロツクモジユールIC67より例え
ば500kHzのクロツクが供給される。It is supplied to the MIDI interface 9 through the MIDI data I / O interface 2 thus taken out, and further supplied to the synthesizer 10 from this. FIG. 8 shows a circuit diagram of an example of the MIDI interface 9. I / O
The MIDI data (first and second data) from the interface 2 is supplied to the data input terminals D _{0 to} D ₇ of the signal processing device 60, where parallel conversion is performed and the inverters 61, 62 and resistors are connected.
It is transferred to the synthesizer 10 via 63, the connector 64 and the like. The data from the synthesizer 10 is supplied to the signal processing device 60 via the connector 65 and the photo coupler 66, respectively,
It is output in parallel from the data input / output terminals D _{0 to} D ₇ . The signal processing device 60 is supplied with a clock of, for example, 500 kHz from the clock module IC 67.

第９図はシンセサイザ10の一実施例のブロツク系統図
を示す。MIDIインターフエース９よりのMIDIデータはシ
ンセサイザ10内の入出力インターフエース（図示せず）
を介して入力端子70₁〜70₆に順次入力される。入力端子
70₁〜70₄の入力データは電圧制御発振器（VCO）71〜74
に制御信号として供給される。ここで、VCO71,72及び73
は夫々前記第1,第２及び第３ホルマント周波数の信号を
発生出力し、VCO74はピツチ周波数F₀の信号を発振出力
する。また入力端子70₅に入来したデータは低周波発振
器（LFO）75に供給され、その発振動作を制御する。こ
のLFO75の出力信号はVCO71〜74に供給されてその出力発
振周波数を可変制御してビブラート効果を得ることがで
き、またLFO75の出力信号は電圧制御フイルタ（VCF）78
及び電圧制御増幅器（VCA）79に夫々供給されて特定の
倍音のみを強調したり、トレモロ効果を得ることができ
る。FIG. 9 shows a block system diagram of an embodiment of the synthesizer 10. MIDI data from the MIDI interface 9 is input / output interface (not shown) in the synthesizer 10.
Are sequentially input to the input terminals 70 ₁ to 70 ₆ via. Input terminal
Input data for 70 ₁ to 70 ₄ is voltage controlled oscillator (VCO) 71 to 74
Are supplied as control signals. Where VCO 71, 72 and 73
Respectively generate and output the signals of the first, second and third formant frequencies, and the VCO 74 oscillates and outputs the signal of the pitch frequency F ₀ . The data incoming to the input terminal 70 ₅ is supplied to the low frequency oscillator (LFO) 75, to control its oscillating operation. The output signal of this LFO75 is supplied to the VCOs 71 to 74, and its output oscillation frequency can be variably controlled to obtain a vibrato effect, and the output signal of the LFO75 is a voltage control filter (VCF) 78.
And a voltage-controlled amplifier (VCA) 79, respectively, to emphasize only a specific overtone or obtain a tremolo effect.

VCO71〜74の各出力信号はリングモジユレータ76に夫
々供給され、ここで各周波数の和又は差の周波数の信号
に変換された後ミキサ77に供給され、ここでVCO71〜74
の各出力信号と混合される。リングモジユレータ76は特
定の音色を得るような場合に用いられ、その動作は端子
70₆よりのデータによつて制御される。ミキサ77の出力
混合信号はVCF78,VCA79及び増幅器80を夫々経て第10図
に示す如き波形の信号とされてスピーカ81に供給され
る。第10図中、Ｔはピツチ周期を示す。The output signals of the VCOs 71 to 74 are supplied to the ring modulator 76, respectively, where they are converted into signals of the sum or difference frequency of each frequency and then supplied to the mixer 77, where the VCOs 71 to 74 are supplied.
Is mixed with each output signal of. The ring modulator 76 is used to obtain a specific tone, and its operation is
Controlled by data from 70 ₆ . The mixed output signal of the mixer 77 is passed through the VCF 78, the VCA 79 and the amplifier 80, respectively, and converted into a signal having a waveform as shown in FIG. In FIG. 10, T indicates the pitch cycle.

ここで、子音部発声時間帯は端子70₁〜70₃にのみ所定
のデータが印加されるので、VCO71,72及び73から第1,第
２及び第３ホルマント周波数F₁,F₂及びF₃が夫々同時的
に出力され、かつ、その周波数値が前記Ｎ段階に分けて
順次変更せしめられるが、VCO74からは信号は発生出力
されない。これに対して、母音部発声時間帯は前記第２
のデータの一部によつてVCO74からは音階を定めるピツ
チ周波数F₀が初めて発生出力されると共に、VCO71〜73
からは引続き第１〜第３ホルマント周波数F₁〜F₃の信号
が一定値で夫々発生される。この結果、スピーカ81から
は歌詞キーボード１によつて入力された歌詞が、音価キ
ーボード４及びコントロールキーボード５によつて入力
された楽譜に従つて人声音で発音される。また、その人
声音、すなわち歌声音にはビブラート，トレモロ等の如
き音楽的変化をつけることもできる。この結果、ニユア
ンスがつき、より好ましい歌声となる。なお、第９図に
示したシンセサイザ10は、キーボード82の使用によつて
従来と同様に楽器音だけを得ることもできる。また、ピ
ツチパラメータ変換手段はテーブルから読みとるものに
限らず、その値を計算で求めることもできる。更にまた
ホルマントの数は３に限定されるものではなく３以外の
複数でも本発明を適用することができる。Since band consonants utterance time predetermined data only to the terminals 70 ₁ to 70 ₃ is applied, first from VCO71,72 and 73, the second and third formant frequencies F _1, F ₂ and F ₃ Are simultaneously output, and their frequency values are divided into N stages and sequentially changed, but no signal is generated and output from the VCO 74. On the other hand, the vowel vocalization time period is the second
Based on a part of the data of VCO74, the pitch frequency F ₀ that determines the scale is generated and output for the first time, and VCO71-73
From then on, the signals of the _first to third formant frequencies F _{1 to} F ₃ are respectively generated with constant values. As a result, the lyrics input by the lyrics keyboard 1 from the speaker 81 are pronounced as a human voice according to the musical score input by the tone value keyboard 4 and the control keyboard 5. Further, the human voice sound, that is, the singing voice sound can be provided with musical changes such as vibrato and tremolo. As a result, there is a nuance and a more preferable singing voice. Note that the synthesizer 10 shown in FIG. 9 can also obtain only instrument sounds by using the keyboard 82 as in the conventional case. Further, the pitch parameter conversion means is not limited to reading from the table, and its value can be calculated. Furthermore, the number of formants is not limited to three, and the present invention can be applied to a plurality of formants other than three.

発明の効果上述の如く、本発明によれば、シンセサイザを用いて
歌声音を発生することができ、またホルマント周波数等
を合成するようにしたので、子音としてホワイトノイズ
を利用する従来装置に比し、自然で明瞭な歌声音を発生
することができ、更にシンセサイザを用いているので容
易に歌声音に音楽的変化をもたせることができる等の従
来にない際立つた特長を有するものである。また、本発
明によりパラメータを編集及び補間するので、連続的な
音が生成でき、細かいニュアンスも表現された歌声音を
生成できる等の特長を有する。EFFECTS OF THE INVENTION As described above, according to the present invention, a singing voice can be generated using a synthesizer, and a formant frequency or the like is synthesized. Therefore, compared with a conventional device that uses white noise as a consonant. In addition, it is possible to generate a natural and clear singing voice sound, and since the synthesizer is used, it is possible to easily give a musical change to the singing voice sound. Further, since the parameters are edited and interpolated according to the present invention, continuous sounds can be generated, and singing voice sounds in which fine nuances are expressed can be generated.

[Brief description of drawings]

第１図は本発明装置の一実施例を示すブロツク系統図、
第２図及び第３図は第１図図示ブロツク系統中の各キー
ボードの概略構成を示す図、第４図は本発明装置の要部
の一実施例を示すブロツク系統図、第５図は本発明装置
の要部の動作説明用フローチヤート、第６図は合成音声
の子音部，母音部とホルマント周波数の関係を示す図、
第７図は各子音のホルマント周波数と時間との関係を示
す図、第８図は第１図図示ブロツク系統中のMIDIインタ
ーフエースの一例の回路図、第９図は第１図図示ブロツ
ク系統中のシンセサイザの一実施例を示すブロツク系統
図、第10図は第９図図示ブロツク系統中のスピーカの入
力合成音声信号波形の一例を示す図である。１……歌詞キーボード、２……I/Oインターフエース、
３……中央処理装置（CPU）、４……音価キーボード、
５……コントロールキーボード、６……ランダム・アク
セス・メモリ（RAM）、７……リード・オンリ・メモリ
（ROM）、９……MIDIインターフエース、10……シンセ
サイザ、11……ビデオコントローラ、12……ビデオ・ラ
ンダム・アクセス・メモリ（V.RAM）、25……歌詞デー
タ・楽譜データ受信部、26……テーブル作成手段、27…
…ピツチパラメータ変換手段、28……合成パラメータ作
成手段、29……MIDIデータ変換手段、30……データ転送
手段、70₁〜70₆……入力端子、71〜74……電圧制御発振
器（VCO）、75……低周波発振器（LFO）、77……ミキ
サ。FIG. 1 is a block diagram showing an embodiment of the device of the present invention,
2 and 3 are diagrams showing a schematic configuration of each keyboard in the block system shown in FIG. 1, FIG. 4 is a block system diagram showing an embodiment of the main part of the device of the present invention, and FIG. FIG. 6 is a flow chart for explaining the operation of the main part of the device of the invention, FIG. 6 is a diagram showing the relationship between the consonant part, the vowel part, and the formant frequency of the synthetic voice,
FIG. 7 is a diagram showing the relationship between the formant frequency of each consonant and time, FIG. 8 is a circuit diagram of an example of the MIDI interface in the block system shown in FIG. 1, and FIG. 9 is the block system shown in FIG. FIG. 10 is a block diagram showing an embodiment of the synthesizer of FIG. 10, and FIG. 10 is a diagram showing an example of an input synthesized voice signal waveform of a speaker in the block system shown in FIG. 1 ... Lyrics keyboard, 2 ... I / O interface,
3 ... Central processing unit (CPU), 4 ... Sound value keyboard,
5 ... Control keyboard, 6 ... Random access memory (RAM), 7 ... Read-only memory (ROM), 9 ... MIDI interface, 10 ... Synthesizer, 11 ... Video controller, 12 ... … Video random access memory (V.RAM), 25 …… Lyrics data / musical score data receiving section, 26 …… Table creating means, 27…
... Pitch parameter conversion means, 28 ... synthesis parameter creation means, 29 ... MIDI data conversion means, 30 ... data transfer means, 70 ₁ to 70 ₆ ... input terminals, 71 to 74 ... voltage controlled oscillator (VCO) , 75 …… Low frequency oscillator (LFO), 77 …… Mixer.

Claims

[Claims]

1. A table creating means for decomposing input lyrics data into parameters in units of consonants and vowels to create a first table, and a pitch for converting into pitch parameters according to scale data in the input score data. Parameter conversion means and formant parameters read from the memory in accordance with the created first table are combined with the pitch parameters to generate edited and interpolated synthetic parameters, and note value data in the score data. A time length table is created in accordance with the above, synthetic parameter creating means for temporarily storing them, data converting means for converting the synthetic parameter into data of a specific standard, and data relating to a plurality of formant frequencies among the converted data. The pitch frequency data relating to the pitch parameter is transferred and vowel vocalization period is transmitted. A data transfer means for generating and outputting a period based on the time length table only, a synthesizer for outputting mixed output signals of a plurality of variable frequency oscillators, and a plurality of formant frequencies among the transfer data from the data transfer means. A singing voice generating device comprising an interface means for separately supplying data and the pitch frequency data to the plurality of variable frequency oscillators in the synthesizer as control signals.