JPH0449957B2

JPH0449957B2 -

Info

Publication number: JPH0449957B2
Application number: JP58167535A
Authority: JP
Inventors: Makoto Morito; Takashi Yato; Takashi Miki
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1983-09-13
Filing date: 1983-09-13
Publication date: 1992-08-12
Also published as: JPS6059399A

Description

[Detailed description of the invention]

（技術分野）本発明は記憶領域から音素波形の波形領域での
情報を読み出し音声を合成する音声合成器に関
し、特に複数のADPCM符号化された音素波形を
重畳する加算を実行し、音声出力を得る音声合成
器に関する。（従来技術）人間の声は有声音の場合、肺からの空気流が声
帯によつて準周期的なインパルス流となり声道と
呼ばれる空洞共鳴体と共鳴することによつて発せ
られる。人間の声の波形を第１図に示す。第１図に示されるごとく人間の声は“ピツチ”
と呼ばれる周期ごとにほとんど同じ波形がくりか
えされている。このことは前にも述べたように人
間の声は準周期的なインパルス流によつて発せら
れることから起因しておりピツチ周期はこのイン
パルス流の間隔に等しい。このような音声を合成しようとしたとき音声の
情報をどのような形で格納しておくかにより各種
方式があげられる。その１つの方式として音声の１ピツチ周期の波
形（これを音素波形と称する）をいろいろな音声
について記憶領域に格納し制御情報にしたがいこ
れら音素波形をつなぎ合わせることによつて音声
を出力する方式がある。第２図にこの方式による構成を示す。１は各種音声に対して１ピツチ周期の音素波形
を格納しておく記憶領域で２は音素波形をつなぎ
合わせるためのピツチならびに振幅倍などの制御
情報用記憶領域で３は音声素片をつなぎ合わせる
合成部である。記憶領域１に格納された音素波形
を記憶領域２に格納された制御情報によつてつな
ぎ合わせることによつて音声を合成部３で合成す
るが自然性の高い合成音を作るためには声の高
さ、声の大きさを適切に制御しなければならな
い。声の大きさは記憶領域内の振幅倍情報によつ
て記憶領域内の音素を一定倍することにより制御
される。また声の高さは記憶領域２内のピツチ情
報によつて制御される。しかし記憶領域１の中の
音素波形は音素波形を抽出した際の音声のピツチ
周期の長さをもつており制御情報によつて与えら
れるピツチとは必ずしも一致していない。したがつて制御情報のピツチが音声素片長より
短い場合には音声素片の後端を切り、制御情報の
ピツチが音声素片長より長い場合には音声素片の
最後の値を延長して制御情報と同一ピツチ長をも
つた音素とする。（第３図参照）しかし記憶領域
１に格納された音素波形を途中で切つた場合には
音素波形が十分に減衰していないときには接続す
る音素波形との間に不連続を生じ合成音に悪影き
ょうをおよぼすという欠点があり、また逆に音素
波形を長くした場合には音素波形のスペクトラム
が変形してしまい音質劣化をまねくという欠点が
あつた。（発明の目的及び概要）本発明の目的はこれらの欠点を解決することに
あり、ピツチ周期ずつずれる複数チヤンネルの
ADPCM符号化された音素波形を重畳する音声合
成器において、ADPCM符号再生を時間多重処理
することにより、また音素波形の重畳演算に用い
る加算器をADPCM符号再生器内の累積加減算手
段と兼用することにより簡単な回路構成で自然性
のある良質な音声を合成する音声合成器を実現さ
せたもので、以下詳細に説明する。（発明の前提）前にも述べたように音声は声帯によるインパル
ス流が共鳴することにより発せられておりこれは
電気回路に置き換えることができる。すなわち音声は声帯に相当する励振回路の発す
る１つのインパルスに対応した共振フイルタの出
力波形（以下これを音素波形と呼ぶ）の重なり合
つたものと考えられる。このことを第４図を用い
て説明する。第４図の１０は、励振回路の発するインパルス
列である。ここで、各インパルス間の時間間隔は
ピツチ周期間隔である。１１は励振回路の発する
インパルス列で共振フイルタを駆動した合成音声
出力波形である。１２は励振回路の発するインパ
ルス列のうち、インパルスP₁によつて共振フイ
ルタを駆動した場合の音素波形である。以下同様に１３〜１７はインパルスP₂〜P₆に
よつて共振フイルタを駆動した場合の音素波形で
ある。励振回路より発せられる駆動インパルス列１０
はインパルスP₁からP₆の加算であるから重畳の
定理によれば合成音声出力波形１１は各音素波形
１２から１７までの加算によつて得られる。第４図に示される音素波形１２は時間点t₁以前
は０である。時間点t₁以後に出力される音素波形
は時間経過とともに減衰し無限大時間点では０と
なる性質を有する。実際、音声波形の場合励振点
（t₁）から16ミリ秒を経過した時点での音素波形
はほとんど０と考えられる。したがつて音素波形
１２の再生処理は励振時点（t₁）から16ミリ秒間
で十分である。従つて、本発明では、音素波形と
して、初期値が０で、且つその音素波形が十分減
衰するように音素片長をピツチ周期より十分長く
（例えば16ミリ秒）設定した音素波形を用いる。しかし、音素波形１３の再生処理を励振時間点
t₂から開始するには、音素波形１２の再生処理が
終了していないために多重的な再生処理が必要と
なる。必要となる波形再生多重度ｎは次の第(1)
式、ｎ＝16ミリ秒／ピツチ周期の最小値 (1) で与えられる。通常音声の場合ｎは４程度で十分
である。そこでｎ＝４として以後説明する。次に音素波形を再生するための符号として、符
号化効率のよいADPCM符号を用いる。一連の
ADPCM符号から音素波形を再生する再生器につ
いては特願昭55−109800に提案されているので詳
しい説明は省略する。 ADPCM符号の再生処理のような差分復号処理
において16ミリ秒（128標本周期）で再生処理を
打ち切つた場合には最終出力値が保持される。し
たがつて音素波形の条件（条件１）励振時間点以前は０である（条件２）無限大時間点では０であるを満たすためには再生初期値が０である事と、再
生最終出力値が０である事が必要であり、再生初
期値を０としても通常再生最終値は０とはならな
い。したがつて再生最終値が０となるような値
（オフセツト用PCM符号と称する）を最後に加え
て再生最終値を０としなければならない。（発明の実施例）第５図に本発明における１実施例を示す。ここで１標本周期時間内に時分割に処理される
処理に対して番号付けのため「チヤネル」又は
「ch」という言葉を用いる。第５図における各部は次のとおりである。１０
０は第1ch入力レジスタ、１０１は第2ch入力レ
ジスタ、１０２は第3ch入力レジスタ、１０３は
第4ch入力レジスタ、１０４，１０５はセレク
タ、１０６はADPCM符号データLnを格納する
レジスタ、１０７はレジスタ１０６の値をアドレ
スとしてポインタ移動量Dnを出力するポインタ
移動量メモリ、１０８は加算器、１０９はセレク
タ、１１０はセレクタ、１１１は第1chポインタ
レジスタ、１１２は第2chポインタレジスタ、１
１３は第3chポインタレジスタ、１１４は第4ch
ポインタレジスタ、１１５はセレクタ、１１６は
ポインタ値Pnを特定の範囲に限定してポインタ
リミツタ値Pn′として出力するポインタリミツ
タ、１１７は量子化ステツプ値Ｘを格納している
量子化メモリ、１１８はシフトレジスタ、１１９
は加減算器、１２０はレジスタ、１２１は合成音
声を出力するレジスタ、１２２は出力端子、１２
３はセレクタ、１３５は第1chダウンカウンタ、
１３６は第2chダウンカウンタ、１３７は第3ch
ダウンカウンタ、１３８は第4chダウンカウン
タ、１３９は第1chデコーダ、１４０は第2chデ
コーダ、１４１は第3chデコーダ、１４２は第
4chデコーダ、１４３は各チヤンネルのBUSY信
号、１４４は再生最終時点を与えるEND信号、
１４５は波形再生のための起動信号、１５０はコ
ントローラである。コントローラ１５０は前述の
第1ch入力レジスタ１００、第2chの入力レジス
タ１０１、…、第3chデコーダ１４１、第4chデ
コーダ１４２等の各回路と接続され、その動作の
制御を行なつているが第５図は図が複雑になるの
をさけるためにその接続の様子は省略している。第６図は音素波形を再生するために本発明にお
いて用いるデータでポインタ初期値DPn、128個
のADPCM符号Ln₁，Ln₂，Ln₃，…，Ln₁₂₈、オ
フセツト用PCM符号から成り立つている。ポイ
ンタの初期値データDPnは波形再生開始時点に各
チヤネル対応のポインタレジスタ１１１〜１１４
に格納されるデータである。このポインタレジス
タは、ADPCM符号データに対応する前記ポイン
タ移動量Dnと直前のポインタ値とを加算器１０
８により加算した結果を新たなポインタ値として
更新し格納するものである。またオフセツト用
PCM符号は128個のADPCM符号による再生処理
の次に読まれるデータで波形再生最終値を０にす
るようなデータである。また各チヤンネルに割り当てられる再生処理と
時間との関係を第７図に示す。第７図において２００は合成出力波形、２０１
はチヤネル１に割り当てられた再生処理によつて
再生される音素波形、２０２はチヤネル１用起動
信号ST1，２０３はチヤネル１用BUSY信号
BUSY1、２０４はチヤネル１用END信号END1
以下２０５，２０９，２１３はチヤネル２、３、
４に割り当てられた再生処理によつて再生される
音素波形、２０６，２１０，２１４はチヤネル
２、３、４の起動信号ST2、ST3、ST4、２０
７，２１１，２１５はチヤネル２、３、４用
BUSY信号BUSY2、BUSY3、BUSY4、２０
８，２１２，２１６はチヤネル２、３、４用
END信号END2、END3、END4信号である。第５図〜第７図を用いて本発明の実施例を詳し
く説明する。時間点t₁において外部から起動信号ST1がコン
トローラ１５０にくわえられチヤネル１によつて
第４図１２に相当する音素波形の再生処理が開始
される。このときコントローラ１５０はダウンカウンタ
１３５に値128をセツトする。この値は前記16ミ
リ秒に相当する値である。（出力波形の標本化周
期を125マイクロ秒とすると16ミリ秒は128標本化
周期となる）デコーダ１３９はダウンカウンタ１３５の出力
が−１以外の場合は“１”をBUSY1信号として
出力する。またダウンカウンタ１３５の出力が０
になつたときEND1信号を“１”とする（第７図
参照）以下、標本化周期（125マイクロ秒）ごと
に入力レジスタ１００から波形再生用のデータを
読み取りADPCM符号による波形再生処理を行な
う。１再生処理終了ごとにダウンタウンタ１３５
は１ずつ減じられる。128回目の再生処理を終了
した時点でダウンカウンタ１３５は０となり
END1信号が“１”となる。次の標本化周期にお
いてはEND1信号が“１”でありこの場合には入
力レジスタ１００からのオフセツト用PCM符号
がセレクタ１０４，１０５，１２３を通り加減算
器１１９に加えられる。次にダウンカウンタ１３
５は１減じられ−１となる。このときデコーダ１
３９の出力であるEND1信号は“０”に、
BUSY1信号は“０”になる。第１表に、コントローラ１５０が、セレクタ１
０４を介して入力される各チヤンネルの入力レジ
スタ１００，１０１，１０２，１０３に格納され
たADPCM符号Lnと、シフトレジスタ１１８に
よつてシフトダウンされた値（〔Ｘ〕，〔１／２Ｘ〕，〔１／４Ｘ〕，〔１／８Ｘ〕；但し〔Ｘ〕は量子化メモ
リ１１７の出力）とを用いてレジスタ１２０と加減算
器１１９において行なうADPCM波形再生演算処
理（ポインタの移動演算は除く）を示す。 (Technical Field) The present invention relates to a speech synthesizer that reads information in the waveform region of phoneme waveforms from a storage area and synthesizes speech, and in particular, performs addition to superimpose a plurality of ADPCM-encoded phoneme waveforms and outputs speech. Concerning the speech synthesizer obtained. (Prior Art) When a human voice is a voiced sound, it is emitted when airflow from the lungs becomes a quasi-periodic impulse flow through the vocal cords and resonates with a hollow resonator called the vocal tract. Figure 1 shows the waveform of a human voice. As shown in Figure 1, the human voice is “pitch”.
Almost the same waveform is repeated every cycle called . This is because, as stated earlier, the human voice is emitted by a quasi-periodic impulse stream, and the pitch period is equal to the interval of this impulse stream. When attempting to synthesize such speech, various methods can be used depending on the format in which the speech information is stored. One method is to store one-pitch period waveforms of speech (called phoneme waveforms) for various speech sounds in a storage area, and output speech by connecting these phoneme waveforms according to control information. be. Figure 2 shows a configuration using this method. 1 is a storage area for storing phoneme waveforms of 1 pitch period for each type of speech, 2 is a storage area for control information such as pitch and amplitude multiplication for connecting phoneme waveforms, and 3 is a storage area for connecting voice segments. This is the synthesis department. Speech is synthesized in the synthesizer 3 by connecting the phoneme waveforms stored in the storage area 1 using the control information stored in the storage area 2. However, in order to create a highly natural synthesized sound, it is necessary to The pitch and volume of the voice must be appropriately controlled. The loudness of the voice is controlled by multiplying the phonemes in the storage area by a certain amount using amplitude multiplication information in the storage area. Furthermore, the pitch of the voice is controlled by pitch information in the storage area 2. However, the phoneme waveform in storage area 1 has the pitch period length of the voice when the phoneme waveform was extracted, and does not necessarily match the pitch given by the control information. Therefore, if the pitch of the control information is shorter than the speech segment length, the rear end of the speech segment is cut off, and if the pitch of the control information is longer than the speech segment length, the last value of the speech segment is extended for control. It is assumed that the phoneme has the same pitch length as the information. (See Figure 3) However, if the phoneme waveform stored in storage area 1 is cut in the middle, if the phoneme waveform is not sufficiently attenuated, discontinuity will occur between the connected phoneme waveforms and the synthesized sound will be affected. This has the disadvantage of causing a shadow, and conversely, when the phoneme waveform is lengthened, the spectrum of the phoneme waveform is deformed, leading to deterioration of sound quality. (Objective and Summary of the Invention) The object of the present invention is to solve these drawbacks, and to solve the problem of multiple channels that are shifted by pitch period.
In a speech synthesizer that superimposes ADPCM encoded phoneme waveforms, by time-multiplexing ADPCM code reproduction, and by using an adder used for phoneme waveform superimposition calculation as cumulative addition/subtraction means in the ADPCM code regenerator. This is a speech synthesizer that synthesizes natural, high-quality speech with a simpler circuit configuration, and will be described in detail below. (Premise of the invention) As mentioned before, speech is produced by the resonance of impulse flows from the vocal cords, and this can be replaced by an electric circuit. That is, speech is considered to be a superposition of output waveforms (hereinafter referred to as phoneme waveforms) of a resonant filter corresponding to one impulse emitted by an excitation circuit corresponding to the vocal cords. This will be explained using FIG. 4. Reference numeral 10 in FIG. 4 is an impulse train generated by the excitation circuit. Here, the time interval between each impulse is the pitch period interval. 11 is a synthesized audio output waveform obtained by driving a resonant filter with an impulse train generated by an excitation circuit. 12 is a phoneme waveform when the resonant filter is driven by impulse _P1 of the impulse train generated by the excitation circuit. Similarly, 13 to 17 are phoneme waveforms when the resonance filter is driven by impulses _P2 to _P6 . Drive impulse train 10 emitted from the excitation circuit
Since is the addition of impulses P ₁ to P ₆ , according to the superposition theorem, the synthesized speech output waveform 11 can be obtained by adding the phoneme waveforms 12 to 17. The phoneme waveform 12 shown in FIG. 4 is zero before time point _t1 . The phoneme waveform output after time point t ₁ has the property of attenuating over time and becoming 0 at an infinite time point. In fact, in the case of a speech waveform, the phoneme waveform is considered to be almost 0 when 16 milliseconds have passed from the excitation point (t ₁ ). Therefore, 16 milliseconds from the excitation time point (t ₁ ) is sufficient for the reproduction processing of the phoneme waveform 12. Therefore, in the present invention, a phoneme waveform whose initial value is 0 and whose phoneme length is set to be sufficiently longer than the pitch period (for example, 16 milliseconds) so that the phoneme waveform is sufficiently attenuated is used as the phoneme waveform. However, the reproduction process of phoneme waveform 13 is performed at the excitation time point.
Starting from t ₂ requires multiple reproduction processing because the reproduction processing of the phoneme waveform 12 has not yet been completed. The required waveform reproduction multiplicity n is the following (1)
It is given by the formula, n = 16 milliseconds/minimum pitch period (1). In the case of normal voice, n of about 4 is sufficient. Therefore, the following description will be made assuming that n=4. Next, as a code for reproducing the phoneme waveform, an ADPCM code with high coding efficiency is used. series of
A regenerator for reproducing phoneme waveforms from ADPCM codes has been proposed in Japanese Patent Application No. 109800/1982, so a detailed explanation will be omitted. In differential decoding processing such as ADPCM code reproduction processing, if the reproduction processing is terminated after 16 milliseconds (128 sample periods), the final output value is retained. Therefore, the phoneme waveform condition (Condition 1) is 0 before the excitation time point (Condition 2) and is 0 at the infinite time point.In order to satisfy the condition, the reproduction initial value must be 0 and the reproduction final output value. must be 0, and even if the initial reproduction value is 0, the final reproduction value will not normally be 0. Therefore, a value (referred to as an offset PCM code) that makes the final reproduction value 0 must be added to the end to make the final reproduction value 0. (Embodiment of the invention) FIG. 5 shows an embodiment of the invention. Here, the word "channel" or "ch" is used to number the processes that are time-divisionally processed within one sample period. Each part in FIG. 5 is as follows. 10
0 is the 1st channel input register, 101 is the 2nd channel input register, 102 is the 3rd channel input register, 103 is the 4th channel input register, 104 and 105 are the selectors, 106 is the register that stores the ADPCM code data Ln, and 107 is the register 106. A pointer movement amount memory that outputs a pointer movement amount Dn using a value as an address; 108 is an adder; 109 is a selector; 110 is a selector; 111 is a first channel pointer register; 112 is a second channel pointer register;
13 is the 3rd channel pointer register, 114 is the 4th channel
A pointer register, 115 is a selector, 116 is a pointer limiter that limits the pointer value Pn to a specific range and outputs it as a pointer limiter value Pn', 117 is a quantization memory that stores the quantization step value X, 118 is a shift register, 119
is an adder/subtractor, 120 is a register, 121 is a register that outputs synthesized speech, 122 is an output terminal, 12
3 is the selector, 135 is the 1st channel down counter,
136 is the 2nd channel down counter, 137 is the 3rd channel
Down counter, 138 is the 4th channel down counter, 139 is the 1st channel decoder, 140 is the 2nd channel decoder, 141 is the 3rd channel decoder, 142 is the 4th channel down counter
4ch decoder, 143 is the BUSY signal of each channel, 144 is the END signal that gives the final point of playback,
145 is a start signal for waveform reproduction, and 150 is a controller. The controller 150 is connected to each circuit such as the first channel input register 100, the second channel input register 101, . . . , the third channel decoder 141, the fourth channel decoder 142, etc., and controls their operations, as shown in FIG. The connections are omitted to avoid complicating the diagram. FIG. 6 shows data used in the present invention to reproduce phoneme waveforms, which consists of a pointer initial value DPn, 128 ADPCM codes Ln ₁ , Ln ₂ , Ln ₃ , . . . , Ln ₁₂₈ , and an offset PCM code. The initial value data DPn of the pointer is stored in the pointer registers 111 to 114 corresponding to each channel at the start of waveform playback.
This is the data stored in . This pointer register inputs the pointer movement amount Dn corresponding to the ADPCM code data and the previous pointer value to the adder 10.
8 is updated and stored as a new pointer value. Also for offset
The PCM code is data that is read after the reproduction processing using 128 ADPCM codes, and is data that sets the final value of the waveform reproduction to 0. Further, FIG. 7 shows the relationship between the playback processing assigned to each channel and the time. In FIG. 7, 200 is a composite output waveform, 201
is the phoneme waveform reproduced by the reproduction process assigned to channel 1, 202 is the activation signal ST1 for channel 1, and 203 is the BUSY signal for channel 1.
BUSY1, 204 is the END signal END1 for channel 1
Below 205, 209, 213 are channels 2, 3,
The phoneme waveforms 206, 210, and 214 reproduced by the reproduction process assigned to channel 4 are activation signals ST2, ST3, ST4, and 20 of channels 2, 3, and 4.
7,211,215 are for channels 2, 3, and 4
BUSY signal BUSY2, BUSY3, BUSY4, 20
8,212,216 are for channels 2, 3, 4
END signals are END2, END3, and END4 signals. Embodiments of the present invention will be described in detail with reference to FIGS. 5 to 7. At time point _t1 , a starting signal ST1 is applied to the controller 150 from the outside, and the reproduction process of the phoneme waveform corresponding to FIG. 412 is started by channel 1. At this time, controller 150 sets down counter 135 to the value 128. This value corresponds to the aforementioned 16 milliseconds. (If the sampling period of the output waveform is 125 microseconds, 16 milliseconds is 128 sampling periods.) When the output of the down counter 135 is other than -1, the decoder 139 outputs "1" as the BUSY1 signal. Also, the output of the down counter 135 is 0.
When this occurs, the END1 signal is set to "1" (see FIG. 7). Thereafter, data for waveform reproduction is read from the input register 100 at every sampling period (125 microseconds) and waveform reproduction processing using the ADPCM code is performed. Down data 135 every time one playback process is completed.
is reduced by 1. At the end of the 128th playback process, the down counter 135 becomes 0.
The END1 signal becomes “1”. In the next sampling period, the END1 signal is "1", and in this case, the offset PCM code from the input register 100 is applied to the adder/subtractor 119 through the selectors 104, 105, and 123. Next, down counter 13
5 is subtracted by 1 and becomes -1. At this time, decoder 1
The END1 signal, which is the output of 39, becomes “0”,
The BUSY1 signal becomes “0”. Table 1 shows that the controller 150 selector 1
The ADPCM code Ln stored in the input registers 100, 101, 102, 103 of each channel input through 04 and the values shifted down by the shift register 118 ([X], [1/2X], [1/4 shows.

【表】同様に時間点t₂において外部から起動信号ST2
がコントローラ１５０に加えられ、チヤネル２に
よつて第４図１３に相当する音素波形の再生処理
が開始される。このときダウンカウンタ１３６に値１２８がセ
ツトされ、BUSY２信号が“１”となる。以下
チヤネル１と同様な再生処理を行ない128標本化
周期後にダウンカウンタ１３６は０となり
BUSY2信号は“０”となる。以下時間点t₃からはチヤネル３において、時間
点t₄からはチヤネル４において、時間点t₅からは
チヤネル１において同様な制御が行なわれる。このように４つのチヤネルによつて再生処理を
行ない合成音声出力波形を標本化周期ごとに出力
するが、各チヤンネルの再生処理は１標本時間点
の間で時分割に行なわれる。ここで、１標本化周
期内での処理の時間関係を第８図ａに、そのフロ
ーチヤートを第８図ｂ，ｃに示す。１標本化周期内での処理は５つのサイクルに分
かれており、順次処理される。尚、それぞれのサ
イクルにおいて処理されるADPCM再生処理に必
要な回路はほとんど共有化されており第６図の各
構成要素のうちその名称にチヤネル番号の付加さ
れていない構成要素はすべて各サイクルに共通に
用いられるものであり、これらを総称して「共有
部」と称する。（サイクル１）チヤネル１に割り当てられた波形再生処理が入
力レジスタ１００、ポインタレジスタ１１１、ダ
ウンカウンタ１３５、デコーダ１３９と共有部を
用いて行なわれ、結果はレジスタ１２０に格納さ
れる。（サイクル２）チヤネル２に割り当てられた波形再生処理が入
力レジスタ１０１、ポインタレジスタ１１２、ダ
ウンカウンタ１３６、デコーダ１４０と共有部を
用いて行なわれ、結果はレジスタ１２０に格納さ
れる。（サイクル３）チヤネル３に割り当てられた波形再生処理が入
力レジスタ１０２、ポインタレジスタ１１３、ダ
ウンカウンタ１３７、デコーダ１４１と共有部を
用いて行なわれ、結果はレジスタ１２０に格納さ
れる。（サイクル４）チヤネル４に割り当てられた波形再生処理が入
力レジスタ１０３、ポインタレジスタ１１４、ダ
ウンカウンタ１３８、デコーダ１４２と共有部を
用いて行なわれ、結果はレジスタ１２０に格納さ
れる。（サイクル５）レジスタ１２０の値をレジスタ１２１に格納し
て合成音声出力のためのPCMデータとする。以上の５つのサイクルにおける制御を第８図
ｂ，ｃのフローチヤートに示す。このように本実施例では、各チヤンネルに割り
当てられたADPCM符号による音素波形再生処理
を、各チヤンネルごとの構成要素と、共通的に用
いる共有部とで、時分割処理によつて実施してい
る。まだ各チヤンネルの波形再生に用いる逐次加
算手段（レジスタ１２０と加減算器１１９）は音
素波形を重ね合わせるための加算手段も兼ねてお
り小さな回路構成により実現される特長を有す
る。又、各チヤンネルの起動信号の間隔の変更に
より合成音声出力のピツチ周期を容易に変化させ
ることができる。さらに１つの音素波形も０から
始まり０で終わるためそれぞれの音素の加算によ
る波形の不連続性等の音質劣化も発生しない特長
を合わせ持つている。（発明の効果）以上説明したように本発明によれば、簡単な回
路構成によりピツチ周期のコントロールが可能で
良質な合成音を出力する音声合成回路が構成でき
る。[Table] Similarly, at time point t ₂ , start signal ST2 is applied externally.
is applied to the controller 150, and channel 2 starts the reproduction process of the phoneme waveform corresponding to FIG. 4, 13. At this time, the value 128 is set in the down counter 136, and the BUSY2 signal becomes "1". After that, the same playback process as for channel 1 is performed, and the down counter 136 becomes 0 after 128 sampling cycles.
The BUSY2 signal becomes "0". Thereafter, similar control is performed on channel ₃ from time point t3, channel ₄ from time point t4, and channel ₁ from time point t5. In this way, reproduction processing is performed using four channels and a synthesized speech output waveform is output at each sampling period, but the reproduction processing of each channel is performed in a time-division manner between one sampling time point. Here, the time relationship of processing within one sampling period is shown in FIG. 8a, and its flowchart is shown in FIGS. 8b and 8c. Processing within one sampling period is divided into five cycles, which are sequentially processed. In addition, most of the circuits required for ADPCM reproduction processing processed in each cycle are shared, and among the components shown in Figure 6, all components that do not have a channel number added to their names are common to each cycle. These are collectively referred to as the "shared part." (Cycle 1) Waveform reproduction processing assigned to channel 1 is performed using the input register 100, pointer register 111, down counter 135, decoder 139, and a common section, and the result is stored in the register 120. (Cycle 2) Waveform reproduction processing assigned to channel 2 is performed using the input register 101, pointer register 112, down counter 136, decoder 140, and a shared section, and the result is stored in the register 120. (Cycle 3) Waveform reproduction processing assigned to channel 3 is performed using the input register 102, pointer register 113, down counter 137, decoder 141, and a common section, and the result is stored in the register 120. (Cycle 4) The waveform reproduction process assigned to channel 4 is performed using the input register 103, pointer register 114, down counter 138, decoder 142 and the common section, and the result is stored in the register 120. (Cycle 5) The value of register 120 is stored in register 121 and used as PCM data for outputting synthesized speech. The control in the above five cycles is shown in the flowcharts of FIGS. 8b and 8c. In this way, in this embodiment, the phoneme waveform reproduction processing using the ADPCM code assigned to each channel is performed by time-division processing using the constituent elements for each channel and the commonly used shared part. . The successive addition means (register 120 and adder/subtractor 119) used for waveform reproduction of each channel also serves as addition means for superimposing phoneme waveforms, and has the advantage of being realized by a small circuit configuration. Furthermore, by changing the interval between activation signals of each channel, the pitch period of the synthesized audio output can be easily changed. Furthermore, since each phoneme waveform starts from 0 and ends at 0, it also has the advantage of not causing any deterioration in sound quality such as waveform discontinuity due to addition of each phoneme. (Effects of the Invention) As described above, according to the present invention, it is possible to construct a speech synthesis circuit that can control the pitch period and outputs high-quality synthesized speech with a simple circuit configuration.

[Brief explanation of drawings]

第１図は音声波形を示した図、第２図は従来の
音声合成器の構成図、第３図はピツチを変化させ
る場合に用いる波形を示す図、第４図は音素波形
の重ね合わせの説明図、第５図は本発明の１実施
例を示した図、第６図は音素波形のデータ形式を
示した図、第７図は音素波形とST信号、BUSY
信号、END信号の時間関係を示した図、第８図
ａは１標本周期時間内での処理の時間関係を表わ
した図、第８図ｂ，ｃは１標本周期時間内での処
理を表わしたフローチヤートである。１００……第1ch入力レジスタ、１０１……第
2ch入力レジスタ、１０２……第3ch入力レジス
タ、１０３……第4ch入力レジスタ、１０４……
セレクタ、１０５……セレクタ、１０６……レジ
スタ、１０７……ポインタ移動量メモリ、１０８
……加算器、１０９……セレクタ、１１０……セ
レクタ、１１１……第1chポインタレジスタ、１
１２……第2chポインタレジスタ、１１３……第
3chポインタレジスタ、１１４……第4chポイン
タレジスタ、１１５……セレクタ、１１６……ポ
インタリミツタ、１１７……量子化メモリ、１１
８……シフトレジスタ、１１９……加減算器、１
２０……レジスタ、１２１……レジスタ、１２２
……出力端子、１２３……セレクタ、１３５……
第1chダウンカウンタ、１３６……第2chダウン
カウンタ、１３７……第3chダウンカウンタ、１
３８……第4chダウンカウンタ、１３９……第
1chデコーダ、１４０……第2chデコーダ、１４
１……第3chデコーダ、１４２……第4chデコー
ダ、１４３……BUSY信号、１４４……END信
号、１４５……起動信号、１５０……コントロー
ラ。 Figure 1 is a diagram showing speech waveforms, Figure 2 is a block diagram of a conventional speech synthesizer, Figure 3 is a diagram showing waveforms used when changing pitch, and Figure 4 is a diagram showing the superposition of phoneme waveforms. Explanatory diagram, Fig. 5 is a diagram showing one embodiment of the present invention, Fig. 6 is a diagram showing the data format of the phoneme waveform, and Fig. 7 is a diagram showing the phoneme waveform, ST signal, BUSY
Figure 8a shows the time relationship between the signals and the END signal. Figure 8a shows the time relationship of processing within one sample cycle time. Figures 8b and c show the processing within one sample cycle time. This is a flowchart. 100...1st channel input register, 101...1st channel input register
2ch input register, 102...3rd channel input register, 103...4th channel input register, 104...
Selector, 105... Selector, 106... Register, 107... Pointer movement amount memory, 108
... Adder, 109 ... Selector, 110 ... Selector, 111 ... 1st channel pointer register, 1
12... 2nd channel pointer register, 113... 2nd channel pointer register
3ch pointer register, 114... 4th channel pointer register, 115... selector, 116... pointer limiter, 117... quantization memory, 11
8...Shift register, 119...Additional/subtractor, 1
20...Register, 121...Register, 122
...Output terminal, 123...Selector, 135...
1st channel down counter, 136... 2nd channel down counter, 137... 3rd channel down counter, 1
38...4th channel down counter, 139...th
1ch decoder, 140...2nd channel decoder, 14
1... 3rd channel decoder, 142... 4th channel decoder, 143... BUSY signal, 144... END signal, 145... Start signal, 150... Controller.

Claims

[Claims] Corresponding to a plurality of channels, a pointer initial value, a plurality of ADPCM code data, and an offset PCM
The unit data for phoneme waveform reproduction, including encoded data, is input sequentially to the reproduction means, the phoneme waveform is reproduced for each channel, and the phoneme waveforms of each channel shifted by the pitch period are added by the addition means and decoded. In the speech synthesizer that obtains the output, the phoneme waveform input to each channel is a phoneme waveform whose initial value is 0 and whose phoneme length is set sufficiently longer than the pitch period, and the offset PCM code data is reproduced. This is code data set to set the final value of the phoneme waveform to 0, and the reproduction means includes a plurality of input registers 100 to 103 that store ADPCM code data of the phoneme waveform corresponding to each channel, and A pointer movement amount memory 107 that outputs a pointer movement amount according to ADPCM code data, and a register that stores a pointer value, which stores the pointer initial value as an initial value and stores the pointer movement amount corresponding to the ADPCM code data. and the previous pointer value to the adder 108
A plurality of pointer registers 111 to 114 corresponding to each channel update and store the result of addition as a new pointer value, and a plurality of predetermined quantization steps are stored and a quantization step value corresponding to the pointer value is calculated. Output quantization memory (11
7), a shift register 118 capable of storing and shifting the quantization step value, and means 119, 1 for cumulatively adding and subtracting the output of the shift register 118.
20, is a means for time-divisionally performing phoneme waveform reproduction processing in each channel within one sampling period, and the cumulative addition/subtraction means 119, 120 add the phoneme waveforms of each channel shifted by a pitch period. A speech synthesizer, characterized in that the speech synthesizer also serves as the addition means.