JPH0449956B2

JPH0449956B2 -

Info

Publication number: JPH0449956B2
Application number: JP58167534A
Authority: JP
Inventors: Makoto Morito; Takashi Yato; Takashi Miki
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1983-09-13
Filing date: 1983-09-13
Publication date: 1992-08-12
Also published as: JPS6059398A

Description

[Detailed description of the invention]

（技術分野）本発明は記憶領域から音素波形の波形領域での
情報を読み出し音声を合成する音声合成器に関
し、特に複数のADPCM符号化された対称音素波
形を重畳する加算を実行し、音声出力を得る音声
合成器に関する。（従来技術）人間の声は有声音の場合、肺からの空気流が声
帯によつて準周期的なインパルス流となり声道と
呼ばれる空洞共鳴体と共鳴することによつて発せ
られる。人間の声の波形を第１図に示す。第１図に示されるごとく人間の声は“ピツチ”
と呼ばれる周期ごとにほとんど同じ波形がくりか
えされている。このことは前にも述べたように人
間の声は準周期的なインパルス流によつて発せら
れることから起因しておりピツチ周期はこのイン
パルス流の間隔に等しい。このような音声を合成しようとしたとき音声の
情報をどのような形で格納しておくかにより各種
方式があげられる。その１つの方式として音声の１ピツチ周期の波
形（これを音素波形と称する）をいろいろな音声
について記憶領域に格納し制御情報にしたがいこ
れら音素波形をつなぎ合わせることによつて音声
を出力する方式がある。第２図にこの方式による構成を示す。１は各種音声に対して１ピツチ周期の音素波形
を格納しておく記憶領域で、２は音素波形をつな
ぎ合わせるためのピツチならびに振幅倍などの制
御情報用記憶領域で、３は音声素片をつなぎ合わ
せる合成部である。記憶領域１に格納された音声を記憶領域２に格
納された制御情報によつてつなぎ合わせることに
よつて音声を合成部３で合成するが、自然性の高
い合成音を作るためには声の高さ、声の大きさを
適切に制御しなければならない。声の大きさは記
憶領域内の振幅倍情報によつて記憶領域内の音素
を一定倍することにより制御される。また声の高
さは記憶領域２内のピツチ情報によつて制御され
る。しかし記憶領域１の中の音素波形は音素波形
を抽出した際の音声のピツチ周期の長さをもつて
おり制御情報によつて与えられるピツチとは必ず
しも一致していない。したがつて制御情報のピツチが音声素片長より
短い場合には音声素片の後端を切り、制御情報の
ピツチが音声素片長より長い場合には音声素片の
最後の値を延長して制御情報と同一ピツチ長をも
つた音素とする。（第３図参照）しかし記憶領域
１に格納された音素波形を途中で切つた場合には
音素波形が十分に減衰していないときには接続す
る音素波形との間に不連続を生じ合成音に悪影き
ょうをおよぼすという欠点があり、また逆に音素
波形を長くした場合には音素波形のスペクトラム
が変形してしまい音質劣化をまねくという欠点が
あつた。（発明の目的及び概要）本発明の目的はこれらの欠点を解決することに
あり、ピツチ周期ずつずれる複数チヤンネルの
ADPCM符号化された対称音素波形を重畳する音
声合成器において、ADPCM符号再生を時間多重
処理することにより、また対称音素波形の重畳演
算に用いる加算器をADPCM符号再生器内の累積
加減算手段と兼用することにより簡単な回路構成
で自然性のある良質な音声を合成する音声合成器
を実現させたもので、以下詳細に説明する。（発明の前提）前にも述べたように音声は声帯によるインパル
ス流が共鳴することにより発せられておりこれは
電気回路に置き換えることができる。すなわち音声は声帯に相当する励振回路の発す
る１つのインパルスに対応した共振フイルタの出
力波形（以下これを音素波形と呼ぶ）の重なり合
つたものと考えられる。このことを第４図を用い
て説明する。第４図の１０は、励振回路の発するインパルス
列である。ここで、各インパルス間の時間間隔は
ピツチ周期間隔である。１１は、励振回路の発す
るインパルス列で共振フイルタを駆動した合成音
声出力波形である。１２は励振回路の発するイン
パルス列のうち、インパルスP₁によつて共振フ
イルタを駆動した場合の音素波形である。以下同
様に１３〜１７はインパルスP₂〜P₆によつて共
振フイルタ４を駆動した場合の音素波形である。
励振回路より発せられる駆動インパルス列１０は
インパルスP₁からP₆の加算であるから重畳の定
理によれば合成音声出力波形１１は各音素波形１
２から１７までの加算によつて得られる。第４図に示される音素波形１２は時間点t₁以前
は０である。時間点t₁以後に出力される音素波形
は時間経過とともに減衰し無限大時間点では０と
なる性質を有する。実際、音声波形の場合励振点
（t₁）から16ミリ秒を経過した時点での音素波形
はほとんど０と考えられる。本発明では両端の値
がほぼ０の偶対称な音素片を用いるが、このため
に(1)各音素片長を一定にする、(2)第４図にも示す
通り各音素片長をピツチ周期に対して十分長く設
定する（例えば16ミリ秒）、という条件を付加し、
音素片作成を実現している。このように音素片長
を十分長く設定する（例えば16ミリ秒）ことによ
り、スペクトル包絡のフーリエ逆変換から算出さ
れる音素片の波形はその両端の値がほぼ０とな
る。したがつて音素波形１２の再生処理は励振時
点（t₁）から16ミリ秒間で十分である。しかし、音素波形１３の再生処理を励振時間点
t₂から開始するには、音素波形１２の再生処理が
終了していないために多重的な再生処理が必要と
なる。必要となる波形再生多重度ｎは次の第(1)
式、ｎ＝16ミリ秒／ピツチ周期の最小値 (1) で与えられる。通常音声の場合ｎは４程度で十分
である。そこでｎ＝４として以後説明する。次に音素波形を再生するための符号として、符
号化効率のよいADPCM符号を用い、さらに音素
波形として半分の情報量ですむ偶対称波形を用い
る。一連のADPCM符号から対称な音素波形を再
生する再生器については特願昭56−185489に提案
されているので詳しい説明は省略する。 ADPCM符号の再生処理のような差分復号処理
において16ミリ秒（128標本周期）で再生処理を
打ち切つた場合には最終出力値が保持される。し
たがつて音素波形の条件（条件１）励振時間点以前は０である（条件２）無限大時間点では０であるを満たすためには再生初期値が０である事と、再
生最終出力値が０である事が必要である。本発明では偶対称波形を扱つているため再生初
期値さえ０にすれば再生最終値は０となり条件を
満足する。（発明の実施例）第５図に本発明における１実施例を示す。ここで１標本周期時間内に時分割に処理される
処理に対して番号付けのため「チヤネル」又は
「ch」という言葉を用いる。第５図における各部は次のとおりである。１０
０は第1ch入力レジスタ、１０１は第2ch入力レ
ジスタ、１０２は第3ch入力レジスタ、１０３は
第4ch入力レジスタ、１０４，１０５はセレク
タ、１０６はADPCM符号データLnを格納する
レジスタ、１０７はレジスタ１０６の値をアドレ
スとしてポインタ移動量Dnを出力するポインタ
移動量メモリ、１０８は加減算器、１０９はセレ
クタ、１１０はセレクタ、１１１は第1chポイン
タレジスタ、１１２は第2chポインタレジスタ、
１１３は第3chポインタレジスタ、１１４は第
4chポインタレジスタ、１１５はセレクタ、１１
６はポインタ値Pnを特定の範囲に限定してポイ
ンタリミツタ値Pn′として出力するポインタリミ
ツタ、１１７は量子化ステツプ値Ｘを格納してい
る量子化メモリ、１１８はシフトレジスタ、１１
９は加減算器、１２０はレジスタ、１２１は合成
音声を出力するレジスタ、１２２は出力端子、１
３５は第1chダウンカウンタ、１３６は第2chダ
ウンカウンタ、１３７は第3chダウンカウンタ、
１３８は第4chダウンカウンタ、１３９は第1ch
デコーダ、１４０は第2chデコーダ、１４１は第
3chデコーダ、１４２は第4chデコーダ、１４３
は各チヤネルのBUSY信号、１４４は対称波形
の再生のうち左半分の再生を示すLEFT信号、１
４５は波形再生のための起動信号、１５０はコン
トローラである。コントローラ１５０は前述の第
1ch入力レジスタ１００、第2chの入力レジスタ
１０１、…、第3chデコーダ１４１、第4chデコ
ーダ１４２等の各回路と接続され、その動作の制
御を行なつているが第５図は図が複雑になるのを
さけるためにその接続の様子は省略している。第６図は偶対称な音素波形を再生するために本
発明において用いるデータでポインタ初期値
DPn、64個のADPCM符号Ln₁，Ln₂，Ln₃，…，
Ln₆₄から成り立つている。特願56−185489にも述
べているように16ミリ秒（128標本周期）の偶対
称波形を再生するためには64個のADPCM符号が
必要である。この64個のADPCM符号を順方向及
び逆方向に再生して128標本周期の音素波形を再
生する。またポインタの初期値データDPnは対称
波形再生開始時点にポインタに格納されるデータ
である。また各チヤンネルに割り当てられる再生処理と
時間との関係を第７図に示す。第７図において２００は合成出力波形、２０１
はチヤネル１に割り当てられた再生処理によつて
再生される音素波形、２０２はチヤネル１用起動
信号ST1，２０３はチヤネル１用BUSY信号
BUSY1、２０４はチヤネル１用LEFT信号
LEFT1、以下２０５，２０９，２１３はチヤネ
ル２、３、４に割り当てられた再生処理によつて
再生される音素波形、２０６，２１０，２１４は
チヤネル２、３、４の起動信号ST2、ST3、
ST4、２０７，２１１，２１５はチヤネル２、
３、４用BUSY信号BUSY2、BUSY3、
BUSY4、２０８，２１２，２１６はチヤネル
２、３、４用LEFT信号LEFT2、LEFT3、
LEFT4信号である。第５図〜第７図を用いて本発明の実施例を詳し
く説明する。時間点t₁において外部から起動信号ST1がコン
トローラ１５０にくわえられチヤネル１によつて
第４図１２に相当する音素波形の再生処理が開始
される。このときコントローラ１５０はダウンカウンタ
１３５に値128をセツトする。この値は前記16ミ
リ秒に相当する値である。（出力波形の標本化周
期を125マイクロ秒とすると、16ミリ秒は128標本
化周期となる）デコーダ１３９はダウンカウンタ１３５の出力
が０以外の場合は“１”をBUSY1信号として出
力する。またダウンカウンタ１３５の出力が65以
上ならば“１”をLEFT1信号として出力する。
（第７図参照）以下、標本化周期（125マイクロ
秒）ごとに入力レジスタ１００から波形再生用の
データを読み取りADPCM符号による対称波形再
生処理を行なう。１再生処理終了するごとにダウ
ンタウンタ１３５は１ずつ減じられる。64個の
ADPCM符号を最初から順番に順方向に再生し終
わると、すなわち65回目の再生処理以降は
LEFT1信号が“０”となり、64個のADPCM符
号を最後から順番に逆方向にADPCM逆再生処理
を行なう。このようにして128回目の再生処理が
終了した時点でダウンカウンタ１３５は０とな
り、BUSY1信号が“０”となる。第１表に、コントローラ１５０が、セレクタ１
０４を介して入力される各チヤンネルの入力レジ
スタ１００，１０１，１０２，１０３に格納され
たADPCM符号Lnと、シフトレジスタ１１８に
よつてシフトダウンされた値（〔Ｘ〕，〔１／２Ｘ〕，〔１／４Ｘ〕，〔１／８Ｘ〕；但し〔Ｘ〕は量子化メモ
リ１１７の出力）と、LEFT信号１１４とを用いてレ
ジスタ１２０と加減算器１１９において行なう
ADPCM波形再生演算処理（ポインタの移動演算
は除く）を示す。 (Technical Field) The present invention relates to a speech synthesizer that reads information in the waveform region of phoneme waveforms from a storage area and synthesizes speech, and in particular, performs addition to superimpose a plurality of symmetrical ADPCM-encoded phoneme waveforms and outputs speech. Concerning a speech synthesizer that obtains. (Prior Art) When a human voice is a voiced sound, it is emitted when airflow from the lungs becomes a quasi-periodic impulse flow through the vocal cords and resonates with a hollow resonator called the vocal tract. Figure 1 shows the waveform of a human voice. As shown in Figure 1, the human voice is “pitch”.
Almost the same waveform is repeated every cycle called . This is because, as stated earlier, the human voice is emitted by a quasi-periodic impulse stream, and the pitch period is equal to the interval of this impulse stream. When trying to synthesize such voices, various methods can be used depending on the format in which the voice information is stored. One method is to store one-pitch period waveforms of speech (called phoneme waveforms) for various speech sounds in a storage area, and output speech by connecting these phoneme waveforms according to control information. be. Figure 2 shows a configuration using this method. 1 is a storage area for storing phoneme waveforms of 1 pitch period for various voices, 2 is a storage area for controlling information such as pitch and amplitude multiplication for connecting phoneme waveforms, and 3 is a storage area for storing phoneme waveforms with one pitch period. This is the synthesis section that connects the pieces together. Speech is synthesized by the synthesizer 3 by connecting the voices stored in the storage area 1 using the control information stored in the storage area 2. However, in order to create a highly natural synthesized sound, it is necessary to The pitch and volume of the voice must be appropriately controlled. The loudness of the voice is controlled by multiplying the phonemes in the storage area by a certain amount using amplitude multiplication information in the storage area. Further, the pitch of the voice is controlled by pitch information in the storage area 2. However, the phoneme waveform in storage area 1 has the pitch period length of the voice when the phoneme waveform was extracted, and does not necessarily match the pitch given by the control information. Therefore, if the pitch of the control information is shorter than the speech segment length, the rear end of the speech segment is cut off, and if the pitch of the control information is longer than the speech segment length, the last value of the speech segment is extended for control. It is assumed that the phoneme has the same pitch length as the information. (See Figure 3) However, if the phoneme waveform stored in storage area 1 is cut in the middle, if the phoneme waveform is not sufficiently attenuated, discontinuity will occur between the connected phoneme waveforms and the synthesized sound will be affected. This has the disadvantage of causing a shadow, and conversely, when the phoneme waveform is lengthened, the spectrum of the phoneme waveform is deformed, leading to deterioration of sound quality. (Objective and Summary of the Invention) The object of the present invention is to solve these drawbacks, and to solve the problem of multiple channels that are shifted by pitch period.
In a speech synthesizer that superimposes ADPCM-encoded symmetric phoneme waveforms, by time-multiplexing ADPCM code reproduction, the adder used for the superimposition operation of symmetric phoneme waveforms can also be used as cumulative addition/subtraction means in the ADPCM code regenerator. By doing so, we have realized a speech synthesizer that synthesizes natural, high-quality speech with a simple circuit configuration, and will be described in detail below. (Premise of the invention) As mentioned before, speech is produced by the resonance of impulse flows from the vocal cords, and this can be replaced by an electric circuit. That is, speech is considered to be a superposition of output waveforms (hereinafter referred to as phoneme waveforms) of a resonant filter corresponding to one impulse emitted by an excitation circuit corresponding to the vocal cords. This will be explained using FIG. 4. Reference numeral 10 in FIG. 4 is an impulse train generated by the excitation circuit. Here, the time interval between each impulse is the pitch period interval. 11 is a synthesized audio output waveform obtained by driving a resonant filter with an impulse train generated by an excitation circuit. 12 is a phoneme waveform when the resonant filter is driven by impulse _P1 of the impulse train generated by the excitation circuit. Similarly, 13 to 17 are phoneme waveforms when the resonance filter 4 is driven by impulses _P2 to _P6 .
Since the driving impulse train 10 emitted from the excitation circuit is the addition of impulses P ₁ to P ₆ , according to the superposition theorem, the synthesized speech output waveform 11 is the sum of each phoneme waveform 1.
Obtained by adding 2 to 17. The phoneme waveform 12 shown in FIG. 4 is zero before time point _t1 . The phoneme waveform output after time point t ₁ has the property of attenuating over time and becoming 0 at an infinite time point. In fact, in the case of a speech waveform, the phoneme waveform is considered to be almost 0 when 16 milliseconds have passed from the excitation point (t ₁ ). In the present invention, even symmetrical phoneme segments with values at both ends of approximately 0 are used, but in order to achieve this, (1) the length of each phoneme segment is constant, and (2) the length of each phoneme segment is set to the pitch period as shown in Figure 4. Add the condition that it is set sufficiently long (for example, 16 milliseconds),
It realizes the creation of phoneme pieces. By setting the phoneme segment length to be sufficiently long (for example, 16 milliseconds) in this way, the waveform of the phoneme segment calculated from the inverse Fourier transform of the spectrum envelope will have values at both ends of approximately 0. Therefore, 16 milliseconds from the excitation time point (t ₁ ) is sufficient for the reproduction processing of the phoneme waveform 12. However, the reproduction process of phoneme waveform 13 is performed at the excitation time point.
Starting from t ₂ requires multiple reproduction processing because the reproduction processing of the phoneme waveform 12 has not yet been completed. The required waveform reproduction multiplicity n is the following (1)
It is given by the formula, n = 16 milliseconds/minimum pitch period (1). In the case of normal voice, n of about 4 is sufficient. Therefore, the following description will be made assuming that n=4. Next, an ADPCM code with high coding efficiency is used as a code for reproducing the phoneme waveform, and an even symmetric waveform that requires half the amount of information is used as the phoneme waveform. A regenerator for reproducing symmetrical phoneme waveforms from a series of ADPCM codes has been proposed in Japanese Patent Application No. 185489/1983, so a detailed explanation will be omitted. In differential decoding processing such as ADPCM code reproduction processing, if the reproduction processing is terminated after 16 milliseconds (128 sample periods), the final output value is retained. Therefore, the phoneme waveform condition (Condition 1) is 0 before the excitation time point (Condition 2) and is 0 at the infinite time point.In order to satisfy the condition, the reproduction initial value must be 0 and the reproduction final output value. must be 0. Since the present invention deals with even symmetrical waveforms, if the initial reproduction value is set to 0, the final reproduction value becomes 0 and the condition is satisfied. (Embodiment of the invention) FIG. 5 shows an embodiment of the invention. Here, the term "channel" or "ch" is used to number the processes that are time-divisionally processed within one sample period. Each part in FIG. 5 is as follows. 10
0 is the 1st channel input register, 101 is the 2nd channel input register, 102 is the 3rd channel input register, 103 is the 4th channel input register, 104 and 105 are the selectors, 106 is the register that stores the ADPCM code data Ln, and 107 is the register 106. A pointer movement amount memory that outputs a pointer movement amount Dn using a value as an address, 108 an adder/subtractor, 109 a selector, 110 a selector, 111 a first channel pointer register, 112 a second channel pointer register,
113 is the 3rd channel pointer register, 114 is the 3rd channel pointer register
4ch pointer register, 115 is selector, 11
6 is a pointer limiter that limits the pointer value Pn to a specific range and outputs it as a pointer limiter value Pn'; 117 is a quantization memory that stores the quantization step value X; 118 is a shift register;
9 is an adder/subtractor, 120 is a register, 121 is a register for outputting synthesized speech, 122 is an output terminal, 1
35 is the 1st channel down counter, 136 is the 2nd channel down counter, 137 is the 3rd channel down counter,
138 is the 4th channel down counter, 139 is the 1st channel
decoder, 140 is the second channel decoder, 141 is the second channel decoder, and 141 is the second channel decoder;
3ch decoder, 142 is 4th channel decoder, 143
is the BUSY signal of each channel, 144 is the LEFT signal indicating the reproduction of the left half of the symmetrical waveform reproduction, 1
45 is a start signal for waveform reproduction, and 150 is a controller. The controller 150 is
It is connected to each circuit such as the 1st channel input register 100, the 2nd channel input register 101,..., the 3rd channel decoder 141, the 4th channel decoder 142, etc., and controls their operation, but the diagram in FIG. 5 becomes complicated. The connection details are omitted to avoid confusion. Figure 6 shows the data used in the present invention to reproduce an even symmetrical phoneme waveform, and the initial value of the pointer.
DPn, 64 ADPCM codes Ln ₁ , Ln ₂ , Ln ₃ ,...,
Consists of Ln ₆₄ . As stated in Japanese Patent Application No. 56-185489, 64 ADPCM codes are required to reproduce an even symmetrical waveform of 16 milliseconds (128 sampling periods). These 64 ADPCM codes are reproduced in forward and reverse directions to reproduce a phoneme waveform with a period of 128 samples. Further, the initial value data DPn of the pointer is data stored in the pointer at the time of starting reproduction of the symmetrical waveform. Further, FIG. 7 shows the relationship between the playback processing assigned to each channel and the time. In FIG. 7, 200 is a composite output waveform, 201
is the phoneme waveform reproduced by the reproduction process assigned to channel 1, 202 is the activation signal ST1 for channel 1, and 203 is the BUSY signal for channel 1.
BUSY1, 204 are LEFT signals for channel 1
LEFT1, hereinafter 205, 209, 213 are phoneme waveforms reproduced by the reproduction processing assigned to channels 2, 3, 4, 206, 210, 214 are activation signals ST2, ST3,
ST4, 207, 211, 215 are channel 2,
BUSY signals for 3 and 4 BUSY2, BUSY3,
BUSY4, 208, 212, 216 are LEFT signals for channels 2, 3, 4 LEFT2, LEFT3,
This is the LEFT4 signal. Embodiments of the present invention will be described in detail with reference to FIGS. 5 to 7. At time point _t1 , a starting signal ST1 is applied to the controller 150 from the outside, and the reproduction process of the phoneme waveform corresponding to FIG. 412 is started by channel 1. At this time, controller 150 sets down counter 135 to the value 128. This value corresponds to the aforementioned 16 milliseconds. (If the sampling period of the output waveform is 125 microseconds, 16 milliseconds is 128 sampling periods.) When the output of the down counter 135 is other than 0, the decoder 139 outputs "1" as the BUSY1 signal. If the output of the down counter 135 is 65 or more, "1" is output as the LEFT1 signal.
(See FIG. 7) Thereafter, data for waveform reproduction is read from the input register 100 at every sampling period (125 microseconds) and symmetrical waveform reproduction processing using the ADPCM code is performed. Each time one playback process is completed, the downtown counter 135 is decremented by one. 64 pieces
After playing the ADPCM code sequentially from the beginning in the forward direction, that is, after the 65th playback process,
The LEFT1 signal becomes "0", and ADPCM reverse reproduction processing is performed on the 64 ADPCM codes in the reverse direction starting from the last one. When the 128th reproduction process is completed in this way, the down counter 135 becomes 0 and the BUSY1 signal becomes "0". Table 1 shows that the controller 150 selector 1
The ADPCM code Ln stored in the input registers 100, 101, 102, 103 of each channel input through 04 and the values shifted down by the shift register 118 ([X], [1/2X], [1/4 X], [1/8
ADPCM waveform reproduction calculation processing (excluding pointer movement calculation) is shown.

【表】【table】

【表】同様に時間点t₂において外部から起動信号ST2
がコントローラ１５０に加えられ、チヤネル２に
よつて第４図１３に相当する音素波形の再生処理
が開始される。このときダウンカウンタ１３６に値１２８がセ
ツトされ、BUSY２信号が“１”となる。以下
チヤネル１と同様な再生処理を行ない128標本化
周期後にダウンカウンタ１３６は０となり
BUSY2信号は“０”となる。以下時間点t₃からはチヤネル３において、時間
点t₄からはチヤネル４において、時間点t₅からは
チヤネル１において同様な制御が行なわれる。このように４つのチヤネルによつて再生処理を
行ない合成音声出力波形を標本化周期ごとに出力
するが、各チヤンネルの再生処理は１標本時間点
の間で時分割に行なわれる。ここで、１標本化周
期内での処理の時間関係を第８図ａに、そのフロ
ーチヤートを第８図ｂ，ｃに示す。１標本化周期内での処理は５つのサイクルに分
かれており、順次処理される。尚、それぞれのサ
イクルにおいて処理されるADPCM再生処理に必
要な回路はほとんど共有化されており第６図の各
構成要素のうちその名称にチヤネル番号の付加さ
れていない構成要素はすべて各サイクルに共通に
用いられるものであり、これらを総称して「共有
部」と称する。（サイクル１）チヤネル１に割り当てられた波形再生処理が入
力レジスタ１００、ポインタレジスタ１１１、ダ
ウンカウンタ１３５、デコーダ１３９と共有部を
用いて行なわれ、結果はレジスタ１２０に格納さ
れる。（サイクル２）チヤネル２に割り当てられた波形再生処理が入
力レジスタ１０１、ポインタレジスタ１１２、ダ
ウンカウンタ１３６、デコーダ１４０と共有部を
用いて行なわれ、結果はレジスタ１２０に格納さ
れる。（サイクル３）チヤネル３に割り当てられた波形再生処理が入
力レジスタ１０２、ポインタレジスタ１１３、ダ
ウンカウンタ１３７、デコーダ１４１と共有部を
用いて行なわれ、結果はレジスタ１２０に格納さ
れる。（サイクル４）チヤネル４に割り当てられた波形再生処理が入
力レジスタ１０３、ポインタレジスタ１１４、ダ
ウンカウンタ１３８、デコーダ１４２と共有部を
用いて行なわれ、結果はレジスタ１２０に格納さ
れる。（サイクル５）レジスタ１２０の値をレジスタ１２１に格納し
て合成音声出力のためのPCMデータとする。以上の５つのサイクルにおける制御を第８図
ｂ，ｃのフローチヤートに示す。このように本実施例では、各チヤンネルに割り
当てられたADPCM符号による対称音素波形再生
処理を、各チヤンネルごとの構成要素と、共通的
に用いる共有部とで、時分割処理によつて実施し
ている。まだ各チヤンネルの波形再生に用いる逐
次加算手段（レジスタ１２０と加減算器１１９）
は音素波形を重ね合わせるための加算手段も兼ね
ており小さな回路構成により実現される特長を有
する。又、各チヤンネルの起動信号の間隔の変更
により合成音声出力のピツチ周期を容易に変化さ
せることができる。さらに１つの音素波形も０か
ら始まり０で終わるためそれぞれの音素の加算に
よる波形の不連続性等の音質劣化も発生しない特
長を合わせ持つている。（発明の効果）以上説明したように本発明によれば、簡単な回
路構成によりピツチ周期のコントロールが可能で
良質な合成音を出力する音声合成回路が構成でき
る。[Table] Similarly, at time point t ₂ , start signal ST2 is applied externally.
is applied to the controller 150, and channel 2 starts the reproduction process of the phoneme waveform corresponding to FIG. 4, 13. At this time, the value 128 is set in the down counter 136, and the BUSY2 signal becomes "1". After that, the same playback process as for channel 1 is performed, and the down counter 136 becomes 0 after 128 sampling cycles.
The BUSY2 signal becomes "0". Thereafter, similar control is performed on channel ₃ from time point t3, channel ₄ from time point t4, and channel ₁ from time point t5. In this way, reproduction processing is performed using four channels and a synthesized speech output waveform is output at each sampling period, but the reproduction processing of each channel is performed in a time-division manner between one sampling time point. Here, the time relationship of processing within one sampling period is shown in FIG. 8a, and its flowchart is shown in FIGS. 8b and 8c. Processing within one sampling period is divided into five cycles, which are sequentially processed. In addition, most of the circuits required for ADPCM reproduction processing processed in each cycle are shared, and among the components shown in Figure 6, all components that do not have a channel number added to their names are common to each cycle. These are collectively referred to as the "shared part." (Cycle 1) Waveform reproduction processing assigned to channel 1 is performed using the input register 100, pointer register 111, down counter 135, decoder 139, and a common section, and the result is stored in the register 120. (Cycle 2) Waveform reproduction processing assigned to channel 2 is performed using the input register 101, pointer register 112, down counter 136, decoder 140, and a common section, and the result is stored in the register 120. (Cycle 3) Waveform reproduction processing assigned to channel 3 is performed using the input register 102, pointer register 113, down counter 137, decoder 141, and a shared section, and the result is stored in the register 120. (Cycle 4) Waveform reproduction processing assigned to channel 4 is performed using the input register 103, pointer register 114, down counter 138, decoder 142, and a shared section, and the result is stored in the register 120. (Cycle 5) The value of register 120 is stored in register 121 and used as PCM data for outputting synthesized speech. The control in the above five cycles is shown in the flowcharts of FIGS. 8b and 8c. In this way, in this embodiment, symmetric phoneme waveform reproduction processing using the ADPCM code assigned to each channel is performed by time-division processing between the constituent elements for each channel and the commonly used shared section. There is. Sequential addition means (register 120 and adder/subtractor 119) used for waveform reproduction of each channel
It also serves as an addition means for superimposing phoneme waveforms, and has the advantage of being realized by a small circuit configuration. Furthermore, by changing the interval between activation signals of each channel, the pitch period of the synthesized audio output can be easily changed. Furthermore, since each phoneme waveform starts from 0 and ends at 0, it also has the advantage of not causing any deterioration in sound quality such as waveform discontinuity due to addition of each phoneme. (Effects of the Invention) As described above, according to the present invention, it is possible to construct a speech synthesis circuit that can control the pitch period and outputs high-quality synthesized speech with a simple circuit configuration.

[Brief explanation of the drawing]

第１図は音声波形を示した図、第２図は従来の
音声合成器の構成図、第３図はピツチを変化させ
る場合に用いる波形を示す図、第４図は音素波形
の重ね合わせの説明図、第５図は本発明の１実施
例を示した図、第６図は音素波形のデータ形式を
示した図、第７図は音素波形とST信号、BUSY
信号、LEFT信号の時間関係を示した図、第８図
ａは１標本周期時間内での処理の時間関係を表わ
した図、第８図ｂ，ｃは１標本周期時間内での処
理を表わしたフローチヤートである。１００……第1ch入力レジスタ、１０１……第
2ch入力レジスタ、１０２……第3ch入力レジス
タ、１０３……第4ch入力レジスタ、１０４……
セレクタ、１０５……セレクタ、１０６……レジ
スタ、１０７……ポインタ移動量メモリ、１０８
……加減算器、１０９……セレクタ、１１０……
セレクタ、１１１……第1chポインタレジスタ、
１１２……第2chポインタレジスタ、１１３……
第3chポインタレジスタ、１１４……第4chポイ
ンタレジスタ、１１５……セレクタ、１１６……
ポインタリミツタ、１１７……量子化メモリ、１
１８……シフトレジスタ、１１９……加減算器、
１２０……レジスタ、１２１……レジスタ、１２
２……出力端子、１３５……第1chダウンカウン
タ、１３６……第2chダウンカウンタ、１３７…
…第3chダウンカウンタ、１３８……第4chダウ
ンカウンタ、１３９……第1chデコーダ、１４０
……第2chデコーダ、１４１……第3chデコーダ、
１４２……第4chデコーダ、１４３……BUSY信
号、１４４……LEFT信号、１４５……起動信
号、１５０……コントローラ。 Figure 1 is a diagram showing speech waveforms, Figure 2 is a block diagram of a conventional speech synthesizer, Figure 3 is a diagram showing waveforms used when changing pitch, and Figure 4 is a diagram showing the superposition of phoneme waveforms. Explanatory diagram, Fig. 5 is a diagram showing one embodiment of the present invention, Fig. 6 is a diagram showing the data format of the phoneme waveform, and Fig. 7 is a diagram showing the phoneme waveform, ST signal, BUSY
Figure 8a shows the time relationship between the signals and the LEFT signal. Figure 8a shows the time relationship of processing within one sample cycle time. Figures 8b and c show the processing within one sample cycle time. This is a flowchart. 100...1st channel input register, 101...1st channel input register
2ch input register, 102...3rd channel input register, 103...4th channel input register, 104...
Selector, 105... Selector, 106... Register, 107... Pointer movement amount memory, 108
...adder/subtractor, 109...selector, 110...
Selector, 111...1st channel pointer register,
112...2nd channel pointer register, 113...
3rd channel pointer register, 114...4th channel pointer register, 115...Selector, 116...
Pointer limiter, 117...Quantization memory, 1
18...shift register, 119...addition/subtraction unit,
120...Register, 121...Register, 12
2... Output terminal, 135... 1st channel down counter, 136... 2nd channel down counter, 137...
...3rd channel down counter, 138...4th channel down counter, 139...1st channel decoder, 140
...2nd channel decoder, 141...3rd channel decoder,
142...4th channel decoder, 143...BUSY signal, 144...LEFT signal, 145...start signal, 150...controller.

Claims

[Scope of Claims] 1. Each piece of data for phoneme waveform reproduction, including a pointer initial value and a plurality of pieces of ADPCM code data consisting of a plurality of bits, is sequentially input to a reproduction means in correspondence with a plurality of channels, and each In a speech synthesizer that reproduces even symmetrical phoneme waveforms for each channel, adds the phoneme waveforms of each channel shifted by a pitch period using an adding means, and obtains the addition result as a decoded output, the phoneme waveform of each channel is as follows: An even symmetrical phoneme waveform whose phoneme length is set to be sufficiently longer than the pitch period, has a constant time length sufficient for the final value to become 0 due to waveform attenuation, and whose initial and final values are 0. Yes, the reproduction means corresponds to each channel.
a plurality of input registers that store ADPCM code data; a pointer movement amount memory 107 that outputs pointer movement amounts according to each ADPCM code data;
An adder/subtracter 108 stores a pointer value, stores the pointer initial value as an initial value, and adds/subtracts the corresponding pointer movement amount and the previous pointer value in response to input of ADPCM code data. A plurality of pointer registers corresponding to each channel are used to update and store the results of addition and subtraction performed by the circuit 108 as new pointer values. A quantization memory (117) for outputting, a shift register 118 that can receive and store the quantization step value and shift the quantization step value, and means 11 for cumulatively adding and subtracting the output of the shift register 118.
9 and 120, and time-divisionally performs phoneme waveform reproduction processing in each channel within the sampling time, and the cumulative addition/subtraction means 119 and 120 add the phoneme waveforms of each channel shifted by pitch periods. A speech synthesizer, characterized in that the speech synthesizer also serves as the addition means.