EP0144724A1

EP0144724A1 - Speech synthesizing apparatus

Info

Publication number: EP0144724A1
Application number: EP84113148A
Authority: EP
Inventors: Kazuo c/o Patent Division Sumita
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1983-11-04
Filing date: 1984-10-31
Publication date: 1985-06-19
Also published as: JPS6098498A

Abstract

@ A speech synthesizing apparatus including a periodic sound signal generator circuit (10) for generating a voiced sound signal synchronously with a clock pulse in accordance with pitch data corresponding to a speech to be synthesized; and a digital filter (1) for filter-processing the voiced sound signal generated from the periodic sound signal generator circuit (10) in accordance with synthesis parameters which are generated in correspondence to a speech to be synthesized. This periodic sound signal generator circuit (10) includes a memory (101) which has stored therein a predetermined number of sampling values within a predetermined time range of a continuous wave signal which is obtained by developing the voiced sound signal based on an interpolation function; and a processing unit (100, 102) for sequentially reading out the sampling values of the continuous wave signal within the predetermined time range that are synchronized with the pitch period represented by the input pitch data from the memory (101) in response to the clock pulse.

Description

The present invention relates to a speech synthesizing apparatus for generating synthetic speech with excellent quality.
Recently, various kinds of speech synthesizing apparatuses for generating desired synthetic speech have been developed. For example, as shown in Fig. 1, this kind of speech synthesizing apparatus usually comprises a vocal-tract-approximation-digital filter 1 for setting the feature of a speech to be synthesized in accordance with synthesis parameters; a random noise generator 2; a periodic sound generator 3 for generating a sound signal at a pitch interval that depends upon the fundamental frequency of the speech to be synthesized; a selector 4 for selectively supplying output signals from the generators 2 and 3 to the digital filter depending upon whether the speech to be synthesized is voiced speech or unvoiced speech; and a control circuit 5 for respectively supplying synthesis parameters and pitch data to the digital filter 1 and to periodic sound generator 3 in accordance with the speech to be synthesized, and for supplying sampling pulses-to these circuits. The digital filter 1 receives a linear predictive coefficient, line spectral pair (LSP) parameter, cepstrum, etc. as synthesis parameters depending upon the speech synthesizing method, e.g., linear predictive coding, LSP method, cepstrum method, etc. and generates the speech data corresponding to the speech to be synthesized. In the case where the speech to be synthesized is what is called periodic voiced speech, the voiced sound signal such as an impulse, triangular wave, glottal pulse wave, etc. which is generated from the periodic sound generator 3 at the pitch interval corresponding to the above-mentioned pitch data is supplied to the digital filter 1 through the selector 4. On the other hand, in the case where the speech to be synthesized is unvoiced, a random noise from the random noise generator 2 is supplied to the digital filter 1 through the selector 4.
The digital filter 1 filtering-processes the voiced sound signal or random noise that is selectively supplied through the selector 4 in accordance with the synthesis parameters from the control circuit 5, thereby making the speech. The synthesis parameter is updated, for example, at every frame of about 10 msec or at every time interval synchronized with the pitch period. Since the synthetic speech signal which is generated from the digital filter 1 is a discrete signal, it is converted to an analog signal by a D/A converter 6, and thereafter it is supplied to an electric-acoustic converter 7 through a low-pass filter 8 having a cut-off frequency which is less than half of the frequency of a sampling pulse which is supplied from the control circuit 5.
In synthesizing voiced speech, the voiced sound signal is substantially equal to the interval determined by the input pitch data and is generated from the periodic sound generator 3 synchronously with the sampling pulse. Then it is subjected to the filtering processing in the digital filter 1 synchronously with the sampling pulse. For instance, in the case where impulses are used as a voiced sound signal, as shown in Fig. 2A, the periodic sound generator 3 generates an impulse. The pulses must be spaced by an integer multiple of the sampling period that is most approximate to the interval determined by the input pitch data. This impulse is supplied through the selector 4 to the digital filter 1, where it is subjected to filtering-process, generating the speech as shown in Fig. 2B.
Generally, since the compass of the voice of a man is low and its pitch period is long, the pitch period which is equal to N times the sampling period can be set with a relatively high degree of accuracy in accordance with the input pitch data. However, in the case where the compass is high and its pitch period is short and where this pitch period frequently varies such as in the voice of a woman or child, it is difficult to approximate the pitch period corresponding to the input pitch data by using N times the sampling period. Furthermore, it is impossible to smoothly execute the processing to set the pitch period to have a length which is N times the sampling period for the variation in pitch period. In such a case, the vibrato component is mixed with the synthetic speech, causing the sound quality to deteriorate. In addition, such an incontinuous change of frequency in the synthetic speech is sensed as an unpleasant sound by audience.
To solve this kind of problem, a method is considered whereby the interval determined by the input pitch data is set to have a length which is N times the sampling period with a high degree of approximation by increasing the frequency of the sampling pulse which is used. However, in this case, in order to impart the proper synthesis parameter to the digital filter from the control circuit 5 in accordance with each sampling pulse, it is required to store a great amount of synthesis parameters into a memory (not shown) in the control circuit 5, causing the readout control of these synthesis parameters to be complicated.
It is an object of the present invention to provide a speech synthesizing apparatus for generating synthetic speech with excellent quality irrespective of the length and variation of the pitch period.
This object is accomplished by a speech synthesizing apparatus comprising a control circuit for generating clock pulses, synthesis parameters and pitch data corresponding to a speech to be synthesized; a memory which has stored therein a predetermined number of sampling data corresponding to a predetermined number of sampling values within a predetermined time range of a continuous wave which is obtained by developing a voiced sound signal by use of an interpolation function; a readout control circuit for sequentially reading out the sampling data of the continuous wave within the predetermined time range that is synchronized with a pitch period represented by the input pitch data from the memory in response to the clock pulse; a digital filter for filtering-processing the sampling data read out synchronously with the clock pulse from that memory in accordance with the synthesis parameters generated from the control circuit; and a speech generator circuit for generating speech corresponding to the digital data from this digital filter.
In this invention, the sampling values corresponding to the generation timings of the clock pulse of the continuous wave which is generated within a predetermined time range synchronously with the pitch period are sequentially generated. Therefore, irrespective of the length of the pitch period, a voiced sound signal corresponding to the continuous wave which is always synchronized with the pitch period is supplied to the digital filter so that a synthesized speech of excellent quality can be obtained.
This invention can be more fully understood from- the following detailed description when taken in conjunction with the accompanying drawings, in which:

_Fig. 1 is a block diagram showing a conventional speech synthesizing apparatus;
Figs. 2A and 2B are signal waveform diagrams for explaining the operation of the speech synthesizing apparatus of Fig. 1;
Fig. 3 is a block diagram of a speech synthesizing apparatus of the present invention;
Figs. 4A to 4D are signal waveform diagrams for explaining the operation of the speech synthesizing apparatus shown in Fig. 3;
Fig. 5 is a flow chart explaining the operation of the voiced sound generator circuit shown in Fig. 3;
Fig. 6 is a signal waveform diagram explaining the operation of the voiced sound generator circuit shown in Fig. 3; and
Fig. 7 is a block diagram of the voiced sound generator circuit of a speech synthesizing apparatus according to another embodiment of the invention.

Fig. 3 is a block diagram of a speech synthesizing apparatus according to one embodiment of the present invention. This speech synthesizing apparatus is constituted similarly to the speech synthesizing apparatus shown in Fig. 1 except that it has a periodic sound generator circuit 10 in place of the periodic sound generator 3. The periodic sound generator circuit 10 comprises a central processing circuit 100: a read only memory (ROM) 101; a random access memory (RAM) 102; an I/O port 103 which receives the pitch data from the control circuit 5; and an I/O port 104 which supplies the voiced sound data to the sound selector 4.
The fundamental operation of the periodic sound generator circuit 10 will be explained hereinbelow. It is now assumed that the pitch data representative of the pitch period which is equal to m (integer) times the sampling period T is supplied from the control circuit 5 and that the periodic sound generator circuit 10 generates the voiced sound data indicative of the impulse x₀(n) at the n-th sampling timing as shown in Fig. 4A in response to this pitch data. A continuous wave signal x_a(t) which is obtained by developing this impulse x₀(n) by the interpolation function based on the sampling theorem is given by the following equation:
As this interpolation function, it is possible to use, for instance, a Lagrangean polynomial, a spline function or the like.
As shown in Fig. 4B, this continuous wave signal x_a(t) is generated over the time interval from -∞ to +∞; however, the signal component x_b(t) at a predetermined time range around time 0 of the continuous wave signal x_a(t) can be substantially regarded to be equivalent to the continuous wave signal x_a(t). For instance, when using a time window ω(t) such as a square window or a hamming window or the like which becomes 0 in the ranges of t ≤ -2T and t ≥ 2T and becomes 1 in the range of -2T < t < 2T, the signal component x_b(t) as shown in Fig. 4C is obtained. This signal component x_b(t) is given by the following equation:
By sampling this signal component x_b(t) with respect to 4N points as shown in Fig. 4D in the time range of -2T < t < 0 and in the time range of 0 < t < 2T, respectively, 4N sampling data SD(-2N) to SD(0) or SD(0) to SD(2N) are given in each time range. These sampling data SD(i) (i is an integer in a range of -2N < i < 2_N) are given by the following equation:
Namely, the sampling data SD(-2N)...SD(0)...SD(2N) correspond to the sampling data at the sampling points -2N...0...2N in Fig. 4D. In Fig. 4D, although N is set to be 5, N can be set to another value.
The items of sampling data SD(-2N)...SD(0)...SD(2N) are respectively stored in memory areas M(-2N)...M(0)... M(2N) of the ROM 101. These memory areas M(-2N)... M(0)...2(2N) are respectively designated by address data A[0]...A[2N]...A[4N].
The operation of the periodic sound generator circuit 10 shown in Fig. 3 will now be explained with reference to the flow chart shown in Fig. 5. It is now assumed that a pitch period PT of the voiced sound signal that should be synthesized is given by the following equation:
where Pl and P2 are integers and 0 ≦ P2 < N.
In the initial state, the contents of the memory areas MRl to MR4 are all cleared.
When the pitch data indicative of an interval PX is supplied to the I/O port 103 under this state, a check is made synchronously with the sampling pulse to see if a pitch count data PCD stored in the memory area MR3 is 0 or less in STEP 1. When it is detected that the pitch count data PCD is 0 or less, deviation time data DT2 stored in the memory area MR2 is then stored in the memory area MRl as deviation time data DT1. At the same time, new deviation time data DT2 which is given by the following equation is stored in the memory area MR2:
where R_e{X} is a function representing a value which is equal to an integer portion of X. This equation (5) denotes that the deviation time data DT2 is obtained by the sum of the time that is given by the deviation time data DT1, i.e., the time between the starting time of the pitch period in the present cycle and the leading edge of the sampling pulse generated simultaneously with or immediately after the start of this pitch period, and the time difference between the fraction interval (interval corresponding to P2 x (T/N) in equation (4)) of the pitch interval expressed by the pitch data given in the next operation cycle and one sampling period T. DT2/T has a value that is equal to or greater than 0 but less than 2.
Thereafter in STEP 2, a check is made to see if this deviation time data DT2 is T or more. When it is detected that DT2/T is less than 1, namely, in the case where it is detected that the deviation time data DT2 represents the time difference between the end of pitch period of the present cycle and the sampling pulse generated simultaneously with or immediately after this end of pitch period, the-pitch count data PCD which is given by the following equation is written in the memory area MR3:
On the other hand, in STEP 2, when it is detected that DT2/T is 1 or more, i.e., that the deviation time data DT2 indicates the time difference between the end of the pitch period in the present cycle and the sampling pulse which is thereafter generated at the second time, the pitch count data PCD which is given by the following equation is written in the memory area MR3:
In this case, the value of this deviation time data DT2 which is subtracted by one sampling period T is written in the memory area MR2 as new deviation time data. After the pitch count data PCD was written in the memory area MR3 in this way, the contents of the memory area MR4 are cleared. Further, thereafter, in STEP 3, a check is made to see if count data CD of the memory area MR4 has a predetermined value Z or less. This predetermined value Z denotes the amount of sampling data which is read out from the ROM 101 in each operation cycle and is set to 3 in this embodiment. In STEP 3, when it is detected that the count data CD has a predetermined value Z or less, one of the address data A[0]...A[2N]...A[4N] corresponding to the sum of this count data CD and 1/T of the deviation time data DT1 stored in the memory area MR1 is supplied to the ROM 101. Corresponding one of the sampling data SD(-2N)...SD(0)...SD(2N) is read out from this ROM 101 and is supplied to the digital filter 1 through the I/O port 104 and selector 4. Thereafter, the count data CD in the memory area MR4 is increased by one count. Further, thereafter, the pitch count data PCD in the memory area MR3 is decreased by one count, and the apparatus stands by until the next sampling pulse is supplied. On the other hand, in STEP 3, when it is detected that the count data CD becomes larger than the predetermined value Z, "0" is supplied to the digital filter, and the pitch count data PCD is decreased by one count without increasing the count data CD. Then, the apparatus waits for the input of the next sampling pulse. When the next sampling pulse is supplied in this state, STEP 1 is again executed. When the pitch count data PCD is determined to be larger than 0 in STEP 1, STEP 2 is executed.
It is now assumed that the first and second pitch data representative of the pitch periods PXl and PX2 of 25.4T (Pl = 25, P2 = 2) and 25.2T (Pl = 25, P2 = 1) are supplied to the CPU 100 through the I/O port 103 in accordance with this sequence.
The CPU 100 executes STEP 1'in response to the first sampling pulse (time t = 0) which was input immediately after the first pitch data was received. In this stage, since the pitch count data PCD stored in the memory area MR3 is 0, the deviation time data DT2 which is given by the following equation is stored in the memory area MR2:
Since the deviation time data DT2 of 0.6T stored in the memory area MR2 is smaller than T, the pitch count data PCD which is given by the following equation is stored in the memory area MR3:
Thereafter, the contents of the memory area MR4 are cleared, and it is detected in STEP 3 that the count data CD is smaller than the predetermined value Z(=3). Due to this, the address data A[Xl] which is given by the following equation is given to the ROM 101:
_Due to this, the sampling data SD(-2N) is read out from the memory area M(-2N) of the ROM 101 and is supplied to the digital filter 1 through the I/O port 104 and selector 4. Next, the content of the memory area MR4 is changed from 0 to 1, and the content of the memory area MR3 is changed from 26 to 25.
The CPU 100 .executes STEP 1 in response to the next sampling pulse. Since the pitch count data PCD is now 25, STEP 2 is executed. Since the count data CD is 1, the address data A[X2] is given by the following equation:
Thereafter, the count data CD of the memory area MR4 is increased by one count, while the count data PC_D of the memory area MR3 is decreased by one count.
After that, a similar operation is repeatedly executed, so that address data A[10] and A[15] are supplied to the ROM 101 and the corresponding sampling data are read out from the ROM 101. After the address data A[15] was read out, the content of the count data CD becomes 4 and the content of the pitch count data PCD becomes 22. In the following cycle, it is detected that the count data CD is larger than 3 in STEP 3. Therefore, 0 is supplied to the digital filer, and the pitch count data PCD is decreased by one count whenever each sampling pulse is supplied. This operation is executed until this pitch count data PC_D becomes 0.
As described above, after the first pitch data was input, the address data A[0], A[5], A[10], and A[15] are - sequentially given to the ROM 101 in response to the first four sampling pulses, and then the sampling data SD(-10), SD(-5), SD(0), and SD(5) are sequentially read out from the ROM 101 and are given to the digital filter 1. As shown in Fig. 6, these items of sampling data SD(-10), SD(-5) and SD(5) are 0, while the sampling data SD(0) corresponds to an impulse of a predetermined amplitude ^X ₀.
The pitch count data CD becomes 0 after the 26th sampling pulse is generated at time 25T. When it is detected in STEP 1 that the pitch count data PCD is 0 or less in response to the sampling pulse generated at time 26T, the CPU 100 transfers the contents of the memory area MR2 to the memory area MRl, and the contents DT2 of the memory area MR2 are updated in accordance with the following equation:
Since the deviation time data DT2 is larger than T, the deviation time data of 0.4T is written in the memory area MR2. The pitch count data PCD which is given by the following equation is stored in the memory area MR3.
Thereafter, the address data A[X26] is obtained in accordance with the following equation in a similar manner as mentioned above:
Thus, the sampling data SD(-2N + 3) is read out from the memory area M(-2N + 3) of the ROM 101.
In a similar manner, the address data A[8], A[l3] and A[18] are sequentially generated whenever the sampling pulses are.supplied at times 27T to 29T. In response to the address data A[3], A[8], A[13], and A[18], the four items of sampling data corresponding to the impulses at the sampling points -7, -2, 3, and 8 (where, N = 5) in Fig. 4D are read out from the ROM 101. Namely, the sampling data corresponding to the four impulses that are generated at times 26T to 29_T shown in Fig. 6 is supplied to the digital filter 1.
A similar operation is executed whenever new pitch data is supplied. For instance, in the next operation cycle, as shown at the right of Fig. 6, the four items of sampling data corresponding to the impulses at the sampling points -8, -3, 2, and 7 in Fig. 4D are read out from the ROM 101 at the sampling timings 51T to 54T.
The synthesized impulse of the four impulses, corresponding to the four items of sampling data that are generated at the sampling timings 26T to 29T in Fig. 6, corresponds to the impulse having the amplitude of X₀, as indicated by the broken line in the diagram. The impulse is generated at time 27.4T, that is, at a time after the interval of 25.4T from the sampling timing 2T at which the impulse having the amplitude of x₀ was generated in the first cycle. On the other hand, the synthesized impulse of the four impulses corresponding to the four items of sampling data which are generated at the sampling timings 51T to 54T corresponds to the impulse having the amplitude of X_o which is generated at time 52.6T, that is, at the time after the elapse of the interval of 25.2T from the time 27.4T.
As described above, in this embodiment, one or a predetermined number of impulses which are equivalent to the voiced speech signal that is generated synchronously with the pitch interval determined by the input pitch data are generated synchronously with the sampling pulses. Therefore, even if the pitch period is set to be short, or even if the pitch period rapidly changes, synthetic speech of relatively excellent quality can be generated.
_Fig. 7 shows a block diagram of a periodic sound generator circuit of a speech synthesizing apparatus according to another embodiment of the present invention. This periodic sound generator circuit comprises registers 110 to 112 for respectively storing the pitch data PX and deviation time data DT1 and DT2; a divider 113 for dividing the pitch data PX by the sampling period T; a register 114 for storing the output data of the divider 113; an integer detector 115 for detecting the integer part of the output data of the register 114; a DT2 calculation circuit 116 for calculating a new deviation time data DT2 on the basis of the output data of the registers 110 and 111; a divider 117 for dividing the output data of the calculation circuit 116 by T; a fraction detector l18 for detecting the fraction part of the output data of the divider 117; and a comparator 119 for comparing the output data of the divider 117 with a constant 1. The comparator 119 generates an output data "1" when it is detected that the output data of the divider 117 is equal to or larger than 1. On the other hand, the comparator 119 generates an output data "0" when such an output data is detected to be smaller than T. A subtracter 120 subtracts the output data of the comparator 119 from the output data generated from the integer detector 115 and sets the result into a PCD down-counter L21. This PCD down-counter 121 executes the down-counting operation in response to the sampling pulse. When the contents of the down-counter 121 become 0, it sets the CD up-counter 122 to 0. The CD up-counter 122 executes the up-counting operation in response to the sampling pulse. A comparator 123 checks the output data of the up-counter 122 to see if it is three or less. When it is detected that the output data is three or less, the comparator 123 generates an output pulse. An address calculation circuit 124 calculates the address data on the basis of the output data of the register 111 and of the up-counter 122 in response to the output pulse from the comparator 123, supplies this calculated address data to a memory 125, and reads out the corresponding sampling data from the memory 125. The memory 125 stores the sampling data SD(-2N)... SD(0)...SD(2N) as in the ROM 101 shown in Fig. 3.
When the pitch data representative of the pitch period PX (= Pl x T + P2 x T/N) is input during the initial state, the data (Pl + P2/N) which was obtained by dividing the pitch data by T in the divider 113 is stored in the register 114. Therefore, output data of Pl is generated from the integer detector 115. On the other hand, the DT2 calculation circuit 116 calculates the new deviation time data DT2 in accordance with equation (5). The output data of the calculation circuit 116 is divided by T in the divider 117. Thereafter, it is supplied to the fraction detector 118 and comparator 119. After the output data of the fraction detector 118 is increased by T times in the multiplier 126, it is stored in the DT2 register 112. In addition, since the output data of the divider 117 is less than that of the first cycle, the output data of 0 is generated from the comparator 119. Thus, the output data of Pl of the integer detector 115 is directly set into the PCD down-counter 121 in the first operation cycle. Although not shown, in the case where the contents of the PCD down-counter 121 are one or more, the down-counter 121 supplies inhibition signals to the registers 110 to 112, calculation circuit 116 and subtracter 120, thereby inhibiting the operations thereof.
Since the contents of the DT1 register 111 and counter 122 are both 0, the address data A[0] is generated in the address calculation circuit 124 in accordance with the following equation.
The corresponding sampling data is read out from the memory 125 in accordance with this address data _A[0]. In a similar manner, whenever the sampling pulse is input, the address data A[5], A[l0] and A[15] are generated from the address calculation circuit 124, while the corresponding sampling data is read out from the memory 125. In response to the next sampling pulse, the contents of the CD up-counter 122 become 4, so that a "0" output signal is generated from the comparator 123. Thus, the address calculation circuit 124 does not execute the calculation.
On the other hand, the PCD down-counter 121 executes the down-counting operation until the contents thereof become 0 in response to the sampling pulse. When the contents of the down-counter 121 become 0, the CD up-counter 122 is set to 0, and at the same time the registers 110 to 112, calculation circuit 116 and subtracter 120 are set into the operational states. Therefore, the pitch data is stored in the PX register 110 in response to the next sampling pulse, and at the same time the deviation time data from the DT2 register 112 is stored in the DTl register 111.
In a similar manner, the circuit shown in Fig. 7 generates the voiced sound signal corresponding to the input pitch data in response to the sampling circuit.
Although the present invention has been described with respect to the embodiments, the invention is not limited to these embodiments. For example, in the flow chart shown in Fig. 5, although the determination regarding the generation of the address data is made by checking the count data CD to see if it is three or less in STEP 3, it is also possible to check whether or not the following condition is satisfied in STEP 3:
Further, it is possible to use the time window which becomes at a 1 level in the range from -T to T as the time window w(t). Also other memory areas M(-N)...M(0)...M(N) for storing the sampling data SD(-N)...SD(0)...SD(N) may be provided in the ROM 101.

Claims

1. A speech synthesizing apparatus comprising:

a control circuit (5) for generating synthesis parameters, pitch data and clock pulses corresponding to speech to be synthesized; a periodic-sound-signal- generating means (100 to 102; 110 to 125) for generating digital data representative of a voiced sound signal synchronously with said clock pulse; a digital filter circuit (1) for filter-processing the digital data supplied from said periodic sound signal generating means (100 to 102; 110 to 125) synchronously with said clock pulse in accordance with said synthesis parameters from said control circuit (5); and speech generating means (6 to 8) for generating the speech corresponding to the digital data from said digital filter circuit (1),

characterized in that said periodic sound signal generating means comprises memory means (101; 125) for storing a predetermined number of sampling data corresponding to a predetermined number of sampling values within a predetermined time range of a continuous wave signal which is obtained by developing the voiced sound signal based on an interpolation function; and readout control means (100, 102; 110 to 124) for sequentially reading out the sampling data of the continuous wave signal within said predetermined time range that are synchronized with the pitch period represented by said input pitch data from said memory means (101; 125) in response to said clock pulse and for supplying said sampling data to said digital filter circuit (1).

2. A speech synthesizing apparatus according to claim 1, characterized in that said readout control means comprises: a first memory (MRl) for storing deviation time data representative of the time between the start point of the pitch interval and a clock pulse which is generated simultaneously with or immediately after said start point of said pitch interval; and a data processing circuit (100, MR2) for sequentially reading out the sampling data at the sampling points corresponding to the sum of said deviation time data and the integer times the period of said clock pulse from said memory means (101) synchronously with said clock pulse.

3. A speech synthesizing apparatus according to claim 2, characterized in that said data processing circuit comprises: a second memory (MR2); and a data processing unit (100) for sequentially counting up the contents of said second memory (MR2) in response to the clock pulse which is generated simultaneously with or immediately after the start point of said pitch interval and for sequentially reading out the sampling data from said memory means (101) in response to the contents of said first and second memories (MRl and MR2).

4. A speech synthesizing apparatus according to claim 1, 2 or 3, characterized in that said readout control means (100, 102; 110 to 124) produces a selection signal according to a speech to be synthesized, and characterized by further comprising a random noise generator (2) and a sound selector (4) for permitting one of output signals from said memory means (101) and random noise generator (2) to be selectively transferred to said digital filter circuit (1) in accordance with the selection signal.