CA1162650A - Integrated circuit phoneme-based speech synthesizer - Google Patents

Integrated circuit phoneme-based speech synthesizer

Info

Publication number
CA1162650A
CA1162650A CA000374573A CA374573A CA1162650A CA 1162650 A CA1162650 A CA 1162650A CA 000374573 A CA000374573 A CA 000374573A CA 374573 A CA374573 A CA 374573A CA 1162650 A CA1162650 A CA 1162650A
Authority
CA
Canada
Prior art keywords
signal
output
phoneme
parameter
control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
CA000374573A
Other languages
French (fr)
Inventor
Carl L. Ostrowski
Bertram J. White
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Federal Screw Works
Original Assignee
Federal Screw Works
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Federal Screw Works filed Critical Federal Screw Works
Priority to CA000432956A priority Critical patent/CA1171179A/en
Application granted granted Critical
Publication of CA1162650A publication Critical patent/CA1162650A/en
Expired legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output

Abstract

INTEGRATED CIRCUIT PHONEME-BASED
SPEECH SYNTHESIZER
ABSTRACT OF THE DISCLOSURE
A phoneme-based speech synthesizer that is particularly adapted for implementation on a single integrated circuit chip. The vocal tract is comprised of a fixed resonant filter and a plurality of tunable resonant filters whose resonant frequencies are con-trolled in accordance with the values of certain control parameters. The vocal tract is implemented utilizing a capacitive switching technique which eliminates the need for large valued components to achieve the relatively low frequencies of human speech.
In addition, a novel digital transition circuit is included which gradually transitions the values of the vocal tract control parameters as they change from phoneme to phoneme. A unique glottal source generator is also disclosed which is adapted to digitally generate a glottal pulse signal in a manner which readily permits the glottal pulse to be spectrally shaped in any desired configuration.

Description

1 1 ~2~50 The present invention relates to speech synthesis and in particular to phoneme-based speech synthesizer that is particularly adapted for implementation in a single encapsulated in-tegrated circuit.
Known phoneme-based speech synthesizers have principally contained vocal tracts comprised of a plurality of resonant filters. It has heretofore generally been considered impractical to produce vocal tracts of this type in integrated circuit form for several significant reasons. ~irst of all, tunable resonant filters of the type commonly used in vocal tracts require resistors and capacitors having relatively large values to produce resonant frequencies in the relatively low frequency range of the human voice. Large value components substantially increase the size of an integrated circuit. Secondly, vocal tract resonant filters are high precision filters which are difficult to produce in integrated circuit form within the required tolerance limits.

mg/'~
~;~

1 1 6~6~

The present invention is used in a phoneme-based speed synthesizer including parameter storage means for producing, for each pholleme, on a first data bus a plurality of multiplexed digital. con~rol parameters defining target values for the control parameters, and a vocal tract model that is controlled in accordance w.ith the current values of the control parameters. The invention relates to the improvement comprising digital transition.
means for sequentially transitioning the control parameters so that the values of the control parameters are gradually changed frorn the current values toward the target values, including: output means for providing on a second data bus the multiplexed current values of the control parameters;
demultiplexer means for demultiplexing the signal on the second data bus and producing a corresponding plurality of parallel digital output signals comprising thc current values of the plurality oE control parameters; and arlthmetic circuit means Eor calculating a factor related to a predetermined percentage of the difference between the target value si~nal on the first data bus and the current value signal on the second data bus, adding the factor to the current value signal at a predetermined rate, .
and providing the resulting value signal to the output means.
The present invention utilizes a novel capacitive switching technique to implement the vocal tract, as well as additional parameter controlled functions, which climinates the above noted problems and thus makes the speech synthesizer according to the present invention part:icularly adapted for implementation as a single integrated circuit silicon "chip".

~-, mg/' - 2 -~ 1 ~2650 The capacitive switching technique employed not only eliminates the requirement of large valued components in the vocal tract, but also eliminates the Tequirement that the values, and hence the size, of the tuning components in the vocal tract be accurately controlled.
~atheT, as will subsequently be seen, with the capacitive switching technique of the present invention, it is only important that the ratio of the tuning component values be accurately controlled, thus making it sub-stantially easier to maintain the high accuracy levels required during pToduction.
In addition, the present speech synthesizer includes a unique digital tTansition circuit which gradually transitions the values of certain control signal parameters between the different steady-state values assigned foT different phonemes. In this manner, adjacent phonetic sounds are properly integrated to produce natural sounding speech.
The speech synthesizer of the present invention also includes a novel glottal source circuit which digitally generates the glottal pulse signal in a manner which readily peTmits the waveform of the glottal pulse signal to be spectrally shaped in any manner desired In general, the present speech synthesizeT
system 85 disclosed herein comprises a single encapsu-lated silicon chip which phonetically synthesizes con-tinuous speech of unlimited vocabularly from low data -~ . . . . .

1 ~62~50 input rates. The system includes a parameter storage ROM containing parameter values defining 64 different phonemes which are accessed by 8 6-bit command word.
Two additional input bits are provided for varying the pitch or inflection of voiced phonemes. The control parameters are generated by the storaye ROM in a multiplexed fashion on an 8-bit paTallel output buss.
The control parameters which are used to control the vocal tract are initially provided to a novel digital transition circuit which serves to gradually transition the variations in the steady-state values of the para-meters which occur from phoneme to phoneme. As will subsequently be seen, the digital transition CiTCUit performs this function in a unique manner by contin-lS uously adding one eighth of the difference between t.he target parameter value and the current parameter value to the current parameter value, and using the result as the new current parameter value. Tn the preferred embodiment, the transition circuit is clocked at a rate which results in a parameter attaining approxi-mately 70% of its target value within a span of 33 milliseconds, The transitioned control signal parameters from the digital transition CiTCUit are provided to the 2S vocal tract to control the resonant frequencies of the Fl, F2 and F3 resonant filters, and to control the injection of ~ocal and fricative excitation energy into "
the vocal tract. In addition, the "Q" or bandwidth of ";

--1 1 ~2~50 the F2 resonant filter is separately controllable for producing nasal phonemes as is conventional. The various parameter controlled functions are implemented by utilizing the 4-bit parallel digital parameter signals to selectively control the capacitance ratio of capacitor networks in the controlled circuits. The capacitor networks are then switched on and off at a predetermined frequency so that the controlled capacitor networks effectively simulate a contTolled variable resistance element.
The glottal source generator circuit produces a glottal pulse signal having a fundamental frequency that varies in accordance with the setting of the two inflection control bits. In addition, a lS degree of automatic inflection control is provided by also varying the fundamental frequency of the glottal signal inversely with respect to movement in the resonant frequency of the Fl resonant filter in the vocal tract. The spectral shape of the glottal pulse is controlled by selectively prcsetting the analog d.c. signal levels applied to the parallel inputs of a multiplexer. The selector inputs of the multiplexer are connected to the output of a counter which is clocked at a predetermined rate. The waveform of the glottal signal produced at the serial output of the multiplexer therefore comprises a segmented approximation of an analog plottal pulse signal with the levcls of the various segments determined by the preset d.c. levels.

~ 1 ~2650 .

Additional features and advantages of the present invention will become apparent from a reading of the detailed description of the preferred embodiment which makes reference to the following set of drawings in which: :
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is an oveTall block diagram of the speech synthesizer system of the present invention;
Figures 2-8 comprise a circuit diagram of the speech synthesizer system of Figure l;
Figure 9 comprises a resistor equiva-lent of the portion of the vocal tract circuit illustrated . in Figure 4;
Figure 10 is a sample waveform of a glottal pulse signal;
Figure 11 is a timing diagram illustrating the timing relationship between various clock signals and also illustrating the multiplexing arrangement in which the control parameters are generated by the parameter storage ROM;
Figure 12 is a diagrammatical view of the parameter storage ROM indicating the manner in which the control parameter values are stored in the ROM; ,.
Figure 13 is a CiTCUit model illustrating the opeTating pTinciples o~ the capacitive switching technique employed in the present invention; and . . .

~i 1 62650 Figure 14 is a timing diagram illustrating the timing relationship between the ~1 and ~2 non-overlapping clock signals utilized in the capacitive switching circuitry.

.... . .. . ....

-,. . . .
..
.. -7-.. ... . .

1 ~ ~26S0 DETAILED DESCRIPTIO~ OF THE PREFERRED ~MBODIMENT
Referring to Figure 1, an overall block diagram of a phoneme-based speech synthesizer 20 according to the present invention is shown. The illustrated system is adapted to be driven by an 8-bit digital input command word. Six of the input bits 22 from the digital command word are used for phoneme selection and the remaining two bits 56 for varying the inflection le~el or pitch of the audio output. The six phoneme select bits 22 are latched by a strobe signal on strobe line 24 into a six bit latch 26 whose six parallel outputs are provided to the six high order address inputs of a parameter storage ROM 40. The strobe signal on line 24 also resets an output latch 28 which forces the acknowledge/request (A/R) output line 30 LO to acknowledge receipt of the new data. The LO signal on A/R line 30 is also provided to the phoneme timing, delay and pause timing network 44 to initialize the phoneme timing counter~ as will subsequently be described in greater detail.
The parameter storage ROM 40 contains data dcfining 64 different phonemes which are accessed by the six phoneme select bits 22. For each of the 64 different phonemes, the parameter storage ROM ~0 contains 12 control signal parameters which electronically define 2S the phoneme. Each control signal parameter stored in ROM 40 preferably comprises four bits of resolution, thus providing sixteen different values for each parameter, except for the phoneme timing control parameter which 1 :1 62f~50 comprises seven bits of resolution and therefore has 128 possible values. The parameter storage ROM 40 is adapted to provide the appropriate set of control sig-nal parametes values defining the particular phoneme identified by the phoneme select bits 22 on its eight data output lines in a multiplexed fashion, such that two 4-bit control parameters are present on the eight data output lines at any given point in time, except again for the phoneme timing control parameter which uses seven of the eight data outputs.
The parameter storage ROM 40 is addition-ally accessed by three parameter select bits 4~ which are provided from the output of the timing circuit 38 to the three low order address inputs of storage ROM 40.
The timing circuit 38 comprises an internal master clock circuit 32 which is adapted to produce a master clock signal having a frequency that is determined in accordance with the values of an external RC network connected to input terminal MCRC. Alternatively, the MCRC input terminal may be grounded and an external clocX signal provided directly to the additional clock input terminal MCX. The master clock signal from the output of the master clock circuit 32 is provided to a ripple counter 34 which is adapted to produce eight clock signal outputs at varying predetermined frequencies. Three of the clock signals comprise the parameter selector bits which are provided on line 48 to the low order address inputs of the parameter storage ROM 40. The outputs from the ripple counter 34 are then provided to a latch and two-phasenon-overlapping clock circuit 36 which is adapted - ~

, to develop the 12 timing signals which are utilized in all sections of the system.
To summarize, therefore, the six phoneme select bits 22 identify the-particulaT phoneme desired and the three parameter select bits 48 determine which of the 12 control signal parameters defining the identi-fied phoneme are present on the data outputs of the- -parameter storage ROM 40 at any given point in time during the phoneme period. Certain of the control signal parameters which are to be provided to the vocal tract require transitioning to smooth the abrupt variations which occur in the values of the control parameters-from one phoneme to the next. Accordingly, the data output lines from the parameter storage R~M carrying these control signal parameters are provided to a digital transition circuit 50 which digitally imple-ments the desired transition function. The data output lines from the parameter storage RO~I carrying the remaining control signal parameters which relate to various timin8 functions and do not require transition-ing are provided directly to the phoneme timing, vocal delay, closure delay and pause timing circuit 44. As will subsequently be described in greater detail, the phoneme timing circuit 44 includes a counter which is clocked at a frequency determined by the phoneme timing control parameter to control the duration of each phoneme. More particularly, the count rate of the counter is determined by the frequency of the clock signal on line 43 from the output of the sub-phoneme clock timing circuit 42, which .

i :I B265~ , ..

essentially comprises an oscillator whose frequency is controlled by the phoneme timing control parameter.
When the counter attains a predetermined count, an~' output signal is produced on line 45'which'sets'output latch 28, thereby forcing the A/R output line 30 HI tD
indicate that new phoneme data is required. -' Timing circuit 44 also includes a vocal delay and closure delay network which essentially com-prises a magnitude comparator which is adapted to com---pare the count output of the phoneme timing counter with the values of the vocal delay and closure delay control signal parameters when presented at the data outputs of the parameter'storage ROM 40. When an equivalency condition is detected by the magnitude comparator, the delay period is terminated. The purpose of the vocal delay control signal parameter is to-delay the trans-mission of the ~ocal amplitude control signal, and hence delay or retain the injection of vocal excitation energy into the vocal tract, for a predetermined period of time less than the duration of'a single phoneme time inter-val, during certain fricative-to-vowel' phonetic transi-tions wherein the amplitude of the fricative constituent is rapidly decaying at the same time the amplitude of the vocal constituent is rapidly increasing. The closure delay control signal serves a similar function and is adapted to delay the transmission of the fricative amplitude and closuTe control signals. Implementation of the delay function is performed whenever a vocal .
2 6 5 0 delay or closure delay control signal parameter is present, by effec~ively '~freezing~ the digital transi-tion circuitry 50 during the period of tranSmiSSiOD for the vocal amplitude and fricative amplitude control-signal paTameters, respectively, for a period specified by the value of the vocal delay and closure delay control parameters. As will also be described subse-quently in greater detail, this function is performed by a freeze transition circuit 46 which effectively clamps the read/write (R/W) line 47 to a storsge RA~I -in the digital transition circuit 50.
The transitioned control signal parameters from the output of the digital transition circuit 50 are de-multiplexed by a latching circuit 52 in accord-lS ance with appropriate timing signals from the timing circuit 38 which are provided through a synchronize clock CiTCUit 54 for controlling the clocking of the latches 52. In addition, the glottal source circuit 58 produces a glottal sync signal at the beginning of each glottal pulse period which is also provided to the synchronize clock circuit 54 on line 55 to latch the Fl, F2, F2Q and F3 control signal parameters from the transition circuit 50.
The vocal excitation signal is digitally generated by a glottal source circuit 58 in accordance with the two inflection select bits 56 from the eight bit digital input command word, which control the funda-mental frequency of the glottal signal. In addition, . .;, 1 ~ .. .

.. . . .

1 ~ 626S0 it will be noted that a degree of automatic-inflection control is provided by also varying the fundamental frequency of the vocal excitation signal in accordance with the inverse of the Fl control signal parameteT'~
(Fl). The Tesulting vocal excitation signal is then - :
provided to a vocal amplitude circuit 62 which modulates the amplitude of the vocal excitation signal in accordance with the vocbl amplitude'control signal parameter before injection of'the vocal excitation signal into the vocal tract 60. - - ' -' --The fricative'excitation signal is supplied' by a white noise generator 64. During voiced fricatives (e.g., z, v) when fricative and vocal excitation energy are both present, the white noise'generator 64 is gated on only during the latter or "rest~' portion of the glottal pulse signal under the control of the FGATE signal on line 65 from the glottal source circuit 58. The resulting white noise signal from the output of the generator circuit 64 is provided to a fricative amplitude and high-pass noise shaping network 66 which is adapted to filter the fricative excitation signal and modulate '- -its amplitude in accordance with the fricative amplitude control signal parameter before injection into the vocal tract 60 under the control of the fricative control parameter tFC) and the inverse of the fricative control parameter (FC).
The vocal tract 60 in the preferred embodiment comprises four serially connected resonant ' ~ 1 62~50 filters designated Fl, F2, F3 and FS. The resonant frequencies of the Fl-F3 resonant filters are controllable in accordance with the Fl, F2 and F3 control signal parameters. The bandwidth or "Q" of the second order resonant filter (F2) in the vocal tract 60 is also controlled by the F2Q control signal parameter. Finally, the output from the vocal tract 60 is provided to the closure circuit 68 which is adapted to abruptly modulate the amplitude of the audio output signal in accordance with the closure control signal (CL).
-Referring now to Figures 2-8, and in particular to Figure 2, a circuit diagram of the speech synthesizer 20 accordir.g to the present invention is shown. As previously noted in connection with the description of the block diagram in Figure 1, the pre-sent speech synthesi2er 20 is adapted to be driven by an 8-bit digital input command word. The six phoneme select bits 22 (P0-P5) are provided in parallel to the data (D) inputs of a 6-bit latch 26. The data present on phoneme select lines (P0-PS) is clocked into latch 26 by a strobe signal produced on line 24 which also serves to reset output latch 28, thereby forcing the ac~nowledge/request line 30 LO to acknowledge receipt of-the new phoneme data. The LO output signal on A/P
line 30 also serves to reset the phoneme timing counter 84, the purpose of which will be subsequently described.
The Q outputs from latch 26 are provided to the high order address inputs (A3-A8) of the parameter storage .

i 1 B2650 ROM 40. The remaining three low order address inputs (AO-A2) of ROM 40 are connected to ~he parameter select bits 48 from the output of the timing circuit 380 As previously noted, the control signal parameters defining the phoneme identified by phoneme select bits 22 are produced at the data outputs (DO-D7) of parameter storage RO~ 40 in a multiplexed fashion in accordance with the three parameter select bits 48.
Although known in the art the functions of the various control signal parameters generated by parameter storage ROM 40 will be briefly summarized to provide a better understanding of the operation of the present system.
The Fl, F2, and F3 control parameters determine the locations of the resonant frequency poles in the first three variable resonant filters in the vocal tract. The fricative control parameter (FC) re-places two control parameters normally provided in synthesizers of this type; i,e., the fricative frequency snd fricative low pass control paTameters. Specifically, it has been determined that, in general, when a fricative phoneme requires low frequency fricative energy in the range of the F2 formant, it does not also require a substantial amount of high frequency fricative energy in the range of the F5 formant, and vice versa. Thus, the present invention utilizes 8 single fricative control tFC) parametcr, and the inverse of the FC control para- :
meter (FC), to control the parallel injection of both ~r ,~ .
.~ 1 .. . ... .. -- _ 1 ~ 62650 low and high frequency fricative energy into the vocal tract. The F2Q con~rol parameter varies the "Q" or bandwidth of the second order resonant filter (F2) in the vocal tract and is used primarily in connection with the production of the nasal phonemes "n", "m" and ~ng~.
Nasal phonemes typicaliy exhibit a higher amount of energy at the first formant (Fl~ and substantiaily lower and broader eneTgy content at the higher formants. Thus, during the presence of nasal phonemes, the F2Q control parameter is generated to reduce the Q or increase the bandwidth of the F2 resonant filter which, due to the cascaded arrangement of the resonant filters in the vocal tract, pTevents significant amounts of energy from reaching the higher formants; The vocal amplitude control parameter (VA) is generated whenever a phoneme having a voiced component is present. The vocal amplitude control parameter controls the intensity of the voiced component in the audio output. The fricative amplitude control parameter (FA) is generated whenever a phoneme having an unvoiced component is present and is used to control the intensity of the unvoiced component in the audio output. The closure parameter (CL) is used to simulate the phoneme interaction which occurs, for example, during the production of the phoneme "b" followed by the phoneme 2~ "e". In particular, the closure control parameter, when provided to the closure circuit 68, is ndapted to cause an nbrupt amplitude modulation in the audio output that simulates the build-up and sudden release of energy that occurs during the pronunciation of such phoneme combinations.

~ :1 62650 - The vocal delay control parameter (VD) is utilized pri-marily during certain fricative-to-vowel phonetic transi-tions wherein the amplitude of the fricative constituent would otherwise be rapidly decaying at the same time S the amplitude of the vocal constituent is rapidly increas-ing. The vocal delay control parameter thus serves to delay the transmission of the vocal amplitude (VA) control signal under such circumstances. The-closure delay control parameter (CLD) is similarly ut~lized primarily during certain vowel-to-fricative phonetic transitions wherein it is desirable to delay the transmission of the closure (CL) and fricative amplitude (FA) control parameters in the same manner as that discussed in connection with the vocal delay control parameter. The pause control parameter (PAC) is generated whenever a pause phoneme is selected to insert a period of silence into the speech pattern. However, because the articulation pattern of the vocal tract is "frozen" during a pause phoneme until all of the excitation energy in the vocal tract is completely dissipated, an additional important function is also provided by the pause phoneme. In particular, certain words whose endings tend to "trail off", such as those ending in nasal phonemes, sound as if an additional phoneme has been included when the articulation pattern of the last phoneme is abruptly changed before the excitation energy in the vocal tract has completcly dissipated. ~or example, the word "sun"
may sound more like "suna". This is due to the fact that the residual excitation energy in the vocai tract is vocalized as something other ~han an "n" after the duration of the "n" phoneme period. Insertion of a pause phoneme following the "n" phoneme in this example will therefore improve speech recognition by freezing the articulation pattern of the "n" phoneme until all fricative and vocal excitation energy is dissipated.
The final control parameter is the phoneme timing control parameter which is generated for each phoneme and is used to establish the period of production for the phoneme.
The parameter storage ROM 40 in the preferred embodiment is configured as shown in Figure 12. In particular, the RO~ 40 has stored therein twelve con-trol signal parameter values for each of the 64 phonemes identifiable by phoneme select bits 22. Each control signal parameter has four bits of resolution except for the phoneme timing control signal parameter which has seven bits of resolution. Thus for example, as indicated in Figure 12, when the parameter select bit 48 are equal to 011, the closure control signal parameter (CL) will be present at the D0-D3 data outputs of RO~I 40 and the F3 control signal parameter will be present at the D~-D7 data outputs of ROM 40. Similarly, when parameter bits 48 are equal to 100, then the closure delay (CLD) and the F2Q control signal parameters will be present at the D0-D3 and D4-D7 data outputs, respectively, of parameter ROM 40. In the preferred embodiment, the frequencies of the paramcter select bits 48 are 40XHz, 20~HZ, snd 10KHz. Thus, since phoneme durations vary from approximately 50-250 milliseconds, it will bc ~ 1 62B50 appreciated that each control signal parameter will be generated on the data output lines DO-D7 of parameter storage ROM 40 a minimum of approximately 500 times during the period of each phoneme. A timing diagram illustrating the relationship between the parameter select bits 4~ at the AO-A2 address inputs of ROM 40 and the data outputs DO-D7 of ROM 40 is shown in Figure 11.
The parameter select signals on lines 48 are generated at the Q4-Q6 outputs of a 10-bit ripple counter 34 which is clocked by the master clock signal, which in the preferred embodiment is set at 1.28MHz.
As will subsequently be appreciated by those skilled in the art, variation of the master clock frequency will vary the overall pitch and frequency composition of the audio output. In addition, by varying the master clock frequency above or below that required for normal speech ranges, the present system can be utilized to produce highly textured sound effects.
The Q0-Q3 outputs of ripple counter 34 are provided to the data (D) inputs of a first 4-bit latch 70 and the Q4-Q6 and Q9 outputs of counter 34 are provided to the data (D) inputs of a second 4-bit latch 72. The three high order Q outputs from latch 70 co~prise the BIT 0, BIT 1 and BIT 2 clock signals which are provided to the digital transition circuit 50 to be subsequently described. The remaining ~ output signal from latch 70 is provided on line 73 to an R-S
flip-flop 74 so as to produce at its SET output terminal i :~ 6265~ , a clock signal ~1) at twice the frequency of the BIT 0 clock signal and opposite in phase relative thereto.
The ~l clock signal is also utilized in the digital transition circuit 50. The three low order Q outputs from latch 72 comprise the A0-A2 clock signals which are utilized in various sections of the system to monitor which control parameter is present and to de-multiplex the transitional control parameters. The A0-A2 clock signals have the same frequencies as the parameter ~J~
select signals provided on lines 48 to A0-A2 address inputs of ROM ~0, only delayed slightly relative thereto.
In addition, it will be noted that the Q3 output signal from ripple counter 34 is also provided to a pair of R-S flip-flops 77 and 78 so as to produce at the RESET output terminals thereof slightly non-overlapping clock signals ~l and ~2. As shown in Figure 14, the ~1 and ~2 clock signals are opposite in phase and in the preferred embodiment have a frequency of 20~Hz. The ~1 and ~2 clock signals are utilized principally in connection with the implementation of the unique capacitive switching parameter control technique to be subsequently described, although the ~l and ~2 clock signals are also utilized simply as convenient clock signals. Similarly, the Q~ output signal from sipple counter 34 is additionally provided to a pair of R-S flip-flops 75 and 76 so ss to produce at the RES~T output ter~inals thereof a second pair of slightly non-overlapping clock signals Pl and P2 which sre also ~'. t . ~

opposite in phase in the same manner as clock signals ~1 and ~Z, and have a frequency of 5KHz in the preferred embodiment. The Pl and P2 clock signals are utilized solely in the sub-phoneme clock circuit 42 (Figure 7) to be described subsequently in greater detail.
The parameter values stored in RON 40 which are generated on the high order data output lines D4-D7 comprise vocal tract control parameters which require dynamic transitioning and are therefore provided to the digital transition circuitry 50 to be subse~
quently described. The parameter values stored in ROM
40 which are generated on the low order data output lines D0-D3, howeveT, are essentially ontoff signals or timing signals which require no transitioning and are therefore provided directly to the phoneme timing, pause timing and delay CiTCUitry.
The operation of the phoneme timing circuit will now be explained. The seven least significant bits D0-D6 from the data output of parameter storage ROM 40 arc provided to the data (D) inputs of a 7-bit latch 80 which is clocked by a signal received on line 85 from the No. 7 output channel of an 8-channel multiplexor 86. The A, B and C binary control inputs of multiplexor 86 are tied to the A0-A2 clock signals from the output of latch 70 in timing circuit 38. Thus, it will be appreciated, that when parameter selector bits A0-A2 are equal to 111, indicating that the phoneme timing control signal parameter is present on the data outputs i ~ 62650 (D0-D7) of parameter storage ROM 40, the No. 7 output channel of multiplexor 86 will go HI, thereby clocking into latch 80 the value of the phoneme timing control signal parameter.
With additional Jeference to Figure 7, the seven parallel outputs (PH0-PH6) from latch 80 are provided to the sub-phoneme clock circuit 42 which effectively comprises an oscillator circuit whose fre-quency is controlled by the vslue of the seven output ~o~
bits frum latch 80. In particular, the outputs from latch 80 ~PH0-P~6) control the on/off state of seven analog switches, as indicated, which are-individually connected in series with one of seven binary weighted~
parallel connected capacitors. As will subsequently be described in greater detail in connection with the description of the vocal tract, by rapidly switching the capacitoT network 79 in the manner shown under the control of the Pl and P2 non-overlapping clock signals, a resistance value is effectively simulated which is equal to the inverse of the product of the switching frequency (Pl) and the resulting capacitance value of capacitor network 79. In other words, the simulated R value in the preferred embodiment is determined by the following:
R ~ ~
Therefore, the frequency of the output signal from the oscillator 42 on line 88 is given by the product of the resulting resistsnce value ~R) and t~e fixed ~ 1 62650 capacitance value (C) in the RC timing network of the oscillator 42.
Returning to Figure 2, the output signal on line 88 from sub-phoneme clock circuit 42 is provided to the clock input of the phoneme timin8 counter 84 and thus determiDes the count Tate of the counter. The duration of the phoneme i5 determined by thP period of time it takes foJ the counter 84 to attain a count of 16. In particular, when the count outputs (Q0-Q3) of counter 84 are equal to 0000, the output of NOR-ga~e 90 will go HI, thereby providing a Hl signal to the data input of flip-flop 92. After a brief delay --specifically, two pulses from clock signal P2 -- the output of NAND-gate 94 will go LO to thereby set output lS latch 28 ria inverter 96 and force the A/R line 30 HI
to signal that new phoneme data is required.
The operation of the vocal delay and closure delay circuit will now be described. As pre-viously mentioned the purpose of the vocal delay and closure delay control signal parameters is to delay for a predetermined portion of the phoneme period the transmission of the vocal amplitude and fricative amplitude control signals respectively during certain vowel/fricative phonetic transitions. This is accom-pl~shed in She fo~lowing manner. The four count outputs (QO-Q3) from the phoneme timinB counter 84 are provided thTough inverters 100 to the B inputs of a 4-bit magni-tude comparator 82. The A inputs of magnitudc comparator 82 are tied to the D0-D3 data outputs of the parameter i 1 62650 ROM 40. The A-B output of magnitude comparatoT 82 is pro'vided on line 102 to a pair of NAND-gates 104 and 1~6. The othes input to NAND-gate 104 is connected via line 110 to the No. 5 output channel of multiplexor 86 and the other input to NAND-gate 106 is connected via line 108 to the No. 4 output channel of multiplexor.
86. Thus, it will be appreciated that when the A2-A0 clock signals provided to the binary control inputs A, B and C of multiplexor 86 are equal to 101, indicating that the vocal delay ~D) control signal parameter is present at the D0-D3 data outputs of parameter storage ROM 40, and the count outputs (Q0-Q3) of phoneme timing counter 84 is equal to the parameter value of the vocal delay control signal, both inputs to NAND-gate 104 will be HI, thereby forcing the output of NAND-gate 104 LO
to set flip-flop 112. Similarly, when the A2-A0 clock signals are equal to 100, the No. 4 output channel of multiplexor 86 on line 108 will go HI, indicating that the closure delay control signal parameter is present on the D0-D3 data outputs of parameter storage ROM 40.
Consequently, when the count output of phoneme counter 84 simultaneously attains a count equal to the.parameter ~alue of the closure delay control signal, the.A=B
output of magnitude comparator 82 on line 102 will also go HJ~ thereby providing HI signals to both inputs of NA~D-gate lD6 and forcing its output LO to set flip-flop 114. As will subsequently be seen, the output signals from flip-flops llZ and 114 are provided 116265~ , to the freeze transition circuit 46 (Figure 3), which effectively inhibits the transition process in the digital transition circuit 50 during transitioning of the vocal amplitude and fricative amplitude control signal parameters. Flip-flops 112 and 114 are reset at the end of each phoneme period by a LO RESET pulse on line 124 from the output of AND-gate 122. The output of AND-gate 122 is forced LO at the end of each phoneme period by the HI RESET pulse produced by phoneme counter 84 on line 30 which is inverted by inverter 120 and provided to the input of AND-gate 122. Note, the "equal to'~ output of magnitude compaTator 82 i5 also reset to-.
a LO level at the beginning of each new phoneme period when the count output (Q0-Q3) of phoneme counter 84 is equal to 0000 by the resulting Hl signal produced at the output of NOR-gate 90 which is inverted by inverter 98.
In addition, it will also be noted that -the closure delay control signal parameter tCLD) is also utilized to inhibit the transmission of the closure (CL) control signal parameter to the closure circuit 68 tFigure 1). In particular, during the closuTe delay period the output of flip-flop 114 is LO, and hence a LO signal is provided on line 126 to one of the inputs of a three-input NAND-gate 128. The remaining two inputs of NAND-gate 128 sre connected to line 116, which is tied to the No. ~ output channel of multiplexor 86, snd to iine 115 which is connected . ~5 - - ~
1 ~ 62650 to the D3 data output of the parameter stoTage ROM 40.
When a closure signal is present, the D0-D2 data outputs from ROM 40 are not utilized since the closure control signal is simply an on/off type signal. Therefore, only the state of data output D3 from RO~ 40 on line 115 is monitored during the closure parameter period when the A2-A0 clock signals are equal to 011 and the No. 3 output channel from multiplexer 86 on line 116 is ~1. HoweveT, even if HI signals are pTesented on both lines 115 and 116, the output of NAND-gate 128 will not go LO to reset flip-flop 132 and produced a LO closure signal at the output of AND-gate 136 until flip-flop 114 is set upon termination of the closure delay period and a HI signal is presented on line 126. When the closure delay function is not present aDd the output of flip-flop 114 is HI, a LO closure signal will be produced at the output of NAD-gate 136 when HI signals are coincidentally detected on both line 115 from from the D3 data output of ROM
40 and line 116 from the No. 3 output channel of multiplexer 86. The closure signal is terminated when the output of NAND-gate 130 goes LO to set flip-flop 132, which occurs when the No. 3 output channel from multiplexer 86 on line 116 is HI and the D3 data output from ~OM 40 is LO, indicating that the closure function is no longer desired. Tbis typically will occur at the beginning of the following phoneme period if the ~ollowing phoneme does not also requiTe the closure function, since the output from the closure delay flip-flop 114 on line 126 will always be at least momentarily HI
at the beginning of each phoneme period.
In addition, it will be noted that 8 .. . .... .
.

1 :~ 62650 closuTe control signal is also produced whenever the values of both the vocal amplitude (VA) and fricative amplitude (FA) control parameters arc equal to zero.
In other words, when both the VA and FA signals from NOR-gates 190 and 192 (Figure 3) are HI, the output of NAND-gate 134 will go LO, thereby producing a LO
closure signal at the output of AND-gate 136. This serves to completely silence the vocal tTact during periods when no excitation energy is present in the vocal tract to prevent bu~zing and otheT forms of noise from being produced during pause periods.
The remaining non-transitional control parameter produced at the D0-D3 data outputs of parameter ROM 40 is the pause control parameter (PAC). As lS preYiously noted, the pause control parameter is generated whenever a silent phoneme is desired, and also serves to freeze the formant positions of the vocal tract until all excitation energy is dissipated. The pause control parameter is similar to the closure control parameter tCL) in that it is an on/off type parameter and thcrcfore requircs only a signal data bit. For convenience, thc D3 data output of parameter storage ROM 40 is again selected and the D0-D2 data outputs are not used.
The D3 data output from ROM 40 on line llS is provided to the input of a NAND-gate 138 which has its other input connected to the No. 6 output channel of multi-~lexeT ~6. Accordingly, when the A2-A0 clock signals are equal to 110 causing the No. 6 output channel of multiplexer 86 to ~o ~, indicating that the pause ~ . .

i 1 6265~ .

control parameter is present at the D0-D3 data outputs of ROM 40, and the D3 data output on line 115 is also HI, the output of NAND-gate 138 will go LO and set flip-flop 142. The output of flip-flop 142 is in turn provided to the input of an AND-gate 146 which has its other input connected to the output of NAND-gate 144.
The inputs to NAND-gate 144 are connected to the VA
and FA signals from the outputs of NOR-gates 190 and 192 in Figure 3. $herefore, since vocal and/or fricative exci~ation energy are always present during non-silent phonemes, the output of NAND-gate 144 will normally be HI. Consequently, when flip-flop 142 is set at the beginning of a pause phoneme, a HI signal will be produced at the output of AND-gate 146, which is provided to the freeze transition circuit 46 (Figure 3) and, as will subsequently be seen, is effective to inhibit the digital transition circuit to thereby hold the transitional control parameters at their current values. The HI
"freeze formants~' signal at the output of AND-gate 146 will be terminated, however, as soon as all of the vocal and ricative excitation energy has been completely dissipated, or at the end of the pause phoneme period, whichever occurs first. In particular, when both the VA and FA signals go HI, the output of NAND-gate 144 will go LO and thereby force the output of AND-gate 146 LO. Conversely, if the output of NAND-gate 144 is still Hl Dt the end o~ the pause phoneme period when thc D3 dsts output of ROM 40 on line 115 ~oes LO, !

1 ~ 62650 then the resetting of flip-flop 142 caused by the resulting BO signal psoduced at the output of NAND-gate 140 will force the output of AND-gate 146 LO.
With particular reference to Figure 3, S the operation of the digital transition circuit 50 will now be explained. As previously noted, that purpose of the digital transition circui* 50 is to provide a gradual change in the values of the vocal tract control parameters from the old or "current" values to the ne~
or '~target" values. The four parameter lines T0-T3 from the D4-D7 data outputs of parameter storage ROM 40 are provided to the 1-4 .input channels of an 8-channel multiplexer 150. The A, B and C binary control inputs of multiplexer 150 are connected to the BIT 0, BIT 1, lS and BIT 2 transition clock signals, respectivley, from the timing circuit 38 (Figure 2). The timing relation-ship betw0en the BIT 0 - BIT 2 transition clock signals and the A0-A2 parameter clock signals is illustrated in Figure 11. As can readily be seen from the timing diagram in Figure 11, the BIT 0 - BIT 2 clock signals define eight states within each state defined by clock signals A0-A2. It will be recalled that clocX signals A0-A2 define which parameter is present on the four parameter lines TO-T3 from the D4-D7 data outputs of RO~I 40.
- Multiplexer 150 thus serves ~o convert the parallel input data on parameter lines T0-T3 to serial data on output line lSl in preparation for the ..

~ .-29-.

1 ~ 62B50 serial arithmetic functions to be subsequently performed.
In addition, however, because the parallel output from the digital transition circuit i5 ultimately taken off the four most significant bits iD the 8-bit parallel S output signal fTom outputlatches 16~ and 170, and the parameter input lines T0-T3 to the transition circuit are connected to input channels 1-4 of multiplexer 150 and aTe hence shifted three bits down with respect to the output, multiplexer 150 also serves to effectively divide the "target" parameter value by 23 or eight.
The serial output from multiplexer 150, which is therefoTe equal to 1/8 of the target parameteT
value, is provided on line 151 to the B input of a single bit full adder 148. The A input of adder 158 is connected to the sum output (~) of a second single bit full adder 156. The A input of adder 156 is connected to the serial output of a second 8-channel multiplexer 152 which merely s~rves to convert the resulting parallel output signal from the digital transition circuit 50 back to a serial signal. Thus, the signal present on line 153 from the serial output of multiplexer 152 represents the most recent or current value of the control parameter. The B input of adder 156 is connected to the serial output of a third 8-channel multiplexer 154 which also serves to convert the paTallel output signal from the digital transit;on CilCUit 50 to a serial signal. However, ~ccause the four most si~nificant bits in thc output signal from latch 168 ~re shifted down three bits an~ ;

connected to the 1-4 input channels of multiplexer 154, multiplexer 154 also serves to effectively divide the current parameter value by eight. Thus, since the serial output from multiplexer 154 on line 155 is pro-vided to the B input of adder 156 through an inverter 172, it will be appreciated that adder 156 effectively sub-tracts 1/8 of the current parameter value from the current parameter value and pTovides the total to the A input of adder 158. Addes 158 then adds to the total from adder 156 1/8 of the target parameter value. The value of the resulting signal at the sum output ~) of adder 158 on line 16~, which represents the "new" current parameter value, is therefore given by the following equation:
(Current - 1~8 Current) ~ 1/8 Target 5 New Current The serial signal on line 165 from the sum output (~) of adder 158 is converted back to a parallel signal by a pair of Hex D flip-flops 160 and 162, snd the resulting 8-bit signal is provided to the eight data inputs (D0-D3) of a pair of temporary storage RAMs 164 and 166. The address inputs (A0-A2) of the RAMs 164 and 166 are connected to the parameter clock lines A0-A2. Thus, it will be appreciated that as long as the R/W inputs of ~A~s 164 and 166 remain properly enabled, each successive new current parameter value will be properly stored in RAMs 164 and 166 iD the address locations identified by paramcter clocX slgnals A0-A2.
!

.... . . .. . .

The read/write (R/W) inputs of both RA~5 164 and 166 aTe connected to the output of an OR-gate 172 which has one of its inputs connected to the serial output of a multiplexer 174 and the other of its inputs tied to the ~2 clock line through an inverter 173.
The A, B, and C binary control inputs of multiplexer 174 are also tied to the A0-A2 parameter clock lines.
The five low order input channels of multiplexer 174 (nos. 0-4) are connected to the output of a first NA~D-gate 176, the No. 5 input channel is connected to the output of a second NAND-gate 178~ and the No. 6 input channel i5 connected to the output of a third N~ND-gate 180. One of the inputs to NAND-gate 176 is tied through an inverter 182 to the '~freeze formants"
(FF) signal line, such that when a HI freeze signal is produced, the output of NAND-gate 176 will go HI.
Similarly, one of the inputs to each of NA~D-gate 178 and 180 is connected to the ~ocal delay (VD) and closure delay (CLD) signal lines, respectively, such that when a LO vocal delay or closure delay signal is produced, the outputs of NAND-gates 178 and 180 respectively, will go HI.
Thus, it will be appreciated that, absent a vocal delay (VD), closure delay (CLD), or freeze formants (FF) signal, the serial output of multiplexer 174 will Temain LO and the R/W inputs of RAMs 164 and 166 will be clocked under the control of the ~2 clock ~ .

-.

~ 1 626~0 signal. Accordingly, new current values for each parameter will be written into RAMs 164 and 166 and then subsequently read out onto the data outputs Q0-Q3 of RA~Is 164 and 166. With the frequency of the ~2 cloc~ signal in the preferred embodiment set at 20KHz, a parameter will normally transition to approximately 70~ of its new target position in 33 msec.
However, when a "freeze formants~ (FF) signal is produced, the serial output of multiplexer 174 will go HI to inhibit the R/W inputs of RA~s 164 and 166 during the periods when parameter bits A2-A0 ~o~
aTe equal to 000, 001, 010, 011, and 100, which corres-ponds to the periods of production for the Fl, F2, FC, F3, and F2Q control parameters. Consequently, when this occurs, new values for these parameters will not be written in RAMs 164 and 166 and the values of these control parameters will effectively be frozen at their current values for as long as the FF signal is produced.
Similarly, the presence of a vocal delay (VD) signal will cause the serial output of multiplexer 174 to go HI and inhibit the R/W line 47, to RAMs 164 and 166 during the "101" or vocal amplitude (VA) parameter period and prevent new values for the vocal amplitude parameter from being written into RAMs 164 and 166 until the vocal delay signal is terminated. Finally, when a closure delay signal (CLD) is generated, the serial output of multiplexer 174 will B Hl du~ing the "110" OT fricatiYe amp~itude (FA) paTameter period and thereby hold thc value of the iricative amplitude parameter until the closure delay signal is ter~inated.
The eipht parallel data outputs Q0-Q3 from RA~ i6; and 166 ~re latched into a pair of 4-bit output latches 168 and 170 under the control of the ~1 clock signal. The four most significant bits, as previously noted, or the Q outputs from outputlatch 168 are then de-multiplexed by a plurality of latches 194-200 to provide the resulting Fl, F2, FC, FC, F3, F2~, VA, and FA transitional control signals which are provided to the various sections of the ~ocal tract 60. In particular, the four Q outputs from output latch 168 are connected in parallel to the data (D) inputs of each of the de-multiplexing latches 194-200. Latches 194-200 are clocked under the control of the A0-A2 parameter clock signals which are connected to the A, B
and E inputs of a 3-to-8 line decoder 210. Ouput channels 2, 5 and 6 of decoder 210 are tied directly to the clock inputs (C~) of latches 196, 199 and 200, respectively.
Output channels 0, 1, 3 and 4, however, are and'ed with the glottal sync pulse on line 55 by AND-gates 202-208 before connection to the clock inputs (C~) of latches 194, 195, 197 and 198. Thus, it will be appreciated that the transitional values for parametess FC, FC, VA, and FA are clocked into latches 196, 199 and 200, respectivley, immediately upon updating, while the transi-tioned values for parameters Fl, F2, F3, and F2Q sre cloc~ed into latches 194, 195, 197 and 198, respectively, in synchronization with the glottal pulse from the gl~ttsl source 58 (Figure 6).
In addition, ~t will be noted that de-~ultiplexing latch 195 which produces the transitioned ~ 1 62f~50 .

F2 control signal parameter, is also provided with the fifth most significant bit (1/2 LSB) in the B-bit output signal from digital transition circuit 50 50 that the transitioned value for the F2 control parameter has five bits of resolution. This is done to increase the step resolution of the F2 contTol parameter due to the 8reater frequency span of the F2 resonant filter in the vocal tract 60 (Figure 4).
Turning now to Figure 6, the unique glottal source circuit 58 of the present invention will now be explained. The period of the glottal pulse signal is determined by the time it takes for an 8-bit counter, comprised of cascaded 4-bit jam counters 220 and 222, to count from a preset count to all ones. Specifically, counters 220 and 222 are clocked by the 20KHz ~2 clock signal. The three most significant data inputs (Ll-L3) of counter 222 sre connected to the inverse of the three least significant bits ~F10-F12) in the transitioned Fl control parameter from the output of latch 194 (Figure 3). The inverse of the most significant bit (F13) in the transitioned Fl control parameter and the two inflection control bits (11, I2) 56 from the 8-bit input command word are provided to the three least significant data inputs (L0-L2~ of counter 220. The data present at the inputs (L0-L3) of counters 220 and 2Z2 is loaded into the counters to preset the counters at the end of each glott~l pulse period when a HI
signal is produced at the carry output (TC) of counter ;

- - - . . .

220, which is inverted by inverter 228 and provided on line 230 to the load inputs (LD) of c~unters 220 and 222. Acceordingly, since the frequency of the carry out signal from counter 2Z0 determines the frequency of the glottal pulse signal, it will be appreciated that the fundamental frequency ~f the glottal pulse signal is controlled by the setting of the inflection control bits 56 and, to a lesser degree, by the value of the Fl control parameter. Since the Fl control parameter is inverted, the fundamental frequency of the glottal signal will vary inversely with respect thereto. In other words, when the value of the Fl parameter decreases, the pitch of the audio output will increase. This serves to provide a degree of automatic inflection contr~l in the audio output in addition to the programmable changes in pitch available via the inflection control bits 56.
It will be noted, however, that since the two inflection control bits 5~ are provided to higher order data inputs of counters 220 and 222 than the h parameter bits, Z0 the automatic inflection changes which result from movement in the resonant frequency of the Fl resonant filter will have a less pronounced effect on the pitch of the audio output than the programmed changes made via the inflection control bits S6.
Z5 In addition, it will be noted that a third 4-bit jam counter 224 is provided which is loaded with the same data and enabled simultaneously with counter 220. The only difference is that counter 224 is cloc~ed by the A0 clock signal and therefore counts 1 1 fi2650 twice as fast as counter 220. The carry output (TC) of counter 224 is connected to the SET input of an R-S flip-flop 226 and the RESET input of flip-flop 226 is connected to the caTry output (TC) of flip-flop 220. Thus, the output of flip-flop 226 on line 65 will be set LO at the beginning of each glottal pulse period and go Hl halfway through the glottal pulse period. The signal on line 65, referred to as the FGATE signal, is provided to the white noise generator circuit 64 (Figure 8) to inhibit the white noise signal during the initial half of the glottal pulse period for voiced fricative phonemes when both vocal and fricativc excitation energy are present at the same time.
The waveform of the glottal pulse sig-nal is generated by a 4-bit counter 234 and a pair of 8-to-1 analog multiplexers 242 and 244. COUnteT 234 i5 clocked by the Ql output of another 4-bit counter 236 which is in turn clocked by the 20KHz ~2 clock signal.
Thus, counter 236 effectively serves to divide the frequency of the 20KHz ~2 clock signal by four so that the frequency of the clock signal provided to counter 234 is 5HKz. The three least significant count outputs (Q0-Q2) from counter 234 are connected in parallel to the A, B and C binary control inputs of multiplexers 242 and Z44. The most significant count output (Q3) from counte~ 234 is connected to the inhibit input (INH) of ~ultlplexer 242 and th~ough an invcrter 245 to the inhibit input (INH) of multiplexer 244, so that for the first eight coun~s (~-7) of counter 234, multiplexer 244 is disabled and during the second eight counts (8-15) of counter 234, multiplexer 242 is disabled. The parallel inputs (o-7) of both multi-plexers 242 and 244 are each shown tied to a variable resistor connected between a voltage source (Vp) and gJound, which is intended to represent a presettable d.c. signal level. As will be seen, the d.c. levels are preset to appropriate values to provide the desired glottal waveform approximation.
The serial outputs of multiplexers 242 and 244 are tied in common and provide the glottal output signaI on line 246. ~he d.c. analog level produced on output Iine 24S is therefore determined by the count output of counter 234 which is provided to the A, B and C binary control inputs of multiplexers 242 and 244. In other words, each of the sixteen counts from the output of counter 234 uniquely identifies one of the sixteen inputs to multiplexers 242 and 244.
For example, when the count output of counter 234 is equal to 0110, the d.c. level present at the No. 6 input channel of multiplexer 242 will be produced on output line 246. Similarly, when the count output of counter 234 is e~ual to 1101, the d.c. level present at the No. 5 input channel of multiplexer 244 will be produced on output line 246. In the preferred embodi-ment, wherein the ~2 clock signal is set at 20~Hz, the 5~Hz count rate Df counter~234 Tesults in a 0.2 msec.
segment in the glottal output signal on line 246 for . .

. .
, ~ 1 62B50 each count of counter 234. Thus, it will be appreciated that by properly presetting the d.c. signal levels pro-vided to the inputs of multiplexers 242 and 244, any desired glottal waveform may be generated. In the preferred embodiment, it was determined that an 8-segment glottal pulse of the type illustrated in Figure 10 was adequate, and therefore only a single 8-to-1 multiplexer was used.
The carry output (TC) from counter 234 is returned to its enable (EP) input through an inverter 238 to disable the counter once the counter has attained a count of 1111 and prevent it responding to additional clock pulses. The purpose of this is to hold counter 234 at its last count so that the last d.c. level pro-l; duced on output line 246 will be maintained for the duration of the glottal period. Specifically, it will be noted that the carry output of counter 220, which determines the glottal period, is provided through in-verter 228 on line 230 to the input of an OR-gate 232, which has its other input tied through an inverter 231 to the ~1 clock signal. The output from OR-gate 232 is connected to the clear inputs (CLR) of both counters 234 and 236. Thus, it will be appreciated that at the end of the glottal pulse period when a HI signal is Z5 produced st the carry output (TC) of counter 220, a LO signal is provided to the CLR inputs of counters 234 and 236, thereby setting all four outputs (Q0-Q3) of each counter to zero to initiate a new glottal pulse.
-3~-,, ~ ~

, i :
~;, ~ 1 62650 When the Q0-Q3 count outputs of counter 234 are reset ~o zero at the beginning of each glottal pulse, 8 HI output pulse is produced on glottal sync line 55 at the output of NOR-gate 240 which has its four inputs connected to the four outputs of counter 234. As previously noted, the glottal sync pulse on line 55 i5 provided to the transition CirCUitTy tFigure 3) to latch the Fl, F2, F3 and F2Q output latches 194, 195, 197 and 19~, respectively, in synchronization ~ith the beginning of each glottal pulse. The purpose of synchronizing the transitioning of the Fl, F2, F3 and F2Q control parameter values with the beginning of the glottal pulse is to prevent the production of audible .
random noise which would be produced if the CMOS switches in the variable capacitance networks in the vocal tract were permitted to switch during the "rest" period of the glottal pulse signal when no excitation energy is present in the vocal tract.
Referring now to Figure B, the fricative excitation signal is produced by a white noise generator comprised of a jam counter 250 and an 18-stage static shift register 252, which generates a random white noise output signal on line 256. Both jam counter 250 and shift register 252 are clocked by the Pl clock signal which is provided to their clock inputs (CK) through a NOR-gate 254. The other input of NOR-gate 254 is connected to the inverse of the fricative amplitude control parameter (FA) from the output of NOR-gate 192 (Figure 3). Thus, when no f~icative excitation energy is nceded, the noise generator is disabled to avoid any unnecessary interference to the remainder of the system, The white noise output signal ~n tlne i J 62650 256 is provided to the input of a NAND-gate 258 which has its other input connected to the output of an OR-gate 260. The inputs to OR-gate 260 are tied to the FGATE signal line 65 and the VA signal line from the output of NOR-gate 190 (Figure 3). During voiced fricative phonemes which require both vocal and fricative excitation energy, both VA and FA signals will be LO. Thus, it will be appreciated that the FGATE signal on line 65, which, it will be recalled, goes Hl during the latter half or "rest" portion of the glottal signal period, will enable NAND-gate 258 and effectively gate on the white noise signal during the latter inactive portion of the glottal signal period.
The white noise signal from the output of NAND-gate 258 is then provided to the fricative amplitude control circuit 260 which controls the ampli-tude of the white noise signal in accordance with the value of the fricative amplitude control parameter (FA) The resulting white noise signal i5 then filtered by a high pass noise shaping filter before injection into the vocal tract 60 under the control of the fricative control signal parameter (FC) and its inverse (FC).
The operation of the fricative amplitude control circuit 260 and the high pass noise shaping circuit 262, which utili~e the same capacitive switching technique em-ployed in the vocal tract 60, will become readily apparent from the following description of the vocal tract 60.
With rcference now to Figures 4 and 5, a circuit diagram of the novel vocsl tract 60 of the present invention is shown. The vocal tract 60 is ., . _ 9 ~ 6~2650 principally comprised of four cascaded resonant filters, designated Fl, F2, F3 and F5. She resonant fsequencies of the Fl, F2 and F3 resonant filters are variable and are controlled in accordance with the Fl, F2 and F3 S control parameters, whereas the resonant fsequency of the F5 resonant filter is fixed. The glottal source or vocal excitation signal is provided through the vocal amplitude control circuit 62, which controls the amplitude of the glottal signal in accordance with the vocal amplitude (VA) control parameter, and is then ~WJ~
injected serially into the Fl *esonant filter of the vocal tract. The fricative excitation signal is injected in parallel into the F2 and F5 resonant filters of the vocal tTact 60 under the control of the fricative con-trol parameter (FC) and the inveTse of the fricative control parameter (FC), respectively. In addition, it will be noted the "Q" or bandwidth of the F2 resonant filter is also controlled by the F2Q control parameter, which as previously explained is used principally during nasal phonemes to reduce the Q and thus increase the bandwidth of the F2 resonant filter. The unique manner in which the parameter control functions are implemented in the present system will now be explained.
As previously noted, the preferred embodiment of the prcsent invention is particularly sdapted to be impiemented in a single integrated circuit utilizing complementary metsl oxide semiconductor (CMOS) technology. In view of the desire to design a complete ............ ..... . .. .

1 ~ 62650 speech synthesizer which is capable of being constructed on a single silicon "chip", a unique approach was taken in the manner in which the parameter control functions are implemented. In particular, rather than utilizing time-weighted duty cycle control signals as in many previous speech systems, the present invention employs a capacitive switching technique to control the tuning of the vocal tract, as well as the other parameter con-tTolled functions.
With particular reference to Figure 13, a circuit model illustrating the theory of operation of the capacitive switching technique employed is shown.
In Figure 13, the current (I~ into the negative input of the operational amplifier is determined by the charge lS on the capacitor (Cr) and the frequency at which it is switched back and forth (Fo~). Expressed in equation form, therefore:

. .
Since current (I) is, of course, also equal to tVi/R), the following relationship is presented:
R ~ ~
Thus, it can be seen that a capacitor that is switched in the manner illustrated in Figure 13 is essentially equivalent to a resistor. In sddition, since the time constnnt lT) of the circuit is given by the following T ~ RCf ~ ( ~ )Cf ~ F~

the frequency response (F) of the circuit is equal to:
F = Fo'( ~ ~
Consequently, it will be appreciated that the time constant and frequency response of an RC ciscuit simu-lated by the above capacitive switching technique is dependent not only upon the switching frequency (FDI ) .
but also, significantly, upon the capacitor ratio of CJ and Cf. As a result, in order to achieve a fre-quency response in the low frequency range of the human voice, it is not necessary to use large capacitors, but simply capaci~ors having the proper ratio. Thus, for example, with a switching frequency (Fo~) equal to 20XHz, values for Cr - 1 pf and Cf z 3.183 pf will provide a frequency response of lOOOHz. Thus, as will be appreciated by those skilled in the art, by eliminating the need for large capacitors, the physical size of the silicon chip can be minimized. In addition, since the frequency response is dependent upon the capacitor ratio and not theiT actual physical size, the tolerance between production batches of the silicon chip can be readily maintained at high accuracy levels.
As can be seen from the circuit diagrams in Figures 4, ~, ~ and 8, the above-described capacitive switching technique is utilized in the preferted embodiment to implement the Fl, F2, F3, F2Q, FC, FC, VA, FA and phoneme timing paJameter controlled functions. In each instance the contTol signal parameter is utilized to cont~ol the value of the capacito~ Jatio of the particular . .

~ 3 626S0 .

circuit involved, and hence the effective resistance value of the circuit, by controlling the on/off state of a plurality of CMOS switches which are individually connected in series with one of a correspondine plurality of binary-weighted, parallel connected capacitors.
The switching frequency (Fo~) is set at 20~Hz as estab-lished by the ~1 and ~2 clock signals fro~ the timing circuit 38, except in the sub-phoneme clock circuit 42 (Figure 7~ which utilizes the 5Xl~z Pl and P2 clock signals from timing circuit 38. The ~1 and ~2 clock signals comprise digital, two-phase clock signals, both having a frequency of 20KHz. As shown in Figure 14, the ~1 and ~2 clock signals are opposite in phase and slightly non-overlapping. The purpose of the second non-overlapping clock signal (~2) is to eliminate para-sitic capacitances due to the operational amplifier and the CiJCUit layout.
The operation of the capacitive switching parameter control circuitry is probably best understood by comparing the circuit diagram of the vocal amplitude circuit 62 and the Fl and F2 resonant filters from the vocal tract 60 in Figure 4 with the resistor equivalence of this circuitry in Figure 9. As can readily be seen, the VA, Fl, F2, F2Q and FC control signal parameters each control the effective resistance value of a variable resistance circuit equivalent by setting the capacitance rstio thereof to one of sixteen discrete values. The sixteen di~ferent values are determined, of course, by the state of the four bits in esch control signal para-meter. This is true of all the parameters except the F2 control parameter which controls the frequency move-~ent of the F2 resonant filter. Although the F2 control S signal parameter contains four bits of resolution defining sixteen different target positions as with the other control parameters, a fifth resolution bit is added to the F2 control parameter during the transition process as previously described, to reduce the discrete incremental step movement in the fundamental frequency of the F2 sesonant filter as it is dynamically transi-tioned to its new target position. The fifth resolu-tion bit in the F2 control parameter is provided in the preferred embodiment because the frequency span of lS the F2 resonant filter is approximately twice that of the Fl resonant filter which also uses four bits of resolution to provide sixteen incremental steps.
Consequently, since it is desirable to make the incre-mental changes small enough so that transitional step movement is perceived as a gradual change by the human ear, it was deemed necessary to add a fifth resolution bit to the F2 control parameter to reduce the amount of movement in each discrete step.
While the above description constitutes ZS the preferred embodiment of the present invention, it ~ill be appreciated that the invention is susceptible to modification, vaTiation and change without departing from the proper scope or fair ~eaninp of the accompanying claims.

-i6-. ..... . . .

Claims (2)

THE EMBODIMENTS OF THE INVENTION IN WHICH AN EXCLUSIVE
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:
1. In a phoneme-based speech synthesizer including parameter storage means for producing, for each phoneme, on a first data bus a plurality of multiplexed digital control parameters defining target values for said control parameters, and a vocal tract model that is controlled in accordance with the current values of said control parameters; the improvement comprising digital transition means for sequentially transi-tioning said control parameters so that the values of said control parameters are gradually changed from said current values toward said target values, including:
output means for providing on a second data bus the multiplexed current values of said control parameters;
demultiplexer means for demultiplexing the signal on said second data bus and producing a corresponding plurality of parallel digital output signals comprising the current values of said plurality of control parameters; and arithmetic circuit means for calculating a factor related to a predetermined percentage of the difference between the target value signal on said first data bus and the current value signal on said second data bus, adding said factor to said current value signal at a predetermined rate, and providing the resulting value signal to said output means.
2. The speech synthesizer of Claim i wherein said arithmetic circuit means comprises first circuit means connected to said first data bus for producing a first
Claim 2...continued.

signal equal to said predetermined percentage of said target value signal, second circuit means connected to said second data bus for producing a second signal equal to said predetermined percentage of said current value signal, third circuit means for subtracting said second signal from said current value signal and producing a third signal equal to the difference thereof, and fourth circuit means for adding said first signal to said third signal and providing the resultant signal to said output means.
CA000374573A 1980-06-04 1981-04-03 Integrated circuit phoneme-based speech synthesizer Expired CA1162650A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CA000432956A CA1171179A (en) 1980-06-04 1983-07-21 Integrated circuit phoneme-based speech synthesizer

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15648380A 1980-06-04 1980-06-04
US156,483 1980-06-04

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CA000432956A Division CA1171179A (en) 1980-06-04 1983-07-21 Integrated circuit phoneme-based speech synthesizer

Publications (1)

Publication Number Publication Date
CA1162650A true CA1162650A (en) 1984-02-21

Family

ID=22559768

Family Applications (1)

Application Number Title Priority Date Filing Date
CA000374573A Expired CA1162650A (en) 1980-06-04 1981-04-03 Integrated circuit phoneme-based speech synthesizer

Country Status (8)

Country Link
JP (1) JPS5936276B2 (en)
BR (1) BR8103517A (en)
CA (1) CA1162650A (en)
DE (1) DE3114189A1 (en)
FR (1) FR2484117B1 (en)
GB (1) GB2077558B (en)
IT (1) IT1138314B (en)
MX (1) MX149882A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0085209B1 (en) * 1982-01-29 1986-07-30 International Business Machines Corporation Audio response terminal for use with data processing systems
JPS61103715A (en) * 1984-10-26 1986-05-22 Ube Ind Ltd Scale removing method for cylindrical billet

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE1524194B2 (en) * 1966-08-17 1971-03-18 Licentia Patent-Verwaltungs-Gmbh, 6000 Frankfurt ARRANGEMENT FOR INTERPOLATE A PATH CURVE
JPS5236406B2 (en) * 1972-01-17 1977-09-16
GB1449812A (en) * 1972-12-22 1976-09-15 Electronic Music Studios Londo Interpolators

Also Published As

Publication number Publication date
JPS5726900A (en) 1982-02-13
FR2484117A1 (en) 1981-12-11
IT1138314B (en) 1986-09-17
MX149882A (en) 1984-01-25
GB2077558B (en) 1985-01-09
DE3114189A1 (en) 1982-01-28
IT8121532A0 (en) 1981-05-06
GB2077558A (en) 1981-12-16
FR2484117B1 (en) 1985-07-12
BR8103517A (en) 1982-02-24
JPS5936276B2 (en) 1984-09-03

Similar Documents

Publication Publication Date Title
US4433210A (en) Integrated circuit phoneme-based speech synthesizer
US4398059A (en) Speech producing system
US4130730A (en) Voice synthesizer
EP0813733B1 (en) Speech synthesis
US5787398A (en) Apparatus for synthesizing speech by varying pitch
US4470150A (en) Voice synthesizer with automatic pitch and speech rate modulation
EP0561752B1 (en) A method and an arrangement for speech synthesis
CA1162650A (en) Integrated circuit phoneme-based speech synthesizer
US5212731A (en) Apparatus for providing sentence-final accents in synthesized american english speech
US4264783A (en) Digital speech synthesizer having an analog delay line vocal tract
US4301328A (en) Voice synthesizer
US4443857A (en) Process for detecting the melody frequency in a speech signal and a device for implementing same
JPH05307399A (en) Voice analysis system
US5163110A (en) Pitch control in artificial speech
CA1171179A (en) Integrated circuit phoneme-based speech synthesizer
CA1124865A (en) Voice synthesizer
Eady et al. Pitch assignment rules for speech synthesis by word concatenation
EP0750778A1 (en) Speech synthesis
JPH08160991A (en) Method for generating speech element piece, and method and device for speech synthesis
Klatt Synthesis of stop consonants in initial position
CA1124866A (en) Voice synthesizer
Landahl et al. Acoustic invariance and the perception of place of articulation: A selective adaptation study
Morris et al. A new speech synthesis chip set
JPH07152396A (en) Voice synthesizer
JP2573587B2 (en) Pitch pattern generator

Legal Events

Date Code Title Description
MKEX Expiry