US3892919A - Speech synthesis system - Google Patents

Speech synthesis system Download PDF

Info

Publication number
US3892919A
US3892919A US414746A US41474673A US3892919A US 3892919 A US3892919 A US 3892919A US 414746 A US414746 A US 414746A US 41474673 A US41474673 A US 41474673A US 3892919 A US3892919 A US 3892919A
Authority
US
United States
Prior art keywords
speech
pitch
syllable
segments
synthesized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US414746A
Inventor
Akira Ichikawa
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Application granted granted Critical
Publication of US3892919A publication Critical patent/US3892919A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/06Elementary speech units used in speech synthesisers; Concatenation rules
    • G10L13/07Concatenation rules

Definitions

  • ABSTRACT [30] Foreign Application Priority Data In a system in which a plurality of previously recorded N v, 13 1972 Japan 47412995 waveforms corresponding to phonetic elements separately picked up from natural voice and having a pitch 52 0.8. (:1 179/1 SM; 179/1 SM g are connected to form y required p [51 ⁇ Int. Cl.
  • the present invention relates to a speech synthesis system and more particularly to a system in which a sound wave extracted from natural voice and having about a pitch length is used as a phonetic segment or speech segment and in which the phonetic segments previously stored are selectively connected at controlled periods due to control signals corresponding to a required word or a sentence to be synthesized.
  • the requirements for the speech synthesis part are as follows: I the synthesized speech must be as near the human voice as possible; the production cost must be low; and the system incorporating the part therein must permit multiple uses, that is, the part must be able to generate a plurality of speech at a time.
  • a plurality of sound waveforms each having a pitch length are previously prepared so as to be used as speech sound waveforms, i.e. speech segments, and the speech segments are selectively connected due to control signals corresponding to words or sentences to be synthesized.
  • This conventional system is rather cheap since any desired speech can be synthesized by connecting speech segments each having a waveform of a pitch length so that the number of the stored speech segments is relatively small.
  • the speech segments can be read out rapidly. that is, the access time is very short, so that the multiple synthesis of speech is possible.
  • the read-out time of a speech segment that is. the length of the waveform of the speech segment can be controlled so that the pitch of the synthesized speech can also be controlled.
  • One object of the present invention is to improve the quality of the synthesized speech produced by a speech synthesis system in which a plurality of speech sound waveforms, each having a pitch length. to be used as speech segments are recorded and these speech segments are selectively connected to form synthesized speech.
  • Another object of the present invention is to provide a speech synthesis system in which a plurality of speech sound waveforms, each having a pitch length. to be used as speech segments are recorded and these speech segments are selectively connected to form synthesized speech, and in which the pitch control of speech sounds is simplified so that the system can be economically fabricated Without deterioration in the vocal quality of the synthesized speech.
  • the time of reading out each speech segment that is, the wavelength of each speech segment of synthesized speech is stepwise changed at intervals of several speech segments. Namely, the waveforms of speech segments read out are changed at intervals of one fifth of a syllable to a full syllable. Therefore. the system according to the present invention can produce synthesized speech softer to ear than that produced by a conventional speech synthesis system in which the length of the waveform of every speech segment is controlled individually.
  • FIG. I is an oscillographic representation of a monosyllable speech sound waveform.
  • FIG. 2 shows the modes of variations in the pitch frequency of monosyllable speech sounds in various pronounciations.
  • FIG. 3 shows the variations in the pitch frequency of one word.
  • FIGS. 4A and 4B show waveforms illustrating the discontinuities resulting from the connection of separate speech segments.
  • FIG. 5 shows the variation in pitch frequency of the synthesized speech formed by the speech synthesizing system according to the present invention.
  • FIG. 6 is a block diagram of a speech synthesis system embodying the present invention.
  • the waveform of a monosyllable speech sound is shown in a rectangular coordinate system in which the abscissa represents the time base and the ordinatc gives the amplitude of waveform.
  • the waveform of the monosyllable speech sound consists of an irregular portion C like that of a consonant and a periodical portion V like that ofa vowel.
  • every syllable of the Japanese speech is composed of a single consonant followed by a single vowel or of a single vowel. And about one hundred different syllables can make up all the speech sounds covering the entire vocabulary of the Japanese language.
  • the pitch or intonation of the speech sound depends mainly on the repetition periods T T T i.e. the pitch period, while the tone is determined by the frequency characteristic of the periodical portion V.
  • the pitch period is usually l() to milliseconds.
  • FIG. 2 shows the variation in the pitch frequency (de fined as the reciprocal of the pitch period) with time of the monosyllable speech sound shown in FIG, I.
  • the abscissa and the ordinate respectively represent the time base and the pitch frequency
  • the convected speech sounds corresponding to a desired word or sentence are formed by connecting together the prerecorded speech segments, Le. speech sound waveforms obtained by dividing the waveform of the monosyllable speech sound as shown in FIG. 1, pronounced in a manner corresponding to the curve I in FIG. 2, into units, each having a pitch length, the discontinuities are formed in the junction points between the unit waveforms, ie. speech segment waveforms, the discontinuities being the por tions where the amplitudes of the waveform rapidly change,
  • FIG. 3 shows the variation in pitch frequency with time of the speech sound corresponding to a word, in which the abscissa and the ordinate respectively represent the time base and the pitch frequency.
  • curve 5 indicates the mode of the variation in pitch frequency of natural speech sound corresponding to a word to be synthesized
  • curve 6 shows the mode of the variation in the pitch frequency of the monosyl lable speech sound corresponding to the curve I in FIG. 2.
  • the abscissa is divided into pronounciation intervals r r t of the monosyllable sounds, Accordingly, in order that the speech sound having a pitch frequency characteristic corresponding to the curve 5 may be composed of speech segments obtained from the natural voice having a pitch frequency characteris tic corresponding to the curve 6, the length of the waveform ofeach speech segment, i.e pitch period, has to be controlled. Therefore, if the waveforms of the speech segments having pitch periods T T T as in FIG. I are connected and synthesized into connected speech having pitch periods longer or shorter than those periods T T T;, and T then the disconti nuities 7 are formed in the junction portion of the respective speech segment waveforms as shown in FIGS. 4A and 48.
  • FIG. 4A corresponds to the case where the synthesized speech has a pitch frequency higher than that of the original natural voice from which the speech segment waveforms are obtained and has a pitch period shorter than that of the natural voice.
  • FIG. 48. corresponds to the case where the synthesized spcech has a pitch frequency lower than that of the original natural voice and a pitch period longer than that of the original natural voice. The dicontinuities 7 thus resulted deteriorate the vocal quality of the synthesized speech and also generate hoarse noises.
  • the degradation of the vocal quality due to the discontinuities can be prevented since the way of the pitch control in the speech control system is improved, and moreover a system can be realized in which the pitch control is further simplified by making the best use of the merits of the speech synthesizing system in which speech segments are connected to form synthesized speech.
  • the pitch frequency or the pitch period of the synthesized speech is changed stepwise at intervals of a quarter of a syllable to a full syllable, It is empirically verified that the synthesized speech having a pitch frequency characteristic corre sponding to a staircurve 8 indicated by dotted line FIG. 5, has a vocal quality superior to that having a pitch fre quency characteristic indicated by a solid curve 5 in FIG. 5. In this case, it is needless to perform the pitch control for every speech segment and since the pitch periods of the successive speech segments are all the same, the pitch control system of the speech synthesis system is simplified.
  • FIG. 6 is a block diagram of a concrete structure of a speech synthesis system embodying the present invention.
  • a speech segment memory 32 is described for convenience sake.
  • the speech sound waveforms of all the syllables necessary for the speech synthesis are stored in a high speed memory device such as a core memory.
  • a high speed memory device such as a core memory.
  • Each syllable in the mem ory consists of time-sequentially arranged speech segments constituting a waveform as shown in FIG. I and the waveform of each speech segment has an address allotted to indicate its location in the memory,
  • serial numbers are allotted to the addresses of the speech segment waveforms arranged in time-sequence. Therefore, the first address is used as a syllable address to represent the syllable.
  • Each speech segment waveform is obtained by sampling the speech sound waveform shown in FIG. 1 at 8KI-I2 and each of the sampled signal is coded into an 8-bit signal.
  • the period at which one speech segment, i.e. wave portion within T T T or T in FIG. 1, is recorded is 10 to 20 msec, Namely, the period is set equal to the maximum one of the pitch periods of speech sounds to be synthesized.
  • a series of code signals, each representing one syllable, to constitute speech to be synthesized are received at a terminal 9 and fed through an input-output control circuit 10 to a data processing circuit 11.
  • code signals corresponding to the syllables YO, KO, HA and MA constituting the name of a famous port city of Japan are applied to the circuit II.
  • the device to generate such code signals is not within the scope of this invention and not shown in the figure, but the device is equivalent to the conventional automatic response system, being designed to form data for answers to preset questions and to connect the code signals according to the arrangement of words corresponding to those answers.
  • the data processing circuit ll interprets the code signals according to the predetermined program and generates signals instructing and controlling the operations of the respective parts of the speech synthesizing apparatus described later.
  • the circuit 11 Judging from the series of code signals, the circuit 11 generates speech segment information, pitch information and syllable time information according to a reference table.
  • the speech segment information is, for example, the address of the first speech segment of a syllable stored in the speech segment memory 32 described above;
  • the pitch information is the information indicated by dotted curve 8 in FIG. 5, that is, the number indicating how many samples, counted from the first one, of the speech segments stored in the memory 32 is to be read out;
  • the syllable time information is the time information representing r, to I in FIG. 5, that is, the number of samples to be read out within the time of one syllable.
  • the data processing circuit to perform such processing as described above may be designed especially for the present invention but a general purpose computer can be used as such a circuit so that the details thereof is omitted.
  • the three kinds of information are respectively stored as time-sequential signals in a syllable address buffer memory 14, a pitch time buffer memory l5 and a syllable time buffer memory 16 ofa speech synthesizing apparatus 13.
  • the speech synthesizing apparatus 13 consists of a part to select speech segments necessary to synthesize connected speech according to the speech segment information, a part to determine the pitch periods of the speech segments according to the pitch information and a part to determine the times allotted to syllables according to the syllable time infor mation.
  • the address data of the syllable address memory 14 are transferred one by one to a segment address memory 17, in response to an external signal and simultaneously the data in the syllable address memory 14 is shifted forward to cause the address of the next syllable to come to the head position.
  • the memory 14 and the memory 17 may be considered to form a shift register.
  • the combination of the pitch time buffer memory 15 and a pitch time memory or of the syllable time buffer memory l6 and a syllable time memory may be also considered to form a shift register.
  • the address signal of the first speech segment of a syllable stored in the segment memory 17 is applied to a read out circuit 29 so that a series of sampled values constituting the segment are sequentially read out in synchronism with clock pulses from a clock signal generator 20.
  • the number of the readout samples is detected by counting the clock pulses by a pitch counter 22.
  • a coincidence circuit 25 detects the instant of coincidence to deliver a coincidence pulse.
  • the coincidence pulse serves not only to reset the pitch counter 22 but also to shift a segment address counter 21 step by stepv
  • the output of the shifted segment address counter 21 is applied to the segment address memory 17 to read out the next speech segment from the speech segment memory 32, in the same manner as described above. Thereafter. the same operation of reading out the sampled values is repeated on.
  • the coincidence pulse also resets the counter 23 at the same time.
  • the time counter 23 also counts the clock pulses, and when the content of the time counter 23 coincides with the syllable time data (that is, the number of sampling points occurring within a time during which the pitch frequency in one syllable remain the same, as described above) set in the syllable memory 19, a coincidence circuit 26 detects the instant of coincidence to deliver a coincidence pulse at the instant.
  • the syllable time data that is, the number of sampling points occurring within a time during which the pitch frequency in one syllable remain the same, as described above
  • the coincidence pulse serves not only to transfer or shift the foremost pitch time data of the pitch time buffer memory 15 to the pitch time memory 18, but also to shift a syllable counter 24 step by step.
  • a coincidence circuit 27 detects the instant of coincidence to deliver a coincidence pulse.
  • the coincidence pulse resets the counter 24 and is also applied to the syllable address buffer memory 14 and the syllable time buffer memory 16 so that the control information for the syllable to be next synthesized, i.e. segment address and time data for the syllable, is transferred respectively to the memory 17 and the memory 19.
  • the step number stored in the step number memory refers to the number of steps occurring within a time of one syllable when the pitch frequency is changed stepwise as shown in FIG. 5.
  • the number of steps is three.
  • the number of steps need not be limited necessarily to 3 but may be 4 to 0, that is, the pitch frequency of the synthesized speech sounds may be varied at intervals of a quarter ofa syllable or a full syllable.
  • the output signal obtained from the read out circuit 29 as a result of the operations as described above is equivalent to a signal obtained by subjecting the signal waveform shown in FIG. 4A or 43 to pulse code modulation since the speech synthesizing circuit 13 consists of digital circuits.
  • the signal is then converted to an analog signal through an digital-to-analog converter 30 and the analog signal is finally converted to a speech sound signal or audible voice through an electroacoustic transducer 31.
  • the digital-toanalog converter 30 and the electro acoustic trans ducer 31 are connected by such a transmission line as a telephone which electrically connects a remote subscriber with the central information service system.
  • the speech synthesis system shown in FIG. 6 has been described as applied to the case where the speech sounds only for one channel are synthesized. It is, however, a matter of course that since the whole system is composed of digital signal treating circuits and the speech segments are stored in such a memory as a core memory capable of high speed access then the system can be easily constructed in a multichannel arrangement as known in the field of the art.
  • the speech segments stored in the speech segment memory may be obtained by directly extracting the components from the natural human voice or by artificially treating the waveforms of the human speech sounds I claim:
  • a speech synthesis system comprising:
  • a speech segment memory which stores a plurality of speech segments
  • a synthesizing apparatus coupled to said speech segment memory, including first means for selecting desired speech segments from said speech segment memory,
  • second means coupled to said first means, for controlling the pitch period of each of said desired speech segments, so as to change the pitch frequency of the synthesized speech step-wise, and
  • third means coupled to said first means and said second means, for connecting the desired pitch controlled speech segments together.
  • a speech synthesis system wherein said second means includes means for adjust ing the intervals of the stepwise change between a quar ter of a syllable and a full syllable.
  • a speech synthesis system comprising: a, a data processing circuit to convert code signals representative of the syllables of words to be synthesized into control signals for speech synthesis; b, a speech segment memory to store speech segments each having a waveform of about a pitch length; c. a speech synthesizing apparatus, coupled to said data processing circuit and to said speech segment memory, and including first means for selecting desired speech segments in said speech segment memory second means, coupled to said first means. for controlling the read time of each of the selected speech segments, so as to change the pitch period of the synthesized speech stepwise at intervals of a quarter of a syllable to a full syllable and,
  • third means coupled to said first means and to said second means, for synthesizing speech sound waveform signals by connecting the selected and pitch controlled speech segment signals together, in response to the control signals from the data processing circuit;
  • an electro-acoustic converting device coupled to said speech segment memory and said speech synthesizing apparatus, for converting the speech sound waveform signals into corresponding speech sounds

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Electrophonic Musical Instruments (AREA)

Abstract

In a system in which a plurality of previously recorded waveforms corresponding to phonetic elements separately picked up from natural voice and having a pitch length, are connected to form any required speech, the degradation in the quality of the synthesized speech due to the discontinuity in the waveform of the synthesized speech is prevented by so controlling the period of reading out each phonetic element as to change the period stepwise at intervals of several phonetic elements (i.e., pitch lengths).

Description

Untied States Patent 1 1 1111 3,892,919 Ichikawa July 1, 1975 [5 1 SPEECH SYNTHESIS SYSTEM 3.369.077 2/1968 French 179/1.5 M
4 2 4 [75] Inventor: Akira lchikawa, Kokubunji, Japan L496 H972 Slaw 3 OM48 [73] Assignee: Hitachi, Ltd., Japan Primary ExaminerKathleen H. Claffy I Assistant ExaminerE S1 Matt Kemeny [22] Fllcd' 1973 Attorney, Agent, or FirmCraig & Antonelli [21] App]. No: 414,746
[57] ABSTRACT [30] Foreign Application Priority Data In a system in which a plurality of previously recorded N v, 13 1972 Japan 47412995 waveforms corresponding to phonetic elements separately picked up from natural voice and having a pitch 52 0.8. (:1 179/1 SM; 179/1 SM g are connected to form y required p [51} Int. Cl. Gl0l 1/00 h gr ion in he quality f he ynthesized [58] Field of Search l79/1 SM; speech due to he i c in i y n h veform of the 340/143 152 synthesized speech is prevented by so controlling the period of reading out each phonetic element as to [56] References Cit d change the period stepwise at intervals of several pho- UNITED netic elements (i pitch lCllglhS).
2,771,509 ll/l956 Dudley l79/l.5 M 3 Claims, 7 Drawing Figures SPEECH SYNTHESIS SYSTEM BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech synthesis system and more particularly to a system in which a sound wave extracted from natural voice and having about a pitch length is used as a phonetic segment or speech segment and in which the phonetic segments previously stored are selectively connected at controlled periods due to control signals corresponding to a required word or a sentence to be synthesized.
2. Description of the Prior Art In recent years, the information service system which connects data processing devices such as electronic computors with communication lines such as telephones, has been developed. In such a system, a remote subscribers question sent through a communication line is received by a central signal processing device which stores large information and the device prepares an answer for the question and sends it back to the subscriber, the answer being in the form of sound like human voice.
In this system. the most important is the speech synthesis part which makes an answer in the form of voice.
The requirements for the speech synthesis part. however, are as follows: I the synthesized speech must be as near the human voice as possible; the production cost must be low; and the system incorporating the part therein must permit multiple uses, that is, the part must be able to generate a plurality of speech at a time.
In a conventional speech synthesis system which is rather satisfactory from the standpoint of the above mentioned requirements. a plurality of sound waveforms each having a pitch length are previously prepared so as to be used as speech sound waveforms, i.e. speech segments, and the speech segments are selectively connected due to control signals corresponding to words or sentences to be synthesized.
This conventional system is rather cheap since any desired speech can be synthesized by connecting speech segments each having a waveform of a pitch length so that the number of the stored speech segments is relatively small. The speech segments can be read out rapidly. that is, the access time is very short, so that the multiple synthesis of speech is possible.
Moreover, the read-out time of a speech segment, that is. the length of the waveform of the speech segment can be controlled so that the pitch of the synthesized speech can also be controlled.
Although the conventional system has several merits as mentioned above, it has also been revealed by the inventors experiments that the speech synthesized by the conventional system suffers from hoarse noises and that the vocal quality thereof is very poor. The cause of such a drawback is as follows. Namely, in this speech synthesis system, connected speech is formed by connecting the waveforms of speech segments and there fore a discontinuity, i.e. rapid change in amplitude, is caused in the junction portion between any two adjacent waveforms of speech segments and such discontinuities appear every pitch period (equal to the fundamental period of speech and having an audible range of frequencies) to generate hoarse noises in synthesized speech.
SUMMARY OF THE INVENTION One object of the present invention is to improve the quality of the synthesized speech produced by a speech synthesis system in which a plurality of speech sound waveforms, each having a pitch length. to be used as speech segments are recorded and these speech segments are selectively connected to form synthesized speech.
Another object of the present invention is to provide a speech synthesis system in which a plurality of speech sound waveforms, each having a pitch length. to be used as speech segments are recorded and these speech segments are selectively connected to form synthesized speech, and in which the pitch control of speech sounds is simplified so that the system can be economically fabricated Without deterioration in the vocal quality of the synthesized speech.
According to the present invention. which has been made to attain the above objects, in a speech synthesizing system in which speech segments, each having a pitch length, are selectively connected to synthesize desired speech, the time of reading out each speech segment, that is, the wavelength of each speech segment of synthesized speech is stepwise changed at intervals of several speech segments. Namely, the waveforms of speech segments read out are changed at intervals of one fifth of a syllable to a full syllable. Therefore. the system according to the present invention can produce synthesized speech softer to ear than that produced by a conventional speech synthesis system in which the length of the waveform of every speech segment is controlled individually.
Other objects, features and advantages of the present invention will be made apparent when one reads the following part of the specification with the aid of the attached drawings.
BRIEF DESCRIPTION OF THE DRAWING FIG. I is an oscillographic representation of a monosyllable speech sound waveform.
FIG. 2 shows the modes of variations in the pitch frequency of monosyllable speech sounds in various pronounciations.
FIG. 3 shows the variations in the pitch frequency of one word.
FIGS. 4A and 4B show waveforms illustrating the discontinuities resulting from the connection of separate speech segments.
FIG. 5 shows the variation in pitch frequency of the synthesized speech formed by the speech synthesizing system according to the present invention.
FIG. 6 is a block diagram of a speech synthesis system embodying the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT In FIG. I, the waveform of a monosyllable speech sound is shown in a rectangular coordinate system in which the abscissa represents the time base and the ordinatc gives the amplitude of waveform. As seen from FIG. I, the waveform of the monosyllable speech sound consists of an irregular portion C like that of a consonant and a periodical portion V like that ofa vowel. Especially. every syllable of the Japanese speech is composed of a single consonant followed by a single vowel or of a single vowel. And about one hundred different syllables can make up all the speech sounds covering the entire vocabulary of the Japanese language. Of the portions of the waveform shown in FIG. I, the more important is the periodical portion v which occupies most part of the monosyllable speech sound waveform and forms the factors of the pitch, intonation and tone (indicating the kind of syllable) of the speech sound.
Namely. the pitch or intonation of the speech sound depends mainly on the repetition periods T T T i.e. the pitch period, while the tone is determined by the frequency characteristic of the periodical portion V. The pitch period is usually l() to milliseconds.
FIG. 2 shows the variation in the pitch frequency (de fined as the reciprocal of the pitch period) with time of the monosyllable speech sound shown in FIG, I. In FIG. 2, the abscissa and the ordinate respectively represent the time base and the pitch frequency, When a monosyllable speech sound is individually pronounced, it has a characteristic curve I convex up as shown in FIG. 2. However, when the same speech sound is pronounced in a word or sentence, it may assume charac teristic curves 2, 3 and 4 corresponding respectively to level, rising and falling intonation, depending upon the position it assumes in the word or sentence or upon the kind of word or sentence.
Accordingly, in case where the convected speech sounds corresponding to a desired word or sentence are formed by connecting together the prerecorded speech segments, Le. speech sound waveforms obtained by dividing the waveform of the monosyllable speech sound as shown in FIG. 1, pronounced in a manner corresponding to the curve I in FIG. 2, into units, each having a pitch length, the discontinuities are formed in the junction points between the unit waveforms, ie. speech segment waveforms, the discontinuities being the por tions where the amplitudes of the waveform rapidly change,
Such discontinuities will be described in further de tail. FIG. 3 shows the variation in pitch frequency with time of the speech sound corresponding to a word, in which the abscissa and the ordinate respectively represent the time base and the pitch frequency. In FIG. 3, curve 5 indicates the mode of the variation in pitch frequency of natural speech sound corresponding to a word to be synthesized, while curve 6 shows the mode of the variation in the pitch frequency of the monosyl lable speech sound corresponding to the curve I in FIG. 2. The abscissa is divided into pronounciation intervals r r t of the monosyllable sounds, Accordingly, in order that the speech sound having a pitch frequency characteristic corresponding to the curve 5 may be composed of speech segments obtained from the natural voice having a pitch frequency characteris tic corresponding to the curve 6, the length of the waveform ofeach speech segment, i.e pitch period, has to be controlled. Therefore, if the waveforms of the speech segments having pitch periods T T T as in FIG. I are connected and synthesized into connected speech having pitch periods longer or shorter than those periods T T T;, and T then the disconti nuities 7 are formed in the junction portion of the respective speech segment waveforms as shown in FIGS. 4A and 48. FIG. 4A corresponds to the case where the synthesized speech has a pitch frequency higher than that of the original natural voice from which the speech segment waveforms are obtained and has a pitch period shorter than that of the natural voice. FIG. 48. on the other hand, corresponds to the case where the synthesized spcech has a pitch frequency lower than that of the original natural voice and a pitch period longer than that of the original natural voice. The dicontinuities 7 thus resulted deteriorate the vocal quality of the synthesized speech and also generate hoarse noises.
In order to eliminate the influence of the discontinuities. a special treatment of waveforms must be introduced. According to the present invention, the degradation of the vocal quality due to the discontinuities can be prevented since the way of the pitch control in the speech control system is improved, and moreover a system can be realized in which the pitch control is further simplified by making the best use of the merits of the speech synthesizing system in which speech segments are connected to form synthesized speech.
Namely, as shown in FIG. 5, the pitch frequency or the pitch period of the synthesized speech is changed stepwise at intervals of a quarter of a syllable to a full syllable, It is empirically verified that the synthesized speech having a pitch frequency characteristic corre sponding to a staircurve 8 indicated by dotted line FIG. 5, has a vocal quality superior to that having a pitch fre quency characteristic indicated by a solid curve 5 in FIG. 5. In this case, it is needless to perform the pitch control for every speech segment and since the pitch periods of the successive speech segments are all the same, the pitch control system of the speech synthesis system is simplified.
In the following, the present invention will be described by way of a preferred embodiment.
FIG. 6 is a block diagram of a concrete structure of a speech synthesis system embodying the present invention.
First, a speech segment memory 32 is described for convenience sake. In the memory 32, the speech sound waveforms of all the syllables necessary for the speech synthesis are stored in a high speed memory device such as a core memory. Each syllable in the mem ory consists of time-sequentially arranged speech segments constituting a waveform as shown in FIG. I and the waveform of each speech segment has an address allotted to indicate its location in the memory, In a monosyllable, serial numbers are allotted to the addresses of the speech segment waveforms arranged in time-sequence. Therefore, the first address is used as a syllable address to represent the syllable.
Each speech segment waveform is obtained by sampling the speech sound waveform shown in FIG. 1 at 8KI-I2 and each of the sampled signal is coded into an 8-bit signal. The period at which one speech segment, i.e. wave portion within T T T or T in FIG. 1, is recorded is 10 to 20 msec, Namely, the period is set equal to the maximum one of the pitch periods of speech sounds to be synthesized.
A series of code signals, each representing one syllable, to constitute speech to be synthesized are received at a terminal 9 and fed through an input-output control circuit 10 to a data processing circuit 11. For example, code signals corresponding to the syllables YO, KO, HA and MA constituting the name of a famous port city of Japan, are applied to the circuit II. The device to generate such code signals is not within the scope of this invention and not shown in the figure, but the device is equivalent to the conventional automatic response system, being designed to form data for answers to preset questions and to connect the code signals according to the arrangement of words corresponding to those answers.
The data processing circuit ll interprets the code signals according to the predetermined program and generates signals instructing and controlling the operations of the respective parts of the speech synthesizing apparatus described later.
The operation of the circuit 11 will be described in further detail. Judging from the series of code signals, the circuit 11 generates speech segment information, pitch information and syllable time information according to a reference table.
The speech segment information is, for example, the address of the first speech segment of a syllable stored in the speech segment memory 32 described above; the pitch information is the information indicated by dotted curve 8 in FIG. 5, that is, the number indicating how many samples, counted from the first one, of the speech segments stored in the memory 32 is to be read out; and the syllable time information is the time information representing r, to I in FIG. 5, that is, the number of samples to be read out within the time of one syllable.
The data processing circuit to perform such processing as described above may be designed especially for the present invention but a general purpose computer can be used as such a circuit so that the details thereof is omitted.
The three kinds of information are respectively stored as time-sequential signals in a syllable address buffer memory 14, a pitch time buffer memory l5 and a syllable time buffer memory 16 ofa speech synthesizing apparatus 13. The speech synthesizing apparatus 13 consists of a part to select speech segments necessary to synthesize connected speech according to the speech segment information, a part to determine the pitch periods of the speech segments according to the pitch information and a part to determine the times allotted to syllables according to the syllable time infor mation.
Next, the operations of the respective components of the speech synthesizing apparatus 13 will be described.
The address data of the syllable address memory 14 are transferred one by one to a segment address memory 17, in response to an external signal and simultaneously the data in the syllable address memory 14 is shifted forward to cause the address of the next syllable to come to the head position. Namely, the memory 14 and the memory 17 may be considered to form a shift register. Also, the combination of the pitch time buffer memory 15 and a pitch time memory or of the syllable time buffer memory l6 and a syllable time memory may be also considered to form a shift register.
With the circuit arrangement as described above, the address signal of the first speech segment of a syllable stored in the segment memory 17 is applied to a read out circuit 29 so that a series of sampled values constituting the segment are sequentially read out in synchronism with clock pulses from a clock signal generator 20. The number of the readout samples is detected by counting the clock pulses by a pitch counter 22. When the content of the pitch counter 22 coincides with the pitch time data set in the pitch memory. a coincidence circuit 25 detects the instant of coincidence to deliver a coincidence pulse. The coincidence pulse serves not only to reset the pitch counter 22 but also to shift a segment address counter 21 step by stepv The output of the shifted segment address counter 21 is applied to the segment address memory 17 to read out the next speech segment from the speech segment memory 32, in the same manner as described above. Thereafter. the same operation of reading out the sampled values is repeated on. The coincidence pulse also resets the counter 23 at the same time.
On the other hand, the time counter 23 also counts the clock pulses, and when the content of the time counter 23 coincides with the syllable time data (that is, the number of sampling points occurring within a time during which the pitch frequency in one syllable remain the same, as described above) set in the syllable memory 19, a coincidence circuit 26 detects the instant of coincidence to deliver a coincidence pulse at the instant.
The coincidence pulse serves not only to transfer or shift the foremost pitch time data of the pitch time buffer memory 15 to the pitch time memory 18, but also to shift a syllable counter 24 step by step. When the content of the syllable counter 24 coincides with the step number recorded in a step number memory 23, a coincidence circuit 27 detects the instant of coincidence to deliver a coincidence pulse. The coincidence pulse resets the counter 24 and is also applied to the syllable address buffer memory 14 and the syllable time buffer memory 16 so that the control information for the syllable to be next synthesized, i.e. segment address and time data for the syllable, is transferred respectively to the memory 17 and the memory 19. The step number stored in the step number memory refers to the number of steps occurring within a time of one syllable when the pitch frequency is changed stepwise as shown in FIG. 5. In case of FIG. 5, the number of steps is three. As has been revealed from the experiments by the inventors, it is where the number of steps is three that the deterioration of the vocal quality of the synthesized speech due to the waveform discontinuities is reduced to the minimum. However, the number of steps need not be limited necessarily to 3 but may be 4 to 0, that is, the pitch frequency of the synthesized speech sounds may be varied at intervals of a quarter ofa syllable or a full syllable.
The output signal obtained from the read out circuit 29 as a result of the operations as described above is equivalent to a signal obtained by subjecting the signal waveform shown in FIG. 4A or 43 to pulse code modulation since the speech synthesizing circuit 13 consists of digital circuits. The signal is then converted to an analog signal through an digital-to-analog converter 30 and the analog signal is finally converted to a speech sound signal or audible voice through an electroacoustic transducer 31. In this case, the digital-toanalog converter 30 and the electro acoustic trans ducer 31 are connected by such a transmission line as a telephone which electrically connects a remote subscriber with the central information service system.
The speech synthesis system shown in FIG. 6 has been described as applied to the case where the speech sounds only for one channel are synthesized. It is, however, a matter of course that since the whole system is composed of digital signal treating circuits and the speech segments are stored in such a memory as a core memory capable of high speed access then the system can be easily constructed in a multichannel arrangement as known in the field of the art.
Namely, such an arrangement for multichannel purpose can be realized if the input-output control circuit 10, the data processing circuit 1] and the speech segment memory 32 are used commonly and ifthe number of the speech synthesizing apparatuses 13 is increased according to the number of channels required.
Moreover, the speech segments stored in the speech segment memory may be obtained by directly extracting the components from the natural human voice or by artificially treating the waveforms of the human speech sounds I claim:
1. A speech synthesis system comprising:
a speech segment memory which stores a plurality of speech segments;
a synthesizing apparatus, coupled to said speech segment memory, including first means for selecting desired speech segments from said speech segment memory,
second means, coupled to said first means, for controlling the pitch period of each of said desired speech segments, so as to change the pitch frequency of the synthesized speech step-wise, and
third means, coupled to said first means and said second means, for connecting the desired pitch controlled speech segments together.
2. A speech synthesis system according to claim I, wherein said second means includes means for adjust ing the intervals of the stepwise change between a quar ter of a syllable and a full syllable.
3. A speech synthesis system comprising: a, a data processing circuit to convert code signals representative of the syllables of words to be synthesized into control signals for speech synthesis; b, a speech segment memory to store speech segments each having a waveform of about a pitch length; c. a speech synthesizing apparatus, coupled to said data processing circuit and to said speech segment memory, and including first means for selecting desired speech segments in said speech segment memory second means, coupled to said first means. for controlling the read time of each of the selected speech segments, so as to change the pitch period of the synthesized speech stepwise at intervals of a quarter of a syllable to a full syllable and,
third means, coupled to said first means and to said second means, for synthesizing speech sound waveform signals by connecting the selected and pitch controlled speech segment signals together, in response to the control signals from the data processing circuit; and
d. an electro-acoustic converting device, coupled to said speech segment memory and said speech synthesizing apparatus, for converting the speech sound waveform signals into corresponding speech sounds,

Claims (3)

1. A speech synthesis system comprising: a speech segment memory which stores a plurality of speech segments; a synthesizing apparatus, coupled to said speech segment memory, including first means for selecting desired speech segments from said speech segment memory, second means, coupled to said first means, for controlling the pitch period of each of said desired speech segments, so as to change the pitch frequency of the synthesized speech stepwise, and third means, coupled to said first means and said second means, for connecting the desired pitch controlled speech segments together.
2. A speech synthesis system according to claim 1, wherein said second means includes means for adjusting the intervals of the stepwise change between a quarter of a syllable and a full syllable.
3. A speech synthesis system comprising: a. a data processing circuit to convert code signals representative of the syllables of words to be synthesized into control signals for speech synthesis; b. a speech segment memory to store speech segments each having a waveform of about a pitch length; c. a speech synthesizing apparatus, coupled to said data processing circuit and to said speech segment memory, and including first means for selecting desired speech segments in said speech segment memory second means, coupled to said first means, for controlling the read time of each of the selected speech segments, so as to change the pitch period of the synthesized speech stepwise at intervals of a quarter of a syllable to a full syllable and, third means, coupled to said first means and to said second means, for synthesizing speech sound waveform signals by connecting the selected and pitch controlled speech segment signals together, in response to the control signals from the data processing circuit; and d. an electro-acoustic converting device, coupled to said speech segment memory and said speech synthesizing apparatus, for converting the speech sound waveform signals into corresponding speech sounds.
US414746A 1972-11-13 1973-11-12 Speech synthesis system Expired - Lifetime US3892919A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP11299572A JPS5331323B2 (en) 1972-11-13 1972-11-13

Publications (1)

Publication Number Publication Date
US3892919A true US3892919A (en) 1975-07-01

Family

ID=14600773

Family Applications (1)

Application Number Title Priority Date Filing Date
US414746A Expired - Lifetime US3892919A (en) 1972-11-13 1973-11-12 Speech synthesis system

Country Status (2)

Country Link
US (1) US3892919A (en)
JP (1) JPS5331323B2 (en)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3998045A (en) * 1975-06-09 1976-12-21 Camin Industries Corporation Talking solid state timepiece
US4060848A (en) * 1970-12-28 1977-11-29 Gilbert Peter Hyatt Electronic calculator system having audio messages for operator interaction
US4163120A (en) * 1978-04-06 1979-07-31 Bell Telephone Laboratories, Incorporated Voice synthesizer
US4214125A (en) * 1977-01-21 1980-07-22 Forrest S. Mozer Method and apparatus for speech synthesizing
US4296279A (en) * 1980-01-31 1981-10-20 Speech Technology Corporation Speech synthesizer
EP0058130A2 (en) * 1981-02-11 1982-08-18 Eberhard Dr.-Ing. Grossmann Method for speech synthesizing with unlimited vocabulary, and arrangement for realizing the same
US4374302A (en) * 1980-01-21 1983-02-15 N.V. Philips' Gloeilampenfabrieken Arrangement and method for generating a speech signal
US4383462A (en) * 1976-04-06 1983-05-17 Nippon Gakki Seizo Kabushiki Kaisha Electronic musical instrument
US4384170A (en) * 1977-01-21 1983-05-17 Forrest S. Mozer Method and apparatus for speech synthesizing
DE3246712A1 (en) * 1981-12-17 1983-06-30 Matsushita Electric Industrial Co., Ltd., Kadoma, Osaka METHOD FOR COMPOSING A VOICE ANALYSIS
US4398059A (en) * 1981-03-05 1983-08-09 Texas Instruments Incorporated Speech producing system
US4420813A (en) * 1978-02-08 1983-12-13 Sharp Kabushiki Kaisha Operation sequence instruction by synthetic speech
US4443859A (en) * 1981-07-06 1984-04-17 Texas Instruments Incorporated Speech analysis circuits using an inverse lattice network
US4470150A (en) * 1982-03-18 1984-09-04 Federal Screw Works Voice synthesizer with automatic pitch and speech rate modulation
US4473904A (en) * 1978-12-11 1984-09-25 Hitachi, Ltd. Speech information transmission method and system
US4516260A (en) * 1978-04-28 1985-05-07 Texas Instruments Incorporated Electronic learning aid or game having synthesized speech
EP0181339A1 (en) * 1984-04-10 1986-05-21 First Byte Real-time text-to-speech conversion system
US4700393A (en) * 1979-05-07 1987-10-13 Sharp Kabushiki Kaisha Speech synthesizer with variable speed of speech
US4797930A (en) * 1983-11-03 1989-01-10 Texas Instruments Incorporated constructed syllable pitch patterns from phonological linguistic unit string data
US4799261A (en) * 1983-11-03 1989-01-17 Texas Instruments Incorporated Low data rate speech encoding employing syllable duration patterns
US4802223A (en) * 1983-11-03 1989-01-31 Texas Instruments Incorporated Low data rate speech encoding employing syllable pitch patterns
US5163110A (en) * 1990-08-13 1992-11-10 First Byte Pitch control in artificial speech
US5715368A (en) * 1994-10-19 1998-02-03 International Business Machines Corporation Speech synthesis system and method utilizing phenome information and rhythm imformation
US5740320A (en) * 1993-03-10 1998-04-14 Nippon Telegraph And Telephone Corporation Text-to-speech synthesis by concatenation using or modifying clustered phoneme waveforms on basis of cluster parameter centroids
US5745651A (en) * 1994-05-30 1998-04-28 Canon Kabushiki Kaisha Speech synthesis apparatus and method for causing a computer to perform speech synthesis by calculating product of parameters for a speech waveform and a read waveform generation matrix
US5802250A (en) * 1994-11-15 1998-09-01 United Microelectronics Corporation Method to eliminate noise in repeated sound start during digital sound recording
US5832431A (en) * 1990-09-26 1998-11-03 Severson; Frederick E. Non-looped continuous sound by random sequencing of digital sound records
US20020133335A1 (en) * 2001-03-13 2002-09-19 Fang-Chu Chen Methods and systems for celp-based speech coding with fine grain scalability
US20090204395A1 (en) * 2007-02-19 2009-08-13 Yumiko Kato Strained-rough-voice conversion device, voice conversion device, voice synthesis device, voice conversion method, voice synthesis method, and program
US20120072208A1 (en) * 2010-09-17 2012-03-22 Qualcomm Incorporated Determining pitch cycle energy and scaling an excitation signal
US20180018957A1 (en) * 2015-03-25 2018-01-18 Yamaha Corporation Sound control device, sound control method, and sound control program

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS55118098A (en) * 1979-03-06 1980-09-10 Nippon Electric Co Waveform producer for answering voice

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2771509A (en) * 1953-05-25 1956-11-20 Bell Telephone Labor Inc Synthesis of speech from code signals
US3369077A (en) * 1964-06-09 1968-02-13 Ibm Pitch modification of audio waveforms
US3641496A (en) * 1969-06-23 1972-02-08 Phonplex Corp Electronic voice annunciating system having binary data converted into audio representations

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2771509A (en) * 1953-05-25 1956-11-20 Bell Telephone Labor Inc Synthesis of speech from code signals
US3369077A (en) * 1964-06-09 1968-02-13 Ibm Pitch modification of audio waveforms
US3641496A (en) * 1969-06-23 1972-02-08 Phonplex Corp Electronic voice annunciating system having binary data converted into audio representations

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4060848A (en) * 1970-12-28 1977-11-29 Gilbert Peter Hyatt Electronic calculator system having audio messages for operator interaction
US3998045A (en) * 1975-06-09 1976-12-21 Camin Industries Corporation Talking solid state timepiece
US4383462A (en) * 1976-04-06 1983-05-17 Nippon Gakki Seizo Kabushiki Kaisha Electronic musical instrument
US4384170A (en) * 1977-01-21 1983-05-17 Forrest S. Mozer Method and apparatus for speech synthesizing
US4214125A (en) * 1977-01-21 1980-07-22 Forrest S. Mozer Method and apparatus for speech synthesizing
US4420813A (en) * 1978-02-08 1983-12-13 Sharp Kabushiki Kaisha Operation sequence instruction by synthetic speech
WO1979000892A1 (en) * 1978-04-06 1979-11-15 Western Electric Co Voice synthesizer
DE2945413C1 (en) * 1978-04-06 1984-06-28 Western Electric Co., Inc., New York, N.Y. Method and device for synthesizing speech
US4163120A (en) * 1978-04-06 1979-07-31 Bell Telephone Laboratories, Incorporated Voice synthesizer
US4516260A (en) * 1978-04-28 1985-05-07 Texas Instruments Incorporated Electronic learning aid or game having synthesized speech
US4489433A (en) * 1978-12-11 1984-12-18 Hitachi, Ltd. Speech information transmission method and system
US4473904A (en) * 1978-12-11 1984-09-25 Hitachi, Ltd. Speech information transmission method and system
US4700393A (en) * 1979-05-07 1987-10-13 Sharp Kabushiki Kaisha Speech synthesizer with variable speed of speech
US4374302A (en) * 1980-01-21 1983-02-15 N.V. Philips' Gloeilampenfabrieken Arrangement and method for generating a speech signal
US4296279A (en) * 1980-01-31 1981-10-20 Speech Technology Corporation Speech synthesizer
EP0058130A2 (en) * 1981-02-11 1982-08-18 Eberhard Dr.-Ing. Grossmann Method for speech synthesizing with unlimited vocabulary, and arrangement for realizing the same
EP0058130A3 (en) * 1981-02-11 1982-09-08 Heinrich-Hertz-Institut Fur Nachrichtentechnik Berlin Gmbh Method for speech synthesizing with unlimited vocabulary, and arrangement for realizing the same
US4398059A (en) * 1981-03-05 1983-08-09 Texas Instruments Incorporated Speech producing system
US4443859A (en) * 1981-07-06 1984-04-17 Texas Instruments Incorporated Speech analysis circuits using an inverse lattice network
DE3246712A1 (en) * 1981-12-17 1983-06-30 Matsushita Electric Industrial Co., Ltd., Kadoma, Osaka METHOD FOR COMPOSING A VOICE ANALYSIS
US4601052A (en) * 1981-12-17 1986-07-15 Matsushita Electric Industrial Co., Ltd. Voice analysis composing method
US4470150A (en) * 1982-03-18 1984-09-04 Federal Screw Works Voice synthesizer with automatic pitch and speech rate modulation
US4802223A (en) * 1983-11-03 1989-01-31 Texas Instruments Incorporated Low data rate speech encoding employing syllable pitch patterns
US4799261A (en) * 1983-11-03 1989-01-17 Texas Instruments Incorporated Low data rate speech encoding employing syllable duration patterns
US4797930A (en) * 1983-11-03 1989-01-10 Texas Instruments Incorporated constructed syllable pitch patterns from phonological linguistic unit string data
EP0181339A1 (en) * 1984-04-10 1986-05-21 First Byte Real-time text-to-speech conversion system
EP0181339A4 (en) * 1984-04-10 1986-12-08 First Byte Real-time text-to-speech conversion system.
US5163110A (en) * 1990-08-13 1992-11-10 First Byte Pitch control in artificial speech
US5832431A (en) * 1990-09-26 1998-11-03 Severson; Frederick E. Non-looped continuous sound by random sequencing of digital sound records
US5740320A (en) * 1993-03-10 1998-04-14 Nippon Telegraph And Telephone Corporation Text-to-speech synthesis by concatenation using or modifying clustered phoneme waveforms on basis of cluster parameter centroids
US5745651A (en) * 1994-05-30 1998-04-28 Canon Kabushiki Kaisha Speech synthesis apparatus and method for causing a computer to perform speech synthesis by calculating product of parameters for a speech waveform and a read waveform generation matrix
US5715368A (en) * 1994-10-19 1998-02-03 International Business Machines Corporation Speech synthesis system and method utilizing phenome information and rhythm imformation
US5802250A (en) * 1994-11-15 1998-09-01 United Microelectronics Corporation Method to eliminate noise in repeated sound start during digital sound recording
US20020133335A1 (en) * 2001-03-13 2002-09-19 Fang-Chu Chen Methods and systems for celp-based speech coding with fine grain scalability
US6996522B2 (en) * 2001-03-13 2006-02-07 Industrial Technology Research Institute Celp-Based speech coding for fine grain scalability by altering sub-frame pitch-pulse
US20090204395A1 (en) * 2007-02-19 2009-08-13 Yumiko Kato Strained-rough-voice conversion device, voice conversion device, voice synthesis device, voice conversion method, voice synthesis method, and program
US8898062B2 (en) * 2007-02-19 2014-11-25 Panasonic Intellectual Property Corporation Of America Strained-rough-voice conversion device, voice conversion device, voice synthesis device, voice conversion method, voice synthesis method, and program
US20120072208A1 (en) * 2010-09-17 2012-03-22 Qualcomm Incorporated Determining pitch cycle energy and scaling an excitation signal
US8862465B2 (en) * 2010-09-17 2014-10-14 Qualcomm Incorporated Determining pitch cycle energy and scaling an excitation signal
US20180018957A1 (en) * 2015-03-25 2018-01-18 Yamaha Corporation Sound control device, sound control method, and sound control program
US10504502B2 (en) * 2015-03-25 2019-12-10 Yamaha Corporation Sound control device, sound control method, and sound control program

Also Published As

Publication number Publication date
JPS5331323B2 (en) 1978-09-01
JPS4971802A (en) 1974-07-11

Similar Documents

Publication Publication Date Title
US3892919A (en) Speech synthesis system
EP0140777B1 (en) Process for encoding speech and an apparatus for carrying out the process
US5113449A (en) Method and apparatus for altering voice characteristics of synthesized speech
CA1105621A (en) Voice synthesizer
GB1592473A (en) Method and apparatus for synthesis of speech
FR2523786A1 (en) Music transmission system using telephone lines - including keyboard controlled transmitter sending binary signals to electrical sound generator via telephone line
KR980700637A (en) METHOD AND DEVICE FOR ENHANCER THE RECOGNITION OF SPEECHAMONG SPEECH-IMPAI RED INDIVIDUALS
CN111246469B (en) Artificial intelligence secret communication system and communication method
US3836717A (en) Speech synthesizer responsive to a digital command input
EP1265226B1 (en) Device for generating announcement information
US4384170A (en) Method and apparatus for speech synthesizing
US3539726A (en) System for storing cochlear profiles
US5163110A (en) Pitch control in artificial speech
JPS61177496A (en) Voice synthesization module
JP2734028B2 (en) Audio recording device
Clapper Automatic word recognition
US3530248A (en) Synthesis of speech from code signals
US3536837A (en) System for uniform printing of intelligence spoken with different enunciations
KR940017622A (en) Speech processing device and method of audiotex device for converting Korean characters into speech
JP2605680B2 (en) Audio noise generation circuit
SU1408450A1 (en) Method and apparatus for synthesis of speech signals
CA1180815A (en) Digital tone generator
JPS58117598A (en) Voice synthesizer
Beddoes et al. Direct sample interpolation (DSI) speech synthesis: An interpolation technique for digital speech data compression and speech synthesis
KR970031720A (en) Voice guidance device and method of electronic exchange