EP0157903B1 - Procédé et dispositif pour la synthèse de la parole - Google Patents

Procédé et dispositif pour la synthèse de la parole Download PDF

Info

Publication number
EP0157903B1
EP0157903B1 EP19840109492 EP84109492A EP0157903B1 EP 0157903 B1 EP0157903 B1 EP 0157903B1 EP 19840109492 EP19840109492 EP 19840109492 EP 84109492 A EP84109492 A EP 84109492A EP 0157903 B1 EP0157903 B1 EP 0157903B1
Authority
EP
European Patent Office
Prior art keywords
vowel
consonant
stored
uniform
vowels
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
EP19840109492
Other languages
German (de)
English (en)
Other versions
EP0157903A1 (fr
Inventor
Christian Deforeit
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Matth Hohner AG
Original Assignee
Matth Hohner AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matth Hohner AG filed Critical Matth Hohner AG
Publication of EP0157903A1 publication Critical patent/EP0157903A1/fr
Application granted granted Critical
Publication of EP0157903B1 publication Critical patent/EP0157903B1/fr
Expired legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

Definitions

  • the invention relates to a method for speech synthesis and an arrangement for carrying it out.
  • the invention has for its object to significantly reduce the amount of memory, so that the method even for mass consumer goods, for. B. toys, (dolls' voices and the like) is applicable.
  • Claim 1 defines the method according to the invention. It can be seen that each consonant is only stored in combination with a single unit vowel, which is to be referred to here and below as "&", and which has approximately the "e” in the German word “all”, in the English word “the”, in the French word “le”. It has been shown that when reading out with a corresponding time delay, the & is masked completely or at least to such an extent by the actual vowel then reproduced that the resulting syllable largely comes close to natural pronunciation.
  • & unit vowel
  • Claim 5 defines an arrangement according to the invention with which speech synthesis can be carried out.
  • Fig. 1 the upper diagram shows the envelope curve of the consonant channel, the lower one that of the vowel channel, the simple word "DATO" being chosen as an example. It can be seen that the & parts of the consonant channel and the vowels are reproduced at the same time, and the weak & sound is strongly masked by this alone. This masking can also be supported by further measures.
  • Fig. 2 represents a first means for this. It is known that the frequency spectrum of the consonants and vowels is different; e.g. For a male voice, for example, the maxima of the consonants are in the range of approximately 600 ... 3000 Hz, the vowels in the range of approximately 200 ... 1000 Hz. Accordingly, filters with the passbands shown in FIG. 2 are assigned to the two channels , where the filtering can be done either during recording or during playback.
  • Fig. 3 shows schematically the format for the storage.
  • the sounds are digitized, that is, with a clock of z. B. 10 KHz or more amplitude sampled and the data thus obtained are stored in successive memory locations for serial readout.
  • two memory locations for command data are kept free, namely a "continue” command and a "end” command.
  • the “continue” command means the point in time at which the other channel is to continue reading; this command is in the consonant data when the actual consonant sound changes to the & part, while in the vowel data it is close to the end of the data string.
  • the command “end” goes without saying, but is necessary because the individual phonemes have different durations.
  • the "continue” command can be used to increase the masking effect by using its occurrence the envelope of the channel just read is damped, as indicated in Fig. 4, for which one can use a conventional analog attenuator consisting of diode, resistor and capacitor.
  • Fig. 5 shows in block form an embodiment of a speech synthesizer which - as can be seen - is extremely simple.
  • the phonemes to be reproduced are selected by external means, for example a microprocessor, and do not form part of the subject matter of the present invention; an external control circuit is therefore only indicated here as block 1.
  • the arrangement comprises two mutually identical channels, only one of which is described below.
  • a memory address counter 2 is set by the control circuit 1 to a specific phoneme start address.
  • a phoneme memory 3 contains all the phonemes required for a given language, with thirty-six phonemes being sufficient for many languages. After filtering during recording (as explained above), the phonemes are digitized and stored in the format shown in FIG. 3; For example, the codes “0" or “1” can be reserved for the commands "Continue” or “End”.
  • a clock generator 5 generates the read clock of z. B. 10 KHz, for both channels. The read data go to a decoder 4, which determines whether it is data or one of the commands "continue” or "end”. Data pass through a digital-to-analog converter 6 and a multiplier 7 to a summing element 8 and from there to an amplifier-loudspeaker unit 9.
  • a phoneme request flip-flop 11 is set for the other channel; it is reset by the external control circuit when the next start address is entered. Furthermore, when the command "continue” is switched a damping flip-flop 12, which connects an envelope generator 13 with its output F to the one, with its output F to the other channel, which acts on the multiplier 7, so that the output of the channel in question is gently damped without, however, the "click” noise.
  • the outputs of both channels are combined in the summing element 8.
  • the respective set state of the flip-flop 12 is also transmitted to the external control circuit in order to signal the latter which of the two channels can be occupied, for example at the beginning of a readout cycle after the circuit has been started up.
  • the memory expenditure can be halved if only one phoneme memory 3 is provided for both channels and the reading is done in time division multiplex.
  • the multiplier 7 is already included in certain commercially available digital-to-analog converters, so that the output of the envelope generator 13 need only be connected to the corresponding input of the converter.
  • the circuit can also be largely implemented in a microprocessor, in which case either the two envelope generators and the two converters remain outside or only a single, common converter, while all other processes are carried out digitally by the microprocessor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Electrophonic Musical Instruments (AREA)

Claims (8)

1. Procédé de synthèse de la parole, selon lequel des combinaisons formées respectivement d'une consonne et d'une voyelle, qui suit la consonne, sont mémorisées et sont lues en cas de besoin, caractérisé en ce que
- chaque consonne apparaissant dans la conversation est mémorisée en association avec une voyelle unité faible "&",
- toutes les voyelles existant dans la langue sont mémorisées individuellement,
- la combinaison désirée consonne-voyelle est formée au moyen de la lecture, décalée dans le temps, d'une consonne et d'une voyelle dans deux canaux, moyennant le masquage de la voyelle unité «&" par la voyelle lue.
2. Procédé selon la revendication 1, caractérisé en ce que lors de la mémorisation des combinaisons formées d'une consonne et d'une voyelle unité, on affaiblit des fréquences typiques pour les voyelles et que lors de la mémorisation des voyelles, on affaiblit des fréquences typiques pour les consonnes.
3. Procédé selon la revendication 1 ou 2, caractérisé en ce qu'on affaiblit, lors de la lecture, les amplitudes des voyelles unités "&".
4. Procédé selon la revendication 1, 2 ou 3, caractérisé en ce que la lecture s'effectue avec une fréquence d'horloge variable.
5. Dispositif pour réaliser la synthèse de la parole, constitué par un dispositif pour mémoriser des consonnes et des voyelles et un circuit de lecture pour lire des combinaisons formées respectivement d'une consonne et d'une voyelle, qui succède à cette consonne, caractérisé en ce que
- chaque consonne, qui existe dans la langue est mémorisée dans le dispositif de mémorisation en association avec une voyelle unité "&",
- toutes les voyelles existant dans la langue sont mémorisées individuellement dans le dispositif de mémoire,
- le circuit de lecture est constitué par deux canaux de lecture pouvant être activés en alternance et par un circuit de commutation pouvant être commandé par un ordre, lu dans un canal et servant à activer l'autre canal, et la combinaison désirée consonne-voyelle est lue avec un décalage dans le temps à partir des deux canaux, et
- pour chaque canal il est prévu un générateur de courbes enveloppes pouvant être activées par un ordre de commutation et qui réalise un affaiblissement d'amplitude pour masquer la voyelle unité »&" au moyen de la voyelle lue.
6. Dispositif selon la revendication 5, caractérisé en ce que chaque canal comprend une mémoire pour tous les sons nécessaires (phonèmes et diphonèmes).
7. Dispositif selon la revendication 5 ou 6, dans lequel les sons (phonèmes et diphonèmes) sont mémorisés en numérique dans chaque mémoire et peuvent être lus séquentiellement, caractérisé en ce qu'au moins pour les diphonèmes, l'ordre de commutation dans la séquence de lecture est mémorisé entre l'intervalle de consonne et l'intervalle de voyelle unité.
8. Dispositif selon la revendication 5, dans lequel les sons (phonèmes et diphonèmes) sont mémorisés en numérique dans chaque mémoire et peuvent être lus séquentiellement, caractérisé en ce qu'un ordre "fin" peut être lu à la fin de chaque séquence de lecture, au moyen duquel une opération de lecture suivante peut être commencée.
EP19840109492 1984-02-23 1984-08-09 Procédé et dispositif pour la synthèse de la parole Expired EP0157903B1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE19843406540 DE3406540C1 (de) 1984-02-23 1984-02-23 Verfahren und Anordnung fuer die Sprachsynthese
DE3406540 1984-02-23

Publications (2)

Publication Number Publication Date
EP0157903A1 EP0157903A1 (fr) 1985-10-16
EP0157903B1 true EP0157903B1 (fr) 1988-01-13

Family

ID=6228601

Family Applications (1)

Application Number Title Priority Date Filing Date
EP19840109492 Expired EP0157903B1 (fr) 1984-02-23 1984-08-09 Procédé et dispositif pour la synthèse de la parole

Country Status (3)

Country Link
EP (1) EP0157903B1 (fr)
JP (1) JPS60211499A (fr)
DE (1) DE3406540C1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE4111781A1 (de) * 1991-04-11 1992-10-22 Ibm Computersystem zur spracherkennung
IT1263756B (it) * 1993-01-15 1996-08-29 Alcatel Italia Metodo automatico per implementazione di curve intonative su messaggi vocali codificati con tecniche che permettono l'assegnazione del pitch
CN105895076B (zh) * 2015-01-26 2019-11-15 科大讯飞股份有限公司 一种语音合成方法及系统

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4658424A (en) * 1981-03-05 1987-04-14 Texas Instruments Incorporated Speech synthesis integrated circuit device having variable frame rate capability
US4597318A (en) * 1983-01-18 1986-07-01 Matsushita Electric Industrial Co., Ltd. Wave generating method and apparatus using same

Also Published As

Publication number Publication date
EP0157903A1 (fr) 1985-10-16
JPS60211499A (ja) 1985-10-23
DE3406540C1 (de) 1985-09-05

Similar Documents

Publication Publication Date Title
DE2740520A1 (de) Verfahren und anordnung zur synthese von sprache
EP1184839A2 (fr) Conversion graphème-phonème
DE3046338A1 (de) Elektronische uhr mit aufzeichnungsfunktion
DE3036680A1 (de) Sprachsynthesizer mit dehnbarer und komprimierbarer sprachzeit
DE2050512B2 (de) Vorrichtung zur Ableitung von Sprachparametern und zur Erzeugung synthetischer Sprache
DE2753707A1 (de) Einrichtung zur erkennung des auftretens eines kommandowortes aus einer eingangssprache
DE2343158A1 (de) Frequenzbandumsetzer
DE2850286A1 (de) Elektronische schlagwerksuhr
DE3228756A1 (de) Verfahren und vorrichtung zur zeitabhaengigen komprimierung und synthese von stimmlosen hoerbaren signalen
DE1965480A1 (de) Geraet fuer kuenstliche Erzeugung von Worten durch Umwandlung eines in Buchstaben gedruckten Textes in Aussprache
EP0157903B1 (fr) Procédé et dispositif pour la synthèse de la parole
DE1811040C3 (de) Anordnung zum Synthetisieren von Sprachsignalen
DE69232964T2 (de) Informationsansageeinrichtung
DE1937464B2 (de) Sprachanalysiergeraet
DE2854401C2 (de) Anrufbeantworter
AT391035B (de) System zur spracherkennung
DE3232835C2 (fr)
DE4441906C2 (de) Anordnung und Verfahren für Sprachsynthese
DE3928664C2 (de) Videorekorder mit Spracheingabe/-ausgabefunktion
DE2657430A1 (de) Einrichtung zum synthetisieren der menschlichen sprache
DE2335818A1 (de) Anordnung zur erzeugung von antwortstimmen bzw. -sprache
AT311077B (de) Einrichtung zur Synthetisierung von Audio-Informationen
EP0094681B1 (fr) Dispositif pour la synthèse électronique de la parole
DE2016572A1 (de) Verfahren und Einrichtung zur Sprachsynthese
DE1762336C (de) Schaltungsanordnung zur Sprachanalyse und Sprachsynthese nach Art eines Vocoders

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Designated state(s): FR GB IT NL

17P Request for examination filed

Effective date: 19851102

17Q First examination report despatched

Effective date: 19870202

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): FR GB IT NL

ET Fr: translation filed
ITF It: translation for a ep patent filed

Owner name: STUDIO TORTA SOCIETA' SEMPLICE

GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Effective date: 19890809

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Effective date: 19900301

GBPC Gb: european patent ceased through non-payment of renewal fee
NLV4 Nl: lapsed or anulled due to non-payment of the annual fee
PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Effective date: 19900427

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST