EP0157903B1 - Procédé et dispositif pour la synthèse de la parole - Google Patents
Procédé et dispositif pour la synthèse de la parole Download PDFInfo
- Publication number
- EP0157903B1 EP0157903B1 EP19840109492 EP84109492A EP0157903B1 EP 0157903 B1 EP0157903 B1 EP 0157903B1 EP 19840109492 EP19840109492 EP 19840109492 EP 84109492 A EP84109492 A EP 84109492A EP 0157903 B1 EP0157903 B1 EP 0157903B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- vowel
- consonant
- stored
- uniform
- vowels
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired
Links
- 238000000034 method Methods 0.000 title claims abstract description 17
- 230000002194 synthesizing effect Effects 0.000 title 1
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 8
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 7
- 230000002238 attenuated effect Effects 0.000 claims 3
- 230000003213 activating effect Effects 0.000 claims 1
- 230000000694 effects Effects 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 4
- 230000000873 masking effect Effects 0.000 description 3
- BDAGIHXWWSANSR-UHFFFAOYSA-M Formate Chemical compound [O-]C=O BDAGIHXWWSANSR-UHFFFAOYSA-M 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 239000003990 capacitor Substances 0.000 description 1
- 238000013016 damping Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
Definitions
- the invention relates to a method for speech synthesis and an arrangement for carrying it out.
- the invention has for its object to significantly reduce the amount of memory, so that the method even for mass consumer goods, for. B. toys, (dolls' voices and the like) is applicable.
- Claim 1 defines the method according to the invention. It can be seen that each consonant is only stored in combination with a single unit vowel, which is to be referred to here and below as "&", and which has approximately the "e” in the German word “all”, in the English word “the”, in the French word “le”. It has been shown that when reading out with a corresponding time delay, the & is masked completely or at least to such an extent by the actual vowel then reproduced that the resulting syllable largely comes close to natural pronunciation.
- & unit vowel
- Claim 5 defines an arrangement according to the invention with which speech synthesis can be carried out.
- Fig. 1 the upper diagram shows the envelope curve of the consonant channel, the lower one that of the vowel channel, the simple word "DATO" being chosen as an example. It can be seen that the & parts of the consonant channel and the vowels are reproduced at the same time, and the weak & sound is strongly masked by this alone. This masking can also be supported by further measures.
- Fig. 2 represents a first means for this. It is known that the frequency spectrum of the consonants and vowels is different; e.g. For a male voice, for example, the maxima of the consonants are in the range of approximately 600 ... 3000 Hz, the vowels in the range of approximately 200 ... 1000 Hz. Accordingly, filters with the passbands shown in FIG. 2 are assigned to the two channels , where the filtering can be done either during recording or during playback.
- Fig. 3 shows schematically the format for the storage.
- the sounds are digitized, that is, with a clock of z. B. 10 KHz or more amplitude sampled and the data thus obtained are stored in successive memory locations for serial readout.
- two memory locations for command data are kept free, namely a "continue” command and a "end” command.
- the “continue” command means the point in time at which the other channel is to continue reading; this command is in the consonant data when the actual consonant sound changes to the & part, while in the vowel data it is close to the end of the data string.
- the command “end” goes without saying, but is necessary because the individual phonemes have different durations.
- the "continue” command can be used to increase the masking effect by using its occurrence the envelope of the channel just read is damped, as indicated in Fig. 4, for which one can use a conventional analog attenuator consisting of diode, resistor and capacitor.
- Fig. 5 shows in block form an embodiment of a speech synthesizer which - as can be seen - is extremely simple.
- the phonemes to be reproduced are selected by external means, for example a microprocessor, and do not form part of the subject matter of the present invention; an external control circuit is therefore only indicated here as block 1.
- the arrangement comprises two mutually identical channels, only one of which is described below.
- a memory address counter 2 is set by the control circuit 1 to a specific phoneme start address.
- a phoneme memory 3 contains all the phonemes required for a given language, with thirty-six phonemes being sufficient for many languages. After filtering during recording (as explained above), the phonemes are digitized and stored in the format shown in FIG. 3; For example, the codes “0" or “1” can be reserved for the commands "Continue” or “End”.
- a clock generator 5 generates the read clock of z. B. 10 KHz, for both channels. The read data go to a decoder 4, which determines whether it is data or one of the commands "continue” or "end”. Data pass through a digital-to-analog converter 6 and a multiplier 7 to a summing element 8 and from there to an amplifier-loudspeaker unit 9.
- a phoneme request flip-flop 11 is set for the other channel; it is reset by the external control circuit when the next start address is entered. Furthermore, when the command "continue” is switched a damping flip-flop 12, which connects an envelope generator 13 with its output F to the one, with its output F to the other channel, which acts on the multiplier 7, so that the output of the channel in question is gently damped without, however, the "click” noise.
- the outputs of both channels are combined in the summing element 8.
- the respective set state of the flip-flop 12 is also transmitted to the external control circuit in order to signal the latter which of the two channels can be occupied, for example at the beginning of a readout cycle after the circuit has been started up.
- the memory expenditure can be halved if only one phoneme memory 3 is provided for both channels and the reading is done in time division multiplex.
- the multiplier 7 is already included in certain commercially available digital-to-analog converters, so that the output of the envelope generator 13 need only be connected to the corresponding input of the converter.
- the circuit can also be largely implemented in a microprocessor, in which case either the two envelope generators and the two converters remain outside or only a single, common converter, while all other processes are carried out digitally by the microprocessor.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Electrophonic Musical Instruments (AREA)
Claims (8)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE19843406540 DE3406540C1 (de) | 1984-02-23 | 1984-02-23 | Verfahren und Anordnung fuer die Sprachsynthese |
DE3406540 | 1984-02-23 |
Publications (2)
Publication Number | Publication Date |
---|---|
EP0157903A1 EP0157903A1 (fr) | 1985-10-16 |
EP0157903B1 true EP0157903B1 (fr) | 1988-01-13 |
Family
ID=6228601
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19840109492 Expired EP0157903B1 (fr) | 1984-02-23 | 1984-08-09 | Procédé et dispositif pour la synthèse de la parole |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP0157903B1 (fr) |
JP (1) | JPS60211499A (fr) |
DE (1) | DE3406540C1 (fr) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE4111781A1 (de) * | 1991-04-11 | 1992-10-22 | Ibm | Computersystem zur spracherkennung |
IT1263756B (it) * | 1993-01-15 | 1996-08-29 | Alcatel Italia | Metodo automatico per implementazione di curve intonative su messaggi vocali codificati con tecniche che permettono l'assegnazione del pitch |
CN105895076B (zh) * | 2015-01-26 | 2019-11-15 | 科大讯飞股份有限公司 | 一种语音合成方法及系统 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4658424A (en) * | 1981-03-05 | 1987-04-14 | Texas Instruments Incorporated | Speech synthesis integrated circuit device having variable frame rate capability |
US4597318A (en) * | 1983-01-18 | 1986-07-01 | Matsushita Electric Industrial Co., Ltd. | Wave generating method and apparatus using same |
-
1984
- 1984-02-23 DE DE19843406540 patent/DE3406540C1/de not_active Expired
- 1984-08-09 EP EP19840109492 patent/EP0157903B1/fr not_active Expired
-
1985
- 1985-02-22 JP JP3425585A patent/JPS60211499A/ja active Pending
Also Published As
Publication number | Publication date |
---|---|
EP0157903A1 (fr) | 1985-10-16 |
JPS60211499A (ja) | 1985-10-23 |
DE3406540C1 (de) | 1985-09-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE2740520A1 (de) | Verfahren und anordnung zur synthese von sprache | |
EP1184839A2 (fr) | Conversion graphème-phonème | |
DE3046338A1 (de) | Elektronische uhr mit aufzeichnungsfunktion | |
DE3036680A1 (de) | Sprachsynthesizer mit dehnbarer und komprimierbarer sprachzeit | |
DE2050512B2 (de) | Vorrichtung zur Ableitung von Sprachparametern und zur Erzeugung synthetischer Sprache | |
DE2753707A1 (de) | Einrichtung zur erkennung des auftretens eines kommandowortes aus einer eingangssprache | |
DE2343158A1 (de) | Frequenzbandumsetzer | |
DE2850286A1 (de) | Elektronische schlagwerksuhr | |
DE3228756A1 (de) | Verfahren und vorrichtung zur zeitabhaengigen komprimierung und synthese von stimmlosen hoerbaren signalen | |
DE1965480A1 (de) | Geraet fuer kuenstliche Erzeugung von Worten durch Umwandlung eines in Buchstaben gedruckten Textes in Aussprache | |
EP0157903B1 (fr) | Procédé et dispositif pour la synthèse de la parole | |
DE1811040C3 (de) | Anordnung zum Synthetisieren von Sprachsignalen | |
DE69232964T2 (de) | Informationsansageeinrichtung | |
DE1937464B2 (de) | Sprachanalysiergeraet | |
DE2854401C2 (de) | Anrufbeantworter | |
AT391035B (de) | System zur spracherkennung | |
DE3232835C2 (fr) | ||
DE4441906C2 (de) | Anordnung und Verfahren für Sprachsynthese | |
DE3928664C2 (de) | Videorekorder mit Spracheingabe/-ausgabefunktion | |
DE2657430A1 (de) | Einrichtung zum synthetisieren der menschlichen sprache | |
DE2335818A1 (de) | Anordnung zur erzeugung von antwortstimmen bzw. -sprache | |
AT311077B (de) | Einrichtung zur Synthetisierung von Audio-Informationen | |
EP0094681B1 (fr) | Dispositif pour la synthèse électronique de la parole | |
DE2016572A1 (de) | Verfahren und Einrichtung zur Sprachsynthese | |
DE1762336C (de) | Schaltungsanordnung zur Sprachanalyse und Sprachsynthese nach Art eines Vocoders |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Designated state(s): FR GB IT NL |
|
17P | Request for examination filed |
Effective date: 19851102 |
|
17Q | First examination report despatched |
Effective date: 19870202 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): FR GB IT NL |
|
ET | Fr: translation filed | ||
ITF | It: translation for a ep patent filed |
Owner name: STUDIO TORTA SOCIETA' SEMPLICE |
|
GBT | Gb: translation of ep patent filed (gb section 77(6)(a)/1977) | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Effective date: 19890809 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Effective date: 19900301 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee | ||
NLV4 | Nl: lapsed or anulled due to non-payment of the annual fee | ||
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Effective date: 19900427 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST |