EP1184838B1 - Transcription phonétique pour la synthèse de parole - Google Patents

Transcription phonétique pour la synthèse de parole Download PDF

Info

Publication number
EP1184838B1
EP1184838B1 EP01113053A EP01113053A EP1184838B1 EP 1184838 B1 EP1184838 B1 EP 1184838B1 EP 01113053 A EP01113053 A EP 01113053A EP 01113053 A EP01113053 A EP 01113053A EP 1184838 B1 EP1184838 B1 EP 1184838B1
Authority
EP
European Patent Office
Prior art keywords
found
subword
phonetic transcription
database
phonetic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP01113053A
Other languages
German (de)
English (en)
Other versions
EP1184838A3 (fr
EP1184838A2 (fr
Inventor
Horst-Udo Hain
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siemens AG
Original Assignee
Siemens AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens AG filed Critical Siemens AG
Publication of EP1184838A2 publication Critical patent/EP1184838A2/fr
Publication of EP1184838A3 publication Critical patent/EP1184838A3/fr
Application granted granted Critical
Publication of EP1184838B1 publication Critical patent/EP1184838B1/fr
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

Definitions

  • the invention relates to a method, an arrangement and a Computer program product for speech synthesis by means of grapheme-phoneme conversion.
  • Speech processing methods are, for example, US 6,029,135, US 5,732,388, DE 19636739 C1 and DE 19719381 C1 known.
  • Text stored in non-spoken form express themselves as speech via a speech synthesis. To do this usually the individual words of the text in a database which searched the phonetic transcriptions of numerous Contains words. The phonetic transcriptions of the words found in the database are put together and can be output as a language.
  • the invention is based on the object To improve speech synthesis in that on an alternative Type to a greater extent on in a database specified phonetic transcriptions of words and only to a lesser extent OOV treatments must be used.
  • the arrangement or the computer program product it is possible even for a given word to use the phonetic transcriptions of his subwords, if the given word is not completely in compile the sub-words contained in the database.
  • the essential idea is that for the first time a hybrid Approach is used when given for the same Word both the phonetic transcription more complete Subtotals, as well as an OOV treatment is used.
  • the phonetic transcriptions of words contains, searched for partial words of the given word. At least one subword of the given word is in the database found and recorded in the database for this purpose phonetic transcription selected. For the given word is searched in the database for further subwords of the word searched. At least one more subword of the given word is found in the database and this one in the Database listed phonetic transcription chosen. One another component in the given word is between the found Partial word and the found additional subword arranged. There is an OOV treatment for phonetic Transcription of the further component depending on the phonetic transcription of the found subword and the phonetic transcription of the found additional subword. The phonetic transcription of the found subword, the phonetic transcription of the further sub-word found and the phonetic transcription of the further constituent word and the phonetic transcription further Ingredients are put together.
  • the search for partial words in the database is possible optimize various measures. So for example only Search for subwords that have a given minimum length exhibit. As a minimum length has in practice a length of 5 letters, while others Framework, for example, for another language, too Minimum lengths of 3, 4 or 6 letters can be useful.
  • the search result is improved when searching for a word part of the given word not immediately after that Finding the first matching subword is aborted, but is still looking for other possible subwords. This can be done, for example, by adding the word part Letters is added. As a rule, this results Proceed the best result, if found by several Subwords the one selected the longest is. However, you can also choose a shorter subword if this shorter subword is in connection with one found in the database and contained in the given word longer subword a larger part of the given Word represents, as the found longer partial word for itself alone, if not with the found second subword can be combined.
  • the OOV treatment for phonetic transcription further Ingredient can be done by means of a neural network.
  • the OOV treatment can also be done by means of a second Database containing the phonetic transcriptions of filling particles commonly used in compound words contains.
  • these are in particular and genitive endings, which in compound words to the each word in front are appended.
  • Step S1 is for the given word in a Database containing phonetic transcriptions of words, searched for partial words of the given word. Because the Minimum length is set to five letters, with the Search for the word "Train” started. In a German-speaking Database, this word is not found. Contains the Database also English words, so is already now Found the first subword of the given word. Preferably but not only in the first, but also in the second Case on. This is done by searching for the Word "Traini”. This letter combination is in the database not found. The same applies to the sought after Letter combination "Trainin”.
  • step S3 is found for the found sub-word "training" the selected phonetic transcription in the database.
  • step S4 it is determined that the given Word "training camp” next to the found sub-word “training” another component “slager” that does not recorded in the database.
  • This further component "slager” is then in step S5 phonetically transcribed by OOV treatment.
  • OOV treatment is preferably based on an implementation of single grapheme of the further component "slager” in phonemes by means of a neural network.
  • the phonemes will be selected and composed by the neural network, that for the other component taken alone gives the best possible speech synthesis.
  • the OOV treatment takes place for phonetic transcription of the further constituent "slager” depending on the selected from the database phonetic transcription of the found subword "Training".
  • the found sub-word "training” or its phonetic Transcription in the selected example gives the left one phonetic context of the further component "slager” for sure in front. That for the OOV treatment of the further ingredient "slager” used neural network can therefore of a safe result of the other component preceded Syllables of the given word go out and one accordingly safe result for phonetic transcription further provide component.
  • step S6 of the method for speech synthesis becomes finally, the phonetic transcription of the found Partial word "Training” and the phonetic transcription of the further component "slager" composed.
  • the OOV treatment also by a search in another database in which the phonetic transcriptions are composed of at Words commonly used filler particles are included.
  • the genitive-s of the present example is such a commonly used filler particle. It would therefore be found in the second database and the associated phonetic transcription can be selected.
  • OVER treatment can also be rule-based Use procedure and DTW method.
  • DTW method a better phonetic transcriptions further Component when expected to undergo OOV treatment phonetic transcription of the further constituent the phonetic Transcription of several or all partial words found is taken into account. Of course this is special the case, if the further ingredient in the word between two found subwords is arranged.
  • the arrangement according to the invention can be in the form of a computer system realize that is being programmed, a corresponding Perform procedure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Claims (8)

  1. Procédé pour la synthèse vocale au moyen d'une conversion graphèmes-phonèmes, dans lequel
    on recherche, pour un mot donné, des mots partiels du mot donné dans une base de données contenant des transcriptions phonétiques de mots,
    au moins un mot partiel du mot donné est trouvé dans la base de données,
    une transcription phonétique enregistrée dans la base de données est sélectionnée pour le mot partiel trouvé,
    on recherche, pour le mot donné, d'autres mots partiels du mot dans la base de données,
    au moins un autre mot partiel du mot donné est trouvé dans la base de données,
    une transcription phonétique enregistrée dans la base de données est sélectionnée pour cet autre mot partiel trouvé,
    une partie constitutive supplémentaire est située entre le mot partiel trouvé et l'autre mot partiel trouvé dans le mot donné,
    on procède à un traitement OOV pour la transcription phonétique de la partie constitutive supplémentaire en fonction de la transcription phonétique du mot partiel trouvé et de la transcription phonétique de l'autre mot partiel trouvé,
    la transcription phonétique du mot partiel trouvé, la transcription phonétique de l'autre mot partiel trouvé et la transcription phonétique de la partie constitutive supplémentaire sont rassemblées.
  2. Procédé pour la synthèse vocale selon au moins une des revendications précédentes, dans lequel
    on recherche uniquement des mots partiels ayant au moins une longueur minimale prédéfinie.
  3. Procédé pour la synthèse vocale selon au moins une des revendications précédentes, dans lequel
    lorsque plusieurs mots partiels sont trouvés pour la même partie de mot de la partie de mot donnée, on sélectionne parmi eux le mot partiel le plus long.
  4. Procédé pour la synthèse vocale selon au moins une des revendications précédentes, dans lequel
    le traitement OOV pour la transcription phonétique de la partie constitutive supplémentaire se fait au moyen d'un réseau de neurones.
  5. Procédé pour la synthèse vocale selon au moins une des revendications précédentes, dans lequel
    le traitement OOV pour la transcription phonétique de la partie constitutive supplémentaire se fait au moyen d'un procédé basé sur des règles.
  6. Procédé pour la synthèse vocale selon au moins une des revendications précédentes, dans lequel
    le traitement OOV pour la transcription phonétique de la partie constitutive supplémentaire se fait au moyen d'une deuxième base de données contenant la transcription phonétique de particules explétives d'utilisation courante dans les mots composés.
  7. Dispositif pour la synthèse vocale au moyen d'une conversion graphèmes-phonèmes, exécuté de manière à ce que
    on puisse rechercher, pour un mot donné, des mots partiels du mot donné dans une base de données contenant des transcriptions phonétiques de mots,
    au moins un mot partiel du mot donné puisse être trouvé dans la base de données,
    une transcription phonétique enregistrée dans la base de données puisse être sélectionnée pour le mot partiel trouvé,
    on puisse rechercher, pour le mot donné, d'autres mots partiels du mot dans la base de données,
    au moins un autre mot partiel du mot donné puisse être trouvé dans la base de données,
    une transcription phonétique enregistrée dans la base de données puisse être sélectionnée pour cet autre mot partiel trouvé,
    une partie constitutive supplémentaire soit située entre le mot partiel trouvé et l'autre mot partiel trouvé dans le mot donné,
    on puisse procéder à un traitement OOV pour la transcription phonétique de la partie constitutive supplémentaire en fonction de la transcription phonétique du mot partiel trouvé et de la transcription phonétique de l'autre mot partiel trouvé,
    la transcription phonétique du mot partiel trouvé, la transcription phonétique de l'autre mot partiel trouvé et la transcription phonétique de la partie constitutive supplémentaire puissent être rassemblées.
  8. Produit de programme d'ordinateur pour la synthèse vocale au moyen d'une conversion graphèmes-phonèmes, dans lequel, lors du déroulement sur au moins une unité de processeur,
    on recherche, pour un mot donné, des mots partiels du mot donné dans une base de données contenant des transcriptions phonétiques de mots,
    au moins un mot partiel du mot donné est trouvé dans la base de données,
    une transcription phonétique enregistrée dans la base de données est sélectionnée pour le mot partiel trouvé,
    on recherche, pour le mot donné, d'autres mots partiels du mot dans la base de données,
    au moins un autre mot partiel du mot donné est trouvé dans la base de données,
    une transcription phonétique enregistrée dans la base de données est sélectionnée pour cet autre mot partiel trouvé,
    une partie constitutive supplémentaire est située entre le mot partiel trouvé et l'autre mot partiel trouvé dans le mot donné,
    on procède à un traitement OOV pour la transcription phonétique de la partie constitutive supplémentaire en fonction de la transcription phonétique du mot partiel trouvé et de la transcription phonétique de l'autre mot partiel trouvé,
    la transcription phonétique du mot partiel trouvé, la transcription phonétique de l'autre mot partiel trouvé et la transcription phonétique de la partie constitutive supplémentaire sont rassemblées.
EP01113053A 2000-08-31 2001-05-28 Transcription phonétique pour la synthèse de parole Expired - Lifetime EP1184838B1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE10042942 2000-08-31
DE10042942A DE10042942C2 (de) 2000-08-31 2000-08-31 Verfahren zur Sprachsynthese

Publications (3)

Publication Number Publication Date
EP1184838A2 EP1184838A2 (fr) 2002-03-06
EP1184838A3 EP1184838A3 (fr) 2003-02-05
EP1184838B1 true EP1184838B1 (fr) 2005-08-31

Family

ID=7654521

Family Applications (1)

Application Number Title Priority Date Filing Date
EP01113053A Expired - Lifetime EP1184838B1 (fr) 2000-08-31 2001-05-28 Transcription phonétique pour la synthèse de parole

Country Status (4)

Country Link
US (1) US7333932B2 (fr)
EP (1) EP1184838B1 (fr)
DE (2) DE10042942C2 (fr)
ES (1) ES2244523T3 (fr)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4072718B2 (ja) * 2002-11-21 2008-04-09 ソニー株式会社 音声処理装置および方法、記録媒体並びにプログラム
TWI233589B (en) * 2004-03-05 2005-06-01 Ind Tech Res Inst Method for text-to-pronunciation conversion capable of increasing the accuracy by re-scoring graphemes likely to be tagged erroneously
US7869999B2 (en) * 2004-08-11 2011-01-11 Nuance Communications, Inc. Systems and methods for selecting from multiple phonectic transcriptions for text-to-speech synthesis
TWI340330B (en) * 2005-11-14 2011-04-11 Ind Tech Res Inst Method for text-to-pronunciation conversion
DE102011118059A1 (de) 2011-11-09 2013-05-16 Elektrobit Automotive Gmbh Technik zur Ausgabe eines akustischen Signals mittels eines Navigationssystems
CN105206259A (zh) * 2015-11-03 2015-12-30 常州工学院 一种语音转换方法
CN110619866A (zh) * 2018-06-19 2019-12-27 普天信息技术有限公司 语音合成方法及装置

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5283833A (en) * 1991-09-19 1994-02-01 At&T Bell Laboratories Method and apparatus for speech processing using morphology and rhyming
ES2139066T3 (es) * 1993-03-26 2000-02-01 British Telecomm Conversion de texto a una forma de onda.
US5651095A (en) * 1993-10-04 1997-07-22 British Telecommunications Public Limited Company Speech synthesis using word parser with knowledge base having dictionary of morphemes with binding properties and combining rules to identify input word class
DE4440598C1 (de) * 1994-11-14 1996-05-23 Siemens Ag Durch gesprochene Worte steuerbares Hypertext-Navigationssystem, Hypertext-Dokument für dieses Navigationssystem und Verfahren zur Erzeugung eines derartigen Dokuments
DE19500494C2 (de) * 1995-01-10 1997-01-23 Siemens Ag Merkmalsextraktionsverfahren für ein Sprachsignal
DE19636739C1 (de) * 1996-09-10 1997-07-03 Siemens Ag Verfahren zur Mehrsprachenverwendung eines hidden Markov Lautmodelles in einem Spracherkennungssystem
DE19719381C1 (de) * 1997-05-07 1998-01-22 Siemens Ag Verfahren zur Spracherkennung durch einen Rechner
US5913194A (en) * 1997-07-14 1999-06-15 Motorola, Inc. Method, device and system for using statistical information to reduce computation and memory requirements of a neural network based speech synthesis system
US6108627A (en) * 1997-10-31 2000-08-22 Nortel Networks Corporation Automatic transcription tool
US6076060A (en) * 1998-05-01 2000-06-13 Compaq Computer Corporation Computer method and apparatus for translating text to sound
US6188984B1 (en) * 1998-11-17 2001-02-13 Fonix Corporation Method and system for syllable parsing
US6208968B1 (en) * 1998-12-16 2001-03-27 Compaq Computer Corporation Computer method and apparatus for text-to-speech synthesizer dictionary reduction
DE10042944C2 (de) * 2000-08-31 2003-03-13 Siemens Ag Graphem-Phonem-Konvertierung

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DAELEMANS W.: "GRAFON: A Grapheme-to-Phoneme Conversion System for Dutch", PROC. 12TH INT. CONF. ON COMPUTATIONAL LINGUISTICS COLING-88, 1988, BUDAPEST, HUNGARY, pages 133 - 138 *

Also Published As

Publication number Publication date
US20020026313A1 (en) 2002-02-28
EP1184838A3 (fr) 2003-02-05
ES2244523T3 (es) 2005-12-16
EP1184838A2 (fr) 2002-03-06
DE10042942C2 (de) 2003-05-08
DE50107259D1 (de) 2005-10-06
DE10042942A1 (de) 2002-03-28
US7333932B2 (en) 2008-02-19

Similar Documents

Publication Publication Date Title
DE69937176T2 (de) Segmentierungsverfahren zur Erweiterung des aktiven Vokabulars von Spracherkennern
DE60035001T2 (de) Sprachsynthese mit Prosodie-Mustern
EP1184839B1 (fr) Conversion graphème-phonème
DE68913669T2 (de) Namenaussprache durch einen Synthetisator.
DE60016722T2 (de) Spracherkennung in zwei Durchgängen mit Restriktion des aktiven Vokabulars
EP0797185B1 (fr) Procédé et dispositif pour la reconnaissance de la parole
DE19636739C1 (de) Verfahren zur Mehrsprachenverwendung eines hidden Markov Lautmodelles in einem Spracherkennungssystem
DE60126564T2 (de) Verfahren und Anordnung zur Sprachsysnthese
DE69330427T2 (de) Spracherkennungssystem für sprachen mit zusammengesetzten wörtern
EP1466317B1 (fr) Procede d'exploitation d'un systeme de reconnaissance vocale automatique pour la reconnaissance vocale multilocuteur de mots de differentes langues et systeme de reconnaissance vocale automatique
DE69712216T2 (de) Verfahren und gerät zum übersetzen von einer sparche in eine andere
DE102020205786A1 (de) Spracherkennung unter verwendung von nlu (natural language understanding)-bezogenem wissen über tiefe vorwärtsgerichtete neuronale netze
EP0285221A2 (fr) Procédé pour la reconnaissance de la parole continue
DE102005018174A1 (de) Verfahren zur gezielten Ermittlung eines vollständigen Eingabedatensatzes in einem Sprachdialog 11
WO2001069591A1 (fr) Procede pour reconnaitre les enonces verbaux de locuteurs non natifs dans un systeme de traitement de la parole
DE10040063A1 (de) Verfahren zur Zuordnung von Phonemen
EP1282897B1 (fr) Procede pour produire une banque de donnees vocales pour un lexique cible pour l'apprentissage d'un systeme de reconnaissance vocale
EP1184838B1 (fr) Transcription phonétique pour la synthèse de parole
DE102012202391A1 (de) Verfahren und Einrichtung zur Phonetisierung von textenthaltenden Datensätzen
DE60029456T2 (de) Verfahren zur Online-Anpassung von Aussprachewörterbüchern
DE60219030T2 (de) Verfahren zur mehrsprachigen Spracherkennung
DE69607928T2 (de) Verfahren und vorrichtung zur bereitstellung und verwendung von diphonen für mehrsprachige text-nach-sprache systeme
WO1999005681A1 (fr) Procede pour la memorisation des parametres de recherche d'une sequence d'images et acces a une suite d'images dans cette sequence d'images
DE69908106T2 (de) Erweiterung eines spracherkennungswortschatzes unter verwendung von abgeleiteten wörtern
EP0834859B1 (fr) Procédé de détermination d'un modèle acoustique pour un mot

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

17P Request for examination filed

Effective date: 20030303

AKX Designation fees paid

Designated state(s): DE ES FR GB IT

17Q First examination report despatched

Effective date: 20030918

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE ES FR GB IT

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

Free format text: NOT ENGLISH

REF Corresponds to:

Ref document number: 50107259

Country of ref document: DE

Date of ref document: 20051006

Kind code of ref document: P

GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)

Effective date: 20050926

REG Reference to a national code

Ref country code: ES

Ref legal event code: FG2A

Ref document number: 2244523

Country of ref document: ES

Kind code of ref document: T3

ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20060601

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: ES

Payment date: 20120621

Year of fee payment: 12

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: IT

Payment date: 20130529

Year of fee payment: 13

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140528

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 15

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20150511

Year of fee payment: 15

REG Reference to a national code

Ref country code: ES

Ref legal event code: FD2A

Effective date: 20150731

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20150513

Year of fee payment: 15

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140529

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20150720

Year of fee payment: 15

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 50107259

Country of ref document: DE

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20160528

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20170131

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20160531

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20161201

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20160528