EP1224531B1 - Verfahren zum bestimmen des zeitlichen verlaufs einer grundfrequenz einer zu synthetisierenden sprachausgabe - Google Patents
Verfahren zum bestimmen des zeitlichen verlaufs einer grundfrequenz einer zu synthetisierenden sprachausgabe Download PDFInfo
- Publication number
- EP1224531B1 EP1224531B1 EP00984858A EP00984858A EP1224531B1 EP 1224531 B1 EP1224531 B1 EP 1224531B1 EP 00984858 A EP00984858 A EP 00984858A EP 00984858 A EP00984858 A EP 00984858A EP 1224531 B1 EP1224531 B1 EP 1224531B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- fundamental
- frequency
- macrosegment
- fundamental frequency
- sequences
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/90—Pitch determination of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L2025/783—Detection of presence or absence of voice signals based on threshold decision
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
Definitions
- the invention relates to a method for determining the temporal Course of a fundamental frequency of a to be synthesized Voice output.
- the invention is therefore based on the object of a method to determine the time profile of a basic frequency to create a speech to be synthesized that the A natural, a human voice gives very similar sound.
- the present invention is based on the finding that the determination of the course of a fundamental frequency by means of a neural network the macro structure of the temporal course a fundamental frequency very similar to the course of the Fundamental frequency of a natural language is generated, and that in a fundamental frequency sequences stored very much similar to the microstructure of the fundamental frequency of a natural one Play language.
- an optimal determination of the course of the fundamental frequency achieved both in the macro structure and in the microstructure is much more similar to that of natural language is, than in one with the previously known methods generated fundamental frequency. This will make a considerable one Approximation of synthetic speech to a natural one Language achieved.
- the synthetic language created in this way is very similar to natural language and can hardly speak of this can be distinguished.
- the deviation between the replica macro segment is preferred and the default macro segment using a cost function determined, which is weighted such that at low Deviations from the basic frequency of the default macro segment only a small deviation is determined, whereby when predetermined limit frequency differences are exceeded deviations determined strongly until a saturation value is reached increase.
- a cost function determined, which is weighted such that at low Deviations from the basic frequency of the default macro segment only a small deviation is determined, whereby when predetermined limit frequency differences are exceeded deviations determined strongly until a saturation value is reached increase.
- deviations are weighted less, the closer they are are arranged on the edge of a syllable.
- the default macro segment is preferably reproduced by creating multiple fundamental frequency sequences for each a microprosodic unit, with combinations of fundamental frequency sequences both in terms of the deviation from the default macro segment as well as paired voting be rated. Depending on the outcome of these two Ratings (deviation from the default macro segment, coordination between adjacent fundamental frequency sequences) an appropriate selection of a combination of fundamental frequency sequences met.
- pairwise voting especially the Transitions between adjacent fundamental frequency sequences evaluated, larger jumps should be avoided here.
- the syllable core is decisive for the auditory impression.
- 6 is a method for synthesizing speech, where a text is converted into a series of acoustic signals is shown in a flow chart.
- This process is implemented in the form of a computer program, that is started with a step S1.
- step S2 a text is entered in the form of a there is an electronically readable text file.
- a sequence of phonemes the is called a sound sequence, created using the individual graphemes of the text, that is each one or more letters, which are each assigned a phoneme.
- the individual graphemes are then assigned Phonemes determines what determines the phoneme sequence.
- step S4 an emphasis structure is determined, that is it is determined how strongly the individual phonemes are emphasized should.
- the emphasis structure is in Fig. 1a by means of a time line represented by the word "stop". Accordingly, that Grapheme “st” is level 1, grapheme “o” is level 0.3 and the graphem “p” assigned the emphasis level 0.5 Service.
- the duration of the individual phonemes is determined below (S5).
- step S6 the time course of the basic frequency determines what is detailed below.
- the wave file is generated using an acoustic output unit and a loudspeaker converted into acoustic signals (S8), which ends the speech output (S9).
- the time course of the basic frequency the speech to be synthesized using a neural Network in combination with stored in a database Fundamental frequency sequences generated.
- step S6 in FIG. 6 is shown in more detail in Figure 5 in a flow chart.
- This method for determining the time course of the Fundamental frequency is a subroutine of that shown in FIG. 6 Program.
- the subroutine is started with step S10.
- a default macro segment of the fundamental frequency determined by means of a neural network is schematically simplified in Fig. 4 shown.
- the neural network points to an input layer I node for entering a phonetic linguistic Unit PE of the text to be synthesized and one Context Kl, Kr left and right of the phonetic linguistic Unity on.
- the phonetic linguistic unit e.g. from a phrase, a word or a syllable of the text to be synthesized for which the default macro segment the fundamental frequency is to be determined.
- the left context Kl and the right context Kr each represent a section of text left and right of the phonetic linguistic Unit PE represents the entered with the phonetic unit
- Data include the corresponding phoneme sequence, stress structure and the duration of each phoneme.
- the one with the left or right context include information entered at least the phoneme sequence, where it can be useful also enter the emphasis structure and / or the duration of the sound.
- the length of the left and right context can be the Correspond to the length of the phonetic linguistic unit PE, so again be a phrase, a word or a syllable. It however, it can also be useful to have a longer context of e.g. two or three words as left or right context provided.
- These inputs Kl, PE and Kr are hidden Layer VS processed and at an output layer O output as the default macro segment VG of the fundamental frequency.
- Fig. 1b is such a default macro segment for the word shown "stop".
- This default macro segment has one typical triangular course, initially with a Ascent begins and ends with a somewhat shorter descent.
- step S12 a database, in the graphemes assigned fundamental frequency sequences are stored, read out, usually a large number of for each grapheme Fundamental frequency sequences are available. 1c are such Fundamental frequency sequences for the graphemes “st", “o” and “p” shown schematically, with the aim of simplifying the drawing only a small number of fundamental frequency sequences are shown.
- a cost factor Kf is calculated using the following cost function:
- the cost function has two terms, a local cost function lok (k ij ) and a link cost function Ver (k ij , k n , j + 1).
- the local cost function is used to evaluate the deviation of the i-th fundamental frequency sequence of the j-th phoneme from the default macro segment.
- the link cost function evaluates the coordination between the i-th fundamental frequency of the j-th phoneme and the n-th fundamental frequency sequence of the j + 1-th phoneme.
- the local cost function has the following form, for example:
- the local cost function is thus an integral over the time range from the beginning ta of a phoneme to the end te of the phoneme over the square of the difference between the fundamental frequency f v specified by the default macro segment and the i-th fundamental frequency sequence of the j-th phoneme.
- This local cost function thus determines a positive one Value of the deviation between the respective fundamental frequency sequence and the fundamental frequency of the default macro segment. Moreover this cost function is very easy to implement and generate by the parabolic property a rating that resembles that of human hearing because of minor deviations to be rated low by the default sequence fv, whereas larger deviations are evaluated progressively.
- the local cost function is provided with a weighting term which leads to the function curve shown in FIG. 2.
- the diagram in FIG. 2 shows the value of the local cost function lok (f ij ) as a function of the logarithm of the frequency f ij of the i-th fundamental frequency sequence of the j-th phoneme. It can be seen from the diagram that deviations from the preset frequency f v within certain limit frequencies GF1, GF2 are assessed only slightly, a further deviation causing a sharply increasing increase up to a threshold value SW. Such a weighting corresponds to human hearing, which hardly perceives slight frequency deviations but registers this as a clear difference from certain frequency differences.
- the link cost function evaluates how well two successive fundamental frequency sequences on top of each other are coordinated. In particular, the frequency difference at the junction of the two fundamental frequency sequences rated, the greater the difference at the end of the previous one Fundamental frequency sequence to the frequency at the beginning of the subsequent fundamental frequency sequences, the larger the output value of the link cost function.
- other parameters are taken into account that e.g. reflect the continuity of the transition or the like.
- the Output value of the link cost function weighted even less, the closer the respective connection point of two neighboring ones Fundamental frequency sequences arranged on the edge of a syllable is. This corresponds to human hearing, the acoustic Signals on the edge of a syllable analyzed less intensely than in the middle area of the syllable. Such a weighting is also known as perceptually dominant.
- cost function Kf are for each combination of fundamental frequency sequences of the phonemes of a linguistic Unit for which a default macro segment has been determined the values of the local cost function and the link cost function all fundamental frequency sequences determined and summed. From the set of combinations of the fundamental frequency sequences the combination is selected for which the Cost function Kf has given the smallest value since this Combination of fundamental frequency sequences a fundamental frequency course for the corresponding linguistic unit, which is called the replica macro segment and the default macro segment is very similar.
- the means default macro segments generated by the neural network the fundamental frequency characteristics adapted by means of individual fundamental frequency sequences stored in a database generated. This creates a very natural macro structure ensured that also the detailed Has microstructure of the fundamental frequency sequences.
- step S14 After the selection of the combinations of Fundamental frequency sequences to simulate the default macro segment is completed, it is checked in step S14 whether for another phonetic linguistic unit another time course of the fundamental frequency must be generated. results If this query is "yes" in step S14, the program flow jumps go back to step S11, otherwise branch the program flow to step S15, with which the individual Simulation macro segments of the fundamental frequency composed become.
- connection points of the individual simulation macro segments are matched to one another, as shown in FIG. 3.
- the frequencies left f 1 and right f r are matched to one another by the connecting points V, the end regions of the replica macro segments preferably being changed such that the frequencies f 1 and f r have the same value.
- the transition can also be smoothed and / or made continuous in the area of the connection point.
- a course can thus a fundamental frequency are generated, which is the fundamental frequency of a natural language is very similar because of the neural Network simply captures larger context areas and can be evaluated (macro structure) and at the same time by means of the fundamental frequency sequences stored in the database finest structures corresponding to the fundamental frequency curve natural language can be generated (microstructure). This makes a speech with an essential more natural sound than with previously known methods allows.
- the order of when the fundamental frequency sequences from the database and when the neural network the default macro segment created varied. It is e.g. also possible that initially for all phonetic linguistic units Default macro segments are generated and only then individual fundamental frequency sequences read out, combined, evaluated and be selected. Within the scope of the invention different cost functions are also used, as long as there is a discrepancy between a default macro segment the fundamental frequency and microsegments of the fundamental frequencies consider. The integral of the local described above For numerical reasons, the cost function can also be a sum being represented.
Description
- Fig. 1a bis 1d
- den Aufbau und das Zusammensetzen des zeitlichen Verlaufes einer Grundfrequenz in vier Schritten,
- Fig. 2
- eine Funktion zur Gewichtung einer Kostenfunktion zur Bestimmung der Abweichung zwischen einem Nachbildungsmakrosegment und einem Vorgabemakrosegment,
- Fig. 3
- den Verlauf einer aus mehreren Makrosegmenten bestehenden Grundfrequenz,
- Fig. 4
- schematisch vereinfacht den Aufbau eines neuronalen Netzwerkes,
- Fig. 5
- das erfindungsgemäße Verfahren in einem Flußdiagramm, und
- Fig. 6
- ein Verfahren zum Synthetisieren von Sprache, daß auf dem erfindungsgemäßen Verfahren beruht.
Claims (13)
- Verfahren zum Bestimmen des zeitlichen Verlaufs einer Grundfrequenz einer zu synthetisierenden Sprachausgabe, umfassend die Schritte:Bestimmen von Vorgabemakrosegmenten der Grundfrequenz einer phonetisch linguistischen Einheit eines zu synthetisierenden Textes (S11) mittels eines neuronalen Netzwerkes, undBestimmen von dem jeweiligen Vorgabemakrosegment entsprechenden Mikrosegmenten (S12, S13) mittels in einer Datenbasis gespeicherten Grundfrequenzsequenzen, wobei die Grundfrequenzsequenzen derart aus der Datenbasis ausgewählt werden, daß durch die aufeinanderfolgenden Grundfrequenzsequenzen das jeweilige Vorgabemakrosegment mit möglichst geringer Abweichung nachgebildet wird.
- Verfahren nach Anspruch 1,
dadurch gekennzeichnet, daß die Vorgabemakrosegmente einen Zeitbereich abdecken, der einer phonetisch linguistischen Einheit der Sprache, wie z.B. einer Phrase, einem Wort oder einer Silbe, entspricht. - Verfahren nach Anspruch 1 oder 2,
dadurch gekennzeichnet, daß die Grundfrequenzsequenzen der Mikrosegmente die Grundfrequenzen jeweils eines Phonems darstellen. - Verfahren nach einem der Ansprüche 1 bis 3,
dadurch gekennzeichnet, daß die Grundfrequenzsequenzen der Mikrosegmente, die innerhalb eines zeitlichen Bereiches eines der Vorgabemakrosegmente liegen, zu einem Nachbildungsmakrosegment zusammengesetzt werden, wobei die Abweichung des Nachbildungsmakrosegments zum jeweiligen Vorgabemakrosegment ermittelt wird, und die Grundfrequenzsequenzen derart optimiert werden, daß die Abweichung möglichst gering ist. - Verfahren nach Anspruch 4,
dadurch gekennzeichnet, daß für die einzelnen Mikrosegmente jeweils mehrere Grundfrequenzsequenzen ausgewählt werden können, wobei diejenigen Kombinationen von Grundfrequenzsequenzen ausgewählt werden, die die geringste Abweichung zwischen dem jeweiligen Nachbildungsmakrosegment und dem jeweiligen Vorgabemakrosegment ergeben. - Verfahren nach Anspruch 4 oder 5,
dadurch gekennzeichnet, daß die Abweichung zwischen dem Nachbildüngsmakrosegment und dem Vorgabemakrosegment mittels einer Kostenfunktion ermittelt wird, die derart gewichtet ist, daß bei geringen Abweichungen von der Grundfrequenz des Vorgabemakrosegments lediglich eine kleine Abweichung ermittelt wird, wobei bei Überschreiten vorbestimmter Grenzfrequenzdifferenzen die ermittelten Abweichungen stark bis zum Erreichen eines Sättigungswertes ansteigen. - Verfahren nach einem der Ansprüche 4 bis 6,
dadurch gekennzeichnet, daß die Abweichung zwischen dem Nachbildungsmakrosegment und dem Vorgabemakrosegment mittels einer Kostenfunktion ermittelt wird, mit der eine Vielzahl von über die Makrosegmente verteilt angeordnete Abweichungen bewertet werden, wobei die Abweichung desto schwächer gewichtet werden, je näher sie am Rand einer Silbe angeordnet sind. - Verfahren nach einem der Ansprüche 4 bis 7,
dadurch gekennzeichnet, daß beim Auswählen der Grundfrequenzsequenzen die einzelnen Grundfrequenzsequenzen mit den hierzu jeweils nachfolgenden bzw. vorhergehenden Grundfrequenzsequenzen nach vorbestimmten Kriterien abgestimmt werden, und lediglich Kombinationen von Grundfrequenzsequenzen zum Zusammensetzen zu einem Nachbildungsmakrosegment zugelassen werden, die die Kriterien erfüllen. - Verfahren nach Anspruch 8,
dadurch gekennzeichnet, daß die Beurteilung benachbarter Grundfrequenzsequenzen mittels einer Kostenfunktion erfolgt, die einen zu minimierenden Ausgabewert für eine Verbindungsstelle der Grundfrequenzsequenzen benachbarter Grundfrequenzsequenzen erzeugt, der desto größer ist, je größer die Differenz am Ende der vorhergehenden Grundfrequenzsequenz zur Frequenz am Anfang der nachfolgenden Grundfrequenzsequenz ist. - Verfahren nach Anspruch 9,
dadurch gekennzeichnet, daß der Ausgabewert desto schwächer gewichtet wird, je näher die jeweilige Verbindungsstelle am Rand einer Silbe angeordnet ist. - Verfahren nach einem der Ansprüche 1 bis 10,
dadurch gekennzeichnet, daß die einzelnen Makrosegmente miteinander verkettet werden, wobei an den Verbindungsstellen der Makrosegmente die Grundfrequenzen aneinander angepaßt werden. - Verfahren nach einem der Ansprüche 1 bis 11,
dadurch gekennzeichnet, daß die neuronalen Netzwerke die Vorgabesegmente für einen vorbestimmten Abschnitt eines Textes auf Grundlage dieses Textabschnittes und eines diesem Textabschnitt vorausgehenden und/oder nachfolgenden Textabschnittes bestimmen. - Verfahren zum Synthetisieren von Sprache, bei dem ein Text in eine Folge akustischer Signale gewandelt wird, umfassend folgende Schritte:Mandeln des Textes in eine Folge von Phonemen,Erzeugen einer Betonungsstruktur,Bestimmen der Dauer der einzelnen Phoneme,Bestimmen des zeitlichen Verlaufs einer Grundfrequenz nach dem Verfahren gemäß einem der Ansprüche 1 bis 12,Erzeugen der die Sprache darstellenden akustischen Signale auf Grundlage der ermittelten Folge von Phonemen und der ermittelten Grundfrequenz.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE19952051 | 1999-10-28 | ||
DE19952051 | 1999-10-28 | ||
PCT/DE2000/003753 WO2001031434A2 (de) | 1999-10-28 | 2000-10-24 | Verfahren zum bestimmen des zeitlichen verlaufs einer grundfrequenz einer zu synthetisierenden sprachausgabe |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1224531A2 EP1224531A2 (de) | 2002-07-24 |
EP1224531B1 true EP1224531B1 (de) | 2004-12-15 |
Family
ID=7927243
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP00984858A Expired - Lifetime EP1224531B1 (de) | 1999-10-28 | 2000-10-24 | Verfahren zum bestimmen des zeitlichen verlaufs einer grundfrequenz einer zu synthetisierenden sprachausgabe |
Country Status (5)
Country | Link |
---|---|
US (1) | US7219061B1 (de) |
EP (1) | EP1224531B1 (de) |
JP (1) | JP4005360B2 (de) |
DE (1) | DE50008976D1 (de) |
WO (1) | WO2001031434A2 (de) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AT6920U1 (de) | 2002-02-14 | 2004-05-25 | Sail Labs Technology Ag | Verfahren zur erzeugung natürlicher sprache in computer-dialogsystemen |
DE10230884B4 (de) * | 2002-07-09 | 2006-01-12 | Siemens Ag | Vereinigung von Prosodiegenerierung und Bausteinauswahl bei der Sprachsynthese |
JP4264030B2 (ja) * | 2003-06-04 | 2009-05-13 | 株式会社ケンウッド | 音声データ選択装置、音声データ選択方法及びプログラム |
JP2005018036A (ja) * | 2003-06-05 | 2005-01-20 | Kenwood Corp | 音声合成装置、音声合成方法及びプログラム |
JP3812848B2 (ja) * | 2004-06-04 | 2006-08-23 | 松下電器産業株式会社 | 音声合成装置 |
US10453479B2 (en) * | 2011-09-23 | 2019-10-22 | Lessac Technologies, Inc. | Methods for aligning expressive speech utterances with text and systems therefor |
US10109014B1 (en) | 2013-03-15 | 2018-10-23 | Allstate Insurance Company | Pre-calculated insurance premiums with wildcarding |
CN105357613B (zh) * | 2015-11-03 | 2018-06-29 | 广东欧珀移动通信有限公司 | 音频输出设备播放参数的调整方法及装置 |
CN106653056B (zh) * | 2016-11-16 | 2020-04-24 | 中国科学院自动化研究所 | 基于lstm循环神经网络的基频提取模型及训练方法 |
CN108630190B (zh) * | 2018-05-18 | 2019-12-10 | 百度在线网络技术(北京)有限公司 | 用于生成语音合成模型的方法和装置 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0710378A4 (de) | 1994-04-28 | 1998-04-01 | Motorola Inc | Verfahren und vorrichtung zur umwandlung von text in audiosignale unter verwendung eines neuralen netzwerks |
US5787387A (en) * | 1994-07-11 | 1998-07-28 | Voxware, Inc. | Harmonic adaptive speech coding method and system |
JPH10153998A (ja) * | 1996-09-24 | 1998-06-09 | Nippon Telegr & Teleph Corp <Ntt> | 補助情報利用型音声合成方法、この方法を実施する手順を記録した記録媒体、およびこの方法を実施する装置 |
BE1011892A3 (fr) * | 1997-05-22 | 2000-02-01 | Motorola Inc | Methode, dispositif et systeme pour generer des parametres de synthese vocale a partir d'informations comprenant une representation explicite de l'intonation. |
US5913194A (en) * | 1997-07-14 | 1999-06-15 | Motorola, Inc. | Method, device and system for using statistical information to reduce computation and memory requirements of a neural network based speech synthesis system |
US6064960A (en) * | 1997-12-18 | 2000-05-16 | Apple Computer, Inc. | Method and apparatus for improved duration modeling of phonemes |
US6078885A (en) * | 1998-05-08 | 2000-06-20 | At&T Corp | Verbal, fully automatic dictionary updates by end-users of speech synthesis and recognition systems |
AU772874B2 (en) * | 1998-11-13 | 2004-05-13 | Scansoft, Inc. | Speech synthesis using concatenation of speech waveforms |
US7222075B2 (en) * | 1999-08-31 | 2007-05-22 | Accenture Llp | Detecting emotions using voice signal analysis |
-
2000
- 2000-10-24 US US10/111,695 patent/US7219061B1/en not_active Expired - Fee Related
- 2000-10-24 EP EP00984858A patent/EP1224531B1/de not_active Expired - Lifetime
- 2000-10-24 JP JP2001533505A patent/JP4005360B2/ja not_active Expired - Fee Related
- 2000-10-24 WO PCT/DE2000/003753 patent/WO2001031434A2/de active IP Right Grant
- 2000-10-24 DE DE50008976T patent/DE50008976D1/de not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
JP4005360B2 (ja) | 2007-11-07 |
US7219061B1 (en) | 2007-05-15 |
WO2001031434A3 (de) | 2002-02-14 |
EP1224531A2 (de) | 2002-07-24 |
JP2003513311A (ja) | 2003-04-08 |
WO2001031434A2 (de) | 2001-05-03 |
DE50008976D1 (de) | 2005-01-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE2115258C3 (de) | Verfahren und Anordnung zur Sprachsynthese aus Darstellungen von individuell gesprochenen Wörtern | |
AT400646B (de) | Sprachsegmentkodierungs- und tonlagensteuerungsverfahren für sprachsynthesesysteme und synthesevorrichtung | |
DE602005002706T2 (de) | Verfahren und System für die Umsetzung von Text-zu-Sprache | |
DE60118874T2 (de) | Prosodiemustervergleich für Text-zu-Sprache Systeme | |
DE60112512T2 (de) | Kodierung von Ausdruck in Sprachsynthese | |
DE60004420T2 (de) | Erkennung von Bereichen überlappender Elemente für ein konkatenatives Sprachsynthesesystem | |
DE69627865T2 (de) | Sprachsynthesizer mit einer datenbank für akustische elemente | |
EP1224531B1 (de) | Verfahren zum bestimmen des zeitlichen verlaufs einer grundfrequenz einer zu synthetisierenden sprachausgabe | |
DE19942178C1 (de) | Verfahren zum Aufbereiten einer Datenbank für die automatische Sprachverarbeitung | |
DE2920298A1 (de) | Binaere interpolatorschaltung fuer ein elektronisches musikinstrument | |
EP1282897B1 (de) | Verfahren zum erzeugen einer sprachdatenbank für einen zielwortschatz zum trainieren eines spracherkennungssystems | |
DE19861167A1 (de) | Verfahren und Vorrichtung zur koartikulationsgerechten Konkatenation von Audiosegmenten sowie Vorrichtungen zur Bereitstellung koartikulationsgerecht konkatenierter Audiodaten | |
EP1159733B1 (de) | Verfahren und anordnung zur bestimmung eines repräsentativen lautes | |
DE69816049T2 (de) | Vorrichtung und verfahren zur prosodie-erzeugung bei der visuellen synthese | |
DE4138016A1 (de) | Einrichtung zur erzeugung einer ansageinformation | |
DE60305944T2 (de) | Verfahren zur synthese eines stationären klangsignals | |
WO2000016310A1 (de) | Vorrichtung und verfahren zur digitalen sprachbearbeitung | |
DE60131521T2 (de) | Verfahren und Vorrichtung zur Steuerung des Betriebs eines Geräts bzw. eines Systems sowie System mit einer solchen Vorrichtung und Computerprogramm zur Ausführung des Verfahrens | |
EP1170723B1 (de) | Verfahren zum Erzeugen einer Statistik von Phondauern und Verfahren zum Ermitteln der Dauer einzelner Phone für die Sprachsynthese | |
DE10230884B4 (de) | Vereinigung von Prosodiegenerierung und Bausteinauswahl bei der Sprachsynthese | |
WO2002050815A1 (de) | Vorrichtung und verfahren zur differenzierten sprachausgabe | |
DE69721539T2 (de) | Syntheseverfahren für stimmlose konsonanten | |
DE19837661C2 (de) | Verfahren und Vorrichtung zur koartikulationsgerechten Konkatenation von Audiosegmenten | |
EP0505709A2 (de) | Verfahren zur Erweiterung des Wortschatzes für sprecherunabhängige Spracherkennung | |
DE1922170B2 (de) | Sprachsynthesesystem |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20020404 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE |
|
AX | Request for extension of the european patent |
Free format text: AL;LT;LV;MK;RO;SI |
|
RBV | Designated contracting states (corrected) |
Designated state(s): DE FR GB IT |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: 7G 10L 11/04 A Ipc: 7G 06F 3/16 B |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: 7G 06F 3/16 B Ipc: 7G 10L 13/08 B Ipc: 7G 10L 11/04 A |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE FR GB IT |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D Free format text: NOT ENGLISH |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D Free format text: GERMAN |
|
REF | Corresponds to: |
Ref document number: 50008976 Country of ref document: DE Date of ref document: 20050120 Kind code of ref document: P |
|
GBT | Gb: translation of ep patent filed (gb section 77(6)(a)/1977) |
Effective date: 20050211 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
ET | Fr: translation filed | ||
26N | No opposition filed |
Effective date: 20050916 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20131219 Year of fee payment: 14 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20141017 Year of fee payment: 15 Ref country code: GB Payment date: 20141013 Year of fee payment: 15 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: IT Payment date: 20141029 Year of fee payment: 15 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R119 Ref document number: 50008976 Country of ref document: DE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20150501 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20151024 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20151024 Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20151024 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20160630 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20151102 |