EP0750778B1 - Sprachsynthese - Google Patents
Sprachsynthese Download PDFInfo
- Publication number
- EP0750778B1 EP0750778B1 EP95911420A EP95911420A EP0750778B1 EP 0750778 B1 EP0750778 B1 EP 0750778B1 EP 95911420 A EP95911420 A EP 95911420A EP 95911420 A EP95911420 A EP 95911420A EP 0750778 B1 EP0750778 B1 EP 0750778B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- speech
- pitch
- excitation
- synthesis apparatus
- speech synthesis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000003786 synthesis reaction Methods 0.000 title claims description 33
- 230000015572 biosynthetic process Effects 0.000 title claims description 31
- 230000005284 excitation Effects 0.000 claims description 31
- 238000004458 analytical method Methods 0.000 claims description 23
- 230000003595 spectral effect Effects 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 9
- 230000002123 temporal effect Effects 0.000 claims description 7
- 230000001755 vocal effect Effects 0.000 claims description 7
- 230000006835 compression Effects 0.000 claims description 6
- 238000007906 compression Methods 0.000 claims description 6
- 230000004044 response Effects 0.000 claims description 6
- 230000001360 synchronised effect Effects 0.000 claims description 5
- 230000001419 dependent effect Effects 0.000 claims description 3
- 238000000034 method Methods 0.000 description 26
- 238000012952 Resampling Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 238000001914 filtration Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 5
- 238000005070 sampling Methods 0.000 description 5
- 238000001228 spectrum Methods 0.000 description 4
- MQJKPEGWNLWLTK-UHFFFAOYSA-N Dapsone Chemical compound C1=CC(N)=CC=C1S(=O)(=O)C1=CC=C(N)C=C1 MQJKPEGWNLWLTK-UHFFFAOYSA-N 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 101000822695 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C1 Proteins 0.000 description 2
- 101000655262 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C2 Proteins 0.000 description 2
- 101000655256 Paraclostridium bifermentans Small, acid-soluble spore protein alpha Proteins 0.000 description 2
- 101000655264 Paraclostridium bifermentans Small, acid-soluble spore protein beta Proteins 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000009499 grossing Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000005279 excitation period Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0264—Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
Definitions
- the present invention is concerned with the automated generation of speech (for example from a coded text input). More particularly it concerns analysis-synthesis methods where the "synthetic" speech is generated from stored speech waveforms derived originally from a human speaker (as opposed to “synthesis by rule” systems). In order to produce natural-sounding speech it is necessary to produce, in the synthetic speech, the same kind of context-dependent (prosodic) variation of intonation that occurs in human speech. This invention presupposes the generation of prosodic information defining variations of pitch that are to be made, and addresses the problem of processing speech signals to achieve such pitch variation.
- a waveform portion to be used is divided into overlapping segments using a Hamming window having a length equal to three times the pitch period.
- a global spectral envelope is obtained for the waveform, and a short term spectral envelope obtained using a Discrete Fourier transform; a "source component" is obtained which is the short term spectrum divided by the spectral envelope.
- the source component then has its pitch modified by a linear interpolation process and it is then recombined with the envelope information. After preprocessing in this way the segments are concatenated by an overlap-add process to give a desired fundamental pitch.
- time-domain overlap-add process may be applied to an excitation component, for example by using LPC analysis to produce a residual signal (or a parametric representation of it) and applying the overlap-add process to the residual prior to passing it through an LPC synthesis filter (see “Pitch-synchronous Waveform Processing Techniques for Text-to Speech Synthesis using Diphones", F. Charpentier and E. Moulines, European Conference on Speech Communications and Technology, Paris, 1989, vol. II, pp. 13-19).
- FIG. 1 The basic principle of the overlap-add process is shown in Figure 1 where a speech signal S is shown with pitch marks P centred on the excitation peaks; it is separated into overlapping segments by multiplication by windowing waveforms W (only two of which are shown).
- the synthesised waveform is generated by adding the segments together with time shifting to raise or lower the pitch with a segment being respectively occasionally omitted or repeated.
- a speech synthesis apparatus including means controllable to vary the pitch of speech signals synthesised thereby, having:
- the invention provides a speech synthesis apparatus including means controllable to vary the pitch of speech signals synthesised thereby, having:
- the invention provides a speech synthesis apparatus as claimed in claim 8.
- portions of digital speech waveform S are stored in a store 100, each with corresponding pitchmark timing information P, as explained earlier.
- Waveform portions are read out under control of a text-to-speech driver 101 which produces the necessary store addresses; the operation of the driver 101 is conventional and it will not be described further except to note that it also produces pitch information PP.
- the excitation and vocal tract components of a waveform portion read out from the store 100 are separated by an LPC analysis unit 102 which periodically produces the coefficients of a synthesis filter having a frequency response resembling the frequency spectrum of the speech waveform portion.
- This drives an analysis filter 103 which is the inverse of the synthesis filter and produces at its output a residual signal R.
- the LPC analysis and inverse filtering operation is synchronous with the pitchmarks P, as will be described below.
- the next step in the process is that of modifying the pitch of the residual signal.
- This is (for voiced speech segments) performed by a multiple-window method in which the residual is separated into segments in a processing unit 104 by multiplying by a series of overlapping window functions, at least two per pitch period; five are shown in Figure 3, which shows one trapezoidal window centred on the pitch period and four intermediate triangular windows.
- the pitch period windows are somewhat wider than the intermediate ones to avoid duplication of the main excitation when lowering the pitch.
- the windowed segments are added together, but with a reduced temporal spacing, as shown in the lower part of Figure 3; if the pitch is lowered, the temporal spacing is increased.
- the relative window widths are chosen to give overlap of the sloping flanks (i.e. 50% overlap on the intermediate windows) during synthesis to ensure the correct signal amplitude.
- the temporal adjustment is controlled by the signals PP. Typical widths for the intermediate windows are 2 ms whilst the width of the windows located on the pitch marks will depend on the pitch period of the particular signal but is likely to be in the range 2 to 10ms. The use of multiple windows is thought to reduce phase distortion compared with the use of one window per pitch period.
- the store 100 also contains a voiced/unvoiced indicator for each waveform portion, and unvoiced portions are processed by a pitch unit 104' identical to the unit 104, but bypassing the LPC analysis and synthesis. Switching between the two paths is controlled at 106. Alternatively, the unvoiced portions could follow the same route as the voiced ones; in either case, arbitrary positions are taken for the pitch marks.
- Resampling is achieved by mapping each sample instant at the original sampling rate to a new position on the time axis.
- the signal amplitude at each sampling instant for the resampled signal is then estimated by linear interpolation between the two nearest mapped samples.
- Linear interpolation is not ideal for resampling, but is simple to implement and should at least give an indication of how useful the technique could be.
- the signal When downsampling to reduce the pitch period, the signal must be low-pass filtered to avoid aliasing. Initially, a separate filter has been designed for each pitch period using the window design method. Eventually, these could be generated by table lookup to reduce computation.
- the resampling factor varies smoothly over the segment to be processed to avoid a sharp change in signal characteristics at the boundaries. Without this, the effective sampling rate of the signal would undergo step changes. A sinusoidal function is used, and the degree of smoothing is controllable.
- An alternative implementation involves resampling of the whole signal rather than a selected part of each pitch period. This presents no problems for pitch raising provided that appropriate filtering is applied to prevent aliasing, since the harmonic structure still occupies the whole frequency range. When lowering pitch, however, interpolation leaves a gap at the high end of the spectrum. In a practical system aimed at telephony applications, this effect could be minimised by storing and processing the speech at a higher bandwidth than 4kHz (6kHz for example). The "lost" high frequencies would then be mostly out of the telephony band, and hence not relevant.
- this is synchronous with the pitch markings. More particularly, one set of LPC parameters is required for each pitchmark in the speech signal. As part of the speech modification process, a mapping is performed between original and modified pitchmarks. The appropriate LPC parameters can then be selected for each modified pitchmark to resynthesise speech from the residual.
- LPC parameters are interpolated at the speech sampling rate in both analysis and synthesis phases.
- each set of LPC parameters would be obtained for a section of the speech portion (analysis frame) of length equal to the pitch period (centred on the midpoint of the pitch period rather than on the pitch mark), or alternatively longer, overlapping sections might be used which has the advantage of permitting the use of an analysis frame of fixed length according to pitch.
- a windowed analysis frame is preferred, as shown in Figure 4.
- each parameter set is referenced to the period centre rather than the pitchmark.
- the frame length is fixed, as this was found to give more consistent results than a pitch-dependent value.
- the stabilised covariance method would be preferable in terms of accuracy.
- the autocorrelation method is preferred as it is computationally efficient and guaranteed to give a stable synthesis filter.
- the next step is to inverse filter the speech on a pitch-synchronous basis.
- the parameters are interpolated to minimise transients due to large changes in parameter values at frame boundaries.
- the filter corresponds exactly to that obtained from the analysis.
- the filter is a weighted combination of the two filters obtained from the analysis.
- the interpolation is applied directly to the filter coefficients. This has been shown to produce less spectral distortion than other parameters (LAR's, LSP's etc), but is not guaranteed to give a stable interpolated filter. No instability problems have been encountered practice.
- ⁇ n is the value of a weighting function at sample n.a l and a r represent the parameter sets referenced to the nearest left and right period centres.
- the filter coefficients for the re-synthesis filter 105 are calculated in the same way as for inverse filtering. Modifications to pitch and durations mean that the sequence of filters and the period values will be different from those used in the analysis, but the interpolation still ensures a smooth variation in filter coefficients from sample-to-sample.
- a single-window overlap-add process may be used, with however a window width of less than two pitch period duration (preferably less than 1.7 e.g. in the range 1.25 - 1.6).
- the window function necessarily has a flat top, moreover it is preferably asymmetrically located relative to the pitch marks (preferably embracing a complete period between two pitchmarks).
- a typical window function is shown in Figure 5, with a flat top having a length equal to the synthesis pitch period and flanks of raised half-cosine or linear shape.
- This form of window is beneficial because a smaller temporal portion of the signal is constructed by the overlap-add process than with a longer window, and the asymmetric form places the overlap-add distortion towards the end of the pitch period where the speech energy is lower than immediately after the glottal excitation.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
- Signal Processing Not Specific To The Method Of Recording And Reproducing (AREA)
- Electrophonic Musical Instruments (AREA)
Claims (12)
- Sprachsynthesevorrichtung, die eine Einrichtung enthält, die zur Änderung des Pitches von mit der Sprachsynthesevorrichtung synthetisierten Sprachsignalen steuerbar ist und die umfaßt:(i) eine Einrichtung zur Auftrennung der Sprachsignale in eine Spektrumskomponente und eine Erregungskomponente,(ii) eine Einrichtung zur Multiplikation der Erregungskomponente mit einer Reihe sich überlappender Fensterfunktionen, die bei von einer menschlichen Stimme gesprochenen Sprache synchron mit einer Pitchzeitmarken-Information ist, die zumindest näherungsweise Zeitpunkten der Stimmerregung entspricht, zu ihrer Aufteilung in Fenstersegmente,(iii) eine Einrichtung zum Aufbringen einer steuerbaren Zeitverschiebung auf die Segmente und zu ihrer Addition und(iv) eine Einrichtung zur Rekombination der Spektrumskomponente und der Erregungskomponente,
wobei die Einrichtung zur Multiplikation mindestens zwei Fenster pro Pitchperiode vorsieht und jedes Fenster eine Dauer von weniger als einer Pitchperiode aufweist. - Sprachsynthesevorrichtung nach Anspruch 1, bei der die Fenster aus ersten Fenstern, einem Fenster pro Pitchperiode, bestehen, welche die Pitchzeitmarken-Positionen und mehrere Zwischenfenster einschließen.
- Sprachsynthesevorrichtung nach Anspruch 2, bei der jedes Zwischenfenster eine Breite aufweist, die kleiner ist als die Breite der ersten Fenster.
- Sprachsynthesevorrichtung, die eine Einrichtung enthält, die zur Änderung des Pitches synthetisierter Sprachsignale steuerbar ist, und die aufweist:(i) eine Einrichtung zur Auftrennung der Sprachsignale in eine Spektrumskomponente und eine Erregungskomponente,(ii) eine Einrichtung zur temporären Kompression/Expansion der Erregungskomponente durch Interpolation neuer Abtastsignale aus Eingangs-Abtastsignalen und(iii) eine Einrichtung zur Rekombination der Spektrumskomponente und der Erregungskomponente.
- Sprachsynthesevorrichtung nach Anspruch 4, bei der die Einrichtung zur Kompression/Expansion in Abhängigkeit von der Pitchzeitmarken-Information betreibbar ist, die zumindest näherungsweise Zeitpunkten der Stimmerregung entspricht, um den Grad der Kompression/Expansion synchron damit so zu ändern, daß das Erregungssignal in der Nähe der Pitchzeitmarken weniger komprimiert/expandiert wird als in der Mitte der Pitchperiode zwischen zwei derartigen aufeinanderfolgenden Pitchzeitmarken.
- Sprachsynthesevorrichtung nach einem der Ansprüche 1 bis 5, die umfaßt:(a) einen Speicher, der Datenelemente enthält, die jeweils einen Abschnitt einer Sprachsignalwellenform definieren und Pitchzeitmarken-Information enthalten, die zumindest näherungsweise dem Peak der Stimmerregung entspricht, und(b) eine Treibereinrichtung, die auf Eingangssignale anspricht und Adressen erzeugt, um Datenelemente aus dem Speicher auszulesen und Pitchsignale zu erzeugen, die vom Kontext abhängige Pitchänderungen darstellen, aus denen Sprache erzeugt wird.
- Sprachsynthesevorrichtung nach einem der Ansprüche 1 bis 6, bei der die Einrichtung zur Auftrennung der Sprachsignale in eine Spektrumskomponente und eine Erregungskomponente umfaßt:(a) eine Analyseeinrichtung zum Empfang der synthetisierten Sprache und zur Erzeugung von Parametern für ein Filter, dessen Frequenzantwort dem Spektralinhalt der Sprache gleicht, sowie für ein Filter, das eine inverse Antwort erzeugt, und(b) ein Filter, das zum Empfang der Parameter vorgesehen ist, um die Sprache zu filtern und ein Restsignal zu erzeugen, wobei diese Filter in der Einrichtung zur Rekombination vorgesehen sind,(c) ein Umkehrfilter, das zum Empfang der Parameter und zum Filtern des Restsignals in Übereinstimmung mit der inversen Antwort vorgesehen ist.
- Sprachsynthesevorrichtung, die enthält: eine Einrichtung zur Steuerung des Pitches eines Eingangssignals durch Multiplikation des Signals mit einer Reihe sich überlappender Fenster, um dieses in Segmente aufzuteilen und die Segmente, nachdem sie einer Zeitverschiebung unterworfen wurden, zu rekombinieren, wobei die Fenster mit den Pitchzeitmarken synchron sind, die Zeitpunkte von Peaks der Stimmerregung repräsentieren,
wobei jedes Fenster eine Dauer von weniger als dem Doppelten der Pitchperiode zwischen den Pitchzeitmarken aufweist und um die Pitchzeitmarke herum asymmetrisch ist. - Sprachsynthesevorrichtung nach Anspruch 8, die enthält:
eine Einrichtung zur Auftrennung eines Sprachsignals in eine Spektrumskomponente und eine Erregungskomponente, wobei die Pitchsteuereinrichtung so ausgeführt ist, daß sie die Erregungskomponente als Eingangssignal empfängt, und eine Einrichtung zur Rekombination der Spektrumskomponente und der Erregungskomponente, bei welcher der Pitch eingestellt wurde. - Sprachsynthesevorrichtung nach Anspruch 8 oder 9, bei der jedes Fenster eine Dauer von weniger als dem 1,7-fachen der Pitchperiode zwischen den Pitchzeitmarken aufweist.
- Sprachsynthesevorrichtung nach Anspruch 10, bei der jedes Fenster eine Dauer im Bereich des 1,25 bis 1,6-fachen der Pitchperiode zwischen den Pitchzeitmarken aufweist.
- Sprachsynthesevorrichtung nach Anspruch 8 oder 9, bei der jedes Fenster eine komplette Periode zwischen zwei Pitchmarken umfaßt.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG1996003308A SG43076A1 (en) | 1994-03-18 | 1994-03-18 | Speech synthesis |
EP95911420A EP0750778B1 (de) | 1994-03-18 | 1995-03-17 | Sprachsynthese |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG1996003308A SG43076A1 (en) | 1994-03-18 | 1994-03-18 | Speech synthesis |
EP94301953 | 1994-03-18 | ||
EP94301953 | 1994-03-18 | ||
PCT/GB1995/000588 WO1995026024A1 (en) | 1994-03-18 | 1995-03-17 | Speech synthesis |
EP95911420A EP0750778B1 (de) | 1994-03-18 | 1995-03-17 | Sprachsynthese |
Publications (2)
Publication Number | Publication Date |
---|---|
EP0750778A1 EP0750778A1 (de) | 1997-01-02 |
EP0750778B1 true EP0750778B1 (de) | 2000-10-11 |
Family
ID=26136991
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP95911420A Expired - Lifetime EP0750778B1 (de) | 1994-03-18 | 1995-03-17 | Sprachsynthese |
Country Status (10)
Country | Link |
---|---|
EP (1) | EP0750778B1 (de) |
JP (1) | JPH09510554A (de) |
CN (1) | CN1144008A (de) |
AU (1) | AU692238B2 (de) |
CA (1) | CA2185134C (de) |
DE (1) | DE69519086T2 (de) |
ES (1) | ES2152390T3 (de) |
NZ (1) | NZ282012A (de) |
SG (1) | SG43076A1 (de) |
WO (1) | WO1995026024A1 (de) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3093113B2 (ja) * | 1994-09-21 | 2000-10-03 | 日本アイ・ビー・エム株式会社 | 音声合成方法及びシステム |
WO1996016533A2 (en) * | 1994-11-25 | 1996-06-06 | Fink Fleming K | Method for transforming a speech signal using a pitch manipulator |
EP1019906B1 (de) * | 1997-01-27 | 2004-06-16 | Entropic Research Laboratory Inc. | Ein system und verfahren zur prosodyanpassung |
CN104205213B (zh) * | 2012-03-23 | 2018-01-05 | 西门子公司 | 语音信号处理方法及装置以及使用其的助听器 |
JP6446993B2 (ja) * | 2014-10-20 | 2019-01-09 | ヤマハ株式会社 | 音声制御装置およびプログラム |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5163110A (en) * | 1990-08-13 | 1992-11-10 | First Byte | Pitch control in artificial speech |
-
1994
- 1994-03-18 SG SG1996003308A patent/SG43076A1/en unknown
-
1995
- 1995-03-17 NZ NZ282012A patent/NZ282012A/en not_active IP Right Cessation
- 1995-03-17 DE DE69519086T patent/DE69519086T2/de not_active Expired - Lifetime
- 1995-03-17 WO PCT/GB1995/000588 patent/WO1995026024A1/en active IP Right Grant
- 1995-03-17 CA CA002185134A patent/CA2185134C/en not_active Expired - Fee Related
- 1995-03-17 JP JP7524461A patent/JPH09510554A/ja not_active Ceased
- 1995-03-17 EP EP95911420A patent/EP0750778B1/de not_active Expired - Lifetime
- 1995-03-17 CN CN95192141A patent/CN1144008A/zh active Pending
- 1995-03-17 AU AU18995/95A patent/AU692238B2/en not_active Ceased
- 1995-03-17 ES ES95911420T patent/ES2152390T3/es not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
CA2185134A1 (en) | 1995-09-28 |
AU1899595A (en) | 1995-10-09 |
NZ282012A (en) | 1997-05-26 |
EP0750778A1 (de) | 1997-01-02 |
AU692238B2 (en) | 1998-06-04 |
CN1144008A (zh) | 1997-02-26 |
JPH09510554A (ja) | 1997-10-21 |
ES2152390T3 (es) | 2001-02-01 |
DE69519086T2 (de) | 2001-05-10 |
WO1995026024A1 (en) | 1995-09-28 |
SG43076A1 (en) | 1997-10-17 |
CA2185134C (en) | 2001-04-24 |
DE69519086D1 (de) | 2000-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5787398A (en) | Apparatus for synthesizing speech by varying pitch | |
Charpentier et al. | Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. | |
Moulines et al. | Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones | |
Moulines et al. | Non-parametric techniques for pitch-scale and time-scale modification of speech | |
Stylianou | Applying the harmonic plus noise model in concatenative speech synthesis | |
US8706496B2 (en) | Audio signal transforming by utilizing a computational cost function | |
Moulines et al. | Time-domain and frequency-domain techniques for prosodic modification of speech | |
Rao et al. | Prosody modification using instants of significant excitation | |
JP4641620B2 (ja) | ピッチ検出の精密化 | |
US8280724B2 (en) | Speech synthesis using complex spectral modeling | |
JPH03501896A (ja) | 波形の加算重畳による音声合成のための処理装置 | |
EP0813184B1 (de) | Verfahren zur Tonsynthese | |
Stylianou et al. | Diphone concatenation using a harmonic plus noise model of speech. | |
O'Brien et al. | Concatenative synthesis based on a harmonic model | |
KR100457414B1 (ko) | 음성합성방법, 음성합성장치 및 기록매체 | |
EP0750778B1 (de) | Sprachsynthese | |
Bonada | Wide-band harmonic sinusoidal modeling | |
Bonada | High quality voice transformations based on modeling radiated voice pulses in frequency domain | |
Ferreira | An odd-DFT based approach to time-scale expansion of audio signals | |
EP1500080B1 (de) | Verfahren zum synthetisieren von sprache | |
Edgington et al. | Residual-based speech modification algorithms for text-to-speech synthesis | |
US5911170A (en) | Synthesis of acoustic waveforms based on parametric modeling | |
KR100417092B1 (ko) | 음성합성 방법 | |
JP3089940B2 (ja) | 音声合成装置 | |
Gigi et al. | A mixed-excitation vocoder based on exact analysis of harmonic components |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 19960905 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): BE CH DE DK ES FR GB IT LI NL SE |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
17Q | First examination report despatched |
Effective date: 19990312 |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
RIC1 | Information provided on ipc code assigned before grant |
Free format text: 7G 10L 13/02 A |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): BE CH DE DK ES FR GB IT LI NL SE |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REF | Corresponds to: |
Ref document number: 69519086 Country of ref document: DE Date of ref document: 20001116 |
|
ITF | It: translation for a ep patent filed |
Owner name: JACOBACCI & PERANI S.P.A. |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20010111 |
|
ET | Fr: translation filed | ||
REG | Reference to a national code |
Ref country code: CH Ref legal event code: NV Representative=s name: JACOBACCI & PERANI S.A. |
|
REG | Reference to a national code |
Ref country code: ES Ref legal event code: FG2A Ref document number: 2152390 Country of ref document: ES Kind code of ref document: T3 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
REG | Reference to a national code |
Ref country code: GB Ref legal event code: IF02 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: CH Payment date: 20030214 Year of fee payment: 9 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: BE Payment date: 20030305 Year of fee payment: 9 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040331 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040331 Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040331 |
|
BERE | Be: lapsed |
Owner name: BRITISH *TELECOMMUNICATIONS P.L.C. Effective date: 20040331 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: ES Payment date: 20080325 Year of fee payment: 14 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: SE Payment date: 20080218 Year of fee payment: 14 Ref country code: NL Payment date: 20080214 Year of fee payment: 14 Ref country code: IT Payment date: 20080216 Year of fee payment: 14 |
|
EUG | Se: european patent has lapsed | ||
NLV4 | Nl: lapsed or anulled due to non-payment of the annual fee |
Effective date: 20091001 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20091001 |
|
REG | Reference to a national code |
Ref country code: ES Ref legal event code: FD2A Effective date: 20090318 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: ES Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090318 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090317 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090318 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20120403 Year of fee payment: 18 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20120323 Year of fee payment: 18 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20131129 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R119 Ref document number: 69519086 Country of ref document: DE Effective date: 20131001 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20131001 Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20130402 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20140319 Year of fee payment: 20 |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: PE20 Expiry date: 20150316 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION Effective date: 20150316 |