EP0076233B1 - Procédé et dispositif pour traitement digital de la parole réduisant la redondance - Google Patents
Procédé et dispositif pour traitement digital de la parole réduisant la redondance Download PDFInfo
- Publication number
- EP0076233B1 EP0076233B1 EP82810390A EP82810390A EP0076233B1 EP 0076233 B1 EP0076233 B1 EP 0076233B1 EP 82810390 A EP82810390 A EP 82810390A EP 82810390 A EP82810390 A EP 82810390A EP 0076233 B1 EP0076233 B1 EP 0076233B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- speech
- process according
- energy
- decision
- signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired
Links
- 238000000034 method Methods 0.000 title claims description 50
- 238000012545 processing Methods 0.000 title claims description 12
- 238000012360 testing method Methods 0.000 claims abstract description 34
- 230000008569 process Effects 0.000 claims description 36
- 238000004458 analytical method Methods 0.000 claims description 16
- 238000001914 filtration Methods 0.000 claims description 13
- 238000005311 autocorrelation function Methods 0.000 claims description 11
- 238000012546 transfer Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 2
- 230000015572 biosynthetic process Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 230000001755 vocal effect Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 5
- 230000005284 excitation Effects 0.000 description 5
- 238000003786 synthesis reaction Methods 0.000 description 5
- 238000000605 extraction Methods 0.000 description 4
- 230000003321 amplification Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- 230000000737 periodic effect Effects 0.000 description 3
- 210000001260 vocal cord Anatomy 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000001208 nuclear magnetic resonance pulse sequence Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000002146 bilateral effect Effects 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012432 intermediate storage Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
Definitions
- the invention relates to a redundancy-reducing digital speech processing method that works according to the linear prediction method and to a corresponding device according to the preamble of claim 1 and claim 33.
- LPC vocoders are not yet fully satisfactory. Although the language synthesized again after the analysis is usually still relatively understandable, it is distorted and sounds artificial. A main cause for this lies u. a. above all in the difficulty of making the decision with certainty whether there is a voiced or an unvoiced speech section with sufficient certainty. Other causes include poor determination of the pitch period and accurate determination of the sound filter parameters.
- the present invention is now primarily concerned with the first of these difficulties and aims to improve a digital speech processing method or system of the type defined at the outset in such a way that it makes more accurate or more reliable voiced-unvoiced decisions and thus an improvement in Quality of the synthesized language leads.
- a number of decision criteria are known for the voiced-voiceless classification, which are used individually or in part in combination. Common criteria are e.g. B. the energy of the speech signal, the number of zero crossings of the same within a certain time period, the normalized residual error energy, d. H. the ratio of the energy of the prediction error signal to that of the speech signal, and the level of the second maximum of the autocorrelation function of the speech signal or of the prediction error signal. Furthermore, it is also common to carry out a cross-comparison to one or more neighboring language sections. A clear and comparative representation of the most important classification criteria and methods is e.g. B. the publication by L. R. Rabiner et al. refer to.
- a common feature of all of these known methods and criteria is that two-sided decisions are always made by definitely assigning the language section to one or the other of the two options, depending on whether the criteria or criteria in question are met or not. In this way, it can be achieved with a suitable selection and, if necessary, a combination of the decision criteria, a relatively high degree of accuracy, however, as practice shows, wrong decisions still occur relatively often, which significantly affect the quality of the synthesized language.
- a main reason for this lies in the fact that, in spite of all redundancy, voice signals generally have an unsteady character, due to which it is simply not possible to set the decision thresholds used in the respective criteria in such a way that a reliable statement can be made on both sides. A certain degree of uncertainty always remains and must be accepted.
- the invention now proceeds from this previously used principle of bilateral decisions and instead uses a strategy in which only unilateral, but practically absolutely safe decisions are made.
- a language section is only clearly classified as voiced or unvoiced if a certain criterion is met. However, if the criterion is not met, the language section is not already definitely judged to be unvoiced or voiced, but is subject to a further classification criterion. This in turn only makes a safe decision in one direction if the relevant criterion is met, otherwise the decision procedure is continued in an analogous manner. This continues until a safe classification is possible. Extensive studies have shown that with a suitable selection and order of the criteria, a maximum of about six to seven decision steps are usually required.
- the positions of the respective decisions are for the degree of security of the individual decisions applicable thresholds. The more extreme these decision thresholds are, the more selective the criteria and the safer the decisions. However, with increasing selectivity of the individual criteria, the number of the maximum necessary decision-making operations increases. In practice, however, it is easily possible to set the thresholds in such a way that practically absolute (one-sided) decision-making certainty is achieved without the total number of criteria or decision-making operations increasing above the level specified above.
- this is from some source, e.g. B. a microphone 1 originating analog voice signal in a filter 2 band limited and then sampled and digitized in an A / D converter 3.
- the sampling rate is about 6 to 16 kHz, preferably about 8 kHz.
- the resolution is about 8 to 12 bit.
- the pass band of the filter 2 usually extends from approximately 80 Hz to approximately 3.1-3.4 kHz in the case of so-called broadband speech, and from approximately 300 Hz to 3.1-3.4 kHz in the telephone language.
- the digital speech signal s n is divided into successive, preferably overlapping speech sections, so-called frames.
- the speech section length can be approximately 10 to 30 msec, preferably approximately 20 msec.
- the frame rate ie the number of frames per second, is approximately 30 to 100, preferably approximately 45 to 70.
- sections as short as possible and correspondingly high frame rates are desirable, but this is appropriate on the one hand, the limited performance of the computer used in real-time processing and, on the other hand, the demand for the lowest possible bit rates during transmission.
- the analysis is essentially divided into two main procedures, firstly in the calculation of the amplification factor or volume parameter and the coefficients or filter parameters of the underlying vocal tract model filter and secondly in the voiced-unvoiced decision and in determining the pitch -Period in voiced case.
- the filter coefficients are obtained in a parameter calculator 4 by solving the system of equations which is obtained when the energy of the prediction error, ie the energy of the difference between the actual samples and the samples estimated on the basis of the model assumption in the interval under consideration (speech section) is minimized as a function of the coefficients becomes.
- the system of equations is preferably solved using the autocorrelation method using an algorithm according to Durbin (cf., for example, BLB Rabiner and RW Schafer "Digital Processing of Speech Signals", Prentice-Hall Inc., Englewood Cliffs, NJ 1978, pp. 411-413) .
- the so-called reflection coefficients (k j ) also result, which are less sensitive transforms of the filter coefficients (a j ) to quantization.
- the amount of reflection coefficients in stable filters is always less than 1 and, moreover, their amount decreases with increasing atomic number. Because of these advantages, the reflection coefficients (k j ) are preferably transmitted instead of the filter coefficients (a;).
- the volume parameter G results from the algorithm as a by-product.
- the digital speech signal Sn is first temporarily stored in a buffer 5 until the filter parameters (a;) have been calculated.
- the signal then passes through an inverse filter 6 set with the parameters (a j ), which has an inverse transfer function to the transfer function of the vocal tract model filter.
- the result of this inverse filtering is a prediction error signal e " , which is similar to the excitation signal x " multiplied by the gain factor G.
- This prediction error signal e n is now in the case of telephone speech directly or in the case of broadband speech via a low-pass filter 7 fed to an autocorrelation stage 8, which forms the autocorrelation function AKF standardized to the zero-order autocorrelation maximum, from which the pitch period p is determined in a pitch extraction stage 9, in a known manner as the distance between the second autocorrelation maximum RXX and the first maximum (zero-order), an adaptive search method is preferably used.
- the low-pass filter 7 will be explained further below. At this point it should only be mentioned that it can be bridged by means of a switch 10 for telephone speech and could also be arranged in front of the inverse filter 6.
- the speech section under consideration is classified as voiced or unvoiced according to the decision procedure according to the invention to be explained in more detail in a decision stage 11 which is supported by an energy determination stage 12 and a zero crossing determination stage 13.
- the pitch parameter p is set to zero.
- the parameter calculator described above determines a set of filter parameters for each speech section (frame).
- the filter parameters could also be determined differently, for example continuously by means of adaptive inverse filtering or another known method, the filter parameters being readjusted continuously with each sampling cycle, but only at the times determined by the frame rate for further processing or Transmission will be provided.
- the invention is in no way restricted in this regard. It is only essential that there is a set of filter parameters for each language section.
- the recovery or synthesis of the speech signal from the parameters takes place in a known manner in that the parameters initially decoded in a decoder 15 are fed to a pulse-noise generator 16, an amplifier 17 and a vocal tract model filter 18 and the output signal of the model filter 18 by means of a D / A converter 19 brought into analog form and then after the usual filtering 20 by a playback device, for. B. a speaker 21 is made audible.
- the volume parameter G controls the amplification factor of the amplifier 17, the filter parameters (kj) define the transfer function of the sound formation or vocal tract model filter 18.
- Fig. 2 An example of such a system is shown in Fig. 2 as a block diagram.
- the multi-processor system shown essentially comprises four functional blocks, namely a main processor 50, two secondary processors 60 and 70 and an input / output unit 80. It implements both analysis and synthesis.
- the input / output unit 80 contains the stages designated 81 for analog signal processing, such as amplifiers, filters and automatic gain control, as well as the A / D converter and the D / A converter.
- the main processor 50 carries out the actual speech analysis or synthesis, for which purpose the determination of the filter parameters and the volume parameters (parameter calculator 4), the determination of energy and zero crossings of the speech signal (stages 12 and 13), the voiced-unvoiced decision (stage 11) and the determination of the pitch period (stage 9) or, on the synthesis side, the generation of the output signal (stage 16), its volume variation (stage 17) and its filtering in the speech model filter (filter 18).
- the main processor 50 is supported by the secondary processor 60, which carries out the intermediate storage (buffer 5), inverse filtering (stage 6), optionally the low-pass filtering (stage 7) and the autocorrelation (stage 8).
- the secondary processor 70 finally deals exclusively with the coding or decoding of the speech parameters and with the data traffic with z.
- the voiced-unvoiced decision-making procedure is explained in more detail below.
- a longer analysis interval is preferably used as a basis for the voiced-unvoiced decision and the determination of the pitch period than for the determination of the filter coefficients.
- the analysis interval is the same as the language section under consideration; for pit extraction, on the other hand, the analysis interval extends on both sides of the language section into the respectively adjacent language section, for example up to about half of the same. In this way, a more reliable and less erratic pitch extraction can be carried out.
- the energy of a signal is referred to in the following, this always means the relative energy of the signal in the analysis interval, that is to say standardized to the dynamic range of the A / D converter 3.
- FIGS. 3 and 4 show the flow diagrams of two particularly expedient decision-making processes according to the invention, specifically in FIG. 3 a variant for broadband voice and in FIG. 4 one for telephone voice.
- an energy test is carried out as the first decision criterion.
- the (relative, standardized) energy Es of the speech signal s n is compared with a minimum energy threshold EL which is set so low that the speech section can certainly be called unvoiced if the energy Es does not lie above this threshold.
- Practical values for this minimum energy threshold EL is 1.1 - 10- 4 to 1.4 - 10 -4, preferably about 1.2 - 10. 4 These values apply in the event that all digital scanning signals are shown in the standard format (range ⁇ 1). For other signal formats, the values must be multiplied by the corresponding factors.
- the next criterion is a zero-crossing test.
- the number of zero crossings of the digital voice signal is determined in the analysis interval and compared with a maximum number of ZCU. If the number is greater than this maximum number, the speech section is clearly rated as unvoiced, otherwise a further decision criterion is used.
- the maximum number ZCU is approximately 105 to 120, preferably approximately 110 zero crossings for an analysis interval length of 256 samples.
- the normalized autocorrelation function AFK of the low-pass filtered prediction error signal e is used, namely the normalized autocorrelation maximum RXX, which is at a distance from the zero-order maximum identified by the index IP, is compared with a threshold value RU and evaluated as correct if this threshold value is exceeded, otherwise the next criterion is proceeded in.
- Practically favorable values for the threshold value are 0.55 to 0.75, preferably about 0.6.
- the energy of the low-pass filtered prediction error signal e n is examined. If this energy ratio V o is smaller than a first, lower ratio threshold VL, the speech section is rated as voiced. Otherwise there is a further comparison with a second, higher ratio threshold VU, the decision being made unvoiced if the energy ratio V o is above this higher threshold VU. This second comparison may also be omitted.
- Suitable values for the two ratio thresholds VL and VU are 0.05 to 0.15 and 0.6 to 0.75, preferably about 0.1 and 0.7.
- the next decision criterion is yet another energy test, whereby the energy Es of the speech signal is compared to a second, higher minimum energy threshold EU and this time the decision is made as to when the energy Es of the speech signal exceeds this threshold EU.
- Practical values for these higher minimum energy threshold EU are 1.3 - 10-- 3 to 1.8 - 10 -3, preferably about 1.5 - 10 3rd
- the autocorrelation maximum RXX is first compared with a second, lower threshold value RM. If this threshold is exceeded, the decision will be made by voice. Otherwise, a cross-comparison with the two (possibly also only one) immediately preceding language sections is carried out as the last criterion. The speech section is only rated as unvoiced if the (or one) of the two previous speech sections were also unvoiced. Otherwise, the decision will be final. Suitable values for the threshold value RM are 0.35 to 0.45, preferably approximately 0.42.
- the prediction error signal e n becomes low-pass in broadband speech filtered.
- This low-pass filtering causes the frequency distributions of the autocorrelation maximum values to be split up between unvoiced and voiced speech sections and thus makes it easier to determine the decision threshold while at the same time reducing the frequency of errors. It also enables better pitch extraction, ie determining the pitch period.
- An essential condition for this, however, is that the low-pass filtering is carried out with an extremely high slope of approximately 150 to 180 db / octave.
- the (digital) filter used should have an elliptical characteristic, the cut-off frequency should be in the range of 700-1200 Hz, preferably 800 to 900 Hz.
- the decision process for telephone speech shown in FIG. 4 largely corresponds to that for broadband speech. Only the sequence of the second energy test and the second zero-crossing test is reversed (not mandatory) and the second test of the auto-correlation maximum RXX is also omitted, since this would not work for telephone speech.
- the individual decision thresholds are partly different, depending on the differences between the telephone language and the broadband language. Practical values are shown in the table below.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
- Use Of Switch Circuits For Exchanges And Methods Of Control Of Multiplex Exchanges (AREA)
- Exchange Systems With Centralized Control (AREA)
- Error Detection And Correction (AREA)
Claims (34)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AT82810390T ATE15563T1 (de) | 1981-09-24 | 1982-09-20 | Verfahren und vorrichtung zur redundanzvermindernden digitalen sprachverarbeitung. |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CH6167/81 | 1981-09-24 | ||
CH616781 | 1981-09-24 |
Publications (2)
Publication Number | Publication Date |
---|---|
EP0076233A1 EP0076233A1 (fr) | 1983-04-06 |
EP0076233B1 true EP0076233B1 (fr) | 1985-09-11 |
Family
ID=4305323
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP82810390A Expired EP0076233B1 (fr) | 1981-09-24 | 1982-09-20 | Procédé et dispositif pour traitement digital de la parole réduisant la redondance |
Country Status (6)
Country | Link |
---|---|
US (1) | US4589131A (fr) |
EP (1) | EP0076233B1 (fr) |
JP (1) | JPS5870299A (fr) |
AT (1) | ATE15563T1 (fr) |
CA (1) | CA1184657A (fr) |
DE (1) | DE3266204D1 (fr) |
Families Citing this family (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
NL8400728A (nl) * | 1984-03-07 | 1985-10-01 | Philips Nv | Digitale spraakcoder met basisband residucodering. |
US5208861A (en) * | 1988-06-16 | 1993-05-04 | Yamaha Corporation | Pitch extraction apparatus for an acoustic signal waveform |
US4972474A (en) * | 1989-05-01 | 1990-11-20 | Cylink Corporation | Integer encryptor |
IT1229725B (it) * | 1989-05-15 | 1991-09-07 | Face Standard Ind | Metodo e disposizione strutturale per la differenziazione tra elementi sonori e sordi del parlato |
US5680508A (en) * | 1991-05-03 | 1997-10-21 | Itt Corporation | Enhancement of speech coding in background noise for low-rate speech coder |
US5280525A (en) * | 1991-09-27 | 1994-01-18 | At&T Bell Laboratories | Adaptive frequency dependent compensation for telecommunications channels |
US5361379A (en) * | 1991-10-03 | 1994-11-01 | Rockwell International Corporation | Soft-decision classifier |
FR2684226B1 (fr) * | 1991-11-22 | 1993-12-24 | Thomson Csf | Procede et dispositif de decision de voisement pour vocodeur a tres faible debit. |
JP2746033B2 (ja) * | 1992-12-24 | 1998-04-28 | 日本電気株式会社 | 音声復号化装置 |
US5471527A (en) | 1993-12-02 | 1995-11-28 | Dsc Communications Corporation | Voice enhancement system and method |
TW271524B (fr) * | 1994-08-05 | 1996-03-01 | Qualcomm Inc | |
US5970441A (en) * | 1997-08-25 | 1999-10-19 | Telefonaktiebolaget Lm Ericsson | Detection of periodicity information from an audio signal |
US6381570B2 (en) * | 1999-02-12 | 2002-04-30 | Telogy Networks, Inc. | Adaptive two-threshold method for discriminating noise from speech in a communication signal |
US6980950B1 (en) * | 1999-10-22 | 2005-12-27 | Texas Instruments Incorporated | Automatic utterance detector with high noise immunity |
GB2357683A (en) * | 1999-12-24 | 2001-06-27 | Nokia Mobile Phones Ltd | Voiced/unvoiced determination for speech coding |
KR101008022B1 (ko) * | 2004-02-10 | 2011-01-14 | 삼성전자주식회사 | 유성음 및 무성음 검출방법 및 장치 |
US8694308B2 (en) * | 2007-11-27 | 2014-04-08 | Nec Corporation | System, method and program for voice detection |
DE102008042579B4 (de) * | 2008-10-02 | 2020-07-23 | Robert Bosch Gmbh | Verfahren zur Fehlerverdeckung bei fehlerhafter Übertragung von Sprachdaten |
CN101859568B (zh) * | 2009-04-10 | 2012-05-30 | 比亚迪股份有限公司 | 一种语音背景噪声的消除方法和装置 |
US9454976B2 (en) | 2013-10-14 | 2016-09-27 | Zanavox | Efficient discrimination of voiced and unvoiced sounds |
CN112885380B (zh) * | 2021-01-26 | 2024-06-14 | 腾讯音乐娱乐科技(深圳)有限公司 | 一种清浊音检测方法、装置、设备及介质 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US2908761A (en) * | 1954-10-20 | 1959-10-13 | Bell Telephone Labor Inc | Voice pitch determination |
US3102928A (en) * | 1960-12-23 | 1963-09-03 | Bell Telephone Labor Inc | Vocoder excitation generator |
US3083266A (en) * | 1961-02-28 | 1963-03-26 | Bell Telephone Labor Inc | Vocoder apparatus |
US4004096A (en) * | 1975-02-18 | 1977-01-18 | The United States Of America As Represented By The Secretary Of The Army | Process for extracting pitch information |
US4074069A (en) * | 1975-06-18 | 1978-02-14 | Nippon Telegraph & Telephone Public Corporation | Method and apparatus for judging voiced and unvoiced conditions of speech signal |
US4281218A (en) * | 1979-10-26 | 1981-07-28 | Bell Telephone Laboratories, Incorporated | Speech-nonspeech detector-classifier |
-
1982
- 1982-09-20 AT AT82810390T patent/ATE15563T1/de not_active IP Right Cessation
- 1982-09-20 DE DE8282810390T patent/DE3266204D1/de not_active Expired
- 1982-09-20 EP EP82810390A patent/EP0076233B1/fr not_active Expired
- 1982-09-22 CA CA000411900A patent/CA1184657A/fr not_active Expired
- 1982-09-23 US US06/421,883 patent/US4589131A/en not_active Expired - Fee Related
- 1982-09-24 JP JP57165153A patent/JPS5870299A/ja active Pending
Also Published As
Publication number | Publication date |
---|---|
DE3266204D1 (en) | 1985-10-17 |
CA1184657A (fr) | 1985-03-26 |
EP0076233A1 (fr) | 1983-04-06 |
ATE15563T1 (de) | 1985-09-15 |
US4589131A (en) | 1986-05-13 |
JPS5870299A (ja) | 1983-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0076233B1 (fr) | Procédé et dispositif pour traitement digital de la parole réduisant la redondance | |
EP0076234B1 (fr) | Procédé et dispositif pour traitement digital de la parole réduisant la redondance | |
DE69926851T2 (de) | Verfahren und Vorrichtung zur Sprachaktivitätsdetektion | |
DE69938374T2 (de) | Verfahren und Vorrichtung zur Spracherkennung mittels sowohl eines neuralen Netzwerks als auch verborgener Markov-Modelle | |
DE3244476C2 (fr) | ||
DE69726235T2 (de) | Verfahren und Vorrichtung zur Spracherkennung | |
DE69816177T2 (de) | Sprache/Pausen-Unterscheidung mittels ungeführter Adaption von Hidden-Markov-Modellen | |
DE60023517T2 (de) | Klassifizierung von schallquellen | |
DE69830017T2 (de) | Verfahren und Vorrichtung zur Spracherkennung | |
EP1386307B2 (fr) | Procede et dispositif pour determiner un niveau de qualite d'un signal audio | |
DE102007001255A1 (de) | Tonsignalverarbeitungsverfahren und -vorrichtung und Computerprogramm | |
DE2326517A1 (de) | Verfahren und schaltungsanordnung zum erkennen von gesprochenen woertern | |
EP0815553B1 (fr) | Methode de detection d'une pause de signal entre deux modeles presents dans un signal de mesure variable en temps | |
DE69918635T2 (de) | Vorrichtung und Verfahren zur Sprachverarbeitung | |
DE19942178C1 (de) | Verfahren zum Aufbereiten einer Datenbank für die automatische Sprachverarbeitung | |
DE3043516C2 (de) | Verfahren und Vorrichtung zur Spracherkennung | |
EP3291234B1 (fr) | Procede d'estimation de la qualite de la voix d'une personne qui parle | |
DE3733659C2 (fr) | ||
DE2636032B2 (de) | Elektrische Schaltungsanordnung zum Extrahieren der Grundschwingungsperiode aus einem Sprachsignal | |
DE69922769T2 (de) | Vorrichtung und Verfahren zur Sprachverarbeitung | |
DE60018690T2 (de) | Verfahren und Vorrichtung zur Stimmhaft-/Stimmlos-Entscheidung | |
DE19920501A1 (de) | Wiedergabeverfahren für sprachgesteuerte Systeme mit textbasierter Sprachsynthese | |
DE60110541T2 (de) | Verfahren zur Spracherkennung mit geräuschabhängiger Normalisierung der Varianz | |
EP0803861B1 (fr) | Procédé d'extraction de caractéristiques d'un signal de parole | |
WO2005069278A1 (fr) | Procede et dispositif pour traiter un signal vocal pour la reconnaissance vocale robuste |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 19820922 |
|
AK | Designated contracting states |
Designated state(s): AT CH DE FR GB IT LI NL SE |
|
ITF | It: translation for a ep patent filed |
Owner name: SOCIETA' ITALIANA BREVETTI S.P.A. |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Designated state(s): AT CH DE FR GB IT LI NL SE |
|
REF | Corresponds to: |
Ref document number: 15563 Country of ref document: AT Date of ref document: 19850915 Kind code of ref document: T |
|
REF | Corresponds to: |
Ref document number: 3266204 Country of ref document: DE Date of ref document: 19851017 |
|
ET | Fr: translation filed | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: AT Payment date: 19860825 Year of fee payment: 5 |
|
26N | No opposition filed | ||
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: NL Payment date: 19870930 Year of fee payment: 6 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PUE Owner name: OMNISEC AG |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: TP |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: 732 |
|
ITPR | It: changes in ownership of a european patent |
Owner name: CESSIONE;OMNISEC AG |
|
NLS | Nl: assignments of ep-patents |
Owner name: OMNISEC AG TE REGENSDORF, ZWITSERLAND. |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Effective date: 19880920 Ref country code: AT Effective date: 19880920 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Effective date: 19880930 Ref country code: CH Effective date: 19880930 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Effective date: 19890401 |
|
NLV4 | Nl: lapsed or anulled due to non-payment of the annual fee | ||
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 19890531 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee | ||
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Effective date: 19890601 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: SE Payment date: 19890921 Year of fee payment: 8 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Effective date: 19900921 |
|
EUG | Se: european patent has lapsed |
Ref document number: 82810390.3 Effective date: 19910527 |