EP0110467B1 - Dispositif pour la détection des silences dans les signaux de paroles - Google Patents
Dispositif pour la détection des silences dans les signaux de paroles Download PDFInfo
- Publication number
- EP0110467B1 EP0110467B1 EP83201638A EP83201638A EP0110467B1 EP 0110467 B1 EP0110467 B1 EP 0110467B1 EP 83201638 A EP83201638 A EP 83201638A EP 83201638 A EP83201638 A EP 83201638A EP 0110467 B1 EP0110467 B1 EP 0110467B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- value
- short
- arrangement
- estimate
- time mean
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired
Links
- 238000001514 detection method Methods 0.000 title description 3
- 238000009499 grossing Methods 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 15
- 238000012935 Averaging Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 3
- 238000001914 filtration Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000000034 method Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/87—Detection of discrete points within a voice signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L2025/783—Detection of presence or absence of voice signals based on threshold decision
- G10L2025/786—Adaptive threshold
Definitions
- the invention relates to an arrangement for recognizing speech pauses in a speech signal, according to the preamble of patent claim 1.
- Such arrangements are e.g. the prerequisite for the suppression of interference signals when calling from an acoustically disturbed environment.
- characteristic parameters of the interference signal are measured and used to filter out the interference as completely as possible from the signal to be transmitted using adaptive filters.
- a circuit arrangement for recognizing speech pauses in a speech signal in which a short-term mean value is determined at certain clock instants of a clock.
- the circuit arrangement known therefrom has a fixed threshold and two adaptively tracked thresholds, the sign of the respective slope in the speech signal being used in the tracking of the thresholds.
- the adaptive noise thresholds are changed by constant amounts, so that these are not determined as a function of own values at previous clock instants.
- Such a circuit arrangement is preferably used for the recognition of speech pauses in a speech signal, on which only weak interference signals are superimposed.
- This break detection does not take into account, among other things, that z. B. unvoiced sounds lead to a drop in performance in the speech signal and the speech sections in question are therefore incorrectly regarded as speech pauses. Such mistakes in the known arrangement occur all the more, the more the speech signal is overlaid with interference signals.
- the arrangement is also intended to enable speech pause recognition even if the average noise level changes only slowly.
- kT o samples x (k) are obtained from the disturbed speech signal applied to a terminal E by an analog-to-digital converter A / D at sampling times, where k is a natural number and I / T o the sampling frequency represents.
- the samples are passed on to an averager M.
- the mean value generator M At all clock instants T (n) with the time interval mT o , the mean value generator M generates a so-called short-term mean value from the amounts of m consecutive samples.
- the arithmetic mean of the amounts of the sampled values is used as the mean value, since the block expenditure is less than z. B. to form the square mean.
- Each short-term mean value G (n) is approximately a measure of the average power of the disturbed speech signal over a period of approximately 100 ms. This specification and the sampling frequency also determine the number m of samples which are required to determine one of the short-term mean values G (n). Is z. B. sampled the disturbed speech signal at 10 kHz, m must be about 1000. Each of the quantities G (1), G (2) ... thus results from approximately a thousand consecutive samples.
- the unit GL of FIG. 1 smoothes the sequence of the short-term mean values G (n). More about the purpose and manner of smoothing is given below.
- block PA of FIG. 1 converts the short-term mean values into an estimated value P (n) for the average noise power, i.e. determined for the average power of the interference signal. More details about the estimate P (n) are also given below.
- a comparator V in FIG. 1 compares a threshold S dependent on the estimated value P (n) with the smoothed short-term mean values GG (n). If the smoothed short-term mean value GG (n) is less than the threshold S, a signal is forwarded to a unit EN. Has the unit EN z. B. at two consecutive clock instants T (n-1) and T (n) receive such a signal, they can in turn detect the presence of a speech pause by means of their own signal at terminal A.
- the diagram a) of Fig. 2 shows a possible output signal AM of the averager M, i.e. a possible sequence of the short-term mean values G (1), G (2) ...
- the output signal AM is standardized so that its absolute maximum assumes the value 1.
- the amplitude thresholds entered are the estimated value P (n) (lower threshold, shown in broken lines) and the threshold S (upper threshold, solid).
- Diagram b) schematically shows the associated speech signal S with its true pauses P. If a pause determination were made due to the fact that the upper amplitude threshold was not reached in diagram a) - this pause determination is shown in diagram c) - a large number of wrong decisions would result, as a comparison of diagrams b) and c) shows.
- a shift of the upper threshold downward would lead to the fact that the performance drops contained in diagram c), which are not based on language breaks, would not be displayed either, but the statement about the length of the breaks would then be significantly falsified.
- a smoothing of the output signal AM is provided before the decision to pause, either with the aid of a linear digital filter, by means of which three successive short-term mean values G (n), G (n-1) and G (n- 2) a value GG (n) of the smoothed signal is obtained, or using a median filter.
- FIG. 3 shows how the output signal of the mean value generator M looks after smoothing with a linear digital filter.
- diagram b) the true speech sections and the real pauses of the speech signal are in turn plotted, and diagram c) shows the speech sections and speech pauses as they result analogously to diagram c) in FIG. 1. Due to the linear smoothing, the number of wrong decisions has decreased considerably, as the comparison of FIGS. 2 and 3 shows. Even with smoothing with a median filter, the number of incorrect decisions is reduced, as can be seen from diagram c) in FIG. 4.
- a drop in performance can only be regarded as a speech pause if the upper amplitude threshold is fallen below twice in FIG. 2, 3 or 4.
- the amplitude thresholds shown in FIGS. 2, 3 and 4 are - as already indicated above - determined by the unit PA in FIG. 1, namely that the estimated value P (n) of the noise power is initially determined for each time T (n).
- This variable is intended to be an approximate measure of the average power of the interference signal, the averaging time being of the order of one second.
- the arrangement according to the invention still delivers good results even if the above-mentioned average power of the interference signal changes only slowly , ie if it is to be regarded as stationary in time intervals of the size, one or two seconds.
- the estimated value P (n) is a linear combination of the previous estimated value (P (n-1) and the short-term mean value G (n) according to the equation redefined.
- the value of the constant a appearing in this equation is between zero and one.
- a threshold D in terms of amount. Is z. B. K times the inequality in succession is satisfied, this fact is considered to be a longer speech pause and the new estimated value P (n) is determined according to the equation given above.
- the threshold D is selected proportional to the short-term mean G (n) in order to arrive at the same statements if, for. B. the levels of all signals would be doubled.
- the proportionality factor y and the number K are to be determined experimentally in such a way that as few incorrect decisions as possible are made by the arrangement. Typical values are
- the constant c is to be chosen so that the estimation value reaches the modulation limit in one to two seconds with unimpeded enlargement. If, on the other hand, the already existing estimated value P (n-1) lies above the current short-term mean value G (n), the new estimated value P (n) is lowered compared to the existing one, specifically according to the equation which represents the new estimated value as a linear combination of the previous estimated value and the current short-term mean value G (n). Values around 0.5 have proven to be favorable for the constant ⁇ .
- the threshold S which is used for the pause decision, is proportional to the estimated value P (n).
- the relationship S 1.1 P (n) is typical of the relationship between the threshold S and the estimated value P (n).
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Noise Elimination (AREA)
- Analogue/Digital Conversion (AREA)
- Telephone Function (AREA)
Claims (8)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE3243231 | 1982-11-23 | ||
DE19823243231 DE3243231A1 (de) | 1982-11-23 | 1982-11-23 | Verfahren zur erkennung von sprachpausen |
Publications (3)
Publication Number | Publication Date |
---|---|
EP0110467A1 EP0110467A1 (fr) | 1984-06-13 |
EP0110467B1 true EP0110467B1 (fr) | 1987-08-12 |
EP0110467B2 EP0110467B2 (fr) | 1991-06-19 |
Family
ID=6178780
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP83201638A Expired - Lifetime EP0110467B2 (fr) | 1982-11-23 | 1983-11-17 | Dispositif pour la détection des silences dans les signaux de paroles |
Country Status (6)
Country | Link |
---|---|
US (1) | US4700394A (fr) |
EP (1) | EP0110467B2 (fr) |
JP (1) | JPS59105695A (fr) |
AU (1) | AU561076B2 (fr) |
CA (1) | CA1203627A (fr) |
DE (2) | DE3243231A1 (fr) |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
IT1160148B (it) * | 1983-12-19 | 1987-03-04 | Cselt Centro Studi Lab Telecom | Dispositivo per la verifica del parlatore |
EP0167364A1 (fr) * | 1984-07-06 | 1986-01-08 | AT&T Corp. | Détection parole-silence avec codage par sous-bandes |
AU583871B2 (en) * | 1984-12-31 | 1989-05-11 | Itt Industries, Inc. | Apparatus and method for automatic speech recognition |
JPH0748695B2 (ja) * | 1986-05-23 | 1995-05-24 | 株式会社日立製作所 | 音声符号化方式 |
DE3626862A1 (de) * | 1986-08-08 | 1988-02-11 | Philips Patentverwaltung | Mehrstufige sender- antennenkoppeleinrichtung |
DE3739681A1 (de) * | 1987-11-24 | 1989-06-08 | Philips Patentverwaltung | Verfahren zum bestimmen von anfangs- und endpunkt isoliert gesprochener woerter in einem sprachsignal und anordnung zur durchfuehrung des verfahrens |
FR2631147B1 (fr) * | 1988-05-04 | 1991-02-08 | Thomson Csf | Procede et dispositif de detection de signaux vocaux |
JP2573352B2 (ja) * | 1989-04-10 | 1997-01-22 | 富士通株式会社 | 音声検出装置 |
US5305422A (en) * | 1992-02-28 | 1994-04-19 | Panasonic Technologies, Inc. | Method for determining boundaries of isolated words within a speech signal |
DE4220524A1 (de) * | 1992-06-23 | 1992-10-22 | Matzner Rolf Dipl Ing | Verfahren und vorrichtung zur getrennten schaetzung der einzelleistungen zweier stochastischer prozesse aus der beobachtung des durch additive ueberlagerung entstandenen summenprozesses |
US5459814A (en) * | 1993-03-26 | 1995-10-17 | Hughes Aircraft Company | Voice activity detector for speech signals in variable background noise |
DE4405723A1 (de) * | 1994-02-23 | 1995-08-24 | Daimler Benz Ag | Verfahren zur Geräuschreduktion eines gestörten Sprachsignals |
DE19730518C1 (de) * | 1997-07-16 | 1999-02-11 | Siemens Ag | Verfahren und Einrichtung zum Erkennen einer Sprechpause |
GB0103242D0 (en) * | 2001-02-09 | 2001-03-28 | Radioscape Ltd | Method of analysing a compressed signal for the presence or absence of information content |
DE10120231A1 (de) * | 2001-04-19 | 2002-10-24 | Deutsche Telekom Ag | Verfahren und Anordnung zur einkanaligen Geräuschreduktion für gestörte Sprachsignale |
CN1867965B (zh) * | 2003-10-16 | 2010-05-26 | Nxp股份有限公司 | 使用自适应噪声基底跟踪的语音活动检测 |
US8543061B2 (en) | 2011-05-03 | 2013-09-24 | Suhami Associates Ltd | Cellphone managed hearing eyeglasses |
CN104658546B (zh) * | 2013-11-19 | 2019-02-01 | 腾讯科技(深圳)有限公司 | 录音处理方法和装置 |
RU2691603C1 (ru) * | 2018-08-22 | 2019-06-14 | Акционерное общество "Концерн "Созвездие" | Способ разделения речи и пауз путем анализа значений корреляционной функции помехи и смеси сигнала и помехи |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
IT1044353B (it) * | 1975-07-03 | 1980-03-20 | Telettra Lab Telefon | Metodo e dispositivo per il rico noscimento della presenza e.o assenza di segnale utile parola parlato su linee foniche canali fonici |
US4052568A (en) * | 1976-04-23 | 1977-10-04 | Communications Satellite Corporation | Digital voice switch |
US4025721A (en) * | 1976-05-04 | 1977-05-24 | Biocommunications Research Corporation | Method of and means for adaptively filtering near-stationary noise from speech |
US4028496A (en) * | 1976-08-17 | 1977-06-07 | Bell Telephone Laboratories, Incorporated | Digital speech detector |
FR2451680A1 (fr) * | 1979-03-12 | 1980-10-10 | Soumagne Joel | Discriminateur parole/silence pour interpolation de la parole |
JPS56104399A (en) * | 1980-01-23 | 1981-08-20 | Hitachi Ltd | Voice interval detection system |
JPS56135898A (en) * | 1980-03-26 | 1981-10-23 | Sanyo Electric Co | Voice recognition device |
CA1147071A (fr) * | 1980-09-09 | 1983-05-24 | Northern Telecom Limited | Methode et appareil de detection de paroles dans un signal de voie telephonique |
US4357491A (en) * | 1980-09-16 | 1982-11-02 | Northern Telecom Limited | Method of and apparatus for detecting speech in a voice channel signal |
JPS5852695A (ja) * | 1981-09-25 | 1983-03-28 | 日産自動車株式会社 | 車両用音声検出装置 |
US4531228A (en) * | 1981-10-20 | 1985-07-23 | Nissan Motor Company, Limited | Speech recognition system for an automotive vehicle |
-
1982
- 1982-11-23 DE DE19823243231 patent/DE3243231A1/de active Granted
-
1983
- 1983-11-17 CA CA000441366A patent/CA1203627A/fr not_active Expired
- 1983-11-17 DE DE8383201638T patent/DE3373037D1/de not_active Expired
- 1983-11-17 EP EP83201638A patent/EP0110467B2/fr not_active Expired - Lifetime
- 1983-11-17 US US06/552,998 patent/US4700394A/en not_active Expired - Fee Related
- 1983-11-21 AU AU21545/83A patent/AU561076B2/en not_active Ceased
- 1983-11-22 JP JP58220467A patent/JPS59105695A/ja active Pending
Also Published As
Publication number | Publication date |
---|---|
EP0110467A1 (fr) | 1984-06-13 |
DE3243231A1 (de) | 1984-05-24 |
JPS59105695A (ja) | 1984-06-19 |
DE3373037D1 (en) | 1987-09-17 |
US4700394A (en) | 1987-10-13 |
CA1203627A (fr) | 1986-04-22 |
AU561076B2 (en) | 1987-04-30 |
DE3243231C2 (fr) | 1987-07-02 |
AU2154583A (en) | 1984-05-31 |
EP0110467B2 (fr) | 1991-06-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0110467B1 (fr) | Dispositif pour la détection des silences dans les signaux de paroles | |
EP0111947A1 (fr) | Dispositif pour la détection des silences dans les signaux de parole | |
DE3612347C2 (fr) | ||
DE69300413T2 (de) | Verfahren zum Identifizieren von Sprach- und Rufverlaufsignalen. | |
DE2233872A1 (de) | Signalanalysator | |
DE3889242T2 (de) | EKG-Vorrichtung. | |
DE3012771C2 (fr) | ||
DE3235279A1 (de) | Spracherkennungseinrichtung | |
WO1998035715A1 (fr) | Procede pour faire demarrer la phase d'inspiration ou d'expiration pendant une therapie par ventilation spontanee en pression positive continue | |
DE3878895T2 (de) | Verfahren und einrichtung zur spracherkennung. | |
DE19834108C2 (de) | Verfahren zur Bestimmung der Anzahl von Motorumdrehungen bei Elektromotoren aus Stromripplen | |
EP0584388A1 (fr) | Méthode de production d'un signal correspondant au volume-par-minute d'un patient | |
EP0560047B1 (fr) | Dispositif de sécurité pour ouvertures fermables électriquement | |
DE4217265A1 (de) | Verfahren zur Ermittlung von relativen Extremwerten eines störimpulsbeaufschlagten Signals | |
WO2001006265A2 (fr) | Procede de determination de l'amplitude et de l'angle de phase d'un signal de mesure correspondant au courant ou a la tension d'un reseau electrique de distribution d'energie | |
WO2000010633A1 (fr) | Procedes et dispositif permettant de passer en phase inspiratoire ou expiratoire durant un traitement par respiration spontanee en pression positive continue | |
DE69725970T2 (de) | Verfahren zur überwachung von stufenschaltern durch akustische analyse | |
DE69018840T2 (de) | Verfahren und Vorrichtung zur Bestimmung des mittleren arteriellen Blutdruckes. | |
DE19840872A1 (de) | Verfahren zur probabilistischen Schätzung gestörter Meßwerte | |
DE3017623C2 (de) | Sensor zur Verkehrserfassung von aus Analogsignalen bestehenden Nachrichtenströmen auf Fernmeldeleitungen | |
DE19854341A1 (de) | Verfahren und Schaltungsanordnung zur Sprachpegelmessung in einem Sprachsignalverarbeitungssystem | |
EP0203029B1 (fr) | Méthode pour la production d'un signal de déclenchement en dépendance de l'amplitude et de la durée d'un surcourant | |
DE2904426A1 (de) | Analog-sprach-codierer und decodierer | |
DE10244699B4 (de) | Verfahren zur Bestimmung der Sprachaktivität | |
DE2746837C2 (de) | Verfahren zur Untersuchung von Schichten unterschiedlicher Strahlenabsorption |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Designated state(s): BE DE FR GB IT SE |
|
17P | Request for examination filed |
Effective date: 19840718 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): BE DE FR GB IT SE |
|
REF | Corresponds to: |
Ref document number: 3373037 Country of ref document: DE Date of ref document: 19870917 |
|
ITF | It: translation for a ep patent filed |
Owner name: ING. C. GREGORJ S.P.A. |
|
ET | Fr: translation filed | ||
PLBI | Opposition filed |
Free format text: ORIGINAL CODE: 0009260 |
|
26 | Opposition filed |
Opponent name: SIEMENS AKTIENGESELLSCHAFT, BERLIN UND MUENCHEN Effective date: 19880502 |
|
PLAB | Opposition data, opponent's data or that of the opponent's representative modified |
Free format text: ORIGINAL CODE: 0009299OPPO |
|
R26 | Opposition filed (corrected) |
Opponent name: SIEMENS AKTIENGESELLSCHAFT, BERLIN UND MUENCHEN Effective date: 19880809 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: BE Payment date: 19891114 Year of fee payment: 7 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 19891121 Year of fee payment: 7 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: SE Payment date: 19891128 Year of fee payment: 7 |
|
ITTA | It: last paid annual fee | ||
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 19891130 Year of fee payment: 7 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 19900125 Year of fee payment: 7 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Effective date: 19901117 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Effective date: 19901118 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BE Effective date: 19901130 |
|
PUAH | Patent maintained in amended form |
Free format text: ORIGINAL CODE: 0009272 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: PATENT MAINTAINED AS AMENDED |
|
BERE | Be: lapsed |
Owner name: N.V. PHILIPS' GLOEILAMPENFABRIEKEN Effective date: 19901130 |
|
27A | Patent maintained in amended form |
Effective date: 19910619 |
|
AK | Designated contracting states |
Kind code of ref document: B2 Designated state(s): BE DE FR GB IT SE |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee | ||
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Effective date: 19910731 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Effective date: 19910801 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST |
|
EN3 | Fr: translation not filed ** decision concerning opposition | ||
EUG | Se: european patent has lapsed |
Ref document number: 83201638.0 Effective date: 19910705 |