EP0110467B2 - Anordnung zur Erkennung von Sprachpausen - Google Patents

Anordnung zur Erkennung von Sprachpausen Download PDF

Info

Publication number
EP0110467B2
EP0110467B2 EP83201638A EP83201638A EP0110467B2 EP 0110467 B2 EP0110467 B2 EP 0110467B2 EP 83201638 A EP83201638 A EP 83201638A EP 83201638 A EP83201638 A EP 83201638A EP 0110467 B2 EP0110467 B2 EP 0110467B2
Authority
EP
European Patent Office
Prior art keywords
value
short
estimate
arrangement
time mean
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP83201638A
Other languages
German (de)
English (en)
French (fr)
Other versions
EP0110467A1 (de
EP0110467B1 (de
Inventor
Bernd Dipl.-Ing. Selbach
Peter Dr. Ing. Vary
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Philips Kommunikations Industrie AG
Koninklijke Philips NV
Original Assignee
Philips Kommunikations Industrie AG
Philips Gloeilampenfabrieken NV
Koninklijke Philips Electronics NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=6178780&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=EP0110467(B2) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Philips Kommunikations Industrie AG, Philips Gloeilampenfabrieken NV, Koninklijke Philips Electronics NV filed Critical Philips Kommunikations Industrie AG
Publication of EP0110467A1 publication Critical patent/EP0110467A1/de
Application granted granted Critical
Publication of EP0110467B1 publication Critical patent/EP0110467B1/de
Publication of EP0110467B2 publication Critical patent/EP0110467B2/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/87Detection of discrete points within a voice signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision
    • G10L2025/786Adaptive threshold

Definitions

  • the invention relates to an arrangement for recognizing speech pauses in a speech signal, according to the preamble of patent claim 1.
  • Such arrangements are e.g. the prerequisite for the suppression of interference signals when calling from an acoustically disturbed environment.
  • characteristic parameters of the interference signal are measured and used to filter out the interference as completely as possible from the signal to be transmitted using adaptive filters.
  • a circuit arrangement for recognizing speech pauses in a speech signal in which a short-term mean value is determined at certain clock instants of a clock.
  • the circuit arrangement known therefrom has a fixed threshold and two adaptively tracked thresholds, the sign of the respective slope in the speech signal being used in tracking the thresholds.
  • the adaptive noise thresholds are changed by constant amounts, so that these are not determined as a function of own values at previous clock instants.
  • Such a circuit arrangement is preferably used for the recognition of speech pauses in a speech signal, on which only weak interference signals are superimposed.
  • this pause detection does not take into account that e.g. unvoiced sounds lead to a drop in performance in the speech signal and the relevant speech sections are therefore incorrectly regarded as speech pauses. Such mistakes in the known arrangement occur all the more, the more the speech signal is overlaid with interference signals.
  • the arrangement is also intended to enable speech pause recognition even if the average noise level changes only slowly.
  • kTo samples x (k) are obtained from the disturbed speech signal applied to a terminal E by an analog-to-digital converter A / D at sampling times, k representing a natural number and IlT o the sampling frequency.
  • the samples are passed on to an averager M.
  • the averager M generates a so-called short-term mean value at all clock instants T (n) with the time interval mT o from the amounts of m consecutive samples.
  • n 1, 2, 3, ... etc.
  • the arithmetic mean of the amounts of the sampled values is used as the mean value, since the effort involved in the building block is less than e.g. to form the quadratic mean.
  • Each short-term mean value G (n) is approximately a measure of the average power of the disturbed speech signal over a period of approximately 100 ms. This specification and the sampling frequency also determine the number m of samples which are required to determine one of the short-term mean values G (n). E.g. the disturbed speech signal sampled at 10 kHz, m must be about 1000.
  • Each of the quantities G (1), G (2) ... thus results from approximately a thousand consecutive samples.
  • the unit GL of FIG. 1 smoothes the sequence of the short-term mean values G (n). More about the purpose and manner of smoothing is given below.
  • the block PA 1 from the short-term mean values an estimated value P (n) for the average noise power, ie for the average power of the interference signal. More details about the estimate P (n) are also given below.
  • a comparator V in FIG. 1 compares a threshold S dependent on the estimated value P (n) with the smoothed short-term mean values GG (n). If the smoothed short-term mean value GG (n) is less than the threshold S, a signal is forwarded to a unit EN. If, for example, the unit EN has received such a signal at two successive clock instants T (n-1) and T (n), it can in turn detect the presence of a speech pause by means of its own signal at terminal A.
  • the diagram a) of Fig. 2 shows a possible output signal AM of the averager M, i.e. a possible sequence of the short-term mean values G (1), G (2) ...
  • the output signal AM is standardized so that its absolute maximum assumes the value 1.
  • the entered amplitude thresholds are the estimated value P (n) (lower threshold, shown in broken lines) and the threshold S (upper threshold, solid).
  • Diagram b) schematically shows the associated speech signal S with its true pauses P. If a pause determination were made due to the fact that the upper amplitude threshold was undershot in diagram a) - this pause determination is shown in diagram c) - a large number of incorrect decisions would result, as a comparison of diagrams b) and c) shows. A shift of the upper threshold downward would lead to the fact that the performance drops in diagram c), which are not based on language breaks, would not be displayed either, but the statement about the length of the breaks would then be significantly falsified.
  • a smoothing of the output signal AM is provided before the decision to pause, either with the aid of a linear digital filter, through which three short-term mean values G (n), G (n -1) and G (n - 2) a value GG (n) of the smoothed signal is obtained, or using a median filter.
  • FIG. 3 shows how the output signal of the mean value generator M looks after smoothing with a linear digital filter.
  • diagram b) the true speech sections and the real pauses of the speech signal are in turn plotted, and diagram c) shows the speech sections and speech pauses as they result analogously to diagram c) in FIG. 1. Due to the linear smoothing, the number of wrong decisions has decreased considerably, as the comparison of FIGS. 2 and 3 shows. Even with smoothing with a median filter, the number of incorrect decisions is reduced, as can be seen from diagram c) in FIG. 4.
  • a further measure, not to misinterpret shorter drops in performance in the disturbed speech signal as pauses, is e.g. a drop in performance can only be regarded as a speech pause when the upper amplitude threshold is fallen below twice in FIG. 2, 3 or 4.
  • the amplitude thresholds shown in FIGS. 2, 3 and 4 are - as already indicated above - determined by the unit PA in FIG. 1, namely that the estimated value P (n) of the noise power is initially determined for each time T (n).
  • This variable is intended to be an approximate measure of the average power of the interference signal, the averaging time being of the order of one second.
  • the arrangement according to the invention still delivers good results even if the above-mentioned average power of the interference signal changes only slowly , ie if it is to be regarded as stationary in time intervals of the size, one or two seconds.
  • the estimated value P (n) is a linear combination of the previous estimated value (P (n-1) and the short-term mean value G (n) according to the equation redefined.
  • the value of the constant a appearing in this equation is between zero and one.
  • a threshold D in terms of amount. For example, if K is the inequality in succession is satisfied, this fact is considered to be a longer speech pause and the new estimated value P (n) is determined according to the equation given above.
  • the threshold D is proportional to the cure time average value G (n) is selected in order to arrive at the same statements if, for example, the levels of all signals were doubled.
  • the proportionality factor y and the number K are to be determined experimentally so that as few incorrect decisions as possible are made by the arrangement. Typical values are
  • the constant c is to be chosen so that the estimation value reaches the modulation limit in one to two seconds with unimpeded enlargement. If, on the other hand, the already existing estimated value P (n-1) lies above the current short-term mean value G (n), the new estimated value P (n) is lowered compared to the existing one, specifically according to the equation which represents the new estimated value as a linear combination of the previous estimated value and the current short-term mean value G (n). Values around 0.5 have proven to be favorable for the constant ⁇ .
  • the threshold S which is used for the pause decision, is proportional to the estimated value P (n).
  • the relationship S 1.1 P (n) is typical of the relationship between the threshold S and the estimated value P (n).

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Noise Elimination (AREA)
  • Analogue/Digital Conversion (AREA)
  • Telephone Function (AREA)
EP83201638A 1982-11-23 1983-11-17 Anordnung zur Erkennung von Sprachpausen Expired - Lifetime EP0110467B2 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE3243231 1982-11-23
DE19823243231 DE3243231A1 (de) 1982-11-23 1982-11-23 Verfahren zur erkennung von sprachpausen

Publications (3)

Publication Number Publication Date
EP0110467A1 EP0110467A1 (de) 1984-06-13
EP0110467B1 EP0110467B1 (de) 1987-08-12
EP0110467B2 true EP0110467B2 (de) 1991-06-19

Family

ID=6178780

Family Applications (1)

Application Number Title Priority Date Filing Date
EP83201638A Expired - Lifetime EP0110467B2 (de) 1982-11-23 1983-11-17 Anordnung zur Erkennung von Sprachpausen

Country Status (6)

Country Link
US (1) US4700394A (enrdf_load_stackoverflow)
EP (1) EP0110467B2 (enrdf_load_stackoverflow)
JP (1) JPS59105695A (enrdf_load_stackoverflow)
AU (1) AU561076B2 (enrdf_load_stackoverflow)
CA (1) CA1203627A (enrdf_load_stackoverflow)
DE (2) DE3243231A1 (enrdf_load_stackoverflow)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IT1160148B (it) * 1983-12-19 1987-03-04 Cselt Centro Studi Lab Telecom Dispositivo per la verifica del parlatore
EP0167364A1 (en) * 1984-07-06 1986-01-08 AT&T Corp. Speech-silence detection with subband coding
AU583871B2 (en) * 1984-12-31 1989-05-11 Itt Industries, Inc. Apparatus and method for automatic speech recognition
JPH0748695B2 (ja) * 1986-05-23 1995-05-24 株式会社日立製作所 音声符号化方式
DE3626862A1 (de) * 1986-08-08 1988-02-11 Philips Patentverwaltung Mehrstufige sender- antennenkoppeleinrichtung
DE3739681A1 (de) * 1987-11-24 1989-06-08 Philips Patentverwaltung Verfahren zum bestimmen von anfangs- und endpunkt isoliert gesprochener woerter in einem sprachsignal und anordnung zur durchfuehrung des verfahrens
FR2631147B1 (fr) * 1988-05-04 1991-02-08 Thomson Csf Procede et dispositif de detection de signaux vocaux
JP2573352B2 (ja) * 1989-04-10 1997-01-22 富士通株式会社 音声検出装置
US5305422A (en) * 1992-02-28 1994-04-19 Panasonic Technologies, Inc. Method for determining boundaries of isolated words within a speech signal
DE4220524A1 (de) * 1992-06-23 1992-10-22 Matzner Rolf Dipl Ing Verfahren und vorrichtung zur getrennten schaetzung der einzelleistungen zweier stochastischer prozesse aus der beobachtung des durch additive ueberlagerung entstandenen summenprozesses
US5459814A (en) * 1993-03-26 1995-10-17 Hughes Aircraft Company Voice activity detector for speech signals in variable background noise
DE4405723A1 (de) * 1994-02-23 1995-08-24 Daimler Benz Ag Verfahren zur Geräuschreduktion eines gestörten Sprachsignals
DE19730518C1 (de) * 1997-07-16 1999-02-11 Siemens Ag Verfahren und Einrichtung zum Erkennen einer Sprechpause
GB0103242D0 (en) * 2001-02-09 2001-03-28 Radioscape Ltd Method of analysing a compressed signal for the presence or absence of information content
DE10120231A1 (de) * 2001-04-19 2002-10-24 Deutsche Telekom Ag Verfahren und Anordnung zur einkanaligen Geräuschreduktion für gestörte Sprachsignale
EP1676261A1 (en) * 2003-10-16 2006-07-05 Koninklijke Philips Electronics N.V. Voice activity detection with adaptive noise floor tracking
RU2436173C1 (ru) * 2010-06-15 2011-12-10 Государственное образовательное учреждение высшего профессионального образования "Рязанский государственный радиотехнический университет" Способ обнаружения пауз в речевых сигналах и устройство его реализующее
US8543061B2 (en) 2011-05-03 2013-09-24 Suhami Associates Ltd Cellphone managed hearing eyeglasses
CN104658546B (zh) * 2013-11-19 2019-02-01 腾讯科技(深圳)有限公司 录音处理方法和装置
RU2691603C1 (ru) * 2018-08-22 2019-06-14 Акционерное общество "Концерн "Созвездие" Способ разделения речи и пауз путем анализа значений корреляционной функции помехи и смеси сигнала и помехи

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IT1044353B (it) * 1975-07-03 1980-03-20 Telettra Lab Telefon Metodo e dispositivo per il rico noscimento della presenza e.o assenza di segnale utile parola parlato su linee foniche canali fonici
US4052568A (en) * 1976-04-23 1977-10-04 Communications Satellite Corporation Digital voice switch
US4025721A (en) * 1976-05-04 1977-05-24 Biocommunications Research Corporation Method of and means for adaptively filtering near-stationary noise from speech
US4028496A (en) * 1976-08-17 1977-06-07 Bell Telephone Laboratories, Incorporated Digital speech detector
FR2451680A1 (fr) * 1979-03-12 1980-10-10 Soumagne Joel Discriminateur parole/silence pour interpolation de la parole
JPS56104399A (en) * 1980-01-23 1981-08-20 Hitachi Ltd Voice interval detection system
JPS56135898A (en) * 1980-03-26 1981-10-23 Sanyo Electric Co Voice recognition device
CA1147071A (en) * 1980-09-09 1983-05-24 Northern Telecom Limited Method of and apparatus for detecting speech in a voice channel signal
US4357491A (en) * 1980-09-16 1982-11-02 Northern Telecom Limited Method of and apparatus for detecting speech in a voice channel signal
JPS5852695A (ja) * 1981-09-25 1983-03-28 日産自動車株式会社 車両用音声検出装置
US4531228A (en) * 1981-10-20 1985-07-23 Nissan Motor Company, Limited Speech recognition system for an automotive vehicle

Also Published As

Publication number Publication date
DE3243231C2 (enrdf_load_stackoverflow) 1987-07-02
DE3243231A1 (de) 1984-05-24
EP0110467A1 (de) 1984-06-13
CA1203627A (en) 1986-04-22
JPS59105695A (ja) 1984-06-19
US4700394A (en) 1987-10-13
AU561076B2 (en) 1987-04-30
DE3373037D1 (en) 1987-09-17
AU2154583A (en) 1984-05-31
EP0110467B1 (de) 1987-08-12

Similar Documents

Publication Publication Date Title
EP0110467B2 (de) Anordnung zur Erkennung von Sprachpausen
EP0111947A1 (de) Anordnung zur Erkennung von Sprachpausen
DE3101851C2 (de) Vorrichtung zum Erkennen von Sprache
DE3612347C2 (enrdf_load_stackoverflow)
DE68910859T2 (de) Detektion für die Anwesenheit eines Sprachsignals.
DE19736669C1 (de) Verfahren und Vorrichtung zum Erfassen eines Anschlags in einem zeitdiskreten Audiosignal sowie Vorrichtung und Verfahren zum Codieren eines Audiosignals
EP0076233B1 (de) Verfahren und Vorrichtung zur redundanzvermindernden digitalen Sprachverarbeitung
DE2233872A1 (de) Signalanalysator
DE3012771C2 (enrdf_load_stackoverflow)
DE69028428T2 (de) Vorrichtung zum Erfassen eines Sprachsignals
WO1998035715A1 (de) Verfahren für das schalten in die ein- oder ausatmungsphase bei der cpap-therapie
DE3235279A1 (de) Spracherkennungseinrichtung
DE2636032B2 (de) Elektrische Schaltungsanordnung zum Extrahieren der Grundschwingungsperiode aus einem Sprachsignal
EP0584388A1 (de) Verfahren zum Erzeugen eines dem Atemzeitvolumen eines Patienten entsprechenden Signals
DE69325053T2 (de) Verfahren zur Verbesserung der Empfindlichkeit und des Sprachschutzes eines Mehrfrequenzempfängers
DE69203186T2 (de) Verarbeitungsgerät für die menschliche Sprache zum Detektieren des Schliessens der Stimmritze.
EP0560047B1 (de) Sicherheitseinrichtung für motorisch verschliessbare Öffnungen
EP0574464B1 (de) Verfahren und vorrichtung zum herausfiltern von grundlinienschwankungen aus einem elektrokardiogramm
DE69018840T2 (de) Verfahren und Vorrichtung zur Bestimmung des mittleren arteriellen Blutdruckes.
WO2000010633A1 (de) Verfahren und vorrichtung für das schalten in die ein- oder ausatmungsphase bei der cpap-therapie
DE69511508T2 (de) Sprachaktivitätsdetektion
EP1458216B1 (de) Vorrichtung und Verfahren zur Adaption von Hörgerätemikrofonen
EP0775348B1 (de) Verfahren zur erkennung von signalen mittels fuzzy-klassifikation
DE3017623C2 (de) Sensor zur Verkehrserfassung von aus Analogsignalen bestehenden Nachrichtenströmen auf Fernmeldeleitungen
DE19854341A1 (de) Verfahren und Schaltungsanordnung zur Sprachpegelmessung in einem Sprachsignalverarbeitungssystem

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Designated state(s): BE DE FR GB IT SE

17P Request for examination filed

Effective date: 19840718

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): BE DE FR GB IT SE

REF Corresponds to:

Ref document number: 3373037

Country of ref document: DE

Date of ref document: 19870917

ITF It: translation for a ep patent filed
ET Fr: translation filed
PLBI Opposition filed

Free format text: ORIGINAL CODE: 0009260

26 Opposition filed

Opponent name: SIEMENS AKTIENGESELLSCHAFT, BERLIN UND MUENCHEN

Effective date: 19880502

PLAB Opposition data, opponent's data or that of the opponent's representative modified

Free format text: ORIGINAL CODE: 0009299OPPO

R26 Opposition filed (corrected)

Opponent name: SIEMENS AKTIENGESELLSCHAFT, BERLIN UND MUENCHEN

Effective date: 19880809

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: BE

Payment date: 19891114

Year of fee payment: 7

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 19891121

Year of fee payment: 7

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: SE

Payment date: 19891128

Year of fee payment: 7

ITTA It: last paid annual fee
PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 19891130

Year of fee payment: 7

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 19900125

Year of fee payment: 7

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Effective date: 19901117

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Effective date: 19901118

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Effective date: 19901130

PUAH Patent maintained in amended form

Free format text: ORIGINAL CODE: 0009272

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: PATENT MAINTAINED AS AMENDED

BERE Be: lapsed

Owner name: N.V. PHILIPS' GLOEILAMPENFABRIEKEN

Effective date: 19901130

27A Patent maintained in amended form

Effective date: 19910619

AK Designated contracting states

Kind code of ref document: B2

Designated state(s): BE DE FR GB IT SE

GBPC Gb: european patent ceased through non-payment of renewal fee
PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Effective date: 19910731

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Effective date: 19910801

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

EN3 Fr: translation not filed ** decision concerning opposition
EUG Se: european patent has lapsed

Ref document number: 83201638.0

Effective date: 19910705