EP1005016A2 - Procédé et dispositif de circuit pour mesurer le niveau de parole dans un système de traitement du signal de parole - Google Patents

Procédé et dispositif de circuit pour mesurer le niveau de parole dans un système de traitement du signal de parole Download PDF

Info

Publication number
EP1005016A2
EP1005016A2 EP99440312A EP99440312A EP1005016A2 EP 1005016 A2 EP1005016 A2 EP 1005016A2 EP 99440312 A EP99440312 A EP 99440312A EP 99440312 A EP99440312 A EP 99440312A EP 1005016 A2 EP1005016 A2 EP 1005016A2
Authority
EP
European Patent Office
Prior art keywords
speech
detector
speech signal
signal
pause
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP99440312A
Other languages
German (de)
English (en)
Other versions
EP1005016A3 (fr
Inventor
Michael Walker
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alcatel CIT SA
Alcatel Lucent SAS
Original Assignee
Alcatel CIT SA
Alcatel SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alcatel CIT SA, Alcatel SA filed Critical Alcatel CIT SA
Publication of EP1005016A2 publication Critical patent/EP1005016A2/fr
Publication of EP1005016A3 publication Critical patent/EP1005016A3/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision

Definitions

  • the current speech level for example for scaling signals, for threshold value decisions, for speech pause detection and / or for automatic Gain setting used.
  • The has particular importance Speech level measurement for successful echo cancellation in Telecommunications systems, for noise cancellation in noisy environment, for example in military vehicles, or in the Speech recognition and in speech coding and Speech decoding devices.
  • the mean value SL takes on one of the number N of Samples determine the value of the quiet noise for a certain time.
  • an averager needs one of the number N certain time to determine the speech level.
  • the averaging in one Time interval of 125 ms requires a data storage of 1000 data words at a sampling rate of 8 kHz. Except for the there is considerable computing and storage effort for the simple one Averaging the risk that with a short averaging time Interference influences Errors occur when determining the speech level.
  • join Speech level changes Incorrect measurements of the speech level.
  • the method is linear Prediction (linear predictive coding, LPC) known with the principle distinguishing features of speech and noise can also be determined are.
  • LPC linear predictive coding
  • the LPC analysis is very accurate and can be done very quickly and is a powerful process with which, among other things, the Fundamental frequency, the spectrum and the formats of a speech signal can be determined, cf. Eppinger, Herter: language processing, Kunststoff, Vienna: Hanser 1983, pages 73-77. Such an elaborate However, the process is for commercial reasons for mass products, such as Telecommunication terminals, not suitable.
  • the essence of the invention is that a measured Speech level value only then for further processing in one Voice signal processing system is allowed when characteristic Characteristics of speech recognized and interference signals and speech pauses at the Measurement were hidden.
  • the circuit arrangement consists essentially of a Speech pause detector 1, a speech detector 2, an averager 3, a memory 4 and a circuit 5 for forming a Absolute value.
  • the sampling function x (k) is at the circuit input Speech signal, at the circuit output the value of a speech level SL spent. If a speech pause, output signal P of Speech pause detector 1, and no speech, output signal F des Speech detector 2, recognized, there are a first according to FIG. 1 Switch S1, a second switch S2 and a third switch S3 in the drawn position. There is a voice signal in the form of the sampling function x (k) before, i.e.
  • a speech pause P is not recognized, the Voice detector 2 activated via the closed first switch S1 and the averaging over the circuit 5 and the closed second Switch S2 initiated with the averager 3.
  • the output signal F of the speech detector 2 detects the third switch S3 closed and the output signal SAM (x) of Averager 3 is transferred to memory 4 via third switch S3 accepted.
  • the last one measured during the pauses in speech Speech level SL from the memory 4 via the second switch S2 Transfer mean value generator 3.
  • the short-term average SAM (x) (short Average Magnitude) so that the time behavior of the Short - term mean SAM (x) of the subjective perception function of the human ear is largely adapted.
  • a dynamic leap from soft to loud tones is done with a small time constant ⁇ s, for example less than 6.5 ms.
  • a dynamic leap from loud to soft tones is according to the after masking effect of the human ear with a large time constant ⁇ l, for example 65 ms to 300 ms. Briefly spoken vowels are added to this Way well grasped. Nasal sounds or consonants compared to Lower level vowels are measured by the large time constant ⁇ l largely suppressed with falling levels.
  • the signal curve becomes a fast adaptation of the short-term mean value SAM (x) reached the current peak value of the short-term level of the speech signal. This peak value of the short-term level of the speech signal thus determines the relative speech level regardless of the speech content.
  • FIG. 2 shows the time behavior of the samples for three functions.
  • the Input function x (k) of the speech level measuring circuit according to FIG. 1 is as Functional curve 6 of a speech sample is shown.
  • the course of functions 7 shows the course of the short-term mean SAM (x (k)), short SAM (x), below Taking into account the mode of action of the different time constants ⁇ s, ⁇ l as previously described.
  • a third is for comparison Functional curve 8 shown, the effect of a simple low pass reproduces. It follows that a low pass for a quick and precise Determining the current language level is unsuitable.
  • the mean value generator 3 shows details of the mean value generator 3, which contains a recursive filter, a IIR filter 9 (Infinite Impulse Response Filter) known per se, and a circuit arrangement 10 for switching over the time constants ⁇ s, ⁇ l.
  • the circuit 5 for forming the absolute value corresponds to the circuit shown in FIG. 1.
  • the time constants ⁇ s, ⁇ l must be switched according to the following equation G2:
  • a method is used with which the temporal behavior of the sampling function x (k) of the speech signal is evaluated.
  • the short-term mean value SAM (x) of the sampling function x (k) is compared with a long-term minimum value determined in a time interval from a number of short-term mean values SAM (x).
  • time constants ⁇ s, ⁇ l of the averager 3 vary by one adapted to the respective application Obtain speech level SL.
  • the one described in the embodiment Formation of a short-term mean value SAM (x) advantageously becomes strong noisy environment, used for example in a tank. If the speakers are indistinct, it is cheaper to use an average (medium Average Magnitude) MAM (x) by the small time constant ⁇ s enlarged and the large time constant ⁇ l of the averager 3 is reduced.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
  • Noise Elimination (AREA)
  • Telephone Function (AREA)
EP99440312A 1998-11-25 1999-11-12 Procédé et dispositif de circuit pour mesurer le niveau de parole dans un système de traitement du signal de parole Withdrawn EP1005016A3 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE19854341 1998-11-25
DE19854341A DE19854341A1 (de) 1998-11-25 1998-11-25 Verfahren und Schaltungsanordnung zur Sprachpegelmessung in einem Sprachsignalverarbeitungssystem

Publications (2)

Publication Number Publication Date
EP1005016A2 true EP1005016A2 (fr) 2000-05-31
EP1005016A3 EP1005016A3 (fr) 2000-11-29

Family

ID=7888949

Family Applications (1)

Application Number Title Priority Date Filing Date
EP99440312A Withdrawn EP1005016A3 (fr) 1998-11-25 1999-11-12 Procédé et dispositif de circuit pour mesurer le niveau de parole dans un système de traitement du signal de parole

Country Status (3)

Country Link
US (1) US6539350B1 (fr)
EP (1) EP1005016A3 (fr)
DE (1) DE19854341A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1278185A2 (fr) * 2001-07-13 2003-01-22 Alcatel Procédé pour améliorer la reduction de bruit lors de la transmission de la voix

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19939102C1 (de) * 1999-08-18 2000-10-26 Siemens Ag Verfahren und Anordnung zum Erkennen von Sprache
KR100406307B1 (ko) * 2001-08-09 2003-11-19 삼성전자주식회사 음성등록방법 및 음성등록시스템과 이에 기초한음성인식방법 및 음성인식시스템
EP1429314A1 (fr) * 2002-12-13 2004-06-16 Sony International (Europe) GmbH Correction d'énergie comme paramètre d'entrée pour le traitement de la parole
EP2560410B1 (fr) * 2011-08-15 2019-06-19 Oticon A/s Contrôle de modulation de sortie dans un instrument auditif
US8255218B1 (en) * 2011-09-26 2012-08-28 Google Inc. Directing dictation into input fields
US8543397B1 (en) 2012-10-11 2013-09-24 Google Inc. Mobile device voice activation

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4696039A (en) * 1983-10-13 1987-09-22 Texas Instruments Incorporated Speech analysis/synthesis system with silence suppression
JPH07326981A (ja) * 1994-05-31 1995-12-12 Japan Radio Co Ltd Vox制御通信装置

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4032710A (en) * 1975-03-10 1977-06-28 Threshold Technology, Inc. Word boundary detector for speech recognition equipment
US4481593A (en) * 1981-10-05 1984-11-06 Exxon Corporation Continuous speech recognition
DE3276731D1 (en) * 1982-04-27 1987-08-13 Philips Nv Speech analysis system
DE3276732D1 (en) * 1982-04-27 1987-08-13 Philips Nv Speech analysis system
DE3230391A1 (de) * 1982-08-14 1984-02-16 Philips Kommunikations Industrie AG, 8500 Nürnberg Verfahren zur signalverbesserung von gestoerten sprachsignalen
US4625083A (en) * 1985-04-02 1986-11-25 Poikela Timo J Voice operated switch
FR2631147B1 (fr) * 1988-05-04 1991-02-08 Thomson Csf Procede et dispositif de detection de signaux vocaux
US5204906A (en) * 1990-02-13 1993-04-20 Matsushita Electric Industrial Co., Ltd. Voice signal processing device
US5216702A (en) * 1992-02-27 1993-06-01 At&T Bell Laboratories Nonintrusive speech level and dynamic noise measurements
US5305422A (en) * 1992-02-28 1994-04-19 Panasonic Technologies, Inc. Method for determining boundaries of isolated words within a speech signal

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4696039A (en) * 1983-10-13 1987-09-22 Texas Instruments Incorporated Speech analysis/synthesis system with silence suppression
JPH07326981A (ja) * 1994-05-31 1995-12-12 Japan Radio Co Ltd Vox制御通信装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BAUER B B ET AL: "THE MEASUREMENT OF LOUDNESS LEVEL" JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA,US,AMERICAN INSTITUTE OF PHYSICS. NEW YORK, Bd. 50, Nr. 2, PART 01, August 1971 (1971-08), Seiten 405-414, XP000795762 ISSN: 0001-4966 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1278185A2 (fr) * 2001-07-13 2003-01-22 Alcatel Procédé pour améliorer la reduction de bruit lors de la transmission de la voix
EP1278185A3 (fr) * 2001-07-13 2005-02-09 Alcatel Procédé pour améliorer la reduction de bruit lors de la transmission de la voix

Also Published As

Publication number Publication date
DE19854341A1 (de) 2000-06-08
EP1005016A3 (fr) 2000-11-29
US6539350B1 (en) 2003-03-25

Similar Documents

Publication Publication Date Title
DE69535709T2 (de) Verfahren und Vorrichtung zur Auswahl der Kodierrate bei einem Vokoder mit variabler Rate
DE69926851T2 (de) Verfahren und Vorrichtung zur Sprachaktivitätsdetektion
DE60009206T2 (de) Rauschunterdrückung mittels spektraler Subtraktion
DE69816610T2 (de) Verfahren und vorrichtung zur rauschverminderung, insbesondere bei hörhilfegeräten
DE69913262T2 (de) Vorrichtung und verfahren zur anpassung der rauschschwelle zur sprachaktivitätsdetektion in einer nichtstationären geräuschumgebung
DE112009000805B4 (de) Rauschreduktion
EP0690436B1 (fr) Détection de début et de la fin des mots pour la reconnaissence de mots
DE3233637C2 (de) Vorrichtung zur Bestimmung der Dauer von Sprachsignalen
EP1088300B1 (fr) Procede d'execution d'une evaluation automatisee de la qualite de transmission de signaux audio
EP0698986A2 (fr) Procédé pour la compensation adaptative d'écho
EP0747880B1 (fr) Système de reconnaissance de la parole
DE69918635T2 (de) Vorrichtung und Verfahren zur Sprachverarbeitung
EP1103956B1 (fr) Réduction exponentielle de bruit et d'écho pendant les pauses de la parole
DE69635141T2 (de) Verfahren zur Erzeugung von Sprachmerkmalsignalen und Vorrichtung zu seiner Durchführung
EP0938831A1 (fr) Evaluation de la qualite, a adaptation auditive, de signaux audio
DE19715126A1 (de) Sprachsignal-Codiervorrichtung
DE102015207706B3 (de) Verfahren zur frequenzabhängigen Rauschunterdrückung eines Eingangssignals
DE3243231A1 (de) Verfahren zur erkennung von sprachpausen
DE69922769T2 (de) Vorrichtung und Verfahren zur Sprachverarbeitung
EP1005016A2 (fr) Procédé et dispositif de circuit pour mesurer le niveau de parole dans un système de traitement du signal de parole
DE2021126A1 (de) Spracherkennungsvorrichtung
EP1382034B1 (fr) Procede de determination de valeurs caracteristiques d'intensite de bruits de fond dans des pauses de voix de signaux vocaux
EP1202253B1 (fr) Estimateur actif de niveau de bruit
EP1453355A1 (fr) Traitement de signal dans un appareil auditif
EP0902416B1 (fr) Procédé et dispositif pour reconnaitre une entrée de parole pendant la diffusion d'une annonce

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

RIC1 Information provided on ipc code assigned before grant

Free format text: 7G 10L 11/00 A, 7G 01H 3/12 B

17P Request for examination filed

Effective date: 20001129

AKX Designation fees paid

Free format text: AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20031218