EP2235720A1 - Method for instantaneous peak level management and speech clarity enhancement - Google Patents
Method for instantaneous peak level management and speech clarity enhancementInfo
- Publication number
- EP2235720A1 EP2235720A1 EP09706215A EP09706215A EP2235720A1 EP 2235720 A1 EP2235720 A1 EP 2235720A1 EP 09706215 A EP09706215 A EP 09706215A EP 09706215 A EP09706215 A EP 09706215A EP 2235720 A1 EP2235720 A1 EP 2235720A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- wave form
- rate
- clipping
- amplitude change
- form amplitude
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000008859 change Effects 0.000 claims description 8
- 230000004044 response Effects 0.000 claims description 2
- SGPGESCZOCHFCL-UHFFFAOYSA-N Tilisolol hydrochloride Chemical compound [Cl-].C1=CC=C2C(=O)N(C)C=C(OCC(O)C[NH2+]C(C)(C)C)C2=C1 SGPGESCZOCHFCL-UHFFFAOYSA-N 0.000 claims 1
- 230000008901 benefit Effects 0.000 abstract description 8
- 230000010354 integration Effects 0.000 abstract description 8
- 230000002123 temporal effect Effects 0.000 abstract description 5
- 230000008713 feedback mechanism Effects 0.000 abstract 1
- 210000000697 sensory organ Anatomy 0.000 abstract 1
- 238000012545 processing Methods 0.000 description 14
- 238000001514 detection method Methods 0.000 description 6
- 238000013459 approach Methods 0.000 description 5
- 230000005236 sound signal Effects 0.000 description 5
- 238000001914 filtration Methods 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 4
- 230000003321 amplification Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 230000001755 vocal effect Effects 0.000 description 3
- 206010021403 Illusion Diseases 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 210000003027 ear inner Anatomy 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 210000002768 hair cell Anatomy 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- QURLONWWPWCPIC-UHFFFAOYSA-N 2-(2-aminoethoxy)ethanol;3,6-dichloro-2-methoxybenzoic acid Chemical compound NCCOCCO.COC1=C(Cl)C=CC(Cl)=C1C(O)=O QURLONWWPWCPIC-UHFFFAOYSA-N 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 210000004081 cilia Anatomy 0.000 description 1
- 210000003477 cochlea Anatomy 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 208000016354 hearing loss disease Diseases 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000000069 prophylactic effect Effects 0.000 description 1
- 230000001681 protective effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000001020 rhythmical effect Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
- G10L21/0364—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0264—Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
Definitions
- the present invention relates to audio signal processing generally. More particularly, the present invention is related to an improved system and method for instantaneous audio signal peak dynamic adjustment for improving the audibility of consonants while simultaneously preserving the sound quality of vowels, and for eliminating potentially damaging acoustic impulse transients to benefit hearing preservation.
- U.S. Patent No. 4,208,548 and 5,168,526 also issued to Orban more specifically propose methods for controlling clipping in analog voltage amplification systems but also employ by high frequency filter methods to remove undesired distortion. It should be noted that high frequency filtering does not remove low frequency inter-modulation distortion components in complex signals. The present invention has several distinguishable properties of detection, and does not require filter techniques to remove perceptual distortions.
- U.S. Patent No.5, 815,532 issued to Bhattacharya, et al. discloses a method for processing radio broadcast signals in which carrier frequencies can be
- the processing method of the present invention overcomes these and other problems not solved by the prior art by abandoning the commonly used feedback loop and providing an innovative method of controlled peak clipping and signal detection.
- This method introduces precisely calculated amplification of soft and medium sounds to the benefit of auditory detail perception and especially, speech understanding. It simultaneously reduces on an instantaneous basis, short duration high level impulse spikes. This effectively attenuates stress on the crucial hair cilia of the cochlea, thus providing a valuable hearing conservation benefit to the listener.
- the combination of high level outputs and extended listening time for entertainment, telecommunication, and other electronic audio devices is well understood to cause permanent sensori-neural hearing impairment.
- Figure 1 is a flow diagram of the processing stages, of the present invention.
- Figure 2 is a graphic representation of the acoustic pattern of an example of a recorded passage of music illustrating that the average energy distribution lies at 10 dB below the peak energy values (32% of peak);
- Figure 3 is an enlarged view of the acoustic pattern Figure 2 illustrating that the contribution to total power by excursions over 10 dB is less than half the power contributed by remaining signals;
- Figure 4 is illustrates a post peak-excision of 10 dB of the peak power from the waveform of Figure 2;
- Figure 5 illustrates the signal of Figures 2-4 amplified after clipping (or 'overdriven' by 10 dB);
- Figure 6 illustrates the classical temporal integration pattern for human listeners showing the steep fall off of detection ability as a function of duration; Loudness does not fully integrate until signal duration reaches approximately 100 milliseconds;
- Figure 7 illustrates the averaged spectrum of a single sentence speech sample without the processing of the present invention.
- the low frequencies are
- Figure 8 illustrates the speech sentence shown in Figure 7 following processing by the present invention showing that the averaged spectrum is flattened without the undesired consequence of biasing the frequency response by filtering the low frequency region;
- Figure 9. a illustrates the acoustic waveform of a female speaker's utterance of the word, "Intuition";
- Figure 9.b illustrates the wave form of Figure 9. a following processing by the present invention showing that soft consonants have been intensified rendering an audible clarity improvement;
- Figure 10. a illustrates the acoustic waveform of a male speaker's utterance of a sentence simultaneously over-laid with a series of sharp, high intensity impulse. After processing by the present invention (Fig. 10.b) the impulses spikes are clearly removed. Simultaneously, soft speech has been intensified to the advantage of greater clarity.
- Figure 10.b illustrates the waveform of Figure 10. a. following processing by the present invention showing the removal of the impulse spikes accompanied by soft speech intensification and audible sound clarity improvement.
- the present invention exploits the psychoacoustic property of temporal integration in the human auditory system. This is a crucial aspect of the method. It is known that loudness of signals is integrated within a time window of approximately 100 milliseconds. Hence, shorter duration impulse spikes sound considerably softer and are often imperceptible. An illustration of this is shown in Figures 3 and 4. In that example, a particular dynamic amplitude pattern of a music passage is illustrated, by way of example, with 10 dB reduction of the amplitude peaks removed by the present invention with a net consequential loudness reduction of only 0.2 dB due to the psycho-acoustically determined temporal integration.
- FIG. 5 the audio signal of Figures 3 and 4 is shown amplified after clipping or "overdriven” by 10 dB.
- the average levels of long duration signals are increased, which results in increased loudness for soft and midlevel sounds, the net effect of which is to enhance the detail and clarity of the signal.
- High level impulses that are extremely fast, i.e., less than 2 msec, are instantaneously adjusted downward by the third stage shown in Figure 1 which applies controlled clipping with no time delay.
- the extreme brevity of these signals renders the distortion associated with the clipping to generally imperceptible levels due to the temporal integration roll off illustrated in Figure 6 and as explained previously.
- Speech clarity in audio systems and especially noisy input environments is often compromised by the greater intensity of low frequency, higher energy vowels which tend to mask the higher frequency, lower intensity consonants.
- Traditional approaches often apply filter techniques to attenuate the low frequency noise and voice components. In some cases the approach is to bias the spectrum in favor of the high frequencies. Both have the effect of creating an undesirable tinny sound and a negative perceptual effect on voice quality.
- the present invention avoids this problem by boosting all soft and mid level sounds without filtering or frequency biasing.
- the range of the applied gain value is between approximately 1dB and 4OdB.
- a train of pulses impulses (or peaks in a continuous sinusoidal or complex signal) is treated as a Long Term signal. Because the attack and release is an exponential function the recovery on termination of a vowel in speech is relatively fast - which permits almost full amplification of consonants or other low level sounds, e.g., in music.
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Circuit For Audible Band Transducer (AREA)
- Tone Control, Compression And Expansion, Limiting Amplitude (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
- Measurement Of Current Or Voltage (AREA)
Abstract
Description
Claims
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US2485808P | 2008-01-30 | 2008-01-30 | |
US12/361,508 US20090192793A1 (en) | 2008-01-30 | 2009-01-28 | Method for instantaneous peak level management and speech clarity enhancement |
PCT/US2009/032449 WO2009097437A1 (en) | 2008-01-30 | 2009-01-29 | Method for instantaneous peak level management and speech clarity enhancement |
Publications (2)
Publication Number | Publication Date |
---|---|
EP2235720A1 true EP2235720A1 (en) | 2010-10-06 |
EP2235720A4 EP2235720A4 (en) | 2012-01-25 |
Family
ID=40900108
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP09706215A Ceased EP2235720A4 (en) | 2008-01-30 | 2009-01-29 | Method for instantaneous peak level management and speech clarity enhancement |
Country Status (8)
Country | Link |
---|---|
US (1) | US20090192793A1 (en) |
EP (1) | EP2235720A4 (en) |
JP (1) | JP5345638B2 (en) |
CN (1) | CN102144257A (en) |
AU (1) | AU2009209090B2 (en) |
CA (1) | CA2718968A1 (en) |
NZ (1) | NZ587052A (en) |
WO (1) | WO2009097437A1 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
BRPI0913987A2 (en) * | 2008-06-30 | 2015-10-20 | Able Planet Inc | signal processing method and system |
EP2518723A4 (en) * | 2009-12-21 | 2012-11-28 | Fujitsu Ltd | Voice control device and voice control method |
RU2568281C2 (en) * | 2013-05-31 | 2015-11-20 | Александр Юрьевич Бредихин | Method for compensating for hearing loss in telephone system and in mobile telephone apparatus |
CN109979475A (en) | 2017-12-26 | 2019-07-05 | 深圳Tcl新技术有限公司 | Solve method, system and the storage medium of echo cancellor failure |
Family Cites Families (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4208548A (en) * | 1977-07-19 | 1980-06-17 | Orban Associates, Inc. | Apparatus and method for peak-limiting audio frequency signals |
US4249042A (en) * | 1979-08-06 | 1981-02-03 | Orban Associates, Inc. | Multiband cross-coupled compressor with overshoot protection circuit |
US4928311A (en) * | 1986-01-03 | 1990-05-22 | Trompler Lyle D | Noise limiting circuit for earmuffs |
ATE79495T1 (en) * | 1986-04-03 | 1992-08-15 | Motorola Inc | FM RECEIVER WITH NOISE REDUCTION WHEN RECEIVING SIGNALS WITH ''RALEIGH'' FADER. |
JPS63203097A (en) * | 1987-02-18 | 1988-08-22 | Nippon Telegr & Teleph Corp <Ntt> | Video conference system |
US4926144A (en) * | 1988-09-29 | 1990-05-15 | General Electric Company | Multi-function modulation and center frequency control port for voltage controlled oscillator |
US5168526A (en) * | 1990-10-29 | 1992-12-01 | Akg Acoustics, Inc. | Distortion-cancellation circuit for audio peak limiting |
JP3295443B2 (en) * | 1991-10-09 | 2002-06-24 | パイオニア株式会社 | Signal processing circuit in audio equipment |
US5459814A (en) * | 1993-03-26 | 1995-10-17 | Hughes Aircraft Company | Voice activity detector for speech signals in variable background noise |
JPH07104788A (en) * | 1993-10-06 | 1995-04-21 | Technol Res Assoc Of Medical & Welfare Apparatus | Voice emphasis processor |
US5448646A (en) * | 1993-11-01 | 1995-09-05 | Unex Corporation | Headset interface assembly |
JPH08161704A (en) * | 1994-12-07 | 1996-06-21 | Pioneer Electron Corp | Automatic bias control method and apparatus |
US5631968A (en) * | 1995-06-06 | 1997-05-20 | Analog Devices, Inc. | Signal conditioning circuit for compressing audio signals |
US5862238A (en) * | 1995-09-11 | 1999-01-19 | Starkey Laboratories, Inc. | Hearing aid having input and output gain compression circuits |
US5815532A (en) * | 1996-05-01 | 1998-09-29 | Glenayre Electronics, Inc. | Method and apparatus for peak-to-average ratio control in an amplitude modulation paging transmitter |
US5737434A (en) * | 1996-08-26 | 1998-04-07 | Orban, Inc. | Multi-band audio compressor with look-ahead clipper |
KR100213073B1 (en) * | 1996-11-09 | 1999-08-02 | 윤종용 | Frequency response compensation apparatus of audio signal in playback mode |
JPH10163775A (en) * | 1996-12-02 | 1998-06-19 | Eiden Kk | Limiting amplifier |
US6610917B2 (en) * | 1998-05-15 | 2003-08-26 | Lester F. Ludwig | Activity indication, external source, and processing loop provisions for driven vibrating-element environments |
US6757396B1 (en) * | 1998-11-16 | 2004-06-29 | Texas Instruments Incorporated | Digital audio dynamic range compressor and method |
US7027981B2 (en) * | 1999-11-29 | 2006-04-11 | Bizjak Karl M | System output control method and apparatus |
GB2359177A (en) * | 2000-02-08 | 2001-08-15 | Nokia Corp | Orientation sensitive display and selection mechanism |
US6731768B1 (en) * | 2000-07-26 | 2004-05-04 | Etymotic Research, Inc. | Hearing aid having switched release automatic gain control |
EP1405424A1 (en) * | 2001-06-28 | 2004-04-07 | Koninklijke Philips Electronics N.V. | Narrowband speech signal transmission system with perceptual low-frequency enhancement |
FR2831961B1 (en) * | 2001-11-07 | 2004-07-23 | Inst Francais Du Petrole | METHOD FOR PROCESSING SEISMIC DATA OF WELLS IN ABSOLUTE PRESERVED AMPLITUDE |
US6741844B2 (en) * | 2001-11-27 | 2004-05-25 | Motorola, Inc. | Receiver for audio enhancement and method therefor |
EP1599992B1 (en) * | 2003-02-27 | 2010-01-13 | Telefonaktiebolaget L M Ericsson (Publ) | Audibility enhancement |
JP4048499B2 (en) * | 2004-02-27 | 2008-02-20 | ソニー株式会社 | AGC circuit and AGC circuit gain control method |
US7391875B2 (en) * | 2004-06-21 | 2008-06-24 | Waves Audio Ltd. | Peak-limiting mixer for multiple audio tracks |
US20060206320A1 (en) * | 2005-03-14 | 2006-09-14 | Li Qi P | Apparatus and method for noise reduction and speech enhancement with microphones and loudspeakers |
-
2009
- 2009-01-28 US US12/361,508 patent/US20090192793A1/en not_active Abandoned
- 2009-01-29 CA CA2718968A patent/CA2718968A1/en not_active Abandoned
- 2009-01-29 CN CN2009801037049A patent/CN102144257A/en active Pending
- 2009-01-29 WO PCT/US2009/032449 patent/WO2009097437A1/en active Application Filing
- 2009-01-29 NZ NZ587052A patent/NZ587052A/en not_active IP Right Cessation
- 2009-01-29 AU AU2009209090A patent/AU2009209090B2/en not_active Ceased
- 2009-01-29 JP JP2010545165A patent/JP5345638B2/en not_active Expired - Fee Related
- 2009-01-29 EP EP09706215A patent/EP2235720A4/en not_active Ceased
Non-Patent Citations (2)
Title |
---|
No further relevant documents disclosed * |
See also references of WO2009097437A1 * |
Also Published As
Publication number | Publication date |
---|---|
CA2718968A1 (en) | 2009-08-06 |
NZ587052A (en) | 2013-04-26 |
AU2009209090A1 (en) | 2009-08-06 |
CN102144257A (en) | 2011-08-03 |
AU2009209090B2 (en) | 2013-05-02 |
US20090192793A1 (en) | 2009-07-30 |
WO2009097437A1 (en) | 2009-08-06 |
JP5345638B2 (en) | 2013-11-20 |
JP2011511964A (en) | 2011-04-14 |
EP2235720A4 (en) | 2012-01-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zorila et al. | Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compression | |
EP2011234B1 (en) | Audio gain control using specific-loudness-based auditory event detection | |
US7343022B2 (en) | Spectral enhancement using digital frequency warping | |
US5274711A (en) | Apparatus and method for modifying a speech waveform to compensate for recruitment of loudness | |
Yoo et al. | Speech signal modification to increase intelligibility in noisy environments | |
Marzinzik | Noise reduction schemes for digital hearing aids and their use for the hearing impaired | |
AU2009209090B2 (en) | Method for instantaneous peak level management and speech clarity enhancement | |
US10176824B2 (en) | Method and system for consonant-vowel ratio modification for improving speech perception | |
Lin et al. | Subband noise estimation for speech enhancement using a perceptual Wiener filter | |
Krause et al. | Evaluating the role of spectral and envelope characteristics in the intelligibility advantage of clear speech | |
EP3595172B1 (en) | Systems and methods for processing an audio signal for replay on an audio device | |
Graupe et al. | Blind adaptive filtering of speech from noise of unknown spectrum using a virtual feedback configuration | |
Brouckxon et al. | Time and frequency dependent amplification for speech intelligibility enhancement in noisy environments | |
US10149070B2 (en) | Normalizing signal energy for speech in fluctuating noise | |
JP5005614B2 (en) | Adaptive dynamic range optimized sound processor | |
Mauler et al. | Improved reproduction of stops in noise reduction systems with adaptive windows and nonstationarity detection | |
EP2394271B1 (en) | Method for separating signal paths and use for improving speech using electric larynx | |
JP3596580B2 (en) | Audio signal processing circuit | |
Tejero-Calado et al. | Combination compression and linear gain processing for digital hearing aids | |
Nishigaki et al. | Influence of auditory feedback on uttering vowel speech in noisy environment | |
Zorila et al. | Effectiveness of Near-End Speech Enhancement Under Equal-Loudness and Equal-Level Constraints. | |
Maher et al. | Audio signal enhancement | |
Alam et al. | WIENER DENOISING BASED ON PERCEPTUAL FREQUENCY WEIGHTING AND NOISE SPECTRUM SHAPING | |
Udrea et al. | An Improved Multi-band Speech Enhancement Method for Colored Noise Estimation and Reduction | |
Jenssen | Noise reduction in hearing aids |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20100830 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL BA RS |
|
DAX | Request for extension of the european patent (deleted) | ||
A4 | Supplementary search report drawn up and despatched |
Effective date: 20111229 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G10L 21/02 20060101AFI20111222BHEP |
|
17Q | First examination report despatched |
Effective date: 20121109 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
18R | Application refused |
Effective date: 20130711 |