WO1999017278A1 - Method and apparatus for improving speech intelligibility - Google Patents
Method and apparatus for improving speech intelligibility Download PDFInfo
- Publication number
- WO1999017278A1 WO1999017278A1 PCT/GB1998/002890 GB9802890W WO9917278A1 WO 1999017278 A1 WO1999017278 A1 WO 1999017278A1 GB 9802890 W GB9802890 W GB 9802890W WO 9917278 A1 WO9917278 A1 WO 9917278A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- signal
- intelligibility
- word
- threshold
- speech
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
- G10L21/0364—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0264—Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
Definitions
- the present invention relates to a method and apparatus for improving the intelligibility of the spoken word.
- the intelligibility of the spoken word is determined by a number of different factors. It has been known from an examination of the Temples of Ancient Egypt and Amphitheatres of Greece and Rome that structural changes to buildings and spaces can improve the intelligibility of the spoken word but the impetus towards a more scientific approach to the problem of improving intelligibility came with the advent of the telephone, which resulted in a large volume of scientific work but mainly with the aim of solving problems of distortion, bandwidth and transducer design in telephone systems.
- Knudsen published a work entitled "On Hearing in Auditoriums" where he postulated that what he termed percentage articulation was a function of reverberation, noise, room shape and echo.
- STI speech transmission index
- the factors affecting speech intelligibility can be grouped into four major areas namely those associated with the talker, the listener, the space and the transmission system.
- the list of factors can be greatly reduced by assuming a perfect talker, a normal listener, a space free from anomalies and a perfect transmission system. These assumptions reduce the list of influencing factors to those dependent on direct sound pressure level, reverberant sound pressure level, reverberation time and noise. If we understand that the direct sound pressure level is the signal (speech) and wanted component, then this list further reduces to two ratios as follows: direct-to-reverberant ratio i.e. the ratio of wanted to unwanted sound, and the signal-to-noise ratio i.e.
- speech intelligibility is a function of the product of all three factors mentioned above. As previously mentioned, the detrimental effects and the limitations imposed by one dependent variable may not be fully compensated by another. While the intelligibility of speech can be improved by making structural alterations to the space where the listener is present e.g. by reducing the reverberation time within the space by the introduction of acoustic absorption, in certain instances it is not economic to make such material changes and there is a need for a simpler and more economic way of improving the intelligibility of the spoken word in public areas. It is now commonplace for spoken words to be amplified by electrical apparatus we have looked at ways in which the problem can be solved electronically.
- the present invention provides apparatus for broadcasting speech into an acoustic space through one or more loudspeakers which comprises means for compressing the higher amplitude portions of the spoken words and expanding the thus compressed signal whereby the emphasise the lower amplitude parts of the spoken words.
- the compression is preferably in the range 1: 1 to 10: 1 and usually 2: 1 to 4: 1. In most cases a compression ratio of 3 : 1 will suffice.
- the present invention also provides a method for broadcasting public address announcements into closed acoustic spaces which comprises the use of the compression/expansion apparatus.
- the threshold at which compression commences is preferably selected depending on the speech characteristics of the person enunciating the words but as an alternative, the speech signals can be normalized prior to transmission to the compressor/expander apparatus in which case the threshold can be preset to a specific value depending on the output from the normalization circuitry.
- Fig. 1 shows a graph of STI versus word score
- Fig. 2 shows a graph of input against output for a compressor according to the present invention
- Fig. 3 shows a block diagram of a arrangement according to the present invention.
- Fig. 4 shows the amplitude waveforms of various words before and after processing by apparatus according to the present invention.
- Speech has a dynamic range of some 10-20 dB or, in pressure terms around 100: 1, i.e. the quietest parts of our speech (at a normal level) are around 100th of the loudest.
- vowel sounds 100 Hz - 1 Kz
- These large vowel sounds tend to mask the weaker consonants which are vital and play a far more important role in intelligibility.
- the vowel sounds also enhance the reverberant sound thereby reducing the direct-to reverberant ratio.
- Fig. 4a shows a waveform diagram of the word "drop” in its original, uncompressed form. After compression, as shown in Fig. 4b, it will be seen that the difference between the amplitudes of the loudest and quietest part of the word have been reduced. Thus, on expansion or amplification, the very quiet consonant "p” has been enhanced with respect to the vowel sound “o” and therefore the intelligibility of the word "drop” has been improved.
- Fig. 3 a diagram of a suitable apparatus is shown where the user speaks into a microphone 10. The output from the microphone is then passed through a normaliser which will process the input signal and provide a normalised sound output. The output from the normaliser 11 is fed to a compression and expansion circuit, sometimes known as a compander, 12 which applies amplitude compression to the input signal if the amplitude exceeds a pre-set threshold. The compander 12 is arranged to start compression at a threshold which is set relative to the magnitude of the speech in the signal chain.
- the threshold should be set at a value less than halfway up the dynamic range so that the majority of the speech signal is subject to compression. It has been found that a threshold at a value 5 - 6 dB above the level of the quietest part of the speech is adequate. Another way of expressing this is to look at the average value of the peak amplitudes of the speech signal, in which case the threshold should be in the range 28 dB to 22 dB below the average of the peak levels of the speech. Typically, the threshold is set at 25 dB below the average of the peaks.
- the amount of compression is usually in the range 2: 1 - 10: 1 but might be as high as 20: 1.
- the output from the compander 12 is then fed to an electro-acoustic transducer in the form of a loud speaker system 13 for broadcast to the listener who is in an acoustic space.
- the above apparatus can be used with good effect in public address systems for all public spaces including but not limited to stations, theatres and cinemas. It also has application in other areas where ambient noise levels are high and speech intelligibility is important such as in aircraft for in-flight announcement and also for induction loops and hearing aids for persons with impaired hearing since it has been found that those who suffer from impaired hearing due to age can have their understanding of spoken words improved if the aforementioned technique is utilized.
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0006872A GB2344982A (en) | 1997-09-26 | 1998-09-24 | Method and apparatus for improving speech intelligibility |
AU91772/98A AU9177298A (en) | 1997-09-26 | 1998-09-24 | Method and apparatus for improving speech intelligibility |
EP98944106A EP1018108A1 (en) | 1997-09-26 | 1998-09-24 | Method and apparatus for improving speech intelligibility |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB9720544.7 | 1997-09-26 | ||
GB9720544A GB9720544D0 (en) | 1997-09-26 | 1997-09-26 | Method and apparatus for inputting speech intelligibility |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1999017278A1 true WO1999017278A1 (en) | 1999-04-08 |
Family
ID=10819714
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GB1998/002890 WO1999017278A1 (en) | 1997-09-26 | 1998-09-24 | Method and apparatus for improving speech intelligibility |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1018108A1 (en) |
AU (1) | AU9177298A (en) |
GB (1) | GB9720544D0 (en) |
WO (1) | WO1999017278A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0076687A1 (en) * | 1981-10-05 | 1983-04-13 | Signatron, Inc. | Speech intelligibility enhancement system and method |
EP0279451A2 (en) * | 1987-02-20 | 1988-08-24 | Fujitsu Limited | Speech coding transmission equipment |
US5506899A (en) * | 1993-08-20 | 1996-04-09 | Sony Corporation | Voice suppressor |
US5737719A (en) * | 1995-12-19 | 1998-04-07 | U S West, Inc. | Method and apparatus for enhancement of telephonic speech signals |
-
1997
- 1997-09-26 GB GB9720544A patent/GB9720544D0/en not_active Ceased
-
1998
- 1998-09-24 EP EP98944106A patent/EP1018108A1/en not_active Withdrawn
- 1998-09-24 WO PCT/GB1998/002890 patent/WO1999017278A1/en not_active Application Discontinuation
- 1998-09-24 AU AU91772/98A patent/AU9177298A/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0076687A1 (en) * | 1981-10-05 | 1983-04-13 | Signatron, Inc. | Speech intelligibility enhancement system and method |
EP0279451A2 (en) * | 1987-02-20 | 1988-08-24 | Fujitsu Limited | Speech coding transmission equipment |
US5506899A (en) * | 1993-08-20 | 1996-04-09 | Sony Corporation | Voice suppressor |
US5737719A (en) * | 1995-12-19 | 1998-04-07 | U S West, Inc. | Method and apparatus for enhancement of telephonic speech signals |
Also Published As
Publication number | Publication date |
---|---|
EP1018108A1 (en) | 2000-07-12 |
AU9177298A (en) | 1999-04-23 |
GB9720544D0 (en) | 1997-11-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bronkhorst | The cocktail party phenomenon: A review of research on speech intelligibility in multiple-talker conditions | |
EP0796489B1 (en) | Method for transforming a speech signal using a pitch manipulator | |
US5737719A (en) | Method and apparatus for enhancement of telephonic speech signals | |
CN109065067A (en) | A kind of conference terminal voice de-noising method based on neural network model | |
Pollack et al. | Masking of speech by noise at high sound levels | |
US20030216907A1 (en) | Enhancing the aural perception of speech | |
JP2017538146A (en) | Systems, methods, and devices for intelligent speech recognition and processing | |
Kennedy et al. | Consonant–vowel intensity ratios for maximizing consonant recognition by hearing-impaired listeners | |
Nejime et al. | Evaluation of the effect of speech-rate slowing on speech intelligibility in noise using a simulation of cochlear hearing loss | |
US20060126859A1 (en) | Sound system improving speech intelligibility | |
Nábělek | Performance of hearing‐impaired listeners under various types of amplitude compression | |
US20060239472A1 (en) | Sound quality adjusting apparatus and sound quality adjusting method | |
JP4876245B2 (en) | Consonant processing device, voice information transmission device, and consonant processing method | |
JP3367592B2 (en) | Automatic gain adjustment device | |
JP2000152394A (en) | Hearing aid for moderately hard of hearing, transmission system having provision for the moderately hard of hearing, recording and reproducing device for the moderately hard of hearing and reproducing device having provision for the moderately hard of hearing | |
WO2009001035A2 (en) | Transmission of audio information | |
Arai et al. | Effective speech processing for various impaired listeners | |
KR20090082605A (en) | Creation Method of channel of digital hearing-aid and Multi-channel digital hearing-aid | |
JPH09311696A (en) | Automatic gain control device | |
US7123732B2 (en) | Process to adapt the signal amplification in a hearing device as well as a hearing device | |
Kusumoto et al. | Modulation enhancement of speech as a preprocessing for reverberant chambers with the hearing-impaired | |
EP1018108A1 (en) | Method and apparatus for improving speech intelligibility | |
Yanick et al. | Signal processing to improve intelligibility in the presence of noice for persons with a ski-slope hearing impairment | |
Vaughan et al. | Time-expanded speech and speech recognition in older adults. | |
RU2589298C1 (en) | Method of increasing legible and informative audio signals in the noise situation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
ENP | Entry into the national phase |
Ref country code: GB Ref document number: 200006872 Kind code of ref document: A Format of ref document f/p: F |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1998944106 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: KR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 09509431 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 1998944106 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
NENP | Non-entry into the national phase |
Ref country code: CA |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1998944106 Country of ref document: EP |