WO2008075305A1 - Method and apparatus to address source of lombard speech - Google Patents

Method and apparatus to address source of lombard speech Download PDF

Info

Publication number
WO2008075305A1
WO2008075305A1 PCT/IB2007/055241 IB2007055241W WO2008075305A1 WO 2008075305 A1 WO2008075305 A1 WO 2008075305A1 IB 2007055241 W IB2007055241 W IB 2007055241W WO 2008075305 A1 WO2008075305 A1 WO 2008075305A1
Authority
WO
WIPO (PCT)
Prior art keywords
lombard
audio
adapted
apparatus
stress
Prior art date
Application number
PCT/IB2007/055241
Other languages
French (fr)
Inventor
Xavier Chabot
Daniel Willem Elisabeth Schobben
Original Assignee
Nxp B.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US87627306P priority Critical
Priority to US60/876,273 priority
Application filed by Nxp B.V. filed Critical Nxp B.V.
Publication of WO2008075305A1 publication Critical patent/WO2008075305A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0364Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
    • G10L2021/03646Stress or Lombard effect

Abstract

A method of reducing the impact of the Lombard Effect includes identifying audio stress. The method also includes taking a remedial action to reduce the audio stress. An apparatus adapted to reduce the impact of the Lombard Effect is described.

Description

Method and Apparatus to Address Source of Lombard Speech

Background

The Lombard Effect relates to the alteration of speech in response to stress, which may be induced by ambient audio sources. For example, if a person is watching a television and has to take a telephone call, the ambient sound from the television may distract the person, or interfere with the person's ability to hear the other conversant on the telephone, or both.

Lombard speech is often manifest in a variety of modifications to speech in an effort to improve speech communication in the presence of the ambient source of stress. For example, ambient noise from a social gathering often results in Lombard speech manifest as increased volume (audio intensity) of speech by the affected person. In normal conversation, participants are more likely to move closer together, rather than raising their voice. However a rule-of-thumb relationship is that every 1 dB increase in interfering noise above 45 dB will result in an average rise in output speech level of 0.5 db to 0.6 dB. This induced rise in output speech level does not normally occur at lower ('softer') speech levels, as these are more likely to occur in close-range/face-to-face conversations.

In addition to increased output speech level (volume), Lombard speech can be manifest as an alteration of spectral characteristics (e.g., pitch) of the affected person's speech.

Methods have been developed to attempt to reduce Lombard speech. Unfortunately, in known methods to improve sound reproduction, the awareness of the context is often quite limited. For example, known methods may attempt to account for background noise levels, but do not provide adequate solutions that take into account the stress that a user may feel when raising his or her voice over the background noise. There is a need, therefore, for a method and system that overcomes at least the shortcomings described above.

Summary

In accordance with an illustrative embodiment, a method of reducing the impact of the Lombard Effect includes identifying audio stress and taking a remedial action to reduce the audio stress. In accordance with another illustrative embodiment, a an apparatus adapted to reduce the impact of the Lombard Effect includes: a voice feature extractor adapted to obtain at least one feature of a user's voice; a Lombard speech detector adapted to determine a presence or absence of Lombard speech based a Lombard speech classifier; a controller adapted to map the Lombard speech classifier to a sound reproduction control parameter; and a sound modification device adapted to alter a sound parameter to reduce audio stress of the user.

Brief Description of the Drawings

The invention is best understood from the following detailed description when read with the accompanying drawing figures. It is emphasized that the various features are not necessarily drawn to scale. In fact, the dimensions may be arbitrarily increased or decreased for clarity of discussion. Wherever practical, like reference numerals refer to like elements in the drawing figures.

Fig. 1 is a conceptual diagram showing a source of audio stress and an apparatus adapted to reduce the audio stress in accordance with a representative embodiment.

Fig. 2 is a block diagram of an apparatus according to a representative embodiment. Fig. 3 is a flow-chart of a method of reducing the impact of the Lombard Effect in accordance with a representative embodiment.

Fig. 4 is a conceptual view of a device in accordance with a representative embodiment.

Defined Terminology

As used herein, the terms 'a' and 'an' mean one or more; and the term 'plurality' means two or more. As used herein, audio stress means stress a user feels when trying to make himself/herself heard in a conversation and which is due to an audio source.

Detailed Description

In the following detailed description, for purposes of explanation and not limitation, representative embodiments disclosing specific details are set forth in order to provide a thorough understanding of the present teachings. However, it will be apparent to one having ordinary skill in the art having had the benefit of the present disclosure that other embodiments depart from the specific details disclosed herein. Moreover, descriptions of well-known devices, methods, systems and protocols may be omitted so as to not obscure the description of the representative embodiments. Nonetheless, such devices, methods, systems and protocols that are within the purview of one of ordinary skill in the art may be used in accordance with the representative embodiments. In addition, the detailed description which follows presents methods that may be embodied by routines and symbolic representations of operations of data bits within a computer readable medium, associated processors, microprocessors, entertainment electronic equipment including televisions and audio equipment, communications devices including telephones, mobile phones, radios, personal computers to name only a few. In general, a method herein is conceived to be a sequence of steps or actions leading to a desired result, and as such, encompasses such terms of art as "routine," "program," "objects," "functions," "subroutines," and "procedures." The method(s) may also be embodied in an analog device or a hardware device the operation of which is not typically embodied in a 'program'

With respect to the software useful in the embodiments described herein, those of ordinary skill in the art will recognize that there exist a variety of platforms and languages for creating software for performing the procedures outlined herein. Certain illustrative embodiments can be implemented using any of a number of varieties of operating systems (OS) and programming languages. For example, the OS may be a commercially available OS from Microsoft Corporation, Seattle, Washington, USA, or a Linux OS. The programming language may be a C-programming language, such as C++, or Java.

Fig. 1 is a conceptual diagram showing a source of audio stress and an apparatus adapted to reduce the audio stress in accordance with a representative embodiment. In the diagram, a device 101 includes an apparatus 102 adapted to detect the existence of Lombard speech, to determine the source of the stress and to take a remedial measure to reduce the audio stress. In certain representative embodiments, the device 101 is a television (TV), a radio or other electronic musical device; and the apparatus 102 is provided in the device 101. Finally, and as described more fully herein, the device 101 optionally includes a detector 103 adapted to detect other signs of Lombard stress. Alternatively, the detector 103 may be a component of the apparatus 102.

In the diagram, a user 104 is engaging in a conversation on a telephone 105 or other device into which the user provides a voice input (e.g., a voice recorder). Notably, the user 104 may have been watching the TV or listening to a radio when a telephone call is received.

It is emphasized that the scenario of Fig. 1 is merely illustrative and that other applications of the representative embodiments are contemplated. For example, in a face-to-face conversation in a noisy environment audio stress may occur if two people attempting to engage in conversation cannot get close enough to improve their communication.

As will be appreciated, the TV or radio provides ambient noise that may result in audio stress to the user 104. The apparatus 102, or the detector 103, or both, are adapted to detect Lombard speech. Moreover, the apparatus 102 is adapted to identify the source of the audio stress. In its simplest and most common form, the device 101 maintains a current setting of its output audio intensity and provides this to the apparatus 102. With this information, the apparatus 102 may readily identify the source of the audio stress, for example, by comparing the audio intensity to stored data. Alternatively, or additionally, the source of the audio stress may be determined, for example, by incorporating a microphone into the apparatus 102 with feedback to a controller (not shown) of the apparatus 102.

In another representative embodiment, control signals may be sent to the apparatus 102 to take preventive or remedial action. These control signals may result from an identification of the presence of Lombard stress, or may be generated when phone calls arrive. In the former scenario, a microphone (not show) may receive speech at high audio levels. Notably, the microphone may be a component of the telephone 105, or located elsewhere (e.g., a component of the apparatus 102). Once such Lombard speech is detected, the control signals are generated and transmitted to the apparatus 102 to take action. In the latter scenario, and based on user-specified presets, the apparatus 102 may adjust the audio intensity of the device 101 to mitigate audio stress. Notably, the control signals may be transmitted via a wired or wireless communication link (not shown).

As alluded to above, after identifying the Lombard stress, the apparatus 102 then takes remedial action. The action may be, for example, to reduce the volume (audio intensity) of the device 101; or to reduce the audio output over a certain portion of the audio spectrum; or to take a user defined preset step to reduce the source of audio stress. The identification of Lombard stress and the remedial action are described more fully in conjunction with other representative embodiments. In the embodiment of Fig. 1, the detector 103 may be useful in determining the presence of other symptoms indicating Lombard speech. For example, the facial expressions of the user 104 may indicate emotional stress indicative of Lombard speech. These expressions may be generated in a training sequence by the user 104 and stored in a memory (not shown) in the apparatus 102. The detector 103 may receive images of emotional expressions during normal operation. Upon recognizing the identity of user 104 and determining that an expression is characteristic of the user 104 when subjected to audio stress is present, data from the detector 103 are provided to the apparatus 102 for remedial action. Beneficially, the visual recognition function serves to make the operation of the apparatus 102 user specific, and to add information useful for Lombard detection. Notably, in such an embodiment, the detector 103 may be an imaging device such as a video recorder or a camera. The detector 103 can also use control messages sent by other devices in the environment. For instance, the phone system can transmit a signal informing to the detector 103 that there is a phone call in process. This control message may trigger detector 103 to commence audio stress detection, optionally with the help of video surveillance of the user 104.

In certain embodiments, the apparatus 102 may be adapted to determine multiple sources of audio stress and to provide commands to the sources to remedy the reduce or eliminate the source of audio stress. For instance, the device 101 may be a television that is maintained in the room. However, the source of audio stress may be a whole- house stereo system having a speaker 106 but a receiver in another portion of the house. Alternatively, the speaker 106 may be a component of a stereo or radio in the room. In either case, the apparatus 102 may be adapted to transmit a control signal to the device controlling the speaker 106 in order to effect a remedial response. Notably, the transmission may be via a wired or wireless communication link (not shown).

In another representative embodiment, the apparatus 102 may be integrated into the telephone 105 or other device accessed by the user; or may be a separate device. The apparatus 102 may be linked to the device 101, or the device controlling the speaker 106 by a wireless or wired link to effect remedial action to reduce the source of the audio stress. For example, the device 101, the telephone 105 and the controller of the speaker 106 may be adapted to function under the Bluetooth Standard, or IEEE 802.11 or its progeny, or IEEE 802.22. Of course, the noted standards are provided only to illustrate some useful methods and systems to communicate between the apparatus 102 and the television 105 and speaker 106 to effect remedial action thereto.

Fig. 2 is a simplified block diagram of the apparatus 102 in accordance with a representative embodiment. Notably, the components described presently may be instantiated in hardware, software and firmware, and combinations thereof. For example, the apparatus 102 may include a known microprocessor having selected software provided therein to provide desired functionality. Many details of the hardware, software and firmware are well within the purview of one of ordinary skill in the art and are not repeated in order to avoid obscuring the description of the representative embodiments. The apparatus 102 includes a voice feature extractor (VFE) 201. The VFE 201 is adapted to detect voice activity from a user. In a representative embodiment, the VRE 201 is a voice recognition device and may be adapted to extract pitch contour and jitter, formant frequencies and bandwidth, amplitude and spectral centroid from a user input. As is known, a formant is a peak in an acoustic transfer function, which results from the resonant frequencies of any acoustical system. It is most commonly invoked in phonetics or acoustics involving the resonant frequencies of vocal tracts or musical instruments. However, it is equally valid to talk about the formant frequencies of the air in a room. Formants are the distinguishing characteristic of human speech and of singing. By definition, the information that humans require to distinguish between vowels can be represented purely quantitatively by the frequency content of the vowel sounds, which in turn depends on the formant structure of the vocal track that produces that sound. Thus, voice recognition relies on the detection of the number of formants and each formant characteristic (e.g. frequency position, bandwidth, gain, and spectral tilt). In other embodiments, the audio spectral centroid, which is a measure of the average frequency, weighted by amplitude, of an audio spectrum is used in voice recognition. In cognition applications, the spectral centroid is usually averaged over time. Moreover, the VFE 201 may be useful in determining representations such as

Mel-Cepstral Coefficients (MCC). As is known, a cepstrum is the result of taking the Fourier transform (FT) of the decibel spectrum as if it were a signal. Its name was derived by reversing the first four letters of "spectrum". There is a complex cepstrum and a real cepstrum. The MCCs or Mel Frequency Cepstral Coefficients (MFCCs) are coefficients that represent audio. They are derived from a type of cepstral representation of the audio clip. The basic difference between the cepstrum and the MFCC is that in the MFCC, the frequency bands are positioned logarithmically (on the mel scale) that approximates the human auditory system's response more closely than the linearly- spaced frequency bands obtained directly from the discrete FT (DFT) or discrete cosine transform (DCT). This can allow for better processing of data, for example, in audio compression. The determination of the MCCs is known and such representations are common in known voice recognition systems.

Output of the VFE 201 is provided to a Lombard speech detector 202. The detector 202 detects the presence or absence of Lombard speech based on the extracted features of the VFE 201. In one representative embodiment, the detector determines whether the speech is or is not Lombard speech. Thus, the detector 202 may provide a simple logic output based on the presence or absence of Lombard speech. In other embodiments, the detector 202 may comprise a more sophisticated Lombard speech recognizer that provides one or several continuous parameters that can be used with the control functions described herein.

The Lombard speech detector 202 may be affected by one of a variety of methods and variations thereof. Notably, the detector 202 may be a speech recognition engine such as described in U.S. Patent Publication 20030236672 to A. Aaron, et al. and entitled "Apparatus and Method for Testing Speech Recognition"; and in U.S. Patent Publication 20020173959 to Y. Gong and entitled "Method and Apparatus of Speech Recognition with Compensation for both Channel Distortion and Background Noise." The disclosures of these publications are specifically incorporated herein by reference. In one illustrative embodiment, more sophisticated Lombard speech recognition capability includes a degree or scale factor of Lombard speech adapted to discern the degree of the Lombard speech. For instance, if the user were attempting to carry on a telephone conversation with comparatively very loud music from device 101, the degree of Lombard speech would be great. By contrast, if background music were played at a comparatively low volume from a desk-top radio at a user's place of employment, the degree of Lombard speech would be comparably slight. In certain embodiments, the detector 202 may provide a continuous scale from no Lombard speech to extreme Lombard speech. Alternatively, there may be discrete degradations of the Lombard speech, such as: no Lombard speech; low Lombard speech; medium Lombard speech; high Lombard speech; and very high Lombard speech. The presence or degree of Lombard speech may be determined in a variety of ways. For instance, the detector 202 may be instantiated in software that is 'trained' to recognize certain characteristic features of Lombard speech, according to many of the same principles of training speech recognition engines. For example, for non-specific users, the test users may wear headphones with background noise at different volumes and speak into recording devices. Different noise files may be prepared for different volumes and noise sources to create the baseline characteristics of the Lombard speech.

Alternatively, user-specific Lombard speech recognition may be provided. Illustratively, before implementing the apparatus 102, the user would speak into a microphone with varying degrees of background noise/audio stress sources in order to create a user-specific profile.

In another embodiment, the detector 202 is integrated into the infrastructure of an intelligent digital home. Information collected by a variety of sensors is processed to aid in the self training and intelligent ambient-dependent control of the apparatus 102. Such information includes user identification, habits and behavior, and user preferences, for example.

Regardless of the speech recognition engine, or the degree of its sophistication, the detector 202 provides classifications of the Lombard speech. These classifications are then provided to a controller 203 adapted to map the classifications to sound reproduction control parameters adapted to effect changes to mitigate the impact of the source of audio stress (e.g., device 101).

The controller 203 may be instantiated in software, firmware, or digital or analog hardware to provide logic as needed to effect changes to the source of audio stress. The functionality contemplated varies. In particular, the controller 203 may receive a simple classification confirming the presence of Lombard speech, or a classification that there is no Lombard speech. If there is Lombard speech, the controller may 203 provide a command that eliminates the source of the audio stress (e.g., mutes the television). In other embodiments, the functionality is more sophisticated including user- specific training for automatic creation of user preferences. Illustratively, for a certain user it may be found that he/she requires the volume of the television below a certain level in order to be able to speak on the phone. Once the controller 203 has the user preferences stored (e.g., in a look-up table), the command to make the required adjustment can be made. The apparatus 102 optionally includes a sound modification/processing device

204 therewith. Alternatively, the device 204 may be provided with the device 101 and adjusted by the controller 203. Illustratively, the device 204 may be an amplifier, an equalizer, a compressor, a noise cancelling device or a microphone, or a combination of two or more of the noted devices. The parameters of the sound processing device 204 are geared towards reducing sound features that are known to induce the Lombard effect in the human voice, including but not limited to: an overall audio intensity levels (loudness), a higher frequency energy levels, a distance between the 'disturbance' spectral shape, and a typical speech spectrum. For instance, the by device 101 may be adapted to provide a reduced power in or to filter sound in the voice frequency range of approximately 300 Hz to approximately 3400 Hz.

In another representative embodiment, a user may be listening to music over headphones (not shown) when a conversation begins with another person. In this situation the lack of feedback of the user's voice will typically induce Lombard effect. The remedy according to one aspect of the present teachings is to make the headphones substantially acoustically transparent. To this end, in one representative embodiment, external microphones in combination with apparatus 102 may be used to detect that the user's voice has Lombard speech characteristics in a manner such as described previously. In this case, the audio intensity of the headphones may be reduced via the device 204 as described previously. Notably, however, the music from the headphones is only one source of audio stress.

Another source lies in the reduced 'feedback' of one's voice when speaking. In particular, with the headphones on, the user is less able to hear himself/herself speaking. This causes the user to increase his/her voice level and Lombard speech occurs. In a representative embodiment, this is mitigated by capturing the user's speech with a microphone, and adding added this to the audio signal to the user so the user can hear his/her voice. This serves to reduce the Lombard speech caused by the muffling by the headphones. Notably, this function may be instantiated in the apparatus 102 or other device provided in the headset.

Fig. 3 is a flow-chart of a method of reducing the impact of the Lombard Effect in accordance with a representative embodiment. Many of the details of the embodiments described in connection with Figs. 1 and 2 are common to the method described presently. Such details are not repeated to avoid obscuring the description of the present embodiments.

At step 301, the method commences with the identifying audio stress. This identifying may include extracting voice features, evaluating the presence of Lombard speech and determining the source of audio stress as described previously. Once the presence and source of audio stress are determined, at step 302 the method continues with the taking of remedial action to reduce the source of the stress. As described previously, the remedial action includes altering the output of the source of the audio stress to reduce the stress on the user, or the nullifying of the audio stress, or both. The former action may be reducing the volume or muting the device 101, for example. The latter action may be providing sound cancelling acoustic signals to negate the Lombard stress-inducing audio signals. The nullifying of the sound may be effected by known methods and is described more fully in connection with the embodiments of Fig. 4. Fig. 4 is a conceptual view of a device 400 in accordance with a representative embodiment. Many of the details of the embodiments described in connection with Figs. 1-3 are common to the method described presently. Such details are not repeated to avoid obscuring the description of the present embodiments.

The device 400 may be a mobile telephone or other communication device, by way of illustration. The device 400 comprises the apparatus 102 as described previously. Additionally, the device 400 optionally includes a noise cancelling module 401, a display 403 and an audio input 402. In the presently described embodiment, the noise cancelling module 401 may be an integral component of the device 400. In other embodiments, the module 401 may be a component of another device (not shown). For example, the module 401 may be a component of an automobile that is adapted to reduce background noise from an engine or road noise by transmitting noise cancelling signals through the speakers of the automobile when the user engages a mobile phone in the automobile. Alternatively, the module 401 may be a component of an automobile that is adapted to transmit noise cancelling signals to a headset or speaker (audio output) of a mobile phone via a Bluetooth or other wireless link.

The module 401 is adapted to receive the output from the apparatus 102 and to take a remedial action to reduce the audio stress. In an embodiment, ambient noise (e.g., noise on a plane or train) may be received by the audio input 402. The apparatus 102 determines the presence (or absence) of Lombard speech in a manner described previously herein. The controller 203 provides commands to the module 401, which in turn provides a noise cancelling signal to an audio output of the device 401.

In the representative embodiments described herein, a method and an apparatus are adapted to reduce the impact of ambient noise that causes Lombard speech. As will be appreciated by one of ordinary skill in the art, many variations that are in accordance with the present teachings are possible and remain within the scope of the appended claims. These and other variations would become clear to one of ordinary skill in the art after inspection of the specification, drawings and claims herein. The invention therefore is not to be restricted except within the spirit and scope of the appended claims.

Claims

Claims:
1. A method of reducing the impact of the Lombard Effect, the method comprising: identifying audio stress; and taking a remedial action to reduce the audio stress.
2. A method as claimed in claim 1, further comprising, before the identifying: receiving an audio input; and determining a presence of stress in the audio input.
3. A method as claimed in claim 2, wherein the determining further comprises classifying the audio input as Lombard speech or non-Lombard speech.
4. A method as claimed in claim 1, wherein the taking the remedial action further comprises reducing an intensity of a source of audio stress.
5. A method as claimed in claim 1, wherein the taking the remedial action further comprises filtering an audio output of a source of audio stress over a selected frequency band.
6. A method as claimed in claim 1, wherein the taking the remedial action further comprises providing an audio signal adapted to substantially nullify an audio output of a source of audio stress.
7. A method as claimed in claim 3, wherein the classifying further comprises: viewing activity of a person; and determining a presence of stress based on the viewing.
8. A method as claimed in claim 3, wherein the classifying further comprises: providing voice input from a user that is Lombard speech and that is non- Lombard speech and identifying the Lombard speech and the non-Lombard speech.
9. A method as claimed in claim 3, wherein the classifying further comprises providing a scaling factor to the Lombard speech.
10. An apparatus adapted to reduce the impact of the Lombard Effect, comprising: a voice feature extractor adapted to obtain at least one feature of a user's voice; a Lombard speech detector adapted to determine a presence or absence of Lombard speech based a Lombard speech classifier; a controller adapted to map the Lombard speech classifier to a sound reproduction control parameter; and a sound modification device adapted to alter a sound parameter to reduce audio stress of the user.
11. An apparatus as claimed in claim 10, wherein the Lombard speech detector is adapted to provide a positive indication of Lombard speech and a negative indication of Lombard speech.
12. An apparatus as claimed in claim 10, wherein the Lombard speech detector is adapted to provide a degree of the presence of Lombard speech.
13. An apparatus as claimed in claim 10, wherein the Lombard speech detector is improved by user identification and user-specific criteria.
14. An apparatus as claimed in claim 10, wherein the controller is instantiated in one or more of hardware, or software or firmware.
15. An apparatus as claimed in claim 10, wherein the sound parameter is a sound intensity.
16. An apparatus as claimed in claim 15, wherein the sound parameter is an audio spectral output.
17. An apparatus as claimed in claim 15, wherein the sound modification device is adapted to provide a sound cancellation signal that substantially nullifies a source of the audio stress.
18. An electronic device, comprising: an apparatus adapted to reduce the impact of the Lombard Effect, the apparatus further comprising: a voice feature extractor adapted to obtain at least one feature of a user's voice; a Lombard speech detector adapted to determine a presence or absence of Lombard speech based a Lombard speech classifier; and a controller adapted to map the Lombard speech classifier to a sound reproduction control parameter.
19. An electronic device as claimed in claim 18, further comprising a noise cancelling module is adapted to receive an output from the apparatus and to take a remedial action to reduce the audio stress.
20. An electronic device as claimed in claim 18, further comprising: a sound modification device adapted to alter a sound parameter to reduce audio stress of the user.
PCT/IB2007/055241 2006-12-20 2007-12-19 Method and apparatus to address source of lombard speech WO2008075305A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US87627306P true 2006-12-20 2006-12-20
US60/876,273 2006-12-20

Publications (1)

Publication Number Publication Date
WO2008075305A1 true WO2008075305A1 (en) 2008-06-26

Family

ID=39321548

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2007/055241 WO2008075305A1 (en) 2006-12-20 2007-12-19 Method and apparatus to address source of lombard speech

Country Status (1)

Country Link
WO (1) WO2008075305A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103730122A (en) * 2012-10-12 2014-04-16 三星电子株式会社 Voice converting apparatus and method for converting user voice thereof
EP2860730A1 (en) * 2013-10-10 2015-04-15 Nokia Corporation Speech processing
US9959888B2 (en) 2016-08-11 2018-05-01 Qualcomm Incorporated System and method for detection of the Lombard effect

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1993026085A1 (en) * 1992-06-05 1993-12-23 Noise Cancellation Technologies Active/passive headset with speech filter
JPH11126092A (en) * 1997-10-22 1999-05-11 Toyota Motor Corp Voice recognition device and on-vehicle voice recognition device
EP1267590A2 (en) * 2001-06-11 2002-12-18 Pioneer Corporation Contents presenting system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1993026085A1 (en) * 1992-06-05 1993-12-23 Noise Cancellation Technologies Active/passive headset with speech filter
JPH11126092A (en) * 1997-10-22 1999-05-11 Toyota Motor Corp Voice recognition device and on-vehicle voice recognition device
EP1267590A2 (en) * 2001-06-11 2002-12-18 Pioneer Corporation Contents presenting system and method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103730122A (en) * 2012-10-12 2014-04-16 三星电子株式会社 Voice converting apparatus and method for converting user voice thereof
EP2720224A3 (en) * 2012-10-12 2014-06-18 Samsung Electronics Co., Ltd Voice Converting Apparatus and Method for Converting User Voice Thereof
US9564119B2 (en) 2012-10-12 2017-02-07 Samsung Electronics Co., Ltd. Voice converting apparatus and method for converting user voice thereof
US10121492B2 (en) 2012-10-12 2018-11-06 Samsung Electronics Co., Ltd. Voice converting apparatus and method for converting user voice thereof
EP2860730A1 (en) * 2013-10-10 2015-04-15 Nokia Corporation Speech processing
US9530427B2 (en) 2013-10-10 2016-12-27 Nokia Technologies Oy Speech processing
US9959888B2 (en) 2016-08-11 2018-05-01 Qualcomm Incorporated System and method for detection of the Lombard effect

Similar Documents

Publication Publication Date Title
Wölfel et al. Distant speech recognition
Li et al. Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction
Soli et al. Assessment of speech intelligibility in noise with the Hearing in Noise Test
US8538749B2 (en) Systems, methods, apparatus, and computer program products for enhanced intelligibility
US5991277A (en) Primary transmission site switching in a multipoint videoconference environment based on human voice
KR100382024B1 (en) Device and method for processing speech
Lu et al. Speech production modifications produced by competing talkers, babble, and stationary noise
AU2011261756B2 (en) User-specific noise suppression for voice quality improvements
US20120101819A1 (en) System and a method for providing sound signals
US20090299742A1 (en) Systems, methods, apparatus, and computer program products for spectral contrast enhancement
CN100583238C (en) System and method enabling acoustic barge-in
CN204029371U (en) Communication facilities
CN101313483B (en) Configuration of echo cancellation
KR100750440B1 (en) Reverberation estimation and suppression system
KR20140019023A (en) Generating a masking signal on an electronic device
US8917894B2 (en) Method and device for acute sound detection and reproduction
KR101444100B1 (en) Noise cancelling method and apparatus from the mixed sound
US20070203698A1 (en) Method and apparatus for speech disruption
EP2396958B1 (en) Controlling an adaptation of a behavior of an audio device to a current acoustic environmental condition
US7986791B2 (en) Method and system for automatically muting headphones
Ortega-Garcia et al. AHUMADA: A large speech corpus in Spanish for speaker characterization and identification
US20050129252A1 (en) Audio presentations based on environmental context and user preferences
TWI442384B (en) Microphone-array-based speech recognition system and method
KR101884709B1 (en) Method and apparatus for adjusting volume of user terminal, and terminal
US6411927B1 (en) Robust preprocessing signal equalization system and method for normalizing to a target environment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07859465

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07859465

Country of ref document: EP

Kind code of ref document: A1