EP1696695A1

EP1696695A1 - Acoustic device and method for treating an audio signal

Info

Publication number: EP1696695A1
Application number: EP05004148A
Authority: EP
Inventors: Martin Schönle; Bruno Tramblay
Original assignee: Siemens AG
Current assignee: Siemens AG
Priority date: 2005-02-25
Filing date: 2005-02-25
Publication date: 2006-08-30

Abstract

The acoustic shock protection for acoustic devices shall be improved. In order to solve this problem there is provided a method for treating an audio signal by detecting the amplitude of the audio signal to be treated and increasing a gain applied onto the audio signal from a start level (SL) in the form of ramp over the time at most until the amplitude reaches a second amplitude value.

This measure controls the sound pressure level at the loudspeaker dependent on the properties of audio signal and the energy of the audio signal is limited to a pregiven value.

Description

The present invention relates to a method for treating audio signals used on multimedia devices with acoustic output (e.g. mobile phones, MP3 players, PDAs, PCs used with headphones,...). Moreover, the present invention relates to an acoustic device for processing such audio signals. Specifically, the present invention deals with the problem of acoustic shock protection.
The protection of users of acoustic devices against the risk of acoustic shock has become essential due to several reasons:

Mobile phones / PDAs with telephone functionality:
- Market requirements for higher sound pressure levels (SPL) at the loudspeaker, especially for ringing tones, but also for speech, music, sounds in all hands-free modes;
- Growing importance of hands-free modes, e.g. for telephone calls, video mode, car-kit mode, gaming mode;
- Usage of ringing tones and signaling tones in all operation modes, e.g. incoming call during gaming mode is commonly supported;
- User-configurable ringing and signaling tones, i.e. these signals can be downloaded (e.g. Internet, Jamba ...) or even self-made and no control by acoustic experts is possible on their audio properties (dangerous or not).
PC with headphones:
- User is listening to soundfiles using some standard multimedia player at a convenient loudness. Any application running in parallel is able to superimpose its own sound output at a different volume level (email programs playing very loud notification signals, when an email has arrived).

Acoustic shock is defined in the ITU standard [ITU-T P.10: Vocabulary of terms on telephone transmission quality and telephone sets, 12/98] as "any temporary or permanent disturbance of the functioning of the ear, or of the nervous system, which may be caused to the user of a telephone earphone by a sudden sharp rise in the acoustic pressure produced by it. "
In the ETSI technical report [ETSI TR 101 800: Acoustic Safety of Terminal Equipment; An investigation on standards and approval documents, V1.1.1, 07/2000] following reasons for a damage of the ear are given:

"Damage may be caused to hearing either by long-term exposure to high levels of sound or by high levels of acoustic shock"

From the requirements above a confusion of operation modes, e.g. the user does not remember the actual operation mode, will get more likely and the risk of putting a device which is working in some hands-free mode with high SPLs directly to the ear is increased. In addition no control of the acoustic quality and safety of the used sounds for ringing and signaling is possible.
Several typical scenarios and risks can be deduced:

The terminal is in hands-free mode, e.g. lying on a table during a call. The user has forgotten about the mode (the SPL provided at the loudspeaker for the speech signal may be too low or the conversation was interrupted for any reason) and takes the phone to his ear by habit. In this moment a signaling tone with a much higher sound pressure level (e.g. SMS arrived, battery empty warning) may be applied occasionally.
The ringing tone (e.g. a piece of music) contains several longer pauses. Exactly during one pause the user accepts the call by pressing the correct key, but mistypes without noticing and puts naturally the device to his ear. Without protecting measures the ringing tone continues with full sound pressure level.
Ringing and signaling tones downloaded from the Internet or "home-made" by some users may contain sections with properties dangerous to the human ear (e.g. high volumes over a long time period exceeding the resulting damage threshold of the SPL at the loudspeaker) or to the electro-acoustic parts of the device (e.g. a high DC portion in the signal).

Three different ways to overcome the mentioned risks are known: pure hardware solutions, pure software solutions, and a mix of both.
Pure hardware solutions are typically based on a two-speaker concept, i.e. using an extra loudspeaker for signaling and ringing tones which remains far enough from the ear when the terminal is used normally. These solutions are expensive in terms of material costs, space requirements, power consumption. From a security point of view these are the best solutions.
Mixed solutions are normally based on a one-speaker concept using an infrared sensor and a control software. If the signal provided by the sensor indicates that a certain temperature threshold is exceeded, the SPL at the loudspeaker is reduced. These solutions are also expensive in terms of material costs. They are not robust in every situation (e.g. the sensor cannot differentiate if a mobile device is located in the trouser pocket or near the earlap of a user). Malfunctions of these sensors are known. The market penetration of these concepts seems not to be very high.
A different mixed approach without infrared sensor was chosen by Siemens Mobile Phones: A ramping for signals performed by an analogue amplifier at the end of the audio processing chain, just before the loudspeaker. The amplifier is programmable by memory-mapped registers. The duration and level increments of each ramping step are controlled by the Digital Signal Processor (DSP) firmware. The drawback of this solution is that the ramping functionality is only applied at the beginning of a sound signal, independent of the signal content. E.g. following scenario may appear: The signal contains a section of silence right at the beginning. The signal is ramped smoothly from the beginning; when the actual signal content starts, the ramping has already reached a high level resulting in a high SPL at the loudspeaker.
In view of that, the object of the present invention is to provide an improved acoustic shock protection.
According to the present invention this object is solved by a method for treating an audio signal by detecting the amplitude or an amplitude based quantity of the audio signal to be treated and increasing a gain applied onto the audio signal, from a predefined first amplitude value in the form of a ramp over the time, at most until the energy reaches a predefined second amplitude value, if the amplitude or the amplitude based quantity, respectively, lies beyond a predefined threshold.
Furthermore, according to the present invention, there is provided an acoustic device comprising detecting means for detecting the amplitude or an amplitude based quantity of an audio signal and processing means suitable for increasing a gain applied onto the audio signal, from a predefined first amplitude value in the form of a ramp over the time, at most until the amplitude or the amplitude based quantity, respectively, reaches a predefined second amplitude value, if the amplitude or the amplitude based quantity, respectively, lies beyond a predefined threshold.
Thereby, the amplitude based quantity can be mapped to a respective energy level or to a resulting sound pressure level at the loudspeaker or any other electro-acoustical transducer. Furthermore, the first amplitude value and the second amplitude value relate to the amplitude or the amplitude based quantity of the audio signal.
Preferably, the audio signal is filtered by a high pass filter before the amplitude of the audio signal is detected. This enables compensation of DC portions in the signal.
According to a further preferred embodiment the energy of a chosen section of the audio signal is limited to a pregiven value. This energy limitation is an additional limitation to that obtained by increasing the gain of the audio signal at most until the amplitude reaches the second amplitude value defined above. Thus, the energy limitation further improves the acoustic shock protection.
The increasing of the gain of the audio signal may start after a predefined hold time from the start of the audio signal. Thus, the user has additional time to react after the start of playback of the acoustic signal.
According to a further improvement the gain of the audio signal is reduced to the start level, if the amplitude of the audio signal to be treated lies below a pregiven ramping threshold, for a pregiven period of time. Due to this a re-ramping is provided.
The gain of the audio signal may be increased in the form of ramped volume steps. These ramped steps guarantee a smooth "crescendo mode".
According to another embodiment the slope of the ramp may be variable. Thus, the acoustic device can be adapted to the reaction time of the user.
If the audio signal is a stereo signal including two channels, both channels may be treated equally depending on only the signal amplitude in one channel or the signals in both channels. This means, that the ramping is performed on both channels in the same manner.
Furthermore, the start level for presenting the audio signal before it is subjected to a ramp like gain function may be adjustable. This allows to adapt the start level to the gain of the front end chosen by the user.
If the audio signal is processed digitally, several samples of the audio signal may be grouped to a sample group, which is processed as one single unit. Such grouping guarantees reduced processing time.
In the following, the present invention will be explained in more detail along with the attached drawings showing in:

FIG 1: a principle of digital ramping;
FIG 2: blind versus intelligent ramping according to the present invention;
FIG 3: a crescendo behavior with ramped steps;
FIG 4: a crescendo behavior during pauses;
FIG 5: a system for acoustic shock protection for a mono sound signal and
FIG 6: a system for acoustic shock protection for a stereo sound signal.

In the following passages preferred embodiments of the present invention are described.
The embodiment proposed here is a pure SW solution for mobile phones consisting of a system of digital signal processing algorithms. A software solution is the only way to minimize the risk of acoustic shock, if specific additional hardware like a second loudspeaker or an infrared sensor is not available. In terms of material costs, space requirements and power consumption it is the most efficient solution.
The system is based on the following algorithms:

1. High pass filter
2. Intelligent digital ramping algorithm
3. Energy limitation algorithm.

Items 2 and 3 may be changed in their sequence.
Their combination leads to a large reduction of the risk of acoustic shock. FIG 5 shows an example for realizing these algorithms. A mono sound signal is input into a high pass filter HP. The filtered signal is subjected to ramping R and energy limitation EL. The resulting signal is fed into the audio front end FE and subsequently to a loudspeaker L. The audio front end FE represents all components after a digital/analogue converter not shown in FIG 5.
The three algorithms are now described in more detail:

1. The high pass filter compensates DC portions in the signal. This may be essential for following digital processing steps like the ramping.
2. The ramping algorithm multiplies the input signal by an increasing envelope factor. The slope of the envelope, i.e. the gradient of the ramp, is defined by two parameters: a time increment TI and a level increment LI (compare FIG 1). The limits of this factor are defined by a start level SL, which can be adapted to the volume level VL1, VL2, VL3, VL4 selected on the terminal, and an end level MAX which must be adapted according to the maximum allowed SPL at the loudspeaker. The start level SL has to be smaller than or equal to a minimum level MIN. The values MAX and MIN are pregiven by the respective standards.
The ramping algorithm provides the following features:
- a) Pause detection: In a first step the input signal is analysed. The signal is smoothed and the energy of the resulting samples is computed. However, the processing order may be changed. If the energy is below a tuneable threshold (RT), this part of the signal is marked as a silence period (see FIG 2).
  The ramping function is applied only onto the parts of the signal which are not silence sections, during silence sections the ramping algorithm will internally be on-hold for this time period. The ramping only starts when the signal level is above a certain threshold. Thus the disadvantage of unintelligent ramping at the beginning of a signal is avoided. If a pause period occurs in the signal, the ramping algorithm stops increasing the loudness during this period and the actual volume level is frozen. The risk of being exposed to a high SPL after the pause because of the ramping resuming is minimized by this measure.
  In FIG 2, the resulting function, here called "intelligent ramping", is marked with IR. Additionally, the curve of "blind ramping" BR known from the prior art is depicted in FIG 2. One can easily recognize, that the loudness at the beginning of the first sound block is much higher for a blind ramping BR than for intelligent ramping IR according to the present invention. This means, that the risk for the user to suffer damages from acoustic shock is increased when blind ramping is applied and the playback signal has a silence period at the beginning. In contrast to that the user gets no acoustic shock when intelligent ramping IR is applied.
- b) Hold time: At the beginning of a signal, a hold time period HT can be inserted that delays the ramping start and keeps the signal to a constant low level (see FIG 1). This provides the user with additional time to react and remove the terminal from the ear if necessary. In addition the hold-time can be regarded as a warning to the user who is then aware that a signal with a higher volume will follow. For the sake of simplicity, a hold time HT is not shown in FIG 2.
- c) Re-ramping: It is possible to initiate a new ramping phase in the case that a silence period is getting too long. By this feature long periods of silence (e.g. in a hands-free telephone call) are detected and any re-appearing signal will start with low volume level.
- d) Crescendo mode support: Some terminals offer a specific ringing tone mode, the so called crescendo mode. In this mode the ringing tone is amplified stepwise (Step_1 to Step_4) from low signal levels to high signal levels. Typically these amplification steps are coarse, each with a duration of about 1 second. (See dotted staircase curve TC in FIG 3). The continual curve depicts the improvement of crescendo mode by the ramping algorithm: the amplitude and time increments are much smaller; this gives the impression of a quasi 'continual' ramped signal, the ramped crescendo RC in FIG 3. In the case of FIG 3 the start level SL is higher than the ramping threshold RT, so that the ramping is activated.
  If the ringing tone contains pauses, the scheme of smoothly ramping from one volume step to the next needs to be circumvented. Otherwise the volume evolution would be as depicted by the dotted staircase curve (typical crescendo TC) in FIG 4 and the volume difference before and after the pause could be too big (several volume steps). This can be dangerous for the user. With the proposed solution (intelligent crescendo IC) the ramping level is frozen to the last actual amplitude level and the algorithm resumes ramping from this level (compare continuous line in FIG 4). Therefore this solution lowers the acoustic shock probability for a user in this mode also.
- e) Stereo support: This invention provides a ramping solution also for stereo signals (see FIG 6). Two possible solutions are provided to deal with asymmetric channels:
  - The ramping is performed only when both channels do contain a signal with energy above the ramping threshold.
  - The ramping is performed as soon as a signal with energy above the ramping threshold is present on one of the channels.
  FIG 6 depicts a possible combination of the algorithms for stereo mode. A stereo sound signal is processed in a left channel and a right channel. The structure of both channels is similar to that of the system of FIG 5. The left channel includes a high pass filter HP1, a ramping block R1 and an energy limiter EL1. The right channel includes a high pass filter HP2, a ramping block R2 and an energy limiter EL2. Both channels lead to a common audio front end CFE, which feeds two loudspeakers L1 and L2. The mono and the stereo scenarios differ only in that way that there is a communication C between the two ramping algorithms R1 and R2, since the ramping gains are applied simultaneously on both channels and since each channel must know about an eventual silence period present on the other channel.
- f) Adaptation of ramping initial level: The ramping initial level SL (compare FIG 1) is set accordingly to the volume setting VL1 to VL4 selected by the user, to achieve a constant initial loudness at the loudspeaker. Without this feature, the signal may stay inaudible at the lowest volume setting.
- g) Scalability of the algorithm: It is possible to gather samples and apply a common gain to a respective group of samples at the same time. This allows a trade-off between computation load and audio quality without any impact to security.
3. The energy limitation algorithm is provided optionally to improve the acoustic shock protection. The ramping algorithm is a first measure against acoustic shock, yet can not deal with sudden sharp rises in the sound signal, e.g. when the ramping phase is over. For that the energy limiter is required in addition.

The energy limitation algorithm assures an energy limitation so that the corresponding SPL at the loudspeaker is below the threshold involving ear damage. The peak energy is computed and, after smoothing the result is compared to a tuneable threshold. If the energy level is above this threshold an attenuation factor (tuneable by a characteristic curve) is applied.

a) Stereo support: In case of stereo signals, each channel is processed independently from the other (see item 2e) above).
b) Scalability of the algorithm: It is possible to gather samples and apply a common gain to several samples at once. This allows a trade-off between computation load and audio quality without impact to security (see item 2g) above).

Generally, the system may support the stereo mode as indicated. Additionally, the system provides support of different audio front-ends and audio modes: As the resulting SPL at the loudspeaker is influenced by the whole loudspeaker path hardware and is dependent on the audio mode the system is tuneable to any kind of audio front-end. For instance, the SPL is not the same for handset mode as for integrated hands-free mode for the same digital input energy.
The system can be adapted to any level and any sampling rate used within every possible audio mode, and to any audio front-end. Therefore the invention does not only apply to mobile phones and may be extended to any acoustic terminal like PDAs, PCs, MP3-player etc.
In summary, there is provided a software (SW) solution (system of digital signal processing algorithms) to minimize the risks of ear damage introduced by the frequent usage of multimedia features on acoustic devices, by user-configurable sounds for ringing and signaling, and by complex control of audio output signals for example on multitasking systems.

Bezugszeichenliste

HT: hold time period
LI: level increment
SL: start level
TI: time increment
VL1, VL2, VL3, VL4: volume level
BR: blind ramping
IR: intelligent ramping
RC: ramped crescendo
RT: ramping threshold
TC-: typical crescendo
IC: intelligent crescendo
EL: energy limiter
FE: audio front end
HP, HP1, HP2: high pass filter
L, L1, L2: loudspeakers
R, R1, R2: ramping block
CFE: common audio front end
EL1, EL2: energy limiter

Claims

Method for treating an audio signal
characterized by
- detecting the amplitude or an amplitude based quantity of the audio signal to be treated and

- increasing a gain applied onto the audio signal, from a predefined first amplitude value (SL) in the form of a ramp over the time, at most until the amplitude or the amplitude based quantity, respectively, reaches a predefined second amplitude value, if the amplitude or the amplitude based quantity, respectively, lies beyond a predefined threshold (RT).
Method according to claim 1, wherein the audio signal is filtered by a high pass filter (HP, HP1, HP2) before the amplitude of the audio signal is detected.
Method according to claim 1 or 2 wherein the energy of a chosen section of the audio signal is limited (EL, EL1, EL2) to a pre-given energy value.
Method according to one of the preceding claims, wherein the increasing of the gain of the audio signal starts after a pre-defined hold time (HT) from the start of the audio signal.
Method according to one of the preceding claims, wherein the gain of the audio signal is reset to the start level (SL), if the amplitude of the audio signal to be treated lies below a predefined ramping threshold (RT) for a pre-given period of time.
Method according to one of the preceding claims, wherein the gain of the audio signal is increased in the form of ramped steps (RC, IC).
Method according to one of the preceding claims, wherein the slope of the ramp is variable.
Method according to one of the preceding claims, wherein the audio signal is a stereo signal including two channels, and wherein both channels are treated equally according to a method of the preceding claims depending on only the signal amplitude in one channel or on the signal amplitudes in both channels.
Method according to one of the preceding claims, wherein the start level (SL) is adjustable.
Method according to one of the preceding claims, wherein the audio signal is processed digitally and several samples of the audio signal are grouped to a sample group, which is processed as one single unit.
Acoustic device
characterized by
- detecting means for detecting the amplitude or an amplitude based quantity of an audio signal and

- processing means suitable for increasing a gain applied onto the audio signal, from a predefined first amplitude value (SL) in the form of a ramp over the time, at most until the amplitude or the amplitude based quantity, respectively, reaches a predefined second amplitude value, if the amplitude or the amplitude based quantity, respectively, lies beyond a predefined threshold (RT).
Acoustic device according to claim 11, further including a high pass filter (HP, HP1, HP2) connected in front of the detecting means.
Acoustic device according to claim 11 or 12, wherein the processing means includes energy limitation means (EL, EL1, EL2) for limiting the energy of a chosen section of the audio signal to a pre-given energy value.