EP1238389A1 - Traitement audio destine par exemple a prevenir la vocalisation ou la production de sons complexes - Google Patents
Traitement audio destine par exemple a prevenir la vocalisation ou la production de sons complexesInfo
- Publication number
- EP1238389A1 EP1238389A1 EP00979810A EP00979810A EP1238389A1 EP 1238389 A1 EP1238389 A1 EP 1238389A1 EP 00979810 A EP00979810 A EP 00979810A EP 00979810 A EP00979810 A EP 00979810A EP 1238389 A1 EP1238389 A1 EP 1238389A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- audio
- ouφut
- time
- loud
- ambient audio
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/16—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/175—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
- G10K11/1752—Masking
- G10K11/1754—Speech masking
Definitions
- Audio Processing e.g. for Discouraging Vocalisation or the Production of Complex Sounds
- This invention relates to audio processing methods and apparatus, particularly (but not exclusively) for use in discouraging vocalisation or the production of complex sounds.
- the term 'vocalisation' includes not only speech but also other sounds uttered by both human beings and also animals, and the term 'complex sounds' includes other sounds and noises such as music whether generated live or being a replay of a recording.
- the term 'ambient audio' implies an ensemble of sounds in a volume, which are not necessarily produced for the purpose of detection by a sensor, and whose sources are not necessarily in close physical proximity to such a sensor. This is in contrast to localised audio, which implies sounds (perhaps just one specific sound) that may be produced for the express purpose of detection by a sensor, whose sources may be in close physical proximity to the sensor. Detection of ambient audio generally requires much greater amplifier sensitivity than detection of localised audio.
- the present invention is concerned with discouraging such vocalisation and/or production of other ambient audio.
- Some methods described herein may be said to 'interfere' with undesirable spoken words, since they produce ambient audio at the same time as the undesirable spoken words.
- Other methods may be said to 'interrupt' a speaker, since they reflect spoken words back to the speaker just after the end of an undesirable word, in the same way that a person would normally interrupt another person.
- US-A-4,464,119 discloses an invention for preventing stammering.
- the device permits a person to hear a delayed version of his own voice. The delay may be adjusted.
- Input audio is detected and used to activate the output.
- US-A-5,749,324 discloses a device for preventing vocalisation of animals, particularly dogs. The emphasis is on recognising sounds, and producing a stimulus as a result of recognising those sounds. Those sounds are animal noises (barking) and words spoken by humans. The device can correlate animal noises with human spoken words, and cause an output noise to be made in response.
- an audio processing method for example for discouraging vocalisation or the production of complex sounds, the method comprising the steps, performed in a repeating cycle, of: receiving ambient audio; detecting when the received ambient audio is loud; and broadcasting a burst of output audio so as to mix with the ambient audio, the burst of output audio being timed in dependence upon the detection of loud ambient audio.
- the method is particularly, but not exclusively, intended to be used in circumstances in which the ambient audio (at least at the point of reception) is relatively quiet and yet the output audio is relatively loud.
- the production of output audio may be dependent upon some or all of the following methods and events:
- the method preferably further comprises the step, in each cycle, of deciding whether or not to perform the broadcasting step in dependence upon at least one parameter related to the received ambient audio and/or the broadcast ou ⁇ ut audio.
- the decision may be made in dependence upon the length of time for which the received ambient audio is loud. For example, the decision may be made not to perform the broadcasting step if the received ambient audio is loud for less than a first predetermined period of time. This can assist in preventing mistnggering of ou ⁇ ut audio in response to an extraneous transient noise. Additionally or alternatively, in the deciding step, the decision may be made in dependence upon the length of time since the preceding broadcast of such a burst of ou ⁇ ut audio. For example, the decision may be made not to perform the broadcasting step if the length of time since the preceding broadcast of such a burst of ou ⁇ ut audio is less than a second predetermined period of time.
- the decision may be made to perform the broadcasting step if the received ambient audio is loud for more than said first predetermined period of time and the length of time since the preceding broadcast of such a burst of ou ⁇ ut audio is more than said second predetermined period of time. Additionally or alternatively, the decision may be made to perform the broadcasting step if the received ambient audio is loud for more than a third predetermined period of time. This can assist in preventing the method locking up.
- the method further comprises the step of ignoring any detection of loud ambient audio for a period of time after the broadcasting step, for example a fourth predetermined period of time.
- the broadcasting step may be commenced immediately that loud ambient audio is detected.
- the ambient audio can be assessed to determine whether it should trigger ou ⁇ ut audio, and in some embodiments can be processed in order to generate the ou ⁇ ut audio.
- the broadcasting step may be commenced substantially immediately that the ambient audio ceases to be detected as loud, and in the case where the ambient audio is detected as loud for said fifth predetermined period of time, the broadcasting step may be commenced substantially immediately at the end of said fifth predetermined period of time. Accordingly, a short burst of ambient audio will trigger a burst of ou ⁇ ut audio immediately after the burst of ambient audio, whereas a long burst of ambient audio will trigger a burst of ou ⁇ ut audio a predetermined time after the start of the burst of ambient audio or after the start of the cycle.
- the method further comprises the step of making one of the following decisions: • whether or not the received incident audio is loud at substantially the beginning of each cycle;
- the period of time for which the ou ⁇ ut audio is broadcast may be determined differently in the two modes. Additionally or alternatively, said fifth predetermined period of time (mentioned above) is preferably different in the two modes, so as, in general, to achieve the interrupting effect and the interfering effect.
- the method may further comprise the step, in each cycle, of generating the ou ⁇ ut audio at least in part from the received ambient audio.
- the method preferably includes the step of automatically controlling the level of the ou ⁇ ut audio, for example by detecting the level of the received audio, and applying a gain in generating the ou ⁇ ut audio dependent on the detected level so that the peak level of each burst of ou ⁇ ut audio is substantially predetermined.
- the level of the received audio is preferably ignored while each broadcasting step is being performed, and preferably also for a period of time immediately after each broadcasting step has been performed. It may also be conditionally ignored for a first period of time or temporarily ignored for a second period of time.
- the content of the ou ⁇ ut audio may be produced at least in part from the substantially current content of the received ambient audio or at least in part from delayed content of the received ambient audio.
- the content of the ou ⁇ ut audio may be produced at least in part from a source independent of the incident ambient audio, such as a white noise generator, a coloured-noise generator, or an oscillatory-signal generator.
- a source independent of the incident ambient audio such as a white noise generator, a coloured-noise generator, or an oscillatory-signal generator.
- the step of detecting loud ambient audio preferably comprises comparing the level of the received audio with at least one threshold.
- the method preferably further comprises the step of automatically adjusting the threshold, or at least one of the thresholds, in dependence upon an average value of the level of the received ambient audio.
- adjustment of the threshold(s) is independent of the level of the received ambient audio while each broadcasting step is being performed, and preferably also for a period of time immediately after each broadcasting step has been performed. It may also be conditional for a first period of time or be temporarily delayed for a second period of time.
- Determining the presence and/or absence of loud ambient audio may involve some or all of the following: ignoring bursts of loud ambient audio that are shorter than typical spoken words; setting age and/or threshold detection levels, not altering age and/or threshold detection levels while broadcasting ou ⁇ ut audio, not altering age and/or threshold detection levels for a first time after broadcasting ou ⁇ ut audio, conditionally adapting age and/or threshold detection levels for a second time after broadcasting ou ⁇ ut audio; not adapting threshold detection levels with data obtained between the said first time and a second time until the said second time; ignoring incident ambient audio while broadcasting ou ⁇ ut audio, ignoring ambient audio for a first time after broadcasting ou ⁇ ut audio, conditionally ignoring incident audio for a second time after broadcasting ou ⁇ ut audio, where the second time is longer than the first time.
- the previous methods may be combined with a further method, such that desired audio may be broadcast instead of ou ⁇ ut audio produced according to the previous methods. This has the effect of providing a conventional loud hailer when desired audio is detected.
- the invention also provides an audio processing apparatus adapted to perform the method described above.
- a main objective of an embodiment of the present invention is the prevention of recognition of broadcast ou ⁇ ut audio as new ambient audio from a desirable or undesirable source.
- An apparatus that mistriggers in such a way is likely to oscillate and may be ineffective at responding to original input audio.
- Another objective is the obstruction of offensive human speech.
- the timing attributes of the apparatus are preferably matched to the characteristic timings of human speech, in order that apparatus is able to respond to a spoken word in time to obstruct that spoken word.
- Second, that deadtime is preferably minimised. In this context, deadtime is a period or periods where apparatus does not respond to a spoken word and (obviously) fails to obstruct that spoken word.
- any significant deadtime obviously enables offensive words to be spoken during that deadtime, and significantly reduces the effectiveness of the invention
- the apparatus preferably mimics the response of a human.
- undesirable spoken words should be interfered with when it is desired to be more assertive, such as when production of undesirable audio fails to stop.
- Undesirable spoken words should be interrupted when it is desired to be less assertive, such as when production of undesirable audio is infrequent.
- Mistriggering can also be prevented when the characteristics of broadcast ou ⁇ ut audio are those that are recognised as input audio. There are several ways that such mistriggering might be prevented.
- the audio input should detect the audio ou ⁇ ut. This requires (1) that there is sufficient loop gain from the ou ⁇ ut actuator to the input detector and (2) that the input detector is enabled when there is sufficient loop gain from the ou ⁇ ut actuator to the input detector and ou ⁇ ut audio is present.
- ambient audio derived from ou ⁇ ut audio can still be present in the form of echoes after active production of ou ⁇ ut audio has ceased. Such echoes may be similar in strength to that of original ambient audio, but may be psychoacoustically overwhelmed by original ambient audio.
- threshold parameters and age parameters may be distorted if the input detector is searching for distant (weak) sounds, but is exposed to local (strong) sounds when ou ⁇ ut audio is produced.
- the first requirement depends on the volume of the ou ⁇ ut audio, the physical position of the audio sensor relative to the audio ou ⁇ ut actuator, and the amount of amplification in the audio input.
- the second requirement depends on various timing elements.
- a first approach to maintaining stability would be to use an ou ⁇ ut stage that is responsive to ou ⁇ ut audio, without an input stage that is responsive to ou ⁇ ut audio.
- Such a device will typically have several timing elements, including:
- the input stage can have a "sensitivity time constant" , such that loud signals whose duration is less than the sensitivity time constant are rejected. This is provided to eliminate mistriggering caused by arbitrary inoffensive noises.
- the apparatus can have an "activation-hold time constant” , which determines the length of time for which the apparatus input remains in the active state, after a loud sound that is longer than the sensitivity time constant has been detected.
- the ou ⁇ ut stage can have a "disable time constant” , which is the time for which the audio ou ⁇ ut is disabled, at the end of a period where the audio ou ⁇ ut has been active.
- Stability may require a sensitivity time constant that is longer than echoes of ou ⁇ ut audio, in order that echoes are not detected, but a very long sensitivity time constant may prevent desired detection of original ambient audio.
- Stability may require an activation-hold time constant that is shorter than the disable time constant, in order that active ou ⁇ ut audio does not cause reactivation, but a short activation-hold time constant may cause inappropriate detection of the end of original ambient audio.
- Stability may require a disable time constant that is longer than the duration of echoes plus the activation-hold time constant, in order that active ou ⁇ ut audio does not cause reactivation, but the disable time constant is pure dead time, where the apparatus is unable to respond to original ambient audio.
- the disable time constant is pure dead time, where the apparatus is unable to respond to original ambient audio.
- original input audio is present at the end of active ou ⁇ ut audio, it may not be necessary or even desirable to ignore the audio input. The compromises may be further worsened if the apparatus uses inaccurate timings, such as those provided by analogue circuitry.
- threshold parameters and age parameters may be distorted if the input adapts to local (strong) sounds when ou ⁇ ut audio is produced, but normally is adapted to distant (weak) sounds. In such cases, the apparatus may respond in an undesirable fashion (or not respond at all) for a period after the ou ⁇ ut of audio.
- This first approach is not well suited to the objectives of the present invention and is therefore preferably not used. There are a number of design compromises, there is inevitable dead time, it is incompatible with quiet inputs and loud ou ⁇ uts, and it does not cope with original ambient audio during echoes. Apparatus according to this first approach may be less than fully successful in responding to offensive human speech.
- a second approach to maintaining stability (used in the present invention) has an input stage that is responsive to ou ⁇ ut audio, without an ou ⁇ ut stage that is responsive to ou ⁇ ut audio.
- This approach decouples the constraints on time constants, practically eliminates deadtime, enables acoustic sensitivity and detection while permitting loud ou ⁇ ut audio, and can cope with original ambient audio during echoes. This enables greater design freedom to meet the timing requirements of human speech.
- Figure 1 schematically illustrates the relationship between the input ambient audio and the ou ⁇ ut broadcast audio according to the present invention
- Figure 2 schematically illustrates a first example of a method according to the present invention
- Figure 3 schematically illustrates a second example of a method according to the present invention
- Figure 4 is a block diagram of an apparatus for performing the first example
- Figure 5 is a state diagram to illustrate the operation of the apparatus of figure 4, and
- Figure 6 is a second state diagram to illustrate the operation of the apparatus of figure 4.
- Figure 1 illustrates the relationship between the ambient audio 10, the audio 12 from an undesirable source 14, and the broadcast ou ⁇ ut audio 16 produced by an apparatus 18
- the ambient audio 10 is converted by microphone 20 into a signal that is amplified to usable levels by preamplifier 22.
- the signal is processed by a processing block 24, which produces an ou ⁇ ut signal that is amplified by a power amplifier 26 and broadcast by loudspeaker 28.
- the broadcast audio 16 mixes with audio 12 from the undesirable source 14 to form the ambient audio 10
- the undesirable source 14 is typically relatively distant from the apparatus 18, because of the difficulty and possible hazard of positioning the apparatus 18 close to the source 14 of undesirable audio 12
- the signal produced by microphone 20 from the undesirable component of the ambient audio 10 is therefore of relatively small size, and requires significant amplification by preamplifier 22 to reach a usable level.
- the actual distance between the apparatus 18 and the source 14 of undesirable audio 12 is generally unpredictable, and the size of the undesirable audio is generally unpredictable.
- the processing block 24 uses undesirable input audio as a component of ou ⁇ ut audio 16, processing block 24 applies variable amounts of amplification to its input signal to produce a component of consistent peak or average amplitude. Such age methods are well known to those skilled in the art of audio processing.
- Processing block 24 also uses its mput signal to derive an adaptive threshold in order to disnnguish between loud undesirable mput audio and silence.
- Such methods are well known to those skilled in the art of audio processing, and may be combined with methods of age control.
- the broadcast audio 16 produced by loudspeaker 28 must be relatively loud m order to carry to the distant source 14 of undesirable audio 12 and disrupt the undesirable audio 12 at that source 14. Since the microphone 20 is normally adjacent the loudspeaker 28, the size of the audio input to processing block 24 caused by broadcast ou ⁇ ut ambient audio is generall) significantly larger than that caused by undesirable ambient audio. Such a difference can significantly disturb the age and threshold parameters of processing block 24, such that undesirable audio is not properly detected for a time after the production of broadcast ou ⁇ ut audio, until the age and threshold parameters readjust.
- Methods to perform such disabling are well known to those skilled in the art of audio processing, and may include holding constant the voltage on a capacitor or locking the contents of digital memory, for example.
- the age parameters and threshold parameters may be frozen and also conditionally accepted until retrospectively rolled-back at the end of the echo period.
- the threshold parameters may also be frozen until retrospectively updated at the end of the echo period.
- FIG. 2 shows in greater detail elements of the apparatus 18 used in performing the method.
- Ambient audio 10 is converted by the microphone 20 into a signal that is amplified to a usable level by preamplifier 22.
- the signal is connected to a combiner/switch 30 and to a control input 32 of processing block 24.
- an algorithmic generator 34 that produces a signal according to an algorithm.
- a pattern generator 36 that produces a signal according to a stored pattern, which may be an artificially created pattern or a recording of a real audio signal.
- the switch connects some combination of one or more of the ou ⁇ uts of the preamplifier 22, the algorithmic generator 34, and the pattern generator 36 to a signal input 38 of the processing block 32.
- the ou ⁇ ut of the preamplifier 22 controls the processing block 24 via its control input 32 to produce an ou ⁇ ut signal that is amplified by power amplifier 26 and broadcast by loudspeaker 28.
- the broadcast audio 16 mixes with audio 12 from the undesirable audio source 14 to form the ambient audio 10.
- the ou ⁇ ut of combiner/switch 30 could therefore include a component from a pseudorandom source (such as that produced by algorithmic generator 34) or from a stored repetitive waveform source (such as that produced by the pattern generator 36). Such sources are well known per se.
- the processing block 24 may act to encourage ambient audio oscillation or may act to prevent ambient audio oscillation. Oscillation will occur if processing block 24 introduces sufficient loop gain. Oscillation may be prevented if processing block 24 ignores input audio while ou ⁇ ut audio is being broadcast. Then the apparatus may be said to operate in a 'record-or-replay' mode, since the processing block 24 gathers incident audio or ou ⁇ uts audio, but never does both simultaneously. Oscillation may also be prevented if processing block 24 uses 'echo cancellation' techniques to remove broadcast ou ⁇ ut audio from an input signal that includes both new incident audio and broadcast ou ⁇ ut audio.
- the apparatus may be said to operate in a 'record-while- replay' mode, since ou ⁇ ut audio can be broadcast while new incident audio is being gathered.
- Such 'echo cancellation' techniques are well known per se to one skilled in the art, and will not be mentioned further here except to note that such techniques require 'training' to learn the characteristics of the path between the ou ⁇ ut and input of the processing block 24.
- Such training necessarily requires the production of ou ⁇ ut audio in the absence of significant new incident audio. This may be done by deliberately producing a specific training signal. Training may be done while processing block 24 executes a 'record-or-replay' method.
- Oscillation may also be prevented when broadcast ou ⁇ ut audio is constrained to a given bandwidth or bandwidths.
- processing block 24 can use standard filtering techniques to reduce amplitudes in such bandwidths to an acceptable level in signals derived from ambient audio. This is also a 'record-while-replay' technique.
- control input 32 of the processing block 24 Since the control input 32 of the processing block 24 is derived from incident ambient audio, instability will occur if ou ⁇ ut audio is mistaken for genuinely new ambient audio. Such instability is avoided by applying the 'record-or-replay' or 'record- while-replay' techniques (described above) to control input 32 of the processing block 24 as well as to the signal input to the processing block 24.
- the processing block 24 examines the signal presented at control input 32 so that loud ambient audio and quiet ambient audio may be differentiated and detected. This may be done in many ways, which will be apparent, in the light of this specification, to those skilled in the art of audio processing.
- the type of ou ⁇ ut produced by processing block 24 depends on the presence of loud ambient audio, detected via the signal at control input 32. If loud audio has been detected, the processing block ou ⁇ uts a signal that represents the audio that will obstruct production of ambient audio.
- processing block 24 ou ⁇ uts a signal that represents silence or some other audio that will not obstruct production of ambient audio.
- An obstructing ou ⁇ ut signal is produced by processing block 24 from its input signal after the detection of loud ambient audio via control input 32.
- the combiner/switch 30 and processing block 24 operate to produce an ou ⁇ ut signal which represents ou ⁇ ut audio that discourages the production of ambient audio.
- the processing block 24 rejects the signals at its signal input 38 and its control input 32 for a short tune to allow the amplitude of ambient echoes of ou ⁇ ut audio to decay below the level that overwhelms original ambient audio
- the processing block 24 then conditionally accepts larger signals at control input 32 as being caused by new original loud ambient audio provided that the signal is large for longer than a certain tune This time must be shorter than the delay between the detection of loud ambient audio via control input 32 and the decision to create an ou ⁇ ut signal.
- the result of such unconditional and conditional rejection is that production of new obstructing audio caused by old ou ⁇ ut audio is much reduced, if not eliminated
- the louder (earlier) loud echoes of ou ⁇ ut audio are simply ignored
- the quieter (later) echoes are rejected if they are not masked by new loud ambient incident audio
- Such delays could be simply pre-programmed, or be the
- a 'record-while-replay' method is unable to reduce the level of broadcast ou ⁇ ut audio to an acceptable level m signals derived from ambient audio, there is an advantage in combining the 'record-while-replay' method with features of the 'record-or-replay' method For example, such a 'record-while-replay' method may need to use the 'record-or-replay' technique of rejecting input audio after producing ou ⁇ ut audio, but probably for a shorter time than for a pure 'record-or-replay' method
- the ou ⁇ ut signal is processed to maintain a uniformly high mean level of ou ⁇ ut audio
- a minimum amount of incident loud audio is detected at control input 32 of the processing block before processing block 24 produces ou ⁇ ut audio Otherwise, the incident loud audio is rejected This is to eliminate activation by spurious bursts of noise Preferably bursts of noise that are less than a typical short spoken word are rejected
- processing block 24 ou ⁇ uts interfering audio before the end of that loud ambient audio, in order to mterfere with the loud ambient audio
- a delay between detection of loud ambient audio and ou ⁇ ut of an interfering signal is necessary to enable the processing block 24 to reject signals at control input 32 that arise from bursts of ambient noise, as explained previously.
- the delay is also necessary if the ou ⁇ ut audio is to contain a delayed copy of input audio, since time is required to gather that input audio.
- the delay may also be necessary to enable detection of loud ambient audio via control input 32 (depending on the method used).
- the delay may also be necessary to enable determination of the characteristics of the control signal that indicate loudness and quietness.
- the delay may also be necessary to determine the recent peak amplitudes of the input signal to the processing block 24, which may be temporarily stored for future use in automatic -gain-control. If the 'record-or-replay' method is in use, the delay may also be necessary to reject unwanted echoes of previous ou ⁇ ut audio, as explained previously.
- the action of the processing block 24 when producing an interfering ou ⁇ ut signal is to amplify its input signal into an ou ⁇ ut signal that produces ou ⁇ ut audio with consistently loud mean ou ⁇ ut amplitude. If the ou ⁇ ut signal of the combiner/switch 30 is independent of the preamplifier 22, the ou ⁇ ut signal from the processing block 24 is simply amplified. If the ou ⁇ ut signal of the combiner/switch 30 is dependent on the preamplifier 22, the ou ⁇ ut signal from the processing block 24 is adjusted according to stored peak amplitudes of the signal input and new peak amplitudes of the signal input. Methods of applying automatic-gain-control will be apparent, in the light of this specification, to those skilled in the art of audio processing.
- the interfering ou ⁇ ut signal does not overdrive the power amplifier 26 or the loudspeaker 28.
- the processing block 24 assumes that it cannot differentiate between signals at its control input 32 that were caused by original ambient audio and those that were caused by ou ⁇ ut audio. So the processing block 24 freezes detailed interpretation of its control input 32.
- the processing block 24 also freezes detailed interpretation of its signal input 38, except as previously noted when preamplifier 22 contributes to the signal source.
- a second mode of operation of the arrangement shown in Figure 2 'interrupts' speech during gaps in that speech.
- processing block 24 starts the ou ⁇ ut of interrupting audio just after a break in the incident undesired audio. If the combiner/switch 30 is operated to produce its ou ⁇ ut from the preamplifier 22, this second mode reflects essentially whole spoken words back to a speaker, either almost immediately after that word was finished, or a short time later.
- Processing block 24 acts to prevent oscillation by applying either the 'record-or-replay' or 'record-while-replay' methods described above to both its signal input 38 and its control input 32, to isolate genuinely new ambient audio. If the processing block 24 is executing the 'record-or-replay' method, stability is achieved simply by the act of ignoring input signals while producing interrupting ou ⁇ ut audio. If the combiner/switch 30 is operated to produce its ou ⁇ ut from the preamplifier 22, the overall effect is that processing block 24 detects new loud ambient audio, stores that audio until it becomes quiet, replays that stored audio and simultaneously ignores ambient audio, and then returns to searching for new loud ambient audio.
- the processing block 24 If the processing block 24 is executing the 'record-while-replay' method, stability is achieved by removing the ou ⁇ ut signal from input signals. If the combiner/switch 30 is operated to produce its ou ⁇ ut from the preamplifier 22, the processing block 24 isolates new ambient audio from its input signal and stores it in temporary memory. The processing block 24 isolates new ambient audio at its control input 32 and detects the start of new loud ambient audio. When new isolated quiet ambient audio is detected via control input 32 after new isolated loud ambient audio, the processing block 24 ou ⁇ uts the stored input signal from temporary memory, from the start of the new isolated loud ambient audio to the start of the new isolated quiet ambient audio.
- processing block 24 causes processing block 24 to automatically start replay of stored audio when a preset maximum amount of audio has been stored. This is to eliminate lockup in the presence of continuously loud new ambient audio.
- interrupting is a modest form of assertion and sufficient to dissuade some but not all individuals from speaking, while interfering is a more robust form of assertion, and dissuades more individuals from speaking.
- the method will continue interrupting if interruption is effective. Otherwise, it will use interference.
- Another variation is to activate the method depending of the time of day, the relative occurrence of loud ambient audio, and so on.
- Another variation is to add at least one sensor that detects desirable audio.
- the detection of loud audio at that sensor takes precedence over the detection of undesired audio and causes desired audio to be broadcast from a, or the, loudspeaker instead of obstructing audio.
- desirable audio will be localised audio, such as words spoken directly into a microphone, instead of ambient audio. This is because it must be possible to distinguish desired audio from undesired ambient audio. It is, however, possible for desired audio to originate at a distant source.
- Figure 3 An audio sensor 40 produces an input signal from ambient desired audio 42 and another audio sensor 20 produces an input signal from ambient undesired audio 10.
- a loudspeaker 28 is driven by the ou ⁇ ut of decision circuit 44.
- Obstructer circuit 46 produces an obstructing signal using one of the methods previously described.
- decision circuit 44 ou ⁇ uts a signal derived from audio sensor 40 when desired audio is active, and otherwise ou ⁇ uts an obstructing signal from obstructer circuit 46.
- the ou ⁇ ut signal is subtracted from the desired input and also from the undesired input using subtractors 48,50, such that any trace of the ou ⁇ ut signal is at an acceptably low level. It may also be necessary to remove the clean desired signal from the clean undesired signal using a subtractor 52, such that any trace of the clean allowed signal is at an acceptably low level.
- the desired signal is produced using a non-audio transducer 40, such as throat microphone, the desired signal will not include the ou ⁇ ut signal, thus eliminating the stage of removing ou ⁇ ut audio from desired audio. This eliminates subtractor 48.
- FIG. 4 illustrates the preferred physical architecture of an apparatus for performing the methods described above.
- An electret microphone-insert 20 converts ambient audio into an electrical signal that is magnified by amplifiers 22a (such as the National Semiconductor LM358 set for a gain of 2) and 22b (such as the National Semiconductor LM386 bypassed for maximum gain).
- the ou ⁇ ut of amplifier 22b is the audio input to a codec 54 (such as the Texas Instruments TCM320AC36).
- the codec 54 is driven by control signals generated by a microcontroller 56 (such as a Microchip PIC16C64).
- the codec 54 converts the incident analogue audio to digital and compresses it to an 8 bit word (using ⁇ law coding in this example).
- the microcontroller 56 controls the codec 54 via reset, data, clock and sync signals 58 such that the codec 54 sends the compressed data to the microcontroller 56, and performs manipulation of the data according to the program stored inside the microcontroller 56.
- the microcontroller 56 has insufficient internal temporary memory, and therefore uses RAM 60 (8k x 8 industry standard type 6264) to store the compressed data samples.
- the microcontroller 56 produces address signals 62 and control signals 64 to drive the RAM 60.
- the microcontroller 56 exchanges data with the RAM 60 via data signals 66.
- the microcontroller 56 When the microcontroller 56 has finished its processing, it sends a compressed digital version of the ou ⁇ ut audio to the codec 54 using signals 58.
- the codec 54 converts the digital data to an analogue waveform that is amplified by the power amplifier 26 (such as Analog Devices SSM2211), that drives the loudspeaker 28 (such as a 1.5W loudspeaker).
- the microcontroller 56 derives its timebase from a crystal 68 (preferably 20MHz).
- the crystal 68 also drives a counter 70 (such as the industry standard HC4024) that produces a reference clock 72 for the codec 54.
- the microcontroller 56 continually drives the RAM 60 so that compressed input data is continually written to the RAM 60. New data overwrites the oldest data when the RAM 60 is full.
- the microcontroller 56 is also continually inspecting input data to detect contiguous loud audio. There are many ways of determining when loud audio is present, all of which will be apparent, in the light of this specification, to one skilled in the art. In a prototype, time was divided into arbitrary contiguous intervals of 20ms or so, the peak value in each interval was noted, and the last nine peak values recorded in a FIFO. An upper threshold is set to an appropriate proportion, such as a half, or more preferably a quarter, of the median value in the peak FIFO.
- a 20ms or so retriggerable 'upper-monostable' is set.
- a lower threshold is set to an appropriate proportion, such as an eighth, or more preferably a sixteenth, of the median value in the peak FIFO.
- a 20ms or so retriggerable 'lower-monostable' is set. If the prototype's state is 'audio absent', the state changes to 'audio present' when the 'upper-monostable' is active. If the prototype's state is 'audio present', the state remains as 'audio present' as long as the 'lower-monostable' is active.
- the actual start of contiguous audio is taken to be 20ms or so before the state changes to 'audio present'.
- the actual end of contiguous audio is taken to be 20ms or so after the state changed to 'audio absent', when the state has been 'audio absent' for 80ms or so. It will be appreciated that this is just one method of determining the presence or absence of spoken words, that the values quoted here can be varied, and that there are other methods.
- Figure 5 is an illustration of a state-machine that is implemented as a program in the microcontroller 56 in the preferred implementation.
- the program in the microcontroller 56 examines the samples representing incident ambient audio.
- the program executes the interrupting method 76, where entire spoken words are replayed as soon as they have finished. Then the program returns to the QUIESCENT state 74. On the other hand, if the program spends less than that short time Ti in the QUIESCENT state 74, the program executes the interfering method 78 and then returns to the QUIESCENT state 74.
- the state changes to GATHER1 state 80.
- GATHER1 state 80 the amplitude of detected audio is examined so as to temporarily record the peak levels of the audio, and the characteristics of loud audio are updated. If audio becomes quiet, the state changes from the GATHER1 state 80 to TEST1 state 82.
- TEST1 state 82 the time since the broadcast of ou ⁇ ut audio is measured, and the duration of the loud audio is examined. If the time since broadcast of ou ⁇ ut audio is less than a predetermined time T2 (the prototype used a duration of 140ms), or the duration of the loud audio is less than a predetermined time T3 (the prototype used a duration of 180ms), the audio is rejected and the state returns to the QUIESCENT state 74. (If in a specific instance, T2 is less than T3, then obviously the test using T2 is redundant.) Otherwise, the state changes to OUTPUT1 state 84.
- T2 the prototype used a duration of 140ms
- T3 the prototype used a duration of 180ms
- OUTPUT 1 state 84 audio is generated from a signal, and is broadcast.
- Ts the prototype used a duration of 180ms
- ECHOl state 86 In the ECHOl state 86. all ambient audio is ignored. When the time spent in ECHOl state 86 reaches a limit TO (the prototype used a duration of 20ms), the state returns to QUIESCENT state 76.
- the prefened implementation uses incident audio as the signal that is converted to audio and broadcast.
- the audio sample that has just been gathered is amplified by an automatic gain control to produce a consistently loud mean ou ⁇ ut amplitude without clipping.
- the microcontroller does this by noting the maximum sample amplitude during the GATHER 1 state 80, and amplifying all samples by the same amount so that the maximum sample amplitude during replay is the peak desired value. If feedback causes larger input samples that would be clipped by this process, the amount of amplification is reduced so as to avoid clipping.
- An alternative implementation could use a signal derived from an algorithmic generator.
- pseudo-random generators to produce apparently random noise.
- a description of pseudo-random generators is in 'Pseudo Random Sequences and Arrays' - MacWilliams and Sloane, proc. IEEE vol. 64 #12, December 1976.
- a suitable polynomial is [x 15 +x+ 1], since it has few taps but has a cycle length of a few seconds when incremented once per sample period.
- the contents of the generator could be repeatedly exclusive-ORed with audio samples during the start of the GATHER1 state 80 to provide a variable start position when the time comes to provide ou ⁇ ut audio, provided that steps are taken to detect the all-zero lockup state and exit it.
- An audio sample could be produced from the generator by incrementing it every sample period.
- the six least significant bits of the generator are used to produce a varying audio ou ⁇ ut.
- Four bits are used as the amplitude part of a ⁇ law sample, another bit as the least significant bit of the segment value of that sample, and another bit as the sign bit.
- the two most significant bits in the segment value should be set to 1 , to ensure a large amplitude ou ⁇ ut. This produces 'white' noise audio, which may be acceptable for interrupting certain speakers.
- Another alternative implementation could use a signal derived from a primitive pattern stored in non-volatile memory. At each sample period, a successive value of the pattern is converted to audio. When the end of the pattern is reached, the method cycles back to using the start of the pattern, and the process repeats.
- Such patterns (such as sine wave, or more complex cyclic signals) may be generated by algorithms, while others (such as a stored version of actual positive audio feedback) may be stored versions of actual audio signals .
- the state changes to GATHER2 state 88.
- GATHER2 state 88 the amplitude of detected audio is examined so as to temporarily record the peak levels of the audio, the characteristics of loud audio are updated, and detected audio is temporarily stored. If audio becomes quiet, the state changes from GATHER2 state 88 to TEST2 state 90.
- TEST2 state 90 the time since the broadcast of ou ⁇ ut audio is measured, and the duration of the loud audio is examined. If the time since broadcast of ou ⁇ ut audio is less than a predetermined time T7 (the prototype used a duration of 140ms), or the duration of the loud audio is less than a predetermined time Ts (the prototype used a duration of 180ms), the audio is rejected and the state returns to QUIESCENT state 74. (If in a specific instance, T7 is less than T8, then obviously the test using T7 is redundant.) Otherwise, the state changes to OUTPUT2 state 92.
- T7 the prototype used a duration of 140ms
- Ts the duration of the loud audio is less than a predetermined time
- ECH02 state 94 In the ECH02 state 94, all ambient audio is ignored. When the time spent in ECH02 state 94 reaches a limit T10 (the prototype used a duration of 20ms), the state returns to QUIESCENT state 74.
- Figure 6 illustrates the processing of input parameters such as age settings and threshold level settings.
- input parameters are adjusted according to the level of the received ambient audio. If the apparatus enters an OUTPUT1 state 84 or OUTPUT2 state 92, the parameter processing enters OUTPUT3 state 96 and stays there until the apparatus leaves OUTPUT 1 state 84 or OUTPUT2 state 92. During OUTPUT3 state 96, input parameters are not changed. If the apparatus enters an ECHOl state 86 or ECH02 state 94, the parameter processing enters ECH03 state 97 and stays there until the apparatus leaves ECHOl state 86 or ECH02 state 94. During ECH03 state 97, input parameters are not changed.
- One implementation may then follow path-a, while it may be that another implementation will follow path-b.
- the parameter processing enters CONDITIONAL4 state 101, during which input parameters are not changed but pending changes due to the level of the received ambient audio are noted.
- CONDITION AL4 state 101 After a time Tit (the prototype used a duration of 140ms) in CONDITION AL4 state 101 , OBSERVE4 stage 102 observes whether ambient audio is loud, or is loud and has recently been loud. If loud ambient audio is present, the pending changes are applied to the input parameters in ROLL FORWARD state 103, and the apparatus then returns to QUIESCENT3 state 95. If no new ambient audio is present, the pending changes are abandoned and the apparatus returns directly to QUIESCENT3 state 95.
- the parameter processing enters CONDITIONAL5 state 98, during which input parameters are adjusted according to the level of the received ambient audio but those changes are temporarily recorded.
- CONDITIONAL5 state 98 After a time T12 (the prototype used a duration of 140ms) in CONDITIONAL5 state 98, OBSERVE5 stage 99 observes whether ambient audio is loud, or is loud and has recently been loud. If loud ambient audio is present, the apparatus returns directly to QUIESCENT3 state 95. If no new ambient audio is present, the changes are removed from the input parameters in ROLL_BACK state 100, and the apparatus then returns to QUIESCENT3 state 95.
- the processing block 206 need merely activate and deactivate a common buzzer, and combiner/switch 205 is redundant.
- Many such buzzers are much more efficient than a loudspeaker at converting electricity into sound, and may produce much more directional sound than a loudspeaker. These properties may be useful in portable equipment, for example.
- RAM 60, power amplifier 26, and loudspeaker 28 are redundant.
- the codec 54 may also be feasible to replace the codec 54 with pure analogue circuitry that derives the amplitude of incoming audio, its mean peak value, various thresholds, and the size of incoming audio relative to those thresholds.
- the amplitude can be derived using a rectifier circuit.
- the mean peak value (rather than the median value used for simplicity in the microcontroller implementation) can be derived by peak-detecting and filtering the rectified audio.
- the mean peak value can be divided to produce a high threshold and a low threshold.
- a silence threshold can be derived from a fixed voltage.
- the microcontroller produces timing waveforms that cause the mean peak circuitry to accept and ignore and conditionally accept or roll-back incoming audio.
- One convenient method is to use duplicate low pass filters, each filtering the peak-detected signal.
- the input to the first duplicated filter is enabled only during the period when quiet echoes of ou ⁇ ut audio are present.
- the input to the second duplicate filter is disabled at certain times, depending on the desired effect, and the ou ⁇ ut of the first filter is added or subtracted to the ou ⁇ ut of the second filter as required.
- the resultant signal is the mean peak value.
Landscapes
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Control Of Amplification And Gain Control (AREA)
- Reverberation, Karaoke And Other Acoustics (AREA)
Abstract
L'invention concerne un procédé et un appareil de traitement audio destinés à prévenir la vocalisation ou la production de sons complexes. Le procédé consiste, au cours de cycles à répétition, à recevoir (74) des sons audio ambiants; à détecter (74) si ces sons audio ambiants sont trop forts; et à diffuser (84, 92) une rafale de sons audio de sortie de façon à les mélanger aux sons audio ambiants, la rafale de sons audio de sortie étant temporisée en fonction de la détection de sons ambiants trop forts.
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB9929520A GB2357411A (en) | 1999-12-15 | 1999-12-15 | Audio processing, e.g. for discouraging vocalisation or the production of complex sounds |
GB9929520 | 1999-12-15 | ||
GB9929519A GB2357410A (en) | 1999-12-15 | 1999-12-15 | Audio processing, e.g. for discouraging vocalisation or the production of complex sounds |
GB9929519 | 1999-12-15 | ||
GB0007329A GB0007329D0 (en) | 1999-12-15 | 2000-03-28 | Audio processing e.g. for discouraging vocalisation or the production of complex sounds |
GB0007329 | 2000-03-28 | ||
PCT/GB2000/004645 WO2001045082A1 (fr) | 1999-12-15 | 2000-12-04 | Traitement audio destine par exemple a prevenir la vocalisation ou la production de sons complexes |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1238389A1 true EP1238389A1 (fr) | 2002-09-11 |
Family
ID=27255623
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP00979810A Withdrawn EP1238389A1 (fr) | 1999-12-15 | 2000-12-04 | Traitement audio destine par exemple a prevenir la vocalisation ou la production de sons complexes |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1238389A1 (fr) |
AU (1) | AU1719401A (fr) |
GB (1) | GB2364492B (fr) |
WO (1) | WO2001045082A1 (fr) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB0023207D0 (en) * | 2000-09-21 | 2000-11-01 | Royal College Of Art | Apparatus for acoustically improving an environment |
GB9927131D0 (en) | 1999-11-16 | 2000-01-12 | Royal College Of Art | Apparatus for acoustically improving an environment and related method |
US8964997B2 (en) | 2005-05-18 | 2015-02-24 | Bose Corporation | Adapted audio masking |
US8218783B2 (en) * | 2008-12-23 | 2012-07-10 | Bose Corporation | Masking based gain control |
US8229125B2 (en) | 2009-02-06 | 2012-07-24 | Bose Corporation | Adjusting dynamic range of an audio system |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3823691A (en) * | 1973-05-10 | 1974-07-16 | M Morgan | Animal training device |
GB2071889A (en) * | 1980-03-07 | 1981-09-23 | Steel M | Anti-barking device |
US4438526A (en) * | 1982-04-26 | 1984-03-20 | Conwed Corporation | Automatic volume and frequency controlled sound masking system |
US4603317A (en) * | 1982-11-08 | 1986-07-29 | Electronic Controls Co. | Electrically-operated backup alarm |
FR2549990B1 (fr) * | 1983-07-25 | 1987-05-22 | Proactive Systems Inc | Dispositif d'emission d'un message sonore subliminal et procede pour reduire le vol a l'etalage dans un magasin |
-
2000
- 2000-12-04 AU AU17194/01A patent/AU1719401A/en not_active Abandoned
- 2000-12-04 WO PCT/GB2000/004645 patent/WO2001045082A1/fr not_active Application Discontinuation
- 2000-12-04 EP EP00979810A patent/EP1238389A1/fr not_active Withdrawn
- 2000-12-04 GB GB0127819A patent/GB2364492B/en not_active Expired - Fee Related
Non-Patent Citations (1)
Title |
---|
See references of WO0145082A1 * |
Also Published As
Publication number | Publication date |
---|---|
GB2364492A (en) | 2002-01-23 |
GB2364492B (en) | 2002-07-24 |
AU1719401A (en) | 2001-06-25 |
GB0127819D0 (en) | 2002-01-09 |
WO2001045082A1 (fr) | 2001-06-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7231347B2 (en) | Acoustic signal enhancement system | |
US5297210A (en) | Microphone actuation control system | |
EP2192794B1 (fr) | Améliorations dans les algorithmes d'aide auditive | |
EP0312569B1 (fr) | Procede et appareil d'amelioration de l'intelligibilite de la voix dans des environnements bruyants | |
KR950703891A (ko) | 전자청진기 | |
US20060227978A1 (en) | Feedback elimination method and apparatus | |
US20070104335A1 (en) | Acoustic feedback suppression for audio amplification systems | |
JP2007181099A (ja) | 放収音装置 | |
US4589136A (en) | Circuit for suppressing amplitude peaks caused by stop consonants in an electroacoustic transmission system | |
US20080052079A1 (en) | Electronic appliance and voice signal processing method for use in the same | |
WO2004105430A1 (fr) | Suppression d'oscillation | |
US10757514B2 (en) | Method of suppressing an acoustic reverberation in an audio signal and hearing device | |
US20040252853A1 (en) | Oscillation suppression | |
WO2001045082A1 (fr) | Traitement audio destine par exemple a prevenir la vocalisation ou la production de sons complexes | |
US20020181714A1 (en) | Audio processing, e.g. for discouraging vocalisation or the production of complex sounds | |
EP0516220B1 (fr) | Dispositif d'amplificateur électro-acoustique et dispositif de microphone pour l'usage dans le dispositif d'amplificateur électro-acoustique | |
GB2357411A (en) | Audio processing, e.g. for discouraging vocalisation or the production of complex sounds | |
EP2050305B1 (fr) | Grille de poussée | |
US20040096068A1 (en) | Audio effector circuit | |
US20230215450A1 (en) | Automatic noise gating | |
TWI777265B (zh) | 指向音源探取裝置及其方法 | |
JP2870421B2 (ja) | 話速変換機能を有する補聴器 | |
JPH08318449A (ja) | 超音波振動監視装置 | |
CN118413796A (zh) | 一种抑制本地音频扩音啸叫的方法 | |
JPH0564894U (ja) | 能動消音装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20020615 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR |
|
AX | Request for extension of the european patent |
Free format text: AL;LT;LV;MK;RO;SI |
|
17Q | First examination report despatched |
Effective date: 20030819 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20031230 |