CA1227573A - Adaptive speech detector system - Google Patents
Adaptive speech detector systemInfo
- Publication number
- CA1227573A CA1227573A CA000483355A CA483355A CA1227573A CA 1227573 A CA1227573 A CA 1227573A CA 000483355 A CA000483355 A CA 000483355A CA 483355 A CA483355 A CA 483355A CA 1227573 A CA1227573 A CA 1227573A
- Authority
- CA
- Canada
- Prior art keywords
- signal
- speech
- detector system
- speech detector
- noise
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired
Links
- 230000003044 adaptive effect Effects 0.000 title 1
- 239000003990 capacitor Substances 0.000 claims description 29
- 230000010354 integration Effects 0.000 claims description 5
- 238000004519 manufacturing process Methods 0.000 claims description 5
- 230000003321 amplification Effects 0.000 description 7
- 238000003199 nucleic acid amplification method Methods 0.000 description 7
- 230000001419 dependent effect Effects 0.000 description 3
- 230000001052 transient effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000000034 method Methods 0.000 description 2
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- RUZYUOTYCVRMRZ-UHFFFAOYSA-N doxazosin Chemical compound C1OC2=CC=CC=C2OC1C(=O)N(CC1)CCN1C1=NC(N)=C(C=C(C(OC)=C2)OC)C2=N1 RUZYUOTYCVRMRZ-UHFFFAOYSA-N 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 230000000063 preceeding effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B1/00—Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission
- H04B1/38—Transceivers, i.e. devices in which transmitter and receiver form a structural unit and in which at least one part is used for functions of transmitting and receiving
- H04B1/40—Circuits
- H04B1/44—Transmit/receive switching
- H04B1/46—Transmit/receive switching by voice-frequency signals; by pilot signals
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computer Networks & Wireless Communication (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Control Of Amplification And Gain Control (AREA)
- Alarm Systems (AREA)
- Interface Circuits In Exchanges (AREA)
- Radio Relay Systems (AREA)
Abstract
ABSTRACT
A speech detector circuit for selectively transmitting speech signals while suppressing ambient sound related signals comprises an input from a microphone producing a speech signal and an input from a microrphone producing a noise signal. The circuit includes a comparator which compares the speech signal with peak values of the noise signal to produce a pulse signal which indicates when the speech signal exceeds a threshhold and a discriminator responsive to the pulse signal to produce a detected speech signal when the pulse signal occurs within a selected period which is spaced from a preceding pulse signal by a selected period.
A speech detector circuit for selectively transmitting speech signals while suppressing ambient sound related signals comprises an input from a microphone producing a speech signal and an input from a microrphone producing a noise signal. The circuit includes a comparator which compares the speech signal with peak values of the noise signal to produce a pulse signal which indicates when the speech signal exceeds a threshhold and a discriminator responsive to the pulse signal to produce a detected speech signal when the pulse signal occurs within a selected period which is spaced from a preceding pulse signal by a selected period.
Description
~%;~573 This invention relates to speech detector systems. the invention is particularly suitable for use in combination with a transmitter for voice communication and which is equipped with a dual-microphone amplifier system arranged so as selectively to transmit speech signals whilst suppressing ambient-sound-related signals.
There are a number of applications for speech detector systems. For example, wireless transmitters, especially of the portable type, are commonly required to have a VOX
facility, that is, automatic keying of thy transmitter in response to speech signals.
In previously-known VOX systems which utilise a single microphone channel, it has been difficult to combine effective recognition of the operator's speech with adequate rejection of unwanted background noise or speech. Various techniques have been employed to improve the voice discrimination, including band-separation filtering and the use of noise-cancelling microphones coupled with automatic gain control, abbreviated as "AGC". However, where a wide amplitude-dynamic-range is achieved by the use of AGC it is difficult to avoid false-triggering of the VOX circuit. In addition, the AGC system tends to amplify background noise signals and circuit noise to an objectionable level during periods when the speech is interrupted. This is especially evident when a transmitter is switched to a manual-keying or ' press-to-talk mode, sometimes abbreviated as "PTT".
i A system which has been used successfully to overcome the foregoing disadvantages of VOX and AGC systems compriaea f ~L2~:~5~3 a two-channel dual-~icrophone arrangement in which one microphone receives the operator's speech, superimposed on ambient noise, and the other principally receives ambient noise. jot only is noise cancellation possible, over at least the lower part of the frequency range, but the ambient noise signal may be used to control the speech channel gain in the absence of speech and, by this method, prevent the noise signal at the speech channel output from rising above, for example, a level lOdB below the nominal speech signal l output level. A significant advantage of this system when combined with a speech detector is that, at least in the steady state, there is a well-defined difference in speech and noise levels which is easily discriminated in a simple comparator circuit.
Some constraints and disadvantages of the dual-microphone system are:
A time delay must be included in the speech detector response to take account oi the AGC attack time which can be, for example, up to 5 ms.
The two microphones must be closely matched in sensitivity and frequency response.
Direct noise cancellation is usually only possible at frequencies up to about 500 Hz because of the transit time difference for a sound pressure wave travelling to each of the two microphones, this varying with the direction of the sound source.
There is a variation of the automatic-gain-controlled ouput level of either channel with input signal level, ~IL2;~5~3 depending on the loop gain of the AGC loop.
There is an uncertainty in the output reference level of the GO system because of production tolerances in components.
There is a need for precise tracking, or equality of gain, in the control elements of the speech and noise channels.
There is an uncertainty in the speech detector reference level owing to production tolerances in components.
A number of the constraints and disadvantages listed above can produce effects which are additive. This can give rise to a substantial error in the effective speech detector threshold. Even with the inclusion in the circuit of a speech detector threshold adjustment means to take account of static mismatch with the preceding microphone and AGC
amplifier system, dynamic uncertainties may still give rise to false-triggering by the speech detector.
An object of the present invention is to provide circuit means which at least ameliorate some of the above disadvantages of the prior art and which in preferred embodiments reduce if not eliminate the effects of variations in AGC levels, mismatch between speech detector and AGC
reference levels, tracking errors of AGC elements and transient signals during the AGC attack time. In principle this is achieved my combining the functions of AGC and speech detection to eliminate mismatch occurring in separate circuits and, further, by dynamically comparing the detected speech level with the detected noise level instead of with a : - 4 -constant threshold.
This invention consists in a speech detector system comprising means for producing a first (speech) signal representative of speech superimposed on ambient noise and a second (noise) signal representative of said ambient noise;
comparator means to determine a speech signal threshold according to peak values of said second signal and to produce a third (pulse) signal which provides an indication of when said first signal exceeds said speech signal threshold and discriminator means responsive to an indication by said third signal to produce a fourth signal indicating detected speech when an indication occurs within a second .selected period which is spaced from a preceeding indication by a first selected period.
For preference the speech signal and noise signal are subjected to automatic gain control and noise cancellation prior to input to the comparison means. For preference also the ratio of the noise cancelled speech signal to the noise signal fed to the comparison means is greater than 1:1.
Other aspects of the invention will be apparent from the description which follows.
An embodiment of the invention will now be described by way of example only with reference to the accompanying drawings wherein:
Figure 1 is a schematic block diagram showing part of a wireless transmitter speech detector system having a two channel audio system with speech and noise inputs.
Figure 2 shows a first circuit suitable for use as the ~7~3 peak comparator shown as a block in Fig. 1.
Figure 3 shows a circuit suitable for use as the pulse discriminator shown as a block in Fig. 1.
Figure 4 shows a preferred circuit which combines the peak comparator and pulse discriminator in a single circuit.
Figure 5 shows a further preferred circuit which combine the peak comparator and pulse discriminator.
The block diagram of Figure 1 shows schematically a wireless transmitter speech detector system having a two channel audio system with speech and noise inputs, and AGC.
The rectified speech and noise signals are compared to detect the presence of speech.
The AGC system operates in substantially the known manner to control the peak noise level at the output of the noise channel, to a predetermined value equal to VREF/ .
The high-frequency components of noise in the speech channel, which are not removed by the low-frequency noise-cancellation circuit, will generally be o similar amplitude to the output of the noise channel. The foregoing control mode is over-ridden when the level of signal in the speech channel output exceeds VREF, thus suppressing the noise further and enabling an effective signal-to-noise ratio for speech of lOdB to be maintained in noisy environments.
It has been found that, in a practical system, it is desirable to have the VOX control switched at a speech-to-noise ratio of approximately 6dB, based on peak values. In previously-known speech detectors it has been usual, therefore, to detect, by means of a comparator amplifier, the 757~3 instances when a signal in the speech channel exceeds a predetermined value ox 2VREF/ ~0.
In the present invention, the peak value of the rectified noise-channel signal is itself used, after amplification, as the comparator reference as shown in Fig. 2.
Figure 2 shows a peak comparator circuit for use in the arrangement shown in Figure 1. Rectified speech signal Is representative of speech superimposed on ambient noise (a first signal) and rectified noise signal nIN representative of ambient noise pa second signal) are fed to the inputs as shown after amplification by amplifiers shown in figure 1, where n is the ratio of the amplification of the noise channel to the amplification of the speech channel.
The current signals Is and nIN respectively generate voltages Vs and VN across the input resistors Rl and R2 .
Voltage VN generated by signal nIN appears at the base of transistor Ql which has its collector connected to a positive power supply V . The emitter of transistor Ql is connected to an R-C combination of resistor R4 and capacitor C2. Transistor Ql, capacitor C2 and resistor R4 comprise peak detecting means. The transistor Ql operates as a voltage follower when VN is greater than the voltage on capacitor C2 and is switched off when VN is less than the voltage on capacitor C2. Current to charge capacitor C2 is drawn from supply Vcc. The charging current is therefore independent of the current drawn by resistor R2 thereby avoiding response lag. Resistor R4 is in parallel with capacitor C2 and the decay time constant ~4C2 is long enough for acceptable smoothing of random noise.
Voltage Vs generated by signal Is charges capacitor Cl through diode Dl only when the voltage stored across capacitor Cl is less than Vs. That is, when a positive pulse or peak occurs in the speech signal Is. The time constant of resistor Rl and Capacitor Cl is short enough to allow Cl to be charged by peaks in Is corresponding to speech "glottal" pulses, but long enough to filter out transient noise. Resistor R3 is in parallel to capacitor Cl and its value sets the decay time of the charge on capacitor Cl. The decay time constant resistor R3 and capacitor Cl is made equal to the time constant o resistor R4 and capacitor C2 to provide good dynamic tracking.
The voltages appearing across capacitors Cl and C2 are fed to the inputs ox a differential amplifier OPl which produces a positive output when the voltage across Cl exceeds the voltage across C2 and a negative output in the reverse situation. what is, the voltage across capacitor C2 constitutes a speech signal threshold and the output of amplifier OP1 comprises a third signal which provides an indication of when the speech signal exceeds the speech signal threshold.
The rectified noise signal amplification in Figure 1 is usually twice the rectified speech signal amplification so that a positive peak in the speech signal of at least twice the level of ambient noise is xequired to produce a positive output from differential amplifier OPl.
75~3 Fig. 3 shows one kind of pulse discriminator circuit, the principle purpose of which is to generate an output VOX
control pulse only if a series of speech pulses is received but not when a single pulse or short burst of pulses is received. This is achieved by discriminating between glotta puls0s comprising speech, which typically occur at 6-8 ms intervals, and single pulses or short bursts of pulses separated by less than 3 ms generated by noise. In this way the system provides immunity to transient noise such as that caused by an impact, which is typically too fast for AGC
response to be effective.
In the system shown, a bistable latch is used to generate a VOX control or "speech detected" signal. When a trigger pulse is received from the peak comparator, a timer comprising a double pulse generator generates two control pulses, A and B, with a first selected period Tl and a second selected period T2 respactively. Control pulse A and the trigger pulse from the peak comparator are fed to an AND
gate. The output of the AND gate is connected to the SET
input of the bistable latch. A delay is included in the trigger pulse connection to prevent a trigger pulse reaching ` the AND gate before control pulse A. In this way control pulse A inhibits the setting of the bistable latch by either the initial trigger pulse or any subsequent pulse occurring within the period Tl, typically 5 ms. However, a trigger pulse occurring within the period T2 is able to sex the ' bistable latch as well as re-starting the timing period T2 of control pulse B. The latter function ensures that the output _ 9 _ i , ~2~2~573 VOX control pulse has at least a period equal to the initial or minimum value of T2, 10 ins for example. This period is determined by the requirements of any ensuing transmitter circuit. At the end of the period T2 after the last trigger pulse, the bistable latch is reset and control pulse A
trigger is enabled.
Fig. 4 shows a system which combines the peak comparator and pulse-discriminator functions in one circuit, resulting in a substantial saving in components which can be important in the envisaged applications.
The transistors Ql and Q2 act as peak detectiny means for the noise and speech rectified inputs respectively, and because transistors Ql and Q2 share a common connection to ground via the parallel combination of resistor R4 and a storage element comprising CapaCitGr C2, the collector-emitter current of either transistor Oll input signal peaks is dependent on the previous peak value to which capacitor C2 has been charged.
By suitable choice of the decay time of capacitor C2 via resistor R4, the transistor pair Ql, Q2 can therefore also act to supress second and subsequent peaks separated by less than the normal "glottal" period.
This is achieved because the total charge flowing into the collector of transistor Q2 during a detected speech pulse input IS is dependent on the instantaneous charge of capacitor C2, which is arranged to decay via resistor R4. It J' follows that, if a rapid burst of pulses of similar amplitude occur in Is, capacitor C2 will charge up rapidly and only one or two large pulses of charge will flow into the ,, .
/
collector of transistor Q2, followed by small charge pulses which are sufficient to keep capacitor C2 charged. In contrast, speech signals typically have large impulses, or glottal pulses, spaced apart at 6-8 ms intervals with smaller amplitude pulses in between. By choice of decay time capacitor C2 charge decays enough for each glottal pulse to produce a large current impulse in the collector lead of transistor Q2. It is therefore quite a simple matter to discriminate between speech and either continuous or impact noise by integrating the collector charge impulses and comparing this voltage with a suitable reference value using a Schmitt trigger circuit.
Integration is achieved in the circuit of Figure 4 by means of the R-C combination comprising resistor R5 and capacitor C3. Charge impulses from the collector lead of transistor Q2 result in a voltage appearing across capacitor C3 with respect to constant voltage supply Vcc. The reference voltage is chosen so that it is exceeded by the voltage across C3 when three consecutive pulses occur in the collector lead o transistor Q2 without substantial decay of the charge on capacitor C3 between pulses. This is achieved by suitable choice of resistor R5 so that "staircase"
integration of current pulses is obtained for pulses occurring within a selected period.
The Schmitt trigger is preferred to a comparator, for example, to ensure clean switching and an output pulse width in excess of some arbitrary value, dependent on the nerds of ensuing circuitry. This is related partly to the decay ~~
:~27~;;73 time-constant of capacitor C3 which is controlled by resistor R5. The decay time-constant is principally determined by the need for recovery between typical impact-noise occurrence, but must be long enough for effective "staircase" integration of glottal pulses. A resistor R6 is included in the collector lead of transistor Q2 to limit thy amplitude of charge impulses in the event of the AGC system being overloaded momentarily by a large impact-noise signal, thus ensuring that effective discrimination is maintained.
As will be apparent to those skilled in the art, the amplification of detected noise inputs IN by a greater factor n than speech input Is further enhances the discrimination of the circuit by, firstly, generating a comparator threshold proportional to the ambient noise level and thereby reducing the frequency of Q2 collector impulses resulting from random noise and, secondly, because of the I previously-discussed properties of the two-microphone system, permitting operator speech to be distinguished from more distant "ambient" speech, the effectiveness of the approach being improved because of the reduced reliance on AGC control accuracy.
Figure 5 shows a further circuit combining the peak comparator and pulse discriminator functions. The circuit !~ operates in substantially the same manner as the circuit of Figure 4 however the transistors Q4 and Q5 act as a current mirror in a known manner to produce a current flowing through collector of transistor Q5 proportional to the collector current of transistor Q2. This allows R-C combination of .
', ~Z;~i73 resistor R5 and capacitor C3 to be connected to ground thereby eliminating the need for maintaining the voltage supply Vcc in Figure 4 constant.
It will be apparent that resistor R5 and capacitor C3 effectively integrate charge impulses flowing from transistor Q5 collector in substantially the same manner as described above to control the voltage at the input of Schmitt trigger Sl which generates a VOX control or "speech detected" signal when a predetermined threshold voltage is reached.
The amplitude of charge impulses from the collector of Q5 is limited by the shunting action of Q3 in response to the voltage developed across R7 by the current flowing into Q2 collector.
As will be apparent to those skilled in the art the invention hereof may be embodied in other circuits which function in an equivalent or analagous manner and such embodiments are within the scope hereof.
There are a number of applications for speech detector systems. For example, wireless transmitters, especially of the portable type, are commonly required to have a VOX
facility, that is, automatic keying of thy transmitter in response to speech signals.
In previously-known VOX systems which utilise a single microphone channel, it has been difficult to combine effective recognition of the operator's speech with adequate rejection of unwanted background noise or speech. Various techniques have been employed to improve the voice discrimination, including band-separation filtering and the use of noise-cancelling microphones coupled with automatic gain control, abbreviated as "AGC". However, where a wide amplitude-dynamic-range is achieved by the use of AGC it is difficult to avoid false-triggering of the VOX circuit. In addition, the AGC system tends to amplify background noise signals and circuit noise to an objectionable level during periods when the speech is interrupted. This is especially evident when a transmitter is switched to a manual-keying or ' press-to-talk mode, sometimes abbreviated as "PTT".
i A system which has been used successfully to overcome the foregoing disadvantages of VOX and AGC systems compriaea f ~L2~:~5~3 a two-channel dual-~icrophone arrangement in which one microphone receives the operator's speech, superimposed on ambient noise, and the other principally receives ambient noise. jot only is noise cancellation possible, over at least the lower part of the frequency range, but the ambient noise signal may be used to control the speech channel gain in the absence of speech and, by this method, prevent the noise signal at the speech channel output from rising above, for example, a level lOdB below the nominal speech signal l output level. A significant advantage of this system when combined with a speech detector is that, at least in the steady state, there is a well-defined difference in speech and noise levels which is easily discriminated in a simple comparator circuit.
Some constraints and disadvantages of the dual-microphone system are:
A time delay must be included in the speech detector response to take account oi the AGC attack time which can be, for example, up to 5 ms.
The two microphones must be closely matched in sensitivity and frequency response.
Direct noise cancellation is usually only possible at frequencies up to about 500 Hz because of the transit time difference for a sound pressure wave travelling to each of the two microphones, this varying with the direction of the sound source.
There is a variation of the automatic-gain-controlled ouput level of either channel with input signal level, ~IL2;~5~3 depending on the loop gain of the AGC loop.
There is an uncertainty in the output reference level of the GO system because of production tolerances in components.
There is a need for precise tracking, or equality of gain, in the control elements of the speech and noise channels.
There is an uncertainty in the speech detector reference level owing to production tolerances in components.
A number of the constraints and disadvantages listed above can produce effects which are additive. This can give rise to a substantial error in the effective speech detector threshold. Even with the inclusion in the circuit of a speech detector threshold adjustment means to take account of static mismatch with the preceding microphone and AGC
amplifier system, dynamic uncertainties may still give rise to false-triggering by the speech detector.
An object of the present invention is to provide circuit means which at least ameliorate some of the above disadvantages of the prior art and which in preferred embodiments reduce if not eliminate the effects of variations in AGC levels, mismatch between speech detector and AGC
reference levels, tracking errors of AGC elements and transient signals during the AGC attack time. In principle this is achieved my combining the functions of AGC and speech detection to eliminate mismatch occurring in separate circuits and, further, by dynamically comparing the detected speech level with the detected noise level instead of with a : - 4 -constant threshold.
This invention consists in a speech detector system comprising means for producing a first (speech) signal representative of speech superimposed on ambient noise and a second (noise) signal representative of said ambient noise;
comparator means to determine a speech signal threshold according to peak values of said second signal and to produce a third (pulse) signal which provides an indication of when said first signal exceeds said speech signal threshold and discriminator means responsive to an indication by said third signal to produce a fourth signal indicating detected speech when an indication occurs within a second .selected period which is spaced from a preceeding indication by a first selected period.
For preference the speech signal and noise signal are subjected to automatic gain control and noise cancellation prior to input to the comparison means. For preference also the ratio of the noise cancelled speech signal to the noise signal fed to the comparison means is greater than 1:1.
Other aspects of the invention will be apparent from the description which follows.
An embodiment of the invention will now be described by way of example only with reference to the accompanying drawings wherein:
Figure 1 is a schematic block diagram showing part of a wireless transmitter speech detector system having a two channel audio system with speech and noise inputs.
Figure 2 shows a first circuit suitable for use as the ~7~3 peak comparator shown as a block in Fig. 1.
Figure 3 shows a circuit suitable for use as the pulse discriminator shown as a block in Fig. 1.
Figure 4 shows a preferred circuit which combines the peak comparator and pulse discriminator in a single circuit.
Figure 5 shows a further preferred circuit which combine the peak comparator and pulse discriminator.
The block diagram of Figure 1 shows schematically a wireless transmitter speech detector system having a two channel audio system with speech and noise inputs, and AGC.
The rectified speech and noise signals are compared to detect the presence of speech.
The AGC system operates in substantially the known manner to control the peak noise level at the output of the noise channel, to a predetermined value equal to VREF/ .
The high-frequency components of noise in the speech channel, which are not removed by the low-frequency noise-cancellation circuit, will generally be o similar amplitude to the output of the noise channel. The foregoing control mode is over-ridden when the level of signal in the speech channel output exceeds VREF, thus suppressing the noise further and enabling an effective signal-to-noise ratio for speech of lOdB to be maintained in noisy environments.
It has been found that, in a practical system, it is desirable to have the VOX control switched at a speech-to-noise ratio of approximately 6dB, based on peak values. In previously-known speech detectors it has been usual, therefore, to detect, by means of a comparator amplifier, the 757~3 instances when a signal in the speech channel exceeds a predetermined value ox 2VREF/ ~0.
In the present invention, the peak value of the rectified noise-channel signal is itself used, after amplification, as the comparator reference as shown in Fig. 2.
Figure 2 shows a peak comparator circuit for use in the arrangement shown in Figure 1. Rectified speech signal Is representative of speech superimposed on ambient noise (a first signal) and rectified noise signal nIN representative of ambient noise pa second signal) are fed to the inputs as shown after amplification by amplifiers shown in figure 1, where n is the ratio of the amplification of the noise channel to the amplification of the speech channel.
The current signals Is and nIN respectively generate voltages Vs and VN across the input resistors Rl and R2 .
Voltage VN generated by signal nIN appears at the base of transistor Ql which has its collector connected to a positive power supply V . The emitter of transistor Ql is connected to an R-C combination of resistor R4 and capacitor C2. Transistor Ql, capacitor C2 and resistor R4 comprise peak detecting means. The transistor Ql operates as a voltage follower when VN is greater than the voltage on capacitor C2 and is switched off when VN is less than the voltage on capacitor C2. Current to charge capacitor C2 is drawn from supply Vcc. The charging current is therefore independent of the current drawn by resistor R2 thereby avoiding response lag. Resistor R4 is in parallel with capacitor C2 and the decay time constant ~4C2 is long enough for acceptable smoothing of random noise.
Voltage Vs generated by signal Is charges capacitor Cl through diode Dl only when the voltage stored across capacitor Cl is less than Vs. That is, when a positive pulse or peak occurs in the speech signal Is. The time constant of resistor Rl and Capacitor Cl is short enough to allow Cl to be charged by peaks in Is corresponding to speech "glottal" pulses, but long enough to filter out transient noise. Resistor R3 is in parallel to capacitor Cl and its value sets the decay time of the charge on capacitor Cl. The decay time constant resistor R3 and capacitor Cl is made equal to the time constant o resistor R4 and capacitor C2 to provide good dynamic tracking.
The voltages appearing across capacitors Cl and C2 are fed to the inputs ox a differential amplifier OPl which produces a positive output when the voltage across Cl exceeds the voltage across C2 and a negative output in the reverse situation. what is, the voltage across capacitor C2 constitutes a speech signal threshold and the output of amplifier OP1 comprises a third signal which provides an indication of when the speech signal exceeds the speech signal threshold.
The rectified noise signal amplification in Figure 1 is usually twice the rectified speech signal amplification so that a positive peak in the speech signal of at least twice the level of ambient noise is xequired to produce a positive output from differential amplifier OPl.
75~3 Fig. 3 shows one kind of pulse discriminator circuit, the principle purpose of which is to generate an output VOX
control pulse only if a series of speech pulses is received but not when a single pulse or short burst of pulses is received. This is achieved by discriminating between glotta puls0s comprising speech, which typically occur at 6-8 ms intervals, and single pulses or short bursts of pulses separated by less than 3 ms generated by noise. In this way the system provides immunity to transient noise such as that caused by an impact, which is typically too fast for AGC
response to be effective.
In the system shown, a bistable latch is used to generate a VOX control or "speech detected" signal. When a trigger pulse is received from the peak comparator, a timer comprising a double pulse generator generates two control pulses, A and B, with a first selected period Tl and a second selected period T2 respactively. Control pulse A and the trigger pulse from the peak comparator are fed to an AND
gate. The output of the AND gate is connected to the SET
input of the bistable latch. A delay is included in the trigger pulse connection to prevent a trigger pulse reaching ` the AND gate before control pulse A. In this way control pulse A inhibits the setting of the bistable latch by either the initial trigger pulse or any subsequent pulse occurring within the period Tl, typically 5 ms. However, a trigger pulse occurring within the period T2 is able to sex the ' bistable latch as well as re-starting the timing period T2 of control pulse B. The latter function ensures that the output _ 9 _ i , ~2~2~573 VOX control pulse has at least a period equal to the initial or minimum value of T2, 10 ins for example. This period is determined by the requirements of any ensuing transmitter circuit. At the end of the period T2 after the last trigger pulse, the bistable latch is reset and control pulse A
trigger is enabled.
Fig. 4 shows a system which combines the peak comparator and pulse-discriminator functions in one circuit, resulting in a substantial saving in components which can be important in the envisaged applications.
The transistors Ql and Q2 act as peak detectiny means for the noise and speech rectified inputs respectively, and because transistors Ql and Q2 share a common connection to ground via the parallel combination of resistor R4 and a storage element comprising CapaCitGr C2, the collector-emitter current of either transistor Oll input signal peaks is dependent on the previous peak value to which capacitor C2 has been charged.
By suitable choice of the decay time of capacitor C2 via resistor R4, the transistor pair Ql, Q2 can therefore also act to supress second and subsequent peaks separated by less than the normal "glottal" period.
This is achieved because the total charge flowing into the collector of transistor Q2 during a detected speech pulse input IS is dependent on the instantaneous charge of capacitor C2, which is arranged to decay via resistor R4. It J' follows that, if a rapid burst of pulses of similar amplitude occur in Is, capacitor C2 will charge up rapidly and only one or two large pulses of charge will flow into the ,, .
/
collector of transistor Q2, followed by small charge pulses which are sufficient to keep capacitor C2 charged. In contrast, speech signals typically have large impulses, or glottal pulses, spaced apart at 6-8 ms intervals with smaller amplitude pulses in between. By choice of decay time capacitor C2 charge decays enough for each glottal pulse to produce a large current impulse in the collector lead of transistor Q2. It is therefore quite a simple matter to discriminate between speech and either continuous or impact noise by integrating the collector charge impulses and comparing this voltage with a suitable reference value using a Schmitt trigger circuit.
Integration is achieved in the circuit of Figure 4 by means of the R-C combination comprising resistor R5 and capacitor C3. Charge impulses from the collector lead of transistor Q2 result in a voltage appearing across capacitor C3 with respect to constant voltage supply Vcc. The reference voltage is chosen so that it is exceeded by the voltage across C3 when three consecutive pulses occur in the collector lead o transistor Q2 without substantial decay of the charge on capacitor C3 between pulses. This is achieved by suitable choice of resistor R5 so that "staircase"
integration of current pulses is obtained for pulses occurring within a selected period.
The Schmitt trigger is preferred to a comparator, for example, to ensure clean switching and an output pulse width in excess of some arbitrary value, dependent on the nerds of ensuing circuitry. This is related partly to the decay ~~
:~27~;;73 time-constant of capacitor C3 which is controlled by resistor R5. The decay time-constant is principally determined by the need for recovery between typical impact-noise occurrence, but must be long enough for effective "staircase" integration of glottal pulses. A resistor R6 is included in the collector lead of transistor Q2 to limit thy amplitude of charge impulses in the event of the AGC system being overloaded momentarily by a large impact-noise signal, thus ensuring that effective discrimination is maintained.
As will be apparent to those skilled in the art, the amplification of detected noise inputs IN by a greater factor n than speech input Is further enhances the discrimination of the circuit by, firstly, generating a comparator threshold proportional to the ambient noise level and thereby reducing the frequency of Q2 collector impulses resulting from random noise and, secondly, because of the I previously-discussed properties of the two-microphone system, permitting operator speech to be distinguished from more distant "ambient" speech, the effectiveness of the approach being improved because of the reduced reliance on AGC control accuracy.
Figure 5 shows a further circuit combining the peak comparator and pulse discriminator functions. The circuit !~ operates in substantially the same manner as the circuit of Figure 4 however the transistors Q4 and Q5 act as a current mirror in a known manner to produce a current flowing through collector of transistor Q5 proportional to the collector current of transistor Q2. This allows R-C combination of .
', ~Z;~i73 resistor R5 and capacitor C3 to be connected to ground thereby eliminating the need for maintaining the voltage supply Vcc in Figure 4 constant.
It will be apparent that resistor R5 and capacitor C3 effectively integrate charge impulses flowing from transistor Q5 collector in substantially the same manner as described above to control the voltage at the input of Schmitt trigger Sl which generates a VOX control or "speech detected" signal when a predetermined threshold voltage is reached.
The amplitude of charge impulses from the collector of Q5 is limited by the shunting action of Q3 in response to the voltage developed across R7 by the current flowing into Q2 collector.
As will be apparent to those skilled in the art the invention hereof may be embodied in other circuits which function in an equivalent or analagous manner and such embodiments are within the scope hereof.
Claims (15)
1. A speech detector system comprising means for producing a first (speech) signal representative of speech superimposed on ambient noise and a second (noise) signal representative of said ambient noise;
comparator means to determine a speech signal threshold according to peak values of said second signal and to produce a third (pulse) signal which provides an indication of when said first signal exceeds said speech signal threshold; and discriminator means responsive to an indication by said third signal to produce a fourth signal indicating detected speech when an indication occurs within a second selected period which is spaced from a preceding indication by a first selected period.
comparator means to determine a speech signal threshold according to peak values of said second signal and to produce a third (pulse) signal which provides an indication of when said first signal exceeds said speech signal threshold; and discriminator means responsive to an indication by said third signal to produce a fourth signal indicating detected speech when an indication occurs within a second selected period which is spaced from a preceding indication by a first selected period.
2. A speech detector system as claimed in claim 1 wherein said speech signal threshold is determined from said second signal by peak detecting means including a storage element.
3. A speech detector system as claimed in claim 2 wherein said first signal is compared with said speech signal threshold by means of a differential amplifier, the output of the differential amplifier producing said third signal.
4. A speech detector system as claimed in claim 2 wherein said comparator means comprises a three terminal amplifier to generate said third signal, said amplifier having the terminal common to input and output connected to the storage element of said peak detector means.
5. A speech detector system as claimed in claim 4 wherein current flow through the output terminal of said three terminal amplifier comprises said third signal.
6. A speech detector system as claimed in claim 5 wherein said 3-terminal amplifier is a transistor and said terminal common to input and output is the emitter.
7. A speech detector system as claimed in claim 4 or claim 5 further comprising means to limit said third signal amplitude.
8. A speech detector system as claimed in claim 6 further comprising a current mirror to produce a mirrored current flow proportional to said third signal current flow and wherein said discriminator means is responsive to the mirrored current flow.
9. A speech detector as claimed in claim 8 further comprising transistor current shunting means to limit said mirrored current flow.
10. A speech detector system as claimed in claim 1, wherein said discriminator means comprises a control pulse generator which in response to an indication in said third signal generates a first control pulse having a duration equal to said first selected period and a second control pulse having a duration equal to said second selected period, said first control pulse acting to inhibit the production of a fourth signal during the duration thereof and said second control pulse acting to permit production of a fourth signal only in response to pulse indications in said third signal occuring during the duration of said second control pulse.
11. A speech detector system as claimed in claim 10 wherein a bistable latch generates said fourth signal and said first and second control pulses act to control operation of the bistable latch by said third signal.
12. A speech detector system as claimed in claim 1, wherein said discriminator means comprises an integrator the output of which controls production of said fourth signal, said second selected period being determined by the integration constant.
13. A speech detector system as claimed in claim 12 wherein the output of the integrator is compared with a reference voltage to produce said fourth signal.
14. A speech detector system as claimed in claim 12, wherein the integrator comprises a capacitor and the integration constant is determined by the decay time of the capacitor through a parallel connected resistance.
15. A speech detector system as claimed in any of claims 1 to 3 wherein said first signal and said second signal are amplified and the second signal is amplified by a greater amount than the first signal.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AUPG542284 | 1984-06-08 | ||
AUPG5422 | 1984-06-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
CA1227573A true CA1227573A (en) | 1987-09-29 |
Family
ID=3770635
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA000483355A Expired CA1227573A (en) | 1984-06-08 | 1985-06-06 | Adaptive speech detector system |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP0186671A4 (en) |
JP (1) | JPS61502368A (en) |
AU (1) | AU584904B2 (en) |
CA (1) | CA1227573A (en) |
NZ (1) | NZ212331A (en) |
WO (1) | WO1986000133A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE3742929C1 (en) * | 1987-12-18 | 1988-09-29 | Daimler Benz Ag | Method for improving the reliability of voice controls of functional elements and device for carrying it out |
GB2243274A (en) * | 1990-02-20 | 1991-10-23 | Switchtoll Limited | Subtracting ambient noise from total noise during recording or broadcasting |
US6480823B1 (en) * | 1998-03-24 | 2002-11-12 | Matsushita Electric Industrial Co., Ltd. | Speech detection for noisy conditions |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3814856A (en) * | 1973-02-22 | 1974-06-04 | D Dugan | Control apparatus for sound reinforcement systems |
US3962553A (en) * | 1973-03-29 | 1976-06-08 | Motorola, Inc. | Portable telephone system having a battery saver feature |
FR2305909A1 (en) * | 1975-03-28 | 1976-10-22 | Dassault Electronique | Microphones and associated equipment - include one unshielded microphone and one masked microphone, and electronics system to minimise noise background |
GB1516100A (en) * | 1975-12-17 | 1978-06-28 | Secr Defence | Audio signal processing apparatus |
DE2731971B2 (en) * | 1977-07-15 | 1980-05-14 | Dieter 4300 Essen Eller | Method and device for controlling or regulating a useful sound source |
CA1116300A (en) * | 1977-12-28 | 1982-01-12 | Hiroaki Sakoe | Speech recognition system |
US4215241A (en) * | 1978-10-16 | 1980-07-29 | Frank L. Eppenger | Sound operated control device |
DE2849938A1 (en) * | 1978-11-17 | 1980-05-29 | Kiepe Bahn Elektrik Gmbh | Individual monitoring security alarm system - requires individual to respond to and interact with, pre-alarm and main alarm cycles |
DE2931604C2 (en) * | 1979-08-03 | 1982-04-29 | Siemens AG, 1000 Berlin und 8000 München | Noise-compensated microphone circuit |
US4484344A (en) * | 1982-03-01 | 1984-11-20 | Rockwell International Corporation | Voice operated switch |
JPS58156236A (en) * | 1982-03-12 | 1983-09-17 | Matsushita Electric Ind Co Ltd | Vox circuit |
EP0156826B1 (en) * | 1983-09-14 | 1988-11-09 | Peiker, Andreas | Telephone transmission installation |
-
1985
- 1985-06-06 WO PCT/AU1985/000121 patent/WO1986000133A1/en not_active Application Discontinuation
- 1985-06-06 JP JP50255785A patent/JPS61502368A/en active Pending
- 1985-06-06 CA CA000483355A patent/CA1227573A/en not_active Expired
- 1985-06-06 EP EP19850902424 patent/EP0186671A4/en not_active Withdrawn
- 1985-06-06 AU AU44393/85A patent/AU584904B2/en not_active Ceased
- 1985-06-07 NZ NZ21233185A patent/NZ212331A/en unknown
Also Published As
Publication number | Publication date |
---|---|
EP0186671A1 (en) | 1986-07-09 |
AU584904B2 (en) | 1989-06-08 |
EP0186671A4 (en) | 1988-11-16 |
WO1986000133A1 (en) | 1986-01-03 |
JPS61502368A (en) | 1986-10-16 |
AU4439385A (en) | 1986-01-10 |
NZ212331A (en) | 1988-04-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US4625083A (en) | Voice operated switch | |
US5170434A (en) | Hearing aid with improved noise discrimination | |
EP0066110B1 (en) | Noise removing apparatus in an fm receiver | |
US4975657A (en) | Speech detector for automatic level control systems | |
US4352962A (en) | Tone responsive disabling circuit | |
US4514703A (en) | Automatic level control system | |
CA1126825A (en) | Automatic gain control device for a single-sideband receiver | |
CA1227573A (en) | Adaptive speech detector system | |
US6516026B1 (en) | Data transmission equipment | |
US5036540A (en) | Speech operated noise attenuation device | |
US3873925A (en) | Audio frequency squelch system | |
JPS6035835A (en) | Squelch circuit disposition | |
US4694500A (en) | Control voltage generating circuit for activating a noise reduction circuit in an FM stereo receiver | |
US4176286A (en) | Signal translator with squelch | |
EP0516220B1 (en) | Electroacoustic amplifier arrangement and microphone arrangement to be used in the electroacoustic amplifier arrangement | |
US2543807A (en) | Voice operated relay | |
US5297213A (en) | System and method for reducing noise | |
US4965835A (en) | Signal sensing circuit | |
US2081422A (en) | Transmission control in signaling systems | |
US4347406A (en) | Pulse detector circuit | |
KR900004793Y1 (en) | Over output preventing & howling preventing circuit of amp | |
SU580660A1 (en) | Device for automatic turning-on of transmitting portion of electro-acoustic channel by voice | |
GB2076261A (en) | Speech direction detection circuits | |
JPS5824493Y2 (en) | automatic volume adjustment device | |
JPS61222329A (en) | Optical input interruption detecting circuit |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MKEX | Expiry |