GB2113054A - Speech detector - Google Patents

Speech detector Download PDF

Info

Publication number
GB2113054A
GB2113054A GB8138393A GB8138393A GB2113054A GB 2113054 A GB2113054 A GB 2113054A GB 8138393 A GB8138393 A GB 8138393A GB 8138393 A GB8138393 A GB 8138393A GB 2113054 A GB2113054 A GB 2113054A
Authority
GB
United Kingdom
Prior art keywords
speech
channel switch
delay
time
time delay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB8138393A
Inventor
Louis David Thomas
Vivian Jones Phillips
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
VOICE MICROSYSTEMS Ltd
Original Assignee
VOICE MICROSYSTEMS Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by VOICE MICROSYSTEMS Ltd filed Critical VOICE MICROSYSTEMS Ltd
Priority to GB8138393A priority Critical patent/GB2113054A/en
Publication of GB2113054A publication Critical patent/GB2113054A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B15/00Driving, starting or stopping record carriers of filamentary or web form; Driving both such record carriers and heads; Guiding such record carriers or containers therefor; Control thereof; Control of operating function
    • G11B15/02Control of operating function, e.g. switching from recording to reproducing
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/22Means responsive to presence or absence of recorded information signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04JMULTIPLEX COMMUNICATION
    • H04J3/00Time-division multiplex systems
    • H04J3/17Time-division multiplex systems in which the transmission channel allotted to a first user may be taken away and re-allotted to a second user if the first user becomes inactive, e.g. TASI
    • H04J3/175Speech activity or inactivity detectors
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B2220/00Record carriers by type
    • G11B2220/90Tape-like record carriers

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)

Abstract

In a system incorporating an input line 10, a channel switch 12, and a speech detector 14 for closing the channel switch 12 in the presence of speech, and in order to allow the speech detector sufficient time to determine the presence of speech without the commencement of an utterance being clipped, a delay line 40 is arranged between the input line and the channel switch. The time delay of the delay line 40 is controlled such that it is long in the period of silence between utterances and decreases gradually upon commencement of an utterance and after an initial hold-off period so as to minimize any echo effect at the end of an utterance. <IMAGE>

Description

SPECIFICATION Speech detector The present invention relates to a speech detector that is to say a circuit which detects the presence of sound or speech on a channel. Such circuits are used, for example, in tape recorders to switch off the tape recorder automatically when no sound is being recorded. The circuits also find use in telephony systems where several lines time share a single channel over which signals are transmitted only when there is speech on the individual line.
A conventional form of speech detector and channel switch is shown in Figure 1 of the accompanying drawings. The input line 10 is connected to a channel switch 12 and additionally to a speech detector which is generally designated 14. The speech detector 14 consists of a full wave rectifier 16 followed by a low-pass filter 18.
Whenever sound is present on the line 10 a positive voltage is developed at the output of the rectifier 16 which voltage is smoothed by the lowpass filter 18 to produce a DC voltage indicative of the sound level. This DC voltage is compared in a comparator 22 with a reference voltage received over line 24 and when the loudness exceeds the threshold set by the reference voltage, the comparator 22 produces an output signal to close the channel switch 12.
The low-pass filter 18 has a slow rise time and the channel switch will therefore not close until some time after the sound has begun. This results in the beginning of bursts of sound being clipped.
To reduce this clipping, it is known to incorporate a differentiator 21 and a full wave rectifier 23 at the output of the low-pass filter 18 and to add the output of the full wave rectifier 23 to the output of the low-pass filter 18 in an adder 20 to produce the voltage for comparison with the reference voltage. The effect of the differentiator, the full wave rectifier and the adder is to make the value of the voltage fed to the comparator not only dependent upon the (level) of the sound but also on the rate of change of (level) thereby improving the response. The comparator 22 drives the channel switch 12 by way of a hangover timer 26 which incorporates a retriggerable monostable multivibrator having a period of approximately 200 milliseconds. The effect of the hangover timer is to maintain the channel switch in a closed position during brief pauses in a burst of sound.
The hangover timer increases the sensitivity of the system to noise since a short noise burst could cause the channel switch to close for a period of 200 milliseconds. In order to avoid the release of the hangover timer for this period by noise peaks that might last up to 50 milliseconds, a deferred hangover timer 28 is used. The deferred hangover timer ensures that the noise peaks that last less than 50 milliseconds will not release the hangover timer while those which are more than 50 milliseconds are indistinguishable from speech and do release a hangover timer.
Reliable speech detection for a range of talker loudness will occur when the reference is set below the average level of the softness talker. A safety margin of about 5 db must be included to allow for ripple in the average level.
The desired rapid response, and high noise immunity present conflicting requirements on the value of the time constant of the smoothing circuit constituted by the low-pass filter 18. A short time constant gives a rapid response and reduces the possibility of clipping of the commencement of speech sound. However, it also increases the ripple of the mean modulus and the probability of false operation caused by noise spikes. On the other hand, a long time constant while providing better noise immunity increases the possibility of clipping of the commencement of a speech sound.
Consequently, in the design of a speech detector as set out in Figure 1, the value of the time constant of the low-pass filter has always been a compromise taking into consideration the conflicting requirements. It has been found in practice that the performance is such that a 5 db reduction in loudness from the optimum, that is to say the maximum input, begins to cause noticeable clipping of the commencement of a speech sound. The present invention seeks to overcome these limitations and to improve the performance of the speech detector.
In accordance with the present invention, there is provided a system comprising an input line connected by way of a channel switch to an output line and a speech detector for closing the channel switch in the presence of speech on the input line, wherein a time delay element is included in the line connecting the input line to the channel switch for delaying signals on the speech channel for a time at least equal to the response time of the speech detector.
The insertion of the delay element has the advantage of allowing the filter time constant to be increased thereby increasing the noise immunity and also the ripple safety margin while still allowing complete capture of the commencement of a speech sound. The delay also has the effect of allowing a greater range of talker loudness variation to be accommodated while retaining satisfactory performance.
It should be mentioned that though the term speech detector is commonly used in the art and the term "speech" has accordingly been used in the present specification, the term is not used only in the sense of a human voice but in the sense of sound having an information content as opposed to background noise.
The inclusion of a delay element in this manner has significant instrumental advantages but from transmission considerations it is not desirable. In the case of a simplex connection, the delay has the result of reducing the information rate; while in a duplex connection the delay results in enhancement of echo effects. With a view to reducing these disadvantages, in accordance with a preferred feature of the invention, means are provided for reducing the time delay introduced by the time delay element into the speech channel after each closing of the channel switch. Since the effects of echo are most discernable at the end of an utterance that is to say when the talker is inactive, the reduction of the time delay after speech has been detected enables the echo enhancement to be inconsequential.The rate of reduction of the delay element affects the frequency or pitch of the transmitted speech and it is preferred that the reduction should be less than 0.15. Thus, for example, if the initial delay is 30 milliseconds and the final delay 0.3 milliseconds the delay may be removed in 198 milliseconds of real time. Experiments have shown that if the delay is removed at this rate it is not noticeable.
It has been found in practice that it is additionally desirable to build a circuit to hold off the delay reduction for 15 milliseconds in order to take into account the multiple transitions which may occur at the speech detector output when an utterance commences.
The invention will now be described further, by way of example, with reference to the accompanying drawings, in which: Figure 1 is as previously described a prior art speech detector and channel switch, Figure 2 shows a speech detector and channel switch of the invention, Figure 3 is a wave form diagram to explain the operation of the circuit shown in Figure 2, and Figure 4 is a block diagram showing in more detail a preferred embodiment of the invention.
Figure 2 employs all the circuit block already described in Figure 1 and to avoid repetition the blocks have been allocated the same reference numerals and will not be described further. The embodiment of the invention differs from the prior art by the provision of a delay element 40 which is controlled by the input signal to the channel switch 12. The operation of the delay element 40 is to introduce a long delay period which is effective in the periods of silence between utterances and which is reduced gradually after the commencement of an utterance so that the effects of echo are not accentuated at the end of an utterance. Figure 3 shows in the upper graph a rectangular waveform indicating the on and off states of the channel switch and the lower graph shows the same time scale of the delay introduced by the delay element 40.It can be seen that the delay reduces gradually while the switch is closed and rises rapidly at the end of an utterance.
Referring now to Figure 4 which describes the circuit of the invention in detail it is seen that the output from the hangover timer 26 is fed to an OR gate 42 the output of which controls the channel switch 12. The output of the OR gate is also fed to a hold-off timer which has the effect of holding off the reduction in the delay for approximately 1 5 milliseconds following the commencement of an utterance in order to take into account the multiple transitions which may occur at this speech detector output when an utterance commences.
The output from the hold-off timer 44 is fed to ramp generator 46 followed by a low path filter 48.The ramp generator integrates the output signal of the hold-off timer to produce a ramp function for controlling the delay of the delay line while the low path filter 48 modifies the ramp slope such that it commences slowly and then increases more rapidly so as to cause the variation of the delay with time to be non-linear. The output on the low path filter 48 is fed to the delay element 40 which can be seen from Figure 4 to consist of a delay line 40a, a control voltage oscillator 40c the frequency of which determines the amount of delay introduced by the delay line 40a, and a flip flop 40b which provides the delay line with clock signals of the desired phase relationship to one another. The voltage controlled oscillator 40c has its frequency controlled by the output of the low path filter so that the total delay introduced follows the desired relationship in time.
Figure 4 also shows that an input amplifier 15 is introduced prior to the full wave rectifier 1 6 and a low path filter 50 is provided before the channel switch 12. The delay line 40 operates on a sampling principle and the purpose of the low path filter 50 is to eliminate the output signal component at the sampling frequency.

Claims (4)

1. A system comprising an input line connected by way of a channel switch to an output line and a speech detector for closing the channel switch in the presence of speech on the input line, wherein a time delay element is included in the line connecting the input line to the channel switch for delaying signals on the speech channel for a time at least equal to the response time of the speech detector.
2. A system as claimed in Claim 1, wherein the time delay introduced by the time delay element is variable.
3. A system as claimed in Claim 2, comprising means for varying the time delay introduced by the time delay element such that the time delay has a maximum value in the absence of speech on the channel and the delay is gradually reduced after the commencement of an utterance.
4. A system constructed, arranged and adapted to operate substantially as hereinbefore described with reference to and as illustrated in Figures 2, 3 and 4 of the accompanying drawings.
GB8138393A 1981-12-21 1981-12-21 Speech detector Withdrawn GB2113054A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB8138393A GB2113054A (en) 1981-12-21 1981-12-21 Speech detector

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB8138393A GB2113054A (en) 1981-12-21 1981-12-21 Speech detector

Publications (1)

Publication Number Publication Date
GB2113054A true GB2113054A (en) 1983-07-27

Family

ID=10526745

Family Applications (1)

Application Number Title Priority Date Filing Date
GB8138393A Withdrawn GB2113054A (en) 1981-12-21 1981-12-21 Speech detector

Country Status (1)

Country Link
GB (1) GB2113054A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2139054A (en) * 1983-04-22 1984-10-31 Gen Electric Co Plc Loudspeaking telephone instruments
EP0526925A1 (en) * 1991-07-08 1993-02-10 Koninklijke Philips Electronics N.V. Information recording device
EP0551422A1 (en) * 1990-10-01 1993-07-21 Motorola Inc. Automatic length-reducing audio delay line
EP0658878A2 (en) * 1993-12-13 1995-06-21 Philips Patentverwaltung GmbH System for transmitting a speech signal

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2139054A (en) * 1983-04-22 1984-10-31 Gen Electric Co Plc Loudspeaking telephone instruments
EP0551422A1 (en) * 1990-10-01 1993-07-21 Motorola Inc. Automatic length-reducing audio delay line
EP0551422A4 (en) * 1990-10-01 1995-01-25 Motorola Inc Automatic length-reducing audio delay line
EP0526925A1 (en) * 1991-07-08 1993-02-10 Koninklijke Philips Electronics N.V. Information recording device
EP0658878A2 (en) * 1993-12-13 1995-06-21 Philips Patentverwaltung GmbH System for transmitting a speech signal
EP0658878A3 (en) * 1993-12-13 1996-04-17 Philips Patentverwaltung System for transmitting a speech signal.
AU681458B2 (en) * 1993-12-13 1997-08-28 Koninklijke Philips Electronics N.V. Method and arrangement for transmitting speech signals

Similar Documents

Publication Publication Date Title
US5007046A (en) Computer controlled adaptive speakerphone
EP0077574B1 (en) Speech recognition system for an automotive vehicle
US5583969A (en) Speech signal processing apparatus for amplifying an input signal based upon consonant features of the signal
US6826525B2 (en) Method and device for detecting a transient in a discrete-time audio signal
US4597098A (en) Speech recognition system in a variable noise environment
US4610023A (en) Speech recognition system and method for variable noise environment
JPH0247142B2 (en)
US4525856A (en) Amplifier arrangement for acoustic signals, provided with means for suppressing (undersired) spurious signals
KR100302370B1 (en) Speech interval detection method and system, and speech speed converting method and system using the speech interval detection method and system
US5187741A (en) Enhanced acoustic calibration procedure for a voice switched speakerphone
EP0376587B1 (en) Acoustic calibration arrangement for a voice switched speakerphone
US4887288A (en) Self calibration arrangement for a voice switched speakerphone
US5940499A (en) Voice switch used in hands-free communications system
US4979163A (en) Echo suppression arrangement for an adaptive speakerphone
US4688256A (en) Speech detector capable of avoiding an interruption by monitoring a variation of a spectrum of an input signal
US4151471A (en) System for reducing noise transients
EP0376588B1 (en) Computer controlled speakerphone for adapting to a communication line
US4187396A (en) Voice detector circuit
GB2151887A (en) Microphone arrangement having suppressed amplitude peaks
GB2113054A (en) Speech detector
EP0160788B1 (en) Apparatus and method for controlling a digital speech filing and retrieval system during playback mode
US6141426A (en) Voice operated switch for use in high noise environments
IE781966L (en) Half-echo suppressor
US6516068B1 (en) Microphone expander
US2866848A (en) Method of improving intelligence under random noise interference

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)