CA1147071A

CA1147071A - Method of and apparatus for detecting speech in a voice channel signal

Info

Publication number: CA1147071A
Application number: CA000359968A
Authority: CA
Inventors: Fouad Daaboul; Tiu Le Van
Original assignee: Northern Telecom Ltd
Current assignee: Nortel Networks Corp
Priority date: 1980-09-09
Filing date: 1980-09-09
Publication date: 1983-05-24
Also published as: JPS5781733A; EP0047589A1; EP0047589B1; DE3164171D1; JPH0311139B2

Abstract

METHOD OF AND APPARATUS FOR DETECTING
SPEECH IN A VOICE CHANNEL SIGNAL

Abstract of the Disclosure In a digital speech detector, a first detector produces an output signal if any sample of a voice channel signal has a magnitude exceeding a fixed threshold above noise levels. A
second detector produces a second threshold which is adaptively adjusted to noise levels on the channel by being set to a level above the current sample magnitude each time that this does not exceed the preceding sample's magnitude. When the sample magnitude rises above the second threshold, and for each subsequent sample of successively increasing magnitude, the second detector produces an output signal. A speech decision for the voice channel is reached if either detector produces its output signal. The output signals of the first and second detectors are maintained for fixed and variable, respectively, hangover periods to maintain the speech decision during intersyllabic pauses in speech. The samples are supplied to the speech detector by an offset remover and averaging circuit.

- i -

Claims

THE EMBODIMENTS OF THE INVENTION IN WHICH
AN EXCLUSIVE PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED
AS FOLLOWS:-

1. A method of detecting the presence of speech in a sampled voice channel signal, comprising the steps of:-setting a threshold, to a level which is greater than and is dependent upon the magnitude of the current sample, whenever the magnitude of the current sample is not greater than that of the preceding sample;
and providing an indication of the presence of speech whenever the magnitude of the current sample is greater than that of the preceding sample and exceeds said threshold level.

2. A method as claimed in claim 1 and including maintaining said indication in respect of a number of samples following each sample whose magnitude is greater than that of the preceding sample.

3. A method as claimed in claim 2 and including the step of determining the number of samples in respect of which said indication is maintained in dependence upon previous sample magnitudes, said number being increased, up to a maximum number, for each sample whose magnitude is greater than that of the preceding sample and being decreased, down to a minimum number, for each sample whose magnitude is not greater than that of the preceding sample.

4. A method as claimed in claim 1, 2, or 3 and including the step of providing an indication of the presence of speech in respect of each sample whose magnitude exceeds a fixed threshold level.

5. A method of detecting the presence of speech signals in a sampled voice channel signal, comprising the steps of:-producing a first signal state whenever the magnitude of a signal sample exceeds a first threshold level;
comparing the magnitude of each sample with that of the preceding sample;
whenever the magnitude of a sample is not greater than that of the preceding sample, setting a second threshold to a level which is greater than and is dependent upon the magnitude of the current sample;
whenever the magnitude of a sample is greater than that of the preceding sample, producing a second signal state if the magnitude of the current sample exceeds the second threshold level; and in response to each of the first and the second signal states, producing a signal, representing the presence of speech, at least in respect of the current sample.

6. A method as claimed in claim 5 and including the steps of:-whenever the magnitude of a sample does not exceed the first threshold level and the first signal state was produced in respect of the preceding sample, producing a third signal state in respect of a first predetermined number of consecutive samples commencing with the current sample;
whenever the magnitude of a sample is not greater than that of the preceding sample and the second signal state was produced in respect of said preceding sample, producing a fourth signal state in respect of a second number of consecutive samples commencing with the current sample; and producing the signal representing the presence of speech in response to each of the third and fourth signal states.

7. A method as claimed in claim 6 and including the step of determining said second number in dependence upon previous sample magnitudes, said second number being increased by a predetermined amount, up to a maximum number, for each sample in respect of which the second signal state is produced, and being decreased, down to a minimum number, for each sample whose magnitude is not greater than the magnitude of the preceding sample.

8. A method as claimed in claim 6 or 7 and including the steps of:-whenever the magnitude of a sample exceeds that of the preceding sample, and in respect of said preceding sample the fourth signal state was produced but the second signal state was not produced, producing the second signal state in respect of the current sample if its magnitude exceeds a third threshold level; and setting the third threshold level equal to the magnitude of the preceding sample whenever the second signal state was produced in respect of said preceding sample and the magnitude of the current sample is not greater than the magnitude of said preceding sample.

9. A method as claimed in claim 5 wherein, each time that the second threshold level is set, it is set to be greater than the magnitude of the current sample by a predetermined amount.

10. A method as claimed in claim 5, 6, or 7 wherein each signal sample is constituted by an average of a plurality of individual samples of the voice channel signal, the method further comprising the step of producing each signal sample by removing d.c. offsets from and averaging a plurality of individual samples of the voice channel signal.

11. A speech detector comprising one or more read-only memories programmed and arranged to carry out the method of claim 1, 2, or 5.

12. A speech detector for detecting the presence of speech signals in a sampled voice channel signal, comprising:-means for producing a first signal state whenever the magnitude of a signal sample exceeds a first threshold level;
means for generating a second threshold;
means for delaying each sample until the next sample;
means for comparing the magnitude of each sample with that of the preceding sample delayed by said delaying means;
means, responsive to said comparing means determining that the magnitude of a sample is not greater than that of the preceding sample, for setting the second threshold to a level which is greater than and is dependent upon the magnitude of the current sample;
means, responsive to said comparing means determining that the magnitude of a sample is greater than that of the preceding sample, for producing a second signal state if the magnitude of the current sample exceeds the second threshold level; and means responsive to each of the first and second signal states for producing a signal, representing the presence of speech, at least in respect of the current sample.

13. A speech detector as claimed in claim 12 and including means for producing each signal sample by removing d.c. offsets from and averaging a plurality of individual samples of the voice channel signal.