US3513260A - Speech presence detector - Google Patents

Speech presence detector Download PDF

Info

Publication number
US3513260A
US3513260A US675230A US3513260DA US3513260A US 3513260 A US3513260 A US 3513260A US 675230 A US675230 A US 675230A US 3513260D A US3513260D A US 3513260DA US 3513260 A US3513260 A US 3513260A
Authority
US
United States
Prior art keywords
speech
transistor
capacitor
signal
presence detector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US675230A
Inventor
George A Hellwarth
Gardner D Jones
Albert C Ruocchio
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Application granted granted Critical
Publication of US3513260A publication Critical patent/US3513260A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Definitions

  • Another prior art speech presence detector works satisfactorily provided the noise and speech signal levels remain within predetermined narrow limits. Therefore, such detectors are unsuitable for use in conjunction with telephone distribution system since the Vsignal to noise ratio in the average telephone network may vary from 13 to 40 db and this amount of variation exceeds the satisfactory operating range ofthe circuit.
  • One object offthis invention is to provide a speech presence detector which is fast in operation so as. to prevent the loss of the beginning of a speech signal.
  • Another object of the invention is to provide a speech presence detector which4 is capable of operation in'an environment having a large variation in signal to noise ratio.
  • a further object of the invention is to provide a speech 3,513,260 Patented May 19, 1970 ICC presence detector which is not adversely affected by large long-term variations in the noise level.
  • Yet another object of the invention is to provide a speech presence detector which is capable of operation in the presence of wide variations in the amplitude of the speech signal which is to be detected.
  • the invention contemplates a speech presence detector suitable for connection to a telephone speech distribution system and comprises a peak detecting means having a time constant large enough to convert periodic signals having a frequency above the basic speech pitch frequency range and steady state noise to direct current components and said time constant also being selected so as to follow initial pitch period variations in the speech signal, means for differentiating the peak detector output to accentuate the alternating current component and block the direct current component, and means responsive to the differentiated signal for indicating the presence of speech in response thereto and for continuing said indications provided the period of the alternating component remains within a predetermined range.
  • FIG. l an intermittent speech signal superimposed on steady state long-term amplitude variable background noise is applied to an input terminal 11.
  • the speech signal is graphically illustrated in curve A of FIG. 3 and includes a plurality of pitch periods in succession followed by aperiod of science in which only the background noise is present at the input.
  • the received signal is amplified in amplifier 12 and applied to a peak detecting circuit 1'4 which provides the output illustrated graphically in curve B of FIG. 3.
  • the output of peak detector 14 is applied to a pair of series connected differentiating circuits 15 which alter the input waveforms from peak detector 14 as illustrated in curves C and D of FIG. 3.
  • the output of the second differentiating circuit illustrated in curve D of FIG. 3 comprises a plurality of sharply defined voltage spikes coinciding with the sharp voltage rise occurring at the beginning of each pitch period.
  • the output from the @second stage of differentiating circuit 15 is applied to .periods following the termination of speech are rejected.
  • the length of the time period determines the time after an utterance before the detector indicates the absence of speech.
  • FIG. 2 shows the specific details of peak detector 14,
  • the output of amplifier 12 is applied to the cathode of a diode 20 which has its anode connected to the cornmon junction of a capacitor 21 and a resistor 22 which are both returned to ground.
  • the common junction is connected to the base of amplifiyng transistor 23 which has its emitter current supplied by a positive voltage source E through a resistor 24 and its Collector connected directly to a negative voltage source E.
  • capacitor 21 is at ground potential and diode is nonconductive as long as the output of amplifier 12 is above ground.
  • capacitor 21 charges to the peak negative value and discharges via resistor 22.
  • the time constant of resistor 22 and capacitor 21 is selected so that it is large enough to convert periodic signals having a frequency above the basic speech pitch frequency range and steady state noise to direct current components and simultaneously low enough to follow initial pitch period variations in the speech signals as shown in curve B of FIG. 3.
  • Diode 20 remains reversed biased until capacitor 21 discharges to a point where its voltage applied to the anode of diode 20 is more positive than the input voltage on the cathode.
  • the emitter of transistor 23 is connected to the base of an amplifying transistor 26 by a series connected capacitor 27 and a resistor 28 which differentiates the output of amplifier 23 to produce the wave form illustrated in curve C of FIG. 3.
  • Series connected resistors 30, 31 and 32 connected between positive source E and negative source E provide collector and base bias potential for transistor 26 which has its emitter connected to ground.
  • the amplified output at the collector of transistor 26 is again differentiated by capacitor 34 and resistor 39.
  • the twice differentiated pulses illustrated in curve D of FIG. 3 are applied to the base of transistor 37 and cause the transistor to conduct when they exceed a given amplitude determined by the base-emitter contact potential of transistor 37.
  • the collector of transistor 37 is connected by a current limiting resistor 40 to the common junction of a resistor 41 and capacitor 42 which are each connected to positive voltage source IE. Capacitor 42 will charge through resistor 41 to the voltage of source E. Each time transistor 37 conducts, i.e. driven by a sufficiently large pulse from differentiating circuit 15, capacitor 42 discharges via current limiting resistor 40 through transistor 37. In this manner, the maximum voltage at the junction of resistor 41 and capacitor 42 illustrated by curve E of IFIG. 3 is a function of the number of pulses above the threshold applied in a given time to the base of transistor 37.
  • resistor 41 and capacitor 42 is connected to the base of a transistor 44 which in conjunction with another transistor 45 comprises a comparator.
  • the emitters of transistors 44 and 45 are connected to positive source :E by a resistor 46 and the collectors to negative source E by identical load resistors 47 and 48, respectively.
  • a pair of resistors 50 and 51 connected between positive source E and ground provide a reference potential at their common junction which is connected to the base of transistor 45.
  • transistor 44 conducts and transistors 45 is cut off. This condition causes the collector potential of transistor 45 to go to the negative potential of source E as indicated in curve FF of FIG. 3 while the collector of transistor 44 assumes a more positive potential determined by the voltage across resistor 47.
  • capacitor 42 charges to a voltage more positive than the reference voltage applied to the base of transistor 45
  • the voltage conditions at the collectors of transistors 44 and 45 reverse to indicate the termination of speech.
  • the bi-polar outputs provided at the collectors of transistors 44 and 45 may be utilized for any purpose such as controlling the recording of the signal on the line in a storage medium to thus eliminate periods of silence to conserve storage.
  • a speech presence detector for rapidly detecting the beginning and continuance of speech superposed on noise subject to long term variations comprising,
  • peak detecting means having a time constant large enough to convert periodic signals having a frequency above the basic speech pitch frequency range and steady state noise to direct current components and low enough to follow the initial pitch period variations in the speech signal
  • a first differentiating circuit means responsive to the output of the peak detecting means
  • a second differentiating circuit means responsive to first differentiating circuit means for performing a second differentiation of the signal from the peak detecting means for further accentuating the alternating components of the received signal and blocking the direct current components received.
  • a speech presence detector as set forth in claim 1 in which the means responsive to the differentiated signal includes,
  • a charging circuit connected to said capacitor for charging it to a voltage exceeding the reference voltage connected to the comparator

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Noise Elimination (AREA)
  • Measurement Of Current Or Voltage (AREA)
  • Prepayment Telephone Systems (AREA)

Description

May 19, 1970 G. A. HELLWARTH 4 SPEECH PRESENCE DETECTOR 'Filed oct. 15, 1967 3 Sheets-Sheet 1 ALBERT C. RUOCCHIO /6 (9 ATTORNEY May 19, 1970 Filed Oct. 13, 1967 FIG. 3A
G. A. HELLWARTH SPEECH PRESENCE DETECTOR 3 Sheets-Sheet 2 May 19, 1970 Filed Oct. l5, 1967 G. A. H ELLWARTH SPEECH PRESENCE DETECTOR 3 Sheets-Sheet 5 FIG. 3B
United States Patent O M' 3,513,260 SPEECH PRESENCE DETECTOR George A. Hellwarth, Gardner D. *.Iones, and Albert C. Ruocchio, Raleigh, N.C., assignors to. International Business Machines Corporation, Armonk, N.Y., a corporation of New York Filed Oct. 13, 1967, Ser. No. 675,230I Int. Cl. Gl 1/04 U.S. Cl. 179-1 3 Claims ABSTRACT OF THE DISCLOSURE FIELD oF INVENTION- v This invention relates to speech processing in general and more particularly to speech priesencedetectors for indicating the beginnings4 and continuances of speech signals.
@Riou ART Speech presence detection is us`e'ful`in many speech processingareas. Onearea where itfhasfound'extensive use is in recording speech for use in data processing systems where limited storage requires the elimination of many and prolonged periods of silence'- which occur. in natural speech. The use to which the recordedl vspeech is put is immaterial to this application-and` it may be utilized for one ory more purposes. vv v Several techniques have been employed lin the prior art for detecting the beginning and continuance of speech. In one prior art -systerr 1,'a detectory was Vutilizednto derive the envelopev of the incoming signal and this envelope was passed through a low pass filter to determine the presence of speech. The noise envelope was generally D.C. and contributed nothing to the filter output. This` system has one serious drawback. The use of suitableV linear filtering resulted in a time'- delay-in detecting; the onset'of speech thereby distorting the initial portionof the signal. This problem is overcome through the` use of analog delay and the employment of a parallel signal path such that signal detection can bev made to Vcoincide with the start of the delayed signal. The solutionof this problem has not however, been considered generally successful due to its cost. t ,v
Another prior art speech presence detector works satisfactorily provided the noise and speech signal levels remain within predetermined narrow limits. Therefore, such detectors are unsuitable for use in conjunction with telephone distribution system since the Vsignal to noise ratio in the average telephone network may vary from 13 to 40 db and this amount of variation exceeds the satisfactory operating range ofthe circuit.
SUMMARY OF INVENTION One object offthis invention is to provide a speech presence detector which is fast in operation so as. to prevent the loss of the beginning of a speech signal.
Another object of the invention is to provide a speech presence detector which4 is capable of operation in'an environment having a large variation in signal to noise ratio. Y I
A further object of the invention is to provide a speech 3,513,260 Patented May 19, 1970 ICC presence detector which is not adversely affected by large long-term variations in the noise level.
Yet another object of the invention is to provide a speech presence detector which is capable of operation in the presence of wide variations in the amplitude of the speech signal which is to be detected.
The invention contemplates a speech presence detector suitable for connection to a telephone speech distribution system and comprises a peak detecting means having a time constant large enough to convert periodic signals having a frequency above the basic speech pitch frequency range and steady state noise to direct current components and said time constant also being selected so as to follow initial pitch period variations in the speech signal, means for differentiating the peak detector output to accentuate the alternating current component and block the direct current component, and means responsive to the differentiated signal for indicating the presence of speech in response thereto and for continuing said indications provided the period of the alternating component remains within a predetermined range.
BRIEF DESCRIPTION OF DRAWINGS DESCRIPTION OF THE PREFERRED EMBODIMENT In FIG. l an intermittent speech signal superimposed on steady state long-term amplitude variable background noise is applied to an input terminal 11. The speech signal is graphically illustrated in curve A of FIG. 3 and includes a plurality of pitch periods in succession followed by aperiod of science in which only the background noise is present at the input. The received signal is amplified in amplifier 12 and applied to a peak detecting circuit 1'4 which provides the output illustrated graphically in curve B of FIG. 3.
The output of peak detector 14 is applied to a pair of series connected differentiating circuits 15 which alter the input waveforms from peak detector 14 as illustrated in curves C and D of FIG. 3. The output of the second differentiating circuit illustrated in curve D of FIG. 3 comprises a plurality of sharply defined voltage spikes coinciding with the sharp voltage rise occurring at the beginning of each pitch period. The output from the @second stage of differentiating circuit 15 is applied to .periods following the termination of speech are rejected.
The length of the time period determines the time after an utterance before the detector indicates the absence of speech.
FIG. 2 shows the specific details of peak detector 14,
differentiating circuits 15 and timing and level detector 16. The output of amplifier 12 is applied to the cathode of a diode 20 which has its anode connected to the cornmon junction of a capacitor 21 and a resistor 22 which are both returned to ground. In addition, the common junction is connected to the base of amplifiyng transistor 23 which has its emitter current supplied by a positive voltage source E through a resistor 24 and its Collector connected directly to a negative voltage source E.
llnitially capacitor 21 is at ground potential and diode is nonconductive as long as the output of amplifier 12 is above ground. Upon the first negative excursion of the output of amplifier 12, capacitor 21 charges to the peak negative value and discharges via resistor 22. The time constant of resistor 22 and capacitor 21 is selected so that it is large enough to convert periodic signals having a frequency above the basic speech pitch frequency range and steady state noise to direct current components and simultaneously low enough to follow initial pitch period variations in the speech signals as shown in curve B of FIG. 3. Diode 20 remains reversed biased until capacitor 21 discharges to a point where its voltage applied to the anode of diode 20 is more positive than the input voltage on the cathode. This condition will occur generally at the onset of the next successive pitch period, however, voltage spikes within a pitch period may forward bias diode 20 to charge capacitor 21 as shown after the second and third pitch periods. The pulses thus generated will not however change operation as will be explained later. Minor variations in the noise level do not produce pulses large enough to trigger the detector, because of the particular time constant selected for 21 and 22.
The emitter of transistor 23 is connected to the base of an amplifying transistor 26 by a series connected capacitor 27 and a resistor 28 which differentiates the output of amplifier 23 to produce the wave form illustrated in curve C of FIG. 3. Series connected resistors 30, 31 and 32 connected between positive source E and negative source E provide collector and base bias potential for transistor 26 which has its emitter connected to ground.
The amplified output at the collector of transistor 26 is again differentiated by capacitor 34 and resistor 39. The twice differentiated pulses illustrated in curve D of FIG. 3 are applied to the base of transistor 37 and cause the transistor to conduct when they exceed a given amplitude determined by the base-emitter contact potential of transistor 37.
The collector of transistor 37 is connected by a current limiting resistor 40 to the common junction of a resistor 41 and capacitor 42 which are each connected to positive voltage source IE. Capacitor 42 will charge through resistor 41 to the voltage of source E. Each time transistor 37 conducts, i.e. driven by a sufficiently large pulse from differentiating circuit 15, capacitor 42 discharges via current limiting resistor 40 through transistor 37. In this manner, the maximum voltage at the junction of resistor 41 and capacitor 42 illustrated by curve E of IFIG. 3 is a function of the number of pulses above the threshold applied in a given time to the base of transistor 37.
The junction of resistor 41 and capacitor 42 is connected to the base of a transistor 44 which in conjunction with another transistor 45 comprises a comparator. The emitters of transistors 44 and 45 are connected to positive source :E by a resistor 46 and the collectors to negative source E by identical load resistors 47 and 48, respectively. A pair of resistors 50 and 51 connected between positive source E and ground provide a reference potential at their common junction which is connected to the base of transistor 45.
As long as the voltage at the common junction of resistor 41 and capacitor 42 remains below the reference potential applied to the base of transistor 45, transistor 44 conducts and transistors 45 is cut off. This condition causes the collector potential of transistor 45 to go to the negative potential of source E as indicated in curve FF of FIG. 3 While the collector of transistor 44 assumes a more positive potential determined by the voltage across resistor 47.
As soon as capacitor 42 charges to a voltage more positive than the reference voltage applied to the base of transistor 45, the voltage conditions at the collectors of transistors 44 and 45 reverse to indicate the termination of speech. The bi-polar outputs provided at the collectors of transistors 44 and 45 may be utilized for any purpose such as controlling the recording of the signal on the line in a storage medium to thus eliminate periods of silence to conserve storage.
While the invention has been particularly shown and described with reference to a preferred embodiment thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.
What is claimed is:
1. A speech presence detector for rapidly detecting the beginning and continuance of speech superposed on noise subject to long term variations comprising,
peak detecting means having a time constant large enough to convert periodic signals having a frequency above the basic speech pitch frequency range and steady state noise to direct current components and low enough to follow the initial pitch period variations in the speech signal,
means responsive to the output of the peak detecting means for differentiating the output signal to accen` tuate the alternating current components and block the direct current components, and
means responsive to the differentiated signal for indicating the presence of speech as long as the alternating component recurs within a preselected time interval.
2. A speech presence detector as set forth in claim 1 in which said differentiating means includes,
a first differentiating circuit means responsive to the output of the peak detecting means, and
a second differentiating circuit means responsive to first differentiating circuit means for performing a second differentiation of the signal from the peak detecting means for further accentuating the alternating components of the received signal and blocking the direct current components received.
3. A speech presence detector as set forth in claim 1 in which the means responsive to the differentiated signal includes,
a two input comparator means,
a reference voltage source connected to one of the comparator inputs,
a capacitor,
a charging circuit connected to said capacitor for charging it to a voltage exceeding the reference voltage connected to the comparator,
means connecting the capacitor to the other input of the comparator which provides an output indicative of which input is greater, and
means responsive to the differentiated signal for discharging the capacitor in accordance with the accentuated alternating components.
References Cited UNITED STATES PATENTS 3,286,031 11/1966 Geddes 179-1 OTHER REFERENCES Ives, Music Pulse Analyzer, Electronics, Apr. 1, 1957, pp. 183-184.
KATHLEEN H. CLAFFY, Primary Examiner =D. W. OLMS, Assistant Examiner
US675230A 1967-10-13 1967-10-13 Speech presence detector Expired - Lifetime US3513260A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US67523067A 1967-10-13 1967-10-13

Publications (1)

Publication Number Publication Date
US3513260A true US3513260A (en) 1970-05-19

Family

ID=24709581

Family Applications (1)

Application Number Title Priority Date Filing Date
US675230A Expired - Lifetime US3513260A (en) 1967-10-13 1967-10-13 Speech presence detector

Country Status (4)

Country Link
US (1) US3513260A (en)
DE (1) DE1802502A1 (en)
FR (1) FR1579311A (en)
GB (1) GB1183569A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3985956A (en) * 1974-04-24 1976-10-12 Societa Italiana Telecomunicazioni Siemens S.P.A. Method of and means for detecting voice frequencies in telephone system
US5134658A (en) * 1990-09-27 1992-07-28 Advanced Micro Devices, Inc. Apparatus for discriminating information signals from noise signals in a communication signal

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE1963082C2 (en) * 1969-12-16 1984-08-02 Heinz Dipl.-Phys. 7801 Umkirch Kusch Coding system for speech recognition - uses several successive ratios of extreme values for coding or correlation using resistance matrix
CA1184506A (en) * 1980-04-21 1985-03-26 Akira Komatsu Method and system for discriminating human voice signal
DE3810068A1 (en) * 1988-03-25 1989-10-05 Telefonbau & Normalzeit Gmbh METHOD FOR DETECTING VOICE SIGNALS
DE4405465C2 (en) * 1994-02-21 1996-01-18 Marantec Antrieb Steuerung Drive device for an object that is guided to move back and forth, in particular a door leaf

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3196031A (en) * 1961-10-13 1965-07-20 American Can Co Bonding of topcoatings to printed surfaces

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3196031A (en) * 1961-10-13 1965-07-20 American Can Co Bonding of topcoatings to printed surfaces

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3985956A (en) * 1974-04-24 1976-10-12 Societa Italiana Telecomunicazioni Siemens S.P.A. Method of and means for detecting voice frequencies in telephone system
US5134658A (en) * 1990-09-27 1992-07-28 Advanced Micro Devices, Inc. Apparatus for discriminating information signals from noise signals in a communication signal

Also Published As

Publication number Publication date
GB1183569A (en) 1970-03-11
DE1802502A1 (en) 1969-05-14
FR1579311A (en) 1969-08-22

Similar Documents

Publication Publication Date Title
US4112384A (en) Controlled recovery automatic gain control amplifier
US3985954A (en) DC level control circuit
US3902123A (en) Digital circuit for determining if signal source consists primarily of noise or contains information
US3602826A (en) Adaptive signal detection system
US2466705A (en) Detector system
US4296277A (en) Electronic voice detector
US3140446A (en) Communication receiver with noise blanking
US3328705A (en) Peak detector
US4718097A (en) Method and apparatus for determining the endpoints of a speech utterance
US3513260A (en) Speech presence detector
GB1351993A (en) Disc file agc circuit
US3812432A (en) Tone detector
US4386239A (en) Multifrequency tone detector
US4373139A (en) Detectors
US3053996A (en) Circuit for the conversion of amplitude pulses to time duration pulses
US3944753A (en) Apparatus for distinguishing voice and other noise signals from legitimate multi-frequency tone signals present on telephone or similar communication lines
US3786358A (en) Method and apparatus for detecting the beginning of data block
US4626788A (en) Circuit for reconstructing noise-affected signals
US3679986A (en) Non-linear feedback gain control and peak detector system
US4370620A (en) Efficient high performance envelope detector with controlled decay envelope detection
US3543050A (en) Peak polarity selector
US3125723A (en) shaver
US3140445A (en) Communication receiver with noise blanking
US3652944A (en) Pulse-characteristic modifying circuit
US3701857A (en) Multifrequency signal receiving circuit