GB1487291A - Determining the boundaries of a speech utterance - Google Patents

Determining the boundaries of a speech utterance

Info

Publication number
GB1487291A
GB1487291A GB12245/75A GB1224575A GB1487291A GB 1487291 A GB1487291 A GB 1487291A GB 12245/75 A GB12245/75 A GB 12245/75A GB 1224575 A GB1224575 A GB 1224575A GB 1487291 A GB1487291 A GB 1487291A
Authority
GB
United Kingdom
Prior art keywords
speech
energy
threshold
expression
utterance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
GB12245/75A
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AT&T Corp
Original Assignee
Western Electric Co Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Western Electric Co Inc filed Critical Western Electric Co Inc
Publication of GB1487291A publication Critical patent/GB1487291A/en
Expired legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/87Detection of discrete points within a voice signal
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S52/00Static structures, e.g. buildings
    • Y10S52/13Hook and loop type fastener

Abstract

1487291 Speech boundary detectors WESTERN ELECTRIC CO Inc 24 March 1975 [29 March 1974] 12245/75 Heading H4R The beginning and/or the end of a speech utterance is determined by adaptively encoding the speech (Fig. 1) and determining when the energy of the encoded speech signals exceeds a threshold. The speech input is compared with the level predicted from the quantized level in loop 14, 15 to produce a differential signal input to the quantizer 12 whose output, at e.g. 16 levels, is encoded 13 to produce a series of digital output code words C(i). The quantizing steps are automatically adjusted to take account of the signal variations by logic network 16. It is said that the code word energy, which is equivalent to the number of code word adaptations per unit time, indicates far better than the speech waveform energy, where the boundaries of an utterance lie. Voiced and unvoiced utterances show no significant difference in code word energy. An expression of this energy E(n) is taken as the sum over a window of 101 sampling instants of the square of the individual code components the average codeword energy is subtracted from each C(i) before evaluation of the energy expression. The D.C. level here is 7À5 but since this cannot be expressed easily in digital form, the terms C(i) are replaced by a(i) = [2C(i) - 15]<SP>2</SP> in the expression for E(n) and this leads to the equation E(n) = E(n - 1) + a(n+50)-a(n-50), where a(n + 50) and a(n - 50) represent the newest and oldest values of a. The circuit of Fig. 4 performs this computation to generate E(n) which then passes to a logic circuit (Fig. 5, not shown) which compares E(n) with a stored threshold between the values of E associated with "silence" and average measured speech. When the threshold is first exceeded a counter begins to increment and continues for as long as the threshold remains exceeded until a count of 320 samples, equivalent to 50 Ásecs. of speech, triggers an output pulse confirming that the original input is speech. Similarly when the threshold is crossed in the downward direction, a count of 1024 (160 Ásec.) produces an output confirming that speech has ceased. These outputs may be used to indicate to an operator that a speech utterance has occurred or to gate a register temporarily storing the code words before permanent storage without gaps between utterances.
GB12245/75A 1974-03-29 1975-03-24 Determining the boundaries of a speech utterance Expired GB1487291A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US456027A US3909532A (en) 1974-03-29 1974-03-29 Apparatus and method for determining the beginning and the end of a speech utterance

Publications (1)

Publication Number Publication Date
GB1487291A true GB1487291A (en) 1977-09-28

Family

ID=23811146

Family Applications (1)

Application Number Title Priority Date Filing Date
GB12245/75A Expired GB1487291A (en) 1974-03-29 1975-03-24 Determining the boundaries of a speech utterance

Country Status (3)

Country Link
US (1) US3909532A (en)
CA (1) CA1036271A (en)
GB (1) GB1487291A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2256997A (en) * 1991-05-31 1992-12-23 Kokusai Electric Co Ltd Voice coding communication system and apparatus

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4032711A (en) * 1975-12-31 1977-06-28 Bell Telephone Laboratories, Incorporated Speaker recognition arrangement
US4351983A (en) * 1979-03-05 1982-09-28 International Business Machines Corp. Speech detector with variable threshold
US4275270A (en) * 1979-11-29 1981-06-23 The Regents Of The University Of California Speech detector for use in an adaptive hybrid circuit
USRE32172E (en) * 1980-12-19 1986-06-03 At&T Bell Laboratories Endpoint detector
US4370521A (en) * 1980-12-19 1983-01-25 Bell Telephone Laboratories, Incorporated Endpoint detector
US4454586A (en) * 1981-11-19 1984-06-12 At&T Bell Laboratories Method and apparatus for generating speech pattern templates
US4587670A (en) * 1982-10-15 1986-05-06 At&T Bell Laboratories Hidden Markov model speech recognition arrangement
USRE33597E (en) * 1982-10-15 1991-05-28 Hidden Markov model speech recognition arrangement
US4704696A (en) * 1984-01-26 1987-11-03 Texas Instruments Incorporated Method and apparatus for voice control of a computer
US4821325A (en) * 1984-11-08 1989-04-11 American Telephone And Telegraph Company, At&T Bell Laboratories Endpoint detector
DE3630518C2 (en) * 1985-09-06 1996-05-02 Ricoh Kk Device for loudly identifying a speech pattern
US4833713A (en) * 1985-09-06 1989-05-23 Ricoh Company, Ltd. Voice recognition system
US4802224A (en) * 1985-09-26 1989-01-31 Nippon Telegraph And Telephone Corporation Reference speech pattern generating method
US4829572A (en) * 1987-11-05 1989-05-09 Andrew Ho Chung Speech recognition system
US4989246A (en) * 1989-03-22 1991-01-29 Industrial Technology Research Institute, R.O.C. Adaptive differential, pulse code modulation sound generator
JPH07281689A (en) * 1994-04-08 1995-10-27 Matsushita Electric Ind Co Ltd Audio signal transmission device
HU229538B1 (en) 1995-12-07 2014-01-28 Koninkl Philips Electronics Nv A method and device for encoding, transferring and decoding a non-pcm bitstream a digital versatile disc device and a multi-channel reproduction apparatus
US6012027A (en) * 1997-05-27 2000-01-04 Ameritech Corporation Criteria for usable repetitions of an utterance during speech reference enrollment
US6003004A (en) 1998-01-08 1999-12-14 Advanced Recognition Technologies, Inc. Speech recognition method and system using compressed speech data
US6480823B1 (en) * 1998-03-24 2002-11-12 Matsushita Electric Industrial Co., Ltd. Speech detection for noisy conditions
ATE489702T1 (en) 2000-01-27 2010-12-15 Nuance Comm Austria Gmbh VOICE DETECTION DEVICE WITH TWO SWITCH-OFF CRITERIA
AU2000278920B2 (en) * 2000-05-17 2006-11-30 Symstream Technology Holdings No.2 Pty Ltd Octave pulse data method and apparatus
US7072828B2 (en) * 2002-05-13 2006-07-04 Avaya Technology Corp. Apparatus and method for improved voice activity detection
GB0414420D0 (en) * 2004-06-28 2004-07-28 Cambridge Silicon Radio Ltd Speech activity detection
EP2277326A4 (en) 2008-04-17 2012-07-18 Cochlear Ltd Sound processor for a medical implant
CN112669880B (en) * 2020-12-16 2023-05-02 北京读我网络技术有限公司 Method and system for adaptively detecting voice ending

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3750024A (en) * 1971-06-16 1973-07-31 Itt Corp Nutley Narrow band digital speech communication system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2256997A (en) * 1991-05-31 1992-12-23 Kokusai Electric Co Ltd Voice coding communication system and apparatus
GB2256997B (en) * 1991-05-31 1995-05-31 Kokusai Electric Co Ltd Voice coding communication system and apparatus

Also Published As

Publication number Publication date
CA1036271A (en) 1978-08-08
US3909532A (en) 1975-09-30
AU7926675A (en) 1976-09-23

Similar Documents

Publication Publication Date Title
GB1487291A (en) Determining the boundaries of a speech utterance
US4449190A (en) Silence editing speech processor
JP2650201B2 (en) How to derive pitch related delay values
US5734789A (en) Voiced, unvoiced or noise modes in a CELP vocoder
US4142066A (en) Suppression of idle channel noise in delta modulation systems
Markel et al. A linear prediction vocoder simulation based upon the autocorrelation method
US20020052734A1 (en) Apparatus and quality enhancement algorithm for mixed excitation linear predictive (MELP) and other speech coders
KR950702730A (en) Transmission system comprising at least a coder
CA2327627A1 (en) Process for processing at least one coded binary audio flux organized into frames
CA1241116A (en) Method of and device for speech signal coding and decoding by vector quantization techniques
GB1012765A (en) Apparatus for the analysis of waveforms
US4379949A (en) Method of and means for variable-rate coding of LPC parameters
KR19980070294A (en) Improved multimodal code-excited linear prediction (CELPL) coder and method
JPH0341838B2 (en)
CA2156558C (en) Speech-coding parameter sequence reconstruction by classification and contour inventory
Mumolo et al. Adaptive predictive coding of speech by means of Volterra predictors
JPS63282795A (en) Multi-pulse voice encoder
FR1301696A (en) Apparatus for encoding pitch-related information in a voice encoder system
CH549849A (en) PROCEDURE FOR DETERMINING THE INTERVAL CORRESPONDING TO THE PERIOD OF THE EXCITATION FREQUENCY OF THE VOICE RANGES.
US5016279A (en) Speech analyzing and synthesizing apparatus using reduced number of codes
US5570455A (en) Method and apparatus for encoding sequences of data
Rheem et al. A nonuniform sampling method of speech signal and its application to speech coding
CN101908888B (en) Dequantization processing method and device
KR0140777B1 (en) Data transformation circuit
Despotovic et al. Low-order volterra long-term predictors

Legal Events

Date Code Title Description
PS Patent sealed [section 19, patents act 1949]
PE20 Patent expired after termination of 20 years

Effective date: 19950323