WO1995005655B1 - Method for recognizing a spoken word in the presence of interfering speech - Google Patents

Method for recognizing a spoken word in the presence of interfering speech

Info

Publication number
WO1995005655B1
WO1995005655B1 PCT/US1994/009353 US9409353W WO9505655B1 WO 1995005655 B1 WO1995005655 B1 WO 1995005655B1 US 9409353 W US9409353 W US 9409353W WO 9505655 B1 WO9505655 B1 WO 9505655B1
Authority
WO
WIPO (PCT)
Prior art keywords
spoken word
speech
speech recognition
echo
residual signal
Prior art date
Application number
PCT/US1994/009353
Other languages
French (fr)
Other versions
WO1995005655A3 (en
WO1995005655A2 (en
Filing date
Publication date
Priority claimed from US08/106,072 external-priority patent/US5475791A/en
Application filed filed Critical
Priority to AU75273/94A priority Critical patent/AU687089B2/en
Priority to EP94925293A priority patent/EP0713597B1/en
Priority to DE69424172T priority patent/DE69424172T2/en
Priority to AT94925293T priority patent/ATE192258T1/en
Publication of WO1995005655A2 publication Critical patent/WO1995005655A2/en
Publication of WO1995005655A3 publication Critical patent/WO1995005655A3/en
Publication of WO1995005655B1 publication Critical patent/WO1995005655B1/en

Links

Abstract

A method for recognizing a spoken word in the presence of interfering speech begins by echo cancelling the voice prompt and any detected speech signal to produce a residual signal (60). Portions of the residual signal that have been most recently echo-cancelled are then continuously stored in a buffer (62). The energy in the residual signal is also continuously processed to determine onset of the spoken word (64). Upon detection of word onset, the portion of the residual signal then currently in the buffer is retained, the voice prompt is terminated, and the recognizer begins realtime recognition of subsequent portions of the residual signal (66). Upon detection of word completion (68), the method retrieves the portion of the residual signal that was retained in the buffer upon detection of word onset (70) and performs recognition of that portion (72).

Claims

AMENDED CLAIMS
[received by the International Bureau on 30 March 1995 (30.03.95); original claim 5 cancelled; remaining claims amended (4 pages)]
1. A method for recognizing a spoken word in the presence of a voice message generated by a voice processing system, the voice processing system having a speech recognizer, comprising the steps of:
(a) echo cancelling the voice message and any detected speech signal to produce a residual signal;
(b) continuously storing a portion of the residual signal that has been most recently
processed;
(c) processing the residual signal to detect a first portion of the spoken word;
(d) upon detection of the first portion of the spoken word, storing the portion of the residual signal including the first portion of the spoken word that has been most recently processed at the time of such detection, stopping echo cancelling of the voice message and any detected speech signal and initiating speech recognition of a second portion of the spoken word;
(e) thereafter initiating speech recognition of the stored first portion of the spoken word; and
(f) combining results of the speech
recognition effected in steps (c) and (d) to
determine the spoken word.
2. The method as described in Claim 1 further including the step of ceasing the voice message upon detection of the first portion of the spoken word.
3. The method as described in Claim 1 further including the step of detecting completion of the spoken word prior to initiating speech recognition of the stored first portion of the spoken word.
4. The method as described in Claim 1 wherein the recognition of the second portion of the spoken word occurs. 6. The method as described in Claim 1 wherein the step of echo cancelling further includes the steps of estimating an energy level in the residual signal and comparing the estimated energy level to a predetermined threshold energy level.
7. A method for recognizing a spoken word in the presence of a voice message generated by voice processing system, the voice processing system having a speech recognizer, comprising the steps of:
(B) echo cancelling the voice message and any detected speech signal to produce a residual signal;
(b) continuously storing a portion of the residual signal that has been most recently echo cancelled;
(c) processing the residual signal to detect a first portion of the spoken word;
(d) upon detection of the first portion of the spoken word, retaining the stored portion of the residual signal including the first portion of the spoken word that has been most recently processed at the time of such detection, ceasing the voice message, stopping echo cancelling of the voice message and any detected speech signal and
initiating speech recognition of spoken word;
(e) thereafter initiating speech recognition of the first portion of the spoken word retained upon detection of the first portion of the spoken word; and (f) combining results of the recognition effected in steps (d) and (e) to determine the spoken word. 8. The method as described in Claim 7 further including the step of detecting completion of the spoken word prior to initiating speech recognition of the retained first portion of the spoken word. 9. The method as described in Claim 7 wherein the recognition of the second portion of the spoken word occurs in realtime.
10. A method, using a single digital signal processor, for recognizing a spoken word in the presence of interfering speech, comprising the steps of:
(a) echo cancelling the interfering speech and any detected speech signal with the single digital signal processor to produce a residual signal;
(b) continuously storing a portion of the residual signal that has been most recently echo cancelled;
(c) processing the residual signal to detect a first portion of the spoken word;
(d) upon detection of the first portion of the spoken word, retaining the portion of the residual signal including the first portion of the spoken word that has been most recently processed at the time of such detection, ceasing the interfering speech and switching the single digital signal processor from echo cancelling of the interfering speech to speech recognition of a second portion of the spoken word; (e) detecting completion of the spoken word;
(f) upon detection of completion of the spoken word, initiating speech recognition of the retained first portion of the spoken word; and
(g) combining results of the speech
recognition effected in steps (d) and (f) to determine the spoken word.
STATEMENT UNDER ARTICLE 19
In response to the Notification of Transmittal of The International Search Report Or The
Declaration, the Applicant has amended Claims 1-4 and 6-10 and cancelled Claim 5. Thus Claims 1-4 and 6-10 are now pending in the Application, The
Applicant has adopted the language of the Abstract as suggested by the Examiner.
The Examiner rejected Claims 1-10 as being unpatentable over Kartwell, Johnson and Noso. The
Claims have been amended to more particularly point out how upon detection of a first portion of a spoken word the portion of the residual signal including the first portion of a spoken word is retained, echo cancellation is stopped and speech recognition of a second portion of the spoken word is initiated. After the second portion of the spoken word is subjected to speech recognition, the stored first portion of the spoken word is then subjected to speech recognition. The first and second portions are finally combined to determine the spoken word. Neither the Hartwell nor the
Johnson references describe a method which first detects a portion of a spoken word within
interfering speech and then stops echo cancelling to initiate speech recognition of a second portion of the detected word. This procedure enables an echo canceller and voice recognizer to be utilized with fewer system requirements than would otherwise be necessary.
The Hartwell reference discloses an apparatus which first echo cancels throughout a received spoken word for a predetermined period of time.
After the predetermined time period expires, the echo cancelled spoken signal is subjected to speech recognition. Thus, the reference does not disclose stopping echo cancellation to initiate speech recognition upon detection of a spoken word, but the continuous echo cancellation of the signal throughout a predetermined time period including the duration of the spoken word and then the speech recognition of the echo cancelled signal including the spoken word. Additionally, the Hartwell, et al. reference does not disclose speech recognition of a second portion of a detected spoken word and then going back to speech recognize the first portion of the spoken word to determine the complete spoken word.
The Johnson, et al. reference describes a system wherein the cancellation procedure is begun after the initiation of a prompt or announcement and continues throughout the playing of the
announcement. Even when a speech signal is detected within the echo cancelled incoming signal, the echo cancellation continues as the received signal is recorded. In the present invention, once the speech signal is detected, echo cancellation ceases and speech recognition begins. Furthermore, the
Johnson, et al. reference describes the recording of incoming speech signal, and not the voice
recognition thereof as claimed by the Applicant.
The Applicant respectfully submits that the claims are allowable over the Noso, et al. reference for reasons similar to those discussed with respect to the Hartwell and Johnson references.
Furthermore, the Applicant respectfully submits that the Noso, et al. reference fails to disclose the use of a cancellation/recognition system wherein echo cancellation is stopped and speech recognition initiated upon the detection of a spoken word within a residual signal.
Submitted concurrently herewith are substitute pages 13-17 for pages 13-16 originally submitted with the Application. Upon review of the amended claim, it will be evident that the Claims are now fully patentable over the prior art.
PCT/US1994/009353 1993-08-13 1994-08-15 Method for recognizing a spoken word in the presence of interfering speech WO1995005655A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
AU75273/94A AU687089B2 (en) 1993-08-13 1994-08-15 Method for recognizing a spoken word in the presence of interfering speech
EP94925293A EP0713597B1 (en) 1993-08-13 1994-08-15 Method for recognizing a spoken word in the presence of interfering speech
DE69424172T DE69424172T2 (en) 1993-08-13 1994-08-15 METHOD FOR RECOGNIZING A SPOKEN WORD IN THE PRESENCE OF DISTURBING LANGUAGE
AT94925293T ATE192258T1 (en) 1993-08-13 1994-08-15 METHOD FOR RECOGNIZING A SPOKEN WORD IN THE PRESENCE OF DISRUPTIVE LANGUAGE

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/106,072 1993-08-13
US08/106,072 US5475791A (en) 1993-08-13 1993-08-13 Method for recognizing a spoken word in the presence of interfering speech

Publications (3)

Publication Number Publication Date
WO1995005655A2 WO1995005655A2 (en) 1995-02-23
WO1995005655A3 WO1995005655A3 (en) 1995-03-23
WO1995005655B1 true WO1995005655B1 (en) 1995-05-18

Family

ID=22309326

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1994/009353 WO1995005655A2 (en) 1993-08-13 1994-08-15 Method for recognizing a spoken word in the presence of interfering speech

Country Status (8)

Country Link
US (1) US5475791A (en)
EP (1) EP0713597B1 (en)
AT (1) ATE192258T1 (en)
AU (1) AU687089B2 (en)
CA (1) CA2169447A1 (en)
DE (1) DE69424172T2 (en)
ES (1) ES2145148T3 (en)
WO (1) WO1995005655A2 (en)

Families Citing this family (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2212658C (en) * 1995-02-15 2002-01-22 British Telecommunications Public Limited Company Voice activity detection using echo return loss to adapt the detection threshold
DE19533541C1 (en) * 1995-09-11 1997-03-27 Daimler Benz Aerospace Ag Method for the automatic control of one or more devices by voice commands or by voice dialog in real time and device for executing the method
JP2921472B2 (en) * 1996-03-15 1999-07-19 日本電気株式会社 Voice and noise elimination device, voice recognition device
US5765130A (en) * 1996-05-21 1998-06-09 Applied Language Technologies, Inc. Method and apparatus for facilitating speech barge-in in connection with voice recognition systems
JP3998724B2 (en) * 1996-11-28 2007-10-31 ブリティッシュ・テレコミュニケーションズ・パブリック・リミテッド・カンパニー Interactive device
US5953411A (en) * 1996-12-18 1999-09-14 Intel Corporation Method and apparatus for maintaining audio sample correlation
US5848130A (en) * 1996-12-31 1998-12-08 At&T Corp System and method for enhanced intelligibility of voice messages
US6775264B1 (en) 1997-03-03 2004-08-10 Webley Systems, Inc. Computer, internet and telecommunications based network
JPH10257583A (en) * 1997-03-06 1998-09-25 Asahi Chem Ind Co Ltd Voice processing unit and its voice processing method
GB2325110B (en) 1997-05-06 2002-10-16 Ibm Voice processing system
GB2325112B (en) 1997-05-06 2002-07-31 Ibm Voice processing system
DE19722784C1 (en) * 1997-05-30 1999-01-14 Deutsche Telekom Ag Method and arrangement for a voice-controlled communication terminal with acoustic operator guidance
WO1999018566A2 (en) * 1997-10-07 1999-04-15 Koninklijke Philips Electronics N.V. A method and device for activating a voice-controlled function in a multi-station network through using both speaker-dependent and speaker-independent speech recognition
US7274928B2 (en) 1998-10-02 2007-09-25 Telespree Communications Portable cellular phone system having automatic initialization
US6167251A (en) * 1998-10-02 2000-12-26 Telespree Communications Keyless portable cellular phone system having remote voice recognition
US6665645B1 (en) * 1999-07-28 2003-12-16 Matsushita Electric Industrial Co., Ltd. Speech recognition apparatus for AV equipment
US7117149B1 (en) * 1999-08-30 2006-10-03 Harman Becker Automotive Systems-Wavemakers, Inc. Sound source classification
US6868385B1 (en) * 1999-10-05 2005-03-15 Yomobile, Inc. Method and apparatus for the provision of information signals based upon speech recognition
US6963759B1 (en) 1999-10-05 2005-11-08 Fastmobile, Inc. Speech recognition technique based on local interrupt detection
US6937977B2 (en) * 1999-10-05 2005-08-30 Fastmobile, Inc. Method and apparatus for processing an input speech signal during presentation of an output audio signal
GB9928011D0 (en) * 1999-11-27 2000-01-26 Ibm Voice processing system
US7516190B2 (en) 2000-02-04 2009-04-07 Parus Holdings, Inc. Personal voice-based information retrieval system
US6721705B2 (en) 2000-02-04 2004-04-13 Webley Systems, Inc. Robust voice browser system and voice activated device controller
US6744885B1 (en) * 2000-02-24 2004-06-01 Lucent Technologies Inc. ASR talkoff suppressor
WO2001075555A2 (en) * 2000-03-06 2001-10-11 Conita Technologies, Inc. Personal virtual assistant
AU2001286450A1 (en) * 2000-08-12 2002-02-25 Georgia Tech Research Corporation A system and method for capturing an image
US6725193B1 (en) * 2000-09-13 2004-04-20 Telefonaktiebolaget Lm Ericsson Cancellation of loudspeaker words in speech recognition
US20020173333A1 (en) * 2001-05-18 2002-11-21 Buchholz Dale R. Method and apparatus for processing barge-in requests
DE10158583A1 (en) * 2001-11-29 2003-06-12 Philips Intellectual Property Procedure for operating a barge-in dialog system
US7328159B2 (en) * 2002-01-15 2008-02-05 Qualcomm Inc. Interactive speech recognition apparatus and method with conditioned voice prompts
US8046581B2 (en) 2002-03-04 2011-10-25 Telespree Communications Method and apparatus for secure immediate wireless access in a telecommunications network
US7197301B2 (en) 2002-03-04 2007-03-27 Telespree Communications Method and apparatus for secure immediate wireless access in a telecommunications network
US20030229491A1 (en) * 2002-06-06 2003-12-11 International Business Machines Corporation Single sound fragment processing
JP3727927B2 (en) * 2003-02-10 2005-12-21 株式会社東芝 Speaker verification device
US7496387B2 (en) * 2003-09-25 2009-02-24 Vocollect, Inc. Wireless headset for use in speech recognition environment
US20050071158A1 (en) * 2003-09-25 2005-03-31 Vocollect, Inc. Apparatus and method for detecting user speech
US20060146652A1 (en) * 2005-01-03 2006-07-06 Sdi Technologies, Inc. Sunset timer
JP4630876B2 (en) * 2005-01-18 2011-02-09 富士通株式会社 Speech speed conversion method and speech speed converter
US20070055514A1 (en) * 2005-09-08 2007-03-08 Beattie Valerie L Intelligent tutoring feedback
US8417185B2 (en) 2005-12-16 2013-04-09 Vocollect, Inc. Wireless headset and method for robust voice data communication
US7885419B2 (en) * 2006-02-06 2011-02-08 Vocollect, Inc. Headset terminal with speech functionality
US7773767B2 (en) 2006-02-06 2010-08-10 Vocollect, Inc. Headset terminal with rear stability strap
US8046221B2 (en) * 2007-10-31 2011-10-25 At&T Intellectual Property Ii, L.P. Multi-state barge-in models for spoken dialog systems
EP2107553B1 (en) * 2008-03-31 2011-05-18 Harman Becker Automotive Systems GmbH Method for determining barge-in
EP2148325B1 (en) * 2008-07-22 2014-10-01 Nuance Communications, Inc. Method for determining the presence of a wanted signal component
USD605629S1 (en) 2008-09-29 2009-12-08 Vocollect, Inc. Headset
US8160287B2 (en) 2009-05-22 2012-04-17 Vocollect, Inc. Headset with adjustable headband
US8438659B2 (en) 2009-11-05 2013-05-07 Vocollect, Inc. Portable computing device and headset interface
WO2013187932A1 (en) 2012-06-10 2013-12-19 Nuance Communications, Inc. Noise dependent signal processing for in-car communication systems with multiple acoustic zones
DE112012006876B4 (en) 2012-09-04 2021-06-10 Cerence Operating Company Method and speech signal processing system for formant-dependent speech signal amplification
WO2014070139A2 (en) 2012-10-30 2014-05-08 Nuance Communications, Inc. Speech enhancement
CN109903758B (en) 2017-12-08 2023-06-23 阿里巴巴集团控股有限公司 Audio processing method and device and terminal equipment
CN111048096B (en) * 2019-12-24 2022-07-26 大众问问(北京)信息科技有限公司 Voice signal processing method and device and terminal

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5852695A (en) * 1981-09-25 1983-03-28 日産自動車株式会社 Voice detector for vehicle
US4645883A (en) * 1984-05-09 1987-02-24 Communications Satellite Corporation Double talk and line noise detector for a echo canceller
US4914692A (en) * 1987-12-29 1990-04-03 At&T Bell Laboratories Automatic speech recognition using echo cancellation
US5125024A (en) * 1990-03-28 1992-06-23 At&T Bell Laboratories Voice response unit
US5155760A (en) * 1991-06-26 1992-10-13 At&T Bell Laboratories Voice messaging system with voice activated prompt interrupt

Similar Documents

Publication Publication Date Title
WO1995005655B1 (en) Method for recognizing a spoken word in the presence of interfering speech
JP4098842B2 (en) Prompt interrupt system with voice activated prompt interrupt function and adjustable echo cancellation method
US6785365B2 (en) Method and apparatus for facilitating speech barge-in in connection with voice recognition systems
EP0713597B1 (en) Method for recognizing a spoken word in the presence of interfering speech
JP2538176B2 (en) Eco-control device
AU707896B2 (en) Voice activity detection
EP0901267B1 (en) The detection of the speech activity of a source
US4914692A (en) Automatic speech recognition using echo cancellation
CA2358044C (en) Method for handling far-end speech effects in hands-free telephony systems on acoustic beamforming
US6449361B1 (en) Control method and device for echo canceller
CA2018836A1 (en) Training method for an echo canceller for use in a voice conference system
US6606595B1 (en) HMM-based echo model for noise cancellation avoiding the problem of false triggers
EP0806759B1 (en) Canceler of speech and noise, and speech recognition apparatus
US7085715B2 (en) Method and apparatus of controlling noise level calculations in a conferencing system
JP2002041073A (en) Speech recognition device
KR100194765B1 (en) Speech recognition system using echo cancellation and method
Heitkamper et al. Adaptive gain control and echo cancellation for hands-free telephone systems
JP3357284B2 (en) Double talk detection control device and double talk detection control method
JPH07264103A (en) Method and device for detecting superimposed voice and voice input and output device using the detector
JPH09149100A (en) Telephone set
JPH036712B2 (en)