WO1995005655B1 - Method for recognizing a spoken word in the presence of interfering speech - Google Patents
Method for recognizing a spoken word in the presence of interfering speechInfo
- Publication number
- WO1995005655B1 WO1995005655B1 PCT/US1994/009353 US9409353W WO9505655B1 WO 1995005655 B1 WO1995005655 B1 WO 1995005655B1 US 9409353 W US9409353 W US 9409353W WO 9505655 B1 WO9505655 B1 WO 9505655B1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- spoken word
- speech
- speech recognition
- echo
- residual signal
- Prior art date
Links
- 230000002452 interceptive Effects 0.000 title claims abstract 7
- 238000002592 echocardiography Methods 0.000 claims abstract 22
- 238000001514 detection method Methods 0.000 claims abstract 15
- 230000000717 retained Effects 0.000 claims abstract 6
- 230000000977 initiatory Effects 0.000 claims 8
- 238000000034 method Methods 0.000 claims 2
Abstract
A method for recognizing a spoken word in the presence of interfering speech begins by echo cancelling the voice prompt and any detected speech signal to produce a residual signal (60). Portions of the residual signal that have been most recently echo-cancelled are then continuously stored in a buffer (62). The energy in the residual signal is also continuously processed to determine onset of the spoken word (64). Upon detection of word onset, the portion of the residual signal then currently in the buffer is retained, the voice prompt is terminated, and the recognizer begins realtime recognition of subsequent portions of the residual signal (66). Upon detection of word completion (68), the method retrieves the portion of the residual signal that was retained in the buffer upon detection of word onset (70) and performs recognition of that portion (72).
Claims
AMENDED CLAIMS
[received by the International Bureau on 30 March 1995 (30.03.95); original claim 5 cancelled; remaining claims amended (4 pages)]
1. A method for recognizing a spoken word in the presence of a voice message generated by a voice processing system, the voice processing system having a speech recognizer, comprising the steps of:
(a) echo cancelling the voice message and any detected speech signal to produce a residual signal;
(b) continuously storing a portion of the residual signal that has been most recently
processed;
(c) processing the residual signal to detect a first portion of the spoken word;
(d) upon detection of the first portion of the spoken word, storing the portion of the residual signal including the first portion of the spoken word that has been most recently processed at the time of such detection, stopping echo cancelling of the voice message and any detected speech signal and initiating speech recognition of a second portion of the spoken word;
(e) thereafter initiating speech recognition of the stored first portion of the spoken word; and
(f) combining results of the speech
recognition effected in steps (c) and (d) to
determine the spoken word.
2. The method as described in Claim 1 further including the step of ceasing the voice message upon detection of the first portion of the spoken word.
3. The method as described in Claim 1 further including the step of detecting completion of the spoken word prior to initiating speech recognition of the stored first portion of the spoken word.
4. The method as described in Claim 1 wherein the recognition of the second portion of the spoken word occurs. 6. The method as described in Claim 1 wherein the step of echo cancelling further includes the steps of estimating an energy level in the residual signal and comparing the estimated energy level to a predetermined threshold energy level.
7. A method for recognizing a spoken word in the presence of a voice message generated by voice processing system, the voice processing system having a speech recognizer, comprising the steps of:
(B) echo cancelling the voice message and any detected speech signal to produce a residual signal;
(b) continuously storing a portion of the residual signal that has been most recently echo cancelled;
(c) processing the residual signal to detect a first portion of the spoken word;
(d) upon detection of the first portion of the spoken word, retaining the stored portion of the residual signal including the first portion of the spoken word that has been most recently processed at the time of such detection, ceasing the voice message, stopping echo cancelling of the voice message and any detected speech signal and
initiating speech recognition of spoken word;
(e) thereafter initiating speech recognition of the first portion of the spoken word retained upon detection of the first portion of the spoken word; and
(f) combining results of the recognition effected in steps (d) and (e) to determine the spoken word. 8. The method as described in Claim 7 further including the step of detecting completion of the spoken word prior to initiating speech recognition of the retained first portion of the spoken word. 9. The method as described in Claim 7 wherein the recognition of the second portion of the spoken word occurs in realtime.
10. A method, using a single digital signal processor, for recognizing a spoken word in the presence of interfering speech, comprising the steps of:
(a) echo cancelling the interfering speech and any detected speech signal with the single digital signal processor to produce a residual signal;
(b) continuously storing a portion of the residual signal that has been most recently echo cancelled;
(c) processing the residual signal to detect a first portion of the spoken word;
(d) upon detection of the first portion of the spoken word, retaining the portion of the residual signal including the first portion of the spoken word that has been most recently processed at the time of such detection, ceasing the interfering speech and switching the single digital signal processor from echo cancelling of the interfering speech to speech recognition of a second portion of the spoken word;
(e) detecting completion of the spoken word;
(f) upon detection of completion of the spoken word, initiating speech recognition of the retained first portion of the spoken word; and
(g) combining results of the speech
recognition effected in steps (d) and (f) to determine the spoken word.
STATEMENT UNDER ARTICLE 19
In response to the Notification of Transmittal of The International Search Report Or The
Declaration, the Applicant has amended Claims 1-4 and 6-10 and cancelled Claim 5. Thus Claims 1-4 and 6-10 are now pending in the Application, The
Applicant has adopted the language of the Abstract as suggested by the Examiner.
The Examiner rejected Claims 1-10 as being unpatentable over Kartwell, Johnson and Noso. The
Claims have been amended to more particularly point out how upon detection of a first portion of a spoken word the portion of the residual signal including the first portion of a spoken word is retained, echo cancellation is stopped and speech recognition of a second portion of the spoken word is initiated. After the second portion of the spoken word is subjected to speech recognition, the stored first portion of the spoken word is then subjected to speech recognition. The first and second portions are finally combined to determine the spoken word. Neither the Hartwell nor the
Johnson references describe a method which first detects a portion of a spoken word within
interfering speech and then stops echo cancelling to initiate speech recognition of a second portion of the detected word. This procedure enables an echo canceller and voice recognizer to be utilized with fewer system requirements than would otherwise be necessary.
The Hartwell reference discloses an apparatus which first echo cancels throughout a received spoken word for a predetermined period of time.
After the predetermined time period expires, the echo cancelled spoken signal is subjected to speech recognition. Thus, the reference does not disclose stopping echo cancellation to initiate speech recognition upon detection of a spoken word, but the continuous echo cancellation of the signal
throughout a predetermined time period including the duration of the spoken word and then the speech recognition of the echo cancelled signal including the spoken word. Additionally, the Hartwell, et al. reference does not disclose speech recognition of a second portion of a detected spoken word and then going back to speech recognize the first portion of the spoken word to determine the complete spoken word.
The Johnson, et al. reference describes a system wherein the cancellation procedure is begun after the initiation of a prompt or announcement and continues throughout the playing of the
announcement. Even when a speech signal is detected within the echo cancelled incoming signal, the echo cancellation continues as the received signal is recorded. In the present invention, once the speech signal is detected, echo cancellation ceases and speech recognition begins. Furthermore, the
Johnson, et al. reference describes the recording of incoming speech signal, and not the voice
recognition thereof as claimed by the Applicant.
The Applicant respectfully submits that the claims are allowable over the Noso, et al. reference for reasons similar to those discussed with respect to the Hartwell and Johnson references.
Furthermore, the Applicant respectfully submits that the Noso, et al. reference fails to disclose the use of a cancellation/recognition system wherein echo cancellation is stopped and speech recognition initiated upon the detection of a spoken word within a residual signal.
Submitted concurrently herewith are substitute pages 13-17 for pages 13-16 originally submitted with the Application. Upon review of the amended claim, it will be evident that the Claims are now fully patentable over the prior art.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU75273/94A AU687089B2 (en) | 1993-08-13 | 1994-08-15 | Method for recognizing a spoken word in the presence of interfering speech |
EP94925293A EP0713597B1 (en) | 1993-08-13 | 1994-08-15 | Method for recognizing a spoken word in the presence of interfering speech |
DE69424172T DE69424172T2 (en) | 1993-08-13 | 1994-08-15 | METHOD FOR RECOGNIZING A SPOKEN WORD IN THE PRESENCE OF DISTURBING LANGUAGE |
AT94925293T ATE192258T1 (en) | 1993-08-13 | 1994-08-15 | METHOD FOR RECOGNIZING A SPOKEN WORD IN THE PRESENCE OF DISRUPTIVE LANGUAGE |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/106,072 | 1993-08-13 | ||
US08/106,072 US5475791A (en) | 1993-08-13 | 1993-08-13 | Method for recognizing a spoken word in the presence of interfering speech |
Publications (3)
Publication Number | Publication Date |
---|---|
WO1995005655A2 WO1995005655A2 (en) | 1995-02-23 |
WO1995005655A3 WO1995005655A3 (en) | 1995-03-23 |
WO1995005655B1 true WO1995005655B1 (en) | 1995-05-18 |
Family
ID=22309326
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1994/009353 WO1995005655A2 (en) | 1993-08-13 | 1994-08-15 | Method for recognizing a spoken word in the presence of interfering speech |
Country Status (8)
Country | Link |
---|---|
US (1) | US5475791A (en) |
EP (1) | EP0713597B1 (en) |
AT (1) | ATE192258T1 (en) |
AU (1) | AU687089B2 (en) |
CA (1) | CA2169447A1 (en) |
DE (1) | DE69424172T2 (en) |
ES (1) | ES2145148T3 (en) |
WO (1) | WO1995005655A2 (en) |
Families Citing this family (53)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2212658C (en) * | 1995-02-15 | 2002-01-22 | British Telecommunications Public Limited Company | Voice activity detection using echo return loss to adapt the detection threshold |
DE19533541C1 (en) * | 1995-09-11 | 1997-03-27 | Daimler Benz Aerospace Ag | Method for the automatic control of one or more devices by voice commands or by voice dialog in real time and device for executing the method |
JP2921472B2 (en) * | 1996-03-15 | 1999-07-19 | 日本電気株式会社 | Voice and noise elimination device, voice recognition device |
US5765130A (en) * | 1996-05-21 | 1998-06-09 | Applied Language Technologies, Inc. | Method and apparatus for facilitating speech barge-in in connection with voice recognition systems |
JP3998724B2 (en) * | 1996-11-28 | 2007-10-31 | ブリティッシュ・テレコミュニケーションズ・パブリック・リミテッド・カンパニー | Interactive device |
US5953411A (en) * | 1996-12-18 | 1999-09-14 | Intel Corporation | Method and apparatus for maintaining audio sample correlation |
US5848130A (en) * | 1996-12-31 | 1998-12-08 | At&T Corp | System and method for enhanced intelligibility of voice messages |
US6775264B1 (en) | 1997-03-03 | 2004-08-10 | Webley Systems, Inc. | Computer, internet and telecommunications based network |
JPH10257583A (en) * | 1997-03-06 | 1998-09-25 | Asahi Chem Ind Co Ltd | Voice processing unit and its voice processing method |
GB2325110B (en) | 1997-05-06 | 2002-10-16 | Ibm | Voice processing system |
GB2325112B (en) | 1997-05-06 | 2002-07-31 | Ibm | Voice processing system |
DE19722784C1 (en) * | 1997-05-30 | 1999-01-14 | Deutsche Telekom Ag | Method and arrangement for a voice-controlled communication terminal with acoustic operator guidance |
WO1999018566A2 (en) * | 1997-10-07 | 1999-04-15 | Koninklijke Philips Electronics N.V. | A method and device for activating a voice-controlled function in a multi-station network through using both speaker-dependent and speaker-independent speech recognition |
US7274928B2 (en) | 1998-10-02 | 2007-09-25 | Telespree Communications | Portable cellular phone system having automatic initialization |
US6167251A (en) * | 1998-10-02 | 2000-12-26 | Telespree Communications | Keyless portable cellular phone system having remote voice recognition |
US6665645B1 (en) * | 1999-07-28 | 2003-12-16 | Matsushita Electric Industrial Co., Ltd. | Speech recognition apparatus for AV equipment |
US7117149B1 (en) * | 1999-08-30 | 2006-10-03 | Harman Becker Automotive Systems-Wavemakers, Inc. | Sound source classification |
US6868385B1 (en) * | 1999-10-05 | 2005-03-15 | Yomobile, Inc. | Method and apparatus for the provision of information signals based upon speech recognition |
US6963759B1 (en) | 1999-10-05 | 2005-11-08 | Fastmobile, Inc. | Speech recognition technique based on local interrupt detection |
US6937977B2 (en) * | 1999-10-05 | 2005-08-30 | Fastmobile, Inc. | Method and apparatus for processing an input speech signal during presentation of an output audio signal |
GB9928011D0 (en) * | 1999-11-27 | 2000-01-26 | Ibm | Voice processing system |
US7516190B2 (en) | 2000-02-04 | 2009-04-07 | Parus Holdings, Inc. | Personal voice-based information retrieval system |
US6721705B2 (en) | 2000-02-04 | 2004-04-13 | Webley Systems, Inc. | Robust voice browser system and voice activated device controller |
US6744885B1 (en) * | 2000-02-24 | 2004-06-01 | Lucent Technologies Inc. | ASR talkoff suppressor |
WO2001075555A2 (en) * | 2000-03-06 | 2001-10-11 | Conita Technologies, Inc. | Personal virtual assistant |
AU2001286450A1 (en) * | 2000-08-12 | 2002-02-25 | Georgia Tech Research Corporation | A system and method for capturing an image |
US6725193B1 (en) * | 2000-09-13 | 2004-04-20 | Telefonaktiebolaget Lm Ericsson | Cancellation of loudspeaker words in speech recognition |
US20020173333A1 (en) * | 2001-05-18 | 2002-11-21 | Buchholz Dale R. | Method and apparatus for processing barge-in requests |
DE10158583A1 (en) * | 2001-11-29 | 2003-06-12 | Philips Intellectual Property | Procedure for operating a barge-in dialog system |
US7328159B2 (en) * | 2002-01-15 | 2008-02-05 | Qualcomm Inc. | Interactive speech recognition apparatus and method with conditioned voice prompts |
US8046581B2 (en) | 2002-03-04 | 2011-10-25 | Telespree Communications | Method and apparatus for secure immediate wireless access in a telecommunications network |
US7197301B2 (en) | 2002-03-04 | 2007-03-27 | Telespree Communications | Method and apparatus for secure immediate wireless access in a telecommunications network |
US20030229491A1 (en) * | 2002-06-06 | 2003-12-11 | International Business Machines Corporation | Single sound fragment processing |
JP3727927B2 (en) * | 2003-02-10 | 2005-12-21 | 株式会社東芝 | Speaker verification device |
US7496387B2 (en) * | 2003-09-25 | 2009-02-24 | Vocollect, Inc. | Wireless headset for use in speech recognition environment |
US20050071158A1 (en) * | 2003-09-25 | 2005-03-31 | Vocollect, Inc. | Apparatus and method for detecting user speech |
US20060146652A1 (en) * | 2005-01-03 | 2006-07-06 | Sdi Technologies, Inc. | Sunset timer |
JP4630876B2 (en) * | 2005-01-18 | 2011-02-09 | 富士通株式会社 | Speech speed conversion method and speech speed converter |
US20070055514A1 (en) * | 2005-09-08 | 2007-03-08 | Beattie Valerie L | Intelligent tutoring feedback |
US8417185B2 (en) | 2005-12-16 | 2013-04-09 | Vocollect, Inc. | Wireless headset and method for robust voice data communication |
US7885419B2 (en) * | 2006-02-06 | 2011-02-08 | Vocollect, Inc. | Headset terminal with speech functionality |
US7773767B2 (en) | 2006-02-06 | 2010-08-10 | Vocollect, Inc. | Headset terminal with rear stability strap |
US8046221B2 (en) * | 2007-10-31 | 2011-10-25 | At&T Intellectual Property Ii, L.P. | Multi-state barge-in models for spoken dialog systems |
EP2107553B1 (en) * | 2008-03-31 | 2011-05-18 | Harman Becker Automotive Systems GmbH | Method for determining barge-in |
EP2148325B1 (en) * | 2008-07-22 | 2014-10-01 | Nuance Communications, Inc. | Method for determining the presence of a wanted signal component |
USD605629S1 (en) | 2008-09-29 | 2009-12-08 | Vocollect, Inc. | Headset |
US8160287B2 (en) | 2009-05-22 | 2012-04-17 | Vocollect, Inc. | Headset with adjustable headband |
US8438659B2 (en) | 2009-11-05 | 2013-05-07 | Vocollect, Inc. | Portable computing device and headset interface |
WO2013187932A1 (en) | 2012-06-10 | 2013-12-19 | Nuance Communications, Inc. | Noise dependent signal processing for in-car communication systems with multiple acoustic zones |
DE112012006876B4 (en) | 2012-09-04 | 2021-06-10 | Cerence Operating Company | Method and speech signal processing system for formant-dependent speech signal amplification |
WO2014070139A2 (en) | 2012-10-30 | 2014-05-08 | Nuance Communications, Inc. | Speech enhancement |
CN109903758B (en) | 2017-12-08 | 2023-06-23 | 阿里巴巴集团控股有限公司 | Audio processing method and device and terminal equipment |
CN111048096B (en) * | 2019-12-24 | 2022-07-26 | 大众问问(北京)信息科技有限公司 | Voice signal processing method and device and terminal |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS5852695A (en) * | 1981-09-25 | 1983-03-28 | 日産自動車株式会社 | Voice detector for vehicle |
US4645883A (en) * | 1984-05-09 | 1987-02-24 | Communications Satellite Corporation | Double talk and line noise detector for a echo canceller |
US4914692A (en) * | 1987-12-29 | 1990-04-03 | At&T Bell Laboratories | Automatic speech recognition using echo cancellation |
US5125024A (en) * | 1990-03-28 | 1992-06-23 | At&T Bell Laboratories | Voice response unit |
US5155760A (en) * | 1991-06-26 | 1992-10-13 | At&T Bell Laboratories | Voice messaging system with voice activated prompt interrupt |
-
1993
- 1993-08-13 US US08/106,072 patent/US5475791A/en not_active Expired - Lifetime
-
1994
- 1994-08-15 WO PCT/US1994/009353 patent/WO1995005655A2/en active IP Right Grant
- 1994-08-15 AU AU75273/94A patent/AU687089B2/en not_active Ceased
- 1994-08-15 DE DE69424172T patent/DE69424172T2/en not_active Expired - Lifetime
- 1994-08-15 AT AT94925293T patent/ATE192258T1/en not_active IP Right Cessation
- 1994-08-15 CA CA002169447A patent/CA2169447A1/en not_active Abandoned
- 1994-08-15 ES ES94925293T patent/ES2145148T3/en not_active Expired - Lifetime
- 1994-08-15 EP EP94925293A patent/EP0713597B1/en not_active Expired - Lifetime
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO1995005655B1 (en) | Method for recognizing a spoken word in the presence of interfering speech | |
JP4098842B2 (en) | Prompt interrupt system with voice activated prompt interrupt function and adjustable echo cancellation method | |
US6785365B2 (en) | Method and apparatus for facilitating speech barge-in in connection with voice recognition systems | |
EP0713597B1 (en) | Method for recognizing a spoken word in the presence of interfering speech | |
JP2538176B2 (en) | Eco-control device | |
AU707896B2 (en) | Voice activity detection | |
EP0901267B1 (en) | The detection of the speech activity of a source | |
US4914692A (en) | Automatic speech recognition using echo cancellation | |
CA2358044C (en) | Method for handling far-end speech effects in hands-free telephony systems on acoustic beamforming | |
US6449361B1 (en) | Control method and device for echo canceller | |
CA2018836A1 (en) | Training method for an echo canceller for use in a voice conference system | |
US6606595B1 (en) | HMM-based echo model for noise cancellation avoiding the problem of false triggers | |
EP0806759B1 (en) | Canceler of speech and noise, and speech recognition apparatus | |
US7085715B2 (en) | Method and apparatus of controlling noise level calculations in a conferencing system | |
JP2002041073A (en) | Speech recognition device | |
KR100194765B1 (en) | Speech recognition system using echo cancellation and method | |
Heitkamper et al. | Adaptive gain control and echo cancellation for hands-free telephone systems | |
JP3357284B2 (en) | Double talk detection control device and double talk detection control method | |
JPH07264103A (en) | Method and device for detecting superimposed voice and voice input and output device using the detector | |
JPH09149100A (en) | Telephone set | |
JPH036712B2 (en) |