EP0974205A1 - Echo reducing phone with state machine controlled switches - Google Patents

Echo reducing phone with state machine controlled switches

Info

Publication number
EP0974205A1
EP0974205A1 EP98909895A EP98909895A EP0974205A1 EP 0974205 A1 EP0974205 A1 EP 0974205A1 EP 98909895 A EP98909895 A EP 98909895A EP 98909895 A EP98909895 A EP 98909895A EP 0974205 A1 EP0974205 A1 EP 0974205A1
Authority
EP
European Patent Office
Prior art keywords
state machine
signal
microphone
flag
passed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP98909895A
Other languages
German (de)
French (fr)
Inventor
Johan Gnosspelius
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Telefonaktiebolaget LM Ericsson AB
Original Assignee
Telefonaktiebolaget LM Ericsson AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget LM Ericsson AB filed Critical Telefonaktiebolaget LM Ericsson AB
Publication of EP0974205A1 publication Critical patent/EP0974205A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic

Definitions

  • the present invention relates to telecommxinication in general and to speech processing for voice communication over the Internet in particular.
  • a typical Internet phone uses a PC with a sound board, a microphone and two loudspeakers .
  • the microphone and the loudspeakers are often placed next to each other on the desk.
  • Such a configuration causes a considerable amount of cross-talk heard as an echo at the receiver end. This echo must be suppressed to make the Internet phone usable.
  • VAD Voice Activity Detection
  • VOX-principle Voice Operated Transmisson
  • a VAD unit is responsible for detecting wheter or not a received sound sequence represents human speech or not.
  • the VAD unit can take two different states where a first state indicates that the sound sequence was human voice and the other state indicates that the sound sequence was not human voice.
  • the VAD unit detects that a given sound sequence represents human voice the VAD unit will issue a first state signal to a speech coding unit which will encode the sound sequence in a speech frame. If on the other hand a given sound sequence represents something but not human speech the VAD unit will issue a second state signal to a SID (Silence Descriptor) unit. Said SID unit will every N:th frame deliver a SID frame. During the remaining N-l possible occasions to send frames nothing will be sent.
  • a SID frame comprises information about estimated background noise and estimated noise spectra on the sending side. With this procedure batteri power and radio-bandwidth can be saved.
  • a so called hang-over is normally applied during which the speech encoding unit will continue to deliver speech frames as if the received sound sequence had been human speech. If, after the hang-over time the VAD unit still detects non-speech a SID frame is generated. The reason for this procedure is that short pauses between words in the human speech not shall be interpreted as non-speech, but that the speech frame generator still shall be active.
  • the present invention discloses a method an an apparatus for reduction of echoes introduced by cross-talk.
  • the purpose of the present invention is thus to reduce the echo introduced by cross-talk
  • Figure 1 shows a block diagram of one embodiment of the invention.
  • Figure 2 shows a Finite State Diagram
  • a microphone 101 is connected to a GSM encoder 102. Before the signal arrives to the GSM encoder 102 it has been digitalised and sampled according to known technology but which is not shown in figure 1. From the GSM encoder 102 the encoded signal is transmitted to a receiver not shown in the figure first passing a switch 103 which can enable or disable the transmission. From the GSM encoder 102 is a ACF E (Autocorrection Coefficient) passed to a VAD unit 104. To the VAD unit 105 is also a long term predictor lag value N E passed from the GSM frames. From the VAD unit 104 is a value, P E representing the energi of the signal passed to a finite state machine 105. The VAD unit 104 also computes a flag F E indicating if the VAD unit
  • the flag F E is passed to the finite state machine 105.
  • the flag F E will be true if human voice has been detected.
  • FIG 1 Further in figure 1 is a sampled, encoded voice signal received from a sender (not shown) and passed to a GSM decoder 106. From the GSM decoder 106 the decoded, sampled voice signal is passed to a speaker 107 first passing a switch 108 which can enable or disable to voice signal from reaching the speaker. For the speaker to be able to function properly an D/A-convertion is needed according to known technology but not shown in figure 1. From the received sampled, encoded voice signal a long term predictor lag value N D is deducted and passed to a VAD unit 109.
  • an autocorrelation unit 110 receives data from the GSM decoder 106 and calculates the ACF D which is passed to the VAD unit 109.
  • the autocorrelation unit 110 is a part of the GSM encoder as described in the standards.
  • a value P D indication the energi in the voice signal to the speaker is passed from the VAD unit 109 to the finite state machine 105. From the VAD unit 109 is also a flag F D passed to said finite state machine indicating if the VAD unit has detected human voice.
  • the finite state machine 106 comprises functionality for setting the switches 103 and 109 in dependance of the values inputted to the finite state machine.
  • TRANSMITTING 207 is the switch controlling the transmission of the voice signal from the microphone enabled and the switch controlling the transmission of the voice signal to the speaker disabled.
  • RECEIVING 208 is the switch controlling the transmission of the voice signal from the microphone disabled and the switch controlling the transmission to the speaker enabled.
  • IDLE state 209 both switches are disabled.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
  • Telephone Function (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Interconnected Communication Systems, Intercoms, And Interphones (AREA)
  • Selective Calling Equipment (AREA)

Abstract

The purpose of the present invention is thus to reduce the echo introduced by cross-talk. The problem described above, with how to reduce the echo introduced by cross-talk is solved by to the microphone and to the speaker introduce switches controlled by a state-machine which take as input the signal energy of the signal from the microphone, a VAD flag of the signal from the microphone, the signal energy of the signal to the speaker and a VAD flag of the signal to the speaker.

Description

ECHO REDUCING PHONE WITH STATE MACHINE CONTROLLED SWITCHES
TECHNICAL FIELD OF INVENTION
The present invention relates to telecommxinication in general and to speech processing for voice communication over the Internet in particular.
DESCRIPTION OF RELATED ART
A typical Internet phone uses a PC with a sound board, a microphone and two loudspeakers . The microphone and the loudspeakers are often placed next to each other on the desk. Such a configuration causes a considerable amount of cross-talk heard as an echo at the receiver end. This echo must be suppressed to make the Internet phone usable.
In GSM it is known to use VAD (Voice Activity Detection) to detect when a user of a mobile phone is talking or not talking. This information is used to be able to decrease to bandwith when transmitting the voice. In discontinuous speech coding according to the VOX-principle (Voice Operated Transmisson) a VAD unit is responsible for detecting wheter or not a received sound sequence represents human speech or not. The VAD unit can take two different states where a first state indicates that the sound sequence was human voice and the other state indicates that the sound sequence was not human voice.
If the VAD unit detects that a given sound sequence represents human voice the VAD unit will issue a first state signal to a speech coding unit which will encode the sound sequence in a speech frame. If on the other hand a given sound sequence represents something but not human speech the VAD unit will issue a second state signal to a SID (Silence Descriptor) unit. Said SID unit will every N:th frame deliver a SID frame. During the remaining N-l possible occasions to send frames nothing will be sent. A SID frame comprises information about estimated background noise and estimated noise spectra on the sending side. With this procedure batteri power and radio-bandwidth can be saved.
When the SID unit changes from generating the first state signal to generating the second state signal, that is from detecting speech to detecting non-speech a time interval, a so called hang-over is normally applied during which the speech encoding unit will continue to deliver speech frames as if the received sound sequence had been human speech. If, after the hang-over time the VAD unit still detects non-speech a SID frame is generated. The reason for this procedure is that short pauses between words in the human speech not shall be interpreted as non-speech, but that the speech frame generator still shall be active.
SUMMARY OF THE INVENTION
The present invention discloses a method an an apparatus for reduction of echoes introduced by cross-talk.
The purpose of the present invention is thus to reduce the echo introduced by cross-talk
The problem described above, with how to reduce the echo introduced by cross-talk is solved by to the microphone and to the speaker introduce switches controlled by a state-machine which take as input the signal energi of the signal from the microphone, a VAD flag of the signal from the microphone, the signal energi of the signal to the speaker and a VAD flag of the signal to the speaker. One of the advantages with the present invention is that the echo introduced with cross-talk is significantly reduced without requiring to much computational power.
Other advantages will be obviouse to a man skilled in the art in the light of the detailed description given below.
Further scope of applicability of the present invention will become apparent from the detailed description given herein after. However, it should be understood that the preferred embodiments of the invention, are given by way of illustration only, since variouse changes and modifications within the scope of the invention will become apparent to those skilled in the art from this detailed description.
BRIEF DESCRIPTION OF DRAWINGS
Figure 1 shows a block diagram of one embodiment of the invention.
Figure 2 shows a Finite State Diagram.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
In figure 1 a microphone 101 is connected to a GSM encoder 102. Before the signal arrives to the GSM encoder 102 it has been digitalised and sampled according to known technology but which is not shown in figure 1. From the GSM encoder 102 the encoded signal is transmitted to a receiver not shown in the figure first passing a switch 103 which can enable or disable the transmission. From the GSM encoder 102 is a ACFE (Autocorrection Coefficient) passed to a VAD unit 104. To the VAD unit 105 is also a long term predictor lag value NE passed from the GSM frames. From the VAD unit 104 is a value, PE representing the energi of the signal passed to a finite state machine 105. The VAD unit 104 also computes a flag FE indicating if the VAD unit
104 has detected human speech. The flag FE is passed to the finite state machine 105. The flag FE will be true if human voice has been detected.
Further in figure 1 is a sampled, encoded voice signal received from a sender (not shown) and passed to a GSM decoder 106. From the GSM decoder 106 the decoded, sampled voice signal is passed to a speaker 107 first passing a switch 108 which can enable or disable to voice signal from reaching the speaker. For the speaker to be able to function properly an D/A-convertion is needed according to known technology but not shown in figure 1. From the received sampled, encoded voice signal a long term predictor lag value ND is deducted and passed to a VAD unit 109.
Since the decoding of GSM frames normaly do not involve using a VAD unit the GSM decoder lacks necessary parameters for calculating the ACF. To be able to calculate the ACF an autocorrelation unit 110 receives data from the GSM decoder 106 and calculates the ACFD which is passed to the VAD unit 109. The autocorrelation unit 110 is a part of the GSM encoder as described in the standards. A value PD indication the energi in the voice signal to the speaker is passed from the VAD unit 109 to the finite state machine 105. From the VAD unit 109 is also a flag FD passed to said finite state machine indicating if the VAD unit has detected human voice.
The finite state machine 106 comprises functionality for setting the switches 103 and 109 in dependance of the values inputted to the finite state machine.
In figure 2 the states and the possible transitions is shown for the finite state machine in figure 1. The transitions between states are done according to the description below. The following definitions are used:
• FE: VAD flag when encoding
• FD: VAD flag when decoding • PE: Signal energy when encoding
• PD: Signal energy when decoding.
• Hangover: The time from the decision to switch direction until the switch is made. This time must be long enough to compensate for the room echo.
201. FE= 1 AND FD= 0 OR FE= 1 and PE> PD, hangover = 0
202. FE= 0, hangover = 600 ms
203. FD= 1 AND FE= 0 OR FD= 1 and PD> PE, hangover = 0
204. FD= 0, hangover = 600 ms
205. FD= 1 AND PD> PE, hangover = 600 ms 206. FE= 1 AND PE> PD hangover = 600 ms
In the state TRANSMITTING 207 is the switch controlling the transmission of the voice signal from the microphone enabled and the switch controlling the transmission of the voice signal to the speaker disabled. In the state RECEIVING 208 is the switch controlling the transmission of the voice signal from the microphone disabled and the switch controlling the transmission to the speaker enabled. In the IDLE state 209 both switches are disabled.
The invention being thus described, it will be obviouse that the same may be varied in many ways. Such variations are not to be regarded as a departure from the spirit and scope of the invention, and all such modifications as would be obvoiuse to a man skilled in the art are intended to be included within the scope of the following claims .

Claims

1. A method for reducing the echo when transmitting voice in an telephone application, said telephone application comprises a speaker and a microphone, characterise in that a finit state machine affects said speaker and microphone to be on or off in dependance of characteristics of the signal from said microphone and characteristics of the signal to said speaker.
2. A method according to claim 1 where said telephone application comprises at least one VAD unit, one GSM encoder and one GSM decoder, characterised in that a first VAD flag of the signal from the microphone is passed to said finite state machine, that a first value representing the signal energi in the signal from the microphone is passed to said finite state machine, that a second VAD flag of the signal to the speaker is passed to said finite state machine, that a second value representing the energi in the signal to the speaker is passed to said finite state machine, that said finite state machine affects a first switch controlling the transmission of said signal from said microphone, that said finite state machine affects a second switch controlling the transmission of said second signal to said speaker in dependance of the values of said first VAD flag, said second VAD flag, said first value and said second value.
3. A method according to claim 2 , characterised in that a first sampled voice signal from said microphone is passed to said
GSM encoder, that a first long term predictor lag value is passed to a first VAD unit, that a first autocorrelation coefficient is passed from said first GSM encoder to said first VAD unit, that a first boolean flag is passed from said first VAD unit to said finit state machine, that a first value representing the energi of the signal from said microphone is passed from said first VAD unit to said finite state machine, that a second sampled, encoded voice signal is received, that said second voice signal is passed to a GSM decoder, that a second long term predictor lag value from said second voice signal is passed to a second VAD unit, that a second autocorrelation coefficient is calculated and passed to said second VAD unit, that a second value representing the energi in said second voice signal is passed from said second VAD unit to said finite state machine, that a second boolean flag is passed from said VAD unit to said finite state machine and that said finite state machine controlls a first switch affecting the transmission of said first sampled, encoded voice signal from said microphone, a second switch affecting the transmission of said second, decoded voice signal to a speaker in dependance of the values of said first boolean flag, said second boolean flag, said first value and said second value.
4. A method according to claim 2 , characterised in that if said finite state machine takes a first state said first switch controlling the transmission from said microphone is set to allow such transmission and that said second switch controlling the transmission to the speaker is set not to allow such transmission, that if said finite state machine takes a second state said first switch controlling the transmission from said microphone is set not to allow such transmission and that said second switch controlling the transmission to the speaker is set to allow such transmission.
5. A method according to claim 4, characterised in that if said finite state machine takes a third state said first and second switch are both set to the same state.
. A method according to claim 5 , characterised in that said finite state machine switch from said third state to said first state if said first flag is true and said second flag is false or if said first flag is true and said first value is larger than said second value, that said finite state machine switches from said first state to said third state if said first flag is false and a hangover time has elapsed, that said finite state machen switches from said third state to said second state if said second flag is true and said first flag is false or if said second flag is true and said second value is larger than said first value, that said finite state machine switches from said second state to said third state if said second flag is false and said hangover time has passed, that said finite state machine switches from said first state to said second state if said second flag is true and said second value is larger than said first value and said hangover time has passed, that said finite state machine switches from said second to said first state if said first flag is true and said first value is larger than said second value and said hangover time has passed.
7. A method according to claim 6, characterised in that said hangover time is 600 ms.
8. An apparatus for reducing the echo when transmitting voice in an telephone application, said telephone application comprises a speaker and a microphone, characterised in that said telephone application comprises a finit state machine arranged to affect said speaker and microphone to be on or off in dependance of characteristics of the signal from said microphone and characteristics of the signal to said speaker.
9. A personal computer arranged for transmitting and receiving voice with an telephone application comprising an apparatus for reducing the echo of said voice, said telephone application comprises a speaker and a microphone, characterised in that said telephone application comprises a finit state machine arranged to affect said speaker and microphone to be on or off in dependance of characteristics of the signal from said microphone and characteristics of the signal to said speaker.
EP98909895A 1997-03-11 1998-02-24 Echo reducing phone with state machine controlled switches Withdrawn EP0974205A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
SE9700873 1997-03-11
SE9700873A SE511650C2 (en) 1997-03-11 1997-03-11 Method and apparatus for reducing echo in a telephone application
PCT/SE1998/000332 WO1998040974A1 (en) 1997-03-11 1998-02-24 Echo reducing phone with state machine controlled switches

Publications (1)

Publication Number Publication Date
EP0974205A1 true EP0974205A1 (en) 2000-01-26

Family

ID=20406109

Family Applications (1)

Application Number Title Priority Date Filing Date
EP98909895A Withdrawn EP0974205A1 (en) 1997-03-11 1998-02-24 Echo reducing phone with state machine controlled switches

Country Status (9)

Country Link
EP (1) EP0974205A1 (en)
JP (1) JP2001514823A (en)
CN (1) CN1255255A (en)
AU (1) AU735505B2 (en)
BR (1) BR9808240A (en)
CA (1) CA2283590A1 (en)
SE (1) SE511650C2 (en)
TW (1) TW407435B (en)
WO (1) WO1998040974A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002539719A (en) * 1999-03-15 2002-11-19 ヴォーカルテク・コミュニケーションズ・リミテッド Echo suppression device and method for performing echo suppression
US6754337B2 (en) * 2002-01-25 2004-06-22 Acoustic Technologies, Inc. Telephone having four VAD circuits
US7020257B2 (en) * 2002-04-17 2006-03-28 Texas Instruments Incorporated Voice activity identiftication for speaker tracking in a packet based conferencing system with distributed processing
CN101145803B (en) * 2007-09-06 2012-09-05 杭州华三通信技术有限公司 A method and device for separating echo reflection

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4897832A (en) * 1988-01-18 1990-01-30 Oki Electric Industry Co., Ltd. Digital speech interpolation system and speech detector
GB2256351B (en) * 1991-05-25 1995-07-05 Motorola Inc Enhancement of echo return loss
FI110826B (en) * 1995-06-08 2003-03-31 Nokia Corp Eliminating an acoustic echo in a digital mobile communication system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9840974A1 *

Also Published As

Publication number Publication date
WO1998040974A1 (en) 1998-09-17
AU6426498A (en) 1998-09-29
SE9700873L (en) 1998-09-12
AU735505B2 (en) 2001-07-12
CA2283590A1 (en) 1998-09-17
TW407435B (en) 2000-10-01
CN1255255A (en) 2000-05-31
JP2001514823A (en) 2001-09-11
BR9808240A (en) 2000-05-16
SE511650C2 (en) 1999-11-01
SE9700873D0 (en) 1997-03-11

Similar Documents

Publication Publication Date Title
US5475712A (en) Voice coding communication system and apparatus therefor
RU2187199C2 (en) Microphone muting in radio communication systems
US6223154B1 (en) Using vocoded parameters in a staggered average to provide speakerphone operation based on enhanced speech activity thresholds
JP2518765B2 (en) Speech coding communication system and device thereof
WO1998040974A1 (en) Echo reducing phone with state machine controlled switches
JP2861889B2 (en) Voice packet transmission system
JP2979859B2 (en) Digital mobile radio equipment
JP4333005B2 (en) Speech encoding / decoding device, speech encoding device, and encoding method
JP2005534258A (en) System and method for operating a speakerphone in a communication device
MXPA99008026A (en) Echo reducing phone with state machine controlled switches
JPH09200308A (en) Communication terminal equipment
JP3173639B2 (en) Background noise update system and method
JP3201136B2 (en) Digital mobile radio equipment
JP2974427B2 (en) Voice communication system and voice communication device
JPH06350682A (en) Voice transmitting and receiving telephone system
JPH0284830A (en) Audio encoding switching system
JPH0530137A (en) Sound packet transmission device
JPH0832526A (en) Voice detector
JPH06104852A (en) Voice decoder
JPH04119028A (en) Voice detector
JPH06216860A (en) Voice communication equipment
JPH04167635A (en) Adaptive prediction type adpc encoder/decoder
JPH06326670A (en) Voice communication equipment
JPH1084310A (en) Communication system with silencing processing
JPS58225733A (en) Sensitivity control system

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19990913

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Withdrawal date: 20021014