USRE49462E1 - Adaptive noise cancellation for multiple audio endpoints in a shared space - Google Patents

Adaptive noise cancellation for multiple audio endpoints in a shared space Download PDF

Info

Publication number
USRE49462E1
USRE49462E1 US17/325,875 US202117325875A USRE49462E US RE49462 E1 USRE49462 E1 US RE49462E1 US 202117325875 A US202117325875 A US 202117325875A US RE49462 E USRE49462 E US RE49462E
Authority
US
United States
Prior art keywords
audio
endpoint
microphone
audio endpoint
endpoints
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
US17/325,875
Inventor
Lennart Burenius
Oystein BIRKENES
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cisco Technology Inc
Original Assignee
Cisco Technology Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cisco Technology Inc filed Critical Cisco Technology Inc
Priority to US17/325,875 priority Critical patent/USRE49462E1/en
Assigned to CISCO TECHNOLOGY, INC. reassignment CISCO TECHNOLOGY, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BIRKENES, OYSTEIN, BURENIUS, LENNART
Application granted granted Critical
Publication of USRE49462E1 publication Critical patent/USRE49462E1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/04Circuits for transducers, loudspeakers or microphones for correcting frequency response
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1083Reduction of ambient noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2227/00Details of public address [PA] systems covered by H04R27/00 but not provided for in any of its subgroups
    • H04R2227/001Adaptation of signal processing in PA systems in dependence of presence of noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2227/00Details of public address [PA] systems covered by H04R27/00 but not provided for in any of its subgroups
    • H04R2227/003Digital PA systems using, e.g. LAN or internet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2227/00Details of public address [PA] systems covered by H04R27/00 but not provided for in any of its subgroups
    • H04R2227/009Signal processing in [PA] systems to enhance the speech intelligibility
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2410/00Microphones
    • H04R2410/05Noise reduction with a separate noise microphone

Definitions

  • the present disclosure relates to telecommunications audio endpoints.
  • Audio endpoints may often be located in a shared space or common location. In these shared spaces, background noise caused by audio endpoints is often captured by the microphones of other audio endpoints at the common location. This background noise may then be transmitted to a far-end or remote audio endpoint that is participating in a telecommunication session with one of the audio endpoints. Receiving this background noise at the far-end can cause a loss of intelligibility and fatigue to participants in the telecommunication session.
  • FIG. 1 is a diagram of a shared space including multiple audio endpoints in which techniques for adaptive noise cancellation may be implemented, according to an example embodiment.
  • FIG. 2 is a flowchart of a method of locating audio endpoints in a shared space and identifying a target noise source, according to an example embodiment.
  • FIG. 3 is a diagram illustrating a technique for implementing adaptive noise cancellation for multiple audio endpoints in a shared space, according to an example embodiment.
  • FIG. 4 is a diagram illustrating a technique for implementing adaptive noise cancellation at an audio endpoint, according to an example embodiment.
  • FIG. 5 is a diagram illustrating a technique for implementing adaptive noise cancellation for multiple audio endpoints in a shared space, according to another example embodiment.
  • FIG. 6 is a diagram illustrating a technique for implementing adaptive noise cancellation at an audio endpoint, according to another example embodiment.
  • FIG. 7 is a flowchart of a method for implementing adaptive noise cancellation at an audio endpoint, according to an example embodiment.
  • FIG. 8 is a block diagram of an audio endpoint configured to implement techniques for adaptive noise cancellation for multiple audio endpoints in a shared space, according to an example embodiment.
  • a method of adaptive noise cancellation for multiple audio endpoints in a shared space includes detecting, by a first audio endpoint, one or more audio endpoints co-located with the first audio endpoint at a first location. The method also includes identifying a selected audio endpoint of the one or more audio endpoints as a target noise source and obtaining, from the selected audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected audio endpoint.
  • the method includes removing the loudspeaker reference signal from a microphone signal associated with a microphone of the first audio endpoint and providing the microphone signal from the first audio endpoint to at least one of a voice user interface (VUI) or a second audio endpoint, wherein the second audio endpoint is located remotely from the first location.
  • VUI voice user interface
  • VUIs voice user interfaces
  • the example embodiments described herein provide techniques for adaptive noise cancellation across multiple devices or audio endpoints in an acoustically shared space to reduce the amount and extent of unwanted/unrelated background noise that is sent to far-end or remote audio endpoint participants and to improve the performance of VUIs.
  • FIG. 1 is a diagram of a shared space 100 including multiple audio endpoints in which techniques for adaptive noise cancellation may be implemented, according to an example embodiment.
  • a plurality of audio endpoints may be co-located within an acoustically shared space 100 .
  • acoustically shared space 100 may be an open office environment, a conference room, a public area, or other common location where multiple audio endpoints are physically present within acoustic proximity to each other.
  • shared space 100 includes a first audio endpoint 102 , a second audio endpoint 104 , a third audio endpoint 106 , and additional audio endpoints up to an nth audio endpoint 108 .
  • one or more of the multiple audio endpoints 102 , 104 , 106 , 108 may be engaged in separate telecommunication sessions with a remote audio endpoint or other far-end participant.
  • multiple remote audio endpoints including a first remote audio endpoint 110 , a second remote audio endpoint 112 , a third remote audio endpoint 114 , and up to an nth remote audio endpoint 116 are physically located remotely from shared space 100 and multiple audio endpoints 102 , 104 , 106 , 108 . That is, remote audio endpoints 110 , 112 , 114 , 116 are not within acoustic proximity to audio endpoints 102 , 104 , 106 , 108 .
  • Audio endpoints including any of audio endpoints 102 , 104 , 106 , 108 and/or remote audio endpoints 110 , 112 , 114 , 116 , may include various types of devices having at least audio or acoustic telecommunication capabilities.
  • audio endpoints may include conference phones, video conferencing devices, tablets, computers with audio input and output components, electronic personal/home assistants, hands-free/smart speakers (i.e., speakers with voice controls), devices or programs controlled with VUIs, and/or other devices that include at least one speaker and at least one microphone.
  • an audio endpoint in shared space 100 may implement techniques for adaptive noise cancellation to remove background noise associated with one or more of the other audio endpoints (e.g., second audio endpoint 104 , third audio endpoint 106 , and/or nth audio endpoint 108 ) that are also co-located within shared space 100 .
  • first audio endpoint 102 detects one or more audio endpoints that are co-located with first audio endpoint 102 within shared space 100 and are connected to a common local area network (LAN).
  • LAN local area network
  • audio endpoints 102 , 104 , 106 , 108 may communicate with each other, remote audio endpoints 110 , 112 , 114 , 116 , or any other devices by accessing LAN through a LAN access point (AP) 120 .
  • LAN access point 120 may provide a connection to a network, such as the internet, public switched telephone network (PSTN), or any other wired or wireless network, including LANs and wide-area networks (WANs), to permit audio endpoints 102 , 104 , 106 , 108 to engage in a telecommunication session.
  • PSTN public switched telephone network
  • WANs wide-area networks
  • the presence of other audio endpoints within shared space 100 may be detected or determined by first audio endpoint 102 using an ultrasonic signal obtained from one or more of the other audio endpoints (e.g., second audio endpoint 104 , third audio endpoint 106 , and/or nth audio endpoint 108 ).
  • audio endpoints 104 , 106 , 108 may transmit or provide an ultrasonic proximity signal that broadcasts each audio endpoint's Internet Protocol (IP) address in the high-frequency audio spectrum (e.g., above 16-17 kHz).
  • IP Internet Protocol
  • first audio endpoint 102 may receive a first ultrasonic signal 122 from second audio endpoint 104 , a second ultrasonic signal 124 from third audio endpoint 106 , and a third ultrasonic signal 126 from nth audio endpoint 108 .
  • each audio endpoint 102 , 104 , 106 , 108 may use an ultrasonic encoding technique that permits multiple concurrent broadcasts or using a “first-come, first-serve” method to transmit its ultrasonic signal to other endpoints to locate each of audio endpoints 102 , 104 , 106 , 108 in shared space 100 .
  • detecting or locating each of audio endpoints 102 , 104 , 106 , 108 in shared space 100 may be set up manually.
  • first audio endpoint 102 may synchronize clocks and establish a low-delay LAN connection with each of second audio endpoint 104 , third audio endpoint 106 , and/or nth audio endpoint 108 .
  • the network delay may be short compared to the acoustical delay in shared space 100 .
  • Clock synchronization between the analog-to-digital (ADC) and digital-to-analog (DAC) converters associated with the audio transducers inside each of audio endpoints 102 , 104 , 106 , 108 may be accomplished according to known techniques. For example, using Precision Time Protocol (PTP) standard defined by Institute of Electrical and Electronics Engineers (IEEE) 1588 and/or Audio Video Bridging (AVB) and Time Synchronized Networking (TSN) standards, the specifications of which standards are hereby incorporated by reference in their entirety.
  • PTP Precision Time Protocol
  • IEEE Institute of Electrical and Electronics Engineers
  • AVB Audio Video Bridging
  • TSN Time Synchronized Networking
  • first audio endpoint 102 may next identify a selected audio endpoint as a target noise source, as will be described in more detail below.
  • computational network resources may be limited. Accordingly, a method 200 of detecting audio endpoints in shared space 100 and identifying a target noise source may be used to select the audio endpoint associated with the worst or highest anticipated noise level may be used. In other embodiments, however, where additional computational network resources are available, additional audio endpoints may be identified as target noise sources for adaptive noise cancellation techniques according to the example embodiments described herein.
  • method 200 may begin at an operation 202 where each audio endpoint in shared space 100 plays or emits an ultrasonic signal from its loudspeaker.
  • a subject audio endpoint e.g., first audio endpoint 102
  • listens or obtains other ultrasonic signals from one or more of the other audio endpoints e.g., second audio endpoint 104 , third audio endpoint 106 , and/or nth audio endpoint 108 ) in shared space 100 using its microphone.
  • first audio endpoint 102 may establish low-delay LAN connections with each detected audio endpoint, for example, second audio endpoint 104 , third audio endpoint 106 , and/or nth audio endpoint 108 .
  • first audio endpoint 102 may also establish clock synchronization with each detected audio endpoint.
  • Method 200 may proceed to operations 206 , 208 to obtain information for determining associated noise levels of each of the detected audio endpoints.
  • the information may be obtained from operation 206 where first audio endpoint 102 determines an ultrasonic signal receive level (i.e., a higher receive level indicates a closer proximity to first audio endpoint 102 ) for each located audio endpoint (e.g., second audio endpoint 104 , third audio endpoint 106 , and/or nth audio endpoint 108 ).
  • the information may also be obtained from operation 208 where loudspeaker volume settings and/or call status (i.e., whether or not an audio endpoint is currently participating in a telecommunication session) is obtained by first audio endpoint 102 for each of the other audio endpoints 104 , 106 , 108 .
  • first audio endpoint 102 may compute or determine an anticipated noise level for each other audio endpoint 104 , 106 , 108 .
  • Anticipated noise level may be determined using a variety of factors and/or information obtained from each other audio endpoint 104 , 106 , 108 .
  • some of the factors and/or information that may be used by first audio endpoint 102 to determine the anticipated noise levels include: the ultrasonic signal receive level (e.g., obtained from operation 206 ), metadata obtained over the low-delay LAN connections (e.g., loudspeaker volume settings, call status, and other signal levels obtained from operation 208 ), cross-correlations of received microphone signals with local microphone signals, and distance and/or direction information (e.g., which may be obtained using triangulation techniques from a microphone array).
  • the ultrasonic signal receive level e.g., obtained from operation 206
  • metadata obtained over the low-delay LAN connections e.g., loudspeaker volume settings, call status, and other signal levels obtained from operation 208
  • cross-correlations of received microphone signals with local microphone signals e.g., which may be obtained using triangulation techniques from a microphone array.
  • method 200 may proceed to an operation 212 where first audio endpoint 102 may assemble or determine a ranked list of detected audio endpoints 104 , 106 , 108 that is prioritized based on the determined anticipated noise levels from operation 210 . For example, audio endpoints having higher anticipated noise levels are ranked higher on the list than those with lower anticipated noise levels.
  • first audio endpoint 102 picks or selects one or more of the audio endpoints associated with the highest ranked anticipated noise levels from operation 212 .
  • first audio endpoint 102 may identify a selected audio endpoint associated with the highest ranked anticipated noise level from operation 212 as a target noise source for the purposes of implementing techniques for adaptive noise cancellation to remove background noise associated with the selected audio endpoint.
  • a single audio endpoint may be selected as being associated with the worst or highest anticipated noise level for adaptive noise cancellation.
  • two or more audio endpoints may be identified as selected audio endpoints associated with target noise sources for adaptive noise cancellation.
  • audio endpoints associated with an anticipated noise level that exceeds a predetermined threshold may be identified as selected audio endpoints associated with target noise sources for adaptive noise cancellation.
  • a technique 300 for implementing adaptive noise cancellation for multiple audio endpoints in shared space 100 is shown according to an example embodiment.
  • two audio endpoints e.g., first audio endpoint 102 and second audio endpoint 104
  • a first user 302 is using first audio endpoint 102 to engage in a telecommunication session with a first remote audio endpoint 110 .
  • a second user 304 is using second audio endpoint 104 to engage in a separate audio or acoustical session that is independent from the telecommunication session between first audio endpoint 102 and first remote audio endpoint 110 .
  • second user 304 may be using second audio endpoint 104 to engage in a separate telecommunication session with a different remote audio endpoint, such as second remote audio endpoint 112 .
  • Second user 304 may alternatively or additionally be using second audio endpoint 104 to engage in some other type of separate audio or acoustical session.
  • second user 304 may be receiving calls or messages on second audio endpoint 104 that generate a ringtone, playing music on a loudspeaker associated with second audio endpoint 104 , and/or may be communicating with a VUI embedded or in communication with second audio endpoint 104 .
  • first audio endpoint 102 and second audio endpoint 104 are both connected to a network (e.g., a LAN via LAN AP 120 , shown in FIG. 1 ) to allow communication with other devices and/or participants. Additionally, as previously described above, first audio endpoint 102 and second audio endpoint 104 may be connected to each other via a low-delay LAN connection 306 .
  • LAN connection 306 allows first audio endpoint 102 and second audio endpoint 104 to exchange various information.
  • first audio endpoint 102 includes a microphone 310 , a loudspeaker 312 , and one or more signal processing components, including a first adaptive filter 314 , a second adaptive filter 316 that is part of an acoustic echo cancellation (AEC) module 410 (shown in FIG. 4 ), and a signal decoder/encoder 318 .
  • Second audio endpoint 104 has a similar configuration, including a microphone 320 , a loud-speaker 322 , and one or more signal processing components, including a first adaptive filter 324 , a second adaptive filter 326 that is part of an AEC module, and a signal decoder/encoder 328 .
  • microphone 310 of first audio endpoint 102 is receiving inputs from several different audio sources within shared space 100 .
  • microphone 310 receives a first audio input 330 from first user 302 who is using first audio endpoint 102 to conduct a telecommunication session with first remote audio endpoint 110 .
  • first audio input 330 is the intended audio content that first user 302 is providing to first remote audio endpoint 110 via a transmitted microphone signal 336 .
  • Microphone 310 also picks up or receives echo and/or noise from other audio sources within shared space 100 , including an echo source 332 output from loudspeaker 312 of first audio endpoint 102 and a first noise source 334 output from loudspeaker 322 of second audio endpoint 104 .
  • first audio endpoint 102 may implement adaptive noise cancellation of first noise source 334 output from loudspeaker 322 of second audio endpoint 104 by obtaining from second audio endpoint 104 a loudspeaker reference signal 338 that may then be removed from the microphone signal associated with microphone 310 of first audio endpoint 102 using first adaptive filter 314 .
  • first audio endpoint 102 receives or obtains loudspeaker reference signal 338 from second audio endpoint 104 via low-delay LAN connection 306 .
  • loudspeaker reference signal 338 is the audio signal provided from signal decoder 328 of second audio endpoint 104 that is to be output from loudspeaker 322 .
  • loudspeaker reference signal 338 may be based on received audio signals from second remote audio endpoint 112 .
  • loudspeaker reference signal 338 is removed from the microphone signal associated with microphone 310 of first audio endpoint 102 using first adaptive filter 314 . That is, loudspeaker reference signal 338 corresponds to first noise source 334 output from loudspeaker 322 of second audio endpoint 104 and picked up by microphone 310 of first audio endpoint 102 .
  • first adaptive filter 314 uses loudspeaker reference signal 338 to remove the contribution of first noise source 334 from the microphone signal associated with microphone 310 of first audio endpoint 102 before microphone signal 336 is provided or transmitted to first remote audio endpoint 110 .
  • first audio endpoint 102 may further include second adaptive filter 316 that removes the contribution of echo source 332 from the microphone signal associated with microphone 310 of first audio endpoint 102 before microphone signal 336 is provided or transmitted to first remote audio endpoint 110 .
  • each audio endpoint in shared space 100 may implement adaptive noise cancellation to remove noise sources from the other audio endpoints within shared space 100 .
  • microphone 320 of second audio endpoint 104 receives inputs from a first audio input 340 from second user 304 who is using second audio endpoint 104 to conduct a separate telecommunication or other audio/acoustical session with second remote audio endpoint 112 .
  • first audio input 340 is the intended audio content that second user 304 is providing to second remote audio endpoint 112 via a transmitted microphone signal 346 .
  • microphone 320 also picks up or receives echo and/or noise from other audio sources within shared space 100 , including an echo source 342 output from loudspeaker 322 of second audio endpoint 104 and a first noise source 344 output from loudspeaker 312 of first audio endpoint 102 .
  • a loudspeaker reference signal 348 is provided from first audio endpoint 102 via LAN connection 306 .
  • Loudspeaker reference signal 348 corresponds to first noise source 344 output from loudspeaker 312 of first audio endpoint 102 and picked up by microphone 320 of second audio endpoint 104 .
  • This loudspeaker reference signal 348 is removed from the microphone signal associated with microphone 320 of second audio endpoint 104 using first adaptive filter 324 .
  • second audio endpoint 104 may further include second adaptive filter 326 that removes the contribution of echo source 342 from the microphone signal associated with microphone 320 of second audio endpoint 104 before microphone signal 346 is provided or transmitted to second remote audio endpoint 112 .
  • microphone 310 of first audio endpoint 102 is associated with a microphone signal that includes multiple components from different audio sources.
  • the microphone signal includes first audio input 330 from first user 302 , echo source 332 (h 11 ) output from loudspeaker 312 of first audio endpoint 102 , and first noise source 334 (h 21 ) output from loudspeaker 322 of second audio endpoint 104 .
  • the microphone signal from microphone 310 is then provided to an analog-to-digital converter (ADC) 400 .
  • ADC analog-to-digital converter
  • the digital microphone signal from ADC 400 passes to first adaptive filter module 314 , which removes first noise source 334 (h 21 ) from the microphone signal using loudspeaker reference signal 338 that is obtained from second audio endpoint 104 via LAN connection 306 .
  • second adaptive filter module 316 removes echo source 332 (h 11 ) from the microphone signal from loudspeaker 312 of first audio endpoint 102 .
  • Second adaptive filter module 316 may remove echo source 332 using a corresponding loudspeaker reference signal ( 414 ) from loudspeaker 312 of first audio endpoint 102 .
  • loudspeaker reference signal 414 may be obtained before the signal is provided to a digital-to-analog converter (DAC) 402 for output by loudspeaker 312 .
  • DAC digital-to-analog converter
  • transmitted microphone signal 336 provided from encode/decode module 318 of first audio endpoint 102 to first remote audio endpoint 110 may have contributions from unwanted noise sources removed (e.g., echo source 332 and first noise source 334 ) so that first remote audio endpoint 110 receives the content of first audio input 330 from first user 302 in a clear manner.
  • echo at first audio endpoint 102 caused by first remote audio endpoint 110 may be suppressed using AEC module 410 .
  • AEC module 410 includes second filter module 316 , which may be a linear AEC portion, followed by a non-linear AEC portion (e.g., a Non-Linear Processing (NLP) module 412 ).
  • first adaptive filter module 314 may include a linear portion, without a non-linear (NLP) portion.
  • the linear portion of the first adaptive filter module 314 may sufficiently attenuate background noise from co-workers and co-located audio endpoints in shared space 100 without using NLP which can cause more attenuation of microphone signal 336 that is provided to first remote audio endpoint 110 and result in a less duplex experience for telecommunication session participants.
  • techniques for implementing adaptive noise cancellation for audio endpoints may further include removing a microphone reference signal from other audio endpoints in shared space 100 .
  • FIG. 5 a technique 500 for implementing adaptive noise cancellation for multiple audio endpoints in shared space 100 is shown according to an example embodiment.
  • two audio endpoints e.g., first audio endpoint 102 and second audio endpoint 104
  • a first user 302 is using first audio endpoint 102 to engage in a telecommunication session with a first remote audio endpoint 110 .
  • a second user 304 is using second audio endpoint 104 to engage in a separate audio or acoustical session that is independent from the telecommunication session between first audio endpoint 102 and first remote audio endpoint 110 , as detailed above with reference to FIG. 3 .
  • first audio endpoint 102 and second audio endpoint 104 are both connected to a network (e.g., a LAN via LAN AP 120 , shown in FIG. 1 ) to allow communication with other devices and/or participants. Additionally, as previously described above, first audio endpoint 102 and second audio endpoint 104 may be connected to each other via low-delay LAN connection 306 .
  • first audio endpoint 102 includes microphone 310 , loudspeaker 312 , and one or more signal processing components, including first adaptive filter 314 , second adaptive filter 316 , a third adaptive filter 502 , and signal decoder/encoder 318 .
  • Second audio endpoint 104 has a similar configuration, including microphone 320 , loudspeaker 322 , and one or more signal processing components, including first adaptive filter 324 , second adaptive filter 326 , a third adaptive filter 504 , and signal decoder/encoder 328 .
  • microphone 310 of first audio endpoint 102 is receiving inputs from several different audio sources within shared space 100 .
  • microphone 310 receives a first audio input 510 from first user 302 who is using first audio endpoint 102 to conduct a telecommunication session with first remote audio endpoint 110 .
  • first audio input 510 is the intended audio content that first user 302 is providing to first remote audio endpoint 110 via a transmitted microphone signal 518 .
  • Microphone 310 also picks up or receives echo and/or noise from other audio sources within shared space 100 , including an echo source 512 output from loudspeaker 312 of first audio endpoint 102 , a first noise source 514 output from loudspeaker 322 of second audio endpoint 104 , and a second noise source 516 output from second user 304 .
  • first audio endpoint 102 may implement adaptive noise cancellation of first noise source 514 output from loudspeaker 322 of second audio endpoint 104 and second noise source 516 from second user 304 by obtaining from second audio endpoint 104 a loudspeaker reference signal 520 that corresponds to a signal to be output by from loudspeaker 322 and a microphone reference signal 522 that corresponds to an audio stream that is input to microphone 320 of second audio endpoint 104 (e.g., a first audio input 530 from second user 304 ).
  • each of loudspeaker reference signal 520 and microphone reference signal 522 may be removed from the microphone signal associated with microphone 310 of first audio endpoint 102 using corresponding adaptive filters 314 , 502 .
  • first adaptive filter 314 is configured to remove loudspeaker reference signal 520
  • third adaptive filter 502 is configured to remove microphone reference signal 522 .
  • first audio endpoint 102 receives or obtains loudspeaker reference signal 520 and microphone reference signal 522 from second audio endpoint 104 via low-delay LAN connection 306 .
  • loudspeaker reference signal 520 is the audio signal provided from signal decoder 328 of second audio endpoint 104 that is to be output from loudspeaker 322 .
  • loudspeaker reference signal 520 may be based on received audio signals from second remote audio endpoint 112 .
  • microphone reference signal 522 is the audio stream provided from microphone 320 of second audio endpoint 104 obtained from first audio input 530 provided by second user 304 .
  • loudspeaker reference signal 520 is removed from the microphone signal associated with microphone 310 of first audio endpoint 102 using first adaptive filter 314 . That is, loudspeaker reference signal 520 corresponds to first noise source 514 output from loudspeaker 322 of second audio endpoint 104 and picked up by microphone 310 of first audio endpoint 102 . Additionally, in the embodiment of FIG. 5 , technique 500 further includes removing microphone reference signal 522 from the microphone signal associated with microphone 310 of first audio endpoint 102 using third adaptive filter 502 . That is, microphone reference signal 522 corresponds to second noise source 516 from second user 304 and picked up by microphone 310 of first audio endpoint 102 .
  • first adaptive filter 314 uses loudspeaker reference signal 520 to remove the contribution of first noise source 514 and third adaptive filter 502 uses microphone reference signal 522 to remove the contribution of second noise source 516 from the microphone signal associated with microphone 310 of first audio endpoint 102 before microphone signal 518 is provided or transmitted to first remote audio endpoint 110 .
  • first audio endpoint 102 may further include second adaptive filter 316 that removes the contribution of echo source 512 from the microphone signal associated with microphone 310 of first audio endpoint 102 before microphone signal 518 is provided or transmitted to first remote audio endpoint 110 .
  • each audio endpoint in shared space 100 may implement adaptive noise cancellation to remove noise sources from the other audio endpoints within shared space 100 .
  • microphone 320 of second audio endpoint 104 receives inputs from a first audio input 530 from second user 304 who is using second audio endpoint 104 to conduct a separate telecommunication or other audio/acoustical session with second remote audio endpoint 112 .
  • first audio input 530 is the intended audio content that second user 304 is providing to second remote audio endpoint 112 via a transmitted microphone signal 538 .
  • microphone 320 also picks up or receives echo and/or noise from other audio sources within shared space 100 , including an echo source 532 output from loudspeaker 322 of second audio endpoint 104 , a first noise source 534 output from loudspeaker 312 of first audio endpoint 102 , and a second noise source 536 output from first user 302 .
  • a loudspeaker reference signal 540 and a microphone reference signal 542 are provided from first audio endpoint 102 via LAN connection 306 .
  • Loudspeaker reference signal 540 corresponds to first noise source 534 output from loudspeaker 312 of first audio endpoint 102 and picked up by microphone 320 of second audio endpoint 104 and microphone reference signal 542 corresponds to second noise source 536 from first user 302 that is input to microphone 310 of first audio endpoint 102 (e.g., first audio input 510 from first user 302 ).
  • the loudspeaker reference signal 540 is removed from the microphone signal associated with microphone 320 of second audio endpoint 104 using first adaptive filter 324 and the microphone reference signal 542 is removed from the microphone signal associated with microphone 320 of second audio endpoint 104 using second adaptive filter 504 .
  • second audio endpoint 104 may further include second adaptive filter 326 that removes the contribution of echo source 532 from the microphone signal associated with microphone 320 of second audio endpoint 104 before microphone signal 538 is provided or transmitted to second remote audio endpoint 112 .
  • microphone 310 of first audio endpoint 102 is associated with a microphone signal that includes multiple components from different audio sources.
  • the microphone signal includes first audio input 510 from first user 302 , echo source 512 (h 11 ) output from loudspeaker 312 of first audio endpoint 102 , first noise source 514 (h 21 ) output from loudspeaker 322 of second audio endpoint 104 , and second noise source 516 (h 2u1 ) output from second user 304 .
  • the microphone signal from microphone 310 is then provided to ADC 400 .
  • the digital microphone signal from ADC 400 passes to first adaptive filter module 314 , which removes first noise source 514 (h 21 ) from the microphone signal using loudspeaker reference signal 520 that is obtained from second audio endpoint 104 via LAN connection 306 .
  • third adaptive filter module 502 removes second noise source 516 (had) from the microphone signal using microphone reference signal 522 that is obtained from second audio endpoint 104 via LAN connection 306 .
  • AEC module 410 may be used to remove echo source 512 (h 11 ), including second adaptive filter module 316 that removes echo source 512 (h 11 ) from the microphone signal from loudspeaker 312 of first audio endpoint 102 .
  • Second adaptive filter module 316 may remove echo source 512 using a corresponding loudspeaker reference signal ( 600 ) from loudspeaker 312 of first audio endpoint 102 .
  • loudspeaker reference signal 600 may be obtained before the signal is provided to DAC 402 for output by loudspeaker 312 .
  • AEC module 410 also includes NLP module 412 that may be used to further remove echo source 512 from the microphone signal before it is provided to encode/decode module 318 .
  • transmitted microphone signal 518 provided from encode/decode module 318 of first audio endpoint 102 to first remote audio endpoint 110 may have contributions from unwanted noise sources removed (e.g., echo source 512 , first noise source 514 , and second noise source 516 ) so that first remote audio endpoint 110 receives the content of first audio input 510 from first user 302 in a clear manner.
  • unwanted noise sources e.g., echo source 512 , first noise source 514 , and second noise source 516
  • Method 700 may be implemented by one or more audio endpoints within a shared space with other audio endpoints.
  • method 700 may be implemented by first audio endpoint 102 co-located with one or more other audio endpoints within shared space 100 , as described above.
  • method 700 may begin at an operation 702 where one or more audio endpoints are detected or located at a first location.
  • first audio endpoint 102 may detect one or more of audio endpoints 104 , 106 , 108 within shared space 100 using ultrasonic signals, as described in reference to FIG. 2 above. Additionally, in some embodiments, detecting the audio endpoints within shared space 100 may further include located the audio endpoints relative to first audio endpoint 102 , for example, using information received from ultrasonic signals, metadata, and/or a microphone array.
  • method 700 includes an operation 704 where a selected audio endpoint is identified as a target noise source. For example, first audio endpoint 102 may use method 200 to identify a selected audio endpoint as a target noise source according to the techniques described above in reference to FIG. 2 .
  • a loudspeaker reference signal is obtained from the selected audio endpoint.
  • first audio endpoint 102 may obtain loudspeaker reference signal 338 from second audio endpoint 104 via low-delay LAN connection 306 .
  • method 700 further includes an operation 708 where the loudspeaker reference signal is removed from the microphone signal.
  • first audio endpoint 102 removes loudspeaker reference signal 338 from the microphone signal from microphone 310 before microphone signal 336 is provided to first remote audio endpoint 110 .
  • method 700 may also include operations (not shown) for obtaining a microphone reference signal from the selected audio endpoint and removing the microphone reference signal from the selected audio endpoint from the microphone signal before it is transmitted to a remote audio endpoint.
  • method 700 may further include operations (not shown) to remove echo noise components from the microphone signal before it is transmitted. For example, using AEC module 410 , including second adaptive filter 316 and/or NLP module 412 described above in reference to FIGS. 3 - 6 .
  • Method 700 may end with an operation 710 where the filtered microphone signal is provided to a remote audio endpoint.
  • first audio endpoint 102 may provide or transmit microphone signal 336 that has been filtered to remove noise components to first remote audio endpoint 110 .
  • FIG. 8 is an electrical block diagram of first audio endpoint 102 , according to an example embodiment.
  • first audio endpoint 102 includes at least one microphone 310 and at least one loudspeaker 312 .
  • First audio endpoint 102 also includes a processor 800 , a memory 810 , and a network interface 820 comprising one or more ports.
  • the network interface 820 and associated one or more ports may be referred to collectively as a network interface unit.
  • Network interface 820 may be used by first audio endpoint 102 to establish a network connection via LAN AP 120 to conduct a telecommunication session with a remote audio endpoint.
  • Network interface 820 may also be used by first audio endpoint 102 to establish a low-delay LAN connection with other audio endpoints co-located within shared space 100 (e.g., LAN connection 306 ).
  • First audio endpoint 102 may also include a bus (not shown) to connect components of first audio endpoint 102 , including processor 800 , memory 810 , network interface 820 , microphone 310 and/or loudspeaker 312 .
  • Memory 810 may include software instructions that are configured to be executed by processor 800 for providing one or more of the functions or operations of first audio endpoint 102 described above in reference to FIGS. 1 - 7 .
  • memory 810 includes encode/decode logic 812 , adaptive filter module logic 814 , AEC module logic 816 , and/or ultrasonic signal processing logic 818 .
  • encode/decode logic 812 may be configured to provide functions associated with signal decoder/encoder 318 for first audio endpoint 102 , including at least analog-to-digital and digital-to-analog conversion, as well as signal processing functions, such as transmitting and/or receiving signals.
  • Adaptive filter module logic 814 may be configured, for example, to provide functions associated with first adaptive filter 314 for first audio endpoint 102 , as well as third adaptive filter 502 in relevant embodiments, to remove the corresponding loudspeaker reference signals (in the case of first adaptive filter 314 ) or microphone reference signals (in the case of third adaptive filter 502 ).
  • AEC module logic 816 may be configured to provide functions associated with AEC module 410 , including second adaptive filter 316 and/or NLP module 412 for first audio endpoint 102 , including at least filtering of the microphone signal to remove or cancel noise sources associated with loudspeaker 312 .
  • Ultrasonic signal processing logic 818 may be configured to provide functions associated with obtaining/receiving, providing/transmitting, and processing ultrasonic signals from one or more audio endpoints, for example, as may be used by first audio endpoint 102 to locate other audio endpoints within shared space 100 , as detailed in reference to FIG. 2 above.
  • Memory 810 may comprise read only memory (ROM), random access memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible (e.g., non-transitory) memory storage devices.
  • the processor 800 is, for example, a microprocessor or microcontroller that executes instructions for operating first audio endpoint 102 .
  • the memory 810 may comprise one or more tangible (non-transitory) computer readable storage media (e.g., a memory device) encoded with software comprising computer executable instructions and when the software is executed (by the processor 800 ), and, in particular, encode/decode logic 812 , adaptive filter module logic 814 , AEC module logic 816 , and/or ultrasonic signal processing logic 818 , it is operable to perform the operations described herein in connection with FIGS. 1 - 7 .
  • processor 800 may be configured in separate hardware, software, or a combination of both. Additionally, processor 800 may include a plurality of processors.
  • the loudspeaker reference signals from other audio endpoints co-located within a shared space are pure noise sources with no contamination of the wanted or intended audio signal from a user, thereby improving performance.
  • using a microphone for the same purposes would degrade the adaptive noise cancellation performance because the resulting noise source would not be pure.
  • the techniques of the present embodiments also provide a mechanism that allows the noise signal to be obtained early in the signal processing chain to minimize delay.
  • the increased popularity of shared spaces and VUIs increases the occurrence of noise pollution from co-workers and other users within that shared space.
  • the principles of the example embodiments described herein provide techniques for adaptive noise cancellation across multiple audio endpoints within a shared space to greatly reduce the amount and/or degree of unwanted background noise that is sent to far-end or remote audio endpoint participants and can also improve the performance of VUIs.
  • a method comprising: detecting, by a first audio endpoint, one or more audio endpoints co-located with the first audio endpoint at a first location; identifying a selected audio endpoint of the one or more audio endpoints as a target noise source; obtaining, from the selected audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected audio endpoint; removing the loudspeaker reference signal from a microphone signal associated with a microphone of the first audio endpoint; and providing the microphone signal from the first audio endpoint to at least one of a voice user interface (VUI) or a second audio endpoint, wherein the second audio endpoint is located remotely from the first location.
  • VUI voice user interface
  • an apparatus comprising: a microphone; a loudspeaker; a processor in communication with the microphone and the loudspeaker, the processor configured to: detect one or more audio endpoints co-located with the apparatus at a first location; identify a selected audio endpoint of the one or more audio endpoints as a target noise source; obtain, from the selected audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected audio endpoint; remove the loudspeaker reference signal from a microphone signal associated with the microphone; and provide the microphone signal to at least one of a voice user interface (VUI) or a remote audio endpoint, wherein the remote audio endpoint is located remotely from the first location.
  • VUI voice user interface
  • one or more non-transitory computer readable storage media are provided that are encoded with instructions that, when executed by a processor of a first audio endpoint, cause the processor to: detect one or more audio endpoints co-located with the first audio endpoint at a first location; identify a selected audio endpoint of the one or more audio endpoints as a target noise source; obtain, from the selected audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected audio endpoint; remove the loudspeaker reference signal from a microphone signal associated with a microphone of the first audio endpoint; and provide the microphone signal from the first audio endpoint to at least one of a voice user interface (VUI) or a second audio endpoint, wherein the second audio endpoint is located remotely from the first location.
  • VUI voice user interface

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Otolaryngology (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

Techniques for adaptive noise cancellation for multiple audio endpoints in a shared space are described. According to one example, a method includes detecting, by a first audio endpoint, one or more audio endpoints co-located with the first audio endpoint at a first location. A selected audio endpoint of the one or more audio endpoints is identified as a target noise source. The method includes obtaining, from the selected audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected audio endpoint and removing the loudspeaker reference signal from a microphone signal associated with a microphone of the first audio endpoint. The method also includes providing the microphone signal from the first audio endpoint to at least one of a voice user interface (VUI) or a second audio endpoint, wherein the second audio endpoint is located remotely from the first location.

Description

TECHNICAL FIELD
The present disclosure relates to telecommunications audio endpoints.
BACKGROUND
Multiple audio endpoints may often be located in a shared space or common location. In these shared spaces, background noise caused by audio endpoints is often captured by the microphones of other audio endpoints at the common location. This background noise may then be transmitted to a far-end or remote audio endpoint that is participating in a telecommunication session with one of the audio endpoints. Receiving this background noise at the far-end can cause a loss of intelligibility and fatigue to participants in the telecommunication session.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a diagram of a shared space including multiple audio endpoints in which techniques for adaptive noise cancellation may be implemented, according to an example embodiment.
FIG. 2 is a flowchart of a method of locating audio endpoints in a shared space and identifying a target noise source, according to an example embodiment.
FIG. 3 is a diagram illustrating a technique for implementing adaptive noise cancellation for multiple audio endpoints in a shared space, according to an example embodiment.
FIG. 4 is a diagram illustrating a technique for implementing adaptive noise cancellation at an audio endpoint, according to an example embodiment.
FIG. 5 is a diagram illustrating a technique for implementing adaptive noise cancellation for multiple audio endpoints in a shared space, according to another example embodiment.
FIG. 6 is a diagram illustrating a technique for implementing adaptive noise cancellation at an audio endpoint, according to another example embodiment.
FIG. 7 is a flowchart of a method for implementing adaptive noise cancellation at an audio endpoint, according to an example embodiment.
FIG. 8 is a block diagram of an audio endpoint configured to implement techniques for adaptive noise cancellation for multiple audio endpoints in a shared space, according to an example embodiment.
DESCRIPTION OF EXAMPLE EMBODIMENTS
Overview
Presented herein are techniques for implementing adaptive noise cancellation for multiple audio endpoints in a shared space. According to one example embodiment, a method of adaptive noise cancellation for multiple audio endpoints in a shared space includes detecting, by a first audio endpoint, one or more audio endpoints co-located with the first audio endpoint at a first location. The method also includes identifying a selected audio endpoint of the one or more audio endpoints as a target noise source and obtaining, from the selected audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected audio endpoint. The method includes removing the loudspeaker reference signal from a microphone signal associated with a microphone of the first audio endpoint and providing the microphone signal from the first audio endpoint to at least one of a voice user interface (VUI) or a second audio endpoint, wherein the second audio endpoint is located remotely from the first location.
Example Embodiments
Noise pollution in an acoustically shared space, such as open offices or other common locations, can be caused by other people's conversations and/or by background noise from multiple audio endpoints or other devices being used within the same shared space at the same time. For example, hands-free communication devices, such as phones or video conferencing endpoints, may be used simultaneously in an acoustically shared space by different users on separate telecommunication sessions and devices may be operated within the shared space using voice user interfaces (VUIs), such as personal assistants or other voice-activated software or hardware.
Users within the acoustically shared space can use binaural cues to filter out other people's conversations and background noise to some extent. However, far-end or remote audio endpoint participants and many VUIs that listen to or receive the audio signals from a microphone of an audio endpoint in the shared space cannot use these binaural cues to filter out the noise pollution caused by the conversations and/or other noise in the background. As a result, the far-end or remote audio endpoint participants in a telecommunication session may experience bad speech comprehension caused by receiving an audio mix including the unrelated conversations and background noise, which can lead to a frustrating audio experience for these far-end or remote audio endpoint participants.
According to the principles of the present embodiments, techniques for implementing adaptive noise cancellation for multiple audio endpoints in a shared space are provided. With these techniques, audio signals provided to far-end or remote audio endpoint participants and/or to VUIs may be improved.
The example embodiments described herein provide techniques for adaptive noise cancellation across multiple devices or audio endpoints in an acoustically shared space to reduce the amount and extent of unwanted/unrelated background noise that is sent to far-end or remote audio endpoint participants and to improve the performance of VUIs.
FIG. 1 is a diagram of a shared space 100 including multiple audio endpoints in which techniques for adaptive noise cancellation may be implemented, according to an example embodiment. In an example embodiment, a plurality of audio endpoints may be co-located within an acoustically shared space 100. For example, acoustically shared space 100 may be an open office environment, a conference room, a public area, or other common location where multiple audio endpoints are physically present within acoustic proximity to each other. In this embodiment, shared space 100 includes a first audio endpoint 102, a second audio endpoint 104, a third audio endpoint 106, and additional audio endpoints up to an nth audio endpoint 108.
In some embodiments, one or more of the multiple audio endpoints 102, 104, 106, 108 may be engaged in separate telecommunication sessions with a remote audio endpoint or other far-end participant. In this embodiment, multiple remote audio endpoints, including a first remote audio endpoint 110, a second remote audio endpoint 112, a third remote audio endpoint 114, and up to an nth remote audio endpoint 116 are physically located remotely from shared space 100 and multiple audio endpoints 102, 104, 106, 108. That is, remote audio endpoints 110, 112, 114, 116 are not within acoustic proximity to audio endpoints 102, 104, 106, 108.
Audio endpoints, including any of audio endpoints 102, 104, 106, 108 and/or remote audio endpoints 110, 112, 114, 116, may include various types of devices having at least audio or acoustic telecommunication capabilities. For example, audio endpoints may include conference phones, video conferencing devices, tablets, computers with audio input and output components, electronic personal/home assistants, hands-free/smart speakers (i.e., speakers with voice controls), devices or programs controlled with VUIs, and/or other devices that include at least one speaker and at least one microphone.
In an example embodiment, an audio endpoint in shared space 100, for example, first audio endpoint 102, may implement techniques for adaptive noise cancellation to remove background noise associated with one or more of the other audio endpoints (e.g., second audio endpoint 104, third audio endpoint 106, and/or nth audio endpoint 108) that are also co-located within shared space 100. In one embodiment, first audio endpoint 102 detects one or more audio endpoints that are co-located with first audio endpoint 102 within shared space 100 and are connected to a common local area network (LAN). For example, audio endpoints 102, 104, 106, 108 may communicate with each other, remote audio endpoints 110, 112, 114, 116, or any other devices by accessing LAN through a LAN access point (AP) 120. LAN access point 120 may provide a connection to a network, such as the internet, public switched telephone network (PSTN), or any other wired or wireless network, including LANs and wide-area networks (WANs), to permit audio endpoints 102, 104, 106, 108 to engage in a telecommunication session.
In one embodiment, the presence of other audio endpoints within shared space 100 may be detected or determined by first audio endpoint 102 using an ultrasonic signal obtained from one or more of the other audio endpoints (e.g., second audio endpoint 104, third audio endpoint 106, and/or nth audio endpoint 108). For example, audio endpoints 104, 106, 108 may transmit or provide an ultrasonic proximity signal that broadcasts each audio endpoint's Internet Protocol (IP) address in the high-frequency audio spectrum (e.g., above 16-17 kHz). As shown in FIG. 1 , first audio endpoint 102 may receive a first ultrasonic signal 122 from second audio endpoint 104, a second ultrasonic signal 124 from third audio endpoint 106, and a third ultrasonic signal 126 from nth audio endpoint 108.
In some embodiments, each audio endpoint 102, 104, 106, 108 may use an ultrasonic encoding technique that permits multiple concurrent broadcasts or using a “first-come, first-serve” method to transmit its ultrasonic signal to other endpoints to locate each of audio endpoints 102, 104, 106, 108 in shared space 100. In other embodiments, detecting or locating each of audio endpoints 102, 104, 106, 108 in shared space 100 may be set up manually.
Once each of audio endpoints 102, 104, 106, 108 has been detected or located within shared space 100, clock-synchronization and a low-delay LAN connection may be established between one or more of audio endpoints 102, 104, 106, 108. For example, as shown in FIG. 1 , first audio endpoint 102 may synchronize clocks and establish a low-delay LAN connection with each of second audio endpoint 104, third audio endpoint 106, and/or nth audio endpoint 108. The network delay may be short compared to the acoustical delay in shared space 100. Clock synchronization between the analog-to-digital (ADC) and digital-to-analog (DAC) converters associated with the audio transducers inside each of audio endpoints 102, 104, 106, 108 may be accomplished according to known techniques. For example, using Precision Time Protocol (PTP) standard defined by Institute of Electrical and Electronics Engineers (IEEE) 1588 and/or Audio Video Bridging (AVB) and Time Synchronized Networking (TSN) standards, the specifications of which standards are hereby incorporated by reference in their entirety.
After detecting each of the other audio endpoints in shared space 100 and setting up clock synchronization and the low-delay LAN connection, first audio endpoint 102 may next identify a selected audio endpoint as a target noise source, as will be described in more detail below. In some embodiments, computational network resources may be limited. Accordingly, a method 200 of detecting audio endpoints in shared space 100 and identifying a target noise source may be used to select the audio endpoint associated with the worst or highest anticipated noise level may be used. In other embodiments, however, where additional computational network resources are available, additional audio endpoints may be identified as target noise sources for adaptive noise cancellation techniques according to the example embodiments described herein.
Referring now to FIG. 2 , a flowchart of method 200 of detecting audio endpoints in shared space 100 and identifying a target noise source is illustrated according to an example embodiment. In this embodiment, method 200 may begin at an operation 202 where each audio endpoint in shared space 100 plays or emits an ultrasonic signal from its loudspeaker. Next, at an operation 204, a subject audio endpoint (e.g., first audio endpoint 102) listens or obtains other ultrasonic signals from one or more of the other audio endpoints (e.g., second audio endpoint 104, third audio endpoint 106, and/or nth audio endpoint 108) in shared space 100 using its microphone.
After operation 204, first audio endpoint 102 may establish low-delay LAN connections with each detected audio endpoint, for example, second audio endpoint 104, third audio endpoint 106, and/or nth audio endpoint 108. Optionally, in some embodiments, first audio endpoint 102 may also establish clock synchronization with each detected audio endpoint. Method 200 may proceed to operations 206, 208 to obtain information for determining associated noise levels of each of the detected audio endpoints. For example, the information may be obtained from operation 206 where first audio endpoint 102 determines an ultrasonic signal receive level (i.e., a higher receive level indicates a closer proximity to first audio endpoint 102) for each located audio endpoint (e.g., second audio endpoint 104, third audio endpoint 106, and/or nth audio endpoint 108). The information may also be obtained from operation 208 where loudspeaker volume settings and/or call status (i.e., whether or not an audio endpoint is currently participating in a telecommunication session) is obtained by first audio endpoint 102 for each of the other audio endpoints 104, 106, 108.
At an operation 210, first audio endpoint 102 may compute or determine an anticipated noise level for each other audio endpoint 104, 106, 108. Anticipated noise level may be determined using a variety of factors and/or information obtained from each other audio endpoint 104, 106, 108. For example, some of the factors and/or information that may be used by first audio endpoint 102 to determine the anticipated noise levels include: the ultrasonic signal receive level (e.g., obtained from operation 206), metadata obtained over the low-delay LAN connections (e.g., loudspeaker volume settings, call status, and other signal levels obtained from operation 208), cross-correlations of received microphone signals with local microphone signals, and distance and/or direction information (e.g., which may be obtained using triangulation techniques from a microphone array).
Based on this information, method 200 may proceed to an operation 212 where first audio endpoint 102 may assemble or determine a ranked list of detected audio endpoints 104, 106, 108 that is prioritized based on the determined anticipated noise levels from operation 210. For example, audio endpoints having higher anticipated noise levels are ranked higher on the list than those with lower anticipated noise levels.
At an operation 214, first audio endpoint 102 picks or selects one or more of the audio endpoints associated with the highest ranked anticipated noise levels from operation 212. For example, at operation 214, first audio endpoint 102 may identify a selected audio endpoint associated with the highest ranked anticipated noise level from operation 212 as a target noise source for the purposes of implementing techniques for adaptive noise cancellation to remove background noise associated with the selected audio endpoint.
In one embodiment, a single audio endpoint may be selected as being associated with the worst or highest anticipated noise level for adaptive noise cancellation. In other embodiments, however, two or more audio endpoints may be identified as selected audio endpoints associated with target noise sources for adaptive noise cancellation. For example, audio endpoints associated with an anticipated noise level that exceeds a predetermined threshold may be identified as selected audio endpoints associated with target noise sources for adaptive noise cancellation.
Referring now to FIG. 3 , a technique 300 for implementing adaptive noise cancellation for multiple audio endpoints in shared space 100 is shown according to an example embodiment. In this embodiment, two audio endpoints (e.g., first audio endpoint 102 and second audio endpoint 104) are shown in acoustically shared space 100. A first user 302 is using first audio endpoint 102 to engage in a telecommunication session with a first remote audio endpoint 110. Simultaneously within shared space 100, a second user 304 is using second audio endpoint 104 to engage in a separate audio or acoustical session that is independent from the telecommunication session between first audio endpoint 102 and first remote audio endpoint 110.
For example, second user 304 may be using second audio endpoint 104 to engage in a separate telecommunication session with a different remote audio endpoint, such as second remote audio endpoint 112. Second user 304 may alternatively or additionally be using second audio endpoint 104 to engage in some other type of separate audio or acoustical session. For example, second user 304 may be receiving calls or messages on second audio endpoint 104 that generate a ringtone, playing music on a loudspeaker associated with second audio endpoint 104, and/or may be communicating with a VUI embedded or in communication with second audio endpoint 104.
Within shared space 100, first audio endpoint 102 and second audio endpoint 104 are both connected to a network (e.g., a LAN via LAN AP 120, shown in FIG. 1 ) to allow communication with other devices and/or participants. Additionally, as previously described above, first audio endpoint 102 and second audio endpoint 104 may be connected to each other via a low-delay LAN connection 306. LAN connection 306 allows first audio endpoint 102 and second audio endpoint 104 to exchange various information. In this example, first audio endpoint 102 includes a microphone 310, a loudspeaker 312, and one or more signal processing components, including a first adaptive filter 314, a second adaptive filter 316 that is part of an acoustic echo cancellation (AEC) module 410 (shown in FIG. 4 ), and a signal decoder/encoder 318. Second audio endpoint 104 has a similar configuration, including a microphone 320, a loud-speaker 322, and one or more signal processing components, including a first adaptive filter 324, a second adaptive filter 326 that is part of an AEC module, and a signal decoder/encoder 328.
Technique 300 for implementing adaptive noise cancellation for audio endpoints in shared space 100 may be described with reference to first audio endpoint 102. In this embodiment, microphone 310 of first audio endpoint 102 is receiving inputs from several different audio sources within shared space 100. For example, microphone 310 receives a first audio input 330 from first user 302 who is using first audio endpoint 102 to conduct a telecommunication session with first remote audio endpoint 110. In this example, first audio input 330 is the intended audio content that first user 302 is providing to first remote audio endpoint 110 via a transmitted microphone signal 336. Microphone 310 also picks up or receives echo and/or noise from other audio sources within shared space 100, including an echo source 332 output from loudspeaker 312 of first audio endpoint 102 and a first noise source 334 output from loudspeaker 322 of second audio endpoint 104.
The example embodiments presented herein provide a technique of implementing adaptive noise cancellation to remove these additional unwanted noise sources from microphone signal 336 provided to first remote audio endpoint 110 from first audio endpoint 102. In this embodiment, first audio endpoint 102 may implement adaptive noise cancellation of first noise source 334 output from loudspeaker 322 of second audio endpoint 104 by obtaining from second audio endpoint 104 a loudspeaker reference signal 338 that may then be removed from the microphone signal associated with microphone 310 of first audio endpoint 102 using first adaptive filter 314. As shown in FIG. 3 , first audio endpoint 102 receives or obtains loudspeaker reference signal 338 from second audio endpoint 104 via low-delay LAN connection 306. In this embodiment, loudspeaker reference signal 338 is the audio signal provided from signal decoder 328 of second audio endpoint 104 that is to be output from loudspeaker 322. For example, loudspeaker reference signal 338 may be based on received audio signals from second remote audio endpoint 112.
At first audio endpoint 102, loudspeaker reference signal 338 is removed from the microphone signal associated with microphone 310 of first audio endpoint 102 using first adaptive filter 314. That is, loudspeaker reference signal 338 corresponds to first noise source 334 output from loudspeaker 322 of second audio endpoint 104 and picked up by microphone 310 of first audio endpoint 102. With this arrangement, first adaptive filter 314 uses loudspeaker reference signal 338 to remove the contribution of first noise source 334 from the microphone signal associated with microphone 310 of first audio endpoint 102 before microphone signal 336 is provided or transmitted to first remote audio endpoint 110.
Additionally, in some embodiments, first audio endpoint 102 may further include second adaptive filter 316 that removes the contribution of echo source 332 from the microphone signal associated with microphone 310 of first audio endpoint 102 before microphone signal 336 is provided or transmitted to first remote audio endpoint 110.
Technique 300 for implementing adaptive noise cancellation for audio endpoints in shared space 100 may also be described with reference to second audio endpoint 104. That is, each audio endpoint in shared space 100 may implement adaptive noise cancellation to remove noise sources from the other audio endpoints within shared space 100. For example, microphone 320 of second audio endpoint 104 receives inputs from a first audio input 340 from second user 304 who is using second audio endpoint 104 to conduct a separate telecommunication or other audio/acoustical session with second remote audio endpoint 112. In this example, first audio input 340 is the intended audio content that second user 304 is providing to second remote audio endpoint 112 via a transmitted microphone signal 346. As in the previous example, microphone 320 also picks up or receives echo and/or noise from other audio sources within shared space 100, including an echo source 342 output from loudspeaker 322 of second audio endpoint 104 and a first noise source 344 output from loudspeaker 312 of first audio endpoint 102.
At second audio endpoint 104, a loudspeaker reference signal 348 is provided from first audio endpoint 102 via LAN connection 306. Loudspeaker reference signal 348 corresponds to first noise source 344 output from loudspeaker 312 of first audio endpoint 102 and picked up by microphone 320 of second audio endpoint 104. This loudspeaker reference signal 348 is removed from the microphone signal associated with microphone 320 of second audio endpoint 104 using first adaptive filter 324. Additionally, in some embodiments, second audio endpoint 104 may further include second adaptive filter 326 that removes the contribution of echo source 342 from the microphone signal associated with microphone 320 of second audio endpoint 104 before microphone signal 346 is provided or transmitted to second remote audio endpoint 112.
Referring now to FIG. 4 , a simplified representative diagram illustrates technique 300 for implementing adaptive noise cancellation at first audio endpoint 102, according to an example embodiment. As described above, microphone 310 of first audio endpoint 102 is associated with a microphone signal that includes multiple components from different audio sources. In this embodiment, the microphone signal includes first audio input 330 from first user 302, echo source 332 (h11) output from loudspeaker 312 of first audio endpoint 102, and first noise source 334 (h21) output from loudspeaker 322 of second audio endpoint 104. The microphone signal from microphone 310 is then provided to an analog-to-digital converter (ADC) 400.
As shown in FIG. 4 , the digital microphone signal from ADC 400 passes to first adaptive filter module 314, which removes first noise source 334 (h21) from the microphone signal using loudspeaker reference signal 338 that is obtained from second audio endpoint 104 via LAN connection 306. Additionally, second adaptive filter module 316 removes echo source 332 (h11) from the microphone signal from loudspeaker 312 of first audio endpoint 102. Second adaptive filter module 316 may remove echo source 332 using a corresponding loudspeaker reference signal (414) from loudspeaker 312 of first audio endpoint 102. In this embodiment, loudspeaker reference signal 414 may be obtained before the signal is provided to a digital-to-analog converter (DAC) 402 for output by loudspeaker 312. With this arrangement, transmitted microphone signal 336 provided from encode/decode module 318 of first audio endpoint 102 to first remote audio endpoint 110 may have contributions from unwanted noise sources removed (e.g., echo source 332 and first noise source 334) so that first remote audio endpoint 110 receives the content of first audio input 330 from first user 302 in a clear manner.
In an example embodiment, echo at first audio endpoint 102 caused by first remote audio endpoint 110 may be suppressed using AEC module 410. In one embodiment, AEC module 410 includes second filter module 316, which may be a linear AEC portion, followed by a non-linear AEC portion (e.g., a Non-Linear Processing (NLP) module 412). Additionally, in an example embodiment, first adaptive filter module 314 may include a linear portion, without a non-linear (NLP) portion. With this configuration, the linear portion of the first adaptive filter module 314 may sufficiently attenuate background noise from co-workers and co-located audio endpoints in shared space 100 without using NLP which can cause more attenuation of microphone signal 336 that is provided to first remote audio endpoint 110 and result in a less duplex experience for telecommunication session participants.
In some embodiments, techniques for implementing adaptive noise cancellation for audio endpoints may further include removing a microphone reference signal from other audio endpoints in shared space 100. Referring now to FIG. 5 , a technique 500 for implementing adaptive noise cancellation for multiple audio endpoints in shared space 100 is shown according to an example embodiment. In this embodiment, two audio endpoints (e.g., first audio endpoint 102 and second audio endpoint 104) are shown in acoustically shared space 100. A first user 302 is using first audio endpoint 102 to engage in a telecommunication session with a first remote audio endpoint 110. Simultaneously within shared space 100, a second user 304 is using second audio endpoint 104 to engage in a separate audio or acoustical session that is independent from the telecommunication session between first audio endpoint 102 and first remote audio endpoint 110, as detailed above with reference to FIG. 3 .
Within shared space 100, first audio endpoint 102 and second audio endpoint 104 are both connected to a network (e.g., a LAN via LAN AP 120, shown in FIG. 1 ) to allow communication with other devices and/or participants. Additionally, as previously described above, first audio endpoint 102 and second audio endpoint 104 may be connected to each other via low-delay LAN connection 306. In this example, first audio endpoint 102 includes microphone 310, loudspeaker 312, and one or more signal processing components, including first adaptive filter 314, second adaptive filter 316, a third adaptive filter 502, and signal decoder/encoder 318. Second audio endpoint 104 has a similar configuration, including microphone 320, loudspeaker 322, and one or more signal processing components, including first adaptive filter 324, second adaptive filter 326, a third adaptive filter 504, and signal decoder/encoder 328.
Technique 500 for implementing adaptive noise cancellation for audio endpoints in shared space 100 may be described with reference to first audio endpoint 102. In this embodiment, microphone 310 of first audio endpoint 102 is receiving inputs from several different audio sources within shared space 100. For example, microphone 310 receives a first audio input 510 from first user 302 who is using first audio endpoint 102 to conduct a telecommunication session with first remote audio endpoint 110. In this example, first audio input 510 is the intended audio content that first user 302 is providing to first remote audio endpoint 110 via a transmitted microphone signal 518. Microphone 310 also picks up or receives echo and/or noise from other audio sources within shared space 100, including an echo source 512 output from loudspeaker 312 of first audio endpoint 102, a first noise source 514 output from loudspeaker 322 of second audio endpoint 104, and a second noise source 516 output from second user 304.
The example embodiments presented herein provide a technique of implementing adaptive noise cancellation to remove these additional unwanted noise sources from microphone signal 518 provided to first remote audio endpoint 110 from first audio endpoint 102. In this embodiment, first audio endpoint 102 may implement adaptive noise cancellation of first noise source 514 output from loudspeaker 322 of second audio endpoint 104 and second noise source 516 from second user 304 by obtaining from second audio endpoint 104 a loudspeaker reference signal 520 that corresponds to a signal to be output by from loudspeaker 322 and a microphone reference signal 522 that corresponds to an audio stream that is input to microphone 320 of second audio endpoint 104 (e.g., a first audio input 530 from second user 304).
In this embodiment, each of loudspeaker reference signal 520 and microphone reference signal 522 may be removed from the microphone signal associated with microphone 310 of first audio endpoint 102 using corresponding adaptive filters 314, 502. For example, first adaptive filter 314 is configured to remove loudspeaker reference signal 520 and third adaptive filter 502 is configured to remove microphone reference signal 522. As shown in FIG. 5 , first audio endpoint 102 receives or obtains loudspeaker reference signal 520 and microphone reference signal 522 from second audio endpoint 104 via low-delay LAN connection 306. In this embodiment, loudspeaker reference signal 520 is the audio signal provided from signal decoder 328 of second audio endpoint 104 that is to be output from loudspeaker 322. For example, loudspeaker reference signal 520 may be based on received audio signals from second remote audio endpoint 112. Also in this embodiment, microphone reference signal 522 is the audio stream provided from microphone 320 of second audio endpoint 104 obtained from first audio input 530 provided by second user 304.
At first audio endpoint 102, loudspeaker reference signal 520 is removed from the microphone signal associated with microphone 310 of first audio endpoint 102 using first adaptive filter 314. That is, loudspeaker reference signal 520 corresponds to first noise source 514 output from loudspeaker 322 of second audio endpoint 104 and picked up by microphone 310 of first audio endpoint 102. Additionally, in the embodiment of FIG. 5 , technique 500 further includes removing microphone reference signal 522 from the microphone signal associated with microphone 310 of first audio endpoint 102 using third adaptive filter 502. That is, microphone reference signal 522 corresponds to second noise source 516 from second user 304 and picked up by microphone 310 of first audio endpoint 102. With this arrangement, first adaptive filter 314 uses loudspeaker reference signal 520 to remove the contribution of first noise source 514 and third adaptive filter 502 uses microphone reference signal 522 to remove the contribution of second noise source 516 from the microphone signal associated with microphone 310 of first audio endpoint 102 before microphone signal 518 is provided or transmitted to first remote audio endpoint 110.
Additionally, in some embodiments, first audio endpoint 102 may further include second adaptive filter 316 that removes the contribution of echo source 512 from the microphone signal associated with microphone 310 of first audio endpoint 102 before microphone signal 518 is provided or transmitted to first remote audio endpoint 110.
Technique 500 for implementing adaptive noise cancellation for audio endpoints in shared space 100 may also be described with reference to second audio endpoint 104. That is, each audio endpoint in shared space 100 may implement adaptive noise cancellation to remove noise sources from the other audio endpoints within shared space 100. For example, microphone 320 of second audio endpoint 104 receives inputs from a first audio input 530 from second user 304 who is using second audio endpoint 104 to conduct a separate telecommunication or other audio/acoustical session with second remote audio endpoint 112. In this example, first audio input 530 is the intended audio content that second user 304 is providing to second remote audio endpoint 112 via a transmitted microphone signal 538. As in the previous example, microphone 320 also picks up or receives echo and/or noise from other audio sources within shared space 100, including an echo source 532 output from loudspeaker 322 of second audio endpoint 104, a first noise source 534 output from loudspeaker 312 of first audio endpoint 102, and a second noise source 536 output from first user 302.
At second audio endpoint 104, a loudspeaker reference signal 540 and a microphone reference signal 542 are provided from first audio endpoint 102 via LAN connection 306. Loudspeaker reference signal 540 corresponds to first noise source 534 output from loudspeaker 312 of first audio endpoint 102 and picked up by microphone 320 of second audio endpoint 104 and microphone reference signal 542 corresponds to second noise source 536 from first user 302 that is input to microphone 310 of first audio endpoint 102 (e.g., first audio input 510 from first user 302).
The loudspeaker reference signal 540 is removed from the microphone signal associated with microphone 320 of second audio endpoint 104 using first adaptive filter 324 and the microphone reference signal 542 is removed from the microphone signal associated with microphone 320 of second audio endpoint 104 using second adaptive filter 504. Additionally, in some embodiments, second audio endpoint 104 may further include second adaptive filter 326 that removes the contribution of echo source 532 from the microphone signal associated with microphone 320 of second audio endpoint 104 before microphone signal 538 is provided or transmitted to second remote audio endpoint 112.
Referring now to FIG. 6 , a simplified representative diagram illustrates technique 500 for implementing adaptive noise cancellation at first audio endpoint 102, according to an example embodiment. As described above, microphone 310 of first audio endpoint 102 is associated with a microphone signal that includes multiple components from different audio sources. In this embodiment, the microphone signal includes first audio input 510 from first user 302, echo source 512 (h11) output from loudspeaker 312 of first audio endpoint 102, first noise source 514 (h21) output from loudspeaker 322 of second audio endpoint 104, and second noise source 516 (h2u1) output from second user 304. The microphone signal from microphone 310 is then provided to ADC 400.
As shown in FIG. 6 , the digital microphone signal from ADC 400 passes to first adaptive filter module 314, which removes first noise source 514 (h21) from the microphone signal using loudspeaker reference signal 520 that is obtained from second audio endpoint 104 via LAN connection 306. Similarly, third adaptive filter module 502 removes second noise source 516 (had) from the microphone signal using microphone reference signal 522 that is obtained from second audio endpoint 104 via LAN connection 306. Additionally, AEC module 410 may be used to remove echo source 512 (h11), including second adaptive filter module 316 that removes echo source 512 (h11) from the microphone signal from loudspeaker 312 of first audio endpoint 102. Second adaptive filter module 316 may remove echo source 512 using a corresponding loudspeaker reference signal (600) from loudspeaker 312 of first audio endpoint 102. In this embodiment, loudspeaker reference signal 600 may be obtained before the signal is provided to DAC 402 for output by loudspeaker 312. Additionally, AEC module 410 also includes NLP module 412 that may be used to further remove echo source 512 from the microphone signal before it is provided to encode/decode module 318. With this arrangement, transmitted microphone signal 518 provided from encode/decode module 318 of first audio endpoint 102 to first remote audio endpoint 110 may have contributions from unwanted noise sources removed (e.g., echo source 512, first noise source 514, and second noise source 516) so that first remote audio endpoint 110 receives the content of first audio input 510 from first user 302 in a clear manner.
Referring now to FIG. 7 , a flowchart of a method 700 for implementing adaptive noise cancellation at an audio endpoint according to an example embodiment is illustrated. Method 700 may be implemented by one or more audio endpoints within a shared space with other audio endpoints. For example, method 700 may be implemented by first audio endpoint 102 co-located with one or more other audio endpoints within shared space 100, as described above.
In this embodiment, method 700 may begin at an operation 702 where one or more audio endpoints are detected or located at a first location. For example, first audio endpoint 102 may detect one or more of audio endpoints 104, 106, 108 within shared space 100 using ultrasonic signals, as described in reference to FIG. 2 above. Additionally, in some embodiments, detecting the audio endpoints within shared space 100 may further include located the audio endpoints relative to first audio endpoint 102, for example, using information received from ultrasonic signals, metadata, and/or a microphone array. Next, method 700 includes an operation 704 where a selected audio endpoint is identified as a target noise source. For example, first audio endpoint 102 may use method 200 to identify a selected audio endpoint as a target noise source according to the techniques described above in reference to FIG. 2 .
Next, at an operation 706, a loudspeaker reference signal is obtained from the selected audio endpoint. For example, as shown in FIG. 3 , first audio endpoint 102 may obtain loudspeaker reference signal 338 from second audio endpoint 104 via low-delay LAN connection 306. Upon obtaining the loudspeaker reference signal at operation 706, method 700 further includes an operation 708 where the loudspeaker reference signal is removed from the microphone signal. For example, as shown in FIG. 4 , first audio endpoint 102 removes loudspeaker reference signal 338 from the microphone signal from microphone 310 before microphone signal 336 is provided to first remote audio endpoint 110.
Optionally, as described with reference to FIGS. 5 and 6 above, method 700 may also include operations (not shown) for obtaining a microphone reference signal from the selected audio endpoint and removing the microphone reference signal from the selected audio endpoint from the microphone signal before it is transmitted to a remote audio endpoint.
Additionally, method 700 may further include operations (not shown) to remove echo noise components from the microphone signal before it is transmitted. For example, using AEC module 410, including second adaptive filter 316 and/or NLP module 412 described above in reference to FIGS. 3-6 .
Method 700 may end with an operation 710 where the filtered microphone signal is provided to a remote audio endpoint. For example, first audio endpoint 102 may provide or transmit microphone signal 336 that has been filtered to remove noise components to first remote audio endpoint 110.
FIG. 8 is an electrical block diagram of first audio endpoint 102, according to an example embodiment. As described above, first audio endpoint 102 includes at least one microphone 310 and at least one loudspeaker 312. First audio endpoint 102 also includes a processor 800, a memory 810, and a network interface 820 comprising one or more ports. For simplicity, the network interface 820 and associated one or more ports may be referred to collectively as a network interface unit. Network interface 820 may be used by first audio endpoint 102 to establish a network connection via LAN AP 120 to conduct a telecommunication session with a remote audio endpoint. Network interface 820 may also be used by first audio endpoint 102 to establish a low-delay LAN connection with other audio endpoints co-located within shared space 100 (e.g., LAN connection 306). First audio endpoint 102 may also include a bus (not shown) to connect components of first audio endpoint 102, including processor 800, memory 810, network interface 820, microphone 310 and/or loudspeaker 312.
Memory 810 may include software instructions that are configured to be executed by processor 800 for providing one or more of the functions or operations of first audio endpoint 102 described above in reference to FIGS. 1-7 . In this embodiment, memory 810 includes encode/decode logic 812, adaptive filter module logic 814, AEC module logic 816, and/or ultrasonic signal processing logic 818. For example, encode/decode logic 812 may be configured to provide functions associated with signal decoder/encoder 318 for first audio endpoint 102, including at least analog-to-digital and digital-to-analog conversion, as well as signal processing functions, such as transmitting and/or receiving signals. Adaptive filter module logic 814 may be configured, for example, to provide functions associated with first adaptive filter 314 for first audio endpoint 102, as well as third adaptive filter 502 in relevant embodiments, to remove the corresponding loudspeaker reference signals (in the case of first adaptive filter 314) or microphone reference signals (in the case of third adaptive filter 502).
AEC module logic 816 may be configured to provide functions associated with AEC module 410, including second adaptive filter 316 and/or NLP module 412 for first audio endpoint 102, including at least filtering of the microphone signal to remove or cancel noise sources associated with loudspeaker 312. Ultrasonic signal processing logic 818 may be configured to provide functions associated with obtaining/receiving, providing/transmitting, and processing ultrasonic signals from one or more audio endpoints, for example, as may be used by first audio endpoint 102 to locate other audio endpoints within shared space 100, as detailed in reference to FIG. 2 above.
Memory 810 may comprise read only memory (ROM), random access memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible (e.g., non-transitory) memory storage devices. The processor 800 is, for example, a microprocessor or microcontroller that executes instructions for operating first audio endpoint 102. Thus, in general, the memory 810 may comprise one or more tangible (non-transitory) computer readable storage media (e.g., a memory device) encoded with software comprising computer executable instructions and when the software is executed (by the processor 800), and, in particular, encode/decode logic 812, adaptive filter module logic 814, AEC module logic 816, and/or ultrasonic signal processing logic 818, it is operable to perform the operations described herein in connection with FIGS. 1-7 .
It should be understood that one or more functions of processor 800, including encode/decode logic 812, adaptive filter module logic 814, AEC module logic 816, and/or ultrasonic signal processing logic 818, or other components, may be configured in separate hardware, software, or a combination of both. Additionally, processor 800 may include a plurality of processors.
In accordance with the principles described herein, the loudspeaker reference signals from other audio endpoints co-located within a shared space are pure noise sources with no contamination of the wanted or intended audio signal from a user, thereby improving performance. In contrast, using a microphone for the same purposes would degrade the adaptive noise cancellation performance because the resulting noise source would not be pure. Additionally, the techniques of the present embodiments also provide a mechanism that allows the noise signal to be obtained early in the signal processing chain to minimize delay.
The increased popularity of shared spaces and VUIs increases the occurrence of noise pollution from co-workers and other users within that shared space. The principles of the example embodiments described herein provide techniques for adaptive noise cancellation across multiple audio endpoints within a shared space to greatly reduce the amount and/or degree of unwanted background noise that is sent to far-end or remote audio endpoint participants and can also improve the performance of VUIs.
To summarize, in one form, a method is provided comprising: detecting, by a first audio endpoint, one or more audio endpoints co-located with the first audio endpoint at a first location; identifying a selected audio endpoint of the one or more audio endpoints as a target noise source; obtaining, from the selected audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected audio endpoint; removing the loudspeaker reference signal from a microphone signal associated with a microphone of the first audio endpoint; and providing the microphone signal from the first audio endpoint to at least one of a voice user interface (VUI) or a second audio endpoint, wherein the second audio endpoint is located remotely from the first location.
In another form, an apparatus is provided comprising: a microphone; a loudspeaker; a processor in communication with the microphone and the loudspeaker, the processor configured to: detect one or more audio endpoints co-located with the apparatus at a first location; identify a selected audio endpoint of the one or more audio endpoints as a target noise source; obtain, from the selected audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected audio endpoint; remove the loudspeaker reference signal from a microphone signal associated with the microphone; and provide the microphone signal to at least one of a voice user interface (VUI) or a remote audio endpoint, wherein the remote audio endpoint is located remotely from the first location.
In yet another form, one or more non-transitory computer readable storage media are provided that are encoded with instructions that, when executed by a processor of a first audio endpoint, cause the processor to: detect one or more audio endpoints co-located with the first audio endpoint at a first location; identify a selected audio endpoint of the one or more audio endpoints as a target noise source; obtain, from the selected audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected audio endpoint; remove the loudspeaker reference signal from a microphone signal associated with a microphone of the first audio endpoint; and provide the microphone signal from the first audio endpoint to at least one of a voice user interface (VUI) or a second audio endpoint, wherein the second audio endpoint is located remotely from the first location.
Although the techniques are illustrated and described herein as embodied in one or more specific examples, it is nevertheless not intended to be limited to the details shown, since various modifications and structural changes may be made within the scope and range of the embodiments presented herein. In addition, various features from one of the embodiments discussed herein may be incorporated into any other embodiments. Accordingly, the appended claims should be construed broadly and in a manner consistent with the scope of the disclosure.

Claims (26)

What is claimed is:
1. A method comprising:
detecting, by a first audio endpoint, one or more a second audio endpoints endpoint co-located with the first audio endpoint at a first location;
identifying a selected audio endpoint of the one or more audio endpoints as a target noise source;
obtaining, from the selected second audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected second audio endpoint;
removing the loudspeaker reference signal from a microphone signal associated with a microphone of the first audio endpoint to generate an adjusted microphone signal; and
providing the adjusted microphone signal from the first audio endpoint to at least one of a voice user interface (VUI) or a second third audio endpoint, wherein the second third audio endpoint is located remotely from the first location.
2. The method of claim 1, wherein detecting the one or more second audio endpoints endpoint comprises obtaining an ultrasonic signal from the one or more second audio endpoints endpoint.
3. The method of claim 1, further comprising:
removing an echo from the microphone signal.
4. The method of claim 1, wherein identifying the selected second audio endpoint is detected as the a target noise source comprises based at least in part upon obtaining from the one or more audio endpoints at least one of a decibel level, an ultrasonic receive level, an indication of a distance and/or a direction, or metadata from the second audio endpoint.
5. The method of claim 4, further comprising:
using a microphone array to determine the a distance to and/or a direction of the one or more second audio endpoints endpoint from the first audio endpoint in the first location; and
using the obtained distance and/or direction to identify the selected second audio endpoint as the target noise source.
6. The method of claim 1, further comprising:
obtaining an audio stream from, from the second audio endpoint, a microphone reference signal associated with at least one microphone associated with of the selected second audio endpoint, wherein the microphone reference signal corresponds to an audio stream that is input to the at least one microphone, and the adjusted microphone signal is also based at least in part upon the microphone reference signal.
7. The method of claim 6, further comprising:
providing the audio stream as a microphone reference signal to an adaptive filter at the first audio endpoint to remove the audio stream from the microphone signal of the first audio endpoint.
8. An apparatus comprising:
a microphone;
a loudspeaker;
a processor in communication with the microphone and the loudspeaker, the processor configured to:
detect one or more an audio endpoints endpoint co-located with the apparatus at a first location;
identify a selected audio endpoint of the one or more audio endpoints as a target noise source;
obtain, from the selected audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected audio endpoint;
remove the loudspeaker reference signal from a microphone signal associated with the microphone to generate an adjusted microphone signal; and
provide the adjusted microphone signal to at least one of a voice user interface (VUI) or a remote audio endpoint, wherein the remote audio endpoint is located remotely from the first location.
9. The apparatus of claim 8, wherein the processor is configured to detect the one or more audio endpoints by endpoint is detected based at least in part upon obtaining an ultrasonic signal from the one or more audio endpoints endpoint.
10. The apparatus of claim 8, wherein the processor is further configured to:
remove an echo from the microphone signal.
11. The apparatus of claim 8, wherein the processor is configured to identify the selected audio endpoint is detected as the a target noise source by based at least in part upon obtaining from the one or more audio endpoints at least one of a decibel level, an ultrasonic receive level, an indication of a distance and/or a direction, or metadata from the audio endpoint.
12. The apparatus of claim 11, wherein the processor is further configured to:
use a microphone array to determine the a distance to and/or a direction of the one or more audio endpoints endpoint from the apparatus in the first location; and
use the obtained distance and/or direction to identify the selected audio endpoint as the target noise source.
13. The apparatus of claim 8, wherein the processor is further configured to obtain an audio stream from a microphone reference signal associated with at least one microphone associated with of the selected audio endpoint, wherein the microphone reference signal corresponds to an audio stream that is input to the at least one microphone, and the adjusted microphone signal is also based at least in part upon the microphone reference signal.
14. The apparatus of claim 13, wherein the processor is further configured to:
provide the audio stream as a microphone reference signal to an adaptive filter to remove the audio stream from the microphone signal.
15. One or more non-transitory computer readable storage media encoded with instructions that, when executed by a processor of a first audio endpoint, cause the processor to:
detect one or more a second audio endpoints endpoint co-located with the a first audio endpoint at a first location;
identify a selected audio endpoint of the one or more audio endpoints as a target noise source;
obtain, from the selected second audio endpoint, a loudspeaker reference signal associated with a loudspeaker of the selected second audio endpoint;
remove the loudspeaker reference signal from a microphone signal associated with a microphone of the first audio endpoint to generate an adjusted microphone signal; and
provide the adjusted microphone signal from the first audio endpoint to at least one of a voice user interface (VUI) or a second third audio endpoint, wherein the second third audio endpoint is located remotely from the first location.
16. The one or more non-transitory computer readable storage media of claim 15, wherein the instructions further cause the processor to detect the one or more audio endpoints by second audio endpoint is detected based at least in part upon obtaining an ultrasonic signal from the one or more second audio endpoints endpoint.
17. The one or more non-transitory computer readable storage media of claim 15, wherein the instructions further cause the processor to:
remove an echo from the microphone signal.
18. The one or more non-transitory computer readable storage media of claim 15, wherein the instructions further cause the processor to identify the selected second audio endpoint is detected as the a target noise source by based at least in part upon obtaining from the one or more audio endpoints at least one of a decibel level, an ultrasonic receive level, an indication of a distance and/or a direction, or metadata from the second audio endpoint.
19. The one or more non-transitory computer readable storage media of claim 18, wherein the instructions further cause the processor to:
use a microphone array to determine the a distance to and/or a direction of the one or more second audio endpoints endpoint from the first audio endpoint in the first location; and
use the obtained distance and/or direction to identify the selected second audio endpoint as the target noise source.
20. The one or more non-transitory computer readable storage media of claim 15, wherein the instructions further cause the processor to:
obtain an audio stream from a microphone reference signal associated with at least one microphone associated with of the selected second audio endpoint; and
provide the audio stream as a microphone reference signal to an adaptive filter to remove the audio stream from the microphone signal, wherein the microphone reference signal corresponds to an audio stream that is input to the at least one microphone, and the adjusted microphone signal is also based at least in part upon the microphone reference signal.
21. The method of claim 1, wherein the first audio endpoint and the second audio endpoint are connected to a common local area network (LAN).
22. The method of claim 1, wherein clock synchronization is established between the first audio endpoint and the second audio endpoint.
23. The apparatus of claim 8, wherein the apparatus and the audio endpoint are connected to a common local area network (LAN).
24. The apparatus of claim 8, wherein clock synchronization is established between the apparatus and the audio endpoint.
25. The one or more non-transitory computer readable storage media of claim 15, wherein the first audio endpoint and the second audio endpoint are connected to a common local area network (LAN).
26. The one or more non-transitory computer readable storage media of claim 15, wherein clock synchronization is established between the first audio endpoint and the second audio endpoint.
US17/325,875 2018-06-15 2021-05-20 Adaptive noise cancellation for multiple audio endpoints in a shared space Active USRE49462E1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/325,875 USRE49462E1 (en) 2018-06-15 2021-05-20 Adaptive noise cancellation for multiple audio endpoints in a shared space

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16/009,652 US10297266B1 (en) 2018-06-15 2018-06-15 Adaptive noise cancellation for multiple audio endpoints in a shared space
US17/325,875 USRE49462E1 (en) 2018-06-15 2021-05-20 Adaptive noise cancellation for multiple audio endpoints in a shared space

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/009,652 Reissue US10297266B1 (en) 2018-06-15 2018-06-15 Adaptive noise cancellation for multiple audio endpoints in a shared space

Publications (1)

Publication Number Publication Date
USRE49462E1 true USRE49462E1 (en) 2023-03-14

Family

ID=66540981

Family Applications (2)

Application Number Title Priority Date Filing Date
US16/009,652 Ceased US10297266B1 (en) 2018-06-15 2018-06-15 Adaptive noise cancellation for multiple audio endpoints in a shared space
US17/325,875 Active USRE49462E1 (en) 2018-06-15 2021-05-20 Adaptive noise cancellation for multiple audio endpoints in a shared space

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US16/009,652 Ceased US10297266B1 (en) 2018-06-15 2018-06-15 Adaptive noise cancellation for multiple audio endpoints in a shared space

Country Status (1)

Country Link
US (2) US10297266B1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3709194A1 (en) 2019-03-15 2020-09-16 Spotify AB Ensemble-based data comparison
EP3970355B1 (en) * 2019-05-12 2023-09-27 Plantronics, Inc. Ptp-based audio clock synchronization and alignment for acoustic echo cancellation in a conferencing system with ip-connected cameras, microphones and speakers
US11094319B2 (en) * 2019-08-30 2021-08-17 Spotify Ab Systems and methods for generating a cleaned version of ambient sound
US11328722B2 (en) 2020-02-11 2022-05-10 Spotify Ab Systems and methods for generating a singular voice audio stream
US11308959B2 (en) 2020-02-11 2022-04-19 Spotify Ab Dynamic adjustment of wake word acceptance tolerance thresholds in voice-controlled devices

Citations (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050207567A1 (en) * 2000-09-12 2005-09-22 Forgent Networks, Inc. Communications system and method utilizing centralized signal processing
US20070237336A1 (en) * 2006-04-11 2007-10-11 Diethorn Eric J Speech canceler-enhancer system for use in call-center applications
US20080232569A1 (en) * 2007-03-19 2008-09-25 Avaya Technology Llc Teleconferencing System with Multi-channel Imaging
US20090323925A1 (en) * 2008-06-26 2009-12-31 Embarq Holdings Company, Llc System and Method for Telephone Based Noise Cancellation
US7876890B2 (en) 2006-06-15 2011-01-25 Avaya Inc. Method for coordinating co-resident teleconferencing endpoints to avoid feedback
US8488745B2 (en) * 2009-06-17 2013-07-16 Microsoft Corporation Endpoint echo detection
US20130230152A1 (en) * 2008-12-02 2013-09-05 Cisco Technology, Inc. Echo mitigation in a conference call
US20150117626A1 (en) * 2013-10-31 2015-04-30 Citrix Systems, Inc. Using audio signals to identify when client devices are co-located
US9025762B2 (en) 2012-10-23 2015-05-05 Cisco Technology, Inc. System and method for clock synchronization of acoustic echo canceller (AEC) with different sampling clocks for speakers and microphones
US9241016B2 (en) 2013-03-05 2016-01-19 Cisco Technology, Inc. System and associated methodology for detecting same-room presence using ultrasound as an out-of-band channel
US9275625B2 (en) 2013-03-06 2016-03-01 Qualcomm Incorporated Content based noise suppression
US9319633B1 (en) 2015-03-19 2016-04-19 Cisco Technology, Inc. Ultrasonic echo canceler-based technique to detect participant presence at a video conference endpoint
US9473580B2 (en) 2012-12-06 2016-10-18 Cisco Technology, Inc. System and associated methodology for proximity detection and device association using ultrasound
US9591148B2 (en) 2015-04-07 2017-03-07 Cisco Technology, Inc. Detecting proximity of devices based on transmission of inaudible sound signatures in the speech band
US9799330B2 (en) 2014-08-28 2017-10-24 Knowles Electronics, Llc Multi-sourced noise suppression
US20170346950A1 (en) * 2016-05-31 2017-11-30 Vonage Business Inc. Systems and methods for mitigating and/or avoiding feedback loops during communication sessions
US9877114B2 (en) * 2015-04-13 2018-01-23 DSCG Solutions, Inc. Audio detection system and methods
US9913026B2 (en) 2014-08-13 2018-03-06 Microsoft Technology Licensing, Llc Reversed echo canceller
US20180077205A1 (en) 2016-09-15 2018-03-15 Cisco Technology, Inc. Potential echo detection and warning for online meeting
US10003377B1 (en) 2016-12-19 2018-06-19 Cisco Technology, Inc. Spread spectrum acoustic communication techniques
US10089067B1 (en) * 2017-05-22 2018-10-02 International Business Machines Corporation Context based identification of non-relevant verbal communications
US10110994B1 (en) * 2017-11-21 2018-10-23 Nokia Technologies Oy Method and apparatus for providing voice communication with spatial audio
US10141973B1 (en) 2017-06-23 2018-11-27 Cisco Technology, Inc. Endpoint proximity pairing using acoustic spread spectrum token exchange and ranging information
US10473751B2 (en) 2017-04-25 2019-11-12 Cisco Technology, Inc. Audio based motion detection
US10491995B1 (en) 2018-10-11 2019-11-26 Cisco Technology, Inc. Directional audio pickup in collaboration endpoints
US10825460B1 (en) 2019-07-03 2020-11-03 Cisco Technology, Inc. Audio fingerprinting for meeting services
US20220024484A1 (en) * 2020-07-21 2022-01-27 Waymo Llc Identifying The Position Of A Horn Honk Or Other Acoustical Information Using Multiple Autonomous Vehicles
US20220318860A1 (en) * 2021-02-24 2022-10-06 Conversenowai Edge Appliance to Provide Conversational Artificial Intelligence Based Software Agents

Patent Citations (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050207567A1 (en) * 2000-09-12 2005-09-22 Forgent Networks, Inc. Communications system and method utilizing centralized signal processing
US20070237336A1 (en) * 2006-04-11 2007-10-11 Diethorn Eric J Speech canceler-enhancer system for use in call-center applications
US7876890B2 (en) 2006-06-15 2011-01-25 Avaya Inc. Method for coordinating co-resident teleconferencing endpoints to avoid feedback
US20080232569A1 (en) * 2007-03-19 2008-09-25 Avaya Technology Llc Teleconferencing System with Multi-channel Imaging
US20090323925A1 (en) * 2008-06-26 2009-12-31 Embarq Holdings Company, Llc System and Method for Telephone Based Noise Cancellation
US20130230152A1 (en) * 2008-12-02 2013-09-05 Cisco Technology, Inc. Echo mitigation in a conference call
US8488745B2 (en) * 2009-06-17 2013-07-16 Microsoft Corporation Endpoint echo detection
US9025762B2 (en) 2012-10-23 2015-05-05 Cisco Technology, Inc. System and method for clock synchronization of acoustic echo canceller (AEC) with different sampling clocks for speakers and microphones
US9473580B2 (en) 2012-12-06 2016-10-18 Cisco Technology, Inc. System and associated methodology for proximity detection and device association using ultrasound
US10177859B2 (en) 2012-12-06 2019-01-08 Cisco Technology, Inc. System and associated methodology for proximity detection and device association using ultrasound
US10491311B2 (en) 2013-03-05 2019-11-26 Cisco Technology, Inc. System and associated methodology for detecting same-room presence using ultrasound as an out-of-band channel
US20200059305A1 (en) 2013-03-05 2020-02-20 Cisco Technology, Inc. System and associated methodology for detecting same-room presence using ultrasound as an out-of-band channel
US10277332B2 (en) 2013-03-05 2019-04-30 Cisco Technology, Inc. System and associated methodology for detecting same room presence using ultrasound as an out-of-band channel
US9241016B2 (en) 2013-03-05 2016-01-19 Cisco Technology, Inc. System and associated methodology for detecting same-room presence using ultrasound as an out-of-band channel
US9275625B2 (en) 2013-03-06 2016-03-01 Qualcomm Incorporated Content based noise suppression
US20150117626A1 (en) * 2013-10-31 2015-04-30 Citrix Systems, Inc. Using audio signals to identify when client devices are co-located
US9913026B2 (en) 2014-08-13 2018-03-06 Microsoft Technology Licensing, Llc Reversed echo canceller
US9799330B2 (en) 2014-08-28 2017-10-24 Knowles Electronics, Llc Multi-sourced noise suppression
US9319633B1 (en) 2015-03-19 2016-04-19 Cisco Technology, Inc. Ultrasonic echo canceler-based technique to detect participant presence at a video conference endpoint
US9591148B2 (en) 2015-04-07 2017-03-07 Cisco Technology, Inc. Detecting proximity of devices based on transmission of inaudible sound signatures in the speech band
US9877114B2 (en) * 2015-04-13 2018-01-23 DSCG Solutions, Inc. Audio detection system and methods
US20170346950A1 (en) * 2016-05-31 2017-11-30 Vonage Business Inc. Systems and methods for mitigating and/or avoiding feedback loops during communication sessions
US20180077205A1 (en) 2016-09-15 2018-03-15 Cisco Technology, Inc. Potential echo detection and warning for online meeting
US20200195297A1 (en) 2016-12-19 2020-06-18 Cisco Technology, Inc. Spread spectrum acoustic communication techniques
US10003377B1 (en) 2016-12-19 2018-06-19 Cisco Technology, Inc. Spread spectrum acoustic communication techniques
US10530417B2 (en) 2016-12-19 2020-01-07 Cisco Technology, Inc. Spread spectrum acoustic communication techniques
US20200116820A1 (en) 2017-04-25 2020-04-16 Cisco Technology, Inc. Audio based motion detection
US10473751B2 (en) 2017-04-25 2019-11-12 Cisco Technology, Inc. Audio based motion detection
US10089067B1 (en) * 2017-05-22 2018-10-02 International Business Machines Corporation Context based identification of non-relevant verbal communications
US10141973B1 (en) 2017-06-23 2018-11-27 Cisco Technology, Inc. Endpoint proximity pairing using acoustic spread spectrum token exchange and ranging information
US10110994B1 (en) * 2017-11-21 2018-10-23 Nokia Technologies Oy Method and apparatus for providing voice communication with spatial audio
US10491995B1 (en) 2018-10-11 2019-11-26 Cisco Technology, Inc. Directional audio pickup in collaboration endpoints
US10687139B2 (en) 2018-10-11 2020-06-16 Cisco Technology, Inc. Directional audio pickup in collaboration endpoints
US10825460B1 (en) 2019-07-03 2020-11-03 Cisco Technology, Inc. Audio fingerprinting for meeting services
US20220024484A1 (en) * 2020-07-21 2022-01-27 Waymo Llc Identifying The Position Of A Horn Honk Or Other Acoustical Information Using Multiple Autonomous Vehicles
US20220318860A1 (en) * 2021-02-24 2022-10-06 Conversenowai Edge Appliance to Provide Conversational Artificial Intelligence Based Software Agents

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Bao, et al., "Using an Ultrasound Signal for Clock and Timing Synchronization for Acoustic Echo Cancellation with Multiple Teleconference Devices in the Same Room", ip.com, IPCOM000247327D, Cisco Systems, Inc., Aug. 24, 2016, 16 pgs.
Sakanashi, et al., "Speech enhancement with ad-hoc microphone array using single source activity", IEEE, Signal and Information Processing Association Annual Summit and Conference (APSIPA), Oct. 2013, 6 pgs.
Tavakoli, et al., "A Framework for Speech Enhancement With Ad Hoc Microphone Arrays", IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 24, No. 6, Jun. 2016, 14 pgs.
Thubert, et al., "Virtual Noise Wall for Workspace Collaboration", ip.com, IPCOM000248272D, Cisco Systems, Inc., Nov. 14, 2016, 7 pgs.
Widrow, et al., "Adaptive Noise Cancelling: Principles and Applications", Proceedings of the IEEE, vol. 63, No. 12, Dec. 1975, 27 pgs.

Also Published As

Publication number Publication date
US10297266B1 (en) 2019-05-21

Similar Documents

Publication Publication Date Title
USRE49462E1 (en) Adaptive noise cancellation for multiple audio endpoints in a shared space
US8606249B1 (en) Methods and systems for enhancing audio quality during teleconferencing
CN105513596B (en) Voice control method and control equipment
US10574804B2 (en) Automatic volume control of a voice signal provided to a captioning communication service
KR101255404B1 (en) Configuration of echo cancellation
US7206404B2 (en) Communications system and method utilizing centralized signal processing
US20090046866A1 (en) Apparatus capable of performing acoustic echo cancellation and a method thereof
US9253331B2 (en) Call handling
EP3257236A1 (en) Nearby talker obscuring, duplicate dialogue amelioration and automatic muting of acoustically proximate participants
EP1973321A1 (en) Teleconferencing system with multiple channels at each location
EP2342884B1 (en) Method of controlling a system and signal processing system
US7983406B2 (en) Adaptive, multi-channel teleconferencing system
CN103312906A (en) Method and device for realizing teleconference
EP3920516B1 (en) Voice call method and apparatus, electronic device, and computer-readable storage medium
US9491306B2 (en) Signal processing control in an audio device
US8914007B2 (en) Method and apparatus for voice conferencing
CN104580764A (en) Ultrasound pairing signal control in teleconferencing system
US7171004B2 (en) Room acoustics echo meter for voice terminals
JP2008211526A (en) Voice input/output device and voice input/output method
KR20100030579A (en) Notification of dropped audio in a teleconference call
US20120140918A1 (en) System and method for echo reduction in audio and video telecommunications over a network
JPH09233198A (en) Method and device for software basis bridge for full duplex voice conference telephone system
TWI790718B (en) Conference terminal and echo cancellation method for conference
JP5745475B2 (en) Echo cancellation method, system and devices
Härmä Ambient telephony: Scenarios and research challenges

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: CISCO TECHNOLOGY, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BURENIUS, LENNART;BIRKENES, OYSTEIN;REEL/FRAME:056313/0286

Effective date: 20180605