US20160164577A1 - Utilizing mobile devices in physical proximity to create an ad-hoc microphone array - Google Patents

Utilizing mobile devices in physical proximity to create an ad-hoc microphone array Download PDF

Info

Publication number
US20160164577A1
US20160164577A1 US14/560,299 US201414560299A US2016164577A1 US 20160164577 A1 US20160164577 A1 US 20160164577A1 US 201414560299 A US201414560299 A US 201414560299A US 2016164577 A1 US2016164577 A1 US 2016164577A1
Authority
US
United States
Prior art keywords
stream
microphone
mobile device
logic
aggregate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US14/560,299
Other versions
US9369186B1 (en
Inventor
Michael Gregory Rexroad
Neil Joshi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cisco Technology Inc
Original Assignee
Cisco Technology Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cisco Technology Inc filed Critical Cisco Technology Inc
Priority to US14/560,299 priority Critical patent/US9369186B1/en
Assigned to CISCO TECHNOLOGY, INC. reassignment CISCO TECHNOLOGY, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JOSHI, NEIL, REXROAD, MICHAEL GREGORY
Publication of US20160164577A1 publication Critical patent/US20160164577A1/en
Application granted granted Critical
Publication of US9369186B1 publication Critical patent/US9369186B1/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B5/00Near-field transmission systems, e.g. inductive or capacitive transmission systems
    • H04B5/06
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/765Media network packet handling intermediate
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W8/00Network data management
    • H04W8/005Discovery of network devices, e.g. terminals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/18Self-organising networks, e.g. ad-hoc networks or sensor networks

Definitions

  • the disclosure relates generally to measuring capturing audio signals during a meeting. More particularly, the disclosure related to creating and utilizing microphones of mobile devices to create an ad-hoc microphone array for use during a meeting.
  • Many meetings involve audio and/or video components that are broadcast to remote participants.
  • many meetings may be audio and/or video conferences which include participants located at a physical location such as a conference room, and participants at a remote location to whom audio streams from the physical location may be broadcast or otherwise transmitted.
  • there is at least one fixed microphone at a physical location e.g., a microphone on a speaker phone, into which participants may speak.
  • the quality of audio transmitted to remote participants in a meeting may be poor, particularly when a speaker is not positioned substantially directly in from of a microphone.
  • the quality of audio that is transmitted from meeting participants in a physical location such as a conference room to meeting participants participating virtually or remotely may generally be affected by many factors. Background noises such as microphone scuffing, breathing, background conversation, and room echo may adversely affect the quality of audio transmitted to remote participants in a meeting. Audio transmitted to remote participants in a meeting may be unintelligible, for example, when two participants in a conference room speak at substantially the same time. In addition, the volume or loudness of audio may be affected by the position of an active speaker and/or orientation relative to a microphone and, thus, the quality of audio transmitted to remote participants may be compromised.
  • FIG. 1 is a process flow diagram which illustrates a method of creating and utilizing an ad-hoc microphone array that includes microphones of a plurality of mobile devices in accordance with an embodiment.
  • FIG. 2 is a diagrammatic representation of an ad-hoc microphone array that includes a plurality of mobile devices in accordance with an embodiment.
  • FIG. 3 is a diagrammatic representation of a managing mobile device in accordance with an embodiment
  • FIG. 4 is a process flow diagram which illustrates a method of creating and utilizing an ad-hoc microphone array that includes a plurality of mobile devices managed by a server in accordance with an embodiment.
  • FIG. 5 is a diagrammatic representation of an ad-hoc microphone array that includes a plurality of mobile devices managed by a server in accordance with an embodiment.
  • FIG. 6 is a diagrammatic representation of an ad-hoc microphone array that includes at least one mobile device and at least one microphone not included in a mobile device, e.g., a microphone arrangement of a speaker phone or a microphone arrangement of a television, in accordance with an embodiment.
  • FIGS. 7A and 7B are a process flow diagram which illustrates one method of creating and utilizing an ad-hoc microphone array that includes a plurality of mobile devices which may be added and removed from the ad-hoc microphone array in accordance with an embodiment.
  • FIG. 8 is a diagrammatic representation of a managing server in accordance with an embodiment.
  • a method in one embodiment, includes determining when a first device and a second device are in proximity to each other, wherein the first device includes a first microphone and the second device includes a second microphone. The method also includes pairing the first device and the second device, and creating a mobile mapping of the physical relationship between the first device and the second device. Pairing the first device and the second device forms a microphone array that includes the first microphone and the second microphone. An aggregate stream is created using a first stream obtained from the first microphone and a second stream obtained from the second microphone. Creating the aggregate stream includes using the model mapping to determine when to use the first stream and when to use the second stream. Finally, the method also includes transmitting the aggregate stream.
  • a meeting such as an audio conference, a video conference, or a multimedia conference generally involves providing or otherwise transmitting audio to remote participants.
  • a microphone used to capture audio during a meeting is typically at a fixed position in a physical location, unless an active speaker is positioned substantially directly in front of the microphone, the quality of the audio that is captured and transmitted may be compromised.
  • Parties who attend a meeting in person are often in possession of mobile devices, particularly mobile devices that include microphones.
  • a party who is physically present in a meeting room for a meeting may have his or her mobile phone, tablet, and/or laptop computer in his or her possession.
  • the ability to create an ad-hoc microphone array from the microphones such that the ad-hoc microphone array may be used to capture audio during the meeting may improve the quality of audio transmitted to remote participants in the meeting.
  • the microphones of the mobile devices may be identified for inclusion, e.g., pairing, in an ad-hoc microphone array. Mapping the precise location of each of the mobile devices and, hence, microphones included in the ad-hoc microphone array, as well as the orientation of each of the mobile devices allows a selection to be made as to which microphone provides the most desirable audio stream based on a current speaker.
  • the ability to relatively precisely identify locations and orientations of microphones included in an ad-hoc microphone array at a physical location associated with a meeting may enhance digital signal processing of audio streams obtained from the microphones and, hence, improve the quality of audio provided to remote participants in the meeting.
  • An ad-hoc microphone array may generally be an array of microphones effectively created from microphones included in various devices such as mobile devices.
  • an ad-hoc microphone array may include microphones of different cellular phones that are all located at a particular location.
  • Mobile devices may generally include, but are not limited to including, cellular or mobile phones, laptops, tables, and headsets.
  • a mobile device may be substantially any portable device that includes a microphone and may be used to participate in a meeting, e.g., a telepresence meeting or a conference call.
  • a method 101 of creating and utilizing an ad-hoc microphone array begins at step 105 in which mobile devices that are in physical proximity to one another are identified or detected.
  • the mobile devices may be identified or detected when a meeting, as for example a meeting in which remote participants may participate, is initiated or is underway.
  • mobile devices at a particular physical location e.g., in a conference room, may be identified as being in physical proximity to one another.
  • Any suitable method may generally be used to identify mobile devices that are in physical proximity to one another, as for example at a geographical location associated with a meeting. Suitable methods may include, but are not limited to including, utilizing Bluetooth 4.0 LE to determine physical proximity between devices, utilizing iBeacon to determine the presence of devices, and the like. Further, threshold distances used to assess whether devices are in physical proximity to one another may vary.
  • the mobile devices are paired in step 109 . Pairing the mobile devices effectively creates an ad-hoc microphone array from the microphones of the mobile devices.
  • mobile devices that are in physical proximity to one another may be automatically paired.
  • Methods or techniques used to pair mobile devices may include, but are not limited to including, Bluetooth techniques, WiFi techniques, and/or ultrasonic techniques. Other methods used to pair mobile devices may include methods which utilize meeting invitations to pair or otherwise associate mobile devices.
  • a model mapping of a physical relationship between the paired mobile devices is created in step 113 .
  • Relatively precise physical positions of the paired mobile devices and/or the orientations of the paired mobile devices may be used to create a mapping of the paired mobile devices. For example, when iBeacons or substantially equivalent transmitters are in the vicinity of the paired mobile devices, the paired mobile devices may determine their physical positions relative to the iBeacons or transmitters.
  • a first mobile device e.g., one of the paired mobile devices
  • process streams e.g., audio streams
  • aggregate streams such as aggregate enhanced streams created from the streams.
  • the first mobile device that is identified may create an aggregate enhanced stream from the streams obtained from the mobile devices.
  • the first mobile device is also typically arranged to transmit or otherwise provide the aggregate enhance stream to remote participants in a meeting, or virtual participants in a meeting.
  • the first mobile device may effectively function as a master device such as a WiFi-direct group owner that handles the aggregation of and the processing of audio streams. It should be understood that while a single mobile device may be identified for use in processing streams and creating an aggregate enhanced stream, more than one mobile device may be used in processing streams and creating an aggregate enhanced stream.
  • the first mobile device switches between streams, e.g., inbound streams, based on the location of an active speaker, and creates an aggregate stream using the streams in step 121 .
  • the first mobile device may switch to a stream that is provided by a microphone closest to the active speaker such that the stream is chosen in part using the model mapping.
  • the first mobile device may obtain a stream from substantially the best microphone signal available for a current speaker.
  • the first mobile device may also perform digital signal processing on the streams to create an aggregate enhanced stream.
  • Audio streams from microphones of an ad-hoc microphone array that are not capturing audio from an active speaker may be used in digital signal processing to substantially filter out background sounds or other disturbances from an aggregate stream to create an aggregate enhanced stream.
  • proximity and location of microphones capturing audio that is not associated with an active speaker may be used in digital signal processing to facilitate the identification of sounds that a relevant to the active speaker.
  • Digital signal processing may further be enhanced when microphones in an ad-hoc microphone array use beam forming to determine a precise source location for various sounds and, thus, may enable filtering and/or muting of sounds or disturbances that are not associated with the active speaker.
  • digital signal processing may allow background noise to be removed from an outbound stream and loudness to be substantially normalized in the outbound stream.
  • digital signal processing may also allow metadata to be provided with an outbound stream such that a recipient of the outbound stream may enhance audio source separation.
  • the first mobile device transmits the aggregate enhanced stream, or the outbound stream. That is, the first mobile device transmits an outbound stream to remote, or virtual, participants in the meeting.
  • the method of creating and utilizing an ad-hoc microphone array is completed.
  • FIG. 2 is a diagrammatic representation of an ad-hoc microphone array that includes a plurality of mobile devices in accordance with an embodiment.
  • An ad-hoc microphone array 200 may include multiple mobile devices 204 a - c .
  • Each mobile device 204 a - c includes a microphone 208 a - c or, more generally, a sensor arrangement that is configured to capture sound.
  • Ad-hoc microphone array 200 may be created or otherwise formed when parties in possession of mobile devices 204 a - c are at substantially the same physical location and are participating in a meeting, e.g., a telepresence meeting.
  • Mobile device 204 a may be a managing mobile device arranged to effectively obtain sound captured by microphones 208 a - c and to create an outbound stream 206 , or an aggregate enhanced stream, for transmission to remote or virtual participants in the meeting. In one embodiment, mobile device 204 a may also perform digital signal processing on the sound captured by microphones 208 a - c to create outbound stream 206 .
  • a managing mobile device 304 includes an input/output (I/O) interface 312 , a processor 316 , and a logic module 320 .
  • I/O interface 312 generally allows mobile device 304 to communicate on a network, e.g., a wireless or cellular communications network, and includes a microphone 308 . It should be appreciated that managing mobile device may include more than one microphone 308 .
  • Logic module 320 generally includes hardware and/or software logic. Processor 316 is configured to execute software logic included in logic module 320 . In the described embodiment, logic module includes proximity detection logic 324 , pairing logic 328 , model mapping logic 332 , managing logic 336 , and transmission logic 344 .
  • Proximity detection logic 324 is configured to effectively detect or otherwise identify when there is at least one mobile device in proximity to mobile device 304 .
  • Proximity detection logic 324 may include, or may have access to, transmitters which may provide notifications which effectively identify mobile devices that are in proximity to the transmitters.
  • Pairing logic 328 is configured to pair mobile devices that are in proximity to each other. For example, pairing logic 328 may pair mobile device 304 to other mobile devices within its proximity to essentially create an ad-hoc microphone array.
  • Model mapping logic 332 is configured to map a physical relationship between paired mobile devices. Model mapping logic 332 may obtain information from sensing devices, e.g., transmitters, which identify mobile devices in proximity to the sensing devices. Model mapping logic 332 may also obtain information from mobile devices which derive their relative locations based on the information from sensing devices.
  • sensing devices e.g., transmitters
  • Model mapping logic 332 may also obtain information from mobile devices which derive their relative locations based on the information from sensing devices.
  • Managing logic 336 is configured to process streams, e.g., audio, obtained from microphone 308 and from other mobile devices such that an aggregate stream may be generated.
  • Managing logic 336 may include digital signal processing logic 340 that is arranged to process obtained streams to enhance the aggregate stream, or to create an enhanced aggregate stream. That is, managing logic 336 is generally arranged to handle the aggregation and processing of audio streams. Processing audio streams may include, but is not limited to including, substantially optimizing an aggregate stream based upon capabilities of a device intended to receive the aggregate stream.
  • Digital processing logic 340 may use location information, e.g., position and orientation information, relating to mobile device 304 and to other mobile devices paired to mobile device when performing digital signal processing to create an enhanced aggregate stream that is arranged to be transmitted, as for example to remote participants in a meeting.
  • location information e.g., position and orientation information
  • Transmission logic 344 is arranged to transmit an aggregate stream or an outbound stream created by managing logic 336 .
  • Transmission logic 344 may cause an aggregate stream such as an enhanced aggregate stream to be transmitted across a network to, or otherwise provided to, remote or virtual participants in a meeting.
  • the aggregate stream transmitted using transmission logic 344 may include metadata that may be used by a recipient of the aggregate stream to enhance source separation.
  • a central device or a managing server may instead provide management functionality for an ad-hoc microphone array.
  • a conference server that is located in a conference room may support an ad-hoc microphone array that includes mobile devices located in the conference room.
  • a server that is located outside a conference room, but is effectively in communication with the conference room may support an ad-hoc microphone array.
  • FIG. 4 is a process flow diagram which illustrates a method of creating and utilizing an ad-hoc microphone array that includes a plurality of mobile devices managed by a server in accordance with an embodiment.
  • a method 401 of creating and utilizing an ad-hoc microphone array that includes a plurality of mobile devices managed by a server begins at step 405 in which mobile devices in physical proximity to one another, and with a server, are identified.
  • the identification of the mobile devices may be made by the server.
  • the mobile devices may be identified when the server is used to initiate and to maintain a meeting.
  • the mobile devices are paired with the server in step 409 .
  • a model mapping of a physical relationship between the mobile devices is then created in step 413 .
  • the server may switch between streams provided by the mobile devices to create an aggregate enhanced stream in step 421 based on the identity of an active speaker. For example, the stream provided by a mobile device that is nearest to the active speaker may be a significant component of the aggregate enhanced stream. It should be appreciated that the server may also perform digital signal processing when creating the aggregate enhanced stream.
  • the server may transmit an aggregate enhanced stream, or an outbound stream, in step 425 . Such a stream may be transmitted to remote participants in the meeting.
  • the method of creating and utilizing an ad-hoc microphone array is completed when the server transmits an aggregate enhanced stream.
  • FIG. 5 is a diagrammatic representation of an ad-hoc microphone array that includes a plurality of mobile devices managed by a server in accordance with an embodiment.
  • An ad-hoc microphone array 500 may include multiple mobile devices 504 a - c . Each mobile device 504 a - c includes a microphone 508 a - c .
  • Ad-hoc microphone array 500 may be created or otherwise formed when parties in possession of mobile devices 504 a - c are at substantially the same physical location and are participating in a meeting, e.g., a telepresence meeting, that is managed by a managing server 550 .
  • Managing server 550 or a central device, is arranged to effectively obtain sound captured by microphones 508 a - c and to create an outbound stream 506 , or an aggregate enhanced stream, for transmission to remote or virtual participants in the meeting.
  • Managing server 550 may be located in proximity to mobile devices 504 a - c , and arranged to detect when mobile devices 504 a - c are within a particular range of managing server 550 , or within a particular range of a sensing device (not shown) that is in communication with managing server 550 .
  • Managing server 550 may also perform digital signal processing on the sound captured by microphones 508 a - c to create the outbound stream.
  • managing server 550 may be considered to be part of ad-hoc microphone array 500 .
  • managing server 550 may be located at substantially the same physical location as ad-hoc microphone array 500 , although it should be understood that managing server 550 may instead be at a different physical location from ad-hoc microphone array 500 but in communication with ad-hoc microphone array 500 .
  • FIG. 6 is a diagrammatic representation of an ad-hoc microphone array that includes at least one mobile device and at least one microphone not included in a mobile device, e.g., a microphone arrangement of a non-mobile device, in accordance with an embodiment.
  • An ad-hoc microphone array 600 includes a mobile device 604 which includes a microphone 608 , and a microphone arrangement 654 which includes a microphone 658 .
  • Microphone arrangement 654 may be any suitable arrangement which includes microphone 658 .
  • microphone arrangement 654 may be a standalone acoustic microphone arrangement, a speaker phone, a computing device, and/or any other device which includes microphone 658 .
  • Managing server 650 may detect when mobile device 604 and microphone arrangement 654 are within proximity to each other and/or to managing server 650 , and form ad-hoc microphone array 600 .
  • Managing server 650 is arranged to obtain streams from mobile device 604 and microphone arrangement 654 , and to create an outbound stream 606 that may be transmitted, e.g., to remote participants in a meeting.
  • managing server 650 may apply digital signal processing techniques to streams obtained from mobile device and microphone arrangement 654 .
  • microphones included in the ad-hoc microphone array may change.
  • one microphone originally included in an ad-hoc microphone array may be moved away from the physical location of the ad-hoc microphone array, and another microphone may move into the physical proximity of other microphones in the ad-hoc microphone array.
  • FIGS. 7A and 7B a method of creating and utilizing an ad-hoc microphone array that includes a plurality of mobile devices which may be added and removed from the ad-hoc microphone array will be described in accordance with an embodiment.
  • a method 701 of creating and utilizing an ad-hoc microphone array begins at step 705 in which a first mobile device and a second mobile device that are in physical proximity to one another at a particular location are identified.
  • the first mobile device and the second mobile device may generally be in the possession of participants in a meeting.
  • the first mobile device and the second mobile device are paired in step 709 .
  • a model mapping of a physical relationship between the paired mobile devices is created.
  • a managing device is identified in step 717 to process and to transmit an aggregate enhanced stream to virtual participants in the meeting.
  • the managing device may be one of the paired mobile devices.
  • the managing device may be a managing server. It should be appreciated that in still another embodiment, managing functionality may be distributed between the paired mobile devices and, as such, the managing device may effectively be the set of paired mobile devices.
  • the managing device switches between streams obtained from the paired mobile devices based on an active speaker, and creates an aggregate enhanced stream in step 721 .
  • the managing device may also perform digital signal processing when creating an aggregate enhanced stream for transmission to the virtual participants in the meeting. Once the aggregate enhanced stream, or the outbound stream, is created, the managing device transmits the aggregate enhanced stream in step 725 .
  • step 729 A determination is made in step 729 as to whether a new mobile device is detected in proximity to the paired mobile devices. If the determination is that a new mobile device is detected in proximity to the paired mobile devices, the new mobile device is paired to the other mobile devices, e.g., the first mobile device and the second mobile devices, in step 733 . From step 733 , process flow moves to step 713 in which a model mapping of the physical relationship between paired mobile devices is created.
  • step 737 it is determined in step 737 whether a paired mobile device is no longer detected at the physical location. That is, a determination is made in step 737 as to whether all of the paired mobile devices are still detected at the physical location. If it is determined that a paired mobile device is no longer detected at the physical location, the mobile device that is no longer detected at the physical location is unpaired from the other paired mobile devices in step 741 . From step 741 , process flow moves to step 713 in which a model mapping of the physical relationship between paired mobile devices is created.
  • step 737 if it is determined in step 737 that all paired mobile devices is are still detected at the physical location, then process flow returns to step 721 in which the managing device continues to switch between streams based on an active speaker, and creates an aggregate enhanced stream. That is, if all paired mobile devices are still detected at the physical location, then the managing device continues to switch between streams and creates an aggregate enhanced stream.
  • paired mobile devices may move relative to one another, while still being detected at a physical location. That is, the positioning and orientation of paired mobile devices at a physical location may change. It should be appreciated that when paired mobile devices move relative to one another at a physical location, a new model mapping of a physical relationship between the paired mobile devices may be created. When a change is detected in a location and/or an orientation of a paired mobile device at a physical location, a new model mapping may be created to further enhance the performance of an ad-hoc microphone array formed from paired mobile devices.
  • FIG. 8 is a diagrammatic representation of a managing server.
  • a managing server 850 includes a communications interface 848 , a processor 816 , and a logic module 820 . It should be appreciated that managing server 850 may be physically located at substantially the same physical location as an ad-hoc microphone array that managing server 850 is managing, or managing server 850 may be located at a different location and in communication with an ad-hoc microphone array that managing server 850 is managing.
  • Communications interface 848 generally allows managing server 850 to communicate on a network, e.g., a wireless or cellular communications network. Communications interface 848 may be configured to allow managing server 850 to communicate with microphones of an ad-hoc microphone array during a meeting, and to communicate with remote participants in the meeting.
  • a network e.g., a wireless or cellular communications network.
  • Communications interface 848 may be configured to allow managing server 850 to communicate with microphones of an ad-hoc microphone array during a meeting, and to communicate with remote participants in the meeting.
  • Logic module 820 generally includes hardware and/or software logic. Processor 816 is configured to execute software logic included in logic module 820 . In the described embodiment, logic module includes proximity detection logic 824 , pairing logic 828 , model mapping logic 832 , managing logic 836 , and transmission logic 844 .
  • Proximity detection logic 824 is configured to effectively detect or otherwise identify when there are devices with microphones, e.g., mobile devices with microphones, in proximity to each other. In one embodiment proximity detection logic 824 may determine when mobile devices are in proximity to each other and to managing server 850 . Proximity detection logic 824 may include, or may have access to, transmitters which may provide notifications which effectively identify mobile devices that are in proximity to the transmitters.
  • Pairing logic 828 is configured to pair mobile devices that are in proximity to each other. That is, pairing logic 828 is arranged to pair mobile devices that are in proximity to each other and, in some instances, in proximity to managing server 850 such that an ad-hoc microphone array is created.
  • Model mapping logic 832 is configured to map a physical relationship between paired mobile devices. Model mapping logic 832 may obtain information from sensing devices, e.g., transmitters, which identify mobile devices in proximity to the sensing devices. Model mapping logic 832 may also obtain information from mobile devices which derive their relative locations based on the information from sensing devices. Such information may be used to map a physical relationship between paired mobile devices.
  • sensing devices e.g., transmitters
  • Model mapping logic 832 may also obtain information from mobile devices which derive their relative locations based on the information from sensing devices. Such information may be used to map a physical relationship between paired mobile devices.
  • Managing logic 836 is configured to process streams, e.g., audio, obtained by managing server 850 from an ad-hoc microphone array that includes paired mobile devices to produce an aggregate stream.
  • Managing logic 836 may include digital signal processing logic 340 that is arranged to process streams obtained from microphones in an ad-hoc microphone array to enhance the aggregate stream, or to create an enhanced aggregate stream. That is, managing logic 836 is generally arranged to handle the aggregation and processing of audio streams. Processing audio streams may include, but is not limited to including, substantially optimizing an aggregate stream based upon capabilities of a device intended to receive the aggregate stream.
  • Digital processing logic 840 may use location information, e.g., position and orientation information, relating to mobile devices when performing digital signal processing to create an enhanced aggregate stream that is arranged to be transmitted, as for example to remote participants in a meeting.
  • Transmission logic 844 is arranged to transmit an aggregate stream or an outbound stream created by managing logic 836 .
  • Transmission logic 844 may cause an aggregate stream such as an enhanced aggregate stream to be transmitted across a network to, or otherwise provided to, remote or virtual participants in a meeting.
  • the aggregate stream transmitted using transmission logic 844 may include metadata that may be used by a recipient of the aggregate stream to enhance source separation.
  • any suitable apparatus or method may be used to determine when two or more mobile devices are in physical proximity to one another. Further, thresholds used to determine when mobile device is in physical proximity to another mobile device may vary widely.
  • Mobile devices may be configured to automatically join an ad-hoc microphone array.
  • a mobile device may be added into an ad-hoc microphone array substantially manually, e.g., by accessing an application that is used to allow the mobile device to join the ad-hoc microphone array.
  • a mobile device with a microphone may join an ad-hoc microphone array either implicitly or explicitly.
  • a mobile device known to be associated with a particular meeting attendee may be allowed to automatically join an ad-hoc microphone array during a meeting, while a mobile device that is not known to be associated with a particular meeting attendee may be required to undergo an authorization process before being allowed to join the ad-hoc microphone array.
  • a determination of which mobile device of an ad-hoc microphone array is to be used as a master device, or to process and to transmit an aggregate stream, may be based on a number of different factors. Factors used to identify a suitable mobile device for use as a managing device with respect to an ad-hoc microphone array may include, but are not limited to including, the capabilities of a mobile device and the resources available to the mobile device. For instance, a mobile device may be identified for use as a managing device based upon available processing, available memory, available network capabilities, available battery life, and/or power consumption considerations. In one embodiment, if multiple mobile devices are capable of serving as a managing device, the mobile device may effectively share the role of a managing device such that power consumption burdens may be substantially shared.
  • digital signal processing may be used to generate an enhanced aggregate stream, or enhanced outbound stream.
  • Information such as a physical location of a microphone of a mobile device and an orientation of the microphone may generally be used to improve digital signal processing.
  • positional information provided with respect to a microphone of a mobile device is considered to be relatively inaccurate, accurate positional and orientation information may be provided for purposes of digital signal processing using other methods.
  • an ultrasonic ping may be used to provide accurate positioning and orientation information about a mobile device. Such an ultrasonic ping may also provide valuable metadata in real-time, and may reduce timing issues and out-of-band communications issues.
  • the mobile devices When mobile devices are within physical proximity to each other, the mobile devices may be located at certain distances from each other. For example, a first mobile device may be considered to be in physical proximity to a second mobile device if the first mobile device and the second mobile device are separated by less than a predetermined distance. Mobile devices may also be in physical proximity to each other if the mobile devices are all at a particular physical location, e.g., in a room or within a predefined set of boundaries.
  • Meetings at which an ad-hoc microphone array is defined and used may vary widely. Meetings may generally include any meetings in which microphones are used, e.g., meetings that include remote or virtual attendees. Such meetings may include, but are not limited to including, multimedia meetings such as telepresence meetings, video meetings, and audio meetings.
  • the embodiments may be implemented as hardware, firmware, and/or software logic embodied in a tangible, i.e., non-transitory, medium that, when executed, is operable to perform the various methods and processes described above. That is, the logic may be embodied as physical arrangements, modules, or components.
  • a tangible medium may be substantially any computer-readable medium that is capable of storing logic or computer program code which may be executed, e.g., by a processor or an overall computing system, to perform methods and functions associated with the embodiments.
  • Such computer-readable mediums may include, but are not limited to including, physical storage and/or memory devices.
  • Executable logic may include, but is not limited to including, code devices, computer program code, and/or executable computer commands or instructions.
  • a computer-readable medium may include transitory embodiments and/or non-transitory embodiments, e.g., signals or signals embodied in carrier waves. That is, a computer-readable medium may be associated with non-transitory tangible media and transitory propagating signals.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Telephonic Communication Services (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

According to one aspect, a method includes determining when a first device and a second device are in proximity to each other, wherein the first device includes a first microphone and the second device includes a second microphone. The method also includes pairing the first device and the second device, and creating a mobile mapping of the physical relationship between the first device and the second device. Pairing the first device and the second device forms a microphone array that includes the first microphone and the second microphone. An aggregate stream is created using a first stream obtained from the first microphone and a second stream obtained from the second microphone. Creating the aggregate stream includes using the model mapping to determine when to use the first stream and when to use the second stream. Finally, the method also includes transmitting the aggregate stream.

Description

    TECHNICAL FIELD
  • The disclosure relates generally to measuring capturing audio signals during a meeting. More particularly, the disclosure related to creating and utilizing microphones of mobile devices to create an ad-hoc microphone array for use during a meeting.
  • BACKGROUND
  • Many meetings involve audio and/or video components that are broadcast to remote participants. For example, many meetings may be audio and/or video conferences which include participants located at a physical location such as a conference room, and participants at a remote location to whom audio streams from the physical location may be broadcast or otherwise transmitted. At many meetings, there is at least one fixed microphone at a physical location, e.g., a microphone on a speaker phone, into which participants may speak. The quality of audio transmitted to remote participants in a meeting may be poor, particularly when a speaker is not positioned substantially directly in from of a microphone.
  • The quality of audio that is transmitted from meeting participants in a physical location such as a conference room to meeting participants participating virtually or remotely may generally be affected by many factors. Background noises such as microphone scuffing, breathing, background conversation, and room echo may adversely affect the quality of audio transmitted to remote participants in a meeting. Audio transmitted to remote participants in a meeting may be unintelligible, for example, when two participants in a conference room speak at substantially the same time. In addition, the volume or loudness of audio may be affected by the position of an active speaker and/or orientation relative to a microphone and, thus, the quality of audio transmitted to remote participants may be compromised.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The disclosure will be readily understood by the following detailed description in conjunction with the accompanying drawings in which:
  • FIG. 1 is a process flow diagram which illustrates a method of creating and utilizing an ad-hoc microphone array that includes microphones of a plurality of mobile devices in accordance with an embodiment.
  • FIG. 2 is a diagrammatic representation of an ad-hoc microphone array that includes a plurality of mobile devices in accordance with an embodiment.
  • FIG. 3 is a diagrammatic representation of a managing mobile device in accordance with an embodiment
  • FIG. 4 is a process flow diagram which illustrates a method of creating and utilizing an ad-hoc microphone array that includes a plurality of mobile devices managed by a server in accordance with an embodiment.
  • FIG. 5 is a diagrammatic representation of an ad-hoc microphone array that includes a plurality of mobile devices managed by a server in accordance with an embodiment.
  • FIG. 6 is a diagrammatic representation of an ad-hoc microphone array that includes at least one mobile device and at least one microphone not included in a mobile device, e.g., a microphone arrangement of a speaker phone or a microphone arrangement of a television, in accordance with an embodiment.
  • FIGS. 7A and 7B are a process flow diagram which illustrates one method of creating and utilizing an ad-hoc microphone array that includes a plurality of mobile devices which may be added and removed from the ad-hoc microphone array in accordance with an embodiment.
  • FIG. 8 is a diagrammatic representation of a managing server in accordance with an embodiment.
  • DESCRIPTION OF EXAMPLE EMBODIMENTS General Overview
  • In one embodiment, a method includes determining when a first device and a second device are in proximity to each other, wherein the first device includes a first microphone and the second device includes a second microphone. The method also includes pairing the first device and the second device, and creating a mobile mapping of the physical relationship between the first device and the second device. Pairing the first device and the second device forms a microphone array that includes the first microphone and the second microphone. An aggregate stream is created using a first stream obtained from the first microphone and a second stream obtained from the second microphone. Creating the aggregate stream includes using the model mapping to determine when to use the first stream and when to use the second stream. Finally, the method also includes transmitting the aggregate stream.
  • DESCRIPTION
  • A meeting such as an audio conference, a video conference, or a multimedia conference generally involves providing or otherwise transmitting audio to remote participants. As a microphone used to capture audio during a meeting is typically at a fixed position in a physical location, unless an active speaker is positioned substantially directly in front of the microphone, the quality of the audio that is captured and transmitted may be compromised.
  • Parties who attend a meeting in person, e.g., parties who attend a multimedia conference at a physical location such as a conference room, are often in possession of mobile devices, particularly mobile devices that include microphones. For example, a party who is physically present in a meeting room for a meeting may have his or her mobile phone, tablet, and/or laptop computer in his or her possession. As there may generally be multiple mobile devices and, hence, multiple associated microphones, in a meeting room during a meeting, the ability to create an ad-hoc microphone array from the microphones such that the ad-hoc microphone array may be used to capture audio during the meeting may improve the quality of audio transmitted to remote participants in the meeting.
  • By identifying mobile devices in physical proximity to one another at a physical location, the microphones of the mobile devices may be identified for inclusion, e.g., pairing, in an ad-hoc microphone array. Mapping the precise location of each of the mobile devices and, hence, microphones included in the ad-hoc microphone array, as well as the orientation of each of the mobile devices allows a selection to be made as to which microphone provides the most desirable audio stream based on a current speaker. The ability to relatively precisely identify locations and orientations of microphones included in an ad-hoc microphone array at a physical location associated with a meeting may enhance digital signal processing of audio streams obtained from the microphones and, hence, improve the quality of audio provided to remote participants in the meeting.
  • An ad-hoc microphone array may generally be an array of microphones effectively created from microphones included in various devices such as mobile devices. For example, an ad-hoc microphone array may include microphones of different cellular phones that are all located at a particular location.
  • Mobile devices may generally include, but are not limited to including, cellular or mobile phones, laptops, tables, and headsets. In one embodiment, a mobile device may be substantially any portable device that includes a microphone and may be used to participate in a meeting, e.g., a telepresence meeting or a conference call.
  • Referring initially to FIG. 1, a method of creating and utilizing an ad-hoc microphone array that includes microphones of a plurality of mobile devices will be described in accordance with an embodiment. A method 101 of creating and utilizing an ad-hoc microphone array begins at step 105 in which mobile devices that are in physical proximity to one another are identified or detected. The mobile devices may be identified or detected when a meeting, as for example a meeting in which remote participants may participate, is initiated or is underway. In one embodiment, mobile devices at a particular physical location, e.g., in a conference room, may be identified as being in physical proximity to one another.
  • Any suitable method may generally be used to identify mobile devices that are in physical proximity to one another, as for example at a geographical location associated with a meeting. Suitable methods may include, but are not limited to including, utilizing Bluetooth 4.0 LE to determine physical proximity between devices, utilizing iBeacon to determine the presence of devices, and the like. Further, threshold distances used to assess whether devices are in physical proximity to one another may vary.
  • After identifying mobile devices that are in physical proximity to one another, the mobile devices are paired in step 109. Pairing the mobile devices effectively creates an ad-hoc microphone array from the microphones of the mobile devices. In one embodiment, mobile devices that are in physical proximity to one another may be automatically paired. Methods or techniques used to pair mobile devices may include, but are not limited to including, Bluetooth techniques, WiFi techniques, and/or ultrasonic techniques. Other methods used to pair mobile devices may include methods which utilize meeting invitations to pair or otherwise associate mobile devices.
  • Once mobile devices are paired, a model mapping of a physical relationship between the paired mobile devices is created in step 113. Relatively precise physical positions of the paired mobile devices and/or the orientations of the paired mobile devices may be used to create a mapping of the paired mobile devices. For example, when iBeacons or substantially equivalent transmitters are in the vicinity of the paired mobile devices, the paired mobile devices may determine their physical positions relative to the iBeacons or transmitters.
  • From step 113, process flow moves to step 117 in which a first mobile device, e.g., one of the paired mobile devices, is identified for use to process streams, e.g., audio streams, obtained from the mobile devices and to transmit aggregate streams such as aggregate enhanced streams created from the streams. The first mobile device that is identified may create an aggregate enhanced stream from the streams obtained from the mobile devices. The first mobile device is also typically arranged to transmit or otherwise provide the aggregate enhance stream to remote participants in a meeting, or virtual participants in a meeting. For example, the first mobile device may effectively function as a master device such as a WiFi-direct group owner that handles the aggregation of and the processing of audio streams. It should be understood that while a single mobile device may be identified for use in processing streams and creating an aggregate enhanced stream, more than one mobile device may be used in processing streams and creating an aggregate enhanced stream.
  • The first mobile device switches between streams, e.g., inbound streams, based on the location of an active speaker, and creates an aggregate stream using the streams in step 121. Typically, the first mobile device may switch to a stream that is provided by a microphone closest to the active speaker such that the stream is chosen in part using the model mapping. For example, to create an aggregate stream, the first mobile device may obtain a stream from substantially the best microphone signal available for a current speaker.
  • The first mobile device may also perform digital signal processing on the streams to create an aggregate enhanced stream. Audio streams from microphones of an ad-hoc microphone array that are not capturing audio from an active speaker may be used in digital signal processing to substantially filter out background sounds or other disturbances from an aggregate stream to create an aggregate enhanced stream. By way of example, proximity and location of microphones capturing audio that is not associated with an active speaker may be used in digital signal processing to facilitate the identification of sounds that a relevant to the active speaker. Digital signal processing may further be enhanced when microphones in an ad-hoc microphone array use beam forming to determine a precise source location for various sounds and, thus, may enable filtering and/or muting of sounds or disturbances that are not associated with the active speaker. In general, digital signal processing may allow background noise to be removed from an outbound stream and loudness to be substantially normalized in the outbound stream. As will be understood by those skilled in the art, digital signal processing may also allow metadata to be provided with an outbound stream such that a recipient of the outbound stream may enhance audio source separation.
  • In step 125, the first mobile device transmits the aggregate enhanced stream, or the outbound stream. That is, the first mobile device transmits an outbound stream to remote, or virtual, participants in the meeting. Upon the first mobile device transmitting the aggregate enhanced stream, the method of creating and utilizing an ad-hoc microphone array is completed.
  • FIG. 2 is a diagrammatic representation of an ad-hoc microphone array that includes a plurality of mobile devices in accordance with an embodiment. An ad-hoc microphone array 200 may include multiple mobile devices 204 a-c. Each mobile device 204 a-c includes a microphone 208 a-c or, more generally, a sensor arrangement that is configured to capture sound. Ad-hoc microphone array 200 may be created or otherwise formed when parties in possession of mobile devices 204 a-c are at substantially the same physical location and are participating in a meeting, e.g., a telepresence meeting.
  • Mobile device 204 a may be a managing mobile device arranged to effectively obtain sound captured by microphones 208 a-c and to create an outbound stream 206, or an aggregate enhanced stream, for transmission to remote or virtual participants in the meeting. In one embodiment, mobile device 204 a may also perform digital signal processing on the sound captured by microphones 208 a-c to create outbound stream 206.
  • In general, any mobile device which has a microphone included in an ad-hoc microphone array may serve as a managing mobile device. With reference to FIG. 3, one embodiment of a managing mobile device will be described. A managing mobile device 304 includes an input/output (I/O) interface 312, a processor 316, and a logic module 320.
  • I/O interface 312 generally allows mobile device 304 to communicate on a network, e.g., a wireless or cellular communications network, and includes a microphone 308. It should be appreciated that managing mobile device may include more than one microphone 308.
  • Logic module 320 generally includes hardware and/or software logic. Processor 316 is configured to execute software logic included in logic module 320. In the described embodiment, logic module includes proximity detection logic 324, pairing logic 328, model mapping logic 332, managing logic 336, and transmission logic 344.
  • Proximity detection logic 324 is configured to effectively detect or otherwise identify when there is at least one mobile device in proximity to mobile device 304. Proximity detection logic 324 may include, or may have access to, transmitters which may provide notifications which effectively identify mobile devices that are in proximity to the transmitters.
  • Pairing logic 328 is configured to pair mobile devices that are in proximity to each other. For example, pairing logic 328 may pair mobile device 304 to other mobile devices within its proximity to essentially create an ad-hoc microphone array.
  • Model mapping logic 332 is configured to map a physical relationship between paired mobile devices. Model mapping logic 332 may obtain information from sensing devices, e.g., transmitters, which identify mobile devices in proximity to the sensing devices. Model mapping logic 332 may also obtain information from mobile devices which derive their relative locations based on the information from sensing devices.
  • Managing logic 336 is configured to process streams, e.g., audio, obtained from microphone 308 and from other mobile devices such that an aggregate stream may be generated. Managing logic 336 may include digital signal processing logic 340 that is arranged to process obtained streams to enhance the aggregate stream, or to create an enhanced aggregate stream. That is, managing logic 336 is generally arranged to handle the aggregation and processing of audio streams. Processing audio streams may include, but is not limited to including, substantially optimizing an aggregate stream based upon capabilities of a device intended to receive the aggregate stream. Digital processing logic 340 may use location information, e.g., position and orientation information, relating to mobile device 304 and to other mobile devices paired to mobile device when performing digital signal processing to create an enhanced aggregate stream that is arranged to be transmitted, as for example to remote participants in a meeting.
  • Transmission logic 344 is arranged to transmit an aggregate stream or an outbound stream created by managing logic 336. Transmission logic 344 may cause an aggregate stream such as an enhanced aggregate stream to be transmitted across a network to, or otherwise provided to, remote or virtual participants in a meeting. In one embodiment, the aggregate stream transmitted using transmission logic 344 may include metadata that may be used by a recipient of the aggregate stream to enhance source separation.
  • In one embodiment, in lieu of a mobile device acting as a managing mobile device for an ad-hoc microphone array, a central device or a managing server may instead provide management functionality for an ad-hoc microphone array. For example, a conference server that is located in a conference room may support an ad-hoc microphone array that includes mobile devices located in the conference room. Alternatively, a server that is located outside a conference room, but is effectively in communication with the conference room, may support an ad-hoc microphone array. FIG. 4 is a process flow diagram which illustrates a method of creating and utilizing an ad-hoc microphone array that includes a plurality of mobile devices managed by a server in accordance with an embodiment. A method 401 of creating and utilizing an ad-hoc microphone array that includes a plurality of mobile devices managed by a server begins at step 405 in which mobile devices in physical proximity to one another, and with a server, are identified. The identification of the mobile devices may be made by the server. In general, the mobile devices may be identified when the server is used to initiate and to maintain a meeting.
  • Once mobile devices in physical proximity to one another are identified, the mobile devices are paired with the server in step 409. A model mapping of a physical relationship between the mobile devices is then created in step 413. After the model mapping is created, the server may switch between streams provided by the mobile devices to create an aggregate enhanced stream in step 421 based on the identity of an active speaker. For example, the stream provided by a mobile device that is nearest to the active speaker may be a significant component of the aggregate enhanced stream. It should be appreciated that the server may also perform digital signal processing when creating the aggregate enhanced stream.
  • The server may transmit an aggregate enhanced stream, or an outbound stream, in step 425. Such a stream may be transmitted to remote participants in the meeting. The method of creating and utilizing an ad-hoc microphone array is completed when the server transmits an aggregate enhanced stream.
  • FIG. 5 is a diagrammatic representation of an ad-hoc microphone array that includes a plurality of mobile devices managed by a server in accordance with an embodiment. An ad-hoc microphone array 500 may include multiple mobile devices 504 a-c. Each mobile device 504 a-c includes a microphone 508 a-c. Ad-hoc microphone array 500 may be created or otherwise formed when parties in possession of mobile devices 504 a-c are at substantially the same physical location and are participating in a meeting, e.g., a telepresence meeting, that is managed by a managing server 550.
  • Managing server 550, or a central device, is arranged to effectively obtain sound captured by microphones 508 a-c and to create an outbound stream 506, or an aggregate enhanced stream, for transmission to remote or virtual participants in the meeting. Managing server 550 may be located in proximity to mobile devices 504 a-c, and arranged to detect when mobile devices 504 a-c are within a particular range of managing server 550, or within a particular range of a sensing device (not shown) that is in communication with managing server 550. Managing server 550 may also perform digital signal processing on the sound captured by microphones 508 a-c to create the outbound stream. It should be appreciated that managing server 550 may be considered to be part of ad-hoc microphone array 500. In one embodiment, managing server 550 may be located at substantially the same physical location as ad-hoc microphone array 500, although it should be understood that managing server 550 may instead be at a different physical location from ad-hoc microphone array 500 but in communication with ad-hoc microphone array 500.
  • Some ad-hoc microphone arrays may include substantially only microphones of mobile devices. It should be appreciated, however, that other ad-hoc microphone arrays may include both microphones of mobile devices and other microphones, e.g., microphones of a speaker phone and/or standalone acoustic microphones. FIG. 6 is a diagrammatic representation of an ad-hoc microphone array that includes at least one mobile device and at least one microphone not included in a mobile device, e.g., a microphone arrangement of a non-mobile device, in accordance with an embodiment. An ad-hoc microphone array 600 includes a mobile device 604 which includes a microphone 608, and a microphone arrangement 654 which includes a microphone 658.
  • Microphone arrangement 654 may be any suitable arrangement which includes microphone 658. For example, microphone arrangement 654 may be a standalone acoustic microphone arrangement, a speaker phone, a computing device, and/or any other device which includes microphone 658.
  • Managing server 650 may detect when mobile device 604 and microphone arrangement 654 are within proximity to each other and/or to managing server 650, and form ad-hoc microphone array 600. Managing server 650 is arranged to obtain streams from mobile device 604 and microphone arrangement 654, and to create an outbound stream 606 that may be transmitted, e.g., to remote participants in a meeting. When managing server 650 creates or generates outbound stream 606, managing server 650 may apply digital signal processing techniques to streams obtained from mobile device and microphone arrangement 654.
  • During the course of a meeting in which an ad-hoc microphone array is used to capture sounds, microphones included in the ad-hoc microphone array may change. By way of example, one microphone originally included in an ad-hoc microphone array may be moved away from the physical location of the ad-hoc microphone array, and another microphone may move into the physical proximity of other microphones in the ad-hoc microphone array. With reference to FIGS. 7A and 7B, a method of creating and utilizing an ad-hoc microphone array that includes a plurality of mobile devices which may be added and removed from the ad-hoc microphone array will be described in accordance with an embodiment. A method 701 of creating and utilizing an ad-hoc microphone array begins at step 705 in which a first mobile device and a second mobile device that are in physical proximity to one another at a particular location are identified. The first mobile device and the second mobile device may generally be in the possession of participants in a meeting. Once identified, the first mobile device and the second mobile device are paired in step 709.
  • In step 713, a model mapping of a physical relationship between the paired mobile devices is created. After the model mapping is created, a managing device is identified in step 717 to process and to transmit an aggregate enhanced stream to virtual participants in the meeting. In one embodiment, the managing device may be one of the paired mobile devices. In another embodiment, the managing device may be a managing server. It should be appreciated that in still another embodiment, managing functionality may be distributed between the paired mobile devices and, as such, the managing device may effectively be the set of paired mobile devices.
  • The managing device switches between streams obtained from the paired mobile devices based on an active speaker, and creates an aggregate enhanced stream in step 721. The managing device may also perform digital signal processing when creating an aggregate enhanced stream for transmission to the virtual participants in the meeting. Once the aggregate enhanced stream, or the outbound stream, is created, the managing device transmits the aggregate enhanced stream in step 725.
  • A determination is made in step 729 as to whether a new mobile device is detected in proximity to the paired mobile devices. If the determination is that a new mobile device is detected in proximity to the paired mobile devices, the new mobile device is paired to the other mobile devices, e.g., the first mobile device and the second mobile devices, in step 733. From step 733, process flow moves to step 713 in which a model mapping of the physical relationship between paired mobile devices is created.
  • Returning to step 729, if the determination is that no new mobile device has been detected in proximity to the paired mobile devices, it is determined in step 737 whether a paired mobile device is no longer detected at the physical location. That is, a determination is made in step 737 as to whether all of the paired mobile devices are still detected at the physical location. If it is determined that a paired mobile device is no longer detected at the physical location, the mobile device that is no longer detected at the physical location is unpaired from the other paired mobile devices in step 741. From step 741, process flow moves to step 713 in which a model mapping of the physical relationship between paired mobile devices is created.
  • Alternatively, if it is determined in step 737 that all paired mobile devices is are still detected at the physical location, then process flow returns to step 721 in which the managing device continues to switch between streams based on an active speaker, and creates an aggregate enhanced stream. That is, if all paired mobile devices are still detected at the physical location, then the managing device continues to switch between streams and creates an aggregate enhanced stream.
  • In one embodiment, paired mobile devices may move relative to one another, while still being detected at a physical location. That is, the positioning and orientation of paired mobile devices at a physical location may change. It should be appreciated that when paired mobile devices move relative to one another at a physical location, a new model mapping of a physical relationship between the paired mobile devices may be created. When a change is detected in a location and/or an orientation of a paired mobile device at a physical location, a new model mapping may be created to further enhance the performance of an ad-hoc microphone array formed from paired mobile devices.
  • One embodiment of a managing server will be described with respect to FIG. 8. FIG. 8 is a diagrammatic representation of a managing server. A managing server 850 includes a communications interface 848, a processor 816, and a logic module 820. It should be appreciated that managing server 850 may be physically located at substantially the same physical location as an ad-hoc microphone array that managing server 850 is managing, or managing server 850 may be located at a different location and in communication with an ad-hoc microphone array that managing server 850 is managing.
  • Communications interface 848 generally allows managing server 850 to communicate on a network, e.g., a wireless or cellular communications network. Communications interface 848 may be configured to allow managing server 850 to communicate with microphones of an ad-hoc microphone array during a meeting, and to communicate with remote participants in the meeting.
  • Logic module 820 generally includes hardware and/or software logic. Processor 816 is configured to execute software logic included in logic module 820. In the described embodiment, logic module includes proximity detection logic 824, pairing logic 828, model mapping logic 832, managing logic 836, and transmission logic 844.
  • Proximity detection logic 824 is configured to effectively detect or otherwise identify when there are devices with microphones, e.g., mobile devices with microphones, in proximity to each other. In one embodiment proximity detection logic 824 may determine when mobile devices are in proximity to each other and to managing server 850. Proximity detection logic 824 may include, or may have access to, transmitters which may provide notifications which effectively identify mobile devices that are in proximity to the transmitters.
  • Pairing logic 828 is configured to pair mobile devices that are in proximity to each other. That is, pairing logic 828 is arranged to pair mobile devices that are in proximity to each other and, in some instances, in proximity to managing server 850 such that an ad-hoc microphone array is created.
  • Model mapping logic 832 is configured to map a physical relationship between paired mobile devices. Model mapping logic 832 may obtain information from sensing devices, e.g., transmitters, which identify mobile devices in proximity to the sensing devices. Model mapping logic 832 may also obtain information from mobile devices which derive their relative locations based on the information from sensing devices. Such information may be used to map a physical relationship between paired mobile devices.
  • Managing logic 836 is configured to process streams, e.g., audio, obtained by managing server 850 from an ad-hoc microphone array that includes paired mobile devices to produce an aggregate stream. Managing logic 836 may include digital signal processing logic 340 that is arranged to process streams obtained from microphones in an ad-hoc microphone array to enhance the aggregate stream, or to create an enhanced aggregate stream. That is, managing logic 836 is generally arranged to handle the aggregation and processing of audio streams. Processing audio streams may include, but is not limited to including, substantially optimizing an aggregate stream based upon capabilities of a device intended to receive the aggregate stream. Digital processing logic 840 may use location information, e.g., position and orientation information, relating to mobile devices when performing digital signal processing to create an enhanced aggregate stream that is arranged to be transmitted, as for example to remote participants in a meeting.
  • Transmission logic 844 is arranged to transmit an aggregate stream or an outbound stream created by managing logic 836. Transmission logic 844 may cause an aggregate stream such as an enhanced aggregate stream to be transmitted across a network to, or otherwise provided to, remote or virtual participants in a meeting. In one embodiment, the aggregate stream transmitted using transmission logic 844 may include metadata that may be used by a recipient of the aggregate stream to enhance source separation.
  • Although only a few embodiments have been described in this disclosure, it should be understood that the disclosure may be embodied in many other specific forms without departing from the spirit or the scope of the present disclosure. By way of example, any suitable apparatus or method may be used to determine when two or more mobile devices are in physical proximity to one another. Further, thresholds used to determine when mobile device is in physical proximity to another mobile device may vary widely.
  • Mobile devices may be configured to automatically join an ad-hoc microphone array. Alternatively, a mobile device may be added into an ad-hoc microphone array substantially manually, e.g., by accessing an application that is used to allow the mobile device to join the ad-hoc microphone array. In other words, a mobile device with a microphone may join an ad-hoc microphone array either implicitly or explicitly. In one embodiment, a mobile device known to be associated with a particular meeting attendee may be allowed to automatically join an ad-hoc microphone array during a meeting, while a mobile device that is not known to be associated with a particular meeting attendee may be required to undergo an authorization process before being allowed to join the ad-hoc microphone array.
  • A determination of which mobile device of an ad-hoc microphone array is to be used as a master device, or to process and to transmit an aggregate stream, may be based on a number of different factors. Factors used to identify a suitable mobile device for use as a managing device with respect to an ad-hoc microphone array may include, but are not limited to including, the capabilities of a mobile device and the resources available to the mobile device. For instance, a mobile device may be identified for use as a managing device based upon available processing, available memory, available network capabilities, available battery life, and/or power consumption considerations. In one embodiment, if multiple mobile devices are capable of serving as a managing device, the mobile device may effectively share the role of a managing device such that power consumption burdens may be substantially shared.
  • As mentioned above, digital signal processing may be used to generate an enhanced aggregate stream, or enhanced outbound stream. Information such as a physical location of a microphone of a mobile device and an orientation of the microphone may generally be used to improve digital signal processing. When positional information provided with respect to a microphone of a mobile device is considered to be relatively inaccurate, accurate positional and orientation information may be provided for purposes of digital signal processing using other methods. By way of example, an ultrasonic ping may be used to provide accurate positioning and orientation information about a mobile device. Such an ultrasonic ping may also provide valuable metadata in real-time, and may reduce timing issues and out-of-band communications issues.
  • When mobile devices are within physical proximity to each other, the mobile devices may be located at certain distances from each other. For example, a first mobile device may be considered to be in physical proximity to a second mobile device if the first mobile device and the second mobile device are separated by less than a predetermined distance. Mobile devices may also be in physical proximity to each other if the mobile devices are all at a particular physical location, e.g., in a room or within a predefined set of boundaries.
  • Meetings at which an ad-hoc microphone array is defined and used may vary widely. Meetings may generally include any meetings in which microphones are used, e.g., meetings that include remote or virtual attendees. Such meetings may include, but are not limited to including, multimedia meetings such as telepresence meetings, video meetings, and audio meetings.
  • The embodiments may be implemented as hardware, firmware, and/or software logic embodied in a tangible, i.e., non-transitory, medium that, when executed, is operable to perform the various methods and processes described above. That is, the logic may be embodied as physical arrangements, modules, or components. A tangible medium may be substantially any computer-readable medium that is capable of storing logic or computer program code which may be executed, e.g., by a processor or an overall computing system, to perform methods and functions associated with the embodiments. Such computer-readable mediums may include, but are not limited to including, physical storage and/or memory devices. Executable logic may include, but is not limited to including, code devices, computer program code, and/or executable computer commands or instructions.
  • It should be appreciated that a computer-readable medium, or a machine-readable medium, may include transitory embodiments and/or non-transitory embodiments, e.g., signals or signals embodied in carrier waves. That is, a computer-readable medium may be associated with non-transitory tangible media and transitory propagating signals.
  • The steps associated with the methods of the present disclosure may vary widely. Steps may be added, removed, altered, combined, and reordered without departing from the spirit of the scope of the present disclosure. Therefore, the present examples are to be considered as illustrative and not restrictive, and the examples is not to be limited to the details given herein, but may be modified within the scope of the appended claims.

Claims (18)

What is claimed is:
1. A method comprising:
determining when a first device and a second device are in proximity to each other, wherein the first device includes a first microphone and the second device includes a second microphone;
pairing the first device and the second device, wherein pairing the first device and the second device forms a microphone array, the microphone array including the first microphone and the second microphone;
creating a model mapping of a physical relationship between the first device and the second device;
creating an aggregate stream using a first stream obtained from the first microphone and a second stream obtained from the second microphone, wherein creating the aggregate stream includes using the model mapping to determine when to use the first stream and when to use the second stream; and
transmitting the aggregate stream.
2. The method of claim 1 wherein creating the aggregate stream includes processing at least one of the first stream and the second stream using digital signal processing.
3. The method of claim 1 wherein the first device and the second device are paired during a meeting, the first device and the second device being located at a physical location associated with the meeting, and wherein transmitting the aggregate stream includes transmitting the aggregate stream to at least one virtual participant in the meeting.
4. The method of claim 1 wherein the first device is a first mobile device and the second device is a second mobile device, the method further including:
identifying the first mobile device as a managing device, wherein the first mobile device creates the aggregate stream.
5. The method of claim 4 wherein the first mobile device transmits the aggregate stream.
6. The method of claim 1 wherein the first device is a first mobile device and the second device is a second mobile device, and wherein a managing server creates the aggregate stream and transmits the aggregate stream.
7. A tangible, non-transitory computer-readable medium comprising computer program code, the computer program code, when executed, configured to:
determine when a first device and a second device are in proximity to each other, wherein the first device includes a first microphone and the second device includes a second microphone;
pair the first device and the second device, wherein the computer program code configured to pair the first device and the second device is configured to form a microphone array, the microphone array including the first microphone and the second microphone;
create a model mapping of a physical relationship between the first device and the second device;
create an aggregate stream using a first stream obtained from the first microphone and a second stream obtained from the second microphone, wherein the computer program code configured to create the aggregate stream is configured to use the model mapping to determine when to use the first stream and when to use the second stream; and
transmit the aggregate stream.
8. The tangible, non-transitory computer-readable medium comprising computer program code of claim 7 wherein the computer program code configured to create the aggregate stream is configured to process at least one of the first stream and the second stream using digital signal processing.
9. The tangible, non-transitory computer-readable medium comprising computer program code of claim 7 wherein the first device and the second device are paired during a meeting, the first device and the second device being located at a physical location associated with the meeting, and wherein the computer program code configured to transmit the aggregate stream is configured to transmit the aggregate stream to at least one virtual participant in the meeting.
10. The tangible, non-transitory computer-readable medium comprising computer program code of claim 7 wherein the first device is a first mobile device and the second device is a second mobile device, the computer program code further configured to:
identify the first mobile device as a managing device, wherein the first mobile device creates the aggregate stream.
11. The tangible, non-transitory computer-readable medium comprising computer program code of claim 10 wherein the aggregate stream is transmitted by the first mobile device.
12. The tangible, non-transitory computer-readable medium comprising computer program code of claim 7 wherein the first device is a first mobile device and the second device is a second mobile device, and the aggregate stream is created by a managing server and the aggregate stream is transmitted by the managing server.
13. An apparatus comprising:
a processor; and
logic configured to be executed by the processor, the logic including proximity detection logic, pairing logic, and managing logic, the proximity detection logic being configured to determine when a first device is in proximity to the apparatus at a first physical location, the first device being a first mobile device and including a first microphone, the pairing logic being configured to pair the first microphone and a second microphone to form a microphone array, wherein the managing logic is configured to create an aggregate stream using a first stream obtained from the first microphone and a second stream obtained from the second microphone.
14. The apparatus of claim 13 wherein the apparatus is a second mobile device, and wherein the pairing logic is configured to pair the first microphone and the second microphone by pairing the first mobile device and the second mobile device, the apparatus further including:
the second microphone.
15. The apparatus of claim 13 wherein the logic further includes model mapping logic, the model mapping logic being configured to create a model mapping of a physical relationship between the first mobile device and the apparatus.
16. The apparatus of claim 15 wherein the pairing logic is arranged to pair the first microphone and the second microphone during a meeting, and wherein the logic further includes transmission logic, the transmission logic being arranged to transmit the aggregate stream to a remote participant in the meeting.
17. The apparatus of claim 13 wherein the managing logic is configured to create the aggregate stream by applying digital signal processing to at least one of the first stream and the second stream.
18. The apparatus of claim 13 wherein the second microphone is included in a second device and the apparatus is a managing server, and wherein the pairing logic is configured to pair the first microphone and the second microphone by pairing the first mobile device and the second device.
US14/560,299 2014-12-04 2014-12-04 Utilizing mobile devices in physical proximity to create an ad-hoc microphone array Active 2034-12-30 US9369186B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/560,299 US9369186B1 (en) 2014-12-04 2014-12-04 Utilizing mobile devices in physical proximity to create an ad-hoc microphone array

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US14/560,299 US9369186B1 (en) 2014-12-04 2014-12-04 Utilizing mobile devices in physical proximity to create an ad-hoc microphone array

Publications (2)

Publication Number Publication Date
US20160164577A1 true US20160164577A1 (en) 2016-06-09
US9369186B1 US9369186B1 (en) 2016-06-14

Family

ID=56095284

Family Applications (1)

Application Number Title Priority Date Filing Date
US14/560,299 Active 2034-12-30 US9369186B1 (en) 2014-12-04 2014-12-04 Utilizing mobile devices in physical proximity to create an ad-hoc microphone array

Country Status (1)

Country Link
US (1) US9369186B1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9781106B1 (en) * 2013-11-20 2017-10-03 Knowles Electronics, Llc Method for modeling user possession of mobile device for user authentication framework
US20180191908A1 (en) * 2016-12-30 2018-07-05 Akamai Technologies, Inc. Collecting and correlating microphone data from multiple co-located clients, and constructing 3D sound profile of a room
US10524048B2 (en) * 2018-04-13 2019-12-31 Bose Corporation Intelligent beam steering in microphone array
CN111917438A (en) * 2020-07-10 2020-11-10 北京搜狗科技发展有限公司 Voice acquisition method, device and system and voice acquisition equipment
US20220393896A1 (en) * 2021-06-08 2022-12-08 International Business Machines Corporation Multi-user camera switch icon during video call

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10516707B2 (en) * 2016-12-15 2019-12-24 Cisco Technology, Inc. Initiating a conferencing meeting using a conference room device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8223187B2 (en) 2008-07-17 2012-07-17 Cisco Technology, Inc. Non-bandwidth intensive method for providing multiple levels of censoring in an A/V stream
US8487975B2 (en) 2009-01-27 2013-07-16 Lifesize Communications, Inc. Conferencing system utilizing a mobile communication device as an interface
CH702399B1 (en) 2009-12-02 2018-05-15 Veovox Sa Apparatus and method for capturing and processing the voice
US9313336B2 (en) 2011-07-21 2016-04-12 Nuance Communications, Inc. Systems and methods for processing audio signals captured using microphones of multiple devices
US9024998B2 (en) 2011-10-27 2015-05-05 Pollycom, Inc. Pairing devices in conference using ultrasonic beacon

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9781106B1 (en) * 2013-11-20 2017-10-03 Knowles Electronics, Llc Method for modeling user possession of mobile device for user authentication framework
US20180191908A1 (en) * 2016-12-30 2018-07-05 Akamai Technologies, Inc. Collecting and correlating microphone data from multiple co-located clients, and constructing 3D sound profile of a room
US10291783B2 (en) * 2016-12-30 2019-05-14 Akamai Technologies, Inc. Collecting and correlating microphone data from multiple co-located clients, and constructing 3D sound profile of a room
US10524048B2 (en) * 2018-04-13 2019-12-31 Bose Corporation Intelligent beam steering in microphone array
US10721560B2 (en) 2018-04-13 2020-07-21 Bose Coporation Intelligent beam steering in microphone array
CN111917438A (en) * 2020-07-10 2020-11-10 北京搜狗科技发展有限公司 Voice acquisition method, device and system and voice acquisition equipment
US20220393896A1 (en) * 2021-06-08 2022-12-08 International Business Machines Corporation Multi-user camera switch icon during video call

Also Published As

Publication number Publication date
US9369186B1 (en) 2016-06-14

Similar Documents

Publication Publication Date Title
US9369186B1 (en) Utilizing mobile devices in physical proximity to create an ad-hoc microphone array
US9554091B1 (en) Identifying conference participants and active talkers at a video conference endpoint using user devices
US9787848B2 (en) Multi-beacon meeting attendee proximity tracking
US20190166424A1 (en) Microphone mesh network
KR101569863B1 (en) Muting participants in a communication session
US9319532B2 (en) Acoustic echo cancellation for audio system with bring your own devices (BYOD)
US8630208B1 (en) Muting of communication session participants
US9973561B2 (en) Conferencing based on portable multifunction devices
WO2015191788A1 (en) Intelligent device connection for wireless media in an ad hoc acoustic network
US9094524B2 (en) Enhancing conferencing user experience via components
US20150358767A1 (en) Intelligent device connection for wireless media in an ad hoc acoustic network
CN106663447B (en) Audio system with noise interference suppression
US10484544B2 (en) Method and system for adjusting volume of conference call
US20080101624A1 (en) Speaker directionality for user interface enhancement
KR20130063542A (en) System and method for providing conference information
US20150117674A1 (en) Dynamic audio input filtering for multi-device systems
US8914007B2 (en) Method and apparatus for voice conferencing
KR20170017381A (en) Terminal and method for operaing terminal
CN112887871A (en) Authority-based earphone voice playing method, earphone and storage medium
US9106717B2 (en) Speaking participant identification
US10250850B2 (en) Communication control method, communication control apparatus, telepresence robot, and recording medium storing a program
US11557296B2 (en) Communication transfer between devices
CN114531425A (en) Processing method and processing device
US20190149917A1 (en) Audio recording system and method
US20200112809A1 (en) Spatial Audio Capture & Processing

Legal Events

Date Code Title Description
AS Assignment

Owner name: CISCO TECHNOLOGY, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:REXROAD, MICHAEL GREGORY;JOSHI, NEIL;SIGNING DATES FROM 20141115 TO 20141204;REEL/FRAME:034376/0874

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8