WO2010115285A1 - Enhanced communication bridge - Google Patents
Enhanced communication bridge Download PDFInfo
- Publication number
- WO2010115285A1 WO2010115285A1 PCT/CA2010/000534 CA2010000534W WO2010115285A1 WO 2010115285 A1 WO2010115285 A1 WO 2010115285A1 CA 2010000534 W CA2010000534 W CA 2010000534W WO 2010115285 A1 WO2010115285 A1 WO 2010115285A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- communication
- bridge
- communication session
- audio
- context
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/28—Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
- H04L12/46—Interconnection of networks
- H04L12/4604—LAN interconnection over a backbone network, e.g. Internet, Frame Relay
- H04L12/462—LAN interconnection over a bridge based backbone
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/1066—Session management
- H04L65/1083—In-session procedures
- H04L65/1093—In-session procedures by adding participants; by removing participants
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/40—Support for services or applications
- H04L65/403—Arrangements for multi-party communication, e.g. for conferences
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
- H04L65/752—Media network packet handling adapting media to network capabilities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
- H04L65/765—Media network packet handling intermediate
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/22—Arrangements for supervision, monitoring or testing
- H04M3/2236—Quality of speech transmission monitoring
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M7/00—Arrangements for interconnection between switching centres
- H04M7/12—Arrangements for interconnection between switching centres for working between exchanges having different types of switching equipment, e.g. power-driven and step by step or decimal and non-decimal
- H04M7/1205—Arrangements for interconnection between switching centres for working between exchanges having different types of switching equipment, e.g. power-driven and step by step or decimal and non-decimal where the types of switching equipement comprises PSTN/ISDN equipment and switching equipment of networks other than PSTN/ISDN, e.g. Internet Protocol networks
Definitions
- This application relates to communication networks and, more particularly, to a method and apparatus for providing an enhanced communication bridge.
- Data communication networks may include various computers, servers, nodes, routers, switches, hubs, proxies, and other devices coupled to and configured to pass data to one another. These devices will be referred to herein as "network elements," and may provide a variety of network resources on the network.
- Data is communicated through data communication networks by passing protocol data units (such as packets, cells, frames, or segments) between the network elements over communication links on the network.
- protocol data units such as packets, cells, frames, or segments
- a particular protocol data unit may be handled by multiple network elements and cross multiple communication links as it travels between its source and its destination over the network.
- Hosts such as computers, telephones, cellular telephones, Personal Digital Assistants, and other types of consumer electronics connect to and transmit/receive data over the communication network and, hence, are users of the communication services offered by the communication network.
- a telephone call may be established to connect two, three, or a small number of people and enable those individuals to talk with each other on a communication network.
- an audio bridge may be used.
- An audio bridge basically receives input from the participants, selects two, three, or another small number of signals to be mixed, and provides the mixed audio to each of the participants. This allows many people to simultaneously talk and listen to a given communication over the network. Audio bridges have been around for many years and are well known in the art.
- An enhanced communication bridge includes a context interface that enables the audio bridge to learn information about the type of Voice encoder, device, network connection, location, type of call (business vs. personal), identity and position of the individual, and other information about the context of the communication session itself as well as the context of each person joining the communication session.
- This context information is used to determine how quality of experience targets for the communication as a whole, as well as how each individual contribution should be uniquely processed to attempt to meet the quality of experience targets.
- Business factors may influence the decision as to the type of processing to be implemented on each of the signals provided by the participants. Corrective action may also be implemented by the bridge on the client network devices as well in the embodiment.
- the bridge may be centralized or distributed.
- a video bridge may be implemented as well.
- FIG. 1 is a functional block diagram of an example of a communication network according to an embodiment of the invention.
- FIGs. 2A and 2B are functional block diagrams showing the flow of information between participants A-E and two types of communication bridges; and 54184-1
- FIG. 3 is a functional block diagram of an example enhanced communication bridge according to an embodiment of the invention.
- Fig. 1 illustrates an example communication network 10 on which a multi-party communication session may be implemented.
- the multi-party communication session may be an audio call and, optionally, may include video content as well.
- the term "communication bridge" will be used to refer to a device that is capable of connecting multiple parties during a communication session.
- the communication session may be an audio communication session or may be an audio/video communication session.
- the term communication bridge is used herein as a generic term that encompasses conventional audio-only bridges as well as bridges capable of handling both audio and video data.
- parts of the description may refer to audio, the invention is not limited to audio-only bridges as the same techniques may be used to handle audio on a multi-party audio-video communication session.
- Fig. 1 shows an example communication network over which a multi-party communication session may be established.
- the network 10 includes an enhanced communication bridge 12, an embodiment of which is described below in connection with Fig. 3.
- People may connect to the communication bridge 12 using many different access technologies. Since these connection technologies have different characteristics, according to an embodiment of the invention, the enhanced communication bridge is able to determine the context associated with the call itself as well as with each participant. The enhanced communication bridge will use the context information to adjust the audio processing for that particular participant in view of Quality of Experience and business metrics. This enables the communication bridge to adjust the processing that is applied to each of the participant audio streams, so that the output audio is consistent with expectations for the type of communication session.
- the communication bridge will use the business factors 54184-1 to conduct capacity vs. quality tradeoffs to minimize Operational Expenses (OpEx) to determine which processing makes sense, from a business standpoint, to enable revenue generation, key user quality of experience, and processing resources to be optimized by the communication bridge.
- OpEx Operational Expenses
- a person on a communication session may be a client or customer who the other participants may want to hear during the call.
- the communication bridge may preferentially select the audio stream from that person to be included as one of the mixed output audio streams to enable the person to be heard.
- the classification of participants may be included as part of the business factors to enable different classifications to be provided to participants depending on the type of communication session.
- the communication bridge may deploy more processing to ensure high quality of experience.
- the communication bridge may bias the customer's line for best quality, and to ensure that the customer is able to break into the conversation easily. For bridges that don't generate revenue directly, use of the processing elements of the communication bridge may be optimized to ensure that the bridge can support the highest number of simultaneous calls and users as possible.
- Fig. 1 shows several example access technologies that may be used to connect to a communication session.
- a person may talk on a cellular telephone 14 via a cellular access network, e.g. via base transceiver station 16, to join a communication session hosted by the communication bridge.
- another user may have a wireless IP phone 18 that the user may use to connect to a communication session via a wireless access point 20.
- Other users may have soft telephony clients loaded onto their laptop or desktop computers, PDAs, or other computing devices 22. These users may connect to a communication session over the Internet via gateway 24.
- Still other users may join the communication session from user equipment 26 (IP phone, soft client, etc.) connected to an enterprise network.
- the communication bridge is located external to the enterprise network, the user may connect to the communication session via an enterprise gateway 28.
- Many other ways of connecting to the bridge may exist as well, or may be developed over time, and the selection shown in Fig. 1 is not intended to be limiting.
- the communication bridge determines the context information associated with each participant and uses the context information to process the signals from that participant as well as signals going to that participant. This enables the communication bridge to adapt to the particular way that the user has connected to the communication session to increase the clarity of the resultant mixed audio output by the communication bridge.
- Fig. 2A shows an example of how a communication bridge can operate to enable multiple people to talk to each other during a communication session.
- the communication bridge will receive input audio from each of the participants that are connecting to a particular communication session.
- a communication session may have hundreds of participants and the invention is not limited to this particular example.
- the communication bridge will select a subset of the inputs to be mixed together and presented to the participants.
- the communication bridge has selected the input from participants A, B, and E to be mixed together and provided as output audio on the communication session. Accordingly, each non-active participant will be provided with an output audio stream including the mixed input from participants A, B, and E.
- the active participants receive a mix that does not include their own voice, so A will receive B and E mixed, B will receive A and E mixed and E will receive A and B mixed.
- Fig. 2B shows another type of communication bridge in which the selection function is performed centrally by the communication bridge, but in which the mixing occurs in a distributed manner.
- the communication bridge 12 will determine which participants should be heard on the communication session and will output multiple voice streams to each of the participants.
- the communication bridge has selected participants A, B, and E to be heard on the communication session. Accordingly, the bridge has output voice stream A, voice stream B, and voice stream E, to each of the participants.
- the participants have a local mixing function that will mix these input voice streams so that the users may listen to the mixed audio.
- the aspects of the invention described herein may be applied to either type of bridge. 54184-1
- Fig. 3 shows an example enhanced communication bridge 12 according to an embodiment of the invention.
- users connect to the communication bridge 12 via user devices 30A-30F.
- Each user may connect to the communication bridge using the same user device or may use a different user device. In general, it may be expected that users will connect to the communication bridge using whatever type of user device is convenient and available to that particular user.
- the communication bridge has an application interface 32 that users will interact with when initiating a communication session, joining a communication session, during the communication session, and optionally in connection with leaving a communication session.
- the communication bridge may be accessed by dialing a particular telephone number.
- the application interface may ask the user for a conference ID number, security access code, or other similar information.
- the application interface has an interactive voice response and/or DTMF signaling module that enables the user to interact with an automated system to initiate, join, modify, or terminate a communication session.
- the application interface enables the users to interact with the communication bridge and also enables the communication bridge to negotiate with the user device to determine how the user device will implement the communication session. For example, the application interface may implement control and signaling to select a vocoder to be used by the user device for the communication session, and to adjust the rate at which the user device and communication session communicate. Other features of the underlying connection may likewise be negotiated when the user device connects to the communication bridge.
- the API may instruct the user to take corrective action to improve signals being generated by the user device.
- the Bridge API is able to send information back to the participants.
- the API can transmit a message to the end user suggesting a corrective action to be taken by the end user that may enable the end user to help improve audio quality on the communication session.
- the API may instruct a participant on a noisy connection to mute their microphone to reduce the amount of noise 54184-1 on the conference call.
- this may be implemented by the API directly controlling the noisy participant's device on the noisy participant's behalf.
- the API may also remotely control and repair subscriber client problems such as audio and microphone gain. Where the participant is using a soft client implemented on a computer, for example, and the participant is talking on a headset, a separate microphone on the person's laptop may be simultaneously picking up the person's voice as well as picking up other ambient noise.
- the API can disable the laptop microphone or, alternatively, use the signal from the laptop for noise profiling and cancellation.
- the API can detect the audio level provided by a participant and signal the participant to talk louder or more softly, or to move the microphone away from a noise source to improve signal-to-noise ratio.
- the API can interact directly with the end device to adjust the signal level provided by the end device automatically. This may enable API to mute the end device or adjust the audio gain at the end device to amplify the participant's voice if the participant is speaking softly, or decrease the amplification level if the participant is speaking loudly, to moderate the overall volume of each of the participants on the communication session.
- the API may also take other corrective action or implement other processing actions on the end user device.
- the context interface and inference engine 46 may determine processing to be performed on the signals provided from the user device 30A or on the signals provided to the user device 3OA and instruct the user device 30A to implement all or part of these processes.
- codec selection, echo processing, noise cancellation, and other pre and post processing functions may be implemented at the user device under the instruction of the API.
- the user may also interact with the application interface to select particular features during the communication session.
- the user may have a local mute control or, alternatively, the communication bridge may provide mute control.
- the application interface may enable the users to control whether their audio stream is selected to be output on the conference call.
- the application interface may also enable the user to select features for the call.
- the application interface may also provide additional information to the participants during the communication session.
- the application interface may provide information about the current talker so that participants can follow along with who is speaking at a particular point in time.
- the application interface may also enable users to specify the volume of the audio on the communication session as a whole and, optionally, on a per-speaker basis.
- the bridge may assign locations of particular individuals on the call and mix the audio so that it sounds, to other participants, that the sound is originating from the direction of where the individual is sitting.
- three dimensional audio mixing include using phase and delay audio processing to enable a user to have a spatial perception that the audio is originating to the left/right, or from the front/back.
- Different ways of implementing three dimensional audio have been developed and may be developed in the future, and the audio bridge may use these processing techniques to adjust the location of the participants for each user of the audio bridge.
- the directionality of the audio may help people determine who is talking on the communication session.
- the application interface may also enable the user device to provide information to the conference bridge that may then be passed to the context interface to enable the conference bridge to know more about the overall context of the communication session as well as the particular context of this user on the communication session.
- the application interface may detect the type of device connecting to the communication session, the type of client implemented on the device, and determine the types of features implemented on the device, such as whether the device will be employing any noise cancellation techniques during the communication session.
- the application interface may also detect the type of network connection (e.g. cellular, wireless IP, IP, POTS), and whether the caller is connecting from a residential connection or business connection.
- the application interface may also receive input from the user as to whether the call is being implemented outdoors or indoors, and may listen to the background noise levels during the initial connection (when the user is logging into the communication session) to determine the quality of the service being provided to the user and optionally the background noise level on the connection.
- Information collected by the application interface will be, in one embodiment, passed to a context interface 46. Although much of the context information may be collected by the application interface, the invention is not limited in this manner as other ways of collecting information for use by the context interface and inference engine may be implemented as well.
- the context interface 46 is discussed in greater detail below.
- the communication bridge also has an audio bridge 34 that implements communication sessions.
- the media path is illustrated using thick lines and the flow of control information is shown using thin lines.
- the audio bridge includes a control
- the control 36 interacts with the application interface 32 to selectively admit participants to one or more communication sessions being implemented by the audio bridge.
- the audio mixer performs the function of mixing signals to be transmitted to the participants on the communication sessions.
- 40 selects one, two, three, or other small number of audio inputs to be mixed by the audio mixer and output on the communication session.
- the application interface 32 will instruct the control 36 to add the user to a particular communication session that is to be started by the audio bridge 34 or to add the user to an already extant communication session being hosted by the audio bridge 34.
- the selector 40 will start to receive input from the user and, if appropriate, select audio by that user to be mixed into the output stream on the communication session.
- the audio mixer will also provide output audio from the communication session to the user once the user joins the communication session.
- the communication bridge 12 includes an audio enhancer 42 that processes each user's audio independently according to context information 44 received from a context interface and inference engine 46.
- the audio enhancer includes a control 48 that programs an audio processor 50 to apply particular audio processing algorithms to the signals selected by the selector 40.
- Each channel provided by the selector 40 to the audio processor will be processed individually using separate audio processing algorithms so that the individual channel may be optimized according to the context associated with that particular channel.
- the selector 40 selects audio channels for processing by the audio processor 50
- the invention is not limited in this regard as the audio processor may implement the selecting function if desired.
- not all input audio channels will be mixed together by the audio mixer 38 for output to the users on the communication session.
- the selection process (whether implemented by selector 40 or audio processor 50) should be performed before audio processing so that only the relevant audio channels that will contribute to the communication session will be processed by the audio processor 50.
- a larger subset of the input audio channels will undergo some audio processing prior to the selection process. For example audio inputs from channels that are detected to have noise or gain issues may be preprocessed prior to the selection process in order to optimize the selection.
- the signals may undergo gain adjustment prior to selection to make it easier for a person who naturally speaks softly to break into the conversation being hosted by the communication bridge.
- the communication bridge may include a preprocessor 41 configured to receive the input audio and process the signals before the signals are input to the selector.
- the type of processing to be performed by the preprocessor may be controlled by the audio enhancer 42 to enable pre-processing of the audio signals to be coordinated with post-processing.
- the audio enhancer 42 may also provide input to the selector to help the selector determine which signals should be preferentially be selected to be output on the communication session.
- the pre-processor 41, selector 40, audio processor 50, and audio mixer 38 are shown as separate boxes to help explain the different functions that may be implemented in connection with the audio signals.
- the invention is 54184-1 not limited in this manner as, optionally, several or all of these functions may be combined into a single FPGA or other programmable circuitry.
- the signals may be input to a single chip that performs pre-processing, selection, audio processing, and audio mixing to output a plurality of individually mixed audio channels to the several participants to the communication session.
- Many ways of implementing the communication bridge are possible including software running on dedicated processors optimized for signal processing or on general purpose microprocessors
- the context interface and inference engine 46 provides context information to 44 to the audio enhancer 42 to instruct the audio enhancer as to the type of processing that should be performed on particular channels and optionally parameters that should be used in connection with processing particular audio channels
- the context interface collects information about each participant in the communication session For example, in the illustrated embodiment the context interface and inference engine 46 receives input about the voice encoder (vocoder) 52 in use by the participant, the type of network connection 54, the type of device 56, and the communication client in use by the device 58. These parameters enable the context interface and inference engine 46 to learn about physical characte ⁇ stics of the connection and device that may affect how signals provided by the user device should be processed in the audio processor
- vocoder voice encoder
- the context interface also collects social context information about the communication session as a whole as well as about the user's participation in the communication session
- the context interface and inference engine 46 may receive input from the user's calendar 60 to learn the social context of the communication session
- This enables communication bridge may implement different processing for business calls than it does for personal calls
- the organization and the person's role in the organization 62 may impact the quality of service provided by the bridge on the communication session.
- p ⁇ o ⁇ ty may be given to particular participants, such as customers on a sales conference call, to increase the quality of experience for those participants, make it easier for that particular participant to break into the conversation, or otherwise adjust the manner in which the participant is treated during the communication session 54184-1
- the location that the person is calling from may also be relevant to the communication bridge 12. For example, if the person is calling from outside, the amount of ambient background noise may be higher than if the person is calling from a quieter indoor location. Similarly, if the person is calling from home rather than from an office the background noise characteristics on the audio provided by that person may be different.
- the audio bridge may also look at the service quality 65 to determine how to process audio received from a particular user. For example, if the user is calling from home and has a relatively static riddled connection, the communication bridge may want to filter the signal to try to eliminate some of the static from the connection. Other service quality factors may be determined as well.
- the audio bridge may also use Session Priority 66 with Business Factors rules 70 to determine how to allocate the resources of the bridge to optimize the quality, costs and capacity. For example, conference calls with customers may take priority for compute resources over internal conference calls in a business environment. In a conference bridge running as a service, priority may be given to customers with premium subscriptions versus others paying lower fees.
- the audio bridge may keep a record of optimizations, inferences and connection issues and context in the context history 80.
- the context history can be used as the starting point settings for audio processing. For example a user who has consistently high gain can have gain reduction automatically applied when they call in to the bridge.
- the other context inputs such as user device, connection type, codec, etc. can be kept in the context history. To optimize storage, the context of only the most frequent and high priority users may be stored.
- the audio enhancer receives input from the context interface and inference engine 46 and combines that with quality of experience factors 68 and business factors 70 to determine how to process the signals in audio processor 50.
- Quality of experience factors 68 are factors that describe user perception of communication session properties.
- echo cancellation or suppression may be important to implement to prevent excessive echo from interfering with sound fidelity.
- a quality of experience factor for echo suppression may specify that an optimal Total Echo Loudness Ratio (TELR) value, as well as an acceptable TELR value. These TELR values may depend on the particular context of the conference call and other factors. 54184-1
- a business conference call may be less tolerant of echo and, hence, a first set of optimal and acceptable TELR values may be specified for business conference calls. Teenagers may have a different tolerance for echo and, hence, a second set of optimal and acceptable TELR values may be specified for this class of users. Similarly, relatives talking amongst themselves to discuss family matters may have a different tolerance for echo and, hence, a third set of optimal and acceptable TELR values may be specified for this class of users.
- optimal and acceptable thresholds may be specified for other audio properties such as noise levels, overall loudness values, and other similar properties as well.
- the quality of experience factors thus give the audio enhancer target values to prevent the audio enhancer from over-processing signals to increase a particular property (e.g. echo cancellation) where doing so would not perceptibly increase the overall sound quality to the end users but may take unnecessary compute resources
- the business factors enable cost and session priority to be factored into determining how signals should be processed by the communication bridge.
- Particular processes may be computationally intensive and, hence, occupy a greater percentage of the processing capabilities of the communication bridge. Since the communication bridge has finite computational resources, implementing computationally intensive processes limits the number of communication sessions that the communication bridge can handle. Where the owner of the communication bridge is paid based on the number of communication sessions, implementing computationally intensive processes may affect the revenue generated by the communication bridge.
- the business factors enable business decisions to be implemented so that the communication bridge is able to optimize not only the quality of experience for participants on the communication session, but is also able to optimize the amount of revenue the bridge is able to generate on the network.
- the business factors may enable the communication bridge to implement higher quality processing for communication sessions while the bridge is lightly loaded, and then substitute lesser quality processing for less important communication sessions as the bridge becomes more congested. This enables the bridge to adjust to the load conditions to maximize revenue by adjusting how the audio enhancer processes extant communication sessions. 54184-1
- the context interface and inference engine receives these types of inputs and possibly other inputs and determines appropriate audio processing algorithms for the signal. This enables the conference bridge to enhance conference user experience by providing superior audio performance, tunable to the social context and the individual participants, to increase collaboration effectiveness by integrating business intelligence over a traditional audio bridge.
- the audio processor may implement many different types of processing techniques for particular individual participants, to optimize the sound quality for that participant on the communication session.
- One example type of processing may be to determine whether linear or non-linear approach to echo control should be implemented. In particular, if a linear approach to echo processing is selected an echo canceller may be used, whereas a non-linear approach would require the use of echo suppression rather than echo cancellation.
- Echo cancellation is a process by which the audio processor 50 may learn which part of a received signal is the actual signal and which part is the echo. An adaptive filter may then be built to subtract the echo from the signal. This enables the echo to be subtracted or cancelled from the signal so that, in theory, the echo may be removed from the signal with minimal impact on the non-echo signal.
- Echo suppression does not remove only the echo portion of the signal but rather can block the entire reverse signal. Since echo travels back towards the speaker, one common approach is to block audio in the reverse direction to reduce the perception of echo on the line. While this is simpler than echo cancellation, since an adaptive filter does not need to be created, it prevents both people from talking at the same time. In particular, when a first person is talking, if a second person starts to talk, the echo suppression may think that the audio traveling from the second person toward the first person is echo, in which case it will thus suppress the audio from the second person. Accordingly, with echo suppression it is difficult to have a full duplex conversation
- G.720, EVRC and variants, AMR, G.723.1, G.722.2 are typically non-linear and, accordingly, echo suppression may have to be used where one of these vocoders is in use by a communication session participant.
- mobile users may be using a vocoder such as Enhanced Variable Rate Codec (EVRC) or Adaptive Multi-Rate Compression (AMR)
- EVRC Enhanced Variable Rate Codec
- AMR Adaptive Multi-Rate Compression
- business users often use vocoders such as G.702/G.711/G.722
- home based residential users often will use a G.729 or G.711 vocoder. Accordingly the type of network connection may impact the particular vocoder in use by that person.
- network impairments may also indicate a need to deploy non-linear echo suppression.
- Example network impairments that may be detected include packet loss and jitter, which may be further characterized according to patterns, rate of spikes, burst size, frequency/occurrence, etc. Measured jitter characteristics such as rate of spikes may indicate a frequent change in network jitter characteristics. If the packet loss rate exceeds the rate below which standard packet loss concealment algorithm operates with minor artifacts, then echo suppression should be used instead of echo cancellation.
- the audio enhancer can also implement noise control on a per-user basis. Where there is background noise on a channel, it is desirable to implement some form of noise suppression to reduce the background noise. However, it is important to not be too aggressive in suppressing noise as the noise suppression may also destroy the signal that is of interest on the channel.
- the level and type of noise suppression can be adjusted depending on the particular characteristics of the communication channel with the particular user and the location of the user and context info such as type of conversation (business/casual).
- the noise reduction engine may be implemented by the audio enhancer, although the invention is not limited in this manner.
- the level and type of noise suppression can be adjusted based on the vocoder type between the talker and noise reduction engine.
- vocoders There are two general types of vocoders - waveform vocoders which preserve the original waveform, and parametric vocoders which decompress the original signal into components and then individually compress the components. If a waveform vocoder is used, such as G.711 or G.726, then the noise suppression algorithm can be more aggressive. If a parametric vocoder is used, then, depending on the compression rate, noise suppression should be less aggressive.
- noise floor measurements may be used to determine the ratio of the noise to the signal. If the ambient noise floor is above a particular threshold, a notice may be sent to the participant via the application interface 32 to enable the participant to modify conditions in the area of the user device to help reduce the noise threshold.
- the participant may be on speaker phone and the microphone of the device may be located too close to a noise source such as a computer fan, projector fan, or other type of noise source. The participant may not be aware that the fan is causing significant noise on the call. Providing the participant with a notification may enable the participant to move the phone or switch to a headset rather than a speaker phone.
- the context of each participant is collected and processed by the context interface and inference engine.
- the context may include the participant's name, role in the company, the conversation type (business, casual, relatives, teens) that help the context interface and inference engine to determine the required quality of the audio on the communication session.
- a business conference call may need to be supported differently, and have different audio qualities, than a conference call between a group of teenaged people or a family conference call between a group of relatives.
- the reason behind the communication session may be used to determine required quality of experience thresholds based on the expectations associated with the social context. These quality of experience factors may then be used to adjust processing 54184-1 of the call, in terms of echo suppression, noise reduction, volume balancing, etc., that is implemented by the audio processor on audio streams to be mixed together on the communication session.
- the communication bridge uses the context information available about the participants and the context of the call, as well as physical information about the type of device, type of network connection, and other properties associated with how the participants are connected to the communication bridge, to determine whether improvement to some factor that affects quality of experience is possible. For example, the communication bridge may determine whether it is possible to improve echo cancellation, noise reduction, loudness ratios, or another factor. The communication bridge may then determine whether the available mechanism will improve the factor sufficiently to alter the end user quality of experience. If not, there is no reason to apply the available mechanism. Even if the communication bridge can use the available mechanism to improve the end user quality of experience, the communication bridge may look at the social context associated with the communication session to determine whether it is worth-while doing so from a business perspective.
- Mechanism for Dynamic Coordination of Signal Processing Network Equipment is a draft ITU-T Recommendation to coordinate signal processing features for voice quality enhancement.
- MDCSPNE Signal Processing Network Equipment
- application of different voice quality enhancements at different places on the network may cause undesirable degradations due to unintended interference between the processes.
- these enhancements may be coordinated to avoid this type of unintended interference and the attendant potential audio quality degradation.
- the communication bridge includes a service assurance interface 72 that receives input from the network as to the state of the network and through which the communication bridge may take remedial actions.
- the service assurance interface also provides the state of the bridge and the audio functioning to the service assurance system to enable the service assurance system to know how the bridge is functioning over time.
- the service assurance system may provide operational information as to the state of the network to enable the communication bridge to learn how the network is operating. For example, as noted above, the packet loss rate and jitter characteristics of the network may help the communication bridge to determine which type of echo processing to use.
- the service assurance interface 72 can obtain information as to the operational state of the network to help the context interface and inference engine determine these parameters when implementing echo processing for particular signals.
- control logic may be implemented as a set of program instructions that are stored in a computer readable memory within the network element and executed on a microprocessor.
- a programmable logic device such as a Field Programmable Gate Array (FPGA) or microprocessor, or any other device including any combination thereof.
- Programmable logic can be fixed temporarily or permanently in a tangible medium such as a read-only memory chip, a computer memory, a disk, or other storage medium. All such embodiments are intended to fall within the scope of the present invention.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Telephonic Communication Services (AREA)
- Telephone Function (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
Description
Claims
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA2758194A CA2758194A1 (en) | 2009-04-09 | 2010-04-09 | Enhanced communication bridge |
AU2010234200A AU2010234200A1 (en) | 2009-04-09 | 2010-04-09 | Enhanced communication bridge |
EP10761160.0A EP2417756A4 (en) | 2009-04-09 | 2010-04-09 | Enhanced communication bridge |
CN201080025387.6A CN102461139B (en) | 2009-04-09 | 2010-04-09 | Enhanced communication bridge |
JP2012503840A JP5523551B2 (en) | 2009-04-09 | 2010-04-09 | Extended communication bridge |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/420,976 US9191234B2 (en) | 2009-04-09 | 2009-04-09 | Enhanced communication bridge |
US12/420,976 | 2009-04-09 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2010115285A1 true WO2010115285A1 (en) | 2010-10-14 |
Family
ID=42934312
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CA2010/000534 WO2010115285A1 (en) | 2009-04-09 | 2010-04-09 | Enhanced communication bridge |
Country Status (7)
Country | Link |
---|---|
US (1) | US9191234B2 (en) |
EP (1) | EP2417756A4 (en) |
JP (1) | JP5523551B2 (en) |
CN (1) | CN102461139B (en) |
AU (1) | AU2010234200A1 (en) |
CA (1) | CA2758194A1 (en) |
WO (1) | WO2010115285A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9191234B2 (en) | 2009-04-09 | 2015-11-17 | Rpx Clearinghouse Llc | Enhanced communication bridge |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8060563B2 (en) * | 2008-12-29 | 2011-11-15 | Nortel Networks Limited | Collaboration agent |
US20130301415A1 (en) * | 2011-09-29 | 2013-11-14 | Avvasi Inc. | Methods and systems for managing media traffic based on network conditions |
CN103475633A (en) * | 2012-08-06 | 2013-12-25 | 苏州沃通信息科技有限公司 | Voice and video communication engine and extensible communication service framework based such engine |
US9293148B2 (en) * | 2012-10-11 | 2016-03-22 | International Business Machines Corporation | Reducing noise in a shared media session |
KR20140067512A (en) * | 2012-11-26 | 2014-06-05 | 삼성전자주식회사 | Signal processing apparatus and signal processing method thereof |
US9602571B2 (en) | 2013-10-29 | 2017-03-21 | International Business Machines Corporation | Codec selection and usage for improved VoIP call quality |
US20150186317A1 (en) * | 2014-01-02 | 2015-07-02 | Lsi Corporation | Method and apparatus for detecting the initiator/target orientation of a smart bridge |
US20150327035A1 (en) * | 2014-05-12 | 2015-11-12 | Intel Corporation | Far-end context dependent pre-processing |
US20170134447A1 (en) * | 2015-11-05 | 2017-05-11 | International Business Machines Corporation | Real time voice participate self-assessment |
US10791026B2 (en) * | 2016-11-10 | 2020-09-29 | Ciena Corporation | Systems and methods for adaptive over-the-top content quality of experience optimization |
US10862771B2 (en) | 2016-11-10 | 2020-12-08 | Ciena Corporation | Adaptive systems and methods enhancing service quality of experience |
CN108874355B (en) * | 2017-05-16 | 2021-07-27 | 宏碁股份有限公司 | Game platform and audio processing method thereof |
CN108882116A (en) * | 2017-05-16 | 2018-11-23 | 宏碁股份有限公司 | Audio Switch Assembly and its operating method |
CN108874354B (en) * | 2017-05-16 | 2021-07-23 | 宏碁股份有限公司 | Game platform and audio processing method thereof |
DE102019111365B4 (en) | 2019-05-02 | 2024-09-26 | Johannes Raschpichler | Method, computer program product, system and device for modifying acoustic interaction signals generated by at least one interaction partner with respect to an interaction goal |
CN113516991A (en) * | 2020-08-18 | 2021-10-19 | 腾讯科技(深圳)有限公司 | Audio playing and equipment management method and device based on group session |
US11381410B1 (en) * | 2021-03-19 | 2022-07-05 | Avaya Management L.P. | Dynamic media switching between different devices of same user based on quality of service and performance |
JP2022182019A (en) * | 2021-05-27 | 2022-12-08 | シャープ株式会社 | Conference system, conference method, and conference program |
US20240282324A1 (en) * | 2021-06-29 | 2024-08-22 | Hewlett-Packard Development Company, L.P. | Noise Removal on an Electronic Device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080160977A1 (en) * | 2006-12-27 | 2008-07-03 | Nokia Corporation | Teleconference group formation using context information |
US20080226051A1 (en) * | 2007-03-14 | 2008-09-18 | Microsoft Corporation | Techniques for managing a multimedia conference call |
Family Cites Families (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4475190A (en) * | 1982-05-27 | 1984-10-02 | At&T Bell Laboratories | Method and apparatus for controlling ports in a digital conference arrangement |
JP3027793B2 (en) | 1994-08-03 | 2000-04-04 | 日本電信電話株式会社 | Virtual space sharing device |
US6097820A (en) * | 1996-12-23 | 2000-08-01 | Lucent Technologies Inc. | System and method for suppressing noise in digitally represented voice signals |
DE19852091C1 (en) * | 1998-11-12 | 2000-05-25 | Deutsche Telekom Mobil | Method and device for improving the audio quality in a mobile radio network |
US6463414B1 (en) * | 1999-04-12 | 2002-10-08 | Conexant Systems, Inc. | Conference bridge processing of speech in a packet network environment |
US6556817B1 (en) * | 1999-12-13 | 2003-04-29 | Motorola, Inc. | Method and apparatus for selectively communicating in a wireless communication system based on varying time incremental costs of communication |
US7830824B2 (en) | 2000-03-01 | 2010-11-09 | Polycom, Inc. | System and method for providing reservationless third party meeting rooms |
CA2446085C (en) * | 2001-04-30 | 2010-04-27 | Octave Communications, Inc. | Audio conference platform with dynamic speech detection threshold |
US8223942B2 (en) | 2001-12-31 | 2012-07-17 | Polycom, Inc. | Conference endpoint requesting and receiving billing information from a conference bridge |
US20050276234A1 (en) * | 2004-06-09 | 2005-12-15 | Yemeng Feng | Method and architecture for efficiently delivering conferencing data in a distributed multipoint communication system |
ES2406942T3 (en) * | 2005-02-22 | 2013-06-10 | France Telecom | Procedure and information system of the participants in a telephone conversation |
US20060244813A1 (en) | 2005-04-29 | 2006-11-02 | Relan Sandeep K | System and method for video teleconferencing via a video bridge |
US20060288096A1 (en) * | 2005-06-17 | 2006-12-21 | Wai Yim | Integrated monitoring for network and local internet protocol traffic |
US20070117508A1 (en) | 2005-11-07 | 2007-05-24 | Jack Jachner | Conference presence based music-on-hold suppression system and method |
CN101313484B (en) * | 2005-11-21 | 2012-01-11 | 艾利森电话股份有限公司 | Method and apparatus for improving call quality |
US7668304B2 (en) | 2006-01-25 | 2010-02-23 | Avaya Inc. | Display hierarchy of participants during phone call |
US7738643B1 (en) * | 2006-06-29 | 2010-06-15 | At&T Corp. | Method for troubleshooting echo on teleconference bridge |
US9065667B2 (en) | 2006-09-05 | 2015-06-23 | Codian Limited | Viewing data as part of a video conference |
US8041017B2 (en) * | 2007-03-16 | 2011-10-18 | Alcatel Lucent | Emergency call service with automatic third party notification and/or bridging |
US7970603B2 (en) * | 2007-11-15 | 2011-06-28 | Lockheed Martin Corporation | Method and apparatus for managing speech decoders in a communication device |
US8310937B2 (en) * | 2008-05-28 | 2012-11-13 | Centurylink Intellectual Property Llc | Voice packet dynamic echo cancellation system |
US9191234B2 (en) | 2009-04-09 | 2015-11-17 | Rpx Clearinghouse Llc | Enhanced communication bridge |
-
2009
- 2009-04-09 US US12/420,976 patent/US9191234B2/en not_active Expired - Fee Related
-
2010
- 2010-04-09 JP JP2012503840A patent/JP5523551B2/en not_active Expired - Fee Related
- 2010-04-09 EP EP10761160.0A patent/EP2417756A4/en not_active Withdrawn
- 2010-04-09 AU AU2010234200A patent/AU2010234200A1/en not_active Abandoned
- 2010-04-09 CN CN201080025387.6A patent/CN102461139B/en not_active Expired - Fee Related
- 2010-04-09 CA CA2758194A patent/CA2758194A1/en not_active Abandoned
- 2010-04-09 WO PCT/CA2010/000534 patent/WO2010115285A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080160977A1 (en) * | 2006-12-27 | 2008-07-03 | Nokia Corporation | Teleconference group formation using context information |
US20080226051A1 (en) * | 2007-03-14 | 2008-09-18 | Microsoft Corporation | Techniques for managing a multimedia conference call |
Non-Patent Citations (1)
Title |
---|
See also references of EP2417756A4 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9191234B2 (en) | 2009-04-09 | 2015-11-17 | Rpx Clearinghouse Llc | Enhanced communication bridge |
Also Published As
Publication number | Publication date |
---|---|
CN102461139B (en) | 2015-01-14 |
AU2010234200A1 (en) | 2011-11-03 |
EP2417756A1 (en) | 2012-02-15 |
US20100260074A1 (en) | 2010-10-14 |
JP5523551B2 (en) | 2014-06-18 |
CN102461139A (en) | 2012-05-16 |
EP2417756A4 (en) | 2014-06-18 |
CA2758194A1 (en) | 2010-10-14 |
JP2012523720A (en) | 2012-10-04 |
US9191234B2 (en) | 2015-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9191234B2 (en) | Enhanced communication bridge | |
US8433050B1 (en) | Optimizing conference quality with diverse codecs | |
US8218751B2 (en) | Method and apparatus for identifying and eliminating the source of background noise in multi-party teleconferences | |
US8670537B2 (en) | Adjusting audio volume in a conference call environment | |
US8462931B2 (en) | Monitoring signal path quality in a conference call | |
RU2398361C2 (en) | Intelligent method, audio limiting unit and system | |
US7689568B2 (en) | Communication system | |
US20110044474A1 (en) | System and Method for Adjusting an Audio Signal Volume Level Based on Whom is Speaking | |
US20060248210A1 (en) | Controlling video display mode in a video conferencing system | |
US8289362B2 (en) | Audio directionality control for a multi-display switched video conferencing system | |
US20130029649A1 (en) | Automatic Mute Detection | |
US20130097333A1 (en) | Methods and apparatuses for unified streaming communication | |
US7885396B2 (en) | Multiple simultaneously active telephone calls | |
US7986644B2 (en) | Multi-fidelity conferencing bridge | |
US8548146B2 (en) | Method and system to manage connections on a conference bridge | |
US9042535B2 (en) | Echo control optimization | |
US8830056B2 (en) | Intelligent music on hold | |
US7925503B2 (en) | Method and apparatus for dynamically providing comfort noise | |
US20170078338A1 (en) | Systems and methods for establishing and controlling conference call bridges | |
US20240251039A1 (en) | Hybrid digital signal processing-artificial intelligence acoustic echo cancellation for virtual conferences | |
CN107317944A (en) | A kind of meeting room member audio diversification control method | |
Mani et al. | DSP subsystem for multiparty conferencing in VoIP | |
US10356247B2 (en) | Enhancements for VoIP communications | |
TW201906394A (en) | Method, apparatus and computer readable storage medium for call management | |
Chrin et al. | Performance of soft phones and advances in associated technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080025387.6 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10761160 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010761160 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2758194 Country of ref document: CA Ref document number: 2012503840 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2010234200 Country of ref document: AU Date of ref document: 20100409 Kind code of ref document: A |