US20060248210A1 - Controlling video display mode in a video conferencing system - Google Patents

Controlling video display mode in a video conferencing system Download PDF

Info

Publication number
US20060248210A1
US20060248210A1 US11/348,217 US34821706A US2006248210A1 US 20060248210 A1 US20060248210 A1 US 20060248210A1 US 34821706 A US34821706 A US 34821706A US 2006248210 A1 US2006248210 A1 US 2006248210A1
Authority
US
United States
Prior art keywords
audio signal
display
video conferencing
audio
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/348,217
Inventor
Michael Kenoyer
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lifesize Inc
Original Assignee
Lifesize Communications Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lifesize Communications Inc filed Critical Lifesize Communications Inc
Priority to US11/348,217 priority Critical patent/US20060248210A1/en
Publication of US20060248210A1 publication Critical patent/US20060248210A1/en
Assigned to LIFESIZE COMMUNICATIONS, INC. reassignment LIFESIZE COMMUNICATIONS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KENOYER, MICHAEL L.
Assigned to LIFESIZE, INC. reassignment LIFESIZE, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LIFESIZE COMMUNICATIONS, INC.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1101Session protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • H04L65/4038Arrangements for multi-party communication, e.g. for conferences with floor control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • H04L65/4046Arrangements for multi-party communication, e.g. for conferences with distributed floor control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor

Definitions

  • the present invention relates generally to video conferencing and, more specifically, to automatically switching between display modes within a video conference.
  • Video conferencing may be used to allow two or more people to communicate using both video and audio.
  • a video conferencing system may include a camera and microphone at each participant's location to collect video and audio from a respective participant to send to the other participant(s).
  • a speaker and display at each respective participant location may reproduce the audio and video, respectively, from the other participant(s).
  • the video conferencing system may also allow for use of a computer system to allow additional functionality into the video conference, such as data conferencing (including displaying and/or modifying a document for participants during the conference).
  • a video conferencing system may support multiple video display modes.
  • a continuous presence mode a plurality or all of the participants may be presented on the display at a respective location, as shown in FIG. 1 a .
  • continuous presence mode allows a viewer to see a plurality or all of the participants, whose images are typically tiled on the display as shown in FIG. 1 a .
  • a participant may view video of the currently talking speaker, as shown in FIG. 1 b.
  • U.S. Pat. No. 6,744,460 (the '460 Patent) titled “Video Display Mode Automatic Switching System and Method” relates to a system that uses a timer to determine how long a participant has been speaking. When a respective participant has been speaking for a length of time greater than a threshold, as determined by the timer, the system may switch to single speaker mode displaying that respective participant. When no participants are speaking for greater than a time threshold, then the system displays video signals of all of the participants in continuous presence mode.
  • the '460 Patent teaches the “duration of the signals from each of the endpoints are continuously monitored by the timer . . . ” Based on the duration of these signals, the system switches between single speaker mode and multiple speaker mode.
  • the method described in the '460 Patent has several disadvantages.
  • the system of the '460 patent only considers speaking time, and does not consider the intensity or amplitude of the participants' voices. For example, if one of the participants begins talking more loudly or shouting during the conference, the system or the '460 Patent will take as long to switch to that person as to switch to someone who is quietly talking. It would be desirable to provide a video conferencing system that more intelligently switches between single speaker and continuous presence mode.
  • a video conferencing system switches between single speaker and continuous presence mode based on the amount of accumulated audio signal of various ones of the participants. For example, when a first speaker begins speaking, the method may begin accumulating, e.g., via integration, the audio signal of the first speaker. When the accumulated audio signal of the first speaker becomes greater than a certain accumulation threshold, the video conferencing system may automatically switch to single speaker mode presenting the video image of the first speaker. Thus, if the first speaker is speaking more loudly or even yelling during the video conference, the system may switch to single speaker mode faster than if the first speaker were talking normally. Conversely, if the first speaker begins speaking softly, the system may switch to single speaker mode after a greater amount of time has passed. Thus, the method does not switch between video display modes based on time, but rather switches based on the amount of accumulated audio signal of respective participants.
  • the system may receive audio signals from a plurality of participants in a video conference.
  • An audio signal may be generated by a single speaker at a respective participant location or by multiple speakers at that participant location.
  • the signal metric may be constrained to utilize certain types of audio signals, such as human voices and/or to reject other types of audio signals, such as fan noise or paper shuffling.
  • the system may operate to analyze incoming signals in order to determine the accumulated amount of audio signal for each participant or participant location.
  • the signal may be manipulated through various available methods to provide desirable processed signals. For example, incoming audio signals may be processed such that they are always positive.
  • the signals may be integrated using any suitable methods for determining an accumulated amount of audio signal.
  • the signals may only be processed and/or integrated when exceeding a minimum audio level.
  • the level above which the signal may be integrated is herein referred to as an audio threshold.
  • determining the accumulated amount of the audio signal may occur after the audio signal has exceeded an audio threshold.
  • the audio signal may be accumulated only while the audio signal is continuous and uninterrupted, or substantially uninterrupted.
  • the accumulation of a respective audio signal may be restarted each time the audio signal stops, e.g., when the level of the respective audio signal goes below the audio threshold for a certain time period or accumulation amount.
  • the system may begin accumulating an audio signal when the speaker begins to talk and end the accumulation of the audio signal when the respective speaker stops speaking or is interrupted.
  • the system may remain in continuous presence mode.
  • an interruption may have to exceed an interruption threshold to end the accumulation of the audio signal of the currently speaking participant.
  • an interruption threshold may be based on the accumulated audio signal of the interruption or may be time based.
  • a display mode from two or more possible display modes for at least one of the video conferencing system locations may be determined based on the accumulated amount of the audio signal from each of one or more of the audio signals.
  • the system may choose from a plurality of display modes for each of the participants based on the uninterrupted accumulated amount of audio signal being generated by the participants.
  • the possible display modes comprise a single window display mode and a multiple window (continuous presence) display mode.
  • the multiple window display mode may comprise a display with a subset or all of the participants in the video conference as will be described in more detail below.
  • the method may also include comparing an accumulated amount of the audio signal from one or more of the audio signals with at least one accumulation threshold, where the display mode may be determined based on the comparing. For example, if a participant begins to talk, the system may switch the other participants' displays to the speaking participant only after the speaking participant has accumulated enough audio signal to exceed the accumulation threshold.
  • the accumulation threshold will be discussed in more detail hereinbelow.
  • video signals from the first location may be displayed on each of a plurality of video conferencing systems in the single window mode.
  • a participant's accumulated signal exceeds some value, e.g., if the participant speaks enough to surpass his respective accumulation threshold
  • each of the other participants i.e., the listening participants, may view that single speaker.
  • the talking participant may view a continuous presence mode, e.g., he may see all of the other participants or, alternatively, a subset therefrom.
  • video signals from a plurality of locations may be displayed on each of a plurality of video conferencing systems in a continuous presence mode.
  • a certain threshold amount of audio signal e.g., energy of the audio signal
  • the participants may view a continuous presence display mode comprising a subset or all of the participants on their display.
  • video signals from that respective subset of locations may be displayed on each of a plurality of video conferencing systems in a continuous presence mode.
  • this subset of the talking participants may be displayed on each of the participants' displays.
  • the participants' displays may show each of the talking participants singly, and intelligently switch between each of the talking participants throughout the conversation.
  • the method may also include modifying, e.g., raising, a respective accumulation threshold for a video conferencing system when an accumulated amount of audio signal has not exceeded the respective accumulation threshold within a predetermined amount of time.
  • the method may also modify, e.g., lower, a respective accumulation threshold for a video conferencing system when an accumulated amount of audio signal has recently exceeded the respective accumulation threshold within a predetermined amount of time.
  • the accumulation thresholds may be variable, i.e. may dynamically change, throughout the duration of the video conference.
  • the accumulation threshold variables may vary differently depending on whether the respective participant has spoken within some predetermined amount of time.
  • the accumulation thresholds may also vary with respect to each participant, i.e., each participant may have his own threshold that may vary independently from the other participants' thresholds.
  • each participant's threshold may be normalized with respect to the average audio level of each participant. For example, quieter participants may have lower thresholds than louder participants. Such an example will be described in more detail hereinbelow.
  • FIGS. 1 a and 1 b illustrate examples of continuous presence and single speaker modes for video conference displays
  • FIG. 2 illustrates a video conferencing system, according to one embodiment
  • FIG. 3 illustrates a participant location or conferencing unit, according to one embodiment
  • FIG. 4 illustrates a network and local system for use in video conferencing, according to one embodiment
  • FIG. 5 is a flowchart illustrating an exemplary method for controlling video display modes in a video conferencing system, according to one embodiment
  • FIG. 6 illustrates an audio signal integrated above a threshold, according to one embodiment
  • FIG. 7 illustrates two respective audio signals integrated above a fixed threshold, according to one embodiment
  • FIG. 8 illustrates a display mode according to the integrated audio signals, according to one embodiment
  • FIG. 9 illustrates two respective audio signals integrated above a variable audio threshold, according to one embodiment.
  • FIGS. 10 a - c illustrate various embodiments of continuous presence screens.
  • FIG. 2 Video Conferencing System
  • FIG. 2 illustrates an embodiment of a video conferencing system 100 .
  • Video conferencing system 100 may include a network 101 , endpoints 103 A- 103 H (e.g., audio and/or video conferencing systems), gateways 130 A- 130 B, a service provider 107 (e.g., a multipoint control unit (MCU)), a public switched telephone network (PSTN) 120 , conference units 105 A- 105 D, and plain old telephone system (POTS) telephones 106 A- 106 B.
  • MCU multipoint control unit
  • PSTN public switched telephone network
  • POTS plain old telephone system
  • Endpoints 103 C and 103 D- 103 H may be coupled to network 101 via gateways 130 A and 130 B, respectively, and gateways 130 A and 130 B may each include a firewall, a network address translator (NAT), a packet filter, and/or proxy mechanisms, among others.
  • Conference units 105 A- 105 B and POTS telephones 106 A- 106 B may be coupled to network 101 via PSTN 120 .
  • conference units 105 A- 105 B may each be coupled to PSTN 120 via an Integrated Services Digital Network (ISDN) connection, and each may include and/or implement H. 320 capabilities.
  • ISDN Integrated Services Digital Network
  • video and audio conferencing may be implemented over various types of networked devices.
  • endpoints 103 A- 103 H, gateways 130 A- 130 B, conference units 105 C- 105 D, and service provider 107 may each include various wireless or wired communication devices that implement various types of communication, such as wired Ethernet, wireless Ethernet (e.g., IEEE 802.11), IEEE 802.16, paging logic, RF (radio frequency) communication logic, a modem, a digital subscriber line (DSL) device, a cable (television) modem, an ISDN device, an ATM (asynchronous transfer mode) device, a satellite transceiver device, a parallel or serial port bus interface, and/or other type of communication device or method.
  • wireless Ethernet e.g., IEEE 802.11
  • IEEE 802.16 paging logic
  • RF (radio frequency) communication logic paging logic
  • modem e.g., a modem
  • DSL digital subscriber line
  • cable cable
  • ISDN ISDN
  • ATM asynchronous transfer mode
  • satellite transceiver device e.g.,
  • the methods and/or systems described may be used to implement connectivity between or among two or more participant locations or endpoints, each having voice and/or video devices (e.g., endpoints 103 A- 103 H, conference units 105 A- 105 D, POTS telephones 106 A- 106 B, etc.) that communicate through various networks (e.g., network 101 , PSTN 120 , the Internet, etc.).
  • voice and/or video devices e.g., endpoints 103 A- 103 H, conference units 105 A- 105 D, POTS telephones 106 A- 106 B, etc.
  • networks e.g., network 101 , PSTN 120 , the Internet, etc.
  • Endpoints 103 A- 103 C may include voice conferencing capabilities and include or be coupled to various audio devices (e.g., microphones, audio input devices, speakers, audio output devices, telephones, speaker telephones, etc.).
  • Endpoints 103 D- 103 H may include voice and video communications capabilities (e.g., video conferencing capabilities) and include or be coupled to various audio devices (e.g., microphones, audio input devices, speakers, audio output devices, telephones, speaker telephones, etc.) and include or be coupled to various video devices (e.g., monitors, projectors, displays, televisions, video output devices, video input devices, cameras, etc.).
  • endpoints 103 A- 103 H may comprise various ports for coupling to one or more devices (e.g., audio devices, video devices, etc.) and/or to one or more networks.
  • Conference units 105 A- 105 D may include voice and/or video conferencing capabilities and include or be coupled to various audio devices (e.g., microphones, audio input devices, speakers, audio output devices, telephones, speaker telephones, etc.) and/or include or be coupled to various video devices (e.g., monitors, projectors, displays, televisions, video output devices, video input devices, cameras, etc.).
  • endpoints 103 A- 103 H and/or conference units 105 A- 105 D may include and/or implement various network media communication capabilities.
  • endpoints 103 A- 103 H and/or conference units 105 C- 105 D may each include and/or implement one or more real time protocols, e.g., session initiation protocol (SIP), H.261, H.263, H.264, H.323, among others.
  • SIP session initiation protocol
  • a codec may implement a real time transmission protocol.
  • a codec (which may be short for “compressor/decompressor”) may comprise any system and/or method for encoding and/or decoding (e.g., compressing and decompressing) data (e.g., audio and/or video data).
  • communication applications may use codecs to convert an analog signal to a digital signal for transmitting over various digital networks (e.g., network 101 , PSTN 120 , the Internet, etc.) and to convert a received digital signal to an analog signal.
  • codecs may be implemented in software, hardware, or a combination of both.
  • Some codecs for computer video and/or audio may include MPEG, Indeo, and Cinepak, among others.
  • At least one of the participant locations may include a camera for acquiring high resolution or high definition (e.g., HDTV compatible) signals. At least one of the participant locations may include a high definition display (e.g., an HDTV display), for displaying received video signals in a high definition format.
  • the network 101 may be 1.5 MB or less (e.g., T1 or less). In another embodiment, the network is 2 MB or less.
  • FIG. 3 Participant Location
  • FIG. 3 illustrates an embodiment of a participant location, also referred to as an endpoint or conferencing unit (e.g., a video conferencing system).
  • the video conference system may have a system codec 209 to manage both a speakerphone 205 / 207 and a video conferencing system 203 .
  • a speakerphone 205 / 207 and a video conferencing system 203 may be coupled to the integrated video and audio conferencing system codec 209 and may receive audio and/or video signals from the system codec 209 .
  • the participant location may include a high definition camera 204 for acquiring high definition images of the participant location.
  • the participant location may also include a high definition display 201 (e.g., a HDTV display). High definition images acquired by the camera may be displayed locally on the display and may also be encoded and transmitted to other participant locations in the video conference.
  • the participant location may also include a sound system 261 .
  • the sound system 261 may include multiple speakers including left speakers 271 , center speaker 273 , and right speakers 275 . Other numbers of speakers and other speaker configurations may also be used.
  • the video conferencing system may include a camera 204 for capturing video of the conference site.
  • the video conferencing system may include one or more speakerphones 205 / 207 which may be daisy chained together.
  • the video conferencing system components may be coupled to a system codec 209 .
  • the system codec 209 may receive audio and/or video data from a network.
  • the system codec 209 may send the audio to the speakerphone 205 / 207 and/or sound system 261 and the video to the display 201 .
  • the received video may be high definition video that is displayed on the high definition display.
  • the system codec 209 may also receive video data from the camera 204 and audio data from the speakerphones 205 / 207 and transmit the video and/or audio data over the network to another conferencing system.
  • the conferencing system may be controlled by a participant through the user input components (e.g., buttons) on the speakerphone and/or remote control 250 . Other system interfaces may also be used.
  • FIG. 4 illustrates an exemplary embodiment of a video conferencing system comprising a plurality of participants located at respective endpoints.
  • the video conferencing system includes a local participant 407 and one or more remote participants 401 , 403 and 405 .
  • Each participant 401 - 407 may be at a respective location or endpoint.
  • Each location may include video conferencing equipment, such as the equipment described regarding FIG. 3 .
  • the various participants in the video conference may communicate over a transmission medium or network 409 .
  • the network 409 may be any of various types suitable for transmission of video and audio data between the participant locations.
  • the network is or includes a wide area network, such as the Internet.
  • the network 409 may also include various other types of communication systems, such as ISDN (Integrated Services Digital Network), the PSTN (Public Switched Telephone Network), LANs (local area networks) and/or other types of WANs.
  • ISDN Integrated Services Digital Network
  • PSTN Public Switched Telephone Network
  • LANs local area networks
  • WANs wide area networks
  • Each of the participants may be coupled to a control unit, e.g., a multipoint control unit (MCU).
  • the MCU may comprise processor 417 and memory 419 .
  • the MCU may be coupled to memory 419 via transmission media.
  • the system and method described herein may utilize suitable types of control units other than the MCU; the MCU is exemplary only, and in fact, other control units are envisioned.
  • the MCU may be comprised in a server.
  • Each of the participant's endpoints may be coupled to the MCU via a network such as network 101 .
  • the server may be an internet hosted web-server capable of providing video conferencing services to end-users.
  • At least one of the participant locations may comprise the MCU.
  • the MCU may operate to receive audio and video signals from each of the participant locations and selectively combine the signals for output to the various participant locations.
  • the MCU may operate to selectively provide different combinations of signals for different display modes. For example, in a single speaker display mode, where a participant from one location is talking, the MCU may operate to send the video signal of that participant to each of a subset or all of the participant locations. In a continuous presence display mode, where multiple participants are conversing, the MCU may operate to combine the video signals of a subset of the participants and provide this combined signal to each of the participant locations.
  • the system is operable to intelligently select a video display mode based on the received audio signals from one or more of the participant locations.
  • FIG. 5 is a flowchart illustrating an exemplary method for controlling video display modes in a video conferencing system, according to one embodiment. It should be noted that in various embodiments of the methods described below, one or more of the elements described may be performed concurrently, in a different order than shown, or may be omitted entirely. Other additional elements may also be performed as desired.
  • an audio signal from each of a plurality of video conferencing system locations may be received.
  • the audio signal may be from a single speaker at a respective party location or from multiple speakers at that party location.
  • the audio signals may be received by an MCU, and the MCU may be operable to perform the reception via network cables or other transmission media as described above.
  • the MCU may receive audio signals from each of local participant 407 and remote participants 401 , 403 , and 405 .
  • an accumulated amount of the audio signal may be determined from each of one or more of the audio signals. Determining the accumulated amount of audio signal may be performed by determining a signal metric for each of one or more of the respective audio signals using an integrated form of the respective signal. More specifically, determining the accumulated amount of audio signal may include integrating each of the one or more audio signals from the plurality of video conferencing systems to generate respective accumulated amounts of audio signal. In some embodiments, the MCU may implement signal integrator 411 to perform the determination of the accumulated amount of the audio signal.
  • the MCU and coupled components may operate to analyze incoming audio signals in order to determine the accumulated amount of audio signal for each participant or participant location.
  • the signal may be manipulated through various available methods to provide desirable processed signals. For example, incoming audio signals may be processed such that they are always positive. FIG. 6 illustrates such a signal.
  • the absolute value, the root-mean square (rms), or the square of the signal (providing the signal's energy) may be taken to provide positively valued signals.
  • the signals may be smoothed to facilitate integration or accumulation computations.
  • the processed or unprocessed signals may be integrated using any suitable methods for integration.
  • the signal might be sampled at given lengths or intervals or by other suitable methods, approximated using Riemann, trapezoidal, or Simpson sums, or processed using other appropriate techniques as desired.
  • the accumulated amount of audio signal may be determined using other methods.
  • the volume or intensity of the signal may be measured via averaging methods, e.g., average amplitude or decibels.
  • integration is not limited to those methods described above, and in fact, may refer to any suitable methods for measuring accumulated audio signal.
  • determining the accumulated amount of audio signal may comprise performing various other signal processing methods on the received audio signal.
  • determining the accumulated amount of audio signal may include integrating (or approximating the integration of) various forms of the signal to provide accumulated energy, power, rms, absolute value, intensity, or other desirable signal metrics of the audio signal.
  • changes in amplitude may be integrated and/or tracked (e.g., the changes in amplitude of a person's voice may be integrated).
  • the signals may only be processed and/or integrated when exceeding an audio level. More specifically, the signal integrator may begin measuring (or accumulating) the accumulated audio signal once a minimum audio level has been reached.
  • the level above which the signal may be integrated is herein referred to as an audio threshold.
  • FIG. 6 illustrates an exemplary signal exceeding an audio threshold. The signal, shown in FIG. 6 in a signal level 607 versus time 609 plot, exceeds audio threshold 603 and may be integrated over the area 605 . As FIG. 6 further shows, signals below the audio threshold, such as 601 , may not be integrated. Thus, determining the accumulated amount of the audio signal may occur after the audio signal has exceeded an audio threshold. In some embodiments, the audio signal may only continue to be accumulated while the audio signal remains above the audio threshold without “significant” interruption.
  • the accumulated amount of audio signal from each of one or more of the audio signals may be an uninterrupted accumulated amount of audio signal.
  • the accumulation of a respective audio signal may be restarted each time the audio signal stops, e.g., when the level of the respective audio signal goes below the audio threshold for a certain time period.
  • the system may begin accumulating an audio signal when the speaker begins to talk and end the accumulation of the audio signal when the respective speaker stops speaking or is interrupted.
  • the system may remain in continuous presence mode.
  • an interruption may have to exceed an interruption threshold to end the accumulation of the audio signal of the currently speaking participant.
  • the audio signal may continue to be accumulated as long as no “significant” interruption occurs.
  • the system may continue to integrate the speaking participant's signal because the noise or comment from the other participant did not exceed the interruption threshold.
  • interjections below the interjection threshold may not hinder the system from switching from the previous display mode, e.g., continuous presence mode, to the new display mode, e.g., the single window display of the currently speaking participant.
  • the system may intelligently filter interruptions and integrate audio signals in a desirable manner.
  • the interruption threshold may be based on the accumulated audio signal of the interruption, or may be time based. Thus in one embodiment if the accumulated audio signal of the “interruption” is less than an interruption threshold then the “interruption” is ignored, and the audio signal currently being accumulated continues to be accumulated. In another embodiment, if the “interruption” is less than an interruption threshold time period, then the “interruption” is ignored, and the audio signal currently being accumulated continues to be accumulated.
  • the term “significant interruption” may refer to an amount of interruption, which in some embodiments is less than or equal to a certain percentage (2%, 4%, 5%, 7%, etc.) of the accumulation threshold for determining display mode. Alternatively, the term “significant interruption” may refer to an amount of accumulated energy equivalent to 2 seconds of normal talking voice, or 1.5 seconds of a raised talking voice.
  • rules may be used (e.g., predetermined and/or provided by a conference participant) to determine when to accumulate energy.
  • Rules may be threshold based. For example, when the audio is below a first threshold, no energy is integrated. When above the first threshold but below a second threshold, a percentage of the audio is integrated, etc.
  • Rules may also be based on how quickly (or slowly) the audio is fluctuating between various thresholds. For example, if a participant's voice suddenly shifts above a high threshold, the audio may be integrated at a higher percentage (which may exceed 100% in some embodiments). This may allow more emphasis to be given to a participant who suddenly begins shouting.
  • audio exceeding a threshold may not be integrated above the threshold. For example, the audio may be integrated under the threshold but not over it. This may prevent the system from switching too quickly to naturally loud speakers.
  • the system may adapt the rules throughout the conference based on factors such as a time averaged participant audio levels.
  • the signal metric may be constrained to utilize certain types of audio signals (such as human voices) and/or to reject other types of audio signals (such as fan noise or paper shuffling).
  • the audio may be processed to detect human voices and the corresponding signal metric may be comprised of the human voice component. This may allow human voices to be tracked and integrated without including extraneous noise. For example, a loud air conditioner switching on at a remote conference site may be ignored by the system because the dominant frequencies of the air conditioner noise do not match human voice frequencies.
  • the system may integrate only audio of frequencies in a certain range (e.g., a range dominated by human voice).
  • the system may integrate audio that comprises fundamental harmonics (e.g., characteristic of human voice).
  • the system may identify and track the voices of different participants.
  • different weights may be given to different voices for the integration. For example, the voice of the leader of the conference may be weighted during integration so the system switches to him/her (or stays on them) more often.
  • a display mode from two or more possible display modes for at least one of the video conferencing system locations may be determined based on the accumulated amount of the audio signal from each of one or more of the audio signals.
  • the system may choose from a plurality of display modes for each of the participants based on the accumulated amount of audio signal being generated by the participants.
  • the possible display modes comprise a single window display mode and a multiple window display mode.
  • the multiple window display mode may comprise the continuous presence display mode described hereinabove.
  • the continuous presence display mode may comprise a display with a subset or all of the participants in the video conference as will be described in more detail below.
  • the method may compare an accumulated amount of the audio signal from one or more of the audio signals with at least one accumulation threshold, where the display mode may be determined based on the comparing.
  • the MCU may use signal integrator 411 to determine if a participant has accumulated audio signal above a certain level, i.e., an accumulation threshold.
  • the accumulation threshold corresponds to the level of accumulated audio signal after which the display mode is changed. For example, if a participant begins to talk, the system may switch the other participants' displays to the speaking participant only after the speaking participant has accumulated enough audio signal to exceed the accumulation threshold.
  • the accumulation threshold will be discussed in more detail below with regard to FIGS. 7 and 8 .
  • the value of the accumulation threshold may be static, may be set by an administrator or moderator, or may be set by one participant, or may be set by each participant. In one embodiment, the value of the accumulation threshold may be set to approximate a normal talking voice with an 8 second time duration, or a loud talking voice with a 6 second time duration.
  • video signals from the first location may be displayed on each of a plurality of video conferencing systems in the single window mode.
  • a participant's accumulated signal exceeds some value, e.g., if the participant speaks enough to surpass his respective accumulation threshold
  • each of the other participants i.e., the listening participants
  • the other participants may view that speaker in combination with a subset of the other participants.
  • the talking participant may view a continuous presence mode, e.g., he may see all of the other participants or, alternatively, a subset therefrom.
  • a subset or any of the participants may be able to choose the subset of the participants that may be viewed or may set a desired display mode independent of any determination of the accumulated audio signal.
  • the MCU may utilize a mode switch 415 function to implement the display change for each of the participants.
  • video signals from a plurality of locations may be displayed on each of a plurality of video conferencing systems in a continuous presence mode. Said another way, if no one in the video conference is speaking in an uninterrupted manner for a certain threshold amount of audio signal (e.g. energy), the participants may view a continuous presence display mode comprising a subset or all of the participants on their display.
  • a certain threshold amount of audio signal e.g. energy
  • video signals from that respective subset of locations may be displayed on each of a plurality of video conferencing systems in a continuous presence mode.
  • this subset of the talking participants may be displayed on each of the participants' displays.
  • the participants' displays may show each of the talking participants singly, and intelligently switch between each of the talking participants throughout the conversation.
  • the talking participants and the listening participants may view different displays.
  • the talking participants may view all of listening participants, the other talking participants, or all of the participants in the video conference.
  • the listening participants may view all of the talking participants, the currently talking participant, or a subset or all of the participants in the video conference.
  • the displays for the talking and listening participants are not limited to the displays described above, and in fact, other displays are contemplated.
  • the talking and listening participants may be able to manually choose between a plurality of views to be displayed.
  • only one audio signal may exceed the accumulation threshold at any given time.
  • the method may also include modifying, e.g., raising, a respective accumulation threshold for a video conferencing system when an accumulated amount of audio signal has not exceeded the respective accumulation threshold within a predetermined amount of time.
  • the method may also modify, e.g., lower, a respective accumulation threshold for a video conferencing system when an accumulated amount of audio signal has recently exceeded the respective accumulation threshold within a predetermined amount of time.
  • the accumulation thresholds may be variable, i.e. may dynamically change, throughout the duration of the video conference. Additionally, the accumulation threshold variables may vary differently depending on whether the respective participant has spoken within some predetermined amount of time.
  • the accumulation thresholds may vary with respect to each participant, i.e., each participant may have his own threshold that may vary independently from the other participants' thresholds.
  • each participant's threshold may be normalized with respect to each participant. For example, quieter participants may have lower thresholds than louder participants. Such an example will be described in more detail hereinbelow.
  • FIGS. 7 and 8 illustrate an example where the use of accumulated audio signal 605 , rather than time 609 , provides improvements to display mode switching as outlined below.
  • the system may determine when a single speaker is presumed to be talking, e.g., when the volume or amplitude level 607 of the audio signal from one participant location is above a certain audio threshold 603 , or greater than the other locations by a certain threshold or ratio.
  • the system may begin to integrate or sample the audio or voice signal received from that user or that location.
  • a certain amount of audio signal has been generated or accumulated by the integration, such as in FIG. 7 in the integrated area before 702 for participant 151 , the system may presume that the user has been talking for a sufficient amount (e.g., of accumulated audio signal) and that he may be a single talking user.
  • the system may switch from continuous presence mode, illustrated in FIG. 8 during time segment A as 801 , where a subset or all of the participants are displayed, to a single speaker mode 803 , where only the single speaker, in this case participant 151 , may be displayed.
  • the display of the talking participant may remain in continuous presence mode to allow the talking participant to view a plurality of other participants. However, the displays of the other participants may be switched to the location of the talking participant.
  • this method does not measure the amount of time that a participant has been speaking, but rather measures the amount of accumulated audio signal generated by the remote location.
  • the system may switch the other participants' displays to the talking participant faster than if the talking participant was speaking more softly.
  • Such a situation is illustrated in the transition 704 to time segment C in FIGS. 7 and 8 .
  • participant 157 generates the threshold amount of accumulated audio signal in a smaller amount of time than that of participant 151 during time segment A.
  • the single window display mode transfers to participant 157 more quickly than it had previously for participant 151 because of participant 157 's louder speaking volume.
  • the system will switch the other participants' displays to the shouting participant even faster. This occurs because the system measures the accumulated audio signal, essentially the amount of audio signal produced, as opposed to the prior art method which simply measures the length of time a participant speaks.
  • the system may switch the participants' displays back to continuous presence mode.
  • the present method provides a significant improvement over prior time based methods, in that the method switches the participants' displays to a participant speaking loudly more quickly.
  • the system may adjust the accumulation threshold of each participant based on the participant's total accumulated audio signal, i.e., the sum of all the accumulated audio signals from that participant.
  • participants who are speaking more in the video conference may have their accumulation threshold lowered, while other participants who are speaking less or not at all in the video conference may have their accumulation threshold raised.
  • the system may switch to, i.e., switch a plurality of participants' displays to, those participants who are speaking more or more often in a video conference in a faster or more responsive manner than participants who are speaking less in the video conference.
  • the system may switch to those participants who are speaking less in the video conference in a slower or less responsive manner, presuming that these less-talking participants may not be speaking very long or often.
  • the accumulation thresholds may be adjusted each time the system switches to a new speaker.
  • the thresholds may also be adjusted after a predetermined amount of time for each participant, e.g., long enough to predict the participant's long-term behavior.
  • conversation focus tends to go back and forth between those two people or locations.
  • the system tracks which participants are talking.
  • the system may lower the accumulation threshold required to display a single talking participant.
  • the system may show the first participant more quickly.
  • the system may switch to that second participant in single presentation mode more quickly.
  • the system may essentially ping-pong back and forth between each of the two talking participants. In other words, after one of the participants stops talking and the other participant starts talking, the system may switch to the single presentation mode of the talking participant substantially immediately, e.g., within a second or two.
  • the system when the system detects that two (or a subset) of the participants are doing all of the talking (i.e., only their audio signals are exceeding the accumulation threshold) the system may show these subset of participants in continuous presence mode.
  • the system when the system detects that two of the participants are having a dialog, the system may display these two participants in a dual split display mode.
  • the accumulation threshold for participants A and B may be lowered. Therefore, when either of participants A or B begins speaking, the system may quickly switch to this dual display mode.
  • participants A and B may have two associated accumulation thresholds.
  • the system may display the dual display mode using the lowered accumulation threshold, and then later switch to the single speaker mode for one of the participants upon reaching a second higher accumulation threshold.
  • the system may intelligently switch to a single speaker mode if the second participant in the two-person dialog is no longer responding to the conversation.
  • a first (greater) accumulation threshold may be required to switch from the dual-display mode to a continuous presence mode (where the three talking participants or all six participants are displayed).
  • a second (and even greater) accumulation threshold may be required to switch from continuous presence mode to single speaker mode for participant C.
  • the algorithm may be intelligent, e.g., using heuristics, to know that, for example, two speakers in the past ten minutes have been the dominant speakers; so, when one accumulates even a small amount of accumulated audio signal, the system may switch to that single speaker, or switch to a dual-speaker mode, much more quickly.
  • participant may not have this lowered accumulation threshold.
  • this third participant must generate a greater amount of audio signal energy before the display switches to either a continuous presence mode or single speaker mode view of this third participant.
  • participants other than the two dominant participants may also have two different accumulation thresholds, a first to go from dual display mode of the two dominant speakers to continuous presence mode, and a second accumulation threshold to go to single speaker mode for that participant.
  • each participant may have independent audio thresholds.
  • participant 151 may have audio threshold 603 A
  • participant 157 may have audio threshold 603 B.
  • Independent thresholds may be desirable for situations when a first participant is in a noisy environment. In such environments, a larger audio threshold may allow the MCU to properly determine when the first participant is speaking; i.e., the larger audio threshold may prevent the MCU from mistaking background noise for the participant's voice. However, if a second participant is in a quiet environment, it may be desirable for the second participant to have a much lower audio threshold than the first participant.
  • each participant's independent audio threshold may be normalized with respect to each participant.
  • a first participant may have a louder normal speaking volume than a second participant.
  • the first participant may have a higher audio threshold than the second participant.
  • the quieter participants such as the second participant, may not have to speak louder than normal to exceed their respective audio thresholds.
  • independent audio thresholds may be desirable.
  • the audio threshold for each participant may vary throughout a video conference.
  • FIG. 9 illustrates two respective audio signals integrated above a variable audio threshold, according to one embodiment.
  • the thresholds may be continuous.
  • the thresholds may be defined as a piece-wise function such as that in FIG. 9 .
  • the threshold may vary with respect to whether the participant is speaking; for example, participant 151 's threshold may decrease while participant 151 is speaking, as in 903 , and may increase while participant 151 is listening, as in 905 . Similarly, participant 157 's threshold may also decrease while he is speaking, 913 , and increase while listening, 911 and 915 .
  • FIGS. 10 a - c illustrate various embodiments of continuous presence screens.
  • the system may determine a dominant speaker 1003 , 1113 , or 1123 for display in a central and/or larger area of the display than corresponding other participants (e.g., other participants 1001 a - h ; 1111 a - l ; and 1121 a - g ).
  • Other continuous presence displays are also contemplated.
  • various embodiments of the systems and methods described above may facilitate intelligent control of video display modes in a video conferencing system.
  • a memory medium may include any of various types of memory devices or storage devices.
  • the term “memory medium” is intended to include an installation medium, e.g., a CD-ROM, floppy disks, or tape device; a computer system memory or random access memory such as DRAM, DDR RAM, SRAM, EDO RAM, Rambus RAM, etc.; or a non-volatile memory such as a magnetic media, e.g., a hard drive, or optical storage.
  • the memory medium may comprise other types of memory as well, or combinations thereof.
  • the memory medium may be located in a first computer in which the programs are executed, or may be located in a second different computer that connects to the first computer over a network, such as the Internet. In the latter instance, the second computer may provide program instructions to the first computer for execution.
  • the term “memory medium” may include two or more memory mediums that may reside in different locations, e.g., in different computers that are connected over a network.
  • a carrier medium may be used.
  • a carrier medium may include a memory medium as described above, as well as signals such as electrical, electromagnetic, or digital signals, conveyed via a communication medium such as a bus, network and/or a wireless link.
  • a method may be implemented from memory medium(s) on which one or more computer programs or software components according to one embodiment may be stored.
  • the memory medium may comprise an electrically erasable programmable read-only memory (EEPROM), various types of flash memory, etc. which store software programs (e.g., firmware) that is executable to perform the methods described herein.
  • firmware software programs
  • field programmable gate arrays may be used.
  • Various embodiments further include receiving or storing instructions and/or data implemented in accordance with the foregoing description upon a carrier medium.

Abstract

System and method for controlling video display modes in a video conferencing system. An audio signal from each of a plurality of video conferencing system locations may be received. An accumulated amount of audio signal may be determined from each of one or more of the audio signals. Subsequently, a display mode of two or more possible display modes may be determined for at least one of the video conferencing system locations based on the determined accumulated audio signal. Determining the accumulated audio signal may comprise determining a signal metric for each of one or more of the audio signals using an integrated form of the signal. The method may include comparing accumulated amounts of audio signal from one or more audio signals with at least one accumulation threshold. The display mode may also be determined based on the comparison between the accumulated audio signal and at least one accumulation threshold.

Description

    PRIORITY CLAIMS
  • This application claims priority to U.S. Provisional Application No. 60/676,918 titled “Audio and Video Conferencing”, which was filed May 2, 2005, whose inventors are Michael L. Kenoyer, Wayne Mock and Patrick D. Vanderwilt which is hereby incorporated by reference in its entirety as though fully and completely set forth herein.
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • The present invention relates generally to video conferencing and, more specifically, to automatically switching between display modes within a video conference.
  • 2. Description of the Related Art
  • Video conferencing may be used to allow two or more people to communicate using both video and audio. A video conferencing system may include a camera and microphone at each participant's location to collect video and audio from a respective participant to send to the other participant(s). A speaker and display at each respective participant location may reproduce the audio and video, respectively, from the other participant(s). The video conferencing system may also allow for use of a computer system to allow additional functionality into the video conference, such as data conferencing (including displaying and/or modifying a document for participants during the conference).
  • A video conferencing system may support multiple video display modes. In a continuous presence mode, a plurality or all of the participants may be presented on the display at a respective location, as shown in FIG. 1 a. Thus, continuous presence mode allows a viewer to see a plurality or all of the participants, whose images are typically tiled on the display as shown in FIG. 1 a. In a single speaker display mode, a participant may view video of the currently talking speaker, as shown in FIG. 1 b.
  • It may be desirable for a video conferencing system to automatically switch the display between a single speaker mode and a continuous presence mode. For example, U.S. Pat. No. 6,744,460 (the '460 Patent) titled “Video Display Mode Automatic Switching System and Method” relates to a system that uses a timer to determine how long a participant has been speaking. When a respective participant has been speaking for a length of time greater than a threshold, as determined by the timer, the system may switch to single speaker mode displaying that respective participant. When no participants are speaking for greater than a time threshold, then the system displays video signals of all of the participants in continuous presence mode. The '460 Patent teaches the “duration of the signals from each of the endpoints are continuously monitored by the timer . . . ” Based on the duration of these signals, the system switches between single speaker mode and multiple speaker mode.
  • The method described in the '460 Patent has several disadvantages. For example, the system of the '460 patent only considers speaking time, and does not consider the intensity or amplitude of the participants' voices. For example, if one of the participants begins talking more loudly or shouting during the conference, the system or the '460 Patent will take as long to switch to that person as to switch to someone who is quietly talking. It would be desirable to provide a video conferencing system that more intelligently switches between single speaker and continuous presence mode.
  • SUMMARY OF THE INVENTION
  • In various embodiments, a video conferencing system switches between single speaker and continuous presence mode based on the amount of accumulated audio signal of various ones of the participants. For example, when a first speaker begins speaking, the method may begin accumulating, e.g., via integration, the audio signal of the first speaker. When the accumulated audio signal of the first speaker becomes greater than a certain accumulation threshold, the video conferencing system may automatically switch to single speaker mode presenting the video image of the first speaker. Thus, if the first speaker is speaking more loudly or even yelling during the video conference, the system may switch to single speaker mode faster than if the first speaker were talking normally. Conversely, if the first speaker begins speaking softly, the system may switch to single speaker mode after a greater amount of time has passed. Thus, the method does not switch between video display modes based on time, but rather switches based on the amount of accumulated audio signal of respective participants.
  • In some embodiments, the system may receive audio signals from a plurality of participants in a video conference. An audio signal may be generated by a single speaker at a respective participant location or by multiple speakers at that participant location. The accumulated amount of the audio signal may then be determined from each of one or more of the audio signals. Determining the accumulated amount of audio signal may be performed by determining a signal metric for each of one or more of the respective audio signals using an integrated form of the respective signal. More specifically, determining the accumulated amount of audio signal may include integrating each of the one or more audio signals from the plurality of video conferencing systems to generate respective accumulated amounts of audio signal. In some embodiments, the signal metric may be constrained to utilize certain types of audio signals, such as human voices and/or to reject other types of audio signals, such as fan noise or paper shuffling.
  • Said another way, the system may operate to analyze incoming signals in order to determine the accumulated amount of audio signal for each participant or participant location. In some embodiments, the signal may be manipulated through various available methods to provide desirable processed signals. For example, incoming audio signals may be processed such that they are always positive. The signals may be integrated using any suitable methods for determining an accumulated amount of audio signal.
  • In some embodiments, the signals may only be processed and/or integrated when exceeding a minimum audio level. The level above which the signal may be integrated is herein referred to as an audio threshold. Thus, determining the accumulated amount of the audio signal may occur after the audio signal has exceeded an audio threshold.
  • In one embodiment, the audio signal may be accumulated only while the audio signal is continuous and uninterrupted, or substantially uninterrupted. In other words, the accumulation of a respective audio signal may be restarted each time the audio signal stops, e.g., when the level of the respective audio signal goes below the audio threshold for a certain time period or accumulation amount. Said another way, the system may begin accumulating an audio signal when the speaker begins to talk and end the accumulation of the audio signal when the respective speaker stops speaking or is interrupted. Thus, in a video conference with a lot of “back and-forth” talking, where the participants do not exceed their respective accumulation thresholds before being interrupted, the system may remain in continuous presence mode.
  • In some embodiments, an interruption may have to exceed an interruption threshold to end the accumulation of the audio signal of the currently speaking participant. For example, in a video conference where one participant begins to speak, and another participant coughs or interjects a brief comment, e.g., “yes”, “I agree”, etc., the system may continue to integrate the speaking participant's signal because the noise or comment from the other participant did not exceed the interruption threshold. Thus, interjections below the interjection threshold may not hinder the system from switching from the previous display mode, e.g., continuous presence mode, to the new display mode, e.g., the single window display of the currently speaking participant. Thus, the system may intelligently filter interruptions and integrate audio signals in a desirable manner. The interruption threshold may be based on the accumulated audio signal of the interruption or may be time based.
  • In some embodiments, a display mode from two or more possible display modes for at least one of the video conferencing system locations may be determined based on the accumulated amount of the audio signal from each of one or more of the audio signals. In other words, the system may choose from a plurality of display modes for each of the participants based on the uninterrupted accumulated amount of audio signal being generated by the participants. In some embodiments, the possible display modes comprise a single window display mode and a multiple window (continuous presence) display mode. The multiple window display mode may comprise a display with a subset or all of the participants in the video conference as will be described in more detail below.
  • The method may also include comparing an accumulated amount of the audio signal from one or more of the audio signals with at least one accumulation threshold, where the display mode may be determined based on the comparing. For example, if a participant begins to talk, the system may switch the other participants' displays to the speaking participant only after the speaking participant has accumulated enough audio signal to exceed the accumulation threshold. The accumulation threshold will be discussed in more detail hereinbelow.
  • In some embodiments, if the accumulated amount of audio signal corresponding to a first location exceeds an accumulation threshold, video signals from the first location may be displayed on each of a plurality of video conferencing systems in the single window mode. In other words, if a participant's accumulated signal exceeds some value, e.g., if the participant speaks enough to surpass his respective accumulation threshold, each of the other participants, i.e., the listening participants, may view that single speaker. The talking participant, however, may view a continuous presence mode, e.g., he may see all of the other participants or, alternatively, a subset therefrom.
  • Alternatively, if the accumulated amount of audio signal corresponding to any location does not exceed the accumulation threshold, video signals from a plurality of locations may be displayed on each of a plurality of video conferencing systems in a continuous presence mode. Said another way, if no one in the video conference is speaking in an uninterrupted manner for a certain threshold amount of audio signal (e.g., energy of the audio signal), the participants may view a continuous presence display mode comprising a subset or all of the participants on their display.
  • In one embodiment, if the accumulated amount of audio signal corresponding to a subset of locations repeatedly exceeds the accumulation threshold, video signals from that respective subset of locations may be displayed on each of a plurality of video conferencing systems in a continuous presence mode. In other words, if participants from a certain subset of participant locations are doing all of the talking, i.e., exceeding a common (or respective) accumulation threshold(s), this subset of the talking participants may be displayed on each of the participants' displays. Alternatively, the participants' displays may show each of the talking participants singly, and intelligently switch between each of the talking participants throughout the conversation.
  • In embodiments utilizing an accumulation threshold, the method may also include modifying, e.g., raising, a respective accumulation threshold for a video conferencing system when an accumulated amount of audio signal has not exceeded the respective accumulation threshold within a predetermined amount of time. The method may also modify, e.g., lower, a respective accumulation threshold for a video conferencing system when an accumulated amount of audio signal has recently exceeded the respective accumulation threshold within a predetermined amount of time. In other words, the accumulation thresholds may be variable, i.e. may dynamically change, throughout the duration of the video conference. For example, the accumulation threshold variables may vary differently depending on whether the respective participant has spoken within some predetermined amount of time.
  • The accumulation thresholds may also vary with respect to each participant, i.e., each participant may have his own threshold that may vary independently from the other participants' thresholds. In one embodiment, each participant's threshold may be normalized with respect to the average audio level of each participant. For example, quieter participants may have lower thresholds than louder participants. Such an example will be described in more detail hereinbelow.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • A better understanding of the present invention may be obtained when the following detailed description is considered in conjunction with the following drawings, in which:
  • FIGS. 1 a and 1 b illustrate examples of continuous presence and single speaker modes for video conference displays;
  • FIG. 2 illustrates a video conferencing system, according to one embodiment;
  • FIG. 3 illustrates a participant location or conferencing unit, according to one embodiment;
  • FIG. 4 illustrates a network and local system for use in video conferencing, according to one embodiment;
  • FIG. 5 is a flowchart illustrating an exemplary method for controlling video display modes in a video conferencing system, according to one embodiment;
  • FIG. 6 illustrates an audio signal integrated above a threshold, according to one embodiment;
  • FIG. 7 illustrates two respective audio signals integrated above a fixed threshold, according to one embodiment;
  • FIG. 8 illustrates a display mode according to the integrated audio signals, according to one embodiment;
  • FIG. 9 illustrates two respective audio signals integrated above a variable audio threshold, according to one embodiment; and
  • FIGS. 10 a-c illustrate various embodiments of continuous presence screens.
  • While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention as defined by the appended claims. Note, the headings are for organizational purposes only and are not meant to be used to limit or interpret the description or claims. Furthermore, note that the word “may” is used throughout this application in a permissive sense (i.e., having the potential to, being able to), not a mandatory sense (i.e., must). The term “include”, and derivations thereof, mean “including, but not limited to”. The term “coupled” means “directly or indirectly connected”.
  • DETAILED DESCRIPTION OF THE EMBODIMENTS INCORPORATION BY REFERENCE
  • U.S. Pat. No. 6,744,460 titled “Video Display Mode Automatic Switching System and Method” is hereby incorporated by reference as though fully and completely set forth herein.
  • U.S. Patent Application titled “Speakerphone”, Ser. No. 11/251,084, which was filed Oct. 14, 2005, whose inventor is William V. Oxford is hereby incorporated by reference in its entirety as though fully and completely set forth herein.
  • U.S. Patent Application titled “Video Conferencing System Transcoder”, Ser. No. 11/252,238, which was filed Oct. 17, 2005, whose inventors are Michael L. Kenoyer and Michael V. Jenkins, is hereby incorporated by reference in its entirety as though fully and completely set forth herein.
  • U.S. Patent Application titled “Speakerphone Supporting Video and Audio Features”, Ser. No. 11/251,086, which was filed Oct. 14, 2005, whose inventors are Michael L. Kenoyer, Craig B. Malloy and Wayne E. Mock is hereby incorporated by reference in its entirety as though fully and completely set forth herein.
  • U.S. Patent Application titled “High Definition Camera Pan Tilt Mechanism”, Ser. No. 11/251,083, which was filed Oct. 14, 2005, whose inventors are Michael L. Kenoyer, William V. Oxford, Patrick D. Vanderwilt, Hans-Christoph Haenlein, Branko Lukic and Jonathan I. Kaplan, is hereby incorporated by reference in its entirety as though fully and completely set forth herein.
  • FIG. 2—Video Conferencing System
  • FIG. 2 illustrates an embodiment of a video conferencing system 100. Video conferencing system 100 may include a network 101, endpoints 103A-103H (e.g., audio and/or video conferencing systems), gateways 130A-130B, a service provider 107 (e.g., a multipoint control unit (MCU)), a public switched telephone network (PSTN) 120, conference units 105A-105D, and plain old telephone system (POTS) telephones 106A-106B. Endpoints 103C and 103D-103H may be coupled to network 101 via gateways 130A and 130B, respectively, and gateways 130A and 130B may each include a firewall, a network address translator (NAT), a packet filter, and/or proxy mechanisms, among others. Conference units 105A-105B and POTS telephones 106A-106B may be coupled to network 101 via PSTN 120. In some embodiments, conference units 105A-105B may each be coupled to PSTN 120 via an Integrated Services Digital Network (ISDN) connection, and each may include and/or implement H.320 capabilities. In various embodiments, video and audio conferencing may be implemented over various types of networked devices.
  • In some embodiments, endpoints 103A-103H, gateways 130A-130B, conference units 105C-105D, and service provider 107 may each include various wireless or wired communication devices that implement various types of communication, such as wired Ethernet, wireless Ethernet (e.g., IEEE 802.11), IEEE 802.16, paging logic, RF (radio frequency) communication logic, a modem, a digital subscriber line (DSL) device, a cable (television) modem, an ISDN device, an ATM (asynchronous transfer mode) device, a satellite transceiver device, a parallel or serial port bus interface, and/or other type of communication device or method.
  • In various embodiments, the methods and/or systems described may be used to implement connectivity between or among two or more participant locations or endpoints, each having voice and/or video devices (e.g., endpoints 103A-103H, conference units 105A-105D, POTS telephones 106A-106B, etc.) that communicate through various networks (e.g., network 101, PSTN 120, the Internet, etc.).
  • Endpoints 103A-103C may include voice conferencing capabilities and include or be coupled to various audio devices (e.g., microphones, audio input devices, speakers, audio output devices, telephones, speaker telephones, etc.). Endpoints 103D-103H may include voice and video communications capabilities (e.g., video conferencing capabilities) and include or be coupled to various audio devices (e.g., microphones, audio input devices, speakers, audio output devices, telephones, speaker telephones, etc.) and include or be coupled to various video devices (e.g., monitors, projectors, displays, televisions, video output devices, video input devices, cameras, etc.). In some embodiments, endpoints 103A-103H may comprise various ports for coupling to one or more devices (e.g., audio devices, video devices, etc.) and/or to one or more networks.
  • Conference units 105A-105D may include voice and/or video conferencing capabilities and include or be coupled to various audio devices (e.g., microphones, audio input devices, speakers, audio output devices, telephones, speaker telephones, etc.) and/or include or be coupled to various video devices (e.g., monitors, projectors, displays, televisions, video output devices, video input devices, cameras, etc.). In some embodiments, endpoints 103A-103H and/or conference units 105A-105D may include and/or implement various network media communication capabilities. For example, endpoints 103A-103H and/or conference units 105C-105D may each include and/or implement one or more real time protocols, e.g., session initiation protocol (SIP), H.261, H.263, H.264, H.323, among others.
  • In various embodiments, a codec may implement a real time transmission protocol. In some embodiments, a codec (which may be short for “compressor/decompressor”) may comprise any system and/or method for encoding and/or decoding (e.g., compressing and decompressing) data (e.g., audio and/or video data). For example, communication applications may use codecs to convert an analog signal to a digital signal for transmitting over various digital networks (e.g., network 101, PSTN 120, the Internet, etc.) and to convert a received digital signal to an analog signal. In various embodiments, codecs may be implemented in software, hardware, or a combination of both. Some codecs for computer video and/or audio may include MPEG, Indeo, and Cinepak, among others.
  • At least one of the participant locations may include a camera for acquiring high resolution or high definition (e.g., HDTV compatible) signals. At least one of the participant locations may include a high definition display (e.g., an HDTV display), for displaying received video signals in a high definition format. In one embodiment the network 101 may be 1.5 MB or less (e.g., T1 or less). In another embodiment, the network is 2 MB or less.
  • FIG. 3—Participant Location
  • FIG. 3 illustrates an embodiment of a participant location, also referred to as an endpoint or conferencing unit (e.g., a video conferencing system). In some embodiments, the video conference system may have a system codec 209 to manage both a speakerphone 205/207 and a video conferencing system 203. For example, a speakerphone 205/207 and a video conferencing system 203 may be coupled to the integrated video and audio conferencing system codec 209 and may receive audio and/or video signals from the system codec 209.
  • In some embodiments, the participant location may include a high definition camera 204 for acquiring high definition images of the participant location. The participant location may also include a high definition display 201 (e.g., a HDTV display). High definition images acquired by the camera may be displayed locally on the display and may also be encoded and transmitted to other participant locations in the video conference.
  • The participant location may also include a sound system 261. The sound system 261 may include multiple speakers including left speakers 271, center speaker 273, and right speakers 275. Other numbers of speakers and other speaker configurations may also be used. In some embodiments, the video conferencing system may include a camera 204 for capturing video of the conference site. In some embodiments, the video conferencing system may include one or more speakerphones 205/207 which may be daisy chained together.
  • The video conferencing system components (e.g., the camera 204, display 201, sound system 261, and speakerphones 205/207) may be coupled to a system codec 209. The system codec 209 may receive audio and/or video data from a network. The system codec 209 may send the audio to the speakerphone 205/207 and/or sound system 261 and the video to the display 201. The received video may be high definition video that is displayed on the high definition display. The system codec 209 may also receive video data from the camera 204 and audio data from the speakerphones 205/207 and transmit the video and/or audio data over the network to another conferencing system. In some embodiments, the conferencing system may be controlled by a participant through the user input components (e.g., buttons) on the speakerphone and/or remote control 250. Other system interfaces may also be used.
  • FIG. 4 illustrates an exemplary embodiment of a video conferencing system comprising a plurality of participants located at respective endpoints. As shown, the video conferencing system includes a local participant 407 and one or more remote participants 401, 403 and 405. Each participant 401-407 may be at a respective location or endpoint. Each location may include video conferencing equipment, such as the equipment described regarding FIG. 3.
  • The various participants in the video conference may communicate over a transmission medium or network 409. The network 409 may be any of various types suitable for transmission of video and audio data between the participant locations. In one embodiment, the network is or includes a wide area network, such as the Internet. The network 409 may also include various other types of communication systems, such as ISDN (Integrated Services Digital Network), the PSTN (Public Switched Telephone Network), LANs (local area networks) and/or other types of WANs.
  • Each of the participants may be coupled to a control unit, e.g., a multipoint control unit (MCU). The MCU may comprise processor 417 and memory 419. In one embodiment, the MCU may be coupled to memory 419 via transmission media. Note that the system and method described herein may utilize suitable types of control units other than the MCU; the MCU is exemplary only, and in fact, other control units are envisioned.
  • In some embodiments, the MCU may be comprised in a server. Each of the participant's endpoints may be coupled to the MCU via a network such as network 101. In one embodiment, the server may be an internet hosted web-server capable of providing video conferencing services to end-users.
  • Alternatively, at least one of the participant locations may comprise the MCU. The MCU may operate to receive audio and video signals from each of the participant locations and selectively combine the signals for output to the various participant locations. In some embodiments, the MCU may operate to selectively provide different combinations of signals for different display modes. For example, in a single speaker display mode, where a participant from one location is talking, the MCU may operate to send the video signal of that participant to each of a subset or all of the participant locations. In a continuous presence display mode, where multiple participants are conversing, the MCU may operate to combine the video signals of a subset of the participants and provide this combined signal to each of the participant locations.
  • In one embodiment, the system is operable to intelligently select a video display mode based on the received audio signals from one or more of the participant locations.
  • FIG. 5 is a flowchart illustrating an exemplary method for controlling video display modes in a video conferencing system, according to one embodiment. It should be noted that in various embodiments of the methods described below, one or more of the elements described may be performed concurrently, in a different order than shown, or may be omitted entirely. Other additional elements may also be performed as desired.
  • In 502, an audio signal from each of a plurality of video conferencing system locations may be received. The audio signal may be from a single speaker at a respective party location or from multiple speakers at that party location. In one embodiment, the audio signals may be received by an MCU, and the MCU may be operable to perform the reception via network cables or other transmission media as described above. For example, in FIG. 4, the MCU may receive audio signals from each of local participant 407 and remote participants 401, 403, and 405.
  • In 504, an accumulated amount of the audio signal may be determined from each of one or more of the audio signals. Determining the accumulated amount of audio signal may be performed by determining a signal metric for each of one or more of the respective audio signals using an integrated form of the respective signal. More specifically, determining the accumulated amount of audio signal may include integrating each of the one or more audio signals from the plurality of video conferencing systems to generate respective accumulated amounts of audio signal. In some embodiments, the MCU may implement signal integrator 411 to perform the determination of the accumulated amount of the audio signal.
  • Said another way, the MCU and coupled components may operate to analyze incoming audio signals in order to determine the accumulated amount of audio signal for each participant or participant location. In some embodiments, the signal may be manipulated through various available methods to provide desirable processed signals. For example, incoming audio signals may be processed such that they are always positive. FIG. 6 illustrates such a signal. As further examples, the absolute value, the root-mean square (rms), or the square of the signal (providing the signal's energy), may be taken to provide positively valued signals. As another example, the signals may be smoothed to facilitate integration or accumulation computations.
  • The processed or unprocessed signals may be integrated using any suitable methods for integration. For example, the signal might be sampled at given lengths or intervals or by other suitable methods, approximated using Riemann, trapezoidal, or Simpson sums, or processed using other appropriate techniques as desired. In some embodiments, the accumulated amount of audio signal may be determined using other methods. For example, the volume or intensity of the signal may be measured via averaging methods, e.g., average amplitude or decibels. Note that in the systems and methods disclosed herein, integration is not limited to those methods described above, and in fact, may refer to any suitable methods for measuring accumulated audio signal. In other words, determining the accumulated amount of audio signal may comprise performing various other signal processing methods on the received audio signal.
  • Thus, determining the accumulated amount of audio signal may include integrating (or approximating the integration of) various forms of the signal to provide accumulated energy, power, rms, absolute value, intensity, or other desirable signal metrics of the audio signal. As another example, changes in amplitude may be integrated and/or tracked (e.g., the changes in amplitude of a person's voice may be integrated).
  • In some embodiments, the signals may only be processed and/or integrated when exceeding an audio level. More specifically, the signal integrator may begin measuring (or accumulating) the accumulated audio signal once a minimum audio level has been reached. The level above which the signal may be integrated is herein referred to as an audio threshold. FIG. 6 illustrates an exemplary signal exceeding an audio threshold. The signal, shown in FIG. 6 in a signal level 607 versus time 609 plot, exceeds audio threshold 603 and may be integrated over the area 605. As FIG. 6 further shows, signals below the audio threshold, such as 601, may not be integrated. Thus, determining the accumulated amount of the audio signal may occur after the audio signal has exceeded an audio threshold. In some embodiments, the audio signal may only continue to be accumulated while the audio signal remains above the audio threshold without “significant” interruption.
  • In one embodiment, the accumulated amount of audio signal from each of one or more of the audio signals may be an uninterrupted accumulated amount of audio signal. In other words; the accumulation of a respective audio signal may be restarted each time the audio signal stops, e.g., when the level of the respective audio signal goes below the audio threshold for a certain time period. Said another way, the system may begin accumulating an audio signal when the speaker begins to talk and end the accumulation of the audio signal when the respective speaker stops speaking or is interrupted. Thus, in a video conference with a lot of back and forth talking where the participants do not exceed their respective accumulation thresholds before being interrupted, the system may remain in continuous presence mode.
  • In some embodiments, an interruption may have to exceed an interruption threshold to end the accumulation of the audio signal of the currently speaking participant. In other words, the audio signal may continue to be accumulated as long as no “significant” interruption occurs. For example, in a video conference where one participant begins to speak, and another participant coughs or interjects a brief comment, e.g., “yes”, “I agree”, etc., the system may continue to integrate the speaking participant's signal because the noise or comment from the other participant did not exceed the interruption threshold. Thus, interjections below the interjection threshold may not hinder the system from switching from the previous display mode, e.g., continuous presence mode, to the new display mode, e.g., the single window display of the currently speaking participant. Thus, the system may intelligently filter interruptions and integrate audio signals in a desirable manner.
  • The interruption threshold may be based on the accumulated audio signal of the interruption, or may be time based. Thus in one embodiment if the accumulated audio signal of the “interruption” is less than an interruption threshold then the “interruption” is ignored, and the audio signal currently being accumulated continues to be accumulated. In another embodiment, if the “interruption” is less than an interruption threshold time period, then the “interruption” is ignored, and the audio signal currently being accumulated continues to be accumulated. As used herein, the term “significant interruption” may refer to an amount of interruption, which in some embodiments is less than or equal to a certain percentage (2%, 4%, 5%, 7%, etc.) of the accumulation threshold for determining display mode. Alternatively, the term “significant interruption” may refer to an amount of accumulated energy equivalent to 2 seconds of normal talking voice, or 1.5 seconds of a raised talking voice.
  • In some embodiments, rules may be used (e.g., predetermined and/or provided by a conference participant) to determine when to accumulate energy. Rules may be threshold based. For example, when the audio is below a first threshold, no energy is integrated. When above the first threshold but below a second threshold, a percentage of the audio is integrated, etc. Rules may also be based on how quickly (or slowly) the audio is fluctuating between various thresholds. For example, if a participant's voice suddenly shifts above a high threshold, the audio may be integrated at a higher percentage (which may exceed 100% in some embodiments). This may allow more emphasis to be given to a participant who suddenly begins shouting. In some embodiments, audio exceeding a threshold may not be integrated above the threshold. For example, the audio may be integrated under the threshold but not over it. This may prevent the system from switching too quickly to naturally loud speakers. In some embodiments, the system may adapt the rules throughout the conference based on factors such as a time averaged participant audio levels.
  • In some embodiments, the signal metric may be constrained to utilize certain types of audio signals (such as human voices) and/or to reject other types of audio signals (such as fan noise or paper shuffling). For example, the audio may be processed to detect human voices and the corresponding signal metric may be comprised of the human voice component. This may allow human voices to be tracked and integrated without including extraneous noise. For example, a loud air conditioner switching on at a remote conference site may be ignored by the system because the dominant frequencies of the air conditioner noise do not match human voice frequencies. In some embodiments, the system may integrate only audio of frequencies in a certain range (e.g., a range dominated by human voice). In some embodiments, the system may integrate audio that comprises fundamental harmonics (e.g., characteristic of human voice). In some embodiments, the system may identify and track the voices of different participants. In some embodiments, different weights may be given to different voices for the integration. For example, the voice of the leader of the conference may be weighted during integration so the system switches to him/her (or stays on them) more often.
  • In 506, a display mode from two or more possible display modes for at least one of the video conferencing system locations may be determined based on the accumulated amount of the audio signal from each of one or more of the audio signals. In other words, the system may choose from a plurality of display modes for each of the participants based on the accumulated amount of audio signal being generated by the participants. In some embodiments, the possible display modes comprise a single window display mode and a multiple window display mode. The multiple window display mode may comprise the continuous presence display mode described hereinabove. As noted above, the continuous presence display mode may comprise a display with a subset or all of the participants in the video conference as will be described in more detail below.
  • In some embodiments, the method may compare an accumulated amount of the audio signal from one or more of the audio signals with at least one accumulation threshold, where the display mode may be determined based on the comparing. Said another way, the MCU may use signal integrator 411 to determine if a participant has accumulated audio signal above a certain level, i.e., an accumulation threshold. As used herein, the accumulation threshold corresponds to the level of accumulated audio signal after which the display mode is changed. For example, if a participant begins to talk, the system may switch the other participants' displays to the speaking participant only after the speaking participant has accumulated enough audio signal to exceed the accumulation threshold. The accumulation threshold will be discussed in more detail below with regard to FIGS. 7 and 8.
  • The value of the accumulation threshold may be static, may be set by an administrator or moderator, or may be set by one participant, or may be set by each participant. In one embodiment, the value of the accumulation threshold may be set to approximate a normal talking voice with an 8 second time duration, or a loud talking voice with a 6 second time duration.
  • In some embodiments, if the accumulated amount of audio signal corresponding to a first location exceeds an accumulation threshold, video signals from the first location may be displayed on each of a plurality of video conferencing systems in the single window mode. In other words, if a participant's accumulated signal exceeds some value, e.g., if the participant speaks enough to surpass his respective accumulation threshold, each of the other participants, i.e., the listening participants, may view that single speaker. Alternatively, the other participants may view that speaker in combination with a subset of the other participants. The talking participant, however, may view a continuous presence mode, e.g., he may see all of the other participants or, alternatively, a subset therefrom. In one embodiment, a subset or any of the participants may be able to choose the subset of the participants that may be viewed or may set a desired display mode independent of any determination of the accumulated audio signal. The MCU may utilize a mode switch 415 function to implement the display change for each of the participants.
  • Alternatively, if the accumulated amount of audio signal corresponding to any location does not exceed the accumulation threshold, video signals from a plurality of locations may be displayed on each of a plurality of video conferencing systems in a continuous presence mode. Said another way, if no one in the video conference is speaking in an uninterrupted manner for a certain threshold amount of audio signal (e.g. energy), the participants may view a continuous presence display mode comprising a subset or all of the participants on their display.
  • In one embodiment, if the accumulated amount of audio signal corresponding to a subset of locations are determined to be repeatedly exceeding the accumulation threshold, video signals from that respective subset of locations may be displayed on each of a plurality of video conferencing systems in a continuous presence mode. In other words, if participants from a certain subset of participant locations are doing all of the talking, i.e., exceeding a common (or respective) accumulation threshold, this subset of the talking participants may be displayed on each of the participants' displays. Alternatively, the participants' displays may show each of the talking participants singly, and intelligently switch between each of the talking participants throughout the conversation. In some embodiments, the talking participants and the listening participants may view different displays. For example, the talking participants may view all of listening participants, the other talking participants, or all of the participants in the video conference. Similarly, the listening participants may view all of the talking participants, the currently talking participant, or a subset or all of the participants in the video conference. Note that the displays for the talking and listening participants are not limited to the displays described above, and in fact, other displays are contemplated. In some embodiments, the talking and listening participants may be able to manually choose between a plurality of views to be displayed. In one embodiment, only one audio signal may exceed the accumulation threshold at any given time.
  • In embodiments utilizing an accumulation threshold, the method may also include modifying, e.g., raising, a respective accumulation threshold for a video conferencing system when an accumulated amount of audio signal has not exceeded the respective accumulation threshold within a predetermined amount of time. The method may also modify, e.g., lower, a respective accumulation threshold for a video conferencing system when an accumulated amount of audio signal has recently exceeded the respective accumulation threshold within a predetermined amount of time. In other words, the accumulation thresholds may be variable, i.e. may dynamically change, throughout the duration of the video conference. Additionally, the accumulation threshold variables may vary differently depending on whether the respective participant has spoken within some predetermined amount of time.
  • The accumulation thresholds may vary with respect to each participant, i.e., each participant may have his own threshold that may vary independently from the other participants' thresholds. In one embodiment, each participant's threshold may be normalized with respect to each participant. For example, quieter participants may have lower thresholds than louder participants. Such an example will be described in more detail hereinbelow.
  • As described above, the accumulated amount of audio signal from each participant may be measured to determine when the video conferencing system should switch between two speakers and/or switch between single speaker mode and continuous presence (multiple speaker) mode. FIGS. 7 and 8 illustrate an example where the use of accumulated audio signal 605, rather than time 609, provides improvements to display mode switching as outlined below.
  • In some embodiments, the system may determine when a single speaker is presumed to be talking, e.g., when the volume or amplitude level 607 of the audio signal from one participant location is above a certain audio threshold 603, or greater than the other locations by a certain threshold or ratio. When a single speaker is determined to be talking, as illustrated in FIG. 7 in time segment A, the system may begin to integrate or sample the audio or voice signal received from that user or that location. When a certain amount of audio signal has been generated or accumulated by the integration, such as in FIG. 7 in the integrated area before 702 for participant 151, the system may presume that the user has been talking for a sufficient amount (e.g., of accumulated audio signal) and that he may be a single talking user. At this point, the system may switch from continuous presence mode, illustrated in FIG. 8 during time segment A as 801, where a subset or all of the participants are displayed, to a single speaker mode 803, where only the single speaker, in this case participant 151, may be displayed. The display of the talking participant may remain in continuous presence mode to allow the talking participant to view a plurality of other participants. However, the displays of the other participants may be switched to the location of the talking participant.
  • Note that this method does not measure the amount of time that a participant has been speaking, but rather measures the amount of accumulated audio signal generated by the remote location. Thus, if a participant is speaking very loudly or more loudly than normal if the thresholds are normalized, the system may switch the other participants' displays to the talking participant faster than if the talking participant was speaking more softly. Such a situation is illustrated in the transition 704 to time segment C in FIGS. 7 and 8. In this instance, participant 157 generates the threshold amount of accumulated audio signal in a smaller amount of time than that of participant 151 during time segment A. In this case, the single window display mode transfers to participant 157 more quickly than it had previously for participant 151 because of participant 157's louder speaking volume. Moreover, if a participant begins shouting in the video conference, the system will switch the other participants' displays to the shouting participant even faster. This occurs because the system measures the accumulated audio signal, essentially the amount of audio signal produced, as opposed to the prior art method which simply measures the length of time a participant speaks.
  • Finally, when no participant speaks, as illustrated in time segment D of FIGS. 7 and 8, the system may switch the participants' displays back to continuous presence mode. Thus, the present method provides a significant improvement over prior time based methods, in that the method switches the participants' displays to a participant speaking loudly more quickly.
  • In some embodiments, the system may adjust the accumulation threshold of each participant based on the participant's total accumulated audio signal, i.e., the sum of all the accumulated audio signals from that participant. Thus, participants who are speaking more in the video conference may have their accumulation threshold lowered, while other participants who are speaking less or not at all in the video conference may have their accumulation threshold raised. Moreover, the system may switch to, i.e., switch a plurality of participants' displays to, those participants who are speaking more or more often in a video conference in a faster or more responsive manner than participants who are speaking less in the video conference. Consequently, the system may switch to those participants who are speaking less in the video conference in a slower or less responsive manner, presuming that these less-talking participants may not be speaking very long or often. In some embodiments, the accumulation thresholds may be adjusted each time the system switches to a new speaker. The thresholds may also be adjusted after a predetermined amount of time for each participant, e.g., long enough to predict the participant's long-term behavior.
  • In situations where two of the participants are having a dialog, e.g., two people or two participant locations are in a discussion, conversation focus tends to go back and forth between those two people or locations. In one embodiment, the system tracks which participants are talking.
  • If the system determines that two of the plurality of participants (or participant locations) are engaging in a conversation, the system may lower the accumulation threshold required to display a single talking participant. Thus, when a first participant of these two participants begins talking, the system may show the first participant more quickly. Similarly, after the first participant stops talking and the second participant begins to talk, the system may switch to that second participant in single presentation mode more quickly. Thus, the system may essentially ping-pong back and forth between each of the two talking participants. In other words, after one of the participants stops talking and the other participant starts talking, the system may switch to the single presentation mode of the talking participant substantially immediately, e.g., within a second or two.
  • In another embodiment, as noted above, when the system detects that two (or a subset) of the participants are doing all of the talking (i.e., only their audio signals are exceeding the accumulation threshold) the system may show these subset of participants in continuous presence mode. Thus, in some embodiments, when the system detects that two of the participants are having a dialog, the system may display these two participants in a dual split display mode. Thus, if six different participant locations are participating in the video conference, but participants A and B are dominating the conversation, the accumulation threshold for participants A and B may be lowered. Therefore, when either of participants A or B begins speaking, the system may quickly switch to this dual display mode. In some embodiments, participants A and B may have two associated accumulation thresholds. The system may display the dual display mode using the lowered accumulation threshold, and then later switch to the single speaker mode for one of the participants upon reaching a second higher accumulation threshold. In other words, the system may intelligently switch to a single speaker mode if the second participant in the two-person dialog is no longer responding to the conversation.
  • When another one of the participants (participant C) begins speaking, a first (greater) accumulation threshold may be required to switch from the dual-display mode to a continuous presence mode (where the three talking participants or all six participants are displayed). A second (and even greater) accumulation threshold may be required to switch from continuous presence mode to single speaker mode for participant C.
  • Thus, the algorithm may be intelligent, e.g., using heuristics, to know that, for example, two speakers in the past ten minutes have been the dominant speakers; so, when one accumulates even a small amount of accumulated audio signal, the system may switch to that single speaker, or switch to a dual-speaker mode, much more quickly.
  • As described above, others of the participants that are not engaged in this two-person dialog may not have this lowered accumulation threshold. Thus, when a third participant that is not part of this two-person dialog begins speaking, this third participant must generate a greater amount of audio signal energy before the display switches to either a continuous presence mode or single speaker mode view of this third participant. As noted above, participants other than the two dominant participants may also have two different accumulation thresholds, a first to go from dual display mode of the two dominant speakers to continuous presence mode, and a second accumulation threshold to go to single speaker mode for that participant.
  • In some embodiments, each participant may have independent audio thresholds. For example, participant 151 may have audio threshold 603A, and participant 157 may have audio threshold 603B. Independent thresholds may be desirable for situations when a first participant is in a noisy environment. In such environments, a larger audio threshold may allow the MCU to properly determine when the first participant is speaking; i.e., the larger audio threshold may prevent the MCU from mistaking background noise for the participant's voice. However, if a second participant is in a quiet environment, it may be desirable for the second participant to have a much lower audio threshold than the first participant. In some embodiments, each participant's independent audio threshold may be normalized with respect to each participant. For example, in some situations, a first participant may have a louder normal speaking volume than a second participant. In this case, the first participant may have a higher audio threshold than the second participant. Thus, the quieter participants, such as the second participant, may not have to speak louder than normal to exceed their respective audio thresholds. Thus, independent audio thresholds may be desirable.
  • In some embodiments, similar to the accumulation threshold, the audio threshold for each participant may vary throughout a video conference. FIG. 9 illustrates two respective audio signals integrated above a variable audio threshold, according to one embodiment. In some embodiments, the thresholds may be continuous. In other embodiments, the thresholds may be defined as a piece-wise function such as that in FIG. 9. In one embodiment, the threshold may vary with respect to whether the participant is speaking; for example, participant 151's threshold may decrease while participant 151 is speaking, as in 903, and may increase while participant 151 is listening, as in 905. Similarly, participant 157's threshold may also decrease while he is speaking, 913, and increase while listening, 911 and 915.
  • FIGS. 10 a-c illustrate various embodiments of continuous presence screens. In some embodiments, the system may determine a dominant speaker 1003, 1113, or 1123 for display in a central and/or larger area of the display than corresponding other participants (e.g., other participants 1001 a-h; 1111 a-l; and 1121 a-g). Other continuous presence displays are also contemplated.
  • Thus, various embodiments of the systems and methods described above may facilitate intelligent control of video display modes in a video conferencing system.
  • Embodiments of these methods may be implemented from a memory medium. A memory medium may include any of various types of memory devices or storage devices. The term “memory medium” is intended to include an installation medium, e.g., a CD-ROM, floppy disks, or tape device; a computer system memory or random access memory such as DRAM, DDR RAM, SRAM, EDO RAM, Rambus RAM, etc.; or a non-volatile memory such as a magnetic media, e.g., a hard drive, or optical storage. The memory medium may comprise other types of memory as well, or combinations thereof. In addition, the memory medium may be located in a first computer in which the programs are executed, or may be located in a second different computer that connects to the first computer over a network, such as the Internet. In the latter instance, the second computer may provide program instructions to the first computer for execution. The term “memory medium” may include two or more memory mediums that may reside in different locations, e.g., in different computers that are connected over a network. In some embodiments, a carrier medium may be used. A carrier medium may include a memory medium as described above, as well as signals such as electrical, electromagnetic, or digital signals, conveyed via a communication medium such as a bus, network and/or a wireless link.
  • In some embodiments, a method may be implemented from memory medium(s) on which one or more computer programs or software components according to one embodiment may be stored. For example, the memory medium may comprise an electrically erasable programmable read-only memory (EEPROM), various types of flash memory, etc. which store software programs (e.g., firmware) that is executable to perform the methods described herein. In some embodiments, field programmable gate arrays may be used. Various embodiments further include receiving or storing instructions and/or data implemented in accordance with the foregoing description upon a carrier medium.
  • Further modifications and alternative embodiments of various aspects of the invention may be apparent to those skilled in the art in view of this description. Accordingly, this description is to be construed as illustrative only and is for the purpose of teaching those skilled in the art the general manner of carrying out the invention. It is to be understood that the forms of the invention shown and described herein are to be taken as embodiments. Elements and materials may be substituted for those illustrated and described herein, parts and processes may be reversed, and certain features of the invention may be utilized independently, all as would be apparent to one skilled in the art after having the benefit of this description of the invention. Changes may be made in the elements described herein without departing from the spirit and scope of the invention as described in the following claims.

Claims (28)

1. A method, comprising:
receiving an audio signal from each of a plurality of video conferencing system locations;
determining an accumulated amount of the audio signal from each of one or more of the audio signals; and
determining a display mode for at least one of the video conferencing system locations based on said determining, wherein the display mode is determined from two or more possible display modes.
2. The method of claim 1, wherein said determining the accumulated amount of audio signal comprises:
determining a signal metric for each of one or more of the respective audio signals using an integrated form of the respective signal.
3. The method of claim 1,
wherein said determining the accumulated amount of the audio signal comprises integrating each of the one or more audio signals from the plurality of video conferencing systems to generate respective accumulated amounts of audio signal.
4. The method of claim 1,
wherein the two or more possible display modes comprise a single window display mode and a multiple window display mode.
5. The method of claim 1, further comprising:
comparing the accumulated amount of the audio signal from one or more of the audio signals with at least one accumulation threshold;
wherein the display mode is determined based on said comparing.
6. The method of claim 5, wherein:
if the accumulated amount of audio signal corresponding to a first location exceeds an accumulation threshold, displaying video signals from the first location on a display of each of a plurality of video conferencing systems in a single window mode;
if the accumulated amount of audio signal corresponding to any location does not exceed the accumulation threshold, displaying video signals from a plurality of locations on a display of each of a plurality of video conferencing systems in a continuous presence mode.
7. The method of claim 5, wherein:
if the accumulated amount of audio signal corresponding to a subset of locations exceeds the accumulation threshold, displaying video signals from that respective subset of locations on a display of each of a plurality of video conferencing systems in a continuous presence mode.
8. The method of claim 5, further comprising:
modifying a respective accumulation threshold for a video conferencing system when an accumulated amount of audio signal has not exceeded the respective accumulation threshold within a predetermined amount of time.
9. The method of claim 5, further comprising:
modifying a respective accumulation threshold for a video conferencing system when an accumulated amount of audio signal has recently exceeded the respective accumulation threshold within a predetermined amount of time.
10. The method of claim 5,
wherein said comparing comprises comparing the accumulated amounts of audio signal from each of the audio signals with the at least one accumulation threshold.
11. The method of claim 1,
wherein said determining the accumulated amount of the audio signal comprises determining the accumulated amount of the audio signal after the audio signal has exceeded an audio threshold.
12. The method of claim 1,
wherein the accumulated amount of audio signal from each of one or more of the audio signals is an uninterrupted accumulated amount of audio signal.
13. A computer accessible memory medium comprising program instructions for determining a display mode in a video conferencing system, wherein the program instructions are executable to implement:
receiving an audio signal from each of a plurality of video conferencing system locations;
determining an accumulated amount of the audio signal from each of one or more of the audio signals; and
determining a display mode for at least one of the video conferencing system locations based on said determining, wherein the display mode is determined from two or more possible display modes.
14. The memory medium of claim 13,
wherein the accumulated amount of audio signal from each of one or more of the audio signals is an uninterrupted accumulated amount of audio signal.
15. The memory medium of claim 13,
wherein said determining the accumulated amount of the audio signal comprises integrating each of the one or more audio signals from the plurality of video conferencing systems to generate respective accumulated amounts of audio signal.
16. The memory medium of claim 13, wherein the program instructions are further executable to implement:
comparing the accumulated amount of the audio signal from one or more of the audio signals with at least one accumulation threshold;
wherein the display mode is determined based on said comparing.
17. A method for automatically determining a display mode for a display device comprising the steps of:
(a) receiving a signal from each of multiple endpoints;
(b) monitoring an amount of audio signal from each of the multiple endpoints;
(c) comparing the amount of audio signal from each of the multiple endpoints with predefined parameters; and
(d) determining a display mode from available display modes, wherein available display modes are single-window display and multiple-window display, based on step (c);
18. The method of claim 17, further comprising:
(e) wherein when the determined display mode is different than a current display mode of the display device, transmitting a display mode command signal based on a determination in step (d), the display mode command signal affecting the display mode of the display device; and
19. The method of claim 18,
wherein step (e) comprises a command signal to specify the multiple-window display upon the duration from each of the multiple endpoints not exceeding a first predefined parameter.
20. The method of claim 18, wherein step (e) comprises a command signal to specify the single-window display to display video images originating from one of the multiple endpoints from which the duration exceeds a predefined parameter and upon none of the durations from each of the other multiple endpoints exceeds the predefined parameter.
21. The method of claim 18, wherein step (e) comprises a command signal to specify the multiple-window display upon the durations from at least two of the multiple endpoints exceeding a predefined parameter.
22. The method of claim 17, wherein the display device is coupled to a video conferencing device or application.
23. A system, comprising:
a plurality of video conferencing systems, wherein the plurality of video conferencing systems are coupled through a network and wherein the plurality of video conferencing systems provide video and audio signals of participants using the respective systems;
a signal integrator, wherein the signal integrator determines an amount of accumulated audio signal for each of the plurality of video conferencing systems; and
a mode switch coupled to the signal integrator and operable to select a display mode based on the amount of accumulated audio signal for each of the plurality of video conferencing systems
24. The system of claim 23,
wherein if the amount of accumulated audio signal of a first video conferencing system exceeds an accumulation threshold, the mode switch directs a display coupled to at least one of the video conferencing systems to display the video signals provided by the first video conferencing system with the amount of accumulated audio signal that exceeds the accumulation threshold.
25. The system of claim 23, wherein if each of the amounts of accumulated audio signal of the plurality of video conferencing systems do not exceed the accumulation threshold, a display on at least one of the plurality of video conferencing system displays at least two of the plurality of video conferencing systems video signals.
26. The system of claim 23, wherein if two or more of the plurality of video conferencing systems each exceed the accumulation threshold, a display on at least one of the plurality of video conferencing system displays the two or more of the plurality of video conferencing systems.
27. A switching system for automatically determining a display mode for a video display device comprising:
an integrator configured to determine an amount of audio signal of each of a plurality of audio signals, the signals being from a source at each of multiple endpoints; and
a switching processor coupled to the integrator and to a video switching module, configured to determine an appropriate display mode from the available display modes, wherein available display modes are single-window display and multiple-window display, based upon a comparison of the integrated audio signal energy of each of the signals with at least one predefined parameter.
28. The switching system of claim 27,
wherein upon a determination that the appropriate display mode is different than the current display mode the switching processor transmits to the video switching module a display mode command, the display mode command being chosen from a single-window display command to effect the single-window display and a multiple-window display command to effect the multiple-window display.
US11/348,217 2005-05-02 2006-02-06 Controlling video display mode in a video conferencing system Abandoned US20060248210A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/348,217 US20060248210A1 (en) 2005-05-02 2006-02-06 Controlling video display mode in a video conferencing system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US67691805P 2005-05-02 2005-05-02
US11/348,217 US20060248210A1 (en) 2005-05-02 2006-02-06 Controlling video display mode in a video conferencing system

Publications (1)

Publication Number Publication Date
US20060248210A1 true US20060248210A1 (en) 2006-11-02

Family

ID=37235746

Family Applications (3)

Application Number Title Priority Date Filing Date
US11/348,217 Abandoned US20060248210A1 (en) 2005-05-02 2006-02-06 Controlling video display mode in a video conferencing system
US11/405,371 Active 2029-10-11 US7990410B2 (en) 2005-05-02 2006-04-17 Status and control icons on a continuous presence display in a videoconferencing system
US11/405,372 Abandoned US20060259552A1 (en) 2005-05-02 2006-04-17 Live video icons for signal selection in a videoconferencing system

Family Applications After (2)

Application Number Title Priority Date Filing Date
US11/405,371 Active 2029-10-11 US7990410B2 (en) 2005-05-02 2006-04-17 Status and control icons on a continuous presence display in a videoconferencing system
US11/405,372 Abandoned US20060259552A1 (en) 2005-05-02 2006-04-17 Live video icons for signal selection in a videoconferencing system

Country Status (1)

Country Link
US (3) US20060248210A1 (en)

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070206089A1 (en) * 2006-03-01 2007-09-06 Polycom, Inc. Method and system for providing continuous presence video in a cascading conference
US20070299981A1 (en) * 2006-06-21 2007-12-27 Cisco Technology, Inc. Techniques for managing multi-window video conference displays
US20080320158A1 (en) * 2007-06-20 2008-12-25 Mcomms Design Pty Ltd Apparatus and method for providing multimedia content
US20090187400A1 (en) * 2006-09-30 2009-07-23 Huawei Technologies Co., Ltd. System, method and multipoint control unit for providing multi-language conference
US20100066806A1 (en) * 2008-09-12 2010-03-18 Primax Electronics Ltd. Internet video image producing method
US20110074913A1 (en) * 2009-09-28 2011-03-31 Kulkarni Hrishikesh G Videoconferencing Using a Precoded Bitstream
US20110074910A1 (en) * 2009-09-28 2011-03-31 King Keith C Supporting Multiple Videoconferencing Streams in a Videoconference
US20110279632A1 (en) * 2010-05-13 2011-11-17 Kulkarni Hrishikesh G Multiway Telepresence without a Hardware MCU
US20120327180A1 (en) * 2011-06-27 2012-12-27 Motorola Mobility, Inc. Apparatus for providing feedback on nonverbal cues of video conference participants
CN102857732A (en) * 2012-05-25 2013-01-02 华为技术有限公司 Picture control method, device and system for multi-picture video conferences
US8436888B1 (en) * 2008-02-20 2013-05-07 Cisco Technology, Inc. Detection of a lecturer in a videoconference
US8681203B1 (en) * 2012-08-20 2014-03-25 Google Inc. Automatic mute control for video conferencing
EP2732622A1 (en) * 2011-07-14 2014-05-21 Ricoh Company, Limited Multipoint connection apparatus and communication system
CN104135638A (en) * 2013-05-02 2014-11-05 阿瓦亚公司 Optimized video snapshot
US20150052198A1 (en) * 2013-08-16 2015-02-19 Joonsuh KWUN Dynamic social networking service system and respective methods in collecting and disseminating specialized and interdisciplinary knowledge
US8976223B1 (en) * 2012-12-21 2015-03-10 Google Inc. Speaker switching in multiway conversation
US20150180919A1 (en) * 2013-12-20 2015-06-25 Avaya, Inc. Active talker activated conference pointers
US9077851B2 (en) 2012-03-19 2015-07-07 Ricoh Company, Ltd. Transmission terminal, transmission system, display control method, and recording medium storing display control program
US9077848B2 (en) 2011-07-15 2015-07-07 Google Technology Holdings LLC Side channel for employing descriptive audio commentary about a video conference
EP2373015A3 (en) * 2010-03-31 2015-09-23 Polycom, Inc. Method and system for adapting a continuous presence layout according to interaction between conferees
CN105009571A (en) * 2013-02-04 2015-10-28 汤姆逊许可公司 Dual telepresence set-top box
US9215395B2 (en) 2012-03-15 2015-12-15 Ronaldo Luiz Lisboa Herdy Apparatus, system, and method for providing social content
US20170149854A1 (en) * 2015-11-20 2017-05-25 Microsoft Technology Licensing, Llc Communication System
US20170168692A1 (en) * 2015-12-14 2017-06-15 Microsoft Technology Licensing, Llc Dual-Modality Client Application
EP2140608B1 (en) * 2007-04-27 2018-04-04 Cisco Technology, Inc. Optimizing bandwidth in a multipoint video conference
US10334205B2 (en) * 2012-11-26 2019-06-25 Intouch Technologies, Inc. Enhanced video interaction for a user interface of a telepresence network
US10742929B2 (en) 2015-11-20 2020-08-11 Microsoft Technology Licensing, Llc Communication system
US10880345B2 (en) * 2014-04-04 2020-12-29 Aleksandr Lvovich SHVEDOV Virtual meeting conduct procedure, virtual meeting conduct system, and virtual meeting member interface
US10887628B1 (en) * 2016-04-27 2021-01-05 United Services Automobile Services (USAA) Systems and methods for adaptive livestreaming
US10892052B2 (en) 2012-05-22 2021-01-12 Intouch Technologies, Inc. Graphical user interfaces including touchpad driving interfaces for telemedicine devices
US10965963B2 (en) * 2019-07-30 2021-03-30 Sling Media Pvt Ltd Audio-based automatic video feed selection for a digital video production system
WO2022026842A1 (en) * 2020-07-30 2022-02-03 T1V, Inc. Virtual distributed camera, associated applications and system
US11453126B2 (en) 2012-05-22 2022-09-27 Teladoc Health, Inc. Clinical workflows utilizing autonomous and semi-autonomous telemedicine devices
US11468983B2 (en) 2011-01-28 2022-10-11 Teladoc Health, Inc. Time-dependent navigation of telepresence robots

Families Citing this family (62)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7349000B2 (en) * 2002-04-30 2008-03-25 Tandberg Telecom As Method and system for display of video device status information
US9030968B2 (en) * 2006-06-16 2015-05-12 Alcatel Lucent System and method for processing a conference session through a communication channel
US20080273078A1 (en) * 2007-05-01 2008-11-06 Scott Grasley Videoconferencing audio distribution
US20080316295A1 (en) * 2007-06-22 2008-12-25 King Keith C Virtual decoders
US8139100B2 (en) 2007-07-13 2012-03-20 Lifesize Communications, Inc. Virtual multiway scaler compensation
KR20090081616A (en) * 2008-01-24 2009-07-29 삼성전자주식회사 Method and device for managing shared software
US8325216B2 (en) * 2008-02-26 2012-12-04 Seiko Epson Corporation Remote control of videoconference clients
US9201527B2 (en) * 2008-04-04 2015-12-01 Microsoft Technology Licensing, Llc Techniques to remotely manage a multimedia conference event
BRPI0822250A2 (en) * 2008-04-30 2019-04-09 Hewlett Packard Development Co event management system, event management method and program product
US20110069141A1 (en) * 2008-04-30 2011-03-24 Mitchell April S Communication Between Scheduled And In Progress Event Attendees
GB2463110B (en) * 2008-09-05 2013-01-16 Skype Communication system and method
GB2463124B (en) 2008-09-05 2012-06-20 Skype Ltd A peripheral device for communication over a communications sytem
CN102165767A (en) * 2008-09-26 2011-08-24 惠普开发有限公司 Event management system for creating a second event
US20100091687A1 (en) * 2008-10-15 2010-04-15 Ted Beers Status of events
US7792901B2 (en) * 2008-10-15 2010-09-07 Hewlett-Packard Development Company, L.P. Reconfiguring a collaboration event
US20100110160A1 (en) * 2008-10-30 2010-05-06 Brandt Matthew K Videoconferencing Community with Live Images
KR101502365B1 (en) * 2008-11-06 2015-03-13 삼성전자주식회사 Three dimensional video scaler and controlling method for the same
KR101610705B1 (en) 2008-12-10 2016-04-11 삼성전자주식회사 Terminal having camera and method for processing image thereof
CN101534411B (en) * 2009-04-08 2012-12-12 华为终端有限公司 Control method for video conference, terminal and system based on image
US20100293469A1 (en) * 2009-05-14 2010-11-18 Gautam Khot Providing Portions of a Presentation During a Videoconference
US8782267B2 (en) * 2009-05-29 2014-07-15 Comcast Cable Communications, Llc Methods, systems, devices, and computer-readable media for delivering additional content using a multicast streaming
WO2010141023A1 (en) * 2009-06-04 2010-12-09 Hewlett-Packard Development Company, L.P. Video conference
US8350891B2 (en) 2009-11-16 2013-01-08 Lifesize Communications, Inc. Determining a videoconference layout based on numbers of participants
US8456509B2 (en) * 2010-01-08 2013-06-04 Lifesize Communications, Inc. Providing presentations in a videoconference
US20110183654A1 (en) 2010-01-25 2011-07-28 Brian Lanier Concurrent Use of Multiple User Interface Devices
US9628722B2 (en) 2010-03-30 2017-04-18 Personify, Inc. Systems and methods for embedding a foreground video into a background feed based on a control input
US9516272B2 (en) 2010-03-31 2016-12-06 Polycom, Inc. Adapting a continuous presence layout to a discussion situation
US9264659B2 (en) * 2010-04-07 2016-02-16 Apple Inc. Video conference network management for a mobile device
US20120030595A1 (en) * 2010-07-29 2012-02-02 Seiko Epson Corporation Information storage medium, terminal apparatus, and image generation method
US8649592B2 (en) 2010-08-30 2014-02-11 University Of Illinois At Urbana-Champaign System for background subtraction with 3D camera
CN102572370B (en) * 2011-01-04 2014-06-11 华为终端有限公司 Video conference control method and conference terminal
US8791911B2 (en) * 2011-02-09 2014-07-29 Robotzone, Llc Multichannel controller
US9390617B2 (en) * 2011-06-10 2016-07-12 Robotzone, Llc Camera motion control system with variable autonomy
WO2013026457A1 (en) * 2011-08-19 2013-02-28 Telefonaktiebolaget L M Ericsson (Publ) Technique for video conferencing
US8767034B2 (en) * 2011-12-01 2014-07-01 Tangome, Inc. Augmenting a video conference
US20130155171A1 (en) * 2011-12-16 2013-06-20 Wayne E. Mock Providing User Input Having a Plurality of Data Types Using a Remote Control Device
KR101910659B1 (en) * 2011-12-29 2018-10-24 삼성전자주식회사 Digital imaging apparatus and control method for the same
US9204099B2 (en) * 2012-02-01 2015-12-01 Magor Communications Corporation Videoconferencing system providing virtual physical context
EP2624581A1 (en) * 2012-02-06 2013-08-07 Research in Motion Limited Division of a graphical display into regions
US20130201305A1 (en) * 2012-02-06 2013-08-08 Research In Motion Corporation Division of a graphical display into regions
US8928726B2 (en) * 2012-04-20 2015-01-06 Logitech Europe S.A. Videoconferencing system with context sensitive wake features
US8947491B2 (en) * 2012-06-28 2015-02-03 Microsoft Corporation Communication system
US9131058B2 (en) * 2012-08-15 2015-09-08 Vidyo, Inc. Conference server communication techniques
CN103780741B (en) * 2012-10-18 2018-03-13 腾讯科技(深圳)有限公司 Prompt the method and mobile device of network speed
US9485433B2 (en) 2013-12-31 2016-11-01 Personify, Inc. Systems and methods for iterative adjustment of video-capture settings based on identified persona
US9414016B2 (en) 2013-12-31 2016-08-09 Personify, Inc. System and methods for persona identification using combined probability maps
US20150188970A1 (en) * 2013-12-31 2015-07-02 Personify, Inc. Methods and Systems for Presenting Personas According to a Common Cross-Client Configuration
US9736428B1 (en) * 2014-04-01 2017-08-15 Securus Technologies, Inc. Providing remote visitation and other services to non-residents of controlled-environment facilities via display devices
US9726463B2 (en) 2014-07-16 2017-08-08 Robtozone, LLC Multichannel controller for target shooting range
US9674243B2 (en) * 2014-09-05 2017-06-06 Minerva Project, Inc. System and method for tracking events and providing feedback in a virtual conference
US9671931B2 (en) * 2015-01-04 2017-06-06 Personify, Inc. Methods and systems for visually deemphasizing a displayed persona
CN104767910A (en) * 2015-04-27 2015-07-08 京东方科技集团股份有限公司 Video image stitching system and method
US9916668B2 (en) 2015-05-19 2018-03-13 Personify, Inc. Methods and systems for identifying background in video data using geometric primitives
US9563962B2 (en) 2015-05-19 2017-02-07 Personify, Inc. Methods and systems for assigning pixels distance-cost values using a flood fill technique
US9883155B2 (en) 2016-06-14 2018-01-30 Personify, Inc. Methods and systems for combining foreground video and background video using chromatic matching
US9881207B1 (en) 2016-10-25 2018-01-30 Personify, Inc. Methods and systems for real-time user extraction using deep learning networks
KR102271308B1 (en) * 2017-11-21 2021-06-30 주식회사 하이퍼커넥트 Method for providing interactive visible object during video call, and system performing the same
DE102017128680A1 (en) * 2017-12-04 2019-06-06 Vitero GmbH - Gesellschaft für mediale Kommunikationslösungen Method and apparatus for conducting multi-party remote meetings
CN110109636B (en) * 2019-04-28 2022-04-05 华为技术有限公司 Screen projection method, electronic device and system
KR20210055278A (en) * 2019-11-07 2021-05-17 라인플러스 주식회사 Method and system for hybrid video coding
US11800056B2 (en) 2021-02-11 2023-10-24 Logitech Europe S.A. Smart webcam system
US11800048B2 (en) 2021-02-24 2023-10-24 Logitech Europe S.A. Image generating system with background replacement or modification capabilities

Citations (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4449238A (en) * 1982-03-25 1984-05-15 Bell Telephone Laboratories, Incorporated Voice-actuated switching system
US5398309A (en) * 1993-05-17 1995-03-14 Intel Corporation Method and apparatus for generating composite images using multiple local masks
US5453780A (en) * 1994-04-28 1995-09-26 Bell Communications Research, Inc. Continous presence video signal combiner
US5528740A (en) * 1993-02-25 1996-06-18 Document Technologies, Inc. Conversion of higher resolution images for display on a lower-resolution display device
US5534914A (en) * 1993-06-03 1996-07-09 Target Technologies, Inc. Videoconferencing system
US5537440A (en) * 1994-01-07 1996-07-16 Motorola, Inc. Efficient transcoding device and method
US5572248A (en) * 1994-09-19 1996-11-05 Teleport Corporation Teleconferencing method and system for providing face-to-face, non-animated teleconference environment
US5594859A (en) * 1992-06-03 1997-01-14 Digital Equipment Corporation Graphical user interface for video teleconferencing
US5600646A (en) * 1995-01-27 1997-02-04 Videoserver, Inc. Video teleconferencing system with digital transcoding
US5617539A (en) * 1993-10-01 1997-04-01 Vicor, Inc. Multimedia collaboration system with separate data network and A/V network controlled by information transmitting on the data network
US5625410A (en) * 1993-04-21 1997-04-29 Kinywa Washino Video monitoring and conferencing system
US5629736A (en) * 1994-11-01 1997-05-13 Lucent Technologies Inc. Coded domain picture composition for multimedia communications systems
US5640543A (en) * 1992-06-19 1997-06-17 Intel Corporation Scalable multimedia platform architecture
US5649055A (en) * 1993-03-26 1997-07-15 Hughes Electronics Voice activity detector for speech signals in variable background noise
US5657096A (en) * 1995-05-03 1997-08-12 Lukacs; Michael Edward Real time video conferencing system and method with multilayer keying of multiple video images
US5684527A (en) * 1992-07-28 1997-11-04 Fujitsu Limited Adaptively controlled multipoint videoconferencing system
US5719951A (en) * 1990-07-17 1998-02-17 British Telecommunications Public Limited Company Normalized image feature processing
US5737011A (en) * 1995-05-03 1998-04-07 Bell Communications Research, Inc. Infinitely expandable real-time video conferencing system
US5751338A (en) * 1994-12-30 1998-05-12 Visionary Corporate Technologies Methods and systems for multimedia communications via public telephone networks
US5764277A (en) * 1995-11-08 1998-06-09 Bell Communications Research, Inc. Group-of-block based video signal combining for multipoint continuous presence video conferencing
US5767897A (en) * 1994-10-31 1998-06-16 Picturetel Corporation Video conferencing system
US5768263A (en) * 1995-10-20 1998-06-16 Vtel Corporation Method for talk/listen determination and multipoint conferencing system using such method
US5812789A (en) * 1996-08-26 1998-09-22 Stmicroelectronics, Inc. Video and/or audio decompression and/or compression device that shares a memory interface
US5821986A (en) * 1994-11-03 1998-10-13 Picturetel Corporation Method and apparatus for visual communications in a scalable network environment
US5831666A (en) * 1992-06-03 1998-11-03 Digital Equipment Corporation Video data scaling for video teleconferencing workstations communicating by digital data network
US5838664A (en) * 1997-07-17 1998-11-17 Videoserver, Inc. Video teleconferencing system with digital transcoding
US5859979A (en) * 1993-11-24 1999-01-12 Intel Corporation System for negotiating conferencing capabilities by selecting a subset of a non-unique set of conferencing capabilities to specify a unique set of conferencing capabilities
US5870146A (en) * 1997-01-21 1999-02-09 Multilink, Incorporated Device and method for digital video transcoding
US5896128A (en) * 1995-05-03 1999-04-20 Bell Communications Research, Inc. System and method for associating multimedia objects for use in a video conferencing system
US5914940A (en) * 1996-02-09 1999-06-22 Nec Corporation Multipoint video conference controlling method and system capable of synchronizing video and audio packets
US5995608A (en) * 1997-03-28 1999-11-30 Confertech Systems Inc. Method and apparatus for on-demand teleconferencing
US6025870A (en) * 1998-10-14 2000-02-15 Vtel Corporation Automatic switching of videoconference focus
US6038532A (en) * 1990-01-18 2000-03-14 Matsushita Electric Industrial Co., Ltd. Signal processing device for cancelling noise in a signal
US6043844A (en) * 1997-02-18 2000-03-28 Conexant Systems, Inc. Perceptually motivated trellis based rate control method and apparatus for low bit rate video coding
US6049694A (en) * 1988-10-17 2000-04-11 Kassatly; Samuel Anthony Multi-point video conference system and method
US6078350A (en) * 1992-02-19 2000-06-20 8 X 8, Inc. System and method for distribution of encoded video data
US6101480A (en) * 1998-06-19 2000-08-08 International Business Machines Electronic calendar with group scheduling and automated scheduling techniques for coordinating conflicting schedules
US6122668A (en) * 1995-11-02 2000-09-19 Starlight Networks Synchronization of audio and video signals in a live multicast in a LAN
US6243129B1 (en) * 1998-01-09 2001-06-05 8×8, Inc. System and method for videoconferencing and simultaneously viewing a supplemental video source
US6285661B1 (en) * 1998-01-28 2001-09-04 Picturetel Corporation Low delay real time digital video mixing for multipoint video conferencing
US6288740B1 (en) * 1998-06-11 2001-09-11 Ezenia! Inc. Method and apparatus for continuous presence conferencing with voice-activated quadrant selection
US6292204B1 (en) * 1993-09-28 2001-09-18 Ncr Corporation Method and apparatus for display of video images in a video conferencing system
US6300973B1 (en) * 2000-01-13 2001-10-09 Meir Feder Method and system for multimedia communication control
US6453285B1 (en) * 1998-08-21 2002-09-17 Polycom, Inc. Speech activity detector for use in noise reduction system, and methods therefor
US6480823B1 (en) * 1998-03-24 2002-11-12 Matsushita Electric Industrial Co., Ltd. Speech detection for noisy conditions
US20020188731A1 (en) * 2001-05-10 2002-12-12 Sergey Potekhin Control unit for multipoint multimedia/audio system
US6526099B1 (en) * 1996-10-25 2003-02-25 Telefonaktiebolaget Lm Ericsson (Publ) Transcoder
US6535604B1 (en) * 1998-09-04 2003-03-18 Nortel Networks Limited Voice-switching device and method for multiple receivers
US6564380B1 (en) * 1999-01-26 2003-05-13 Pixelworld Networks, Inc. System and method for sending live video on the internet
US6594688B2 (en) * 1993-10-01 2003-07-15 Collaboration Properties, Inc. Dedicated echo canceler for a workstation
US6603501B1 (en) * 2000-07-12 2003-08-05 Onscreen24 Corporation Videoconferencing using distributed processing
US20030174146A1 (en) * 2002-02-04 2003-09-18 Michael Kenoyer Apparatus and method for providing electronic image manipulation in video conferencing applications
US6646997B1 (en) * 1999-10-25 2003-11-11 Voyant Technologies, Inc. Large-scale, fault-tolerant audio conferencing in a purely packet-switched network
US6657975B1 (en) * 1999-10-25 2003-12-02 Voyant Technologies, Inc. Large-scale, fault-tolerant audio conferencing over a hybrid network
US6728221B1 (en) * 1999-04-09 2004-04-27 Siemens Information & Communication Networks, Inc. Method and apparatus for efficiently utilizing conference bridge capacity
US6744460B1 (en) * 1999-10-04 2004-06-01 Polycom, Inc. Video display mode automatic switching system and method
US20040113939A1 (en) * 2002-12-11 2004-06-17 Eastman Kodak Company Adaptive display system
US6760415B2 (en) * 2000-03-17 2004-07-06 Qwest Communications International Inc. Voice telephony system
US20040183897A1 (en) * 2001-08-07 2004-09-23 Michael Kenoyer System and method for high resolution videoconferencing
US6816904B1 (en) * 1997-11-04 2004-11-09 Collaboration Properties, Inc. Networked video multimedia storage server environment
US20040263610A1 (en) * 2003-06-30 2004-12-30 Whynot Stephen R. Apparatus, method, and computer program for supporting video conferencing in a communication system
US20060013416A1 (en) * 2004-06-30 2006-01-19 Polycom, Inc. Stereo microphone processing for teleconferencing
US7089285B1 (en) * 1999-10-05 2006-08-08 Polycom, Inc. Videoconferencing apparatus having integrated multi-point conference capabilities
US7330541B1 (en) * 2003-05-22 2008-02-12 Cisco Technology, Inc. Automated conference moderation

Family Cites Families (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5072412A (en) * 1987-03-25 1991-12-10 Xerox Corporation User interface with multiple workspaces for sharing display system objects
US4974173A (en) * 1987-12-02 1990-11-27 Xerox Corporation Small-scale workspace representations indicating activities by other users
IT1219727B (en) * 1988-06-16 1990-05-24 Italtel Spa BROADBAND COMMUNICATION SYSTEM
US5107443A (en) * 1988-09-07 1992-04-21 Xerox Corporation Private regions within a shared workspace
US5382972A (en) * 1988-09-22 1995-01-17 Kannes; Deno Video conferencing system for courtroom and other applications
JP2575197B2 (en) * 1988-10-25 1997-01-22 沖電気工業株式会社 3D image forming device
US4953159A (en) * 1989-01-03 1990-08-28 American Telephone And Telegraph Company Audiographics conferencing arrangement
US5014267A (en) * 1989-04-06 1991-05-07 Datapoint Corporation Video conferencing network
US5003532A (en) * 1989-06-02 1991-03-26 Fujitsu Limited Multi-point conference system
US6400996B1 (en) * 1999-02-01 2002-06-04 Steven M. Hoffberg Adaptive pattern recognition based control system and method
US5375068A (en) * 1992-06-03 1994-12-20 Digital Equipment Corporation Video teleconferencing for networked workstations
US6675386B1 (en) * 1996-09-04 2004-01-06 Discovery Communications, Inc. Apparatus for video access and control over computer network, including image correction
US5444476A (en) * 1992-12-11 1995-08-22 The Regents Of The University Of Michigan System and method for teleinteraction
US5745161A (en) * 1993-08-30 1998-04-28 Canon Kabushiki Kaisha Video conference system
US7185054B1 (en) * 1993-10-01 2007-02-27 Collaboration Properties, Inc. Participant display and selection in video conference calls
JPH07114652A (en) * 1993-10-18 1995-05-02 Hitachi Medical Corp Device and method for moving picture display for three-dimensional image
CN1135823A (en) * 1993-10-20 1996-11-13 电视会议系统公司 Adaptive videoconferencing system
US6286034B1 (en) * 1995-08-25 2001-09-04 Canon Kabushiki Kaisha Communication apparatus, a communication system and a communication method
US6108704A (en) * 1995-09-25 2000-08-22 Netspeak Corporation Point-to-point internet protocol
US5786804A (en) * 1995-10-06 1998-07-28 Hewlett-Packard Company Method and system for tracking attitude
US5828838A (en) * 1996-06-20 1998-10-27 Intel Corporation Method and apparatus for conducting multi-point electronic conferences
US6151619A (en) * 1996-11-26 2000-11-21 Apple Computer, Inc. Method and apparatus for maintaining configuration information of a teleconference and identification of endpoint during teleconference
US6128649A (en) * 1997-06-02 2000-10-03 Nortel Networks Limited Dynamic selection of media streams for display
JP4056154B2 (en) * 1997-12-30 2008-03-05 三星電子株式会社 2D continuous video 3D video conversion apparatus and method, and 3D video post-processing method
US6195184B1 (en) * 1999-06-19 2001-02-27 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration High-resolution large-field-of-view three-dimensional hologram display system and method thereof
WO2001059551A2 (en) * 2000-02-08 2001-08-16 Sony Corporation Of America User interface for interacting with plural real-time data sources
JP3942789B2 (en) * 2000-02-22 2007-07-11 独立行政法人科学技術振興機構 Stereoscopic image playback device with background
US6760750B1 (en) * 2000-03-01 2004-07-06 Polycom Israel, Ltd. System and method of monitoring video and/or audio conferencing through a rapid-update web site
US6938069B1 (en) * 2000-03-18 2005-08-30 Computing Services Support Solutions Electronic meeting center
US20020133247A1 (en) * 2000-11-11 2002-09-19 Smith Robert D. System and method for seamlessly switching between media streams
US7304985B2 (en) * 2001-09-24 2007-12-04 Marvin L Sojka Multimedia communication management system with line status notification for key switch emulation
AU2002343441A1 (en) * 2001-09-26 2003-04-07 Massachusetts Institute Of Technology Versatile cone-beam imaging apparatus and method
US20030071902A1 (en) * 2001-10-11 2003-04-17 Allen Paul G. System, devices, and methods for switching between video cameras
US7068299B2 (en) * 2001-10-26 2006-06-27 Tandberg Telecom As System and method for graphically configuring a video call
US20030105820A1 (en) * 2001-12-03 2003-06-05 Jeffrey Haims Method and apparatus for facilitating online communication
JP3664132B2 (en) * 2001-12-27 2005-06-22 ソニー株式会社 Network information processing system and information processing method
US7293243B1 (en) * 2002-05-22 2007-11-06 Microsoft Corporation Application sharing viewer presentation
US6967321B2 (en) * 2002-11-01 2005-11-22 Agilent Technologies, Inc. Optical navigation sensor with integrated lens
US8095409B2 (en) * 2002-12-06 2012-01-10 Insors Integrated Communications Methods and program products for organizing virtual meetings
US7278107B2 (en) * 2002-12-10 2007-10-02 International Business Machines Corporation Method, system and program product for managing windows in a network-based collaborative meeting
JP2004294477A (en) * 2003-03-25 2004-10-21 Dhs Ltd Three-dimensional image calculating method, three-dimensional image generating method and three-dimensional image display device
US7949116B2 (en) * 2003-05-22 2011-05-24 Insors Integrated Communications Primary data stream communication
US7133062B2 (en) * 2003-07-31 2006-11-07 Polycom, Inc. Graphical user interface for video feed on videoconference terminal
US7948448B2 (en) * 2004-04-01 2011-05-24 Polyvision Corporation Portable presentation system and methods for use therewith
US7870192B2 (en) * 2004-12-16 2011-01-11 International Business Machines Corporation Integrated voice and video conferencing management
US7528860B2 (en) * 2005-04-29 2009-05-05 Hewlett-Packard Development Company, L.P. Method and system for videoconferencing between parties at N sites

Patent Citations (70)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4449238A (en) * 1982-03-25 1984-05-15 Bell Telephone Laboratories, Incorporated Voice-actuated switching system
US6049694A (en) * 1988-10-17 2000-04-11 Kassatly; Samuel Anthony Multi-point video conference system and method
US6038532A (en) * 1990-01-18 2000-03-14 Matsushita Electric Industrial Co., Ltd. Signal processing device for cancelling noise in a signal
US5719951A (en) * 1990-07-17 1998-02-17 British Telecommunications Public Limited Company Normalized image feature processing
US6078350A (en) * 1992-02-19 2000-06-20 8 X 8, Inc. System and method for distribution of encoded video data
US6373517B1 (en) * 1992-02-19 2002-04-16 8X8, Inc. System and method for distribution of encoded video data
US5831666A (en) * 1992-06-03 1998-11-03 Digital Equipment Corporation Video data scaling for video teleconferencing workstations communicating by digital data network
US5594859A (en) * 1992-06-03 1997-01-14 Digital Equipment Corporation Graphical user interface for video teleconferencing
US5640543A (en) * 1992-06-19 1997-06-17 Intel Corporation Scalable multimedia platform architecture
US5684527A (en) * 1992-07-28 1997-11-04 Fujitsu Limited Adaptively controlled multipoint videoconferencing system
US5528740A (en) * 1993-02-25 1996-06-18 Document Technologies, Inc. Conversion of higher resolution images for display on a lower-resolution display device
US5649055A (en) * 1993-03-26 1997-07-15 Hughes Electronics Voice activity detector for speech signals in variable background noise
US5625410A (en) * 1993-04-21 1997-04-29 Kinywa Washino Video monitoring and conferencing system
US5398309A (en) * 1993-05-17 1995-03-14 Intel Corporation Method and apparatus for generating composite images using multiple local masks
US5534914A (en) * 1993-06-03 1996-07-09 Target Technologies, Inc. Videoconferencing system
US6292204B1 (en) * 1993-09-28 2001-09-18 Ncr Corporation Method and apparatus for display of video images in a video conferencing system
US6594688B2 (en) * 1993-10-01 2003-07-15 Collaboration Properties, Inc. Dedicated echo canceler for a workstation
US5689641A (en) * 1993-10-01 1997-11-18 Vicor, Inc. Multimedia collaboration system arrangement for routing compressed AV signal through a participant site without decompressing the AV signal
US5617539A (en) * 1993-10-01 1997-04-01 Vicor, Inc. Multimedia collaboration system with separate data network and A/V network controlled by information transmitting on the data network
US5859979A (en) * 1993-11-24 1999-01-12 Intel Corporation System for negotiating conferencing capabilities by selecting a subset of a non-unique set of conferencing capabilities to specify a unique set of conferencing capabilities
US5537440A (en) * 1994-01-07 1996-07-16 Motorola, Inc. Efficient transcoding device and method
US5453780A (en) * 1994-04-28 1995-09-26 Bell Communications Research, Inc. Continous presence video signal combiner
US6654045B2 (en) * 1994-09-19 2003-11-25 Telesuite Corporation Teleconferencing method and system
US5572248A (en) * 1994-09-19 1996-11-05 Teleport Corporation Teleconferencing method and system for providing face-to-face, non-animated teleconference environment
US5767897A (en) * 1994-10-31 1998-06-16 Picturetel Corporation Video conferencing system
US5629736A (en) * 1994-11-01 1997-05-13 Lucent Technologies Inc. Coded domain picture composition for multimedia communications systems
US5821986A (en) * 1994-11-03 1998-10-13 Picturetel Corporation Method and apparatus for visual communications in a scalable network environment
US5751338A (en) * 1994-12-30 1998-05-12 Visionary Corporate Technologies Methods and systems for multimedia communications via public telephone networks
US5600646A (en) * 1995-01-27 1997-02-04 Videoserver, Inc. Video teleconferencing system with digital transcoding
US5896128A (en) * 1995-05-03 1999-04-20 Bell Communications Research, Inc. System and method for associating multimedia objects for use in a video conferencing system
US5737011A (en) * 1995-05-03 1998-04-07 Bell Communications Research, Inc. Infinitely expandable real-time video conferencing system
US5657096A (en) * 1995-05-03 1997-08-12 Lukacs; Michael Edward Real time video conferencing system and method with multilayer keying of multiple video images
US5768263A (en) * 1995-10-20 1998-06-16 Vtel Corporation Method for talk/listen determination and multipoint conferencing system using such method
US5991277A (en) * 1995-10-20 1999-11-23 Vtel Corporation Primary transmission site switching in a multipoint videoconference environment based on human voice
US6122668A (en) * 1995-11-02 2000-09-19 Starlight Networks Synchronization of audio and video signals in a live multicast in a LAN
US5764277A (en) * 1995-11-08 1998-06-09 Bell Communications Research, Inc. Group-of-block based video signal combining for multipoint continuous presence video conferencing
US5914940A (en) * 1996-02-09 1999-06-22 Nec Corporation Multipoint video conference controlling method and system capable of synchronizing video and audio packets
US5812789A (en) * 1996-08-26 1998-09-22 Stmicroelectronics, Inc. Video and/or audio decompression and/or compression device that shares a memory interface
US6526099B1 (en) * 1996-10-25 2003-02-25 Telefonaktiebolaget Lm Ericsson (Publ) Transcoder
US5870146A (en) * 1997-01-21 1999-02-09 Multilink, Incorporated Device and method for digital video transcoding
US6043844A (en) * 1997-02-18 2000-03-28 Conexant Systems, Inc. Perceptually motivated trellis based rate control method and apparatus for low bit rate video coding
US5995608A (en) * 1997-03-28 1999-11-30 Confertech Systems Inc. Method and apparatus for on-demand teleconferencing
US5838664A (en) * 1997-07-17 1998-11-17 Videoserver, Inc. Video teleconferencing system with digital transcoding
US6816904B1 (en) * 1997-11-04 2004-11-09 Collaboration Properties, Inc. Networked video multimedia storage server environment
US6243129B1 (en) * 1998-01-09 2001-06-05 8×8, Inc. System and method for videoconferencing and simultaneously viewing a supplemental video source
US6285661B1 (en) * 1998-01-28 2001-09-04 Picturetel Corporation Low delay real time digital video mixing for multipoint video conferencing
US6480823B1 (en) * 1998-03-24 2002-11-12 Matsushita Electric Industrial Co., Ltd. Speech detection for noisy conditions
US6288740B1 (en) * 1998-06-11 2001-09-11 Ezenia! Inc. Method and apparatus for continuous presence conferencing with voice-activated quadrant selection
US6101480A (en) * 1998-06-19 2000-08-08 International Business Machines Electronic calendar with group scheduling and automated scheduling techniques for coordinating conflicting schedules
US6453285B1 (en) * 1998-08-21 2002-09-17 Polycom, Inc. Speech activity detector for use in noise reduction system, and methods therefor
US6535604B1 (en) * 1998-09-04 2003-03-18 Nortel Networks Limited Voice-switching device and method for multiple receivers
US6025870A (en) * 1998-10-14 2000-02-15 Vtel Corporation Automatic switching of videoconference focus
US6564380B1 (en) * 1999-01-26 2003-05-13 Pixelworld Networks, Inc. System and method for sending live video on the internet
US6728221B1 (en) * 1999-04-09 2004-04-27 Siemens Information & Communication Networks, Inc. Method and apparatus for efficiently utilizing conference bridge capacity
US6744460B1 (en) * 1999-10-04 2004-06-01 Polycom, Inc. Video display mode automatic switching system and method
US7089285B1 (en) * 1999-10-05 2006-08-08 Polycom, Inc. Videoconferencing apparatus having integrated multi-point conference capabilities
US6646997B1 (en) * 1999-10-25 2003-11-11 Voyant Technologies, Inc. Large-scale, fault-tolerant audio conferencing in a purely packet-switched network
US6657975B1 (en) * 1999-10-25 2003-12-02 Voyant Technologies, Inc. Large-scale, fault-tolerant audio conferencing over a hybrid network
US6496216B2 (en) * 2000-01-13 2002-12-17 Polycom Israel Ltd. Method and system for multimedia communication control
US6300973B1 (en) * 2000-01-13 2001-10-09 Meir Feder Method and system for multimedia communication control
US6757005B1 (en) * 2000-01-13 2004-06-29 Polycom Israel, Ltd. Method and system for multimedia video processing
US6760415B2 (en) * 2000-03-17 2004-07-06 Qwest Communications International Inc. Voice telephony system
US6603501B1 (en) * 2000-07-12 2003-08-05 Onscreen24 Corporation Videoconferencing using distributed processing
US20020188731A1 (en) * 2001-05-10 2002-12-12 Sergey Potekhin Control unit for multipoint multimedia/audio system
US20040183897A1 (en) * 2001-08-07 2004-09-23 Michael Kenoyer System and method for high resolution videoconferencing
US20030174146A1 (en) * 2002-02-04 2003-09-18 Michael Kenoyer Apparatus and method for providing electronic image manipulation in video conferencing applications
US20040113939A1 (en) * 2002-12-11 2004-06-17 Eastman Kodak Company Adaptive display system
US7330541B1 (en) * 2003-05-22 2008-02-12 Cisco Technology, Inc. Automated conference moderation
US20040263610A1 (en) * 2003-06-30 2004-12-30 Whynot Stephen R. Apparatus, method, and computer program for supporting video conferencing in a communication system
US20060013416A1 (en) * 2004-06-30 2006-01-19 Polycom, Inc. Stereo microphone processing for teleconferencing

Cited By (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070206089A1 (en) * 2006-03-01 2007-09-06 Polycom, Inc. Method and system for providing continuous presence video in a cascading conference
US7800642B2 (en) * 2006-03-01 2010-09-21 Polycom, Inc. Method and system for providing continuous presence video in a cascading conference
US20110018960A1 (en) * 2006-03-01 2011-01-27 Polycom, Inc. Method and System for Providing Continuous Presence Video in a Cascading Conference
US9035990B2 (en) 2006-03-01 2015-05-19 Polycom, Inc. Method and system for providing continuous presence video in a cascading conference
US8446451B2 (en) * 2006-03-01 2013-05-21 Polycom, Inc. Method and system for providing continuous presence video in a cascading conference
US20070299981A1 (en) * 2006-06-21 2007-12-27 Cisco Technology, Inc. Techniques for managing multi-window video conference displays
US7797383B2 (en) * 2006-06-21 2010-09-14 Cisco Technology, Inc. Techniques for managing multi-window video conference displays
US20090187400A1 (en) * 2006-09-30 2009-07-23 Huawei Technologies Co., Ltd. System, method and multipoint control unit for providing multi-language conference
US9031849B2 (en) * 2006-09-30 2015-05-12 Huawei Technologies Co., Ltd. System, method and multipoint control unit for providing multi-language conference
EP2140608B1 (en) * 2007-04-27 2018-04-04 Cisco Technology, Inc. Optimizing bandwidth in a multipoint video conference
US20080320158A1 (en) * 2007-06-20 2008-12-25 Mcomms Design Pty Ltd Apparatus and method for providing multimedia content
US8631143B2 (en) * 2007-06-20 2014-01-14 Mcomms Design Pty. Ltd. Apparatus and method for providing multimedia content
US8436888B1 (en) * 2008-02-20 2013-05-07 Cisco Technology, Inc. Detection of a lecturer in a videoconference
US20100066806A1 (en) * 2008-09-12 2010-03-18 Primax Electronics Ltd. Internet video image producing method
US8754922B2 (en) 2009-09-28 2014-06-17 Lifesize Communications, Inc. Supporting multiple videoconferencing streams in a videoconference
US8558862B2 (en) 2009-09-28 2013-10-15 Lifesize Communications, Inc. Videoconferencing using a precoded bitstream
US20110074913A1 (en) * 2009-09-28 2011-03-31 Kulkarni Hrishikesh G Videoconferencing Using a Precoded Bitstream
US20110074910A1 (en) * 2009-09-28 2011-03-31 King Keith C Supporting Multiple Videoconferencing Streams in a Videoconference
EP2373015A3 (en) * 2010-03-31 2015-09-23 Polycom, Inc. Method and system for adapting a continuous presence layout according to interaction between conferees
US20110279632A1 (en) * 2010-05-13 2011-11-17 Kulkarni Hrishikesh G Multiway Telepresence without a Hardware MCU
US8704870B2 (en) * 2010-05-13 2014-04-22 Lifesize Communications, Inc. Multiway telepresence without a hardware MCU
US11468983B2 (en) 2011-01-28 2022-10-11 Teladoc Health, Inc. Time-dependent navigation of telepresence robots
US8976218B2 (en) * 2011-06-27 2015-03-10 Google Technology Holdings LLC Apparatus for providing feedback on nonverbal cues of video conference participants
US20120327180A1 (en) * 2011-06-27 2012-12-27 Motorola Mobility, Inc. Apparatus for providing feedback on nonverbal cues of video conference participants
EP2732622A4 (en) * 2011-07-14 2014-12-24 Ricoh Co Ltd Multipoint connection apparatus and communication system
EP2732622A1 (en) * 2011-07-14 2014-05-21 Ricoh Company, Limited Multipoint connection apparatus and communication system
US9392224B2 (en) 2011-07-14 2016-07-12 Ricoh Company, Limited Multipoint connection apparatus and communication system
US9077848B2 (en) 2011-07-15 2015-07-07 Google Technology Holdings LLC Side channel for employing descriptive audio commentary about a video conference
US9215395B2 (en) 2012-03-15 2015-12-15 Ronaldo Luiz Lisboa Herdy Apparatus, system, and method for providing social content
EP2642753B1 (en) * 2012-03-19 2017-09-13 Ricoh Company, Ltd. Transmission terminal, transmission system, display control method, and display control program
US9077851B2 (en) 2012-03-19 2015-07-07 Ricoh Company, Ltd. Transmission terminal, transmission system, display control method, and recording medium storing display control program
US11453126B2 (en) 2012-05-22 2022-09-27 Teladoc Health, Inc. Clinical workflows utilizing autonomous and semi-autonomous telemedicine devices
US10892052B2 (en) 2012-05-22 2021-01-12 Intouch Technologies, Inc. Graphical user interfaces including touchpad driving interfaces for telemedicine devices
US11515049B2 (en) 2012-05-22 2022-11-29 Teladoc Health, Inc. Graphical user interfaces including touchpad driving interfaces for telemedicine devices
CN102857732B (en) * 2012-05-25 2015-12-09 华为技术有限公司 Menu control method, equipment and system in a kind of many pictures video conference
WO2013174115A1 (en) * 2012-05-25 2013-11-28 华为技术有限公司 Presence control method, device, and system in continuous presence video conferencing
CN102857732A (en) * 2012-05-25 2013-01-02 华为技术有限公司 Picture control method, device and system for multi-picture video conferences
US9247204B1 (en) 2012-08-20 2016-01-26 Google Inc. Automatic mute control for video conferencing
US8681203B1 (en) * 2012-08-20 2014-03-25 Google Inc. Automatic mute control for video conferencing
US10334205B2 (en) * 2012-11-26 2019-06-25 Intouch Technologies, Inc. Enhanced video interaction for a user interface of a telepresence network
US10924708B2 (en) 2012-11-26 2021-02-16 Teladoc Health, Inc. Enhanced video interaction for a user interface of a telepresence network
US11910128B2 (en) 2012-11-26 2024-02-20 Teladoc Health, Inc. Enhanced video interaction for a user interface of a telepresence network
US8976223B1 (en) * 2012-12-21 2015-03-10 Google Inc. Speaker switching in multiway conversation
CN105009571A (en) * 2013-02-04 2015-10-28 汤姆逊许可公司 Dual telepresence set-top box
US9609272B2 (en) * 2013-05-02 2017-03-28 Avaya Inc. Optimized video snapshot
US20140327730A1 (en) * 2013-05-02 2014-11-06 Avaya, Inc. Optimized video snapshot
CN104135638A (en) * 2013-05-02 2014-11-05 阿瓦亚公司 Optimized video snapshot
US20150052198A1 (en) * 2013-08-16 2015-02-19 Joonsuh KWUN Dynamic social networking service system and respective methods in collecting and disseminating specialized and interdisciplinary knowledge
US11082466B2 (en) * 2013-12-20 2021-08-03 Avaya Inc. Active talker activated conference pointers
US20150180919A1 (en) * 2013-12-20 2015-06-25 Avaya, Inc. Active talker activated conference pointers
US10880345B2 (en) * 2014-04-04 2020-12-29 Aleksandr Lvovich SHVEDOV Virtual meeting conduct procedure, virtual meeting conduct system, and virtual meeting member interface
US10742929B2 (en) 2015-11-20 2020-08-11 Microsoft Technology Licensing, Llc Communication system
US20170149854A1 (en) * 2015-11-20 2017-05-25 Microsoft Technology Licensing, Llc Communication System
US20170168692A1 (en) * 2015-12-14 2017-06-15 Microsoft Technology Licensing, Llc Dual-Modality Client Application
US11290753B1 (en) 2016-04-27 2022-03-29 United Services Automobile Association (Usaa) Systems and methods for adaptive livestreaming
US10887628B1 (en) * 2016-04-27 2021-01-05 United Services Automobile Services (USAA) Systems and methods for adaptive livestreaming
US10965963B2 (en) * 2019-07-30 2021-03-30 Sling Media Pvt Ltd Audio-based automatic video feed selection for a digital video production system
WO2022026842A1 (en) * 2020-07-30 2022-02-03 T1V, Inc. Virtual distributed camera, associated applications and system

Also Published As

Publication number Publication date
US7990410B2 (en) 2011-08-02
US20060256188A1 (en) 2006-11-16
US20060259552A1 (en) 2006-11-16

Similar Documents

Publication Publication Date Title
US20060248210A1 (en) Controlling video display mode in a video conferencing system
RU2398361C2 (en) Intelligent method, audio limiting unit and system
US7404001B2 (en) Videophone and method for a video call
RU2398362C2 (en) Connection of independent multimedia sources into conference communication
EP1868348B1 (en) Conference layout control and control protocol
US8599235B2 (en) Automatic display latency measurement system for video conferencing
US8730297B2 (en) System and method for providing camera functions in a video environment
US20070291108A1 (en) Conference layout control and control protocol
US9191234B2 (en) Enhanced communication bridge
US20070291667A1 (en) Intelligent audio limit method, system and node
US20070294263A1 (en) Associating independent multimedia sources into a conference call
US8736663B2 (en) Media detection and packet distribution in a multipoint conference
US20090174764A1 (en) System and Method for Displaying a Multipoint Videoconference
MX2007006912A (en) Conference layout control and control protocol.
MX2007006914A (en) Intelligent audio limit method, system and node.
MX2007006910A (en) Associating independent multimedia sources into a conference call.

Legal Events

Date Code Title Description
AS Assignment

Owner name: LIFESIZE COMMUNICATIONS, INC., TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:KENOYER, MICHAEL L.;REEL/FRAME:019219/0618

Effective date: 20060202

STCB Information on status: application discontinuation

Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION

AS Assignment

Owner name: LIFESIZE, INC., TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LIFESIZE COMMUNICATIONS, INC.;REEL/FRAME:037900/0054

Effective date: 20160225