US20120327181A1 - Synchronization and Mixing of Audio and Video Streams in Network-Based Video Conferencing Call Systems - Google Patents
Synchronization and Mixing of Audio and Video Streams in Network-Based Video Conferencing Call Systems Download PDFInfo
- Publication number
- US20120327181A1 US20120327181A1 US13/605,741 US201213605741A US2012327181A1 US 20120327181 A1 US20120327181 A1 US 20120327181A1 US 201213605741 A US201213605741 A US 201213605741A US 2012327181 A1 US2012327181 A1 US 2012327181A1
- Authority
- US
- United States
- Prior art keywords
- mix
- audio
- sender
- senders
- computer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
- H04L65/764—Media network packet handling at the destination
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/1066—Session management
- H04L65/1083—In-session procedures
- H04L65/1089—In-session procedures by adding media; by removing media
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/1066—Session management
- H04L65/1083—In-session procedures
- H04L65/1093—In-session procedures by adding participants; by removing participants
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/40—Support for services or applications
- H04L65/403—Arrangements for multi-party communication, e.g. for conferences
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/40—Support for services or applications
- H04L65/403—Arrangements for multi-party communication, e.g. for conferences
- H04L65/4038—Arrangements for multi-party communication, e.g. for conferences with floor control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/04—Synchronising
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/40—Support for services or applications
- H04L65/403—Arrangements for multi-party communication, e.g. for conferences
- H04L65/4046—Arrangements for multi-party communication, e.g. for conferences with distributed floor control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
- H04M3/562—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities where the conference facilities are distributed
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
- H04M3/563—User guidance or feature selection
- H04M3/564—User guidance or feature selection whereby the feature is a sub-conference
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
- H04M3/568—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
- H04N7/155—Conference systems involving storage of or access to video conference sessions
Definitions
- the present invention relates generally to video conferencing over a network.
- the present invention is directed towards synchronization and/or mixing of audio and video streams during a networked video conference call.
- Conventional networking software for video and audio conferencing permits one-way, two-way and in some cases multi-way communication between participants. Because each participant may be in a different environment and at a different location on a network, the transmission and reception of audio and video packets between various participants and/or to a central server may vary among them. For example, a participant may receive packets from a nearby participant in a more reliable fashion and with less delay than those from a participant that is more remotely located on the network. Packets may also be received out of order.
- audio and video data must be synchronized and mixed during display in order to produce a good video conferencing experience. For example, if the video and audio of a participant are not synchronized, then his mouth movements will not match his speech. The result can be annoying at best and can hinder communications at worst. Similarly, if the audio and/or video of different participants are not synchronized, then the unexpected pauses and timing may be interpretted as hesitations or other gestures. This can also hinder efficient communications between the participants.
- the present invention overcomes limitations of the prior art by adding audio streams to a mix until the mix is either complete (i.e., all audio streams have been added) or the mix is closed early (i.e., before the mix is complete).
- audio streams from N senders are to be mixed.
- the N audio streams are received over a network.
- the audio streams are divided into portions that will be referred to as audio chunks (e.g., 40 ms audio chunks).
- the received audio chunks are buffered.
- a mix is opened and the process cycles through the N senders. If a sender's audio chunk has not yet been added to the mix and it is available from the buffer, then the sender's audio chunk is added to the mix. If the sender's audio chunk is already in the mix and the sender has at least one additional audio chunk buffered (i.e., waiting for use in a future mix), a wait counter is incremented for that sender.
- the mix is closed when audio chunks from all N senders have been added. It may also be closed early upon some predetermined condition based on the value of the wait counter(s) (e.g., if the wait counter reaches a maximum value).
- the process is driven by receipt of audio chunks.
- a mix is opened. As each sender's audio chunk is received, it is evaluated for inclusion in the mix. If the sender is not yet in the mix and the received audio chunk is the correct audio chunk for the mix, then it is added to the mix. Otherwise, it is buffered for a future mix.
- the mix is closed if audio chunks from all N senders are in the mix or if a predetermined early close condition is met. For example, a queue counter may be used to count the number of audio chunks in each sender's buffer. The mix may be closed early if the queue counter reaches some maximum value.
- the process attempts to use the audio chunks already stored in the buffers to create the next mix, rather than immediately creating a new mix based on a newly received audio chunk.
- the audio stream is played as a series of audio chunks.
- the video stream is considered one frame at a time.
- a time marker for the current video frame is compared to the expected time duration of the current audio chunk. If the current frame should occur during the current audio chunk, then it is displayed and the process moves to the next frame. If the current frame should occur after the current audio chunk, then the process checks again later. If the current frame should have occurred before the current audio chunk, then the frame is discarded and the process moves to a future frame.
- a client-server architecture can be used where the server performs most of the functionality described above.
- a client-server architecture can be used where the server routes the various streams from client to client but the clients perform most of the functionality described above.
- the functionality can also be split between client and server.
- Peer-to-peer architectures can also be used.
- a central server receives audio and video streams from each sender client. It sends the appropriate audio and video streams to each receiver client (recall that each client typically will act as both a sender client and a receiver client). Each receiver client mixes the audio streams and synchronizes the mixed audio stream with the video stream(s). In an alternate approach, the server mixes the audio streams to produce the appropriate composite audio stream for each receiver client. The server sends to each receiver client the mixed audio stream and any applicable video streams, and each receiver client synchronizes the received audio and video streams.
- FIG. 1 is a block diagram of a server-based architecture suitable for use with the invention.
- FIG. 2 is a screen shot of a participant's user interface for a video conference.
- FIG. 3 is a block diagram of an example client according to the invention.
- FIGS. 4-5 are flow diagrams of different methods for mixing audio streams.
- FIGS. 6A-6B are a flow diagrams of another method for mixing audio streams.
- FIG. 7 is a flow diagram of a method for synchronizing audio and video streams.
- FIG. 1 is a block diagram of a server-based video conferencing architecture suitable for use with the invention.
- three participants 102 A-C are having a video conference.
- Each participant 102 is operating a client device 110 , which connects via a network 150 to a central server 120 .
- the server 120 coordinates the set up and tear down of the video conference and the collection and distribution of audio and video streams from the clients 110 .
- each client 110 is a computer that runs client software with video conferencing capability.
- each client 110 preferably includes at least one camera (for video capture), display (for video play back), microphone (for audio capture) and speaker (for audio play back).
- the clients 110 are connected via the Internet to the central server 120 .
- the central server 120 includes a web server 122 , a call management module 124 , an audio/video server 126 and an applications server 128 .
- the server 120 also includes user database 132 , call management database 134 and audio/video storage 136 .
- the participants 102 have previously registered and their records are stored in user database 132 .
- the web server 122 handles the web interface to the clients 110 .
- the call management module 124 and call management database 134 manage the video conference calls.
- the call management database 134 includes records of who is currently participating on which video conference calls. It may also include records of who is currently logged in and available for calls and/or their video conferencing capabilities.
- the audio/video server 126 manages the audio and video streams for these calls. Streaming technologies, as well as other technologies, can be used. Storage of audio and video at the server is handled by audio/video storage 136 .
- the application server 128 invokes other applications (not shown) as required.
- FIG. 2 is a screen shot of a participant 102 's user interface for the video conference.
- the invention can also be used for one-to-one situations (e.g., two-participant video call) or with more participants.
- FIG. 2 shows Gowreesh's screen as indicated by 200 .
- the top-level control for the user interface will be referred to as the main communicator element 210 . It includes top level controls for video conferencing. These controls typically are either displayed as graphical elements or implemented as part of pull-down menus (or other similar user interface components). Controls can be implemented as buttons, tabs, toolbars, arrows and icons, for example.
- the video conference is displayed in window 280 .
- the window 280 displays video of the other two participants: Alka and Lakshman. Gowreesh's audio system plays the corresponding audio.
- Ancillary window 290 lists the current participants and also provides for text chat. Files can also be shared by clicking on the attachment icon.
- the participants 102 A-B and their clients 110 A-B will be referred to as senders, and participant 102 C and its client 110 C will be referred to as the receiver.
- participant 102 C and its client 110 C will be referred to as the receiver.
- Alka and Lakshman are senders and Gowreesh is the receiver. These terms are used because Alka and Lakshman are sending audio and/or video data streams and Gowreesh is receiving these data (or derivatives of them).
- participants will act as both senders and receivers, sending audio and video of themselves and receiving audio and video of others.
- FIGS. 1-2 illustrate one example, but the invention is not limited to these specifics.
- client devices other than a computer running client software can be used. Examples include PDAs, mobile phones, web-enabled TV, and SIP phones and terminals (i.e., phone-type devices using the SIP protocol that typically have a small video screen and audio capability).
- SIP phones and terminals i.e., phone-type devices using the SIP protocol that typically have a small video screen and audio capability.
- not every device need have both audio and video and both input and output. Some participants may participate with audio only or video only, or be able to receive but not send audio/video or vice versa.
- the underlying architecture also need not be server-based. It could be peer-to-peer, or a combination of server and peer-to-peer. For example, participants that share a local network may communicate with each other on a peer-to-peer basis, but communicate with other participants via a server. Other variations will be apparent.
- Alka's audio and video should be synchronized to each other, and Lakshman's audio and video should be synchronized to each other.
- Alka's and Lakshman's audio/video streams preferably should also have some degree of synchronization. For example, if Alka asks a question, it is preferable that the video conference show Lakshman answering with his actual timing (i.e., avoiding too much relative delay or advance). This requires some synchronization of Alka's and Lakshman's audio and video streams.
- Alka's and Lakshman's audio streams typically would also be mixed together to form a composite audio stream for playback to Gowreesh. These tasks can be made more difficult if each of these data streams is sent as packets over network 150 since timing is not preserved in the transmission of packets. Some packets may propagate through the network 150 more quickly than others, thus arriving out of order or not arriving at all.
- each sender client 110 A-B creates the data streams for its respective participant 102 A-B; that these data streams are sent to server 120 which retransmits them to the receiver client 110 C, and that the receiver client 110 C is responsible for synchronizing and mixing the data streams to produce the appropriate data streams for display to the receiver 102 C. That is, in this example, all synchronization and mixing are performed locally at the client 110 C.
- the functionality might be divided in other ways. For example, some or all of the functionality can be shifted from the receiver client 110 C to the server 120 .
- the server e.g., A/V server 126
- the server 120 might mix the audio streams to form a composite audio stream and then send the composite audio stream and the original video streams to the receiver client 110 C.
- the server 120 might also mix video streams to form a composite video stream (e.g., one video stream that contains both Alka and Lakshman in FIG. 2 ) for transmission to the receiver client 110 C.
- the client 110 C may still be responsible for synchronizing received audio and video since transmission of packets over network 150 typically will not preserve their timing.
- the server 120 might also synchronize the audio stream and video stream, for example by combining the two data streams into a single data stream that contains both audio and video in the correct time relationship.
- any architecture which shifts computational burden from the clients 110 to the server 120 will require more powerful servers and may limit the scalability of the solution.
- the mixing of video streams at the server typically requires the server to decompress both video streams, combine them (often into a non-standard format) and then recompress the mixed video stream. If a video conference has four participants and each participant is viewing the three other participants, this requires the server to decompress the four video streams, combine them three at a time into four composite video streams, and then recompress the four composite video streams. If there are multiple video conferences active at the same time, the burden on the server scales accordingly and the server preferably would be sized to handle the worst case computational burden. On the other hand, if the functionality is implemented in the clients, then the computational resources available (i.e., the number of clients) naturally grows with the number of participants and number of video conferences.
- each sender 110 A-B might send its audio and video streams directly to each receiver 110 C, which then is responsible for locally synchronizing and/or mixing the various data streams.
- FIG. 3 is a block diagram of one example of a client for synchronizing and mixing audio and video streams according to the invention.
- the client includes audio buffers 310 , audio stream decoders 320 , audio mixer 330 and audio output module 340 .
- the client also includes video buffers 350 , video stream decoders 360 , optional video mixer 370 and video output module 380 .
- the client receives audio streams 302 and video streams 304 from the various sender clients 110 A-B (via the server 120 ) and produces an output audio stream 392 (typically, only one) and output video stream(s) 394 (possibly, more than one) for display on the receiver client 110 C.
- the output data streams are synchronized by synchronization module 390 .
- the input data streams usually will not be received in a synchronized manner.
- the audio stream 392 displayed by Gowreesh's client typically will mix the audio from Alka and Lakshman.
- the video stream 394 typically would include two video streams, one of Lakshman and one of Alka.
- the audio and video streams 392 , 394 are synchronized.
- audio chunks are packetized and sent by the sender clients 110 A-B to the receiver client 110 C. For simplicity, assume that an entire audio chunk fits into a single data packet. If multiple packets are required, the packets can be reassembled into the original audio chunks.
- one sender's audio chunk may be available but another sender's chunk may not be available as yet (but yet should be included in the mix to prevent distortion).
- the idea is to cycle through the senders putting one audio chunk from each sender into the mix. If the process reaches a sender but the sender's audio chunk is not available, then cycle through the remaining senders and, at the end of the cycle, come back and recheck whether the sender's audio chunk is now available. The sender may be rechecked a certain number of times before the process times out.
- the existing audio chunks may be mixed by audio mixer 330 without the missing audio chunks, which may be assumed as dropped.
- FIGS. 4-6 are flow diagrams showing three different implementations for mixing audio chunks.
- audio chunk size is expressed in milliseconds (ms). This will be the duration of audio that will be played before the next audio chunk is played.
- a “mix” is the set of all audio chunks that should be combined at a given instant. The mix may have the audio chunks combined using standard approaches or may be kept separate for playback in a player which will mix it. If there are n+1 participants in a video conference, then there typically will be n senders for each receiver. That is, the mix for the receiver at a time t should include the audio chunks for time t from the n senders.
- a particular sender is “in the mix” if his audio chunk is available for mixing. The mix is “complete” when all audio chunks are available for mixing.
- Senders are sometimes referred to as users:
- FIGS. 4-6 Three example algorithms are described in FIGS. 4-6 .
- audio buffers are filled for each sender as packets arrive, and the mixing process independently accesses these buffers.
- the mixing algorithm As each packet arrives, it is sent to the mixing algorithm and processed immediately if possible or else stored in a buffer for future processing.
- the decoding of the packets is not directly relevant to the discussion and can take place at one of several different points.
- An important concept in all the algorithms is the wait count or queue count, which allows the handling of delays in when the packets are received.
- Audio chunks arrive over a network and are put into the appropriate audio buffer 310 , with different buffers 310 for each sender. This typically is an independent process and implemented as a separate thread.
- the mixing algorithm 330 is started 410 , 415 independently and accesses the audio buffers 310 in sequence (loop 470 ). For each audio buffer (sender), if there is no audio chunk available 422 , then the process proceeds 470 to the next audio buffer. If there is an audio chunk available 424 , then the process checks 430 whether that sender is already in the mix. If not 432 , then the audio chunk is added 440 into the mix (assuming the audio chunk is for the right time period).
- the process checks 460 whether the mix should be closed.
- the mix is closed 464 , 465 , if the mix is now complete (i.e., all senders are in the mix) or if the process meets some other predetermined early close condition, for example if the process times out or, in this case, if the maximum wait count for any sender is reached. If the mix is not closed, the loop 470 increments to the next audio buffer. When the next mix is opened 415 , then as each sender's audio chunk is added 440 to the mix, the wait count, if positive, is decremented (last step in 440 ).
- FIG. 5 is a variation of FIG. 4 . The difference is that each time a new mix is opened 515 , the wait count for all users is initialized to zero. Also compare step 540 to step 440 .
- FIGS. 4 and 5 typically are implemented as two threads because the audio chunks are received independently of when they are processed by the mixing algorithm.
- FIG. 6A-6B is an example that coordinates the receiving of audio chunks with the mixing. It can be implemented as a single thread.
- the process was driven by automatically buffering the audio chunks as they are received and then sequentially cycling through the audio buffers.
- the process is driven by the receipt of audio chunks.
- Audio chunks arrive over a network as mentioned before. This time, however, as each chunk is received 610 , it is evaluated for possible mixing. If a mix is not 622 currently open, then a new mix is opened 640 and the received audio chunk is added 650 to the mix (if for the correct time period). If a mix is already open 624 , then there are two possibilities. If this sender is not 632 in the mix, then the audio chunk is added 650 to the mix. If this sender is 634 in the mix, then the audio chunk is buffered 660 for use in a future mix and the queue count for the user is increased 660 by 1.
- step 670 once each sender has an audio chunk in the mix or the queue count reaches its maximum (or other early close condition is met), the mix is closed 674 , 675 . Otherwise 672 , the process waits to receive 610 the next audio chunk.
- a mix When a mix is closed 676 , there may be several audio chunks in the buffers (from step 660 ). If this is ignored, the buffers may overflow. Accordingly, in this example, when the mix is closed 675 , a check 680 is performed to see if the queue count of any sender is greater than zero. If not 682 , then the process waits 610 to receive the next audio chunk.
- any queue count is greater than zero 684 , then the process tries to use 690 these stored audio chunks. For example, a new mix could be opened in step 690 and any applicable stored audio chunks added to the mix (which could be from more than one sender), decrementing the corresponding queue counts. Various approaches can be used to do this. If the mix can be completed, then the process 680 - 690 repeats. Once the process 690 of trying to deplete the audio buffers is completed, the process returns to be driven by receiving 610 the next audio chunk. The process of trying to use stored audio chunks can also be used in the processes of FIGS. 4-5 .
- FIG. 6B is a flow diagram of one approach to process 690 .
- a new mix is opened 691 .
- the process cycles 694 through the buffers for the senders. If a sender has an audio chunk available 692 , it is added to the mix 693 and the queue counter for that sender is decremented. If audio chunks are available for all senders, then the mix will be completed 695 . In that case, the mix is closed 696 . If any queue count is greater than zero 697 , then the process repeats. If the mix is not complete, then the process returns to receive 610 the next audio chunk.
- the queue count has a slightly different meaning than the wait count in FIGS. 4-5 .
- the queue count for a sender is the number of audio chunks currently buffered waiting for a next mix.
- the wait count was the number of times a particular sender had to wait because he was already in the current mix and had additional audio chunks buffered for future mixes.
- the above algorithms do not address where the mixed audio is stored.
- the mix is stored in a buffer which is accessed by the playback process. Thus, it may happen that when a new mix is opened, the buffer may be full.
- one strategy is to check every few ms (for example S A /8) if a slot is open in the buffer (due to playback).
- Alka's video should be synchronized to Alka's audio. If Alka's and Lakshman's audio streams have been mixed to produce a composite audio stream, then Alka's video should be synchronized to the composite audio stream. Audio-video synchronization is preferably achieved by playing the audio stream and synchronizing the video stream to the audio playback. This is due in part because the audio stream has a tighter time tolerance (i.e., jitter tolerance) for playback.
- a time marker is added to each audio chunk or video frame captured.
- the marker is tracked as of the start of the audio sample.
- a 40 ms audio chunk will have many audio samples. The exact number is determined by the sampling frequency.
- Mixed audio streams also have time markers, preferably one for each sender's audio chunk in the mix.
- the original audio streams have time markers and, when they are mixed to form a composite audio stream, the time marker preferably is retained for the composite audio stream. Note that the time marker need not be an actual time stamp but can be any sort of relative counter.
- the differences between the audio chunk versus video frames can be explained in terms of how they are treated. For video, suppose 25 video frames per second (fps) are captured. Then each video frame is displayed and held for 40 ms (1000/25). At 30 frames per second, each video frame is held for 331 ⁇ 3 ms on display. For audio, suppose audio is captured in 40 ms chunks. Then 40 ms worth of audio are played back at a time, but that 40 ms audio chunk includes many audio samples per the sampling rate. The audio playback is effectively continuous relative to the video playback because there are many audio samples per video frame. Thus, the synchronization problem is to match the video playback to the audio playback. This can be done by suitably marking the two data streams and then matching the marks within specified tolerances.
- the audio playback is used to clock the video playback.
- synchronization occurs as follows.
- the decision as to whether the video is behind, at, or ahead of the audio is determined within a certain tolerance.
- FIG. 7 is a flow diagram of a specific implementation, using the following symbols:
- S A is the size of the audio chunk in milliseconds. Audio is captured S A ms at a time.
- T A [i] is the time at which the ith audio chunk was captured, in milliseconds.
- Tv[k] is the time at which the kth video frame was captured, in milliseconds.
- f is the frame rate, in frames per second.
- the basic idea is that if Tv[k] falls within the time period calculated for the current audio chunk, then video frame k should be displayed.
- the nominal time period runs from T A [i] to T A [i]+S A . which starts at time T A [i] and ends at time T A [i]+S A , Tolerances tol 1 and tol 2 are used to add robustness, so that the calculated time period has a start time of T A [i] ⁇ tol 1 and an end time of T A [i]+S A +tol 2 . This assumes that the times T V [k] and T A [i] are measured relative to the same reference time.
- the sender client can start the clocks for audio and video capture at the same time. Equivalently, if the audio and video capture clocks use different time references, the offset between the two can be compensated.
- the process initializes 710 by initializing the video frame counter j and starting playback of the audio stream.
- step 720 lower bound L and upper bound U are calculated for the current audio chunk being played. It is then determined 730 whether video frame j falls within the time period spanned by the current audio chunk. If it does 735 , then the video frame is displayed 750 and the counter j is incremented to move to the next video frame and the process is repeated 725 . If the video frame j occurs after 736 the current audio chunk (i.e., in the future), then nothing happens. The process waits 760 and repeats 725 the process at a later time.
- the video frame j is to have occurred before 734 the current audio chunk, then the video frame is discarded 740 and the next video frame is tested 742 to see if it occurs during the current audio chunk. This process can be repeated until the video stream catches up to the audio stream.
- Certain aspects of the present invention include process steps and instructions described herein in the form of an algorithm. It should be noted that the process steps and instructions of the present invention could be embodied in software, firmware or hardware, and when embodied in software, could be downloaded to reside on and be operated from different platforms used by real time network operating systems.
- the present invention also relates to an apparatus for performing the operations herein.
- This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer.
- a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CDs, DVDs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
- the computers referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Telephonic Communication Services (AREA)
Abstract
In one aspect, audio streams are added to a mix until the mix is either complete (i.e., all audio streams have been added) or the mix is closed early (i.e., before the mix is complete). In another aspect, audio and video streams are synchronized by playing back the audio stream and then synchronizing display of the video frames to the playback of the audio stream.
Description
- This application is a continuation U.S. application Ser. No. 12/242,358, filed on Sep. 30, 2008 and claims priority to U.S. Provisional Application No. 60/976,464, “Video Conference User Interface and Features” by Mukund Thapa filed on Sep. 30, 2007. All of these applications are incorporated by reference herein in their entirety.
- 1. Field of the Invention
- The present invention relates generally to video conferencing over a network. In particular, the present invention is directed towards synchronization and/or mixing of audio and video streams during a networked video conference call.
- 2. Description of Background Art
- Conventional networking software for video and audio conferencing permits one-way, two-way and in some cases multi-way communication between participants. Because each participant may be in a different environment and at a different location on a network, the transmission and reception of audio and video packets between various participants and/or to a central server may vary among them. For example, a participant may receive packets from a nearby participant in a more reliable fashion and with less delay than those from a participant that is more remotely located on the network. Packets may also be received out of order.
- However transmitted and received over a network, audio and video data must be synchronized and mixed during display in order to produce a good video conferencing experience. For example, if the video and audio of a participant are not synchronized, then his mouth movements will not match his speech. The result can be annoying at best and can hinder communications at worst. Similarly, if the audio and/or video of different participants are not synchronized, then the unexpected pauses and timing may be interpretted as hesitations or other gestures. This can also hinder efficient communications between the participants.
- Thus, there is a need for preferably simple approaches to synchronizing and mixing audio and/or video for networked participants in a video conference call.
- In one aspect, the present invention overcomes limitations of the prior art by adding audio streams to a mix until the mix is either complete (i.e., all audio streams have been added) or the mix is closed early (i.e., before the mix is complete).
- In one approach, audio streams from N senders are to be mixed. The N audio streams are received over a network. The audio streams are divided into portions that will be referred to as audio chunks (e.g., 40 ms audio chunks). The received audio chunks are buffered. A mix is opened and the process cycles through the N senders. If a sender's audio chunk has not yet been added to the mix and it is available from the buffer, then the sender's audio chunk is added to the mix. If the sender's audio chunk is already in the mix and the sender has at least one additional audio chunk buffered (i.e., waiting for use in a future mix), a wait counter is incremented for that sender. The mix is closed when audio chunks from all N senders have been added. It may also be closed early upon some predetermined condition based on the value of the wait counter(s) (e.g., if the wait counter reaches a maximum value).
- In a different approach, the process is driven by receipt of audio chunks. A mix is opened. As each sender's audio chunk is received, it is evaluated for inclusion in the mix. If the sender is not yet in the mix and the received audio chunk is the correct audio chunk for the mix, then it is added to the mix. Otherwise, it is buffered for a future mix. Again, the mix is closed if audio chunks from all N senders are in the mix or if a predetermined early close condition is met. For example, a queue counter may be used to count the number of audio chunks in each sender's buffer. The mix may be closed early if the queue counter reaches some maximum value. In another aspect, once a mix is closed, the process attempts to use the audio chunks already stored in the buffers to create the next mix, rather than immediately creating a new mix based on a newly received audio chunk.
- Another aspect concerns synchronizing audio streams and video streams. In one approach, the audio stream is played as a series of audio chunks. The video stream is considered one frame at a time. A time marker for the current video frame is compared to the expected time duration of the current audio chunk. If the current frame should occur during the current audio chunk, then it is displayed and the process moves to the next frame. If the current frame should occur after the current audio chunk, then the process checks again later. If the current frame should have occurred before the current audio chunk, then the frame is discarded and the process moves to a future frame.
- These mixing and synchronization processes can be divided between clients and/or servers in different ways. For example, a client-server architecture can be used where the server performs most of the functionality described above. Alternately, a client-server architecture can be used where the server routes the various streams from client to client but the clients perform most of the functionality described above. The functionality can also be split between client and server. Peer-to-peer architectures can also be used.
- In a preferred approach, a central server receives audio and video streams from each sender client. It sends the appropriate audio and video streams to each receiver client (recall that each client typically will act as both a sender client and a receiver client). Each receiver client mixes the audio streams and synchronizes the mixed audio stream with the video stream(s). In an alternate approach, the server mixes the audio streams to produce the appropriate composite audio stream for each receiver client. The server sends to each receiver client the mixed audio stream and any applicable video streams, and each receiver client synchronizes the received audio and video streams.
- Other aspects of the invention include software, systems and components of systems for implementing the techniques described above. Yet additional aspects include methods and applications for all of the foregoing.
-
FIG. 1 is a block diagram of a server-based architecture suitable for use with the invention. -
FIG. 2 is a screen shot of a participant's user interface for a video conference. -
FIG. 3 is a block diagram of an example client according to the invention. -
FIGS. 4-5 are flow diagrams of different methods for mixing audio streams. -
FIGS. 6A-6B are a flow diagrams of another method for mixing audio streams. -
FIG. 7 is a flow diagram of a method for synchronizing audio and video streams. -
FIG. 1 is a block diagram of a server-based video conferencing architecture suitable for use with the invention. In this example, threeparticipants 102A-C are having a video conference. Each participant 102 is operating aclient device 110, which connects via anetwork 150 to acentral server 120. In this server-based architecture, theserver 120 coordinates the set up and tear down of the video conference and the collection and distribution of audio and video streams from theclients 110. In this particular example, eachclient 110 is a computer that runs client software with video conferencing capability. To allow full video and audio capability, eachclient 110 preferably includes at least one camera (for video capture), display (for video play back), microphone (for audio capture) and speaker (for audio play back). - The
clients 110 are connected via the Internet to thecentral server 120. In this example, thecentral server 120 includes aweb server 122, acall management module 124, an audio/video server 126 and anapplications server 128. Theserver 120 also includesuser database 132,call management database 134 and audio/video storage 136. The participants 102 have previously registered and their records are stored inuser database 132. Theweb server 122 handles the web interface to theclients 110. Thecall management module 124 andcall management database 134 manage the video conference calls. For example, thecall management database 134 includes records of who is currently participating on which video conference calls. It may also include records of who is currently logged in and available for calls and/or their video conferencing capabilities. The audio/video server 126 manages the audio and video streams for these calls. Streaming technologies, as well as other technologies, can be used. Storage of audio and video at the server is handled by audio/video storage 136. Theapplication server 128 invokes other applications (not shown) as required. -
FIG. 2 is a screen shot of a participant 102's user interface for the video conference. In this example, there are three participants: Gowreesh, Alka and Lakshman. This is a multi-point example since the three participants are at different network locations. However, the invention can also be used for one-to-one situations (e.g., two-participant video call) or with more participants.FIG. 2 shows Gowreesh's screen as indicated by 200. The top-level control for the user interface will be referred to as themain communicator element 210. It includes top level controls for video conferencing. These controls typically are either displayed as graphical elements or implemented as part of pull-down menus (or other similar user interface components). Controls can be implemented as buttons, tabs, toolbars, arrows and icons, for example. - The video conference is displayed in
window 280. In this example, thewindow 280 displays video of the other two participants: Alka and Lakshman. Gowreesh's audio system plays the corresponding audio.Ancillary window 290 lists the current participants and also provides for text chat. Files can also be shared by clicking on the attachment icon. - For purposes of explaining aspects of the invention, the
participants 102A-B and theirclients 110A-B will be referred to as senders, andparticipant 102C and itsclient 110C will be referred to as the receiver. In the example shown inFIG. 2 , Alka and Lakshman are senders and Gowreesh is the receiver. These terms are used because Alka and Lakshman are sending audio and/or video data streams and Gowreesh is receiving these data (or derivatives of them). Of course, in most video conferences, participants will act as both senders and receivers, sending audio and video of themselves and receiving audio and video of others. -
FIGS. 1-2 illustrate one example, but the invention is not limited to these specifics. For example, client devices other than a computer running client software can be used. Examples include PDAs, mobile phones, web-enabled TV, and SIP phones and terminals (i.e., phone-type devices using the SIP protocol that typically have a small video screen and audio capability). In addition, not every device need have both audio and video and both input and output. Some participants may participate with audio only or video only, or be able to receive but not send audio/video or vice versa. The underlying architecture also need not be server-based. It could be peer-to-peer, or a combination of server and peer-to-peer. For example, participants that share a local network may communicate with each other on a peer-to-peer basis, but communicate with other participants via a server. Other variations will be apparent. - As described above, one challenge of network-based video conferencing is that the various data streams from the
senders 110A-B should be synchronized and mixed for display at thereceiver 110C. InFIG. 2 , Alka's audio and video should be synchronized to each other, and Lakshman's audio and video should be synchronized to each other. In addition, Alka's and Lakshman's audio/video streams preferably should also have some degree of synchronization. For example, if Alka asks a question, it is preferable that the video conference show Lakshman answering with his actual timing (i.e., avoiding too much relative delay or advance). This requires some synchronization of Alka's and Lakshman's audio and video streams. Alka's and Lakshman's audio streams typically would also be mixed together to form a composite audio stream for playback to Gowreesh. These tasks can be made more difficult if each of these data streams is sent as packets overnetwork 150 since timing is not preserved in the transmission of packets. Some packets may propagate through thenetwork 150 more quickly than others, thus arriving out of order or not arriving at all. - In the following example, it will be assumed that each
sender client 110A-B creates the data streams for itsrespective participant 102A-B; that these data streams are sent toserver 120 which retransmits them to thereceiver client 110C, and that thereceiver client 110C is responsible for synchronizing and mixing the data streams to produce the appropriate data streams for display to thereceiver 102C. That is, in this example, all synchronization and mixing are performed locally at theclient 110C. - This division of functionality is assumed primarily for purposes of explanation. In alternate embodiments, the functionality might be divided in other ways. For example, some or all of the functionality can be shifted from the
receiver client 110C to theserver 120. For example, the server (e.g., A/V server 126) might mix the audio streams to form a composite audio stream and then send the composite audio stream and the original video streams to thereceiver client 110C. Alternately, theserver 120 might also mix video streams to form a composite video stream (e.g., one video stream that contains both Alka and Lakshman inFIG. 2 ) for transmission to thereceiver client 110C. In these examples, theclient 110C may still be responsible for synchronizing received audio and video since transmission of packets overnetwork 150 typically will not preserve their timing. In another variant, theserver 120 might also synchronize the audio stream and video stream, for example by combining the two data streams into a single data stream that contains both audio and video in the correct time relationship. - However, any architecture which shifts computational burden from the
clients 110 to theserver 120 will require more powerful servers and may limit the scalability of the solution. For example, the mixing of video streams at the server typically requires the server to decompress both video streams, combine them (often into a non-standard format) and then recompress the mixed video stream. If a video conference has four participants and each participant is viewing the three other participants, this requires the server to decompress the four video streams, combine them three at a time into four composite video streams, and then recompress the four composite video streams. If there are multiple video conferences active at the same time, the burden on the server scales accordingly and the server preferably would be sized to handle the worst case computational burden. On the other hand, if the functionality is implemented in the clients, then the computational resources available (i.e., the number of clients) naturally grows with the number of participants and number of video conferences. - In a peer-to-peer architecture, each
sender 110A-B might send its audio and video streams directly to eachreceiver 110C, which then is responsible for locally synchronizing and/or mixing the various data streams. -
FIG. 3 is a block diagram of one example of a client for synchronizing and mixing audio and video streams according to the invention. The client includesaudio buffers 310,audio stream decoders 320,audio mixer 330 andaudio output module 340. The client also includes video buffers 350,video stream decoders 360,optional video mixer 370 andvideo output module 380. The client receivesaudio streams 302 andvideo streams 304 from thevarious sender clients 110A-B (via the server 120) and produces an output audio stream 392 (typically, only one) and output video stream(s) 394 (possibly, more than one) for display on thereceiver client 110C. The output data streams are synchronized bysynchronization module 390. The input data streams usually will not be received in a synchronized manner. - Using
FIG. 2 as an example, theaudio stream 392 displayed by Gowreesh's client typically will mix the audio from Alka and Lakshman. Thevideo stream 394 typically would include two video streams, one of Lakshman and one of Alka. The audio andvideo streams - Consider first the mixing of different audio streams 302. Assume that audio data is captured and played back in certain duration “audio chunks.” Currently, the capture is done in audio chunks of 40 ms each. The number of samples in each audio chunk is determined by the sampling frequency (and possibly also the number of audio channels). These audio chunks are packetized and sent by the
sender clients 110A-B to thereceiver client 110C. For simplicity, assume that an entire audio chunk fits into a single data packet. If multiple packets are required, the packets can be reassembled into the original audio chunks. - When packets of audio are received over a network, there can be loss and also delays. Thus, during mixing, for example, one sender's audio chunk may be available but another sender's chunk may not be available as yet (but yet should be included in the mix to prevent distortion). In one approach, the idea is to cycle through the senders putting one audio chunk from each sender into the mix. If the process reaches a sender but the sender's audio chunk is not available, then cycle through the remaining senders and, at the end of the cycle, come back and recheck whether the sender's audio chunk is now available. The sender may be rechecked a certain number of times before the process times out. In one approach, the existing audio chunks may be mixed by
audio mixer 330 without the missing audio chunks, which may be assumed as dropped. -
FIGS. 4-6 are flow diagrams showing three different implementations for mixing audio chunks. In these flow diagrams, audio chunk size is expressed in milliseconds (ms). This will be the duration of audio that will be played before the next audio chunk is played. A “mix” is the set of all audio chunks that should be combined at a given instant. The mix may have the audio chunks combined using standard approaches or may be kept separate for playback in a player which will mix it. If there are n+1 participants in a video conference, then there typically will be n senders for each receiver. That is, the mix for the receiver at a time t should include the audio chunks for time t from the n senders. A particular sender is “in the mix” if his audio chunk is available for mixing. The mix is “complete” when all audio chunks are available for mixing. - The following symbols are used in
FIGS. 4-6 . Senders are sometimes referred to as users: -
- n is the number of audio streams that are to be mixed (i.e., number of senders).
- Typically, a value of n implies a video conference with n+1 participants. A complete mix will have n audio chunks, one from each sender.
- user_is_in_mix is an array of dimension n. Each element k of the array is either 0 or 1. If user_is_in_mix[k]=1, this means the audio chunk for sender k is in the mix. A value of 0 means it is not in the mix.
- num_users_in_mix is the total number of senders currently in the mix. This is the summation of the elements of the array user_is_in_mix. If num_users_in_mix=n, then that mix is complete. If <n, then it is incomplete.
- wait_count_for_user is an array of dimension n. wait_count_for_user[k] is the number of times that sender k, who is already in the mix, has an audio chunk available for some future mix, but must wait because the current mix is not yet complete.
- max_wait_count is the maximum value of wait_count_for_user for any sender k before the mix is closed (even though still incomplete). Analysis, confirmed by experimentation, suggests that the
value 3 works well, although other values can also be used. - q_count_for_user is an array of dimension n. queue_count_for_user[k] is the number of audio chunks that sender k, who is already in the mix, has available for future mixes. The audio chunks are queued because the current mix is not yet complete.
- max_q_count is the maximum value of queue_count_for_user for any sender k before the mix is closed (even though still incomplete).
- k is a counter that counts through the senders.
- n is the number of audio streams that are to be mixed (i.e., number of senders).
- Three example algorithms are described in
FIGS. 4-6 . In the first two, audio buffers are filled for each sender as packets arrive, and the mixing process independently accesses these buffers. In the third example, as each packet arrives, it is sent to the mixing algorithm and processed immediately if possible or else stored in a buffer for future processing. The decoding of the packets is not directly relevant to the discussion and can take place at one of several different points. An important concept in all the algorithms is the wait count or queue count, which allows the handling of delays in when the packets are received. - The general idea behind
FIG. 4 is as follows, with reference toFIG. 3 . Audio chunks arrive over a network and are put into theappropriate audio buffer 310, withdifferent buffers 310 for each sender. This typically is an independent process and implemented as a separate thread. Themixing algorithm 330 is started 410, 415 independently and accesses theaudio buffers 310 in sequence (loop 470). For each audio buffer (sender), if there is no audio chunk available 422, then the process proceeds 470 to the next audio buffer. If there is an audio chunk available 424, then the process checks 430 whether that sender is already in the mix. If not 432, then the audio chunk is added 440 into the mix (assuming the audio chunk is for the right time period). If a sender is already in themix 434, then his/her wait count is increased 450 by 1. The process then checks 460 whether the mix should be closed. The mix is closed 464, 465, if the mix is now complete (i.e., all senders are in the mix) or if the process meets some other predetermined early close condition, for example if the process times out or, in this case, if the maximum wait count for any sender is reached. If the mix is not closed, theloop 470 increments to the next audio buffer. When the next mix is opened 415, then as each sender's audio chunk is added 440 to the mix, the wait count, if positive, is decremented (last step in 440). -
FIG. 5 is a variation ofFIG. 4 . The difference is that each time a new mix is opened 515, the wait count for all users is initialized to zero. Also comparestep 540 to step 440. -
FIGS. 4 and 5 typically are implemented as two threads because the audio chunks are received independently of when they are processed by the mixing algorithm.FIG. 6A-6B is an example that coordinates the receiving of audio chunks with the mixing. It can be implemented as a single thread. InFIGS. 4 and 5 , the process was driven by automatically buffering the audio chunks as they are received and then sequentially cycling through the audio buffers. InFIG. 6 , the process is driven by the receipt of audio chunks. - Referring to
FIG. 6A , the general idea is as follows. Audio chunks arrive over a network as mentioned before. This time, however, as each chunk is received 610, it is evaluated for possible mixing. If a mix is not 622 currently open, then a new mix is opened 640 and the received audio chunk is added 650 to the mix (if for the correct time period). If a mix is already open 624, then there are two possibilities. If this sender is not 632 in the mix, then the audio chunk is added 650 to the mix. If this sender is 634 in the mix, then the audio chunk is buffered 660 for use in a future mix and the queue count for the user is increased 660 by 1. Instep 670, once each sender has an audio chunk in the mix or the queue count reaches its maximum (or other early close condition is met), the mix is closed 674, 675. Otherwise 672, the process waits to receive 610 the next audio chunk. - When a mix is closed 676, there may be several audio chunks in the buffers (from step 660). If this is ignored, the buffers may overflow. Accordingly, in this example, when the mix is closed 675, a check 680 is performed to see if the queue count of any sender is greater than zero. If not 682, then the process waits 610 to receive the next audio chunk.
- However, if any queue count is greater than zero 684, then the process tries to use 690 these stored audio chunks. For example, a new mix could be opened in
step 690 and any applicable stored audio chunks added to the mix (which could be from more than one sender), decrementing the corresponding queue counts. Various approaches can be used to do this. If the mix can be completed, then the process 680-690 repeats. Once theprocess 690 of trying to deplete the audio buffers is completed, the process returns to be driven by receiving 610 the next audio chunk. The process of trying to use stored audio chunks can also be used in the processes ofFIGS. 4-5 . -
FIG. 6B is a flow diagram of one approach to process 690. In this example, a new mix is opened 691. The process cycles 694 through the buffers for the senders. If a sender has an audio chunk available 692, it is added to themix 693 and the queue counter for that sender is decremented. If audio chunks are available for all senders, then the mix will be completed 695. In that case, the mix is closed 696. If any queue count is greater than zero 697, then the process repeats. If the mix is not complete, then the process returns to receive 610 the next audio chunk. - In
FIG. 6 , the queue count has a slightly different meaning than the wait count inFIGS. 4-5 . InFIG. 6 , the queue count for a sender is the number of audio chunks currently buffered waiting for a next mix. InFIGS. 4-5 , the wait count was the number of times a particular sender had to wait because he was already in the current mix and had additional audio chunks buffered for future mixes. - The above algorithms do not address where the mixed audio is stored. Typically the mix is stored in a buffer which is accessed by the playback process. Thus, it may happen that when a new mix is opened, the buffer may be full. In this case, one strategy is to check every few ms (for example SA/8) if a slot is open in the buffer (due to playback).
- Now turn to video synchronization. With respect to
FIG. 2 , Alka's video should be synchronized to Alka's audio. If Alka's and Lakshman's audio streams have been mixed to produce a composite audio stream, then Alka's video should be synchronized to the composite audio stream. Audio-video synchronization is preferably achieved by playing the audio stream and synchronizing the video stream to the audio playback. This is due in part because the audio stream has a tighter time tolerance (i.e., jitter tolerance) for playback. - A time marker is added to each audio chunk or video frame captured. In the case of audio if a 40 ms audio chunk is captured, then the marker is tracked as of the start of the audio sample. A 40 ms audio chunk, however, will have many audio samples. The exact number is determined by the sampling frequency. Mixed audio streams also have time markers, preferably one for each sender's audio chunk in the mix. The original audio streams have time markers and, when they are mixed to form a composite audio stream, the time marker preferably is retained for the composite audio stream. Note that the time marker need not be an actual time stamp but can be any sort of relative counter.
- The differences between the audio chunk versus video frames can be explained in terms of how they are treated. For video, suppose 25 video frames per second (fps) are captured. Then each video frame is displayed and held for 40 ms (1000/25). At 30 frames per second, each video frame is held for 33⅓ ms on display. For audio, suppose audio is captured in 40 ms chunks. Then 40 ms worth of audio are played back at a time, but that 40 ms audio chunk includes many audio samples per the sampling rate. The audio playback is effectively continuous relative to the video playback because there are many audio samples per video frame. Thus, the synchronization problem is to match the video playback to the audio playback. This can be done by suitably marking the two data streams and then matching the marks within specified tolerances.
- In some sense, the audio playback is used to clock the video playback. In one approach, synchronization occurs as follows.
-
- If the time marker of the video frame matches the time of the audio playback, then display the video frame.
- If the time marker of the video frame is ahead of that for the audio playback, then wait.
- If the time marker of the video frame is behind that for the audio playback, then skip the video frame.
- The decision as to whether the video is behind, at, or ahead of the audio is determined within a certain tolerance.
-
FIG. 7 is a flow diagram of a specific implementation, using the following symbols: - SA is the size of the audio chunk in milliseconds. Audio is captured SA ms at a time.
- TA[i] is the time at which the ith audio chunk was captured, in milliseconds.
- Tv[k] is the time at which the kth video frame was captured, in milliseconds.
- f is the frame rate, in frames per second.
- fD is the frame display duration, in milliseconds. fD=(1/f)*1000.
- tol1 is a tolerance for the lower bound, in milliseconds. This can be zero or higher. In practice, tol1=2 appears to work well for SA=40 ms.
- to12 is the tolerance for the upper bound. This can be zero or higher. In practice, to 12=0 appears to work well.
- In
FIG. 7 , the basic idea is that if Tv[k] falls within the time period calculated for the current audio chunk, then video frame k should be displayed. The nominal time period runs from TA[i] to TA[i]+SA. which starts at time TA[i] and ends at time TA[i]+SA, Tolerances tol1 and tol2 are used to add robustness, so that the calculated time period has a start time of TA[i]−tol1 and an end time of TA[i]+SA+tol2. This assumes that the times TV[k] and TA[i] are measured relative to the same reference time. This can be achieved, for example, by starting the audio and video capture threads at the same time relative to a common clock. Alternately, the sender client can start the clocks for audio and video capture at the same time. Equivalently, if the audio and video capture clocks use different time references, the offset between the two can be compensated. - In more detail, the process initializes 710 by initializing the video frame counter j and starting playback of the audio stream. In
step 720, lower bound L and upper bound U are calculated for the current audio chunk being played. It is then determined 730 whether video frame j falls within the time period spanned by the current audio chunk. If it does 735, then the video frame is displayed 750 and the counter j is incremented to move to the next video frame and the process is repeated 725. If the video frame j occurs after 736 the current audio chunk (i.e., in the future), then nothing happens. The process waits 760 and repeats 725 the process at a later time. If the video frame j was to have occurred before 734 the current audio chunk, then the video frame is discarded 740 and the next video frame is tested 742 to see if it occurs during the current audio chunk. This process can be repeated until the video stream catches up to the audio stream. - The present invention has been described in particular detail with respect to a limited number of embodiments. One skilled in the art will appreciate that the invention may additionally be practiced in other embodiments. First, the particular naming of the components, capitalization of terms, the attributes, data structures, or any other programming or structural aspect is not mandatory or significant, and the mechanisms that implement the invention or its features may have different names, formats, or protocols. Further, the system may be implemented via a combination of hardware and software, as described, or entirely in hardware elements. Also, the particular division of functionality between the various system components described herein is merely exemplary, and not mandatory; functions performed by a single system component may instead be performed by multiple components, and functions performed by multiple components may instead performed by a single component.
- Some portions of the above description present the feature of the present invention in terms of algorithms and symbolic representations of operations on information. These algorithmic descriptions and representations are the means used by those skilled in the art to most effectively convey the substance of their work to others skilled in the art. These operations, while described functionally or logically, are understood to be implemented by computer programs. Furthermore, it has also proven convenient at times, to refer to these arrangements of operations as modules or code devices, without loss of generality.
- It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the present discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system memories or registers or other such information storage, transmission or display devices.
- Certain aspects of the present invention include process steps and instructions described herein in the form of an algorithm. It should be noted that the process steps and instructions of the present invention could be embodied in software, firmware or hardware, and when embodied in software, could be downloaded to reside on and be operated from different platforms used by real time network operating systems.
- The present invention also relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CDs, DVDs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus. Furthermore, the computers referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
- The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may also be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear from the description above. In addition, the present invention is not described with reference to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any references to specific languages are provided for disclosure of enablement and best mode of the present invention.
- The figures depict preferred embodiments of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.
- Finally, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, the disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention.
Claims (17)
1. A computer-implemented method for mixing audio streams from N senders, N at least two, the method comprising:
receiving the N audio streams over a network, the audio streams divided into audio chunks;
buffering the audio chunks;
opening a mix;
cycling through the N senders, for each sender on each cycle:
if the sender is not yet in the mix, adding the sender's audio chunk to the mix if the missing audio chunk is available from the buffer; and
closing the mix if either audio chunks from all N senders are in the mix or if a predetermined early close condition is met.
2. The computer-implemented method of claim 1 wherein:
the step of opening a mix comprises resetting a counter for the number of senders in the mix;
the step of cycling through the N senders further comprises:
if the sender's audio chunk is added to the mix in a cycle, incrementing the counter; and
the step of closing the mix further comprises closing the mix if the counter indicates N senders are in the mix.
3. The computer-implemented method of claim 1 wherein the step of cycling through the N senders further comprises:
on each cycle, if the sender's audio chunk is already in the mix and the sender has at least one additional audio chunk buffered, incrementing a wait counter for that sender;
wherein the predetermined early close condition is based on the value of the wait counter(s).
4. The computer-implemented method of claim 3 wherein the predetermined early close condition occurs if any wait counter reaches a predetermined maximum value.
5. The computer-implemented method of claim 4 wherein the predetermined maximum value is three.
6. The computer-implemented method of claim 3 wherein:
the step of cycling through the N senders further comprises:
if the sender's audio chunk is added to the mix in a cycle, decrementing the wait counter for that sender if the wait counter is greater than 0.
7. The computer-implemented method of claim 3 wherein:
the step of opening a mix comprises resetting the wait counters for all senders.
8. The computer-implemented method of claim 1 wherein each audio chunk contains audio of 40 ms duration.
9. The computer-implemented method of claim 1 wherein:
the step of receiving the N audio streams over a network comprises a central server receiving the N audio streams over a network from sender clients; and
the steps of buffering, opening a mix, cycling through the senders and closing the mix occur at the central server.
10. The computer-implemented method of claim 1 wherein:
the step of receiving the N audio streams over a network comprises a receiver client receiving the N audio streams over a network from a central server, the central server having received the N audio streams from sender clients; and
the steps of buffering, opening a mix, cycling through the senders and closing the mix occur at the receiver client.
11. The computer-implemented method of claim 1 wherein:
the step of receiving the N audio streams over a network comprises a receiver client receiving the N audio streams over a network from sender clients; and
the steps of buffering, opening a mix, cycling through the senders and closing the mix occur at the receiver client.
12. A computer-implemented method for mixing audio streams from N senders, N at least two, the method comprising:
opening a mix;
receiving the N audio streams over a network, the audio streams divided into audio chunks;
as each sender's audio chunk is received:
if the sender is not yet in the mix and the received audio chunk is the correct audio chunk for the mix, adding the sender's audio chunk to the mix;
otherwise, buffering the sender's audio chunk for a future mix; and
closing the mix if either audio chunks from all N senders are in the mix or if a predetermined early close condition is met.
13. The computer-implemented method of claim 12 wherein:
the step of opening a mix comprises resetting a counter for the number of senders in the mix;
the step of adding a sender's audio chunk to the mix further comprises incrementing the counter; and
the step of closing the mix further comprises closing the mix if the counter indicates N senders are in the mix.
14. The computer-implemented method of claim 12 wherein the step of buffering the sender's audio chunk for a future mix further comprises:
incrementing a queue counter for that sender; wherein the predetermined early close condition is based on the value of the queue counter(s).
15. The computer-implemented method of claim 14 wherein the predetermined early close condition occurs if any queue counter reaches a predetermined maximum value.
16. The computer-implemented method of claim 14 further comprising:
upon closing the mix, opening a new mix and determining whether any of the previously buffered audio chunks should be added to the new mix.
17-22. (canceled)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/605,741 US20120327181A1 (en) | 2007-09-30 | 2012-09-06 | Synchronization and Mixing of Audio and Video Streams in Network-Based Video Conferencing Call Systems |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US97646407P | 2007-09-30 | 2007-09-30 | |
US12/242,358 US8583268B2 (en) | 2007-09-30 | 2008-09-30 | Synchronization and mixing of audio and video streams in network-based video conferencing call systems |
US13/605,741 US20120327181A1 (en) | 2007-09-30 | 2012-09-06 | Synchronization and Mixing of Audio and Video Streams in Network-Based Video Conferencing Call Systems |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/242,358 Continuation US8583268B2 (en) | 2007-09-30 | 2008-09-30 | Synchronization and mixing of audio and video streams in network-based video conferencing call systems |
Publications (1)
Publication Number | Publication Date |
---|---|
US20120327181A1 true US20120327181A1 (en) | 2012-12-27 |
Family
ID=40507748
Family Applications (10)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/130,787 Active 2031-03-17 US8243119B2 (en) | 2007-09-30 | 2008-05-30 | Recording and videomail for video conferencing call systems |
US12/130,998 Active 2031-09-07 US9060094B2 (en) | 2007-09-30 | 2008-05-30 | Individual adjustment of audio and video properties in network conferencing |
US12/131,749 Active 2030-04-09 US8881029B2 (en) | 2007-09-30 | 2008-06-02 | Systems and methods for asynchronously joining and leaving video conferences and merging multiple video conferences |
US12/242,358 Active 2031-08-29 US8583268B2 (en) | 2007-09-30 | 2008-09-30 | Synchronization and mixing of audio and video streams in network-based video conferencing call systems |
US13/605,741 Abandoned US20120327181A1 (en) | 2007-09-30 | 2012-09-06 | Synchronization and Mixing of Audio and Video Streams in Network-Based Video Conferencing Call Systems |
US13/646,395 Active US8700195B2 (en) | 2007-09-30 | 2012-10-05 | Synchronization and mixing of audio and video streams in network based video conferencing call systems |
US14/196,703 Active 2029-07-17 US9654537B2 (en) | 2007-09-30 | 2014-03-04 | Synchronization and mixing of audio and video streams in network-based video conferencing call systems |
US14/511,369 Active 2029-10-10 US9742830B2 (en) | 2007-09-30 | 2014-10-10 | Systems and methods for asynchronously joining and leaving video conferences and merging multiple video conferences |
US14/727,585 Active 2028-07-13 US10097611B2 (en) | 2007-09-30 | 2015-06-01 | Individual adjustment of audio and video properties in network conferencing |
US16/143,879 Active US10880352B2 (en) | 2007-09-30 | 2018-09-27 | Individual adjustment of audio and video properties in network conferencing |
Family Applications Before (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/130,787 Active 2031-03-17 US8243119B2 (en) | 2007-09-30 | 2008-05-30 | Recording and videomail for video conferencing call systems |
US12/130,998 Active 2031-09-07 US9060094B2 (en) | 2007-09-30 | 2008-05-30 | Individual adjustment of audio and video properties in network conferencing |
US12/131,749 Active 2030-04-09 US8881029B2 (en) | 2007-09-30 | 2008-06-02 | Systems and methods for asynchronously joining and leaving video conferences and merging multiple video conferences |
US12/242,358 Active 2031-08-29 US8583268B2 (en) | 2007-09-30 | 2008-09-30 | Synchronization and mixing of audio and video streams in network-based video conferencing call systems |
Family Applications After (5)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/646,395 Active US8700195B2 (en) | 2007-09-30 | 2012-10-05 | Synchronization and mixing of audio and video streams in network based video conferencing call systems |
US14/196,703 Active 2029-07-17 US9654537B2 (en) | 2007-09-30 | 2014-03-04 | Synchronization and mixing of audio and video streams in network-based video conferencing call systems |
US14/511,369 Active 2029-10-10 US9742830B2 (en) | 2007-09-30 | 2014-10-10 | Systems and methods for asynchronously joining and leaving video conferences and merging multiple video conferences |
US14/727,585 Active 2028-07-13 US10097611B2 (en) | 2007-09-30 | 2015-06-01 | Individual adjustment of audio and video properties in network conferencing |
US16/143,879 Active US10880352B2 (en) | 2007-09-30 | 2018-09-27 | Individual adjustment of audio and video properties in network conferencing |
Country Status (2)
Country | Link |
---|---|
US (10) | US8243119B2 (en) |
WO (4) | WO2009045971A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130051578A1 (en) * | 2011-08-31 | 2013-02-28 | Yuan-Jih Chu | Network signal receiving system and network signal receiving method |
WO2014137981A1 (en) * | 2013-03-04 | 2014-09-12 | Janus Technologies, Inc. | Securing computer video and audio subsystems |
US20160094603A1 (en) * | 2014-09-29 | 2016-03-31 | Wistron Corporation | Audio and video sharing method and system |
US9654537B2 (en) | 2007-09-30 | 2017-05-16 | Optical Fusion, Inc. | Synchronization and mixing of audio and video streams in network-based video conferencing call systems |
Families Citing this family (185)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040162637A1 (en) | 2002-07-25 | 2004-08-19 | Yulun Wang | Medical tele-robotic system with a master remote station with an arbitrator |
US7813836B2 (en) | 2003-12-09 | 2010-10-12 | Intouch Technologies, Inc. | Protocol for a remotely controlled videoconferencing robot |
US8077963B2 (en) | 2004-07-13 | 2011-12-13 | Yulun Wang | Mobile robot with a head-based movement mapping scheme |
US9198728B2 (en) | 2005-09-30 | 2015-12-01 | Intouch Technologies, Inc. | Multi-camera mobile teleconferencing platform |
US8849679B2 (en) | 2006-06-15 | 2014-09-30 | Intouch Technologies, Inc. | Remote controlled robot system that provides medical images |
US9160783B2 (en) | 2007-05-09 | 2015-10-13 | Intouch Technologies, Inc. | Robot system that operates through a network firewall |
US8446454B2 (en) * | 2007-05-21 | 2013-05-21 | Polycom, Inc. | Dynamic adaption of a continuous presence videoconferencing layout based on video content |
US7680154B2 (en) * | 2007-12-31 | 2010-03-16 | Intel Corporation | Methods and apparatus for synchronizing networked audio devices |
US9357164B2 (en) * | 2008-03-18 | 2016-05-31 | Cisco Technology, Inc. | Establishing a remotely hosted conference initiated with one button push |
US10875182B2 (en) | 2008-03-20 | 2020-12-29 | Teladoc Health, Inc. | Remote presence system mounted to operating room hardware |
US8896658B2 (en) * | 2008-04-10 | 2014-11-25 | Creative Technology Ltd | Interface for voice communications |
US8179418B2 (en) | 2008-04-14 | 2012-05-15 | Intouch Technologies, Inc. | Robotic based health care system |
US8170241B2 (en) | 2008-04-17 | 2012-05-01 | Intouch Technologies, Inc. | Mobile tele-presence system with a microphone system |
US8341533B2 (en) * | 2008-06-06 | 2012-12-25 | Microsoft Corporation | Storage and expedited retrieval of messages and responses in multi-tasking environments |
US8250147B2 (en) * | 2008-06-25 | 2012-08-21 | Microsoft Corporation | Remote call control and conferencing using paired devices |
US8862681B2 (en) | 2008-06-25 | 2014-10-14 | Microsoft Corporation | Multimodal conversation transfer |
JP4533976B2 (en) * | 2008-07-09 | 2010-09-01 | 独立行政法人 日本原子力研究開発機構 | Video conferencing system using network |
US9193065B2 (en) | 2008-07-10 | 2015-11-24 | Intouch Technologies, Inc. | Docking system for a tele-presence robot |
US9842192B2 (en) | 2008-07-11 | 2017-12-12 | Intouch Technologies, Inc. | Tele-presence robot system with multi-cast features |
US8340819B2 (en) | 2008-09-18 | 2012-12-25 | Intouch Technologies, Inc. | Mobile videoconferencing robot system with network adaptive driving |
US20100094686A1 (en) * | 2008-09-26 | 2010-04-15 | Deep Rock Drive Partners Inc. | Interactive live events |
US8347216B2 (en) * | 2008-10-01 | 2013-01-01 | Lg Electronics Inc. | Mobile terminal and video sharing method thereof |
US8514265B2 (en) * | 2008-10-02 | 2013-08-20 | Lifesize Communications, Inc. | Systems and methods for selecting videoconferencing endpoints for display in a composite video image |
US8996165B2 (en) | 2008-10-21 | 2015-03-31 | Intouch Technologies, Inc. | Telepresence robot with a camera boom |
US8463435B2 (en) | 2008-11-25 | 2013-06-11 | Intouch Technologies, Inc. | Server connectivity control for tele-presence robot |
US9138891B2 (en) | 2008-11-25 | 2015-09-22 | Intouch Technologies, Inc. | Server connectivity control for tele-presence robot |
KR101050642B1 (en) * | 2008-12-04 | 2011-07-19 | 삼성전자주식회사 | Watch phone and method of conducting call in watch phone |
US20100149301A1 (en) * | 2008-12-15 | 2010-06-17 | Microsoft Corporation | Video Conferencing Subscription Using Multiple Bit Rate Streams |
NO329739B1 (en) * | 2008-12-23 | 2010-12-13 | Tandberg Telecom As | Method, device and computer program for processing images in a conference between a plurality of video conferencing terminals |
CN101442654B (en) * | 2008-12-26 | 2012-05-23 | 华为终端有限公司 | Method, apparatus and system for switching video object of video communication |
JP5372536B2 (en) * | 2009-01-28 | 2013-12-18 | ソニー株式会社 | Information processing apparatus, information processing method, and program |
US8849680B2 (en) | 2009-01-29 | 2014-09-30 | Intouch Technologies, Inc. | Documentation through a remote presence robot |
US8938677B2 (en) | 2009-03-30 | 2015-01-20 | Avaya Inc. | System and method for mode-neutral communications with a widget-based communications metaphor |
US8897920B2 (en) | 2009-04-17 | 2014-11-25 | Intouch Technologies, Inc. | Tele-presence robot system with software modularity, projector and laser pointer |
US20110119597A1 (en) * | 2009-05-09 | 2011-05-19 | Vivu, Inc. | Method and apparatus for capability-based multimedia interactions |
US8384754B2 (en) * | 2009-06-17 | 2013-02-26 | Verizon Patent And Licensing Inc. | Method and system of providing lighting for videoconferencing |
US8929887B2 (en) * | 2009-08-20 | 2015-01-06 | T-Mobile Usa, Inc. | Shared book reading |
US8654952B2 (en) | 2009-08-20 | 2014-02-18 | T-Mobile Usa, Inc. | Shareable applications on telecommunications devices |
US11399153B2 (en) | 2009-08-26 | 2022-07-26 | Teladoc Health, Inc. | Portable telepresence apparatus |
US8384755B2 (en) | 2009-08-26 | 2013-02-26 | Intouch Technologies, Inc. | Portable remote presence robot |
US8843834B2 (en) * | 2009-08-28 | 2014-09-23 | Apple Inc. | Method and apparatus for initiating and managing chat sessions |
US8448073B2 (en) * | 2009-09-09 | 2013-05-21 | Viewplicity, Llc | Multiple camera group collaboration system and method |
JP2011070511A (en) * | 2009-09-28 | 2011-04-07 | Sony Corp | Terminal device, server device, display control method, and program |
CA2684678A1 (en) * | 2009-11-03 | 2011-05-03 | Research In Motion Limited | System and method for dynamic post-processing on a mobile device |
US8775948B2 (en) * | 2009-12-08 | 2014-07-08 | International Business Machines Corporation | Method for capturing collaborative, real-time feedback on socio-technical interactions in a virtual environment and graphically displaying the interaction patterns for later review |
US20110161836A1 (en) * | 2009-12-31 | 2011-06-30 | Ruicao Mu | System for processing and synchronizing large scale video conferencing and document sharing |
US20110173270A1 (en) * | 2010-01-11 | 2011-07-14 | Ricoh Company, Ltd. | Conferencing Apparatus And Method |
US9432372B2 (en) * | 2010-01-28 | 2016-08-30 | Adobe Systems Incorporated | Access policy based on collaboration participation |
US11154981B2 (en) | 2010-02-04 | 2021-10-26 | Teladoc Health, Inc. | Robot user interface for telepresence robot system |
US9615056B2 (en) * | 2010-02-10 | 2017-04-04 | Oovoo, Llc | System and method for video communication on mobile devices |
US8670017B2 (en) | 2010-03-04 | 2014-03-11 | Intouch Technologies, Inc. | Remote presence system including a cart that supports a robot face and an overhead camera |
US8786666B2 (en) * | 2010-04-27 | 2014-07-22 | Lifesize Communications, Inc. | Providing separate video and presentation streams to a recording server |
US20130198654A1 (en) * | 2010-04-30 | 2013-08-01 | American Teleconferencing Services, Ltd. | Systems, Methods, and Computer Programs for Controlling a Conference Interface Without Joining as a Participant |
US20130298040A1 (en) * | 2010-04-30 | 2013-11-07 | American Teleconferencing Services, Ltd. | Systems, Methods, and Computer Programs for Providing Simultaneous Online Conferences |
US10343283B2 (en) | 2010-05-24 | 2019-07-09 | Intouch Technologies, Inc. | Telepresence robot system that can be accessed by a cellular phone |
US10808882B2 (en) | 2010-05-26 | 2020-10-20 | Intouch Technologies, Inc. | Tele-robotic system with a robot face placed on a chair |
US8947492B2 (en) * | 2010-06-18 | 2015-02-03 | Microsoft Corporation | Combining multiple bit rate and scalable video coding |
US8576271B2 (en) * | 2010-06-25 | 2013-11-05 | Microsoft Corporation | Combining direct and routed communication in a video conference |
CN101877643B (en) * | 2010-06-29 | 2014-12-10 | 中兴通讯股份有限公司 | Multipoint sound-mixing distant view presenting method, device and system |
JP2012038210A (en) * | 2010-08-10 | 2012-02-23 | Sony Corp | Information processing unit, information processing method, computer program, and content display system |
US8838696B2 (en) * | 2010-09-15 | 2014-09-16 | Syniverse Technologies, Llc | Method and apparatus to provide an ecosystem for mobile video |
USD690712S1 (en) | 2010-11-29 | 2013-10-01 | Cisco Technology, Inc. | Display screen with a graphical interface |
USD684986S1 (en) | 2010-11-29 | 2013-06-25 | Cisco Technology, Inc. | Display screen with a graphical interface |
US9264664B2 (en) | 2010-12-03 | 2016-02-16 | Intouch Technologies, Inc. | Systems and methods for dynamic bandwidth allocation |
US12093036B2 (en) | 2011-01-21 | 2024-09-17 | Teladoc Health, Inc. | Telerobotic system with a dual application screen presentation |
US9323250B2 (en) | 2011-01-28 | 2016-04-26 | Intouch Technologies, Inc. | Time-dependent navigation of telepresence robots |
CN104898652B (en) | 2011-01-28 | 2018-03-13 | 英塔茨科技公司 | Mutually exchanged with a moveable tele-robotic |
US8438233B2 (en) | 2011-03-23 | 2013-05-07 | Color Labs, Inc. | Storage and distribution of content for a user device group |
US9049033B2 (en) * | 2011-03-28 | 2015-06-02 | Net Power And Light, Inc. | Information mixer and system control for attention management |
US8994779B2 (en) * | 2011-03-28 | 2015-03-31 | Net Power And Light, Inc. | Information mixer and system control for attention management |
US10769739B2 (en) | 2011-04-25 | 2020-09-08 | Intouch Technologies, Inc. | Systems and methods for management of information among medical providers and facilities |
US8786667B2 (en) | 2011-04-26 | 2014-07-22 | Lifesize Communications, Inc. | Distributed recording of a videoconference in multiple formats |
US8780166B2 (en) * | 2011-04-26 | 2014-07-15 | Lifesize Communications, Inc. | Collaborative recording of a videoconference using a recording server |
US9098611B2 (en) | 2012-11-26 | 2015-08-04 | Intouch Technologies, Inc. | Enhanced video interaction for a user interface of a telepresence network |
US20140139616A1 (en) | 2012-01-27 | 2014-05-22 | Intouch Technologies, Inc. | Enhanced Diagnostics for a Telepresence Robot |
US8949333B2 (en) * | 2011-05-20 | 2015-02-03 | Alejandro Backer | Systems and methods for virtual interactions |
US9923982B2 (en) * | 2011-06-24 | 2018-03-20 | Avaya Inc. | Method for visualizing temporal data |
US20130002532A1 (en) * | 2011-07-01 | 2013-01-03 | Nokia Corporation | Method, apparatus, and computer program product for shared synchronous viewing of content |
EP2557806B1 (en) * | 2011-08-08 | 2018-05-23 | Advanced Digital Broadcast S.A. | Method for improving channel change in a television appliance |
US9137086B1 (en) | 2011-08-25 | 2015-09-15 | Google Inc. | Social media session access |
US8473550B2 (en) * | 2011-09-21 | 2013-06-25 | Color Labs, Inc. | Content sharing using notification within a social networking environment |
CN103250198B (en) * | 2011-10-07 | 2016-08-10 | 松下知识产权经营株式会社 | Education system, teacher with information terminal, student with information terminal, integrated circuit and content display method |
US8611877B2 (en) | 2011-10-31 | 2013-12-17 | Blackberry Limited | Automatic management control of external resources |
US9020119B2 (en) | 2011-10-31 | 2015-04-28 | Blackberry Limited | Moderation control method for participants in a heterogeneous conference call |
US8605881B2 (en) | 2011-10-31 | 2013-12-10 | Blackberry Limited | Auto promotion and demotion of conference calls |
US8836751B2 (en) * | 2011-11-08 | 2014-09-16 | Intouch Technologies, Inc. | Tele-presence system with a user interface that displays different communication links |
EP2713551B1 (en) * | 2011-11-09 | 2016-02-03 | Huawei Technologies Co., Ltd. | Intercommunication method and system for multi-conference system |
US8788680B1 (en) * | 2012-01-30 | 2014-07-22 | Google Inc. | Virtual collaboration session access |
CN103248939B (en) * | 2012-02-03 | 2017-11-28 | 海尔集团公司 | A kind of method and system realized multi-screen synchronous and shown |
US8682974B2 (en) | 2012-02-24 | 2014-03-25 | Blackberry Limited | Methods and systems for pausing and resuming a meeting session |
US8902278B2 (en) | 2012-04-11 | 2014-12-02 | Intouch Technologies, Inc. | Systems and methods for visualizing and managing telepresence devices in healthcare networks |
US9251313B2 (en) | 2012-04-11 | 2016-02-02 | Intouch Technologies, Inc. | Systems and methods for visualizing and managing telepresence devices in healthcare networks |
US8970658B2 (en) | 2012-04-20 | 2015-03-03 | Logitech Europe S.A. | User interface allowing a participant to rejoin a previously left videoconference |
WO2013176758A1 (en) | 2012-05-22 | 2013-11-28 | Intouch Technologies, Inc. | Clinical workflows utilizing autonomous and semi-autonomous telemedicine devices |
US9361021B2 (en) | 2012-05-22 | 2016-06-07 | Irobot Corporation | Graphical user interfaces including touchpad driving interfaces for telemedicine devices |
US8947491B2 (en) * | 2012-06-28 | 2015-02-03 | Microsoft Corporation | Communication system |
US8477176B1 (en) | 2012-07-19 | 2013-07-02 | Google Inc. | System and method for automatically suggesting or inviting a party to join a multimedia communications session |
US8892079B1 (en) * | 2012-09-14 | 2014-11-18 | Google Inc. | Ad hoc endpoint device association for multimedia conferencing |
US9508058B2 (en) | 2012-10-15 | 2016-11-29 | Bank Of America Corporation | System providing an interactive conference |
US9001180B2 (en) * | 2012-10-15 | 2015-04-07 | Bank Of America Corporation | Multiple-participant customer service conference |
US9754320B2 (en) | 2012-10-15 | 2017-09-05 | Bank Of America Corporation | Providing a record of an interactive conference |
JP6051782B2 (en) * | 2012-10-31 | 2016-12-27 | 株式会社リコー | Communication system and program |
US9357165B2 (en) | 2012-11-16 | 2016-05-31 | At&T Intellectual Property I, Lp | Method and apparatus for providing video conferencing |
US9369670B2 (en) * | 2012-12-19 | 2016-06-14 | Rabbit, Inc. | Audio video streaming system and method |
GB2509323B (en) | 2012-12-28 | 2015-01-07 | Glide Talk Ltd | Reduced latency server-mediated audio-video communication |
US9077850B1 (en) * | 2013-01-15 | 2015-07-07 | Google Inc. | Recording multi-party video calls |
WO2014115147A1 (en) * | 2013-01-24 | 2014-07-31 | Telesofia Medical Ltd. | System and method for flexible video construction |
US9503485B1 (en) * | 2013-03-01 | 2016-11-22 | Whatsapp Inc. | Connecting communicating devices in a multi-server communication system |
US9485542B2 (en) | 2013-03-15 | 2016-11-01 | Arris Enterprises, Inc. | Method and apparatus for adding and displaying an inline reply within a video message |
US20170039867A1 (en) | 2013-03-15 | 2017-02-09 | Study Social, Inc. | Mobile video presentation, digital compositing, and streaming techniques implemented via a computer network |
US9113036B2 (en) * | 2013-07-17 | 2015-08-18 | Ebay Inc. | Methods, systems, and apparatus for providing video communications |
JP6221489B2 (en) * | 2013-08-09 | 2017-11-01 | 株式会社リコー | COMMUNICATION SYSTEM, MANAGEMENT DEVICE, COMMUNICATION METHOD, AND PROGRAM |
US9473627B2 (en) | 2013-11-08 | 2016-10-18 | Sorenson Communications, Inc. | Video endpoints and related methods for transmitting stored text to other video endpoints |
US9185211B2 (en) | 2013-11-08 | 2015-11-10 | Sorenson Communications, Inc. | Apparatuses and methods for operating a communication system in one of a tone mode and a text mode |
US9210378B2 (en) * | 2014-01-29 | 2015-12-08 | Google Inc. | Controlling access to meetings |
CN104836977B (en) * | 2014-02-10 | 2018-04-24 | 阿里巴巴集团控股有限公司 | Video communication method and system during instant messaging |
US20150244754A1 (en) * | 2014-02-25 | 2015-08-27 | Stephen Beckham, JR. | Technologies for multi-user interactive media streaming |
US10708328B2 (en) * | 2014-03-17 | 2020-07-07 | Intel Corporation | Hardware assisted media playback and capture synchronization |
US9318121B2 (en) * | 2014-04-21 | 2016-04-19 | Sony Corporation | Method and system for processing audio data of video content |
US10686940B2 (en) * | 2014-06-02 | 2020-06-16 | Lenovo (Singapore) Pte. Ltd. | Virtual conference room for telephonic conferencing |
US10305945B2 (en) | 2014-11-10 | 2019-05-28 | The Mitre Corporation | Providing survivable calling and conferencing |
CN105611219A (en) * | 2014-11-24 | 2016-05-25 | 中兴通讯股份有限公司 | Method and device for processing video conference |
US10244175B2 (en) * | 2015-03-09 | 2019-03-26 | Apple Inc. | Automatic cropping of video content |
US9912777B2 (en) * | 2015-03-10 | 2018-03-06 | Cisco Technology, Inc. | System, method, and logic for generating graphical identifiers |
CN106162039A (en) * | 2015-03-25 | 2016-11-23 | 中兴通讯股份有限公司 | A kind of method, terminal and system realizing video conferencing calling membership |
WO2016159984A1 (en) * | 2015-03-31 | 2016-10-06 | Hewlett-Packard Development Company, L.P. | Transmitting multimedia streams to users |
US10225814B2 (en) * | 2015-04-05 | 2019-03-05 | Qualcomm Incorporated | Conference audio management |
TWI604715B (en) * | 2015-05-28 | 2017-11-01 | 仁寶電腦工業股份有限公司 | Method and system for adjusting volume of conference call |
US9591141B1 (en) | 2015-08-11 | 2017-03-07 | International Business Machines Corporation | Controlling conference calls |
US9521173B1 (en) | 2015-09-29 | 2016-12-13 | Ringcentral, Inc. | System and method for managing calls |
CN105338275B (en) * | 2015-10-08 | 2018-10-26 | 天脉聚源(北京)教育科技有限公司 | A kind of method for processing video frequency and device |
US10389770B2 (en) | 2016-03-24 | 2019-08-20 | Intel Corporation | Concurrent network based collaboration sessions |
CN105812716A (en) * | 2016-04-01 | 2016-07-27 | 四川中科腾信科技有限公司 | Multi-party video conference system and method based on mobile device |
CN108111797A (en) * | 2016-11-24 | 2018-06-01 | 北京中创视讯科技有限公司 | Participant's treating method and apparatus |
CN108614829A (en) * | 2016-12-12 | 2018-10-02 | 中移(杭州)信息技术有限公司 | A kind of playback method and terminal |
US10979785B2 (en) * | 2017-01-20 | 2021-04-13 | Hanwha Techwin Co., Ltd. | Media playback apparatus and method for synchronously reproducing video and audio on a web browser |
US10193940B2 (en) | 2017-02-07 | 2019-01-29 | Microsoft Technology Licensing, Llc | Adding recorded content to an interactive timeline of a teleconference session |
US10171256B2 (en) | 2017-02-07 | 2019-01-01 | Microsoft Technology Licensing, Llc | Interactive timeline for a teleconference session |
US10070093B1 (en) | 2017-02-24 | 2018-09-04 | Microsoft Technology Licensing, Llc | Concurrent viewing of live content and recorded content |
US10367857B2 (en) * | 2017-03-30 | 2019-07-30 | International Business Machines Corporation | Managing conference-calls |
US11862302B2 (en) | 2017-04-24 | 2024-01-02 | Teladoc Health, Inc. | Automated transcription and documentation of tele-health encounters |
US10574662B2 (en) | 2017-06-20 | 2020-02-25 | Bank Of America Corporation | System for authentication of a user based on multi-factor passively acquired data |
US10360733B2 (en) | 2017-06-20 | 2019-07-23 | Bank Of America Corporation | System controlled augmented resource facility |
US10483007B2 (en) | 2017-07-25 | 2019-11-19 | Intouch Technologies, Inc. | Modular telehealth cart with thermal imaging and touch screen user interface |
US11636944B2 (en) | 2017-08-25 | 2023-04-25 | Teladoc Health, Inc. | Connectivity infrastructure for a telehealth platform |
US10205974B1 (en) | 2018-01-12 | 2019-02-12 | Ringcentral, Inc. | Systems and methods for providing shared memory pointers to a persistent video stream for use in a video communications session |
KR102051828B1 (en) | 2018-02-28 | 2020-01-08 | 주식회사 하이퍼커넥트 | Method of making video communication and device of mediating video communication |
US20190268387A1 (en) * | 2018-02-28 | 2019-08-29 | Avaya Inc. | Method and system for expanded participation in a collaboration space |
CN110324555B (en) * | 2018-03-28 | 2021-02-26 | 北京富纳特创新科技有限公司 | Video communication apparatus and method |
US11057444B1 (en) * | 2018-03-29 | 2021-07-06 | Facebook, Inc. | Systems and methods for shared broadcasting |
US10617299B2 (en) | 2018-04-27 | 2020-04-14 | Intouch Technologies, Inc. | Telehealth cart that supports a removable tablet with seamless audio/video switching |
CN109040644B (en) * | 2018-07-25 | 2020-12-04 | 成都鼎桥通信技术有限公司 | Video point calling and recording storage method and system |
US11087019B2 (en) * | 2018-08-14 | 2021-08-10 | AffectLayer, Inc. | Data compliance management in recording calls |
CN108712577B (en) * | 2018-08-28 | 2021-03-12 | 维沃移动通信有限公司 | Call mode switching method and terminal equipment |
CN109089059A (en) * | 2018-10-19 | 2018-12-25 | 北京微播视界科技有限公司 | Method, apparatus, electronic equipment and the computer storage medium that video generates |
CN111131640A (en) * | 2018-10-31 | 2020-05-08 | 苏州璨鸿光电有限公司 | Multi-party video call control system and method |
CN109994217B (en) * | 2019-03-08 | 2021-08-06 | 视联动力信息技术股份有限公司 | Method and device for checking pathological file |
US11171795B2 (en) | 2019-03-29 | 2021-11-09 | Lenovo (Singapore) Pte. Ltd. | Systems and methods to merge data streams from different conferencing platforms |
CN109889757B (en) * | 2019-03-29 | 2021-05-04 | 维沃移动通信有限公司 | Video call method and terminal equipment |
US10652655B1 (en) | 2019-04-30 | 2020-05-12 | International Business Machines Corporation | Cognitive volume and speech frequency levels adjustment |
US11308145B2 (en) * | 2019-05-23 | 2022-04-19 | Genesys Telecommunications Laboratories, Inc. | System and method for multimedia contact center interactions via an audiovisual asynchronous channel |
US10771740B1 (en) * | 2019-05-31 | 2020-09-08 | International Business Machines Corporation | Adding an individual to a video conference |
US10972301B2 (en) | 2019-06-27 | 2021-04-06 | Microsoft Technology Licensing, Llc | Displaying notifications for starting a session at a time that is different than a scheduled start time |
US11252206B2 (en) * | 2019-12-03 | 2022-02-15 | Microsoft Technology Licensing, Llc | Reducing setup time for online meetings |
EP3849175A1 (en) * | 2020-01-07 | 2021-07-14 | BenQ Corporation | Video conference system |
US11196869B2 (en) * | 2020-02-15 | 2021-12-07 | Lenovo (Singapore) Pte. Ltd. | Facilitation of two or more video conferences concurrently |
US11122321B1 (en) * | 2020-04-06 | 2021-09-14 | International Business Machines Corporation | Stream synchronization using an automated digital clapperboard |
FR3111504A1 (en) * | 2020-06-16 | 2021-12-17 | Orange | Access method and device for managing access to a secure communication session between participating communication terminals by a requesting communication terminal |
US11805227B2 (en) * | 2020-09-04 | 2023-10-31 | Mersive Technologies, Inc. | Video conferencing systems with meeting migration |
US11662975B2 (en) | 2020-10-06 | 2023-05-30 | Tencent America LLC | Method and apparatus for teleconference |
CN112272327B (en) * | 2020-10-26 | 2021-10-15 | 腾讯科技(深圳)有限公司 | Data processing method, device, storage medium and equipment |
WO2022125103A1 (en) * | 2020-12-10 | 2022-06-16 | Hewlett-Packard Development Company, L.P. | Voice conference audio data via process identifiers |
US11855793B2 (en) | 2020-12-11 | 2023-12-26 | Lenovo (Singapore) Pte. Ltd. | Graphical user interfaces for grouping video conference participants |
US12074721B2 (en) | 2021-01-29 | 2024-08-27 | Zoom Video Communications, Inc. | Merging a call with a virtual meeting |
US20220321617A1 (en) * | 2021-03-30 | 2022-10-06 | Snap Inc. | Automatically navigating between rooms within a virtual conferencing system |
CN113014850A (en) * | 2021-04-02 | 2021-06-22 | 浙江德维迪亚数字科技有限公司 | Anti-network-cut-off communication method of head-mounted computer |
US11522936B2 (en) * | 2021-04-30 | 2022-12-06 | Salesforce, Inc. | Synchronization of live streams from web-based clients |
US20220353304A1 (en) * | 2021-04-30 | 2022-11-03 | Microsoft Technology Licensing, Llc | Intelligent Agent For Auto-Summoning to Meetings |
US11457271B1 (en) | 2021-05-07 | 2022-09-27 | Microsoft Technology Licensing, Llc | Distributed utilization of computing resources for processing coordinated displays of video streams |
US12107696B2 (en) | 2021-06-29 | 2024-10-01 | International Business Machines Corporation | Virtual system for merging and splitting virtual meetings |
CN113489937B (en) * | 2021-07-02 | 2023-06-20 | 北京字跳网络技术有限公司 | Video sharing method, device, equipment and medium |
US11877130B2 (en) | 2022-03-17 | 2024-01-16 | Hewlett-Packard Development Company, L.P. | Audio controls in online conferences |
US11758083B1 (en) | 2022-03-31 | 2023-09-12 | Motorola Mobility Llc | Methods, systems, and devices for presenting demonstration objects without mirroring in a videoconference |
US11880560B1 (en) * | 2022-07-09 | 2024-01-23 | Snap Inc. | Providing bot participants within a virtual conferencing system |
US20240259451A1 (en) * | 2023-01-27 | 2024-08-01 | Zoom Video Communications, Inc. | Isolating videoconference streams |
Family Cites Families (127)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4449238A (en) * | 1982-03-25 | 1984-05-15 | Bell Telephone Laboratories, Incorporated | Voice-actuated switching system |
US4720850A (en) | 1986-03-14 | 1988-01-19 | American Telephone And Telegraph Company At&T Bell Laboratories | Communication system control arrangement |
US5594859A (en) * | 1992-06-03 | 1997-01-14 | Digital Equipment Corporation | Graphical user interface for video teleconferencing |
US6594688B2 (en) * | 1993-10-01 | 2003-07-15 | Collaboration Properties, Inc. | Dedicated echo canceler for a workstation |
US7185054B1 (en) * | 1993-10-01 | 2007-02-27 | Collaboration Properties, Inc. | Participant display and selection in video conference calls |
US5689641A (en) | 1993-10-01 | 1997-11-18 | Vicor, Inc. | Multimedia collaboration system arrangement for routing compressed AV signal through a participant site without decompressing the AV signal |
US5508942A (en) | 1993-11-24 | 1996-04-16 | Intel Corporation | Intra/inter decision rules for encoding and decoding video signals |
US5574934A (en) * | 1993-11-24 | 1996-11-12 | Intel Corporation | Preemptive priority-based transmission of signals using virtual channels |
US6972786B1 (en) * | 1994-12-30 | 2005-12-06 | Collaboration Properties, Inc. | Multimedia services using central office |
US5973724A (en) * | 1995-02-24 | 1999-10-26 | Apple Computer, Inc. | Merging multiple teleconferences |
US5729532A (en) * | 1995-05-26 | 1998-03-17 | Lucent Technologies Inc. | Selective participation in a multimedia communication conference call |
US5648814A (en) | 1995-09-27 | 1997-07-15 | Intel Corporation | Method and apparatus for optimizing camera function for video conferencing systems |
US6836295B1 (en) * | 1995-12-07 | 2004-12-28 | J. Carl Cooper | Audio to video timing measurement for MPEG type television systems |
US5854640A (en) * | 1996-01-02 | 1998-12-29 | Intel Corporation | Method and apparatus for byte alignment of video data in a memory of a host system |
US6898620B1 (en) * | 1996-06-07 | 2005-05-24 | Collaboration Properties, Inc. | Multiplexing video and control signals onto UTP |
US6405255B1 (en) * | 1996-07-01 | 2002-06-11 | Sun Microsystems, Inc. | Mixing and splitting multiple independent audio data streams in kernel space |
US6141033A (en) * | 1997-05-15 | 2000-10-31 | Cognex Corporation | Bandwidth reduction of multichannel images for machine vision |
US6288739B1 (en) | 1997-09-05 | 2001-09-11 | Intelect Systems Corporation | Distributed video communications system |
JP4151110B2 (en) * | 1998-05-14 | 2008-09-17 | ソニー株式会社 | Audio signal processing apparatus and audio signal reproduction apparatus |
JP2000056730A (en) * | 1998-06-05 | 2000-02-25 | Canon Inc | Device and method to form image |
US6119147A (en) | 1998-07-28 | 2000-09-12 | Fuji Xerox Co., Ltd. | Method and system for computer-mediated, multi-modal, asynchronous meetings in a virtual space |
EP1039752A4 (en) | 1998-10-09 | 2007-05-02 | Sony Corp | Communication apparatus and method |
US6233226B1 (en) * | 1998-12-14 | 2001-05-15 | Verizon Laboratories Inc. | System and method for analyzing and transmitting video over a switched network |
US6678250B1 (en) * | 1999-02-19 | 2004-01-13 | 3Com Corporation | Method and system for monitoring and management of the performance of real-time networks |
US6751228B1 (en) | 1999-03-23 | 2004-06-15 | Yamaha Corporation | Packet handler of audio data by isochronous mode |
US6549587B1 (en) * | 1999-09-20 | 2003-04-15 | Broadcom Corporation | Voice and data exchange over a packet based network with timing recovery |
US20020145610A1 (en) * | 1999-07-16 | 2002-10-10 | Steve Barilovits | Video processing engine overlay filter scaler |
EP1079511A1 (en) | 1999-08-26 | 2001-02-28 | Linak A/S | Power supply for DC-motors |
US20100145794A1 (en) | 1999-10-21 | 2010-06-10 | Sean Barnes Barger | Media Processing Engine and Ad-Per-View |
NL1016478C2 (en) * | 1999-10-28 | 2001-11-29 | Sennheiser Electronic | Device for sending two-way audio and / or video signals. |
US6675125B2 (en) * | 1999-11-29 | 2004-01-06 | Syfx | Statistics generator system and method |
US7920697B2 (en) * | 1999-12-09 | 2011-04-05 | Broadcom Corp. | Interaction between echo canceller and packet voice processing |
US20010052932A1 (en) | 1999-12-30 | 2001-12-20 | Young Robert S. | Digital film processing method and system |
US6577737B1 (en) * | 2000-02-17 | 2003-06-10 | Visteon Global Technologies, Inc. | Method of detecting a DC offset in an automotive audio system |
US6795740B1 (en) * | 2000-03-01 | 2004-09-21 | Apple Computer, Inc. | Rectifying overflow and underflow in equalized audio waveforms |
US7075919B1 (en) * | 2000-08-22 | 2006-07-11 | Cisco Technology, Inc. | System and method for providing integrated voice, video and data to customer premises over a single network |
JP2002077544A (en) | 2000-08-28 | 2002-03-15 | Nikon Corp | Image reader, medium storing its control procedure and data structure for transmitting computer program signal including control procedure while encoding |
US7236838B2 (en) * | 2000-08-29 | 2007-06-26 | Matsushita Electric Industrial Co., Ltd. | Signal processing apparatus, signal processing method, program and recording medium |
US6680745B2 (en) | 2000-11-10 | 2004-01-20 | Perceptive Network Technologies, Inc. | Videoconferencing method with tracking of face and dynamic bandwidth allocation |
US6975692B2 (en) * | 2000-12-04 | 2005-12-13 | Koninklijke Philips Electronics N.V. | Scaling of demodulated data in an interleaver memory |
ATE492076T1 (en) * | 2001-04-04 | 2011-01-15 | Quellan Inc | METHOD AND SYSTEM FOR DECODING MULTI-LEVEL SIGNALS |
US8934382B2 (en) | 2001-05-10 | 2015-01-13 | Polycom, Inc. | Conference endpoint controlling functions of a remote device |
US20020194054A1 (en) | 2001-06-18 | 2002-12-19 | Renee Frengut | Internet based qualitative research method and system |
US6782205B2 (en) * | 2001-06-25 | 2004-08-24 | Silicon Light Machines | Method and apparatus for dynamic equalization in wavelength division multiplexing |
US20030023672A1 (en) * | 2001-07-27 | 2003-01-30 | Arthur Vaysman | Voice over IP conferencing server system with resource selection based on quality of service |
US20030035371A1 (en) | 2001-07-31 | 2003-02-20 | Coke Reed | Means and apparatus for a scaleable congestion free switching system with intelligent control |
US20030025786A1 (en) * | 2001-07-31 | 2003-02-06 | Vtel Corporation | Method and system for saving and applying a video address from a video conference |
US20030043260A1 (en) * | 2001-08-29 | 2003-03-06 | Adrian Yap | Videophone answering device |
AU2002357144A1 (en) * | 2001-12-15 | 2003-06-30 | Thomson Licensing S.A. | Quality of service setup on a time reservation basis |
US6906741B2 (en) * | 2002-01-29 | 2005-06-14 | Palm, Inc. | System for and method of conferencing with a handheld computer using multiple media types |
US7046779B2 (en) * | 2002-02-15 | 2006-05-16 | Multimedia Telesys, Inc. | Video conference system and methods for use at multi-station sites |
US20040008249A1 (en) | 2002-07-10 | 2004-01-15 | Steve Nelson | Method and apparatus for controllable conference content via back-channel video interface |
US7362349B2 (en) | 2002-07-10 | 2008-04-22 | Seiko Epson Corporation | Multi-participant conference system with controllable content delivery using a client monitor back-channel |
US7317791B2 (en) * | 2002-08-08 | 2008-01-08 | International Business Machines Corporation | Apparatus and method for controlling conference call participants |
US7133934B1 (en) * | 2002-08-27 | 2006-11-07 | Mindspeed Technologies, Inc. | Adaptive error correction for communications over packet networks |
US7020207B1 (en) * | 2002-12-02 | 2006-03-28 | Hitachi, Ltd. | Video error concealment mechanism for block based video decompression |
US20040119814A1 (en) | 2002-12-20 | 2004-06-24 | Clisham Allister B. | Video conferencing system and method |
US7525918B2 (en) * | 2003-01-21 | 2009-04-28 | Broadcom Corporation | Using RTCP statistics for media system control |
US7664056B2 (en) | 2003-03-10 | 2010-02-16 | Meetrix Corporation | Media based collaboration using mixed-mode PSTN and internet networks |
US7762665B2 (en) | 2003-03-21 | 2010-07-27 | Queen's University At Kingston | Method and apparatus for communication between humans and devices |
EP1465193A1 (en) | 2003-04-04 | 2004-10-06 | Thomson Licensing S.A. | Method for synchronizing audio and video streams |
US7339982B2 (en) * | 2003-05-13 | 2008-03-04 | Agilent Technologies, Inc. | Modular, jitter-tolerant data acquisition and processing systems |
NO319422B1 (en) * | 2003-05-23 | 2005-08-08 | Tandberg Telecom As | Procedure for handling data rate changes |
US7450593B2 (en) | 2003-06-23 | 2008-11-11 | Intel Corporation | Clock difference compensation for a network |
EP1658709A1 (en) * | 2003-08-22 | 2006-05-24 | Koninklijke Philips Electronics N.V. | Backward compatible multi-carrier transmission system |
US7369699B1 (en) * | 2003-08-29 | 2008-05-06 | Apple Inc. | Methods and apparatuses for restoring color and enhancing electronic images |
ES2741016T3 (en) * | 2003-10-07 | 2020-02-07 | Librestream Tech Inc | Camera to communicate a continuous multimedia transmission to a Remote Client |
WO2005036878A1 (en) | 2003-10-08 | 2005-04-21 | Cisco Technology, Inc. | System and method for performing distributed video conferencing |
US8081205B2 (en) | 2003-10-08 | 2011-12-20 | Cisco Technology, Inc. | Dynamically switched and static multiple video streams for a multimedia conference |
US7590941B2 (en) | 2003-10-09 | 2009-09-15 | Hewlett-Packard Development Company, L.P. | Communication and collaboration system using rich media environments |
JP4251955B2 (en) * | 2003-10-15 | 2009-04-08 | パナソニック株式会社 | Audio data network device, amplifier device |
US20050099492A1 (en) * | 2003-10-30 | 2005-05-12 | Ati Technologies Inc. | Activity controlled multimedia conferencing |
US7084898B1 (en) | 2003-11-18 | 2006-08-01 | Cisco Technology, Inc. | System and method for providing video conferencing synchronization |
US8472792B2 (en) | 2003-12-08 | 2013-06-25 | Divx, Llc | Multimedia distribution system |
US7703104B1 (en) | 2004-03-12 | 2010-04-20 | West Corporation | Systems, methods, and computer-readable media for enrolling conferees for expeditied access to conferencing services |
US20050209012A1 (en) * | 2004-03-19 | 2005-09-22 | Jouppi Norman P | Illumination system |
US7567270B2 (en) * | 2004-04-22 | 2009-07-28 | Insors Integrated Communications | Audio data control |
US7552175B2 (en) * | 2004-04-30 | 2009-06-23 | Microsoft Corporation | Mechanism for controlling communication paths between conference members |
US7308476B2 (en) * | 2004-05-11 | 2007-12-11 | International Business Machines Corporation | Method and system for participant automatic re-invite and updating during conferencing |
US7668243B2 (en) | 2004-05-18 | 2010-02-23 | Texas Instruments Incorporated | Audio and video clock synchronization in a wireless network |
US7471337B2 (en) | 2004-06-09 | 2008-12-30 | Lsi Corporation | Method of audio-video synchronization |
US7433373B2 (en) | 2004-06-15 | 2008-10-07 | National Tsing Hua University | Actively Q-switched laser system using quasi-phase-matched electro-optic Q-switch |
US7400653B2 (en) | 2004-06-18 | 2008-07-15 | Dolby Laboratories Licensing Corporation | Maintaining synchronization of streaming audio and video using internet protocol |
US8190680B2 (en) | 2004-07-01 | 2012-05-29 | Netgear, Inc. | Method and system for synchronization of digital media playback |
US20060013504A1 (en) * | 2004-07-15 | 2006-01-19 | Edge Medical Devices Ltd. | Multi-resolution image enhancement |
US7526525B2 (en) * | 2004-07-22 | 2009-04-28 | International Business Machines Corporation | Method for efficiently distributing and remotely managing meeting presentations |
US20060050155A1 (en) | 2004-09-02 | 2006-03-09 | Ing Stephen S | Video camera sharing |
US7496188B2 (en) | 2004-09-20 | 2009-02-24 | International Business Machines Corporation | N-ways conference system using only participants' telephony devices without external conference server |
CA2581844A1 (en) * | 2004-09-27 | 2006-04-06 | David Coleman | Method and apparatus for remote voice-over or music production and management |
JP4411184B2 (en) | 2004-10-29 | 2010-02-10 | 株式会社ルネサステクノロジ | Broadcast station synchronization method and portable terminal |
US7400340B2 (en) | 2004-11-15 | 2008-07-15 | Starent Networks, Corp. | Data mixer for portable communications devices |
JP4434973B2 (en) | 2005-01-24 | 2010-03-17 | 株式会社東芝 | Video display device, video composition distribution device, program, system and method |
US20060170790A1 (en) * | 2005-01-31 | 2006-08-03 | Richard Turley | Method and apparatus for exposure correction in a digital imaging device |
US7672742B2 (en) | 2005-02-16 | 2010-03-02 | Adaptec, Inc. | Method and system for reducing audio latency |
US8290181B2 (en) * | 2005-03-19 | 2012-10-16 | Microsoft Corporation | Automatic audio gain control for concurrent capture applications |
US7692682B2 (en) * | 2005-04-28 | 2010-04-06 | Apple Inc. | Video encoding in a video conference |
US7817180B2 (en) * | 2005-04-28 | 2010-10-19 | Apple Inc. | Video processing in a multi-participant video conference |
US7576766B2 (en) | 2005-06-30 | 2009-08-18 | Microsoft Corporation | Normalized images for cameras |
US8614732B2 (en) * | 2005-08-24 | 2013-12-24 | Cisco Technology, Inc. | System and method for performing distributed multipoint video conferencing |
US20070047731A1 (en) * | 2005-08-31 | 2007-03-01 | Acoustic Technologies, Inc. | Clipping detector for echo cancellation |
US8335576B1 (en) | 2005-09-22 | 2012-12-18 | Teradici Corporation | Methods and apparatus for bridging an audio controller |
TWI280800B (en) * | 2005-10-12 | 2007-05-01 | Benq Corp | System for video conference, proxy and method thereof |
US8483098B2 (en) | 2005-11-29 | 2013-07-09 | Cisco Technology, Inc. | Method and apparatus for conference spanning |
US8214516B2 (en) | 2006-01-06 | 2012-07-03 | Google Inc. | Dynamic media serving infrastructure |
US8125509B2 (en) | 2006-01-24 | 2012-02-28 | Lifesize Communications, Inc. | Facial recognition for a videoconference |
KR101240261B1 (en) * | 2006-02-07 | 2013-03-07 | 엘지전자 주식회사 | The apparatus and method for image communication of mobile communication terminal |
US20070203596A1 (en) * | 2006-02-28 | 2007-08-30 | Accel Semiconductor Corporation | Fm transmission |
US7800642B2 (en) | 2006-03-01 | 2010-09-21 | Polycom, Inc. | Method and system for providing continuous presence video in a cascading conference |
EP1997315B1 (en) * | 2006-03-17 | 2016-02-10 | Sony Corporation | System and method for organizing group content presentations and group communications during the same |
US7822811B2 (en) | 2006-06-16 | 2010-10-26 | Microsoft Corporation | Performance enhancements for video conferencing |
TWI314424B (en) * | 2006-06-23 | 2009-09-01 | Marketech Int Corp | System and method for image signal contrast adjustment and overflow compensation |
US20080016156A1 (en) | 2006-07-13 | 2008-01-17 | Sean Miceli | Large Scale Real-Time Presentation of a Network Conference Having a Plurality of Conference Participants |
US8670537B2 (en) * | 2006-07-31 | 2014-03-11 | Cisco Technology, Inc. | Adjusting audio volume in a conference call environment |
US8432834B2 (en) | 2006-08-08 | 2013-04-30 | Cisco Technology, Inc. | System for disambiguating voice collisions |
TWI323134B (en) | 2006-08-16 | 2010-04-01 | Quanta Comp Inc | Image processing apparatus and method of the same |
TWI333383B (en) | 2006-09-25 | 2010-11-11 | Ind Tech Res Inst | Color adjustment circuit, digital color adjustment device and a multimedia apparatus using the same |
US7847815B2 (en) | 2006-10-11 | 2010-12-07 | Cisco Technology, Inc. | Interaction based on facial recognition of conference participants |
US20080100694A1 (en) | 2006-10-27 | 2008-05-01 | Microsoft Corporation | Distributed caching for multimedia conference calls |
US8885298B2 (en) | 2006-11-22 | 2014-11-11 | Microsoft Corporation | Conference roll call |
US8144186B2 (en) | 2007-03-09 | 2012-03-27 | Polycom, Inc. | Appearance matching for videoconferencing |
US8212856B2 (en) | 2007-05-15 | 2012-07-03 | Radvision Ltd. | Methods, media, and devices for providing visual resources of video conference participants |
US8102836B2 (en) | 2007-05-23 | 2012-01-24 | Broadcom Corporation | Synchronization of a split audio, video, or other data stream with separate sinks |
US8060366B1 (en) | 2007-07-17 | 2011-11-15 | West Corporation | System, method, and computer-readable medium for verbal control of a conference call |
US8243119B2 (en) | 2007-09-30 | 2012-08-14 | Optical Fusion Inc. | Recording and videomail for video conferencing call systems |
US8954178B2 (en) | 2007-09-30 | 2015-02-10 | Optical Fusion, Inc. | Synchronization and mixing of audio and video streams in network-based video conferencing call systems |
US20090210491A1 (en) | 2008-02-20 | 2009-08-20 | Microsoft Corporation | Techniques to automatically identify participants for a multimedia conference event |
US8213503B2 (en) | 2008-09-05 | 2012-07-03 | Microsoft Corporation | Skip modes for inter-layer residual video coding and decoding |
-
2008
- 2008-05-30 US US12/130,787 patent/US8243119B2/en active Active
- 2008-05-30 US US12/130,998 patent/US9060094B2/en active Active
- 2008-06-02 US US12/131,749 patent/US8881029B2/en active Active
- 2008-09-29 WO PCT/US2008/078196 patent/WO2009045971A1/en active Application Filing
- 2008-09-29 WO PCT/US2008/078206 patent/WO2009045974A1/en active Application Filing
- 2008-09-29 WO PCT/US2008/078203 patent/WO2009045973A2/en active Application Filing
- 2008-09-30 US US12/242,358 patent/US8583268B2/en active Active
- 2008-09-30 WO PCT/US2008/078326 patent/WO2009046027A1/en active Application Filing
-
2012
- 2012-09-06 US US13/605,741 patent/US20120327181A1/en not_active Abandoned
- 2012-10-05 US US13/646,395 patent/US8700195B2/en active Active
-
2014
- 2014-03-04 US US14/196,703 patent/US9654537B2/en active Active
- 2014-10-10 US US14/511,369 patent/US9742830B2/en active Active
-
2015
- 2015-06-01 US US14/727,585 patent/US10097611B2/en active Active
-
2018
- 2018-09-27 US US16/143,879 patent/US10880352B2/en active Active
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9654537B2 (en) | 2007-09-30 | 2017-05-16 | Optical Fusion, Inc. | Synchronization and mixing of audio and video streams in network-based video conferencing call systems |
US20130051578A1 (en) * | 2011-08-31 | 2013-02-28 | Yuan-Jih Chu | Network signal receiving system and network signal receiving method |
US8983091B2 (en) * | 2011-08-31 | 2015-03-17 | Realtek Semiconductor Corp. | Network signal receiving system and network signal receiving method |
WO2014137981A1 (en) * | 2013-03-04 | 2014-09-12 | Janus Technologies, Inc. | Securing computer video and audio subsystems |
US9232176B2 (en) | 2013-03-04 | 2016-01-05 | Janus Technologies, Inc. | Method and apparatus for securing computer video and audio subsystems |
US10489657B2 (en) | 2013-03-04 | 2019-11-26 | Janus Technologies, Inc. | Method and apparatus for securing computer video and audio subsystems |
US20160094603A1 (en) * | 2014-09-29 | 2016-03-31 | Wistron Corporation | Audio and video sharing method and system |
Also Published As
Publication number | Publication date |
---|---|
WO2009046027A1 (en) | 2009-04-09 |
WO2009045973A2 (en) | 2009-04-09 |
US20090088880A1 (en) | 2009-04-02 |
US8583268B2 (en) | 2013-11-12 |
US8881029B2 (en) | 2014-11-04 |
WO2009045971A1 (en) | 2009-04-09 |
US8243119B2 (en) | 2012-08-14 |
US10880352B2 (en) | 2020-12-29 |
US20090089683A1 (en) | 2009-04-02 |
US20090086013A1 (en) | 2009-04-02 |
US20090086012A1 (en) | 2009-04-02 |
US8700195B2 (en) | 2014-04-15 |
WO2009045974A1 (en) | 2009-04-09 |
US9654537B2 (en) | 2017-05-16 |
US20150264102A1 (en) | 2015-09-17 |
US10097611B2 (en) | 2018-10-09 |
US9742830B2 (en) | 2017-08-22 |
WO2009045973A3 (en) | 2009-12-30 |
US20140184735A1 (en) | 2014-07-03 |
US9060094B2 (en) | 2015-06-16 |
US20190052690A1 (en) | 2019-02-14 |
US20130027507A1 (en) | 2013-01-31 |
US20150022625A1 (en) | 2015-01-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9654537B2 (en) | Synchronization and mixing of audio and video streams in network-based video conferencing call systems | |
US8954178B2 (en) | Synchronization and mixing of audio and video streams in network-based video conferencing call systems | |
US7133362B2 (en) | Intelligent buffering process for network conference video | |
EP1773072A1 (en) | Synchronization watermarking in multimedia streams | |
US9591262B2 (en) | Flow-control based switched group video chat and real-time interactive broadcast | |
RU2392753C2 (en) | Method for sending instructions to device not to carryout synchronisation or delay synchronisation of multimedia streams | |
KR101354793B1 (en) | Synchronizing media streams across multiple devices | |
US7558221B2 (en) | Method and system for recording videoconference data | |
US20170134831A1 (en) | Flow Controlled Based Synchronized Playback of Recorded Media | |
US9143810B2 (en) | Method for manually optimizing jitter, delay and synch levels in audio-video transmission | |
US20050262260A1 (en) | Two-way audio/video conference system | |
EP2728830A1 (en) | Method and system for synchronizing audio and video streams in media relay conferencing | |
Bielievtsov et al. | Network Technology for Transmission of Visual Information. | |
US8571189B2 (en) | Efficient transmission of audio and non-audio portions of a communication session for phones | |
US7769035B1 (en) | Facilitating a channel change between multiple multimedia data streams | |
Loh et al. | Experience with implementation of multi-party video-conferencing application over packet networks | |
Georganas | Synchronization issues in multimedia presentational and conversational applications | |
KR101074073B1 (en) | A buffer structure for storing multimedia data, and a buffering method | |
Chattopadhyay et al. | A low cost multiparty H. 264 based video conference solution for corporate environment | |
Rudkin et al. | REAL-TIME APPLICATIONS | |
Igor Bokun et al. | The MECCANO Internet Multimedia Conferencing Architecture | |
Mandowara | Live video streaming for handheld devices over an ad hoc network | |
You et al. | HCP6: A high-quality conferencing platform based on IPv6 multicast | |
Mohd Rashidi | DEVELOPMENT OF VIDEO CONFERENCE USING JMF |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |